Text block is ignored when reading file to xml

  • Last Post 26 September 2013
  • Topic Is Solved
als2002 posted this 12 September 2013

Good day!

I have a strange problem with text recongition for the file with table-like diagrams. Source .png file, as well as resulting .xml and job data can be downloaded from https://dl.dropboxusercontent.com/u/25356815/ocr.zip

While result file shows pretty good level of recognition, it completely misses "Seafood" section in the right lower part of page. Food names there do not differ much (in terms of image quality).

Is this some kind of test drive limitation, or this is bug?

Hope to hear from You soon,


ADDED: Have more samples of similar type documents with same problems...

Order By: Standard | Newest | Votes
als2002 posted this 24 September 2013

Is there any chance to get any answer?

Anastasia Galimova posted this 24 September 2013

Dear Alexander,

The issue occurs because of a complex document structure. We recommend to use the TextExtraction profile for this type of documents.

als2002 posted this 26 September 2013

That did the trick! Thank You!