Good day!

I have a strange problem with text recongition for the file with table-like diagrams. Source .png file, as well as resulting .xml and job data can be downloaded from https://dl.dropboxusercontent.com/u/25356815/ocr.zip

While result file shows pretty good level of recognition, it completely misses "Seafood" section in the right lower part of page. Food names there do not differ much (in terms of image quality).

Is this some kind of test drive limitation, or this is bug?

Hope to hear from You soon,

Alexander

ADDED: Have more samples of similar type documents with same problems...

asked 12 Sep '13, 22:25

als2002's gravatar image

als2002
34

edited 13 Sep '13, 20:46

Is there any chance to get any answer?

(24 Sep '13, 12:12) als2002

Dear Alexander,

The issue occurs because of a complex document structure. We recommend to use the TextExtraction profile for this type of documents.

link

answered 24 Sep '13, 17:28

Anastasia%20Galimova's gravatar image

Anastasia Ga... ♦♦
790112

That did the trick! Thank You!

(26 Sep '13, 13:28) als2002
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×49
×2
×1

Asked: 12 Sep '13, 22:25

Seen: 1,234 times

Last updated: 26 Sep '13, 13:31

© 2016 ABBYY. All rights Reserved. www.ABBYY.com | Privacy Policy | Legal