Hi,

I'm developing an application that uses a photo of a page of text taken from an iPhone, sends this photo to the OCR SDK and uses the text extracted to perform searches. Since this is a realtime system speed is important. However the results of the OCR are going to be used to select from a product database trying to find the best match against text. So accuracy is also even more important.

I'm wondering what are the parameter settings for this type of use case to optimize for accuracy.

Specifically, what should the settings be for the following parameters?

  • profile: which one of these? documentConversion, documentArchiving, textExtraction
  • textType: assuming it should be normal?
  • imageSource: assuming it should be photo?
  • correctOrientation: assuming true since the iPhone can be of any orientation?
  • correctSkew: Assuming true since the user can take a picture at any angle?

Also assuming all the other parameters passed will not affect the accuracy.

asked 21 Jul '14, 11:58

hizzo's gravatar image

hizzo
112

edited 21 Jul '14, 23:11


Hi,

If you need to extract all text from the input image, including small text areas of low quality, set profile = textExtraction. In this case the document appearance and structure are ignored, pictures and tables are not detected, and the processing speed will increase.

If you recognize common typographic type of text the normal textType is more convenient. Please see more information about text types here.

Other setting (imageSource, correctOrientation and correctSkew) could have values as you assumed.

You could also set readBarcodes=false (by default it is true if you use export to XML).

link

answered 24 Jul '14, 09:55

Natalia%20Karaseva's gravatar image

Natalia Kara...
3214

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×5

Asked: 21 Jul '14, 11:58

Seen: 900 times

Last updated: 24 Jul '14, 09:55

© 2016 ABBYY. All rights Reserved. www.ABBYY.com | Privacy Policy | Legal