The pages I will be doing OCR on might not always be scanned exactly the same way, although the format on the printed page is the same. This means that the data will have x-y and angular bias. Is there any way to detect the x-y bias - maybe through detecting the location of some constant text within the scanned images?

Thanks, Adam

asked 04 Jul '12, 01:41

greatrat00's gravatar image


edited 05 Jul '12, 10:32

Vasily%20Panferov's gravatar image

Vasily Panferov ♦♦

If you are performing full-page recognition via processPage or processDocument calls, you can get this information in xml output. The coordinates of recognized characters are output in original image's coordinates. So you need to determine the similar text areas on different pages and compare their coordinates.

UPDATE: You are performing field-level recognition via processTextField. In your case this can't be done in single step since you need to know exact positions of different fields. After the page is recognized via processImage or processDocument, its subsequent recognition via any method including processFields will be free. So, you need to perform processImage or processDocument to xml, determine coordinates of the fields, then do processTextField for the same file, and it will be free.

However, to avoid service abuse we implemented a limitation on free pages per day. There are 300 free pages per application per day. After this amount all the subsequent rerecognitions will be billed.

Another issue is document rotation. In processImage/Document it is corrected automatically. In process*Field case we cannot do that because otherwise all the coordinates user passed us will become invalid.

Right now I don't see elegant solution to your problem. So all your suggestions are welcome.


answered 04 Jul '12, 11:47

Vasily%20Panferov's gravatar image

Vasily Panferov ♦♦

edited 05 Jul '12, 10:30


I am using the processFields method. Does this mean that I will need to scan every page twice (once for bias estimation) and another for OCR every time? That means I would be getting charged two page scans for every page I need to read.

Thanks, Adam

(04 Jul '12, 18:24) greatrat00

I thought your scenario was a bit different - just recognize and get text position.

Could you please describe what are you going to achieve in more detail? You have many pages, what are you going to do with them?

(04 Jul '12, 18:30) Vasily Panferov ♦♦


My scenario is as follows -

I have an n-page document where the format of each page is the same, that is page #1 is always formatted the same, page #2 is always formatted the same, etc. However, the images for each page are scans of printed pages which are printed on different printers and therefore the location of the fields are biased. That is, the first page in document A is formatted the same as the first page in document B; but page 1 in document B has a bias with respect to page 1 in document A. So I need to correct for this bias when I set the coordinate parameters of the fields.

(04 Jul '12, 19:50) greatrat00

Your solution seems to work, but it requires two processes per page. One to get the bias (via observing the coordinates of a priori known text) and another to get the actual field values after using the processFields API.

Thanks, Adam

(04 Jul '12, 19:50) greatrat00

I forgot to mention that besides x-y correction I also need angular skew correction. But I need to do this with the processFields method.

(04 Jul '12, 23:05) greatrat00

Thanks for your answer.

I would suggest doing something on your side like what I'm planning on doing until you reach an elegant solution.

I'm selecting two "anchors" or pieces of text that are unique and will always be in every single page. Preferably far apart from each other on the x-axis. I'm running a processImage on the page first and looking at where the anchor points land with respect to where the same anchor points are on a reference page (the same reference page used to select the field positions). These two anchor points give me the page's skew.

(05 Jul '12, 18:59) greatrat00

Next, the x-y bias of each field position can be estimated by linear interpolation, using the x-y bias of both anchor points.

That's what I'm doing on my end. My suggestion is to include some sort of anchor point text and positions in the call to processFields and you guys can do this for us in one step instead of two.

(05 Jul '12, 19:01) greatrat00
showing 5 of 7 show 2 more comments
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported



Asked: 04 Jul '12, 01:41

Seen: 2,757 times

Last updated: 05 Jul '12, 19:01

© 2016 ABBYY. All rights Reserved. www.ABBYY.com | Privacy Policy | Legal