Set anchor points

  • 2.8K Views
  • Last Post 05 July 2012
greatrat00 posted this 04 July 2012

Hi,

The pages I will be doing OCR on might not always be scanned exactly the same way, although the format on the printed page is the same. This means that the data will have x-y and angular bias. Is there any way to detect the x-y bias - maybe through detecting the location of some constant text within the scanned images?

Thanks, Adam

Order By: Standard | Newest | Votes
Vasily Panferov posted this 04 July 2012

If you are performing full-page recognition via processPage or processDocument calls, you can get this information in xml output. The coordinates of recognized characters are output in original image's coordinates. So you need to determine the similar text areas on different pages and compare their coordinates.

UPDATE: You are performing field-level recognition via processTextField. In your case this can't be done in single step since you need to know exact positions of different fields. After the page is recognized via processImage or processDocument, its subsequent recognition via any method including processFields will be free. So, you need to perform processImage or processDocument to xml, determine coordinates of the fields, then do processTextField for the same file, and it will be free.

However, to avoid service abuse we implemented a limitation on free pages per day. There are 300 free pages per application per day. After this amount all the subsequent rerecognitions will be billed.

Another issue is document rotation. In processImage/Document it is corrected automatically. In process*Field case we cannot do that because otherwise all the coordinates user passed us will become invalid.

Right now I don't see elegant solution to your problem. So all your suggestions are welcome.

  • Liked by
  • Dmitry Me
greatrat00 posted this 04 July 2012

Hi,

I am using the processFields method. Does this mean that I will need to scan every page twice (once for bias estimation) and another for OCR every time? That means I would be getting charged two page scans for every page I need to read.

Thanks, Adam

Vasily Panferov posted this 04 July 2012

I thought your scenario was a bit different - just recognize and get text position.

Could you please describe what are you going to achieve in more detail? You have many pages, what are you going to do with them?

greatrat00 posted this 04 July 2012

Hi,

My scenario is as follows -

I have an n-page document where the format of each page is the same, that is page #1 is always formatted the same, page #2 is always formatted the same, etc. However, the images for each page are scans of printed pages which are printed on different printers and therefore the location of the fields are biased. That is, the first page in document A is formatted the same as the first page in document B; but page 1 in document B has a bias with respect to page 1 in document A. So I need to correct for this bias when I set the coordinate parameters of the fields.

greatrat00 posted this 04 July 2012

Your solution seems to work, but it requires two processes per page. One to get the bias (via observing the coordinates of a priori known text) and another to get the actual field values after using the processFields API.

Thanks, Adam

greatrat00 posted this 04 July 2012

I forgot to mention that besides x-y correction I also need angular skew correction. But I need to do this with the processFields method.

greatrat00 posted this 05 July 2012

Thanks for your answer.

I would suggest doing something on your side like what I'm planning on doing until you reach an elegant solution.

I'm selecting two "anchors" or pieces of text that are unique and will always be in every single page. Preferably far apart from each other on the x-axis. I'm running a processImage on the page first and looking at where the anchor points land with respect to where the same anchor points are on a reference page (the same reference page used to select the field positions). These two anchor points give me the page's skew.

greatrat00 posted this 05 July 2012

Next, the x-y bias of each field position can be estimated by linear interpolation, using the x-y bias of both anchor points.

That's what I'm doing on my end. My suggestion is to include some sort of anchor point text and positions in the call to processFields and you guys can do this for us in one step instead of two.

Close