Retreiving coordinates of OCRed text.

  • 4K Views
  • Last Post 23 January 2012
  • Topic Is Solved
John BadFox posted this 23 January 2012

I'm developing an iPhone app and need to highlight, for example every occurance of the word CRM like shown below:

alt text

How do i retrieve the coordinates of a word CRM?

Attached Files

  • Liked by
  • Nikolay_Kh
Nikolay_Kh posted this 23 January 2012

First of all, you need to specify data export format by adding a parametr to processImage call: cloud.ocrsdk.com/processImage?exportFormat=xml

You'll get an xml response that for each recognized character will contain an instance of charParams element as follows:

<charParams l="35" t="39" r="73" b="83" charConfidence="100">M</charParams>
<charParams l="77" t="39" r="117" b="83" charConfidence="100">o</charParams>
<charParams l="120" t="40" r="164" b="83" charConfidence="100">b</charParams>
<charParams l="165" t="40" r="204" b="83" charConfidence="100">i</charParams>
<charParams l="211" t="40" r="225" b="83" charConfidence="100">l</charParams>
<charParams l="231" t="40" r="276" b="84" charConfidence="100">e</charParams>

The XML you get is synthesised according to this schema.

Those "l", "t", "r", "b" params stand for left, top, right and bottom, they describe a rectangle of each character with top-left and bottom-right corner. I beleive that's exatly what you are looking for.

The element will contain the coordinates in page pixels - the same XML also contains a page element:

<page width="..." height="..." resolution="..." originalCoords="...">

where the image width and height are stored. So l and r for each charParams element is in range 0..width-1 of the corresponding page and t and b for each charParams element is in range 0..height-1 of the corresponding page.

Also it's worth mentioning explicitly that all coordinates are in pixels - they are completely resolution-agnostic. This is why whenever you try to highlight anything on an image you have to take zoom into account - the image will likely not be always displayed as is by your device software, but will be downscaled and so you have to map page coordinates onto your zoomed-out image coordinates and highlight appropriately.

  • Liked by
  • John BadFox
Close