Character Coordinates incorrect

  • Last Post 28 March 2016
philhawthorn posted this 25 March 2016

Using cloudocr, the coordinates supplied in the XML output file appear to be incorrect. It correctly calculates the size of my page as A4:

<page width="2480" height="3508" resolution="300" originalCoords="1">

But then beyond that, everything else seems to be incorrect.

I print out the original Invoice, and use a ruler to measure where items are, converting the coordinates to millimetres.

Orginal coords in pixels:

<charParams l="283" t="164" r="306" b="191">C</charParams>

in mm (original coord / 300 * 25.4):

<charParams l="23.961" t="13.885" r="25.908" b="16.171">C</charParams>

Actual coords in mm (approx, from print out):

<charParams l="30" t="18" r="32" b="21">C</charParams>

What is the reason for this? Thanks!

Order By: Standard | Newest | Votes
Oksana Serdyuk posted this 28 March 2016

Possibly the difference in measurement is connected with an automated skew correction. Have you used the correctSkew option in your tests? Also please note that there are two available XML export formats:

•xml - all coordinates written into the output XML file relate to the non-deskewed image plane,

•xmlForCorrectedImage - the same as xml, but all coordinates written into the output XML file relate to the corrected (deskewed, rotated, etc.) image, not the original one.

philhawthorn posted this 28 March 2016

I should also have mentioned that it is a PDF that I am recognising, so now skew or correction...

Oksana Serdyuk posted this 28 March 2016

The coordinates results should be correct.

For my opinion, it is not an accurate way to compare the results using the printed out copy of your invoice and the ruler. Some small differences of a few millimetres are really possible when you have printed out the document. For example, a sheet of paper could be moved a little during printing. Also we can notice that the difference between your ruler measurements and calculated coordinates is roughly consistent for the vertical and horizontal axes:

l = 6,039 and r = 6,092
t = 4,115 and b = 4,829

philhawthorn posted this 28 March 2016

Agreed, it is not likely to be absolutely accurate but I wouldn't expect that much of a difference. I also noticed a difference on a photographed receipt - but that could be affected by some of the issues in your comments above.