Using cloudocr, the coordinates supplied in the XML output file appear to be incorrect. It correctly calculates the size of my page as A4:

<page width="2480" height="3508" resolution="300" originalCoords="1">

But then beyond that, everything else seems to be incorrect.

I print out the original Invoice, and use a ruler to measure where items are, converting the coordinates to millimetres.

Orginal coords in pixels:

<charParams l="283" t="164" r="306" b="191">C</charParams>

in mm (original coord / 300 * 25.4):

<charParams l="23.961" t="13.885" r="25.908" b="16.171">C</charParams>

Actual coords in mm (approx, from print out):

<charParams l="30" t="18" r="32" b="21">C</charParams>

What is the reason for this? Thanks!

asked 25 Mar '16, 22:18

philhawthorn's gravatar image


Possibly the difference in measurement is connected with an automated skew correction. Have you used the correctSkew option in your tests? Also please note that there are two available XML export formats:

•xml - all coordinates written into the output XML file relate to the non-deskewed image plane,

•xmlForCorrectedImage - the same as xml, but all coordinates written into the output XML file relate to the corrected (deskewed, rotated, etc.) image, not the original one.

(28 Mar '16, 12:10) Oksana Serdyuk ♦♦

I should also have mentioned that it is a PDF that I am recognising, so now skew or correction...

(28 Mar '16, 12:25) philhawthorn

The coordinates results should be correct.

For my opinion, it is not an accurate way to compare the results using the printed out copy of your invoice and the ruler. Some small differences of a few millimetres are really possible when you have printed out the document. For example, a sheet of paper could be moved a little during printing. Also we can notice that the difference between your ruler measurements and calculated coordinates is roughly consistent for the vertical and horizontal axes:

l = 6,039 and r = 6,092
t = 4,115 and b = 4,829

answered 28 Mar '16, 14:57

Oksana%20Serdyuk's gravatar image

Oksana Serdyuk ♦♦

Agreed, it is not likely to be absolutely accurate but I wouldn't expect that much of a difference. I also noticed a difference on a photographed receipt - but that could be affected by some of the issues in your comments above.

(28 Mar '16, 15:04) philhawthorn
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text]( "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported



Asked: 25 Mar '16, 22:18

Seen: 562 times

Last updated: 28 Mar '16, 15:04

© 2016 ABBYY. All rights Reserved. | Privacy Policy | Legal