I have images containing this kind of content:
and I would like to get the recognised output in the same order. Instead the API gives me a TXT file with column-based results, something like:
which make me loose all the correspondence/match between texts and numbers. I can't know any more which number belong to which text description
I just want to read line by line, what is the easiest way to achieve this?
My problem occurs when using TXT as output. When I asks for XLSX, if i open the resulting file, row and columns information are there but xlsx is not a format that a programmatic algorithm can use easily. CSV would be good but does not seem to be available.
So, what is the best way to get the plain text from my image, in a line by line fashion?
Thanks in advance.
asked 14 Oct '13, 16:25
You could perform export to XML and process words with its coordinates on your side. The region parameter of the processTextField method could be set to the coordinates corresponded to the each line.
answered 21 Oct '13, 10:47