When a receipt is sent with columns that are sometimes far a part and separated, the service scans each column on its own then returns them one after the other.
The behaviour that we'd like to get is to parse each horizontal line on its own, and return them in that order.
asked 09 Apr '13, 22:25
OCR engine works on all kind of documents and behavior that seems correct on one complicate layout may not be so correct on others. But OCR does not know in advance which one is correct on this particular document, so it has been tuned to keep reasonable balance to work OK in most of the cases.
My recommendation would be to use XML output instead of TXT, and look for text coordinates information when parsing receipt. This way you will be able to decide yourself what would be correct reading order.