Missing set of chars in receipt processing

  • Last Post 12 September 2012
G Moore posted this 08 September 2012


I'm currently evaluating the Cloud OCR product and have been generally pleased with the results.

However, I've experienced one significant issue which I was hoping you could shed some light on/look into. See this particular image (http://i.imgur.com/0vTsy.jpg) and the xml that was returned for it (http://ge.tt/4WPV8MN/v/0 - note I've trimmed some bits out, so the xml might not be valid, but everything you need to see is there). For the item "2 Basmati Rice", it seems that the corresponding price item ("6.00") has been missed entirely. We can't find it anywhere in the results. Everything else seems to be present.

Why is this, and how might we be able to fix it/ensure something like this doesn't happen repeatedly in future queries?

Thanks in advance for your assistance.

Edit: realized I should probably indicate what settings I ran the OCR with. Just copying and pasting the request url below for expediency:

                            $url = 'http://cloud.ocrsdk.com/processImage?language=English&profile=documentConversion&imageSource=photo&exportFormat=xml&xml:writeRecognitionVariants=true';

I should also note that the linked image was cropped a bit, but the images we're sending over are max quality stills from an iPhone 4 (AVCaptureSession), so 2592x1936px.

Order By: Standard | Newest | Votes
Vasily Panferov posted this 10 September 2012

Updated answer:

The results can be improved by implementing the following steps:

Try to capture better images:

  1. Try better lighting conditions. When your captured image is dark, the results can be worse.
  2. Avoid having flash reflection on the captured receipt.

Modify processing settings:

  1. Enable "profile=textExtraction" option in RESTful call to processImage. This setting usually breaks linear text structure, but extracts more text from the document.
  2. Make sure that your xml parsing supports table blocks. Their structure is different from text blocks and so some text which was detected as table can be lost during parsing step.

And the last hint for debugging:

  1. If something goes wrong with the output XML or whatever format you use, try processing your image to txt. The txt is much easier for human to investigate and contains all the text information contained in XML.

G Moore posted this 10 September 2012

What do you mean our "xml transormation routine"? The xml I linked to is the raw result from you guys, we haven't processed it at all. And I understand things can appear in different block types and that in this particular case these receipt items are in a table block, but still, I don't see it in the xml regardless. Could you point to a line number where you claim to see it?

Vasily Panferov posted this 11 September 2012

Ok, you got the following xml: http://ge.tt/22ux8VN/v/0?c for your application bring10_test. Please take a look at line 1111.

Anyway, the best way to check if text is recognized or not is to convert your image to txt, not to xml.

If that's not the case, please let me know the task id of your submitted image.

G Moore posted this 12 September 2012

Ok, lost the task id of the original test I was posting about, but I tried again and the results got even worse - literally all of the price items on the right side of the page seem to be missing. Granted this image is perhaps a little lower quality, but it still got the left side of the page quite well so the missing price items doesn't make sense. The task id is: cb861010-de14-4157-b5a9-ed6379a1e89a. Let me know if you need any additional info to look into this. Thanks a lot

Vasily Panferov posted this 12 September 2012

Yes, the results became worse because your receipt became worse :)

Modified the original answer. There are three points how the result can be improved: 1. Better receipts (not worn) and better lightning condition. Try to disable flash. 2. Enable profile=textExtraction option 3. In XML processing routine support "Table" blocks

G Moore posted this 12 September 2012

Ok, that was just an example though. The real problem is that the first image I was having issues with had decent quality/lighting and it recognized every single piece of text on the receipt except that one, single price item. Is there nothing you can do to track down why that was missed? Is there a way to recover the task id, perhaps by searching for the returned xml (if you guys save all that)? Regarding your other points, we need the flash because we must support low light conditions, and it has been working fine like that; and our parsing does account for both text and table blocks.