Recognition of impact/dot-matrix printer receipts

  • 2.3K Views
  • Last Post 10 April 2015
mkommar posted this 02 April 2015

I have several receipts that recognize nearly perfectly. Specifically, ones that are from laser printers with normal looking fonts. However, this one type of receipt (impact printer... seems like dot-matrix with a "System" font) doesn't seem to recognize correctly. Is there any advice for getting this to output something useful (even if partially) or can I talk to support somehow?

Thanks!

You can see the image at: http://picpaste.com/pics/pfchangs-goIGHs9T.1427929919.jpg

Order By: Standard | Newest | Votes
Natalia Karaseva posted this 02 April 2015

Please, specify the product. Which settings do you set for receipts processing?

Oksana Serdyuk posted this 02 April 2015

As far as we understand you are using ABBYY Cloud OCR SDK. In this case for receipt capture we recommend you to use the processImage method with the textExtraction profile. This profile is suitable for extracting all text from the input image, including small text areas of low quality.

Also please pay your attention on the quality of your input image. We should notice that your image resolution is 96 dpi and it is quite low for your recognition purposes. The image resolution has a real impact on the OCR quality that can be archived. Moreover the image is fuzzy and has background that makes a lot of noise and lower the recognition quality. It is very important that the image contains only the text the customer wants to recognize. Please refer to the Best Practices section where you can find more recommendations on how to scan and photograph documents to achieve the best recognition results.

We have tested your image and have managed to achive more or less acceptable recognition result using our above recommendations:

  • we have changed the image resolution to more optimal value (we have used ABBYY FineReader 12 for image preprocessing);
  • we have cropped the image so that the image contain only the receipt without the background;
  • we have processed the image using the following recognition settings: ".../processImage?language=English&profile=textExtraction&imageSource=Auto&exportFormat=txt"

Our results will be sent you by e-mail.

mkommar posted this 10 April 2015

Thanks! I will try the suggestions. I think there was a bit of some miscommunication in the image. If you saw a small image on a webpage, click on it and you'll get the full sized image alone. It's a picture taken from a cellphone, about 5,312px × 2,988px.

That might make the request seem a little more sane!

That's the use case. I can do some detection of the boundaries of the receipt and crop it accordingly and perhaps even some perspective correction.

rainerp posted this 08 July 2015

Hi,

in BETA we have a method that extracts the data from receipts and returns it in an XML structure.

Cheers,

Rainer

rainerp posted this 06 September 2016

The method for receipt capture is now offically released for the USA. For other countries it is still in beta. Please see more information here: http://ocrsdk.com/documentation/apireference/processReceipt/

and here

https://www.abbyy.com/receipt-capture-ocr/

Close