0
1

Hi,

I'm trying to process a receipt and am getting very poor results on a particular image. I'm calling processImage with exportFormat set to txt, correctSkew set to false, and imageSource set to scanner. Below is the image I'm processing

alt text

and the results being returned are

alt text

As you can see, a lot of the item descriptions are missing, some of the amounts don't have values for cents, there are extra spaces in the return, among other issues. What can I do to get better results?

asked 23 Jun '15, 00:03

ppunzalan's gravatar image

ppunzalan
1114

edited 30 Jun '15, 00:44


Hi,

Please try the textExtraction profile for your scenario. This profile is suitable for extracting all text from the input image.

Note that the red oval hinders ABBYY Cloud OCR SDK to recognize accurately the text above and below the line: Age Confirmed - 12/12/1912. This is expected behavior of the program.

link

answered 23 Jun '15, 13:18

Oksana%20Serdyuk's gravatar image

Oksana Serdyuk ♦♦
1.4k16

Hi Oksana,

Please see my answer below, as I cannot include images when commenting on your answer (limitation of the forum).

Thanks.

(23 Jun '15, 19:42) ppunzalan

Hi Oksana,

I tried adding profile=textExtraction and this particular receipt is getting better results. Here is what was returned:

alt text

However, other receipts are getting bad results with profile=textExtraction. For example, when I submit this image

alt text

I was getting these results (without profile=textExtraction)

alt text

but now I'm getting these results (with profile=textExtraction)

alt text

As you can see, I'm loosing the Subtotal, line item amounts (and those that are read are still incorrectly read), Total amount, etc. Is there one call to read a receipt that will work on all receipts?

link

answered 23 Jun '15, 19:43

ppunzalan's gravatar image

ppunzalan
1114

We have tested your images and sent our results and recommendations to you by e-mail.

(24 Jun '15, 15:49) Oksana Serdyuk ♦♦

Hi Oksana,

As suggested in your email response (that I've attached below), I have already tried setting the profile=textExtraction with mixed results. You also state "try to find more optimal recognition settings for your kind of images", but that's what I'm asking your advise on. What would those settings be?

You also suggest using a better image quality, but I'm trying to process receipts that clients will be taking photos of with their mobile phones and then emailing to a server for processing. I believe your ABBYY FineReader 12 is a desktop application, which isn't an option since all processing is online. Is there a perimeter that can be passed to ABBYY Cloud OCR SDK making the SDK increase the image quality?

Is there any other suggestions you might have to make ABBYY Cloud OCR SDK work for me?

Thanks.


Hi Pamela,

Thank you for your interest in our product.

We are writing to you regarding your question at ABBYY Cloud OCR SDK forum. To achieve better recognition results we could advise you to take care of the source images quality and try to find more optimal recognition settings for your kind of images. Below you can find our recommendations which you can use as a starting point.

At first, it is necessary to notice that your images have quite low resolution for recognition. Mind that the image resolution has a real impact on the OCR quality that can be achieved. We have changed resolution of your image to more optimal values using ABBYY FineReader 12: Image Editor -> the Resolution tool. Please review the OCR - Optimal Image Resolution article to know more about the recommended resolution values for OCR purposes.

Also as we have already written at our forum, it is usually recommended to use the textExtraction profile for your usage scenario. This profile is better to use for receipts processing as it provides better results both in recognition quality and in speed of processing. Morever it is suitable for extracting all text from the input image, including small text areas of low.

We have tested your images and managed to achieve quite good recognition results using our above recommendations. Please find our results in the attachment:

Folder Images consists of your original image and our images after FineReader 12 image preprocessing; Folder Results consists of two subfolders: textExtraction and documentConversion. They have our OCR results which we have got using the processImage method with corresponding profiles.

Hope the information is useful.

If you have any technical issues, please visit our Developer Forum to get fast help from ABBYY Cloud OCR SDK developers’ community. Follow us on Twitter to get the latest news.

Kind regards, Oksana Serdyuk Technical Support Engineer

link

answered 30 Jun '15, 19:28

ppunzalan's gravatar image

ppunzalan
1114

edited 30 Jun '15, 19:38

We have ABBYY Mobile Imaging SDK that you can use for image preprocessing on the mobile devices.

link

answered 01 Jul '15, 16:59

Oksana%20Serdyuk's gravatar image

Oksana Serdyuk ♦♦
1.4k16

That will not work for us since we need to streamline the process for our end users. Do you have plans to support imaging in the API in the future? Is there someone I can contact directly at ABBYY to speak to about this issue?

(01 Jul '15, 19:27) ppunzalan

So far there are no plans to support the image preprocessing in ABBYY Cloud OCR SDK. Anyway, I've forwarded your contact info to my colleagues from our office located in your region. They will contact you soon to discuss the issue.

(02 Jul '15, 13:16) Oksana Serdyuk ♦♦

Hi Pamela,

in BETA we have a method that extracts the data from receipts and returns it in an XML structure.

Cheers,

Rainer

link

answered 08 Jul '15, 15:57

rainerp's gravatar image

rainerp
213

edited 14 Aug '15, 15:40

Oksana%20Serdyuk's gravatar image

Oksana Serdyuk ♦♦
1.4k16

Thanks Rainer,

I was recently told about this module and after performing some testing, I've found it does a much better job than the processImage option.

(08 Jul '15, 19:09) ppunzalan

The method for receipt capture is now offically released for the USA. For other countries it is still in beta. Please see more information here: http://ocrsdk.com/documentation/apireference/processReceipt/

and here

https://www.abbyy.com/receipt-capture-ocr/

link

answered 06 Sep '16, 17:41

rainerp's gravatar image

rainerp
213

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×41
×7
×2

Asked: 23 Jun '15, 00:03

Seen: 2,504 times

Last updated: 06 Sep '16, 17:41

© 2016 ABBYY. All rights Reserved. www.ABBYY.com | Privacy Policy | Legal