I'm quite new to the OCR SDK and have been experimenting with it for the past few days using a Node.js application. I managed to successfully scan 50 documents and have been reviewing the results.
I decided to throw a more complicated document at it -- so I grabbed 50 jpeg images of flyers on a website and ran them through OCR SDK. The results weren't what I was hoping for, so I wanted to reach out and get feedback on what I can do to make them a lot more accurate.
Here's a sample flyer I scanned which turned up very few results: http://toronto.flyerland.ca/new/flyers/view/96593/ON/0/0/0/0/rexall-pharma-plus/1081/8//1372947590
Are their settings in the API that I can trigger or are there any best practice recommendations I can follow to get the quality up? I realize that OCR is nowhere near perfect, but I'm curious as to what else I can do.
asked 04 Jul '13, 18:24
We have processed the sample flyers from mentionned web-site. As we could see there is a text with different font size and text color is inverted in several text blocks. To get the correct recognition results for such images it is necessary to tune the resolution.
In rder we could help you and investigate this issue with more details please send to CloudOcrSdk@abbyy.com the following information:
1) the setting with which you process the sample flyers;
2) the description of how should the results look like.
answered 22 Jul '13, 17:34