Hi there,

I'm quite new to the OCR SDK and have been experimenting with it for the past few days using a Node.js application. I managed to successfully scan 50 documents and have been reviewing the results.

I decided to throw a more complicated document at it -- so I grabbed 50 jpeg images of flyers on a website and ran them through OCR SDK. The results weren't what I was hoping for, so I wanted to reach out and get feedback on what I can do to make them a lot more accurate.

  1. Almost all the images I scanned missed the large big block numbers that most coupons have. I'm wondering if it's an issue with typefaces that have strokes on them or in different colours.
  2. Some of the lower resolution images (minimum 1024x768) didn't have any text at all, they are quite readable so I can't imagine why they wouldn't be accessible.
  3. Almost all scripted typefaces were ignored.

Here's a sample flyer I scanned which turned up very few results: http://toronto.flyerland.ca/new/flyers/view/96593/ON/0/0/0/0/rexall-pharma-plus/1081/8//1372947590

Are their settings in the API that I can trigger or are there any best practice recommendations I can follow to get the quality up? I realize that OCR is nowhere near perfect, but I'm curious as to what else I can do.

Thanks,

Dave

asked 04 Jul '13, 18:24

reds's gravatar image

reds
112


Hi Dave!

We have processed the sample flyers from mentionned web-site. As we could see there is a text with different font size and text color is inverted in several text blocks. To get the correct recognition results for such images it is necessary to tune the resolution.

In rder we could help you and investigate this issue with more details please send to CloudOcrSdk@abbyy.com the following information:

1) the setting with which you process the sample flyers;

2) the description of how should the results look like.

Thank you.

Best regards,

Natalia.

link

answered 22 Jul '13, 17:34

SDK_support's gravatar image

SDK_support ♦♦
2763

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×102
×42
×1
×1

Asked: 04 Jul '13, 18:24

Seen: 2,307 times

Last updated: 22 Jul '13, 17:34

© 2016 ABBYY. All rights Reserved. www.ABBYY.com | Privacy Policy | Legal