I'm running the Abbyy OCR engine in C# to process receipts, and it's been skipping large portions of the receipts that are clearly readable. I've created user trained patterns for receipts that have this problem, but this only resulted in 100% accuracy for the portions it processed. The lines of text it is skipping remained skipped after training.

I'm using the DocumentConvertion_Accuracy settings like this.

        _engine.LoadPredefinedProfile("DocumentConversion_Accuracy");
        FRDocument document = _engine.CreateFRDocument();
        //
        // load some images here...
        //
        const string patternFile = "C:\\abbyy_pattern.ptn";
        DocumentProcessingParams docParams = _engine.CreateDocumentProcessingParams();
        docParams.PageProcessingParams.RecognizerParams.UserPatternsFile = patternFile;
        document.Process(docParams);

Here is a sample image illustrating the problem. The colored rectangles represent text Abbyy successfully processed. As you can see there is a large portion of the receipt missing OCR data.

alt text

As a test, I've modified the scan receipt in Photoshop to increase the line spacing in the problem area. After adding extra space between the lines Abbyy started generating OCR data for that area.

Obviously, I can't Photoshop every scan that has this problem and this problem is presenting itself in a significant percentage of our scans.

I have read over the documentation multiple times and can not find any settings that would indicate a solution to this problem.

Can anyone assist me?

Thanks,

asked 14 May '14, 19:54

ThinkingMedia's gravatar image

ThinkingMedia
133


Dear sir, please send samples of images and description of this issue to dev_support@abbyyusa.com

link

answered 21 May '14, 15:47

SDK_support's gravatar image

SDK_support ♦♦
2763

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×40
×37
×13

Asked: 14 May '14, 19:54

Seen: 1,554 times

Last updated: 21 May '14, 15:47

© 2016 ABBYY. All rights Reserved. www.ABBYY.com | Privacy Policy | Legal