OCR errors on numbers

  • 1.4K Views
  • Last Post 10 September 2014
  • Topic Is Solved
Vitalie posted this 09 September 2014

Hello,

We extracting text from scanned documents with parameters: language-english, profile-textExtraction, imageSource-scanner, correctSkew-true, exportFormat-pdf, pdf:writeTags-yes.

The quality of image to be processed is good.

The result is basically very good, but on some parts of a numbers we have some OCR errors. For example when on page we have 41,917.94 the result is -41/9V7.94, for me it is very strange result.

alt text

I have sent just right now to the ocrsdk e-mail the files with errors for att. of Anastasia Galimova.

Can I have some feedback from ABBYY to resolve this problem?

Many thanks, Vitalie

Attached Files

Order By: Standard | Newest | Votes
Andrey Isaev posted this 10 September 2014

Strange, but Demo page shows correct results on numbers and just one mistake on text. Probably you shuld play arround with settings and chose most optimal ones.

alt text

Attached Files

Vitalie posted this 10 September 2014

Hello Adrey,

I tried demo page with the same settings (English, Text extraction, Scanner) it woks fine on first two strings, but also fails on the last two strings of numbers. Please see attached image. It is very-very strange for me. I think there is something to examine. How can I improve text extraction if even on the Demo page I have errors on ocr?

alt text

alt text

Attached Files

Andrey Isaev posted this 10 September 2014

Thank you, Vitalie, that's different story. Now, when we can reproduce this, support will investigate it.

Vitalie posted this 10 September 2014

Thank you, Andrey, I will waiting for the result of investigation.

Oksana Serdyuk posted this 16 September 2014

Please try to use profile=documentConversion.

  • Liked by
  • Vitalie
Close