ABBYY FineReaderEngine 11: OCR on picture: last line ignored

  • 348 Views
  • Last Post 21 June 2016
anthonyd posted this 20 June 2016

Hello,

Moving from ABBYY OCR SDK v.10 to v.11, involved quite a few updates in the processing and analysis parameters ... I unsuccessfully tried to find the right parameters to ensure that processing the OCR on a picture like this one (http://goo.gl/i8wMY8) wouldn't omit the last line, namely "aabc". alt text My current result : "Abe Def
GHIJK
Lmmmmnn ooo ppppq r
"

Going through the user guide, I thought that setting EnableAggressiveTextExtraction or/and DetectTextOnPictures to true, or EnableTextExtractionMode to true, would solve the issue. Unfortunately it didn't. There's a ton of parameters to try out, but despite their description in the user guide, I haven't a clue which one of these could help.

Thanks in advance for the help, Anthony.

ABBYY SDK 11.00.64.00

Oksana Serdyuk posted this 21 June 2016

Please try to use one of the following predefined profiles:

  • TextExtraction_Accuracy
  • TextExtraction_Speed

These profiles are designed for the situations when you need to retrieve all data from the image. They are suitable for extracting all text from the input image, including small text areas of low quality. Note that the document appearance and structure are ignored, pictures and tables are not detected. The first profile (TextExtraction_Accuracy) is optimized for accuracy, and the second one (TextExtraction_Speed) - for processing speed. When I use these profiles the output is the following:

Abe Def               
GHIJK                 
Lmmmmnn ooo ppppq r   
 aabc

By the way, when I simply put the following processing settings:

[ObjectsExtractionParams]
DetectTextOnPictures = TRUE 
EnableAggressiveTextExtraction = TRUE

[PageAnalysisParams]
EnableTextExtractionMode = TRUE

the result is also the same.

Close