Extracting Known Blocks with Minimum Preprocessing

  • Last Post 17 April 2017
  • Topic Is Solved
ahmed.hani posted this 14 April 2017

I have health cards scans preprocessed using our own proprietary software. All the images have same resolution, size and orientation, so no further enhancement is required.

I've created an XML template file to be used by the Cloud OCR service, and it worked perfectly for extracting name, card number, etc...

However, due to practical reasons, we are switching to the offline SDK on Linux. However, the process is not as straightforward as it is in the Cloud SDK. I mean the documentation does talk about blocks, layouts, regions, etc... but I still don't know what is the minimum for extracting text from blocks with known characteristics and coordinates with minimum further processing. I hope that you can help me a little with coding this part....

 1- My C++ code so far (I'm experienced in C++)

void processImage( CBstr imageName )


wprintf( L"Processing image %ls\n", (wchar_t*)imageName );

// Create document from image file

displayMessage( L"Loading image..." );

CBstr imagePath = imageName;// Concatenate( GetSamplesFolder(), Concatenate( L"/SampleImages/", imageName ) );

CSafePtr<IFRDocument> frDocument = 0;

CheckResult( FREngine->CreateFRDocumentFromImage( imagePath, 0, frDocument.GetBuffer() ) );


// Tune FineReader Engine for business cards processing

CheckResult( FREngine->LoadPredefinedProfile( L"FieldLevelRecognition" ) );


CSafePtr<IDocumentProcessingParams> documentProcessingParams;


// Recognize document

displayMessage( L"Recognizing..." );

CheckResult( frDocument->Process( documentProcessingParams ) );




2- Sample XML template

<document xmlns="http://ocrsdk.com/schema/taskDescription-1.0.xsd"


          xsi:schemaLocation="http://ocrsdk.com/schema/taskDescription-1.0.xsd http://ocrsdk.com/schema/taskDescription-1.0.xsd">



        <text id="textoCampo">





        <text id="codigoIdentificacaoCampo">








    <page applyTo="0">

        <text id="codigo_identificacao" template="codigoIdentificacaoCampo" left="275" top="125" right="565" bottom="155" />

        <text id="local_plano" template="textoCampo" left="30" top="175" right="270" bottom="210" />

        <text id="tipo_plano" template="textoCampo" left="30" top="250" right="900" bottom="280" />



Order By: Standard | Newest | Votes
Oksana Serdyuk posted this 17 April 2017

Hi Ahmed,

Do I understand correctly, that you use the processFields method in Cloud OCR SDK and want to replicate similar algorithm in FineReader Engine?

Firstly, ABBYY FineReader Engine 11 for Linux and ABBYY Cloud OCR SDK are based on the same generation of OCR technology, however the versions are different, so some minor differences are possible in the recognition results.

Secondly, to get similar recognition results you should use the same recognition settings in both cases. When you use processFields method in Cloud OCR SDK, it is the same in FineReader Engine that you create manually the text blocks with certain coordinates and recognition settings and add them to the layout, then this layout should be recognized and synthesized. For example, the code snippet for your XML template may look in the following way:

1) Please do not use the FieldLevelRecognition predefined profile, it is suitable for recognizing short text fragments. You can try either the DocumentConversion_Accuracy profile or the TextExtraction_Accuracy profile. Please test different variants and find which is the most appropriate for your scenario.
2) Creation of one block with the recognition settings as for “local_plano”:


// Add image to document
 WriteToLog( "Loading image...\n" );
 CheckResult( frDocument->AddImageFile( imageFilePath, 0, 0 ) );

 // Set recognition settings
 CSafePtr<IFRPages> pages;
 CheckResult( frDocument->get_Pages( &pages ) );
 CSafePtr<IFRPage> firstPage;
 CheckResult( pages->Item( 0, &firstPage ) );
 CSafePtr<ILayout> layout;
 CheckResult( firstPage->get_Layout( &layout ));

 CSafePtr<IImageDocument> imgDoc;
 CheckResult( firstPage->get_ImageDocument( &imgDoc ) );
 CSafePtr<ICoordinatesConverter> converter;
 int left = 30;
 int top = 175;
 int right = 270;
 int bottom = 210;
 CheckResult(converter->ConvertCoordinates(IT_Base, IT_Modified, &left, &top));
 CheckResult(converter->ConvertCoordinates(IT_Base, IT_Modified, &right, &bottom));
 CSafePtr<IRegion> region;
 CheckResult(region->AddRect(left, top, right, bottom));
 CSafePtr<ILayoutBlocks> blocks;
 CheckResult( layout->get_Blocks(&blocks));
 CSafePtr<IBlock> newBlock;
 CheckResult(blocks->AddNew(BT_Text, region, 0, &newBlock ));
 CSafePtr<ITextBlock> textBlock;
 CSafePtr<IRecognizerParams> recParams;
 CheckResult(recParams->put_TextTypes( TT_Normal ) );
 CSafePtr<IIntsCollection> indices;
 CSafePtr<IText> text;
 CSafePtr<IParagraphs> pars;
 CheckResult(text->get_Paragraphs( &pars ) );
 CSafePtr<IParagraph> par;
 CheckResult(pars->Item(0, &par ));
 CBstr parText;
 CheckResult(par->get_Text(&parText ) );
 // Recognize
 WriteToLog( "Processing...\n" );

 // Export to TXT
 WriteToLog( "Saving results (TXT)...\n" );
 CBstr resultFilePath = resultsPath + imageFileName + L".txt";
 CheckResult( frDocument->Export( resultFilePath, FEF_TextUnicodeDefaults, 0 ) );

3) By analogy you can create and add to the layout as many text blocks as you need.

  • Liked by
  • ahmed.hani
ahmed.hani posted this 17 April 2017

worked with some tweaks

I wish that the SDK came with more samples

thank you very much for your help