[FR Engine 11 SDK] Strange Behaviour when tuning document processing params after loading predefined profile

  • 1.5K Views
  • Last Post 26 April 2015
  • Topic Is Solved
maol posted this 20 April 2015

Hello,

I just noticed something strange.

If I do this:

IDocumentProcessingParams docProcessingParams = engine.CreateDocumentProcessingParams();        docProcessingParams.getPageProcessingParams().getPagePreprocessingParams().setCorrectOrientation(true);

engine.LoadPredefinedProfile("TextExtraction_Accuracy");

document.Process(docProcessingParams);

it gives a totally different result than if I do this:

engine.LoadPredefinedProfile("TextExtraction_Accuracy");

IDocumentProcessingParams docProcessingParams = engine.CreateDocumentProcessingParams();        docProcessingParams.getPageProcessingParams().getPagePreprocessingParams().setCorrectOrientation(true);

document.Process(docProcessingParams);

The first example does preserve the layout of the document. The second example does not preserve the layout of the document.

This is quite noticeable when there are tables.

Is it a bug or the order is important ?

Order By: Standard | Newest | Votes
Natalia Karaseva posted this 20 April 2015

Yes, the order is important. As it is said in Developer's Help->Specifications->Predefined Profiles Specification : "All objects created after the profile is loaded will have these properties set to the specified values".

So, when you load TextExtraction profile before creating IDocumentProcessingParams, it is correct. All the settings from this profile will be used for processing, and as a result, the layout will not be preserved.

  • Liked by
  • maol
maol posted this 22 April 2015

Ok I see ;)

Still, is it possible to get the benefits of the TextExtraction_Accuracy profile AND preserve layout ?

Natalia Karaseva posted this 26 April 2015

Well, the TextExtraction_Accuracy profile contains some settings, such as EnableTextExtractionMode=true, which significally improve the text recognition quality. But it affects the layout preservation.

You could investigate the settings from TextExtraction_Accuracy profile and choose which ones have positive influence on recognition quality. All the profile's settings are listed in the above-mentioned article.

In addition, I could recommend to take a look at "Improving Recognition Quality" article in Developer' Help. Hope, it will be also useful.

  • Liked by
  • maol
Close