I just noticed something strange.
If I do this:
it gives a totally different result than if I do this:
The first example does preserve the layout of the document. The second example does not preserve the layout of the document.
This is quite noticeable when there are tables.
Is it a bug or the order is important ?
asked 20 Apr '15, 14:00
Well, the TextExtraction_Accuracy profile contains some settings, such as EnableTextExtractionMode=true, which significally improve the text recognition quality. But it affects the layout preservation.
You could investigate the settings from TextExtraction_Accuracy profile and choose which ones have positive influence on recognition quality. All the profile's settings are listed in the above-mentioned article.
In addition, I could recommend to take a look at "Improving Recognition Quality" article in Developer' Help. Hope, it will be also useful.
answered 26 Apr '15, 11:18
Yes, the order is important. As it is said in Developer's Help->Specifications->Predefined Profiles Specification : "All objects created after the profile is loaded will have these properties set to the specified values".
So, when you load TextExtraction profile before creating IDocumentProcessingParams, it is correct. All the settings from this profile will be used for processing, and as a result, the layout will not be preserved.
answered 20 Apr '15, 19:33
Ok I see ;)
Still, is it possible to get the benefits of the TextExtraction_Accuracy profile AND preserve layout ?
answered 22 Apr '15, 14:15