Recognized text layout position

  • 77 Views
  • Last Post 06 October 2017
  • Topic Is Solved
michele_malaguti posted this 20 September 2017

On recognizing handwritten text i noticed that often, into the exported pdf, the position of the resulting text is different from the original.

This is shown, for instance, by exporting both a text and its underlying image that could be a grid.

What's the rule that abbyy finereader uses to decide the final position of the text on the exported document? May i decide it using some code parameters?

Can i be sure that if i draw into the exported pdf the same block area with the same coordinates of the source pdf i will find the recognized text into it?

Thanks, Michele

Order By: Standard | Newest | Votes
IvanPopov posted this 02 October 2017

Could you please provide a few examples of input images and output PDF files that illustrate the issue? You can attach the files to your post or send them to SDK_Support@abbyy.com.

Just in case, ABBYY FineReader Engine only supports recognition of so-called "handprinted" text, please see this article: http://knowledgebase.ocrsdk.com/article/1099.

In general, the outline of the resulting recognized text would not perfectly match the outline of handprinted text on the original images. While letters belonging to the same word on the original image may be of different size and not be perfectly aligned, the same letters in the exported PDF will all be the same size and positioned in a single line.

michele_malaguti posted this 02 October 2017

I can show this example of handprinted text processing using VisualComponents example.
On the right side i get different relative position of recognized texts depending on the block area size used to surround handprinted zones.
So i wonder if i can control in some way the position (or even the font) of the extracted text.
Thanks, Michele.

IvanPopov posted this 06 October 2017

To change formatting of the recognized text, you could use the CharParams object (please see the Developer's Help article API Reference → Text-Related Objects → CharParams). The ICharParams::SetFont() method allows to change the font used in the output for individual characters.

As to text outline, on our side the outline of the underlying recognized text in the exported PDF file seemed to match the text on the image.

Close