Text recongnition within defined ROIs

  • 71 Views
  • Last Post 29 June 2017
imenmelki posted this 14 June 2017

Hello,

 

I have an image containing different text zones. I would like to apply the OCR on the different zones separately. How could I do this? When applying the OCR per ROI how the license units are deducted? Per ROI or per image (containing different ROIs)

 

Thanks for your help!

Order By: Standard | Newest | Votes
Diana Khammatova posted this 15 June 2017

Hello,

If you want to recognize specific regions on an image, you can manually place blocks on image. To do that, instead of using the Analyze method (which performs the automatic layout analysis of) you would first need to manually create a TextBlock objects corresponding to the areas that contains text and add this blocks to the layout of the document page where this areas are located. You can then use the RecognizerParams property of this newly created text blocks to change recognition settings. In C#, the code would look something like this:

 

FREngine.Region region = Engine.CreateRegion();

 

region.AddRect(Left, Top, Right, Bottom);

 

FREngine.IBlock block = page.Layout.Blocks.AddNew(FREngine.BlockTypeEnum.BT_TextBlock, region, 0);

 

 In this code snippet page is an FRPage object (a particular page of the FRDocument Object). The input parameters of the AddRect method of the Region object specify coordinates of the left, top, right and bottom borders of the area that you want to recognize. You should set them manually. After that you should call Recognize, Synthesize and Export methods of the FRDocument Object. Please refer to the Help file for additional information about objects and methods used above.

The page counter depends on the recognized page size. Please find more information about page counter calculation in the article http://knowledgebase.abbyy.com/article/1373.

imenmelki posted this 16 June 2017

Thanks for your response. I have a further question: is there any way to know which text were recongnized in which region? 

Diana Khammatova posted this 16 June 2017

The property Region of the Block Object determines block’s location on the page for final output ( Help → API Reference → Supplementary Objects and Methods → Region ).

To get text from the block please use the Paragraphs property of the Text Object. An item of Paragraphs contains the Words property that provides access to a collection of paragraph words. ( more details in the Help → Guided Tour → Advanced Techniques → Working with Text ).

imenmelki posted this 27 June 2017

Thanks for your response!

 

I wonder if it would be possible to get the recognized text into a given block without being obliged to process all the parapraphs of this block separately? I would like to get the whole text within a block in one single container. 

Diana Khammatova posted this 29 June 2017

It is possible to get recognized text only via Paragraph of the Text Object. Please find C# code snippet which shows how to get all text from TextBlock objects and write it to the string in the article "How to get recognized text?" from the KnowledgeBase.

Close