How ABBY is doing OCR for table structure in PDF.

  • Last Post 20 September 2016
Sandip Bhavsar posted this 15 September 2016

We are doing OCR of PDF and exporting to XML. We are not able to understand how ABBYY is treating the table resides on PDF.

Order By: Standard | Newest | Votes
Oksana Serdyuk posted this 16 September 2016

If you use the documentConversion profile, the tables are detected during the document analysis stage. When you export the recognition result to the xml format, the output is described with the following XML scheme. In this case the recognized text is presented in proper hierarchy: document > page > block > region > etc. And the block tag has the blockType attribute, which denotes the type of the block: Text, Table, Picture, Barcode, Separator, SeparatorsBox.

Sandip Bhavsar posted this 20 September 2016

Thanks for your answer Oksana