We are doing OCR of PDF and exporting to XML. We are not able to understand how ABBYY is treating the table resides on PDF.

asked 15 Sep '16, 14:13

Sandip%20Bhavsar's gravatar image

Sandip Bhavsar
1


If you use the documentConversion profile, the tables are detected during the document analysis stage. When you export the recognition result to the xml format, the output is described with the following XML scheme. In this case the recognized text is presented in proper hierarchy: document > page > block > region > etc. And the block tag has the blockType attribute, which denotes the type of the block: Text, Table, Picture, Barcode, Separator, SeparatorsBox.

link

answered 16 Sep '16, 12:40

Oksana%20Serdyuk's gravatar image

Oksana Serdyuk ♦♦
1.4k16

Thanks for your answer Oksana

(20 Sep '16, 13:08) Sandip Bhavsar
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×157
×39

Asked: 15 Sep '16, 14:13

Seen: 412 times

Last updated: 20 Sep '16, 13:08

© 2016 ABBYY. All rights Reserved. www.ABBYY.com | Privacy Policy | Legal