Hi,

I'm using the XML format to detect and parse tables (blockType="Table") in PDFs that have no OCR done on them and it's very reliable. I also have a bunch of PDFs that already have a text layer dans there is no OCR required on them.

Is there a way to process my document and specify to only do the cell detection?

Best regards,

Sam

asked 30 Aug '13, 16:07

samuelcossette's gravatar image

samuelcossette
12

Unfortunately, I do not understand the question well: could you please explain in other words, what is "to only do the cell detection"?

Does it mean that you want to recognize the tables from your PDF-files and skip the other text?

(02 Sep '13, 21:04) Anastasia Ga... ♦♦
Be the first one to answer this question!
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×49
×11
×6
×1

Asked: 30 Aug '13, 16:07

Seen: 1,271 times

Last updated: 05 Sep '13, 00:15

© 2016 ABBYY. All rights Reserved. www.ABBYY.com | Privacy Policy | Legal