Having looked at PDF-Text Extraction from Text Layer, I can see it's possible to get the underlying text of a PDF document from FineReader Engine 10. Is this possible via ABBYY Cloud OCR SDK at all?

asked 23 Aug '13, 17:05

ScraperDragon's gravatar image

ScraperDragon
112

edited 27 Aug '13, 14:54

Anastasia%20Galimova's gravatar image

Anastasia Ga... ♦♦
790112


Unfortunately, this feature is not implemented in ABBYY Cloud OCR SDK.

link

answered 27 Aug '13, 14:54

Anastasia%20Galimova's gravatar image

Anastasia Ga... ♦♦
790112

Hi,

If you want to extract the text layer, you can use a PDF lib like Poppler or PDFMiner.

All the best,

Sam

link

answered 27 Aug '13, 15:49

samuelcossette's gravatar image

samuelcossette
12

Sadly, I'm really interested in automatic layout detection, which is scarce.

(28 Aug '13, 13:59) ScraperDragon

What do you mean by "automatic layout detection". Could you give an example/more detail?

(28 Aug '13, 14:01) samuelcossette
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×160
×49
×2

Asked: 23 Aug '13, 17:05

Seen: 2,292 times

Last updated: 28 Aug '13, 14:01

© 2016 ABBYY. All rights Reserved. www.ABBYY.com | Privacy Policy | Legal