I want to extract text directly from text layer of PDF without applying OCR to it. From following lines of FREngine10UserGuide(Page no.386), I think it is possible. IsFromSourceContent - Specifies whether the character has been extracted from the text content of the input file without recognition. For example, it can be extracted from a PDF file with a text layer Please can anyone tell me How to do this using finereader?

asked 08 Aug '13, 11:21

Sham's gravatar image


Hello Sham!

If you set the IObjectsExtractionParams::SourceContentReuseMode = CRM_ContentOnly parameter then only text layer of the source PDF file is used, the image layer is not used. However, note that if the text line contains characters not included in the alphabet of the selected recognition languages, this text cannot be written to the result and the line would have to be rerecognized.

Please see for details the page no.754 of FREngine10UserGuide.

Best regards, Natalia.


answered 20 Aug '13, 17:14

SDK_support's gravatar image

SDK_support ♦♦

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported



Asked: 08 Aug '13, 11:21

Seen: 5,396 times

Last updated: 20 Aug '13, 17:14

© 2016 ABBYY. All rights Reserved. www.ABBYY.com | Privacy Policy | Legal