Convert PDF to Text using ABBYY without OCR

  • 317 Views
  • Last Post 13 September 2018
Viji posted this 13 September 2018

Hi,

I am now using processanddownload method to convert full text searchable pdf to text using AbbyyOCR.

Kindly let me know if there is a way in abbyy to convert the full text searchable PDF to text without doing OCR

Many thanks in advance!

Regards,

Vijayalakshmi

Koen de Leijer posted this 13 September 2018

Hi

In one of your other posts I saw you are using Python,
if so, you do not explicitly need to use ABBYY to convert a searchable PDF to Text.

Take a look at the PyPDF-library (open source, free to use commercially also)
https://pypi.org/project/pyPdf/
https://stackoverflow.com/questions/34837707/extracting-text-from-a-pdf-file-using-python
http://www.blog.pythonlibrary.org/tag/pypdf/

Or at PDFMiner (also opensource)
https://github.com/euske/pdfminer
https://github.com/pdfminer/pdfminer.six

I am also using Python with both ABBYY and PDFMiner,
ABBYY Finereader for PDFs that need OCR, PDFMiner for PDFs that do not need OCR.

Best regards
Koen de Leijer

Close