Recognize password protected PDF

  • 81 Views
  • Last Post 07 February 2018
viji posted this 07 February 2018

Hi,

I have password protected PDF file.

Is there any way to recognize the PDF and convert it to the Word Document using Abbyy OCR sdk.

Please help.

Thanks!

Koen de Leijer posted this 07 February 2018

Hi

I was having the same question and did not find any other solution in FineReader than to include a password-file with all known passwords.
See help: AddImageFileWithPassword Method of the FRDocument Object
And see why ABBYY does not: https://forum.ocrsdk.com/thread/5149-ocrsdk-has-access-restrictions-and-cannot-be-added-to-the-document/?order=all#comment-9b723d1e-13b3-4ebf-b74c-a74500c14a36

If you are programming in Java and would really want to remove the protection, you could use Apache PDFBox in front of ABBYY's Finereader.
See: https://pdfbox.apache.org/docs/2.0.2/javadocs/org/apache/pdfbox/pdmodel/PDDocument.html

One of my samples to create a new PDF as a copy of the original PDF, but without protection;:

        //Use Apache PDFBox to remove protection from PDF
        PDDocument doc = null;
        try {
            File in = new File(inputPath);
            doc = PDDocument.load(in);
            if (doc.isEncrypted()) {
                doc.setAllSecurityToBeRemoved(true);
            }
            doc.save(outputPath);
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            if (doc != null) {
                doc.close();
            }
        }

Keep in mind that after removing the protection of a legal document, it might not be valid anymore.
So you should always consider creating a backup of the original file.

Best regards
Koen de Leijer

Close