Cloud OCR SDK (Python) returns binary to .txt file instead of text

  • 100 Views
  • Last Post 18 June 2019
lschmidt posted this 16 April 2019

I tried using the Python OCR SDK with a PNG image file like below:

 

$ python3 process.py capture.png result.txt

 

and it worked, normal text was found in result.txt

 

however, when using a static PDF file, the OCR SDK returns unreadable binary into the result.txt file. I used this command string:

 

$ python3 process.py -pdf test.pdf result.txt

 

I can't find any help in the documentation so I'm guessing I'm using the tool wrong here. I will attach the result.txt from the PDF command.

Attached Files

David Klotz posted this 18 June 2019

I don't know much about the python specifics, but the -pdf option you used seems to select the output format, not the input format. At least the "txt" file you attached in a valid PDF.

Close