Cloud OCR SDK (Python) returns binary to .txt file instead of text

  • Last Post 18 June 2019
lschmidt posted this 16 April 2019

I tried using the Python OCR SDK with a PNG image file like below:


$ python3 capture.png result.txt


and it worked, normal text was found in result.txt


however, when using a static PDF file, the OCR SDK returns unreadable binary into the result.txt file. I used this command string:


$ python3 -pdf test.pdf result.txt


I can't find any help in the documentation so I'm guessing I'm using the tool wrong here. I will attach the result.txt from the PDF command.

Attached Files

David Klotz posted this 18 June 2019

I don't know much about the python specifics, but the -pdf option you used seems to select the output format, not the input format. At least the "txt" file you attached in a valid PDF.