OCR SDK output format different from OCR online

  • Last Post 21 April 2017
Pradeep Raje posted this 21 April 2017


I have to Python batch process a tranche of newspaper articles (as JPEGs).

I came to try the ocrsdk.com site because the text output of the finereaderonline.com was perfect. Newspaper Column-wise text was properly output as text, not as columnar data. 

However, working with the SDK, the text is in column format. Why the difference and what is the solution?

Attached is test6 from the finereaderonline site and temp6 from the OCRSDK output.



Oksana Serdyuk posted this 21 April 2017

I can't find any attachments, but please try to use the processImage method with the textExtraction profile.

If it does not help, could you be so kind to share your source image and the result from FineReader Online? You can send this info to CloudOCRSDK@abbyy.com.