XML:Writingformat return in error with OCR SDK ENGINE WITH PYTHON

  • 67 Views
  • Last Post 10 May 2018
Badr Mansour posted this 04 May 2018

Hey, 

I am using python in order to convert pdf to an XML file. I got the XML without font attributes such as font size, type ..etc. I tried to add this option (XML:Writingformat) but it returns 450 error invalid parameter, so how can I add this option to my code without returning any error.  

 

from ABBYY import CloudOCR
ocr_engine = CloudOCR(application_id='', password='')
input_file = open('no5.pdf', 'rb')
post_file = {input_file.name: input_file}
result = ocr_engine.process_and_download(post_file, exportFormat='xml,pdfTextAndImages', language='English')

for format, content in result.items():
output_filename = '{name}.{extension}'.format(name='.'.join(input_file.name.split('.')[:-1]), extension=format)
with open(output_filename, 'wb') as output_file:
output_file.write(content.read())

         output_file.close()   

Oksana Serdyuk posted this 10 May 2018

According to the description of the processImage method, the name of this parameter is xml:writeFormatting and not XML:Writingformat. Please try to use the exact name of the parameter.

For debugging your code we can recommend to use Fiddler. It allows to see what requests are sent to the server and what are the responses. It can be downloaded for free from www.telerik.com/fiddler

If you still cannot resolve the issue, please send us to CloudOCRSDK@abbyy.com such logs for investigating the issue in details. 

Close