Hi Team,

I am trying to convert pdf file to xml.

getting output as xml but not in well format.

can you please help me.

Code Here :

ocr_engine = CloudOCR(application_id='XX', password='XX')

pdf = open('file.pdf', 'rb')

result = ocr_engine.process_and_download(file, exportFormat='xml', language='English')

for format, content in result.items():

    with open('final_xml_file13.xml', 'wb') as output_file:

        output_file.write(content.read())

 

Attached output XML in short