PDF to excel conversion

  • Last Post 3 days ago
  • Topic Is Solved
SamAshwinPaul posted this 3 weeks ago


I am trying out your CLOUD OCR sdk (java). I downloaded the test code from GitHub and I am trying to convert a pdf to excel file. The input is a financial statement (pdf). I tried converting it to a csv but the result was not satisfactory so I tried creating a xml file but this did not work either. Is this a limitation of the software or am I doing something wrong. If so please point me I the right direction.

I have uploaded the input file


Attached Files

Order By: Standard | Newest | Votes
Nikolay Krivchanskiy posted this 2 weeks ago


Please note, that .csv output is supported only for processBusinessCard method. Please, see the list of supported output formats on our official website. 

We managed to obtain great result in xlsx using processImage method with following parameters: language=English&exportFormat=xlsx&profile=documentConversion&imageSource=scanner

You can also get the xml output file if you substitute xlsx with xml in the line above. 

Please, contact us again if you have any further issues.

SamAshwinPaul posted this 3 days ago

Thank you. The output is nearly perfect. Some of the data is misread when there is background noise in the scan. Is there anyway to improve the quality of the conversion. I have attached the input pdf file and output excel