Invoice processing request using OCR to Xml format

  • 1.7K Views
  • Last Post 18 October 2018
AjijulMondal posted this 08 April 2015

We work with Business card processing using OCR ... its work fine.. but we are working on Invoice processing... and we need these processing data into XML format .. So you have any facility to process a invoice using OCR and response with xml format...

Order By: Standard | Newest | Votes
Oksana Serdyuk posted this 08 April 2015

To process invoices in Cloud OCR SDK you can try one of the following ways:

Field-level recognition

This mode should be used if you can specify the exact coordinates of the text fields you want to recognize. The processFields method allows you to recognize several fields in a document. The result of processing is returned in an XML file with the following XSD scheme.

Full-page recognition

This mode should be used if you don’t know the coordinates of text fields you need. In this case you can recognize all the text on a page and extract the information you need on your side. To get the output in the XML format, set exportFormat=xml of the processImage method.

  • Liked by
  • AjijulMondal
rmakam posted this 05 October 2018

Hi,

I have a similar request. I am trying to use ocrsdk from Node JS to extract invoice information. I cannot specify the exact coordinates of the text fields I want to extract since the invoice can be in different formats.

Requirement#1 ->

Below is what I am expecting in the output XML:

<invoice_data>

<invoice_number>12345</invoice_number>

<invoice_date>10-JAN-2018</invoice_date>

<supplier>ABCD Corp</supplier>

<invoice_amount>100</invoice_amount>

</invoice_data>

Please confirm how can we achieve this kind of data.

Requirement#2 ->

Also, I would like to have support for some fuzzy logic. Say, if Invoice has "Success factors Corporation" as the supplier name, if my application has that supplier name registered as "Success-factors Corp.", it should show that closest match instead of looking for exact match.

 

Thanks

Rathnam

Helen Osetrova posted this 18 October 2018

Hi Rathnam, 

 

The field recognition in Cloud OCR SDK requires to specify exact coordinates for each field. If the source documents have various formats, you can create several XML templates to specify processing parameters for each type of invoice separately.

Please review the XML Parameters of Field Recognition article for more details and sample XML definitions. 

 

Close