Abby Cloud OCR Hand Printed Detection XML Format

  • Last Post 23 March 2016
bdinka posted this 22 March 2016

I’m currently evaluating Abby Cloud OCR SDK and was having difficulties getting in XML format a “Signature Block” value on whether it was signed or not. I was curious if there is any particular specification I can provide as an argument to the Abby Cloud OCR SDK that can help in determining on whether the Signature Block is signed or not. I downloaded the Abby Cloud OCR Java examples from the following site: after registering for a trial license.

I ran the Java Code Samples for the “recognize” method call and the result set I’m getting are inconsistent based on the the output format I specify (i.e. in XML, PDF, DOCX, and TXT format). The PDF, DOCX formats detect the Hand Printed text successfully. Unfortunately the XML, and TXT formats do not detect the Hand Printed Texts in their outputs.

I was hoping to get help from someone at Abby OCR Software Developers group in guiding me on which method call I should make to successfully get in XML format the Hand Printed text. I look forward to hearing back from you at you earliest convenience.

Order By: Standard | Newest | Votes
Oksana Serdyuk posted this 22 March 2016

Based on the provided files I have answered you by e-mail in details.

In general, there is no such feature in Cloud OCR SDK that can determine whether the field is signed or not. You can try to implement this function on your side. For example, you can try to recognize any text in the signature field and if some characters are found in the field and this field is not empty, then the field is considered to be signed.

bdinka posted this 22 March 2016

Oksana, Thanks for your reply! I've attached to this message two Image files. The First file (20160223141620196-signed-dated-1.pdf-150-1_2.jpg) is the input image file with the HANDPRINTED text while the Second File (HandPrintedTextOnly_1.png) is a returning output from Abby Cloud OCR SDK which I did a screen capture on. As you can see (in the highlighted RED BOX with the ARROW) the Cloud OCR SDK did not catch the HANDPRINTED text block of the signature field. It actually caught the data at the top where the typed text was the input and it recognized that data as HANPRINTED text. The JVM argument I passed to the "processTextField" RESTFUL api is the following:

"java textField --options=textType=handprinted "C:\Users\bdinka\Desktop\NSC Documents\Automated Consent\ABBYY Cloud OCR SDK User\20160223141620196-signed-dated-1.pdf-150-1.jpg" "C:\Users\bdinka\Desktop\NSC Documents\Automated Consent\ABBYY Cloud OCR SDK User\result.xml"

As you can see I pass the correct parameters to the Cloud OCR SDK to get the HANDPRINTED text. I was hoping you would guide me in what I had to do in order to make the Cloud OCR SDK find actual HANDPRINTED text. Thanks you for all the help on this.

alt text alt text

Attached Files

Oksana Serdyuk posted this 23 March 2016

The reason that you cannot see all text in the output is that one text field value can include only 200 characters, this is our internal restriction and it cannot be changed.