Adding a checkbox symbol to the alphabet

  • Last Post 15 January 2016
michaeln posted this 13 January 2016

I'd like to add a symbol that looks like a checkbox to the alphabet. It seems ridiculous that rather than just teaching the reader to recognize a new symbol, one has to define specific page regions to specifically look for certain types of checkboxes. My narratives come in all sizes (though the checkboxes always look the same), and I simply need to know if a box is checked or not, so that I can use regular expressions to extract the text next to the checkbox. Can someone please tell me if it is possible to simply add a symbol to the alphabet?

Another possibility is to use another software platform just to search and locate checkboxes, and then slice the image up so that only text next to checked boxes is sent to the OCR reader for processing. This seems ridiculous though. There HAS to be more intelligent software out there...

Order By: Standard | Newest | Votes
Oksana Serdyuk posted this 14 January 2016

There is no need to add a checkbox symbol to the alphabet. The ABBYY's technologies are able to recognize different types of checkmarks and checkmarks with “corrections” made by hand:

  1. ABBYY FlexiCapture (for offline recognition) is a ready-to-use product for data capture and document processing solution.
  2. ABBYY FlexiCapture Engine (also for offline recognition) is our data capture SDK which enables to develop solutions for extracting data from forms and documents.
  3. ABBYY Cloud OCR SDK (for online recognition) is our online recognition service providing Web API for OCR. It has the processCheckmarkField method which allows to extract the value (checked or not) of a checkmark on an image. If you are interested with Cloud OCR SDK then start your learning with reading How to Work with Cloud OCR SDK.

michaeln posted this 15 January 2016

Thank you. From what I saw though, you have to tell it specifically where the checkboxes are located on the page in order for it to recognize them. Is that true?

Oksana Serdyuk posted this 15 January 2016

If you use Cloud OCR SDK, then yes, you should specify the region of the checkmark field on the image.

For Data Capture products the situation is different. First, you have to create Document Definitions based on several templates images and then you will be able to use these Document Definitions for processing any similar forms.