I am going to capture data from invoices, bills, questionnaires, application forms, and some other documents. Should i perform OCR and search for field labels in it?

asked 23 Jan '12, 17:30

Singlecon's gravatar image


There is no need to recognize the whole document and search for the data in it. Instead you can recognize only certain text fields of a document and directly capture data from these fields into an information system or database. Please refer to the "How to Recognize Text Fields" article.


answered 23 Jan '12, 17:32

Nikolay_Kh's gravatar image

Nikolay_Kh ♦♦

Any example on how to implement this? I think example codes and tutorial is lacking in your documentation.

(29 Apr '13, 13:50) James

To test field-level recognition, you can use ConsoleTest application from .NET sample code.

To recognize a single text field call

ConsoleTest.exe --asTextField [common options] <source_dir|file> <target_dir>

It performs recognition via processTextField call.

Common options description:

--lang=<languages>: Recognize with specified language. Example: --lang=English --lang=English,German,French

--out=<output format>: Create output in specified format: txt, rtf, docx, xlsx, pptx, pdfSearchable, pdfTextAndImages, xml

--options=<string>: Pass additional arguments.

For example:

ConsoleTest.exe --asTextField --lang=English --options=region=0,0,200,50 D:\1.jpg D:\result

To recognize several text fields in one request call

ConsoleTest.exe --asFields <source_file> <settings.xml> <target_dir>

It performs recognition via processFields call. Processing settings should be specified in xml file. A sample xml can be found at gitnib.


answered 06 May '13, 14:39

Anastasia%20Galimova's gravatar image

Anastasia Ga... ♦♦

can i have the same code for android ?

(23 Jun '14, 09:51) abhinandnandu
(26 Jun '14, 06:50) Maksim Rodkin
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported



Asked: 23 Jan '12, 17:30

Seen: 2,933 times

Last updated: 26 Jun '14, 17:34

© 2016 ABBYY. All rights Reserved. www.ABBYY.com | Privacy Policy | Legal