How to recognize specific text fields?

  • Last Post 26 June 2014
Singlecon posted this 23 January 2012

I am going to capture data from invoices, bills, questionnaires, application forms, and some other documents. Should i perform OCR and search for field labels in it?

  • Liked by
  • Nikolay_Kh
Order By: Standard | Newest | Votes
Nikolay_Kh posted this 23 January 2012

There is no need to recognize the whole document and search for the data in it. Instead you can recognize only certain text fields of a document and directly capture data from these fields into an information system or database. Please refer to the "How to Recognize Text Fields" article.

  • Liked by
  • Vasily Panferov
James posted this 29 April 2013

Any example on how to implement this? I think example codes and tutorial is lacking in your documentation.

Anastasia Galimova posted this 06 May 2013

To test field-level recognition, you can use ConsoleTest application from .NET sample code.

To recognize a single text field call

ConsoleTest.exe --asTextField [common options] <source_dir|file> <target_dir>

It performs recognition via processTextField call.

Common options description:

--lang=<languages>: Recognize with specified language. Example: --lang=English --lang=English,German,French

--out=<output format>: Create output in specified format: txt, rtf, docx, xlsx, pptx, pdfSearchable, pdfTextAndImages, xml

--options=<string>: Pass additional arguments.

For example:

ConsoleTest.exe --asTextField --lang=English --options=region=0,0,200,50 D:\1.jpg D:\result

To recognize several text fields in one request call

ConsoleTest.exe --asFields <source_file> <settings.xml> <target_dir>

It performs recognition via processFields call. Processing settings should be specified in xml file. A sample xml can be found at gitnib.

  • Liked by
  • Andrey Isaev
abhinandnandu posted this 23 June 2014

can i have the same code for android ?

Maksim Rodkin posted this 26 June 2014