I am going to capture data from invoices, bills, questionnaires, application forms, and some other documents. Should i perform OCR and search for field labels in it?
How to recognize specific text fields?
- 3K Views
- Last Post 26 June 2014
There is no need to recognize the whole document and search for the data in it. Instead you can recognize only certain text fields of a document and directly capture data from these fields into an information system or database. Please refer to the "How to Recognize Text Fields" article.
Any example on how to implement this? I think example codes and tutorial is lacking in your documentation.
To test field-level recognition, you can use ConsoleTest application from .NET sample code.
To recognize a single text field call
ConsoleTest.exe --asTextField [common options] <source_dir|file> <target_dir>
It performs recognition via processTextField call.
Common options description:
--lang=<languages>: Recognize with specified language. Example:
--out=<output format>: Create output in specified format: txt, rtf, docx, xlsx, pptx, pdfSearchable, pdfTextAndImages, xml
--options=<string>: Pass additional arguments.
ConsoleTest.exe --asTextField --lang=English --options=region=0,0,200,50 D:\1.jpg D:\result
To recognize several text fields in one request call
ConsoleTest.exe --asFields <source_file> <settings.xml> <target_dir>
Anastasia, thanks a lot for a https://github.com/abbyysdk/ocrsdk.com/tree/master/SampleData!
1267 questions, 4166 answers.