How to restrict documents for processing for specific languages? or support only specific language for processing?

  • 970 Views
  • Last Post 01 May 2015
Dipal Darji posted this 24 April 2015

we are using ABBYY finereader engine 11 in our java application. we want to OCR support for specific languages only. as example we want to support only English and German language documents to be processed. can we do that? if yes then can you please give us the example of that.

Order By: Standard | Newest | Votes
maol posted this 29 April 2015

Hello,

You can do something like this to tell you are processing documents in English and German:

IDocumentProcessingParams docProcessingParams = engine.CreateDocumentProcessingParams();
IRecognizerParams recognizerParams = docProcessingParams.getPageProcessingParams().getRecognizerParams();
ILanguageDatabase languageDatabase = engine.CreateLanguageDatabase();
ITextLanguage textLanguage = languageDatabase.CreateCompoundTextLanguage("English,German");
recognizerParams.setTextLanguage(textLanguage);
...         
document.Process(docProcessingParams);

Dipal Darji posted this 01 May 2015

@maol : thank you for your answer.I implemented your code, but still its processing all language's documents. i have tried Arabic,chines,dutch,french,Italian. please help me.

Oksana Serdyuk posted this 06 May 2015

Also you can call the SetPredefinedTextLanguage method with a parameter which contains several language names separated with commas, for example "English,German". The program will automatically select the language of the document from the specified set. But please note that in general the recognition quality is better when less languages are specified.

Close