Recognizing Arabic text with User Dictionary

  • 41 Views
  • Last Post 28 February 2018
dgriffith posted this 06 February 2018

I want to use a list of Arabic words to supplement the built in Arabic Dictionary

I performed these steps:

LoadEngine();

var languageDatabase = _engineLoader.Engine.CreateLanguageDatabase();

Dictionary dict = languageDatabase.OpenDictionaryExtension(LanguageIdEnum.LI_ArabicSaudiArabia);

dict.Edit();   // at this point, I imported a text file with 9,000,000 Arabic words, then exported it to 

...\ABBYY SDK\11\FineReader Engine\Data\ExtendedDictionaries\Arabic.txt

string s = "x";

double i = 0;

while (s != null)

{

s = enumer.Next(out conf);

i++;

}

// at this point i was around 4.2 million - I assume that I successfully created a user dictionary

UnloadEngine();

Later, I then performed OCR:

LoadEngine()

var languageDatabase = _engineLoader.Engine.CreateLanguageDatabase();

Dictionary dict = languageDatabase.OpenDictionaryExtension(LanguageIdEnum.LI_ArabicSaudiArabia);

var document = _engineLoader.Engine.CreateFRDocument();

document.AddImageFile(imageFile);

var recognitionParameters = _engineLoader.Engine.CreateRecognizerParams();

recognitionParameters.SetPredefinedTextLanguage("Arabic");

var extractionParameters = _engineLoader.Engine.CreateObjectsExtractionParams();

extractionParameters.DetectTextOnPictures = true;

extractionParameters.RemoveGarbage = true;

extractionParameters.EnableAggressiveTextExtraction = true;

var documentParams = _engineLoader.Engine.CreateDocumentProcessingParams();

documentParams.PageProcessingParams.ObjectsExtractionParams = extractionParameters;

documentParams.PageProcessingParams.RecognizerParams = recognitionParameters;

document.Process(documentParams);

// and then go on to write results to PDF, RTF & XML

Have I performed the necessary steps needed to use the Standard and User Arabic dictionaries?

If not, what is missing

Thank you for your help

David Griffith

 

 

Kseniya Leontyeva posted this 28 February 2018

Hi  David,

No, in that way the dictionary won't automatically be added to the language. If you want to edit dictionary please do it through user dictionaries.

There are two ways to add user dictionary to the language:

  • Using AddWord or AddWords methods of Dictionary object.
  • Using *.ame file of dictionary extension. This file could be generated in FineReader 12 or less.

You can find an example realization of the first option in the CustomLanguage sample.

Also, you can read more about this topic in Help → Guided Tour → Advanced Techniques → Working with Dictionaries.

Close