Problem is that we are not able to parse the English and Japanese information in one execution cycle. First we have to run for English text and then for japanese in the snipped right below Is it possible to have only one OCR read cycle for both types of texts?


English Call PageProcessingParams pageprocessingParamsEng = engineLoader.Engine.CreatePageProcessingParams(); pageprocessingParamsEng.RecognizerParams.SetPredefinedTextLanguage("English");

            SynthesisParamsForDocument synthesisParamsForDocument = engineLoader.Engine.CreateSynthesisParamsForDocument();
            synthesisParamsForDocument.CleanRecognizedTextFontNames();
            synthesisParamsForDocument.AddRecognizedTextFontName("MS UI Gothic");
            document.Process(pageprocessingParamsEng, null, synthesisParamsForDocument);

            for (int i = 0; i < document.Pages.Count; i++)
            {
                calculateStatisticsForLayout(document.Pages[i].Layout);
            }

//JAPANESE CALL PageProcessingParams pageprocessingParams = engineLoader.Engine.CreatePageProcessingParams(); pageprocessingParams.RecognizerParams.SetPredefinedTextLanguage("Japanese");

            document.Process(pageprocessingParams, null, synthesisParamsForDocument);

            ///

            for (int i = 0; i < document.Pages.Count; i++)
            {
                calculateStatisticsForLayout(document.Pages[i].Layout);
            }

asked 27 Nov '12, 11:50

gursharan's gravatar image

gursharan
1112

What happens if you call pageprocessingParams.RecognizerParams.SetPredefinedTextLanguage("English,Japanese"); ?

(27 Nov '12, 13:00) Dmitry Me ♦♦

Thank you for the response but still does not pick up the English text. If I remove the japanse font here then the english text is read but japanese shows up as junk.Please help.

SynthesisParamsForDocument synthesisParamsForDocument = engineLoader.Engine.CreateSynthesisParamsForDocument(); synthesisParamsForDocument.CleanRecognizedTextFontNames(); synthesisParamsForDocument.AddRecognizedTextFontName("MS UI Gothic"); document.Process(pageprocessingParamsEng, null, synthesisParamsForDocument);

(27 Nov '12, 13:39) gursharan

Thank you for your question! Could you please attach the image sample for which the issue occurs and write the build number of your ABBYY FineReader Engine package to let us reproduce the issue?

To determine the build number please see http://knowledgebase.ocrsdk.com/article/1116

(12 Dec '12, 15:05) Anastasia Ga... ♦♦
Be the first one to answer this question!
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×106
×3
×2
×1

Asked: 27 Nov '12, 11:50

Seen: 2,569 times

Last updated: 12 Dec '12, 15:05

© 2016 ABBYY. All rights Reserved. www.ABBYY.com | Privacy Policy | Legal