I have a scenario where I need OCR the file which have more than 150 pages. In order to smoothen the process, is there any method which can break the pages of the single file and process it as multithreading.
Multithreading a single document if the File size is high
- 640 Views
- Last Post 17 June 2016
In your case we would suggest that you consider using parallel processing. This will allow you to recognize pages in a document in parallel, and thus decrease overall processing time. For detailed description of possible ways to implement parallel processing in FineReader Engine 11, please refer to the Developer’s Help file, the article Guided Tour→Advanced Techniques→Parallel Processing.
You could also take a look at the MultiProcessingRecognition demo tool for an example of parallel processing in FREngine. This demo tool can be found at:
- %ALLUSERSPROFILE%\Application Data\ABBYY\SDK\10\FineReader Engine\Samples\DemoTools — for Windows XP, Windows Server 2003;
- %ProgramData%\ABBYY\SDK\10\FineReader Engine\Samples\DemoTools — for Windows Vista, Windows Server 2008, Windows 7, Windows 8, Windows Server 2012.
In our scenario we are processing single file of multiple pages (Around 150 pages) ,is there any way that we can split the pages to the batch of 20 and process the OCR and then merge those pages.
This algorithm, that you are describing will, be performed automatically in case of using multiprocessing. As it is recommended in the Help file, use the FRDocument object for parallel processing of multi-page documents. It is the most easy-to-code multiprocessing way, because you do not have to implement any additional interfaces. Please find the usage details in the 'Processing with FRDocument object' section, the article Guided Tour→Advanced Techniques→Parallel Processing.
Also please note that to use multiprocessing your license must have the number of CPU cores available no less than 2(see the Productivity property → CPU cores).
1259 questions, 4147 answers.