Cloud OCR for extracting text from a specific area and from a particular page in pdf

  • 25 Views
  • Last Post 3 weeks ago
Vineet More posted this 3 weeks ago

Hi,

I have a very large pdf of 136 mb and I want to extract a Text of a Particular Page and a Particular Area in the PDF. 

I am very much new to Abbyy and I need to understand below mentioned stuffs:

 

a. How can I process Large PDF using Cloud OCR SDK as whenever I am uploading it is throwing an error. 

b. Some Sample code or reference on how to target a particular page and zone using OCR API.

 

Thanks in Advance.

Helen Osetrova posted this 3 weeks ago

Hi Vineet,

 

The limitation of a file size in Cloud OCR SDK is 30 MB. Please review all the supported input image formats here. For extracting the data from the particular page, you can create a PDF file containing only the page which should be processed (for example, by printing this page to file with some virtual printer software). 

 

The zonal recognition in Cloud OCR SDK is possible with the help of the processFields method. The detailed guide on using this method is available here. Please note that the text field should not be longer than 200 characters.

 

The related topics on our forum also may help you:

How to perform fields recognition in a compiled sample for Windows

How to specify coordinates for zonal recognition

 

Hope this information will be useful!

 

Close