Recognize fields in multipage

  • 864 Views
  • Last Post 14 July 2015
danyolgiax posted this 13 July 2015

I have a PDF document where every page has the same template. I need to recognize a single field from each page.

Do I need to process the whole page or textfield recognition has the possibility to work with multipage?

Order By: Standard | Newest | Votes
Oksana Serdyuk posted this 14 July 2015

For your scenario you can use the processFields method. It allows to specify the coordinates of each field in an XML file for each page, for example:

<?xml version="1.0" encoding="utf-8"?>
<document xmlns="http://ocrsdk.com/schema/taskDescription-1.0.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://ocrsdk.com/schema/taskDescription-1.0.xsd http://ocrsdk.com/schema/taskDescription-1.0.xsd">
  <fieldTemplates />
  <page applyTo="0,1">
    <text id="Field1" left="395" top="105" right="1047" bottom="157">
      <language>English</language>
      <textType>normal</textType>
      <oneTextLine>true</oneTextLine>
    </text>
  </page>
  <page applyTo="2">
    ...
  </page>
  ...
  <page applyTo="N">
    ...
  </page>
</document>

  • Liked by
  • danyolgiax
danyolgiax posted this 14 July 2015

I considered it but I don't know how many pages the document has. Do I need to dinamically generate configuration XML file reading total page numbre from pdf document?

Oksana Serdyuk posted this 14 July 2015

Yes, you should do so because the "applyTo" attribute is mandatory for the "page" element name.

Close