scan not recognized as text

  • Last Post 12 January 2018
  • Topic Is Solved
b-t-o posted this 05 January 2018


What can I do if a single page of a pdf invoice is not fully recognized as text?

<block blockType="Text" blockName="" l="278" t="2872" r="360" b="2926"><region><rect l="353" t="2872" r="360" b="2878"/><rect l="278" t="2878" r="360" b="2919"/><rect l="353" t="2919" r="360" b="2926"/></region>
<par lineSpacing="-1"></par>
<block blockType="Text" blockName="" l="539" t="198" r="567" b="233"><region><rect l="560" t="198" r="567" b="199"/><rect l="539" t="199" r="567" b="232"/><rect l="539" t="232" r="560" b="233"/></region>
<par lineSpacing="940">
<line baseline="226" l="546" t="206" r="560" b="226"><formatting lang="EnglishUnitedStates">
<charParams l="546" t="206" r="560" b="226" suspicious="1">1</charParams></formatting></line></par>
<block blockType="Picture" blockName="" l="395" t="235" r="2867" b="1492"><region><rect l="1614" t="235" r="1851" b="236"/><rect l="559" t="236" r="2867" b="237"/><rect l="395" t="237" r="2867" b="1014"/><rect l="395" t="1014" r="2866" b="1487"/><rect l="395" t="1487" r="2867" b="1489"/><rect l="395" t="1489" r="2867" b="1490"/><rect l="826" t="1490" r="2507" b="1491"/><rect l="2252" t="1491" r="2408" b="1492"/></region>
<block blockType="Text" blockName="" l="394" t="1609" r="2270" b="1675"><region><rect l="1016" t="1609" r="1916" b="1610"/><rect l="394" t="1610" r="2119" b="1611"/><rect l="394" t="1611" r="2260" b="1612"/><rect l="394" t="1612" r="2270" b="1673"/><rect l="394" t="1673" r="2270" b="1674"/><rect l="822" t="1674" r="2270" b="1675"/></region>

But there is a table with text (invoice elements) - not an unrecognizable image.
What can I do to fix this for the future? I am using processImage?language=german,latin,english&exportFormat=xml at this time.

Best wishes


Order By: Standard | Newest | Votes
Oksana Serdyuk posted this 10 January 2018

Could you please share your source PDF file or send it to, so that we can find the more appropriate recognition settings for your scenario?

Oksana Serdyuk posted this 12 January 2018

Hi Marc,

Thank you for the provided document. Please try the following recognition settings:


b-t-o posted this 12 January 2018

Hi Oksana,

thank you for the new parameter.
It works well.

Thank you.

Best wishes