Coulmn data is merged in abbyy 10.5

  • 43 Views
  • Last Post 2 weeks ago
Anurag Singh posted this 3 weeks ago

A table which is being read as a text data using Abbyy engine. item (say 1001) and material code (20015689) are two coulmns. when ABBYY extracts data it merges both value like below -

 

<cell colSpan="2" leftBorder="White" rightBorder="White" bottomBorder="White" width="91" height="19">

<text>

<par>

<line baseline="328" l="38" t="321" r="112" b="329"><formatting lang="EnglishUnitedKingdom">1001  20015689</formatting></line></par>

</text></cell>

I have seen some suggestions on INTERNET (http://knowledgebase.abbyy.com/article/698 ) which I tried to apply but unable to do that.

I am using java to call ABBYY OCR.

 

Can you please help me in this?

Thanks a Lot.

Order By: Standard | Newest | Votes
Oksana Serdyuk posted this 3 weeks ago

Hi,

please try to enable the AggressiveTableDetection property of the PageAnalysisParams object. If you set it to TRUE, FineReader Engine tries to find as many tables as possible on the page.

If this setting does not help, please send:

  • your serial number
  • the build number of FineReader Engine 10.5
  • the image illustrating the issue

 to your region Technical Support. All ABBYY contacts you can find here: https://www.abbyy.com/contacts/

  • Liked by
  • Anurag Singh
Anurag Singh posted this 2 weeks ago

Hi Oksana,

Thanks for your reply.

I will have have to take permission before providing you sample and license  from my client. Once I get that I will send you the sample and serial number.

I have used AggressiveTableDetection property but it is also not working, because table has no vertical lines for separating cells. if two cells are having data very near then ABBYY is merging them together.

Do you have any suggestions for this.

Just to let you know

when I used visual component and vertical lines for separating table cells then it is perfectly extracting all data.

I have one question here,

1- Are ABBYY visual components and ABBYY engine both used together for good extraction?

2-Do companies use both of them together for extraction, like first pre-process the image using Visual Component and then pass it to ABBYY engine or we can achieve same result by only using ABBYY engine programmatically?

 

Thanks a lot.

Regards,

Anurag

 

 

 

 

Close