PDF, replacing the image after the text has been added.

  • Topic Is Locked
  • Last Post 09 September 2013
robin posted this 07 September 2013


I have documents which have headings and margin notes that affect how cleanly they are ocr'd. Even if the text is correctly ocr'd they are so close to the main body of text that they get "run into it". I have found the best way to get a clean scan of great ocr is to "mask out" the text I don't care about (I have an automatic process that can do this), the image is the same size as the "original" unmasked image. Obviously if I request a pdf document I get the "masked" image with the text underlay - what I want is the original image with the text underlay.

Is there any easy way to substitute the "masked" image for the original image, or is there an easy way to use the xml output to add a text underlay onto the original image - all the box positions will be the same.

I should say what I am after is the "pdfSearchable" style of pdf (but with the original image).



Anastasia Galimova posted this 09 September 2013

Unfortunately, it goes beyond the functionality of ABBYY products.

Topic Is Locked