Export pdf document with custom layout

  • 177 Views
  • Last Post 20 February 2017
Angelo posted this 16 February 2017

I need to perform an export of a subset of user created areas to a pdf file keeping their original appearance as much as possible. I've used the FineReader engine VisualComponents example as starting point as it can be used to export to rtf a document which is fully processed by the engine.

I managed to export the full document as pdf but I could always export only the whole file as it was synthesized or part of it removing auto recognized blocks manually. However I need to manually add blocks from scratch, and export a pdf file with only specified blocks.

I set up a simple function which goes from document creation and programmatically add a text block to all pages of a document and went thorugh recognition of blocks but if I try to export the document to a pdf file I get the error "The logical structure of the document is invalid. Please, perform document synthesis". But if I perform a sysnthesis of the document my custom blocks are removed and the full document is exported.

How can I export only the blocks I need?

Order By: Standard | Newest | Votes
Oksana Serdyuk posted this 17 February 2017

I suppose that you export the result to searchable PDF (TextExportMode=PEM_ImageOnText by default), when the whole image without text is saved as a picture and the recognized text is saved as text and is put under the image. The image covers all the recognized text and that's why you can't see what has been actually exported. Please try to use one of export modes: PEM_TextOnly or PEM_TextWithPictures. See also the Developer's Help → API Reference → PDFExportParams Object.

Angelo posted this 20 February 2017

No, sorry, as english is not my native language I may have expressed my issue in an unclear way.

I just can't export the pdf due to an exception returned by a call to

frDocument.Export(filename, FileExportFormatEnum.FEF_PDF, null);

which throw an exception with message

"The logical structure of the document is invalid. Please, perform document synthesis"

My goal is to export some user selected part of a pdf trying to keep their appearance closest as possible to starting document. The FRDocument.Export function does it but for the whole document, while I need only some parts of it

Oksana Serdyuk posted this 20 February 2017

I think that understand you correctly. There are some things that you need to pay your attention:

  1. By default if you export the resulting file to PDF, it is saved to a searchable PDF format, i.e. pdfExportParams.TextExportMode = FREngine.PDFExportModeEnum.PEM_ImageOnText. It means that the entire image is saved as a picture, the recognized text is put under it.
  2. The stage of document synthesis must be performed if you export the recognition result to any of supported formats, except the TXT and the PDF ImageOnly formats, when synthesis information is not used.
  3. You get "The logical structure of the document is invalid. Please, perform document synthesis" error, because you try to export the result to the PDF ImageOnText format, that demands that the document synthesis is performed.
  4. If you perform the document synthesis and export to PDF with default settings, indeed you should get the full document as a result, because the entire image is saved as a picture and it covers all recognized text.

Thus, if I modify our standard VisualComponents sample in the following way:

...
DialogResult result = saveFileDialog.ShowDialog();
if ( result == DialogResult.OK ) 
{
   // synthesize if page or document structure is invalid
   Document.Synthesize(Synchronizer.ProcessingParams.SynthesisParamsForDocument);

   FREngine.PDFExportParams pdfExportParams = engineLoader.Engine.CreatePDFExportParams();
   pdfExportParams.TextExportMode = FREngine.PDFExportModeEnum.PEM_TextWithPictures;

   Document.Export(saveFileDialog.FileName, FREngine.FileExportFormatEnum.FEF_PDF, pdfExportParams);
}
...

The result is the following:

alt text

alt text

Attached Files

Close