Unfortunately, we don't have the XSLT sample for ABBYY XML → HTML transformation. Please check our XML export schema for full ABBYY XML description.
Here is the referenced knowledgebase article copy.
In some cases, you may receive a corrupt layout, because the tables in the document were not detected.
1. First of all, make sure that your images come in sufficient quality. Recommended is 300dpi, color or grayscale images.
2. If your images have good quality, then make sure that you did not use any of the parameters below because they turn off table detection:
- IPageAnalysisParams.EnableTextExtractionMode = true;
- IPageAnalysisParams.DetectTables = false;
3. You may use the following parameter to make table detection a priority for the Analyser:
- IPageAnalysisParams.AggressiveTableDetection = true;
4. In rare cases, FineReader Engine cannot detect tables even if forced. For example, this happens if your table has a lot of decorative formatting, does not have clear separators or decorate fonts are not detected clearly.
There is one last method of table recognition, applicable only to the pages, which consist of the table alone (no pictures or text blocks outside the table). You may create a table block covering the whole page area and forcefully analyze that block. Below C# code sample:
FREngine.IRegion wholePageRegion = engineLoader.Engine.CreateRegion();
wholePageRegion.AddRect(0, 0, document.Pages.ImageDocument.BlackWhiteImage.Width, document.Pages.ImageDocument.BlackWhiteImage.Height);
FREngine.IBlock block = document.Pages.Layout.Blocks.AddNew(FREngine.BlockTypeEnum.BT_Table, wholePageRegion);
FREngine.ITableBlock tableBlock = block.GetAsTableBlock();