Columned documents

  • Last Post 01 January 2019 posted this 28 November 2018

If I have a document that is two columns per page and I need to OCR the document into a page with single column of text (so page 1 has two columns, OCR'd document has page 1 column A and page 2 column  B) is this possible using the Cloud SDK and how would I go about doing this?

Nadezhda A. Solovyeva posted this 01 January 2019

Dear David,

Yes, this is possible. Cloud OCR SDK supports column detection. If the columns were detected successfully, then the columns will appear in resulting XML as text blocks A and B. After that, you may apply any transformation of the export XML, including extraction of formatted text, where block B follows block A.