Secondly on the xml output it would be useful if the blockType had an "image" option. Currently all images us blockType=Text. Another attribute will also work
The xml output format that cloud ocr sdk creates is the same that FineReader Engine creates with default options. It contains information on text and characters plus basic information about blocks.
There are special types of blocks, blockType="Picture" and they are available in the output already. You should be able to get them after processImage call on image containing pictures using default processing profile.
We'll probably enrich the default output format soon to add some additional information about block coordinates.
If you need special xml output like described in the link you provided, please let us know - we'll add additional processing parameters to create such xml.
answered 10 May '12, 17:04
Vasily Panferov ♦♦