Hello,

which way you recommend to efficiently check if a document has text? we want to retrieve the ocr results in xml format so our first attempt is to look for blocks which blockType attribute is ="Text". Is there a flag somewhere? It's our method reliable?

thanks!

asked 04 Apr '12, 20:17

rblasco's gravatar image

rblasco
3128

edited 05 Apr '12, 12:52

Nikolay_Kh's gravatar image

Nikolay_Kh ♦♦
1817


Correct. To check if the document contain text, you need to recognize it as xml and look for a <block blockType="Text"> element. If there is no such elements in document, there is no text recognized on it. But there still can be image blocks, barcodes etc.

link

answered 05 Apr '12, 11:52

Vasily%20Panferov's gravatar image

Vasily Panferov ♦♦
5422516

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×160
×28
×2

Asked: 04 Apr '12, 20:17

Seen: 1,939 times

Last updated: 05 Apr '12, 12:52

© 2016 ABBYY. All rights Reserved. www.ABBYY.com | Privacy Policy | Legal