Just wondering what the precise difference between the serifProbability and charConfidence parameters in XML output are and what exctly each signify?

Thanks!

asked 01 Apr '13, 07:22

G%20Moore's gravatar image

G Moore
318


Hello G Moore,

Thank you for your question! Please see the detailed description below.

  • CharConfidence (integer)

Stores the value of character confidence. It is in the range from 0 to 100, and -1 corresponds to the fact that confidence is undefined. It represents an estimate of recognition confidence of a character in percentage points. The greater its value, the greater the confidence. The characters extracted from the source PDF file without recognition have the character confidence equal to 100.

  • SerifProbability (integer)

The value of this property specifies probability that a character is written with a Serif font. It is in the range from 0 to 100, and 255 corresponds to the fact that this probability is undefined.

link
This answer is marked "community wiki".

answered 02 Apr '13, 18:56

Anastasia%20Galimova's gravatar image

Anastasia Ga... ♦♦
790112

Okay, that's what I thought. We're seeing an interesting pattern emerging whereby, in the case that a character was incorrectly identified but the correct character is among the charRecVariants, the serif probability for the character that ended up being used always seems to be lower than the serif probability for the charRecVariant that is correct and should have been used, whereas the charConfidences are the opposite (i.e. the incorrect character has higher confidence, hence why it was used I guess). Would there be any reason for this? It's seems to happen too regularly to be a coincidence.

(09 Apr '13, 22:46) G Moore

Could you please share the images you recognize and the settings you use? You can send it to CloudOCRSDK@abbyy.com.

(11 Apr '13, 17:23) Anastasia Ga... ♦♦
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×195
×7
×1

Asked: 01 Apr '13, 07:22

Seen: 1,860 times

Last updated: 11 Apr '13, 17:23

© 2016 ABBYY. All rights Reserved. www.ABBYY.com | Privacy Policy | Legal