Difference between serifProbability and charConfidence

  • 2.1K Views
  • Last Post 2 weeks ago
G Moore posted this 01 April 2013

Just wondering what the precise difference between the serifProbability and charConfidence parameters in XML output are and what exctly each signify?

Thanks!

Order By: Standard | Newest | Votes
Anastasia Galimova posted this 02 April 2013

Hello G Moore,

Thank you for your question! Please see the detailed description below.

  • CharConfidence (integer)

Stores the value of character confidence. It is in the range from 0 to 100, and -1 corresponds to the fact that confidence is undefined. It represents an estimate of recognition confidence of a character in percentage points. The greater its value, the greater the confidence. The characters extracted from the source PDF file without recognition have the character confidence equal to 100.

  • SerifProbability (integer)

The value of this property specifies probability that a character is written with a Serif font. It is in the range from 0 to 100, and 255 corresponds to the fact that this probability is undefined.

G Moore posted this 09 April 2013

Okay, that's what I thought. We're seeing an interesting pattern emerging whereby, in the case that a character was incorrectly identified but the correct character is among the charRecVariants, the serif probability for the character that ended up being used always seems to be lower than the serif probability for the charRecVariant that is correct and should have been used, whereas the charConfidences are the opposite (i.e. the incorrect character has higher confidence, hence why it was used I guess). Would there be any reason for this? It's seems to happen too regularly to be a coincidence.

Anastasia Galimova posted this 11 April 2013

Could you please share the images you recognize and the settings you use? You can send it to CloudOCRSDK@abbyy.com.

jedarc posted this 16 October 2017

Hello, I know this thread has been around for a long time, but I'm reading now, because I started using the FlexiCapture Engine 11.

I'm getting -1 in CharConfidence for all my results, what makes this happen? That sounds bad to me, am I doing something wrong or has the tool changed the way it works?

Nikolay Krivchanskiy posted this 20 October 2017

Hi,

 

The idea of how this method works did not change, therefore you should get a reasonable result. Although please note that this property is only valid for the characters from the documents that were recognized with FlexiCapture Processor.

 

Other reason might be the fact that your license does not support such export. You can check if it does in License Manager by doing the following:

1. Open License Manager.

2. Choose your license.

3. Click on “License Parameters” button.

4. Find Export tab.

5. Find “ABBYY XML” parameter under that tab.

 

“Yes” value will mean that you have this feature, “No” that you don’t. 

 

If you do not have this feature, but you need it, please contact your local manager.

  • Liked by
  • jedarc
jedarc posted this 3 weeks ago

Many thanks for your reply, in fact my license does not support XML export.

But the CharConfidence property can only be obtained through XML export?

Here's how I'm trying to get via C #:

 // I get the IDocument object correctly. 
// ...

// Access a field that I'm sure exists (I can get the other properties correctly)
var field = document.Section[sectionIndex].Children[fieldIndex];
var fieldValue = field.Value;

var confidenceLevel = fieldValue.AsText.DefaultCharParams.CharConfidence;

 

The returned value is always -1.

This property exists, but does not have support? How useful is that? Am I doing something wrong?

And how does this property work for FieldValueTypeEnum equal to FVT_Choice, for example? (I saw that it does not exist)

Diana Khammatova posted this 2 weeks ago

According to the documentation of the FlexiCaptire Engine CharConfidence property of the CharParams Object is only valid for characters from the documents recognized with FlexiCapture processor. Could you please ensure that you use FlexiCaptureProcessor Object in your scenario?

As for the second question, as you correctly pointed out, FVT_Choice fields, don't support possibility to get the confidence.

jedarc posted this 2 weeks ago

Many thanks for the feedback.
I'm getting the processor through a pool, as I'm using the FlexiCapture Engine in the multithread context in a WebService. So I'm forced to use the OutprocLoader() method.
I have already seen in another question that this method does not support some features, and I have concluded that it is quite limited.

Are there plans to implement a little more support in this method in the future?

Do you intend to implement confidence for other types of fields in the future?
(I believe that every field should have the confidence level)

Diana Khammatova posted this 2 weeks ago

Please check that EnableRecognitionVariants property of the Engine Object is set to TRUE in your application and the license supports "Verification". Then please see Help → Guided Tour → Advanced Techniques → Working with Recognition Variants article and follow the "How to retrieve recognition variants for a word or character" manual to get CharConfidence property.

As for OutprocLoader()method, please specify what functionality should be supported in the future versions of the library in addition to the GetPicture() method?

Could you please describe, why it is required in your scenario to obtain the confidence of the FVT_Choice fields? We will try to discuss this feature with our analysts.

Close