Hello,

I'm just getting started with OCR SDK, and I'm using PHP to send POST method with an image of a serial number, and I'm testing out letterSet and despite setting it as follows:

http://cloud.ocrsdk.com/processTextField?language=english&letterSet=0123456789*

I still get letters returned in the XML result set. I've also been testing the RegExp parameter, and that also seems to be ignored (returning letters where only numbers are specified in the RegExp). In this case, I am expanding letterSet to include all letters and numbers and adding this regexp parameter:

&regExp=[A-Z]?[A-Z][0-9]{4}

What I am trying to do is have OCR recognize a serial number in the format: A?ANNNN (A=letter, N=number) where only digits can appear in positions 3-6, and only a one or two letter prefix (A-Z).

I assume that the parameters for processTextField are sent in the URL string (GET) as opposed to sending with the POST along with the image?

I did see the post about using the "Digits" language, but my requirements are more than what is contained in that language.

Thanks.

asked 11 Mar '14, 07:06

HankLloydRight's gravatar image

HankLloydRight
112

To let us test it, could you please share or sent to CloudOCRSDK@abbyy,com the image you recognize?

(11 Mar '14, 17:15) Anastasia Ga... ♦♦

I sent a detailed message to that email address. thanks.

(12 Mar '14, 00:30) HankLloydRight

Thank you. We have received your letter and will reply tomorrow.

(18 Mar '14, 02:59) Anastasia Ga... ♦♦

The issue occurs because OCR technologies are not trained well for this font. It should be fixed in the future.

We find our that both of your images could be completely recognized with the following URL: http://cloud.ocrsdk.com/processTextField?textType=handprinted

Thank you for your patience!

link

answered 18 Mar '14, 17:30

Anastasia%20Galimova's gravatar image

Anastasia Ga... ♦♦
790112

Thanks for your reply.

I had tried "handprinted" as well as all the other textType types during testing, but handprinted failed on many more of the other images I tested.

I found that using "textType=normal,typewriter" generated the smallest number of OCR errors for my images. Really, the only one image that failed with "textType=normal,typewriter" was the one I emailed you.

Can you explain how the RegExp parameter works, since Abbyy still returns values that would not pass the RegExp I'm using.

In the mean time, I'll just write some code on my end to detect the mis-reads that violate the RegExp values, and try to correct them before passing to my application.

Thanks again.

(18 Mar '14, 19:12) HankLloydRight

We have found two bugs, that should be fixed in the nearest feature and could be avoided now:

  1. Regular expression does not works when the language is specified directly. We recommend do not specify the language in the URL (letterset and regExp are enough).

  2. It is something wrong with asterisk in the letterset: when it is used with handprinted text type, an error occurs. If all of your expressions contains an asterisk in the end, probably you can recognize only the text before it.

(18 Mar '14, 20:01) Anastasia Ga... ♦♦

Also the syntax you use is slightly different from described in the manual http://ocrsdk.com/documentation/specifications/regular-expressions/ .

For this text


FNNNNNNNNB or AFNNNNNNNNB

where

  • A=letters A thru M
  • F=letters A thru L
  • B=letters A thru Z (excluding letters "O" and "Z")
  • N=digits 0-9

you can use, for example, this regExp:

(|[A-M])[A-L][0-9]{8}([A-N]|[P-Y])

(18 Mar '14, 20:03) Anastasia Ga... ♦♦
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×24
×4

Asked: 11 Mar '14, 07:06

Seen: 1,021 times

Last updated: 18 Mar '14, 20:04

© 2016 ABBYY. All rights Reserved. www.ABBYY.com | Privacy Policy | Legal