The image of a phonebook page column is here: http://digitalfire.com/culiacan/pictures/326.jpg It was scanned at 600 dpi, auto-leveled and resized in Photoshop to 300 (without resampling). Params are: language=Spanish&exportFormat=txt&imageSource=scanner&correctSkew=false

The 18th last line is missing, it starts 'Clz Heroico ...'. Also, we are continuing to get alot of errors with '-0' (eg. 7144)556 instead of 714-0556).

asked 17 Apr '12, 08:14

thansen's gravatar image

thansen
1114

edited 09 Jun '12, 11:28

Vasily%20Panferov's gravatar image

Vasily Panferov ♦♦
5422516


Hallo thansen,

I opend your image in FineReader 11 and for some reasons the text lines have a scew (even if the image looks correct in the browser) - no idea why. After descewing withing FineReader the recognition was much better.

Before: 11% (624/5877) uncertain characters After: 7% (431/6244) uncertain characters Because of the scew some areas were not correctly identified as text this is why the absolut number is very different.

Also chaging the resolution form 600 dpi (4,78 cm x 24,45 cm) to 300 dpi (9,55 48,9 cm) - without re-calculation of pixel - and saved it as a bmp. The relolustion change (virtualy enlargemtn) and the now not scewd text lines made it easier for the OCR to analyse the image. The 8% (472/6207) uncertain characters is at same level as the corrected image in FineReader.

Note: I am not recomending scanning in 300 dpi - you need 600 dpi to get enough pixels form the rather small characters.

The number of characters is not always the same because the ....... areas sometimes were interpreted as an image snippet.

BR

Michael

link

answered 17 Apr '12, 10:52

MFu's gravatar image

MFu
513

Next step to improve:

Double pixel size of the image and set 600 dpi (from 1128x5776 to 2256x11552). When submitting to the server, use all the previous options: scanner and deskew.

link

answered 17 Apr '12, 16:55

Vasily%20Panferov's gravatar image

Vasily Panferov ♦♦
5422516

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×25

Asked: 17 Apr '12, 08:14

Seen: 1,898 times

Last updated: 09 Jun '12, 11:28

© 2016 ABBYY. All rights Reserved. www.ABBYY.com | Privacy Policy | Legal