I am currently evaluating your Cloud OCR SDK for the viability of automating AP Invoices in our application in Salesforce.

I am successfully using the processImage method to upload the image and receive back the document in XML which gives me coordinates and all of the text etc. What I would like to know is if it is possible to build some templates to 'recognise' certain fields. As you know, Invoices come in all shapes and sizes and I would like to cut down on the amount of human intervention. For example, 2 invoices from the same supplier have similar, but not exact coordinates for text fields, and of course it also depends on how many product lines are invoiced as to where things end up on the page.

I looked at the processFields and processTextFields methods, but I have to supply coordinates and as I mentioned, these aren't reliable enough to pin-point a particular field, in addition to that it would be a case of creating a different template for each supplier.

So, my questions are these:

  1. Is it realistic to implement my own templating type functionality? Would that generally be expected functionality?
  2. If so, is it possible to use some kind of tolerance for the coordinates - or do they have to be exact?
  3. Or, is this too ambitious a goal given the methods available in the Cloud OCR SDK currently?

Thanks!

asked 10 Aug '15, 22:05

philhawthorn's gravatar image

philhawthorn
114

I've forwarded your contact info to my colleagues from our office located in your region. They will contact you soon to discuss your questions and find individual approach to the issue.

(11 Aug '15, 15:42) Oksana Serdyuk ♦♦

I am attempting the same thing. Can I discuss with someone, or can an answer be put here?

Thanks

(13 Aug '15, 18:29) Jamesw

My colleagues from your region office will contact you soon to find out your usage scenario details.

(14 Aug '15, 12:32) Oksana Serdyuk ♦♦

Added my thoughts/findings below @Jamesw

(14 Aug '15, 12:45) philhawthorn

Thanks for the reply. I've also done some thinking on this.

FlexiLayouts is of course a good feature that would be great for cloud, but for now I believe that it's not too ambitious to "template" a supplier's layout in a very basic way. However, it's worth noting that our solution is not for a one-off customer, so we are willing to throw more resources at it.

From a business point of view, the values extracted would still need verification, for example, before posting to your accounts package- regardless of the accuracy of the OCR. Therefore, you would ALWAYS be providing a "best efforts" solution to the accounting dept.

The templates themselves would require a well-designed, highly responsive interface to allow a new template to be made in minutes by the more illiterate users. Click-and-drag selection areas, specify which data is at each selected area. Save, done.

link

answered 14 Aug '15, 13:20

Jamesw's gravatar image

Jamesw
212

I agree to some extent, I can certainly have users 'click' on the Invoice #, Invoice Total etc. but I wanted more automation than that. I played with 2 invoices from the same supplier, whilst the pixel locations were similar they were not identical so even locating fields by pixel location would involve some effort...then you'd have to do this for each different format. Maybe I'm making too many assumptions about what FlexiLayouts would give you, but I assumed that it was more intelligent than that. Also, can you confirm which platform you're developing on? Thanks

(14 Aug '15, 13:27) philhawthorn

Here is what I have learnt so far, in answer to my own questions:

Q: Is it realistic to implement my own templating type functionality? Would that generally be expected functionality?

A: Not really realistic, the effort involved would not be practical. I have been advised there are other ABBYY technology (FlexiLayouts?) that achieves this, but are not available via the Cloud OCR SDK.

Q: If so, is it possible to use some kind of tolerance for the coordinates - or do they have to be exact?

A: Not applicable, given answer to first.

Q: Or, is this too ambitious a goal given the methods available in the Cloud OCR SDK currently?

A: From what I have been led to believe, currently I would conclude yes it is too ambitious.

link

answered 14 Aug '15, 12:44

philhawthorn's gravatar image

philhawthorn
114

edited 14 Aug '15, 15:33

Oksana%20Serdyuk's gravatar image

Oksana Serdyuk ♦♦
1.5k16

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×42
×7
×2

Asked: 10 Aug '15, 22:05

Seen: 1,097 times

Last updated: 14 Aug '15, 15:33

© 2016 ABBYY. All rights Reserved. www.ABBYY.com | Privacy Policy | Legal