We are testing ABBYY Cloud OCR API, with demo images and some of our test documents using a trial account. In sample code and documentation we observed that once processing request is submitted to ABBYY server, the code is expected to wait at least for 2 seconds before we request for task status, or else "Invalid Task" exception is thrown.
But we found this behavior inconsistent where, for slightly larger document (14 pages) even 5 seconds wait was not enough. Could you let us know, what should be the best way to deal with this?
Also, we would like to know what will be wait times(for tasks) in production/paid version? Will it be same or less? Is overall document processing faster in production compared to trial environment? If yes, how much faster?
sample code we used:
asked 23 Nov '16, 00:33
ABBYY Cloud OCR SDK is running on Windows Azure in the datacenter located in the Netherlands. There is no difference between paid or trial versions. It is online service and it works in exactly the same way in any cases.
The processing time depends on many factors, such as the image you recognize, the settings you use and the internet connection speed. It is calculated in the following way:
Processing time = Time for transferring a file over the Internet + Time in queue + Time for OCR
Note that if you process a multipage document, the recognition of every page is performing sequentially, so the processing time of such document roughly equals the processing time of one page multiplied by the number of pages. So, it is expected that multipage documents are processed a little bit longer.
To identify the reasons of long recognition of your images, at first please review the following FAQ section: My images are processed too slow, what's wrong?
And to get our additional recommendations, please send to CloudOCRSDK@abbyy.com several images for our tests and the processing settings you have used.
answered 24 Nov '16, 15:29
Oksana Serdyuk ♦♦
Our question was not about the overall performance, but more specifically about this REST API call - http(s)://cloud.ocrsdk.com/getTaskStatus If this call returns a HTTP response code that is not 200/401/407, with an error saying "Task Invalid", what does that mean?
The taskid that we are sending as param to getTaskStatus method is valid, because it is obtained from previous calls to processImage/processDocument methods.
So, the question is, if processImage/processDocument have already given us a valid taskid to query with, and we use the same in getTaskStatus, then why would the API call fail?
It seems to us that this API call fails when we query too early (say 3 sec delay), but when we query with 5 sec/7sec delay, we get a proper response. Why does this happen?
Why wouldn't getTaskStatus return "Queued"/"Submitted"/.. or any of other valid statuses?
What is a safe delay time before we can call getTaskStatus without receiving an error on this API call?
answered 25 Nov '16, 00:05