We are testing ABBYY Cloud OCR API, with demo images and some of our test documents using a trial account. In sample code and documentation we observed that once processing request is submitted to ABBYY server, the code is expected to wait at least for 2 seconds before we request for task status, or else "Invalid Task" exception is thrown.

But we found this behavior inconsistent where, for slightly larger document (14 pages) even 5 seconds wait was not enough. Could you let us know, what should be the best way to deal with this?

Also, we would like to know what will be wait times(for tasks) in production/paid version? Will it be same or less? Is overall document processing faster in production compared to trial environment? If yes, how much faster?

sample code we used:

    // initialize abbyy client
    RestClient restClient = new RestClient();
    restClient.serverUrl = abbyyServerURL;
    restClient.applicationId = appId;
    restClient.password = password;

    ProcessingSettings.OutputFormat outputFormat = outputFormatByFileExt(outputFile);

    // prepare settings object for processing
    ProcessingSettings settings = new ProcessingSettings();
    settings.setLanguage(recognitionLang);
    settings.setOutputFormat(outputFormat);

    Task task = null;
    System.out.println("Uploading file... " + inputFile);
    task = restClient.processImage(inputFile, settings);

    System.out.println("Got task Id: " + task.Id);
    Thread.sleep(5000);
    task = restClient.processDocument(task.Id, settings);

    waitAndDownloadResult(task, outputFile, restClient);

asked 23 Nov '16, 00:33

Choon_L's gravatar image

Choon_L
112


ABBYY Cloud OCR SDK is running on Windows Azure in the datacenter located in the Netherlands. There is no difference between paid or trial versions. It is online service and it works in exactly the same way in any cases.

The processing time depends on many factors, such as the image you recognize, the settings you use and the internet connection speed. It is calculated in the following way:

Processing time = Time for transferring a file over the Internet + Time in queue + Time for OCR

Note that if you process a multipage document, the recognition of every page is performing sequentially, so the processing time of such document roughly equals the processing time of one page multiplied by the number of pages. So, it is expected that multipage documents are processed a little bit longer.

To identify the reasons of long recognition of your images, at first please review the following FAQ section: My images are processed too slow, what's wrong?

And to get our additional recommendations, please send to CloudOCRSDK@abbyy.com several images for our tests and the processing settings you have used.

link

answered 24 Nov '16, 15:29

Oksana%20Serdyuk's gravatar image

Oksana Serdyuk ♦♦
1.5k16

Our question was not about the overall performance, but more specifically about this REST API call - http(s)://cloud.ocrsdk.com/getTaskStatus If this call returns a HTTP response code that is not 200/401/407, with an error saying "Task Invalid", what does that mean?

The taskid that we are sending as param to getTaskStatus method is valid, because it is obtained from previous calls to processImage/processDocument methods.

So, the question is, if processImage/processDocument have already given us a valid taskid to query with, and we use the same in getTaskStatus, then why would the API call fail?

It seems to us that this API call fails when we query too early (say 3 sec delay), but when we query with 5 sec/7sec delay, we get a proper response. Why does this happen?

Why wouldn't getTaskStatus return "Queued"/"Submitted"/.. or any of other valid statuses?

What is a safe delay time before we can call getTaskStatus without receiving an error on this API call?

link

answered 25 Nov '16, 00:05

Choon_L's gravatar image

Choon_L
112

Ok, now I see. Could you please send the Fiddler logs file (or any HTTP debugger analogue) demonstrating this issue? Meanwhile I will try to ask the developers.

(25 Nov '16, 15:28) Oksana Serdyuk ♦♦
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×7
×4
×2
×1
×1

Asked: 23 Nov '16, 00:33

Seen: 327 times

Last updated: 25 Nov '16, 15:28

© 2016 ABBYY. All rights Reserved. www.ABBYY.com | Privacy Policy | Legal