1) Does Abbyy Cloud guarantee constant response time on a given document at any point in time? To elaborate, say I have a PDF document (xyz.pdf) about 5MB in size, I upload it to Abbyy at 6 in the morning, and I receive the PDF conversion in two minutes, will it be the same 2 minutes, if I upload it at 6pm? Also, will it be the same, if the document is part of 10,000 other pdfs that I am going to submit OCR requests in parallel? If it is not constant time, what kind of deviation could I expect? Would it take 3 minutes for the above doc in peak times? Or would it take more time?
2) What are the factors that determine how much time a document could take for OCR? Does it depend on size of the document, or say the content of the document? If I say, that all the documents that I upload will be only textual in nature (meaning scanned images of text without any graphics), would it take less time? What is THE maximum time any document can take? is it 10 minutes/ 15minutes?
asked 09 Nov '16, 22:36
Hallo Choon L,
to 1) Since it is a cloud based service with a varying processing load and different document types, it is not really possible to guarantee a certain processing time for one document. But on a statistical level and after processing a few hundred/thousand documents you will have good numbers.
The over all service is build in a way that it scales up automatically when the incoming number of tasks increases. You can find some more details on the scalability on the ABBYY Technology Portal: https://abbyy.technology/en:products:cloud-ocr:cloud-scalability
The OCR backend is also sharing its current status, under http://status.ocrsdk.com/, further details and explanations are documented here: https://abbyy.technology/en:products:cloud-ocr:cloud-service-status-indicators
to 2) There are multiple factors that influence the processing time: image/text quality, document layout complexity, number of pages, processing task etc. Here a collection of articles that give more insides: https://abbyy.technology/en:features:ocr:quality-speed
The Cloud OCR and also the FineReader Engine OCR SDK offers pre-tuned profiles - you should test them: http://ocrsdk.com/documentation/specifications/processing-profiles/
THE maximum time? -- That is hard to say, the answer can be in theory several hours or even a day (e.g. XXL PDF documents with 5.000+ pages or large construction plans). With on premise SDKs or Recognition Server you have different settings to define a timeout. In the cloud the back-end will probably stop the task and return an error/more info in the job status: http://ocrsdk.com/documentation/specifications/task-statuses/
I hope the linked articles give your enough background info.
answered 10 Nov '16, 11:06