The aim of this article is to provide information to assist with understanding all the factors that will determine the licensing and hardware spec' requirements for high volume Umango installations.
There are many factors that effect processing times for documents handled by Umango. For this reason, it is not practical to offer a one-size-fits-all formula to calculate server size and the number of Umango document processors required. Many of the determining factors in processing speeds are not easily measured. Therefore, we recommend running throughput tests using typical document types with volumes expected during peek periods and running the tests on the server spec' environment you anticipate using. Once throughout is determined, it's then a matter of doing the math based on the outcomes.
A very important thing to understand in this equation is that Umango handles the processing of documents in batches and it is each batch that is assigned a unique document processor. 8 documents processed in 1 batch (regardless of the number of pages) will consume 1 Umango processor (regardless of the number licensed). However, 8 documents processed in 8 batches will consume up to 8 processors simultaneously (if there are at least 8 processors licensed). When a batch is to be processed and no processors are available, the batch is added to a queue and waits until a license becomes available. The thing to take from this is; don't confuse Umango document processors with processing threads - they are not the same thing. Adding a Umango processor will typically not increase the processing speed of each individual document. Processors increase bandwidth not processing power (although there can be a marginal increase in some configurations).
Bear in mind that there are exceptions to processor consumption that cannot be resolved with document processor count increases. For example, in a device based scanning scenario, a customer may have 50 devices and each device will regularly have 50 users standing at the devices scanning and validation data simultaneously. In this example the processor consumption is demand based and cannot be queued. This means that there are instances that processing power can only be resolved with the addition of server resources in order to avoid overwhelming the Umango server. The OCR process is extremely processor intensive and therefore can easily consume large amounts of server resources.
The net effect of the paragraph above is that your calculations should be based on periods of peak document volumes, not how long it takes to process 1 document.
If you'd like to better understand document processors and their use within Umango, you may find it helpful to read this knowledge article.
Link to this article http://umango.com/KB?article=108