Most commercial installations have a requirement for batch processing. Batch processing is typically done at night, after the daily online workload has diminished. This is usually the practice for two reasons: to consolidate the day's transactions into reports, and to prevent batch workloads from impacting the online load.
See Examples for an illustration of a hierarchy to control the environment in which batch jobs are run; this section also covers the Solaris Resource Manager commands used in the process.
Batch workloads are a hybrid between online transaction processing (OLTP) and decision support system (DSS) workloads, and their effect on the system lies somewhere between the two. A batch workload can consist of many repetitive transactions to a database, with some heavy computational work for each. A simple example would be the calculation of total sales for the day. In this case, the batch process would retrieve every sales transaction for the day from the database, extract the sales amount, and maintain a running sum.
Batch processing typically places high demands on both processor and I/O resources, since a large amount of CPU is required for the batch process and the database, and a large number of I/Os are generated from the backend database for each transaction retrieved.
A batch workload is controlled by effectively limiting the rate of consumption of both CPU and I/O. Solaris Resource Manager allows fine-grained resource control of CPU, but I/O resources must be managed by allocating different I/O devices to each workload.
Two methods are typically used to isolate batch resource impact:
Make a copy of the database on a separate system and run the batch and reporting workloads on that separate system. (Note, however, that in most situations, the batch process updates parts of the online database and cannot be separated from it.)
Use CPU resource control.
Because the amount of I/O generated from a batch workload is proportional to the amount of CPU consumed, limits on CPU cycles can be used to indirectly control the I/O rate of the batch workload. Note, however, that care must be taken to ensure that excessive I/O is not generated on workloads that have very light CPU requirements.
By definition, a batch workload is a workload that runs unconstrained, and it will attempt to complete in the shortest time possible. This means that batch is the worst resource consumer, because it will take all the resources it needs until it is constrained by a system bottleneck (generally the smallest dataflow point in the system).
Batch presents two problems for system managers; it can impact other batch jobs running concurrently, and it can never be run together with the online portion of the workload during business hours.
Even if the batch jobs are scheduled to run during off-hours, for example, from 12:00 a.m. to 6:00 a.m., a system problem or a day of high sales could cause the batch workload to spill over into business hours. Although not quite as bad as downtime, having a batch workload still running at 10:30 a.m. the next day could make online customers wait several minutes for each transaction, ultimately leading to fewer transactions.
Using resource allocation will limit the amount of resources available to the batch workloads and constrain them in a controlled manner.