Some key concepts that affect performance tuning are:
User load
Application scalability
Margins of safety
The following table describes these concepts, and how they are measured in practice. The left most column describes the general concept, the second column gives the practical ramifications of the concept, the third column describes the measurements, and the right most column describes the value sources.
Table 1–2 Factors That Affect Performance| Concept | In practice | Measurement | Value sources | 
|---|---|---|---|
| Concurrent sessions at peak load | Transactions Per Minute (TPM) Web Interactions Per Second (WIPS) | (Max. number of concurrent users) * (expected response time) / (time between clicks) Example: (100 users * 2 sec) / 10 sec = 20 | |
| Transaction rate measured on one CPU | TPM or WIPS | Measured from workload benchmark. Perform at each tier. | |
| Vertical scalability | Increase in performance from additional CPUs | Percentage gain per additional CPU | Based on curve fitting from benchmark. Perform tests while gradually increasing the number of CPUs. Identify the “knee” of the curve, where additional CPUs are providing uneconomical gains in performance. Requires tuning as described in this guide. Perform at each tier and iterate if necessary. Stop here if this meets performance requirements. | 
| Horizontal scalability | Increase in performance from additional servers | Percentage gain per additional server process and/or hardware node. | Use a well-tuned single application server instance, as in previous step. Measure how much each additional server instance and hardware node improves performance. | 
| High availability requirements | If the system must cope with failures, size the system to meet performance requirements assuming that one or more application server instances are non functional | Different equations used if high availability is required. | |
| Excess capacity for unexpected peaks | It is desirable to operate a server at less than its benchmarked peak, for some safety margin | 80% system capacity utilization at peak loads may work for most installations. Measure your deployment under real and simulated peak loads. | 
The previous discussion guides you towards defining a deployment architecture. However, you determine the actual size of the deployment by a process called capacity planning. Capacity planning enables you to predict:
The performance capacity of a particular hardware configuration.
The hardware resources required to sustain specified application load and performance.
You can estimate these values through careful performance benchmarking, using an application with realistic data sets and workloads.
 To Determine Capacity
To Determine CapacityDetermine performance on a single CPU.
First determine the largest load that a single processor can sustain. You can obtain this figure by measuring the performance of the application on a single-processor machine. Either leverage the performance numbers of an existing application with similar processing characteristics or, ideally, use the actual application and workload in a testing environment. Make sure that the application and data resources are tiered exactly as they would be in the final deployment.
Determine vertical scalability.
Determine how much additional performance you gain when you add processors. That is, you are indirectly measuring the amount of shared resource contention that occurs on the server for a specific workload. Either obtain this information based on additional load testing of the application on a multiprocessor system, or leverage existing information from a similar application that has already been load tested.
Running a series of performance tests on one to eight CPUs, in incremental steps, generally provides a sense of the vertical scalability characteristics of the system. Be sure to properly tune the application, Application Server, backend database resources, and operating system so that they do not skew the results.
Determine horizontal scalability.
If sufficiently powerful hardware resources are available, a single hardware node may meet the performance requirements. However for better availability, you can cluster two or more systems. Employing external load balancers and workload simulation, determine the performance benefits of replicating one well-tuned application server node, as determined in step (2).
Application end-users generally have some performance expectations. Often you can numerically quantify them. To ensure that customer needs are met, you must understand these expectations clearly, and use them in capacity planning.
Consider the following questions regarding performance expectations:
What do users expect the average response times to be for various interactions with the application? What are the most frequent interactions? Are there any extremely time-critical interactions? What is the length of each transaction, including think time? In many cases, you may need to perform empirical user studies to get good estimates.
What are the anticipated steady-state and peak user loads? Are there are any particular times of the day, week, or year when you observe or expect to observe load peaks? While there may be several million registered customers for an online business, at any one time only a fraction of them are logged in and performing business transactions. A common mistake during capacity planning is to use the total size of customer population as the basis and not the average and peak numbers for concurrent users. The number of concurrent users also may exhibit patterns over time.
What is the average and peak amount of data transferred per request? This value is also application-specific. Good estimates for content size, combined with other usage patterns, will help you anticipate network capacity needs.
What is the expected growth in user load over the next year? Planning ahead for the future will help avoid crisis situations and system downtimes for upgrades.