Capacity Planning and Performance Tuning

     Previous  Next    Open TOC in new window    View as PDF - New Window  Get Adobe Reader - New Window
Content starts here

Capacity Planning Process

The capacity planning process involves several activities. The following sections describe these activities:

Note: The tests described in this guide were conducted in a controlled environment; the numbers presented here may not match the results that you get when you run the tests in your environment. The numbers are meant to illustrate the capacity planning process.

 


Design the WLI Application

Following are some of the performance-related design issues that architects and developers must keep in mind while designing WLI applications:

Note: For more information about design considerations that may affect performance, see Best Practices for WebLogic Integration and WLI Tuning.

 


Tune the Environment

Performance of a WLI application depends not just on the design of the application, but also on the environment in which it runs.

The environment includes the WLI server, the database, the operating system and network, and the JVM. All of these components must be tuned appropriately to extract good performance from the system.

 


Prepare the Application for Performance Testing

Certain minor changes may need to be made in the application for running the performance tests and for invoking the application through load generator scripts.

The extent of change depends on the nature of the application, capability of the load generator, and the outcome that is expected from the capacity planning process.

Following are examples of the changes that may be required:

 


Design the Workload

The quality of the result of any performance test depends on the workload that is used.

Workload is the amount of processing that the system is expected to complete. It consists of certain applications running in the system with a certain number of users connecting to and interacting with the system.

The workload must be designed so that it is as close to the production environment as possible.

The following parameters must be considered while designing the workload:

The next step is to define the unit of work and SLA.

 


Define the Unit of Work and SLA

A Service Level Agreement (SLA) is a contract - between the service provider and service consumer - that defines acceptable (and unacceptable) levels of service. The SLA is typically defined in terms of response time or throughput (transactions per second).

For the purpose of capacity planning, it is important to define the unit of work (that is, the set of activities included in each transaction), before using it to define the SLA.

Consider the purchase order application shown in the following figure.

Figure 2-2 Unit of Work: Purchase Order Application

Unit of Work: Purchase Order Application

Each node is a JPD. All of these JPDs are required for processing the purchase order. In this scenario, the unit of work (transaction) can be defined as either of the following:

It is recommended that the entire flow of business operations, rather than each JPD, be considered a single unit of work.

The next step is to design the load generation script.

 


Design the Load Generation Script

A load generation script is required to load the server with the designed workload while running the tests.

Note: For information about running the tests, seeRun Benchmark Tests and Run Scalability Tests.

While writing the load generation script, you must keep the following points in mind:

If the rate at which requests are sent is not controlled, requests may continue to arrive at the system even beyond the flow-balance rate, leading to issues such as queue overflow.

The following figure depicts a single user sending the next request only after the previous request is processed by the server.

Figure 2-3 Balanced Load Generation Script

Balanced Load Generation Script

With this approach, the arrival rate (load) on the system can be increased by increasing the number of concurrent users, without affecting the system adversely; therefore, the capacity of the system can be measured accurately.

The following figure depicts a single user sending new requests without waiting for the server to finish processing previous requests.

Figure 2-4 Non-blocking Script With Arrival Rate More Than Throughput

Non-blocking Script With Arrival Rate More Than Throughput

This approach could cause issues such as queue overflow and lead to misinterpretation of capacity.

A balanced load generation script is recommended.

 


Configure the Test Environment

The test environment must be configured as described in this section to ensure that the results of the tests are reliable and not affected by external factors.

 


Run Benchmark Tests

Benchmark tests help in identifying system bottlenecks and tuning the system appropriately.

The tests involve increasing the load on the system, in gradual steps, till the throughput does not increase any further.

Note: For the purpose of benchmark tests, load is any aspect of the WLI application under test - number of concurrent users, document size, and so on - that demands system resources.
Note: The load must be increased gradually to ensure that the system has adequate warm-up time.
Note: Benchmark tests are run with no think time and with a single WLI machine.

When the throughput stops increasing, one of the following may have occurred:

The following figure depicts a Mercury LoadRunner ramp-up schedule in which the initial 10 minutes are for warm-up tests with 10 concurrent users. Subsequently, the load is increased at the rate of 10 additional users every 15 minutes.

Figure 2-5 Test Ramp-up Schedule

Test Ramp-up Schedule

The following data must be recorded while running the tests:

The following figure shows the result of a benchmark test.

Figure 2-6 Results of Benchmark Test

Results of Benchmark Test

As users are added, the average TPS increases. When utilization of one of the hardware resources (in this case, CPU) reaches 100%, the average TPS peaks. The response time at this point is the optimal result. When further users are added to the system, the TPS starts diminishing.

This pattern of results indicates a system where resources are utilized to the maximum.

The next activity in the capacity planning process is to validate the results of the benchmark tests.

Validating the Results Using Little's Law

Before analyzing the test results, you must validate them using Little's Law, to identify bottlenecks in the test setup. The test results must not deviate significantly from the result that is obtained when Little's Law is applied.

The response-time formula for a multi-user system can be proved by using Little's Law. Consider n users with an average think time of z connected to an arbitrary system with response time r. Each user cycles between thinking and waiting-for-response; so the total number of jobs in the meta-system (consisting of users and the computer system) is fixed at n.

n = x (z + r)

r = n/x - z

n is the average load, z + r is the average response time, and x is the throughput.

Interpreting the Results

While interpreting the results, take care to consider only the steady-state values of the system. Do not include ramp-up and ramp-down time in the performance metrics.

When the throughput saturates, utilization of a resource - CPU, memory, hard disk, or network - must have peaked. If utilization has not peaked for any of the resources, analyze the system for bottlenecks and tune it appropriately.

Tips for Analyzing Bottlenecks and Tuning

If no resource bottlenecks exist at the point when throughput saturates, bottlenecks could exist in the application and system parameters. These bottlenecks could be caused by any of the following:

 


Run Scalability Tests

An application can be considered scalable, when it can handle increased load without degradation in performance. To handle the increased load, hardware resources may need to be added.

Applications can be scaled horizontally by adding machines and vertically by adding resources (such as CPUs) to the same machine.

Horizontal and Vertical Scaling

The following table compares the relative advantages of horizontal and vertical scaling:

Table 2-1 Relative Advantages of Horizontal and Vertical Scaling
Vertical Scaling
(More resources in a single machine)
Horizontal Scaling
(More machines)
  • Facilitates easy administration.
  • Improves manageability.
  • Provides more effective interconnection between system resources.
  • Offers better load balancing and high availability.

When an application needs to be scaled, you may opt for horizontal scaling, vertical scaling, or a combination, depending on your requirements.

The following figure shows a comparison between WLI running on a single non-clustered 4-CPU machine (vertical scaling) and on two clustered 2-CPU machines (horizontal scaling).

Figure 2-7 Horizontal and Vertical Scaling

Horizontal and Vertical Scaling

Performance in the horizontal scaling scenario (two 2-CPU machines) is slightly lower than in the vertical scaling scenario (single 4-CPU machine) due to additional load balancing and clustering overhead in the horizontal scaling scenario.

Conducting Scalability Tests

Scalability tests help you find out how the application scales when additional resources are added in the system - horizontally and vertically. This information is useful for estimating the additional hardware resources required for a given scenario.

The scalability test involves increasing the load, in gradual steps, till the SLA is achieved or the target resource utilization is reached, whichever occurs first.

Note: In contrast, benchmark tests involve increasing the load till the throughput stops increasing.

For running scalability tests, the workload must be designed to emulate, as closely as possible, the production scenario. If no human user interaction is necessary and if the process invocations happen programmatically, it is recommended that you use a zero-think-time approach, similar to the approach for benchmark tests.

If the target resource utilization level is reached before the SLA is achieved, additional resources must be added to the system. The additional resources (vertical scaling) or machines (horizontal scaling) must be added in the order 1, 2, 4, 8, and so on.

Note: A minimum of three data points must be used to derive the equation for estimating capacity.

All the data that was recorded while running benchmark tests must be captured while running the scalability test. For more information, see Run Benchmark Tests.

Note: Only the data that is recorded when the resource utilization is closest to the target level must be used to estimate the additional resource requirement.

After running the test, validate and analyze them as described for benchmark tests, and then, if required, estimate the additional resource requirement as described in the next section.

 


Estimate Resource Requirement

If the required SLA is not achieved, you can fit a curve for the results of the tests, derive an equation for the curve, and use it to estimate additional hardware resources required. Techniques such as linear regression and curve fitting can be used to predict the required resources. Such techniques can be implemented using spreadsheet applications such as Microsoft Excel.

The following figure shows the results of a horizontal scalability test.

Figure 2-8 Capacity Estimation: Horizontal Scaling

Capacity Estimation: Horizontal Scaling

The graph shows the average number of transactions per second (TPS) at 70% CPU utilization for clusters with varying number of nodes.

For the results of this scalability test, a linear equation is the best fit. In a best fit curve, R2 must approach unity (the value, 1).

The equation is y = 12.636x + 4.065, where y is the average TPS and x is the number of nodes.

Note: Though adding additional resources horizontally or vertically can result in a higher TPS, this may not be useful if the objective is to achieve a certain response time. In such cases, consider using faster CPUs.

Based on the results of the scalability tests and the tuning that is necessary for achieving the required results, you must configure the application for deployment to the production environment.

Note: If the resources that you decide to purchase for the production environment are not of the same type or model as those used for the scalability tests, you can estimate the resource requirement by using the following formula:
Note: E x T1 / T2
Note: E = Estimation from scalability tests
Note: T1 = SPECint rate of the machine on which the test was executed
Note: T2 = SPECint rate of the machine that you want to purchase
Note: This formula is applicable only if the scaling is based on number of CPUs.
Note: For more information about SPECint rates, see http://www.spec.org.

  Back to Top       Previous  Next