2 Planning for Load Testing

This chapter provides a basic methodology for testing the scalability and performance of Web applications throughout the life cycle. It outlines the process for selecting the appropriate tools and the recommended steps to perform effective scalability testing. This chapter is broadly divided into the following sections:

Goals and Requirements of Scalability Testing: What should you aim to accomplish as a result of scalability testing for each phase of your Web application development.
Methodology: The process and the steps that are required to ensure performance and scalability throughout the application life cycle.
Test Planning and Execution: How you should plan and execute scalability testing during each phase of development.

2.1 Goals of Scalability Testing

The primary goals of a load test are as follows:

Determine the user limit for the Web application.
- The user limit is the maximum number of concurrent users that the system can support while remaining stable and providing reasonable response time to users as they perform a variety of typical business transactions.
- The user limit should be higher than the required number of concurrent users that the application must support when it is deployed.
Determine client-side degradation and end user experience under load.
- Can users get to the Web application in a timely manner?
- Are users able to conduct business or perform a transaction within an acceptable time?
- How does the time of day, number of concurrent users, transactions and usage affect the performance of the Web application?
- Is the degradation "graceful?" Under heavy loading conditions, does the application behave correctly in "slow motion," or do components crash or send erroneous/incomplete pages to the client?
- What is the failure rate that users observe? Is it within acceptable limits? Under heavy loading conditions do most users continue to complete their business transactions or do a large number of users receive error messages?
Determine server-side robustness and degradation.
- Does my Web server crash under heavy load?
- Does my application server crash under heavy load?
- Do other middle-tier servers crash or slow down under heavy load?
- Does my database server crash under heavy load?
- Does my system load require balancing, or if a load balancing system is in place, is it functioning correctly?
- Can my current architecture be fine-tuned to extract better performance?
- Should hardware changes be made for improved performance?
- Are there any resource deadlocks in my system?

2.2 Phases of Scalability Testing

The following are the different phases of load and scalability testing for a Web application:

Architecture Validation - tests the scalability of the architecture early in the development of a Web application, presumably after a prototype of the application has been created that can generate transactions to touch all tiers of the application. This allows the engineering organization to determine the viability of the architectural framework selected to build the Web application.

Performance Benchmarking - sets and creates the benchmark tests for the initial version of the application for all business transactions and gives the engineering and the quality assurance groups a set of metrics to quantify the scalability of the application. Based on the requirements specified, the development group will either maintain this scalability or improve upon it through the subsequent milestones.

Performance Regression - is the phase where the Web application is tested with the established benchmarks to ensure that the changes made to the application do not result in degradation of scalability. These tests are executed when key milestones have been reached or architectural modifications have been made during the development of the application. It is also common that the benchmark tests and the metrics originally set for the application be replaced or augmented with additional tests and newer metrics to reflect the improvements made to the application.

Acceptance and Scalability Fine Tuning - is the final load testing phase prior to the official launch of the Web application where all the different pieces of the Web application - including the hardware, load balancing components and all software components - are integrated and the scalability is validated. Different scenarios of real-life usage are emulated and the scalability of the final configuration is validated. These different scenarios are also used to configure the hardware and software components to yield optimal performance.

24x7 Performance Monitoring - after the application is deployed, it is essential to monitor the performance of the system under the real load generated by actual users so that crashes or slow-downs can be spotted before they become problematic. In this phase, data pertaining to real life usage can be collected to help refine future scalability tests for accurate emulation of load.

2.3 Criteria for Accurate Scalability Testing

In order to emulate a realistic load that will correlate with real-life usage of the application, a load-testing tool must:

produce load that stresses all tiers of a multi-tier application;
allow for the simulation of a realistic mix of different groups performing different types of business activities on the site during peak periods;
emulate page and resource request patterns produced by popular browsers such as Internet Explorer and Netscape;
validate the responses coming back from the Web server for each of the thousands of concurrent users to ensure that the correct pages are being returned by the Web application under stress;
allow for easy maintenance of the scripts as the application changes so that scalability can be re-verified each time that the system is changed.

In addition, the following criteria are also important:

Dynamic Dial-up of users - this capability allows you to add new users to the load test without stopping the current test. For example, if you are running a 100 user load test and you will dynamically add another 100 users, you don't have to stop the load test and restart a new test with 200 users;
Real-Time Virtual User Debugger - the load testing tool should have some capability to allow you to visually monitor the progress of a user at any given point in time when the load test is in progress;
Real-Time Graphs that allow you to understand the scalability characteristics of the application as the load test is in progress;
allow for distributing load tests from a number of machines on the LAN/WAN with a central point of control;
allow the load tests to be executed with recorded think times, random think times (following some kind of statistical distribution), and with no think times;
measure response times of entire business transactions in addition to individual objects on pages such as sub-frames and images;
allow for simulation of different types of caching behaviors;
run data-driven tests to allow for unique concurrent users on the system;
allow for complex scheduling to allow for different scenarios of starting, stopping, and ramp-up;
provide reports and a performance database to allow for post-run analysis and comparison with previously established benchmarks.

2.4 Determine Additional Tools Required to Perform Testing and Diagnosis

Scalability testing requires several types of software tools for the various levels of testing and reporting/analysis of results. Before you can perform the testing, you need to familiarize yourself with the following software tools that will be used:

Oracle OpenScript - used to create the various scripts that the virtual users will run. The scripts specify the actual steps used in performing the business transactions in the application. Thorough testing of an application may require many different scripts that exercise different areas (i.e. the business transactions). scripts are easily maintained over time as the application changes.

Oracle Load Testing - used to define virtual user profiles and scenarios and to perform the load test. Oracle Load Testing runs the scripts as multiple virtual users that test the scalability of the application. Oracle Load Testing defines the number and types of virtual users and which script(s) different virtual users run. It also provides real-time reports and graphs for evaluating the progress of the tests and post-analysis reports and graphs for post-run analysis of test results.

Oracle Load Testing ServerStats - provides real time system monitoring for a variety of data sources for monitoring the impact of the load test on the individual servers (Web server, database server, application server, and system counters). For example, the load test many require monitoring software for a Netscape Web server, a ColdFusion application server, a Tuxedo server, and a mainframe data base server.

In addition to Oracle Application Testing Suite, you may need additional software tools for other specialized monitoring or reporting.

Other System Monitoring Tools - You should determine what other software tools may be necessary for monitoring the load test.

Logging Tools - You should determine what software tools will be used for logging transaction and performance data or logging errors.

In addition, you will also need to gather data from other tools that monitor the state of different components in the application architecture. This will allow you to correlate the client-side degradation noticed during the scalability tests to one or more scalability problems with specific components of the application.

2.5 Determining the Hardware Needed to Execute the Tests

To execute a scalability test effectively, the appropriate hardware needed to run the test tools must be procured and configured.

In order to generate the load on the Web application using thousands of concurrent users you must consider the following:

Load Distribution Capability - Does the load test tool allow for load to be generated from multiple machines and controlled from a central point?

Operating System - What operating systems does the load test master and the load generation agents run under?

Processor - What type of CPU is required for the master and virtual user agent?

Memory - How much memory is required for the master and virtual user agent?

To insure that you have the appropriate hardware to execute scalability tests, ask your load testing tool vendor to provide you with the hardware requirements for the load test master and agent machines.

General Rules of Thumb:

Windows NT 4.0/2000/2003 are better suited to run load test virtual users than Windows 98/XP, as they are more scalable and more stable operating systems.
If the CPU utilization of any workstation in the load test - be it the load master or the load agents running concurrent users - is higher than 70-80%, or the memory consumption is over 85%, the processes running on that workstation will experience operating system resource conflicts and the performance results from that test station will be skewed.
You should consider running the load master on a separate machine if possible. The machine that serves as the master typically needs a high performance CPU. The virtual users can run on one or more machines. These machines need to have sufficient memory to run a large number of virtual users.
To determine the number of virtual users that can be run on a machine, you can estimate based on the amount of memory each virtual user would consume. If the virtual users are running as threads within a process, on an average they consume 300-500 KB of memory. If the virtual users are running as separate processes within a process, on an average they consume 1024 - 2048 KB of memory.
A hardware configuration document is usually available from each test vendor that explains the hardware requirements for a particular load testing setup.

2.6 Who Should be Responsible for Load Testing?

The following groups should have active participation in the load test:

Development Engineers and Architecture Groups - design and perform architecture validation tests and benchmarking criteria for the Web application. They work with the quality assurance groups to fine-tune the application and deployment architecture to perform optimally under load.

In some large software organizations, dedicated performance architecture groups exist that are charged with building and maintaining scalable frameworks.

Quality Assurance Organizations - design and execute development tests to verify correct operation of the application and acceptable performance.

Integration and Acceptance Organizations - design and perform integration tests that ensure all tiers and hardware operates together correctly before acceptance and deployment of the application. In most organizations, the quality assurance groups are charged with this responsibility as well.

Monitoring and Operations Groups - design and perform monitoring tests to ensure that the deployed application is available 24x7 and is not degrading under conditions of load or over long periods of regular usage.

2.7 What to Avoid When Testing for Scalability

Organizations performing scalability testing should avoid common pitfalls that guarantee incorrect results and potential failure. These include the following:

performing load tests on applications that are changing even as the tests are being performed;
performing load tests with applications that are not functionally tested so that even basic capabilities are not operational;
performing load tests on certain parts of the application that work and extrapolating the results to the entire application;
performing load tests with a smaller number of concurrent users and extrapolating the result for larger numbers.

2.8 Performing Scalability Testing

The general process for performing scalability testing on a Web application is as follows:

Define a process that is repeatable for executing scalability tests throughout the application life-cycle.
Define the criteria for scalability.
Determine the software tools required to run the load test.
Determine and configure the hardware and environment needed to execute the scalability tests.
Plan the scalability tests.
Plan the test scenarios.
Create and verify the scripts.
Create and verify the load test scenarios.
Execute the tests.
Evaluate the results against the defined criteria.
Generate required reports.

The details for the above steps are explained in the following sections.

2.8.1 Define the Process

Once the requirements for a load testing effort are defined, test planners need to define the process. In defining the process, test planners should consider the following issues and questions:

Required Applications - What application(s) will the load testing be performed against?

Scheduling - When will the testing be performed? What are the dates, times, build availability, and testing milestones that need to be met?

Personnel - Who will perform the analysis, planning, test development, test execution, and evaluation? Which internal department personnel (for example, business analysts, network specialists, quality assurance engineers, and developers) will be involved? Will any third-party personnel (for example, tools vendor, Internet Service Provider, or testing lab) be required?

Location - Where will the testing be performed? Will testing be performed internally or at an external location such as at an Internet Service Provider or testing lab?

Testing Environment- What SW/HW environment will the load tests be run against? When specifying the testing environment, you should look for and avoid the following common pitfalls:

Application stability - Make sure that the application is not being changed even as the load test is being undertaken. Quite frequently the entire application or parts of it are changed even as the application is being load tested.
Deployment environment - Make sure that the environment under which the application is operating when the load test is being performed is very close to the real deployment environment, if not exactly the same. For example, if your requirement states that the load test has to be performed against an HTTPS server that is configured with enough horsepower to sustain heavy loads, you should not run it against a smaller server that is used by the development group.
Acceptance environment - As part of the acceptance tests that are run prior to shipping the product, you must make sure that the environment used to perform the load test is exactly the same as the live production environment (as defined in a specification document).

Hardware Allocation - Is the required hardware (network, master load-test computer, agent computers, etc.) allocated and available for use? Testing vendors should be able to help determine the necessary hardware based on some of the following information:

number of virtual users or the desired throughput for the application as a whole (Transactions per second);
maximum or acceptable duration for each business transaction;
maximum or acceptable duration for delay between business transactions.

2.8.2 Define the Criteria

Before you can begin planning for load testing, you need to define the criteria that indicate whether or not the application will be accepted and ready for live deployment. When defining the criteria, you should specify the following:

Load to be Simulated - What number of virtual users need to be emulated? This indicates the number of concurrent users on the Web server.

Number of Business Transactions to Simulate - How many business transactions are to be simulated for the load test? This is determined by the analysis of the application during requirements planning and may be specified as transactions-per-second (TPS), transactions-per-hour, or simultaneous user sessions.

Types of Business Transactions to be Simulated - What are the business transactions that need to be simulated (for example, read an account balance, make an account transaction, check account details, check contributions, etc.)?

Criteria for Each Business Transactions - For each business transaction you should determine the following:

Acceptable response time under various loads - What is an acceptable response time under various conditions of load. For example, what is the acceptable response time when running 100 virtual users? 200 virtual users? Also, what is the acceptable response time when running the maximum limit of virtual users.
Acceptable failure rate - What is the acceptable failure rate for all of the transactions and for each business transactions when under load? For example, zero failures allowed for up to 100 virtual users, 5% failures for 200 virtual users, etc.
Categories of users - What are the categories of users to simulate in the various transactions? Are they first time users or are they repeat users? First time users have a higher overhead on the Web server since all the images must be downloaded. You can design and develop tests for both types of users and run combinations of load tests under differing conditions.
SSL and HTTP - Does testing require a combination of SSL and plain HTTP, only SSL, or only HTTP?
Browsers to simulate - What browsers will be simulated in the load test? Will testing simulate Internet Explorer or Netscape (or both)?
Pacing mode - What is the virtual user pacing that will be used for the load test? Will the testing be performed using recorded "think times" (that is, running with delays between pages that correspond to the same natural pauses that occurred while recording the script)? Should you try a worst-case stress test with no delays between pages? Or alternatively, should you use a random distribution of delays representing a range of user-speeds from expert users on T3 connections to novice users on slow modems?
Delay Between Business Transaction Runs - What delay time will be included between business transaction tests, if any?
With or without images - Will virtual users run with images or without images? Images constitute an additional load on the Web server. In many cases, you may want to perform load testing both with and without images for comparison.

Overall Transactions-Per-Second Throughput Required - What is the overall transactions per second (TPS) throughput required for the load test? This can be computed based on the number of simultaneous business transactions and the duration of typical transactions.

Type of Error Handling - What type of error handling is required when executing the load test? Does the load test need to be stopped on encountering certain types of error or just log the error and continue? What types of error logging do we need to enable for each concurrent user and for the different components in the application architecture?

Type of Transaction and Performance Data Logging - What type of transaction and performance data needs to be logged for the various scripts?

2.8.3 Planning the Scalability Tests

Developing detailed test plans before you actually create the tests is an important step in making sure the tests conform to the business analysis of the application and the defined criteria.

For each test that will perform a business transaction you need to plan and define the following information:

Steps for Scripts - Each script should have a detailed sequence of steps that define the exact actions a user would perform. Multiple scripts can be used. For example, you can define a specific script that performs user login, several scripts that perform specific business transactions, and another script that logs users off. For each script, you should define the expected results. Oracle OpenScript lets you quickly and easily record scripts that emulate a user's actions.

Run-Time Data - The test plan should specify any run-time data that is needed to interact with the application, for example, login user IDs, passwords, and other run-time data specific to the application.

Data Driven Tests - If the scripts require varying data at run-time, you'll need to have an understanding of all the fields that require this data. You also need to define the data sources and any tool(s) needed to either create fictitious data or extract real data from existing databases.

Oracle OpenScript Data Bank Wizard lets you specify and connect external data sources to the scripts.

2.8.4 Planning the Load Test Scenarios

In addition to the business transaction details for each script, the test plan should also specify the different user groups and test scenarios that will be required for load testing. For each test scenario you need to plan and define the following information:

Type of User - Is this user a first-time user of the application or a repeat user? This is important if the application responds differently for a first-time user than it does for a repeat user and places more stress on the server. Oracle Load Testing scenarios can specify either a first-time user or a repeat user.

Transactions to Perform - Which business transaction(s) will this user perform? In what sequence? If the application requires a first-time user to perform some type of registration, then the user profile for first-time users should include a registration script.

Number of Users - How many virtual users with this user profile will run over the same time interval? Oracle Load Testing lets you specify the number of virtual users for each test scenario.

Which System - Which specific computer(s) will be used to generate the load for this user group? Oracle Load Testing can run virtual users on a single system or on multiple, distributed systems running Oracle Load Testing agents. Oracle Load Testing can specify which virtual user scenarios run on which workstations.

Which Browser - Which browser will this user group emulate? Oracle Load Testing can specify virtual user scenarios emulate either Internet Explorer or Netscape.

Pacing mode - What pacing mode will be used for the user group? Will the testing be performed using recorded think times, a range of times, or as fast as possible? Oracle Load Testing virtual user scenarios let you specify recorded, random, or no pacing.

Delay Between Business Transaction Runs - What delay time will be included between business transactions, if any? Oracle Load Testing lets you specify the amount of delay time between transaction runs.

With or Without Images - Will the user group run with images or without images? You may want to create different user groups that perform load testing both with and without images for comparison. Oracle Load Testing provides this capability.

2.8.5 Create and Verify the Test Scripts

After planning the scripts, you will use Oracle OpenScript to create and verify each script.

Create the Scripts - This process is defined by Oracle OpenScript (recording user actions) and the individual test plans for each script. When creating the script, you specify the following information as defined in the test plan:

user actions to perform
timers
tests to perform
data sources

Verify the Scripts - Once each script is created, you should verify that the script performs as expected and produces the desired result. Each script should be verified independent of any other scripts and in a controlled manner to simplify script debugging.

2.8.6 Create and Verify the Load Test Scenarios

Once the individual scripts have been created and verified, you can create and verify the load test scenarios. It will save you a lot of time and aggravation if you perform a number of simple verification steps before your full-blown load test.

Verify scripts with Multiple Virtual Users - Before combining multiple scripts into a single load test scenario, you should verify that you can successfully run a single script as multiple virtual users. Each script should perform as expected as defined by the criteria for the application.

Oracle Load Testing Autopilot lets you run multiple scenarios with different virtual user characteristics.

Verify distributed test execution on multiple machines - You should verify the load test tool's ability to execute the individual scripts properly in a distributed environment if you plan to use multiple CPU's for load generation. This usually involves a master system controlling the virtual user execution on multiple workstations on the network. This can help you isolate any installation or networking-related issues.

Verify real-life scenarios that include one of each user group - Before executing the full load test you should create and verify a scenario that includes one virtual user of each user group you wish to run at the same time. That is, before you run a test with 20 VU's of group A, and 20 VU's of group B, and 60 VU's of group C, you should first run one VU of group A, one VU of group B, and one VU of group C, and make sure that the results are as expected.

Create real-life scenarios - This process should be defined in the test plans for each scenario. When creating the individual scenarios, you specify the following information as defined in the test plan:

type of user
pacing mode
navigation/transactions to perform
delay between transaction runs
number of users of each type
with or without images
system used for load generation
error log settings
browser emulation

2.8.7 Execute the Tests

Once you have created and verified the basic load test scenarios above, you can begin to run the load test scenarios with many virtual users, and expect that the test results will be valid.

Run basic tests to ensure scaling - run tests with a minimal amount of virtual users to ensure that the system scales up correctly.

Run individual business transactions - Run each of the different business transactions starting with 10 virtual users scaling up to 25 - 50 virtual users.
Run combinations of business transactions - Run a combination of different business transaction scripts starting with 5 virtual users scaling up to 25 virtual users.

If the above two scenarios execute without any problems, the next step is to execute the full load test with the full number of virtual users of each user-group type.

Run the real-life scenarios - Run each of the real scenarios as outlined in the previous steps:

Increase the scenarios up to the required number of simultaneous virtual users;
Monitor for any errors in the system.

Re-Run these scenarios with a real user - While the load test is running, a real person should access the system through a standard browser and report performance observations:

Observe the degradation times for a real user;
Observe any errors if they are being reported back in the browser.

2.8.8 Evaluate the Results

For each of the load test scenarios, examine the following performance data and validate the results against the expected criteria:

Response times for groups of users at the different numbers of virtual users;
System throughput at various numbers of virtual users;
Any errors that may have occurred.

View Run graph options let you evaluate performance in real-time.

Save the erroneous HTML when problems occur to help the development group debug the errors.

2.8.9 Generate Analysis Reports

Document the performance by generating the various reports that may be required for acceptance and deployment of the application. The following are some examples of the types of reports that can be generated from a load test:

Performance vs. Time
Statistics vs. Time
Users vs. Time
Errors vs. Users
Statistics vs. Users
Errors vs. Time
Any other error reports that may be required for the development group to debug and fix any problems that may have occurred.

Oracle Load Testing graphs in the Create Reports tab let you view performance and error data from the load test in multiple formats.

2.9 Summary

Load testing throughout the development cycle has become an essential part of the process of designing scalable, reliable Web applications. Developers and QA professionals now rely on load testing tools as a means to validate system architectures, tune applications for maximum performance, and assess the impact of hardware upgrades. Consequently, it is critical that the load test results can be used with confidence as the basis for key decisions about application readiness and potential changes to the system's hardware and software. Using the methodology embodied in this guide along with accurate load testing tools such as Oracle Load Testing, you now have a systematic approach to ensure the performance of your Web applications. With load testing established as a routine part of the application lifecycle you can be sure to avoid costly "scalability surprises" when your application goes live for the first time or after any subsequent release.