1
Performance Overview

This chapter discusses performance and tuning concepts, and briefly describes Oracle9i Application Server architecture.

This chapter contains the following sections:

Performance Terms

Following are performance terms used in this book:

concurrency

The ability to handle multiple requests simultaneously. Threads and processes are examples of concurrency mechanisms.

contention

Competition for resources.

hash

A number generated from a string of text with an algorithm. The hash value is substantially smaller than the text itself. Hash numbers are used for security and for faster access to data.

latency

The time that one system component spends waiting for another component in order to complete the entire task. Latency can be defined as wasted time. In networking contexts, latency is defined as the travel time of a packet from source to destination.

response time

The time between the submission of a request and the receipt of the response.

scalability

The ability of a system to provide throughput in proportion to, and limited only by, available hardware resources.
A scalable system is one that can handle increasing numbers of requests without adversely affecting response time and throughput.

service time

The time between the receipt of a request and the completion of the response to the request.

think time

The time the user is not engaged in actual use of the processor.

throughput

The number of requests processed per unit of time.

wait time

The time between the submission of the request and initiation of the request.

What is Performance Tuning?

Performance must be built in. You must anticipate performance requirements during application analysis and design, and balance the costs and benefits of optimal performance. This section introduces some fundamental concepts:

Response Time
System Throughput
Wait Time
Critical Resources
Effects of Excessive Demand
Adjustments to Relieve Problems

See Also:
"Setting Performance Targets" for a discussion on performance requirements and determining what parts of the system to tune.
Response Time

Because response time equals service time plus wait time, you can increase performance in this area by:

Reducing wait time
Reducing service time

Figure 1-1 illustrates ten independent tasks competing for a single resource.

Figure 1-1 Sequential processing of independent tasks

Text description of taskproc.gif follows.

Text description of the illustration taskproc.gif

In this example, only task 1 runs without waiting. Task 2 must wait until task 1 has completed; task 3 must wait until tasks 1 and 2 have completed, and so on. (Although the figure shows the independent tasks as the same size, the size of the tasks will vary.)

In parallel processing with multiple resources, more resources are available to the tasks. Each independent task executes immediately using its own resource: no wait time is involved.

System Throughput

System throughput is the amount of work accomplished in a given amount of time. You can increase throughput by:

Reducing service time
Reducing overall response time by increasing the amount of scarce resources available. For example, if the system is CPU bound, and you can add more CPUs.

Wait Time

While the service time for a task may stay the same, wait time will lengthen with increased contention. If many users are waiting for a service that takes one second, the tenth user must wait 9 seconds. Figure 1-2 shows the relationship between wait time and resource contention.

Figure 1-2 Wait time rising with increased contention for a resource

Text description of waittime.gif follows.

Text description of the illustration waittime.gif

Critical Resources

Resources such as CPU, memory, I/O capacity, and network bandwidth are key to reducing service time. Adding resources increases throughput and reduces response time. Performance depends on these factors:

How many resources are available?
How many clients need the resource?
How long must they wait for the resource?
How long do they hold the resource?

Figure 1-3 shows that as the number of units requested rises, the time to service completion rises.

Figure 1-3 Time to service completion vs. demand rate

Text description of the illustration demand.gif

To manage this situation, you have two options:

Limit demand rate to maintain acceptable response times
Add resources

Effects of Excessive Demand

Excessive demand increases response time and reduces throughput, as shown in Figure 1-4. If there is any possibility of the demand rate exceeding the achievable throughput, then determine which parameters should be adjusted (such as ThreadsPerChild in the Oracle HTTP Server and security.maxConnections in JServ) and change the configuration accordingly.

Figure 1-4 Increased Demand/Reduced Throughput

Text description of thruput.gif follows.

Text description of the illustration thruput.gif

Adjustments to Relieve Problems

Performance problems can be relieved by making adjustments in the following areas:

unit consumption

Reducing the resource (CPU, memory) consumption of each request can improve performance. This might be achieved by pooling and caching.

functional demand

Rescheduling or redistributing the work will relieve some problems.

capacity

Increasing or reallocating resources (such as CPUs) relieves some problems.

Setting Performance Targets

Whether you are designing or maintaining a system, you should set specific performance goals so that you know how and what to optimize. If you alter parameters without a specific goal in mind, you can waste time tuning your system without significant gain.

An example of a specific performance goal is an order entry response time under three seconds. If the application does not meet that goal, identify the cause (for example, I/O contention), and take corrective action. During development, test the application to determine if it meets the designed performance goals.

Tuning usually involves a series of trade-offs. Once you have determined the bottlenecks, you may have to modify performance in some other areas to achieve the desired results. For example, if I/O is a problem, you may need to purchase more memory or more disks. If a purchase is not possible, you may have to limit the concurrency of the system to achieve the desired performance. However, if you have clearly defined goals for performance, the decision on what to trade for higher performance is simpler because you have identified the most important areas.

Setting User Expectations

Application developers, database administrators, and system administrators must be careful to set appropriate performance expectations for users. When the system carries out a particularly complicated operation, response time may be slower than when it is performing a simple operation. Users should be made aware of which operations might take longer.

Evaluating Performance

With clearly defined performance goals, you can readily determine when performance tuning has been successful. Success depends on the functional objectives you have established with the user community, your ability to measure whether or not the criteria are being met, and your ability to take corrective action to overcome any exceptions.

Ongoing performance monitoring enables you to maintain a well tuned system. Keeping a history of the application's performance over time enables you to make useful comparisons. With data about actual resource consumption for a range of loads, you can conduct objective scalability studies and from these predict the resource requirements for anticipated load volumes.

Performance Methodology

Achieving optimal effectiveness in your system requires planning, monitoring, and periodic adjustment. The first step in performance tuning is to determine the goals you need to achieve and to design effective usage of available technology into your applications. After implementing your system, it is necessary to periodically monitor and adjust your system For example, you might want to ensure that 90% of the users experience response times no greater than 5 seconds and the maximum response time for all users is 20 seconds. Usually, it's not that simple. Your application may include a variety of operations with differing characteristics and acceptable response times. You will need to set measurable goals for each of these.

You will also need to determine variances in the load. For example, users might access the system heavily between 9:00am and 10:00am and then again between 1:00pm and 2:00pm, as shown in Figure 1-5. If your peak load occurs on a regular basis, for example, daily or weekly, the conventional wisdom is to configure and tune systems to meet your peak load requirements. The lucky users who access the application in off-time will experience better response times than your peak-time users. If your peak load is infrequent, you may be willing to tolerate higher response times at peak loads for the cost savings of smaller hardware configurations.

Figure 1-5 Adjusting Capacity and Functional Demand

Text description of capacity.gif follows.

Text description of the illustration capacity.gif

Factors in Improving Performance

Performance spans several areas:

Application design: Designing applications that efficiently utilize hardware resources and handle increasing numbers of users effectively.
Sizing and configuration: Determining the type of hardware needed to support your performance goals. See Chapter 3, "Sizing and Configuration".
Parameter tuning: Setting configurable parameters to achieve the best performance for your application. See Chapter 5, "Optimizing Apache JServ" and Chapter 4, "Optimizing HTTP Server Performance".
Performance monitoring: Determining what hardware resources are being used by your application and what response time your users are experiencing. See Chapter 2, "Monitoring Your Web Server".

Troubleshooting: Diagnosing why an application is using excessive hardware resources, or why the response time exceeds the desired limit.

See Also:

Chapter 3, "Sizing and Configuration", for more information on sizing and configuration

Chapter 4, "Optimizing HTTP Server Performance", and Chapter 5, "Optimizing Apache JServ", for more information on parameter tuning

Chapter 2, "Monitoring Your Web Server", for more information on performance monitoring

Architecture

Figure 1-6 shows the architecture of Oracle9i Application Server.

This guide addresses the performance and configuration of these components:

Oracle HTTP Server powered by Apache
Apache JServ
OracleJSP

See Also:
The Oracle9i Application Server Overview Guide for a list of publications that describe other components.

Figure 1-6 Oracle9i Application Server architecture

Text description of arch_v1.gif follows.

Text description of the illustration arch_v1.gif

concurrency	The ability to handle multiple requests simultaneously. Threads and processes are examples of concurrency mechanisms.
contention	Competition for resources.
hash	A number generated from a string of text with an algorithm. The hash value is substantially smaller than the text itself. Hash numbers are used for security and for faster access to data.
latency	The time that one system component spends waiting for another component in order to complete the entire task. Latency can be defined as wasted time. In networking contexts, latency is defined as the travel time of a packet from source to destination.
response time	The time between the submission of a request and the receipt of the response.
scalability	The ability of a system to provide throughput in proportion to, and limited only by, available hardware resources. A scalable system is one that can handle increasing numbers of requests without adversely affecting response time and throughput.
service time	The time between the receipt of a request and the completion of the response to the request.
think time	The time the user is not engaged in actual use of the processor.
throughput	The number of requests processed per unit of time.
wait time	The time between the submission of the request and initiation of the request.

unit consumption	Reducing the resource (CPU, memory) consumption of each request can improve performance. This might be achieved by pooling and caching.
functional demand	Rescheduling or redistributing the work will relieve some problems.
capacity	Increasing or reallocating resources (such as CPUs) relieves some problems.

1 Performance Overview