Sun Java System Application Server Enterprise Edition 8.2 Deployment Planning Guide

Establishing Performance Goals

At its simplest, high performance means maximizing throughput and reducing response time. Beyond these basic goals, you can establish specific goals by determining the following:

You can calculate some of these metrics using a remote browser emulator (RBE) tool, or web site performance and benchmarking software that simulates expected application activity. Typically, RBE and benchmarking products generate concurrent HTTP requests and then report the response time for a given number of requests per minute. You can then use these figures to calculate server activity.

The results of the calculations described in this chapter are not absolute. Treat them as reference points to work against, as you fine-tune the performance of the Application Server and your applications.

This section discusses the following topics:

Estimating Throughput

In broad terms, throughput measures the amount of work performed by Application Server. For Application Server, throughput can be defined as the number of requests processed per minute per server instance. High availability applications also impose throughput requirements on HADB, since they save session state data periodically. For HADB, throughput can be defined as volume of session data stored per minute, which is the product of the number of HADB requests per minute, and the average session size per request.

As described in the next section, Application Server throughput is a function of many factors, including the nature and size of user requests, number of users, and performance of Application Server instances and back-end databases. You can estimate throughput on a single machine by benchmarking with simulated workloads.

High availability applications incur additional overhead because they periodically save data to HADB. The amount of overhead depends on the amount of data, how frequently it changes, and how often it is saved. The first two factors depend on the application in question; the latter is also affected by server settings.

HADB throughput can be defined as the number of HADB requests per minute multiplied by the average amount of data per request. Larger throughput to HADB implies that more HADB nodes are needed and a larger store size.

Estimating Load on Application Server Instances

Consider the following factors to estimate the load on Application Server instances:

Maximum Number of Concurrent Users

Users interact with an application through a client, such as a web browser or Java program. Based on the user’s actions, the client periodically sends requests to the Application Server. A user is considered active as long as the user’s session has neither expired nor been terminated. When estimating the number of concurrent users, include all active users.

The following figure illustrates a typical graph of requests processed per minute (throughput) versus number of users. Initially, as the number of users increases, throughput increases correspondingly. However, as the number of concurrent requests increases, server performance begins to saturate, and throughput begins to decline.

Identify the point at which adding concurrent users reduces the number of requests that can be processed per minute. This point indicates when optimal performance is reached and beyond which throughput start to degrade. Generally, strive to operate the system at optimal throughput as much as possible. You might need to add processing power to handle additional load and increase throughput.

Figure 2–1 Typical Profile of Throughput Versus Concurrent Users

Typical Profile of Throughput Versus Concurrent Users

Think Time

A user does not submit requests continuously. A user submits a request, the server receives and processes the request, and then returns a result, at which point the user spends some time before submitting a new request. The time between one request and the next is called think time.

Think times are dependent on the type of users. For example, machine-to-machine interaction such as for a web service typically has a lower think time than that of a human user. You may have to consider a mix of machine and human interactions to estimate think time.

Determining the average think time is important. You can use this duration to calculate the number of requests that need to be completed per minute, as well as the number of concurrent users the system can support.

Average Response Time

Response time refers to the amount of time Application Server takes to return the results of a request to the user. The response time is affected by factors such as network bandwidth, number of users, number and type of requests submitted, and average think time.

In this section, response time refers to the mean, or average, response time. Each type of request has its own minimal response time. However, when evaluating system performance, base the analysis on the average response time of all requests.

The faster the response time, the more requests per minute are being processed. However, as the number of users on the system increases, the response time starts to increase as well, even though the number of requests per minute declines, as the following diagram illustrates:

Figure 2–2 Response Time with Increasing Number of Users

Response Time with Increasing Number of Users

A system performance graph similar to this figure indicates that after a certain point, requests per minute are inversely proportional to response time. The sharper the decline in requests per minute, the steeper the increase in response time (represented by the dotted line arrow).

In the figure, the point of the peak load is the point at which requests per minute start to decline. Prior to this point, response time calculations are not necessarily accurate because they do not use peak numbers in the formula. After this point, (because of the inversely proportional relationship between requests per minute and response time), the administrator can more accurately calculate response time using maximum number of users and requests per minute.

Use the following formula to determine Tresponse, the response time (in seconds) at peak load:

Tresponse = n/r - Tthink


Example 2–1 Calculation of Response Time

If the following conditions exist:

Average think time, Tthink, is three seconds per request.

Thus, the calculation of response time is:

Tresponse = n/r - Tthink = (5000/ 1000) - 3 sec. = 5 - 3 sec.

Therefore, the response time is two seconds.

After the system’s response time has been calculated, particularly at peak load, compare it to the acceptable response time for the application. Response time, along with throughput, is one of the main factors critical to the Application Server performance.

Requests Per Minute

If you know the number of concurrent users at any given time, the response time of their requests, and the average user think time, then you can calculate the number of requests per minute. Typically, start by estimating the number of concurrent users that are on the system.

For example, after running web site performance software, the administrator concludes that the average number of concurrent users submitting requests on an online banking web site is 3,000. This number depends on the number of users who have signed up to be members of the online bank, their banking transaction behavior, the time of the day or week they choose to submit requests, and so on.

Therefore, knowing this information enables you to use the requests per minute formula described in this section to calculate how many requests per minute your system can handle for this user base. Since requests per minute and response time become inversely proportional at peak load, decide if fewer requests per minute is acceptable as a trade-off for better response time, or alternatively, if a slower response time is acceptable as a trade-off for more requests per minute.

Experiment with the requests per minute and response time thresholds that are acceptable as a starting point for fine-tuning system performance. Thereafter, decide which areas of the system require adjustment.

Solving for r in the equation in the previous section gives:

r = n/(Tresponse + Tthink)

Example 2–2 Calculation of Requests Per Second

For the values:

The calculation for the number of requests per second is:

r = 2800 / (1+3) = 700

Therefore, the number of requests per second is 700 and the number of requests per minute is 42000.

Estimating Load on the HADB

To calculate load on the HADB, consider the following factors:

For instructions on configuring session persistence, see Chapter 9, Configuring High Availability Session Persistence and Failover, in Sun Java System Application Server Enterprise Edition 8.2 High Availability Administration Guide.

HTTP Session Persistence Frequency

The number of requests per minute received by the HADB depends on the persistence frequency. Persistence Frequency determines how often Application Server saves HTTP session data to the HADB.

The persistence frequency options are:

The following table summarizes the advantages and disadvantages of persistence frequency options.

Table 2–1 Comparison of Persistence Frequency Options

Persistence Frequency Option 




Guarantees that the most up-to-date session information is available. 

Potentially increased response time and reduced throughput. 


Better response time and potentially better throughput. 

Less guarantee that the most updated session information is available after the failure of an application server instance. 

HTTP Session Size and Scope

The session size per request depends on the amount of session information stored in the session.

Tip –

To improve overall performance, reduce the amount of information in the session as much as possible.

It is possible to fine-tune the session size per request through the persistence scope settings. Choose from the following options for HTTP session persistence scope:

To use this option, the application must:

Table 2–2 Comparison of Persistence Scope Options

Persistence Scope Option 




Provides improved response time for requests that do not modify session state. 

During the execution of a web method, typically doGet() or doPost(), the application must call a session method:

  • setAttribute() if the attribute was changed

  • removeAttribute() if the attribute was removed.


No constraint on applications. 

Potentially poorer throughput and response time as compared to the modified-session and the modified-attribute options.


Better throughput and response time for requests in which the percentage of session state modified is low. 

As the percentage of session state modified for a given request nears 60%, throughput and response time degrade. In such cases, the performance is worse than the other options because of the overhead of splitting the attributes into separate records. 

Stateful Session Bean Checkpointing

For SFSB session persistence, the load on HADB depends on the following:

Checkpointing generally occurs after any transaction involving the SFSB is completed (even if the transaction rolls back).

For better performance, specify a small set of methods for checkpointing. The size of the data that is being checkpointed and the frequency of checkpointing determine the additional overhead in response time for a given client interaction.