Estimating Load on Application Server Instances (Sun Java System Application Server 9.1 Deployment Planning Guide)

Sun Java System Application Server 9.1 Deployment Planning Guide

Estimating Load on Application Server Instances

Consider the following factors to estimate the load on Application Server instances:

Maximum Number of Concurrent Users

Users interact with an application through a client, such as a web browser or Java program. Based on the user’s actions, the client periodically sends requests to the Application Server. A user is considered active as long as the user’s session has neither expired nor been terminated. When estimating the number of concurrent users, include all active users.

The following figure illustrates a typical graph of requests processed per minute (throughput) versus number of users. Initially, as the number of users increases, throughput increases correspondingly. However, as the number of concurrent requests increases, server performance begins to saturate, and throughput begins to decline.

Identify the point at which adding concurrent users reduces the number of requests that can be processed per minute. This point indicates when optimal performance is reached and beyond which throughput start to degrade. Generally, strive to operate the system at optimal throughput as much as possible. You might need to add processing power to handle additional load and increase throughput.

Think Time

A user does not submit requests continuously. A user submits a request, the server receives and processes the request, and then returns a result, at which point the user spends some time before submitting a new request. The time between one request and the next is called think time.

Think times are dependent on the type of users. For example, machine-to-machine interaction such as for a web service typically has a lower think time than that of a human user. You may have to consider a mix of machine and human interactions to estimate think time.

Determining the average think time is important. You can use this duration to calculate the number of requests that need to be completed per minute, as well as the number of concurrent users the system can support.

Average Response Time

Response time refers to the amount of time Application Server takes to return the results of a request to the user. The response time is affected by factors such as network bandwidth, number of users, number and type of requests submitted, and average think time.

In this section, response time refers to the mean, or average, response time. Each type of request has its own minimal response time. However, when evaluating system performance, base the analysis on the average response time of all requests.

The faster the response time, the more requests per minute are being processed. However, as the number of users on the system increases, the response time starts to increase as well, even though the number of requests per minute declines, as the following diagram illustrates:

Sorry: graphics not available in this version of the manual.

A system performance graph similar to this figure indicates that after a certain point, requests per minute are inversely proportional to response time. The sharper the decline in requests per minute, the steeper the increase in response time (represented by the dotted line arrow).

In the figure, the point of the peak load is the point at which requests per minute start to decline. Prior to this point, response time calculations are not necessarily accurate because they do not use peak numbers in the formula. After this point, (because of the inversely proportional relationship between requests per minute and response time), the administrator can more accurately calculate response time using maximum number of users and requests per minute.

Use the following formula to determine T_response, the response time (in seconds) at peak load:

T_response = n/r - T_think

where

n is the number of concurrent users
r is the number requests per second the server receives
T_think is the average think time (in seconds)

To obtain an accurate response time result, always include think time in the equation.

Example 2–1 Calculation of Response Time

If the following conditions exist:

Maximum number of concurrent users, n, that the system can support at peak load is 5,000.
Maximum number of requests, r, the system can process at peak load is 1,000 per second.

Average think time, T_think, is three seconds per request.

Thus, the calculation of response time is:

T_response = n/r - T_think = (5000/ 1000) - 3 sec. = 5 - 3 sec.

Therefore, the response time is two seconds.

After the system’s response time has been calculated, particularly at peak load, compare it to the acceptable response time for the application. Response time, along with throughput, is one of the main factors critical to the Application Server performance.

Requests Per Minute

If you know the number of concurrent users at any given time, the response time of their requests, and the average user think time, then you can calculate the number of requests per minute. Typically, start by estimating the number of concurrent users that are on the system.

For example, after running web site performance software, the administrator concludes that the average number of concurrent users submitting requests on an online banking web site is 3,000. This number depends on the number of users who have signed up to be members of the online bank, their banking transaction behavior, the time of the day or week they choose to submit requests, and so on.

Therefore, knowing this information enables you to use the requests per minute formula described in this section to calculate how many requests per minute your system can handle for this user base. Since requests per minute and response time become inversely proportional at peak load, decide if fewer requests per minute is acceptable as a trade-off for better response time, or alternatively, if a slower response time is acceptable as a trade-off for more requests per minute.

Experiment with the requests per minute and response time thresholds that are acceptable as a starting point for fine-tuning system performance. Thereafter, decide which areas of the system require adjustment.

Solving for r in the equation in the previous section gives:

r = n/(T_response + T_think)

Example 2–2 Calculation of Requests Per Second

For the values:

n = 2,800 concurrent users
T_response = 1 (one second per request average response time)
T_think = 3, (three seconds average think time)

The calculation for the number of requests per second is:

r = 2800 / (1+3) = 700

Therefore, the number of requests per second is 700 and the number of requests per minute is 42000.