At its simplest, high performance means maximizing throughput and reducing response time. Beyond these basic goals, you can establish specific goals by determining the following:
What types of applications and services are deployed, and how do clients access them?
Which applications and services need to be highly available?
Do the applications have session state or are they stateless?
What request capacity or throughput must the system support?
How many concurrent users must the system support?
What is an acceptable average response time for user requests?
What is the average think time between requests?
You can calculate some of these metrics using a remote browser emulator (RBE) tool, or web site performance and benchmarking software that simulates expected application activity. Typically, RBE and benchmarking products generate concurrent HTTP requests and then report the response time for a given number of requests per minute. You can then use these figures to calculate server activity.
The results of the calculations described in this chapter are not absolute. Treat them as reference points to work against, as you fine-tune the performance of the Application Server and your applications.
This section discusses the following topics:
In broad terms, throughput measures the amount of work performed by Enterprise Server. For Enterprise Server, throughput can be defined as the number of requests processed per minute per server instance. High availability applications also impose throughput requirements on HADB, since they save session state data periodically. For HADB, throughput can be defined as volume of session data stored per minute, which is the product of the number of HADB requests per minute, and the average session size per request.
As described in the next section, Enterprise Server throughput is a function of many factors, including the nature and size of user requests, number of users, and performance of Enterprise Server instances and back-end databases. You can estimate throughput on a single machine by benchmarking with simulated workloads.
High availability applications incur additional overhead because they periodically save data to HADB. The amount of overhead depends on the amount of data, how frequently it changes, and how often it is saved. The first two factors depend on the application in question; the latter is also affected by server settings.
HADB throughput can be defined as the number of HADB requests per minute multiplied by the average amount of data per request. Larger throughput to HADB implies that more HADB nodes are needed and a larger store size.
Consider the following factors to estimate the load on Enterprise Server instances:
Users interact with an application through a client, such as a web browser or Java program. Based on the user’s actions, the client periodically sends requests to the Enterprise Server. A user is considered active as long as the user’s session has neither expired nor been terminated. When estimating the number of concurrent users, include all active users.
Initially, as the number of users increases, throughput increases correspondingly. However, as the number of concurrent requests increases, server performance begins to saturate, and throughput begins to decline.
Identify the point at which adding concurrent users reduces the number of requests that can be processed per minute. This point indicates when optimal performance is reached and beyond which throughput start to degrade. Generally, strive to operate the system at optimal throughput as much as possible. You might need to add processing power to handle additional load and increase throughput.
A user does not submit requests continuously. A user submits a request, the server receives and processes the request, and then returns a result, at which point the user spends some time before submitting a new request. The time between one request and the next is called think time.
Think times are dependent on the type of users. For example, machine-to-machine interaction such as for a web service typically has a lower think time than that of a human user. You may have to consider a mix of machine and human interactions to estimate think time.
Determining the average think time is important. You can use this duration to calculate the number of requests that need to be completed per minute, as well as the number of concurrent users the system can support.
Response time refers to the amount of time Enterprise Server takes to return the results of a request to the user. The response time is affected by factors such as network bandwidth, number of users, number and type of requests submitted, and average think time.
In this section, response time refers to the mean, or average, response time. Each type of request has its own minimal response time. However, when evaluating system performance, base the analysis on the average response time of all requests.
The faster the response time, the more requests per minute are being processed. However, as the number of users on the system increases, the response time starts to increase as well, even though the number of requests per minute declines.
A system performance graph similar to this figure indicates that after a certain point, requests per minute are inversely proportional to response time. The sharper the decline in requests per minute, the steeper the increase in response time (represented by the dotted line arrow).
In the figure, the point of the peak load is the point at which requests per minute start to decline. Prior to this point, response time calculations are not necessarily accurate because they do not use peak numbers in the formula. After this point, (because of the inversely proportional relationship between requests per minute and response time), the administrator can more accurately calculate response time using maximum number of users and requests per minute.
Use the following formula to determine Tresponse, the response time (in seconds) at peak load:
Tresponse = n/r - Tthink
where
n is the number of concurrent users
r is the number requests per second the server receives
Tthink is the average think time (in seconds)
To obtain an accurate response time result, always include think time in the equation.
If the following conditions exist:
Maximum number of concurrent users, n, that the system can support at peak load is 5,000.
Maximum number of requests, r, the system can process at peak load is 1,000 per second.
Average think time, Tthink, is three seconds per request.
Thus, the calculation of response time is:
Tresponse = n/r - Tthink = (5000/ 1000) - 3 sec. = 5 - 3 sec.
Therefore, the response time is two seconds.
After the system’s response time has been calculated, particularly at peak load, compare it to the acceptable response time for the application. Response time, along with throughput, is one of the main factors critical to the Application Server performance.
If you know the number of concurrent users at any given time, the response time of their requests, and the average user think time, then you can calculate the number of requests per minute. Typically, start by estimating the number of concurrent users that are on the system.
For example, after running web site performance software, the administrator concludes that the average number of concurrent users submitting requests on an online banking web site is 3,000. This number depends on the number of users who have signed up to be members of the online bank, their banking transaction behavior, the time of the day or week they choose to submit requests, and so on.
Therefore, knowing this information enables you to use the requests per minute formula described in this section to calculate how many requests per minute your system can handle for this user base. Since requests per minute and response time become inversely proportional at peak load, decide if fewer requests per minute is acceptable as a trade-off for better response time, or alternatively, if a slower response time is acceptable as a trade-off for more requests per minute.
Experiment with the requests per minute and response time thresholds that are acceptable as a starting point for fine-tuning system performance. Thereafter, decide which areas of the system require adjustment.
Solving for r in the equation in the previous section gives:
r = n/(Tresponse + Tthink)
For the values:
n = 2,800 concurrent users
Tresponse = 1 (one second per request average response time)
Tthink = 3, (three seconds average think time)
The calculation for the number of requests per second is:
r = 2800 / (1+3) = 700
Therefore, the number of requests per second is 700 and the number of requests per minute is 42000.
To calculate load on the HADB, consider the following factors:
For instructions on configuring session persistence, see Chapter 9, Configuring High Availability Session Persistence and Failover, in Sun GlassFish Enterprise Server v2.1.1 High Availability Administration Guide.
The number of requests per minute received by the HADB depends on the persistence frequency. Persistence Frequency determines how often Enterprise Server saves HTTP session data to the HADB.
The persistence frequency options are:
web-method (default): the server stores session data with every HTTP response. This option guarantees that stored session information will be up to date, but leads to high traffic to HADB.
time-based: the session is stored at the specified time interval. This option reduces the traffic to HADB, but does not guarantee that the session information will be up to date.
The following table summarizes the advantages and disadvantages of persistence frequency options.
Table 2–1 Comparison of Persistence Frequency Options
Persistence Frequency Option |
Advantages |
Disadvantages |
---|---|---|
web-method |
Guarantees that the most up-to-date session information is available. |
Potentially increased response time and reduced throughput. |
time-based |
Better response time and potentially better throughput. |
Less guarantee that the most updated session information is available after the failure of an application server instance. |
The session size per request depends on the amount of session information stored in the session.
To improve overall performance, reduce the amount of information in the session as much as possible.
It is possible to fine-tune the session size per request through the persistence scope settings. Choose from the following options for HTTP session persistence scope:
session: The server serializes and saves the entire session object every time it saves session information to HADB.
modified-session: The server saves the session only if the session has been modified. It detects modification by intercepting calls to the bean’s setAttribute() method. This option will not detect direct modifications to inner objects, so in such cases the SFSB must be coded to call setAttribute() explicitly.
modified-attribute: The server saves only those attributes that have been modified (inserted, updated, or deleted) since the last time the session was stored. This has the same drawback as modified-session but can significantly reduce HADB write throughput requirements if properly applied.
To use this option, the application must:
Call setAttribute() or removeAttribute() every time it modifies session state.
Make sure there are no cross references between attributes.
Distribute the session state across multiple attributes, or at least between a read-only attribute and a modifiable attribute.
The following table summarizes the advantages and disadvantages of the persistence scope options.
For SFSB session persistence, the load on HADB depends on the following:
Number of SFSBs enabled for checkpointing.
Which SFSB methods are selected for checkpointing, and how often they are used.
Size of the session object.
Which methods are transactional.
Checkpointing generally occurs after any transaction involving the SFSB is completed (even if the transaction rolls back).
For better performance, specify a small set of methods for checkpointing. The size of the data that is being checkpointed and the frequency of checkpointing determine the additional overhead in response time for a given client interaction.