Sun Java System Application Server Enterprise Edition 8.2 Performance Tuning Guide

Previous: Chapter 5 Tuning the Operating System

Chapter 6 Tuning for High-Availability

This chapter discusses the following topics:

Tuning HADB

The Application Server uses the high-availability database (HADB) to store persistent session state data. To optimize performance, tune the HADB according to the load of the Application Server. The data volume, transaction frequency, and size of each transaction can affect the performance of the HADB, and consequently the performance of Application Server.

This section discusses following topics:

Disk Use

This section discusses how to calculate HADB data device size and explains the use of separate disks for multiple data devices.

Calculating HADB Data Device Size

When the HADB database is created, specify the number, and size of each data device. These devices must have room for all the user data to be stored. In addition, allocate extra space to account for internal overhead as discussed in the following section.

If the database runs out of device space, the HADB returns error codes 4593 or 4592 to the Application Server.

Note –

See Sun Java System Application Server Enterprise Edition 8.2 Error Message Reference for more information on these error messages.

HADB also writes these error messages to history files. In this case, HADB blocks any client requests to insert, or update data. However, it will accept delete operations.

HADB stores session states as binary data. It serializes the session state and stores it as a BLOB (binary large object). It splits each BLOB into chunks of approximately 7KB each and stores each chunk as a database row (context row is synonymous with tuple, or record) in pages of 16KB.

There is some small memory overhead for each row (approximately 30 bytes). With the most compact allocation of rows (BLOB chunks), two rows are stored in a page. Internal fragmentation can result in each page containing only one row. On average, 50% of each page contains user data.

For availability in case of node failure, HADB always replicates user data. An HADB node stores its own data, plus a copy of the data from its mirror node. Hence, all data is stored twice. Since 50% of the space on a node is user data (on average), and each node is mirrored, the data devices must have space for at least four times the volume of the user data.

In the case of data refragmentation, HADB keeps both the old and the new versions of a table while the refragmentation operation is running. All application requests are performed on the old table while the new table is being created. Assuming that the database is primarily used for one huge table containing BLOB data for session states, this means the device space requirement must be multiplied by another factor of two. Consequently, if you add nodes to a running database, and want to refragment the data to use all nodes, you must have eight times the volume of user data available.

Additionally, you must also account for the device space that HADB reserves for its internal use (four times that of the LogBufferSize). HADB uses this disk space for temporary storage of the log buffer during high load conditions.

Tuning Data Device Size

To increase the size of the HADB data devices, use the following command:

hadbm set TotalDatadeviceSizePerNode

This command restarts all the nodes, one by one, to apply the change. For more information on using this command, seeConfiguring HADB in Sun Java System Application Server Enterprise Edition 8.2 High Availability Administration Guide.

Note –

hadbm does not add data devices to a running database instance.

Placing HADB files on Physical Disks

For best performance, data devices should be allocated on separate physical disks. This applies if there are nodes with more than one data device, or if there are multiple nodes on the same host.

Place devices belonging to different nodes on different devices. Doing this is especially important for Red Hat AS 2.1, because HADB nodes have been observed to wait for asynchronous I/O when the same disk is used for devices belonging to more than one node.

An HADB node writes information, warnings, and errors to the history file synchronously, rather than asynchronously, as output devices normally do. Therefore, HADB behavior and performance can be affected any time the disk waits when writing to the history file. This situation is indicated by the following message in the history file:

BEWARE - last flush/fputs took too long

To avoid this problem, keep the HADB executable files and the history files on physical disks different from those of the data devices.

Memory Allocation

It is essential to allocate sufficient memory for HADB, especially when it is co-located with other processes.

The HADB Node Supervisor Process (NSUP) tracks the time elapsed since the last time it performed monitoring. If the time exceeds a specified maximum (2500 ms, by default), NSUP restarts the node. The situation is likely when there are other processes in the system that compete for memory, causing swapping and multiple page faults. When the blocked node restarts, all active transactions on that node are aborted.

If Application Server throughput slows and requests abort or time out, make sure that swapping is not the cause. To monitor swapping activity on Unix systems, use this command:

vmstat -S

In addition, look for this message in the HADB history files. It is written when the HADB node is restarted, where M is greater than N:

Process blocked for .M. sec, max block time is .N. sec

The presence of aborted transactions will be signaled by the error message

HADB00224: Transaction timed out or HADB00208: Transaction aborted.

Performance

For best performance, all HADB processes (clu_xxx_srv) must fit in physical memory. They should not be paged or swapped. The same applies for shared memory segments in use.

You can configure the size of some of the shared memory segments. If these segments are too small, performance suffers, and user transactions are delayed or even aborted. If the segments are too large, then the physical memory is wasted.

You can configure the following parameters:

DataBufferPoolSize

The HADB stores data on data devices, which are allocated on disks. The data must be in the main memory before it can be processed. The HADB node allocates a portion of shared memory for this purpose. If the allocated database buffer is small compared to the data being processed, then disk I/O will waste significant processing capacity. In a system with write-intensive operations (for example, frequently updated session states), the database buffer must be big enough that the processing capacity used for disk I/O does not hamper request processing.

The database buffer is similar to a cache in a file system. For good performance, the cache must be used as much as possible, so there is no need to wait for a disk read operation. The best performance is when the entire database contents fits in the database buffer. However, in most cases, this is not feasible. Aim to have the “working set” of the client applications in the buffer.

Also monitor the disk I/O. If HADB performs many disk read operations, this means that the database is low on buffer space. The database buffer is partitioned into blocks of size 16KB, the same block size used on the disk. HADB schedules multiple blocks for reading and writing in one I/O operation.

Use the hadbm deviceinfo command to monitor disk use. For example, hadbm deviceinfo --details will produce output similar to this:

NodeNo   TotalSize       FreeSize        Usage
0        512             504             1%
1        512             504             1%

The columns in the output are:

TotalSize: size of device in MB.
FreeSize: free size in MB.

Usage: percent used.

Use the hadbm resourceinfo command to monitor resource usage, for example the following command displays data buffer pool information:

%hadbm resourceinfo --databuf
NodeNo   Avail     Free     Access          Misses          Copy-on-write
0        32        0        205910260       8342738         400330
1        32        0        218908192       8642222         403466

The columns in the output are:

Avail: Size of buffer, in Mbytes.
Free: Free size, when the data volume is larger than the buffer. (The entire buffer is used at all times.)
Access: Number of times blocks that have been accessed in the buffer.
Misses: Number of block requests that “missed the cache” (user had to wait for a disk read)
Copy-on-write: Number of times the block has been modified while it is being written to disk.

For a well-tuned system, the number of misses (and hence the number of reads) must be very small compared to the number of writes. The example numbers above show a miss rate of about 4% (200 million access, and 8 million misses). The acceptability of these figures depends on the client application requirements.

Tuning DataBufferPoolSize

To change the size of the database buffer, use the following command:

hadbm set DataBufferPoolSize

This command restarts all the nodes, one by one, for the change to take effect. For more information on using this command, see Configuring HADB in Sun Java System Application Server Enterprise Edition 8.2 High Availability Administration Guide.

LogBufferSize

Before it executes them, HADB logs all operations that modify the database, such as inserting, deleting, updating, or reading data. It places log records describing the operations in a portion of shared memory referred to as the (tuple) log buffer. HADB uses these log records for undoing operations when transactions are aborted, for recovery in case of node crash, and for replication between mirror nodes.

The log records remain in the buffer until they are processed locally and shipped to the mirror node. The log records are kept until the outcome (commit or abort) of the transaction is certain. If the HADB node runs low on tuple log, the user transactions are delayed, and possibly timed out.

Tuning LogBufferSize

Begin with the default value. Look for HIGH LOAD informational messages in the history files. All the relevant messages will contain tuple log or simply log, and a description of the internal resource contention that occurred.

Under normal operation the log is reported as 70 to 80% full. This is because space reclamation is said to be “lazy.” HADB requires as much data in the log as possible, to recover from a possible node crash.

Use the following command to display information on log buffer size and use:

hadbm resourceinfo --logbuf

For example, output might look like this:

Node No.     Avail         Free Size
0            44            42
1            44            42

The columns in the output are:

Node No.:The node number.
Avail: Size of buffer, in megabytes.
Free Size: Free size, in MB, when the data volume is larger than the buffer. The entire buffer is used at all times.

Change the size of the log buffer with the following command:
```
hadbm set LogbufferSize
```
This command restarts all the nodes, one by one, for the change to take effect. For more information on using this command, see Configuring HADB in Sun Java System Application Server Enterprise Edition 8.2 High Availability Administration Guide.

InternalLogbufferSize

The node internal log (nilog) contains information about physical (as opposed to logical, row level) operations at the local node. For example, it provides information on whether there are disk block allocations and deallocations, and B-tree block splits. This buffer is maintained in shared memory, and is also checked to disk (a separate log device) at regular intervals. The page size of this buffer, and the associated data device is 4096 bytes.

Large BLOBs necessarily allocate many disk blocks, and thus create a high load on the node internal log. This is normally not a problem, since each entry in the nilog is small.

Tuning InternalLogbufferSize

Begin with the default value. Look out for HIGH LOAD informational messages in the history files. The relevant messages contain nilog, and a description of the internal resource contention that occurred.

Use the following command to display node internal log buffer information:

hadbm resourceinfo --nilogbuf

For example, the output might look something like this:

Node No.     Avail         Free Size
0            11            11
1            11            11

To change the size of the nilog buffer, use the following command:

hadbm set InternalLogbufferSize

The hadbm restarts all the nodes, one by one, for the change to take effect. For more information on using this command, see Configuring HADB in Sun Java System Application Server Enterprise Edition 8.2 High Availability Administration Guide.

Note –

If the size of the nilog buffer is changed, the associated log device (located in the same directory as the data devices) also changes. The size of the internal log buffer must be equal to the size of the internal log device. The command hadbm set InternalLogBufferSize ensures this requirement. It stops a node, increases the InternalLogBufferSize, re initializes the internal log device, and brings up the node. This sequence is performed on all nodes.

NumberOfLocks

Each row level operation requires a lock in the database. Locks are held until a transaction commits or rolls back. Locks are set at the row (BLOB chunk) level, which means that a large session state requires many locks. Locks are needed for both primary, and mirror node operations. Hence, a BLOB operation allocates the same number of locks on two HADB nodes.

When a table refragmentation is performed, HADB needs extra lock resources. Thus, ordinary user transactions can only acquire half of the locks allocated.

If the HADB node has no lock objects available, errors are written to the log file. For more information , see Chapter 14, HADB Error Messages, in Sun Java System Application Server Enterprise Edition 8.2 Error Message Reference.

Calculating the number of locks

To calculate the number of locks needed, estimate the following parameters:

Number of concurrent users that request session data to be stored in HADB (one session record per user)
Maximum size of the BLOB session
Persistence scope (max session data size in case of session/modified session and maximum number of attributes in case of modified session). This requires setAttribute() to be called every time the session data is modified.

If:
x is the maximum number of concurrent users, that is, x session data records are present in the HADB, and
y is the session size (for session/modified session) or attribute size (for modified attribute),

Then the number of records written to HADB is:

xy/7000 + 2x

Record operations such as insert, delete, update and read will use one lock per record.

Note –

Locks are held for both primary records and hot-standby records. Hence, for insert, update and delete operations a transaction will need twice as many locks as the number of records. Read operations need locks only on the primary records. During refragmentation and creation of secondary indices, log records for the involved table are also sent to the fragment replicas being created. In that case, a transaction needs four times as many locks as the number of involved records. (Assuming all queries are for the affected table.)

Summary

If refragmentation is performed, the number of locks to be configured is:

N_locks = 4x (y/7000 + 2) = 2xy/3500 + 2x

Otherwise, the number of locks to be configured is:

N_locks = 2x (y/7000 + 2) = xy/3500 + 4x

Tuning NumberOfLocks

Start with the default value. Look for exceptions with the indicated error codes in the Application Server log files. Remember that under normal operations (no ongoing refragmentation) only half of the locks might be acquired by the client application.

To get information on allocated locks and locks in use, use the following command:

hadbm resourceinfo --locks

For example, the output displayed by this command might look something like this:

Node No.     Avail             Free            Waits
0            50000             50000           na
1            50000             50000           na

Avail: Number of locks available.
Free: Number of locks in use.
Waits: Number of transactions that have waited for a lock.“na” (not applicable) if all locks are available.

To change the number of locks, use the following command:
```
hadbm set NumberOfLocks
```
The hadbm restarts all the nodes, one by one, for the change to take effect. For more information on using this command, see Configuring HADB in Sun Java System Application Server Enterprise Edition 8.2 High Availability Administration Guide.

Timeouts

This section describes some of the timeout values that affect performance.

JDBC connection pool timeouts

These values govern how much time the server waits for a connection from the pool before it times out. In most cases, the default values work well. For detailed tuning information, see Tuning JDBC Connection Pools.

Load Balancer timeouts

Some values that may affect performance are:

response-timeout-in-seconds -The time for which the load balancer plug-in will wait for a response before it declares an instance dead and fails over to the next instance in the cluster. Make this value large enough to accommodate the maximum latency for a request from the server instance under the worst (high load) conditions.
health checker: interval-in-seconds - Determines how frequently the load balancer pings the instance to see if it is healthy. Default value is 30 seconds. If the response-timeout-in-seconds is optimally tuned, and the server doesn’t have too much traffic, then the default value works well.
health checker: timeout-in-seconds - How long the load balancer waits after “pinging” an instance. The default value is 100 seconds.

The combination of the health checker’s interval-in-seconds and timeout-in-seconds values determine how much additional traffic goes from the load balancer plug-in to the server instances.

For more information on configuring the load balancer plug-in, see Configuring the Load Balancer in Sun Java System Application Server Enterprise Edition 8.2 High Availability Administration Guide.

HADB timeouts

The sql_client time out value may affect performance.

Operating System Configuration

The following section describes configuration of the operating system.

Semaphores

If the number of semaphores is too low, HADB can fail and display this error message:

No space left on device

This can occur either while starting the database, or during run time. Since the semaphores are provided as a global resource by the operating system, the configuration depends on all processes running on the host, and not the HADB alone. In Solaris, configure the semaphore settings by editing the /etc/system file.

To run the nodes, NNODES (the number of nodes submitted implicitly by --hosts option to the HADB) and NCONNS connections (HADB configuration parameter NumberOfSessions, default value being 100) per host, use the following semaphore settings:

set semsys:seminfo_semmap = <default=10> + NNODES
set semsys:seminfo_semmni = <default=10> + NNODES
set semsys:seminfo_semmns = <default=60> + (NNODES * 8)
set semsys:seminfo_semmnu = <default=30> + NNODES + NCONNS

If you plan to run multiple nodes per host, make sure semmap = NNODES. Use the sysinfo and sysdef commands to inspect the settings.

Shared Memory

Set the maximum shared memory size to the total amount of physical RAM. Additionally, set the maximum number of shared memory segments per process to six or more to accommodate the HADB processes. Set the number of system-wide, shared memory identifiers based on the number of nodes running on the host.

Solaris

In Solaris 9, because of the kernel changes, the hmsys:shminfo_shmseg variable is obsolete. In Solaris 8, add the following settings to the /etc/system file:

set shmsys:shminfo_shmmax = 0xffffffff
set shmsys:shminfo_shmseg = <default=6>
set shmsys:shminfo_shmmni = <default=100> + (6 * NNODES)

Default values are for Solaris 8. Add HADB resource requirements to the previous value of the variables regardless of whether they are the default values.

Note –

You must reboot the host after changing these settings.

Linux

To increase the shared memory to 512 MB, run the following commands:

echo 536870912 > /proc/sys/kernel/shmmax
echo 536870912 > /proc/sys/kernel/shmall

Where the file shmmax contains the maximum size of a single shared memory segment, and shmall contains the total shared memory to be made available.

This value is large enough for a standard HADB node that uses default values. If the default values are changed, consider changing these values, as well.

To make these changes permanent, add those lines to /etc/rc.local on your Linux machine. With Redhat Linux, you can also modify sysctl.conf to set the kernel parameters.

Tuning the Application Server for High-Availability

This section discusses how you can configure the high availability features of Application Server. This section discusses the following topics:

Tuning Session Persistence Frequency
Session Persistence Scope
Session Size
Checkpointing Stateful Session Beans
Configuring the JDBC Connection Pool
Descriptor configuration in the web application

To ensure highly available web applications with persistent session data, the high availability database (HADB) provides a backend store to save HTTP session data. However, there is a overhead involved in saving and reading the data back from HADB. Understanding the different schemes of session persistence and their impact on performance and availability will help you make decisions in configuring Application Server for high availability.

In general, maintain twice as many HADB nodes as there are application server instances. Every application server instance requires two HADB nodes.

Tuning Session Persistence Frequency

The Application Server provides HTTP session persistence and failover by writing session data to HADB. You can control the frequency at which the server writes to HADB by specifying the persistence frequency.

Specify the persistence frequency in the Admin Console under Configurations > config-name > Availability Service (Web Container Availability).

Persistence frequency can be set to:

web-method
time-based

All else being equal, time-based persistence frequency provides better performance but less availability than web-method persistence frequency. This is because the session state is written to the persistent store (HADB) at the time interval specified by the reap interval (default is 60 seconds). If the server instance fails within that interval, the session state will lose any updates since the last time the session information was written to HADB.

Web-method

With web-method persistence frequency, the server writes the HTTP session state to HADB before it responds to each client request. This can have an impact on response time that depends on the size of the data being persisted. Use this mode of persistence frequency for applications where availability is critical and some performance degradation is acceptable.

For more information on web-method persistence frequency, see Configuring Availability for the Web Container in Sun Java System Application Server Enterprise Edition 8.2 High Availability Administration Guide.

Time-based

With time-based persistence frequency, the server stores session information to the persistence store at a constant interval, called the reap interval. You specify the reap interval under Configurations > config-name > Web Container (Manager Properties), where config-name is the name of the configuration. By default, the reap interval is 60 seconds. Every time the reap interval elapses, a special thread “wakes up,” iterates over all the sessions in memory, and saves the session data.

In general, time-based persistence frequency will yield better performance than web-method, since the server’s responses to clients are not held back by saving session information to the HADB. Use this mode of persistence frequency when performance is more important than availability.

Session Persistence Scope

You can specify the scope of the persistence in addition to persistence frequency on the same page in the Admin Console where you specify persistence frequency, Configurations > config-name > Availability Service (Web Container Availability).

For detailed description of different persistence scopes, see Chapter 9, Configuring High Availability Session Persistence and Failover, in Sun Java System Application Server Enterprise Edition 8.2 High Availability Administration Guide.

Persistence scope can be one of:

session
modifed-session
modified-attribute

session

With the session persistence scope, the server writes the entire session data to HADB—regardless of whether it has been modified. This mode ensures that the session data in the backend store is always current, but it degrades performance, since all the session data is persisted for every request.

modified-session

With the modified-session persistence scope, the server examines the state of the HTTP session. If and only if the data has been modified, the server saves the session data to HADB. This mode yields better performance than session mode, because calls to HADB to persist data occur only when the session is modified.

modified-attribute

With the modified-attribute persistence scope, there are no cross-references for the attributes, and the application uses setAttribute() and getAttribute() to manipulate HTTP session data. Applications written this way can take advantage of this session scope behavior to obtain better performance.

Session Size

It is critical to be aware of the impact of HTTP session size on performance. Performance has an inverse relationship with the size of the session data that needs to be persisted. Session data is stored in HADB in a serialized manner. There is an overhead in serializing the data and inserting it as a BLOB and also deserializing it for retrieval.

Tests have shown that for a session size up to 24KB, performance remains unchanged. When the session size exceeds 100KB, and the same back-end store is used for the same number of connections, throughput drops by 90%.

It is important to pay attention while determining the HTTP session size. If you are creating large HTTP session objects, calculate the HADB nodes as discussed in Tuning HADB.

Checkpointing Stateful Session Beans

Checkpointing saves a stateful session bean (SFSB) state to the HADB so that if the server instance fails, the SFSB is failed over to another instance in the cluster and the bean state recovered. The size of the data being checkpointed and the frequency at which checkpointing happens determine the additional overhead in response time for a given client interaction.

You can enable SFSB checkpointing at numerous different levels:

For the entire server instance or EJB container
For the entire application
For a specific EJB module
Per method in an individual EJB module

For best performance, specify checkpointing only for methods that alter the bean state significantly, by adding the <checkpointed-methods> tag in the sun-ejb-jar.xml file.

For more information, see Using Session Beans in Sun Java System Application Server Enterprise Edition 8.2 Developer’s Guide.

Configuring the JDBC Connection Pool

The Application Server uses JDBC to store and retrieve HADB data. For best performance, configure the JDBC connection pool for the fastest possible HADB read/write operations.

Configure the JDBC connection pool in the Admin Console under Resources > JDBC > Connection Pools > pool-name. The connection pool configuration settings are:

Initial and Minimum Pool Size: Minimum and initial number of connections maintained in the pool (default is 8)
Maximum Pool Size: Maximum number of connections that can be created to satisfy client requests (default is 32)
Pool Resize Quantity: Number of connections to be removed when idle timeout timer expires
Idle Timeout: Maximum time (seconds) that a connection can remain idle in the pool. (default is 300)
Max Wait Time: Amount of time (milliseconds) caller waits before connection timeout is sent

For optimal performance, use a pool with eight to 16 connections per node. For example, if you have four nodes configured, then the steady-pool size must be set to 32 and the maximum pool size must be 64. Adjust the Idle Timeout and Pool Resize Quantity values based on monitoring statistics.

For the best performance, use the following settings:

Connection Validation: Required
Validation Method: metadata
Transaction Isolation Level: repeatable-read

In addition to the standard attributes, add the two following properties:

cacheDatabaseMetaData: false
eliminateRedundantEndTransaction: true

To add a property, click the Add Property button, then specify the property name and value, and click Save.

For more information on configuring the JDBC connection pool, see Tuning JDBC Connection Pools.

Configuring the Load Balancer

The Application Server provides a load balancer plugin that can balance the load of requests among multiple instances which are part of the cluster. For more information on configuring the load balancer, see Configuring the Load Balancer in Sun Java System Application Server Enterprise Edition 8.2 High Availability Administration Guide.

Note –

The following section assumes that the server is tuned effectively to service incoming requests.

Enabling the Health Checker

The load balancer periodically checks all the configured Application Server instances that are marked as unhealthy, based on the values specified in the health-checker element in the loadbalancer.xml file. Enabling the health checker is optional. If the health checker is not enabled, periodic health check of unhealthy instances is not performed.

The load balancer’s health check mechanism communicates with the application server instance using HTTP. The health checker sends an HTTP request to the URL specified and waits for a response. The status code in the HTTP response header should be between 100 and 500 to consider the instance to be healthy.

To enable the health checker, edit the following properties:

url: Specifies the listener’s URL that the load balancer checks to determine its state of health.
interval-in-seconds: Specifies the interval at which health checks of instances occur. The default is 30 seconds.
timeout-in-seconds: Specifies the timeout interval within which a response must be obtained for a listener to be considered healthy. The default is 10 seconds.

If the typical response from the server takes n seconds and under peak load takes m seconds, then set the timeout-in-seconds property to m + n, as follows:

<health-checker 
url="http://hostname.domain:port" 
interval-in-seconds="n" 
timeout-in-seconds="m+n"/>

For more information, see Configuring the Load Balancer in Sun Java System Application Server Enterprise Edition 8.2 High Availability Administration Guide.

Previous: Chapter 5 Tuning the Operating System