Sun Java System Web Server 6.1 SP6 Performance Tuning, Sizing, and Scaling Guide

Threads, Processes, and Connections

This section includes the following topics:

Connection-Handling Overview

In Sun Java System Web Server, acceptor threads on a listen socket accept connections and put them into a connection queue. Session threads then pick up connections from the queue and service the requests. The session threads post more session threads if required at the end of the request. The policy for adding new threads is based on the connection queue state:

Each time a new connection is returned, the number of connections waiting in the queue (the backlog of connections) is compared to the number of session threads already created. If it is greater than the number of threads, more threads are scheduled to be added the next time a request completes.
The previous backlog is tracked, so that if it is seen to be increasing over time, and if the increase is greater than the ThreadIncrement value, and the number of session threads minus the backlog is less than the ThreadIncrement value, then another ThreadIncrement number of threads are scheduled to be added.
The process of adding new session threads is strictly limited by the RqThrottle value.
To avoid creating too many threads when the backlog increases suddenly (such as the startup of benchmark loads), the decision as to whether more threads are needed is made only once every 16 or 32 times a connection is made based on how many session threads already exist.

The following directives that affect the number and timeout of threads, processes, and connections can be tuned in the Magnus Editor or magnus.conf:
AcceptTimeout
ConnQueueSize
HeaderBufferSize
KeepAliveThreads
KeepAliveTimeout
KernelThreads
ListenQ
MaxKeepAliveConnections
MaxProcs (UNIX Only)
PostThreadsEarly
RcvBufSize
RqThrottle
RqThrottleMin
SndBufSize
StackSize
StrictHttpHeaders
TerminateTimeout
ThreadIncrement
UseNativePoll (UNIX only)

For detailed information about these directives, see the Sun Java System Web Server 6.1 SP6 Administrator’s Configuration File Reference.

Process Modes

You can run Sun Java System Web Server in one of the following two modes:

Single-Process Mode

In the single-process mode the server receives requests from web clients to a single process. Inside the single server process many threads are running that are waiting for new requests to arrive. When a request arrives, it is handled by the thread receiving the request. Because the server is multi-threaded, all NSAPI extensions written to the server must be thread-safe. This means that if the NSAPI extension uses a global resource, like a shared reference to a file or global variable, then the use of that resource must be synchronized, so that only one thread accesses it at a time. All plugins provided by Netscape/Sun Java System are thread-safe and thread-aware, providing good scalability and concurrency. However, your legacy applications may be single-threaded. When the server runs the application, it can only execute one at a time. This leads to server performance problems when put under load. Unfortunately, in the single-process design, there is no real workaround.

Multi-Process Mode

You can configure the server to handle requests using multiple processes with multiple threads in each process. This flexibility provides optimal performance for sites using threads, and also provides backward compatibility to sites running legacy applications that are not ready to run in a threaded environment. Because applications on Windows generally already take advantage of multi-thread considerations, this feature applies to UNIX/Linux platforms.

The advantage of multiple processes is that legacy applications that are not thread-aware or thread-safe can be run more effectively in Sun Java System Web Server. However, because all of the Netscape/Sun ONE extensions are built to support a single-process threaded environment, they may not run in the multi-process mode, and the Search plugins will fail on startup if the server is in multi-process mode.

In the multi-process mode, the server spawns multiple server processes at startup. Each process contains one or more threads (depending on the configuration) that receive incoming requests. Since each process is completely independent, each one has its own copies of global variables, caches, and other resources. Using multiple processes requires more resources from your system. Also, if you try to install an application that requires shared state, it has to synchronize that state across multiple processes. NSAPI provides no helper functions for implementing cross-process synchronization.

When you specify a MaxProcs value greater than 1, the server relies on the operating system to distribute connections among multiple server processes (seeMaxProcs (UNIX/Linux) MaxProcs (UNIX/Linux) for information about the MaxProcs directive). However, many modern operating systems will not distribute connections evenly, particularly when there are a small number of concurrent connections.

Because Sun Java System Web Server cannot guarantee that load is distributed evenly among server processes, you may encounter performance problems if you specify RqThrottle 1 and MaxProcs greater than 1 to accommodate a legacy application that is not thread-safe. The problem will be especially pronounced if the legacy application takes a long time to respond to requests (for example, if the legacy application contacts a backend database). In this scenario, it may be preferable to use the default value for RqThrottle and serialize access to the legacy application using thread pools. For more information about creating a thread pool, refer to the description of the thread-pool-init SAF in the Sun Java System Web Server 6.1 NSAPI Programmer's Guide.

If you are not running any NSAPI in your server, you should use the default settings: one process and many threads. If you are running an application that is not scalable in a threaded environment, you should use a few processes and many threads, for example, 4 or 8 processes and 128 or 512 threads per process.

MaxProcs (UNIX/Linux)

Use this directive to set your UNIX/Linux server in multi-process mode, which may allow for higher scalability on multi-processor machines. If you set the value to less than 1, it will be ignored and the default value of 1 will be used. SeeMulti-Process Mode Multi-Process Mode for a discussion of the performance implications of setting this to a value greater than 1.

Tuning

You can set the value for MaxProcs by:

Editing the MaxProcs parameter in magnus.conf
Setting or changing the MaxProcs value in the Magnus Editor of the Server Manager

Note –

You will receive duplicate startup messages when running your server in MaxProcs mode.

Listen Socket Acceptor Threads

You can specify how many threads you want in accept mode on a listen socket at any time. It’s a good practice to set this to less than or equal to the number of CPUs in your system.

Tuning

You can set the number of listen socket acceptor threads by:

Editing the server.xml file
Entering the number of acceptor threads you want in the Number of Acceptor Threads field of the Edit Listen Socket page of the Server Manager

Maximum Simultaneous Requests

The RqThrottle parameter in the magnus.conf file specifies the maximum number of simultaneous transactions the Web Server can handle. The default value is 128. Changes to this value can be used to throttle the server, minimizing latencies for the transactions that are performed. The RqThrottle value acts across multiple virtual servers, but does not attempt to load balance.

To compute the number of simultaneous requests, the server counts the number of active requests, adding one to the number when a new request arrives, subtracting one when it finishes the request. When a new request arrives, the server checks to see if it is already processing the maximum number of requests. If it has reached the limit, it defers processing new requests until the number of active requests drops below the maximum amount.

In theory, you could set the maximum simultaneous requests to 1 and still have a functional server. Setting this value to 1 would mean that the server could only handle one request at a time, but since HTTP requests for static files generally have a very short duration (response time can be as low as 5 milliseconds), processing one request at a time would still allow you to process up to 200 requests per second.

However, in actuality, Internet clients frequently connect to the server and then do not complete their requests. In these cases, the server waits 30 seconds or more for the data before timing out. You can define this timeout period using the AcceptTimeout directive in magnus.conf. The default value is 30 seconds. By setting it to less than the default you can free up threads sooner, but you might also disconnect users with slower connections. Also, some sites perform heavyweight transactions that take minutes to complete. Both of these factors add to the maximum simultaneous requests that are required. If your site is processing many requests that take many seconds, you may need to increase the number of maximum simultaneous requests. For more information about AcceptTimeout, see the Sun Java System Web Server 6.1 SP6 Administrator’s Configuration File Reference.

Suitable RqThrottle values range from 100-500, depending on the load.

RqThrottleMin is the minimum number of threads the server initiates upon startup. The default value is 48. RqThrottle represents a hard limit for the maximum number of active threads that can run simultaneously, which can become a bottleneck for performance. The default value is 128.

Note –

If you are using older NSAPI plugins that are not reentrant, they will not work with the multi-threading model described in this document. To continue using them, you should revise them so that they are reentrant. If this is not possible, you can configure your server to work with them by setting RqThrottle to 1, and then using a high value for MaxProcs, such as 48 or greater, but this will adversely impact your server’s performance.

Note –

When configuring Sun Java System Web Server to be used with SNCA (the Solaris Network Cache and Accelerator), setting the RqThrottle and ConnQueueSize parameters to 0 provides better performance. Because SNCA manages the client connections, it is not necessary to set these parameters. These parameters can also be set to 0 with non-SNCA configurations, especially for cases in which short latency responses with no keep-alives must be delivered. It is important to note that RqThrottle and ConnQueueSize must both be set to 0.

For more information about RqThrottle and ConnQueueSize, see the chapter pertaining to magnus.conf in the Sun Java System Web Server 6.1 SP6 Administrator’s Configuration File Reference. Also consult the RqThrottle and ConnQueueSize entries in the index in this book. For information about using SNCA, seeUsing the Solaris Network Cache and Accelerator (SNCA)

Tuning

You can tune the number of simultaneous requests by:

Editing RqThrottleMin and RqThrottle in the magnus.conf file
Entering or changing values for the RqThrottleMin and RqThrottle fields in the Magnus Editor of the Server Manager
Entering the desired value in the Maximum Simultaneous Requests field from the Performance Tuning page under Preferences in the Server Manger

Keep-Alive Subsystem Tuning

The keep-alive (or HTTP/1.1 persistent connection handling) subsystem in Sun Java System Web Server 6.1 is designed to be massively scalable. The out-of-the-box configuration can be less than optimal if the workload is non-persistent (that is, HTTP/1.0 without the KeepAlive header), or for a lightly loaded system that’s primarily servicing keep-alive connections.

There are several tuning parameters that can help improve performance. Those parameters are listed below:

acceptorthreads: Number of threads waiting to accept incoming connections on a given network port. This is specified per the listen socket (LS) element in server.xml.
ConnQueueSize: Size of the queue of active, ready-to-process connections.
RqThrottle: Number of worker threads in the server. Each thread parses and services a request from an active connection. Worker threads, in contrast with acceptor threads, service requests. The maximum number of worker threads is configured using RqThrottle. For more information, seeMaximum Simultaneous Requests
MaxKeepAliveConnections: This controls the maximum number of keep-alive connections the Web Server can maintain at any time. The default is 256. The range is 0 to 32768.
KeepAliveTimeout: This directive determines the maximum time (in seconds) that the server holds open an HTTP keep-alive connection or a persistent connection between the client and the server. The default is 30 seconds. The connection will timeout if idle for more than 30 seconds. The maximum is 300 seconds (5 minutes).
KeepAliveThreads: This directive determines the number of threads in the keep-alive subsystem. It is recommended that this number be a small multiple of the number of processors on the system (for example, a 2 CPU system should have 2 or 4 keep-alive threads). The default is 1.
KeepAliveQueryMaxSleepTime: Specifies an upper limit to the time slept (in milliseconds) after polling keep-alive connections for further requests. The default is 100. On lightly loaded systems that primarily service keep-alive connections, you can lower this number to enhance performance. Doing so can increase CPU usage, however.
KeepAliveQueryMeanTime: Specifies the desired keep-alive latency in milliseconds. The default value of 100 is appropriate for almost all installations. Note that CPU usage will increase with lower KeepAliveQueryMeanTime values.

For more information about the Web Server’s keep-alive subsystem, seeKeep-Alive/Persistent Connection Information

For information about connection queue sizing, seeConnection Queue Information

HTTP/1.0-style Workload

Since HTTP/1.0 results in a large number of new incoming connections, the default acceptor threads of 1 per listen socket would be suboptimal. Increasing this to a higher number should improve performance for HTTP/1.0-style workloads. For instance, for a system with 2 CPUs, you may want to set it to 2.

Example

In the following example, acceptor threads are increased, and keep-alive connections are reduced:

In magnus.conf:
MaxKeepAliveConnections 0
 RqThrottle 128
 RcvBufSize 8192
In server.xml:
<SERVER legacyls="ls1">
     <LS id="ls1" ip="0.0.0.0" port="8080" security="off" blocking="no"
        acceptorthreads="2"
</SERVER>

HTTP/1.0-style workloads would have many connections established and terminated.

If users are experiencing connection timeouts from a browser to Sun Java System Web Server when the server is heavily loaded, you can increase the size of the HTTP listener backlog queue by setting the ListenQ parameter in the magnus.conf file to:

ListenQ  8192

The ListenQ parameter specifies the maximum number of pending connections on a listen socket. Connections that time out on a listen socket whose backlog queue is full will fail.

HTTP/1.1-style Workload

In general, it is a tradeoff between throughput and latency while tuning server persistent connection handling. The KeepAliveQueryQuery* directives (KeepAliveQueryMeanTime and KeepAliveQueryMaxSleepTime) control latency. Lowering the values of these directives is intended to lower latency on lightly loaded systems (for example, reduce page load times). Increasing the values of these directives is intended to raise aggregate throughput on heavily loaded systems (for example, increase the number of requests per second the server can handle). However, if there's too much latency and too few clients, aggregate throughput will suffer as the server sits idle unnecessarily. As a result, the general keep-alive subsystem tuning rules at a particular load are as follows:

If there's idle CPU time, decrease KeepAliveQueryMeanTime and/or KeepAliveQueryMaxSleepTime.
If there's no idle CPU time, increase KeepAliveQueryMeanTime and/or KeepAliveQueryMaxSleepTime.

For more information about these directives, seeKeep-Alive Subsystem Tuning

Also, chunked encoding could affect the performance for HTTP/1.1 workload. Tuning the response buffer size could positively affect the performance. A higher OutputStreamSize for a plugin would result in sending Content-length: header, instead of chunking the response.

Example

In the following example, MaxKeepAliveConnections is increased, as is UseOutputStreamSize for the nsapi_test Service function:

In magnus.conf:
MaxKeepAliveConnections 8192
 KeepAliveThreads 2
 UseNativePoll 1
 RqThrottle 128
 RcvBufSize 8192
In obj.conf:
<Object name="nsapitest">
 ObjectType fn="force-type" type="magnus-internal/nsapitest"
 Service method=(GET) type="magnus-internal/nsapitest" fn="nsapi_test"
UseOutputStreamSize=8192
 </Object>