Quality of Service refers to the performance limits you set for a server instance virtual server class or virtual server. For example, if you are an ISP, you might want to charge different amounts of money for virtual servers depending on how much bandwidth you allow them. You can limit two areas: the amount of bandwidth and the number of connections.
You can enable these settings for the entire server or for a class of virtual servers in the Server Manager from the Monitor tab. However, you can override these server or class-level settings for an individual virtual server. For more information on setting quality of service limits for an individual server, see Configuring Virtual Server Quality of Service Settings.
Two settings govern how traffic is counted and how often the bandwidth is recomputed: the recompute interval and the metric interval. The recompute is how often (in milliseconds) the bandwidth is computed. The metric interval is the period for which data is used in traffic calculations.
This section includes the following topics:
The following example shows how the quality of service information is collected and computed:
The server has metric interval of 30 seconds.
The server starts up at a time of 0 seconds.
At time 1 second, an HTTP connection generates 5000 bytes of traffic to/from the server.
No more connections are made after that. At 30 seconds, the total traffic for the last 30 seconds is 5000 bytes.
At 32 seconds, the traffic sample from 1 second is discarded, since it is older than the 30 seconds of the metric interval. The total traffic for the last 30 seconds is now 0.
The recompute interval works similarly. The server’s recompute interval is 100ms.
Continuing with the example, the bandwidth gets recomputed periodically every 100 milliseconds. The calculation is based on the amount of traffic as well as the metric interval.
At time 0 seconds, the bandwidth is calculated for the first time. The total traffic is zero, divided by the metric interval of 30 seconds, gives a bandwidth of zero.
At 1 second, the bandwidth is calculated for the 10th time (1000 milliseconds/ 100 milliseconds). The total traffic is 5000 bytes, which is divided by 30 seconds. The bandwidth is 5000/30 = 166 bytes per second.
At 30 seconds, the bandwidth is calculated for the 300th time. The total traffic is 5000 bytes, which is divided by 30 seconds. The bandwidth is 5000/30 = 166 bytes per second.
At 32 seconds, the bandwidth is computed again for the 320th time. The traffic is now 0 (since the one connection that generated traffic is too old to be counted), divided by 30, gives a bandwidth of 0 bytes/second.
To configure the quality of service settings for a server instance or a class of virtual servers, you need to configure the settings in through the user interface. To actually enforce your quality of service settings, you must also set up Server Application Functions (SAFs) in your obj.conf file.
To configure quality of service, perform the following steps
From the Server Manager, click the Monitor tab.
Click Quality of Service.
A page appears listing general settings for quality of service, followed by a list containing the server instance and each class of virtual servers.
To enable quality of service the server instance, click Enable.
By default quality of service is enabled. Enabling quality of service increases server overhead slightly.
Choose the Recompute Interval.
The recompute interval is the number of milliseconds between each computation of the bandwidth for all servers, classes, and virtual servers. The default is 100 milliseconds.
Choose the Metric Interval.
The metric interval is the interval in seconds during which the traffic is measured. The default value is 30 seconds. All bandwidth measured during this time is averaged to give the bytes per second.
If your site has a lot of large file transfers, use a large value (several minutes or more) or this field. A large file transfer might take up all the allowed bandwidth for a short metric interval, and result in connections being denied if you’ve enforced the maximum bandwidth setting. Since the bandwidth is averaged by the metric interval, a longer interval smooths out spikes caused by large files.
If the bandwidth limit is much lower than available bandwidth (for example, 1 MB-per-second bandwidth limit but with a 1 GB-per-second connection to the backbone), the metric interval must be shortened.
Please note that if you have large static file transfers and a bandwidth limit that is much lower than available bandwidth, you have to decide which situation to tune for, because the problems require opposite solutions.
Enable quality of service for the server instance and/or the virtual server classes.
The lower portion of the screen lists the server instance and server classes. Choose Enable as the action next to the items for which you want to enable quality of service.
Set the maximum bandwidth, in bytes per second.
Choose whether or not to enforce the maximum bandwidth setting.
If you choose to enforce the maximum bandwidth, additional connections are refused, once the server reaches its bandwidth limit.
If you do not enforce the maximum bandwidth, server logs a message to the error log, when the maximum is exceeded
Choose the maximum number of connections allowed.
This number is the number of concurrent requests processed.
Choose whether or not to enforce the maximum connections setting.
If you choose to enforce the maximum connections, additional connections are refused, once the server reaches its limit
If you do not enforce the maximum connections, when the maximum is exceeded the server logs a message to the error log.
Click OK.
To enable quality of service, you must include directives in your obj.conf to invoke two Server Application Functions (SAFs): an AuthTrans qos-handler and an Error qos-error.
The qos-handler AuthTrans directive must be the first AuthTrans configured in the default object in order to work properly. The role of the quality of service handler is to examine the current statistics for the virtual server, virtual server class, and global server, and enforce the limits by returning an error.
Sun Java System Web Server includes a built-in sample quality of service handler SAF, called qos-handler. This SAF logs when limits are reached, and returns 503 "Server busy" to the server so that it can be processed by NSAPI.
Sun Java System Web Server also includes a built-in sample error SAF called qos-error which returns an error page stating which limits caused the 503 error and the value of the statistic that triggered the limit. You may want to alter the sample code to provide different error information.
These samples are available at server_root/plugins/nsapi/examples/qos.c. You can use these samples, or you can write your own SAFs.
For more information on these SAFs and how to use them, see the Sun Java System Web Server 6.1 SP6 NSAPI Programmer’s Guide.
When you use the quality of service features, keep in mind the following limitations:
The connection or bandwidth statistics are not shared across server processes because of performance. In other words, the setting of MaxProc is not accounted for. All the limits apply individually to a server process, not to the aggregate of all processes. For more information on MaxProcs and multiple processes, see the Sun Java System Web Server 6.1 SP6 Performance Tuning, Sizing, and Scaling Guide.
The quality of service features only measure the HTTP bandwidth at the application level. The HTTP bandwidth can differ from the actual TCP network bandwidth for a variety of reasons:
If SSL is enabled, handshakes and client certificate exchanges add to the traffic but are not measured.
If chunked encoding is enabled in one of the directions or both, the chunking layer removes the chunk headers and they are not counted in the traffic. Other headers or protocol items are counted.
The quality of service features cannot accurately measure traffic from PR_TransmitFile calls. For basic I/O operations such as PR_Send()/net_write or PR_Recv()/net_read, the data transferred can be quickly accounted for by the bandwidth manager, since the number of bytes transferred in one system call is usually the size of a buffer and the I/O call returns quickly. This works very well to measure the instantaneous bandwidth of dynamic content applications. However, because the amount of data transferred from PR_TransmitFile is only known at the end of the transfer, it cannot be measured before the transfer completes.
If the PR_TransmitFile is short, the quality of service features performs adequately. However, If the PR_TransmitFile is long, such as in the case of a long file downloaded by a dialup user, the whole amount of data transferred is counted at completion time. When the bandwidth manager recomputes bandwidth after the next recompute interval period starts, the bandwidth computed will go up significantly because of that recent large PR_TransmitFile. This case could cause the server to deny all requests until the next metric interval, when the bandwidth manager will "expire" the transmit file operation, since it is too old, and thus the bandwidth value will go back down. If your site has a lot of very long static file downloads, the you should increase the metric interval from the default 30 seconds.
The bandwidth computed is always an approximation because it is not measured instantaneously, but is recomputed at regular intervals and over a certain period. For example, if the metric interval is the default 30 seconds and the server is idle for 29 seconds, the next second, a client could potentially use 30 times the bandwidth limit in one second.
The quality of service bandwidth statistics are lost whenever the server is reconfigured dynamically. In addition, the quality of service limitations are not enforced in threads that have connections on an older, inactive configuration, because the bandwidth manager thread only computes bandwidth statistics for the active configuration. Potentially, a client that does not close its socket for a long time and remains active so that the server does not time it out would not be subject to the quality of service limitations after a server dynamic reconfiguration.
The concurrent connections are computed with a different granularity for virtual servers than for virtual server classes and the global server instance. The connection counter for an individual virtual server is incremented atomically immediately after the request is parsed and routed to the virtual server. It is also decremented atomically at the end of the response processing for that request. This means that the virtual server connection statistics are always exact at any instant.
However, the connection statistics for the virtual server class and global server instance are not updated instantly. They are updated by the bandwidth manager thread every recompute interval. The connection count for the virtual server class is the sum of the connections on all virtual servers of that class. The global server instance connection count is the sum of connections on all virtual server classes.
Because of the way these values are computed, the number of connections for a virtual server is always correct (and if you have enforced a limit to the number of connections, you can never have more than the limit), and the virtual server class and server instance values are not quite as accurate, since they are only computed at intervals.