In general, it is a tradeoff between throughput and latency while tuning server persistent connection handling. The KeepAliveQueryQuery* directives (KeepAliveQueryMeanTime and KeepAliveQueryMaxSleepTime) control latency. Lowering the values of these directives is intended to lower latency on lightly loaded systems (for example, reduce page load times). Increasing the values of these directives is intended to raise aggregate throughput on heavily loaded systems (for example, increase the number of requests per second the server can handle). However, if there's too much latency and too few clients, aggregate throughput will suffer as the server sits idle unnecessarily. As a result, the general keep-alive subsystem tuning rules at a particular load are as follows:
If there's idle CPU time, decrease KeepAliveQueryMeanTime and/or KeepAliveQueryMaxSleepTime.
If there's no idle CPU time, increase KeepAliveQueryMeanTime and/or KeepAliveQueryMaxSleepTime.
For more information about these directives, seeKeep-Alive Subsystem Tuning
Also, chunked encoding could affect the performance for HTTP/1.1 workload. Tuning the response buffer size could positively affect the performance. A higher OutputStreamSize for a plugin would result in sending Content-length: header, instead of chunking the response.