Check how many simultaneous requests are typically being handled when you have a large number of clients trying to access your application. Thread dumps can be useful to see where these threads are waiting. If there are too many threads waiting, your site’s performance may be impaired by thread context switching. You might see throughput decrease as load increases if your server were spending too much time context-switching between requests. Check the percentage of System CPU time consumed by your JVM. If this is more than 10% to 20%, this is potentially a problem. Thread context switching also depends in part on how your JVM schedules threads with different priorities.
You can also reduce overhead from thread context switching by making sure you have at least one CPU for each process involved in handling the majority of requests: one CPU for your HTTP server, one for ATG, one for the database server.
You might see throughput go down as load increases in cases where all of your request handler threads were busy waiting for some resource at the same time. For example, you might have one page on your site that makes a very long-running database query. If you increase the number of clients well beyond 40, you might see all 40 threads waiting for the response to this query. At this point, your throughput will go down because your CPU is idle. You should either speed up the slow requests (perhaps by adding caching of these queries) or increase the number of request threads to increase the parallelism. Of course, at some point, the database may become the bottleneck of your site (which is likely before you have 40 simultaneous queries running).
Context switching can also occur when you have a network protocol which synchronizes too often (such as sending a request and waiting for a response).
Typically, these context switches can be overcome by increasing the parallelism in your site. If there are just too many of these synchronization points, though, this won’t work. For example, if you have 40 synchronous RPC calls for each HTTP request, you’d need to context switch processes 80 times for each request if you handled one request at a time. If you handled 2 requests at a time, you’d cut the number of context switches in half. This is in addition to the number of handlers that you’d need to hide any I/O or database activity so the number can add up fast.

