Thread Context Switching Problems

Thread Context Switching Problems

Check how many simultaneous requests are typically being handled when you have a large number of clients trying to access your application. When the site is under load, go to the DRP Server page in the Dynamo administration page, at http://hostname:port/nucleus/atg/dynamo/server/DrpServer, and see how many handlers are active.

Thread dumps can be useful to see where these threads are waiting as well. If there are too many threads waiting, your site’s performance may be impaired by thread context switching. You might see throughput decrease as load increases if your server were spending too much time context-switching between requests. Check the percentage of System CPU time consumed by your JVM. If this is more than 10% to 20%, this is potentially a problem. If you see several threads in a thread dump that are in a runnable state, you may want to try lowering the number of DrpServer handler threads. You should also verify that the priorityDelta property of the /atg/dynamo/server/DrpConnectionAcceptor component is a negative number. This setting should ensure that Dynamo finishes processing any requests that have work to do before it picks up a new request and should reduce thread context switching. However, thread context switching also depends in part on how your JVM schedules threads with different priorities.

You can also reduce overhead from thread context switching by making sure you have at least one CPU for each process involved in handling the majority of requests: one CPU for your HTTP server, one for Dynamo, one for the database server.

You might see throughput go down as load increases in cases where all of your DRP handler threads were busy waiting for some resource at the same time. For example, you might have one page on your site that makes a very long-running database query. If you increase the number of clients well beyond 40, you might see all 40 threads waiting for the response to this query. At this point, your throughput will go down because your CPU is idle. You should either speed up the slow requests (perhaps by adding caching of these queries) or increase the number of DRP handler threads to increase the parallelism. Of course, at some point, the database may become the bottleneck of your site (which is likely before you have 40 simultaneous queries running).

Note that if you are using green threads rather than native threads, thread context switching shows up as user time rather than as System CPU time.

Context switching can also occur when you have a network protocol which synchronizes too often (such as sending a request and waiting for a response).

Typically, these context switches can be overcome by increasing the parallelism in your site (increasing the number of DrpServer handler threads). If there are just too many of these synchronization points, though, this won’t work. For example, if you have 40 synchronous RPC calls for each HTTP request, you’d need to context switch processes 80 times for each request if you handled one request at a time. If you handled 2 requests at a time, you’d cut the number of context switches in half. This is in addition to the number of handlers that you’d need to hide any I/O or database activity so the number can add up fast.

Contents
Search

loading table of contents...