Sun Java System Portal Server 7 Deployment Planning Guide

mpstat

The mpstat utility is a useful tool to monitor CPU utilization, especially with multithreaded applications running on multiprocessor machines, which is a typical configuration for enterprise solutions.

Use mpstat with an argument between 5 seconds to 10 seconds.

An interval that is smaller than 5 or 10 seconds might be more difficult to analyze. A larger interval might provide a means of smoothing the data by removing spikes that could mislead the result.

Output

#mpstat 10

CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
 0    1   0 5529   442  302  419  166   12  196    0   775   95   5   0   0
 1    1   0  220   237  100  383  161   41   95    0   450   96   4   0   0
 4    0   0   27   192  100  178   94   38   44    0   100   99   1   0   0

What to Look For

Note the much higher intr and ithr values for certain CPUs. Solaris will select some CPUs to handle the system interrupts. The CPUs and the number that are chosen depend on the I/O devices attached to the system, the physical location of the devices, and whether interrupts have been disabled on a CPU (psradmin command).

intr - interrupts
intr - thread interrupts (not including the clock interrupts)
- csw - Voluntary Context switches. When this number slowly increases, and the application is not IO bound, it may indicate a mutex contention.
- icsw - Involuntary Context switches. When this number increases past 500, the system is under a heavy load.
- smtx - if smtx increases sharply. An increase from 50 to 500 is a sign of a system resource bottleneck (ex., network or disk).
- Usr, sys and idl - Together, all three columns represent CPU saturation. A well-tuned application under full load (0% idle) should be within 80% to 90% usr, and 20% to 10% sys times, respectively. A smaller percentage value for sys reflects more time for user code and less preemption, which result in greater throughput for Portal application.

Considerations

Make your application available to as many CPUs as it can efficiently use. As an example, you get the best performance from one instance from 2 CPUs. You can expect that creating 14 2CPU processor sets would yield the best performance.

An increasing csw value shows an increase with network use. A common cause for a high csw value is the result of having created too many socket connections--either by not pooling connections or by handling new connections inefficiently. If this is the case you would also see a high TCP connection count when executing netstat -a | wc–l. For more information, refer to netstat.

Do you observe increasing icsw? A common cause of this is preemption, most likely because of an end of time slice on the CPU.