When you are doing clock-based profiling, the data collected depends on the metrics provided by the operating system.
In clock-based profiling under the Solaris OS, the state of each LWP is stored at regular time intervals. This time interval is called the profiling interval. The information is stored in an integer array: one element of the array is used for each of the ten microaccounting states maintained by the kernel. The data collected is converted by the Performance Analyzer into times spent in each state, with a resolution of the profiling interval. The default profiling interval is approximately 10 milliseconds (10 ms). The Collector provides a high-resolution profiling interval of approximately 1 ms and a low-resolution profiling interval of approximately 100 ms, and, where the OS permits, allows arbitrary intervals. Running the collect command with no arguments prints the range and resolution allowable on the system on which it is run.
The metrics that are computed from clock-based data are defined in the following table.
Table 2–1 Solaris Timing Metrics
Metric |
Definition |
---|---|
User CPU time |
LWP time spent running in user mode on the CPU. |
Wall time |
LWP time spent in LWP 1. This is usually the “wall clock time” |
Total LWP time |
Sum of all LWP times. |
System CPU time |
LWP time spent running in kernel mode on the CPU or in a trap state. |
Wait CPU time |
LWP time spent waiting for a CPU. |
User lock time |
LWP time spent waiting for a lock. |
Text page fault time |
LWP time spent waiting for a text page. |
Data page fault time |
LWP time spent waiting for a data page. |
Other wait time |
LWP time spent waiting for a kernel page, or time spent sleeping or stopped. |
For multithreaded experiments, times other than wall clock time are summed across all LWPs. Wall time as defined is not meaningful for multiple-program multiple-data (MPMD) programs.
Timing metrics tell you where your program spent time in several categories and can be used to improve the performance of your program.
High user CPU time tells you where the program did most of the work. It can be used to find the parts of the program where there may be the most gain from redesigning the algorithm.
High system CPU time tells you that your program is spending a lot of time in calls to system routines.
High wait CPU time tells you that there are more threads ready to run than there are CPUs available, or that other processes are using the CPUs.
High user lock time tells you that threads are unable to obtain the lock that they request.
High text page fault time means that the code generated by the linker is organized in memory so that calls or branches cause a new page to be loaded. Creating and using a mapfile (see “Generating and Using a Mapfile” in the Performance Analyzer online help) can fix this kind of problem.
High data page fault time indicates that access to the data is causing new pages to be loaded. Reorganizing the data structure or the algorithm in your program can fix this problem.
Under the Linux OS, the only metric available is User CPU time. Although the total CPU utilization time reported is accurate, it may not be possible for the Analyzer to determine the proportion of the time that is actually System CPU time as accurately as for the Solaris OS. Although the Analyzer displays the information as if the data were for a lightweight process (LWP), in reality there are no LWP’s on a Linux OS; the displayed LWP ID is actually the thread ID.