|C H A P T E R 5|
This chapter describes how you can use the Sun Studio performance tools to profile the kernel while the Solaris OS is running a load. Kernel profiling is available if you are running Sun Studio software on the Solaris 10 OS.
You can record kernel profiles with the er_kernel utility.
The er_kernel utility uses the DTrace driver, a comprehensive dynamic tracing facility that is built into Solaris 10 OS.
The er_kernel utility captures kernel profile data and records the data as an Analyzer experiment in the same format as a user profile. The experiment can be processed by the er_print utility or the Performance Analyzer. A kernel experiment can show function data, caller-callee data, instruction-level data, and a timeline, but not source-line data (because most Solaris OS modules do not contain line-number tables).
Before you can use the er_kernel utility for kernel profiling, you need to set up access to the DTrace driver.
Normally, the DTrace driver is restricted to user root. To run er_kernel utility as a user other than root, you must have specific privileges assigned, and be a member of group sys. To assign the necessary privileges, add the following line to the file /etc/user_attr:
To add yourself to the group sys, add your user name to the sys line in the file /etc/group.
You can run the er_kernel utility to profile only the kernel or both the kernel and the load you are running. For a complete description of the er_kernel command, see the er_kernel(1) man page.
1. Collect the experiment by typing:
2. Run whatever load you want in a separate shell.
3. When the load completes, terminate the er_kernel utility by typing ctrl-C.
4. Load the resulting experiment, named ktest.1.er by default, into the Performance Analyzer or the er_print utility.
Kernel clock profiling produces one performance metric, labeled KCPU Cycles. In the Performance Analyzer, it is shown for kernel functions in the Functions Tab, for callers and callees in the Caller-Callee Tab, and for instructions in the Disassembly Tab. The Source Tab does not show data, because kernel modules, as shipped, do not usually contain file and line symbol table information (stabs).
You can replace the -p on argument to the er_kernel utility with -p high for high-resolution profiling or -p low for low-resolution profiling,. If you expect the run of the load to take 2 to 20 minutes, the default clock profiling is appropriate. If you expect the run to take less than 2 minutes, use -p high; if you expect the run to take longer than 20 minutes, use -p low.
You can add a -t n argument, which will cause the er_kernel utility to terminate itself after n seconds.
You can add the -v argument if you want more information about the run printed to the screen. The -n argument lets you see a preview of the experiment that would be recorded, without actually recording anything.
By default, the experiment generated by the er_kernel utility is named ktest.1.er; the number is incremented for successive runs
If you have a single command, either a program or a script, that you wish to use as a load:
1. Collect the experiment by typing:
2. Analyze the experiment by typing:
The er_kernel utility forks a child process and pauses for a quiet period, and then the child process runs the specified load. When the load terminates, the er_kernel utility pauses again for a quiet period and then exits. The experiment shows the behavior of the Solaris OS during the running of the load, and during the quiet periods before and after. You can specify the duration of the quiet period in seconds with the -q argument to the er_kernel command.
If you have a single program that you wish to use as a load, and you are interested in seeing its profile in conjunction with the kernel profile:
1. Collect both a kernel profile and a user profile by typing both the er_kernel command and the collect command:
2. Analyze the two profiles together by typing:
The data displayed by the Analyzer shows both the kernel profile from ktest.1.er and the user profile from test.1.er. The timeline allows you to see correlations between the two experiments.
You can invoke the er_kernel utility with one or more -T arguments to specify profiling for specific processes or threads:
The target threads must have been created before you invoke the er_kernel utility for them.
When you give one or more -T arguments, an additional metric, labeled Kthr Time, is produced. Data is captured for all profiled threads, whether running on a CPU or not. Special single-frame call stacks are used for indicating the process is suspended (the function <SLEEPING>) or waiting for the CPU (the function <STALLED>).
Functions with high Kthr Time metrics, but low KCPU Cycles metrics, are functions that are spending a lot of time for the profiled threads waiting for some other events.
A few of the recorded fields in kernel experiments have a different meaning from the same fields in user-mode experiments. A user-mode experiment contains data for a single process ID only; a kernel experiment has data that may apply to many different process IDs. To better present that information, some of the field labels in the Analyzer have different meanings in the two types of experiments:
For example, in an kernel experiment, if you want to filter to only a few process IDs, enter the PID(s) of interest in the LWP filter field in the Filter Data dialog box.