C H A P T E R  5

Kernel Profiling

This chapter describes how you can use the Sun Studio performance tools to profile the kernel while the Solaris OS is running a load. Kernel profiling is available if you are running Sun Studio software on the Solaris 10 OS.


Kernel Experiments

You can record kernel profiles with the er_kernel utility.

The er_kernel utility uses the DTrace driver, a comprehensive dynamic tracing facility that is built into Solaris 10 OS.

The er_kernel utility captures kernel profile data and records the data as an Analyzer experiment in the same format as a user profile. The experiment can be processed by the er_print utility or the Performance Analyzer. A kernel experiment can show function data, caller-callee data, instruction-level data, and a timeline, but not source-line data (because most Solaris OS modules do not contain line-number tables).


Setting Up Your System for Kernel Profiling

Before you can use the er_kernel utility for kernel profiling, you need to set up access to the DTrace driver.

Normally, the DTrace driver is restricted to user root. To run er_kernel utility as a user other than root, you must have specific privileges assigned, and be a member of group sys. To assign the necessary privileges, add the following line to the file /etc/user_attr:


username::::defaultpriv=basic,dtrace_kernel,dtrace_proc

To add yourself to the group sys, add your user name to the sys line in the file /etc/group.


Running the er_kernel Utility

You can run the er_kernel utility to profile only the kernel or both the kernel and the load you are running. For a complete description of the er_kernel command, see the er_kernel(1) man page.

Profiling the Kernel

1. Collect the experiment by typing:


% er_kernel -p on

2. Run whatever load you want in a separate shell.

3. When the load completes, terminate the er_kernel utility by typing ctrl-C.

4. Load the resulting experiment, named ktest.1.er by default, into the Performance Analyzer or the er_print utility.

Kernel clock profiling produces one performance metric, labeled KCPU Cycles. In the Performance Analyzer, it is shown for kernel functions in the Functions Tab, for callers and callees in the Caller-Callee Tab, and for instructions in the Disassembly Tab. The Source Tab does not show data, because kernel modules, as shipped, do not usually contain file and line symbol table information (stabs).

You can replace the -p on argument to the er_kernel utility with -p high for high-resolution profiling or -p low for low-resolution profiling,. If you expect the run of the load to take 2 to 20 minutes, the default clock profiling is appropriate. If you expect the run to take less than 2 minutes, use -p high; if you expect the run to take longer than 20 minutes, use -p low.

You can add a -t n argument, which will cause the er_kernel utility to terminate itself after n seconds.

You can add the -v argument if you want more information about the run printed to the screen. The -n argument lets you see a preview of the experiment that would be recorded, without actually recording anything.

By default, the experiment generated by the er_kernel utility is named ktest.1.er; the number is incremented for successive runs

Profiling Under Load

If you have a single command, either a program or a script, that you wish to use as a load:

1. Collect the experiment by typing:


% er_kernel -p on load

2. Analyze the experiment by typing:


% analyzer ktest.1.er 

The er_kernel utility forks a child process and pauses for a quiet period, and then the child process runs the specified load. When the load terminates, the er_kernel utility pauses again for a quiet period and then exits. The experiment shows the behavior of the Solaris OS during the running of the load, and during the quiet periods before and after. You can specify the duration of the quiet period in seconds with the -q argument to the er_kernel command.

Profiling the Kernel and Load Together

If you have a single program that you wish to use as a load, and you are interested in seeing its profile in conjunction with the kernel profile:

1. Collect both a kernel profile and a user profile by typing both the er_kernel command and the collect command:


% er_kernel collect load

2. Analyze the two profiles together by typing:


% analyzer ktest.1.er test.1.er 

The data displayed by the Analyzer shows both the kernel profile from ktest.1.er and the user profile from test.1.er. The timeline allows you to see correlations between the two experiments.



Note - To use a script as the load, and profile the various parts of it, prepend the collect, command, with the appropriate arguments, to the various commands within the script.



Profiling a Specific Process or Kernel Thread

You can invoke the er_kernel utility with one or more -T arguments to specify profiling for specific processes or threads:

The target threads must have been created before you invoke the er_kernel utility for them.

When you give one or more -T arguments, an additional metric, labeled Kthr Time, is produced. Data is captured for all profiled threads, whether running on a CPU or not. Special single-frame call stacks are used for indicating the process is suspended (the function <SLEEPING>) or waiting for the CPU (the function <STALLED>).

Functions with high Kthr Time metrics, but low KCPU Cycles metrics, are functions that are spending a lot of time for the profiled threads waiting for some other events.


Analyzing a Kernel Profile

A few of the recorded fields in kernel experiments have a different meaning from the same fields in user-mode experiments. A user-mode experiment contains data for a single process ID only; a kernel experiment has data that may apply to many different process IDs. To better present that information, some of the field labels in the Analyzer have different meanings in the two types of experiments:


TABLE 5-1 Field Label Meanings for Kernel Experiments in the Analyzer

Analyzer Label

Meaning in User-mode Experiments

Meaning in Kernel Experiments

LWP

User process LWP ID

Process PID; 0 for kernel threads

Thread

Thread ID within process

Kernel TID; kernel DID for kernel threads


For example, in an kernel experiment, if you want to filter to only a few process IDs, enter the PID(s) of interest in the LWP filter field in the Filter Data dialog box.