JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Solaris Studio 12.3: Performance Analyzer     Oracle Solaris Studio 12.3 Information Library
search filter icon
search icon

Document Information

Preface

1.  Overview of the Performance Analyzer

2.  Performance Data

3.  Collecting Performance Data

4.  The Performance Analyzer Tool

5.  The er_print Command Line Performance Analysis Tool

6.  Understanding the Performance Analyzer and Its Data

7.  Understanding Annotated Source and Disassembly Data

8.  Manipulating Experiments

9.  Kernel Profiling

Kernel Experiments

Setting Up Your System for Kernel Profiling

Running the er_kernel Utility

Profiling the Kernel

Profiling Under Load

Profiling the Kernel and Load Together

Profiling the Kernel for Hardware Counter Overflows

Profiling Kernel and User Processes

Analyzing a Kernel Profile

Index

Running the er_kernel Utility

You can run the er_kernel utility to profile only the kernel or both the kernel and the load you are running. For a complete description of the er_kernel command, see the er_kernel (1) man page.

To display a usage message, run the er_kernel command without arguments.

Profiling the Kernel

  1. Collect the experiment by typing:
    % er_kernel -p on
  2. Run whatever load you want in a separate shell.
  3. When the load completes, terminate the er_kernel utility by typing Ctrl-C.
  4. Load the resulting experiment, named ktest.1.er by default, into the Performance Analyzer or the er_print utility.

    Kernel clock profiling produces two metrics: KCPU Cycles (metric name kcycles), for clock profile events recorded in the kernel founder experiment, and KUCPU Cycles (metric name kucycles) for clock profile events recorded in user process subexperiments, when the CPU is in user-mode. In the Performance Analyzer, the metrics are shown for kernel functions in the Functions tab, for callers and callees in the Callers-Callees tab, and for instructions in the Disassembly tab. The Source tab does not show data, because kernel modules, as shipped, do not usually contain file and line symbol table information (stabs).

    You can replace the -p on argument to the er_kernel utility with -p high for high-resolution profiling or -p low for low-resolution profiling. If you expect the run of the load to take 2 to 20 minutes, the default clock profiling is appropriate. If you expect the run to take less than 2 minutes, use -p high; if you expect the run to take longer than 20 minutes, use -p low.

    You can add a -t duration argument, which will cause the er_kernel utility to terminate itself according to the time specified by duration.

    The -t duration can be specified as a single number, with an optional m or s suffix, to indicate the time in minutes or seconds at which the experiment should be terminated. By default, the duration is in seconds. The duration can also be specified as two such numbers separated by a hyphen, which causes data collection to pause until the first time elapses, and at that time data collection begins. When the second time is reached, data collection terminates. If the second number is a zero, data will be collected after the initial pause until the end of the program's run. Even if the experiment is terminated, the target process is allowed to run to completion.

    If no time duration or interval is specified, er_kernel will run until terminated. You can terminate it by pressing Ctrl-C (SIGINT), or by using the kill command and sending SIGINT, SIGQUIT, or SIGTERM to the er_kernel process. The er_kernel process terminates the experiment and runs er_archive (unless -A off is specified) when any of those signals is sent to the process. The er_archive utility reads the list of shared objects referenced in the experiment, and constructs an archive file for each object.

    You can add the -v argument if you want more information about the run printed to the screen. The -n argument lets you see a preview of the experiment that would be recorded, without actually recording anything.

    By default, the experiment generated by the er_kernel utility is named ktest.1.er; the number is incremented for successive runs.

Profiling Under Load

If you have a single command, either a program or a script, that you wish to use as a load:

  1. Collect the experiment by typing:
    % er_kernel -p on load

    If load is a script, it should wait for any commands it spawns to terminate before exiting, or the experiment might be terminated prematurely.

  2. Analyze the experiment by typing:
    % analyzer ktest.1.er

    The er_kernel utility forks a child process and pauses for a quiet period, and then the child process runs the specified load. When the load terminates, the er_kernel utility pauses again for a quiet period and then exits. The experiment shows the behavior of the Oracle Solaris kernel during the running of the load, and during the quiet periods before and after. You can specify the duration of the quiet period in seconds with the -q argument to the er_kernel command.

Profiling the Kernel and Load Together

If you have a single program that you wish to use as a load, and you are interested in seeing its profile in conjunction with the kernel profile:

  1. Collect both a kernel profile and a user profile by typing both the er_kernel command and the collect command:
    % er_kernel collect load
  2. Analyze the two profiles together by typing:
    % analyzer ktest.1.er test.1.er

    The data displayed by the Analyzer shows both the kernel profile from ktest.1.er and the user profile from test.1.er. The Timeline tab allows you to see correlations between the two experiments.


    Note - To use a script as the load and separately profile various parts of the script, prepend the collect command with the appropriate arguments to the various commands within the script.


Profiling the Kernel for Hardware Counter Overflows

The er_kernel utility can collect hardware counter overflow profiles for the kernel using the DTrace cpc provider, which is available only on systems running Oracle Solaris 11.

You can perform hardware counter overflow profiling of the kernel with the -h option for the er_kernel command as you do with the collect command. However, dataspace profiling is not supported so dataspace requests are ignored by er_kernel.

As with the collect command, if you use the -h option without explicitly specifying a -p option, clock-based profiling is turned off. To collect both hardware counter data and clock-based data, you must specify the -h option and the -p option.

To display hardware counters on a machine whose processor supports hardware counter overflow profiling, run the er_kernel —h command with no additional arguments.

If the overflow mechanism on the chip allows the kernel to tell which counter overflowed, you can profile as many counters as the chip provides; otherwise, you can only specify one counter. The er_kernel —h output specifies whether you can use more than one counter by displaying a message such as "specify HW counter profiling for up to 4 HW counters."

The system hardware counter mechanism can be used by multiple processes for user profiling, but cannot be used for kernel profiling if any user process, or the cputrack utility, or another er_kernel process is using the mechanism. If another process is using hardware counters, er_kernel will report "HW counter profiling is not supported on this system."

For more information about hardware counter profiling, see Hardware Counter Overflow Profiling Data and -h counter_definition_1...[,counter_definition_n].

Also see the er_print man page for more information about hardware counter overflow profiling.

Profiling Kernel and User Processes

The er_kernel utility enables you to perform profiling of the kernel and applications. You can use the -F option to control whether or not application processes should be followed and have their data recorded.

When you use the -F on or -F all options, er_kernel records experiments on all application processes as well as the kernel. User processes that are detected while collecting an er_kernel experiment are followed, and a subexperiment is created for each of the followed processes.

Many subexperiments might not be recorded if you run er_kernel as a non-root user because unprivileged users usually cannot read anything about another user's processes.

Assuming sufficient privileges, the user process data is recorded only when the process is in user mode, and only the user call stack is recorded. The subexperiments for each followed process contain data for the kucycles metric. The subexperiments are named using the format _process-name_PID_process-pid.1.er. For example an experiment run on a sshd process might be named _sshd_PID_1264.1.er.

To follow only some user processes, you can specify a regular expression using -F =regexp to record experiments on processes whose name or PID matches the regular expression.

For example, er_kernel -F =synprog follows processes of a program called synprog.

See the regexp(5) man page for information about regular expressions.

The -F off option is set by default so that er_kernel does not perform user process profiling.


Note - The -F option of er_kernel is different from the -F option of collect. The collect —F command is used to follow only processes that are created by the target specified in the command line, while er_kernel —F is used to follow any or all processes currently running on the system.