JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Solaris Studio 12.2: Performance Analyzer
search filter icon
search icon

Document Information

Preface

1.  Overview of the Performance Analyzer

2.  Performance Data

3.  Collecting Performance Data

Compiling and Linking Your Program

Source Code Information

Static Linking

Shared Object Handling

Optimization at Compile Time

Compiling Java Programs

Preparing Your Program for Data Collection and Analysis

Using Dynamically Allocated Memory

Using System Libraries

Using Signal Handlers

Using setuid and setgid

Program Control of Data Collection

The C and C++ Interface

The Fortran Interface

The Java Interface

The C, C++, Fortran, and Java API Functions

Dynamic Functions and Modules

collector_func_load()

collector_func_unload()

Limitations on Data Collection

Limitations on Clock-Based Profiling

Runtime Distortion and Dilation with Clock-profiling

Limitations on Collection of Tracing Data

Runtime Distortion and Dilation with Tracing

Limitations on Hardware Counter Overflow Profiling

Runtime Distortion and Dilation With Hardware Counter Overflow Profiling

Limitations on Data Collection for Descendant Processes

Limitations on OpenMP Profiling

Limitations on Java Profiling

Runtime Performance Distortion and Dilation for Applications Written in the Java Programming Language

Where the Data Is Stored

Experiment Names

Moving Experiments

Estimating Storage Requirements

Collecting Data

Collecting Data Using the collect Command

Data Collection Options

-p option

-h counter_definition_1...[,counter_definition_n]

-s option

-H option

-M option

-m option

-S option

-c option

-I directory

-N library_name

-r option

Experiment Control Options

-F option

-j option

-J java_argument

-l signal

-t duration

-x

-y signal[ ,r]

Output Options

-o experiment_name

-d directory-name

-g group-name

-A option

-L size

-O file

Other Options

-P process_id

-C comment

-n

-R

-V

-v

Collecting Data From a Running Process Using the collect Utility

To Collect Data From a Running Process Using the collect Utility

Collecting Data Using the dbx collector Subcommands

To Run the Collector From dbx:

Data Collection Subcommands

profile option

hwprofile option

synctrace option

heaptrace option

tha option

sample option

dbxsample { on | off }

Experiment Control Subcommands

disable

enable

pause

resume

sample record name

Output Subcommands

archive mode

limit value

store option

Information Subcommands

show

status

Collecting Data From a Running Process With dbx on Solaris Platforms

To Collect Data From a Running Process That is Not Under the Control of dbx

Collecting Tracing Data From a Running Program

Collecting Data From MPI Programs

Running the collect Command for MPI

Storing MPI Experiments

Collecting Data From Scripts

Using collect With ppgsz

4.  The Performance Analyzer Tool

5.  The er_print Command Line Performance Analysis Tool

6.  Understanding the Performance Analyzer and Its Data

7.  Understanding Annotated Source and Disassembly Data

8.  Manipulating Experiments

9.  Kernel Profiling

Index

Collecting Data Using the collect Command

To run the Collector from the command line using the collect command, type the following.

% collect collect-options program program-arguments

Here, collect-options are the collect command options, program is the name of the program you want to collect data on, and program-arguments are the program's arguments. The target program is typically a binary executable. However, if you set the environment variable SP_COLLECTOR_SKIP_CHECKEXEC you can specify a script as the target.

If no collect-options are given, the default is to turn on clock-based profiling with a profiling interval of approximately 10 milliseconds.

To obtain a list of options and a list of the names of any hardware counters that are available for profiling, type the collect command with no arguments.

% collect

For a description of the list of hardware counters, see Hardware Counter Overflow Profiling Data. See also Limitations on Hardware Counter Overflow Profiling.

Data Collection Options

These options control the types of data that are collected. See What Data the Collector Collects for a description of the data types.

If you do not specify data collection options, the default is -p on, which enables clock-based profiling with the default profiling interval of approximately 10 milliseconds. The default is turned off by the -h option but not by any of the other data collection options.

If you explicitly disable clock-based profiling, and do not enable tracing or hardware counter overflow profiling, the collect command prints a warning message, and collects global data only.

-p option

Collect clock-based profiling data. The allowed values of option are:

Collecting clock-based profiling data is the default action of the collect command.

-h counter_definition_1...[,counter_definition_n]

Collect hardware counter overflow profiling data. The number of counter definitions is processor-dependent.

This option is available on systems running the Linux operating system if you have installed the perfctr patch, which you can download from http://user.it.uu.se/~mikpe/linux/perfctr/2.6/ . Instructions for installation are contained within the tar file. The user-level libperfctr.so libraries are searched for using the value of the LD_LIBRARY_PATH environment variable, then in /usr/local/lib, /usr/lib, and /lib for the 32–bit versions, or /usr/local/lib64, /usr/lib64, and /lib64 for the 64–bit versions.

To obtain a list of available counters, type collect with no arguments in a terminal window. A description of the counter list is given in the section Hardware Counter Lists. On most systems, even if a counter is not listed, you can still specify it by a numeric value, either in hexadecimal or decimal.

A counter definition can take one of the following forms, depending on whether the processor supports attributes for hardware counters.

[+]counter_name[/ register_number][,interval ]

[+]counter_name[~ attribute_1=value_1]...[~attribute_n =value_n][/ register_number][,interval ]

The processor-specific counter_name can be one of the following:

If you specify more than one counter, they must use different registers. If they do not use different registers, the collect command prints an error message and exits.

If the hardware counter counts events that relate to memory access, you can prefix the counter name with a + sign to turn on searching for the true program counter address (PC) of the instruction that caused the counter overflow. This backtracking works on SPARC processors, and only with counters of type load , store , or load-store. If the search is successful, the virtual PC, the physical PC, and the effective address that was referenced are stored in the event data packet.

On some processors, attribute options can be associated with a hardware counter. If a processor supports attribute options, then running the collect command with no arguments lists the counter definitions including the attribute names. You can specify attribute values in decimal or hexadecimal format.

The interval (overflow value) is the number of events or cycles counted at which the hardware counter overflows and the overflow event is recorded. The interval can be set to one of the following:

The default is the normal threshold, which is predefined for each counter and which appears in the counter list. See also Limitations on Hardware Counter Overflow Profiling.

If you use the -h option without explicitly specifying a -p option, clock-based profiling is turned off. To collect both hardware counter data and clock-based data, you must specify both a -h option and a -p option.

-s option

Collect synchronization wait tracing data. The allowed values of option are:

Synchronization wait tracing data cannot be recorded for Java programs; specifying it is treated as an error.

On Solaris, the following functions are traced:

mutex_lock()
rw_rdlock()
rw_wrlock()
cond_wait()
cond_timedwait()
cond_reltimedwait()
thr_join()
sema_wait()
pthread_mutex_lock()
pthread_rwlock_rdlock()
pthread_rwlock_wrlock()
pthread_cond_wait()
pthread_cond_timedwait()
pthread_cond_reltimedwait_np()
pthread_join()
sem_wait()

On Linux, the following functions are traced:

pthread_mutex_lock()
pthread_cond_wait()
pthread_cond_timedwait()
pthread_join()
sem_wait()
-H option

Collect heap tracing data. The allowed values of option are:

Heap tracing is turned off by default. Heap tracing is not supported for Java programs; specifying it is treated as an error.

-M option

Specify collection of an MPI experiment. The target of the collect command must be the mpirun command, and its options must be separated from the target programs to be run by the mpirun command by a -- option. (Always use the -- option with the mpirun command so that you can collect an experiment by prepending the collect command and its option to the mpirun command line.) The experiment is named as usual and is referred to as the founder experiment; its directory contains subexperiments for each of the MPI processes, named by rank.

The allowed values of option are:

By default, collection of an MPI experiment is turned off. When collection of an MPI experiment is turned on, the default setting for the -m option is changed to on.

The supported versions of MPI are printed when you type the collect command with no options, or if you specify an unrecognized version with the -M option.

-m option

Collect MPI tracing data. The allowed values of option are:

MPI tracing is turned off by default unless the -M option is enabled, in which case MPI tracing is turned on by default. Normally MPI experiments are collected with the -M option, and no user control of MPI tracing is needed. If you want to collect an MPI experiment, but not collect MPI tracing data, use the explicit options -M MPI-version -m off.

See MPI Tracing Data for more information about the MPI functions whose calls are traced and the metrics that are computed from the tracing data.

-S option

Record sample packets periodically. The allowed values of option are:

By default, periodic sampling at 1 second intervals is enabled.

-c option

Record count data, for Solaris systems only.

The allowed values of option are

By default, turn off collection of count data. Count data cannot be collected with any other type of data.

-I directory

Specify a directory for bit instrumentation. This option is available only on Solaris systems, and is meaningful only when the -c option is also specified.

-N library_name

Specify a library to be excluded from bit instrumentation, whether the library is linked into the executable or loaded with dlopen()(). This option is available only on Solaris systems, and is meaningful only when the -c option is also specified. You can specify multiple -N options.

-r option

Collect data for data race detection or deadlock detection for the Thread Analyzer. The allowed values are:

For more information about the collect -r command and Thread Analyzer, see the Oracle Solaris Studio 12.2: Thread Analyzer User’s Guide and the tha(1) man page.

Experiment Control Options

These options control aspects of how the experiment data is collected.

-F option

Control whether or not descendant processes should have their data recorded. The allowed values of option are:

The -F on option is set by default so that the Collector follows processes created by calls to the functions fork(2), fork1(2), fork(3F), vfork(2), and exec(2) and its variants. The call to vfork is replaced internally by a call to fork1.

For MPI experiments, descendants are also followed by default.

If you specify the -F all option, the Collector follows all descendant processes including those created by calls to system(3C), system(3F), sh(3F), posix_spawn(3p), posix_spawnp(3p), and popen(3C), and similar functions, and their associated descendant processes.

If you specify the -F '= regexp' option, the Collector follows all descendant processes. The Collector creates a subexperiment when the descendant name or subexperiment name matches the specified regular expression. See the regexp(5) man page for information about regular expressions.

When you collect data on descendant processes, the Collector opens a new experiment for each descendant process inside the founder experiment. These new experiments are named by adding an underscore, a letter, and a number to the experiment suffix, as follows:

For example, if the experiment name for the initial process is test.1.er , the experiment for the child process created by its third fork is test.1.er/_f3.er. If that child process execs a new image, the corresponding experiment name is test.1.er/_f3_x1.er. If that child creates another process using a popen call, the experiment name is test.1.er/_f3_x1_c1.er.

The Analyzer and the er_print utility automatically read experiments for descendant processes when the founder experiment is read, and show descendants in the data display.

To select the data for display from the command line, specify the path name explicitly to either er_print or analyzer. The specified path must include the founder experiment name, and descendant experiment name inside the founder directory.

For example, here’s what you specify to see the data for the third fork of the test.1.er experiment:

er_print test.1.er/_f3.er

analyzer test.1.er/_f3.er

Alternatively, you can prepare an experiment group file with the explicit names of the descendant experiments in which you are interested.

To examine descendant processes in the Analyzer, load the founder experiment and select Filter Data from the View menu. A list of experiments is displayed with only the founder experiment checked. Uncheck it and check the descendant experiment of interest.


Note - If the founder process exits while descendant processes are being followed, collection of data from descendants that are still running will continue. The founder experiment directory continues to grow accordingly.


You can also collect data on scripts and follow descendant processes of scripts. See Collecting Data From Scripts for more information.

-j option

Enable Java profiling when the target program is a JVM. The allowed values of option are:

The -j option is not needed if you want to collect data on a .class file or a .jar file, provided that the path to the java executable is in either the JDK_HOME environment variable or the JAVA_PATH environment variable. You can then specify the target program on the collect command line as the .class file or the .jar file, with or without the extension.

If you cannot define the path to the java executable in the JDK_HOME or JAVA_PATH environment variables, or if you want to disable the recognition of methods compiled by the Java HotSpot virtual machine you can use the -j option. If you use this option, the program specified on the collect command line must be a Java virtual machine whose version is not earlier than JDK 6, Update 18. The collect command verifies that program is a JVM, and is an ELF executable; if it is not, the collect command prints an error message.

If you want to collect data using the 64-bit JVM, you must not use the -d64 option to the java command for a 32-bit JVM. If you do so, no data is collected. Instead you must specify the path to the 64-bit JVM either in the program argument to the collect command or in the JDK_HOME or JAVA_PATH environment variable.

-J java_argument

Specify additional arguments to be passed to the JVM used for profiling. If you specify the -J option, but do not specify Java profiling, an error is generated, and no experiment is run. The java_argument must be enclosed in quotation marks if it contains more than one argument. It must consist of a set of tokens separated by blanks or tabs. Each token is passed as a separate argument to the JVM. Most arguments to the JVM must begin with a “-” character.

-l signal

Record a sample packet when the signal named signal is delivered to the process.

You can specify the signal by the full signal name, by the signal name without the initial letters SIG, or by the signal number. Do not use a signal that is used by the program or that would terminate execution. Suggested signals are SIGUSR1 and SIGUSR2. SIGPROF can be used, even when clock-profiling is specified. Signals can be delivered to a process by the kill command.

If you use both the -l and the -y options, you must use different signals for each option.

If you use this option and your program has its own signal handler, you should make sure that the signal that you specify with -l is passed on to the Collector’s signal handler, and is not intercepted or ignored.

See the signal(3HEAD) man page for more information about signals.

-t duration

Specify a time range for data collection.

The duration can be specified as a single number, with an optional m or s suffix, to indicate the time in minutes or seconds at which the experiment should be terminated. By default, the duration is in seconds. The duration can also be specified as two such numbers separated by a hyphen, which causes data collection to pause until the first time elapses, and at that time data collection begins. When the second time is reached, data collection terminates. If the second number is a zero, data will be collected after the initial pause until the end of the program's run. Even if the experiment is terminated, the target process is allowed to run to completion.

-x

Leave the target process stopped on exit from the exec system call in order to allow a debugger to attach to it. If you attach dbx to the process, use the dbx commands ignore PROF and ignore EMT to ensure that collection signals are passed on to the collect command.

-y signal[ ,r]

Control recording of data with the signal named signal. Whenever the signal is delivered to the process, it switches between the paused state, in which no data is recorded, and the recording state, in which data is recorded. Sample points are always recorded, regardless of the state of the switch.

The signal can be specified by the full signal name, by the signal name without the initial letters SIG, or by the signal number. Do not use a signal that is used by the program or that would terminate execution. Suggested signals are SIGUSR1 and SIGUSR2. SIGPROF can be used, even when clock-profiling is specified. Signals can be delivered to a process by the kill command.

If you use both the -l and the -y options, you must use different signals for each option.

When the -y option is used, the Collector is started in the recording state if the optional r argument is given, otherwise it is started in the paused state. If the -y option is not used, the Collector is started in the recording state.

If you use this option and your program has its own signal handler, make sure that the signal that you specify with -y is passed on to the Collector’s signal handler, and is not intercepted or ignored.

See the signal(3HEAD) man page for more information about signals.

Output Options

These options control aspects of the experiment produced by the Collector.

-o experiment_name

Use experiment_name as the name of the experiment to be recorded. The experiment_name string must end in the string “.er”; if not, the collect utility prints an error message and exits.

If you do not specify the -o option, give the experiment a name of the form stem.n.er, where stem is a string, and n is a number. If you have specified a group name with the -g option, set stem to the group name without the .erg suffix. If you have not specified a group name, set stem to the string test.

If you are invoking the collect command from one of the commands used to run MPI jobs, for example, mpirun, but without the -M MPI-version option and the -o option, take the value of n used in the name from the environment variable used to define the MPI rank of that process. Otherwise, set n to one greater than the highest integer currently in use.

If the name is not specified in the form stem.n.er, and the given name is in use, an error message is displayed and the experiment is not run. If the name is of the form stem.n.er and the name supplied is in use, the experiment is recorded under a name corresponding to one greater than the highest value of n that is currently in use. A warning is displayed if the name is changed.

-d directory-name

Place the experiment in directory directory-name. This option only applies to individual experiments and not to experiment groups. If the directory does not exist, the collect utility prints an error message and exits. If a group is specified with the -g option, the group file is also written to directory-name.

For the lightest-weight data collection, it is best to record data to a local file, using the -d option to specify a directory in which to put the data. However, for MPI experiments on a cluster, the founder experiment must be available at the same path for all processes to have all data recorded into the founder experiment.

Experiments written to long-latency file systems are especially problematic, and might progress very slowly, especially if Sample data is collected (-S on option, the default). If you must record over a long-latency connection, disable Sample data.

-g group-name

Make the experiment part of experiment group group-name. If group-name does not end in .erg, the collect utility prints an error message and exits. If the group exists, the experiment is added to it. If group-name is not an absolute path, the experiment group is placed in the directory directory-name if a directory has been specified with -d, otherwise it is placed in the current directory.

-A option

Control whether or not load objects used by the target process should be archived or copied into the recorded experiment. The allowed values of option are:

If you expect to copy experiments to a different machine from which they were recorded, or to read the experiments from a different machine, specify - A copy. Using this option does not copy any source files or object (.o) files into the experiment. Ensure that those files are accessible and unchanged from the machine on which you are examining the experiment.

-L size

Limit the amount of profiling data recorded to size megabytes. The limit applies to the sum of the amounts of clock-based profiling data, hardware counter overflow profiling data, and synchronization wait tracing data, but not to sample points. The limit is only approximate, and can be exceeded.

When the limit is reached, no more profiling data is recorded but the experiment remains open until the target process terminates. If periodic sampling is enabled, sample points continue to be written.

To impose a limit of approximately 2 Gbytes, for example, specify -L 2000. The size specified must be greater than zero.

By default, there is no limit on the amount of data recorded.

-O file

Append all output from collect itself to the name file, but do not redirect the output from the spawned target. If file is set to /dev/null, suppress all output from collect, including any error messages.

Other Options

These collect command options are used for miscellaneous purposes.

-P process_id

Write a script for dbx to attach to the process with the given process_id, collect data from it, and then invoke dbx on the script. You can specify only profiling data, not tracing data, and timed runs (-t option) are not supported.

-C comment

Put the comment into the notes file for the experiment. You can supply up to ten -C options. The contents of the notes file are prepended to the experiment header.

-n

Do not run the target but print the details of the experiment that would be generated if the target were run. This option is a dry run option.

-R

Display the text version of the Performance Analyzer Readme in the terminal window. If the readme is not found, a warning is printed. No further arguments are examined, and no further processing is done.

-V

Print the current version of the collect command. No further arguments are examined, and no further processing is done.

-v

Print the current version of the collect command and detailed information about the experiment being run.

Collecting Data From a Running Process Using the collect Utility

In the Solaris OS only, the -P pid option can be used with the collect utility to attach to the process with the specified PID, and collect data from the process. The other options to the collect command are translated into a script for dbx, which is then invoked to collect the data. Only clock-based profile data (-p option) and hardware counter overflow profile data (-h option) can be collected. Tracing data is not supported.

If you use the -h option without explicitly specifying a -p option, clock-based profiling is turned off. To collect both hardware counter data and clock-based data, you must specify both a -h option and a -p option.

To Collect Data From a Running Process Using the collect Utility

  1. Determine the program’s process ID (PID).

    If you started the program from the command line and put it in the background, its PID will be printed to standard output by the shell. Otherwise you can determine the program’s PID by typing the following.

    % ps -ef | grep program-name
  2. Use the collect command to enable data collection on the process, and set any optional parameters.
    % collect -P pid collect-options

    The collector options are described in Data Collection Options. For information about clock-based profiling, see -p option. For information about hardware clock profiling, see -h option.