Skip Navigation Links | |
Exit Print View | |
Oracle Solaris Studio 12.3: Performance Analyzer Oracle Solaris Studio 12.3 Information Library |
1. Overview of the Performance Analyzer
3. Collecting Performance Data
Preparing Your Program for Data Collection and Analysis
Using Dynamically Allocated Memory
Program Control of Data Collection
The C, C++, Fortran, and Java API Functions
Limitations on Data Collection
Limitations on Clock-Based Profiling
Runtime Distortion and Dilation with Clock-profiling
Limitations on Collection of Tracing Data
Runtime Distortion and Dilation with Tracing
Limitations on Hardware Counter Overflow Profiling
Runtime Distortion and Dilation With Hardware Counter Overflow Profiling
Limitations on Data Collection for Descendant Processes
Limitations on OpenMP Profiling
Experiments for Descendant Processes
Experiments on the Kernel and User Processes
Estimating Storage Requirements
Collecting Data Using the collect Command
-h counter_definition_1...[,counter_definition_n]
Collecting Data From a Running Process Using the collect Utility
To Collect Data From a Running Process Using the collect Utility
Collecting Data Using the dbx collector Subcommands
To Run the Collector From dbx:
Experiment Control Subcommands
Collecting Data From a Running Process With dbx on Oracle Solaris Platforms
To Collect Data From a Running Process That is Not Under the Control of dbx
Collecting Tracing Data From a Running Program
Collecting Data From MPI Programs
Running the collect Command for MPI
4. The Performance Analyzer Tool
5. The er_print Command Line Performance Analysis Tool
6. Understanding the Performance Analyzer and Its Data
You can collect and analyze data for a program compiled with almost any compiler option, but some choices affect what you can collect or what you can see in the Performance Analyzer. The issues that you should take into account when you compile and link your program are described in the following subsections.
To see source code in annotated Source and Disassembly analyses, and source lines in the Lines analyses, you must compile the source files of interest with the -g compiler option (-g0 for C++ to enable front-end inlining) to generate debug symbol information. The format of the debug symbol information can be either DWARF2 or stabs, as specified by -xdebugformat=(dwarf|stabs) . The default debug format is dwarf.
To prepare compilation objects with debug information that allows dataspace profiles, currently only for SPARC processors, compile by specifying -xhwcprof and any level of optimization. (Currently, this functionality is not available without optimization.) To see program data objects in Data Objects analyses, also add -g (or -g0 for C++) to obtain full symbolic information.
For memoryspace profiling of precise hardware counters on some SPARC processors, you do not need to compile with -xhwcprof and optimization. See Dataspace Profiling and Memoryspace Profiling for more information.
Executables and libraries built with DWARF format debugging symbols automatically include a copy of each constituent object file’s debugging symbols. Executables and libraries built with stabs format debugging symbols also include a copy of each constituent object file’s debugging symbols if they are linked with the -xs option, which leaves stabs symbols in the various object files as well as the executable. The inclusion of this information is particularly useful if you need to move or remove the object files. With all of the debugging symbols in the executables and libraries themselves, it is easier to move the experiment and the program-related files to a new location.
When you compile your program, you must not disable dynamic linking, which is done with the -dn and -Bstatic compiler options. If you try to collect data for a program that is entirely statically linked, the Collector prints an error message and does not collect data. The error occurs because the collector library, among others, is dynamically loaded when you run the Collector.
Do not statically link any of the system libraries. If you do, you might not be able to collect any kind of tracing data. Also, do not link to the Collector library, libcollector.so.
Normally the collect command causes data to be collected for all shared objects in the address space of the target, whether they are on the initial library list, or are explicitly loaded with dlopen(). However, under some circumstances some shared objects are not profiled:
When the target program is invoked with lazy loading. In such cases, the library is not loaded at startup time, and is not loaded by explicitly calling dlopen(), so shared object is not included in the experiment, and all PCs from it are mapped to the <Unknown> function. The workaround is to set the LD_BIND_NOW environment variable, which forces the library to be loaded at startup time.
When the executable was built with the -B option. In this case, the object is dynamically loaded by a call specifically to the dynamic linker entry point of dlopen()(), and the libcollector interposition is bypassed. The shared object name is not included in the experiment, and all PCs from it are mapped to the <Unknown>() function. The workaround is to not use the -B option.
If you compile your program with optimization turned on at some level, the compiler can rearrange the order of execution so that it does not strictly follow the sequence of lines in your program. The Performance Analyzer can analyze experiments collected on optimized code, but the data it presents at the disassembly level is often difficult to relate to the original source code lines. In addition, the call sequence can appear to be different from what you expect if the compiler performs tail-call optimizations. See Tail-Call Optimization for more information.
No special action is required for compiling Java programs with the javac command.