Skip Navigation Links | |
Exit Print View | |
Oracle Solaris Studio 12.3: Performance Analyzer Oracle Solaris Studio 12.3 Information Library |
1. Overview of the Performance Analyzer
3. Collecting Performance Data
4. The Performance Analyzer Tool
5. The er_print Command Line Performance Analysis Tool
6. Understanding the Performance Analyzer and Its Data
Interpreting Performance Metrics
Hardware Counter Overflow Profiling
Dataspace Profiling and Memoryspace Profiling
Call Stacks and Program Execution
Single-Threaded Execution and Function Calls
Function Calls Between Shared Objects
Overview of Java Technology-Based Software Execution
Java Call Stacks and Machine Call Stacks
Clock-based Profiling and Hardware Counter Overflow Profiling
User View Mode of Java Profiling Data
Expert View Mode of Java Profiling Data
Machine View Mode of Java Profiling Data
Overview of OpenMP Software Execution
User View Mode of OpenMP Profile Data
Expert View Mode of OpenMP Profiling Data
Machine View Mode of OpenMP Profiling Data
Mapping Addresses to Program Structure
Static Functions From Stripped Shared Libraries
Fortran Alternate Entry Points
Compiler-Generated Body Functions
Dynamically Compiled Functions
The <no Java callstack recorded> Function
The <Truncated-stack> Function
Functions Related to Hardware Counter Overflow Profiling
Mapping Performance Data to Index Objects
Mapping Data Addresses to Program Data Objects
The <Unknown> Data Object and Its Elements
Mapping Performance Data to Memory Objects
The output from a data collection run is an experiment, which is stored as a directory with various internal files and subdirectories in the file system.
All experiments must have three files:
A log file (log.xml), an XML file that contains information about what data was collected, the versions of various components, a record of various events during the life of the target, and the word size of the target.
A map file (map.xml), an XML file that records the time-dependent information about what load objects are loaded into the address space of the target, and the times at which they are loaded or unloaded.
An overview file; a binary file containing usage information recorded at every sample point in the experiment.
In addition, experiments have binary data files representing the profile events in the life of the process. Each data file has a series of events, as described below under Interpreting Performance Metrics. Separate files are used for each type of data, but each file is shared by all threads in the target.
For clock-based profiling, or hardware counter overflow profiling, the data is written in a signal handler invoked by the clock tick or counter overflow. For synchronization tracing, heap tracing, MPI tracing, or OpenMP tracing, data is written from libcollector routines that are interposed by the LD_PRELOAD environment variable on the normal user-invoked routines. Each such interposition routine partially fills in a data record, then invokes the normal user-invoked routine, and fills in the rest of the data record when that routine returns, and writes the record to the data file.
All data files are memory-mapped and written in blocks. The records are filled in such a way as to always have a valid record structure, so that experiments can be read as they are being written. The buffer management strategy is designed to minimize contention and serialization between threads.
An experiment can optionally contain an ASCII file with the filename of notes. This file is automatically created when using the -C comment argument to the collect command. You can create or edit the file manually after the experiment has been created. The contents of the file are prepended to the experiment header.
Each experiment has an archives directory that contains binary files describing each load object referenced in the map.xml file. These files are produced by the er_archive utility, which runs at the end of data collection. If the process terminates abnormally, the er_archive utility may not be invoked, in which case, the archive files are written by the er_print utility or the Analyzer when first invoked on the experiment.
Subexperiments are created when multiple processes are profiled, such as when you follow descendent processes, collect an MPI experiment, or profile the kernel with user processes.
Descendant processes write their experiments into subdirectories within the founder-process’ experiment directory. These new subexperiments are named to indicate their lineage as follows:
An underscore is appended to the creator's experiment name.
One of the following code letters is added: f for fork, x for exec, and c for other descendants.
A number to indicate the index of the fork or exec is added after the code letter. This number is applied whether the process was started successfully or not.
The experiment suffix, .er is appended to complete the experiment name.
For example, if the experiment name for the founder process is test.1.er, the experiment for the child process created by its third fork is test.1.er/_f3.er. If that child process executes a new image, the corresponding experiment name is test.1.er/_f3_x1.er. Descendant experiments consist of the same files as the parent experiment, but they do not have descendant experiments (all descendants are represented by subdirectories in the founder experiment), and they do not have archive subdirectories (all archiving is done into the founder experiment).
Data for MPI programs are collected by default into test.1.er, and all the data from the MPI processes are collected into subexperiments, one per rank. The Collector uses the MPI rank to construct a subexperiment name with the form M_rm.er, where m is the MPI rank. For example, MPI rank 1 would have its experiment data recorded in the test.1.er/M_r1.er directory.
Experiments on the kernel by default are named ktest.1.er rather than test.1.er. When data is also collected on user processes, the kernel experiment contains subexperiments for each user process being followed. The kernel subexperiments are named using the format _process-name_PID_process-id.1.er. For example an experiment run on a sshd process running under process ID 1264 would be named ktest.1.er/_sshd_PID_1264.1.er.
An experiment where the target creates dynamic functions has additional records in the map.xml file describing those functions, and an additional file, dyntext, containing a copy of the actual instructions of the dynamic functions. The copy is needed to produce annotated disassembly of dynamic functions.
A Java experiment has additional records in the map.xml file, both for dynamic functions created by the JVM software for its internal purposes, and for dynamically-compiled (HotSpot) versions of the target Java methods.
In addition, a Java experiment has a JAVA_CLASSES file, containing information about all of the user’s Java classes invoked.
Java tracing data is recorded using a JVMTI agent, which is part of libcollector.so. The agent receives events that are mapped into the recorded trace events. The agent also receives events for class loading and HotSpot compilation, that are used to write the JAVA_CLASSES file, and the Java-compiled method records in the map.xml file.
You can record an experiment on a user-mode target in three different ways:
With the collect command
With dbx creating a process
With dbx creating an experiment from a running process
The Performance Collect window in the Analyzer GUI runs a collect experiment.
When you use the collect command to record an experiment, the collect utility creates the experiment directory and sets the LD_PRELOAD environment variable to ensure that libcollector.so and other libcollector modules are preloaded into the target’s address space. The collect utility then sets environment variables to inform libcollector.so about the experiment name, and data collection options, and executes the target on top of itself.
libcollector.so and associated modules are responsible for writing all experiment files.
When dbx is used to launch a process with data collection enabled, dbx also creates the experiment directory and ensures preloading of libcollector.so. Then dbx stops the process at a breakpoint before its first instruction, and calls an initialization routine in libcollector.so to start the data collection.
Java experiments can not be collected by dbx, since dbx uses a Java Virtual Machine Debug Interface (JVMDI) agent for debugging, and that agent can not coexist with the Java Virtual Machine Tools Interface (JVMTI) agent needed for data collection.
When dbx is used to start an experiment on a running process, it creates the experiment directory, but cannot use the LD_PRELOAD environment variable. dbx makes an interactive function call into the target to open libcollector.so, and then calls the libcollector.so initialization routine, just as it does when creating the process. Data is written by libcollector.so and its modules just as in a collect experiment.
Since libcollector.so was not in the target address space when the process started, any data collection that depends on interposition on user-callable functions (synchronization tracing, heap tracing, MPI tracing) might not work. In general, the symbols have already been resolved to the underlying functions, so the interposition can not happen. Furthermore, the following of descendant processes also depends on interposition, and does not work properly for experiments created by dbx on a running process.
If you have explicitly preloaded libcollector.so before starting the process with dbx, or before using dbx to attach to the running process, you can collect tracing data.