JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Oracle Solaris Studio 12.3: Performance Analyzer     Oracle Solaris Studio 12.3 Information Library
search filter icon
search icon

Document Information

Preface

1.  Overview of the Performance Analyzer

2.  Performance Data

3.  Collecting Performance Data

4.  The Performance Analyzer Tool

5.  The er_print Command Line Performance Analysis Tool

6.  Understanding the Performance Analyzer and Its Data

How Data Collection Works

Experiment Format

The archives Directory

Subexperiments

Dynamic Functions

Java Experiments

Recording Experiments

collect Experiments

dbx Experiments That Create a Process

dbx Experiments on a Running Process

Interpreting Performance Metrics

Clock-Based Profiling

Accuracy of Timing Metrics

Comparisons of Timing Metrics

Hardware Counter Overflow Profiling

Dataspace Profiling and Memoryspace Profiling

Synchronization Wait Tracing

Heap Tracing

MPI Tracing

Call Stacks and Program Execution

Single-Threaded Execution and Function Calls

Function Calls Between Shared Objects

Signals

Traps

Tail-Call Optimization

Explicit Multithreading

Overview of Java Technology-Based Software Execution

Java Call Stacks and Machine Call Stacks

Clock-based Profiling and Hardware Counter Overflow Profiling

Java Profiling View Modes

User View Mode of Java Profiling Data

Expert View Mode of Java Profiling Data

Machine View Mode of Java Profiling Data

Overview of OpenMP Software Execution

User View Mode of OpenMP Profile Data

Artificial Functions

User Mode Call Stacks

OpenMP Metrics

Expert View Mode of OpenMP Profiling Data

Machine View Mode of OpenMP Profiling Data

Incomplete Stack Unwinds

Intermediate Files

Mapping Addresses to Program Structure

The Process Image

Load Objects and Functions

Aliased Functions

Non-Unique Function Names

Static Functions From Stripped Shared Libraries

Fortran Alternate Entry Points

Cloned Functions

Inlined Functions

Compiler-Generated Body Functions

Outline Functions

Dynamically Compiled Functions

The <Unknown> Function

OpenMP Special Functions

The <JVM-System> Function

The <no Java callstack recorded> Function

The <Truncated-stack> Function

The <Total> Function

Functions Related to Hardware Counter Overflow Profiling

Mapping Performance Data to Index Objects

Mapping Data Addresses to Program Data Objects

Data Object Descriptors

The <Total> Data Object

The <Scalars> Data Object

The <Unknown> Data Object and Its Elements

Mapping Performance Data to Memory Objects

7.  Understanding Annotated Source and Disassembly Data

8.  Manipulating Experiments

9.  Kernel Profiling

Index

How Data Collection Works

The output from a data collection run is an experiment, which is stored as a directory with various internal files and subdirectories in the file system.

Experiment Format

All experiments must have three files:

In addition, experiments have binary data files representing the profile events in the life of the process. Each data file has a series of events, as described below under Interpreting Performance Metrics. Separate files are used for each type of data, but each file is shared by all threads in the target.

For clock-based profiling, or hardware counter overflow profiling, the data is written in a signal handler invoked by the clock tick or counter overflow. For synchronization tracing, heap tracing, MPI tracing, or OpenMP tracing, data is written from libcollector routines that are interposed by the LD_PRELOAD environment variable on the normal user-invoked routines. Each such interposition routine partially fills in a data record, then invokes the normal user-invoked routine, and fills in the rest of the data record when that routine returns, and writes the record to the data file.

All data files are memory-mapped and written in blocks. The records are filled in such a way as to always have a valid record structure, so that experiments can be read as they are being written. The buffer management strategy is designed to minimize contention and serialization between threads.

An experiment can optionally contain an ASCII file with the filename of notes. This file is automatically created when using the -C comment argument to the collect command. You can create or edit the file manually after the experiment has been created. The contents of the file are prepended to the experiment header.

The archives Directory

Each experiment has an archives directory that contains binary files describing each load object referenced in the map.xml file. These files are produced by the er_archive utility, which runs at the end of data collection. If the process terminates abnormally, the er_archive utility may not be invoked, in which case, the archive files are written by the er_print utility or the Analyzer when first invoked on the experiment.

Subexperiments

Subexperiments are created when multiple processes are profiled, such as when you follow descendent processes, collect an MPI experiment, or profile the kernel with user processes.

Descendant processes write their experiments into subdirectories within the founder-process’ experiment directory. These new subexperiments are named to indicate their lineage as follows:

For example, if the experiment name for the founder process is test.1.er, the experiment for the child process created by its third fork is test.1.er/_f3.er. If that child process executes a new image, the corresponding experiment name is test.1.er/_f3_x1.er. Descendant experiments consist of the same files as the parent experiment, but they do not have descendant experiments (all descendants are represented by subdirectories in the founder experiment), and they do not have archive subdirectories (all archiving is done into the founder experiment).

Data for MPI programs are collected by default into test.1.er, and all the data from the MPI processes are collected into subexperiments, one per rank. The Collector uses the MPI rank to construct a subexperiment name with the form M_rm.er, where m is the MPI rank. For example, MPI rank 1 would have its experiment data recorded in the test.1.er/M_r1.er directory.

Experiments on the kernel by default are named ktest.1.er rather than test.1.er. When data is also collected on user processes, the kernel experiment contains subexperiments for each user process being followed. The kernel subexperiments are named using the format _process-name_PID_process-id.1.er. For example an experiment run on a sshd process running under process ID 1264 would be named ktest.1.er/_sshd_PID_1264.1.er.

Dynamic Functions

An experiment where the target creates dynamic functions has additional records in the map.xml file describing those functions, and an additional file, dyntext, containing a copy of the actual instructions of the dynamic functions. The copy is needed to produce annotated disassembly of dynamic functions.

Java Experiments

A Java experiment has additional records in the map.xml file, both for dynamic functions created by the JVM software for its internal purposes, and for dynamically-compiled (HotSpot) versions of the target Java methods.

In addition, a Java experiment has a JAVA_CLASSES file, containing information about all of the user’s Java classes invoked.

Java tracing data is recorded using a JVMTI agent, which is part of libcollector.so. The agent receives events that are mapped into the recorded trace events. The agent also receives events for class loading and HotSpot compilation, that are used to write the JAVA_CLASSES file, and the Java-compiled method records in the map.xml file.

Recording Experiments

You can record an experiment on a user-mode target in three different ways:

The Performance Collect window in the Analyzer GUI runs a collect experiment.

collect Experiments

When you use the collect command to record an experiment, the collect utility creates the experiment directory and sets the LD_PRELOAD environment variable to ensure that libcollector.so and other libcollector modules are preloaded into the target’s address space. The collect utility then sets environment variables to inform libcollector.so about the experiment name, and data collection options, and executes the target on top of itself.

libcollector.so and associated modules are responsible for writing all experiment files.

dbx Experiments That Create a Process

When dbx is used to launch a process with data collection enabled, dbx also creates the experiment directory and ensures preloading of libcollector.so. Then dbx stops the process at a breakpoint before its first instruction, and calls an initialization routine in libcollector.so to start the data collection.

Java experiments can not be collected by dbx, since dbx uses a Java Virtual Machine Debug Interface (JVMDI) agent for debugging, and that agent can not coexist with the Java Virtual Machine Tools Interface (JVMTI) agent needed for data collection.

dbx Experiments on a Running Process

When dbx is used to start an experiment on a running process, it creates the experiment directory, but cannot use the LD_PRELOAD environment variable. dbx makes an interactive function call into the target to open libcollector.so, and then calls the libcollector.so initialization routine, just as it does when creating the process. Data is written by libcollector.so and its modules just as in a collect experiment.

Since libcollector.so was not in the target address space when the process started, any data collection that depends on interposition on user-callable functions (synchronization tracing, heap tracing, MPI tracing) might not work. In general, the symbols have already been resolved to the underlying functions, so the interposition can not happen. Furthermore, the following of descendant processes also depends on interposition, and does not work properly for experiments created by dbx on a running process.

If you have explicitly preloaded libcollector.so before starting the process with dbx, or before using dbx to attach to the running process, you can collect tracing data.