Sun Logo


Performance Analyzer

Suntrademark Studio 11

819-3687-10



Contents

Figures

Tables

Before You Begin

How This Book Is Organized

Typographic Conventions

Supported Platforms

Shell Prompts

Accessing Sun Studio Software and Man Pages

Accessing Sun Studio Documentation

Accessing Related Solaris Documentation

Resources for Developers

Contacting Sun Technical Support

Sending Your Comments

1. Overview of the Performance Analyzer

Starting the Performance Analyzer From the Integrated Development Environment

The Tools of Performance Analysis

The Collector Tool

The Performance Analyzer Tool

The er_print Utility

The Performance Analyzer Window

2. Performance Data

What Data the Collector Collects

Clock Data

Hardware Counter Overflow Profiling Data

Synchronization Wait Tracing Data

Heap Tracing (Memory Allocation) Data

MPI Tracing Data

Global (Sampling) Data

How Metrics Are Assigned to Program Structure

Function-Level Metrics: Exclusive, Inclusive, and Attributed

Interpreting Attributed Metrics: An Example

How Recursion Affects Function-Level Metrics

3. Collecting Performance Data

Compiling and Linking Your Program

Source Code Information

Static Linking

Optimization

Compiling Java Programs

Preparing Your Program for Data Collection and Analysis

Using Dynamically Allocated Memory

Using System Libraries

Using Signal Handlers

Using setuid

Program Control of Data Collection

The C, C++, Fortran, and Java API Functions

Dynamic Functions and Modules

Limitations on Data Collection

Limitations on Clock-Based Profiling

Limitations on Collection of Tracing Data

Limitations on Hardware Counter Overflow Profiling

Runtime Distortion and Dilation With Hardware Counter Overflow Profiling

Limitations on Data Collection for Descendant Processes

Limitations on Java Profiling

Runtime Performance Distortion and Dilation for Applications Written in the Java Programming Language

Where the Data Is Stored

Experiment Names

Moving Experiments

Estimating Storage Requirements

Collecting Data

Collecting Data Using the collect Command

Data Collection Options

Experiment Control Options

Output Options

Other Options

Collecting Data Using the dbx collector Subcommands

Data Collection Subcommands

Experiment Control Subcommands

Output Subcommands

Information Subcommands

Collecting Data From a Running Process

Collecting Data From MPI Programs

Storing MPI Experiments

Running the collect Command Under MPI

Collecting Data by Starting dbx Under MPI

Using collect With ppgsz

4. The Performance Analyzer Tool

Starting the Performance Analyzer

Analyzer Options

Performance Analyzer GUI

The Menu Bar

Toolbar

Analyzer Data Displays

Setting Data Presentation Options

Finding Text and Data

Showing or Hiding Functions

Filtering Data

Experiment Selection

Sample Selection

Thread Selection

LWP Selection

CPU Selection

Recording Experiments

Generating Mapfiles and Function Reordering

Defaults

5. Kernel Profiling

Kernel Experiments

Setting Up Your System for Kernel Profiling

Running the er_kernel Utility

Profiling the Kernel

Profiling Under Load

Profiling the Kernel and Load Together

Profiling a Specific Process or Kernel Thread

Analyzing a Kernel Profile

6. The er_print Command Line Performance Analysis Tool

er_print Syntax

Metric Lists

Commands That Control the Function List

Commands That Control the Callers-Callees List

Commands That Control the Leak and Allocation Lists

Commands That Control the Source and Disassembly Listings

Commands That Control the Data Space List

Commands That Control Memory Object Lists

Commands That List Experiments, Samples, Threads, and LWPs

Commands That Control Filtering of Experiment Data

Specifying a Filter Expression

Selecting Samples, Threads, LWPs, and CPUs for Filtering

Commands That Control Load Object Expansion and Collapse

Commands That List Metrics

Commands That Control Output

Commands That Print Other Information

Commands That Set Defaults

Commands That Set Defaults Only For the Performance Analyzer

Miscellaneous Commands

Expression Grammar

Examples

7. Understanding the Performance Analyzer and Its Data

How Data Collection Works

Experiment Format

Recording Experiments

Interpreting Performance Metrics

Clock-Based Profiling

Synchronization Wait Tracing

Hardware Counter Overflow Profiling

Heap Tracing

Dataspace Profiling

MPI Tracing

Call Stacks and Program Execution

Single-Threaded Execution and Function Calls

Explicit Multithreading

Overview of Java Technology-Based Software Execution

Java Processing Representations

Overview of OpenMP Software Execution

Incomplete Stack Unwinds

Mapping Addresses to Program Structure

The Process Image

Load Objects and Functions

Aliased Functions

Non-Unique Function Names

Static Functions From Stripped Shared Libraries

Fortran Alternate Entry Points

Cloned Functions

Inlined Functions

Compiler-Generated Body Functions

Outline Functions

Dynamically Compiled Functions

The <Unknown> Function

New and OpenMP Special Functions

The <JVM-System> Function

The <no Java callstack recorded> Function

The <Truncated-stack> Function

The <Total> Function

Functions Related to Hardware Counter Overflow Profiling

Mapping Data Addresses to Program Data Objects

Data Object Descriptors

8. Understanding Annotated Source and Disassembly Data

Annotated Source Code

Performance Analyzer Source Tab Layout

Annotated Disassembly Code

Interpreting Annotated Disassembly

Special Lines in the Source, Disassembly and PCs Tabs

Outline Functions

Compiler-Generated Body Functions

Dynamically Compiled Functions

Java Native Functions

Cloned Functions

Static Functions

Inclusive Metrics

Branch Target

Viewing Source/Disassembly Without An Experiment

9. Manipulating Experiments

Manipulating Experiments

Copying Experiments With the er_cp Utility

Moving Experiments With the er_mv Utility

Deleting Experiments With the er_rm Utility

Other Utilities

The er_archive Utility

The er_export Utility

A. Profiling Programs With prof, gprof, and tcov

Using prof to Generate a Program Profile

Using gprof to Generate a Call Graph Profile

Using tcov for Statement-Level Analysis

Creating tcov Profiled Shared Libraries

Locking Files

Errors Reported by tcov Runtime Functions

Using tcov Enhanced for Statement-Level Analysis

Creating Profiled Shared Libraries for tcov Enhanced

Locking Files

tcov Directories and Environment Variables

Index