Program Performance Analysis Tools
Forte Developer 7
816-2458-10
Contents |
Accessing Forte Developer Development Tools and Man Pages
Accessing Forte Developer Documentation
Accessing Related Solaris Documentation
1. Overview of Program Performance Analysis Tools
2. Learning to Use the Performance Tools
Setting Up the Examples for Execution
Choosing Alternative Compiler Options
Basic Features of the Performance Analyzer
Example 1: Basic Performance Analysis
Extension Exercise for Simple Metric Analysis
Metric Attribution and the gprof Fallacy
Loading Dynamically Linked Shared Objects
Example 2: OpenMP Parallelization Strategies
Comparing Parallel Sections and Parallel Do Strategies
Comparing Critical Section and Reduction Strategies
Example 3: Locking Strategies in Multithreaded Programs
How Locking Strategies Affect Wait Time
How Data Management Affects Cache Performance
Extension Exercises for mttest
Example 4: Cache Behavior and Optimization
Program Structure and Cache Behavior
Program Optimization and Performance
What Data the Collector Collects
Hardware-Counter Overflow Data
Synchronization Wait Tracing Data
Heap Tracing (Memory Allocation) Data
How Metrics Are Assigned to Program Structure
Function-Level Metrics: Exclusive, Inclusive, and Attributed
Interpreting Function-Level Metrics: An Example
How Recursion Affects Function-Level Metrics
4. Collecting Performance Data
Preparing Your Program for Data Collection and Analysis
Controlling Data Collection From Your Program
Compiling and Linking Your Program
Limitations on Data Collection
Limitations on Clock-based Profiling
Limitations on Collection of Tracing Data
Limitations on Hardware-Counter Overflow Profiling
Limitations on Data Collection for Descendant Processes
Estimating Storage Requirements
Collecting Data Using the collect Command
Collecting Data From the Integrated Development Environment
Collecting Data Using the dbx collector Subcommands
Experiment Control Subcommands
Collecting Data From a Running Process
Collecting Data From MPI Programs
Running the collect Command Under MPI
Collecting Data by Starting dbx Under MPI
5. The Performance Analyzer Graphical User Interface
Running the Performance Analyzer
The Performance Analyzer Displays
Using the Performance Analyzer
Selecting the Data to Be Displayed
Searching for Names or Metric Values
Generating and Using a Mapfile
6. The er_print Command Line Performance Analysis Tool
Source and Disassembly Listing Commands
Memory Allocation List Commands
7. Understanding the Performance Analyzer and Its Data
Interpreting Performance Metrics
Hardware-Counter Overflow Profiling
Call Stacks and Program Execution
Single-Threaded Execution and Function Calls
Parallel Execution and Compiler-Generated Body Functions
Mapping Addresses to Program Structure
Static Functions From Stripped Shared Libraries
Fortran Alternate Entry Points
Compiler-Generated Body Functions
Dynamically Compiled Functions
8. Manipulating Experiments and Viewing Annotated Code Listings
Viewing Annotated Code Listings With er_src
A. Profiling Programs With prof, gprof, and tcov
Using prof to Generate a Program Profile
Using gprof to Generate a Call Graph Profile
Using tcov for Statement-Level Analysis
Creating tcov Profiled Shared Libraries
Errors Reported by tcov Runtime Functions
Using tcov Enhanced for Statement-Level Analysis
Creating Profiled Shared Libraries for tcov Enhanced
Copyright © 2002, Sun Microsystems, Inc. All rights reserved.