Analyzing Program Performance with Sun WorkShop |
Contents
 Preface
1.  Overview of Performance Profiling and Analysis Tools
2.  Tutorial: Using the Sampling Collector and Analyzer
- Example 1: synprog
- Copying synprog
- Building synprog
- Collecting Data About synprog
- Analyzing synprog Performance Metrics
- Example 2: omptest
- Copying omptest
- Building omptest
- Collecting Data About omptest
- Analyzing omptest Performance Metrics
- Example 3: mttest
- Copying mttest
- Building mttest
- Collecting and Analyzing Data About mttest
3.  Sampling Collector Reference
- What the Sampling Collector Collects
- Exclusive, Inclusive, and Attributed Metrics
- Clock-Based Profiling Data
- Thread Synchronization Wait Tracing
- Hardware-Counter Overflow Profiling
- Global Information
- Collecting Performance Data in Sun WorkShop
- Starting a Process Under the Collector in dbx
- Attaching to a Running Process
- Using the Collector for Programs Written with MPI
4.  Sampling Analyzer Reference
- Starting the Analyzer and Loading an Experiment
- Analyzer Command-Line Options
- Exiting the Analyzer
- The Analyzer Window
- Examining Metrics for Functions and Load-Objects
- Viewing Metrics for Functions and Load Objects
- Understanding the Metrics Displayed
- Selecting Metrics and Sort Order for Functions and Load-Objects
- Viewing Summary Metrics for a Function or Load Object
- Searching for a Function or Load Object
- Examining Caller-Callee Metrics for a Function
- Selecting Metrics and Sort Order in the Callers-Callees Window
- Examining Annotated Source Code and Disassembly Code
- Choosing a Text Editor
- Filtering Information
- Selecting Load Objects
- Selecting Samples, Threads, and LWPs
- Generating and Using a Mapfile
- Using the Data Option List to Access Other Data Displays
- Examining Sample Overview Information
- Examining Address-Space Information
- Examining Execution Statistics
- Adding Experiments to the Analyzer
- Dropping Experiments from the Analyzer
- Printing the Display
5.  er_print Reference
- er_print Syntax
- Options
- er_print Commands
- Function List Commands
- Callers-Callees List Commands
- Source and Disassembly Listing Commands
- Selectivity Commands: Samples, Threads, LWPs, and Load Objects
- Metric Commands
- Output Commands
- Miscellaneous Commands
6.  Advanced Topics: Understanding the Sampling Analyzer and Its Data
- Event-Specific Data and What It Means
- Clock-Based Profiling
- Synchronization Wait Tracing
- Hardware-Counter Overflow Profiling
- Call Stacks and Program Execution
- Single-Threaded Execution and Function Calls
- Explicit Multithreading
- Parallel Execution and Compiler-Generated Body Functions
- Unwinding the Stack
- Mapping Addresses to Program Structure
- The Process Image
- Load Objects and Functions
- The Callers-Callees Window
- Annotated Source Code and Disassembly Code
- Annotated Source Code
- Annotated Disassembly
- Understanding Performance Costs
- Performance at the Function-Level
- Performance at the Source Line Level
- Performance at the Instruction Level
7.  Loop Analysis Tools
- Basic Concepts
- Setting Up Your Environment
- Creating a Loop Timing File
- Other Compilation Options
- Running the Program
- Starting LoopTool
- Using LoopTool
- Opening Files
- Creating a Report on All Loops
- Printing the LoopTool Graph
- Choosing an Editor
- Editing Source Code and Getting Hints
- Starting LoopReport
- Timing File
- Fields in the Loop Report
- Compiler Hints
- 0. No hint available
- 1. Loop contains procedure call
- 2. Compiler generated two versions of this loop
- 3. The variable(s) "list" cause a data dependency in this loop
- 4. Loop was significantly transformed during optimization
- 5. Loop may or may not hold enough work to be profitably parallelized
- 6. Loop was marked by user-inserted pragma, DOALL
- 7. Loop contains multiple exits
- 8. Loop contains I/O, or other function calls, that are not MT safe
- 9. Loop contains backward flow of control
- 10. Loop may have been distributed
- 11. Two or more loops may have been fused
- 12. Two or more loops may have been interchanged
- How Optimization Affects Loops
- Inlining
- Loop Transformations: Unrolling, Jamming, Splitting, and Transposing
- Parallel Loops Nested Inside Serial Loops
A.  Traditional Profiling Tools
- Basic Concepts
- Using prof to Generate a Program Profile
- Output Example
- Sample prof Output
- Using gprof to Generate a Call Graph Profile
- Using tcov for Statement-Level Analysis
- Compiling for tcov
- Creating tcov Profiled Shared Libraries
- Locking Files
- Errors Reported by tcov Runtime Routines
- Using tcov Enhanced for Statement-Level Analysis
- Advantages of tcov Enhanced
- Compiling for tcov Enhanced
- Creating Profiled Shared Libraries
- Locking Files
- tcov Directories and Environment Variables
 Index
Sun Microsystems, Inc. Copyright information. All rights reserved. Feedback |
Library | Contents | Previous | Next | Index |