Sun Studio 12 Update 1: Performance Analyzer

Chapter 1 Overview of the Performance Analyzer

Developing high performance applications requires a combination of compiler features, libraries of optimized functions, and tools for performance analysis. The Performance Analyzer manual describes the tools that are available to help you assess the performance of your code, identify potential performance problems, and locate the part of the code where the problems occur.

Starting the Performance Analyzer From the Integrated Development Environment

For information on starting the Performance Analyzer from the Integrated Development Environment (IDE), see the Performance Analyzer Readme at http://developers.sun.com/sunstudio/documentation/ss12u1/mr/READMEs/analyzer.html

The Tools of Performance Analysis

This manual describes the Collector and Performance Analyzer, a pair of Sun Studio tools that you use to collect and analyze performance data for your application. Both tools can be used from the command line or from a graphical user interface.

The Collector and Performance Analyzer are designed for use by any software developer, even if performance tuning is not the developer’s main responsibility. These tools provide a more flexible, detailed, and accurate analysis than the commonly used profiling tools prof and gprof, and are not subject to an attribution error in gprof.

The Collector and Performance Analyzer tools help to answer the following kinds of questions:

The Collector Tool

The Collector tool collects performance data using a statistical method called profiling and by tracing function calls. The data can include call stacks, microstate accounting information (on Solaris platforms only), thread synchronization delay data, hardware counter overflow data, Message Passing Interface (MPI) function call data, memory allocation data, and summary information for the operating system and the process. The Collector can collect all kinds of data for C, C++ and Fortran programs, and it can collect profiling data for applications written in the JavaTM programming language. It can collect data for dynamically-generated functions and for descendant processes. See Chapter 2, Performance Data for information about the data collected and Chapter 3, Collecting Performance Data for detailed information about the Collector. The Collector can be run from the Performance Analyzer GUI, from the IDE, from the dbx command line tool, and using the collect command.

The Performance Analyzer Tool

The Performance Analyzer tool displays the data recorded by the Collector, so that you can examine the information. The Performance Analyzer processes the data and displays various metrics of performance at the level of the program, the functions, the source lines, and the instructions. These metrics are classed into five groups:

The Performance Analyzer also displays the raw data in a graphical format as a function of time. The Performance Analyzer can create a mapfile that you can use to change the order of function loading in the program’s address space, to improve performance.

See Chapter 4, The Performance Analyzer Tool and the online help in the IDE or the Performance Analyzer GUI for detailed information about the Performance Analyzer.

Chapter 5, Kernel Profiling describes how you can use the Sun Studio performance tools to profile the kernel while the SolarisTM Operating System (Solaris OS) is running a load.

Chapter 6, The er_print Command Line Performance Analysis Tool describes how to use the er_print command line interface to analyze the data collected by the Collector.

Chapter 7, Understanding the Performance Analyzer and Its Data discusses topics related to understanding the performance analyzer and its data, including: how data collection works, interpreting performance metrics, call stacks and program execution, and annotated code listings. Annotated source code listings and disassembly code listings that include compiler commentary but do not include performance data can be viewed with the er_src utility (see Chapter 8, Understanding Annotated Source and Disassembly Data for more information).

Chapter 8, Understanding Annotated Source and Disassembly Data provides an understanding of the annotated source and disassembly, providing explanations about the different types of index lines and compiler commentary that the Performance Analyzer displays.

Chapter 9, Manipulating Experiments describes how to copy, move, delete, archive, and export experiments.

The er_print Utility

The er_print utility presents in plain text all the displays that are presented by the Performance Analyzer, with the exception of the Timeline display, the MPI Timeline display, and the MPI Chart display.

The Performance Analyzer Window


Note –

The following is a brief overview of the Performance Analyzer window. See Chapter 4, The Performance Analyzer Tool and the online help for a complete and detailed discussion of the functionality and features of the tabs discussed below.


The Performance Analyzer window consists of a multi-tabbed display, with a menu bar and a toolbar. The tab that is displayed when the Performance Analyzer is started shows a list of functions for the program with exclusive and inclusive metrics for each function. The data can be filtered by load object, by thread, by lightweight process (LWP), by CPU, and by time slice.

For a selected function, another tab displays the callers and callees of the function. This tab can be used to navigate the call tree, in search of high metric values, for example.

Two other tabs display source code that is annotated line-by-line with performance metrics and interleaved with compiler commentary, and disassembly code that is annotated with metrics for each instruction and interleaved with both source code and compiler commentary if they are available.

The performance data is displayed as a function of time in another tab.

MPI tracing data is displayed as processes, messages, and functions in one tab, and as charts in another tab.

OpenMP parallel regions are displayed on one tab, OpenMP tasks on another tab.

Other tabs show details of the experiments and load objects, summary information for a function, memory leaks, and statistics for the process.

Other tabs show Index Objects, Memory Objects, Data Objects, Data Layout, Lines, and PCs. See the Analyzer Data Displays for more information about each tab.

For experiments that have recorded Thread Analyzer data, tabs for data Races and Deadlocks are also available. Tabs are shown only if the loaded experiments have data supporting them.

See the Sun Studio 12: Thread Analyzer User’s Guide for more information about Thread Analyzer.

You can navigate the Performance Analyzer from the keyboard as well as with a mouse.