Developing high performance applications requires a combination of compiler features, libraries of optimized routines, and tools for performance analysis.
Sun Studio software provides a sophisticated pair of tools for collecting and analyzing program performance data:
The Collector collects performance data on a statistical basis called profiling. The data can include call stacks, microstate accounting information, thread-synchronization delay data, hardware-counter overflow data, address space data, and summary information for the operating system.
The Performance Analyzer displays the data recorded by the Collector, so you can examine the information. The Analyzer processes the data and displays various metrics of performance at program, function, caller-callee, source-line, and disassembly-instruction levels. These metrics are classed into three groups: clock-based metrics, synchronization delay metrics, and hardware counter metrics.
The Performance Analyzer can also help you to fine-tune your application’s performance, by creating a mapfile you can use to improve the order of function loading in the application address space.
These two tools help to answer the following kinds of questions:
How much of the available resources does the program consume?
Which functions or load objects are consuming the most resources?
Which source lines and disassembly instructions consume the most resources?
How did the program arrive at this point in the execution?
Which resources are being consumed by a function or load object?
The main window of the Performance Analyzer displays a list of functions for the program with exclusive and inclusive metrics for each function. The list can be filtered by load object, by thread, by light-weight process (LWP) and by time slice. For a selected function, a subsidiary window displays the callers and callees of the function. This window can be used to navigate the call tree—in search of high metric values, for example. Two more windows display source code annotated line-by-line with performance metrics and interleaved with compiler commentary, and disassembly code annotated with metrics for each instruction. Source code and compiler commentary are interleaved with the instructions if available.
The Collector and Analyzer are designed for use by any software developer, even if performance tuning is not the developer’s main responsibility. They provide a more flexible, detailed and accurate analysis than the commonly used profiling tools prof and gprof, and are not subject to an attribution error in gprof.
Command-line equivalents of the Collector and Analyzer are available:
Data collection can be done with the collect(1) command.
The Collector can be run from dbx using the collector subcommands.
The command-line utility er_print(1) prints out an ASCII version of the various Analyzer displays.
The command-line utility er_src(1) displays source and disassembly code listings annotated with compiler commentary but without performance data.
Details can be found in the Sun Studio Program Performance Analysis Tools manual.