Sun Studio 12: Performance Analyzer

Preparing Your Program for Data Collection and Analysis

You do not need to do anything special to prepare most programs for data collection and analysis. You should read one or more of the subsections below if your program does any of the following:

Also, if you want to control data collection from your program, you should read the relevant subsection.

Using Dynamically Allocated Memory

Many programs rely on dynamically-allocated memory, using features such as:

You must take care to ensure that a program does not rely on the initial contents of dynamically allocated memory, unless the memory allocation method is explicitly documented as setting an initial value: for example, compare the descriptions of calloc and malloc in the man page for malloc(3C).

Occasionally, a program that uses dynamically-allocated memory might appear to work correctly when run alone, but might fail when run with performance data collection enabled. Symptoms might include unexpected floating point behavior, segmentation faults, or application-specific error messages.

Such behavior might occur if the uninitialized memory is, by chance, set to a benign value when the application is run alone, but is set to a different value when the application is run in conjunction with the performance data collection tools. In such cases, the performance tools are not at fault. Any application that relies on the contents of dynamically allocated memory has a latent bug: an operating system is at liberty to provide any content whatsoever in dynamically allocated memory, unless explicitly documented otherwise. Even if an operating system happens to always set dynamically allocated memory to a certain value today, such latent bugs might cause unexpected behavior with a later revision of the operating system, or if the program is ported to a different operating system in the future.

The following tools may help in finding such latent bugs:

Using System Libraries

The Collector interposes on functions from various system libraries, to collect tracing data and to ensure the integrity of data collection. The following list describes situations in which the Collector interposes on calls to library functions.

Under some circumstances the interposition does not succeed:

The failure of interposition by the Collector can cause loss or invalidation of performance data.

Using Signal Handlers

The Collector uses two signals to collect profiling data: SIGPROF for all experiments and SIGEMT for hardware counter experiments only. The Collector installs a signal handler for each of these signals. The signal handler intercepts and processes its own signal, but passes other signals on to any other signal handlers that are installed. If a program installs its own signal handler for these signals, the Collector reinstalls its signal handler as the primary handler to guarantee the integrity of the performance data.

The collect command can also use user-specified signals for pausing and resuming data collection and for recording samples. These signals are not protected by the Collector although a warning is written to the experiment if a user handler is installed. It is your responsibility to ensure that there is no conflict between use of the specified signals by the Collector and any use made by the application of the same signals.

The signal handlers installed by the Collector set a flag that ensures that system calls are not interrupted for signal delivery. This flag setting could change the behavior of the program if the program’s signal handler sets the flag to permit interruption of system calls. One important example of a change in behavior occurs for the asynchronous I/O library,, which uses SIGPROF for asynchronous cancel operations, and which does interrupt system calls. If the collector library,, is installed, the cancel signal invariably arrives too late to cancel the asynchronous I/O operation.

If you attach dbx to a process without preloading the collector library and enable performance data collection, and the program subsequently installs its own signal handler, the Collector does not reinstall its own signal handler. In this case, the program’s signal handler must ensure that the SIGPROF and SIGEMT signals are passed on so that performance data is not lost. If the program’s signal handler interrupts system calls, both the program behavior and the profiling behavior are different from when the collector library is preloaded.

Using setuid

Restrictions enforced by the dynamic loader make it difficult to use setuid(2) and collect performance data. If your program calls setuid or executes a setuid file, it is likely that the Collector cannot write an experiment file because it lacks the necessary permissions for the new user ID.

You can work around this issue by ensuring that your umask is set to give write permission to any UIDs or GIDs that the process may run under. The ids must have write permission for the experiments.

Program Control of Data Collection

If you want to control data collection from your program, the Collector shared library, contains some API functions that you can use. The functions are written in C. A Fortran interface is also provided. Both C and Fortran interfaces are defined in header files that are provided with the library.

The API functions are defined as follows.

void collector_sample(char *name);
void collector_pause(void);
void collector_resume(void);
void collector_thread_pause(unsigned int t);
void collector_thread_resume(unsigned int t);
void collector_terminate_expt(void);

Similar functionality is provided for JavaTM programs by the CollectorAPI class, which is described in The Java Interface.

The C and C++ Interface

There are two ways to access the C and C++ interface:

The Fortran Interface

The Fortran API libfcollector.h file defines the Fortran interface to the library. The application must be linked with -lcollectorAPI to use this library. (An alternate name for the library, -lfcollector, is provided for backward compatibility.) The Fortran API provides the same features as the C and C++ API, excluding the dynamic function and thread pause and resume calls.

Insert the following statement to use the API functions for Fortran:

include "libfcollector.h"

Note –

Do not link a program in any language with -lcollector. If you do, the Collector can exhibit unpredictable behavior.

The Java Interface

Use the following statement to import the CollectorAPI class and access the Java API. Note however that your application must be invoked with a classpath pointing to / installation_directory/lib/collector.jar where installation-directory is the directory in which the Sun Studio software is installed.


The Java CollectorAPI methods are defined as follows:

CollectorAPI.sample(String name)
CollectorAPI.threadPause(Thread thread)
CollectorAPI.threadResume(Thread thread)

The Java API includes the same functions as the C and C++ API, excluding the dynamic function API.

The C include file libcollector.h contains macros that bypass the calls to the real API functions if data is not being collected. In this case the functions are not dynamically loaded. However, using these macros is risky because the macros do not work well under some circumstances. It is safer to use collectorAPI.h because it does not use macros. Rather, it refers directly to the functions.

The Fortran API subroutines call the C API functions if performance data is being collected, otherwise they return. The overhead for the checking is very small and should not significantly affect program performance.

To collect performance data you must run your program using the Collector, as described later in this chapter. Inserting calls to the API functions does not enable data collection.

If you intend to use the API functions in a multithreaded program, you should ensure that they are only called by one thread. With the exception of collector_thread_pause() and collector_thread_resume(), the API functions perform actions that apply to the process and not to individual threads. If each thread calls the API functions, the data that is recorded might not be what you expect. For example, if collector_pause() or collector_terminate_expt() is called by one thread before the other threads have reached the same point in the program, collection is paused or terminated for all threads, and data can be lost from the threads that were executing code before the API call. To control data collection at the level of the individual threads, use the collector_thread_pause() and collector_thread_resume() functions. There are two viable ways of using these functions: by having one master thread make all the calls for all threads, including itself; or by having each thread make calls only for itself. Any other usage can lead to unpredictable results.

The C, C++, Fortran, and Java API Functions

The descriptions of the API functions follow.

Dynamic Functions and Modules

If your C or C++ program dynamically compiles functions into the data space of the program, you must supply information to the Collector if you want to see data for the dynamic function or module in the Performance Analyzer. The information is passed by calls to collector API functions. The definitions of the API functions are as follows.

void collector_func_load(char *name, char *alias,
    char *sourcename, void *vaddr, int size, int lntsize,
    Lineno *lntable);
void collector_func_unload(void *vaddr);

You do not need to use these API functions for Java methods that are compiled by the Java HotSpotTM virtual machine, for which a different interface is used. The Java interface provides the name of the method that was compiled to the Collector. You can see function data and annotated disassembly listings for Java compiled methods, but not annotated source listings.

The descriptions of the API functions follow.


Pass information about dynamically compiled functions to the Collector for recording in the experiment. The parameter list is described in the following table.

Table 3–1 Parameter List for collector_func_load()




The name of the dynamically compiled function that is used by the performance tools. The name does not have to be the actual name of the function. The name need not follow any of the normal naming conventions of functions, although it should not contain embedded blanks or embedded quote characters. 


An arbitrary string used to describe the function. It can be NULL. It is not interpreted in any way, and can contain embedded blanks. It is displayed in the Summary tab of the Analyzer. It can be used to indicate what the function is, or why the function was dynamically constructed.


The path to the source file from which the function was constructed. It can be NULL. The source file is used for annotated source listings.


The address at which the function was loaded. 


The size of the function in bytes. 


A count of the number of entries in the line number table. It should be zero if line number information is not provided. 


A table containing lntsize entries, each of which is a pair of integers. The first integer is an offset, and the second entry is a line number. All instructions between an offset in one entry and the offset given in the next entry are attributed to the line number given in the first entry. Offsets must be in increasing numeric order, but the order of line numbers is arbitrary. If lntable is NULL, no source listings of the function are possible, although disassembly listings are available.


Inform the collector that the dynamic function at the address vaddr has been unloaded.