The Sampling Analyzer measures, records, and analyzes the performance of an application. It can also compute an improved load order for functions in your application's address space and help you rebuild a tuned application.
Experiment menu | Provides commands for loading, exporting, printing, and deleting experiments and for creating mapfiles. |
View menu | Provides commands for selecting, sorting, finding, and showing data. |
Options menu | Provides commands for altering column widths and histogram names |
Help menu | Provides online help |
Data list box | Determines the kind of performance data to be analyzed |
Display list box | Sets the display method for the data being analyzed |
Unit radio buttons | Select the type of unit to view in the display pane. |
Average legend | Displays the average percentage of time spent in performance problem areas contained in experiment samples. |
Sample display pane | Contains graphical analyses of collected data. |
Include Samples text field | Displays samples, sample ranges, and/or numbers of displayed samples. The text box is editable. |
Arrow buttons | Let you step through an experiment, incrementing or decrementing the sample number by one per click, and view program behavior at each sample. |
Message area | Displays information about current actions. |
The Sampling Analyzer examines an experiment record written by the Sampling Collector and displays it graphically on screen. The er_export utility converts the data of the experiment record to ASCII format, and the er_print utility prints the data of the current display to a file or printer. These two utilities are invoked from the Export and Print entries in the Experiment menu. They are not normally run from the command line.
You can load an experiment both as the Sampling Analyzer is opening and after the Sampling Analyzer is already open. By default, experiment data is shown using Overview display, but you can change the view to a Histogram, Cumulative, Address Space, or Statistics display, depending on the nature of the data.
To load an experiment as the Sampling Analyzer opens, double-click the experiment name in the Load Experiment dialog box that is displayed when the Sampling Analyzer window opens.
Or, navigate to the experiment name you want, and double-click it.
To load an experiment after the Sampling Analyzer is already open:
Choose Experiment > Load.
Type the name of the experiment in the Name text box or double-click its entry in the file filter.
Or, navigate to the experiment name you want, and double-click it.
The Sampling Analyzer allows you to view different types of collected data. You can specify the kind of data that would help you improve your application's performance.
Select one of the following data types from the Data list box:
Process Times |
Summary of process state transitions |
User Time |
Time spent in the user process state from the execution of instructions |
System Wait Time |
Time the process is sleeping in the kernel but is not in the suspend, idle, lock wait, text fault, or data fault state |
System Time |
Time the operating system spends executing system calls |
Text Page Fault Time |
Time spent faulting in text pages |
Data Page Fault Time |
Time spent faulting in data pages |
Program Sizes |
Sizes in bytes of the functions, modules, and segments of your application. Used in conjunction with Address Space data, this lets you examine the size of your application and helps you establish specific memory requirements |
Address Space |
Reference behavior of both text pages and data pages. Used in conjunction with Program Sizes data, it lets you examine the size of your application and helps you establish specific memory requirements |
Execution Statistics |
Overall statistics on the execution of the application |
Each data type can be viewed only in displays appropriate to its nature. Table 2-1 lists the display options associated with each data type:
Table 2-1 Data Types and Corresponding Display Options
Data Type |
Display Option(s) |
---|---|
Process Times |
Overview |
User Time |
Histogram; Cumulative |
System Wait Time |
Histogram; Cumulative |
System Time |
Histogram; Cumulative |
Text Page Fault Time |
Histogram; Cumulative |
Data Page Fault Time |
Histogram; Cumulative |
Program Sizes |
Histogram |
Address Space |
Address Space |
Execution Statistics |
Statistics |
The Sampling Analyzer associates each data type with one or two display options, depending on the nature of the actual data.
Select one of the display options shown in Table 2-2 from the Display list box.
Table 2-2 Display Options for Specific Data TypesDisplay Option | Information Presented |
---|---|
Overview |
The default display gives a high-level overview of performance behavior |
Histogram |
Summary of the amount of time spent executing functions, files, and load objects |
Cumulative |
Cumulative amount of time spent by a function, file, or load object, including the time spent in called functions, files, or segments |
Address Space |
Information about memory usage |
Statistics |
Aggregate data about performance and system resource usage |
For each sample, the Overview display (see Figure 2-4) shows the amount of time the application spends in different process states. The Sampling Collector always gathers this data during the data collection process, so the Overview display appears by default whenever an experiment is loaded into the Sampling Analyzer.
The Overview display option:
Provides a high-level overview of the performance behavior of an application
Provides data about how your application's execution time breaks down into different performance areas, helping you identify CPU bottlenecks, I/O bottlenecks, and paging bottlenecks
Shows how application performance changes during execution (for example, early parts of the execution might be I/O-bound, while later parts might be CPU-bound)
The Overview display contains numbered sample columns made up of segmented bars. Each column represents individual samples collected during an experiment.
The segments inside each column represent different performance areas. The height of each segment is proportional to the time spent in each performance area.
The shade that represents a specific performance area is consistent across all the sample columns in the experiment and across other experiments as well.
A transparent segment is a segment the same color as the foreground of the display pane. It represents performance areas too small to display individually. To see exactly which performance areas are contained in a transparent segment, click the segment's column and choose View Show Details to open the Sample Details dialog box (see Figure 2-5).
The fields in the dialog box contain the following information about the selected samples.
Samples | Samples currently selected and the percentage of the experiment they represent |
Start Time | Start time of the sample |
End Time | End time of the sample |
Duration | Duration of the sample |
User | Time spent executing application instructions |
System | Time the operating system spent executing system calls |
Trap | Time spent executing traps (automatic exceptions or memory faults) |
Text Fault | The time spent faulting in text pages |
Data Fault | The time spent faulting in data pages |
I/O | Time spent in program I/0 |
Lock Wait | Time spent waiting for lightweight process locks to be released |
Sleep | Time the program spent sleeping (due to any cause other than Text Fault, Data Fault, System Wait, or Lock Wait) |
Suspend | Time spent suspended (including time spent in the debugger when it encounters breakpoints) |
Idle | Time spent idle |
Parameters | List of the data parameters collected for each sample (set in the Sampling Collector before beginning the experiment) |
The Histogram display (see Figure 2-6) shows how much time an application spends executing functions, files, or load objects.
The Histogram display option is available for the following data types:
User Time
System Wait Time
System Time
Text Page Fault Time
Data Page Fault Time
To view your application's performance data at various levels of compilation granularity, choose one of the following unit types:
Function | Time your application spent executing functions |
File |
Time spent executing file-level units. This view is useful if your application has a large number of functions. All data for a single source file is displayed together. Note: If any part of the executable (including shared libraries) is not compiled with the -g option, the Sampling Collector may not have enough information to associate functions with their containing files. |
Load Object |
Time spent executing text segments |
You can select which samples to include in the Histogram display in three ways:
Type sample numbers directly into the Includes Samples text field: separate numbers with commas (1,3,6), and define ranges using a hyphen (1-6).
Select the columns containing those samples while still in the Overview display.
Choose either View > Select or View > Select None while still in the Overview display to include or exclude all samples in the experiment.
To select which segments to include in the Histogram display, choose View Segments Included from Files to open the Segments Included from Files dialog. Click any segments and click Apply, or click the Select All button to select all segments.
To sort the Histogram display, choose View > Sort by and select either Values (descending by time value) or Names(alphabetically).
To search for specific names, choose View > Find to open the Find dialog box. Enter the search string in the text field and click Apply.
The Cumulative display (see Figure 2-7) shows the total execution time spent by a function, file, or load object, including time spent in called functions, files, or segments. All execution time accumulated in a descendant function is attributed to the parent function.
The Cumulative display is available for the following data types:
User Time
System Wait Time
System Time
Text Page Fault Time
Data Page Fault Time
To view data at various levels of compilation granularity, choose one of the following unit types:
Function | Time your application spent executing functions |
File | Time spent executing file-level units. This view is useful if your application has a large number of functions. All data for a single source file is displayed together. Note: If any part of the executable (including shared libraries) is not compiled with the -g option, the Sampling Collector may not have enough information to associate functions with their containing files. |
Load Object | Time spent executing text segments. |
You can select which samples to include in the Cumulative display in three ways:
Type sample numbers directly into the Includes Samples text field: separate numbers with commas (1,3,6), and define ranges using a hyphen (1-6).
Select the columns containing those samples while still in the Overview display.
Choose either View > Select or View > Select None while still in the Overview display to include or exclude all samples in the experiment.
To select which segments to include in the Cumulative display, choose View > Segments Included from Files to open the Segments Included from Files dialog. Click any segments and click Apply, or click the Select All button to select all segments.
To sort the Cumulative display, choose View > Sort by and select either Values (descending by time value) or Names (alphabetically).
To search for specific names, choose View > Find to open the Find dialog box. Enter the search string in the text field and click Apply.
The Address Space display (see Figure 2-8) helps you identify memory that is most heavily used by your application (modified and referenced pages). This display option also identifies memory that is unused because the experiment did not exercise all of your application's functionality, or because your application has dead code or memory allocation problems.
The Address Space display option shows data only if you collect address-space data. If no address-space data was collected, a message to that effect will appear at the bottom of the Sampling Analyzer screen.
The Address Space display divides memory used by your application into the following categories:
Modified | A page written on during the execution of the application; may or may not be referenced |
Referenced | A page read by your application or containing instructions that have been executed by your application |
Unreferenced | A page neither modified nor referenced by the application |
The Address Space display (see Figure 2-8) is laid out in rows and columns that are made up of individual squares (pages) or rectangles (segments). The rows and columns are numbered to describe their address in memory. Gaps (shown as white space) represent a region of the address space that was not used by the application.
Sun systems use either 4-Kbyte or 8-Kbyte pages. The address of a page is a multiple of 0x1000 (4 Kbytes in hexadecimal) or 0x2000 (8 Kbytes in hexadecimal).
To verify the page size of your system, go to a prompt and type:
% pagesize
The pagesize command returns the page size in bytes:
4096 (4-Kbyte pages)
8192 (8-Kbyte pages)
If the page size is 4 Kbytes, the number of pages per row is 16. If the page size is 8 Kbytes, the number of pages per row is 8.
You can determine the address of a page by combining the hexadecimal values of the row and column that contains the page. For example, if the page you are examining is in the fourth row (0004_ _00) and the third column (20), then the address of that page is 00042000.
To view memory units at various levels of granularity in the Address Space display, select Page or Segment in the Unit type area.
Selected pages and segments are shadowed and raised to the left. If you keep the right mouse button pressed down over a selected page, the segment containing that page is also displayed and shadowed; likewise, if you keep the right mouse button pressed down over a selected segment, the pages contained within that segment are also displayed and selected.
To view information about the properties of a selected page or segment, choose View Show Details to open either the Page Properties or Segment Properties dialog, which displays the following information:
Address
Size of the page or size range of the segment in bytes
Functions contained in the page or segment
Name of the segment
To select which samples to include in the Address Space display, you can:
Type sample numbers directly into the Includes Samples text field: separate numbers with commas (1,3,6), and define ranges using a hyphen (1-6).
Select the columns containing those samples while still in the Overview display.
Choose either View Select or View Select None while still in the Overview display to include or exclude all samples in the experiment.
The Statistics display (see Figure 2-9) provides data about your application's overall performance and system resource usage (as opposed to the Histogram, Cumulative, and Address Space display options, which show data broken down by program components such as functions and pages). The information provided by the Statistics display is useful when you want to compare actual numerical values against any previous estimates you may have made.
The information needed to produce the Statistics display is always collected by the Sampling Collector during the data collection process, so you do not need to specify any particular data type to view information in this display. The Statistics display shows:
Minor Page Faults | The number of page faults serviced that do not require any physical I/O activity |
Major Page Faults | The number of page faults serviced that require physical I/O activity (if non-zero, the Overview display shows text page or data page fault wait time) |
Process swaps | The number of times a process is swapped out of main memory |
Input blocks | The number of times a read() system call is performed on a non-character or special file |
Output blocks | The number of times a write() system call is performed on a non-character or special file |
Messages sent | The number of messages sent over sockets |
Messages received | The number of messages received from sockets |
Signals handled | The number of signals delivered or received |
Voluntary context switches | The number of times a context switch occurred because a process voluntarily gave up the processor before its allotted time was completed, to wait for availability of a resource |
Involuntary context switches | The number of times a context switch occurred because a higher-priority process became runnable, or because the current process exceeded its allotted time |
System calls | The total number of system calls |
Characters of I/O | The number of characters transferred in or out to a character device or file by read and write calls |
Total address space size | Total size of the address space (in pages) |
Maximum address space size | Maximum size of the address space (pages per sample) |
Minimum address space size | Minimum size of the address space (pages per sample) |
Average address space size | Average size of the address space (pages per sample) |
Total text address space size | Total size of the text address space (pages) |
Maximum text address space size | Maximum size of the text address space (pages per sample) |
Minimum text address space size | Minimum size of the text address space (pages per sample) |
Average text address space size | Average size of the text address space (pages per sample) |
Total non-text address space size | Total size of the non-text address space (pages) |
Maximum non-text address space size | Maximum size of the non-text address space (pages per sample) |
Minimum non--text address space size | Minimum size of the non-text address space (pages per sample) |
Average non-text address space size | Average size of the non-text address space (pages per sample) |
Workset sizes will be non-zero only if address-space data was collected.
You can select which samples to include in the Statistics display in three ways:
Type sample numbers directly into the Includes Samples text field: separate numbers with commas (1,3,6), and define ranges using a hyphen (1-6).
Select the columns containing those samples while still in the Overview display.
Choose either View Select or View Select None while still in the Overview display to include or exclude all samples in the experiment.
You might wish to reorder your application if (and only if) text page faults are consuming a large percentage of its time.
After the behavior data is collected, you can use the Sampling Analyzer to generate a mapfile containing an improved ordering of functions. The -M option passes the mapfile to the linker, which then relinks your application and produces a new executable application with a smaller text address space size.
After you have reordered your application, you can run a new experiment and compare the original version with the reordered one.
To reorder an application:
Compile the application using the -xF option.
The -xF option is required for reordering. This option causes the compiler to generate functions that can be relocated independently.
For C applications, type:
cc -xF -c a.c b.c
cc -o application_name a.o b.o
For C++ applications, type:
CC -xF -c a.cc b.cc
CC -o application_name a.o b.o
For Fortran applications, type:
f77 -xF -c a.f b.f
f77 -o application_name a.o b.o
If you see the following warning message, check any files that are statically linked, such as unshared object and library files, because these files may not have been compiled with the -xF option:
ld: warning: mapfile: text: .text% :function_name
object_file_name:
Entrance criteria not met, the named file, function_name, has not been compiled with the -xF option
Load the application in Sun WorkShop for debugging.
Activate the Sampling Collector to collect performance data by choosing Windows > Sampling Collector from the Debugging window. Be sure to enable Address Space data collection.
Run the application in Sun WorkShop.
Load the specified experiment into the Sampling Analyzer.
Create a reordered map in the Sampling Analyzer by choosing Experiment > Create Mapfile. In the file chooser, enter the samples to be used, the mapfile directory, and the name of the mapfile to be created; and click OK.
The mapfile contains names of functions that have user CPU time associated with them. It specifies a function ordering that reduces the size of the text address space by sorting profiling data and function sizes in descending order. All functions not listed in the mapfile are placed after the listed functions.
Link the application using the new mapfile.
For C applications, type:
cc -Wl -M mapfile_name a.o b.o
For C++ applications, type:
CC -M omapfile_name a.o b.
For C applications, the -M option causes the compiler to pass -M mapfile_name to the linker.
For Fortran applications, type:
f77 -M mapfile_name a.o b.o
The Sampling Analyzer lets you simultaneously view data in multiple displays, so you can compare samples in an experiment. With multiple displays, you can:
View different sets of samples in the same display option. For example, you can compare Histogram displays of sample 8 and sample 11.
View one set of samples in different display options. For example, you can view samples 1-6 in the Histogram display and, in a second window, view the same samples in the Cumulative display.
Compare samples from different experiments.
To view multiple displays:
Choose View > New Window to open a second Sampling Analyzer window.
In the new Sampling Analyzer window, choose data types, displays, and samples to examine, or load a second experiment if you wish.
The new window does not inherit the settings of the first Sampling Analyzer window; it is set to the defaults with which the original Sampling Analyzer window started. Also, if you close or quit the original Sampling Analyzer window, all windows opened from that window close as well.
If you want to save a record of an experiment, you can print experiment data to either a printer or a file. The Sampling Analyzer allows you to print:
A plain-text version of the current display
A text summary of the experiment that gives average sample times for each data type and shows how frequently functions, modules, and segments are used
To print a plain-text version of the current display:
Choose Experiment > Print.
Select whether the data should be printed to a printer or a file, and indicate the printer name and number of copies, if applicable.
Click OK.
To print a plain text summary of the experiment:
Choose Experiment > Print Summary.
Select whether the summary data should be printed to a printer or a file, and indicate the printer name and number of copies, if applicable.
Click OK.
The Sampling Analyzer allows you to export experiment data to an ASCII file to be used later by other programs.
To export experiment data to an ASCII file: