Oracle Solaris Studio 12.2 DLight Tutorial

September 2010

This short tutorial takes you through the process of running a program in the DLight observability tool and looking at the resulting data displays.

Introduction
Building the Sample Application
Starting DLight
Running the Sample Application
Using the Indicator Controls
Exploring Thread Microstates
Exploring CPU Usage
Exploring Memory Usage
Exploring Thread Usage
Exploring I/O Usage

Introduction

DLight is an interactive graphical observability tool for C/C++ developers based on Oracle Solaris Dynamic Tracing (DTrace) technology. You can use DLight on Solaris platforms to analyze data from multiple DTrace scripts in a synchronized fashion to trace a runtime problem in an application to the root cause.

You must have root privileges on the system where you run DLight. If you are not logged in as root when you start DLight, you are prompted for your root password the first time you run your program.

DLight includes five profiling tools for C, C++, and Fortran programs:

Thread Microstates
CPU Usage
Memory Usage
Thread Usage
I/O Usage

DLight also includes three profiling tools for processes in the AMP (Apache, MySQL, PHP) stack:

Apache Monitor
MySQL Monitor
PHP Monitor

When you run an executable target or attach to a target process, each tools displays usage information in a graph in the Run Monitor window. Each graph includes a button to click for more information. Indicator controls at the bottom of the window let you control your view of the graphs.

Building the Sample Application

The ProfilingDemo application is available in the examples/dlight/ProfilingDemo directory in your installed Oracle Solaris Studio 12.2.

Copy the contents of the ProfilingDemo directory to your own private working area:
```
cp -r installation_directory/examples/dlight/ProfilingDemo ~/ProfilingDemo
```
Build your own copy of the program:
```
cd ~/ProfilingDemo
make
```
The program is built using the -g option, which generates debug information. (If you compile without this option, DLight can collect and display run time data for the program, but the features that let you display the source code for a function will not work.)

Starting DLight

If you do not already have DLight running, start it by typing:

installation_directory/bin/dlight

Running the Sample Application

In DLight, click the New DLight Target button . In the target dialog box:
1. Click the Executable Target tab.
2. Type the pathname to your profilingdemo executable in the Run field, or click Browse to use the Open dialog box to browse to your profilingdemo executable file.
3. The sample application does not take any arguments, so leave the Arguments field blank.
4. You would only use the Trace field if you wanted to trace a child process of the executable, so leave that field blank also.
5. Click Run.
6. If you were not logged in as root when you started DLight, you are prompted for your root password.
The ProfilingDemo application starts executing and the Run Monitor window starts displaying dynamic graphs. In the Output window, the ProfilingDemo program tells you what it is doing so you can match the output to the data represented in the graphs.
Press Enter each time the program requests it until execution is finished.

Using the Indicator Controls

At the bottom of the Run Monitor window, you can see sliders for controlling your view of the graphs: View slider, Details slider, and Time slider. Place your mouse cursor over the end points of the sliders for information about the sliders.
Click and hold your mouse cursor on the Time slider and drag the slider to the left to see the beginning of the run. All the graphs slide in unison so you can see what happened in each area (CPU, memory, threads, I/O) at any given time, and see the relationships between them.
Drag the Time slider from left to right to see the complete run.
Move your mouse cursor to the View slider, the control that overlays the time units. The View slider controls let you select what part of the run time is displayed in the graphs.
Click and drag the left handle of the View slider, the start point, all the way to the beginning of the run. The graphs now show the entire run at once. The effect is similar to zooming out as far as possible. Notice that the Time slider is not functional when you select the complete run time. You are already looking at all the data so there is nothing left to scroll.
Now let's focus our attention closer. Drag the start point of the View slider to the right. As you drag the handle, the graphs zoom in to focus on the area toward the end of the run. Notice that the Time slider can again be used to scroll back and forth in the run time.
Place your mouse cursor over the end points of the orange colored Details slider for a description of how to use the slider. The Details slider controls enable you to select a portion of the run time to examine in detail.
Drag the start point of the Details slider to a later time than the start point of the View slider. Notice that the graphs are grayed out in the area in front of the start point, giving a highlight effect to the graph between the start and end points.
When you click on any of the graphs' detail buttons (Thread Details, Hot Spots, Memory Leaks, Sync Problems, or I/O Details), the data for the highlighted area is shown in a details tab.
Drag the start point of the View slider back to the beginning of the run so you can see all the data.

Exploring Thread Microstates

The Thread Microstates graph shows an overview of the program's threads as they enter various execution states during the program's run. The Solaris microstate accounting feature uses the DTrace facility to provide fine-grained information about the state of each thread as it enters and exits ten different execution states:

User Running	The percentage of time the process has spent in user mode
System Running	The percentage of time the process has spent in system mode
Other running	The percentage of time the process has spent in processing system traps and such
Text page fault	The percentage of time the process has spent in processing text page faults
Data page fault	The percentage of time the process has spent in processing data page faults
Blocked	The percentage of time the process has spent waiting for user locks
Sleeping	The percentage of time the process has spent sleeping
Waiting	The percentage of time the process has spent waiting for CPU

The Thread Microstates tool graphically shows summarized state information for all the threads that are created during the program's run. Only four states are shown: Sleeping, Waiting, Blocked, and Running. These states represent a simplified or summary view of the ten possible microstates, and give an overview of the states of all the threads running in your program. For example, the time spent in the Running state represents all types of running states: running in user mode, running in system calls, running in page faults, running in traps.

Move the left handle of the View slider to the left until the graphs show about 20 seconds of the run time as illustrated below. This image shows the beginning of the program's run, during the SEQUENTIAL DEMO portion, when two tasks are run one after the other in a single thread. The points where the thread is sleeping correspond to the points where the program is waiting for the user to press Enter.
Click and drag the Time slider to the right until the number of threads shown in the Thread Microstates tool jumps to three.
The number of threads jumps to three as the program enters the PARALLEL DEMO portion. The main thread launches two additional threads to run two tasks in parallel, each in its own thread. You can see that considerable time is spent in the Waiting state (yellow) and the Sleeping state (blue), and not as much time spent in the Running state (green). No time is spend in the Blocked state (orange) during the PARALLEL DEMO portion because this portion of the program does not implement any thread synchronization tactics such as mutex locks that would block threads.
Drag the Time slider all the way to the end of the run time. Notice that the Blocked microstate shown in orange appears at the point where the program enters the PTHREAD MUTEX DEMO portion, in which each thread uses mutual exclusion locks to prevent other threads from interfering at certain points. Each thread can run actively only after it obtains the mutex lock. A thread is blocked when it tries to access a locked section of code when another thread owns the mutex lock. The use of mutual exclusion locks prevents the threads from entering a data race condition where threads have overlapping access to the same data.
Click the Thread Details button to display details about the thread microstates. The Thread Details window opens to display a graphical timeline representation of all the threads run in the program along with detailed state information.

The Thread Details window shows the state transitions for each thread during the complete run time of the program.
Put your cursor on one of the colored areas of a thread. A popup window displays details about what is going on with that thread at that particular moment. Details include the time when the data was collected and the percentage of time spent in each thread state at that moment. When you put your cursor over the Summary area at the right side of the window, the popup window displays the percentage in each state for the thread's complete run.
Try using the window's controls to change what is shown:
- By default, the window shows all threads. Click the downward arrow to the right of the Show drop-down list. You can select Live Threads only to show only threads that have not terminated, or Finished Threads only to show only those threads that have terminated during the program run.
- By default, the window shows only the four generalized execution states. Click the downward arrow to the right of the Detail Level drop-down list. You can select Moderate to display more detail on these states, or Advanced to display ten microstates.
- Click an individual thread and notice that the thread is highlighted. If you press Shift and click another thread, a range of threads is selected. To select multiple threads that are not adjacent, press Ctrl before selecting the threads. When the threads you are interested in are highlighted, right-click and select Show Only Selected Threads. To see all the threads again, select All Threads from the Show drop-down list.
Zoom in on the thread graphs to take a closer look by clicking the Zoom In button . This image shows the window zoomed in, with the Detail Level set to Advanced.
Click the Zoom Out button to go back to the previous zoom level.
Click the Show Complete Run button to display the complete run in the Thread Details window at once. (You can click the button again to return to the scrolling view of the thread details.)
Click the second orange rectangle on thread 4. The Thread Call stack tab opens to show the call stack for the thread at this moment. You can expand a node in the stack to see the calls happening in that thread, or right-click the top node and choose Expand All to see the calls in all of the threads.
Double-click the mutex_threadfunc function to open the source file where the function is called. (You can display the calling source file for any function in the stack that is not grayed out.)
Click the Thread Details tab to return to the Thread Details window. Click on a thread. You can navigate along the thread's timeline using your mouse or your keyboard:
- With your mouse, right-click the thread and use the Navigate menu to move the focus to the left or right, set a point on the timeline and update the content of the Thread Call Stack tab, or switch focus to the Thread Call Stack tab.
- To use keyboard shortcuts to navigate the thread, press the following keys:
  - Ctrl+LeftArrow and Ctrl+RightArrow to scroll left and right in the thread timeline
  - Ctrl+DownArrow to select a point in the timeline, which will update the Thread Call Stack
  - Alt+DownArrow to focus input on the Thread Call Stack window
- In the Thread Call Stack you can use the arrow keys and Enter to open source files associated with the functions

Exploring CPU Usage

The CPU Usage graph shows the percentage of the total CPU time used by your application during its run.

Click the Hot Spots button to display details about the CPU time. The CPU Time Per Function tab opens to display the functions of the program, along with the CPU time used by each function. The functions are listed in order of CPU time used, with the functions that use the most time listed first. If the program is still running, the time initially displayed is the amount of time consumed at the moment you clicked the button.
Click the header of the Function Name column to sort the functions alphabetically.
Click the header of the CPU Time (Exclusive) column to sort the functions by the order of time used by the individual functions.
Notice the difference between the two columns of CPU Time. CPU Time (Inclusive) shows the total CPU time spent from the time the function is entered until the time it is exited, including the time of all other functions that are called by the listed function. CPU Time (Exclusive) shows the time used only by the specific function, not including any functions that it calls.
Click the CPU Time (Inclusive) column header to place the most time-consuming function back at the top. Notice that the work_run_usrcpu function used slightly more CPU Time (Inclusive) than CPU Time (Exclusive), which means that a small amount of its CPU time was actually used by other functions that it calls, but that the work_run_usrcpu function itself used most of the time.
Some of the functions are shown in bold text. You can display the source files for those functions. Double-click the work_run_usrcpu function. The common.c file opens, with the cursor resting on the work_run_usrcpu function, line 59. Some numbers are displayed in the left margin for this line.
Place your mouse cursor over the numbers in the left margin. The numbers are the same metrics for exclusive and inclusive CPU time for the function that are displayed in the CPU Time Per Function tab. The metrics are rounded to use less space, but the unrounded values are shown when you mouse over them. Metrics for CPU-consuming lines such as the for loop that does calculations within the work_run_usrcpu function are also shown in the common.c source file.
Change the Time Filter start time in the CPU Time Per Function tab to 0:30 by either typing the time and pressing Enter, or using the arrows to scroll through the seconds. The graphs in the Run Monitor window change just as they did when you moved the handles on Details slider. If you drag the handles, the Time Filter settings in the CPU Time Per Function tab are updated to match. More importantly, the data shown for the functions in the tab is updated to reflect the filter so only the CPU time used during that time period is shown.
You can also filter for data that meets a certain metric. Right-click on the CPU Time (Exclusive) metrics for work_run_usrcpu. Select Show only rows where > CPU Time (Exclusive) == the metric shown for work_run_usrcpu. All the other rows are filtered away, and only the row whose exclusive CPU time is equal to that metric is displayed.

Exploring Memory Usage

The Memory Usage tool shows how your program's memory heap changes over its run time. You can use it to identify memory leaks, which are points in your program where memory that is no longer needed fails to be released. Memory leaks can lead to increased memory consumption in your program. If a program with a memory leak runs long enough, it can eventually run out of usable memory.

Slide the Time slider in the Run Monitor to the left and right to see how the memory heap increases and decreases over time. There are four spikes in usage in our run of the program. The first two occur during the SEQUENTIAL DEMO, the third occurs during the PARALLEL DEMO, and the fourth occurs during the PTHREAD MUTEX DEMO.
Click the Memory Leaks button to display the Memory Leak Details tab, which shows details about which functions exhibit memory leaks. Only functions that are producing memory leaks are listed in the tab. If your program is running when you click the button, the leak locations shown are those that exist at the moment you clicked the button. More leaks may occur as time goes on, so click the Refresh button to update the list. If no memory leaks are detected by the end of the run, the Memory Leak Details tab indicates that no memory leaks were found.
You can filter the data by changing the Start and End times, or by using the Details slider in the Run Monitor window.
In this run, the ProfilingDemo program shows a memory leak associated with a work_run_getmem function. Double-click the work_run_getmem function and the common.c file opens with the cursor at the line where the memory leak occurs in the function.
The memory leak metrics are displayed in the left margin. Mouse over them to display the details as you did with the CPU Usage metrics.
As with the CPU Time Per Function tab, you can right-click the metrics in the Memory Leak Details tab and select a criterion for filtering the data.

Exploring Thread Usage

The Thread Usage Tool shows the number of threads in use by your program, and any moments where a thread has to wait to get a lock in order to proceed with its task. This data is useful for multithreaded applications, which must perform thread synchronization in order to avoid expensive wait times.

Slide the Time slider to the beginning of the run and notice, as you did for the Thread Microstates graph, that the number of threads is one during the SEQUENTIAL DEMO portion of the program but increases to three as the program enters the PARALLEL DEMO portion.
Move the endpoint handle of the View slider so that you can see the data from the beginning of the run until just before the two additional threads are launched.
Look at the CPU Usage and Memory Usage graphs for the same time period and notice that the single thread is performing some activity that uses CPU time and memory. This period corresponds to the SEQUENTIAL DEMO portion of the program, in which the main thread writes to the file and then performs some calculations sequentially. CPU and memory usage both decrease while the program waits for the user to press Enter, and the number of threads remains at one.
Slide the Time slider to the right so that you can see two points where the threads increase to three. The first increase in threads corresponds to the PARALLEL DEMO portion of the program run, where the main thread starts two additional threads to do the work of writing to a file and performing calculations, in parallel. otice that the memory usage and CPU usage are a bit higher during this portion, but thee two tasks are completed in much less time than in the SEQUENTIAL DEMO portion.
Notice that the number of threads returns to one after the PARALLEL DEMO threads finish, and the main thread waits for the user to press Enter.
The thread count increases to three again as the PTHREAD MUTEX DEMO portion of the program runs. Notice that shortly after the thread count increases to three, a lock wait appears, shown in orange. The PTHREAD MUTEX DEMO uses mutual exclusion locks to prevet overlapping access to certain functions by multiple threads, which causes the threads to wait to obtain a lock.
Click the Sync Problems button to display details about thread locks. The Thread Synchronization Details tab opens and lists functions that had to wait to obtain a mutex lock. Also displayed are metrics for the number of milliseconds that the function spent waiting and the number of times the functions had to wait for a lock.
If you click the Sync Problems button while the program is running, you might need to click the Refresh button to update the display with the latest thread lock.
Click the header of the Wait Time column to sort the functions in order of time spent waiting.
Click the header of the Lock Waits column to sort the functions by the umber of times a thread was waiting in the functions.
Double-click the mutex_threadfunc function, which has the most lock waits. The mutex.c source file opens with the cursor at the line where the pthread_mutex_lock function is called. This function is responsible for locking a memory location before the location is read or written, and must wait until no other thread has a lock on that memory location.
The metrics for Wait time and Lock Waits are displayed in the left column margin of the source file. Place your mouse cursor over the metrics to see the details, which match what is shown in the Thread Synchronization Details tab.
Right-click on the metrics and deselect Show Profiler Metrics. The metrics are no longer displayed in the source editor for any of the profiling tools.
Choose View > Show Profiler Metrics to display the metrics again.

Exploring I/O Usage

The I/O Usage tool shows an overview of the program's read and write activity during the run.

The following image shows the I/O usage at the beginning of the run, during the SEQUENTIAL DEMO portion in which two tasks are run one after the other in a single thread. In the first few seconds the program started, and then it was waiting for the user to press Enter. When the user pressed Enter, the program wrote characters to a temporary file. This activity is reflected in the graph as the orange line showing bytes written.
Notice the following:
- The orange line shows the number of bytes written in the last second. The Profiling Demo program itself reports into the Output window that it writes a total of 79984640 bytes or about 76.3M during this run of the SEQUENTIAL DEMO. If you added up all of the data points shown on the orange line, the total would be close to 76.3M.
- The System time shown in the CPU Usage tool is high during this phase because the program makes use of system calls to generate data and write it to the disk.
- The Memory Usage tool shows a steady 8K bytes of memory heap allocated.
Click the I/O Details button. The I/O Usage details tab opens, displaying standard input, standard output, and temporary files that the program reads from and writes to. The files with checkmarks have been closed. The files that are marked with a yellow icon are still open for read and write actions.
Notice that this program requires input from pressing the Enter key only occasionally, so the Bytes Read column is not very useful. You can close the Bytes Read column by clicking the Change Visible Columns button to the right of the column headers. In the Change Visible Columns dialog box, click the checkbox to deselect Bytes Read and click OK.
Suppose that you want more details about how this program uses all of these temporary files. Click the /var/tmp/baa[]... file in the I/O Usage details tab to see the functions that opened and closed the file. The functions are listed in the panel to the right of the file list.
Double-click the work_run_syscpu function in the function list to open the source file for the function.
In the I/O Usage details tab, as in the CPU Time Per Function tab and Thread Synchronization Details tab, you can specify a time interval to look at. In the I/O Usage graph in the Run Monitor window, the write activity starts very near the beginning of the run, at the point where the SEQUENTIAL DEMO portion of the program begins writing to a temporary file.
Isolate the SEQUENTIAL DEMO portion of the run by typing the time that it ended (0:17in the following image) in the End field. The data is filtered in the Run Monitor window and the I/O Usage details tab so that you can focus on the input and output of the SEQUENTIAL DEMO phase.
Notice that one temporary file is shown in the I/O Usage details tab during the sequential demo. A single thread opens, writes to the file, and then closes it. Click the file to see the functions that access it, and double-click a function to see its source code.
In the Run Monitor window, move the Time slider to the right until you see write activity begin again after a pause. The pause in writing occurs during the second task in the sequential demo, which is a calculation task during which no disk writes take place. The renewed write activity shows that the program is entering the PARALLEL DEMO portion, where the user presses Enter to launch the tasks, and the writing to disk and calculation occur simultaneously in separate threads.
Isolate the parallel demo activity by typing the time it starts in the Start field and the time it ends in the End field. For the run shown in this image, the start time is 0:18 and the end time is 0:29.
Notice that several temporary files are shown in the I/O Usage details tab during the PARALLEL DEMO portion of the run. Only one thread is writing to the files because each second it switches to a new file. The calculating task does not write to disk.
Click one of the files and see that the parallel_threadfunc function is opening and closing the files.
Click and drag the right handle of the Details slider to the end of the run. In the following image, you can see that the I/O activity changes again when the program enters the PTHREAD MUTEX DEMO portion and the user presses Enter. Filter the data for this portion of the run by changing the start time to the end time of the parallel demo, in this case 0:24.
The I/O Usage details tab shows that multiple files are open during the PTHREAD MUTEX DEMO portion of the run. One thread is writing to a file, and every second it switches the file it writes to, as in the PARALLEL DEMO portion. However, because mutex locks are used, sometime the writing thread is blocked by the calculating thread and cannot continuously write to files.

Skip Navigation Links
Exit Print View
	Oracle Solaris Studio 12.2 DLight Tutorial