ChorusOS 4.0 Introduction

Part IV Debugging and Performance Profiling

Chapter 10 System and Application Debugging

This chapter presents the source-level debugging architecture in the ChorusOS 4.0 operating system. It explains how to configure the different servers and tools, and how to use them.

Preparing the System for Symbolic Debugging

Compiling for Debugging

In order to use all the debugging features in the XRAY Debugger, you need to generate symbolic debugging information when you compile components. This information is stored in the object files and describes the data type of each variable or function and the correspondence between source line numbers and addresses in the executable code.

How you enable debugging in components will depend on which release of ChorusOS 4.0 you have. The binary release of ChorusOS 4.0 includes the source code for the BSP, driver and example components. These you will compile in what is known as an imake build environment because the imake tool is used to create the Makefiles for these components.

The source release of ChorusOS 4.0 includes everything in the binary release plus source code for system components, such as the kernel and the operating system. These components are built in an mkmk build environment where the tool mkmk is used to build Makefiles. For more details see ChorusOS 4.0 Production Guide.

Enabling Debugging for Components Built with imake

To build all your components with symbolic debugging information turned on:

Other ways can be used to selectively build your components with symbolic debugging information. These are presented below.

To enable symbolic debugging throughout the component directory and its sub-directories:

  1. Edit the Project.tmpl file located in the root of the component source directory, and add the following line to the end:

    DEBUG=$(DEBUG_ON)

  2. Change directory to the root of your build directory and remove all the object files and executables:


    % make clean
    

  3. Rebuild the local Makefile:


    % make Makefile
    

  4. Rebuild the sub-directory Makefiles:


    % make Makefiles
    

  5. Finally, rebuild the component:


    % make
    

To enable symbolic debugging in selected component directories:

  1. Edit the Imakefile within each desired component source directory, and add the following line to the end:

    DEBUG=$(DEBUG_ON)

  2. Change directory to the root of your build directory and remove all the object files and executables:


    % make clean
    

  3. Rebuild the local Makefile:


    % make Makefile
    

  4. Finally, rebuild the component:


    % make
    

If you prefer not to modify the Imakefile or Project.tmpl files, there is an alternative way of enabling debugging. You can pass the debug option within the make command itself:

Once a component has been compiled in debug mode, rebuild and reboot the system image.

Enabling Debugging for Components Built with mkmk

To enable symbolic debugging for system components:

  1. Change to your build directory and remove all the object files and executables:


    % make clean
    

  2. Create a mkmk build definition file:


    % echo 'FREMOTEDEB=ON' > filename.df
    

    filename can be a name of your choice.

  3. Rebuild the system component:


    % make makemk
    

Configuring the Debug Agent

The DebugAgent is activated by enabling the DEBUG_SYSTEM feature with the configurator(1CC) command:


% configurator -set DEBUG_SYSTEM=true


Note -

The DEBUG_SYSTEM feature is set to true by default.


When the DebugAgent is activated, communications on the serial line are performed in binary mode.

The DebugAgent has eight tunable options that you can configure with ews or configurator. The following three tunables control the behavior of the DebugAgent when it is enabled (DEBUG_SYSTEM=true):

The following five tunables control the serial line used by the DebugAgent.


Note -

When the DebugAgent is not active (DEBUG_SYSTEM=false), the serial line is used by the system debugging console, and the five tunables control the serial device and speed.


Application Debugging Architecture

This section describes the components within the application debugging architecture.

Architecture Overview

The application debugging architecture has two components:

The XRAY Debugger for ChorusOS runs on the host. The remote debug server runs on the target and communicates with XRAY over the Ethernet. This is illustrated in Figure 10-1.

Figure 10-1 Application Debugging Architecture

Graphic

Application debugging is intended to be used for debugging user applications, dynamically loaded actors, as well as certain supervisor actors. It is not possible to debug the Actor Management (AM) or I/O Management components (IOM), the kernel, or the system drivers. Application debugging relies on the RDBC supervisor actor which uses the services of the AM, the IOM, the kernel, and system drivers such as the Ethernet driver. When an application is debugged, only that application is affected. Other applications in the operating system, as well as the operating system itself, will keep running.

Setting up a Debugging Session

To begin an application debugging session, follow these steps:

  1. Ensure that your target is connected to your network.

  2. Prepare the system for symbolic debugging. See "Preparing the System for Symbolic Debugging" for information on how to do this.

  3. Configure and start rdbc, the ChorusOS remote debug server. See "RDBC Configuration and Usage" and rdbc(1CC).

  4. Configure and start the XRAY Debugger for ChorusOS. See "Sample XRAY Start-up Script".

RDBC Configuration and Usage

The RDBC server can be started automatically or manually:

To stop RDBC, use the akill command. First identify the actor process ID (aid):


% rsh name aps

Then kill the RDBC process:


% rsh name akill aid


Note -

Your XRAY application debug session will be lost if you stop RDBC.


Information about what targets are available to XRAY is held in the file chorusos.brd. There are four columns: the machine names where RDBC executes are specified in the first column, slot numbers are specified in the second column, and the last two columns are for comments. XRAY interprets integer values between 0 and 25 in the second column as slot numbers and larger values as TCP/IP port numbers, and will adapt its connection to the server accordingly. The default TCP/IP port number of RDBC is 2072.

Here is an example chorusos.brd file:

         target-i386   2072  "i386"  "Application debug of target-i386"
         target-ppc    2072  "ppc"   "Application debug of target-ppc"

The entries specify the application debug of actors running on target-i386 and target-ppc respectively, and require that RDBC be running on both machines.

Two or more RDBC servers can be run on the same target to provide you with a separate console for each program being debugged.

See rdbc(1CC) for more information.


Note -

The name and port number specified in chorusos.brd have different meanings:


System Debugging Architecture

This section describes the components within the system debugging architecture.

Architecture Overview

The system debugging architecture has the following components:

The first three components run on the host. The fourth, the debug agent, runs on the target and communicates with the ChorusOS debug server through a serial connection. This is illustrated in Figure 10-2.

Figure 10-2 System Debugging Architecture

Graphic

A more detailed description of the debugging architecture can be found in Chapter 2 and 3 of the ChorusOS Debug Architecture and API Specifications document (/opt/SUNWconn/SEW/4.0/chorus-doc/pdf/DebugApi.pdf).

System debugging is intended to be used for debugging different parts of the ChorusOS operating system. This includes the kernel, the system drivers, the BSP, and those supervisor actors that cannot be debugged with application debugging such as the AM, and the IOM. System debugging also allows you to debug interrupt and exception handlers. During system debugging, the whole operating system is affected.

Setting up a Debugging Session

To begin your first system debugging session, follow these steps:

  1. Connect a serial cable between the host and target.

  2. Prepare the system for symbolic debugging. See "Preparing the System for Symbolic Debugging" for information on how to do this.

  3. Start and configure the ChorusOS DebugServer chserver. See "Setting up a Debugging Session" and chserver(1CC).

  4. Register the target with chadmin, the ChorusOS DebugServer administration tool. See "Registering a Target".

  5. Configure and start rdbs, the ChorusOS debug server for the XRAY Debugger. See "DebugServer Configuration File" and rdbs(1CC).

  6. Start the ChorusOS debug console chconsole. See the chconsole(1CC) man page for further details.

  7. Configure and start the XRAY Debugger for ChorusOS. See "Sample XRAY Start-up Script".


Note -

If you do not start chserver you will not be able to use chconsole. However, you can still view the system console using the tip or cu commands. See tip(1) and cu(1C) for more details.


For subsequent debugging sessions, you need only perform the following steps:

  1. Start the ChorusOS DebugServer chserver, if it is not already running.

  2. Start rdbs, the ChorusOS debug server for the XRAY debugger, if it is not already running.

  3. Start the ChorusOS debug console chconsole.

  4. Start the XRAY Debugger for ChorusOS.

Starting and Configuring the ChorusOS DebugServer

Identifying the Serial Device

The ChorusOS DebugServer chserver communicates with the target through a serial cable connection and must be run on the host to which the target is connected.

To identify the serial device, look in the /etc/remote file. This file contains references to remote systems that you can access through your local serial line. For more details, see remote(4). The device is usually named /dev/ttya or /dev/ttyb and will be the same device used by the tip or cu tools. The device must be readable and writable by all users.

DebugServer Slot Numbers

The DebugServer is a Sun RPC server that is registered with the rpcbind server. When you require more than one debug server to run on the same host, assign a unique slot number (in the range 0..65535) to each of them so that individual debug servers can be identified. If only one debug server is started on a given host, it is not necessary to allocate a slot number as 0 will be used by default.

If you decide to assign a slot number to your DebugServer, use the DebugServer environment variable CHSERVER_HOST.

DebugServer Environment Variable

The DebugServer, as well as all the other tools based on the Debug Library, uses the optional environment variable CHSERVER_HOST. This environment variable indicates:

The format of the environment string is host[:slot-id]. For example:


%  setenv CHSERVER_HOST  jeriko
%  setenv CHSERVER_HOST  concerto:3

DebugServer Configuration File

Configuration information about targets is held in a special file which the DebugServer reads every time you run it. For each target, the configuration file contains:

When a new target is registered, see "Registering a Target" for details of how to do this, the configuration file is modified.

Starting the DebugServer

On the host to which your target or targets are connected, type the following command:


% chserver

This will start the DebugServer as a background process. An empty configuration file called dbg_config.xml is copied into your home directory the first time you run the DebugServer.

If you have defined a slot number n and not set the environment variable, you can start the DebugServer as follows:


% chserver -slot n

A complete description of the DebugServer is given in the chserver(1CC) man page.


Note -

If chserver is run on a different host to the one the system image was built on, particularly on a shared file system with a different view of the build directory, the tool will not be able to access the necessary source files during system debugging. This problem is NFS related, the symbolic link created on one host may not be valid for another, and is due to there being relative file references in the layout.xml file. There are two solutions to the problem:


Stopping the DebugServer

Stop the DebugServer by using the chadmin tool.


% chadmin -shutdown

Registering a Target

Before registering a target you need to know:

Now you can register the target by typing:


% chadmin -add-serial-target name/
           -device device/
           -layout-file layout_file

name is the name of your target, device is the serial device that you have identified, and layout_file is the absolute path of the layout.xml file.

You only need to register a target once as configuration information is saved in your dbg_config.xml file.

Updating Target Information

Use chadmin to update the information that you gave during the registration of your target.

The following example sets the baud rate to 38400, the parity to none, and uses the device /dev/ttya for the target name:


% chadmin -baud 38400 -parity none -device /dev/ttya name

If you wish to specify a new layout.xml file because the path has changed (see note), use the following command to inform the DebugServer of the new path:


% chadmin -layout-file path/layout.xml name


Note -

When you change the mode of system image booting (using the BOOT_MODE global variable) or select a different system image configuration (using the SYSTEM global variable), the path to the layout.xml file will change.

For example, if you type:


% configurator -set SYSTEM=kts

The path will change to build_dir/image/RAM/kts/layout.xml.

If you then type:


% configurator -set SYSTEM=chorus

The path will change to build_dir/image/RAM/chorus/layout.xml.

Similarly, if you change the mode of system image booting:


% configurator -set BOOT_MODE=ROM

The path will change to build_dir/image/ROM/chorus/layout.xml.


Deactivating a Target

A target can be deactivated to disconnect the DebugServer from the DebugAgent and release the serial device used by the DebugServer. When a target is deactivated, the DebugAgent switches to a stand-alone mode. The chconsole must no longer be used as the DebugServer does not read the serial line any more. Instead, you must start the tip(1) or cu(1C) tools to gain access to the system debugging console.

A target may be temporarily removed (deactivated) with the following command:


% chadmin -deactivate name

name is the name of your target.

When a target is deactivated, it is not removed from the DebugServer configuration file so that it is possible to reactivate it later.

Reactivating a Target

Before the target can be reactivated, you must stop any tip or cu tools which may be using the serial line.

The target is reactivated with this command:


% chadmin -activate name

name is the name of your target.

The DebugServer will synchronize with the DebugAgent and the DebugAgent will switch to a binary protocol mode. At this stage, you must use the chconsole to gain access to the system debugging console.

Removing a Target

A target may be permanently removed by first deactivating it (see "Deactivating a Target") then unregistering it with this command:


% chadmin -remove-target name

name is the name of your target.

Providing you have deactivated the target first, the target's configuration information will be deleted from the configuration file.

RDBS Configuration and Usage

If RDBS is started without any parameters it will connect, by default, to the first target available on the DebugServer. However, you can specify a target name on the command-line provided the name you use is registered with the DebugServer.


Note -

This name is unrelated to the name under which the target might be known on the TCP/IP network (through another connection). It only identifies the serial line connecting the target with the DebugServer.


A complete set of command-line parameters are documented in rdbs(1CC).

Several RDBS servers may be run on one machine to debug several targets, provided you define a different slot for each server.

Information about what targets are available to XRAY is held in the file chorusos.brd. There are four columns: the machine names where RDBS executes are specified in the first column, slot numbers are specified in the second column, and the last two columns are for comments. XRAY interprets integer values between 0 and 25 in the second column as slot numbers and larger values as TCP/IP port numbers, and will adapt its connection to the server accordingly.

Here is an example chorusos.brd file:

         rdbshost      0     "i386"  "System debug of target-i386"
         rdbshost      1     "ppc"   "System debug of target-ppc"

The entries specify that two copies of RDBS will run on the rdbshost machine (a Solaris workstation): one on slot 0, configured to debug the target-i386 target, and another on slot 1, configured to debug the target-ppc target.


Note -

The name and port number specified in chorusos.brd have different meanings:


Concurrent System and Application Debugging

Combine the example chorusos.brd files given in "RDBC Configuration and Usage" and "RDBS Configuration and Usage":

         rdbshost      0     "i386"  "System debug of target-i386"
         rdbshost      1     "ppc"   "System debug of target-ppc"
         target-i386   2072  "i386"  "Application debug of target-i386"
         target-ppc    2072  "ppc"   "Application debug of target-ppc"

The first two entries specify that two copies of RDBS will run on the rdbshost machine (a Solaris workstation): one on slot 0, configured to debug the target-i386 target, and another on slot 1, configured to debug the target-ppc target.

The last two entries specify the application debug of actors running on target-i386 and target-ppc respectively, and require that RDBC be running on both machines.

By attaching to the first and third targets, you can carry out application and system debugging on the same target concurrently. However, while the system is stopped it is not possible to carry out application debug because halting the system halts RDBC, as well as the application itself.

Example XRAY/RDBS debug session

In this session the target is named target-i386, the workstation is named workstation1 and all host tools are available. Several actors and drivers have been added to the system, and they have been compiled for system debugging.

Make sure you have enabled the system debugging during system generation (see "Compiling for Debugging"), then run DebugServer (see "Starting the DebugServer") and connect a console to target-i386. Run RDBS in the following manner:


% rdbs target-i386

Run XRAY (by using the "Sample XRAY Start-up Script", for example), then go to the Managers window and select the Connect tab.

Select the Boards->Add or Copy board entry to register your target for system debug. XRAY opens the Add/Copy Board Entry pop-up dialog:

Graphic

Enter the host name where RDBS is running in the Name of Board field. Enter the slot number used by RDBS (0 by default) in the Port as String field. Leave the other fields blank.


Note -

On Windows NT, XRAY uses native Windows pathnames and it not aware of the Cygwin UNIX emulation layer used by the ChorusOS host tools. As a result, pathname translations must be specified so that XRAY can translate the Unix-like pathnames returned by the DebugServer, or embedded in object modules, into native Windows NT pathnames. Typical pathname translations are /c/=C:\ and /d/=D:\. They must reflect the results of the Cygwin mount command.


After the dialog box is validated, the new board appears in the window. Connect to the RDBS server by double-clicking on it with the left mouse button.

This will connect you to RDBS, and through it to the DebugServer and the target. You can now view the system as it runs.

Enter the following in the command-line area of the Code window to see a list of actors running on the target:


Conn> stat actors

Select the Process tab in the Managers window. The Available Processes list will show a single entry representing the system as process number 1. Double-clicking on it will stop the system and initiate a debugging session.

XRAY will present you with the list of actors for which symbols should be loaded. By default, all actors are selected and you can press the OK button. XRAY will find the executable files automatically. For some of them, it may not have the path and it will prompt for the pathname of the missing executable file. If the actor's binary file is statically linked, you must indicate the path where it is located. If the actor's binary file is relocatable, then your only option is to select Cancel, because system debugging does not support the debugging of actors loaded from relocatable binaries.

After all selected actors have been loaded, XRAY will show where the system has been stopped in the Code window. The name of the thread which was executing last, also known as the current thread, will be displayed in the title bar. Thread execution can now be controlled.

Graphic

Think of a function you want to debug, myFunc() for example, and perform this command:


% scope myFunc

XRAY displays the source code of the function in the Code window. You can place a breakpoint in it for the current thread by double-clicking with the left button on the selected line number. This will set a per-thread breakpoint, for the current thread.

If you do not know whether the current thread will execute this function, place a global breakpoint by opening a local menu with the right mouse button and selecting Set Break All Threads. Press the Go button to resume running the function.

Once the breakpoint has been reached, examine the values of variables by double-clicking on them with the right mouse button.

If the breakpoint is not reached, and the system continues to run, you can stop it asynchronously using the Stop button. The Code window will show the stop location.


Note -

Due to the way in which the stop operation has been implemented, this will always be the same location inside the clock interrupt handler, except if the system was blocked in a console input, or performing console output.


You can find the interrupted location by examining the stack with the Up button.

The chls tool

The chls tool is available from the XRAY command window with the dchls command. You can use this command to display values which are not directly visible in the XRAY windows. For example, to look at the processor specific registers, type the following command:


Stop> dchls -special-regs

Graphic

Troubleshooting

If the DebugServer process is terminated, RDBS will attempt to reconnect to a new DebugServer process automatically. If there was a debugging session open at the time, the single process representing the ChorusOS operating system will be killed, and the debugging context lost. You will need to re-grab the process after RDBS has reconnected to the new DebugServer. If this does not work, then kill and restart RDBS.

If the target is rebooted, the single process representing the ChorusOS operating system will appear in the XRAY output window first as killed, and shortly after as restarted. XRAY will then attempt to reinsert all previously set breakpoints and promote them from thread-specific to global. Any breakpoints that cannot be reinserted will be deleted.

If you stop the system while it is waiting for console input, it will not resume until you provide some keyboard input.

Currently, the DebugServer does not offer access to the target's ChorusOS IPC ports. RDBS will report this by printing a warning message on start-up.

If a given symbols is present in several actors, or in several modules in a single actor (a static symbol, for example), then you can use the ps /f symbol_name command to display all of the occurrences of the symbol, complete with a full pathname. The full pathname is of the @binary_file\\module\symbol_name form. Use this full pathname to reference symbols which not in the current scope.

Because XRAY asks for a thread list each time a debugged process stops, and generating the list takes a long time during system debugging, the thread list shown in the Threads Manager is simplified. It does not include fields names, such as actor names, and is only updated when the current thread changes. The full thread list is available from the Resource Viewer or through the dallthreads command. Per-actor threads can be displayed with the command dthreads=aid. You can force the full thread list to be displayed, both in the Resource Viewer and in the Threads Manager, by permanently leaving the Resource Viewer window open on the thread list.

Sample XRAY Start-up Script

A sample script for setting up and starting XRAY is provided below.

Create a sub-directory within your system image directory to hold the script. For example, if your system image is called chorus.RAM, the directory would be image/RAM/chorus/bin.

Remember to modify the line which initializes the XRAY_INSTALL_DIR environment variable to point to the directory where XRAY is installed. This directory also contains the bin, docs, docschxx, license, master, xraycore and xrayrdb sub-directories. The script assumes you have put the license.dat file into this directory.

This shell script works if you use either a time-limited licence for XRAY, or a license locked to your machine. Please refer to the XRAY documentation for details of the other options available.

#!/bin/sh
set +x

XRAY_INSTALL_DIR=<xray_install_dir>

# Clean up possible crash
/bin/rm -f core /tmp/.MasterMsg/.MasterSock.$DISPLAY*

# Prepare environment variables
XRAYMASTER=$XRAY_INSTALL_DIR/master
export XRAYMASTER

USR_MRI=$XRAY_INSTALL_DIR
export USR_MRI

LD_LIBRARY_PATH=$XRAYMASTER/lib
export LD_LIBRARY_PATH

# LM_LICENSE_FILE=$XRAY_INSTALL_DIR/license.dat
LM_LICENSE_FILE=/Work/build/mir/mri/mri/license.dat
export LM_LICENSE_FILE

# If you use a license server, the following line starts it
# ./mri/bin/lmgrd
# Then we change the LM_LICENSE_FILE variable to point to the server
# LM_LICENSE_FILE=port_number@machine_name

# Run XRAY itself
$XRAY_INSTALL_DIR/master/bin/xray -VABS=rdb $*
# ./mri/master/bin/xray $*

Chapter 11 Performance Profiling

This chapter explains how to analyze the performance of a ChorusOS operating system and its applications by generating a performance profile report.

Introduction to Performance Profiling

The ChorusOS operating system performance profiling system contains a set of tools that facilitate the analysis and optimization of the performance of the ChorusOS operating system and applications. These tools concern only system components sharing the system address space, that is, the ChorusOS operating system components and supervisor application actors. This set of tools is composed of a profiling server, libraries for building profiled actors, a target controller and a host utility.

Software performance profiling consists of collecting data about the dynamic behavior of the software, to gain knowledge of the time distribution within the software. For example, the performance profiling system is able to report the time spent within each procedure, as well as providing a dynamically constructed call graph.

The typical steps of an optimization project are:

  1. To bench a set of typical applications, using the ChorusOS operating system and applications at peak performance. The selection of these applications is critical, as the system will eventually be tuned for this type of application.

  2. To evaluate and record the output of the benchmarks.

  3. To use the performance profiling system to collect raw data about the dynamic behavior of the applications.

  4. To generate, evaluate and record the performance profiling reports.

  5. To plan and implement optimizations such as rewriting certain time-critical routines in assembly language, using in-line functions or tuning algorithms.

The performance profiling tools provide two different classes of service, depending on the way in which the software being measured has been prepared:


Note -

The standard (binary) version of the ChorusOS operating system is not compiled with the performance profiling option: profiling the system will only generate a simple form. Non-profiled components (or components for which a simple report form is sufficient) do not need to be compiled with the performance profiling option.


In order to obtain a full form for ChorusOS operating system components, a source product distribution is needed. In this case, it is necessary to regenerate the system components with the performance profiling option set.

Preparing to Create a Performance Profile

Configuring the System

In order to perform system performance profiling using the ChorusOS Profiler, a ChorusOS target system must include the ACTOR_EXTENDED_MNGT and NFS_CLIENT feature options.

Launch the performance profiling server (the PROF actor) dynamically, using:


% rsh -n target arun PROF &

Compiling the Application

If you require full report forms, the profiled components must be compiled using the performance profiling compiler options (usually, the -p option).

If you are using the imake environment provided with the ChorusOS operating system, you can set the profiling option in the Project.tmpl file if you want to profile the whole project hierarchy, or in each Imakefile of the directories that you want to profile if you want to profile only a subset of your project hierarchy. In either case, add the following line:

PROF=$(PROF_ON)

You can also add the performance profiling option dynamically by calling make with the compiler profiling option:


% make PROF=-p  

in the directory of the program that is to be performance profiled.

Launching the Performance Profiled Application

In this section, it is assumed that the application consists of a single supervisor actor, the_actor, it is also assumed that the target system is named trumpet, and that the target tree is mounted under the $CHORUS_ROOT host directory.

In order to be performance profiled, an application may be either:

Running a Performance Profiling Session

Starting the Performance Profiling Session

Performance profiling is initiated by running the profctl utility on the target system, using the -start option. This utility (see "Security" for more details) considers the components to be profiled as arguments.

If the_actor was part of the system image:


% rsh trumpet arun profctl -start -b the_actor

Otherwise, if the_actor was loaded dynamically:


% rsh trumpet arun profctl -start -a the_actor aid

where aid is the numeric identifier of the actor (as returned by the arun or aps commands).


Note -

Several components may be specified to the profctl utility. See "Security" for more details.


Run the application.

Stopping the Performance Profiling Session

Performance profiling is stopped by running the profctl utility again, using the -stop option:


% rsh trumpet arun profctl -stop

When performance profiling is stopped, a raw data file is generated for each profiled component within the /tmp directory of the target file system. The name of the file consists of the component name, to which the suffix .prof is added. For example, if only the_actor was profiled, the file $CHORUSUS_ROOT/tmp/the_actor.prof would be created.

Generating Performance Profiling Reports

Performance profiling reports are generated by the profrpg host utility (see "Security" for details on reporting options).

Use the report generator to produce a report for each profiled component; as follows:


% cd $CHORUSUS_ROOT/tmp  


% profrpg the_actor > the_actor.rpg 

In order to track the benefits of optimization, the reports should be archived.

Analyzing Performance Profiling Reports

Performance profiling is applied to a user-selected set of components (ChorusOS operating system kernel, supervisor actors). The result of the performance profiling consists of a set of reports, one per profiled component.

A performance profiling report consists of two parts:

For each function, the performance profile report displays the following information:

Shown below is an example of a profiling report.


  overhead=2.468
  memcpy 4 K=18.834
  memcpy 16 K=51.936
  memcpy 64 K=185.579
  memcpy 256 K=801.300
  sysTime=2.576
  threadSelf=2.210
  thread switch=5.777
  threadCreate (active)=8.062
  threadCreate (active, preempt)=10.071
  threadPriority (self)=3.789
  threadPriority (self, high)=3.195
  threadResume (preempt)=6.999
  threadResume (awake)=4.014
 ...
  ipcCall (null, timeout)=35.732
  ipcSend (null, funcmode)=7.723
  ipcCall (null, funcmode)=31.762
  ipcSend (null, funcumode)=7.924
  ipcCall (null, funcumode)=31.864
  ipcSend (annex)=8.294
  ipcReceive (annex)=7.086
  ipcCall (annex)=33.708
  ipcSend (body 4b)=8.020
  ipcReceive (body 4b)=6.822
  ipcCall (body 4b)=32.558
  ipcSend (annex, body 4b)=8.684
  ipcReceive (annex, body 4b)=7.495
  ipcCall (annex, body 4b)=34.849

Performance Profiler Description

This section provides information about the performance profiling system's design, to help you understand the sequence of events that occurs before the generation of a performance profiling report.

The performance profiling tool set consists of:

The Performance Profiling Library

When the performance profiling compiler option (generally -p) is used, the compiler provides each function entry point with a call to a routine, normally called mcount. For each function, the compiler also sets up a static counter, and passes the address of this counter to mcount. The counter is initialized at zero.

What is done by mcount is defined by the application. Low-end performance profilers simply count the number of times the routine is called. ChorusOS Profiler provides a sophisticated mcount routine within the profiled library that constructs the runtime call graph. Note that you can supply your own mcount routine, for example to assert predicates when debugging a component.

The Performance Profiler Server

The profiler server, PROF, is a supervisor actor that can locate and modify static data within the memory context of the profiled actors, using the embedded symbol tables. The profiler server also dynamically creates and deletes the memory regions that are used to construct the call graph and count the profiling ticks (see below).

The Performance Profiling Clock

While the performance profiler is active, the system is regularly interrupted by the profiling clock, which by default is the system clock. At each clock tick, the instruction pointer is sampled, the active procedure is located and a counter associated with the interrupted procedure is incremented. A high rate performance profiling clock could use a significant amount of system time, which could lead to the system appearing to run more slowly. A rapid sampling clock could jeopardize the system's real-time requirements.

Notes About Accuracy

Significant disruptions in the real-time capabilities of the profiled programs must be expected, because performance profiling is performed by software (rather than by hardware with an external bus analyzer or equivalent device). Performance profiling using software slows down the processor, and the profiled applications may behave differently when being profiled compared to when running at full processor speed.

When profiling, the processor can spend more than fifty percent of the processing time profiling clock interrupts. Similarly, the time spent recording the call graph is significant, and tends to bias the profiling results in a non-linear manner.

The accuracy of the reported percentage of time spent is about five percent when the number of profiling ticks is in the order of magnitude of ten times the number of bytes in the profiled programs. In other words, in order to profile a program of 1 million bytes with any degree of accuracy, at least 10 millions ticks should be used. This level of accuracy is usually sufficient to plan code optimizations, which is the primary goal of the profiler, but the operator should beware of using all the fractional digits of the reported figures.

If more accuracy is needed, the operator can experiment with different combinations of the rate of the profiling clock, the type of profiling clock and the time spent profiling.