Sun Performance WorkShop Fortran Overview

Chapter 4 Debugging and Tuning

Many variables affect the performance of an application program. The one over which you have the most influence is the design of your program. With debugging and performance tuning, you can make your program efficient, reliable, and fast. Sun Performance WorkShop Fortran includes a variety of performance tools to debug, analyze source code, isolate problems, and provide you with the information you need to finely tune your applications for maximum performance.

Debugging

For your debugging tasks, use one of the two closely related debugging tools contained in Sun Performance WorkShop Fortran: dbx or the debugger.

For Command-Line Debugging, Use dbx

The dbx debugger is an interactive, source-level, command-line tool. You can use it to run a program in a controlled manner and to inspect the state of a stopped program. dbx gives you complete control of the dynamic execution of a program, including the collection of performance data.

Make use of the multithreaded features that are built into the standard dbx. You can identify all known threads, including their current state, base functions, and current functions. You can also examine thread stack traces. To ensure proper execution, you can debug threads by stepping through or over a thread, navigating between threads, and then resuming execution at any time.


Note -

If you wish to debug your multithreaded application with dbx, you must include the -lthread option at link time.


If You Prefer a GUI, Use the Debugger

Should you favor a graphical interface to dbx,try the debugger (the Sun WorkShop Debugging window). During program execution, dbx obtains detailed information about program behavior and delivers this information to the debugger by a communications protocol. You can debug more easily because you can enter most commands by clicking redefinable buttons in the GUI.

You can also edit your programs with your favorite editor from the debugger and minimize the need to change tools.

Fix and Continue

With the Fix and Continue feature of the debugger you can modify source code, recompile the file, and continue program execution--all without leaving the debugger. When you use this feature, you eliminate relinking and reloading the program.

Runtime Checking

Use runtime checking (RTC) to find elusive memory access violations and memory leaks in both single-threaded and multithreaded applications. With runtime checking, you can detect runtime errors in an application during the development phase. As errors are detected, the debugger interrupts program execution and displays the relevant source code so you can fix bugs as they are found.

Data Visualization

As a scientific or numerical software developer, you work with large volumes of data. To facilitate your analyses, you need to "see" results. Data visualization is a debugging technique that lets you explore and comprehend large and complex data sets, simulate results, and interactively steer computations. During this process, you can update the data on demand-- at specified breakpoints, or at specified time intervals.

Debug Faster With Global Program Checking

Use global program checking to facilitate the debugging of your Fortran applications and analyze source programs for inconsistencies and possible runtime problems. global program checking also promotes consistency in definition and use of arguments, commons, and parameters across routines.

Tuning

After you have successfully debugged your program, you can evaluate its performance with the Sampling Analyzer, a program designed to help you tune application performance, including memory allocation. The Sampling Analyzer measures and graphically displays your application's performance profile and suggests ways to improve performance. Its special data collection instrumentation eliminates the need to continually compile and link an application--any program that has been compiled can be analyzed.

Analyze a Variety of Performance Data

The performance data you can examine in the Sampling Analyzer include:

Control Your Analysis

The debugger serves as the data-gathering front end for the Sampling Analyzer. You can control the data collection process with the Sampling Collector window in dbx or the debugger while your program is running. You can collect data only between breakpoints, or you can limit data collection to a particular part of the program. The program run in which you collect data is known as an experiment, and the data file created by the Collector is called the experiment record. You then use the Sampling Analyzer to identify performance bottlenecks in the collected data.


Note -

Performance tuning and runtime checking are mutually exclusive processes. You can perform only one or the other at a time. The information you receive from tuning your application can be adversely affected if you try to perform runtime checking simultaneously.


Focus on Problems

Test your hypotheses about a program's behavior by focusing on the areas where performance problems occur. To rebuild your programs with improved performance, use the Sampling Analyzer to identify areas where you can improve ordering for loading functions into the program's address space. In some cases, the Sampling Analyzer can improve performance automatically by creating a mapfile that instructs the linker to remap functions in memory more efficiently.

Find Which Modules Do the Calling

Performance analysis tools provide a range of analysis levels, from simple timing of a command to a statement-by-statement analysis of a program. While a flat profile can provide valuable data for performance improvements, sometimes the data is not sufficient to point out exactly where improvements can be made. You can obtain a more detailed analysis by using the call graph profile to identify which modules are called by other modules, and which modules call other modules.

Multiprocessing and Multithreading

Multiprocessing (MP) is the hardware technology on the SPARC platform that supports tightly coupled multi-CPU systems with shared memory. Multiple CPUs provide more power to drive application performance.

Multithreading (MT) is the software technology that enables the development of parallel applications, whether on single- or multiple-processor systems. Independent threads of execution can be scheduled on multiple CPUs in a multiprocessor system, but they share resources such as memory and files, allowing single applications to execute code in parallel. Threads share resources, synchronize, and communicate with each other through the use of mutual exclusion (mutex) locks provided by the operating system. Multiprocessing and multithreading together give you a scalable solution for higher application performance.

Take Advantage of Parallelism

If your applications use parallelism, use the new multiprocessing systems and multithreaded operating environments to improve performance, responsiveness, and flexibility. With multithreading you can:

Speed Error Detection With Multithreaded Development Tools

Use multithreaded development tools to extend the Sun WorkShop Compilers Fortran (and multiprocessing C compiler) for multiprocessing optimizations. The multiprocessing/multithreading tool set includes multithreaded extensions to the Sun WorkShop debugger and dbx, and two additional tools: LockLint and LoopTool.

Find Inconsistent Lock Use With LockLint

Use LockLint to do static analysis of the use of mutex and read/write locks. In searching for inconsistent lock use, LockLint detects the most common causes of data races and deadlocks.

Analyze Loop Information

Take advantage of LoopTool and LoopReport, performance analysis tools used with the multiprocessing Fortran and C compilers. The compilers automatically parallelize loops when they determine that it is safe and profitable to do so. With LoopTool you can:

Use the LoopReport command-line tool to create a summary table of all loop runtimes correlated with compiler hints about why a loop was not parallelized.

For More Details, Use Call Grapher and gprof

You can use the -pg option to the Fortran compilers to compile an application for call graph profiling. Once your program is compiled in this manner, call graph profile data is sent to a file called gmon.out after each run. Use the gprof command to interpret the results of the profile.