The following list points out some of the more frequent oversights that can cause bugs in multithreaded programs.
Accessing global memory (shared changeable state) without the protection of a synchronization mechanism.
Creating deadlocks caused by two threads trying to acquire rights to the same pair of global resources in alternate order (so that one thread controls the first resource and the other controls the second resource and neither can proceed until the other gives up).
Creating a hidden gap in synchronization protection. This is caused when a code segment protected by a synchronization mechanism contains a call to a function that frees and then reacquires the synchronization mechanism before it returns to the caller. The result is that it appears to the caller that the global data has been protected when it actually has not.
Mixing UNIX signals with threads—it is better to use the sigwait(2) model for handling asynchronous signals.
Failing to reevaluate the conditions after returning from a call to *_cond_wait() or *_cond_timedwait().
Forgetting that default threads are created PTHREAD_CREATE_JOINABLE and must be reclaimed with pthread_join(3THR); note, pthread_exit(3THR) does not free up its storage space.
Making deeply nested, recursive calls and using large automatic arrays can cause problems because multithreaded programs have a more limited stack size than single-threaded programs.
And, note that multithreaded programs (especially those containing bugs) often behave differently in two successive runs, given identical inputs, because of differences in the thread scheduling order.
In general, multithreading bugs are statistical instead of deterministic. Tracing is usually a more effective method of finding order of execution problems than is breakpoint-based debugging.
Use the TNF utilities (included as part of the Solaris system) to trace, debug, and gather performance analysis information from your applications and libraries. The TNF utilities integrate trace information from the kernel and from multiple user processes and threads, and so are especially useful for multithreaded code.
With the TNF utilities, you can easily trace and debug multithreaded programs. See the TNF manual pages for detailed information on using prex(1) and tnfdump(1).
See truss(1) for information on tracing system calls, signals and user-level function calls.
The following mdb commands can be used to access the LWPs of a multithreaded program.
Table 7–3 MT mdb Commands
pid:A |
Attaches to process # pid. This stops the process and all its LWPs. |
:R |
Detaches from process. This resumes the process and all its LWPs. |
$L |
Lists all active LWPs in the (stopped) process. |
n:l |
Switches focus to LWP # n. |
$l |
Shows the LWP currently focused. |
num:i |
Ignores signal number num. |
These commands to set conditional breakpoints are often useful.
Table 7–4 Setting mdb Breakpoints
[label],[count]:b [expression] |
Breakpoint is detected when expression equals zero |
foo,ffff:b <g7-0xabcdef |
Stop at foo when g7 = the hex value 0xABCDEF |
With the dbx utility you can debug and execute source programs written in C++, ANSI C, and FORTRAN. dbx accepts the same commands as the Debugger, but uses a standard terminal (TTY) interface. Both dbx and the Debugger support debugging multithreaded programs. For a full overview of dbx and Debugger features see the dbx(1) reference manual page and the Using Sun Workshop user's guide.
All the dbx options listed in Table 7–5 can support multithreaded applications.
Table 7–5 dbx Options for MT Programs