C H A P T E R  11

Debugging Multithreaded Applications

dbx can debug multithreaded applications that use either Solaris threads or POSIX threads. With dbx, you can examine stack traces of each thread, resume all threads, step or next a specific thread, and navigate between threads.

dbx recognizes a multithreaded program by detecting whether it utilizes libthread.so. The program uses libthread.so either by explicitly being compiled with -lthread or -mt, or implicitly by being compiled with -lpthread.

This chapter describes how to find information about and debug threads using the dbx thread commands.

This chapter is organized into the following sections:


Understanding Multithreaded Debugging

When it detects a multithreaded program, dbx tries to load libthread_db.so, a special system library for thread debugging located in /usr/lib.

dbx is synchronous; when any thread or lightweight process (LWP) stops, all other threads and LWPs sympathetically stop. This behavior is sometimes referred to as the "stop the world" model.



Note - For information on multithreaded programming and LWPs, see the Solaris Multithreaded Programming Guide.



Thread Information

The following thread information is available in dbx:


(dbx) threads
    t@1 a l@1  ?()  running   in main()
    t@2      ?() asleep on 0xef751450  in_swtch()
    t@3 b l@2  ?()  running in sigwait()
    t@4     consumer()  asleep on 0x22bb0 in _lwp_sema_wait()
  *>t@5 b l@4 consumer()  breakpoint     in Queue_dequeue()
    t@6 b l@5 producer()     running       in _thread_start()
(dbx)

For native code, each line of information is composed of the following:

An 'o' instead of an asterisk indicates that a dbx internal event has occurred.

For Java code, each line of information is composed of the following:

 


TABLE 11-1 Thread and LWP States

Thread and LWP States

Description

suspended

The thread has been explicitly suspended.

runnable

The thread is runnable and is waiting for an LWP as a computational resource.

zombie

When a detached thread exits (thr_exit)), it is in a zombie state until it has rejoined through the use of thr_join(). THR_DETACHED is a flag specified at thread creation time (thr_create()). A non-detached thread that exits is in a zombie state until it has been reaped.

asleep on syncobj

Thread is blocked on the given synchronization object. Depending on what level of support libthread and libthread_db provide, syncobj might be as simple as a hexadecimal address or something with more information content.

active

The thread is active on an LWP, but dbx cannot access the LWP.

unknown

dbx cannot determine the state.

lwpstate

A bound or active thread state has the state of the LWP associated with it.

running

LWP was running but was stopped in synchrony with some other LWP.

syscall num

LWP stopped on an entry into the given system call #.

syscall return num

LWP stopped on an exit from the given system call #.

job control

LWP stopped due to job control.

LWP suspended

LWP is blocked in the kernel.

single stepped

LWP has just completed a single step.

breakpoint

LWP has just hit a breakpoint.

fault num

LWP has incurred the given fault #.

signal name

LWP has incurred the given signal.

process sync

The process to which this LWP belongs has just started executing.

LWP death

LWP is in the process of exiting.


Viewing the Context of Another Thread

To switch the viewing context to another thread, use the thread command. The syntax is:


thread [-blocks] [-blockedby] [-info] [-hide] [-unhide] [-suspend] [-resume] thread_id

To display the current thread, type:


thread

To switch to thread thread_id, type:


thread thread_id

For more information on the thread command, see thread Command.

Viewing the Threads List

To view the threads list, use the threads command. The syntax is:


threads [-all} [-mode [all|filter] [auto|manual]]

To print the list of all known threads, type:


threads

To print threads normally not printed (zombies), type:


threads -all

For an explanation of the threads list, see Thread Information.

For more information on the threads command, see threads Command.

Resuming Execution

Use the cont command to resume program execution. Currently, threads use synchronous breakpoints, so all threads resume execution.


Understanding Thread Creation Activity

You can get an idea of how often your application creates and destroys threads by using the thr_create event and thr_exit event as in the following example:


(dbx) trace thr_create
(dbx) trace thr_exit
(dbx) run
 
trace: thread created t@2 on l@2
trace: thread created t@3 on l@3
trace: thread created t@4 on l@4
trace: thr_exit t@4
trace: thr_exit t@3
trace: thr_exit t@2

Here the application created three threads. Note how the threads exited in reverse order from their creation, which might indicate that had the application had more threads, the threads would accumulate and consume resources.

To get more interesting information, you could try the following in a different session:


(dbx) when thr_create { echo "XXX thread $newthread created by $thread"; }
XXX thread t@2 created by t@1
XXX thread t@3 created by t@1
XXX thread t@4 created by t@1

The output shows that all three threads were created by thread t@1, which is a common multi-threading pattern.

Suppose you want to debug thread t@3 from its outset. You could stop the application at the point that thread t@3 is created as follows:


(dbx) stop thr_create t@3
(dbx) run
t@1 (l@1) stopped in tdb_event_create at 0xff38409c
0xff38409c: tdb_event_create       :    retl     
Current function is main
216       stat = (int) thr_create(NULL, 0, consumer, q, tflags, &tid_cons2);
(dbx)

If your application occasionally spawns a new thread but from thread t@5 instead of thread t@1, you could capture that event as follows:


(dbx) stop thr_create -thread t@5


Understanding LWP Information

Normally, you need not be aware of LWPs. There are times, however, when thread level queries cannot be completed. In these cases, use the lwps command to show information about LWPs.


(dbx) lwps
    l@1 running in main()
    l@2 running in sigwait()
    l@3 running in _lwp_sema_wait()
  *>l@4 breakpoint in Queue_dequeue()
    l@5 running in _thread_start()
(dbx)

Each line of the LWP list contains the following: