Multithreaded Programming Guide

Extending Traditional Signals

The traditional UNIX signal model is extended to threads in a fairly natural way. The key characteristics are that the signal disposition is process-wide, but the signal mask is per-thread. The process-wide disposition of signals is established using the traditional mechanisms (signal(2), sigaction(2), and so on).

When a signal handler is marked SIG_DFL or SIG_IGN, the action on receipt of the signal (exit, core dump, stop, continue, or ignore) is performed on the entire receiving process, affecting all threads in the process. For these signals that don't have handlers, the issue of which thread picks the signal is unimportant, because the action on receipt of the signal is carried out on the whole process. See signal(5) for basic information about signals.

Each thread has its own signal mask. This lets a thread block some signals while it uses memory or another state that is also used by a signal handler. All threads in a process share the set of signal handlers set up by sigaction(2) and its variants.

A thread in one process cannot send a signal to a specific thread in another process. A signal sent by kill(2) or sigsend(2) to a process is handled by any one of the receptive threads in the process.

Unbound threads cannot use alternate signal stacks. A bound thread can use an alternate stack because the state is associated with the execution resource. An alternate stack must be enabled for the signal through sigaction(2), and declared and enabled through signaltstack(2).

An application can have per-thread signal handlers based on the per-process signal handlers. One way is for the process-wide signal handler to use the identifier of the thread handling the signal as an index into a table of per-thread handlers. Note that there is no thread zero.

Signals are divided into two categories: traps and exceptions (synchronously generated signals) and interrupts (asynchronously generated signals).

As in traditional UNIX, if a signal is pending, additional occurrences of that signal have no additional effect--a pending signal is represented by a bit, not by a counter. In other words, signal delivery is idempotent.

As is the case with single-threaded processes, when a thread receives a signal while blocked in a system call, the thread might return early, either with the EINTR error code, or, in the case of I/O calls, with fewer bytes transferred than requested.

Of particular importance to multithreaded programs is the effect of signals on pthread_cond_wait(3T). This call usually returns in response to a pthread_cond_signal(3T) or a pthread_cond_broadcast(3T), but, if the waiting thread receives a traditional UNIX signal, it returns with the error code EINTR. See "Interrupted Waits on Condition Variables (Solaris Threads Only)"for more information.

Synchronous Signals

Traps (such as SIGILL, SIGFPE, SIGSEGV) result from something a thread does to itself, such as dividing by zero or explicitly sending itself a signal. A trap is handled only by the thread that caused it. Several threads in a process can generate and handle the same type of trap simultaneously.

Extending the idea of signals to individual threads is easy for synchronous signals--the signal is dealt with by the thread that caused the problem.

However, if the thread has not chosen to deal with the problem, such as by establishing a signal handler with sigaction(2), the handler is invoked on the thread that receives the synchronous signal.

Because such a synchronous signal usually means that something is seriously wrong with the whole process, and not just with a thread, terminating the process is often a good choice.

Asynchronous Signals

Interrupts (such as SIGINT and SIGIO) are asynchronous with any thread and result from some action outside the process. They might be signals sent explicitly by other threads, or they might represent external actions such as a user typing Control-c. Dealing with asynchronous signals is more complicated than dealing with synchronous signals.

An interrupt can be handled by any thread whose signal mask allows it. When more than one thread is able to receive the interrupt, only one is chosen.

When multiple occurrences of the same signal are sent to a process, then each occurrence can be handled by a separate thread, as long as threads are available that do not have it masked. When all threads have the signal masked, then the signal is marked pending and the first thread to unmask the signal handles it.

Continuation Semantics

Continuation semantics are the traditional way to deal with signals. The idea is that when a signal handler returns, control resumes where it was at the time of the interruption. This is well suited for asynchronous signals in single-threaded processes, as shown in Example 5-1.

This is also used as the exception-handling mechanism in some programming languages, such as PL/1.


Example 5-1 Continuation Semantics

unsigned int nestcount;

unsigned int A(int i, int j) {
    nestcount++;

    if (i==0)
        return(j+1)
    else if (j==0)
        return(A(i-1, 1));
    else
        return(A(i-1, A(i, j-1)));
}

void sig(int i) {
    printf("nestcount = %d\n", nestcount);
}

main() {
    sigset(SIGINT, sig);
    A(4,4);
}

Operations on Signals

pthread_sigsetmask(3T)

pthread_sigsetmask(3T) does for a thread what sigprocmask(2) does for a process--it sets the thread's signal mask. When a new thread is created, its initial mask is inherited from its creator.

The call to sigprocmask() in a multithreaded process is equivalent to a call to pthread_sigsetmask(). See the sigprocmask(2) page for more information.

pthread_kill(3T)

pthread_kill(3T) is the thread analog of kill(2)--it sends a signal to a specific thread.This, of course, is different from sending a signal to a process. When a signal is sent to a process, the signal can be handled by any thread in the process. A signal sent by pthread_kill() can be handled only by the specified thread.

Note than you can use pthread_kill() to send signals only to threads in the current process. This is because the thread identifier (type thread_t) is local in scope--it is not possible to name a thread in any process but your own.

Note also that the action taken (handler, SIG_DFL, SIG_IGN) on receipt of a signal by the target thread is global, as usual. This means, for example, that if you send SIGXXX to a thread, and the SIGXXX signal disposition for the process is to kill the process, then the whole process is killed when the target thread receives the signal.

sigwait(2)

For multithreaded programs, sigwait(2) is the preferred interface to use, because it deals so well with aysynchronously-generated signals.

sigwait() causes the calling thread to wait until any signal identified by its set argument is delivered to the thread. While the thread is waiting, signals identified by the set argument are unmasked, but the original mask is restored when the call returns.

All signals identified by the set argument must be blocked on all threads, including the calling thread; otherwise, sigwait() may not work correctly.

Use sigwait() to separate threads from asynchronous signals. You can create one thread that is listening for asynchronous signals while your other threads are created to block any asynchronous signals that might be set to this process.

New sigwait() Implementations

Two versions of sigwait() are available in the Solaris 2.5 release: the new Solaris 2.5 version, and the POSIX.1c version. New applications and libraries should use the POSIX standard interface, as the Solaris version might not be available in future releases.


Note -

The new Solaris 2.5 sigwait() does not override the signal's ignore disposition. Applications relying on the older sigwait(2) behavior can break unless you install a dummy signal handler to change the disposition from SIG_IGN to having a handler, so calls to sigwait() for this signal catch it.


The syntax for the two versions of sigwait() is shown below.

#include <signal.h>

/* the Solaris 2.5 version*/
int sigwait(sigset_t *set);

/* the POSIX.1c version */
int sigwait(const sigset_t *set, int *sig);

When the signal is delivered, the POSIX.1c sigwait() clears the pending signal and places the signal number in sig. Many threads can call sigwait() at the same time, but only one thread returns for each signal that is received.

With sigwait() you can treat asynchronous signals synchronously--a thread that deals with such signals simply calls sigwait() and returns as soon as a signal arrives. By ensuring that all threads (including the caller of sigwait()) have such signals masked, you can be sure that signals are handled only by the intended handler and that they are handled safely.

By always masking all signals in all threads, and just calling sigwait() as necessary, your application will be much safer for threads that depend on signals.

Usually, you use sigwait() to create one or more threads that wait for signals. Because sigwait() can retrieve even masked signals, be sure to block the signals of interest in all other threads so they are not accidentally delivered.

When the signals arrive, a thread returns from sigwait(), handles the signal, and waits for more signals. The signal-handling thread is not restricted to using Async-Signal-Safe functions and can synchronize with other threads in the usual way. (The Async-Signal-Safe category is defined in "MT Interface Safety Levels".)


Note -

sigwait() should never be used with synchronous signals.


sigtimedwait(2)

sigtimedwait(2) is similar to sigwait(2) except that it fails and returns an error when a signal is not received in the indicated amount of time.

Thread-Directed Signals

The UNIX signal mechanism is extended with the idea of thread-directed signals. These are just like ordinary asynchronous signals, except that they are sent to a particular thread instead of to a process.

Waiting for asynchronous signals in a separate thread can be safer and easier than installing a signal handler and processing the signals there.

A better way to deal with asynchronous signals is to treat them synchronously. By calling sigwait(2), discussed on "sigwait(2)", a thread can wait until a signal occurs.


Example 5-2 Asynchronous Signals and sigwait(2)

main() {
    sigset_t set;
    void runA(void);
    int sig;

    sigemptyset(&set);
    sigaddset(&set, SIGINT);
    pthread_sigsetmask(SIG_BLOCK, &set, NULL);
    pthread_create(NULL, 0, runA, NULL, PTHREAD_DETACHED, NULL);

    while (1) {
        sigwait(&set, &sig);
        printf("nestcount = %d\n", nestcount);
        printf("received signal %d\n", sig);
    }
}

void runA() {
    A(4,4);
    exit(0);
}

This example modifies the code of Example 5-1: the main routine masks the SIGINT signal, creates a child thread that calls the function A of the previous example, and then issues sigwait() to handle the SIGINT signal.

Note that the signal is masked in the compute thread because the compute thread inherits its signal mask from the main thread. The main thread is protected from SIGINT while, and only while, it is not blocked inside of sigwait().

Also, note that there is never any danger of having system calls interrupted when you use sigwait().

Completion Semantics

Another way to deal with signals is with completion semantics.

Use completion semantics when a signal indicates that something so catastrophic has happened that there is no reason to continue executing the current code block. The signal handler runs instead of the remainder of the block that had the problem. In other words, the signal handler completes the block.

In Example 5-3, the block in question is the body of the then part of the if statement. The call to setjmp(3C) saves the current register state of the program in jbuf and returns 0, thereby executing the block.


Example 5-3 Completion Semantics

sigjmp_buf jbuf;
void mult_divide(void) {
    int a, b, c, d;
    void problem();

    sigset(SIGFPE, problem);
    while (1) {
        if (sigsetjmp(&jbuf) == 0) {
            printf("Three numbers, please:\n");
            scanf("%d %d %d", &a, &b, &c);
            d = a*b/c;
            printf("%d*%d/%d = %d\n", a, b, c, d);
        }
    }
}

void problem(int sig) {
    printf("Couldn't deal with them, try again\n");
    siglongjmp(&jbuf, 1);
}

If a SIGFPE (a floating-point exception) occurs, the signal handler is invoked.

The signal handler calls siglongjmp(3C), which restores the register state saved in jbuf, causing the program to return from sigsetjmp() again (among the registers saved are the program counter and the stack pointer).

This time, however, sigsetjmp(3C) returns the second argument of siglongjmp(), which is 1. Notice that the block is skipped over, only to be executed during the next iteration of the while loop.

Note that you can use sigsetjmp(3C) and siglongjmp(3C) in multithreaded programs, but be careful that a thread never does a siglongjmp() using the results of another thread's sigsetjmp().

Also, sigsetjmp() and siglongjmp() save and restore the signal mask, but setjmp(3C) and longjmp(3C) do not.

It is best to use sigsetjmp() and siglongjmp() when you work with signal handlers.

Completion semantics are often used to deal with exceptions. In particular, the Ada\256 programming language uses this model.


Note -

Remember, sigwait(2) should never be used with synchronous signals.


Signal Handlers and Async-Signal Safety

A concept similar to thread safety is Async-Signal safety. Async-Signal-Safe operations are guaranteed not to interfere with operations that are being interrupted.

The problem of Async-Signal safety arises when the actions of a signal handler can interfere with the operation that is being interrupted.

For example, suppose a program is in the middle of a call to printf(3S) and a signal occurs whose handler itself calls printf(). In this case, the output of the two printf() statements would be intertwined. To avoid this, the handler should not call printf() itself when printf() might be interrupted by a signal.

This problem cannot be solved by using synchronization primitives because any attempted synchronization between the signal handler and the operation being synchronized would produce immediate deadlock.

Suppose that printf() is to protect itself by using a mutex. Now suppose that a thread that is in a call to printf(), and so holds the lock on the mutex, is interrupted by a signal.

If the handler (being called by the thread that is still inside of printf()) itself calls printf(), the thread that holds the lock on the mutex will attempt to take it again, resulting in an instant deadlock.

To avoid interference between the handler and the operation, either ensure that the situation never arises (perhaps by masking off signals at critical moments) or invoke only Async-Signal-Safe operations from inside signal handlers.

Because setting a thread's mask is an inexpensive user-level operation, you can inexpensively make functions or sections of code fit in the Async-Signal-Safe category.

The only routines that POSIX guarantees to be Async-Signal-Safe are listed in Table 5-2. Any signal handler can safely call in to one of these functions.

Table 5-2 Async-Signal-Safe Functions

_exit()

fstat()

read()

sysconf()

access()

getegid()

rename()

tcdrain()

alarm()

geteuid()

rmdir()

tcflow()

cfgetispeed()

getgid()

setgid()

tcflush()

cfgetospeed()

getgroups()

setpgid()

tcgetattr()

cfsetispeed()

getpgrp()

setsid()

tcgetpgrp()

cfsetospeed()

getpid()

setuid()

tcsendbreak()

chdir()

getppid()

sigaction()

tcsetattr()

chmod()

getuid()

sigaddset()

tcsetpgrp()

chown()

kill()

sigdelset()

time()

close()

link()

sigemptyset()

times()

creat()

lseek()

sigfillset()

umask()

dup2()

mkdir()

sigismember()

uname()

dup()

mkfifo()

sigpending()

unlink()

execle()

open()

sigprocmask()

utime()

execve()

pathconf()

sigsuspend()

wait()

fcntl()

pause()

sleep()

waitpid()

fork()

pipe()

stat()

write()

Interrupted Waits on Condition Variables (Solaris Threads Only)

When a signal is delivered to a thread while the thread is waiting on a condition variable, the old convention (assuming that the process is not terminated) is that interrupted calls return EINTR.

The ideal new condition would be that when cond_wait(3T) and cond_timedwait(3T) return, the lock has been retaken on the mutex.

This is what is done in Solaris threads: when a thread is blocked in cond_wait() or cond_timedwait() and an unmasked, caught signal is delivered to the thread, the handler is invoked and the call to cond_wait() or cond_timedwait() returns EINTR with the mutex locked.

This implies that the mutex is locked in the signal handler because the handler might have to clean up after the thread. While this is true in the Solaris 2.5 release, it might change in the future, so do not rely upon this behavior.


Note -

In POSIX threads, pthread_cond_wait(3T) returns from signals, but this is not an error, pthread_cond_wait() returns zero as a spurious wakeup.


Handler cleanup is illustrated by Example 5-4.


Example 5-4 Condition Variables and Interrupted Waits

int sig_catcher() {
    sigset_t set;
    void hdlr();

    mutex_lock(&mut);

    sigemptyset(&set);
    sigaddset(&set, SIGINT);
    sigsetmask(SIG_UNBLOCK, &set, 0);

    if (cond_wait(&cond, &mut) == EINTR) {
        /* signal occurred and lock is held */
        cleanup();
        mutex_unlock(&mut);
        return(0);
    }
    normal_processing();
    mutex_unlock(&mut);
    return(1);
}

void hdlr() {
    /* lock is held in the handler */
    ...
}

Assume that the SIGINT signal is blocked in all threads on entry to sig_catcher() and that hdlr() has been established (with a call to sigaction(2)) as the handler for the SIGINT signal. When an unmasked and caught instance of the SIGINT signal is delivered to the thread while it is in cond_wait(), the thread first reacquires the lock on the mutex, then calls hdlr(), and then returns EINTR from cond_wait().

Note that whether SA_RESTART has been specified as a flag to sigaction() has no effect here; cond_wait(3T) is not a system call and is not automatically restarted. When a caught signal occurs while a thread is blocked in cond_wait(), the call always returns EINTR. Again, the application should not rely on an interrupted cond_wait() reacquiring the mutex, because this behavior could change in the future.