Multithreaded Programming Guide

I/O Issues

One of the attractions of multithreaded programming is I/O performance. The traditional UNIX API gave you little assistance in this area--you either used the facilities of the file system or bypassed the file system entirely.

This section shows how to use threads to get more flexibility through I/O concurrency and multibuffering. This section also discusses the differences and similarities between the approaches of synchronous I/O (with threads) and asynchronous I/O (with and without threads).

I/O as a Remote Procedure Call

In the traditional UNIX model, I/O appears to be synchronous, as if you were placing a remote procedure call to the I/O device. Once the call returns, then the I/O has completed (or at least it appears to have completed--a write request, for example, might merely result in the transfer of the data to a buffer in the operating environment).

The advantage of this model is that it is easy to understand because, as a programmer you are very familiar with the concept of procedure calls.

An alternative approach not found in traditional UNIX systems is the asynchronous model, in which an I/O request merely starts an operation. The program must somehow discover when the operation completes.

This approach is not as simple as the synchronous model, but it has the advantage of allowing concurrent I/O and processing in traditional, single-threaded UNIX processes.

Tamed Asynchrony

You can get most of the benefits of asynchronous I/O by using synchronous I/O in a multithreaded program. Where, with asynchronous I/O, you would issue a request and check later to determine when it completes, you can instead have a separate thread perform the I/O synchronously. The main thread can then check (perhaps by calling pthread_join(3T)) for the completion of the operation at some later time.

Asynchronous I/O

In most situations there is no need for asynchronous I/O, since its effects can be achieved with the use of threads, with each thread doing synchronous I/O. However, in a few situations, threads cannot achieve what asynchronous I/O can.

The most straightforward example is writing to a tape drive to make the tape drive stream. Streaming prevents the tape drive from stopping while it is being written to and moves the tape forward at high speed while supplying a constant stream of data that is written to tape.

To do this, the tape driver in the kernel must issue a queued write request when the tape driver responds to an interrupt that indicates that the previous tape-write operation has completed.

Threads cannot guarantee that asynchronous writes will be ordered because the order in which threads execute is indeterminate. Specifying the order of a write to a tape, for example, is not possible.

Asynchronous I/O Operations

#include <sys/asynch.h>

int aioread(int fildes, char *bufp, int bufs, off_t offset,
    int whence, aio_result_t *resultp);

int aiowrite(int filedes, const char *bufp, int bufs,
    off_t offset, int whence, aio_result_t *resultp);

aio_result_t *aiowait(const struct timeval *timeout);
int aiocancel(aio_result_t *resultp);

aioread(3) and aiowrite(3) are similar in form to pread(2) and pwrite(2), except for the addition of the last argument. Calls to aioread() and aiowrite() result in the initiation (or queueing) of an I/O operation.

The call returns without blocking, and the status of the call is returned in the structure pointed to by resultp. This is an item of type aio_result_t that contains the following:

int aio_return;
int aio_errno;

When a call fails immediately, the failure code can be found in aio_errno. Otherwise, this field contains AIO_INPROGRESS, meaning that the operation has been successfully queued.

You can wait for an outstanding asynchronous I/O operation to complete by calling aiowait(3). This returns a pointer to the aio_result_t structure supplied with the original aioread(3) or aiowrite(3) call.

This time aio_result_t contains whatever read(2) or write(2) would have returned if one of them had been called instead of the asynchronous version. If the read() or write() is successful, aio_return contains the number of bytes that were read or written; if it was not successful, aio_return is -1, and aio_errno contains the error code.

aiowait() takes a timeout argument, which indicates how long the caller is willing to wait. As usual, a NULL pointer here means that the caller is willing to wait indefinitely, and a pointer to a structure containing a zero value means that the caller is unwilling to wait at all.

You might start an asynchronous I/O operation, do some work, then call aiowait() to wait for the request to complete. Or you can use SIGIO to be notified, asynchronously, when the operation completes.

Finally, a pending asynchronous I/O operation can be cancelled by calling aiocancel(). This routine is called with the address of the result area as an argument. This result area identifies which operation is being cancelled.

Shared I/O and New I/O System Calls

When multiple threads are performing I/O operations at the same time with the same file descriptor, you might discover that the traditional UNIX I/O interface is not thread-safe. The problem occurs with nonsequential I/O. This uses the lseek(2) system call to set the file offset, which is then used in the next read(2) or write(2) call to indicate where in the file the operation should start. When two or more threads are issuing lseeks() to the same file descriptor, a conflict results.

To avoid this conflict, use the pread(2) and pwrite(2) system calls.

#include <sys/types.h>
#include <unistd.h>

ssize_t pread(int fildes, void *buf, size_t nbyte, off_t offset);

ssize_t pwrite(int filedes, void *buf, size_t nbyte,
    off_t offset);

These behave just like read(2) and write(2) except that they take an additional argument, the file offset. With this argument, you specify the offset without using lseek(2), so multiple threads can use these routines safely for I/O on the same file descriptor.

Alternatives to getc(3S) and putc(3S)

An additional problem occurs with standard I/O. Programmers are accustomed to routines such as getc(3S) and putc(3S) being very quick--they are implemented as macros. Because of this, they can be used within the inner loop of a program with no concerns about efficiency.

However, when they are made thread safe they suddenly become more expensive--they now require (at least) two internal subroutine calls, to lock and unlock a mutex.

To get around this problem, alternative versions of these routines are supplied, getc_unlocked(3S) and putc_unlocked(3S).

These do not acquire locks on a mutex and so are as quick as the original, nonthread-safe versions of getc(3S) and putc(3S).

However, to use them in a thread-safe way, you must explicitly lock and release the mutexes that protect the standard I/O streams, using flockfile(3S) and funlockfile(3S). The calls to these latter routines are placed outside the loop, and the calls to getc_unlocked() or putc_unlocked() are placed inside the loop.