I/O Issues

Language:

One of the attractions of multithreaded programming is I/O performance. The traditional UNIX API gave you little assistance in this area. You either used the facilities of the file system or bypassed the file system entirely.

This section shows how to use threads to get more flexibility through I/O concurrency and multibuffering. This section also discusses the differences and similarities between the approaches of synchronous I/O with threads, and asynchronous I/O with and without threads.

I/O as a Remote Procedure Call

In the traditional UNIX model, I/O appears to be synchronous, as if you were placing a remote procedure call to the I/O device. Once the call returns, then the I/O has completed, or at least appears to have completed. A write request, for example, might merely result in the transfer of the data to a buffer in the operating environment.

The advantage of this model is familiar concept of procedure calls.

An alternative approach not found in traditional UNIX systems is the asynchronous model, in which an I/O request merely starts an operation. The program must somehow discover when the operation completes.

The asynchronous model is not as simple as the synchronous model. But, the asynchronous model has the advantage of allowing concurrent I/O and processing in traditional, single-threaded UNIX processes.

Tamed Asynchrony

You can get most of the benefits of asynchronous I/O by using synchronous I/O in a multithreaded program. With asynchronous I/O, you would issue a request and check later to determine when the I/O completes. You can instead have a separate thread perform the I/O synchronously. The main thread can then check for the completion of the operation at some later time perhaps by calling pthread_join().

Asynchronous I/O

In most situations, asynchronous I/O is not required because its effects can be achieved with the use of threads, with each thread execution of synchronous I/O. However, in a few situations, threads cannot achieve what asynchronous I/O can.

The most straightforward example is writing to a tape drive to make the tape drive stream. Streaming prevents the tape drive from stopping while the drive is being written to. The tape moves forward at high speed while supplying a constant stream of data that is written to tape.

To support streaming, the tape driver in the kernel should use threads. The tape driver in the kernel must issue a queued write request when the tape driver responds to an interrupt. The interrupt indicates that the previous tape-write operation has completed.

Threads cannot guarantee that asynchronous writes are ordered because the order in which threads executed is indeterminate. You cannot, for example, specify the order of a write to a tape.

Asynchronous I/O Operations

#include <aio.h>

int aio_read(struct aiocb *aiocbp);

int aio_write(struct aiocb *aiocbp);

int aio_error(const struct aiocb *aiocbp);

ssize_t aio_return(struct aiocb *aiocbp);

int aio_suspend(struct aiocb *list[], int nent,
    const struct timespec *timeout);

int aio_waitn(struct aiocb *list[], uint_t nent, uint_t *nwait,
    const struct timespec *timeout);

int aio_cancel(int fildes, struct aiocb *aiocbp);

aio_read(3RT) and aio_write(3RT) are similar in concept to pread() and pwrite(), except that the parameters of the I/O operation are stored in an asynchronous I/O control block (aiocbp) that is passed to aio_read() or aio_write():

    aiocbp->aio_fildes;    /* file descriptor */
    aiocbp->aio_buf;       /* buffer */
    aiocbp->aio_nbytes;    /* I/O request size */
    aiocbp->aio_offset;    /* file offset */

In addition, if desired, an asynchronous notification type (most commonly a queued signal) can be specified in the 'struct sigevent' member:

    aiocbp->aio_sigevent;  /* notification type */

A call to aio_read() or aio_write() results in the initiation or queueing of an I/O operation. The call returns without blocking.

The aiocbp value may be used as an argument to aio_error() and aio_return() in order to determine the error status and return status of the asynchronous operation while it is proceeding.

Waiting for I/O Operation to Complete

You can wait for one or more outstanding asynchronous I/O operations to complete by calling aio_suspend() or aio_waitn(). Use aio_error() and aio_return() on the completed asynchronous I/O control blocks to determine the success or failure of the I/O operation.

The aio_suspend() and aio_waitn() functions take a timeout argument, which indicates how long the caller is willing to wait. A NULL pointer means that the caller is willing to wait indefinitely. A pointer to a structure containing a zero value means that the caller is unwilling to wait at all.

You might start an asynchronous I/O operation, do some work, then call aio_suspend() or aio_waitn() to wait for the request to complete. Or you can rely on the asynchronous notification event specified in aio_sigevent() to occur to notify you when the operation completes.

Finally, a pending asynchronous I/O operation can be cancelled by calling aio_cancel(). This function is called with the address of the I/O control block that was used to initiate the I/O operation.

Shared I/O and New I/O System Calls

When multiple threads perform concurrent I/O operations with the same file descriptor, you might discover that the traditional UNIX I/O interface is not thread safe. The problem occurs with nonsequential I/O where the lseek() system call sets the file offset. The file offset is then used in the next read() or write() call to indicate where in the file the operation should start. When two or more threads are issuing an lseek() to the same file descriptor, a conflict results.

To avoid this conflict, use the pread() and pwrite() system calls.

#include <sys/types.h>
#include <unistd.h>

ssize_t pread(int fildes, void *buf, size_t nbyte, off_t offset);

ssize_t pwrite(int filedes, void *buf, size_t nbyte,
    off_t offset);

pread() and pwrite() behave just like read() and write() except that pread() and pwrite() take an additional argument, the file offset. With this argument, you specify the offset without using lseek(), so multiple threads can use these routines safely for I/O on the same file descriptor.

Alternatives to `getc()` and `putc()`

An additional problem occurs with standard I/O. Programmers are accustomed to routines, such as getc() and putc(), that are implemented as macros, being very quick. Because of the speed of getc() and putc(), these macros can be used within the inner loop of a program with no concerns about efficiency.

However, when getc() and putc() are made thread safe the macros suddenly become more expensive. The macros now require at least two internal subroutine calls, to lock and unlock a mutex.

To get around this problem, alternative versions of these routines are supplied: getc_unlocked() and putc_unlocked().

getc_unlocked() and putc_unlocked() do not acquire locks on a mutex. These getc_unlocked() or putc_unlocked() macros are as quick as the original, nonthread-safe versions of getc() and putc().

However, to use these macros in a thread-safe way, you must explicitly lock and release the mutexes that protect the standard I/O streams, using flockfile() and funlockfile(). The calls to these latter routines are placed outside the loop. Calls to getc_unlocked() or putc_unlocked() are placed inside the loop.

New System Calls for Reliable Multithreaded Programming

In the Oracle Solaris 11 release, the following new APIs and flags have been added to make multithreaded programs more reliable:

The following *at() functions are versions of file handling system calls which take the working directory for relative pathnames as an argument, to avoid races with chdir() calls between multiple threads.

openat(2)
fchownat(2)
fstatat(2)
utimensat(2)
unlinkat(2)
faccessat(2)
fchmodat(2)
linkat(2)
mkdirat(2)
mknodat(2)
readlinkat(2)
renameat(2)
symlinkat(2)

The following flags close a race condition between the open(), dup(), and dup2() calls and a following call to the fcntl() function to set the FD_CLOEXEC flag by setting it atomically in the creation system call.

O_CLOEXEC flag to open(2)
F_DUPFD_CLOEXEC and F_DUP2FD_CLOEXEC arguments to fcntl(2)

Multithreaded Programming Guide