Programming Interfaces Guide

High Performance I/O

This section describes I/O with real-time processes. In SunOS, the libraries supply two sets of interfaces and calls to perform fast, asynchronous I/O operations. The POSIX asynchronous I/O interfaces are the most recent standard. The SunOS environment also provides file and in-memory synchronization operations and modes to prevent information loss and data inconsistency.

Standard UNIX I/O is synchronous to the application programmer. An application that calls read(2) or write(2) usually waits until the system call has finished.

Real-time applications need asynchronous, bounded I/O behavior. A process that issues an asynchronous I/O call proceeds without waiting for the I/O operation to complete. The caller is notified when the I/O operation has finished.

Asynchronous I/O can be used with any SunOS file. Files are opened synchronously and no special flagging is required. An asynchronous I/O transfer has three elements: call, request, and operation. The application calls an asynchronous I/O interface, the request for the I/O is placed on a queue, and the call returns immediately. At some point, the system dequeues the request and initiates the I/O operation.

Asynchronous and standard I/O requests can be intermingled on any file descriptor. The system maintains no particular sequence of read and write requests. The system arbitrarily resequences all pending read and write requests. If a specific sequence is required for the application, the application must insure the completion of prior operations before issuing the dependent requests.

POSIX Asynchronous I/O

POSIX asynchronous I/O is performed using aiocb structures. An aiocb control block identifies each asynchronous I/O request and contains all of the controlling information. A control block can be used for only one request at a time. A control block can be reused after its request has been completed.

A typical POSIX asynchronous I/O operation is initiated by a call to aio_read(3RT) or aio_write(3RT). Either polling or signals can be used to determine the completion of an operation. If signals are used for completing operations, each operation can be uniquely tagged. The tag is then returned in the si_value component of the generated signal. See the siginfo(3HEAD) man page.

aio_read

aio_read(3RT) is called with an asynchronous I/O control block to initiate a read operation.

aio_write

aio_write(3RT) is called with an asynchronous I/O control block to initiate a write operation.

aio_return, aio_error

aio_return(3RT) and aio_error(3RT) are called to obtain return and error values, respectively, after an operation is known to have completed.

aio_cancel

aio_cancel(3RT) is called with an asynchronous I/O control block to cancel pending operations. aio_cancel can be used to cancel a specific request, if a request is specified by the control block. aio_cancel can also cancel all of the requests that are pending for the specified file descriptor.

aio_fsync

aio_fsync(3RT) queues an asynchronous fsync(3C) or fdatasync(3RT) request for all of the pending I/O operations on the specified file.

aio_suspend

aio_suspend(3RT) suspends the caller as though one or more of the preceding asynchronous I/O requests had been made synchronously.

Solaris Asynchronous I/O

This section discusses asynchronous I/O operations in the Solaris operating environment.

Notification (SIGIO)

When an asynchronous I/O call returns successfully, the I/O operation has only been queued and waits to be done. The actual operation has a return value and a potential error identifier. This return value and potential error identifier would have been returned to the caller if the call had been synchronous. When the I/O is finished, both the return and error values are stored at a location given by the user at the time of the request as a pointer to an aio_result_t. The structure of the aio_result_t is defined in <sys/asynch.h>:

typedef struct aio_result_t {
 	ssize_t	aio_return; /* return value of read or write */
 	int 		aio_errno;  /* errno generated by the IO */
 } aio_result_t;

When the aio_result_t has been updated, a SIGIO signal is delivered to the process that made the I/O request.

Note that a process with two or more asynchronous I/O operations pending has no certain way to determine the cause of the SIGIO signal. A process that receives a SIGIO should check all its conditions that could be generating the SIGIO signal.

Using aioread

The aioread(3AIO) routine is the asynchronous version of read(2). In addition to the normal read arguments, aioread(3AIO) takes the arguments that specify a file position and the address of an aio_result_t structure. The resulting information about the operation is stored in the aio_result_t structure. The file position specifies a seek to be performed within the file before the operation. Whether the aioread(3AIO) call succeeds or fails, the file pointer is updated.

Using aiowrite

The aiowrite(3AIO) routine is the asynchronous version of write(2). In addition to the normal write arguments, aiowrite(3AIO) takes arguments that specify a file position and the address of an aio_result_t structure. The resulting information about the operation is stored in the aio_result_t structure.

The file position specifies that a seek operation is to be performed within the file before the operation. If the aiowrite(3AIO) call succeeds, the file pointer is updated to the position that would have resulted in a successful seek and write. The file pointer is also updated when a write fails to allow for subsequent write requests.

Using aiocancel

The aiocancel(3AIO) routine attempts to cancel the asynchronous request whose aio_result_t structure is given as an argument. An aiocancel(3AIO) call succeeds only if the request is still queued. If the operation is in progress, aiocancel(3AIO) fails.

Using aiowait

A call to aiowait(3AIO) blocks the calling process until at least one outstanding asynchronous I/O operation is completed. The timeout parameter points to a maximum interval to wait for I/O completion. A timeout value of zero specifies that no wait is wanted. aiowait(3AIO) returns a pointer to the aio_result_t structure for the completed operation.

Using poll()

To determine the completion of an asynchronous I/O event synchronously rather than depend on a SIGIO interrupt, use poll(2). You can also poll to determine the origin of a SIGIO interrupt.

poll(2) is slow when used on very large numbers of files. This problem is resolved by poll(7D).

Using the poll Driver

Using /dev/poll provides a highly scalable way of polling a large number of file descriptors. This scalability is provided through a new set of APIs and a new driver, /dev/poll. The /dev/poll API is an alternative to, not a replacement of, poll(2). Use poll(7D) to provide details and examples of the /dev/poll API. When used properly, the /dev/poll API scales much better than poll(2). This API is especially suited for applications that satisfy the following criteria:

Using close

Files are closed by calling close(2). The call to close(2) cancels any outstanding asynchronous I/O request that can be closed. close(2) waits for an operation that cannot be cancelled. For more information, see Using aiocancel. When close(2) returns, no asynchronous I/O is pending for the file descriptor. Only asynchronous I/O requests queued to the specified file descriptor are cancelled when a file is closed. Any I/O pending requests for other file descriptors are not cancelled.

Synchronized I/O

Applications might need to guarantee that information has been written to stable storage, or that file updates are performed in a particular order. Synchronized I/O provides for these needs.

Synchronization Modes

Under SunOS, a write operation succeeds when the system ensures that all written data is readable after any subsequent open of the file. This check assumes no failure of the physical storage medium. Data is successfully transferred for a read operation when an image of the data on the physical storage medium is available to the requesting process. An I/O operation is complete when the associated data has been successfully transferred, or when the operation has been diagnosed as unsuccessful.

An I/O operation has reached synchronized I/O data integrity completion when:

Synchronizing a File

fsync(3C) and fdatasync(3RT) explicitly synchronize a file to secondary storage.

The fsync(3C) routine guarantees that the interface is synchronized at the I/O file integrity completion level. fdatasync(3RT) guarantees that the interface is synchronized at level of I/O data integrity completion.

Applications can synchronize each I/O operation before the operation completes. Setting the O_DSYNC flag on the file description by using open(2) or fcntl(2) ensures that all I/O writes reach I/O data completion before the operation completes. Setting the O_SYNC flag on the file description ensures that all I/O writes have reached completion before the operation is indicated as completed. Setting the O_RSYNC flag on the file description ensures that all I/O reads read(2) and aio_read(3RT) reach the same level of completion that is requested by the descriptor setting. The descriptor setting can be either O_DSYNC or O_SYNC.