Go to main content

Oracle® Solaris 11.3 Programming Interfaces Guide

Exit Print View

Updated: April 2019
 
 

High Performance I/O

This section describes I/O with real-time processes. In Oracle Solaris, the libraries supply two sets of interfaces and calls to perform fast, asynchronous I/O operations. The POSIX asynchronous I/O interfaces are the most recent standard. An Oracle Solaris environment also provides file and in-memory synchronization operations and modes to prevent information loss and data inconsistency.

Standard UNIX I/O is synchronous to the application programmer. An application that calls read(2) or write(2) usually waits until the system call has finished. For more information, see the read(2) and write(2) man pages.

Real-time applications need asynchronous, bounded I/O behavior. A process that issues an asynchronous I/O call proceeds without waiting for the I/O operation to complete. The caller is notified when the I/O operation has finished.

Asynchronous I/O can be used with any Oracle Solaris file. Files are opened synchronously and no special flagging is required. An asynchronous I/O transfer has three elements: call, request, and operation. The application calls an asynchronous I/O interface, the request for the I/O is placed on a queue, and the call returns immediately. At some point, the system dequeues the request and initiates the I/O operation.

Asynchronous and standard I/O requests can be intermingled on any file descriptor. The system maintains no particular sequence of read and write requests. The system arbitrarily resequences all pending read and write requests. If a specific sequence is required for the application, the application must insure the completion of prior operations before issuing the dependent requests.

POSIX Asynchronous I/O

POSIX asynchronous I/O is performed using aiocb structures. An aiocb control block identifies each asynchronous I/O request and contains all of the controlling information. A control block can be used for only one request at a time. A control block can be reused after its request has been completed.

A typical POSIX asynchronous I/O operation is initiated by a call to aio_read() or aio_write(). Either polling or signals can be used to determine the completion of an operation. If signals are used for completing operations, each operation can be uniquely tagged. The tag is then returned in the si_value component of the generated signal. For more information, see the siginfo(3HEAD) man page.

aio_read()

Is called with an asynchronous I/O control block to initiate a read operation.

aio_write()

Is called with an asynchronous I/O control block to initiate a write operation.

aio_return(), aio_error()

Are called to obtain return and error values, respectively, after an operation is known to have completed.

aio_cancel()

Is called with an asynchronous I/O control block to cancel pending operations. aio_cancel() can be used to cancel a specific request, if a request is specified by the control block. aio_cancel() can also cancel all of the requests that are pending for the specified file descriptor.

aio_fsync()

Queues an asynchronous fsync() or fdatasync() request for all of the pending I/O operations on the specified file.

aio_suspend()

Suspends the caller as though one or more of the preceding asynchronous I/O requests had been made synchronously.

Oracle Solaris Asynchronous I/O

This section discusses asynchronous I/O operations in the Oracle Solaris operating environment.

Notification (SIGIO)

When an asynchronous I/O call returns successfully, the I/O operation has only been queued and waits to be done. The actual operation has a return value and a potential error identifier. This return value and potential error identifier would have been returned to the caller if the call had been synchronous. When the I/O is finished, both the return and error values are stored at a location given by the user at the time of the request as a pointer to an aio_result_t. The structure of the aio_result_t is defined in <sys/asynch.h>:

typedef struct aio_result_t {
 	ssize_t	aio_return; /* return value of read or write */
 	int 		aio_errno;  /* errno generated by the IO */
 } aio_result_t;

When the aio_result_t has been updated, a SIGIO signal is delivered to the process that made the I/O request.

Note that a process with two or more asynchronous I/O operations pending has no certain way to determine the cause of the SIGIO signal. A process that receives a SIGIO should check all its conditions that could be generating the SIGIO signal.

Using aioread()

This command routine is the asynchronous version of read(). In addition to the normal read arguments, aioread() takes the arguments that specify a file position and the address of an aio_result_t structure. The resulting information about the operation is stored in the aio_result_t structure. The file position specifies a seek to be performed within the file before the operation. Whether the aioread() command call succeeds or fails, the file pointer is updated.

Using aiowrite()

The aiowrite() command routine is the asynchronous version of write(). In addition to the normal write arguments, aiowrite() command takes arguments that specify a file position and the address of an aio_result_t structure. The resulting information about the operation is stored in the aio_result_t structure.

The file position specifies that a seek operation is to be performed within the file before the operation. If the call succeeds, the file pointer is updated to the position that would have resulted in a successful seek and write. The file pointer is also updated when a write fails to allow for subsequent write requests.

Using aiocancel()

This command routine attempts to cancel the asynchronous request whose aio_result_t structure is given as an argument. An aiocancel() call succeeds only if the request is still queued. If the operation is in progress, aiocancel() fails.

Using aiowait()

A call to aiowait() blocks the calling process until at least one outstanding asynchronous I/O operation is completed. The timeout parameter points to a maximum interval to wait for I/O completion. A timeout value of zero specifies that no wait is wanted. The aiowait() command returns a pointer to the aio_result_t structure for the completed operation.

Using poll()

To determine the completion of an asynchronous I/O event synchronously rather than depend on a SIGIO interrupt, use poll(). You can also poll to determine the origin of a SIGIO interrupt. For more information, see the poll(2) man page.

poll() is slow when used on very large numbers of files. This problem is resolved by poll(7d).

Using the poll() Driver

Using /dev/poll provides a highly scalable way of polling a large number of file descriptors. This scalability is provided through a new set of APIs and a new driver, /dev/poll. The /dev/poll API is an alternative and not a replacement of, poll(2). Use poll(7d) to provide details and examples of the /dev/poll API. When used properly, the /dev/poll API scales much better than poll(2). This API is especially suited for applications that satisfy the following criteria:

  • Applications that repeatedly poll a large number of file descriptors

  • Polled file descriptors that are relatively stable, meaning that the descriptors are not constantly closed and reopened

  • The set of file descriptors that actually have polled events pending is small, comparing to the total number of file descriptors that are being polled

Using close()

Files are closed by calling close(). The call to close() cancels any outstanding asynchronous I/O request that can be closed. close() waits for an operation that cannot be cancelled. For more information, see Using aiocancel. When close() returns, no asynchronous I/O is pending for the file descriptor. Only asynchronous I/O requests queued to the specified file descriptor are cancelled when a file is closed. Any I/O pending requests for other file descriptors are not cancelled. For more information, see the close(2) man page.

Synchronized I/O

Applications might need to guarantee that information has been written to stable storage, or that file updates are performed in a particular order. Synchronized I/O provides for these needs.

Synchronization Modes

In Oracle Solaris, a write operation succeeds when the system ensures that all written data is readable after any subsequent open of the file. This check assumes no failure of the physical storage medium. Data is successfully transferred for a read operation when an image of the data on the physical storage medium is available to the requesting process. An I/O operation is complete when the associated data has been successfully transferred, or when the operation has been diagnosed as unsuccessful.

An I/O operation has reached synchronized I/O data integrity completion when:

  • For reads, the operation has been completed, or diagnosed if unsuccessful. The read is complete only when an image of the data has been successfully transferred to the requesting process. If the synchronized read operation is requested when pending write requests affect the data to be read, these write requests are successfully completed before the data is read.

  • For writes, the operation has been completed, or diagnosed if unsuccessful. The write operation succeeds when the data specified in the write request is successfully transferred. Furthermore, all file system information required to retrieve the data must be successfully transferred.

  • File attributes that are not necessary for data retrieval are not transferred prior to returning to the calling process.

  • Synchronized I/O file integrity completion requires that all file attributes relative to the I/O operation be successfully transferred before returning to the calling process. Synchronized I/O file integrity completion is otherwise identical to synchronized I/O data integrity completion.

Synchronizing a File

fsync() and fdatasync() explicitly synchronize a file to secondary storage.

The fsync() routine guarantees that the interface is synchronized at the I/O file integrity completion level. fdatasync() guarantees that the interface is synchronized at level of I/O data integrity completion. For more information, see the fsync(3C) man page.

Applications can synchronize each I/O operation before the operation completes. Setting the O_DSYNC flag on the file description by using open() or fcntl() ensures that all I/O writes reach I/O data completion before the operation completes. Setting the O_SYNC flag on the file description ensures that all I/O writes have reached completion before the operation is indicated as completed. Setting the O_RSYNC flag on the file description ensures that all I/O reads read() and aio_read() reach the same level of completion that is requested by the descriptor setting. The descriptor setting can be either O_DSYNC or O_SYNC. For more information, see the open(2), fcntl(2), and read(2) man pages.