Sun MPI 4.0 Programming and Reference Guide

Data Access

The 35 data-access routines are categorized according to file positioning. Data access can be achieved by any of these methods of file positioning:

By explicit offset
By individual file pointer
By shared file pointer

In the following subsections, each of these methods is discussed in more detail.

While blocking I/O calls will not return until the request is completed, nonblocking calls do not wait for the I/O request to complete. A separate "request complete" call, such as MPI_Test or MPI_Wait, is needed to confirm that the buffer is ready to be used again. Nonblocking routines have the prefix MPI_File_i, where the i stands for immediate.

All the nonblocking collective routines for data access are "split" into two routines, each with _begin or _end as a suffix. These split collective routines are subject to the semantic rules described in Section 9.4.5 of the MPI-2 standard.

Data Access With Explicit Offsets

Synchronism	Noncollective coordination	Collective coordination
Blocking	`MPI_File_read_at` `MPI_File_write_at`	`MPI_File_read_at_all` `MPI_File_write_at_all`
Nonblocking orsplit collective	`MPI_File_iread_at` `MPI_File_iwrite_at`	`MPI_File_read_at_all_begin` `MPI_File_read_at_all_end` `MPI_File_write_at_all_begin` `MPI_File_write_at_all_end`

Synchronism

Noncollective coordination

Collective coordination

Blocking

MPI_File_read_at

MPI_File_write_at

MPI_File_read_at_all

MPI_File_write_at_all

Nonblocking orsplit collective

MPI_File_iread_at

MPI_File_iwrite_at

MPI_File_read_at_all_begin

MPI_File_read_at_all_end

MPI_File_write_at_all_begin

MPI_File_write_at_all_end

To access data at an explicit offset, specify the position in the file where the next data access for each process should begin. For each call to a data-access routine, a process attempts to access a specified number of file types of a specified data type (starting at the specified offset) into a specified user buffer.

The offset is measured in elementary data type units relative to the current view; moreover, "holes" are not counted when locating an offset. The data is read from (in the case of a read) or written into (in the case of a write) those parts of the file specified by the current view. These routines store the number of buffer elements of a particular data type actually read (or written) in the status object, and all the other fields associated with the status object are undefined. The number of elements that are read or written can be accessed using MPI_Get_count.

MPI_File_read_at attempts to read from the file via the associated file handle returned from a successful MPI_File_open. Similarly, MPI_File_write_at attempts to write data from a user buffer to a file. MPI_File_iread_at and MPI_File_iwrite_at are the nonblocking versions of MPI_File_read_at and MPI_File_write_at, respectively.

MPI_File_read_at_all and MPI_File_write_at_all are collective versions of MPI_File_read_at and MPI_File_write_at, in which each process provides an explicit offset. The split collective versions of these nonblocking routines are listed in the table at the beginning of this section.

Data Access With Individual File Pointers

Synchronism	Noncollective coordination	Collective coordination
Blocking	`MPI_File_read` `MPI_File_write`	`MPI_File_read_all` `MPI_File_write_all`
Nonblocking orsplit collective	`MPI_File_iread` `MPI_File_iwrite`	`MPI_File_read_all_begin` `MPI_File_read_all_end` `MPI_File_write_all_begin` `MPI_File_write_all_end`

Synchronism

Noncollective coordination

Collective coordination

Blocking

MPI_File_read

MPI_File_write

MPI_File_read_all

MPI_File_write_all

Nonblocking orsplit collective

MPI_File_iread

MPI_File_iwrite

MPI_File_read_all_begin

MPI_File_read_all_end

MPI_File_write_all_begin

MPI_File_write_all_end

For each open file, Sun MPI I/O maintains one individual file pointer per process per collective MPI_File_open. For these data-access routines, MPI I/O implicitly uses the value of the individual file pointer. These routines use and update only the individual file pointers maintained by MPI I/O by pointing to the next elementary data type after the one that has most recently been accessed. The individual file pointer is updated relative to the current view of the file. The shared file pointer is neither used nor updated. (For data access with shared file pointers, please see the next section.)

These routines have similar semantics to the explicit-offset data-access routines, except that the offset is defined here to be the current value of the individual file pointer.

MPI_File_read_all and MPI_File_write_all are collective versions of MPI_File_read and MPI_File_write, with each process using its individual file pointer.

MPI_File_iread and MPI_File_iwrite are the nonblocking versions of MPI_File_read and MPI_File_write, respectively. The split collective versions of MPI_File_read_all and MPI_File_write_all are listed in the table at the beginning of this section.

Pointer Manipulation

MPI_File_seek
MPI_File_get_position
MPI_File_get_byte_offset

Each process can call the routine MPI_File_seek to update its individual file pointer according to the update mode. The update mode has the following possible values:

MPI_SEEK_SET - The pointer is set to the offset.
MPI_SEEK_CUR - The pointer is set to the current pointer position plus the offset.
MPI_SEEK_END - The pointer is set to the end of the file plus the offset.

The offset can be negative for backwards seeking, but you cannot seek to a negative position in the file. The current position is defined as the elementary data item immediately following the last-accessed data item.

MPI_File_get_position returns the current position of the individual file pointer relative to the current displacement and file type.

MPI_File_get_byte_offset converts the offset specified for the current view to the displacement value, or absolute byte position, for the file.

Data Access With Shared File Pointers

Synchronism	Noncollective coordination	Collective coordination
Blocking	`MPI_File_read_shared` `MPI_File_write_shared`	`MPI_File_read_ordered` `MPI_File_write_ordered` `MPI_File_seek_shared` `MPI_File_get_position_shared`
Nonblocking orsplit collective	`MPI_File_iread_shared` `MPI_File_iwrite_shared`	`MPI_File_read_ordered_begin` `MPI_File_read_ordered_end` `MPI_File_write_ordered_begin` `MPI_File_write_ordered_end`

Synchronism

Noncollective coordination

Collective coordination

Blocking

MPI_File_read_shared

MPI_File_write_shared

MPI_File_read_ordered

MPI_File_write_ordered

MPI_File_seek_shared

MPI_File_get_position_shared

Nonblocking orsplit collective

MPI_File_iread_shared

MPI_File_iwrite_shared

MPI_File_read_ordered_begin

MPI_File_read_ordered_end

MPI_File_write_ordered_begin

MPI_File_write_ordered_end

Sun MPI I/O maintains one shared file pointer per collective MPI_File_open (shared among processes in the communicator group that opened the file). As with the routines for data access with individual file pointers, you can also use the current value of the shared file pointer to specify the offset of data accesses implicitly. These routines use and update only the shared file pointer; the individual file pointers are neither used nor updated by any of these routines.

These routines have similar semantics to the explicit-offset data-access routines, except:

The offset is defined here to be the current value of the shared file pointer.

Multiple calls (one for each process in the communicator group) affect the shared file pointer routines as if the calls were serialized.

All processes must use the same file view.

After a shared file pointer operation is initiated, it is updated, relative to the current view of the file, to point to the elementary data item immediately following the last one requested, regardless of the number of items actually accessed.

MPI_File_read_shared and MPI_File_write_shared are blocking routines that use the shared file pointer to read and write files, respectively. The order of serialization is not deterministic for these noncollective routines, so you need to use other methods of synchronization if you wish to impose a particular order.

MPI_File_iread_shared and MPI_File_iwrite_shared are the nonblocking versions of MPI_File_read_shared and MPI_File_write_shared, respectively.

MPI_File_read_ordered and MPI_File_write_ordered are the collective versions of MPI_File_read_shared and MPI_File_write_shared. They must be called by all processes in the communicator group associated with the file handle, and the accesses to the file occur in the order determined by the ranks of the processes within the group. After all the processes in the group have issued their respective calls, for each process in the group, these routines determine the position where the shared file pointer would be after all processes with ranks lower than this process's rank had accessed their data. Then data is accessed (read or written) at that position. The shared file pointer is then updated by the amount of data requested by all processes of the group.

The split collective versions of MPI_File_read_ordered and MPI_File_write_ordered are listed in the table at the beginning of this section.

MPI_File_seek_shared is a collective routine, and all processes in the communicator group associated with the particular file handler must call MPI_File_seek_shared with the same file offset and the same update mode. All the processes are synchronized with a barrier before the shared file pointer is updated.

MPI_File_get_position_shared returns the current position of the shared file pointer relative to the current displacement and file type.