Programming Interfaces Guide

Chapter 2 Remote Shared Memory API for Solaris Clusters

Solaris Cluster OS systems can be configured with a memory-based interconnect such as Dolphin-SCI and layered system software components. These components implement a mechanism for user-level inter-node messaging that is based on direct access to memory residing on remote nodes. This mechanism is referred to as Remote Shared Memory (RSM). This chapter defines the RSM Application Programming Interface (RSMAPI).

Overview of the Shared Memory Model

In the shared memory model, an application process creates an RSM export segment from the process's local address space. One or more remote application processes create an RSM import segment with a virtual connection between export and import segments across the interconnect. All processes make memory references for the shared segment with addresses local to their specific address space.

An application process creates an RSM export segment by allocating locally addressable memory to the export segment. This allocation is done by using one of the standard Solaris interfaces, such as System V Shared Memory, mmap(2), or valloc(3C). The process then calls on the RSMAPI for the creation of a segment, which provides a reference handle for the allocated memory. The RSM segment is published through one or more interconnect controllers. A published segment is remotely accessible. A list of access privileges for the nodes that are permitted to import the segment is also published.

A segment ID is assigned to the exported segment. This segment ID, along with the cluster node ID of the creating process, allows an importing process to uniquely specify an export segment. Successfully creating an export segment returns a segment handle to the process for use in subsequent segment operations.

An application process obtains access to a published segment by using the RSMAPI to create an import segment. After creating the import segment, the application process forms a virtual connection across the interconnect. Successfully creating this import segment returns an RSM import segment handle to the application process for use in subsequent segment import operations. After establishing the virtual connection, the application might request RSMAPI to provide a memory map for local access, if supported by the interconnect. If memory mapping is not supported, the application can use memory access primitives provided by RSMAPI.

The RSMAPI provides a mechanism to support remote access error detection and to resolve write-order memory model issues. This mechanism is called a barrier.

RSMAPI provides a notification mechanism to synchronize local and remote accesses. An export process can call a function to block while an import process finishes a data write operation. When the import process finishes writing, the process unblocks the export process by calling a signal function. Once unblocked, the export process processes the data.

API Framework

The RSM application support components are delivered in software packages as follows:

The user level, which contains API library functions,
is connected to the kernel level, which contains the cluster interfaces and
kernel agent.

API Library Functions

The API library functions support the following operations:

Interconnect Controller Operations

The controller operations provide mechanisms for obtaining access to a controller. Controller operations can also determine the characteristics of the underlying interconnect. The following list contains information on controller operations:

rsm_get_controller

int rsm_get_controller(char *name, rsmapi_controller_handle_t *controller);

The rsm_get_controller operation acquires a controller handle for the given controller instance, such as sci0 or loopback. The returned controller handle is used for subsequent RSM library calls.

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_CTLR_HNDL

Invalid controller handle

RSMERR_CTLR_NOT_PRESENT

Controller not present

RSMERR_INSUFFICIENT_MEM

Insufficient memory

RSMERR_BAD_LIBRARY_VERSION

Invalid library version

RSMERR_BAD_ADDR

Bad address

rsm_release_controller

int rsm_release_controller(rsmapi_controller_handle_t chdl);

This function releases the controller associated with the given controller handle. Each call to rsm_release_controller must have a matching rsm_get_controller. When all the controller handles associated with a controller are released, the system resources associated with the controller are freed. Attempting to access a controller handle, or attempting to access import or export segments on a released controller handle, is not legal. The results of such an attempt are undefined.

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_CTLR_HNDL

Invalid controller handle

rsm_get_controller_attr

int rsm_get_controller_attr(rsmapi_controller_handle_t chdl, rsmapi_controller_attr_t *attr);

This function retrieves attributes for the specified controller handle. The following list describes the currently defined attributes for this function:

typedef struct {
     uint_t       attr_direct_access_sizes;
     uint_t       attr_atomic_sizes;
     size_t       attr_page_size;
     size_t       attr_max_export_segment_size;
     size_t       attr_tot_export_segment_size;
     ulong_t      attr_max_export_segments;
     size_t       attr_max_import_map_size;
     size_t       attr_tot_import_map_size;
     ulong_t      attr_max_import_segments;
 } rsmapi_controller_attr_t;

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_CTLR_HNDL

Invalid controller handle

RSMERR_BAD_ADDR

Bad address

Cluster Topology Operations

The key interconnect data required for export operations and import operations are:

As a fundamental constraint, the controller specified for a segment import must have a physical connection with the controller used for the associated segment export. This interface defines the interconnect topology, which helps applications establish efficient export and import policies. The data that is provided includes local node ID, local controller instance name, and remote connection specification for each local controller.

An application component that exports memory can use the data provided by the interface to find the set of existing local controllers. The data provided by the interface can also be used to correctly assign controllers for the creation and publishing of segments. Application components can efficiently distribute exported segments over the set of controllers that is consistent with the hardware interconnect and with the application software distribution.

An application component that is importing memory must be informed of the segment IDs and controllers used in the memory export. This information is typically conveyed by a predefined segment and controller pair. The importing component can use the topology data to determine the appropriate controllers for the segment import operations.

rsm_get_interconnect_topology

int rsm_get_interconnect_topology(rsm_topology_t **topology_data);

This function returns a pointer to the topology data in a location specified by an application pointer. The topology data structure is defined next.

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_TOPOLOGY_PTR

Invalid topology pointer

RSMERR_INSUFFICIENT_MEM

Insufficient memory

RSMERR_BAD_ADDR

Insufficient memory

rsm_free_interconnect_topology

void rsm_free_interconnect_topology(rsm_topology_t *topology_data);

The rsm_free_interconnect_topology operation frees the memory allocated by rsm_get_interconnect_topology.

Return Values: None.

Data Structures

The pointer returned from rsm_get_topology_data references a rsm_topology_t structure. This structure provides the local node ID and an array of pointers to a connections_t structure for each local controller.

typedef struct rsm_topology {
   rsm_nodeid_t    local_nodeid;
   uint_t          local_cntrl_count;
   connections_t   *connections[1];
} rsm_topology_t;

Administrative Operations

RSM segment IDs can be specified by the application or generated by the system using the rsm_memseg_export_publish() function. Applications that specify segment IDs require a reserved range of segment IDs to use. To reserve a range of segment IDs, use rsm_get_segmentid_range and define the reserved range of segment IDs in the segment ID configuration file /etc/rsm/rsm.segmentid. The rsm_get_segmentid_range function can be used by applications to obtain the segment ID range that is reserved for the applications. This function reads the segment ID range defined in the /etc/rsm/rsm.segmentid file for a given application ID.

An application ID is a null-terminated string that identifies the application. The application can use any value equal to or greater than baseid and less than baseid+length. If baseid or length are modified, the segment ID returned to the application might be outside the reserved range. To avoid this problem, use an offset within the range of reserved segment IDs to obtain a segment ID.

Entries in the /etc/rsm/rsm.segmentid file are of the form:


#keyword      appid      baseid       length
reserve       SUNWfoo    0x600000     100

The entries are composed of strings, which can be separated by tabs or blanks. The first string is the keyword reserve, followed by the application identifier, which is a string without spaces. Following the application identifier is the baseid, which is the starting segment ID of the reserved range in hexadecimal. Following the baseid is the length, which is the number of segment IDs that are reserved. Comment lines have a # in the first column. The file should not contain blank or empty lines. Segment IDs that are reserved for the system are defined in the /usr/include/rsm/rsm_common.h header file. The segment IDs that are reserved for the system cannot be used by the applications.

The rsm_get_segmentid_range function returns 0 to indicate success. If the function fails, the function returns one of the following error values:

RSMERR_BAD_ADDR

The address that is passed is invalid

RSMERR_BAD_APPID

Application ID not defined in the/etc/rsm/rsm.segmentid file

RSMERR_BAD_CONF

The configuration file /etc/rsm/rsm.segmentid is not present or not readable. The file format's configuration is incorrect

Memory Segment Operations

An RSM segment represents a set of (generally) non-contiguous physical memory pages mapped to a contiguous virtual address range. RSM segment export and segment import operations enable the sharing of regions of physical memory among systems on an interconnect. A process of the node on which the physical pages reside is referred to as the exporter of the memory. An exported segment that is published for remote access will have a segment identifier that is unique for the given node. The segment ID might be specified by the exporter or assigned by the RSMAPI framework.

Processes of nodes on the interconnect obtain access to exported memory by creating an RSM import segment. The RSM import segment has a connection with an exported segment, rather than local physical pages. When the interconnect supports memory mapping, importers can read and write the exported memory by using the local memory-mapped addresses of the import segment. When the interconnect does not support memory mapping, the importing process uses memory access primitives.

Export-Side Memory Segment Operations

When exporting a memory segment, the application begins by allocating memory in its virtual address space through the normal operating system interfaces such as the System V Shared Memory Interface, mmap, or valloc. After allocating memory, the application calls the RSMAPI library interfaces to create and label a segment. After labelling the segment, the RSMAPI library interfaces bind physical pages to the allocated virtual range. After binding the physical pages, the RSMAPI library interfaces publish the segment for access by importing processes.


Note –

If virtual address space is obtained by using mmap, the mapping must be MAP_PRIVATE.


Export side memory segment operations include:

Memory Segment Creation and Destruction

Establishing a new memory segment with rsm_memseg_export_create enables the association of physical memory with the segment at creation time. The operation returns an export-side memory segment handle to the new memory segment. The segment exists for the lifetime of the creating process or until destroyed with rsm_memseg_export_destroy.


Note –

If destroy operation is performed before an import side disconnect, the disconnect is forced.


Segment Creation

int rsm_memseg_export_create(rsmapi_controller_handle_t controller, rsm_memseg_export_handle_t *memseg, void *vaddr, size_t size, uint_t flags);

This function creates a segment handle. After the segment handle is created, the segment handle is bound to the specified virtual address range [vaddr..vaddr+size]. The range must be valid and aligned on the controller's alignment property. The flags argument is a bitmask, which enables:


Note –

The RSM_LOCK_OPS flag is not included in the initial release of RSMAPI.


Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_CTLR_HNDL

Invalid controller handle

RSMERR_CTLR_NOT_PRESENT

Controller not present

RSMERR_BAD_SEG_HNDL

Invalid segment handle

RSMERR_BAD_LENGTH

Length zero or length exceeds controller limits

RSMERR_BAD_ADDR

Invalid address

RSMERR_PERM_DENIED

Permission denied

RSMERR_INSUFFICIENT_MEM

Insufficient memory

RSMERR_INSUFFICIENT_RESOURCES

Insufficient resources

RSMERR_BAD_MEM_ALIGNMENT

Address not aligned on page boundary

RSMERR_INTERRUPTED

Operation interrupted by signal

Segment Destruction

int rsm_memseg_export_destroy(rsm_memseg_export_handle_t memseg);

This function deallocates segment and its free resources. All importing processes are forcibly disconnected.

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_SEG_HNDL

Invalid segment handle

RSMERR_POLLFD_IN_USE

pollfd in use

Memory Segment Publish, Republish, and Unpublish

The publish operation enables the importing of a memory segment by other nodes on the interconnect. An export segment might be published on multiple interconnect adapters.

The segment ID might be specified from within authorized ranges or specified as zero, in which case a valid segment ID is generated by the RSMAPI framework and is passed back.

The segment access control list is composed of pairs of node ID and access permissions. For each node ID specified in the list, the associated read/write permissions are provided by three octal digits for owner, group and other, as with Solaris file permissions. In the access control list, each octal digit can have the following values:

2

Write access.

4

Read only access.

6

Read and write access.

An access permission value of 0624 specifies the following kind of access:

When an access control list is provided, nodes not included in the list cannot import the segment. However, if the access list is null, any node can import the segment. The access permissions on all nodes equal the owner-group-other file creation permissions of the exporting process.


Note –

Node applications have the responsibility of managing the assignment of segment identifiers to ensure uniqueness on the exporting node.


Publish Segment

int rsm_memseg_export_publish(rsm_memseg_export_handle_t memseg, rsm_memseg_id_t *segment_id, rsmapi_access_entry_t ACCESS_list[], uint_t access_list_length);
typedef struct {
    rsm_node_id_t       ae_node;    /* remote node id allowed to access resource */
    rsm_permission_t    ae_permissions;    /* mode of access allowed */
} rsmapi_access_entry_t;.

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_SEG_HNDL

Invalid segment handle

RSMERR_SEG_ALREADY_PUBLISHED

Segment already published

RSMERR_BAD_ACL

Invalid access control list

RSMERR_BAD_SEGID

Invalid segment identifier

RSMERR_SEGID_IN_USE

Segment identifier in use

RSMERR_RESERVED_SEGID

Segment identifier reserved

RSMERR_NOT_CREATOR

Not creator of segment

RSMERR_BAD_ADDR

Bad address

RSMERR_INSUFFICIENT_MEM

Insufficient memory

RSMERR_INSUFFICIENT_RESOURCES

Insufficient resources

Authorized Segment ID Ranges:

#define RSM_DRIVER_PRIVATE_ID_BASE

0

#define RSM_DRIVER_PRIVATE_ID_END

0x0FFFFF

#define RSM_CLUSTER_TRANSPORT_ID_BASE

0x100000

#define RSM_CLUSTER_TRANSPORT_ID_END

0x1FFFFF

#define RSM_RSMLIB_ID_BASE

0x200000

#define RSM_RSMLIB_ID_END

0x2FFFFF

#define RSM_DLPI_ID_BASE

0x300000

#define RSM_DLPI_ID_END

0x3FFFFF

#define RSM_HPC_ID_BASE

0x400000

#define RSM_HPC_ID_END

0x4FFFFF

The following range is reserved for allocation by the system when the publish value is zero.

#define RSM_USER_APP_ID_BASE

0x80000000

#define RSM_USER_APP_ID_END

0xFFFFFFF

Republish Segment

int rsm_memseg_export_republish(rsm_memseg_export_handle_t memseg, rsmapi_access_entry_t access_list[], uint_t access_list_length);

This function establishes a new node access list and segment access mode. These changes only affect future import calls and do not revoke already granted import requests.

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_SEG_HNDL

Invalid segment handle

RSMERR_SEG_NOT_PUBLISHED

Segment not published

RSMERR_BAD_ACL

Invalid access control list

RSMERR_NOT_CREATOR

Not creator of segment

RSMERR_INSUFFICIENT_MEMF

Insufficient memory

RSMERR_INSUFFICIENT_RESOURCES

Insufficient resources

RSMERR_INTERRUPTED

Operation interrupted by signal

Unpublish Segment

int rsm_memseg_export_unpublish(rsm_memseg_export_handle_t memseg);

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_SEG_HNDL

Invalid segment handle

RSMERR_SEG_NOT_PUBLISHED

Segment not published

RSMERR_NOT_CREATOR

Not creator of segment

RSMERR_INTERRUPTED

Operation interrupted by signal

Memory Segment Rebind

The rebind operation releases the current backing store for an export segment. After releasing the current backing store for an export segment, the rebind operation allocates a new backing store. The application must first obtain a new virtual memory allocation for the segment. This operation is transparent to importers of the segment.


Note –

The application has the responsibility of preventing access to segment data until the rebind operation is complete. Retrieving data from a segment during rebinding does not cause a system failure, but the results of such an operation are undefined.


Rebind Segment

int rsm_memseg_export_rebind(rsm_memseg_export_handle_t memseg, void *vaddr, offset_t off, size_t size);

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_SEG_HNDL

Invalid segment handle

RSMERR_BAD_LENGTH

Invalid length

RSMERR_BAD_ADDR

Invalid address

RSMERR_REBIND_NOT_ALLOWED

Rebind not allowed

RSMERR_NOT_CREATOR

Not creator of segment

RSMERR_PERM_DENIED

Permission denied

RSMERR_INSUFFICIENT_MEM

Insufficient memory

RSMERR_INSUFFICIENT_RESOURCES

Insufficient resources

RSMERR_INTERRUPTED

Operation interrupted by signal

Import-Side Memory Segment Operations

The following list describes Import-side operations:

The connect operation is used to create an RSM import segment and form a logical connection with an exported segment.

Access to imported segment memory is provided by three interface categories:

Memory Segment Connection and Disconnection

Connect to Segment

int rsm_memseg_import_connect(rsmapi_controller_handle_t controller, rsm_node_id_t node_id, rsm_memseg_id_t segment_id, rsm_permission_t perm, rsm_memseg_import_handle_t *im_memseg);

This function connects to segment segment_id on remote node node_id by using the specified permission perm. The function returns a segment handle after connecting to the segment.

The argument perm specifies the access mode requested by the importer for this connection. To establish the connection, the access permissions specified by the exporter are compared to the access mode, user ID, and group ID used by the importer. If the request mode is not valid, the connection request is denied. The perm argument is limited to the following octal values:

0400

Read mode

0200

Write mode

0600

Read/write mode

The specified controller must have a physical connection to the controller that is used in the export of the segment.

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_CTLR_HNDL

Invalid controller handle

RSMERR_CTLR_NOT_PRESENT

Controller not present

RSMERR_BAD_SEG_HNDL

Invalid segment handle

RSMERR_PERM_DENIED

Permission denied

RSMERR_SEG_NOT_PUBLISHED_TO_NODE

Segment not published to node

RSMERR_SEG_NOT_PUBLISHED

No such segment published

RSMERR_REMOTE_NODE_UNREACHABLE

Remote node not reachable

RSMERR_INTERRUPTED

Connection interrupted

RSMERR_INSUFFICIENT_MEM

Insufficient memory

RSMERR_INSUFFICIENT_RESOURCES

Insufficient resources

RSMERR_BAD_ADDR

Bad address

Disconnect from Segment

int rsm_memseg_import_disconnect(rsm_memseg_import_handle_t im_memseg);

This function disconnects a segment. This function frees a segment's resources after disconnecting a segment. All existing mappings to the disconnected segment are removed. The handle im_memseg is freed.

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_SEG_HNDL

Invalid segment handle

RSMERR_SEG_STILL_MAPPED

Segment still mapped

RSMERR_POLLFD_IN_USE

pollfd in use

Memory Access Primitives

The following interfaces provide a mechanism for transferring between 8 bits and 64 bits of data. The get interfaces use a repeat count (rep_cnt) to indicate the number of data items of a given size the process will read from successive locations. The locations begin at byte offset offset in the imported segment. The data is written to successive locations that begin at datap. The put interfaces use a repeat count (rep_cnt). The count indicates the number of data items the process will read from successive locations. The locations begin at datap. The data is then written to the imported segment at successive locations. The locations begin at the byte offset specified by the offset argument.

These interfaces also provide byte swapping in case the source and destination have incompatible endian characteristics.

Function Prototypes:

int rsm_memseg_import_get8(rsm_memseg_import_handle_t im_memseg, off_t offset, uint8_t *datap, ulong_t rep_cnt);
int rsm_memseg_import_get16(rsm_memseg_import_handle_t im_memseg, off_t offset, uint16_t *datap, ulong_t rep_cnt);
int rsm_memseg_import_get32(rsm_memseg_import_handle_t im_memseg, off_t offset, uint32_t *datap, ulong_t rep_cnt);
int rsm_memseg_import_get64(rsm_memseg_import_handle_t im_memseg, off_t offset, uint64_t *datap, ulong_t rep_cnt);
int rsm_memseg_import_put8(rsm_memseg_import_handle_t im_memseg, off_t offset, uint8_t *datap, ulong_t rep_cnt);
int rsm_memseg_import_put16(rsm_memseg_import_handle_t im_memseg, off_t offset, uint16_t *datap, ulong_t rep_cnt);
int rsm_memseg_import_put32(rsm_memseg_import_handle_t im_memseg, off_t offset, uint32_t *datap, ulong_t rep_cnt);
int rsm_memseg_import_put64(rsm_memseg_import_handle_t im_memseg, off_t offset, uint64_t *datap, ulong_t rep_cnt);

The following interfaces are intended for data transfers that are larger than the ones supported by the segment access operations.

Segment Put

int rsm_memseg_import_put(rsm_memseg_import_handle_t im_memseg, off_t offset, void *src_addr, size_t length);

This function copies data from local memory, specified by the src_addr and length, to the corresponding imported segment locations specified by the handle and offset.

Segment Get

int rsm_memseg_import_get(rsm_memseg_import_handle_t im_memseg, off_t offset, void *dst_addr, size_t length);

This function is similar to rsm_memseg_import_put(), but data flows from the imported segment into local regions defined by the dest_vec argument

The put and get routines write or read the specified quantity of data from the byte offset location specified by the argument offset. The routines begin at the base of the segment. The offset must align at the appropriate boundary. For example, rsm_memseg_import_get64() requires that offset and datap align at a double-word boundary, while rsm_memseg_import_put32() requires an offset that is aligned at a word boundary.

By default, the barrier mode attribute of a segment is implicit. Implicit barrier mode means that the caller assumes the data transfer has completed or has failed upon return from the operation. Because the default barrier mode is implicit, the application must initialize the barrier. The application initializes the barrier by using the rsm_memseg_import_init_barrier() function before calling put or get routines when using the default mode. To use the explicit operation mode, the caller must use a barrier operation to force the completion of a transfer. After forcing the completion of the transfer, the caller must determine if any errors have occurred as a result of the forced completion.


Note –

An import segment can be partially mapped by passing an offset in the rsm_memseg_import_map() routine. If the import segment is partially mapped, the offset argument in the put or get routines is from the base of the segment. The user must make sure that the correct byte offset is passed to put and get routines.


Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_SEG_HNDL

Invalid segment handle

RSMERR_BAD_ADDR

Bad address

RSMERR_BAD_MEM_ALIGNMENT

Invalid memory alignment

RSMERR_BAD_OFFSET

Invalid offset

RSMERR_BAD_LENGTH

Invalid length

RSMERR_PERM_DENIED

Permission denied

RSMERR_BARRIER_UNINITIALIZED

Barrier not initialized

RSMERR_BARRIER_FAILURE

I/O completion error

RSMERR_CONN_ABORTED

Connection aborted

RSMERR_INSUFFICIENT_RESOURCES

Insufficient resources

Scatter-Gather Access

The rsm_memseg_import_putv() and rsm_memseg_import_getv() functions allow the use of a list of I/O requests instead of a single source and single destination address.

Function Prototypes:

int rsm_memseg_import_putv(rsm_scat_gath_t *sg_io);
int rsm_memseg_import_getv(rsm_scat_gath_t *sg_io);

The I/O vector component of the scatter-gather list (sg_io) enables the specification of local virtual addresses or local_memory_handles. Handles are an efficient way to repeatedly use a local address range. Allocated system resources, such as locked down local memory, are maintained until the handle is freed. The supporting functions for handles are rsm_create_localmemory_handle() and rsm_free_localmemory_handle().

You can gather virtual addresses or handles into the vector in order to write to a single remote segment. You can also scatter the results of reading from a single remote segment to the vector of virtual addresses or handles.

I/O for the entire vector is initiated before returning. The barrier mode attribute of the import segment determines whether the I/O has completed before the function returns. Setting the barrier mode attribute to implicit guarantees that data transfer is completed in the order entered in the vector. An implicit barrier open and close surrounds each list entry. If an error is detected, I/O for the vector is terminated and the function returns immediately. The residual count indicates the number of entries for which the I/O either did not complete or was not initiated.

You can specify that a notification event be sent to the target segment when a putv or getv operation is successful. To specify the delivery of a notification event, specify the RSM_IMPLICIT_SIGPOST value in the flags entry of the rsm_scat_gath_t structure. The flags entry can also contain the value RSM_SIGPOST_NO_ACCUMULATE, which is passed on to the signal post operation if RSM_IMPLICIT_SIGPOST is set.

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_SGIO

Invalid scatter-gather structure pointer

RSMERR_BAD_SEG_HNDL

Invalid segment handle

RSMERR_BAD_CTLR_HNDL

Invalid controller handle

RSMERR_BAD_ADDR

Bad address

RSMERR_BAD_OFFSET

Invalid offset

RSMERR_BAD_LENGTH

Invalid length

RSMERR_PERM_DENIED

Permission denied

RSMERR_BARRIER_FAILURE

I/O completion error

RSMERR_CONN_ABORTED

Connection aborted

RSMERR_INSUFFICIENT_RESOURCES

Insufficient resources

RSMERR_INTERRUPTED

Operation interrupted by signal

Get Local Handle

int rsm_create_localmemory_handle(rsmapi_controller_handle_t cntrl_handle, rsm_localmemory_handle_t *local_handle, caddr_t local_vaddr, size_t length);

This function obtains a local handle for use in the I/O vector for subsequent calls to putv or getv. Freeing the handle as soon as possible conserves system resources, notably the memory spanned by the local handle, which might be locked down.

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_CTLR_HNDL

Invalid controller handle

RSMERR_BAD_LOCALMEM_HNDL

Invalid local memory handle

RSMERR_BAD_LENGTH

Invalid length

RSMERR_BAD_ADDR

Invalid address

RSMERR_INSUFFICIENT_MEM

Insufficient memory

Free Local Handle

rsm_free_localmemory_handle(rsmapi_controller_handle_t cntrl_handle, rsm_localmemory_handle_t handle);

This function releases the system resources associated with the local handle. While all handles that belong to a process are freed when the process exits, calling this function conserves system resources.

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_CTLR_HNDL

Invalid controller handle

RSMERR_BAD_LOCALMEM_HNDL

Invalid local memory handle

The following example demonstrates the definition of primary data structures.


Example 2–1 Primary Data Structures

typedef void *rsm_localmemory_handle_t
typedef struct {
    ulong_t    io_request_count;     /* number of rsm_iovec_t entries */
    ulong_t    io_residual_count;    /* rsm_iovec_t entries not completed */

    int        flags;
    rsm_memseg_import_handle_t    remote_handle;    /* opaque handle for import segment */
    rsm_iovec_t    *iovec;           /* pointer to array of io_vec_t */
} rsm_scat_gath_t;

typedef struct {
    int    io_type;                  /* HANDLE or VA_IMMEDIATE */
    union {
        rsm_localmemory_handle_t    handle;          /* used with HANDLE */
        caddr_t                     virtual_addr;    /* used with VA_IMMEDIATE */
    } local;
    size_t     local_offset;             /* offset from handle base vaddr */
    size_t     import_segment_offset;    /* offset from segment base vaddr */
    size_t     transfer_length;
} rsm_iovec_t;

Segment Mapping

Mapping operations are only available for native architecture interconnects such as Dolphin-SCI or NewLink. Mapping a segment grants CPU memory operations access to that segment, saving the overhead of calling memory access primitives.

Imported Segment Map

int rsm_memseg_import_map(rsm_memseg_import_handle_t im_memseg, void **address, rsm_attribute_t attr, rsm_permission_t perm, off_t offset, size_t length);

This function maps an imported segment into the caller address space. If the attribute RSM_MAP_FIXED is specified, the function maps the segment at the value specified in **address.

typedef enum {
    RSM_MAP_NONE  = 0x0,    /* system will choose available virtual address */
    RSM_MAP_FIXED = 0x1,    /* map segment at specified virtual address */
} rsm_map_attr_t;

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_SEG_HNDL

Invalid segment handle

RSMERR_BAD_ADDR

Invalid address

RSMERR_BAD_LENGTH

Invalid length

RSMERR_BAD_OFFSET

Invalid offset

RSMERR_BAD_PERMS

Invalid permissions

RSMERR_SEG_ALREADY_MAPPED

Segment already mapped

RSMERR_SEG_NOT_CONNECTED

Segment not connected

RSMERR_CONN_ABORTED

Connection aborted

RSMERR_MAP_FAILED

Error during mapping

RSMERR_BAD_MEM_ALIGNMENT

Address not aligned on page boundary

Unmap segment

int rsm_memseg_import_unmap(rsm_memseg_import_handle_t im_memseg);

This function unmaps an imported segment from user virtual address space.

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_SEG_HNDL

Invalid segment handle

Barrier Operations

Use Barrier operations to resolve order-of-write-access memory model issues. Barrier operations also provide remote memory access error detection.

The barrier mechanism is made up of the following operations:

The open and close operations define a span-of-time interval for error detection and ordering. The initialization operation enables barrier creation for each imported segment, as well as barrier type specification. The only barrier type currently supported has a span-of-time scope per segment. Use a type argument value of RSM_BAR_DEFAULT.

Successfully performing a close operation guarantees the successful completion of covered access operations, which take place between the barrier open and the barrier close. After a barrier open operation, failures of individual data access operations, both reads and writes, are not reported until the barrier close operation.

To impose a specific order of write completion within a barrier's scope, use an explicit barrier-order operation. A write operation that is issued before the barrier-order operation finishes before operations that are issued after the barrier-order operation. Write operations within a given barrier scope are ordered with respect to another barrier scope.

Initialize Barrier

int rsm_memseg_import_init_barrier(rsm_memseg_import_handle_t im_memseg, rsm_barrier_type_t type, rsmapi_barrier_t *barrier);

Note –

At present, RSM_BAR_DEFAULT is the only supported type.


Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_SEG_HNDL

Invalid segment handle

RSMERR_BAD_BARRIER_PTR

Invalid barrier pointer

RSMERR_INSUFFICIENT_MEM

Insufficient memory

Open Barrier

int rsm_memseg_import_open_barrier(rsmapi_barrier_t *barrier);

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_SEG_HNDL

Invalid segment handle

RSMERR_BAD_BARRIER_PTR

Invalid barrier pointer

Close Barrier

int rsm_memseg_import_close_barrier(rsmapi_barrier_t *barrier);

This function closes the barrier and flushes all store buffers. This call assumes that the calling process will retry all remote memory operations since the last rsm_memseg_import_open_barrier call if the call to rsm_memseg_import_close_barrier() fails.

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_SEG_HNDL

Invalid segment handle

RSMERR_BAD_BARRIER_PTR

Invalid barrier pointer

RSMERR_BARRIER_UNINITIALIZED

Barrier not initialized

RSMERR_BARRIER_NOT_OPENED

Barrier not opened

RSMERR_BARRIER_FAILURE

Memory access error

RSMERR_CONN_ABORTED

Connection aborted

Order Barrier

int rsm_memseg_import_order_barrier(rsmapi_barrier_t *barrier);

This function flushes all store buffers.

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_SEG_HNDL

Invalid segment handle

RSMERR_BAD_BARRIER_PTR

Invalid barrier pointer

RSMERR_BARRIER_UNINITIALIZED

Barrier not initialized

RSMERR_BARRIER_NOT_OPENED

Barrier not opened

RSMERR_BARRIER_FAILURE

Memory access error

RSMERR_CONN_ABORTED

Connection aborted

Destroy Barrier

int rsm_memseg_import_destroy_barrier(rsmapi_barrier_t *barrier);

This function deallocates all barrier resources.

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_SEG_HNDL

Invalid segment handle

RSMERR_BAD_BARRIER_PTR

Invalid barrier pointer

Set Mode

int rsm_memseg_import_set_mode(rsm_memseg_import_handle_t im_memseg, rsm_barrier_mode_t mode);

This function supports the optional explicit barrier scoping that is available in the put routines. The two valid barrier modes are RSM_BARRIER_MODE_EXPLICIT and RSM_BARRIER_MODE_IMPLICIT. The default value of the barrier mode is RSM_BARRIER_MODE_IMPLICIT. While in implicit mode, an implicit barrier open and barrier close is applied to each put operation. Before setting the barrier mode value to RSM_BARRIER_MODE_EXPLICIT, use the rsm_memseg_import_init_barrier routine to initialize a barrier for the imported segment im_memseg.

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_SEG_HNDL

Invalid segment handle

Get Mode

int rsm_memseg_import_get_mode(rsm_memseg_import_handle_t im_memseg, rsm_barrier_mode_t *mode);

This function obtains the current mode value for barrier scoping in the put routines.

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_SEG_HNDL

Invalid segment handle.

Event Operations

Event operations enable processes synchronization on memory access events. If a process cannot use the rsm_intr_signal_wait() function, it can multiplex event waiting by obtaining a poll descriptor with rsm_memseg_get_pollfd() and using the poll system call.


Note –

Using the rsm_intr_signal_post() and rsm_intr_signal_wait() operations incurs the need to process of ioctl calls to the kernel.


Post Signal

int rsm_intr_signal_post(void *memseg, uint_t flags);

The void pointer *memseg can be type cast to either an import segment handle or an export segment handle. If *memseg refers to an import handle, this function sends a signal the exporting process. If *memseg refers to an export handle, this function sends a signal to all importers of that segment. Setting the flags argument to RSM_SIGPOST_NO_ACCUMULATE discards this event if an event is already pending for the target segment.

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_SEG_HNDL

Invalid segment handle

RSMERR_REMOTE_NODE_UNREACHABLE

Remote node not reachable

Wait for Signal

int rsm_intr_signal_wait(void * memseg, int timeout);

The void pointer *memseg can be type cast to either an import segment handle or an export segment handle. The process blocks for up to timeout milliseconds or until an event occurs. If the value is -1, the process blocks until an event occurs or until interrupted.

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_SEG_HNDL

Invalid segment handle

RSMERR_TIMEOUT

Timer expired

RSMERR_INTERRUPTED

Wait interrupted

Get pollfd

int rsm_memseg_get_pollfd(void *memseg, struct pollfd *pollfd);

This function initializes the specified pollfd structure with a descriptor for the specified segment and the singular fixed event generated by rsm_intr_signal_post(). Use the pollfd structure with the poll system call to wait for the event signalled by rsm_intr_signal_post. If the memory segment is not currently published, the poll system call does not return a valid pollfd. Each successful call increments a pollfd reference count for the specified segment.

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_SEG_HNDL

Invalid segment handle

Release pollfd

int rsm_memseg_release_pollfd(oid *memseg);

This call decrements the pollfd reference count for the specified segment. If the reference count is nonzero, operations that unpublish, destroy, or unmap the segment fail.

Return Values: Returns 0 if successful. Returns an error value otherwise.

RSMERR_BAD_SEG_HNDL

Invalid segment handle

RSMAPI General Usage Notes

These usage notes describe general considerations for the export and import sides of a shared-memory operation. These usage notes also contain general information regarding segments, file descriptors, and RSM configurable parameters.

Segment Allocation and File Descriptor Usage

The system allocates a file descriptor, which is inaccessible to the application importing or exporting memory, for each export operation or import operation. The default limit on file descriptor allocation for each process is 256. The importing or exporting application must adjust the allocation limit appropriately. If the application increases the file descriptor limit beyond 256, the values of the file descriptors that are allocated for export segments and import segments starts at 256. These file descriptor values are chosen to avoid interfering with normal file descriptor allocation by the application. This behavior accommodates the use of certain libc functions in 32-bit applications that only work with file descriptor values lower than 256.

Export-Side Considerations

The application must prevent access to segment data until the rebind operation is complete. Segment data access during rebind does not cause a system failure, but data content results are undefined. The virtual address space must be currently mapped and valid.

Import-Side Considerations

The controller that is specified for a segment import must have a physical connection with the controller that is used in the export of the segment.

RSM Configurable Parameters

The SUNWrsm software package includes an rsm.conf file. This file is located in /usr/kernel/drv. This file is a configuration file for RSM. The rsm.conf file can be used to specify values for certain configurable RSM properties. The configurable properties currently defined in rsm.conf include max-exported-memory and enable-dynamic-reconfiguration.

max-exported-memory

This property specifies an upper limit on the amount of exportable memory. The upper limit is expressed as a percentage of total available memory. Giving this property a value of zero indicates that the amount of exportable memory is unlimited.

enable-dynamic-reconfiguration

The value of this property indicates whether dynamic reconfiguration is enabled. A value of zero indicates dynamic reconfiguration is disabled. A value of one enables dynamic reconfiguration support. The default value for this property is one.