Programming Interfaces Guide

Chapter 8 Socket Interfaces

This chapter presents the socket interface. Sample programs are included to illustrate key points. The following topics are discussed in this chapter:


Note –

The interface that is described in this chapter is multithread safe. You can call applications that contain socket interface calls freely in a multithreaded application. Note, however, that the degree of concurrency that is available to applications is not specified.


SunOS 4 Binary Compatibility

Two major changes from the SunOS 4 environment hold true for SunOS 5.10 releases. The binary compatibility package enables dynamically linked socket applications that are based on SunOS 4 to run on SunOS 5.10.

Overview of Sockets

Sockets have been an integral part of SunOS releases since 1981. A socket is an endpoint of communication to which a name can be bound. A socket has a type and an associated process. Sockets were designed to implement the client-server model for interprocess communication where:

Sockets make network protocols available while behaving like UNIX files. Applications create sockets as sockets are needed. Sockets work with the close(2), read(2), write(2), ioctl(2), and fcntl(2) interfaces. The operating system differentiates between the file descriptors for files and the file descriptors for sockets.

Socket Libraries

The socket interface routines are in a library that must be linked with the application. The library libsocket.so is contained in /usr/lib with the rest of the system service libraries. Use libsocket.so for dynamic linking.

Socket Types

Socket types define the communication properties that are visible to a user. The Internet family sockets provide access to the TCP/IP transport protocols. The Internet family is identified by the value AF_INET6, for sockets that can communicate over both IPv6 and IPv4. The value AF_INET is also supported for source compatibility with old applications and for raw access to IPv4.

The SunOS environment supports four types of sockets:

See Selecting Specific Protocols for further information.

Interface Sets

The SunOS 5.10 platform provides two sets of socket interfaces. The BSD socket interfaces are provided and, since SunOS version 5.7, the XNS 5 (UNIX03) socket interfaces are also provided. The XNS 5 interfaces differ slightly from the BSD interfaces.

The XNS 5 socket interfaces are documented in the following man pages:

The traditional BSD Socket behavior is documented in the corresponding 3N man pages. In addition, the following new interfaces have been added to section 3N:

See the standards(5) man page for information on building applications that use the XNS 5 (UNIX03) socket interface.

Socket Basics

This section describes the use of the basic socket interfaces.

Socket Creation

The socket(3SOCKET) call creates a socket in the specified family and of the specified type.

s = socket(family, type, protocol);

If the protocol is unspecified, the system selects a protocol that supports the requested socket type. The socket handle is returned. The socket handle is a file descriptor.

The family is specified by one of the constants that are defined in sys/socket.h. Constants that are named AF_suite specify the address format to use in interpreting names:

AF_APPLETALK

Apple Computer Inc. Appletalk network

AF_INET6

Internet family for IPv6 and IPv4

AF_INET

Internet family for IPv4 only

AF_PUP

Xerox Corporation PUP internet

AF_UNIX

UNIX file system

Socket types are defined in sys/socket.h. These types, SOCK_STREAM, SOCK_DGRAM, or SOCK_RAW, are supported by AF_INET6, AF_INET, and AF_UNIX. The following example creates a stream socket in the Internet family:

s = socket(AF_INET6, SOCK_STREAM, 0);

This call results in a stream socket. The TCP protocol provides the underlying communication. Set the protocol argument to 0, the default, in most situations. You can specify a protocol other than the default, as described in Advanced Socket Topics.

Binding Local Names

A socket is created with no name. A remote process has no way to refer to a socket until an address is bound to the socket. Processes that communicate are connected through addresses. In the Internet family, a connection is composed of local and remote addresses and local and remote ports. Duplicate ordered sets, such as: protocol, local address, local port, foreign address, foreign port cannot exist. In most families, connections must be unique.

The bind(3SOCKET) interface enables a process to specify the local address of the socket. This interface forms the local address, local port set. connect(3SOCKET) and accept(3SOCKET) complete a socket's association by fixing the remote half of the address tuple. The bind(3SOCKET) call is used as follows:

bind (s, name, namelen);

The socket handle is s. The bound name is a byte string that is interpreted by the supporting protocols. Internet family names contain an Internet address and port number.

This example demonstrates binding an Internet address.

#include <sys/types.h>
#include <netinet/in.h>
...
struct sockaddr_in6 sin6;
...
s = socket(AF_INET6, SOCK_STREAM, 0);
bzero (&sin6, sizeof (sin6));
sin6.sin6_family = AF_INET6;
sin6.sin6_addr.s6_addr = in6addr_arg;
sin6.sin6_port = htons(MYPORT);
bind(s, (struct sockaddr *) &sin6, sizeof sin6);

The content of the address sin6 is described in Address Binding, where Internet address bindings are discussed.

Connection Establishment

Connection establishment is usually asymmetric, with one process acting as the client and the other as the server. The server binds a socket to a well-known address associated with the service and blocks on its socket for a connect request. An unrelated process can then connect to the server. The client requests services from the server by initiating a connection to the server's socket. On the client side, the connect(3SOCKET) call initiates a connection. In the Internet family, this connection might appear as:

struct sockaddr_in6 server;
...
connect(s, (struct sockaddr *)&server, sizeof server);

If the client's socket is unbound at the time of the connect call, the system automatically selects and binds a name to the socket. For more information, see Address Binding. This automatic selection is the usual way to bind local addresses to a socket on the client side.

To receive a client's connection, a server must perform two steps after binding its socket. The first step is to indicate how many connection requests can be queued. The second step is to accept a connection.

struct sockaddr_in6 from;
...
listen(s, 5);                /* Allow queue of 5 connections */
fromlen = sizeof(from);
newsock = accept(s, (struct sockaddr *) &from, &fromlen);

The socket handle s is the socket bound to the address to which the connection request is sent. The second parameter of listen(3SOCKET) specifies the maximum number of outstanding connections that might be queued. The from structure is filled with the address of the client. A NULL pointer might be passed. fromlen is the length of the structure.

The accept(3SOCKET) routine normally blocks processes. accept(3SOCKET) returns a new socket descriptor that is connected to the requesting client. The value of fromlen is changed to the actual size of the address.

A server cannot indicate that the server accepts connections from only specific addresses. The server can check the from address returned by accept(3SOCKET) and close a connection with an unacceptable client. A server can accept connections on more than one socket, or avoid blocking on the accept(3SOCKET) call. These techniques are presented in Advanced Socket Topics.

Connection Errors

An error is returned if the connection is unsuccessful, but an address bound by the system remains. If the connection is successful, the socket is associated with the server and data transfer can begin.

The following table lists some of the more common errors returned when a connection attempt fails.

Table 8–1 Socket Connection Errors

Socket Errors 

Error Description 

ENOBUFS

Lack of memory available to support the call. 

EPROTONOSUPPORT

Request for an unknown protocol. 

EPROTOTYPE

Request for an unsupported type of socket. 

ETIMEDOUT

No connection established in specified time. This error happens when the destination host is down or when problems in the network cause in lost transmissions. 

ECONNREFUSED

The host refused service. This error happens when a server process is not present at the requested address. 

ENETDOWN or EHOSTDOWN

These errors are caused by status information delivered by the underlying communication interface. 

ENETUNREACH or EHOSTUNREACH

These operational errors can occur because no route to the network or host exists. These errors can also occur because of status information returned by intermediate gateways or switching nodes. The status information that is returned is not always sufficient to distinguish between a network that is down and a host that is down. 

Data Transfer

This section describes the interfaces to send and receive data. You can send or receive a message with the normal read(2) and write(2) interfaces:

write(s, buf, sizeof buf);
read(s,  buf, sizeof buf);

You can also use send(3SOCKET) and recv(3SOCKET):

send(s, buf, sizeof buf, flags);
recv(s, buf, sizeof buf, flags);

send(3SOCKET) and recv(3SOCKET) are very similar to read(2) and write(2), but the flags argument is important. The flags argument, which is defined in sys/socket.h, can be specified as a nonzero value if one or more of the following is required:

MSG_OOB

Send and receive out-of-band data

MSG_PEEK

Look at data without reading

MSG_DONTROUTE

Send data without routing packets

Out-of-band data is specific to stream sockets. When MSG_PEEK is specified with a recv(3SOCKET) call, any data present is returned to the user, but treated as still unread. The next read(2) or recv(3SOCKET) call on the socket returns the same data. The option to send data without routing packets applied to the outgoing packets is currently used only by the routing table management process.

Closing Sockets

A SOCK_STREAM socket can be discarded by a close(2) interface call. If data is queued to a socket that promises reliable delivery after a close(2), the protocol continues to try to transfer the data. The data is discarded if it remains undelivered after an arbitrary period.

A shutdown(3SOCKET) closes SOCK_STREAM sockets gracefully. Both processes can acknowledge that they are no longer sending. This call has the form:

shutdown(s, how);

where how is defined as

0

Disallows further data reception

1

Disallows further data transmission

2

Disallows further transmission and further reception

Connecting Stream Sockets

The following two examples illustrate initiating and accepting an Internet family stream connection.

Figure 8–1 Connection-Oriented Communication Using Stream Sockets

This graphic depicts data flow between a client and a
server, using the accept/connect and read/write function pairs.

The following example program is a server. The server creates a socket and binds a name to the socket, then displays the port number. The program calls listen(3SOCKET) to mark the socket as ready to accept connection requests and to initialize a queue for the requests. The rest of the program is an infinite loop. Each pass of the loop accepts a new connection and removes it from the queue, creating a new socket. The server reads and displays the messages from the socket and closes the socket. The use of in6addr_any is explained in Address Binding.


Example 8–1 Accepting an Internet Stream Connection (Server)

#include <sys/types.h> 
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <stdio.h>
#define TRUE 1   
/*
 * This program creates a socket and then begins an infinite loop.
 * Each time through the loop it accepts a connection and prints
 * data from it. When the connection breaks, or the client closes
 * the connection, the program accepts a new connection.
*/
main() {
    int sock, length;
    struct sockaddr_in6 server;
    int msgsock;
    char buf[1024];
    int rval;
    /* Create socket. */
    sock = socket(AF_INET6, SOCK_STREAM, 0);
    if (sock == -1) {
        perror("opening stream socket");
        exit(1);
    }
    /* Bind socket using wildcards.*/
    bzero (&server, sizeof(server));
    server.sin6_family = AF_INET6;
    server.sin6_addr = in6addr_any;
    server.sin6_port = 0;
    if (bind(sock, (struct sockaddr *) &server, sizeof server)
            == -1) {
        perror("binding stream socket");
        exit(1);
    }
    /* Find out assigned port number and print it out. */
    length = sizeof server;
    if (getsockname(sock,(struct sockaddr *) &server, &length)
            == -1) {
        perror("getting socket name");
        exit(1);
    }
    printf("Socket port #%d\n", ntohs(server.sin6_port));
    /* Start accepting connections. */
    listen(sock, 5);
    do {
        msgsock = accept(sock,(struct sockaddr *) 0,(int *) 0);
        if (msgsock == -1)
            perror("accept");
        else do {
            memset(buf, 0, sizeof buf);
            if ((rval = read(msgsock,buf, sizeof(buf))) == -1)
                perror("reading stream message");
            if (rval == 0)
                printf("Ending connection\n");
            else
                /* assumes the data is printable */
                printf("-->%s\n", buf);
        } while (rval > 0);
        close(msgsock);
    } while(TRUE);
    /*
     * Since this program has an infinite loop, the socket "sock" is
     * never explicitly closed. However, all sockets are closed
     * automatically when a process is killed or terminates normally.
     */
    exit(0);
}

To initiate a connection, the client program in Example 8–2 creates a stream socket, then calls connect(3SOCKET), specifying the address of the socket for connection. If the target socket exists, and the request is accepted, the connection is complete. The program can now send data. Data is delivered in sequence with no message boundaries. The connection is destroyed when either socket is closed. For more information about data representation routines in this program, such as ntohl(3SOCKET), ntohs(3SOCKET), htons(3SOCKET), and htonl(3XNET), see the byteorder(3SOCKET) man page.


Example 8–2 Internet Family Stream Connection (Client)

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <stdio.h>
#define DATA "Half a league, half a league . . ."   
/* 
 * This program creates a socket and initiates a connection with 
 * the socket given in the command line. Some data are sent over the 
 * connection and then the socket is closed, ending the connection.
 * The form of the command line is: streamwrite hostname portnumber 
 * Usage: pgm host port 
 */ 
main(int argc, char *argv[])
{
    int sock, errnum, h_addr_index;
    struct sockaddr_in6 server;
    struct hostent *hp;
    char buf[1024];
    /* Create socket. */
    sock = socket( AF_INET6, SOCK_STREAM, 0);
    if (sock == -1) {
        perror("opening stream socket");
        exit(1);
    }
    /* Connect socket using name specified by command line. */
    bzero (&server, sizeof (server));
    server.sin6_family = AF_INET6;
    hp = getipnodebyname(argv[1], AF_INET6, AI_DEFAULT, &errnum);
/*
 * getipnodebyname returns a structure including the network address
 * of the specified host.
 */
    if (hp == (struct hostent *) 0) {
        fprintf(stderr, "%s: unknown host\n", argv[1]);
        exit(2);
    }
    h_addr_index = 0;
    while (hp->h_addr_list[h_addr_index] != NULL) {
        bcopy(hp->h_addr_list[h_addr_index], &server.sin6_addr,
                    hp->h_length);
        server.sin6_port = htons(atoi(argv[2]));
        if (connect(sock, (struct sockaddr *) &server,
                    sizeof (server)) == -1) {
            if (hp->h_addr_list[++h_addr_index] != NULL) {
                /* Try next address */
                continue;
            }
            perror("connecting stream socket");
            freehostent(hp);
            exit(1);
        }
        break;
    }
    freehostent(hp);
    if (write( sock, DATA, sizeof DATA) == -1)
        perror("writing on stream socket");
    close(sock);
    freehostent (hp);
    exit(0);
}

You can add support for one-to-one SCTP connections to stream sockets. The following example code adds the -p to an existing program, enabling the program to specify the protocol to use.


Example 8–3 Adding SCTP Support to a Stream Socket

#include <stdio.h>
#include <netdb.h>
#include <string.h>
#include <errno.h>

int
main(int argc, char *argv[])
{
    struct protoent *proto = NULL;
    int c;
    int s;
    int protocol;

    while ((c = getopt(argc, argv, "p:")) != -1) {
        switch (c) {
        case 'p':
            proto = getprotobyname(optarg);
            if (proto == NULL) {
                fprintf(stderr, "Unknown protocol: %s\n",
                        optarg);
                return (-1);
            }
            break;
        default:
            fprintf(stderr, "Unknown option: %c\n", c);
            return (-1);
        }
    }

    /* Use the default protocol, which is TCP, if not specified. */
    if (proto == NULL)
        protocol = 0;
    else
        protocol = proto->p_proto;

    /* Create a IPv6 SOCK_STREAM socket of the protocol. */
    if ((s = socket(AF_INET6, SOCK_STREAM, protocol)) == -1) {
        fprintf(stderr, "Cannot create SOCK_STREAM socket of type %s: "
                "%s\n", proto != NULL ? proto->p_name : "tcp",
                strerror(errno));
        return (-1);
    }
    printf("Success\n");
    return (0);
}

Input/Output Multiplexing

Requests can be multiplexed among multiple sockets or multiple files. Use select(3C) to multiplex:

#include <sys/time.h>
#include <sys/types.h>
#include <sys/select.h>
...
fd_set readmask, writemask, exceptmask;
struct timeval timeout;
...
select(nfds, &readmask, &writemask, &exceptmask, &timeout);

The first argument of select(3C) is the number of file descriptors in the lists pointed to by the next three arguments.

The second, third, and fourth arguments of select(3C) point to three sets of file descriptors: a set of descriptors to read on, a set to write on, and a set on which exception conditions are accepted. Out-of-band data is the only exceptional condition. You can designate any of these pointers as a properly cast null. Each set is a structure that contains an array of long integer bit masks. Set the size of the array with FD_SETSIZE, which is defined in select.h. The array is long enough to hold one bit for each FD_SETSIZE file descriptor.

The macros FD_SET (fd, &mask) and FD_CLR (fd, &mask) add and delete, respectively, the file descriptor fd in the set mask. The set should be zeroed before use and the macro FD_ZERO (&mask) clears the set mask.

The fifth argument of select(3C) enables the specification of a timeout value. If the timeout pointer is NULL, select(3C) blocks until a descriptor is selectable, or until a signal is received. If the fields in timeout are set to 0, select(3C) polls and returns immediately.

The select(3C) routine normally returns the number of file descriptors that are selected, or a zero if the timeout has expired. The select(3C) routine returns -1 for an error or interrupt, with the error number in errno and the file descriptor masks unchanged. For a successful return, the three sets indicate which file descriptors are ready to be read from, written to, or have exceptional conditions pending.

Test the status of a file descriptor in a select mask with the FD_ISSET (fd, &mask) macro. The macro returns a nonzero value if fd is in the set mask. Otherwise, the macro returns zero. Use select(3C) followed by a FD_ISSET (fd, &mask) macro on the read set to check for queued connect requests on a socket.

The following example shows how to select on a listening socket for readability to determine when a new connection can be picked up with a call to accept(3SOCKET). The program accepts connection requests, reads data, and disconnects on a single socket.


Example 8–4 Using select(3C) to Check for Pending Connections

#include <sys/types.h>
#include <sys/socket.h>
#include <sys/time/h>
#include <netinet/in.h>
#include <netdb.h>
#include <stdio.h>
#define TRUE 1
/*
 * This program uses select to check that someone is
 * trying to connect before calling accept.
 */
main() {
    int sock, length;
    struct sockaddr_in6 server;
    int msgsock;
    char buf[1024];
    int rval;
    fd_set ready;
    struct timeval to;
    /* Open a socket and bind it as in previous examples. */
    /* Start accepting connections. */
    listen(sock, 5); 
    do {
        FD_ZERO(&ready);
        FD_SET(sock, &ready);
        to.tv_sec = 5;
        to.tv_usec = 0;
        if (select(sock + 1, &ready, (fd_set *)0, 
                   (fd_set *)0, &to) == -1) {
            perror("select");
            continue;
        }
        if (FD_ISSET(sock, &ready)) {
            msgsock = accept(sock, (struct sockaddr *)0, (int *)0);
            if (msgsock == -1)
                perror("accept");
            else do {
                memset(buf, 0, sizeof buf);
                if ((rval = read(msgsock, buf, sizeof(buf))) == -1)
                    perror("reading stream message");
                else if (rval == 0)
                    printf("Ending connection\n");
                else
                    printf("-->%s\n", buf);
            } while (rval > 0);
            close(msgsock);
        } else
            printf("Do something else\n");
        } while (TRUE);
    exit(0);
}

In previous versions of the select(3C) routine, its arguments were pointers to integers instead of pointers to fd_sets. This style of call still works if the number of file descriptors is smaller than the number of bits in an integer.

The select(3C) routine provides a synchronous multiplexing scheme. The SIGIO and SIGURG signals, which is described in Advanced Socket Topics, provide asynchronous notification of output completion, input availability, and exceptional conditions.

Datagram Sockets

A datagram socket provides a symmetric data exchange interface without requiring connection establishment. Each message carries the destination address. The following figure shows the flow of communication between server and client.

The bind(3SOCKET) step for the server is optional.

Figure 8–2 Connectionless Communication Using Datagram Sockets

This graphic depicts data flow between a server and client,
using the sendto and recvfrom functions.

Create datagram sockets as described in Socket Creation. If a particular local address is needed, the bind(3SOCKET) operation must precede the first data transmission. Otherwise, the system sets the local address or port when data is first sent. Use sendto(3SOCKET) to send data.

sendto(s, buf, buflen, flags, (struct sockaddr *) &to, tolen);

The s, buf, buflen, and flags parameters are the same as in connection-oriented sockets. The to and tolen values indicate the address of the intended recipient of the message. A locally detected error condition, such as an unreachable network, causes a return of -1 and errno to be set to the error number.

recvfrom(s, buf, buflen, flags, (struct sockaddr *) &from, &fromlen);

To receive messages on a datagram socket, recvfrom(3SOCKET) is used. Before the call, fromlen is set to the size of the from buffer. On return, fromlen is set to the size of the address from which the datagram was received.

Datagram sockets can also use the connect(3SOCKET) call to associate a socket with a specific destination address. The socket can then use the send(3SOCKET) call. Any data that is sent on the socket that does not explicitly specify a destination address is addressed to the connected peer. Only the data that is received from that peer is delivered. A socket can have only one connected address at a time. A second connect(3SOCKET) call changes the destination address. Connect requests on datagram sockets return immediately. The system records the peer's address. Neither accept(3SOCKET) nor listen(3SOCKET) are used with datagram sockets.

A datagram socket can return errors from previous send(3SOCKET) calls asynchronously while the socket is connected. The socket can report these errors on subsequent socket operations. Alternately, the socket can use an option of getsockopt(3SOCKET), SO_ERROR to interrogate the error status.

The following example code shows how to send an Internet call by creating a socket, binding a name to the socket, and sending the message to the socket.


Example 8–5 Sending an Internet Family Datagram

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <stdio.h>
#define DATA "The sea is calm, the tide is full . . ."   
/*
 * Here I send a datagram to a receiver whose name I get from
 * the command line arguments. The form of the command line is:
 * dgramsend hostname portnumber
 */
main(int argc, char *argv[])
{
    int sock, errnum;
    struct sockaddr_in6 name;
    struct hostent *hp;
    /* Create socket on which to send. */
    sock = socket(AF_INET6,SOCK_DGRAM, 0);
    if (sock == -1) {
        perror("opening datagram socket");
        exit(1);
    }
    /*
     * Construct name, with no wildcards, of the socket to ``send''
     * to. getinodebyname returns a structure including the network
     * address of the specified host. The port number is taken from
     * the command line.
     */
    hp = getipnodebyname(argv[1], AF_INET6, AI_DEFAULT, &errnum);
    if (hp == (struct hostent *) 0) {
        fprintf(stderr, "%s: unknown host\n", argv[1]);
        exit(2);
    }
    bzero (&name, sizeof (name));
    memcpy((char *) &name.sin6_addr, (char *) hp->h_addr,
       hp->h_length);
    name.sin6_family = AF_INET6;
    name.sin6_port = htons(atoi(argv[2]));
    /* Send message. */
    if (sendto(sock,DATA, sizeof DATA ,0,
        (struct sockaddr *) &name,sizeof name) == -1)
        perror("sending datagram message");
    close(sock);
    exit(0);
}

The following sample code shows how to read an Internet call by creating a socket, binding a name to the socket, and then reading from the socket.


Example 8–6 Reading Internet Family Datagrams

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <stdio.h>
/*
 * This program creates a datagram socket, binds a name to it, then
 * reads from the socket.
 */
 main()
{
    int sock, length;
    struct sockaddr_in6 name;
    char buf[1024];
    /* Create socket from which to read. */
    sock = socket(AF_INET6, SOCK_DGRAM, 0);
    if (sock == -1) {
        perror("opening datagram socket");
        exit(1);
    }
    /* Create name with wildcards. */
    bzero (&name, sizeof (name));
    name.sin6_family = AF_INET6;
    name.sin6_addr = in6addr_any;
    name.sin6_port = 0;
    if (bind (sock, (struct sockaddr *)&name, sizeof (name)) == -1) {
        perror("binding datagram socket");
        exit(1);
    }
    /* Find assigned port value and print it out. */
    length = sizeof(name);
    if (getsockname(sock,(struct sockaddr *) &name, &length)
            == -1) {
        perror("getting socket name");
        exit(1);
    }
    printf("Socket port #%d\n", ntohs(name.sin6_port));
    /* Read from the socket. */
    if (read(sock, buf, 1024) == -1 )
        perror("receiving datagram packet");
    /* Assumes the data is printable */
    printf("-->%s\n", buf);
    close(sock);
    exit(0);
}

Standard Routines

This section describes the routines that you can use to locate and construct network addresses. Unless otherwise stated, interfaces presented in this section apply only to the Internet family.

Locating a service on a remote host requires many levels of mapping before the client and server communicate. A service has a name for human use. The service and host names must translate to network addresses. Finally, the network address must be usable to locate and route to the host. The specifics of the mappings can vary between network architectures.

Standard routines map host names to network addresses, network names to network numbers, protocol names to protocol numbers, and service names to port numbers. The standard routines also indicate the appropriate protocol to use in communicating with the server process. The file netdb.h must be included when using any of these routines.

Host and Service Names

The interfaces getaddrinfo(3SOCKET), getnameinfo(3SOCKET), gai_strerror(3SOCKET), and freeaddrinfo(3SOCKET) provide a simplified way to translate between the names and addresses of a service on a host. These interfaces are more recent than the getipnodebyname(3SOCKET), gethostbyname(3NSL), and getservbyname(3SOCKET) APIs. Both IPv6 and IPv4 addresses are handled transparently.

The getaddrinfo(3SOCKET) routine returns the combined address and port number of the specified host and service names. Because the information returned by getaddrinfo(3SOCKET) is dynamically allocated, the information must be freed by freeaddrinfo(3SOCKET) to prevent memory leaks. getnameinfo(3SOCKET) returns the host and services names associated with a specified address and port number. Call gai_strerror(3SOCKET) to print error messages based on the EAI_xxx codes returned by getaddrinfo(3SOCKET) and getnameinfo(3SOCKET).

An example of using getaddrinfo(3SOCKET) follows.

struct addrinfo        *res, *aip;
struct addrinfo        hints;
int                    error;

/* Get host address.  Any type of address will do. */
bzero(&hints, sizeof (hints));
hints.ai_flags = AI_ALL|AI_ADDRCONFIG;
hints.ai_socktype = SOCK_STREAM;

error = getaddrinfo(hostname, servicename, &hints, &res);
if (error != 0) {
    (void) fprintf(stderr, "getaddrinfo: %s for host %s service %s\n",
    gai_strerror(error), hostname, servicename);
    return (-1);
}

After processing the information returned by getaddrinfo(3SOCKET) in the structure pointed to by res, the storage should be released by freeaddrinfo(res).

The getnameinfo(3SOCKET) routine is particularly useful in identifying the cause of an error, as in the following example:

struct sockaddr_storage faddr;
int                     sock, new_sock, sock_opt;
socklen_t               faddrlen;
int                     error;
char                    hname[NI_MAXHOST];
char                    sname[NI_MAXSERV];
...
faddrlen = sizeof (faddr);
new_sock = accept(sock, (struct sockaddr *)&faddr, &faddrlen);
if (new_sock == -1) {
    if (errno != EINTR && errno != ECONNABORTED) {
        perror("accept");
    }
    continue;
}
error = getnameinfo((struct sockaddr *)&faddr, faddrlen, hname, 
    sizeof (hname), sname, sizeof (sname), 0);
if (error) {
    (void) fprintf(stderr, "getnameinfo: %s\n",
        gai_strerror(error));
} else {
    (void) printf("Connection from %s/%s\n", hname, sname);
}

Host Names – hostent

An Internet host-name-to-address mapping is represented by the hostent structure as defined in gethostent(3NSL):

struct hostent {
    char  *h_name;            /* official name of host */
    char  **h_aliases;        /* alias list */
    int   h_addrtype;         /* hostaddrtype(e.g.,AF_INET6) */
    int   h_length;           /* length of address */
    char  **h_addr_list;      /* list of addrs, null terminated */
};
/*1st addr, net byte order*/
#define h_addr h_addr_list[0]
getipnodebyname(3SOCKET)

Maps an Internet host name to a hostent structure

getipnodebyaddr(3SOCKET)

Maps an Internet host address to a hostent structure

freehostent(3SOCKET)

Frees the memory of a hostent structure

inet_ntop(3SOCKET)

Maps an Internet host address to a string

The routines return a hostent structure that contains the name of the host, its aliases, the address type, and a NULL-terminated list of variable length addresses. The list of addresses is required because a host can have many addresses. The h_addr definition is for backward compatibility, and is the first address in the list of addresses in the hostent structure.

Network Names – netent

The routines to map network names to numbers and the reverse return a netent structure:

/*
 * Assumes that a network number fits in 32 bits.
 */
struct netent {
   char     *n_name;      /* official name of net */
   char     **n_aliases;  /* alias list */
   int      n_addrtype;   /* net address type */
   int      n_net;        /* net number, host byte order */
};

getnetbyname(3SOCKET), getnetbyaddr_r(3SOCKET), and getnetent(3SOCKET) are the network counterparts to the host routines previously described.

Protocol Names – protoent

The protoent structure defines the protocol-name mapping used with getprotobyname(3SOCKET), getprotobynumber(3SOCKET), and getprotoent(3SOCKET) and defined in getprotoent(3SOCKET):

struct protoent {
    char    *p_name;       /* official protocol name */
    char    **p_aliases    /* alias list */
    int     p_proto;       /* protocol number */
};

Service Names – servent

An Internet family service resides at a specific, well-known port, and uses a particular protocol. A service-name-to-port-number mapping is described by the servent structure that is defined in getprotoent(3SOCKET):

struct servent {
    char    *s_name;        /* official service name */
    char    **s_aliases;    /* alias list */
    int     s_port;         /* port number, network byte order */
    char    *s_proto;       /* protocol to use */
};

getservbyname(3SOCKET) maps service names and, optionally, a qualifying protocol to a servent structure. The call:

sp = getservbyname("telnet", (char *) 0);

returns the service specification of a telnet server that is using any protocol. The call:

sp = getservbyname("telnet", "tcp");

returns the telnet server that uses the TCP protocol. getservbyport(3SOCKET) and getservent(3SOCKET) are also provided. getservbyport(3SOCKET) has an interface that is similar to the interface used by getservbyname(3SOCKET). You can specify an optional protocol name to qualify lookups.

Other Routines

Several other routines that simplify manipulating names and addresses are available. The following table summarizes the routines for manipulating variable-length byte strings and byte-swapping network addresses and values.

Table 8–2 Runtime Library Routines

Interface 

Synopsis 

memcmp(3C)

Compares byte-strings; 0 if same, not 0 otherwise

memcpy(3C)

Copies n bytes from s2 to s1

memset(3C)

Sets n bytes to value starting at base

htonl(3SOCKET)

32-bit quantity from host into network byte order 

htons(3SOCKET)

16-bit quantity from host into network byte order 

ntohl(3SOCKET)

32-bit quantity from network into host byte order 

ntohs(3SOCKET)

16-bit quantity from network into host byte order 

The byte-swapping routines are provided because the operating system expects addresses to be supplied in network order. On some architectures, the host byte ordering is different from network byte order, so programs must sometimes byte-swap values. Routines that return network addresses do so in network order. Byte-swapping problems occur only when interpreting network addresses. For example, the following code formats a TCP or UDP port:

printf("port number %d\n", ntohs(sp->s_port));

On machines that do not need these routines, the routines are defined as null macros.

Client-Server Programs

The most common form of distributed application is the client-server model. In this scheme, client processes request services from a server process.

An alternate scheme is a service server that can eliminate dormant server processes. An example is inetd(1M), the Internet service daemon. inetd(1M) listens at a variety of ports, determined at startup by reading a configuration file. When a connection is requested on an inetd(1M) serviced port, inetd(1M) spawns the appropriate server to serve the client. Clients are unaware that an intermediary has played any part in the connection. inetd(1M) is described in more detail in inetd Daemon.

Sockets and Servers

Most servers are accessed at well-known Internet port numbers or UNIX family names. The service rlogin is an example of a well-known UNIX family name. The main loop of a remote login server is shown in Example 8–7.

The server dissociates from the controlling terminal of its invoker unless the server is operating in DEBUG mode.

(void) close(0);
(void) close(1);
(void) close(2);
(void) open("/", O_RDONLY);
(void) dup2(0, 1);
(void) dup2(0, 2);
setsid();

Dissociating prevents the server from receiving signals from the process group of the controlling terminal. After a server has dissociated from the controlling terminal, the server cannot send reports of errors to the terminal. The dissociated server must log errors with syslog(3C).

The server gets its service definition by calling getaddrinfo(3SOCKET).

bzero(&hints, sizeof (hints));
hints.ai_flags = AI_ALL|AI_ADDRCONFIG;
hints.ai_socktype = SOCK_STREAM;
error = getaddrinfo(NULL, "rlogin", &hints, &api);

The result, which is returned in api, contains the Internet port at which the program listens for service requests. Some standard port numbers are defined in /usr/include/netinet/in.h.

The server then creates a socket, and listens for service requests. The bind(3SOCKET) routine ensures that the server listens at the expected location. Because the remote login server listens at a restricted port number, the server runs as superuser. The main body of the server is the following loop.


Example 8–7 Server Main Loop

/* Wait for a connection request. */
for (;;) {
    faddrlen = sizeof (faddr);
    new_sock = accept(sock, (struct sockaddr *)api->ai_addr,
            api->ai_addrlen)
    if (new_sock == -1) {
        if (errno != EINTR && errno != ECONNABORTED) {
            perror("rlogind: accept");
        }
        continue;
    }
    if (fork() == 0) {
        close (sock);
        doit (new_sock, &faddr);
    }
    close (new_sock);
}
/*NOTREACHED*/

accept(3SOCKET) blocks messages until a client requests service. Furthermore, accept(3SOCKET) returns a failure indication if accept is interrupted by a signal, such as SIGCHLD. The return value from accept(3SOCKET) is checked, and an error is logged with syslog(3C), if an error occurs.

The server then forks a child process, and invokes the main body of the remote login protocol processing. The socket used by the parent to queue connection requests is closed in the child. The socket created by accept(3SOCKET) is closed in the parent. The address of the client is passed to the server application's doit() routine, which authenticates the client.

Sockets and Clients

This section describes the steps taken by a client process. As in the server, the first step is to locate the service definition for a remote login.

bzero(&hints, sizeof (hints));
hints.ai_flags = AI_ALL|AI_ADDRCONFIG;
hints.ai_socktype = SOCK_STREAM;

error = getaddrinfo(hostname, servicename, &hints, &res);
if (error != 0) {
    (void) fprintf(stderr, "getaddrinfo: %s for host %s service %s\n",
        gai_strerror(error), hostname, servicename);
    return (-1);
}

getaddrinfo(3SOCKET) returns the head of a list of addresses in res. The desired address is found by creating a socket and trying to connect to each address returned in the list until one works.

for (aip = res; aip != NULL; aip = aip->ai_next) {
    /*
     * Open socket.  The address type depends on what
     * getaddrinfo() gave us.
     */
    sock = socket(aip->ai_family, aip->ai_socktype,
          aip->ai_protocol);
    if (sock == -1) {
        perror("socket");
        freeaddrinfo(res);
        return (-1);
    }

    /* Connect to the host. */
    if (connect(sock, aip->ai_addr, aip->ai_addrlen) == -1) {
        perror("connect");
        (void) close(sock);
        sock = -1;
        continue;
    }
    break;
}

The socket has been created and has been connected to the desired service. The connect(3SOCKET) routine implicitly binds sock, because sock is unbound.

Connectionless Servers

Some services use datagram sockets. The rwho(1) service provides status information on hosts that are connected to a local area network. Avoid running in.rwhod(1M) because in.rwho causes heavy network traffic. The rwho service broadcasts information to all hosts connected to a particular network. The rwho service is an example of datagram socket use.

A user on a host that is running the rwho(1) server can get the current status of another host with ruptime(1). Typical output is illustrated in the following example.


Example 8–8 Output of ruptime(1) Program

itchy up 9:45, 5 users, load 1.15, 1.39, 1.31
scratchy up 2+12:04, 8 users, load 4.67, 5.13, 4.59
click up 10:10, 0 users, load 0.27, 0.15, 0.14
clack up 2+06:28, 9 users, load 1.04, 1.20, 1.65
ezekiel up 25+09:48, 0 users, load 1.49, 1.43, 1.41
dandy 5+00:05, 0 users, load 1.51, 1.54, 1.56
peninsula down 0:24
wood down 17:04
carpediem down 16:09
chances up 2+15:57, 3 users, load 1.52, 1.81, 1.86

Status information is periodically broadcast by the rwho(1) server processes on each host. The server process also receives the status information. The server also updates a database. This database is interpreted for the status of each host. Servers operate autonomously, coupled only by the local network and its broadcast capabilities.

Use of broadcast is fairly inefficient because broadcast generates a lot of net traffic. Unless the service is used widely and frequently, the expense of periodic broadcasts outweighs the simplicity.

The following example shows a simplified version of the rwho(1) server. The sample code receives status information broadcast by other hosts on the network and supplies the status of the host on which the sample code is running. The first task is done in the main loop of the program: Packets received at the rwho(1) port are checked to be sure they were sent by another rwho(1) server process and are stamped with the arrival time. The packets then update a file with the status of the host. When a host has not been heard from for an extended time, the database routines assume the host is down and logs this information. Because a server might be down while a host is up, this application is prone to error.


Example 8–9 rwho(1) Server

main()
{
    ...
    sp = getservbyname("who", "udp");
    net = getnetbyname("localnet");
    sin.sin6_addr = inet_makeaddr(net->n_net, in6addr_any);
    sin.sin6_port = sp->s_port;
    ...
    s = socket(AF_INET6, SOCK_DGRAM, 0);
    ...
    on = 1;
    if (setsockopt(s, SOL_SOCKET, SO_BROADCAST, &on, sizeof on)
            == -1) {
        syslog(LOG_ERR, "setsockopt SO_BROADCAST: %m");
        exit(1);
    }
    bind(s, (struct sockaddr *) &sin, sizeof sin);
    ...
    signal(SIGALRM, onalrm);
    onalrm();
    while(1) {
        struct whod wd;
        int cc, whod, len = sizeof from;
        cc = recvfrom(s, (char *) &wd, sizeof(struct whod), 0,
            (struct sockaddr *) &from, &len);
        if (cc <= 0) {
            if (cc == -1 && errno != EINTR)
                syslog(LOG_ERR, "rwhod: recv: %m");
            continue;
        }
        if (from.sin6_port != sp->s_port) {
            syslog(LOG_ERR, "rwhod: %d: bad from port",
                ntohs(from.sin6_port));
            continue;
        }
        ...
        if (!verify( wd.wd_hostname)) {
            syslog(LOG_ERR, "rwhod: bad host name from %x",
                ntohl(from.sin6_addr.s6_addr));
            continue;
        }
        (void) sprintf(path, "%s/whod.%s", RWHODIR, wd.wd_hostname);
        whod = open(path, O_WRONLY|O_CREAT|O_TRUNC, 0666);
        ...
        (void) time(&wd.wd_recvtime);
        (void) write(whod, (char *) &wd, cc);
        (void) close(whod);
    }
    exit(0);
}

The second server task is to supply the status of its host. This requires periodically acquiring system status information, packaging it in a message, and broadcasting it on the local network for other rwho(1) servers to hear. This task is run by a timer. This task is triggered by a signal.

Status information is broadcast on the local network. For networks that do not support broadcast, use multicast.

Advanced Socket Topics

For most programmers, the mechanisms already described are enough to build distributed applications. This section describes additional features.

Out-of-Band Data

The stream socket abstraction includes out-of-band data. Out-of-band data is a logically independent transmission channel between a pair of connected stream sockets. Out-of-band data is delivered independent of normal data. The out-of-band data facilities must support the reliable delivery of at least one out-of-band message at a time. This message can contain at least one byte of data. At least one message can be pending delivery at any time.

With in-band signaling, urgent data is delivered in sequence with normal data, and the message is extracted from the normal data stream. The extracted message is stored separately. Users can choose between receiving the urgent data in order and receiving the data out of sequence, without having to buffer the intervening data.

Using MSG_PEEK, you can peek at out-of-band data. If the socket has a process group, a SIGURG signal is generated when the protocol is notified of its existence. A process can set the process group or process ID to deliver SIGURG to with the appropriate fcntl(2) call, as described in Interrupt-Driven Socket I/O for SIGIO. If multiple sockets have out-of-band data waiting for delivery, a select(3C) call for exceptional conditions can determine which sockets have such data pending.

A logical mark is placed in the data stream at the point at which the out-of-band data was sent. The remote login and remote shell applications use this facility to propagate signals between client and server processes. When a signal is received, all data up to the mark in the data stream is discarded.

To send an out-of-band message, apply the MSG_OOB flag to send(3SOCKET) or sendto(3SOCKET). To receive out-of-band data, specify MSG_OOB to recvfrom(3SOCKET) or recv(3SOCKET). If out-of-band data is taken in line the MSG_OOB flag is not needed. The SIOCATMARK ioctl(2) indicates whether the read pointer currently points at the mark in the data stream:

int yes;
ioctl(s, SIOCATMARK, &yes);

If yes is 1 on return, the next read returns data after the mark. Otherwise, assuming out-of-band data has arrived, the next read provides data sent by the client before sending the out-of-band signal. The routine in the remote login process that flushes output on receipt of an interrupt or quit signal is shown in the following example. This code reads the normal data up to the mark to discard the normal data, then reads the out-of-band byte.

A process can also read or peek at the out-of-band data without first reading up to the mark. Accessing this data when the underlying protocol delivers the urgent data in-band with the normal data, and sends notification of its presence only ahead of time, is more difficult. An example of this type of protocol is TCP, the protocol used to provide socket streams in the Internet family. With such protocols, the out-of-band byte might not yet have arrived when recv(3SOCKET) is called with the MSG_OOB flag. In that case, the call returns the error of EWOULDBLOCK. Also, the amount of in-band data in the input buffer might cause normal flow control to prevent the peer from sending the urgent data until the buffer is cleared. The process must then read enough of the queued data to clear the input buffer before the peer can send the urgent data.


Example 8–10 Flushing Terminal I/O on Receipt of Out-of-Band Data

#include <sys/ioctl.h>
#include <sys/file.h>
...
oob()
{
    int out = FWRITE;
    char waste[BUFSIZ];
    int mark = 0;
 
    /* flush local terminal output */
    ioctl(1, TIOCFLUSH, (char *) &out);
    while(1) {
        if (ioctl(rem, SIOCATMARK, &mark) == -1) {
            perror("ioctl");
            break;
        }
        if (mark)
            break;
        (void) read(rem, waste, sizeof waste);
    }
    if (recv(rem, &mark, 1, MSG_OOB) == -1) {
        perror("recv");
        ...
    }
    ...
}

A facility to retain the position of urgent in-line data in the socket stream is available as a socket-level option, SO_OOBINLINE. See getsockopt(3SOCKET) for usage. With this socket-level option, the position of urgent data remains. However, the urgent data immediately following the mark in the normal data stream is returned without the MSG_OOB flag. Reception of multiple urgent indications moves the mark, but does not lose any out-of-band data.

Nonblocking Sockets

Some applications require sockets that do not block. For example, a server would return an error code, not executing a request that cannot complete immediately. This error could cause the process to be suspended, awaiting completion. After creating and connecting a socket, issuing a fcntl(2) call, as shown in the following example, makes the socket nonblocking.


Example 8–11 Set Nonblocking Socket

#include <fcntl.h>
#include <sys/file.h>
...
int fileflags;
int s;
...
s = socket(AF_INET6, SOCK_STREAM, 0);
...
if (fileflags = fcntl(s, F_GETFL, 0) == -1)
    perror("fcntl F_GETFL");
    exit(1);
}
if (fcntl(s, F_SETFL, fileflags | FNDELAY) == -1)
    perror("fcntl F_SETFL, FNDELAY");
    exit(1);
}

When performing I/O on a nonblocking socket, check for the error EWOULDBLOCK in errno.h, which occurs when an operation would normally block. accept(3SOCKET), connect(3SOCKET), send(3SOCKET), recv(3SOCKET), read(2), and write(2) can all return EWOULDBLOCK. If an operation such as a send(3SOCKET) cannot be done in its entirety but partial writes work, as when using a stream socket, all available data is processed. The return value is the amount of data actually sent.

Asynchronous Socket I/O

Asynchronous communication between processes is required in applications that simultaneously handle multiple requests. Asynchronous sockets must be of the SOCK_STREAM type. To make a socket asynchronous, you issue a fcntl(2) call, as shown in the following example.


Example 8–12 Making a Socket Asynchronous

#include <fcntl.h>
#include <sys/file.h>
...
int fileflags;
int s;
...
s = socket(AF_INET6, SOCK_STREAM, 0);
...
if (fileflags = fcntl(s, F_GETFL ) == -1)
    perror("fcntl F_GETFL");
    exit(1);
}
if (fcntl(s, F_SETFL, fileflags | FNDELAY | FASYNC) == -1)
    perror("fcntl F_SETFL, FNDELAY | FASYNC");
    exit(1);
}

After sockets are initialized, connected, and made nonblocking and asynchronous, communication is similar to reading and writing a file asynchronously. Initiate a data transfer by using send(3SOCKET), write(2), recv(3SOCKET), or read(2). A signal-driven I/O routine completes a data transfer, as described in the next section.

Interrupt-Driven Socket I/O

The SIGIO signal notifies a process when a socket, or any file descriptor, has finished a data transfer. The steps in using SIGIO are as follows:

  1. Set up a SIGIO signal handler with the signal(3C) or sigvec(3UCB) calls.

  2. Use fcntl(2) to set the process ID or process group ID to route the signal to its own process ID or process group ID. The default process group of a socket is group 0.

  3. Convert the socket to asynchronous, as shown in Asynchronous Socket I/O.

The following sample code enables receipt of information on pending requests as the requests occur for a socket by a given process. With the addition of a handler for SIGURG, this code can also be used to prepare for receipt of SIGURG signals.


Example 8–13 Asynchronous Notification of I/O Requests

#include <fcntl.h>
#include <sys/file.h>
 ...
signal(SIGIO, io_handler);
/* Set the process receiving SIGIO/SIGURG signals to us. */
if (fcntl(s, F_SETOWN, getpid()) < 0) {
    perror("fcntl F_SETOWN");
    exit(1);
}

Signals and Process Group ID

For SIGURG and SIGIO, each socket has a process number and a process group ID. These values are initialized to zero, but can be redefined at a later time with the F_SETOWN fcntl(2) command, as in the previous example. A positive third argument to fcntl(2) sets the socket's process ID. A negative third argument to fcntl(2) sets the socket's process group ID. The only allowed recipient of SIGURG and SIGIO signals is the calling process. A similar fcntl(2), F_GETOWN, returns the process number of a socket.

You can also enable reception of SIGURG and SIGIO by using ioctl(2) to assign the socket to the user's process group.

/* oobdata is the out-of-band data handling routine */
sigset(SIGURG, oobdata);
int pid = -getpid();
if (ioctl(client, SIOCSPGRP, (char *) &pid) < 0) {
    perror("ioctl: SIOCSPGRP");
}

Selecting Specific Protocols

If the third argument of the socket(3SOCKET) call is 0, socket(3SOCKET) selects a default protocol to use with the returned socket of the type requested. The default protocol is usually correct, and alternate choices are not usually available. When using raw sockets to communicate directly with lower-level protocols or lower-level hardware interfaces, set up de-multiplexing with the protocol argument.

Using raw sockets in the Internet family to implement a new protocol on IP ensures that the socket only receives packets for the specified protocol. To obtain a particular protocol, determine the protocol number as defined in the protocol family. For the Internet family, use one of the library routines that are discussed in Standard Routines, such as getprotobyname(3SOCKET).

#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
...
pp = getprotobyname("newtcp");
s = socket(AF_INET6, SOCK_STREAM, pp->p_proto);

Using getprotobyname results in a socket s by using a stream-based connection, but with a protocol type of newtcp instead of the default tcp.

Address Binding

For addressing, TCP and UDP use a 4-tuple of:

TCP requires these 4-tuples to be unique. UDP does not. User programs do not always know proper values to use for the local address and local port, because a host can reside on multiple networks. The set of allocated port numbers is not directly accessible to a user. To avoid these problems, leave parts of the address unspecified and let the system assign the parts appropriately when needed. Various portions of these tuples can be specified by various parts of the sockets API:

bind(3SOCKET)

Local address or local port or both

connect(3SOCKET)

Foreign address and foreign port

A call to accept(3SOCKET) retrieves connection information from a foreign client. This causes the local address and port to be specified to the system even though the caller of accept(3SOCKET) did not specify anything. The foreign address and foreign port are returned.

A call to listen(3SOCKET) can cause a local port to be chosen. If no explicit bind(3SOCKET) has been done to assign local information, listen(3SOCKET) assigns an ephemeral port number.

A service that resides at a particular port can bind(3SOCKET) to that port. Such a service can leave the local address unspecified if the service does not require local address information. The local address is set to in6addr_any, a variable with a constant value in <netinet/in.h>. If the local port does not need to be fixed, a call to listen(3SOCKET) causes a port to be chosen. Specifying an address of in6addr_any or a port number of 0 is known as wildcarding. For AF_INET, INADDR_ANY is used in place of in6addr_any.

The wildcard address simplifies local address binding in the Internet family. The following sample code binds a specific port number that was returned by a call to getaddrinfo(3SOCKET) to a socket and leaves the local address unspecified:

#include <sys/types.h>
#include <netinet/in.h>
...
struct addrinfo  *aip;
...
if (bind(sock, aip->ai_addr, aip->ai_addrlen) == -1) {
    perror("bind");
    (void) close(sock);
    return (-1);
}

Each network interface on a host typically has a unique IP address. Sockets with wildcard local addresses can receive messages that are directed to the specified port number. Messages that are sent to any of the possible addresses that are assigned to a host are also received by sockets with wildcard local addresses. To allow only hosts on a specific network to connect to the server, a server binds the address of the interface on the appropriate network.

Similarly, a local port number can be left unspecified, in which case the system selects a port number. For example, to bind a specific local address to a socket, but to leave the local port number unspecified, you could use bind as follows:

bzero (&sin, sizeof (sin));
(void) inet_pton (AF_INET6, "::ffff:127.0.0.1", sin.sin6_addr.s6_addr);
sin.sin6_family = AF_INET6;
sin.sin6_port = htons(0);
bind(s, (struct sockaddr *) &sin, sizeof sin);

The system uses two criteria to select the local port number:

The port number and IP address of the client are found through either accept(3SOCKET) or getpeername(3SOCKET).

In certain cases, the algorithm used by the system to select port numbers is unsuitable for an application due to the two-step creation process for associations. For example, the Internet file transfer protocol specifies that data connections must always originate from the same local port. However, duplicate associations are avoided by connecting to different foreign ports. In this situation, the system would disallow binding the same local address and local port number to a socket if a previous data connection's socket still existed.

To override the default port selection algorithm, you must perform an option call before address binding:

int on = 1;
...
setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &on, sizeof on);
bind(s, (struct sockaddr *) &sin, sizeof sin);

With this call, local addresses already in use can be bound. This binding does not violate the uniqueness requirement. The system still verifies at connect time that any other sockets with the same local address and local port do not have the same foreign address and foreign port. If the association already exists, the error EADDRINUSE is returned.

Socket Options

You can set and get several options on sockets through setsockopt(3SOCKET) and getsockopt(3SOCKET). For example, you can change the send or receive buffer space. The general forms of the calls are in the following list:

setsockopt(s, level, optname, optval, optlen);

and

getsockopt(s, level, optname, optval, optlen);

The operating system can adjust the values appropriately at any time.

The arguments of setsockopt(3SOCKET) and getsockopt(3SOCKET) calls are in the following list:

s

Socket on which the option is to be applied

level

Specifies the protocol level, such as socket level, indicated by the symbolic constant SOL_SOCKET in sys/socket.h

optname

Symbolic constant defined in sys/socket.h that specifies the option

optval

Points to the value of the option

optlen

Points to the length of the value of the option

For getsockopt(3SOCKET), optlen is a value-result argument. This argument is initially set to the size of the storage area pointed to by optval. On return, the argument's value is set to the length of storage used.

When a program needs to determine an existing socket's type, the program should invoke inetd(1M) by using the SO_TYPE socket option and the getsockopt(3SOCKET) call:

#include <sys/types.h>
#include <sys/socket.h>
 
int type, size;
 
size = sizeof (int);
if (getsockopt(s, SOL_SOCKET, SO_TYPE, (char *) &type, &size) < 0) {
    ...
}

After getsockopt(3SOCKET), type is set to the value of the socket type, as defined in sys/socket.h. For a datagram socket, type would be SOCK_DGRAM.

inetd Daemon

The inetd(1M) daemon is invoked at startup time and gets the services for which the daemon listens from the /etc/inet/inetd.conf file. The daemon creates one socket for each service that is listed in /etc/inet/inetd.conf, binding the appropriate port number to each socket. See the inetd(1M) man page for details.

The inetd(1M) daemon polls each socket, waiting for a connection request to the service corresponding to that socket. For SOCK_STREAM type sockets, inetd(1M) accepts (accept(3SOCKET)) on the listening socket, forks (fork(2)), duplicates (dup(2)) the new socket to file descriptors 0 and 1 (stdin and stdout), closes other open file descriptors, and executes (exec(2)) the appropriate server.

The primary benefit of using inetd(1M) is that services not in use do not consume machine resources. A secondary benefit is that inetd(1M) does most of the work to establish a connection. The server started by inetd(1M) has the socket connected to its client on file descriptors 0 and 1. The server can immediately read, write, send, or receive. Servers can use buffered I/O as provided by the stdio conventions, as long as the servers use fflush(3C) when appropriate.

The getpeername(3SOCKET) routine returns the address of the peer (process) connected to a socket. This routine is useful in servers started by inetd(1M). For example, you could use this routine to log the Internet address such as fec0::56:a00:20ff:fe7d:3dd2, which is conventional for representing the IPv6 address of a client. An inetd(1M) server could use the following sample code:

struct sockaddr_storage name;
int namelen = sizeof (name);
char abuf[INET6_ADDRSTRLEN];
struct in6_addr addr6;
struct in_addr addr;

if (getpeername(fd, (struct sockaddr *) &name, &namelen) == -1) {
    perror("getpeername");
    exit(1);
} else {
    addr = ((struct sockaddr_in *)&name)->sin_addr;
    addr6 = ((struct sockaddr_in6 *)&name)->sin6_addr;
    if (name.ss_family == AF_INET) {
            (void) inet_ntop(AF_INET, &addr, abuf, sizeof (abuf));
    } else if (name.ss_family == AF_INET6 &&
               IN6_IS_ADDR_V4MAPPED(&addr6)) {
            /* this is a IPv4-mapped IPv6 address */
            IN6_MAPPED_TO_IN(&addr6, &addr);
            (void) inet_ntop(AF_INET, &addr, abuf, sizeof (abuf));
    } else if (name.ss_family == AF_INET6) {
            (void) inet_ntop(AF_INET6, &addr6, abuf, sizeof (abuf));

    }
    syslog("Connection from %s\n", abuf);
}

Broadcasting and Determining Network Configuration

Broadcasting is not supported in IPv6. Broadcasting is supported only in IPv4.

Messages sent by datagram sockets can be broadcast to reach all of the hosts on an attached network. The network must support broadcast because the system provides no simulation of broadcast in software. Broadcast messages can place a high load on a network because broadcast messages force every host on the network to service the broadcast messages. Broadcasting is usually used for either of two reasons:

To send a broadcast message, create an Internet datagram socket:

s = socket(AF_INET, SOCK_DGRAM, 0);

Bind a port number to the socket:

sin.sin_family = AF_INET;
sin.sin_addr.s_addr = htonl(INADDR_ANY);
sin.sin_port = htons(MYPORT);
bind(s, (struct sockaddr *) &sin, sizeof sin);

The datagram can be broadcast on only one network by sending to the network's broadcast address. A datagram can also be broadcast on all attached networks by sending to the special address INADDR_BROADCAST, which is defined in netinet/in.h.

The system provides a mechanism to determine a number of pieces of information about the network interfaces on the system. This information includes the IP address and broadcast address. The SIOCGIFCONF ioctl(2) call returns the interface configuration of a host in a single ifconf structure. This structure contains an array of ifreq structures. Every address family supported by every network interface to which the host is connected has its own ifreq structure.

The following example shows the ifreq structures defined in net/if.h.


Example 8–14 net/if.h Header File

struct ifreq {
    #define IFNAMSIZ 16
    char ifr_name[IFNAMSIZ]; /* if name, e.g., "en0" */
    union {
        struct sockaddr ifru_addr;
        struct sockaddr ifru_dstaddr;
        char ifru_oname[IFNAMSIZ]; /* other if name */
        struct sockaddr ifru_broadaddr;
        short ifru_flags;
        int ifru_metric;
        char ifru_data[1]; /* interface dependent data */
        char ifru_enaddr[6];
    } ifr_ifru;
    #define ifr_addr ifr_ifru.ifru_addr
    #define ifr_dstaddr ifr_ifru.ifru_dstaddr
    #define ifr_oname ifr_ifru.ifru_oname
    #define ifr_broadaddr ifr_ifru.ifru_broadaddr
    #define ifr_flags ifr_ifru.ifru_flags
    #define ifr_metric ifr_ifru.ifru_metric
    #define ifr_data ifr_ifru.ifru_data
    #define ifr_enaddr ifr_ifru.ifru_enaddr
};

The call that obtains the interface configuration is:

/*
 * Do SIOCGIFNUM ioctl to find the number of interfaces
 *
 * Allocate space for number of interfaces found
 *
 * Do SIOCGIFCONF with allocated buffer
 *
 */
if (ioctl(s, SIOCGIFNUM, (char *)&numifs) == -1) {
    numifs = MAXIFS;
}
bufsize = numifs * sizeof(struct ifreq);
reqbuf = (struct ifreq *)malloc(bufsize);
if (reqbuf == NULL) {
    fprintf(stderr, "out of memory\n");
    exit(1);
}
ifc.ifc_buf = (caddr_t)&reqbuf[0];
ifc.ifc_len = bufsize;
if (ioctl(s, SIOCGIFCONF, (char *)&ifc) == -1) {
    perror("ioctl(SIOCGIFCONF)");
    exit(1);
}

After this call, buf contains an array of ifreq structures. Every network to which the host connects has an associated ifreq structure. The sort order these structures appear in is:

The value of ifc.ifc_len is set to the number of bytes used by the ifreq structures.

Each structure has a set of interface flags that indicate whether the corresponding network is up or down, point-to-point or broadcast, and so on. The following example shows ioctl(2) returning the SIOCGIFFLAGS flags for an interface specified by an ifreq structure.


Example 8–15 Obtaining Interface Flags

struct ifreq *ifr;
ifr = ifc.ifc_req;
for (n = ifc.ifc_len/sizeof (struct ifreq); --n >= 0; ifr++) {
    /*
     * Be careful not to use an interface devoted to an address
     * family other than those intended.
     */
    if (ifr->ifr_addr.sa_family != AF_INET)
        continue;
    if (ioctl(s, SIOCGIFFLAGS, (char *) ifr) < 0) {
        ...
    }
    /* Skip boring cases */
    if ((ifr->ifr_flags & IFF_UP) == 0 ||
            (ifr->ifr_flags & IFF_LOOPBACK) ||
            (ifr->ifr_flags & (IFF_BROADCAST | IFF_POINTOPOINT)) == 0)
        continue;
}

The following example uses the SIOGGIFBRDADDR ioctl(2) command to obtain the broadcast address of an interface.


Example 8–16 Broadcast Address of an Interface

if (ioctl(s, SIOCGIFBRDADDR, (char *) ifr) < 0) {
    ...
}
memcpy((char *) &dst, (char *) &ifr->ifr_broadaddr,
    sizeof ifr->ifr_broadaddr);

You can also use SIOGGIFBRDADDR ioctl(2) to get the destination address of a point-to-point interface.

After the interface broadcast address is obtained, transmit the broadcast datagram with sendto(3SOCKET):

sendto(s, buf, buflen, 0, (struct sockaddr *)&dst, sizeof dst);

Use one sendto(3SOCKET) for each interface to which the host is connected, if that interface supports the broadcast or point-to-point addressing.

Using Multicast

IP multicasting is supported only on AF_INET6 and AF_INET sockets of type SOCK_DGRAM and SOCK_RAW. IP multicasting is only supported on subnetworks for which the interface driver supports multicasting.

Sending IPv4 Multicast Datagrams

To send a multicast datagram, specify an IP multicast address in the range 224.0.0.0 to 239.255.255.255 as the destination address in a sendto(3SOCKET) call.

By default, IP multicast datagrams are sent with a time-to-live (TTL) of 1. This value prevents the datagrams from being forwarded beyond a single subnetwork. The socket option IP_MULTICAST_TTL allows the TTL for subsequent multicast datagrams to be set to any value from 0 to 255. This ability is used to control the scope of the multicasts.

u_char ttl;
setsockopt(sock, IPPROTO_IP, IP_MULTICAST_TTL, &ttl,sizeof(ttl))

Multicast datagrams with a TTL of 0 are not transmitted on any subnet, but can be delivered locally if the sending host belongs to the destination group and if multicast loopback has not been disabled on the sending socket. Multicast datagrams with a TTL greater than one can be delivered to more than one subnet if one or more multicast routers are attached to the first-hop subnet. To provide meaningful scope control, the multicast routers support the notion of TTL thresholds. These thresholds prevent datagrams with less than a certain TTL from traversing certain subnets. The thresholds enforce the conventions for multicast datagrams with initial TTL values as follows:

0

Are restricted to the same host

1

Are restricted to the same subnet

32

Are restricted to the same site

64

Are restricted to the same region

128

Are restricted to the same continent

255

Are unrestricted in scope

Sites and regions are not strictly defined and sites can be subdivided into smaller administrative units as a local matter.

An application can choose an initial TTL other than the ones previously listed. For example, an application might perform an expanding-ring search for a network resource by sending a multicast query, first with a TTL of 0 and then with larger and larger TTLs until a reply is received.

The multicast router does not forward any multicast datagram with a destination address between 224.0.0.0 and 224.0.0.255 inclusive, regardless of its TTL. This range of addresses is reserved for the use of routing protocols and other low-level topology discovery or maintenance protocols, such as gateway discovery and group membership reporting.

Each multicast transmission is sent from a single network interface, even if the host has more than one multicast-capable interface. If the host is also a multicast router and the TTL is greater than 1, a multicast can be forwarded to interfaces other than the originating interface. A socket option is available to override the default for subsequent transmissions from a given socket:

struct in_addr addr;
setsockopt(sock, IPPROTO_IP, IP_MULTICAST_IF, &addr, sizeof(addr))

where addr is the local IP address of the desired outgoing interface. Revert to the default interface by specifying the address INADDR_ANY. The local IP address of an interface is obtained with the SIOCGIFCONF ioctl. To determine if an interface supports multicasting, fetch the interface flags with the SIOCGIFFLAGS ioctl and test if the IFF_MULTICAST flag is set. This option is intended primarily for multicast routers and other system services specifically concerned with Internet topology.

If a multicast datagram is sent to a group to which the sending host itself belongs, a copy of the datagram is, by default, looped back by the IP layer for local delivery. Another socket option gives the sender explicit control over whether subsequent datagrams are looped back:

u_char loop;
setsockopt(sock, IPPROTO_IP, IP_MULTICAST_LOOP, &loop, sizeof(loop))

where loop is 0 to disable loopback and 1 to enable loopback. This option provides a performance benefit for applications that have only one instance on a single host by eliminating the overhead of receiving their own transmissions. Applications that can have more than one instance on a single host, or for which the sender does not belong to the destination group, should not use this option.

If the sending host belongs to the destination group on another interface, a multicast datagram sent with an initial TTL greater than 1 can be delivered to the sending host on the other interface. The loopback control option has no effect on such delivery.

Receiving IPv4 Multicast Datagrams

Before a host can receive IP multicast datagrams, the host must become a member of one or more IP multicast groups. A process can ask the host to join a multicast group by using the following socket option:

struct ip_mreq mreq;
setsockopt(sock, IPPROTO_IP, IP_ADD_MEMBERSHIP, &mreq, sizeof(mreq))

where mreq is the structure:

struct ip_mreq {
    struct in_addr imr_multiaddr;   /* multicast group to join */
    struct in_addr imr_interface;   /* interface to join on */
}

Each membership is associated with a single interface. You can join the same group on more than one interface. Specify the imr_interface address as INADDR_ANY to choose the default multicast interface. You can also specify one of the host's local addresses to choose a particular multicast-capable interface.

To drop a membership, use:

struct ip_mreq mreq;
setsockopt(sock, IPPROTO_IP, IP_DROP_MEMBERSHIP, &mreq, sizeof(mreq))

where mreq contains the same values used to add the membership. Closing a socket or killing the process that holds the socket drops the memberships associated with that socket. More than one socket can claim a membership in a particular group, and the host remains a member of that group until the last claim is dropped.

If any socket claims membership in the destination group of the datagram, the kernel IP layer accepts incoming multicast packets. A given socket's receipt of a multicast datagram depends on the socket's associated destination port and memberships, or the protocol type for raw sockets. To receive multicast datagrams sent to a particular port, bind to the local port, leaving the local address unspecified, such as INADDR_ANY.

More than one process can bind to the same SOCK_DGRAM UDP port if the bind(3SOCKET) is preceded by:

int one = 1;
setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &one, sizeof(one))

In this case, every incoming multicast or broadcast UDP datagram destined for the shared port is delivered to all sockets bound to the port. For backwards compatibility reasons, this delivery does not apply to incoming unicast datagrams. Unicast datagrams are never delivered to more than one socket, regardless of how many sockets are bound to the datagram's destination port. SOCK_RAW sockets do not require the SO_REUSEADDR option to share a single IP protocol type.

The definitions required for the new, multicast-related socket options are found in <netinet/in.h>. All IP addresses are passed in network byte-order.

Sending IPv6 Multicast Datagrams

To send an IPv6 multicast datagram, specify an IP multicast address in the range ff00::0/8 as the destination address in a sendto(3SOCKET) call.

By default, IP multicast datagrams are sent with a hop limit of one, which prevents the datagrams from being forwarded beyond a single subnetwork. The socket option IPV6_MULTICAST_HOPS allows the hop limit for subsequent multicast datagrams to be set to any value from 0 to 255. This ability is used to control the scope of the multicasts:

uint_l;
setsockopt(sock, IPPROTO_IPV6, IPV6_MULTICAST_HOPS, &hops,sizeof(hops))

You cannot transmit multicast datagrams with a hop limit of zero on any subnet, but you can deliver the datagrams locally if:

You can deliver multicast datagrams with a hop limit that is greater than one to more than one subnet if the first-hop subnet attaches to one or more multicast routers. The IPv6 multicast addresses, unlike their IPv4 counterparts, contain explicit scope information that is encoded in the first part of the address. The defined scopes are, where X is unspecified:

ffX1::0/16

Node-local scope, restricted to the same node

ffX2::0/16

Link-local scope

ffX5::0/16

Site-local scope

ffX8::0/16

Organization-local scope

ffXe::0/16

Global scope

An application can, separately from the scope of the multicast address, use different hop limit values. For example, an application might perform an expanding-ring search for a network resource by sending a multicast query, first with a hop limit of 0, and then with larger and larger hop limits, until a reply is received.

Each multicast transmission is sent from a single network interface, even if the host has more than one multicast-capable interface. If the host is also a multicast router, and the hop limit is greater than 1, a multicast can be forwarded to interfaces other than the originating interface. A socket option is available to override the default for subsequent transmissions from a given socket:

uint_t ifindex;
ifindex = if_nametoindex ("hme3");
setsockopt(sock, IPPROTO_IPV6, IPV6_MULTICAST_IF, &ifindex, sizeof(ifindex))

where ifindex is the interface index for the desired outgoing interface. Revert to the default interface by specifying the value 0.

If a multicast datagram is sent to a group to which the sending host itself belongs, a copy of the datagram is, by default, looped back by the IP layer for local delivery. Another socket option gives the sender explicit control over whether to loop back subsequent datagrams:

uint_t loop;
setsockopt(sock, IPPROTO_IPV6, IPV6_MULTICAST_LOOP, &loop, sizeof(loop))

where loop is zero to disable loopback and one to enable loopback. This option provides a performance benefit for applications that have only one instance on a single host (such as a router or a mail demon), by eliminating the overhead of receiving their own transmissions. Applications that can have more than one instance on a single host (such as a conferencing program) or for which the sender does not belong to the destination group (such as a time querying program) should not use this option.

If the sending host belongs to the destination group on another interface, a multicast datagram sent with an initial hop limit greater than 1 can be delivered to the sending host on the other interface. The loopback control option has no effect on such delivery.

Receiving IPv6 Multicast Datagrams

Before a host can receive IP multicast datagrams, the host must become a member of one, or more IP multicast groups. A process can ask the host to join a multicast group by using the following socket option:

struct ipv6_mreq mreq;
setsockopt(sock, IPPROTO_IPV6, IPV6_JOIN_GROUP, &mreq, sizeof(mreq))

where mreq is the structure:

struct ipv6_mreq {
    struct in6_addr    ipv6mr_multiaddr;    /* IPv6 multicast addr */
    unsigned int       ipv6mr_interface;    /* interface index */
}

Each membership is associated with a single interface. You can join the same group on more than one interface. Specify ipv6_interface to be 0 to choose the default multicast interface. Specify an interface index for one of the host's interfaces to choose that multicast-capable interface.

To leave a group, use:

struct ipv6_mreq mreq;
setsockopt(sock, IPPROTO_IPV6, IP_LEAVE_GROUP, &mreq, sizeof(mreq))

where mreq contains the same values used to add the membership. The socket drops associated memberships when the socket is closed, or when the process that holds the socket is killed. More than one socket can claim a membership in a particular group. The host remains a member of that group until the last claim is dropped.

The kernel IP layer accepts incoming multicast packets if any socket has claimed a membership in the destination group of the datagram. Delivery of a multicast datagram to a particular socket is determined by the destination port and the memberships associated with the socket, or by the protocol type for raw sockets. To receive multicast datagrams sent to a particular port, bind to the local port, leaving the local address unspecified, such as INADDR_ANY.

More than one process can bind to the same SOCK_DGRAM UDP port if the bind(3SOCKET) is preceded by:

int one = 1;
setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &one, sizeof(one))

In this case, all sockets that are bound to the port receive every incoming multicast UDP datagram destined to the shared port. For backward compatibility reasons, this delivery does not apply to incoming unicast datagrams. Unicast datagrams are never delivered to more than one socket, regardless of how many sockets are bound to the datagram's destination port. SOCK_RAW sockets do not require the SO_REUSEADDR option to share a single IP protocol type.

The definitions required for the new, multicast-related socket options are found in <netinet/in.h>. All IP addresses are passed in network byte-order.

Stream Control Transmission Protocol

Stream Control Transmission Protocol (SCTP) is a reliable transport protocol that provides services similar to the services provided by TCP. In addition, SCTP provides network-level fault tolerance. SCTP supports multihoming at either end of an association. The SCTP socket API supports a one-to-one socket style modeled after TCP. The SCTP socket API also supports a one-to-many socket style designed for use with signaling. The one-to-many socket style reduces the number of file descriptors used in a process. You must link the libsctp library to use SCTP function calls.

An SCTP association is set up between two endpoints. The endpoints use a four-way handshake mechanism that uses a cookie to guard against some types of denial-of-service (DoS) attacks. The endpoints can be represented by multiple IP addresses.

SCTP Stack Implementation

This section lists the details of the Solaris implementation of the IETF standard for the Stream Control Transmission Protocol (RFC 2960) and the Stream Control Transmission Protocol Checksum Change (RFC 3309). The table in this section lists all of the exceptions from RFC 2960. The SCTP protocol in the Solaris operating system fully implements RFC 3309 and any section of RFC 2960 that is not explicitly mentioned in the table.

Table 8–3 Solaris SCTP Implementation Exceptions from RFC 2960

RFC 2960 Section 

Exceptions in the Solaris Implementation 

3. SCTP Packet Format 

3.2 Chunk Field Descriptions: Solaris SCTP does not implement the optional ECNE and CWR. 

3.3.2: Solaris SCTP does not implement the Initiation (INIT) Optional ECN, Host Name Address, and Cookie Preserving parameters. 

3.3.3: Solaris SCTP does not implement the Initiation Acknowledgement, Optional ECN, and Host Name Address parameters. 

5. Association Initialization 

5.1.2, Handle Address Parameters: Section (B), Optional Host Name parameter, is not implemented. 

6. User Data Transfer 

6.8, Adler-32 Checksum Calculation: This section has been rendered obsolete by RFC 3309. 

10. Interface with Upper Layer 

Solaris SCTP implements the IETF TSVWG SCTP socket API draft. 


Note –

The Solaris 10 implementation of the TSVWG SCTP socket API is based on a version of the API draft that was published at the time when Solaris 10 first shipped.


SCTP Socket Interfaces

When the socket() call creates a socket for IPPROTO_SCTP, it calls an SCTP-specific socket creation routine. Socket calls made on an SCTP socket call the appropriate SCTP socket routine automatically. In a one-to-one socket, each socket corresponds to one SCTP association. Create a one-to-one socket by calling this function:

socket(AF_INET[6], SOCK_STREAM, IPPROTO_STCP);

In a one-to-many style socket, each socket handles multiple SCTP associations. Each association has an association identifier called sctp_assoc_t. Create a one-to-many socket by calling this function:

socket(AF_INET[6], SOCK_SEQPACKET, IPPROTO_STCP);

sctp_bindx()

int sctp_bindx(int sock, void *addrs, int addrcnt, int flags);

The sctp_bindx() function manages addresses on an SCTP socket. If the sock parameter is an IPv4 socket, the addresses passed to the sctp_bindx() function must be IPv4 addresses. If the sock parameter is an IPv6 socket, the addresses passed to the sctp_bindx() function can be either IPv4 or IPv6 addresses. When the address that is passed to the sctp_bindx() function is INADDR_ANY or IN6ADDR_ANY, the socket binds to all available addresses. Bind the SCTP endpoint with the bind(3SOCKET)

The value of the *addrs parameter is a pointer to an array of one or more socket addresses. Each address is contained in its appropriate structure. If the addresses are IPv4 addresses, they are contained in either sockaddr_in structures or sockaddr_in6 structures. If the addresses are IPv6 addresses, they are contained in sockaddr_in6 structures. The address type's family distinguishes the address length. The caller specifies the number of addresses in the array with the addrcnt parameter.

The sctp_bindx() function returns 0 on success. The sctp_bindx() function returns -1 on failure and sets the value of errno to the appropriate error code.

If the same port is not given for each socket address, the sctp_bindx() function fails and sets the value of errno to EINVAL.

The flags parameter is formed from performing the bitwise OR operation on zero or more of the following currently defined flags:

SCTP_BINDX_ADD_ADDR directs SCTP to add the given addresses to the association. SCTP_BINDX_REM_ADDR directs SCTP to remove the given addresses from the association. The two flags are mutually exclusive. If both are given, the sctp_bindx() fails and sets the value of errno to EINVAL.

A caller may not remove all addresses from an association. The sctp_bindx() function rejects such an attempt by failing and setting the value of errno to EINVAL. An application can use sctp_bindx(SCTP_BINDX_ADD_ADDR) to associate additional addresses with an endpoint after calling the bind() function. An application can use sctp_bindx(SCTP_BINDX_REM_ADDR) to remove addresses associated with a listening socket. After using sctp_bindx(SCTP_BINDX_REM_ADDR) to remove addresses, accepting new associations will not reassociate the removed address. If the endpoint supports dynamic address, using SCTP_BINDX_REM_ADDR or SCTP_BINDX_ADD_ADDR sends a message to the peer to change the peer's address lists. Adding and removing addresses from a connected association is optional functionality. Implementations that do not support this functionality return EOPNOTSUPP.

If the address family is not AF_INET or AF_INET6, the sctp_bindx() function fails and returns EAFNOSUPPORT. If the file descriptor passed to the sctp_bindx() in the sock parameter is invalid, the sctp_bindx() function fails and returns EBADF.

sctp_opt_info()

int sctp_opt_info(int sock, sctp_assoc_id_t id, int opt, void *arg, socklen_t *len);

The sctp_opt_info() function returns the SCTP level options that are associated with the socket described in the sock parameter. If the socket is a one-to-many style SCTP socket the value of the id parameter refers to a specific association. The id parameter is ignored for one-to-one style SCTP sockets. The value of the opt parameter specifies the SCTP socket option to get. The value of the arg parameter is an option-specific structure buffer that is allocated by the calling program. The value of the *len parameter is the length of the option.

The opt parameter can take on the following values:

SCTP_RTOINFO

Returns the protocol parameters that are used to initialize and bind the retransmission timeout (RTO) tunable. The protocol parameters use the following structure:

struct sctp_rtoinfo {
    sctp_assoc_t    srto_assoc_id;
    uint32_t        srto_initial;
    uint32_t        srto_max; 
    uint32_t        srto_min;
};
srto_assoc_id

The calling program provides this value, which specifies the association of interest.

srto_initial

This value is the initial RTO value.

srto_max

This value is the maximum RTO value.

srto_min

This value is the minimum RTO value.

SCTP_ASSOCINFO

Returns the association-specific parameters. The parameters use the following structure:

struct sctp_assocparams {
    sctp_assoc_t    sasoc_assoc_id;
    uint16_t        sasoc_asocmaxrxt;
    uint16_t        sasoc_number_peer_destinations;
    uint32_t        sasoc_peer_rwnd;
    uint32_t        sasoc_local_rwnd;
    uint32_t        sasoc_cookie_life;
};
sasoc_assoc_id

The calling program provides this value, which specifies the association of interest.

sasoc_assocmaxrxt

This value specifies the maximum retransmission count for the association.

sasoc_number_peer_destinations

This value specifies the number of addresses that the peer has.

sasoc_peer_rwnd

This value specifies the current value of the peer's receive window.

sasoc_local_rwnd

This value specifies the last reported receive window that the peer transmitted to.

sasoc_cookie_life

The value specifies the lifetime of the association's cookie. The value is used when issuing cookies.

All parameters that use time values are measured in milliseconds.

SCTP_DEFAULT_SEND_PARAM

Returns the default set of parameters that a call to the sendto(3SOCKET) function uses on this association. The parameters use the following structure:

struct sctp_sndrcvinfo {
    uint16_t        sinfo_stream;
    uint16_t        sinfo_ssn;
    uint16_t        sinfo_flags;
    uint32_t        sinfo_ppid;
    uint32_t        sinfo_context;
    uint32_t        sinfo_timetolive;
    uint32_t        sinfo_tsn;
    uint32_t        sinfo_cumtsn;
    sctp_assoc_t    sinfo_assoc_id;
};
sinfo_stream

This value specifies the default stream for the sendmsg() call.

sinfo_ssn

This value is always zero.

sinfo_flags

This value contains the default flags for the sendmsg() call. This flag can take on the following values:

  • MSG_UNORDERED

  • MSG_ADDR_OVER

  • MSG_ABORT

  • MSG_EOF

  • MSG_PR_SCTP

sinfo_ppid

This value is the default payload protocol identifier for the sendmsg() call.

sinfo_context

This value is the default context for the sendmsg() call.

sinfo_timetolive

This value specifies a time period in milliseconds. After this time period has passed, the message expires if its transmission has not begun. A value of zero indicates that the message does not expire. If the MSG_PR_SCTP flag is set, the message expires when its transmission has not successfully completed within the time period specified in sinfo_timetolive.

sinfo_tsn

This value is always zero.

sinfo_cumtsn

This value is always zero.

sinfo_assoc_id

This value is filled in by the calling application. This value specifies the association of interest.

SCTP_PEER_ADDR_PARAMS

Returns the parameters for a specified peer address. The parameters use the following structure:

struct sctp_paddrparams {
    sctp_assoc_t               spp_assoc_id;
    struct sockaddr_storage    spp_address;
    uint32_t                   spp_hbinterval;
    uint16_t                   spp_pathmaxrxt;
};
spp_assoc_id

The calling program provides this value, which specifies the association of interest.

spp_address

This value specifies the peer's address of interest.

spp_hbinterval

This value specifies the heartbeat interval in milliseconds.

spp_pathmaxrxt

This value specifies the maximum number of retransmissions to attempt on an address before considering the address unreachable.

SCTP_STATUS

Returns the current status information about the association. The parameters use the following structure:

struct sctp_status {
    sctp_assoc_t             sstat_assoc_id;
    int32_t                  sstat_state;
    uint32_t                 sstat_rwnd;
    uint16_t                 sstat_unackdata;
    uint16_t                 sstat_penddata;
    uint16_t                 sstat_instrms;
    uint16_t                 sstat_outstrms;
    uint32_t                 sstat_fragmentation_point;
    struct sctp_paddrinfo    sstat_primary;
};
sstat_assoc_id

The calling program provides this value, which specifies the association of interest.

sstat_state

This value is the association's current state. The association can take on the following states:

SCTP_IDLE

The SCTP endpoint does not have any association associated with it. Immediately after the call to the socket() function opens an endpoint, or after the endpoint closes, the endpoint is in this state.

SCTP_BOUND

An SCTP endpoint is bound to one or more local addresses after calling the bind().

SCTP_LISTEN

This endpoint is waiting for an association request from any remote SCTP endpoint.

SCTP_COOKIE_WAIT

This SCTP endpoint has sent an INIT chunk and is waiting for an INIT-ACK chunk.

SCTP_COOKIE_ECHOED

This SCTP endpoint has echoed the cookie that it received from its peer's INIT-ACK chunk back to the peer.

SCTP_ESTABLISHED

This SCTP endpoint can exchange data with its peer.

SCTP_SHUTDOWN_PENDING

This SCTP endpoint has received a SHUTDOWN primitive from its upper layer. This endpoint no longer accepts data from its upper layer.

SCTP_SHUTDOWN_SEND

An SCTP endpoint that was in the SCTP_SHUTDOWN_PENDING state has sent a SHUTDOWN chunk to its peer. The SHUTDOWN chunk is sent only after all outstanding data from this endpoint to its peer is acknowledged. When this endpoint's peer sends a SHUTDOWN ACK chunk, this endpoint sends a SHUTDOWN COMPLETE chunk and the association is considered closed.

SCTP_SHUTDOWN_RECEIVED

An SCTP endpoint has received a SHUTDOWN chunk from its peer. This endpoint no longer accepts new data from its user.

SCTP_SHUTDOWN_ACK_SEND

An SCTP endpoint in the SCTP_SHUTDOWN_RECEIVED state has sent the SHUTDOWN ACK chunk to its peer. The endpoint only sends the SHUTDOWN ACK chunk after the peer acknowledges all outstanding data from this endpoint. When this endpoint's peer sends a SHUTDOWN COMPLETE chunk, the association is closed.

sstat_rwnd

This value is the association peer's current receive window.

sstat_unackdata

This value is the number of unacknowledged DATA chunks.

sstat_penddata

This value is the number of DATA chunks that are awaiting receipt.

sstat_instrms

This value is the number of inbound streams.

sstat_outstrms

This value is the number of outbound streams.

sstat_fragmentation_point

If the combined size of the message, the SCTP headers, and the IP headers exceeds the value of sstat_fragmentation_point, the message fragments. This value is equal to the Path Maximum Transmission Unit (P-MTU) for the packet's destination address

sstat_primary

This value contains information about the primary peer address. This information uses the following structure:

struct sctp_paddrinfo {
    sctp_assoc_t               spinfo_assoc_id;
    struct sockaddr_storage    spinfo_address;
    int32_t                    spinfo_state;
    uint32_t                   spinfo_cwnd;
    uint32_t                   spinfo_srtt;
    uint32_t                   spinfo_rto;
    uint32_t                   spinfo_mtu;
};
spinfo_assoc_id

The calling program provides this value, which specifies the association of interest.

spinfo_address

This value is the primary peer's address.

spinfo_state

This value can take on either of the two values SCTP_ACTIVE or SCTP_INACTIVE.

spinfo_cwnd

This value is the congestion window of the peer address.

spinfo_srtt

This value is the current smoothed round-trip time calculation for the peer address. This value is expressed in milliseconds.

spinfo_rto

This value is the current retransmission timeout value for the peer address. This value is expressed in milliseconds.

spinfo_mtu

This value is the P-MTU for the peer address.

The sctp_opt_info() function returns 0 on success. The sctp_opt_info() function returns -1 on failure and sets the value of errno to the appropriate error code. If the file descriptor passed to the sctp_opt_info() in the sock parameter is invalid, the sctp_opt_info() function fails and returns EBADF. If the file descriptor passed to the sctp_opt_info() function in the sock parameter does not describe a socket, the sctp_opt_info() function fails and returns ENOTSOCK. If the association ID is invalid for a one-to-many style SCTP socket, the sctp_opt_info() function fails and sets the value of errno to EINVAL. If the input buffer length is too short for the option specified, the sctp_opt_info() function fails and sets the value of errno to EINVAL. If the address family for the peer's address is not AF_INET or AF_INET6, the sctp_opt_info() function fails and sets the value of errno to EAFNOSUPPORT.

sctp_recvmsg()

ssize_t sctp_recvmsg(int s, void *msg, size_t len, struct sockaddr *from, socklen_t *fromlen, struct sctp_sndrcvinfo *sinfo, int *msg_flags);

The sctp_recvmsg() function enables receipt of a message from the SCTP endpoint specified by the s parameter. The calling program can specify the following attributes:

msg

This parameter is the address of the message buffer.

len

This parameter is the length of the message buffer.

from

This parameter is a pointer to an address that contains the sender's address.

fromlen

This parameter is the size of the buffer associated with the address in the from parameter.

sinfo

This parameter is only active if the calling program enables sctp_data_io_events. To enable sctp_data_io_events, call the setsockopt() function with the socket option SCTP_EVENTS. When sctp_data_io_events is enabled, the application receives the contents of the sctp_sndrcvinfo structure for each incoming message. This parameter is a pointer to a sctp_sndrcvinfo structure. The structure is populated upon receipt of the message.

msg_flags

This parameter contains any message flags that are present.

The sctp_recvmsg() function returns the number of bytes it receives. The sctp_recvmsg() function returns -1 when an error occurs.

If the file descriptor passed in the s parameter is not valid, the sctp_recvmsg() function fails and sets the value of errno to EBADF. If the file descriptor passed in the s parameter does not describe a socket, the sctp_recvmsg() function fails and sets the value of errno to ENOTSOCK. If the msg_flags parameter includes the value MSG_OOB, the sctp_recvmsg() function fails and sets the value of errno to EOPNOTSUPP. If there is no established association, the sctp_recvmsg() function fails and sets the value of errno to ENOTCONN.

sctp_sendmsg()

ssize_t sctp_sendmsg(int s, const void *msg, size_t len, const struct sockaddr *to, socklen_t tolen, uint32_t ppid, uint32_t flags, uint16_t stream_no, uint32_t timetolive, uint32_t context);

The sctp_sendmsg() function enables advanced SCTP features while sending a message from an SCTP endpoint.

s

This value specifies the SCTP endpoint that is sending the message.

msg

This value contains the message sent by the sctp_sendmsg() function.

len

This value is the length of the message. This value is expressed in bytes.

to

This value is the destination address of the message.

tolen

This value is the length of the destination address.

ppid

This value is the application-specified payload protocol identifier.

stream_no

This value is the target stream for this message.

timetolive

This value is the time period after which the message expires if it has not been successfully sent to the peer. This value is expressed in milliseconds.

context

This value is returned if an error occurs during the sending of the message.

flags

This value is formed from applying the logical operation OR in bitwise fashion on zero or more of the following flag bits:

MSG_UNORDERED

When this flag is set, the sctp_sendmsg() function delivers the message unordered.

MSG_ADDR_OVER

When this flag is set, the sctp_sendmsg() function uses the address in the to parameter instead of the association's primary destination address. This flag is only used with one-to-many style SCTP sockets.

MSG_ABORT

When this flag is set, the specified association aborts by sending an ABORT signal to its peer. This flag is only used with one-to-many style SCTP sockets.

MSG_EOF

When this flag is set, the specified association enters graceful shutdown. This flag is only used with one-to-many style SCTP sockets.

MSG_PR_SCTP

When this flag is set, the message expires when its transmission has not successfully completed within the time period specified in the timetolive parameter.

The sctp_sendmsg() function returns the number of bytes it sent. The sctp_sendmsg() function returns -1 when an error occurs.

If the file descriptor passed in the s parameter is not valid, the sctp_sendmsg() function fails and sets the value of errno to EBADF. If the file descriptor passed in the s parameter does not describe a socket, the sctp_sendmsg() function fails and sets the value of errno to ENOTSOCK. If the flags parameter includes the value MSG_OOB, the sctp_sendmsg() function fails and sets the value of errno to EOPNOTSUPP. If the flags parameter includes the values MSG_ABORT or MSG_EOF for a one-to-one style socket, the sctp_sendmsg() function fails and sets the value of errno to EOPNOTSUPP. If there is no established association, the sctp_sendmsg() function fails and sets the value of errno to ENOTCONN. If the socket is shutting down, disallowing further writes, the sctp_sendmsg() function fails and sets the value of errno to EPIPE. If the socket is nonblocking and the transmit queue is full, the sctp_sendmsg() function fails and sets the value of errno to EAGAIN.

If the control message length is incorrect, the sctp_sendmsg() function fails and sets the value of errno to EINVAL. If the specified destination address does not belong to the association, the sctp_sendmsg() function fails and sets the value of errno to EINVAL. If the value of stream_no is outside the number of outbound streams that the association supports, the sctp_sendmsg() function fails and sets the value of errno to EINVAL. If the address family of the specified destination address is not AF_INET or AF_INET6, the sctp_sendmsg() function fails and sets the value of errno to EINVAL.

sctp_send()

ssize_t sctp_send(int s, const void *msg, size_t len, const struct sctp_sndrcvinfo *sinfo, int flags);

The sctp_send() function is usable by one-to-one and one-to-many style sockets. The sctp_send() function enables advanced SCTP features while sending a message from an SCTP endpoint.

s

This value specifies the socket created by the socket() function.

msg

This value contains the message sent by the sctp_send() function.

len

This value is the length of the message. This value is expressed in bytes.

sinfo

This value contains the parameters used to send the message. For a one-to-many style socket, this value can contain the association ID to which the message is being sent.

flags

This value is identical to the flags parameter in the sendmsg() function.

The sctp_send() function returns the number of bytes it sent. The sctp_send() function returns -1 when an error occurs.

If the file descriptor passed in the s parameter is not valid, the sctp_send() function fails and sets the value of errno to EBADF. If the file descriptor passed in the s parameter does not describe a socket, the sctp_send() function fails and sets the value of errno to ENOTSOCK. If the sinfo_flags field of the sinfo parameter includes the value MSG_OOB, the sctp_send() function fails and sets the value of errno to EOPNOTSUPP. If the sinfo_flags field of the sinfo parameter includes the values MSG_ABORT or MSG_EOF for a one-to-one style socket, the sctp_send() function fails and sets the value of errno to EOPNOTSUPP. If there is no established association, the sctp_send() function fails and sets the value of errno to ENOTCONN. If the socket is shutting down, disallowing further writes, the sctp_send() function fails and sets the value of errno to EPIPE. If the socket is nonblocking and the transmit queue is full, the sctp_send() function fails and sets the value of errno to EAGAIN.

If the control message length is incorrect, the sctp_send() function fails and sets the value of errno to EINVAL. If the specified destination address does not belong to the association, the sctp_send() function fails and sets the value of errno to EINVAL. If the value of stream_no is outside the number of outbound streams that the association supports, the sctp_send() function fails and sets the value of errno to EINVAL. If the address family of the specified destination address is not AF_INET or AF_INET6, the sctp_send() function fails and sets the value of errno to EINVAL.

Branched-off Association

Applications can branch an established association on a one-to-many style socket into a separate socket and file descriptor. A separate socket and file descriptor is useful for applications that have a number of sporadic message senders or receivers that need to remain under the original one-to-many style socket. The application branches off associations that carry high volume data traffic into separate socket descriptors. The application uses the sctp_peeloff() call to branch off an association into a separate socket. The new socket is a one-to-one style socket. The syntax for the sctp_peeloff() function is as follows:

int sctp_peeloff(int sock, sctp_assoc_t id);
sock

The original one-to-many style socket descriptor returned from the socket() system call

id

The identifier of the association to branch off to a separate file descriptor

The sctp_peeloff() function fails and returns EOPTNOTSUPP if the socket descriptor passed in the sock parameter is not a one-to-many style SCTP socket. The sctp_peeloff() function fails and returns EINVAL if the value of id is zero or if the value of id is greater than the maximum number of associations for the socket descriptor passed in the sock parameter. The sctp_peeloff() function fails and returns EMFILE if the function fails to create a new user file descriptor or file structure.

sctp_getpaddrs()

The sctp_getpaddrs() function returns all peer addresses in an association.

int sctp_getpaddrs(int sock, sctp_assoc_t id, void **addrs);

When the sctp_getpaddrs() function returns successfully, the value of the **addrs parameter points to a dynamically allocated packed array of sockaddr structures of the appropriate type for each address. The calling thread frees the memory with the sctp_freepaddrs() function. The **addrs parameter cannot have a value of NULL. If the socket descriptor given in sock is for an IPv4 socket, the sctp_getpaddrs() function returns IPv4 addresses. If the socket descriptor given in sock is for an IPv6 socket, the sctp_getpaddrs() function returns a mix of IPv4 and IPv6 addresses. For one-to-many style sockets, the id parameter specifies the association to query. The sctp_getpaddrs() function ignores the id parameter for one-to-one style sockets. When the sctp_getpaddrs() function returns successfully, it returns the number of peer addresses in the association. If there is no association on this socket, the sctp_getpaddrs() function returns 0 and the value of the **addrs parameter is undefined. If an error occurs, the sctp_getpaddrs() function returns -1 and the value of the **addrs parameter is undefined.

If the file descriptor passed to the sctp_getpaddrs() function in the sock parameter is invalid, the sctp_getpaddrs() function fails and returns EBADF. If the file descriptor passed to the sctp_getpaddrs() function in the sock parameter does not describe a socket, the sctp_getpaddrs() function fails and returns ENOTSOCK. If the file descriptor passed to the sctp_getpaddrs() function in the sock parameter describes a socket that is not connected, the sctp_getpaddrs() function fails and returns ENOTCONN.

sctp_freepaddrs()

The sctp_freepaddrs() function frees all of the resources that were allocated by a previous call to the sctp_getpaddrs(). The syntax for the sctp_freepaddrs() function is as follows:

void sctp_freepaddrs(void *addrs);

The *addrs parameter is an array that contains the peer addresses that are returned by the sctp_getpaddrs() function.

sctp_getladdrs()

The sctp_getladdrs() function returns all locally bound addresses on a socket. The syntax for the sctp_getladdrs() function is as follows:

int sctp_getladdrs(int sock, sctp_assoc_t id, void **addrs);

When the sctp_getladdrs() function returns successfully, the value of addrs points to a dynamically allocated packed array of sockaddr structures. The sockaddr structures are of the appropriate type for each local address. The calling application uses the sctp_freeladdrs() function to free the memory. The value of the addrs parameter must not be NULL.

If the socket referenced by the sd parameter is an IPv4 socket, the sctp_getladdrs() function returns IPv4 addresses. If the socket referenced by the sd parameter is an IPv6 socket, the sctp_getladdrs() function returns a mix of IPv4 or IPv6 addresses as appropriate.

When the sctp_getladdrs() function is invoked for a one-to-many style socket, the value of the id parameter specifies the association to query. The sctp_getladdrs() function ignores the id parameter when the function is operating on a one-to-one socket.

When the value of the id parameter is zero, the sctp_getladdrs() function returns locally bound addresses without regard to any particular association. When the sctp_getladdrs() function returns successfully, it reports the number of local addresses bound to the socket. If the socket is unbound, the sctp_getladdrs() function returns 0 and the value of *addrs is undefined. If an error occurs, the sctp_getladdrs() function returns -1 and the value of *addrs is undefined.

sctp_freeladdrs()

The sctp_freeladdrs() function frees all of the resources that were allocated by a previous call to the sctp_getladdrs(). The syntax for the sctp_freeladdrs() function is as follows:

void sctp_freeladdrs(void *addrs);

The *addrs parameter is an array that contains the peer addresses that are returned by the sctp_getladdrs() function.

Code Examples of SCTP Use

This section details two uses of SCTP sockets.


Example 8–17 SCTP Echo Client

/*
 * Copyright 2004 Sun Microsystems, Inc.  All rights reserved.
 * Use is subject to license terms.
 */

/* To enable socket features used for SCTP socket. */
#define _XPG4_2
#define __EXTENSIONS__

#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <stdlib.h>
#include <unistd.h>
#include <pthread.h>
#include <netinet/sctp.h>
#include <netdb.h>

#define BUFLEN 1024

static void
usage(char *a0)
{
    fprintf(stderr, "Usage: %s <addr>\n", a0);
}

/*
 * Read from the network.
 */
static void
readit(void *vfdp)
{
    int   fd;
    ssize_t   n;
    char   buf[BUFLEN];
    struct msghdr  msg[1];
    struct iovec  iov[1];
    struct cmsghdr  *cmsg;
    struct sctp_sndrcvinfo *sri;
    char   cbuf[sizeof (*cmsg) + sizeof (*sri)];
    union sctp_notification *snp;

    pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, NULL);

    fd = *(int *)vfdp;

    /* Initialize the message header for receiving */
    memset(msg, 0, sizeof (*msg));
    msg->msg_control = cbuf;
    msg->msg_controllen = sizeof (*cmsg) + sizeof (*sri);
    msg->msg_flags = 0;
    cmsg = (struct cmsghdr *)cbuf;
    sri = (struct sctp_sndrcvinfo *)(cmsg + 1);
    iov->iov_base = buf;
    iov->iov_len = BUFLEN;
    msg->msg_iov = iov;
    msg->msg_iovlen = 1;

    while ((n = recvmsg(fd, msg, 0)) > 0) {
        /* Intercept notifications here */
        if (msg->msg_flags & MSG_NOTIFICATION) {
            snp = (union sctp_notification *)buf;
            printf("[ Receive notification type %u ]\n",
                snp->sn_type);
            continue;
        }
        msg->msg_control = cbuf;
        msg->msg_controllen = sizeof (*cmsg) + sizeof (*sri);
        printf("[ Receive echo (%u bytes): stream = %hu, ssn = %hu, "
            "flags = %hx, ppid = %u ]\n", n,
            sri->sinfo_stream, sri->sinfo_ssn, sri->sinfo_flags,
            sri->sinfo_ppid);
    }

    if (n < 0) {
        perror("recv");
        exit(1);
    }

    close(fd);
    exit(0);
}

#define MAX_STREAM 64

static void
echo(struct sockaddr_in *addr)
{
    int     fd;
    uchar_t     buf[BUFLEN];
    ssize_t    n;
    int    perr;
    pthread_t   tid;
    struct cmsghdr   *cmsg;
    struct sctp_sndrcvinfo  *sri;
    char    cbuf[sizeof (*cmsg) + sizeof (*sri)];
    struct msghdr   msg[1];
    struct iovec   iov[1];
    int    ret;
    struct sctp_initmsg   initmsg;
    struct sctp_event_subscribe events;

    /* Create a one-one SCTP socket */
    if ((fd = socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP)) == -1) {
        perror("socket");
        exit(1);
    }

    /*
     * We are interested in association change events and we want
     * to get sctp_sndrcvinfo in each receive.
     */
    events.sctp_association_event = 1; 
    events.sctp_data_io_event = 1; 
    ret = setsockopt(fd, IPPROTO_SCTP, SCTP_EVENTS, &events,
        sizeof (events));
    if (ret < 0) {
        perror("setsockopt SCTP_EVENTS");
        exit(1);
    }

    /*
     * Set the SCTP stream parameters to tell the other side when
     * setting up the association.
     */
    memset(&initmsg, 0, sizeof(struct sctp_initmsg));
    initmsg.sinit_num_ostreams = MAX_STREAM;
    initmsg.sinit_max_instreams = MAX_STREAM;
    initmsg.sinit_max_attempts = MAX_STREAM;
    ret = setsockopt(fd, IPPROTO_SCTP, SCTP_INITMSG, &initmsg,
        sizeof(struct sctp_initmsg));
    if (ret < 0) {
        perror("setsockopt SCTP_INITMSG");
        exit(1);
    }

    if (connect(fd, (struct sockaddr *)addr, sizeof (*addr)) == -1) {
        perror("connect");
        exit(1);
    }

    /* Initialize the message header structure for sending. */
    memset(msg, 0, sizeof (*msg));
    iov->iov_base = buf;
    msg->msg_iov = iov;
    msg->msg_iovlen = 1;
    msg->msg_control = cbuf;
    msg->msg_controllen = sizeof (*cmsg) + sizeof (*sri);
    msg->msg_flags |= MSG_XPG4_2;

    memset(cbuf, 0, sizeof (*cmsg) + sizeof (*sri));
    cmsg = (struct cmsghdr *)cbuf;
    sri = (struct sctp_sndrcvinfo *)(cmsg + 1);

    cmsg->cmsg_len = sizeof (*cmsg) + sizeof (*sri);
    cmsg->cmsg_level = IPPROTO_SCTP;
    cmsg->cmsg_type  = SCTP_SNDRCV;

    sri->sinfo_ppid   = 1;
    /* Start sending to stream 0. */
    sri->sinfo_stream = 0;

    /* Create a thread to receive network traffic. */
    perr = pthread_create(&tid, NULL, (void *(*)(void *))readit, &fd);

    if (perr != 0) {
        fprintf(stderr, "pthread_create: %d\n", perr);
        exit(1);
    }

    /* Read from stdin and then send to the echo server. */
    while ((n = read(fileno(stdin), buf, BUFLEN)) > 0) {
        iov->iov_len = n;
        if (sendmsg(fd, msg, 0) < 0) {
            perror("sendmsg");
            exit(1);
        }
        /* Send the next message to a different stream. */
        sri->sinfo_stream = (sri->sinfo_stream + 1) % MAX_STREAM;
    }

    pthread_cancel(tid);
    close(fd);
}

int
main(int argc, char **argv)
{
    struct sockaddr_in addr[1];
    struct hostent *hp;
    int error;

    if (argc < 2) {
        usage(*argv);
        exit(1);
    }

    /* Find the host to connect to. */ 
    hp = getipnodebyname(argv[1], AF_INET, AI_DEFAULT, &error);
    if (hp == NULL) {
        fprintf(stderr, "host not found\n");
        exit(1);
    }

    addr->sin_family = AF_INET;
    addr->sin_addr.s_addr = *(ipaddr_t *)hp->h_addr_list[0];
    addr->sin_port = htons(5000);

    echo(addr);

    return (0);
}


Example 8–18 SCTP Echo Server

/*
 * Copyright 2004 Sun Microsystems, Inc.  All rights reserved.
 * Use is subject to license terms.
 */

/* To enable socket features used for SCTP socket. */
#define _XPG4_2
#define __EXTENSIONS__

#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <stdlib.h>
#include <unistd.h>
#include <netinet/sctp.h>

#define BUFLEN 1024

/*
 * Given an event notification, print out what it is.
 */
static void
handle_event(void *buf)
{
    struct sctp_assoc_change *sac;
    struct sctp_send_failed  *ssf;
    struct sctp_paddr_change *spc;
    struct sctp_remote_error *sre;
    union sctp_notification  *snp;
    char    addrbuf[INET6_ADDRSTRLEN];
    const char   *ap;
    struct sockaddr_in  *sin;
    struct sockaddr_in6  *sin6;

    snp = buf;

    switch (snp->sn_header.sn_type) {
    case SCTP_ASSOC_CHANGE:
        sac = &snp->sn_assoc_change;
        printf("^^^ assoc_change: state=%hu, error=%hu, instr=%hu "
            "outstr=%hu\n", sac->sac_state, sac->sac_error,
            sac->sac_inbound_streams, sac->sac_outbound_streams);
        break;
    case SCTP_SEND_FAILED:
        ssf = &snp->sn_send_failed;
        printf("^^^ sendfailed: len=%hu err=%d\n", ssf->ssf_length,
            ssf->ssf_error);
        break;
    case SCTP_PEER_ADDR_CHANGE:
        spc = &snp->sn_paddr_change;
        if (spc->spc_aaddr.ss_family == AF_INET) {
            sin = (struct sockaddr_in *)&spc->spc_aaddr;
            ap = inet_ntop(AF_INET, &sin->sin_addr, addrbuf,
                INET6_ADDRSTRLEN);
        } else {
            sin6 = (struct sockaddr_in6 *)&spc->spc_aaddr;
            ap = inet_ntop(AF_INET6, &sin6->sin6_addr, addrbuf,
                INET6_ADDRSTRLEN);
        }
        printf("^^^ intf_change: %s state=%d, error=%d\n", ap,
            spc->spc_state, spc->spc_error);
        break;
    case SCTP_REMOTE_ERROR:
        sre = &snp->sn_remote_error;
        printf("^^^ remote_error: err=%hu len=%hu\n",
            ntohs(sre->sre_error), ntohs(sre->sre_length));
        break;
    case SCTP_SHUTDOWN_EVENT:
        printf("^^^ shutdown event\n");
        break;
    default:
        printf("unknown type: %hu\n", snp->sn_header.sn_type);
        break;
    }
}

/*
 * Receive a message from the network.
 */
static void *
getmsg(int fd, struct msghdr *msg, void *buf, size_t *buflen,
    ssize_t *nrp, size_t cmsglen)
{
    ssize_t  nr = 0;
    struct iovec iov[1];

    *nrp = 0;
    iov->iov_base = buf;
    msg->msg_iov = iov;
    msg->msg_iovlen = 1;

    /* Loop until a whole message is received. */
    for (;;) {
        msg->msg_flags = MSG_XPG4_2;
        msg->msg_iov->iov_len = *buflen;
        msg->msg_controllen = cmsglen;

        nr += recvmsg(fd, msg, 0);
        if (nr <= 0) {
            /* EOF or error */
            *nrp = nr;
            return (NULL);
        }

        /* Whole message is received, return it. */
        if (msg->msg_flags & MSG_EOR) {
            *nrp = nr;
            return (buf);
        }

        /* Maybe we need a bigger buffer, do realloc(). */
        if (*buflen == nr) {
            buf = realloc(buf, *buflen * 2);
            if (buf == 0) {
                fprintf(stderr, "out of memory\n");
                exit(1);
            }
            *buflen *= 2;
        }

        /* Set the next read offset */
        iov->iov_base = (char *)buf + nr;
        iov->iov_len = *buflen - nr;

    }
}

/*
 * The echo server.
 */
static void
echo(int fd)
{
    ssize_t   nr;
    struct sctp_sndrcvinfo *sri;
    struct msghdr  msg[1];
    struct cmsghdr  *cmsg;
    char   cbuf[sizeof (*cmsg) + sizeof (*sri)];
    char   *buf;
    size_t   buflen;
    struct iovec  iov[1];
    size_t   cmsglen = sizeof (*cmsg) + sizeof (*sri);

    /* Allocate the initial data buffer */
    buflen = BUFLEN;
    if ((buf = malloc(BUFLEN)) == NULL) {
        fprintf(stderr, "out of memory\n");
        exit(1);
    }

    /* Set up the msghdr structure for receiving */
    memset(msg, 0, sizeof (*msg));
    msg->msg_control = cbuf;
    msg->msg_controllen = cmsglen;
    msg->msg_flags = 0;
    cmsg = (struct cmsghdr *)cbuf;
    sri = (struct sctp_sndrcvinfo *)(cmsg + 1);

    /* Wait for something to echo */
    while ((buf = getmsg(fd, msg, buf, &buflen, &nr, cmsglen)) != NULL) {

        /* Intercept notifications here */
        if (msg->msg_flags & MSG_NOTIFICATION) {
            handle_event(buf);
            continue;
        }

        iov->iov_base = buf;
        msg->msg_iov = iov;
        msg->msg_iovlen = 1;
        iov->iov_len = nr;
        msg->msg_control = cbuf;
        msg->msg_controllen = sizeof (*cmsg) + sizeof (*sri);

        printf("got %u bytes on stream %hu:\n", nr,
               sri->sinfo_stream);
        write(0, buf, nr);

        /* Echo it back */
        msg->msg_flags = MSG_XPG4_2;
        if (sendmsg(fd, msg, 0) < 0) {
            perror("sendmsg");
            exit(1);
        }
    }

    if (nr < 0) {
        perror("recvmsg");
    }
    close(fd);
}

int
main(void)
{
    int    lfd;
    int    cfd;
    int    onoff = 1;
    struct sockaddr_in  sin[1];
    struct sctp_event_subscribe events;
    struct sctp_initmsg  initmsg;

    if ((lfd = socket(AF_INET, SOCK_STREAM, IPPROTO_SCTP)) == -1) {
        perror("socket");
        exit(1);
    }

    sin->sin_family = AF_INET;
    sin->sin_port = htons(5000);
    sin->sin_addr.s_addr = INADDR_ANY;
    if (bind(lfd, (struct sockaddr *)sin, sizeof (*sin)) == -1) {
        perror("bind");
        exit(1);
    }

    if (listen(lfd, 1) == -1) {
        perror("listen");
        exit(1);
    }

    (void) memset(&initmsg, 0, sizeof(struct sctp_initmsg));
    initmsg.sinit_num_ostreams = 64;
    initmsg.sinit_max_instreams = 64;
    initmsg.sinit_max_attempts = 64;
    if (setsockopt(lfd, IPPROTO_SCTP, SCTP_INITMSG, &initmsg,
           sizeof(struct sctp_initmsg)) < 0) { 
        perror("SCTP_INITMSG");
        exit (1);
    }

    /* Events to be notified for */
    (void) memset(&events, 0, sizeof (events));
    events.sctp_data_io_event = 1;
    events.sctp_association_event = 1;
    events.sctp_send_failure_event = 1;
    events.sctp_address_event = 1;
    events.sctp_peer_error_event = 1;
    events.sctp_shutdown_event = 1;

    /* Wait for new associations */
    for (;;) {
        if ((cfd = accept(lfd, NULL, 0)) == -1) {
            perror("accept");
            exit(1);
        }

        /* Enable ancillary data */
        if (setsockopt(cfd, IPPROTO_SCTP, SCTP_EVENTS, &events,
               sizeof (events)) < 0) {
            perror("setsockopt SCTP_EVENTS");
            exit(1);
        }
        /* Echo back any and all data */
        echo(cfd);
    }
}