C H A P T E R 4 |
Driver API |
This chapter describes the driver application programming interface (API) which consists of the Sun Netra DPS Crypto and Hashing API and Ethernet API. Topics include:
Sun Netra DPS Crypto and Hashing API is an interface enables you to access the cryption and hash hardware functions supported by UltraSPARC T2 based platforms.
Note - Sun Netra DPS Cryptography API requires the SUNWndpsc Cryptography Driver package. |
You do not need to know the details in implementing the crypto and hash APIs when accessing these APIs.
The Sun Netra DPS reference application, IPSec Gateway, is an example of how to use this API. The package SUNWndpsc (required export clearance) contains this API.
You must include the following header files under src/libs/ndps_crypto_api/ in the application:
The SPU driver is provided in the binary format located in SUNWndpsc: lib/n2cp/lwrten2cp.o
You must link the driver into the application.
Creates a context for the crypto or hash task to be submitted to the UltraSPARC T2 crypto engine. The caller supplies the cipher, or hash, and the mode, which is the algorithms supported in the UltraSPARC T2 crypto engine. This function allocates the necessary resource to fulfill the crypto or hash task, such as the SPU (Stream Processing Unit) CG devices.
NDPS_crypto_ctx_t NDPSCreateCryptoContext (
const NDPS_CIPHER cipher, int mode);
cipher - An algorithm supported in UltraSPARC T2. See ndpscrypt. Possible ciphers include AES/DES/3DES/RC4 and MD5/SHA1/SHA256.
mode - The variation for each cipher, such as ECB/CBC/CTR for AES and ECB/CBC/CFB for DES/3DES
Returns the opaque handle NDPS_crypto_ctx_t to the available hardware CG and SPU devices.
A CG device is used for symmetric key encryption and hashing. You must have NDPS_crypto_ctx_t to be able to use other API functions.
To implement, use the following approach. Since each core owns one SPU, any strands on the same core accesses the same SPU. Therefore, the user’s routine to access SPU depends on the strand the user is running. For UltraSPARC T2 platforms, the CG the user is accessing = strand# % 8. If the application has only one strand per core to access the SPU on that core, then no other action is required. If the user has two functions running different strands on the same core and both access the SPU, the user then must either place the mutex around the crypto function by accessing the SPU, or allocate one strand whose task is to access the SPU only. The mutex then handles the callers in round-robin fashion to avoid locking.
Releases the context of accessing the SPU/CG device after finishing the crypto and hash task. The released hardware resources are available for the next caller.
int NDPSDestroyCryptoContext (NDPS_crypto_ctx_t ctx);
The opaque NDPS_crypto_ctx_t handle is allocated through NDPSCreateCryptoContext.
Loads the Key length for the crypto. The key length could be 128-bit, 192-bit, or 256-bit.
int NDPSCryptKeyLength (NDPS_crypto_ctx_t ctx, int key_len);
ctx - NDPS_crypto_ctx_t handle
int NDPSCryptKeyLoad (NDPS_crypto_ctx_t ctx, NDPS_key_t *key);
ctx - NDPS_crypto_ctx_t handle
Note - To avoid key copy, the caller must maintain space for its key until it calls NDPSDestroyContext(). |
int NDPSCryptIVLoad (NDPS_crypto_ctx_t ctx, NDPS_iv_t *iv);
ctx - NDPS_crypto_ctx_t handle
Note - To avoid IV copy, the caller must maintain space for its IV until it calls NDPSDestroyContext(). |
Submits the crypto task with a single data block to the UltraSPARC T2 crypto device.
int NDPSCrypt (NDPS_crypto_ctx_t ctx, int encrypt_flag,
uchar_t *outbuf, int *outlen, uchar_t *inbuf, int inlen);
ctx - NDPS_crypto_ctx_t handle
encrypt_flag = 1 - For encryption
encrypt_flag = 0 - For decryption
inbuf - Text to be encrypted or decrypted
inlen - Number of the text in bytes
outbuf - Where the crypted or decrypted data is placed
outlen - Number of the crypted or decrypted data in bytes
Submits the crypto task with chained multiple data blocks to the UltraSPARC T2 crypto device.
int NDPSCryptMultiple (NDPS_crypto_ctx_t ctx, int encrypt_flag,
int num_blk, uchar_t **outbuf, size_t *outlen, uchar_t **inbuf,
size_t *inlen);
ctx - NDPS_crypto_ctx_t handle
encrypt_flag = 1 - For encryption
encrypt_flag = 0 - For decryption
num_blk - Number of data blocks to be chained
inbuf - Array of the input chained data blocks
inlen - Array of the input lengths of the chained data blocks
outbuf - Array of the chained output data blocks
outlen - Array of the lengths of the chained output data blocks
Submits the Crypto and Hashing tasks with multiple data blocks to the UltraSPARC T2 Crypto device.
int NDPSCryptAndHashMultiple(NDPS_crypto_ctx_t ctx, int encrypt_flag,
int num_blk, char **outbuf, size_t *outlen,
char **inbuf, size_t *inlen, NDPS_crypto_ctx_t
h_ctx, char **h_outbuf, size_t *h_outlen,
char **h_inbuf, size_t *h_inlen)
ctx - Handler NDPS_crypto_ctx_t for Crypto
encrypt_flag = 1 - For encrypt and hash
encrypt_flag = 0 - For unhash and decrypt
num_blk - Number of data block CryptHash pairs to be submitted in one request
outbuf - Array of the output data blocks for Crypto
outlen - Array of the lengths of the output data blocks for Crypto
inbuf - Array of the input data blocks for Crypto
inlen - Array of the lengths of the input data blocks for Crypto
h_ctx - Handler NDPS_Crypto_ctx_t for Hash
h_outbuf - Array of the output data blocks for Hash
h_outlen - Array of the lengths of the output data blocks for Hash
h_inbuf - Array of the input data blocks for Hash
h_inlen - Array of the lengths of the input data blocks for Hash
int NDPSHashLength (NDPS_crypto_ctx_t ctx, int len);
ctx - NDPS_crypto_ctx_t handle
Loads the Hash IV (initialization vector) load.
Note - To avoid IV copy, the caller must maintain space for its IV until it calls NDPSDestroyContext(). |
int NDPSHashIVLoad(NDPS_crypto_ctx_t ctx, NDPS_iv_t *iv);
ctx - NDPS_crypto_ctx_t handle
Acquires the IV (initialization vector) address for the hash.
int NDPSHashIVGet (NDPS_crypto_ctx_t ctx, NDPS_iv_t **iv);
ctx - NDPS_crypto_ctx_t handle
iv - Pointer to the IV location
Produces the Hash value from the input data with its length. This Hash function does not overwrite the internal IV, but rather does a complete hash operation and stores the result in the provided outbuf.
int NDPSHashDirect (NDPS_crypto_ctx_t ctx, uchar_t *outbuf,
uchar_t *inbuf, int inlen);
ctx - NDPS_crypto_ctx_t handle
inbuf - Input data to be hashed
inlen - Length of data to be hashed
Submits the Hash task with chained multiple data blocks to the UltraSPARC T2 crypto device.
int NDPSHashDirectMultiple (NDPS_crypto_ctx_t ctx, int num_blk,
uchar_t **outbuf, size_t *outlen, uchar_t **inbuf,
size_t *inlen);
ctx - The NDPS_crypto_ctx_t handle
num_blk - Number of data blocks to be chained
outbuf - Array of the chained output data blocks
outlen - Array of the lengths of the chained output data blocks
inbuf - Array of the input chained data blocks
inlen - Array of the input lengths of the chained data blocks
Combines crypto and hash operations in one function call. This API calls SPU in one call to get a performance boost.
int NDPSCryptAndHash(NDPS_crypto_ctx_t ctx, int encrypt_flag,
char *outbuf, int *outlen, char *inbuf, int inlen,
NDPS_crypto_ctx_t h_ctx,
char *h_outbuf, int h_outlen, char *h_inbuf, int h_inlen);
ctx - NDPS_crypto_ctx_t for crypto handle
encrypt_flag = 1 - For encryption
encrypt_flag = 0 - For decryption
outbuf - Array of the chained output data blocks for crypto
outlen - Array of the lengths of the chained output data blocks for crypto
inbuf - Array of the input chained data blocks for crypto
inlen - Array of the input lengths of the chained data blocks for crypto
h_ctx - NDPS_crypto_ctx_t handle for Hash
h_outbuf - Array of the chained output data blocks for Hash
h_outlen - Array of the lengths of the chained output data blocks for Hash
h_inbuf - Array of the input chained data blocks for Hash
h_inlen - Array of the input lengths of the chained data blocks for Hash
The following APIs support AES-XCBC-MAC-96.
Loads the initial key for AES-XCBC-MAC-96. which is supplied by the caller.
int NDPSAESXCBCMAC96KeyLoad (
NDPS_crypto_ctx_t ctx, NDPS_key_t *key);
ctx - NDPS_crypto_ctx_t handle for crypto
Generates the AES-XCBC-MAC-96 authentic value in 96-bit.
int NDPSAESXCBCMAC96AuthGenerate(NDPS_crypto_ctx_t ctx,
uchar_t *inbuf, int inlen, uchar_t **auth_buf, int *auth_len);
ctx - NDPS_crypto_ctx_t handle for crypto
inbuf - Input data for AES-XCBC-MAC-96
inlen - Input lengths for AES-XCBC-MAC-96
auth_buf - Resulting AES-XCBC-MAC-96 hash value
auth_len - Lengths, in 96-bits
The Ethernet API is an interface between the user network application and the device drivers. A Sun Netra DPS application developer should be aware of the device features and capabilities but does not need to have the knowledge of the detailed implementation of the device driver. TABLE 4-1 shows the relationship among Ethernet device, device driver, Ethernet API, and the user application.
Network applications require network hardware resources. The RLP, IP packet forwarding, and IPSec reference applications are all network applications.
TABLE 4-2 lists the Ethernet device supported in Sun Netra DPS platforms.
See Note 11 for Ethernet device driver nxge tunables.
The API list of functions include the following:
Allocates a message block for managing incoming packet data. The allocated entity is returned as a pointer to the buffer block structure (pbuf_t). pbuf_t is a message block struct (mblk) that consists of the all necessary pointers and fields for manipulating the data buffer. See the mblk_t in mblk.h header file for the details of the message block. Packet data begins at b_wptr. The size of the mblk must be the size specified as mblk_size in the eth_open() call. This API is implemented in the user application space. (See Note 4). The device driver calls this function.
pbuf_t *eth_pbuf_alloc(void *hook, size_t bufsz, uint16_t pool);
hook - User-provided hook. (See in Note 1)
bufsz - User-provided buffer size to be allocated. (See Note 2.)
pool - DMA channel pool (See Note 3.)
On success, returns pointer to mblk with b_rptr and b_wptr pointing to the start of a valid data buffer. An error returns NULL.
Frees a message block allocated by eth_pbuf_alloc(). This function is implemented by the user and is called by the device driver.
void eth_pbuf_free(void *hook, pbuf_t * mblkp, void *arg,
uint16_t pool);
hook - User-provided hook. (See in Note 1.)
mblkp - Pointer to message block to be freed
pool - DMA channel pool. (See Note 3.)
Allocates a data buffer for storing incoming packet data. The allocated entity is a pointer to the allocated buffer. This function is implemented in the user application space (see Note 4). The device driver calls this function.
char *eth_buf_alloc(void *hook, size_t bufsz, uint16_t pool);
hook - User-provided hook. (See Note 1.)
bufsz - User-provided buffer size to be allocated. (See Note 2.)
pool - DMA channel pool (See Note 3.)
On success, returns the pointer to a valid data buffer. An error returns NULL.
Frees a buffer allocated by eth_buf_alloc(). This function is implemented in the user application space. (See Note 4.) The device driver calls this function.
void eth_buf_free(void *hook, char *buf, void *arg, uint16_t pool);
hook - User-provided hook. (See Note 1.)
buf - Pointer to data buffer to be freed
pool - DMA channel pool. (See Note 3.)
Probes a network device in the target platform and, if the device is found, this function initializes the network device. On a successful completion, this function returns an opaque handle, which needs to be used in other API calls that is targeted to a specific device. When multiple ports are opened, eth_open() must be invoked in the increasing order of the port numbers, that is, port0, then port1, and so on, during initialization.
ihandle_t eth_open(uint16_t vid, uint16_t did, eth_port_t port,
int num_chans, void *txhook, void* rxhook,
size_t mblk_siz, uint_t mpbase);
vid - Vendor ID of network device
did - Device ID of network device
port - Port number of the Ethernet interface. (See Note 5.)
txhook - Application provided hook to tx fastq table. (See Note 6.)
rxhook - Application provided hook to rx fastq table (See Note 7.)
mblk_siz - Size of buffer that is returned by eth_pbuf_alloc()
mpbase - Base index into the mempool type array used in application
(See Note 8.)
On success - Returns a valid opaque device handle. (ihandle_t)
On error - Returns INVALID_IHANDLE
Releases the Ethernet interface instance and all resources held by it.
int eth_close(ihandle_t ihandle);
ihandle - Opaque handle returned by eth_open().
Receives messages from the Ethernet interface instance specified by ihandle. This function can be configured to return a chain of packets. The maximum number of packets in the chain is configurable through ETH_IOC_SET_MAX_PKT_CHAIN. This function is nonblocking.
pbuf_t *eth_read(ihandle_t ihandle, eth_chan_t chan_num);
ihandle - Opaque handle returned by eth_open().
chan_num - DMA channel number (See Note 9.)
On success - Returns mblk packet chain containing message
Sends a message which is specified by the message block structure pointer (mblk). This function is nonblocking and can fail if the hardware transmit descriptor ring is full.
int eth_write(ihandle_t ihandle, eth_chan_t chan_num,
pbuf_t * mblkp);
ihandle - Opaque handle returned by eth_open().
chan_num - DMA channel number. (See Note 9.)
mblkp - Message block pointer. This pointer can be a chain. The maximum size of chain supported is implementation-specific and can be discovered via ETH_IOC_GET_MAX_TX_PKT_CHAIN.
Catch-all configuration API that can be used to control the device driver attributes.
int eth_ioc(ihandle_t ihandle, ioc_cmd_t cmd, void *arg);
ihandle - Opaque handle returned by eth_open().
cmd - Command to execute (See Note 10.)
*arg - Argument passed to command
This section specifies the content that needs to be passed into the arg parameter of the IOC command.
Obtain MAC ID of the ethernet port from the MAC address hardware registers.
(struct ether_addr *)arg struct ether_addr { ether_addr_t ether_addr_octet; };
typedef uchar_t ether_addr_t[ETHERADDRL]; /* 6 octets */
Set MAC ID into the MAC address hardware registers of the ethernet port.
Check the ethernet link status. When link status is changed, display a message on the console showing the new Ethernet status.
Obtain the current Ethernet link status and return it to the link_status_ioc_t structure.
(link_status_ioc_t *)arg typedef struct _link_status_ioc_s { int status; /* 0: Link Down 1: Link Up */ int speed; /* Link speed in Mbps 10/100/1000/10000 */ int duplex; /* 1: Half 2: Full */ } link_status_ioc_t;
Set a promiscous mode to the Ethernet port.
(promisc_mode_ioc_t *)arg typedef struct _promisc_mode_ioc_s { promisc_mode_t mode; /* Promiscuous mode */ boolean_t enable; /* B_TRUE: enable B_FALSE: disable */ } promisc_mode_ioc_t; typedef enum { PROMISC_ALL = 0, /* Accepts all valid frames */ PROMISC_GRP, /* Accepts all valid multicast frames */ PROMISC_VIRT /* Enable MAC Address Filtering */ } promisc_mode_t;
Set the maximum allowable frame size that the MAC will receive.
(*(int *)arg)
Note - User can declare an interger variable named maxfrmsz and pass in the address of the variable as the third argument when calling eth_ioc(). |
Add a multicast address entry into the hardware multicast hash table.
(struct ether_addr *)arg
Note - See argument of ETH_IOC_GET_MAC_ADDR. |
Delete a multicast address entry from the hardware multicast hash table.
Diaplay the hardware multicast hash table content to the console.
Set up the address filter and the address filter mask hardware registers.
(addr_filter_ioc_t *)arg typedef struct _addr_filter_ioc_s { struct ether_addr addr_filter; /* Address Filter */ uint16_t mask_0_15_bits_a0; /* Bit mask octet4-5 */ uint8_t mask_0_7_nib_a1_a2; /* Nibble mask octet0-3*/ } addr_filter_ioc_t;
Obtain statistics of igress and egress packet count of the Ethernet port.
(nxge_eth_kstat_t *)arg typedef struct nxge_eth_kstat { uint64_t ipackets; /* Ingress packets count */ uint64_t opackets; /* Egress packets count */ uint64_t rbytes; /* Ingress packets byte count */ uint64_t obytes; /* Egress packets byte count */ } nxge_eth_kstat_t;
Display Ethernet port statistical information to the console.
Set up the hardware MAC table.
(mac_tbl_ioc_t *)arg typedef struct _mac_tbl_ioc_s { mac_addr_type_t addr_type; /* MAC address type */ struct ether_addr mac_addr; /* MAC ID */ uint8_t rdc_grp_id; /* RDC Group ID (0~7)*/ boolean_t enable; /* enable/disable */ boolean_t priority; /* High/Low Priority */ } mac_tbl_ioc_t; typedef enum { MAC_ADDR_SELF = 0, /* MAC ID of this interface */ MAC_ADDR_ALT0, /* Alternate MAC ID #0 */ MAC_ADDR_ALT1, /* Alternate MAC ID #1 */ MAC_ADDR_ALT2, /* Alternate MAC ID #2 */ MAC_ADDR_ALT3, /* Alternate MAC ID #3 */ MAC_ADDR_ALT4, /* Alternate MAC ID #4 */ MAC_ADDR_ALT5, /* Alternate MAC ID #5 */ MAC_ADDR_ALT6, /* Alternate MAC ID #6 */ MAC_ADDR_ALT7, /* Alternate MAC ID #7 */ MAC_ADDR_ALT8, /* Alternate MAC ID #8 */ MAC_ADDR_ALT9, /* Alternate MAC ID #9 */ MAC_ADDR_ALT10, /* Alternate MAC ID #10 */ MAC_ADDR_ALT11, /* Alternate MAC ID #11 */ MAC_ADDR_ALT12, /* Alternate MAC ID #12 */ MAC_ADDR_ALT13, /* Alternate MAC ID #13 */ MAC_ADDR_ALT14, /* Alternate MAC ID #14 */ MAC_ADDR_ALT15, /* Alternate MAC ID #15 */ MAC_ADDR_RSVD_MULTICAST, /* Reserved Multicast */ MAC_ADDR_FILTER, /* Address Filter */ MAC_ADDR_FLOW_CTL /* Flow Control Address */ } mac_addr_type_t;
Display the hardware MAC table content to the console.
Set up the hardware VLAN table.
(vlan_tbl_ioc_t *)arg typedef struct _vlan_tbl_ioc_s { uint16_t vlan_id; /* VLAN LD (0 ~ 4095) */ uint8_t rdc_grp_id; /* RX DMA Channel Group ID (0~7)*/ boolean_t enable; /* Enable/Disable */ boolean_t priority; /* High/Low priority */ } vlan_tbl_ioc_t;
Set up the hardware RDC (Receive DMA Channel) group table.
(rdc_tbl_ioc_t *)arg typedef struct _rdc_tbl_ioc_s { uint8_t rdc_grp_id; /* RDC Group ID (0~7)*/ uint8_t rdc[NXGE_NUM_CHANLS]; /* RDC (0~15) */ } rdc_tbl_ioc_t;
Display the hardware RDC group table content to the console.
Bind a default RDC group number to a port.
(uint8_t *)arg /* RDC Group number */
Obtain the port information, and return it to portinfo_ioc_t.
(portinfo_ioc_t *)arg typedef struct _portinfo_ioc_s { struct ether_addr mac_addr; /* MAC ID */ struct _link_status_ioc_s link_status; /* Link status */ int rx_min_frame_sz; /* Rx Minimum Frame Size */ int tx_min_frame_sz; /* Tx Minimum Frame Size */ int max_frame_sz; /* Maximum Frame Size */ boolean_t tx_en; /* Transmit Enable/Disable */ boolean_t rx_en; /* Receive Enable/Disable */ boolean_t addr_filter_en; /* Addr Filtering Ena/Dis */ boolean_t hash_filter_en; /* Hash Filtering Ena/Dis */ boolean_t promisc_all; /* Promiscuous All Ena/Dis */ boolean_t promisc_grp; /* Promiscuous Group Ena/Dis */ boolean_t rx_strip_crc; /* Rx Strip CRC Ena/Dis */ boolean_t tx_gen_crc; /* Tx CRC generation Ena/Dis */ boolean_t rx_err_chk; /* Rx Error Check Ena/Dis */ } portinfo_ioc_t;
Display the port information to the console.
Set classification rules for incoming traffic to the receive port.
(classify_ioc_t *)arg typedef struct classify_ioc_s { uint_t opcode; /* 1:Add an entry 2:Invalidate an entry */ uint_t action; /* 1:Accept on match 2:Discard on match*/ flow_spec_t flow_spec; /* Flow specification */ } classify_ioc_t; typedef struct flow_spec_s { uint_t fs_type; /* Flow Spec Type */ uint_t index; /* Index of flow entry */ uint_t channel; /* RDC to be selected when match */ union { flow_spec_ipv4_t ip4; /* IPv4 flow spec */ flow_spec_ipv6_t ip6; /* IPv6 flow spec */ flow_spec_l2_t l2; /* L2 flow spec */ uint8_t hd[64]; /* Hex data */ } ue, um; } flow_spec_t; typedef struct flow_spec_ipv4_s { uint8_t protocol; uint8_t tos; union { port_t tcp; port_t udp; spi_port_t spi; } port; uint32_t src; uint32_t dst; } flow_spec_ipv4_t; typedef struct flow_spec_ipv6_s { uint8_t protocol; uint8_t tos; union { port_t tcp; port_t udp; spi_port_t spi; } port; struct ip6_addr src; struct ip6_addr dst; } flow_spec_ipv6_t; typedef struct flow_spec_l2_s { // uint32_t l2_rule; /* l2 classification types */ XXXX uint8_t dst[6]; /* MAC address */ uint8_t src[6]; /* MAC address */ uint16_t type; /* Ether type */ uint16_t vlantag; /* VLANID|CFI|PRI */ } flow_spec_l2_t;
Check errors on the Ethernet port.
TABLE 4-3 lists a summary of the Ethernet API functions.
This is a catch-all argument for the user application. This argument can be used for any purpose. If not used, pass in NULL.
The size value should be large enough to hold an Ethernet packet.
When using multiple memory pools (one for each DMA channel), pool indicates the ID of the memory pool, which is normally indexed by the DMA channel number. When a single memory pool is used, always pass in a zero.
The user application has the best knowledge and control of how system memory is utilized. The device driver calls this function in the Packet Read routine.
The port number of the device can be determined using the -v option during boot. For example, boot net:,my_binary_file -v.
Part of the console output is similar to the following:
This output indicates that there are two NIU ports. The numbers inside the netdev[] (0 and 1) are the port numbers used in eth_open calls.
For an application that needs to forward packets (for example, RLP, IP packet forwarding, and IPSec applications), the application must pass in the pointer to the transmit fastq allocated by the application.
If nxge_multi_qs is enabled (set to 1), the driver expects the application to pass in the entire free queue array because the driver needs to determine the index of the free queue element to be freed based on the forwarding port number and channel number.
fastq_t txfreeq_dram[NUM_PORTS+START_PORT][NUM_CHANS][NUM_PORTS];
fastq_t rxfreeq_dram[NUM_PORTS+START_PORT][NUM_CHANS][NUM_PORTS];
eth_open(vid, did, port, NUM_CHANS, (void *)txfreeq_dram, (void *)rxfreeq_dram, RX_BUF_SIZE, poolidx);
If nxge_multi_qs is disabled (set to 0), the driver expects the application to pass in the starting element of free queue array indexed by the port. The driver needs to determine only the freeq array element to be freed based on the channel number. The port number in this mode is static.
fastq_t txfreeq_dram[NUM_PORTS+START_PORT][NUM_CHANS];
fastq_t rxfreeq_dram[NUM_PORTS+START_PORT][NUM_CHANS];
eth_open(vid, did, port, NUM_CHANS, (void *)&txfreeq_dram[port], (void *)&rxfreeq_dram[port], RX_BUF_SIZE, poolidx);
For an application that must forward packets (for example, RLP, IP packet forwarding, and IPSec applications), the application must pass in the pointer to the receive fastq allocated by the application.
If nxge_multi_qs is enabled (set to 1), the driver expects the application to pass in the entire free queue array because the driver needs to determine the index of the free queue element to be freed based on the forwarding port number and channel number.
fastq_t txfreeq_dram[NUM_PORTS+START_PORT][NUM_CHANS][NUM_PORTS];
fastq_t rxfreeq_dram[NUM_PORTS+START_PORT][NUM_CHANS][NUM_PORTS];
eth_open(vid, did, port, NUM_CHANS, (void *)txfreeq_dram, (void *)rxfreeq_dram, RX_BUF_SIZE, poolidx);
If nxge_multi_qs is disabled (set to 0), the driver expects the application to pass in the starting element of free queue array indexed by the port. The driver needs to determine only the freeq array element to be freed based on the channel number. The port number in this mode is static.
fastq_t txfreeq_dram[NUM_PORTS+START_PORT][NUM_CHANS];
fastq_t rxfreeq_dram[NUM_PORTS+START_PORT][NUM_CHANS];
eth_open(vid, did, port, NUM_CHANS, (void *)&txfreeq_dram[port], (void *)&rxfreeq_dram[port], RX_BUF_SIZE, poolidx);
When the application is using multiple memory pools, this is the index to the first memory pool used by the device. For example, if the device to be opened has eight memory pools (one for each DMA channel) and the memory pool ID are identified from 0 to 7, then the base index is 0.
This is the DMA channel number of the DMA channel receiving the packet. In Sun multithreaded 10GbE with NIU, up to 16 DMA channels can be used. The number of DMA channels to be used is specified when calling eth_open().
In the Sun Netra DPS 2.1 Update 1 release, the following control commands are implemented:
Refer to the reference application (for example, IP packet forwarding) for usage of these commands.
TABLE 4-4 lists the Ethernet device driver nxge tunables.
This note describes how to enable the hardware checksum offload features on the Sun multithreaded 10Gb/NIU Ethernet hardware using the nxge driver.
The following mblk fields are used:
unsigned char b_ick_flag; /* H/W checksum enable flag : TX */unsigned char *b_ick_start; /* Pointer to start offset : TX/RX */unsigned char *b_ick_stuff; /* Pointer to stuff offset : TX */ |
If (b_ick_flag and NXGE_TX_CKENB), then the hardware is programmed to compute hardware checksum. It is expected that the ick_start/stuff point to the L4 payload start/stuff offsets, respectively. Also, the udp/tcp header checksum field must be filled with the pseudo header checksum value. The hardware will use this field for computing the full checksum.
On rx, if (b_ick_flag and NXGE_RX_CKERR), the hardware detected a checksum error in the ingress packet.
The vnet driver API is an interface between the user application and the Sun Netra DPS vnet driver. A Sun Netra DPS application developer must be aware of the vnet driver capabilities and features, but she or he does not need to know the detailed implementation of the device driver.
External function to be implemented by the user and called by the library to allocate a message block for incoming packets. The allocated entity is returned as a pointer to the buffer block structure (pbuf_t). pbuf_t is a message block structure (mblk_t) that contains the necessary pointers and fields for manipulating the data buffer. See mblk_t in the mblk.h header file for the details about the message block.
Packet data begins at b_wptr. The library assumes that the mblk returned by this function will point to a valid buffer. The library will start writing packet data at the byte pointed by b_wptr. This function is implemented in the user application space. The device driver calls this function.
pbuf_t *vnet_pbuf_alloc(void *hook, size_t bufsz, uint16_t pool);
hook - currently unused by the vnet device driver; the driver will always pass NULL for this value.
bufsz - buffer size to be allocated
pool - pool id from which buffer is to be allocated
On success, returns a mblk with b_rptr and b_wptr pointing to start of a valid data buffer. On error, returns NULL.
Allocates a data buffer for storing incoming packet data. The allocated entity is a pointer to the allocated buffer. This function is implemented in the user application and called by the device driver.
unsigned char *vnet_buf_alloc(void *hook, size_t bufsz, uint16_t pool);
hook - Currently unused by the vnet device driver. The driver will always pass NULL for this value.
bufsz - Buffer size to be allocated
pool - Pool id from which buffer is to be allocated
On success, returns the pointer to a valid data buffer. On error, returns NULL.
Frees a message block allocated by vnet_pbuf_alloc(). This function is implemented by the user and called by the driver.
void vnet_pbuf_free(void *hook, pbuf_t *mblkp, void *arg, uint16_t pool);
hook - currently unused by the vnet device driver; the driver will always pass NULL for this value.
mblkp - pointer to the message block to be freed
arg - currently usused; the driver will pass NULL for this value
pool - pool id to which the buffer must be freed
Frees a buffer that is allocated using vnet_buf_alloc(). This is implemented by the user and called by the device driver.
int vnet_buf_free(void *hook, unsigned char *buf, void *arg, uint16_t pool);
hook - currently unused by the vnet device driver; the driver will always pass NULL for this value.
buf - pointer to the data buffer to be freed
arg - currently unused by the vnet device driver; the driver will always pass NULL for this value
pool - pool id to which the buffer must be freed
On success, returns 0. On error, returns -1.
Probes a virtual network device in the target platform and if the device is found, the function initializes the virtual network device. On successful completion, this function returns an opaque handle that needs to be used in other API function calls that are targeted to a specific virtual network device.
void *vnet_eth_open(uint16_t vid, uint16_t did, int port, uint_t mpbase, void *receive_packet_queue, void *transmit_free_queue);
vid - vedor ID (see vnet_ethapi.h)
did - device ID (see vnet_ethapi.h)
port - virtual network device instance number; this is the value shown in the output of
ldm-list-bindings -e ndps-ldom under the DEVICE column. For example, if output shows network@4, then user must pass 4 for the port number.
mpbase - base index into the mempool type array used in the application
receive_packet_queue - fastq into which ingress packets are queued by the device driver for processing by the application
transmit_free_queue - fastq into which packets whose transmission is complete are queued by the device driver for processing(freeing) by the application
On success, returns a valid opaque handle. On failure, returns NULL.
Receives messages from the virtual network device instance specified by ihandle. This function can block in some circumstances (see description of vnet_set_rxburst).
int64_t vnet_eth_read(void *ihandle, int chan);
ihandle - opaque handle which was returned by vnet_eth_open()
chan - relative index into memory pool array into which packet must be freed after transmission completes
On success, a value that indicates the number of frames received. On failure, returns -1.
Sends a message that is specified by the message block structure pointer. This function is non-blocking.
int vnet_eth_write(void *ihandle, int chan, pbuf_t *txmp);
ihandle - opaque handle which was returned by vnet_eth_open()
chan - relative index into memory pool array into which packet must be freed after transmission completes
On success, returns 0. On failure, returns -1.
A catch-all configuration function that is used to configure and read device driver attributes.
int vnet_eth_ioc(void *ihandle, ioc_cmd_t cmd, void *arg);
ihandle - opaque handle which was returned by vnet_eth_open()
arg - argument passed to the command
On succes, returns 0. On error, returns -1.
The vnet device driver uses the same commands as the Sun Netra DPS Ethernet device drivers like nxge and ipge. Some of the commands are not supported by the vnet device driver. For some of the commands, the argument type is also different. The following section outlines the commands and their arguments.
Obtains the MAC address of the vnet port.
Pointer to an array of ETHERADDRL bytes.
Checks the vnet device link status. Display the status of the links on the console.
Obtains the vnet device link status, and returns it in the argument passed.
(vnet_link_status_ioc_t *)arg typedef struct { int ldc_num; eth_event_t status; int prv_unused; } vnet_link_status_ele_t; where, ldc_num - LDC channel number status - link status in terms of a eth_event_t value; please see ethapi.h typedef struct { int instance; int ele_cnt; int error; char status_ele[1]; } vnet_link_status_ioc_t;
instance - virtual network device instance; this is the value displayed in the output of
ldm list-bindings -e ndps-ldom under the DEVICE column. For example, if output shows network@4, then user must pass 4 for the instance number.
ele_cnt - number of vnet_link_status_ele_t elements for which user has allocated memory starting from status_ele
error - indicates any error encountered in the device driver while executing the command; currently only one error code is supported; if this value is 1 upon return from the ioctl call, then it implies that memory starting from status_ele is insufficient to store the status of all LDC channels in a vnet device; if this value is 0 upon return from the ioc call, then it means the call was successful
status_ele - beginning of the memory region where all vnet_link_status_ele_t for each LDC channel of a vnet device is stored. The user needs to allocate this memory before the call. This memory must accomodate at least ele_cnt elements.
Adds a multicast address into the multicast table of the vnet device and the connected virtual switch.
Pointer to a byte array of length ETHERADDRL that contains the multicast address.
Deletes a multicast address from the multicast table for the vnet device and the connected virtual switch.
Pointer to a byte array of length ETHERADDRL that contains the multicast address.
Displays or obtains a copy of the vnet device multicast table.
(vnet_mc_table_t *)arg typedef struct { int ele_cnt; int error; boolean_t display; unsigned char pad[4]; char data[1]; } vnet_mc_table_t; where,
ele_cnt - number of MAC addresses for which user has allocated space in memory region starting at data
error - indicates any error encountered in the library while executing the ioctl. Currently, only one error code is supported. If this value is 1 on return from the ioctl call, memory starting from data is insufficient to store all the MAC addresses in the multicast table. If this value is 0 on return, then the call was succesful and all entries were copied.
display - If set to B_TRUE by the user, the driver will print the multicast table on the console.
data - beginning of the memory region where all MAC addresses are stored. The user needs to allocate this memory before the call; this memory must accomodate at least ele_cnt elements
Obtain statistics about a vnet instance.
typedef struct { /* Link Input/Output stats */ uint64_t ipackets; /* # rx packets */ uint64_t ierrors; /* # rx error */ uint64_t opackets; /* # tx packets */ uint64_t oerrors; /* # tx error */ /* MIB II variables */ uint64_t rbytes; /* # bytes received */ uint64_t obytes; /* # bytes transmitted */ uint32_t multircv; /* # multicast packets received */ uint32_t multixmt; /* # multicast packets for xmit */ uint32_t brdcstrcv; /* # broadcast packets received */ uint32_t brdcstxmt; /* # broadcast packets for xmit */ uint32_t norcvbuf; /* # rcv packets discarded */ uint32_t noxmtbuf; /* # xmit packets discarded */ /* Tx Statistics */ uint32_t tx_no_desc; /* # out of transmit descriptors */ uint32_t tx_qfull; /* pkts dropped due to qfull in vsw */ uint32_t tx_pri_fail; /* # tx priority packet failures */ uint64_t tx_pri_packets; /* # priority packets transmitted */ uint64_t tx_pri_bytes; /* # priority bytes transmitted */ /* Rx Statistics */ uint32_t rx_allocb_fail; /* # rx buf allocb() failures */ uint32_t rx_vio_allocb_fail; /* # vio_allocb() failures */ uint32_t rx_lost_pkts; /* # rx lost packets */ uint32_t rx_pri_fail; /* # rx priority packet failures */ uint64_t rx_pri_packets; /* # priority packets received */ uint64_t rx_pri_bytes; /* # priority bytes received */ /* Callback statistics */ uint32_t callbacks; /* # callbacks */ uint32_t dring_data_acks; /* # dring data acks recvd */ uint32_t dring_stopped_acks; /* # dring stopped acks recvd */ uint32_t dring_data_msgs; /* # dring data msgs sent */ } vnet_eth_stats_t; typedef struct { int ldc_num; unsigned char pad[4]; vnet_eth_stats_t stats; } vnet_eth_kstat_ele_t;
stats - statistics for the LDC channel
typedef struct { int instance; int ele_cnt; int error; unsigned char pad[4]; char stats_ele[1]; } vnet_eth_kstat_t;
instance - virtual network device instance. This is the value displayed in the output of
ldm list_bindings -e ndps-ldom under the DEVICE column. For example, if output shows network@4, then user must pass 4 for the instance number
ele_cnt - number of vnet_eth_kstat_ele_t elements for which user has allocated memory starting from stats_ele
error - indicates any error encountered in the library while executing the ioctl. Currently, only one error code is supported. If this value is 1 upon return from the ioctl call, then it implies that memory starting from status_ele is insufficient to store the status of all LDC channels in a vnet device. If this value is 0 upon return from the ioctl call, then it means the call was successful
stats_ele - beginning of the memory region where all vnet_eth_kstat_ele_t for each LDC channel of a vnet device is stored. The user needs to allocate this memory before the call. This memory must accomodate at least ele_cnt elements.
Display the vnet device statistics on the console.
Not supported by the vnet device driver.
Obtains the MAC address of a vnet device.
int vnet_eth_get_mac_addr(void *ihandle, unsigned char *addr);
ihandle - opaque handle returned by vnet_eth_open()
addr - pointer to a user space buffer of length ETHERADDRL bytes where device MAC address is copied by the driver
On success, returns 0. On failure, returns -1.
This function accomplishes two important tasks:
1. Flush packets that are pending transmission in the vnet transmit buffers and free the associated transmit resources
2. Functionality needed to re-initialize the vnet device in case it has been reset.
Given the nature of the VIO protocol that is used by vnet drivers, it is possible that when an application is transmitting packets in bursts, at the end of a burst, some packets are left in the transmit buffers of the vnet device waiting for the peer vnet device to consume them. In such a scenario, this function must be called to signal the peer vnet device that there are pending packets for it to consume. When an application is transmitting packets continuously by calling vnet_eth_write(), the same functionality is achieved by vnet_eth_write(). When an application has stopped transmitting packets using vnet_eth_write(), it must call vnet_eth_flush() until it returns 0 to ensure that there are no more pending transmits.
It is also necessary that the application call this function in order to re-initialize the vnet device, in case the device is reset, when the application is not transmitting packets using vnet_eth_write().
If the application is sending packets by calling vnet_eth_write(), the signalling necessary for re-initializing the device is achieved in vnet_eth_write(). If not, the application must call this function in a timely manner.
int vnet_eth_flush(void *ihandle, int chan);
ihandle - opaque handle returned by vnet_eth_open()
chan - relative index into memory pool array into which packet must be freed after transmission completes
If all pending transmissions are completed, returns 0. If more packets are pending transmission, returns 1.
A vnet device can have several LDC channels to other vnet devices and also to virtual switches.
vnet_eth_read() reads packets from the ingress FIFOs of each of these LDC channels in a round robin manner. Each vnet device has an associated Receive Burst Size value that determines the budget available for each LDC channel in units of packets, in one call of vnet_eth_read() by the application .
A non-zero Receive Burst Size value implies that atmost Receive Burst Size packets are read from a LDC channel. If more packets than Receive Burst Size are available to be processed, the vnet_eth_read() function will stop processing that LDC channel and will proceed to process another LDC channel.
If Receive Burst Size value is non-zero and packets available are fewer than the value, then those packets are processed and vnet_eth_read() will continue to process another LDC channel.
If the Receive Burst Size is zero, then it implies that the vnet_eth_read() function will process the LDC channel until there are no more packets to be processed. That is, this represents an infinite burst size value.
When there are more than one LDC channels in a vnet and if the Receive Burst Size value for the vnet device is zero, then there is a possibility that an LDC channel that has heavy traffic arriving on it can starve the other LDC channels. In the same scenario, if a LDC channel is receiving continuous traffic, vnet_eth_read() function can be blocked processing that LDC channel.
This function is used by the user application to set the Receive Burst Size of a vnet device to a desired value.
The default value for Receive Burst Size is zero.
int vnet_set_rxburst(void *ihandle, unsigned int burst_size);
ihandle - opaque handle returned from vnet_eth_open()
burst_size - burst size value for the vnet device
On success, returns 0. On failure, returns -1.
This function is used by the user application to read the current setting of the Receive Burst Size value.
unsigned int vnet_get_rxburst(void *ihandle);
ihandle - opaque value returned by vnet_eth_open()
Returns the current Receive Burst Size value for the vnet device.
Enables the start addresses of packets received over a vnet device to be offset by cache aligned addresses. Offsets are 64B, 128B, or 192B. Default value is 1 (turned on).
extern uint_t vnet_rxoff_var_n2_2
vnet_rxoff_var_n2_2 enables N2 2.2 behaviour for start address offsets of packets received over a vnet device. Offsets are 64B, 128B, 320B, and 384B. Default value is 0 (turned off).
When both vnet_rxoff_var and vnet_rxoff_var_n2_2 are set to 0, a constant offset of 128B is used for packets received over all vnet devices being used.
The actual offset in a received packet is the pre-determined packet offset plus 6B. So actual offsets when vnet_rxoff_var is enabled are 70B, 134B, and 198B. The actual offsets when vnet_rxoff_var_n2_2 is enabled are 70B, 134B, 326B, and 392B.
The offset for received packets are determined per-vnet device. So, for a given vnet device, the offset is constant for all packets received. When multiple vnet devices are used, each vnet device can have different offsets. When the number of vnet devices used are more than the set of offset values, the offsets can be the same for some of the vnet devices.
This note explains the buffer management in the Sun Netra DPS vnet driver. In this note, all references to chained packet imply a packet that is created by linking several mblks using their b_cont field and each mblk points to a data buffer.
The vnet driver will always call vnet_pbuf_alloc() to allocate buffers for packets or frames. It does not support buffer reuse from transmit free queue. All mblks allocated by the vnet driver have their inport field set to INPORT_TYPE_VNET (see vnet_ethapi.h). This field allows applications to identify buffers that are allocated by the vnet driver.
If the user application passes a NULL in the transmit_free_queue argument of vnet_eth_open(), then the vnet driver calls vnet_pbuf_free() to free the buffers that have completed transmission. If the user application has transmitted a chained packet, then the vnet driver will walk the chain and free the individual mblks in the chain.
If user application passes a non-NULL valid fast queue in the transmit_free_queue argument of vnet_eth_open(), then the vnet driver en-queues all frames whose transmission is complete into the transmit_free_queue. The user application must de-queue such packets from this fast queue for further processing.
If user application has transmitted a chained packet, then the head mblk of the chain is en-queued by the driver into the transmit_free_queue. The user is responsible for freeing the individual mblks in this chain by walking the chain.
If the user application transmits an untagged frame that is destined to another vnet device that uses VLAN Tagging, then the vnet driver may prepend an mblk to the frame during the process of tagging. This means that, in this scenario, the vnet driver creates a chain of mblks to complete the transmission. When the transmission is complete, the chained frame is en-queued into the transmit_free_queue. So the user application must always check for a chain before freeing the mblk it dequeues from the transmit_free_queue.
This note gives an example of using the receive packet queue and the transmit free queue in vnet driver.
fastq_t rxfq_dram[MAX_VNET_DEVS], tx_freeq_dram[MAX_VNET_DEVS]; vnet_eth_open(VNET_VID, VNET_DID, port, poolid, rxfq_dram[port], tx_freeq_dram[port]);
This note explains VLAN support for vnet devices in Sun Netra DPS.
The Sun Netra DPS vnet driver supports VLANs. A Sun Netra DPS vnet interface can be assigned a VLAN by using the respective ldm command. Either Port VLANs or VLAN Tagging can be used for Sun Netra DPS vnet interfaces.
When a Sun Netra DPS vnet interface is connected to a Linux vnet interfaces, then VLAN is not supported. This is because Linux vnet driver does not support VLANs. Hence, the Sun Netra DPS vnet interface initialization is known to fail when they are connected to Linux vnet interfaces that have VLANs enabled.
When a Sun Netra DPS vnet interface is connected to a Solaris OS vnet interface, VLAN is supported. The Sun Netra DPS vnet driver will support the following functionality:
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.