C H A P T E R 8 |
Receive Packet Classification |
This chapter describes the basic functions of the Receive Packet Classification and the Sun Netra DPS software interface. Topics include:
The Sun multithreaded 10GbE with network interface unit (NIU) networking hardware consists of a Receive Packet Classifier that performs L2/L3/L4 header parsing, matching and searching functions. Sun Netra DPS provides the software interface to utilize this hardware mechanism.
Classification is needed for the following reasons:
This classification spreads traffic flows across multiple CPUs so that each CPU hardware strand shares the load of 10 Gbps processing. By spreading the load across at least eight pipelines, packets are processed at 10Gbps preventing overloading of processing power on a particular processing unit.
This classification refers to blocking, re-routing, or to perform special processing to certain traffic types from the incoming traffic stream.
This classification sustains forwarding of 10Gbps of incoming traffic with a relatively small packet size from the 10Gbps Ethernet ingress port to the 10Gbps egress port. Traffic must be spread into multiple DMA channels for processing.
The following network interfaces support classification:
Sun multithreaded PCIe 10GbE, PCIe 4GbE, and 10GbE NIU supports two ways to spread input packets:
Determines the target DMA channel based on a L2 RDC group and then a hash algorithm applied on the defined values of L2/L3/L4 header fields.
Determines the target DMA channel based on the values of L2/L3/L4 header fields with the help of hardware lookup tables and TCAM preprogrammed with matching rules.
In Sun Multithreaded 10-Gb Ethernet and NIU, there are a total of 16 Receive DMA Channels (RDCs) in hardware. These Receive DMA channels are organized into Receive DMA Channel Groups (RDC Groups). Each RDC Channel Group can have up to 16 RDC entries. During receive, a RDC group (identified by the RDC group number) is selected to be used. For packets that pass through classification successfully, with no L2 CRC error or IP checksum error, the Receive DMA Group number and the offset from the hardware classifier will be used to select the DMA channel. For packets with checksum errors, the offset will be changed to zero to select the default within the group. A RDC hardware RDC table holds the content of each RDC group. Each table consists of the following entries:
Where n is any number between 0 and 15.
In the default configuration, each Ethernet port is associated with a default RDC table and all classification results will be based on the value of this RDC table. The RDC used for receive is determined by the RDC table entry that is indexed by the offset value generated by the classifier.
The following tables show the contents of the default RDC table for each reference configuration:
In this configuration, the RDC table entry 0 is bound to port0 as the default RDC table entry. All classification results will end up in one of the table entries in this table. The target RDC used to carry traffic will be in a range from RDC#0 to RDC#7.
In this configuration, entry 0 is bound to port0 as the default RDC table entry for port0. Entry 8 is bound to port1 as the default RDC table entry. All classification results will end up in one of the table entries in these two table. The target RDC used to carry traffic will be in a range from RDC#0 to RDC#7.
In this configuration, entry 0 is binded to port0 as the default RDC table entry for port0. Entry 1 is binded to port1 as the default RDC table entry, and so on, up to 4 ports. All classification results end up in one of the table entries in these four table. Only one RDC is used for each port.
The following I/O control functions can be used to override the default RDC configuration:
The following I/O control functions show the current RDC group contents and configuration:
The procedure of hashing includes a hash lookup table based on the hash key. The hash key is created by applying a hash algorithm to a flow key and the flow key is generated from extracting certain fields from Layer 2, Layer 3, and Layer 4 (L2/L3/L4) packet headers.
The header fields in the flow key selections consist of the following individual header fields:
The hashing algorithm is based on polynomial hashing with CRC-32C. The algorithm is a 32-bit hash value. The last four bits of the value is used to index into a hardware hash table to lookup a DMA channel. In a Sun Netra DPS environment, one RDC table is used. The DMA channel number is one-to-one corresponding to the RDC table entry number, the value of the last four bits, therefore, equals the DMA channel number.
X32 + x28 + X27 + X26 + X25 + X23 + X22 + X20 + X19 + X18 + X14 + X13 + X11 + X10 + X9 + X8 + X6 + 1
The hash key is generated by a seed value. The following driver parameter can be used to modify the hash key:
It is set to 0xffffffff by default.
Use hashing for general load spreading and load balancing applications. The traffic load of each DMA channel depends on the value in the header fields used for the hash. Since the target DMA channel is determined by a polynomial, the correlation between the header value and the target DMA channel cannot be easily determined. How balance of the DMA channels are spread also depends on the value and range of the header fields. Hashing is considered a general purpose load spreading scheme.
Hashing is enabled by default. The hash policy is determined by setting the FLOW_POLICY to one of the values shown in TABLE 8-5:
The default FLOW_POLICY is set to HASH_ALL, meaning that the hash hardware hash algorithm is applied on all of the above header fields. To disable hash, set FLOW_POLICY to 0 or TCAM_CLASSIFY. When set to 0, no traffic spreading is performed. All traffic ends up at a default DMA channel. When set to TCAM_CLASSIFY, traffic spreading is determined by predefined flow specifications.
The layer 2 parser (part of the classification hardware) parses the following information from an Ethernet frame:
1. If the frame is VLAN packet, the VLAN ID
2. Ethernet format, whether there is a LLC/SNAP field.
Upon receiving this information, the classifier selects a RDC table to be used for further classification. L2 classification can be based on the following criteria:
For VLAN frames, the VLAN ID is used to index into a VLAN table to determine the RDC table number to be used for further classification. The VLAN table consists of 4-K entries. Each entry specifies a VLAN ID and its corresponding target RDC table number.
The target RDC table can also be determined based on the MAC address information. This information includes the MAC address type (for example, unicast, multicast, self address, address filter, or flow control) and the address.
The following I/O Control functions are used for L2 classification setup:
Because both VLAN table and MAC address table can set the preference, the arbitration between VLAN table and MAC address table is done by setting the priority field in each of these two tables.
L3/L4 header classification relies on the TCAM hardware to determine how traffic flows are distributed. There are multiple TCAM hardware entries (256 in Sun multithreaded 10GbE, 128 in NIU) for specifying flow specification. The CAM lookup table key generation use the concept of classes of packets to assemble a key. With the CAM key, a packet goes through a single CAM lookup table for an associative search. The L3/L4 header classification starts when the header parse identifies the incoming L2/L3 packet type.
The following packet classes are supported in Sun Netra DPS:
Use flow tables and TCAM to direct a particular type of traffic flow (with different traffic classes) into particular DMA channels. Flow tables and TCAM are ideal for use in load balancing applications.
The interface to the Flow Matching scheme is the ETH_IOC_SET_CLASSSIFY
“IO Control” command of the Sun Netra DPS Ethernet interface. The following shows the calling convention of the interface:
eth_ioc(ihdlnet[port], ETH_IOC_SET_CLASSIFY, (void *)&clsfy_ioc);
ihdlnet[] is an array of device driver handle indexed by the Ethernet port number [port]. ETH_IOC_SET_CLASSIFY is the set classifier command.
The clsfy_ioc structure is defined as follows:
typedef struct classify_ioc_s {
uint_t opcode;
uint_t action;
flow_spec_t flow_spec;
} classify_ioc_t;
opcode specifies what to do about a new traffic flow. TABLE 8-6 shows possible opcode values:
action specifies what action to take when there is a match. TABLE 8-7 shows possible action values:
flow_spec is the flow specification specifying the characteristics of the IPv4 and IPv6 flow. The following shows the flow_spec structure:
TABLE 8-8 shows the possible values of the traffic flow spec types (fs_type):
This is the index into the TCAM entries (for L3/L4 TCAM classification) or index into the MAC or VLAN table (for L2 MAC/VLAN classification).
Note - The software application must keep track of the index number. |
This is the target DMA channel ranges 0 ~ 15.
ue is the 5-tuple for IPv4 or 4-tuple for IPv6 structure for L3/L4 TCAM classification. For L2 classification, it is the L2 header structure. um is the bit-mask corresponding to the ue. Set 1 to bit-mask for don’t care (not to compare). Set 0 in bit-mask to compare.
This is the entire 64-bit header.
The following is the IPv4 flow specification structure:
The following is the IPv6 flow specification structure:
typedef struct flow_spec_ipv6_s { uint8_t protocol; union { port_t tcp; port_t udp; spi_port_t spi; } port; uint8_t src[16]; uint8_t dst[16]; } flow_spec_ipv6_t;u |
This is the L2 header structure as shown below:
typedef struct flow_spec_l2_s { uint8_t dst[6]; /* MAC address */ uint8_t src[6]; /* MAC address */ uint16_t type; /* Ether type */ uint16_t vlantag; /* VLANID|CFI|PRI */ } flow_spec_l2_t; |
|
Set FLOW_POLICY to a desired policy. For example:
This command tells Sun multithreaded 10GbE with NIU hardware to hash on all L2/L3/L4 header fields.
|
This example shows how a flow table can be established in the application.
1. Set up an array of flow table entries.
For example, use entries with the following structure:
2. Populate the flow table as shown in the below example.
3. Write a parsing function to parse the entries in the table as shown in the below example.
4. During the build, enable TCAM classification and disable hashing. To do this, type:
This command enables Sun multithreaded 10-Gb Ethernet with NIU hardware to enable TCAM classification with matching rules as shown in Step 1 to Step 3.
Copyright © 2011, Oracle and/or its affiliates. All rights reserved.