C H A P T E R  5

Configuring the nxge Device Driver Parameters

The nxge device driver controls the Sun Dual 10GbE interfaces. You can manually set the nxge driver parameters to customize each device in your system.

This chapter lists the available device driver parameters and describes how you can set these parameters.


nxge Hardware and Software Overview

The Sun Dual 10GbE XFP PCIe ExpressModule provides two 10-Gigabit full-duplex networking interfaces. The device driver automatically sets the link speed to 10000 Mbit/sec and conforms to the IEEE 802.3 Ethernet standard. Each interface has 8 receive DMA channels and 12 transmit DMA channels to enable parallel processing of the packets.

The Sun Dual 10GbE XFP PCIe ExpressModule extends CPU and OS parallelism to networking with its support for hardware-based flow classification and multiple DMAs. Using CPU thread affinity to bind a given flow to a specific CPU thread, the EM enables a one-to-one correlation of Rx and Tx packets across the same TCP connection. This functionality can help avoid cross-calls and context switching to deliver greater performance while reducing the need for CPU resources to support I/O processing. The Sun 10-Gigabit Ethernet EM uses the Sun MAC controller to map the 10-Gigabit XAUI interface onto the PCI Express form factor. The EM supports 10 Gbit/sec bandwidth using eight transmit and eight receive lanes.


Setting nxge Driver Parameters on a Solaris Platform

You can set the nxge device driver parameters in two ways:

If you use the ndd utility, the parameters are valid only until you reboot the system. This method is good for testing parameter settings.

To set parameters so they remain in effect after you reboot the system, create a
/platform/sun4u/kernel/drv/nxge.conf file and add parameter values to this file when you need to set a particular parameter for a device in the system.


Setting Parameters Using the ndd Utility

Use the ndd utility to configure parameters that are valid until you reboot the system.

The following sections describe how you can use the nxge driver and the ndd utility to modify (with the -set option) or display (without the -set option) the parameters for each nxge device.

Noninteractive and Interactive Modes

You can use the ndd utility in two modes:

In Noninteractive mode, you invoke the utility to execute a specific command. Once the command is executed, you exit the utility. In Interactive mode, you can use the utility to get or set more than one parameter value. Refer to the ndd(1M) man page for more information.


procedure icon  To Specify Device Instances for the ndd Utility

Before you use the ndd utility to get or set a parameter for an nxge device, you must specify the device instance for the utility.

single-step bullet  Check the /etc/path_to_inst file to identify the instance associated with a particular device.


# grep nxge /etc/path_to_inst
"/pci@7c0/pci@0/pci@9/network@0" 0 "nxge"
"/pci@7c0/pci@0/pci@9/network@0,1" 1 "nxge" 


procedure icon  To Specify Parameter Values Using the ndd Utility

This procedure describes how to modify and display parameter values.

1. Modify a parameter value, using the -set option.

If you invoke the ndd utility with the -set option, the utility passes value, which must be specified, down to the named /dev/nxgedriver-instance, and assigns the value to the parameter:


# ndd -set /dev/nxgenumber parameter-value

where number is the driver instance, for example /dev/nxge0, /dev/nxge1.

2. Display the value of a parameter by specifying the parameter name and omitting the value.

When you omit the -set option, the utility queries the named driver instance, retrieves the value associated with the specified parameter, and prints the value:


# ndd /dev/nxgeX parameter


procedure icon  To Use the ndd Utility in Interactive Mode

1. List all the parameters supported by the nxge driver by typing ?.


# ndd /dev/nxge0
  name to get/set ? ?
  ?                             (read only)
  function_number               (read only)
  fw_version                    (read only)
  adv_autoneg_cap               (read and write)
  adv_10gfdx_cap                (read and write)
  adv_1000fdx_cap               (read and write)
  adv_100fdx_cap                (read and write)
  adv_10fdx_cap                 (read and write)
  adv_pause_cap                 (read and write)
  accept_jumbo                  (read and write)
  rxdma_intr_time               (read and write)
  rxdma_intr_pkts               (read and write)
  class_opt_ipv4_tcp            (read and write)
  class_opt_ipv4_udp            (read and write)
  class_opt_ipv4_ah             (read and write)
  class_opt_ipv4_sctp           (read and write)
  class_opt_ipv6_tcp            (read and write)
  class_opt_ipv6_udp            (read and write)
  class_opt_ipv6_ah             (read and write)
  class_opt_ipv6_sctp           (read and write)

2. Modify a parameter value by specifying ndd /dev/nxgenumber:


# ndd /dev/nxge0
name to get/set? (Enter the parameter name or ? to view all parameters)

After you enter the parameter name, the ndd utility prompts you for the parameter value.


Setting Parameters Using the nxge.conf File

Specify the driver parameter properties for each device by creating a nxge.conf file in the /kernel/drv directory. Use a nxge.conf file when you need to set a particular parameter for a device in the system.

The man pages for prtconf(1M) and driver.conf(4) include additional details. See To Access a Man Page.


procedure icon  To Access a Man Page

single-step bullet  Type the man command plus the name of the man page.

For example, to access man pages for prtconf(1M), type:


% man prtconf


procedure icon  To Set Driver Parameters Using an nxge.conf File

1. Obtain the hardware path names for the nxge devices in the device tree.

a. Check the /etc/driver_aliases file to identify the name associated with a particular device:


# grep nxge /etc/driver_aliases
nxge "pciex108e,abcd"

b. Locate the path names and the associated instance numbers in the
/etc/path_to_inst file.


# grep nxge/etc/path_to_inst
"/pci@780/pci@0/pci@8/network@0" 0 "nxge"
"/pci@780/pci@0/pci@8/network@0,1" 1 "nxge"

In this example:

To identify a PCIe device unambiguously in the nxge.conf file, use the name, parent name, and the unit address for the device. Refer to the pci(4) man page for more information about the PCIe device specification.

In this example:

2. Set the parameters for the nxge devices in the /platform/sun4u/kernel/drv/nxge.conf file.

In the following example, the ports of all the Sun Dual 10GbE XFP PCIe ExpressModule are being set for load balancing Rx traffic based on the IP source address. The default value is F80, indicating Rx load balancing based on IP 5-tuple. Notice the semicolon at the end of the last parameter.


class-opt-ipv4-tcp = 100;
class-opt-ipv4-udp = 100;

The following example shows ports on two different cards being set. Only one node needs to be specified.


name = "pciex108e,abcd" parent = "/pci@780/pci@0/pci@8/" unit-address = "0" class-opt-ipv4-tcp = 0x100;
 
name = "pciex108e,abcd" parent = "/pci@7c0/pci@0/pci@9/" unit-address = "0" class-opt-ipv4-tcp = 0x40;

3. Save the nxge.conf file.


Setting Parameters on a Linux Platform

You can use the ethtool utility or the configtool utility to set parameters on a Linux platform.

Using the ethtool Utility to Set Parameters

This section provides useful ethtool commands to use for setting paramaters.


procedure icon  To Determine Available Parameters

single-step bullet  Determine which parameters are available using the ethtool utility:


# ethtool -help eth4
ethtool version 1.8
Usage:
       ethtool DEVNAME
       ethtool -a DEVNAME
       ethtool -A DEVNAME \
               [ autoneg on|off ] \
               [ rx on|off ] \
               [ tx on|off ]
       ethtool -c DEVNAME
       ethtool -C DEVNAME \
               [adaptive-rx on|off] \
               [adaptive-tx on|off] \
               [rx-usecs N] \
               [rx-frames N] \
               [rx-usecs-irq N] \
               [rx-frames-irq N] \
               [tx-usecs N] \
               [tx-frames N] \
               [tx-usecs-irq N] \
               [tx-frames-irq N] \
               [stats-block-usecs N] \
               [pkt-rate-low N] \
               [rx-usecs-low N] \
               [rx-frames-low N] \
               [tx-usecs-low N] \
               [tx-frames-low N] \
               [pkt-rate-high N] \
               [rx-usecs-high N] \
               [rx-frames-high N] \
               [tx-usecs-high N] \
               [tx-frames-high N] \
               [sample-interval N]
       ethtool -g DEVNAME
       ethtool -G DEVNAME \
               [ rx N ] \
               [ rx-mini N ] \
               [ rx-jumbo N ] \
               [ tx N ]
       ethtool -i DEVNAME
       ethtool -d DEVNAME
       ethtool -e DEVNAME \
               [ raw on|off ] \
               [ offset N ] \
               [ length N ]
       ethtool -E DEVNAME \
               [ magic N ] \
               [ offset N ] \
               [ value N ]
       ethtool -k DEVNAME
       ethtool -K DEVNAME \
               [ rx on|off ] \
               [ tx on|off ] \
               [ sg on|off ] \
               [ tso on|off ]
       ethtool -r DEVNAME
       ethtool -p DEVNAME [ %d ]
       ethtool -t DEVNAME [online|(offline)]
       ethtool -s DEVNAME \
               [ speed 10|100|1000 ] \
               [ duplex half|full ]    \
               [ port tp|aui|bnc|mii|fibre ] \
               [ autoneg on|off ] \
               [ phyad %d ] \
               [ xcvr internal|external ] \
               [ wol p|u|m|b|a|g|s|d... ] \
               [ sopass %x:%x:%x:%x:%x:%x ] \
               [ msglvl %d ]
       ethtool -S DEVNAME

Following are some common parameters that can be changed:


# ethtool -c eth8
Coalesce parameters for eth8:
Adaptive RX: off  TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0
 
rx-usecs: 8
rx-frames: 512
rx-usecs-irq: 0
rx-frames-irq: 512
 
tx-usecs: 0
tx-frames: 0
tx-usecs-irq: 0
tx-frames-irq: 0
 
rx-usecs-low: 0
rx-frame-low: 0
tx-usecs-low: 0
tx-frame-low: 0
 
rx-usecs-high: 0
rx-frame-high: 0
tx-usecs-high: 0
tx-frame--high: 0

rx-usecs and rx-frames control the RX interrupt rate per RX DMA channel. RX interrupt will be generated after rx-frames have been received or after rx-usecs time interval if fewer than rx-frames have been received within the interval. For low latency applications, set rx-usecs to a smaller value. For bulk traffic, use larger values of rx-usecs and control the rate with rx-frames.

rx-frames-irq controls the maximum number of RX packets processed with a single RX interrupt.


procedure icon  To Change RX Coalesce Parameters

single-step bullet  Type the ethtool -C command:


# ethtool -C eth4 rx-usecs 20
# ethtool -c eth4
Coalesce parameters for eth4:
Adaptive RX: off  TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0
 
rx-usecs: 20
rx-frames: 512
rx-usecs-irq: 0
rx-frames-irq: 512
 
tx-usecs: 0
tx-frames: 0
tx-usecs-irq: 0
tx-frames-irq: 0
 
rx-usecs-low: 0
rx-frame-low: 0
tx-usecs-low: 0
tx-frame-low: 0
 
rx-usecs-high: 0
rx-frame-high: 0
tx-usecs-high: 0
tx-frame-high: 0


procedure icon  To Obtain the Status of L4 Hardware

single-step bullet  Type the ethtool -k command:


# ethtool -k eth4
Offload parameters for eth4:
Cannot get device tcp segmentation offload settings: Operation not supported
rx-checksumming: on
tx-checksumming: on
scatter-gather: off
tcp segmentation offload: off

Setting Parameters Using the Bundled configtool Utility

This section describes how to use the commands in the configtool utility.


procedure icon  To Obtain a List of Tunable Parameters

single-step bullet  Use the nxge_config if-name get command:


# /usr/local/bin/nxge_config eth4 get
The tunable parameters exported by this device are:
 
class_opt_ipv4_tcp                              Read-Write
class_opt_ipv4_udp                              Read-Write
class_opt_ipv4_ah                               Read-Write
class_opt_ipv4_sctp                             Read-Write
class_opt_ipv6_tcp                              Read-Write
class_opt_ipv6_udp                              Read-Write
class_opt_ipv6_ah                               Read-Write
class_opt_ipv6_sctp                             Read-Write

These classification variables define how each IP class is configured. This parameter also controls how the flow template is constructed and how packets are distrubuted within RDC groups.


Configuration bits:
 
       0x0010:         use MAC Port (for flow key)
       0x0020:         use L2DA (for flow key)
       0x0040:         use VLAN (for flow key)
       0x0080:         use proto (for flow key)
       0x0100:         use IP src addr (for flow key)
       0x0200:         use IP dest addr (for flow key)
       0x0400:         use Src Port (for flow key)
       0x0800:         use Dest Port (for flow key)



Note - The classification variables are modified on an EM basis. That is, if any of these variables is modifiied for one port, the change carries over to all other ports of that EM.



procedure icon  To Obtain a Specific Variable

single-step bullet  Type nxge_config if-name get param-name:


# /usr/local/bin/nxge_config eth4 get class_opt_ipv4_udp
  class_opt_ipv4_udp                0xfe3


procedure icon  To Set a Specific Variable

single-step bullet  Type the /usr/local/bin/nxge_config if_name set param_name value:


# /usr/local/bin/nxge_config eth4 set class_opt_ipv4_tcp 0xfe0


Tuning for Maximum Performance on a Linux Platform

The following tunings improve the performance of the Sun x8 Express Dual 10-Gigabit Ethernet device driver on a system running the Linux operating system.


procedure icon  To Tune for Maximum Performance on a Linux Platform

1. Create the conf file that will be called by the sysctl utility.

For example, sysctl_e1000.conf


### IPV4 specific settings
# turns TCP timestamp support off, default 1, reduces CPU use
net.ipv4.tcp_timestamps = 0
# turn SACK support off, default on systems with a VERY fast bus ->
# memory interface this is the big gainer
net.ipv4.tcp_sack = 0
# sets min/default/max TCP read buffer, default 4096 87380 174760
net.ipv4.tcp_rmem = 10000000 10000000 10000000
# sets min/pressure/max TCP write buffer, default 4096 16384 131072
net.ipv4.tcp_wmem = 10000000 10000000 10000000
# sets min/pressure/max TCP buffer space, default 31744 32256 32768
net.ipv4.tcp_mem = 10000000 10000000 10000000
 
### CORE settings (mostly for socket and UDP effect)
# maximum receive socket buffer size, default 131071
net.core.rmem_max = 524287
# maximum send socket buffer size, default 131071
net.core.wmem_max = 524287
# default receive socket buffer size, default 65535
net.core.rmem_default = 524287
# default send socket buffer size, default 65535
net.core.wmem_default = 524287
# maximum amount of option memory buffers, default 10240
net.core.optmem_max = 524287
# number of unprocessed input packets before kernel starts dropping
# them, default 300
net.core.netdev_max_backlog = 300000

2. Set up the sysctl utility.


# sysctl -p /etc/sysctl_nxge.conf