Writing Device Drivers for Oracle® Solaris 11.2

Exit Print View

Updated: September 2014
 
 

GLDv3 Data Paths

    Data-path entry points are comprised of the following components:

  • Callbacks exported by the driver and invoked by the GLDv3 framework for sending packets.

  • GLDv3 framework entry points called by the driver for transmit flow control and for receiving packets.


Note - If a driver implements the rings capability then all data sent and received by the driver is passed through ring-specific entry points.

Transmit Data Path

The type of transmit entry point invoked by the GLDv3 framework to pass a message block to the driver is dependent on the underlying driver support for MAC_CAPAB_RINGS. If the driver supports MAC_CAPAB_RINGS capability then the framework invokes mri_tx(9E) ring entry point. Otherwise the framework invokes mc_tx(9E) entry point.

Accordingly, the device driver has to provide a pointer to the transmit entry point in either mc_tx() or mri_tx(). See GLDv3 MAC Registration Data Structures and mr_rget() Entry Point for more information.

Example 19-6  The mc_tx() and mri_tx() Entry Point
mblk_t *
xx_m_tx(void *arg, mblk_t *mp)
{
        xx_t    *xxp = arg;
        mblk_t   *nmp;

        mutex_enter(&xxp->xx_xmtlock);

        if (xxp->xx_flags & XX_SUSPENDED) {
                while ((nmp = mp) != NULL) {
                        xxp->xx_carrier_errors++;
                        mp = mp->b_next;
                        freemsg(nmp);
                }
                mutex_exit(&xxp->xx_xmtlock);
                return (NULL);
        }

        while (mp != NULL) {
                nmp = mp->b_next;
                mp->b_next = NULL;

                if (!xx_send(xxp, mp)) {
                        mp->b_next = nmp;
                        break;
                }
                mp = nmp;
        }
        mutex_exit(&xxp->xx_xmtlock);

        return (mp);
}

The following sections discuss topics related to transmitting data to the hardware.

Flow Control

If the driver cannot send the packets because of insufficient hardware resources, the driver returns the sub-chain of packets that could not be sent. When more descriptors become available at a later time, the driver must invoke mac_tx_update(9F) or mac_tx_ring(9F) to notify the framework. The driver will invoke either function depending on whether the driver implements Rings Capability.

Hardware Checksumming: Hardware

If the driver specified hardware checksum support (see Hardware Checksum Offload), then the driver must do the following tasks:

  • Use mac_hcksum_get(9F) to check every packet for hardware checksum metadata.

  • Program the hardware to perform the required checksum calculation.

Large Segment Offload

If the driver specified LSO capabilities (see Large Segment (or Send) Offload), then the driver must use mac_lso_get(9F) to query whether LSO must be performed on the packet.

Virtual LAN: Hardware

When the administrator configures VLANs, the MAC layer inserts the needed VLAN headers on the outbound packets before they are passed to the driver through the mc_tx() entry point. However, if the hardware supports VLAN tagging then the tagging is offloaded to the hardware. See mr_gget() Entry Point for more details.

Receive Data Path

The receive data-path can be interrupt-driven or poll-driven.

Receive Interrupt Data Path

Note: If the driver does not support the rings capability then call the mac_rx(9F) function in your driver's interrupt handler to pass a chain of one or more packets up the stack to the MAC layer. Avoid holding mutex or other locks during the call to mac_rx() or mac_rx_ring(). In particular, do not hold locks that could be taken by a transmit thread during a call to mac_rx() or mac_rx_ring().

In interrupt mode, packet chains are sent up from the driver to the framework whenever they are received by the NIC and available by the driver for pickup. Packet chains consists of one or more mblk_t chained with each other through b_next and allow per-packet processing overhead to be reduced. Received packets are passed up to the framework in interrupt mode by calling the mac_rx_ring() entry point.

void mac_rx_ring(mac_handle_t mh, mac_ring_handle_t mrh, mblk_t *mp_chain, int64_tmr_gen_num)

The mh_handle corresponds to the MAC handle obtained by the device driver when it registered with the kernel via the mac_register() function. The mrh _handle is the framework ring handle which was passed to the driver as part of the mr_rget() call. mr_gen_num must be set to the generation number specified by the framework when the receive ring was started via the mri_start() entry point. The ring generation number provided by the driver is matched with the ring generation number held in framework. If they do not match, received packets are considered stale packets coming from an older assignment of the ring and they will be dropped.

Receive Polling Data-Path

In addition to being able to receive packets through an interrupt-driven path, framework also supports a polling-based data path. In polling mode, a kernel thread running in the stack fetches packets from the driver through a polling entry point. This allows the stack to efficiently control when packets will be processed, with which priority, while reducing the numbers of interrupts coming into the system based on actual load. In addition, polling allows the stack to more effectively enforce bandwidth limits on received traffic, which is especially critical in virtualization scenarios. The host toggles between interrupt and polling mode on demand. While a ring is in polling mode, the driver should not deliver packets received through the receive ring using mac_rx_ring() function. This is guaranteed as interrupts are disabled while in polling mode. Instead, the framework will call the mri_poll() entry point that was exposed by the driver as part of the mac_ring_info structure. See mr_rget() Entry Point for more information.

Switching Between Interrupt and Polling Mode

By default, a ring should be in interrupt mode after it is started. As long as a ring is in interrupt mode, it should pass up received packets in the form of chains through the entry points. When the host switches a ring to polling mode, it disables its interrupt by invoking the entry point through the mac_intr structure, which was previously exposed through the mac_ring_info structure.

Hardware Checksumming: MAC Layer

If the driver specified hardware checksum support (see Hardware Checksum Offload), then the driver must use the mac_hcksum_set(9F) function to associate hardware checksumming metadata with the packet.

Virtual LAN: MAC Layer

VLAN packets must be passed with their tags to the MAC layer. Do not strip the VLAN headers from the packets. However if the hardware supports VLAN stripping and the framework has requested the hardware to strip VLAN tags then the hardware can strip VLAN tags to improve performance. See mr_gget() Entry Point for more information.