C H A P T E R  5

Using PCI Busses With Logical Domains Software

This chapter describes how to configure PCI express busses across multiple logical domains and how to enable the I/O MMU bypass mode on a PCI bus.


Configuring PCI Express Busses Across Multiple Logical Domains



Note - For Sun UltraSPARC T-2 based servers, such as the Sun SPARC Enterprise T5120 and T5220 servers, you would assign a Network Interface Unit (NIU) to the logical domain rather than use this procedure.



The PCI Express (PCIe) bus on a Sun UltraSPARC T1-based server consists of two ports with various leaf devices attached to them. These are identified on a server with the names pci@780 (bus_a) and pci@7c0 (bus_b). In a multidomain environment, the PCIe bus can be programmed to assign each leaf to a separate domain using the Logical Domains Manager. Thus, you can enable more than one domain with direct access to physical devices instead of using I/O virtualization.

When the Logical Domains system is powered on, the control (primary) domain uses all the physical device resources, so the primary domain owns both the PCIe bus leaves.



caution icon

Caution - All internal disks on the supported servers are connected to a single leaf. If a control domain is booted from an internal disk, do not remove that leaf from the domain. Also, ensure that you are not removing the leaf with the primary network port. If you remove the wrong leaf from the control or service domain, that domain would not be able to access required devices and would become unusable. If the primary network port is on a different bus than the system disk, then move the network cable to an onboard network port and use the Logical Domains Manager to reconfigure the virtual switch (vsw) to reflect this change.



procedure icon   Create a Split PCI Configuration

The example shown here is for a Sun Fire T2000 server. This procedure also can be used on other Sun UltraSPARC T1-based servers, such a Sun Fire T1000 server and a Netra T2000 server. The instructions for different servers might vary slightly from these, but you can obtain the basic principles from the example. Mainly, you need to retain the leaf that has the boot disk and remove the other leaf from the primary domain and assign it to another domain.

  1. Verify that the primary domain owns both leaves of the PCI Express bus.


    primary# ldm list-bindings primary
    ...
    IO
        DEVICE           PSEUDONYM        OPTIONS
        pci@780          bus_a
        pci@7c0          bus_b
    ...
    

  2. Determine the device path of the boot disk, which needs to be retained.


    primary# df /
    /                  (/dev/dsk/c1t0d0s0 ): 1309384 blocks   457028 files
    

  3. Determine the physical device to which the block device c1t0d0s0 is linked.


    primary# ls -l /dev/dsk/c1t0d0s0
    lrwxrwxrwx   1 root     root          65 Feb  2 17:19 /dev/dsk/c1t0d0s0 -> ../
    ../devices/pci@7c0/pci@0/pci@1/pci@0,2/LSILogic,sas@2/sd@0,0:a
    

    In this example, the physical device for the boot disk for domain primary is under the leaf pci@7c0, which corresponds to our earlier listing of bus_b. This means that we can assign bus_a (pci@780) of the PCIe bus to another domain.

  4. Check /etc/path_to_inst to find the physical path of the onboard network ports.


    primary# grep e1000g /etc/path_to_inst
    

  5. Remove the leaf that does not contain the boot disk (pci@780 in this example) from the primary domain.


    primary# ldm remove-io pci@780 primary
    

  6. Add this split PCI configuration (split-cfg in this example) to the system controller.


    primary# ldm add-config split-cfg
    

    This configuration (split-cfg) is also set as the next configuration to be used after the reboot.



    Note - Currently, there is a limit of 8 configurations that can be saved on the SC, not including the factory-default configuration.



  7. Reboot the primary domain so that the change takes effect.


    primary# shutdown -i6 -g0 -y
    

  8. Add the leaf (pci@780 in this example) to the domain (ldg1 in this example) that needs direct access.


    primary# ldm add-io pci@780 ldg1
    Notice: the LDom Manager is running in configuration mode. Any
    configuration changes made will only take effect after the machine
    configuration is downloaded to the system controller and the
    host is reset.
    

    If you have an Infiniband card, you might need to enable the bypass mode on the pci@780 bus. See Enabling the I/O MMU Bypass Mode on a PCI Bus for information on whether you need to enable the bypass mode.

  9. Reboot domain ldg1 so that the change takes effect.

    All domains must be inactive for this reboot. If you are configuring this domain for the first time, the domain will be inactive.


    ldg1# shutdown -i6 -g0 -y
    

  10. Confirm that the correct leaf is still assigned to the primary domain and the correct leaf is assigned to domain ldg1.


    primary# ldm list-bindings primary
    NAME          STATE   FLAGS  CONS   VCPU  MEMORY  UTIL  UPTIME
    primary       active  -n-cv  SP     4     4G      0.4%  18h 25m
    ...
    IO
        DEVICE           PSEUDONYM        OPTIONS
        pci@7c0          bus_b
    ...
    ----------------------------------------------------------------
    NAME          STATE   FLAGS  CONS   VCPU  MEMORY  UTIL  UPTIME
    ldg1          active  -n---  5000   4     2G      10%   35m
    ...
    IO
        DEVICE           PSEUDONYM        OPTIONS
        pci@780          bus_a
    ...
    

    This output confirms that the PCIe leaf bus_b and the devices below it are assigned to domain primary, and bus_a and its devices are assigned to ldg1.


Enabling the I/O MMU Bypass Mode on a PCI Bus

If you have an Infiniband Host Channel Adapter (HCA) card, you might need to turn the I/O memory management unit (MMU) bypass mode on. By default, Logical Domains software controls PCIe transactions so that a given I/O device or PCIe option can only access the physical memory assigned within the I/O domain. Any attempt to access memory of another guest domain is prevented by the I/O MMU. This provides a higher level of security between the I/O domain and all other domains. However, in the rare case where a PCIe or PCI-X option card does not load or operate with the I/O MMU bypass mode off, this option allows you to turn the I/O MMU bypass mode on. However, if you turn the bypass mode on, there no longer is a hardware-enforced protection of memory accesses from the I/O domain.

The bypass=on option turns on the I/O MMU bypass mode. This bypass mode should be enabled only if the respective I/O domain and I/O devices within that I/O domain are trusted by all guest domains. This example turns on the bypass mode.


primary# ldm add-io bypass=on pci@780 ldg1

The output shows bypass=on under OPTIONS.