C H A P T E R 5 |
This chapter describes how to configure PCI express busses across multiple logical domains and how to enable the I/O MMU bypass mode on a PCI bus.
The PCI Express (PCIe) bus on a Sun UltraSPARC T1-based server consists of two ports with various leaf devices attached to them. These are identified on a server with the names pci@780 (bus_a) and pci@7c0 (bus_b). In a multidomain environment, the PCIe bus can be programmed to assign each leaf to a separate domain using the Logical Domains Manager. Thus, you can enable more than one domain with direct access to physical devices instead of using I/O virtualization.
When the Logical Domains system is powered on, the control (primary) domain uses all the physical device resources, so the primary domain owns both the PCIe bus leaves.
The example shown here is for a Sun Fire T2000 server. This procedure also can be used on other Sun UltraSPARC T1-based servers, such a Sun Fire T1000 server and a Netra T2000 server. The instructions for different servers might vary slightly from these, but you can obtain the basic principles from the example. Mainly, you need to retain the leaf that has the boot disk and remove the other leaf from the primary domain and assign it to another domain.
Verify that the primary domain owns both leaves of the PCI Express bus.
primary# ldm list-bindings primary ... IO DEVICE PSEUDONYM OPTIONS pci@780 bus_a pci@7c0 bus_b ... |
Determine the device path of the boot disk, which needs to be retained.
primary# df / / (/dev/dsk/c1t0d0s0 ): 1309384 blocks 457028 files |
Determine the physical device to which the block device c1t0d0s0 is linked.
primary# ls -l /dev/dsk/c1t0d0s0 lrwxrwxrwx 1 root root 65 Feb 2 17:19 /dev/dsk/c1t0d0s0 -> ../ ../devices/pci@7c0/pci@0/pci@1/pci@0,2/LSILogic,sas@2/sd@0,0:a |
In this example, the physical device for the boot disk for domain primary is under the leaf pci@7c0, which corresponds to our earlier listing of bus_b. This means that we can assign bus_a (pci@780) of the PCIe bus to another domain.
Check /etc/path_to_inst to find the physical path of the onboard network ports.
primary# grep e1000g /etc/path_to_inst |
Remove the leaf that does not contain the boot disk (pci@780 in this example) from the primary domain.
primary# ldm remove-io pci@780 primary |
Add this split PCI configuration (split-cfg in this example) to the system controller.
primary# ldm add-config split-cfg |
This configuration (split-cfg) is also set as the next configuration to be used after the reboot.
Note - Currently, there is a limit of 8 configurations that can be saved on the SC, not including the factory-default configuration. |
Reboot the primary domain so that the change takes effect.
primary# shutdown -i6 -g0 -y |
Add the leaf (pci@780 in this example) to the domain (ldg1 in this example) that needs direct access.
primary# ldm add-io pci@780 ldg1 Notice: the LDom Manager is running in configuration mode. Any configuration changes made will only take effect after the machine configuration is downloaded to the system controller and the host is reset. |
If you have an Infiniband card, you might need to enable the bypass mode on the pci@780 bus. See Enabling the I/O MMU Bypass Mode on a PCI Bus for information on whether you need to enable the bypass mode.
Reboot domain ldg1 so that the change takes effect.
All domains must be inactive for this reboot. If you are configuring this domain for the first time, the domain will be inactive.
ldg1# shutdown -i6 -g0 -y |
Confirm that the correct leaf is still assigned to the primary domain and the correct leaf is assigned to domain ldg1.
primary# ldm list-bindings primary NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME primary active -n-cv SP 4 4G 0.4% 18h 25m ... IO DEVICE PSEUDONYM OPTIONS pci@7c0 bus_b ... ---------------------------------------------------------------- NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME ldg1 active -n--- 5000 4 2G 10% 35m ... IO DEVICE PSEUDONYM OPTIONS pci@780 bus_a ... |
This output confirms that the PCIe leaf bus_b and the devices below it are assigned to domain primary, and bus_a and its devices are assigned to ldg1.
If you have an Infiniband Host Channel Adapter (HCA) card, you might need to turn the I/O memory management unit (MMU) bypass mode on. By default, Logical Domains software controls PCIe transactions so that a given I/O device or PCIe option can only access the physical memory assigned within the I/O domain. Any attempt to access memory of another guest domain is prevented by the I/O MMU. This provides a higher level of security between the I/O domain and all other domains. However, in the rare case where a PCIe or PCI-X option card does not load or operate with the I/O MMU bypass mode off, this option allows you to turn the I/O MMU bypass mode on. However, if you turn the bypass mode on, there no longer is a hardware-enforced protection of memory accesses from the I/O domain.
The bypass=on option turns on the I/O MMU bypass mode. This bypass mode should be enabled only if the respective I/O domain and I/O devices within that I/O domain are trusted by all guest domains. This example turns on the bypass mode.
primary# ldm add-io bypass=on pci@780 ldg1 |
Copyright © 2008, Sun Microsystems, Inc. All rights reserved.