This section contains general issues and specific bugs concerning the Oracle VM Server for SPARC 3.6 software.
This section summarizes the bugs that you might encounter when using this version of the software. The most recent bugs are described first. Workarounds and recovery procedures are specified, if available.
Bug ID 28045753:
When a virtual switch has been created without an MTU value on an Oracle VM Server for SPARC 3.4 or 3.5 system, you might see the following error after you upgrade to Version 3.5 or 3.6 and attempt to modify the backend device (net-dev) of the virtual switch.
primary# ldm set-vswitch net-dev=net0 vsw_1 Domain "primary" network device "vsw_1" MTU (0) must be within the backing device's MTU range 1500-15500
Workaround:Ensure that you also specify an MTU value when you modify the virtual switch's backend device:
primary# ldm set-vswitch mtu=1500 net-dev=net0 vsw_1
Bug ID 27974950: You cannot migrate a guest domain that has a vhba instance associated with a virtual SAN with mask=on.
Use the ldm list -o hba command to determine whether your guest domain is affected by this issue. The following example lists the vhba instances in the ldgb guest domain that you want to migrate:
primary# ldm list -o hba ldgb NAME ldgb VHBA NAME VSAN DEVICE TOUT SERVER vhba0 vsan0 vhba@0 0 ldga
The vhba instance, vhba0, is associated with vsan0 which executes in the ldga domain. The following command lists the vsan resources in the ldga domain:
primary# ldm list -o san ldga NAME ldga VSAN NAME MASK DEVICE IPORT vsan0 on vsan@0 [/pci@300/pci@1/pci@0/pci@4/SUNW,emlxs@0,1/fp@0,0] vsan1 off vsan@1 [/pci@300/pci@1/pci@0/pci@4/SUNW,emlxs@0,1/fp@0,0]
The previous output shows that vsan0 has its mask property set to on, which means that you cannot migrate the ldgb guest domain.
Bug ID 27925093: LLDP advertises information throughout a LAN for purposes of topology discovery. Due to the following issue:
28650967 - LLDP lock is preventing offline of pf's, breaking LDoms create-vf
an attempt to create or destroy virtual functions in a root domain where this service is enabled, and which also owns the physical function target of the create-vf or destroy-vf command, will fail. This occurs because the service keeps all the physical functions in that domain busy. This in turn prevents the required offline and online of that physical function for the create or destroy operation.
Workaround: Disable the service, create or destroy virtual functions as needed, then reenable the service.
Run the following commands as superuser on the root domain which owns the physical function involved in the create or destroy operation. For example:
# svcadm disable lldp # ldm create-vf <pf_name> # svcadm enable lldp
Bug ID 24393532: The fix for bug ID 23591953 disabled both Oracle VM Server for SPARC Oracle VM Server for SPARC MIB monitoring, such as listing the Oracle VM Server for SPARC MIB objects by using the snmpwalk command, and trap generation for the ldomHbaTable table. As a result, the Oracle VM Server for SPARC MIB ldomHbaTable table does not show contents.
primary# snmpwalk -v1 -c public localhost SUN-LDOM-MIB::ldomHbaTable primary#
Workaround: Use the ldm list-hba command to view the HBA information.
Bug ID 23031413: When the target machine's control domain runs out of LDCs during a domain migration, the migration fails with no explanation and the following message is written to the SMF log:
warning: Failed to read feasibility response type (5) from target LDoms Manager
This error is issued when the domain being migrated fails to bind on the target machine. Note that the bind operation might fail for other reasons on the target machine, as well.
Workaround: For the migration to succeed, the number of LDCs must be reduced either in the domain being migrated or in the control domain of the target machine. You can reduce the number of LDCs by reducing the number of virtual devices being used by or being serviced by a domain. For more information about managing LDCs, see Using Logical Domain Channels in Oracle VM Server for SPARC 3.6 Administration Guide.
Bug ID 22828100: In Oracle Solaris 11, the virtual switch is not an actual network device. As such, the value of its linkprop property has no operational impact. However, this property can cause a spurious error message if set to phys-state when you attempt to remove the net-dev backing device by running the ldm set-vsw command.
primary# ldm set-vsw net-dev= vsw0 Failed to modify virtual switch because the linkprop of the virtual switch requires that it has a physical network device assigned
You can avoid this error message by specifying the linkprop= option on the command line:
primary# ldm set-vsw net-dev= linkprop= vsw0
Alternatively, you can ignore this error message. As long as no virtual network devices have the linkprop property set to phys-state, the ldm set-vsw command succeeds.
However, if an attached virtual network device has its linkprop property set to phys-state, the ldm set-vsw issues the following error message and the command fails:
Failed to modify virtual switch because the linkprop of at least one virtual network device requires that the virtual switch has a physical network device assigned
Bug ID 22108218: Oracle VM Server for SPARC migration will not detect a jumbo frame MTU size mismatch before it tries and migrates a logical domain. Such a migration will fail silently so physical and or virtual NICs using jumbo frames must be carefully set up so that all the NIC and VNICs participants can communicate.
To setup jumbo frames for a virtual switch and multiple virtual networks, the virtual switch must be setup first.
Setting Up the Virtual Switch for Jumbo Frames
From an account with root privileges, run dladm to find the max possible MTU size supported by the physical backing NIC device.
# dladm show-linkprop -p mtu net0 LINK PROPERTY PERM VALUE EFFECTIVE DEFAULT POSSIBLE nte0 mtu rw 1500 1500 1500 46-9194
In this example, the the value of the MTU should never exceed 9194 for the backing device (net-dev=net0) on the virtual switch.
Then add the virtual switch with the selected net-dev backing device and an MTU size that does not exceed the max POSSIBLE MTU size displayed in dladm. The best practice for Oracle VM Server for SPARC is to use an MTU size of 9000 or less for jumbo frames.
Setting up Jumbo Frames between a Virtual Switch and other Physical NICs
Determine the max POSSIBLE MTU size for each physical NIC you want to connect to using jumbo frames, then determine the MIN(vsw0_nic,vsw1_nic,vswN_nic,..,nic1.nic2,...nicN) of all the virtual switch backing NICs and NICs you are connecting. The MIN is the smallest max possible MTU size of all the NICs in the jumbo frame network. That is the max MTU size you can use. Again, the best practice is to use an MTU size of 9000 or less.
Bug ID 21616429: The Oracle VM Server for SPARC 3.3 software introduced socket support for Fujitsu SPARC M12 servers and Fujitsu M10 servers only.
Software running on Oracle SPARC servers and Oracle VM Server for SPARC versions older than 3.3 cannot re-create a domain with socket constraints from an XML file.
Attempting to re-create a domain with socket constraints from an XML file with an older version of the Oracle VM Server for SPARC software or on an Oracle SPARC server fails with the following message:
primary# ldm add-domain -i ovm3.3_socket_ovm11.xml socket not a known resource
If Oracle VM Server for SPARC 3.2 is running on a Fujitsu SPARC M12 server or Fujitsu M10 server and you attempt to re-create a domain with socket constraints from an XML file, the command fails with various error messages, such as the following:
primary# ldm add-domain -i ovm3.3_socket_ovm11.xml Unknown property: vcpus primary# ldm add-domain -i ovm3.3_socket_ovm11.xml perf-counters property not supported, platform does not have performance register access capability, ignoring constraint setting.
Workaround: Edit the XML file to remove any sections that reference the socket resource type.
Bug ID 17036795: The Oracle Solaris 11.3 SRU 12 OS has merged the ssd and sd driver functionality for Fibre Channel devices on SPARC platforms.
This change affects device node names on the physical device path. The device node names change from ssd@ to disk@. This change also affects device driver bindings from ssd to sd.
This change is not enabled by default for Oracle Solaris 11.3 systems.
You must enable this change to perform live migrations of domains that use virtual HBA and Fibre Channel devices.
Before you enable this change, ensure that MPxIO is already enabled by running the stmsboot -D fp -e command.
Run the format command to determine whether MPxIO is enabled. When enabled, you should see vhci in device names. Alternatively, if the mpathadm -list lu output is empty, no MPxIO devices are enumerated.
Use the beadm command to create a new boot environment (BE). By using BEs, you can roll back easily to a previous boot environment if you experience unexpected problems.
Mount the BE and replace the /etc/devices/inception_points file with the /etc/devices/inception_points.vhba file. The .vhba file includes some feature flags to enable this change.
Finally, reboot after you activate the new BE.
# beadm create BE-name # beadm mount BE-name /mnt # cp /mnt/etc/devices/inception_points.vhba /mnt/etc/devices/inception_points # beadm umount BE-name # beadm activate BE-name # reboot
After rebooting, use the prtconf -D | grep driver | grep sd command to verify the change.
If any disks use the ssd driver, there is a problem with the configuration.
You can also use the mpathadm list lu command to show multiple paths to the same disks if virtual HBA and the FibreChannel virtual function are both configured to see the same LUNs.
Bug ID 16691046: If virtual functions are assigned from the root domain, an I/O domain might fail to provide resiliency in the following hotplug situations:
You add a root complex (PCIe bus) dynamically to the root domain, and then you create the virtual functions and assign them to the I/O domain.
You hot-add an SR-IOV card to the root domain that owns the root complex, and then you create the virtual functions and assign them to the I/O domain.
You replace or add any PCIe card to an empty slot (either through hotplug or when the root domain is down) on the root complex that is owned by the root domain. This root domain provides virtual functions from the root complex to the I/O domain.
Workaround: Perform one of the following steps:
If the root complex already provides virtual functions to the I/O domain and you add, remove, or replace any PCIe card on that root complex (through hotplug or when the root domain is down), you must reboot both the root domain and the I/O domain.
If the root complex does not have virtual functions currently assigned to the I/O domain and you add an SR-IOV card or any other PCIe card to the root complex, you must stop the root domain to add the PCIe card. After the root domain reboots, you can assign virtual functions from that root complex to the I/O domain.
If you want to add a new PCIe bus to the root domain and then create and assign virtual functions from that bus to the I/O domain, perform one of the following steps and then reboot the root domain:
Add the bus during a delayed reconfiguration
Add the bus dynamically
Bug ID 15783031: You might experience problems when you use the ldm init-system command to restore a domain configuration that has used direct I/O or SR-IOV operations.
A problem arises if one or more of the following operations have been performed on the configuration to be restored:
A slot has been removed from a bus that is still owned by the primary domain.
A virtual function has been created from a physical function that is owned by the primary domain.
A virtual function has been assigned to the primary domain, to other guest domains, or to both.
A root complex has been removed from the primary domain and assigned to a guest domain, and that root complex is used as the basis for further I/O virtualization operations.
In other words, you created a non-primary root domain and performed any of the previous operations.
If you have performed any of the previous actions, perform the workaround shown in Oracle VM Server for SPARC PCIe Direct I/O and SR-IOV Features (Doc ID 1325454.1) (https://support.oracle.com/epmos/faces/SearchDocDisplay?amp;_adf.ctrl-state=10c69raljg_77&_afrLoop=506200315473090).
Bug ID 15775637: An I/O domain has a limit on the number of interrupt resources that are available per root complex.
On SPARC T4 servers, the limit is approximately 63 MSI/X vectors. Each igb virtual function uses three interrupts. The ixgbe virtual function uses two interrupts.
If you assign a large number of virtual functions to a domain, the domain runs out of system resources to support these devices. You might see messages similar to the following:
WARNING: ixgbevf32: interrupt pool too full. WARNING: ddi_intr_alloc: cannot fit into interrupt pool
Bug ID 15761509: Use only the PCIe cards that support the Direct I/O (DIO) feature, which are listed in this support document (https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=REFERENCE&id=1325454.1).
Workaround: Use the ldm add-io command to add the card to the primary domain again.
Bug ID 15754356: In the root domain, the Oracle Solaris device path for a Fibre Channel virtual function is incorrect.
For example, the incorrect path name is pci@380/pci@1/pci@0/pci@6/fibre-channel@0,2 while it should be pci@380/pci@1/pci@0/pci@6/SUNW,emlxs@0,2.
The ldm list-io -l output shows the correct device path for the Fibre Channel virtual functions.
Workaround: None.
Bug ID 15701865: If you attempt a live migration of a domain that depends on an inactive domain on the target machine, the ldmd daemon faults with a segmentation fault and crashes. The ldmd daemon is restarted automatically, but the migration is aborted.
Workaround: Perform one of the following actions before you attempt the live migration:
Remove the guest dependency from the domain to be migrated.
Start the master domain on the target machine.
Bug ID 15696986: If two ldm migrate commands are issued between the same two systems simultaneously in the “opposite direction,” the two commands might hang and never complete. An opposite direction situation occurs when you simultaneously start a migration on machine A to machine B and a migration on machine B to machine A.
The hang occurs even if the migration processes are initiated as dry runs by using the –n option. When this problem occurs, all other ldm commands might hang.
Recovery: Restart the Logical Domains Manager on both the source machine and the target machine:
primary# svcadm restart ldmd
Workaround: None.
Bug ID 15664666: When a reset dependency is created, an ldm stop -a command might result in a domain with a reset dependency being restarted instead of only stopped.
Workaround: First, issue the ldm stop command to the master domain. Then, issue the ldm stop command to the slave domain. If the initial stop of the slave domain results in a failure, issue the ldm stop -f command to the slave domain.
Bug ID 15513998: Occasionally, after a domain has been migrated, it is not possible to connect to the console for that domain.
Note that this problem occurs when the migrated domain is running an OS version older than Oracle Solaris 11.3.
Workaround: Restart the vntsd SMF service to enable connections to the console:
# svcadm restart vntsd
Bug ID 15453968: Simultaneous net installation of multiple guest domains fails on systems that have a common console group.
Workaround: Only net-install on guest domains that each have their own console group. This failure is seen only on domains with a common console group shared among multiple net-installing domains.