Go to main content

Oracle® VM Server for SPARC 3.6 Release Notes

Exit Print View

Updated: February 2021
 
 

Known Issues

This section contains general issues and specific bugs concerning the Oracle VM Server for SPARC 3.6 software.

Bugs Affecting the Oracle VM Server for SPARC Software

This section summarizes the bugs that you might encounter when using this version of the software. The most recent bugs are described first. Workarounds and recovery procedures are specified, if available.

Bugs Affecting the Oracle VM Server for SPARC 3.6 Software

Virtual Switch MTU Value Erroneously Set to Zero After Upgrading to Oracle VM Server for SPARC 3.5 and Oracle VM Server for SPARC 3.6

Bug ID 28045753:

When a virtual switch has been created without an MTU value on an Oracle VM Server for SPARC 3.4 or 3.5 system, you might see the following error after you upgrade to Version 3.5 or 3.6 and attempt to modify the backend device (net-dev) of the virtual switch.

primary# ldm set-vswitch  net-dev=net0 vsw_1 
Domain "primary" network device "vsw_1" MTU (0) must be within 
the backing device's MTU range 1500-15500

Workaround:Ensure that you also specify an MTU value when you modify the virtual switch's backend device:

primary# ldm set-vswitch mtu=1500 net-dev=net0 vsw_1
ldm add-vsan-dev Does Not Support Domain Migration

Bug ID 27974950: You cannot migrate a guest domain that has a vhba instance associated with a virtual SAN with mask=on.

Use the ldm list -o hba command to determine whether your guest domain is affected by this issue. The following example lists the vhba instances in the ldgb guest domain that you want to migrate:

primary# ldm list -o hba ldgb
NAME
ldgb

VHBA
    NAME         VSAN                   DEVICE  TOUT SERVER
    vhba0        vsan0                  vhba@0  0    ldga

The vhba instance, vhba0, is associated with vsan0 which executes in the ldga domain. The following command lists the vsan resources in the ldga domain:

primary# ldm list -o san ldga
NAME
ldga

VSAN
    NAME         MASK   DEVICE  IPORT
    vsan0        on     vsan@0  
[/pci@300/pci@1/pci@0/pci@4/SUNW,emlxs@0,1/fp@0,0]
    vsan1        off    vsan@1  
[/pci@300/pci@1/pci@0/pci@4/SUNW,emlxs@0,1/fp@0,0]

The previous output shows that vsan0 has its mask property set to on, which means that you cannot migrate the ldgb guest domain.

LLDP SMF Service Can Prevent VFs From Being Created or Destroyed

 

Bug ID 27925093: LLDP advertises information throughout a LAN for purposes of topology discovery. Due to the following issue:

28650967 - LLDP lock is preventing offline of pf's, breaking LDoms create-vf

an attempt to create or destroy virtual functions in a root domain where this service is enabled, and which also owns the physical function target of the create-vf or destroy-vf command, will fail.  This occurs because the service keeps all the physical functions in that domain busy. This in turn prevents the required offline and online of that physical function for the create or destroy operation.

Workaround: Disable the service, create or destroy virtual functions as needed, then reenable the service.

Run the following commands as superuser on the root domain which owns the physical function involved in the create or destroy operation.  For example:

# svcadm disable lldp
# ldm create-vf <pf_name>
# svcadm enable lldp
ldomHbaTable Is Empty

 

Bug ID 24393532: The fix for bug ID 23591953 disabled both Oracle VM Server for SPARC Oracle VM Server for SPARC MIB monitoring, such as listing the Oracle VM Server for SPARC MIB objects by using the snmpwalk command, and trap generation for the ldomHbaTable table. As a result, the Oracle VM Server for SPARC MIB ldomHbaTable table does not show contents.

primary# snmpwalk -v1 -c public localhost SUN-LDOM-MIB::ldomHbaTable
primary#

Workaround: Use the ldm list-hba command to view the HBA information.

Migration Fails When the Target Machine Has Insufficient Free LDCs

 

Bug ID 23031413: When the target machine's control domain runs out of LDCs during a domain migration, the migration fails with no explanation and the following message is written to the SMF log:

warning: Failed to read feasibility response type (5) from target LDoms Manager

This error is issued when the domain being migrated fails to bind on the target machine. Note that the bind operation might fail for other reasons on the target machine, as well.

Workaround: For the migration to succeed, the number of LDCs must be reduced either in the domain being migrated or in the control domain of the target machine. You can reduce the number of LDCs by reducing the number of virtual devices being used by or being serviced by a domain. For more information about managing LDCs, see Using Logical Domain Channels in Oracle VM Server for SPARC 3.6 Administration Guide.

ldm set-vsw net-dev= Successfully Removes the Virtual Switch's Backing Device With an Error Message When Virtual Switch's linkprop Is Set With phys-state

Bug ID 22828100: In Oracle Solaris 11, the virtual switch is not an actual network device. As such, the value of its linkprop property has no operational impact. However, this property can cause a spurious error message if set to phys-state when you attempt to remove the net-dev backing device by running the ldm set-vsw command.

primary# ldm set-vsw net-dev= vsw0
Failed to modify virtual switch because the linkprop of the virtual
switch requires that it has a physical network device assigned

You can avoid this error message by specifying the linkprop= option on the command line:

primary# ldm set-vsw net-dev= linkprop= vsw0

Alternatively, you can ignore this error message. As long as no virtual network devices have the linkprop property set to phys-state, the ldm set-vsw command succeeds.

However, if an attached virtual network device has its linkprop property set to phys-state, the ldm set-vsw issues the following error message and the command fails:

Failed to modify virtual switch because the linkprop of at least one
virtual network device requires that the virtual switch has a physical
network device assigned
Setting Up Jumbo Frames for Oracle VM Server for SPARC Virtual Network

Bug ID 22108218: Oracle VM Server for SPARC migration will not detect a jumbo frame MTU size mismatch before it tries and migrates a logical domain. Such a migration will fail silently so physical and or virtual NICs using jumbo frames must be carefully set up so that all the NIC and VNICs participants can communicate.

To setup jumbo frames for a virtual switch and multiple virtual networks, the virtual switch must be setup first.

  • Setting Up the Virtual Switch for Jumbo Frames

    From an account with root privileges, run dladm to find the max possible MTU size supported by the physical backing NIC device.

    # dladm show-linkprop -p mtu net0
    LINK PROPERTY PERM VALUE EFFECTIVE DEFAULT POSSIBLE
    nte0 mtu      rw   1500  1500      1500    46-9194
    

    In this example, the the value of the MTU should never exceed 9194 for the backing device (net-dev=net0) on the virtual switch.

    Then add the virtual switch with the selected net-dev backing device and an MTU size that does not exceed the max POSSIBLE MTU size displayed in dladm. The best practice for Oracle VM Server for SPARC is to use an MTU size of 9000 or less for jumbo frames.

  • Setting up Jumbo Frames between a Virtual Switch and other Physical NICs

    Determine the max POSSIBLE MTU size for each physical NIC you want to connect to using jumbo frames, then determine the   MIN(vsw0_nic,vsw1_nic,vswN_nic,..,nic1.nic2,...nicN) of all the virtual switch backing NICs and NICs you are connecting. The MIN is the smallest max possible MTU size of all the NICs in the jumbo frame network. That is the max MTU size you can use. Again, the best practice is to use an MTU size of 9000 or less.

A Domain That Has Socket Constraints Cannot Be Re-Created From an XML File

 

Bug ID 21616429: The Oracle VM Server for SPARC 3.3 software introduced socket support for Fujitsu SPARC M12 servers and Fujitsu M10 servers only.

Software running on Oracle SPARC servers and Oracle VM Server for SPARC versions older than 3.3 cannot re-create a domain with socket constraints from an XML file.

Attempting to re-create a domain with socket constraints from an XML file with an older version of the Oracle VM Server for SPARC software or on an Oracle SPARC server fails with the following message:

primary# ldm add-domain -i ovm3.3_socket_ovm11.xml
socket not a known resource

If Oracle VM Server for SPARC 3.2 is running on a Fujitsu SPARC M12 server or Fujitsu M10 server and you attempt to re-create a domain with socket constraints from an XML file, the command fails with various error messages, such as the following:

primary# ldm add-domain -i ovm3.3_socket_ovm11.xml
Unknown property: vcpus

primary# ldm add-domain -i ovm3.3_socket_ovm11.xml
perf-counters property not supported, platform does not have
performance register access capability, ignoring constraint setting.

Workaround: Edit the XML file to remove any sections that reference the socket resource type.

Oracle Solaris 11.3 SRU 12: ssd and sd Driver Functionality Is Merged for Fibre Channel Devices on SPARC Platforms

 

Bug ID 17036795: The Oracle Solaris 11.3 SRU 12 OS has merged the ssd and sd driver functionality for Fibre Channel devices on SPARC platforms.

This change affects device node names on the physical device path. The device node names change from ssd@ to disk@. This change also affects device driver bindings from ssd to sd.


Note -  Ensure that any application or client in the Oracle Solaris OS system that depends on these device node names or device driver bindings is adjusted.

This change is not enabled by default for Oracle Solaris 11.3 systems.

You must enable this change to perform live migrations of domains that use virtual HBA and Fibre Channel devices.

Before you enable this change, ensure that MPxIO is already enabled by running the stmsboot -D fp -e command.

Run the format command to determine whether MPxIO is enabled. When enabled, you should see vhci in device names. Alternatively, if the mpathadm -list lu output is empty, no MPxIO devices are enumerated.

Use the beadm command to create a new boot environment (BE). By using BEs, you can roll back easily to a previous boot environment if you experience unexpected problems.

Mount the BE and replace the /etc/devices/inception_points file with the /etc/devices/inception_points.vhba file. The .vhba file includes some feature flags to enable this change.

Finally, reboot after you activate the new BE.

# beadm create BE-name
# beadm mount BE-name /mnt
# cp /mnt/etc/devices/inception_points.vhba /mnt/etc/devices/inception_points
# beadm umount BE-name
# beadm activate BE-name
# reboot

After rebooting, use the prtconf -D | grep driver | grep sd command to verify the change.

If any disks use the ssd driver, there is a problem with the configuration.

You can also use the mpathadm list lu command to show multiple paths to the same disks if virtual HBA and the FibreChannel virtual function are both configured to see the same LUNs.

Resilient I/O Domain Should Support PCI Device Configuration Changes After the Root Domain Is Rebooted

 

Bug ID 16691046: If virtual functions are assigned from the root domain, an I/O domain might fail to provide resiliency in the following hotplug situations:

  • You add a root complex (PCIe bus) dynamically to the root domain, and then you create the virtual functions and assign them to the I/O domain.

  • You hot-add an SR-IOV card to the root domain that owns the root complex, and then you create the virtual functions and assign them to the I/O domain.

  • You replace or add any PCIe card to an empty slot (either through hotplug or when the root domain is down) on the root complex that is owned by the root domain. This root domain provides virtual functions from the root complex to the I/O domain.

Workaround: Perform one of the following steps:

  • If the root complex already provides virtual functions to the I/O domain and you add, remove, or replace any PCIe card on that root complex (through hotplug or when the root domain is down), you must reboot both the root domain and the I/O domain.

  • If the root complex does not have virtual functions currently assigned to the I/O domain and you add an SR-IOV card or any other PCIe card to the root complex, you must stop the root domain to add the PCIe card. After the root domain reboots, you can assign virtual functions from that root complex to the I/O domain.

  • If you want to add a new PCIe bus to the root domain and then create and assign virtual functions from that bus to the I/O domain, perform one of the following steps and then reboot the root domain:

    • Add the bus during a delayed reconfiguration

    • Add the bus dynamically

ldm init-system Command Might Not Correctly Restore a Domain Configuration on Which Physical I/O Changes Have Been Made

 

Bug ID 15783031: You might experience problems when you use the ldm init-system command to restore a domain configuration that has used direct I/O or SR-IOV operations.

    A problem arises if one or more of the following operations have been performed on the configuration to be restored:

  • A slot has been removed from a bus that is still owned by the primary domain.

  • A virtual function has been created from a physical function that is owned by the primary domain.

  • A virtual function has been assigned to the primary domain, to other guest domains, or to both.

  • A root complex has been removed from the primary domain and assigned to a guest domain, and that root complex is used as the basis for further I/O virtualization operations.

    In other words, you created a non-primary root domain and performed any of the previous operations.

If you have performed any of the previous actions, perform the workaround shown in Oracle VM Server for SPARC PCIe Direct I/O and SR-IOV Features (Doc ID 1325454.1) (https://support.oracle.com/epmos/faces/SearchDocDisplay?amp;_adf.ctrl-state=10c69raljg_77&_afrLoop=506200315473090).

Limit the Maximum Number of Virtual Functions That Can Be Assigned to a Domain

 

Bug ID 15775637: An I/O domain has a limit on the number of interrupt resources that are available per root complex.

On SPARC T4 servers, the limit is approximately 63 MSI/X vectors. Each igb virtual function uses three interrupts. The ixgbe virtual function uses two interrupts.

If you assign a large number of virtual functions to a domain, the domain runs out of system resources to support these devices. You might see messages similar to the following:

WARNING: ixgbevf32: interrupt pool too full.
WARNING: ddi_intr_alloc: cannot fit into interrupt pool
ldm remove-io of PCIe Cards That Have PCIe-to-PCI Bridges Should Be Disallowed

 

Bug ID 15761509: Use only the PCIe cards that support the Direct I/O (DIO) feature, which are listed in this support document (https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&doctype=REFERENCE&id=1325454.1).


Note -  The direct I/O feature is deprecated starting with the SPARC T7 series servers and the SPARC M7 series servers.

Workaround: Use the ldm add-io command to add the card to the primary domain again.

Incorrect Device Path for Fibre Channel Virtual Functions in a Root Domain

 

Bug ID 15754356: In the root domain, the Oracle Solaris device path for a Fibre Channel virtual function is incorrect.

For example, the incorrect path name is pci@380/pci@1/pci@0/pci@6/fibre-channel@0,2 while it should be pci@380/pci@1/pci@0/pci@6/SUNW,emlxs@0,2.

The ldm list-io -l output shows the correct device path for the Fibre Channel virtual functions.

Workaround: None.

Live Migration of a Domain That Depends on an Inactive Master Domain on the Target Machine Causes ldmd to Fault With a Segmentation Fault

 

Bug ID 15701865: If you attempt a live migration of a domain that depends on an inactive domain on the target machine, the ldmd daemon faults with a segmentation fault and crashes. The ldmd daemon is restarted automatically, but the migration is aborted.

    Workaround: Perform one of the following actions before you attempt the live migration:

  • Remove the guest dependency from the domain to be migrated.

  • Start the master domain on the target machine.

Simultaneous Migration Operations in “Opposite Direction” Might Cause ldm to Hang

 

Bug ID 15696986: If two ldm migrate commands are issued between the same two systems simultaneously in the “opposite direction,” the two commands might hang and never complete. An opposite direction situation occurs when you simultaneously start a migration on machine A to machine B and a migration on machine B to machine A.

The hang occurs even if the migration processes are initiated as dry runs by using the –n option. When this problem occurs, all other ldm commands might hang.

Recovery: Restart the Logical Domains Manager on both the source machine and the target machine:

primary# svcadm restart ldmd

Workaround: None.

Using the ldm stop -a Command on Domains in a Master-Slave Relationship Leaves the Slave With the stopping Flag Set

 

Bug ID 15664666: When a reset dependency is created, an ldm stop -a command might result in a domain with a reset dependency being restarted instead of only stopped.

Workaround: First, issue the ldm stop command to the master domain. Then, issue the ldm stop command to the slave domain. If the initial stop of the slave domain results in a failure, issue the ldm stop -f command to the slave domain.

Cannot Connect to Migrated Domain's Console Unless vntsd Is Restarted

 

Bug ID 15513998: Occasionally, after a domain has been migrated, it is not possible to connect to the console for that domain.

Note that this problem occurs when the migrated domain is running an OS version older than Oracle Solaris 11.3.

Workaround: Restart the vntsd SMF service to enable connections to the console:

# svcadm restart vntsd

Note -  This command will disconnect all active console connections.
Simultaneous Net Installation of Multiple Domains Fails When in a Common Console Group

 

Bug ID 15453968: Simultaneous net installation of multiple guest domains fails on systems that have a common console group.

Workaround: Only net-install on guest domains that each have their own console group. This failure is seen only on domains with a common console group shared among multiple net-installing domains.