This section summarizes the bugs that you might encounter when using this version of the software. The most recent bugs are described first. Workarounds and recovery procedures are specified, if available.
The following Oracle Solaris OS bugs have been fixed in the fully qualified Oracle Solaris OS releases. These bugs might still be present in Oracle Solaris 10 OS versions. To avoid these problems, ensure that you run one of the Oracle Solaris OS versions that is associated with the bug ID.
To obtain details about the bugs in this table, review the bug reports.
|
Bug ID 21953704: The ldm list-io command might not show the most up-to-date IOV information immediately after running a cfgadm command. You might have to wait as long as four minutes for the updated information to be available.
Workaround: None.
Bug ID 21780045: The ovmtcreate utility generates a NULL string for the Version information in the OVF file if the locale is not the C locale (non-English locale environment).
The values for the Version and FullVersion properties are null as shown by the XML lines that appear in bold in this example:
<ovf:VirtualSystem ovf:id="templates"> <ovf:Info>Oracle VM Template</ovf:Info> <ovf:ProductSection ovf:class="com.oracle.ovmt"> <ovf:Info>Oracle VM Template</ovf:Info> <ovf:Product>Oracle VM Template</ovf:Product> <ovf:Version></ovf:Version> <ovf:FullVersion></ovf:FullVersion>
When the ovmtdeploy utility uses the templates that you created by using the ovmtcreate utility in the non-C locale environment, a Java exception occurs because the templates include the NULL strings.
# /opt/ovmtutils/bin/ovmtdeploy -d guest10 -o /export/home/ovm \ /export/home/templates.ova Oracle Virtual Machine for SPARC Deployment Utility ovmtdeploy Version Copyright (c) 2014, 2015, Oracle and/or its affiliates. All rights reserved. STAGE 1 - EXAMINING SYSTEM AND ENVIRONMENT ------------------------------------------ Checking user privilege Performing platform & prerequisite checks Checking for required services Named resourced available 2 - ANALYZING ARCHIVE & RESOURCE REQUIREMENTS --------------------------------------------------- Checking .ova format and contents Validating archive configuration Exception in thread "main" java.lang.NullPointerException at ovfparse.OvfParse.getTagValue(OvfParse.java:233) at ovfparse.VmProduct.<init>(VmProduct.java:33) at ovfparse.VmSys.<init>(VmSys.java:72) at ovfparse.OvfParse.parseOVFByDOM(OvfParse.java:371) at ovfparse.OvfParse.<init>(OvfParse.java:56) at ovmtdeploy.Ovmtdeploy.exec(Ovmtdeploy.java:1841) at ovmtdeploy.Ovmtdeploy.main(Ovmtdeploy.java:1946)
Workaround: Perform the following steps:
Edit the OVF file to add the version numbers to the contents of the Version and FullVersion properties.
Re-archive the template ova by using the gtar command.
For example:
# /usr/bin/gtar -cf templates.ova templates.ovf templates.mf System.img.gz
Run the ovmtdeploy utility with –k option to skip checksum verification.
Bug ID 21674282: When you replace a PCIe card in the same slot, using the ldm add-vsan command that specifies an alias for the physical SCSI HBA device (/SYS) might fail.
Workaround: Do not specify the device name alias. Instead, specify the full device path name (/pci) for the ldm add-vsan command.
Bug ID 21635033: When a service domain has more than one virtual disk server (vds), running the ovmtcreate utility for a guest domain might fail because the utility checks only the first vds instance in the service domain.
For example, running the ovmtcreate utility for the gdom3 domain fails if the virtual disk is configured as follows:
The primary domain has four virtual disk servers (vds)
The virtual disk server device that corresponds to the virtual disk on the gdom3 domain is associated with vds3
In the following sample output, the lines in bold show that vds0 is the first virtual disk server and virtual disk server device for the gdom3 virtual disk is not vds0.
primary# ldm list -l -p -o disk VERSION 1.15 DOMAIN|name=primary| VDS|name=vds0|nclients=1 |vol=vol0|opts=|dev=/export/home/ovm/gdom0.img|mpgroup= VDS|name=vds1|nclients=1 |vol=vol0|opts=|dev=/export/home/ovm/gdom1.img|mpgroup= VDS|name=vds2|nclients=1 |vol=vol0|opts=|dev=/export/home/ovm/gdom2.img|mpgroup= VDS|name=cdrom|nclients=3 |vol=1|opts=|dev=/export/home/ovm/sol-113_1.iso|mpgroup= |vol=2|opts=|dev=/export/home/ovm/sol-113_2.iso|mpgroup= |vol=3|opts=|dev=/export/home/ovm/sol-113_3.iso|mpgroup= |vol=4|opts=|dev=/export/home/ovm/sol-113_4.iso|mpgroup= VDS|name=vds3|nclients=1 |vol=disk0|opts=|dev=/export/home/ovm/gdom3.img|mpgroup= DOMAIN|name=gdom0| VDISK|name=vdisk0|vol=vol0@vds0|timeout=|dev=disk@0|server=primary|mpgroup=|id=0 VDISK|name=cdrom|vol=1@cdrom|timeout=|dev=disk@1|server=primary|mpgroup=|id=1 DOMAIN|name=gdom1| VDISK|name=vdisk0|vol=vol0@vds1|timeout=|dev=disk@0|server=primary|mpgroup=|id=0 VDISK|name=cdrom|vol=2@cdrom|timeout=|dev=disk@1|server=primary|mpgroup=|id=1 DOMAIN|name=gdom2| VDISK|name=vdisk0|vol=vol0@vds2|timeout=|dev=disk@0|server=primary|mpgroup=|id=0 VDISK|name=cdrom|vol=3@cdrom|timeout=|dev=disk@1|server=primary|mpgroup=|id=1 DOMAIN|name=gdom3| VDISK|name=vdisk0|vol=disk0@vds3|timeout=|dev=disk@0|server=primary|mpgroup=|id=0
The following ldm list command shows the gdom3 domain status:
primary# ldm list NAME STATE FLAGS CONS VCPU MEMORY UTIL NORM UPTIME primary active -n-cv- UART 32 46848M 0.3% 0.3% 1d 51m gdom0 active -n---- 5000 24 24G 0.0% 0.0% 1d 35m gdom1 active -n---- 5001 24 24G 0.0% 0.0% 8d 18h 21m gdom2 active -n---- 5002 24 24G 0.0% 0.0% 8d 17h 43m gdom3 bound ------ 5003 24 24G
The following command shows the error you receive when running the ovmtcreate command for the gdom3 domain:
# /opt/ovmtutils/bin/ovmtcreate -d gdom3 -o /export/home/ovmt STAGE 1 - EXAMINING SYSTEM AND ENVIRONMENT ------------------------------------------- Performing platform & prerequisite checks Checking user permissions Checking for required packages Checking for required services Checking directory permissions STAGE 2 - ANALYZING DOMAIN --------------------------- Retrieving and processing attributes Checking domain state Getting domain resource settings Discovering network topology Discovering disk topology ERROR: VDS Device does not exist or not readable
Workaround: Ensure that the service domain has only one virtual disk server before you run the ovmtcreate utility.
Bug ID 21616429: The Oracle VM Server for SPARC 3.3 software introduced socket support for Fujitsu M10 servers only.
Software running on Oracle SPARC systems and Oracle VM Server for SPARC versions older than 3.3 cannot re-create a domain with socket constraints from an XML file.
Attempting to re-create a domain with socket constraints from an XML file with an older version of the Oracle VM Server for SPARC software or on an Oracle SPARC system fails with the following message:
primary# ldm add-domain -i ovm3.3_socket_ovm11.xml socket not a known resource
If Oracle VM Server for SPARC 3.2 is running on a Fujitsu M10 server and you attempt to re-create a domain with socket constraints from an XML file, the command fails with various error messages, such as the following:
primary# ldm add-domain -i ovm3.3_socket_ovm11.xml Unknown property: vcpus primary# ldm add-domain -i ovm3.3_socket_ovm11.xml perf-counters property not supported, platform does not have performance register access capability, ignoring constraint setting.
Workaround: Edit the XML file to remove any sections that reference the socket resource type.
Bug ID 21561834: If the number of virtual CPUs in a domain drops below four, DRM might fail to add virtual CPUs to the domain even when utilization significantly exceeds the upper utilization level. If the util-upper property value is greater than the default value of 70, DRM might fail to add virtual CPUs even if the domain has more than four virtual CPUs.
Workaround: Set the DRM policy's elastic-margin property value to at least 15.
primary# ldm set-policy elastic-margin=15 name=policy-name domain-name
If the util-upper property value is greater than 70, set the DRM policy's elastic-margin property value to at least 20.
primary# ldm set-policy elastic-margin=20 name=policy-name domain-name
Bug ID 21527087: In rare cases, using the ldm set-socket command to specify sockets for a running domain might cause the following unexpected behavior:
The Logical Domains Manager might crash
The ldm set-socket command completes but not all of the domain's CPUs and memory are remapped to the specified sockets
However, if the physical partition (PPAR) has more than 12 sockets, do not use the ldm set-socket --restored-degraded and ldm set-socket socket_id=id commands while the domain is running. If you do run these commands on a running domain, the ldmd state might become corrupted.
Workaround: Stop the domain before executing an ldm set-socket command.
It is always safe to clear an active domain's socket constraints by using the ldm set-socket command to set the socket_id property to a NULL value.
Bug ID 21510615: Sometimes, you might get a persistent device busy or ldm remove-io failures while removing one or more PCIe buses.
Workaround: Check the gdm service, disable manually (or check and kill Xorg) and retry the ldm remove-io operation.
# svcs | grep gdm # svcadm disable -st svc:/application/graphical-login/gdm:default
Or:
# ps -ef | grep Xorg # pkill Xorg
Bug ID 21367043: In rare circumstances, socket constraints might become out of synchronization with the bound CPU and memory resources of a domain. The ldm rm-vcpu, ldm set-vcpu, ldm rm-core, and ldm set-core commands might cause the Logical Domains Manager to crash with the following error message in the ldmd SMF log:
fatal error: xcalloc(0,4) : one of number or size is <= 0 at line 1183 of affinity_core.c
Workaround: Clear the domain's socket constraints by using the following commands:
primary# ldm list-socket domain-name primary# ldm set-socket socket_id= domain-name
Bug ID 21369897: While administering a guest domain, running the ldmpower command causes a segmentation fault of the ldmd daemon.
Workaround: Do not execute the ldmpower command while performing addition or removal operations on a guest domain.
Bug IDs 21352084, 21861284, and 21861327: In rare circumstances, a root domain might panic if it receives an I/O error and starts to analyze the error while an I/O domain is reset.
The panic message is similar to the following:
panic[cpu15]/thread=2a1017d3c20: Fatal error has occured in: PCIe fabric.(0x2)(0x245)
The ereports are dumped to the console at the time of the panic. The ereports show that some status register values, including the pcie_ue_status value, are all FFs. After the panic, the root domain reboots itself and recovers.
Workaround: None.
Bug ID 21321166: I/O throughput is sometimes slower when using a virtual SCSI HBA MPxIO path to an offline service domain.
Workaround: Disable the path to the offline service domain by using the mpathadm disable path command until the service domain is returned to service.
Bug ID 21299404: If you use the ldm shrink-socket command to perform a memory DR operation and one of the domain's memory blocks is not 256-Mbyte aligned, the command might remove an additional 256 Mbytes of memory from the active domain. If the domain's memory is fragmented, the ldmd daemon might attempt to further remove additional memory.
Workaround: None.
Bug ID 21283102: The ldm list-rsrc-group command might show the same memory and I/O resource information under both the /SYS/MB (motherboard) and other resource groups. For example:
primary# ldm list-group NAME CORE MEMORY IO /SYS/PM0 32 64G 4 /SYS/PM1 32 256G 4 /SYS/PM2 32 128G 4 /SYS/PM3 32 128G 4 /SYS/MB 0 576G 16 primary# ldm list-group -a -l NAME CORE MEMORY IO /SYS/PM0 32 64G 4 CORE CID BOUND 0, 1 primary 2, 3, 4, 5, 6, 7, 8, 9 10, 11, 12, 13, 14, 15, 16, 17 18, 19, 20, 21, 22, 23, 24, 25 26, 27, 28, 29, 30, 31 MEMORY PA SIZE BOUND 0x0 57M _sys_ 0x3900000 32M _sys_ 0x5900000 94M _sys_ 0xb700000 393M _sys_ 0x24000000 192M _sys_ 0x30000000 31488M 0x7e0000000 64M _sys_ 0x7e4000000 64M _sys_ 0x7e8000000 384M _sys_ 0x80000000000 32G IO DEVICE PSEUDONYM BOUND pci@300 pci_0 primary pci@340 pci_1 primary pci@380 pci_2 primary pci@3c0 pci_3 primary ------------------------------------------------------------------------------ NAME CORE MEMORY IO /SYS/PM1 32 256G 4 CORE CID BOUND 32, 33, 34, 35, 36, 37, 38, 39 40, 41, 42, 43, 44, 45, 46, 47 48, 49, 50, 51, 52, 53, 54, 55 56, 57, 58, 59, 60, 61, 62, 63 MEMORY PA SIZE BOUND 0x100000000000 768M 0x100030000000 24G primary 0x100630000000 105728M 0x180000000000 128G IO DEVICE PSEUDONYM BOUND pci@400 pci_4 primary pci@440 pci_5 primary pci@480 pci_6 primary pci@4c0 pci_7 primary ------------------------------------------------------------------------------ NAME CORE MEMORY IO /SYS/PM2 32 128G 4 CORE CID BOUND 64, 65, 66, 67, 68, 69, 70, 71 72, 73, 74, 75, 76, 77, 78, 79 80, 81, 82, 83, 84, 85, 86, 87 88, 89, 90, 91, 92, 93, 94, 95 MEMORY PA SIZE BOUND 0x200000000000 64G 0x280000000000 64G IO DEVICE PSEUDONYM BOUND pci@500 pci_8 primary pci@540 pci_9 primary pci@580 pci_10 primary pci@5c0 pci_11 primary ------------------------------------------------------------------------------ NAME CORE MEMORY IO /SYS/PM3 32 128G 4 CORE CID BOUND 96, 97, 98, 99, 100, 101, 102, 103 104, 105, 106, 107, 108, 109, 110, 111 112, 113, 114, 115, 116, 117, 118, 119 120, 121, 122, 123, 124, 125, 126, 127 MEMORY PA SIZE BOUND 0x300000000000 64G 0x380000000000 64G IO DEVICE PSEUDONYM BOUND pci@600 pci_12 primary pci@640 pci_13 primary pci@680 pci_14 primary pci@6c0 pci_15 primary ------------------------------------------------------------------------------ NAME CORE MEMORY IO /SYS/MB 0 576G 16 MEMORY PA SIZE BOUND 0x0 57M _sys_ 0x3900000 32M _sys_ 0x5900000 94M _sys_ 0xb700000 393M _sys_ 0x24000000 192M _sys_ 0x30000000 31488M 0x7e0000000 64M _sys_ 0x7e4000000 64M _sys_ 0x7e8000000 384M _sys_ 0x80000000000 32G 0x100000000000 768M 0x100030000000 24G primary 0x100630000000 105728M 0x180000000000 128G 0x200000000000 64G 0x280000000000 64G 0x300000000000 64G 0x380000000000 64G IO DEVICE PSEUDONYM BOUND pci@300 pci_0 primary pci@340 pci_1 primary pci@380 pci_2 primary pci@3c0 pci_3 primary pci@400 pci_4 primary pci@440 pci_5 primary pci@480 pci_6 primary pci@4c0 pci_7 primary pci@500 pci_8 primary pci@540 pci_9 primary pci@580 pci_10 primary pci@5c0 pci_11 primary pci@600 pci_12 primary pci@640 pci_13 primary pci@680 pci_14 primary pci@6c0 pci_15 primary
Workaround: See the detailed information for memory and I/O in the following columns to determine whether the same resource information is shown:
Memory: PA, SIZE and BOUND
I/O: DEVICE, PSEUDONYM and BOUND
Bug ID 21188211: If LUNs are added to or removed from a virtual SAN after a virtual SCSI HBA is configured, the ldm rescan-vhba command sometimes does not show the new LUN view.
Workaround: Remove the virtual SCSI HBA and then re-add it. Check to see whether the LUNs are seen. If the removal and re-add operations are unsuccessful, you must reboot the guest domain.
Bug ID 21114622: When you execute the ldm create-vf or ldm destroy-vf command, the associated physical function driver is detached and re-attached, which can take a significant but unquantifiable amount of time. The amount of time depends the number of virtual functions that are involved and the complexity of the target hardware device.
Running the ldm list-io command might show that the physical function (and its child virtual functions) have the INV (invalid) status.
Currently, the Logical Domains Manager polls the agent for a period of time and then stops polling. If the polling period is too short, the device might show the INV status indefinitely.
Workaround: From the root domain that owns the physical function device, restart the ldoms/agents service.
primary# svcadm restart ldoms/agents
Run this command if the INV status persists for at least six minutes after issuing the ldm create-vf or ldm destroy-vf command.
Bug ID 20951004: vhba should support SCSI HBAs when MPxIO is enabled in the service domain.
Workaround: Disable MPxIO for all the initiator ports on the service domain by running the following command:
# stmsboot -d
Bug ID 20882700: When a PCIe device (or an SR-IOV virtual function) is removed from or added to a domain, the Oracle Solaris 11.3 fmd fault management daemon reports the event in exactly the same way as if an FRU had been physically removed or added.
You might see console messages and messages in the /var/adm/messages file similar to the following:
SUNW-MSG-ID: FMD-8000-A0, TYPE: Alert, VER: 1, SEVERITY: Minor EVENT-TIME: Tue May 19 18:39:41 PDT 2015 PLATFORM: unknown, CSN: unknown, HOSTNAME: starbuck SOURCE: software-diagnosis, REV: 0.1 EVENT-ID: 5077e6c3-6a15-457e-a55b-cb72ea5f9728 DESC: FRU has been added to the system. AUTO-RESPONSE: FMD topology will be updated. IMPACT: System impact depends on the type of FRU. REC-ACTION: Use fmadm faulty to provide a more detailed view of this event. Please refer to the associated reference document at http://support.oracle.com/msg/FMD-8000-A0 for the latest service procedures and policies regarding this diagnosis.
# fmadm faulty --------------- ------------------------------------ ----------- --------- TIME EVENT-ID MSG-ID SEVERITY --------------- ------------------------------------ ----------- --------- Apr 14 10:04:00 2d981602-975c-4861-9f26-e37360eca697 FMD-8000-CV Minor Problem Status : open Diag Engine : software-diagnosis / 0.1 System Manufacturer : Oracle Corporation Name : SPARC T7-2 Part_Number : T7_2 Serial_Number : T7_2 Host_ID : 86582a8c ---------------------------------------- Suspect 1 of 1 : Problem class : alert.oracle.solaris.fmd.fru-monitor.fru-remove Certainty : 100% FRU Status : active/not present Location : "/SYS/MB/PCIE1" Manufacturer : unknown Name : unknown Part_Number : unknown Revision : unknown Serial_Number : unknown Chassis Manufacturer : Oracle-Corporation Name : SPARC-T7-2 Part_Number : T7_2 Serial_Number : T7_2 Resource Status : active/not present Description : FRU '/SYS/MB/PCIE1' has been removed from the system. Response : FMD topology will be updated. Impact : System impact depends on the type of FRU. Action : Use 'fmadm faulty' to provide a more detailed view of this event. Please refer to the associated reference document at http://support.oracle.com/msg/FMD-8000-CV for the latest service procedures and policies regarding this diagnosis.
Workaround: You can ignore these alerts as long as they were generated by explicit administrator actions to add or remove an I/O device from a domain.
Bug ID 20876502: Pulling the SAN cable from a service domain that is part of a virtual SCSI HBA MPxIO guest domain configuration causes the Path State column of the mpathadm output to show incorrect values. In addition, pulling the cable leads to I/O operation failures in the guest domain.
Workaround: Plug in the SAN cable and run the ldm rescan-vhba command for all the virtual SCSI HBAs to the service domain that has the cable attached. After performing this workaround, the guest domain should resume performing I/O operations.
Bug ID 20774477: If you use SES-enabled storage devices, you might see a device busy error when you attempt to remove a PCIe bus that hosts these devices. To determine whether you are using this type of storage device, search for the ses or enclosure string in the ldm list-io -l output for the PCIe bus.
Workaround: Perform one of the following workarounds to remove the PCIe bus:
Dynamically remove the PCIe bus.
Disable the FMD service.
primary# svcadm disable -st svc:/system/fmd
Remove the PCIe bus.
primary# ldm remove-io bus
Re-enable the FMD service.
primary# svcadm enable svc:/system/fmd
Statically remove the PCIe bus.
Place the root domain that has the PCIe bus in a delayed reconfiguration.
primary# ldm start-reconf root-domain
Remove the PCIe bus.
primary# ldm remove-io bus
Perform a reboot from the root domain console.
root-domain# reboot
Bug ID 20619894: If the system/management/hwmgmtd package is not installed, a dynamic bus remove operation causes the rcm_daemon to print the following message on the console:
rcm_daemon[839]: rcm script ORCL,pcie_rc_rcm.pl: svcs: Pattern 'sp/management' doesn't match any instances
Workaround: You can safely ignore this message.
Bug ID 20532270: Be aware of any direct I/O or dynamic bus removal operations that attempt to remove the physical SCSI HBA from the virtual SAN's control.
If you perform an ldm remove-io operation on a PCIe resource that is referenced by a virtual SAN device, that device is unusable if it has never been referenced by an ldm add-vhba command. If the ldm remove-io operation occurs after you run the ldm add-vhba command, the vsan module prevents the PCIe resource from being removed.
Workaround: Delete the virtual SAN.
Bug ID 20425271: While triggering a recovery after dropping into factory-default, recovery mode fails if the system boots from a different device than the one booted in the previously active configuration. This failure might occur if the active configuration uses a boot device other than the factory-default boot device.
Workaround: Perform the following steps any time you want to save a new configuration to the SP:
Determine the full PCI path to the boot device for the primary domain.
Use this path for the ldm set-var command in Step 4.
Remove any currently set boot-device property from the primary domain.
Performing this step is necessary only if the boot-device property has a value set. If the property does not have a value set, an attempt to remove the boot-device property results in the boot-device not found message.
primary# ldm rm-var boot-device primary
Save the current configuration to the SP.
primary# ldm add-spconfig config-name
Explicitly set the boot-device property for the primary domain.
primary# ldm set-var boot-device=value primary
If you set the boot-device property after saving the configuration to the SP as described, the specified boot device is booted when recovery mode is triggered.
Recovery: If recovery mode has already failed as described, perform the following steps:
Explicitly set the boot device to the one used in the last running configuration.
primary# ldm set-var boot-device=value primary
Reboot the primary domain.
primary# reboot
The reboot enables the recovery to proceed.
Bug ID 20046234: When a virtual SCSI HBA and a Fibre Channel SR-IOV device can view the same LUNs in a guest domain when MPxIO is enabled, a panic might occur. The panic occurs if the Fibre Channel SR-IOV card is removed from the guest domain and then re-added.
Workaround: Do not configure a guest domain with Fibre Channel SR-IOV and a virtual SCSI HBA when both have MPxIO enabled.
Bug ID 20004281: When a primary domain is power cycled, ixgbevf nodes on the I/O domain might be reported as disabled by the ipadm command, and as nonexistent by the ifconfig command.
Workaround: Re-enable the IP interfaces:
# svcadm restart network/physical:default
Bug ID 19943809: The hxge driver cannot use interfaces inside an I/O domain when the card is assigned by using the direct I/O feature.
The following warning is issued to the system log file:
WARNING: hxge0 : <== hxge_setup_mutexes: failed 0x1
Workaround: Add the following line to the /etc/system and reboot:
set px:px_force_intx_support=1
Bug ID 19932842: An attempt to set an OBP variable from a guest domain might fail if you use the eeprom or the OBP command before one of the following commands is completed:
ldm add-spconfig
ldm remove-spconfig
ldm set-spconfig
ldm bind
This problem might occur when these commands take more than 15 seconds to complete.
# /usr/sbin/eeprom boot-file\=-k promif_ldom_setprop: promif_ldom_setprop: ds response timeout eeprom: OPROMSETOPT: Invalid argument boot-file: invalid property
Recovery: Retry the eeprom or OBP command after the ldm operation has completed.
Workaround: Retry the eeprom or OBP command on the affected guest domain. You might be able to avoid the problem by using the ldm set-var command on the primary domain.
Bug ID 19449221: A domain can have no more than 999 virtual network devices (vnets).
Workaround: Limit the number of vnets on a domain to 999.
Bug ID 19078763: Oracle VM Server for SPARC no longer keeps track of freed MAC addresses. MAC addresses are now allocated by randomly selecting an address and then confirming that address is not used by any logical domains on the local network.
Bug ID 18083904: The firmware for Sun Storage 16 Gb Fibre Channel Universal HBA, Emulex cards does not support setting bandwidth controls. The HBA firmware ignores any value that you specify for the bw-percent property.
Workaround: None.
Bug ID 18001028: In the root domain, the Oracle Solaris device path for a Fibre Channel virtual function is incorrect.
For example, the incorrect path name is pci@380/pci@1/pci@0/pci@6/fibre-channel@0,2 while it should be pci@380/pci@1/pci@0/pci@6/SUNW,emlxs@0,2.
The ldm list-io -l output shows the correct device path for the Fibre Channel virtual functions.
Workaround: None.
Bug ID 17576087: Performing a power cycle of the system to a saved configuration might not restore the memory after the faulty memory has been replaced.
Workaround: After you replace the faulty memory, perform a power cycle of the system to the factory-default configuration. Then, perform a power cycle of the system to the configuration that you want to use.
You cannot configure a DLMP aggregation on an SR-IOV NIC virtual function or a virtual network device in a guest domain.
Bug ID 17422973: The installation of the Oracle Solaris 11.1 OS on a single-slice disk might fail with the following error on a SPARC T4 server that runs at least system firmware version 8.4.0, a SPARC T5, SPARC M5, or SPARC M6 server that runs at least system firmware version 9.1.0, and a Fujitsu M10 server that runs at least XCP version 2230:
cannot label 'c1d0': try using fdisk(1M) and then provide a specific slice Unable to build pool from specified devices: invalid vdev configuration
Workaround: Relabel the disk with an SMI label.
Bug ID 17020950: After migrating an active domain from a SPARC T4 platform to a SPARC T5, SPARC M5, or SPARC M6 platform that was bound using firmware version 8.3, performing a memory dynamic reconfiguration might result in a guest domain panic.
Workaround: Before you perform the migration, update the SPARC T4 system with version 8.4 of the system firmware. Then, rebind the domain.
Bug ID 16979993: An attempt to use a dynamic SR-IOV remove operation on an InfiniBand device results in confusing and inappropriate error messages.
Dynamic SR-IOV remove operations are not supported for InfiniBand devices.
Workaround: Remove InfiniBand virtual functions by performing one of the following procedures:
Bug ID 16691046: If virtual functions are assigned from the root domain, an I/O domain might fail to provide resiliency in the following hotplug situations:
You add a root complex (PCIe bus) dynamically to the root domain, and then you create the virtual functions and assign them to the I/O domain.
You hot-add an SR-IOV card to the root domain that owns the root complex, and then you create the virtual functions and assign them to the I/O domain.
You replace or add any PCIe card to an empty slot (either through hotplug or when the root domain is down) on the root complex that is owned by the root domain. This root domain provides virtual functions from the root complex to the I/O domain.
Workaround: Perform one of the following steps:
If the root complex already provides virtual functions to the I/O domain and you add, remove, or replace any PCIe card on that root complex (through hotplug or when the root domain is down), you must reboot both the root domain and the I/O domain.
If the root complex does not have virtual functions currently assigned to the I/O domain and you add an SR-IOV card or any other PCIe card to the root complex, you must stop the root domain to add the PCIe card. After the root domain reboots, you can assign virtual functions from that root complex to the I/O domain.
If you want to add a new PCIe bus to the root domain and then create and assign virtual functions from that bus to the I/O domain, perform one of the following steps and then reboot the root domain:
Add the bus during a delayed reconfiguration
Add the bus dynamically
Bug ID 16659506: A guest domain is in transition state (t) after a reboot of the primary domain. This problem arises when a large number of virtual functions are configured on the system.
Workaround: To avoid this problem, retry the OBP disk boot command several times to avoid a boot from the network.
Perform the following steps on each domain:
Access the console of the domain.
primary# telnet localhost 5000
Set the boot-device property.
ok> setenv boot-device disk disk disk disk disk disk disk disk disk disk net
The number of disk entries that you specify as the value of the boot-device property depends on the number of virtual functions that are configured on the system. On smaller systems, you might be able to include fewer instances of disk in the property value.
Verify that the boot-device property is set correctly by using the printenv.
ok> printenv
Return to the primary domain console.
Repeat Steps 1-4 for each domain on the system.
Reboot the primary domain.
primary# shutdown -i6 -g0 -y
Bug ID 16299053: After disabling a PCIe device, you might experience unexpected behavior. The subdevices that are under the disabled PCIe device revert to the non-assigned names while the PCIe device is still owned by the domain.
Workaround: If you decide to disable a PCIe slot on the ILOM, ensure that the PCIe slot is not assigned to a domain by means of the direct I/O (DIO) feature. That is, first ensure that the PCIe slot is assigned to the corresponding root domain before disabling the slot on the ILOM.
If you disable the PCIe slot on the ILOM while the PCIe slot is assigned to a domain with DIO, stop that domain and reassign the device to the root domain for the correct behavior.
Bug ID 16284767: This warning on the Oracle Solaris console means the interrupt supply was exhausted while attaching I/O device drivers:
WARNING: ddi_intr_alloc: cannot fit into interrupt pool
The hardware provides a finite number of interrupts, so Oracle Solaris limits how many each device can use. A default limit is designed to match the needs of typical system configurations, however this limit may need adjustment for certain system configurations.
Specifically, the limit may need adjustment if the system is partitioned into multiple logical domains and if too many I/O devices are assigned to any guest domain. Oracle VM Server for SPARC divides the total interrupts into smaller sets given to guest domains. If too many I/O devices are assigned to a guest domain, its supply might be too small to give each device the default limit of interrupts. Thus, it exhausts its supply before it completely attaches all the drivers.
Some drivers provide an optional callback routine which allows Oracle Solaris to automatically adjust their interrupts. The default limit does not apply to these drivers.
Workaround: Use the ::irmpools and ::irmreqs MDB macros to determine how interrupts are used. The ::irmpools macro shows the overall supply of interrupts divided into pools. The ::irmreqs macro shows which devices are mapped to each pool. For each device, ::irmreqs shows whether the default limit is enforced by an optional callback routine, how many interrupts each driver requested, and how many interrupts the driver is given.
The macros do not show information about drivers that failed to attach. However, the information that is shown helps calculate the extent to which you can adjust the default limit. Any device that uses more than one interrupt without providing a callback routine can be forced to use fewer interrupts by adjusting the default limit. Reducing the default limit below the amount that is used by such a device results in freeing of interrupts for use by other devices.
To adjust the default limit, set the ddi_msix_alloc_limit property to a value from 1 to 8 in the /etc/system file. Then, reboot the system for the change to take effect.
To maximize performance, start by assigning larger values and decrease the values in small increments until the system boots successfully without any warnings. Use the ::irmpools and ::irmreqs macros to measure the adjustment's impact on all attached drivers.
For example, suppose the following warnings are issued while booting the Oracle Solaris OS in a guest domain:
WARNING: emlxs3: interrupt pool too full. WARNING: ddi_intr_alloc: cannot fit into interrupt pool
The ::irmpools and ::irmreqs macros show the following information:
# echo "::irmpools" | mdb -k ADDR OWNER TYPE SIZE REQUESTED RESERVED 00000400016be970 px#0 MSI/X 36 36 36 # echo "00000400016be970::irmreqs" | mdb -k ADDR OWNER TYPE CALLBACK NINTRS NREQ NAVAIL 00001000143acaa8 emlxs#0 MSI-X No 32 8 8 00001000170199f8 emlxs#1 MSI-X No 32 8 8 000010001400ca28 emlxs#2 MSI-X No 32 8 8 0000100016151328 igb#3 MSI-X No 10 3 3 0000100019549d30 igb#2 MSI-X No 10 3 3 0000040000e0f878 igb#1 MSI-X No 10 3 3 000010001955a5c8 igb#0 MSI-X No 10 3 3
The default limit in this example is eight interrupts per device, which is not enough interrupts to accommodate the attachment of the final emlxs3 device to the system. Assuming that all emlxs instances behave in the same way, emlxs3 probably requested 8 interrupts.
By subtracting the 12 interrupts used by all of the igb devices from the total pool size of 36 interrupts, 24 interrupts are available for the emlxs devices. Dividing the 24 interrupts by 4 suggests that 6 interrupts per device would enable all emlxs devices to attach with equal performance. So, the following adjustment is added to the /etc/system file:
set ddi_msix_alloc_limit = 6
When the system successfully boots without warnings, the ::irmpools and ::irmreqs macros show the following updated information:
# echo "::irmpools" | mdb -k ADDR OWNER TYPE SIZE REQUESTED RESERVED 00000400018ca868 px#0 MSI/X 36 36 36 # echo "00000400018ca868::irmreqs" | mdb -k ADDR OWNER TYPE CALLBACK NINTRS NREQ NAVAIL 0000100016143218 emlxs#0 MSI-X No 32 8 6 0000100014269920 emlxs#1 MSI-X No 32 8 6 000010001540be30 emlxs#2 MSI-X No 32 8 6 00001000140cbe10 emlxs#3 MSI-X No 32 8 6 00001000141210c0 igb#3 MSI-X No 10 3 3 0000100017549d38 igb#2 MSI-X No 10 3 3 0000040001ceac40 igb#1 MSI-X No 10 3 3 000010001acc3480 igb#0 MSI-X No 10 3 3
Bug ID 16224353: After rebooting the primary domain, ixgbevf instances in primary domain might not work.
Workaround: None.
Bug ID 16071170: On a SPARC M5-32 or a SPARC M6-32 system, the internal SAS controllers are exported as SR-IOV-enabled controllers even though these cards do not support SR-IOV.
The Oracle VM Server for SPARC log shows the following messages when attempting to create the physical function on these cards:
Dec 11 04:27:54 warning: Dropping pf pci@d00/pci@1/pci@0/pci@0/pci@0/pci@4/LSI,sas@0: no IOV capable driver Dec 11 04:27:54 warning: Dropping pf pci@d80/pci@1/pci@0/pci@c/pci@0/pci@4/LSI,sas@0: no IOV capable driver Dec 11 04:27:54 warning: Dropping pf pci@c00/pci@1/pci@0/pci@c/pci@0/pci@4/LSI,sas@0: no IOV capable driver Dec 11 04:27:54 warning: Dropping pf pci@e00/pci@1/pci@0/pci@0/pci@0/pci@4/LSI,sas@0: no IOV capable driver
The system has four LSI SAS controller ports, each in one IOU of the SPARC M5-32 and SPARC M6-32 assembly. This error is reported for each port.
Workaround: You can ignore these messages. These messages indicate only that the LSI-SAS controller devices on the system are capable of SR-IOV but no SR-IOV support is available for this hardware.
Bug ID 16068376: On a T5-8 with approximately 128 domains, some ldm commands such as ldm list might show 0 seconds as the uptime for all domains.
Workaround: Log in to the domain and use the uptime command to determine the domain's uptime.
Bug ID 15812823: In low free-memory situations, not all memory blocks can be used as part of a memory DR operation due to size. However, these memory blocks are included in the amount of free memory. This situation might lead to a smaller amount of memory being added to the domain than expected. No error message is shown if this situation occurs.
Workaround: None.
Bug ID 15783031: You might experience problems when you use the ldm init-system command to restore a domain configuration that has used direct I/O or SR-IOV operations.
A problem arises if one or more of the following operations have been performed on the configuration to be restored:
A slot has been removed from a bus that is still owned by the primary domain.
A virtual function has been created from a physical function that is owned by the primary domain.
A virtual function has been assigned to the primary domain, to other guest domains, or to both.
A root complex has been removed from the primary domain and assigned to a guest domain, and that root complex is used as the basis for further I/O virtualization operations.
In other words, you created a non-primary root domain and performed any of the previous operations.
To ensure that the system remains in a state in which none of the previous actions have taken place, see Using the ldm init-system Command to Restore Domains on Which Physical I/O Changes Have Been Made.
Bug ID 15778392: The control domain requires the lowest core in the system. So, if core ID 0 is the lowest core, it cannot be shared with any other domain if you want to apply the whole-core constraint to the control domain.
For example, if the lowest core in the system is core ID 0, the control domain should look similar to the following output:
# ldm ls -o cpu primary NAME primary VCPU VID PID CID UTIL STRAND 0 0 0 0.4% 100% 1 1 0 0.2% 100% 2 2 0 0.1% 100% 3 3 0 0.2% 100% 4 4 0 0.3% 100% 5 5 0 0.2% 100% 6 6 0 0.1% 100% 7 7 0 0.1% 100%
Bug ID 15775637: An I/O domain has a limit on the number of interrupt resources that are available per root complex.
On SPARC T3 and SPARC T4 systems, the limit is approximately 63 MSI/X vectors. Each igb virtual function uses three interrupts. The ixgbe virtual function uses two interrupts.
If you assign a large number of virtual functions to a domain, the domain runs out of system resources to support these devices. You might see messages similar to the following:
WARNING: ixgbevf32: interrupt pool too full. WARNING: ddi_intr_alloc: cannot fit into interrupt pool
Bug ID 15771384: A domain's guest console might freeze if repeated attempts are made to connect to the console before and during the time the console is bound. For example, this might occur if you use an automated script to grab the console as a domain is being migrated onto the machine.
Workaround: To unfreeze console, perform the following commands on the domain that hosts the domain's console concentrator (usually the control domain):
primary# svcadm disable vntsd primary# svcadm enable vntsd
Bug ID 15761509: Use only the PCIe cards that support the Direct I/O (DIO) feature, which are listed in this support document.
Workaround: Use the ldm add-io command to add the card to the primary domain again.
Bug ID 15759601: If you issue an ldm stop command immediately after an ldm start command, the ldm stop command might fail with the following error:
LDom domain-name stop notification failed
Workaround: Reissue the ldm stop command.
Bug ID 15750727: A system might panic when you reboot a primary domain that has a very large number of virtual functions assigned to it.
Workaround: Perform one of the following workarounds:
Decrease the virtual function number to reduce the number of failed virtual functions. This change might keep the chip responsive.
Create more Interrupt Resource Management (IRM) pools for the ixgbe virtual function because only one IRM pool is created by default for all the ixgbe virtual functions on the system.
Bug ID 15748348: When the primary domain shares the lowest physical core (usually 0) with another domain, attempts to set the whole-core constraint for the primary domain fail.
Workaround: Perform the following steps:
Determine the lowest bound core that is shared by the domains.
# ldm list -o cpu
Unbind all the CPU threads of the lowest core from all domains other than the primary domain.
As a result, CPU threads of the lowest core are not shared and are free for binding to the primary domain.
Set the whole-core constraint by doing one of the following:
Bind the CPU threads to the primary domain, and set the whole-core constraint by using the ldm set-vcpu -c command.
Use the ldm set-core command to bind the CPU threads and set the whole-core constraint in a single step.
Bug ID 15721872: You cannot use Oracle Solaris hot-plug operations to hot-remove a PCIe endpoint device after that device is removed from the primary domain by using the ldm rm-io command. For information about replacing or removing a PCIe endpoint device, see Making PCIe Hardware Changes in Oracle VM Server for SPARC 3.3 Administration Guide .
Bug ID 15701853: A No response message might appear in the Oracle VM Server for SPARC log when a loaded domain's DRM policy expires after the CPU count has been substantially reduced. The ldm list output shows that more CPU resources are allocated to the domain than is shown in the psrinfo output.
Workaround: Use the ldm set-vcpu command to reset the number of CPUs on the domain to the value that is shown in the psrinfo output.
Bug ID 15668368: A SPARC T3-1 system can be installed with dual-ported disks, which can be accessed by two different direct I/O devices. In this case, assigning these two direct I/O devices to different domains can cause the disks to be used by both domains and affect each other based on the actual usage of those disks.
Workaround: Do not assign direct I/O devices that have access to the same set of disks to different I/O domains. To determine whether you have dual-ported disks on a SPARC T3-1 system, run the following command on the SP:
-> show /SYS/SASBP
If the output includes the following fru_description value, the corresponding system has dual-ported disks:
fru_description = BD,SAS2,16DSK,LOUISE
If dual disks are found to be present in the system, ensure that both of the following direct I/O devices are always assigned to the same domain:
pci@400/pci@1/pci@0/pci@4 /SYS/MB/SASHBA0 pci@400/pci@2/pci@0/pci@4 /SYS/MB/SASHBA1
Bug ID 15667770: When multiple NIU nxge instances are plumbed on a domain, the ldm rm-mem and ldm set-mem commands, which are used to remove memory from the domain, might never complete. To determine whether the problem has occurred during a memory removal operation, monitor the progress of the operation with the ldm list -o status command. You might have encountered this problem if the progress percentage remains constant for several minutes.
Workaround: Cancel the ldm rm-mem or ldm set-mem command, and check whether a sufficient amount of memory was removed. If not, a subsequent memory removal command to remove a smaller amount of memory might complete successfully.
If the problem has occurred on the primary domain, do the following:
Start a delayed reconfiguration operation on the primary domain.
# ldm start-reconf primary
Assign the desired amount of memory to the domain.
Reboot the primary domain.
If the problem occurred on another domain, stop the domain before adjusting the amount of memory that is assigned to the domain.
Bug ID 15664666: When a reset dependency is created, an ldm stop -a command might result in a domain with a reset dependency being restarted instead of only stopped.
Workaround: First, issue the ldm stop command to the master domain. Then, issue the ldm stop command to the slave domain. If the initial stop of the slave domain results in a failure, issue the ldm stop -f command to the slave domain.
Bug ID 15631119: If you modify the maximum transmission unit (MTU) of a virtual network device on the control domain, a delayed reconfiguration operation is triggered. If you subsequently cancel the delayed reconfiguration, the MTU value for the device is not restored to the original value.
Recovery: Rerun the ldm set-vnet command to set the MTU to the original value. Resetting the MTU value puts the control domain into delayed reconfiguration mode, which you need to cancel. The resulting MTU value is now the original, correct MTU value.
# ldm set-vnet mtu=orig-value vnet1 primary # ldm cancel-op reconf primary
Bug ID 15600969: If all the hardware cryptographic units are dynamically removed from a running domain, the cryptographic framework fails to seamlessly switch to the software cryptographic providers, and kills all the ssh connections.
Recovery: Re-establish the ssh connections after all the cryptographic units are removed from the domain.
Workaround: Set UseOpenSSLEngine=no in the /etc/ssh/sshd_config file on the server side, and run the svcadm restart ssh command.
All ssh connections will no longer use the hardware cryptographic units (and thus not benefit from the associated performance improvements), and ssh connections will not be disconnected when the cryptographic units are removed.
Bug ID 15597025: When you run the ldm ls-io -l command on a system that has a PCI Express Dual 10-Gigabit Ethernet Fiber card (X1027A-Z) installed, the output might show the following:
primary# ldm ls-io -l ... pci@500/pci@0/pci@c PCIE5 OCC primary network@0 network@0,1 ethernet ethernet
The output shows four subdevices even though this Ethernet card has only two ports. This anomaly occurs because this card has four PCI functions. Two of these functions are disabled internally and appear as ethernet in the ldm ls-io -l output.
Workaround: You can ignore the ethernet entries in the ldm ls-io -l output.
Bug ID 15572184: An ldm command might be slow to respond when several domains are booting. If you issue an ldm command at this stage, the command might appear to hang. Note that the ldm command will return after performing the expected task. After the command returns, the system should respond normally to ldm commands.
Workaround: Avoid booting many domains simultaneously. However, if you must boot several domains at once, refrain from issuing further ldm commands until the system returns to normal. For instance, wait for about two minutes on Sun SPARC Enterprise T5140 and T5240 servers and for about four minutes on the Sun SPARC Enterprise T5440 server or Sun Netra T5440 server.
Bug ID 15560811: In Oracle Solaris 11, zones that are configured with an automatic network interface (anet) might fail to start in a domain that has Logical Domains virtual network devices only.
Workaround 1: Assign one or more physical network devices to the guest domain. Use PCIe bus assignment, the Direct I/O (DIO), or the SR-IOV feature to assign a physical NIC to the domain.
Workaround 2: If the zones configuration requirement is to have interzone communication only within the domain, create an etherstub device. Use the etherstub device as the “lower link” in the zones configuration so that virtual NICs are created on the etherstub device.
Workaround 3: Use exclusive link assignment to assign a Logical Domains virtual network device to a zone. Assign virtual network devices, as needed, to the domain. You might also choose to disable inter-vnet links to be able to create a large number of virtual network devices.
Bug ID 15518409: If you do not have a network configured on your machine and have a Network Information Services (NIS) client running, the Logical Domains Manager will not start on your system.
Workaround: Disable the NIS client on your non-networked machine:
# svcadm disable nis/client
Bug ID 15511551: Sometimes, executing the uadmin 1 0 command from the command line of a Logical Domains system does not leave the system at the ok prompt after the subsequent reset. This incorrect behavior is seen only when the Logical Domains variable auto-reboot? is set to true. If auto-reboot? is set to false, the expected behavior occurs.
Workaround: Use this command instead:
uadmin 2 0
Or, always run with auto-reboot? set to false.
Bug ID 15453968: Simultaneous net installation of multiple guest domains fails on systems that have a common console group.
Workaround: Only net-install on guest domains that each have their own console group. This failure is seen only on domains with a common console group shared among multiple net-installing domains.
Bug ID 15387338: This issue is summarized in Logical Domains Variable Persistence in Oracle VM Server for SPARC 3.3 Administration Guide and affects only the control domain.
Bug ID 15370442: The Logical Domains environment does not support setting or deleting wide-area network (WAN) boot keys from within the Oracle Solaris OS by using the ickey(1M) command. All ickey operations fail with the following error:
ickey: setkey: ioctl: I/O error
In addition, WAN boot keys that are set using OpenBoot firmware in logical domains other than the control domain are not remembered across reboots of the domain. In these domains, the keys set from the OpenBoot firmware are valid only for a single use.
Bug ID 15368170: In some cases, the behavior of the ldm stop-domain command is confusing.
# ldm stop-domain -f domain-name
If the domain is at the kernel module debugger, kmdb(1), prompt, then the ldm stop-domain command fails with the following error message:
LDom <domain-name> stop notification failed