General Issues

Language:

This section describes general known issues about this release of the Oracle VM Server for SPARC software that are broader than a specific bug number. Workarounds are provided where available.

After Canceling a Migration, `ldm` Commands That Are Run on the Target System Are Temporarily Unresponsive

If you cancel a live migration, the memory contents of the domain instance that is created on the target machine must be “scrubbed” by the hypervisor. This scrubbing process is performed for security reasons and must be complete before the memory can be returned to the pool of free memory. While this scrubbing is in progress, ldm commands become unresponsive. As a result, the Logical Domains Manager appears to be hung.

Recovery: You must wait for this scrubbing request to finish before you attempt to run other ldm commands. This process might take a long time. For example, a guest domain that has 500 Gbytes of memory might complete this process in up to 7 minutes on a SPARC T4 server or up to 25 minutes on a SPARC T3 server.

SPARC M5-32 and SPARC M6-32: Issue With Disks That Are Accessible Through Multiple Direct I/O Paths

When using the ldm add-vcpu command to assign CPUs to a domain, the Oracle Solaris OS might panic with the following message:

panic[cpu16]/thread=c4012102c860: mpo_cpu_add: Cannot read MD

This panic occurs if the following conditions exist:

Additional DCUs have been assigned to a host
The host is started by using a previously saved SP configuration that does not contain all the hardware that is assigned to the host

The target domain of the ldm add-vcpu operation is the domain that panics. The domain recovers with the additional CPUs when it reboots.

Workaround: Do not use configurations that are generated with fewer hardware resources than are assigned to the host.

To avoid the problem, do not add CPUs as described in the problem description. Or, perform the following steps:

Generate a new SP configuration after the DCUs have been added.

For example, the following command creates a configuration called new-config-more-dcus:
```
primary# ldm add-config new-config-more-dcus
```
Shutdown the domain.
Stop the host.
```
-> stop /HOST
```
Start the host.
```
-> start /HOST
```

Destroying All Virtual Functions and Returning the Slots to the Root Domain Does Not Restore the Root Complex Resources

The resources on the root complex are not restored after you destroy all the virtual functions and return the slots to the root domain.

Recovery: Return all the virtual I/O resources that are associated with the root complex to its root domain.

First, put the control domain in delayed reconfiguration..

primary# ldm start-reconf primary

Return all child PCIe slots to the root domain that owns the pci_0 bus. Then, remove all of the child virtual functions on the pci_0 bus and destroy them.

Finally, set iov=off for the pci_0 bus and reboot the root domain.

primary# ldm set-io iov=off pci_0
primary# shutdown -y -g 10

Workaround: Set the iov option to off for the specific PCIe bus.

primary# ldm start-reconf primary
primary# ldm set-io iov=off pci_0

`init-system` Does Not Restore Named Core Constraints for Guest Domains From Saved XML Files

The ldm init-system command fails to restore the named CPU core constraints for guest domains from a saved XML file.

Workaround: Perform the following steps:

Create an XML file for the primary domain.

# ldm ls-constraints -x primary > primary.xml

Create an XML file for the guest domain or domains.

# ldm ls-constraints -x domain-name[,domain-name][,...] > guest.xml

Power cycle the system and boot a factory default configuration.
Apply the XML configuration to the primary domain.
```
# ldm init-system -r -i primary.xml
```
Apply the XML configuration to the guest domain or domains.
```
# ldm init-system -f -i guest.xml
```

Removing a Large Number of CPUs From a Domain Might Fail

You might see the following error message when you attempt to remove a large number of CPUs from a guest domain:

Request to remove cpu(s) sent, but no valid response received
VCPU(s) will remain allocated to the domain, but might
not be available to the guest OS
Resource modification failed

Workaround: Stop the guest domain before you remove more than 100 CPUs from the domain.

Newly Added NIU/XAUI Adapters Are Not Visible to the Host OS If Logical Domains Is Configured

When Logical Domains is configured on a system and you add another XAUI network card, the card is not visible after the machine has undergone a power cycle.

Recovery: To make the newly added XAUI visible in the control domain, perform the following steps:

Set and clear a dummy variable in the control domain.

The following commands use a dummy variable called fix-xaui:
```
# ldm set-var fix-xaui=yes primary
# ldm rm-var fix-xaui primary
```
Save the modified configuration to the SP, replacing the current configuration.

The following commands use a configuration name of config1:
```
# ldm rm-spconfig config1
# ldm add-spconfig config1
```
Perform a reconfiguration reboot of the control domain.
```
# reboot -- -r
```
At this time, you can configure the newly available network or networks for use by Logical Domains.

LSI SAS 2008 Cannot Be Added by Dynamic Bus or PCI-Box Hotplug Operations

If you attempt to remove a PCIe bus that hosts LSI SAS HBA devices, you cannot later add the devices by using dynamic bus or PCI-box hotplug operations.

In Certain Conditions, a Guest Domain's Solaris Volume Manager Configuration or Metadevices Can Be Lost

If a service domain is running a version of Oracle Solaris 10 OS prior to Oracle Solaris 10 1/13 OS and is exporting a physical disk slice as a virtual disk to a guest domain, then this virtual disk will appear in the guest domain with an inappropriate device ID. If that service domain is then upgraded to Oracle Solaris 10 1/13 OS, the physical disk slice exported as a virtual disk will appear in the guest domain with no device ID.

This removal of the device ID of the virtual disk can cause problems to applications attempting to reference the device ID of virtual disks. In particular, Solaris Volume Manager might be unable to find its configuration or to access its metadevices.

Workaround: After upgrading a service domain to Oracle Solaris 10 1/13 OS, if a guest domain is unable to find its Solaris Volume Manager configuration or its metadevices, perform the following procedure.

How to Find a Guest Domain's Solaris Volume Manager Configuration or Metadevices

Boot the guest domain.
Disable the devid feature of Solaris Volume Manager by adding the following lines to the /kernel/drv/md.conf file:
```
md_devid_destroy=1;
md_keep_repl_state=1;
```
Reboot the guest domain.
After the domain has booted, the Solaris Volume Manager configuration and metadevices should be available.
Check the Solaris Volume Manager configuration and ensure that it is correct.
Re-enable the Solaris Volume Manager devid feature by removing from the /kernel/drv/md.conf file the two lines that you added in Step 2.
Reboot the guest domain.
During the reboot, you will see messages similar to this:
```
NOTICE: mddb: unable to get devid for 'vdc', 0x10
```
These messages are normal and do not report any problems.

Oracle Solaris Boot Disk Compatibility

Historically, the Oracle Solaris OS has been installed on a boot disk configured with an SMI VTOC disk label. Starting with the Oracle Solaris 11.1 OS, the OS is installed on a boot disk that is configured with an extensible firmware interface (EFI) GUID partition table (GPT) disk label by default. If the firmware does not support EFI, the disk is configured with an SMI VTOC disk label instead. This situation applies only to SPARC T4 servers that run at least system firmware version 8.4.0, to SPARC T5, SPARC M5, SPARC M6 servers that run at least system firmware version 9.1.0, and to Fujitsu M10 servers that run at least XCP version 2230.

The following servers cannot boot from a disk that has an EFI GPT disk label:

UltraSPARC T2, UltraSPARC T2 Plus, and SPARC T3 servers no matter which system firmware version is used
SPARC T4 servers that run system firmware versions prior to 8.4.0
SPARC T5, SPARC M5, and SPARC M6 servers that run system firmware versions prior to 9.1.0
Fujitsu M10 servers that run XCP versions prior to 2230

So, an Oracle Solaris 11.1 boot disk that is created on an up-to-date SPARC T4, SPARC T5, SPARC M5, SPARC M6, Fujitsu M10 server cannot be used on older servers or on servers that run older firmware.

This limitation restrains the ability to use either cold or live migration to move a domain from a recent server to an older server. This limitation also prevents you from using an EFI GPT boot disk image on an older server.

To determine whether an Oracle Solaris 11.1 boot disk is compatible with your server and its firmware, ensure that the Oracle Solaris 11.1 OS is installed on a disk that is configured with an SMI VTOC disk label.

To maintain backward compatibility with systems that run older firmware, use one of the following procedures. Otherwise, the boot disk uses the EFI GPT disk label by default. These procedures show how to ensure that the Oracle Solaris 11.1 OS is installed on a boot disk with an SMI VTOC disk label on a SPARC T4 server with at least system firmware version 8.4.0, on a SPARC T5, SPARC M5, or SPARC M6 server with at least system firmware version 9.1.0, and on a Fujitsu M10 server with at least XCP version 2230.

Solution 1: Remove the gpt property so that the firmware does not report that it supports EFI.
1. From the OpenBoot PROM prompt, disable automatic booting and reset the system to be installed.
```
ok setenv auto-boot? false
ok reset-all
```
  After the system resets, it returns to the ok prompt.
2. Change to the /packages/disk-label directory and remove the gpt property.
```
ok cd /packages/disk-label
ok " gpt" delete-property
```
3. Begin the Oracle Solaris 11.1 OS installation.
  
  For example, perform a network installation:
```
ok boot net - install
```

Solution 2: Use the format -e command to write an SMI VTOC label on the disk to be installed with the Oracle Solaris 11.1 OS.

Write an SMI VTOC label on the disk.

For example, select the label option and specify the SMI label:
```
# format -e c1d0
format> label
[0] SMI Label
[1] EFI Label
Specify Label type[1]: 0
```

Configure the disk with a slice 0 and slice 2 that cover the entire disk.

The disk should have no other partitions. For example:

format> partition
 
partition> print
Current partition table (unnamed):
Total disk cylinders available: 14087 + 2 (reserved cylinders)

Part      Tag    Flag     Cylinders         Size            Blocks
  0       root    wm       0 - 14086      136.71GB    (14087/0/0) 286698624
  1 unassigned    wu       0                0         (0/0/0)             0
  2     backup    wu       0 - 14086      136.71GB    (14087/0/0) 286698624
  3 unassigned    wm       0                0         (0/0/0)             0
  4 unassigned    wm       0                0         (0/0/0)             0
  5 unassigned    wm       0                0         (0/0/0)             0
  6 unassigned    wm       0                0         (0/0/0)             0
  7 unassigned    wm       0                0         (0/0/0)             0

Re-write the SMI VTOC disk label.

partition> label
[0] SMI Label
[1] EFI Label
Specify Label type[0]: 0
Ready to label disk, continue? y

Configure your Oracle Solaris Automatic Installer (AI) to install the Oracle Solaris OS on slice 0 of the boot disk.

Change the <disk> excerpt in the AI manifest as follows:
```
<target>
   <disk whole_disk="true">
        <disk_keyword key="boot_disk"/>
        <slice name="0" in_zpool="rpool"/>
   </disk>
[...]
</target>
```
Perform the installation of the Oracle Solaris 11.1 OS.

Sometimes a Block of Dynamically Added Memory Can Be Dynamically Removed Only as a Whole

Due to the way in which the Oracle Solaris OS handles the metadata for managing dynamically added memory, you might later be able to remove only the entire block of memory that was previously dynamically added rather than a proper subset of that memory.

This situation could occur if a domain with a small memory size is dynamically grown to a much larger size, as shown in the following example.

primary# ldm list ldom1
NAME  STATE FLAGS   CONS VCPU MEMORY UTIL UPTIME
ldom1 active -n--   5000 2    2G     0.4% 23h

primary# ldm add-mem 16G ldom1

primary# ldm rm-mem 8G ldom1
Memory removal failed because all of the memory is in use.

primary# ldm rm-mem 16G ldom1

primary# ldm list ldom1
NAME  STATE FLAGS   CONS VCPU MEMORY UTIL UPTIME
ldom1 active -n--   5000 2    2G     0.4% 23h

Workaround: Use the ldm add-mem command to sequentially add memory in smaller chunks rather than in chunks larger than you might want to remove in the future.

Recovery: Perform one of the following actions:

Stop the domain, remove the memory, and then restart the domain.
Reboot the domain, which causes the Oracle Solaris OS to reallocate its memory management metadata such that the previously added memory can now be removed dynamically in smaller chunks.