Bugs Affecting the Oracle VM Server for SPARC Software

The following sections summarize the bugs that you might encounter when using each version of the Oracle VM Server for SPARC 3.1 software. Each section lists the bugs found in that release. The bugs might occur in any or all of the Oracle VM Server for SPARC 3.1 releases. The most recent bugs are described first. Workarounds and recovery procedures are specified, if available.

Note - Some of the bugs described in this section have been fixed since the Oracle VM Server for SPARC 3.1 release. These bug writeups remain for those who are still running the Oracle VM Server for SPARC 3.1 release.

Bugs Affecting the Oracle VM Server for SPARC 3.1.1.2 Software

System Crashes When Applying the Whole-Core Constraint to a Partial Core `primary` Domain

Bug ID 19456310: When using dynamic reconfiguration to apply the whole-core constraint to a primary domain, the removal of partial cores results in an OS panic or system powercycle.

A partial core is removed if the core is shared with another domain or if one of the free strands in the core is faulty.

Workaround: Use a delayed reconfiguration to apply the whole-core constraint to a primary domain that has partial cores.

Verify that the primary domain does not have the whole-core constraint.
```
primary# ldm list -o resmgmt primary
```
Verify that the primary domain has partial cores.
```
primary# ldm list -o core primary
```
Initiate a delayed reconfiguration on the primary domain.
```
primary# ldm start-reconf primary
```
Apply the whole-core constraint.

For example, the following command assigns two whole cores to the primary domain:
```
primary# ldm set-core 2 primary
```
Reboot the primary domain.

`format` Command Hangs After Having Migrated a Guest Domain or a Guest Domain Console Does Not Take Input

You might encounter the following bugs if your system runs system firmware version 8.5.1.b, 9.2.1.b, or 9.2.1.c. For more information, see Oracle Virtual Machine (OVM) Server for SPARC Guest Domains may not Accept Console Input on SPARC T4/T5/M5/M6 Series Servers Running Sun System Firmware Releases 8.5.1.b and 9.2.1.B/C (Doc ID 1946535.1).

Bug ID 19430884: A guest domain that is configured with 108 virtual disks from two service domains is migrated. After the migration completes successfully, the format command hangs even though the disks are available and can be accessed.

Workaround: Reboot the system.
Bug ID 19388985: Attempting to connect to a guest domain console succeeds but the console does not take input. This situation occurs intermittently after stopping and starting guest domains, rebooting the primary domain, and binding and starting the guest domains.

Workaround: Avoid unbinding and then rebinding the guest domain.

Recovery: Save the configuration of the guest domains and then perform a powercycle.

Kernel Zones Block Live Migration of Guest Domains

Bug ID 18289196: On a SPARC system, a running kernel zone within an Oracle VM Server for SPARC domain will block live migration of the guest domain. The following error message is displayed:

Live migration failed because Kernel Zones are active.
Stop Kernel Zones and retry.

Workaround: Choose one of the following workarounds:

Stop running the kernel zone.
```
# zoneadm -z zonename shutdown
```
Suspend the kernel zone.
```
# zoneadm -z zonename suspend
```

Bugs Affecting the Oracle VM Server for SPARC 3.1.1.1 Software

Live Migration Might Fail With `Unable to restore ldc resource state on target Domain Migration of LDom failed`

Bug ID 19454837: A live migration of a domain on a system that runs particular versions of SPARC system firmware might fail with the following error message:

system1 # ldm migrate ldg1 system2
Target Password:
Unable to restore ldc resource state on target
Domain Migration of LDom ldg1 failed

The error message occurs after transferring all the domain state to the target machine but before attempting to suspend the domain to be migrated on the source machine. The domain to be migrated continues to run on the source system.

The following are the affected system firmware versions:

SPARC T5, SPARC M5, SPARC M6 – System firmware version 9.2.1
SPARC T4 – System firmware version 8.5.1

Mitigation: Unless you want to take advantage of the new increased LDC limits (and not use the live migration feature), avoid updating your system to system firmware versions 8.5.1 or 9.2.1 until at least versions 8.6 and 9.3 are released.

Recovery: Perform a powercycle of the source machine to permit the live migration of the domain.

Workaround: None.

Recovery Mode Fails With `ldmd` in Maintenance Mode When Virtual Switch net-dev Is Missing

Bug ID 18770805: If a virtual switch net-dev is faulty and cannot be validated, the recovery operation fails and the ldmd daemon dumps core.

Recovery: Disable recovery mode and recover the configuration manually.

Migration to a SPARC M5 or SPARC T5 System Might Panic With `suspend: get stick freq failed`

Bug ID 16934400: When you migrate a guest domain to a SPARC M5 or a SPARC T5 system, the OS on the guest domain might panic with the suspend: get stick freq failed message.

Workaround: Add the following line to the /etc/system file in the guest domain to be migrated:

set migmd_buf_addl_size = 0x100000

Reboot the guest domain to make the change take effect.

Logical Domains Manager Does Not Prohibit the Creation of Circular Dependencies

Bug ID 15751041: The Logical Domains Manager permits you to create a circular configuration where two domains provide services to each other. Such a configuration is not recommended because it creates a single-point-of-failure outage where one domain takes down the other domain. In addition, a circular dependency prevents the affected domains from being unbound.

Workaround: If a circular dependency configuration prevents you from unbinding a domain, remove the devices that cause the circular dependency and then retry the unbind operation.

Bugs Affecting the Oracle VM Server for SPARC 3.1.1 Software

Very Large LDC Counts Might Result in Oracle Solaris Issues in Guest Domains

Bug ID 19480835: The following Sun System Firmware versions increase the maximum number of Logical Domain Channels (LDCs) per guest domain:

SPARC T5, SPARC M5, SPARC M6 – 9.2.1
SPARC T4 – 8.5.1

This increase in LDCs per guest domain requires that you run at least Logical Domains Manager 3.1.1.1.

To prevent potential issues when using versions of the Logical Domains Manager prior to and including 3.1.1, do not increase the number of LDCs per guest domain beyond the 768 that is supported by previous system firmware versions. For example, do not add large numbers of virtual disks and virtual network interfaces until after you install at least Logical Domains Manager 3.1.1.1.

You might see the following possible symptoms when exceeding the 768 LDCs per domain limit with Oracle VM Server for SPARC versions prior to and including 3.1.1:

Dictionary overflow in OBP:

Dictionary overflow - here f21ffe58 limit f2200000

Dictionary overflow - here f21ffe70 limit f2200000
WARNING: /virtual-devices@100/channel-devices@200/disk@5b2: Problem
  creating devalias for virtual device node

Dictionary overflow - here f21ffe70 limit f2200000

Dictionary overflow - here f21ffe70 limit f2200000

Dictionary overflow - here f21ffe70 limit f2200000

Stack Underflow
ok

Panic in vmem_xalloc:

panic[cpu6]/thread=2a10020fc80: vmem_xalloc(1a04610, 29360128, 29360128, 0,
0, 0, 0, 1): parameters inconsistent or invalid

000002a10020f000 genunix:vmem_xalloc+850 (1a04610, 1c00000, 0, 0, 1bfffff, 0)
  %l0-3: 0000000000001fff 0000000000002000 0000000000420000 0000000000000010
  %l4-7: 0000000001c00000 0000000000000008 0000000001c00000 0000000000000000
000002a10020f180 unix:contig_vmem_xalloc_aligned_wrapper+24 (1a04610,
1c00000, 1, 0, 1000000, 1)
  %l0-3: 000002a10020f9a4 0000000000000008 0000000001a4bd90 0000000000000018
  %l4-7: 0000000000000002 ffffffffffffffff 000000000136efe8 00000000013722c0
000002a10020f240 genunix:vmem_xalloc+5c8 (300150c2d98, 1c00000, 0, 0, 80000,
0)
  %l0-3: 00000300150c2ff0 ffffffffffffffff 00000300150c39e0 ffffffffff000000
  %l4-7: 0000000000000000 ffffffffffffffff 0000000001000000 0000000000000004
000002a10020f3c0 unix:contig_mem_span_alloc+24 (300150c2d98, 1000000, 1, 1,
cd4000, 3)
  %l0-3: 00000000000f4000 0000000000000000 0000000000000000 0000000001921897
  %l4-7: 0000000000000006 00000000fe53dce8 00000000fee3a844 000000007ffffa4c
000002a10020f490 genunix:vmem_xalloc+5c8 (300150c4000, cd4000, 0, 0, 80000,
0)
  %l0-3: 00000300150c4258 ffffffffffffffff 00000300150c4c48 ffffffffffffe000
  %l4-7: 0000000000000000 ffffffffffffffff 0000000000002000 0000000000000003
000002a10020f610 unix:contig_mem_alloc_align+28 (cd4000, 2000, 600957feaf8,
1, 600957feaf8, 18e3000)
  %l0-3: 0000000000000001 0000000000003000 00000300051c01d8 0000000000000000
  %l4-7: 0000000000002000 0000000001a29e20 00000300051c01b0 00000300051c0380
000002a10020f6d0 unix:mach_descrip_buf_alloc+8 (cd4000, 2000, 4, 1,
2a10020f838, 10448d0)
  %l0-3: 0000000000000000 0000000000003000 00000300002141d8 0000000000000000
  %l4-7: 0000000000000001 0000000000000100 00000300002141b0 0000030000214380
000002a10020f780 unix:mach_descrip_update+84 (1864c00, 1c00, cd4000, 18e31d8,
0, 0)
  %l0-3: 0000000001864c58 000002a10020f830 0000000000002000 ffffffffffffe000
  %l4-7: 000002a10020f838 0000000000cd27b0 0000000001864c30 00000600957feaf8
000002a10020f840 platsvc:ps_md_data_handler+30 (1a4bcc0, 3003a822be0, 8, 18,
10, 1)
  %l0-3: 0000000000001d03 0000000000420000 0000000000420000 0000000000000010
  %l4-7: 000003003a822bd8 0000000000000008 0000000000000008 000003000d9bb940
000002a10020f900 ds:ds_dispatch_event+30 (6009fef4df8, 1372000, 48, 9, 9,
3003a822bd0)
  %l0-3: 000002a10020f9a4 0000000000000008 0000000001a4bd90 0000000000000018
  %l4-7: 0000000000000002 ffffffffffffffff 000000000136efe8 00000000013722c0
000002a10020f9b0 genunix:taskq_thread+3cc (600957fd390, 600957fd328,
260fe5123efd, 600957fd35a, 260fe5124083, 600957fd35c)
  %l0-3: 00000600957feaf8 00000600957fd358 0000000000000001 0000000000080000
  %l4-7: 00000600957fd348 0000000000010000 00000000fffeffff 00000600957fd350

Fibre Channel Physical Function Is Faulted by FMA And Disabled

Bug IDs 18168525 and 18156291: You must connect the Fibre Channel PCIe card to a Fibre Channel switch that supports NPIV and that is compatible with the PCIe card. If you do not use this configuration, using the format command, or creating or destroying a virtual function might cause the physical function to be faulted by FMA and disabled. If this fault occurs, the message is similar to the following:

SUNW-MSG-ID: PCIEX-8000-0A, TYPE: Fault, VER: 1, SEVERITY: Critical
EVENT-TIME: event-time
PLATFORM: platform-type
SOURCE: eft, REV: 1.16
EVENT-ID: event-ID
DESC: A problem was detected for a PCIEX device.
AUTO_RESPONSE: One or more device instances may be disabled
IMPACT: Loss of services provided by the device instances associated with
this fault
REC-ACTION: Use 'fmadm faulty' to provide a more detailed view of this event.
Please refer to the associated reference document at
http://support.oracle.com/msg/PCIEX-8000-0A for the latest service procedures
and policies regarding this diagnosis.

Workaround: If the card has been faulted by FMA, first check its connections and ensure that the card is not directly connected to storage. Then, perform the step that matches your configuration:

Card is directly connected to storage – Correctly configure the Fibre Channel PCIe card by connecting it to a Fibre Channel switch that supports NPIV and is compatible with the PCIe card. Then, run the fmadm repair command to override the FMA diagnosis.
Card is not directly connected to storage – Replace the card.

Virtual Network LDC Handshake Issues Seen When There Are a Large Number of Virtual Network Devices Present

Bug ID 18166010: You might experience virtual network LDC handshake issues if your deployment has a large number of virtual network devices.

Workaround: Perform the following steps:

Increase the number of handshake retries on all domains that have a virtual network device by adding the following entry to the /etc/system file:
```
set vnet:vgen_ldc_max_resets = 25
```
Note that you must reboot any domain on which you updated the /etc/system file for the changes to take effect. For information about /etc/system tuneables, see the system(4) man page.
Disable inter-vnet links when a large number of virtual network devices are required in a virtual switch.

If more than eight virtual network devices use a given virtual switch, set the inter-vnet-link property to off. Disabling the inter-vnet-link property avoids the use of N² channels for inter-vnet communications. This change might negatively affect the performance of inter-vnet communications. So, if the guest-to-guest performance is critical for your deployment, create a separate system-private virtual switch (without specifying a net-dev device) that uses only the virtual network devices that require inter-vnet communications.

If your deployment does not require high-performance guest-to-guest communications, set the inter-vnet-link property to off even if fewer virtual network devices use a given virtual switch.
```
primary# ldm set-vsw inter-vnet-link=off vsw0
```

If this workaround does not solve your problem, as a last result, make the following changes to the /etc/system file on all domains that have virtual network and virtual switch devices.

Note that updating the /etc/system file in this way might negatively affect guest-to-guest communication performance.

Add the following entry to the /etc/system file of a domain that has a virtual network device:
```
set vnet:vnet_num_descriptors = 512
```
Add the following entry to the /etc/system file of a domain that has a virtual switch device:
```
set vsw:vsw_num_descriptors = 512
```
Reboot the system for these settings to take effect.

Sun Storage 16 Gb Fibre Channel Universal HBA Firmware Does Not Support Bandwidth Controls

Bug ID 18083904: The firmware for Sun Storage 16 Gb Fibre Channel Universal HBA, Emulex cards does not support setting bandwidth controls. The HBA firmware ignores any value that you specify for the bw-percent property.

Workaround: None.

Adding Memory After Performing a Cross-CPU Migration Might Cause a Guest Domain Panic

Bug ID 18032944: Performing a cross-CPU live migration of a domain from a SPARC T5, SPARC M5, or SPARC M6 machine to a platform that runs a different CPU type succeeds. However, a subsequent memory dynamic reconfiguration operation to increase the memory size of the guest domain might cause a panic similar to the following:

panic[cpu0]/thread=2a1003c9c60: kphysm_add_memory_dynamic(1018000, 200000):
range has 2097152 pages, but memgr p_walk_pfnrange only reported 0
 000002a1003c9500 genunix:kphysm_add_memory_dynamic+254 (1018000, 200000,
12e8000, 3, 1218000, 0)

vpanic(12e8220, 1018000, 200000, 200000, 0, 2a1003c95c8)
kphysm_add_memory_dynamic+0x254(1018000, 200000, 12e8000, 3, 1218000, 0)
dr_mem_configure+0x94(1018000, 2a1003c97b4, fffffff, 2430000000, 1068ac00,
1068ac00)
dr_mem_list_wrk+0x15c(4c01b3382b8, 0, 20, 4c014ba27c8, 1, 1)
dr_mem_data_handler+0xa8(0, 4c01b3382b8, 20, 2a1003c9890, 7bac0644, 16)
ds_dispatch_event+0x2c(4c01ee33478, 7bf888b8, 48, 7bf88800, 9, 9)
taskq_thread+0x3a8(95af9e15e84, 4c010a5caf0, 95af9e15f74, 4c010a5cb22,
4c010a5cb24, 4c01e24d688)
thread_start+4(4c010a5caf0, 0, 0, 0, 0, 0)

This panic occurs when the target system is one of the following:

SPARC T-Series systems that have socket 0 disabled
SPARC M-Series systems that have socket 0 disabled
Physical domains on a SPARC M-Series system that do not contain DCU0

This situation does not affect migrations between systems that have the same CPU type or domains that have cpu-arch=native.

Workaround: After migrating a domain from a system with one of these configurations, you must reboot the guest domain before you attempt to add memory by means of dynamic reconfiguration.

Incorrect Device Path for Fibre Channel Virtual Functions in a Root Domain

Bug ID 18001028: In the root domain, the Oracle Solaris device path for a Fibre Channel virtual function is incorrect.

For example, the incorrect path name is pci@380/pci@1/pci@0/pci@6/fibre-channel@0,2 while it should be pci@380/pci@1/pci@0/pci@6/SUNW,emlxs@0,2.

The ldm list-io -l output shows the correct device path for the Fibre Channel virtual functions.

Workaround: None.

`ldmd` Dumps Core When Attempting to Bind a Domain in Either the Binding or Unbinding State

Bug ID 17796639: When running Oracle Enterprise Manager Ops Center 12c Release 1 Update 4 (12.1.4.0.0), if you attempt a bind, unbind, start, or stop operation on a domain that is in the binding or unbinding state, the ldmd service might dump core and the domain will drop to maintenance mode.

Recovery: If the ldmd service has dumped core already, perform a power cycle of the system to bring the ldmd service online again.

Workaround: Determine whether the domain is in a binding or unbinding state by running the ldm list command. If it is, wait until the process completes and the domain is in the bound or inactive state.

Bugs Affecting the Oracle VM Server for SPARC 3.1 Software

Issues Might Arise When FMA Detects Faulty Memory

Bug IDs 17663828 and 17576087: When FMA attempts to isolate an extremely small range of memory as a percentage of the total memory capacity of the system, the Logical Domains Manager might incorrectly mark a very large range of memory as being blacklisted.

This error can have a significant impact on usable memory capacity, which might lead to the following issues:

The reboot of an affected guest domain might prevent that domain from starting because too much memory has been incorrectly removed.
A very large range of memory might be unavailable for assignment to guest domains if a blacklist request is applied to unbound memory. Thus, if you attempt to use most of the system memory, you might be unable to create guest domains.
The Logical Domains Manager might crash if it restarts prior to the faulty memory being repaired because the blacklisted memory block might not have been properly marked internally.
Performing a power cycle of the system to a saved configuration might not restore the memory after the faulty memory has been replaced.

Workaround: If a significant amount of memory no longer appears in ldm list-devices -a memory output, contact Oracle Service to confirm and identify the DIMM that must be replaced.

After you replace the faulty memory, perform a power cycle of the system to the factory-default configuration. Then, perform a power cycle of the system to the configuration that you want to use.

`ldmd` Service Fails to Start Because of a Delay in Creating `virtual-channel@0:hvctl`

Bug ID 17627526: Sometimes during a system boot, a race condition occurs where the device the ldmd daemon uses to communicate with the hypervisor is not created by the time the svc:/ldoms/ldmd:default SMF service starts. This behavior causes the ldmd SMF service to drop to maintenance mode.

The following error message appears in the ldmd SMF log:

ldmd cannot communicate with the hypervisor as the required device
does not exist:
/devices/virtual-devices@100/channel-devices@200/virtual-channel@0:hvctl

This problem might occur if the control domain is running one of the following OS versions:

At least Oracle Solaris 11.1.12.3.0
At least Oracle Solaris 10 1/13 and patch ID 150840-01

Recovery: Verify that the /devices/virtual-devices@100/channel-devices@200/virtual-channel@0:hvctl device exists and then run the svcadm clear ldmd command.

Poor Affinity on the Control Domain When You Assign Memory Before You Assign CPUs in a Delayed Reconfiguration

Bug ID 17606070: If you assign memory prior to assigning CPUs to the primary domain while in a delayed reconfiguration, the memory will have affinity to the allocated CPUs at the time you issue the ldm set-memory command even if you perform additional ldm set-vcpu or ldm set-core commands. For example, the following commands might create a situation where the 16 Gbytes of memory allocated to the primary domain might not have affinity to the eight cores that are subsequently allocated by the ldm set-core command:

primary# ldm start-reconf primary
primary# ldm set-mem 16G primary
primary# ldm set-core 8 primary
primary# reboot

Workaround: Ensure that you assign the cores to the primary domain before you assign the memory. For example, the following commands first assign eight cores to the primary domain and then assign 16 Gbytes of memory:

primary# ldm start-reconf primary
primary# ldm set-core 8 primary
primary# ldm set-mem 16G primary
primary# reboot

Cannot Install the Oracle Solaris 11.1 OS Using an EFI GPT Disk Label on Single-Slice Virtual Disk

Bug ID 17422973: The installation of the Oracle Solaris 11.1 OS on a single-slice disk might fail with the following error on a SPARC T4 server that runs at least system firmware version 8.4.0, a SPARC T5, SPARC M5, or SPARC M6 server that runs at least system firmware version 9.1.0, or a Fujitsu M10 system that runs at least XCP version 2230:

cannot label 'c1d0': try using fdisk(1M) and then provide a specific slice
Unable to build pool from specified devices: invalid vdev configuration

Workaround: Relabel the disk with an SMI label.

After Being Migrated, A Domain Can Panic on Boot After Being Started or Rebooted

Bug ID 17285811: A guest domain that was previously migrated might fail to reboot on subsequent reboots or domain start operations because of a kernel panic. The panic occurs as the domain boots. The panic error message is similar to the following message:

panic[cpu0]/thread=10012000: tilelet_assign_cb: assigning pfns [50000, c0000)
 to mgid 1, mnodeid 1: pachunk 1 already assigned to mgid 0, mnodeid 0

Workaround: Do not reboot the domain. First, stop and unbind the domain and then bind and start the domain again. For example:

primary# ldm stop domain
primary# ldm unbind domain
primary# ldm bind domain
primary# ldm start domain

Recovery: When the problem occurs, stop and unbind the domain and then bind and start the domain again.

Size of Preallocated Machine Description Buffer Is Used During Migration

Bug ID 17285745: Migrating a guest domain to a SPARC T5, SPARC M5, or SPARC M6 system might result in a kernel panic on the guest domain with the suspend: get stick freq failed message.

Workaround: Add the following setting to the /etc/system file in the guest domain that to be migrated. Then, reboot the guest domain.

set migmd_buf_addl_size = 0x100000

Attempting to Resize a Guest Domain's Virtual CPUs After a Successful Core Remap Operation Might Fail

Bug ID 17245915: When FMA detects a faulty core, the Logical Domains Manager attempts to evacuate it by performing a core remap operation if a core is free to use as a target. After the core remap operation succeeds and the faulty core is replaced, attempting to resize a guest domain's virtual CPUs by using the ldm add-vcpu command might fail with an Invalid response error message.

The failure is intermittent and depends on the system configuration.

Workaround: None.

Recovery: Perform the following steps to add more CPUs to the guest domain:

Unbind the guest domain.
Remove all the virtual CPUs.
Add the virtual CPUs again.
Bind the guest domain.

The ability to reliably use DR to add CPUs will be fully restored when the blacklisted CPU resources are repaired.

Oracle Solaris 10: Non-`primary` Root Domain Hangs at Boot on a `primary` Reboot When `failure-policy=reset`

Bug ID 17232035: A slave domain might hang on boot when failure-policy=reset in the master domain. This issue is not reproducible with different settings of the failure-policy property.

Recovery: Stop the I/O domains that are associated with this root domain and start the non-primary root domain.

Workaround: Set the failure-policy property to a value other than reset.

Virtual Network Hang Prevents a Domain Migration

Bug ID 17191488: When attempting to migrate a domain from a SPARC T5-8 to a SPARC T4-4 system, the following error occurs:

primary# ldm migrate ldg1 system2
Target Password:
Timeout waiting for domain ldg1 to suspend
Domain Migration of LDom ldg1 failed

Workaround: To avoid this problem, set extended-mapin-space=on.

Note - This command initiates a delayed reconfiguration if ldom is primary. In all other cases, stop the domain before you perform this command.

primary# ldm set-domain extended-mapin-space=on ldom

`ldmpower` Output Sometimes Does Not Include Timestamps

Bug ID 17188920: The –-suppress and –-timestamp options do not properly show timestamp values.

Workaround: Include the –r option when using the –-suppress and –-timestamp options to show the correct output.

`mac_do_softlso` Drops LSO Packets

Bug ID 17182503: mac_do_softlso() drops LSO packets that are generated by the vnet_vlan_insert_tag() and vnet_vlan_remove_tag() functions.

Workaround: To avoid this issue with VLAN tagged LSO packets, disable virtual network LSO capability on all domains that support it.

Append the following lines to the /etc/system file:
```
set vnet_enable_lso = 0
set vsw_enable_lso = 0
```
Reboot.

Verify the changes by using the mdb -k command.

# mdb -k
> vnet_enable_lso/D
vnet_enable_lso:
vnet_enable_lso:0   

> vsw_enable_lso/D
vsw_enable_lso:
vsw_enable_lso: 0

Migration Failure: `Invalid Shutdown-group: 0`

Bug ID 17088083: The migration of a domain that has more than eight virtual CPUs might result in memory corruption if the domain's highest processor group ID increases across a 64-unit multiple. For example, before the migration the highest processor group ID on the domain is 63 and after the migration it is 64.

Use the pginfo command to determine the processor group IDs in a domain. Within a domain, run the following command to print the highest processor group ID:

# pginfo -I|tr ' ' '\n'|sort -n|tail -1

Workaround: Reduce the number of virtual CPUs in the domain to eight before performing the migration. After the migration completes, you can restore the virtual CPU count in the domain to the original value.

Autosave Configuration Is Not Updated After the Removal of a Virtual Function or a PCIe Device

Bug ID 17051532: When a PCIe device or a virtual function is removed from a guest domain, the autosave configuration is not updated. This problem might result in the device or virtual function reappearing in the guest domain after you perform an autosave recovery; namely, when autorecovery_policy=3. This problem can also cause the ldm add-spconfig -r command to fail with the Autosave configuration config-name is invalid message if you do not perform another ldm command that causes the autosave to be updated.

Workaround: Perform one of the following workarounds:

Save a new configuration after you remove the PCIe device or virtual function.
```
primary# ldm add-config new-config-name
```
Refresh the saved configuration after you remove the PCIe device or virtual function by removing and re-creating the configuration.
```
primary# ldm rm-config config-name
primary# ldm add-config config-name
```
Note that this bug prevents the ldm add-config -r config-name command from working properly.
Issue another ldm command that causes an autosave update to occur such as ldm set-vcpu, ldm bind, or ldm unbind.

`ldmp2v convert` Command Failure Causes Upgrade Loop

Bug ID 17026219: If an error occurs during the ldmp2v convert command, sometimes the boot-device property for the guest is not set to the guest's boot disk. This error causes the guest domain to boot from the Oracle Solaris Install image again after the Oracle Solaris upgrade finishes.

Workaround: Change the boot-device property on the guest domain from within the control domain. Make this change when you re-enter the Oracle Solaris Installer, and then redo the Oracle Solaris upgrade. The guest domain will then reboot from the upgraded boot disk after the upgrade has finished.

To set the boot device, run the following command on the control domain. This command assumes that the root (/) file system of the original physical system is located on slice 0 of the boot disk. If the original system booted from another slice, adjust the letter after the colon accordingly. For instance, use a for slice 0, b for slice 1, and so on.

primary# ldm set-variable boot-device=disk0:a domain-name

Domain Migrations From SPARC T4 Systems That Run System Firmware 8.3 to SPARC T5, SPARC M5, or SPARC M6 Systems Are Erroneously Permitted

Bug ID 17027275: Domain migrations between SPARC T4 systems that run system firmware 8.3 should not be permitted to SPARC T5, SPARC M5, or SPARC M6 systems. Although the migration succeeds, a subsequent memory DR operation causes a panic.

Workaround: Update the system firmware on the SPARC T4 system to version 8.4. See the workaround for Guest Domain Panics at lgrp_lineage_add(mutex_enter: bad mutex, lp=10351178).

Guest Domain Panics at `lgrp_lineage_add(mutex_enter: bad mutex, lp=10351178)`

Bug ID 17020950: After migrating an active domain from a SPARC T4 platform to a SPARC T5, SPARC M5, or SPARC M6 platform that was bound using firmware version 8.3, performing a memory dynamic reconfiguration might result in a guest domain panic.

Workaround: Before you perform the migration, update the SPARC T4 system with version 8.4 of the system firmware. Then, rebind the domain.

Guest Domains in Transition State After Reboot of the `primary` Domain

Bug ID 17020481: A guest domain is in transition state (t) after a reboot of the primary domain. This problem arises when a large number of virtual functions are configured on the system.

Workaround: To avoid this problem, retry the OBP disk boot command several times to avoid a boot from the network.

Perform the following steps on each domain:

Access the console of the domain.
```
primary# telnet localhost domain-name
```
Set the boot-device property.
```
ok> setenv boot-device disk disk disk disk disk disk disk disk disk disk net
```
The number of disk entries that you specify as the value of the boot-device property depends on the number of virtual functions that are configured on the system. On smaller systems, you might be able to include fewer instances of disk in the property value.
Verify that the boot-device property is set correctly by using the printenv.
```
ok> printenv
```
Return to the primary domain console.
Repeat Steps 1-4 for each domain on the system.
Reboot the primary domain.
```
primary# shutdown -i6 -g0 -y
```

Panic Occurs in Rare Circumstances When the Virtual Network Device Driver Operates in `TxDring` Mode

Bug ID 16991255: A panic occurs in rare circumstances when the virtual network device driver operates in TxDring mode.

Workaround: To avoid this panic, set the extended-mapin-space property value to on.

Note - This command initiates a delayed reconfiguration if ldom is primary. In all other cases, stop the domain before you perform this command.

primary# ldm set-domain extended-mapin-space=on ldom

A Domain That Has Only One Virtual CPU Assigned Might Panic During a Live Migration

Bug ID 16895816: Performing the migration of a domain that has only one virtual CPU assigned to it might panic on the guest domain in the pg_cmt_cpu_fini() function.

Workaround: Assign at least two virtual CPUs to the guest domain before you migrate it. For example, use the ldm add-vcpu 2 domain-name command to increase the number of virtual CPUs that are assigned to the domain-name guest domain.

`ldm migrate -n` Should Fail When Cross-CPU Migration From SPARC T5, SPARC M5, or SPARC M6 System to UltraSPARC T2 or SPARC T3 System

Bug ID 16864417: The ldm migrate -n command does not report failure when attempting to migrate between a SPARC T5, SPARC M5, or SPARC M6 machine and an UltraSPARC T2 or SPARC T3 machine.

Workaround: None.

Recovery Mode Should Support PCIe Slot Removal in Non-`primary` Root Domains

Bug ID 16713362: PCIe slots cannot currently be removed from non-primary root domains during the recovery operation. The PCIe slots remain assigned to the non-primary root domain.

Workaround: The PCIe slots must be removed manually from the non-primary root domain and assigned to the appropriate I/O domain or domains after the recovery operation has finished.

For information about how to remove PCIe slots from a non-primary root domain, see Using Non-primary Root Domains in Oracle VM Server for SPARC 3.1 Administration Guide .

Recovering I/O domains that use PCIe slots owned by non-primary root domains depends on the I/O domain configuration:

If the I/O domain uses only PCIe slots and none of its PCIe slots are available, the I/O domain is not recovered and is left in the unbound state with the PCIe slots marked as evacuated.
If the I/O domain uses SR-IOV virtual functions and PCIe slots, the domain is recovered with the unavailable PCIe slots marked as evacuated.

Use the ldm add-io command to add the PCIe slots to an I/O domain after you have manually removed them from the non-primary root domain.

`ldm list` Does Not Show the `evacuated` Property for Physical I/O Devices

Bug ID 16617981: ldm list output does not show the evacuated property for the physical I/O devices.

Workaround: Use the –p option with any of the ldm list commands to show the evacuated property for physical I/O devices.

Invalid Physical Address Is Received During a Domain Migration

Bug ID 16494899: In rare circumstances, a domain migration is rejected with the following message in the ldmd SMF log:

Mar 08 17:42:12 warning: Received invalid physical address during
migration of domain rztcrmdev2: base RA: 0x400000000, offset: 0x1ffff0000,
PA: 0x87fff0000 size: 0x1001a

Because the migration fails before the domain is suspended on the source system, there is no loss of service.

This failure occurs when the following circumstances occur to cause the rejection of the migration:

The memory contents of the last chunk of memory in the domain are compressed to a size larger than the memory chunk
The ldmd daemon incorrectly determines that data was written to memory outside of the domain on the target

The failure mode depends on the domain workload and the exact memory contents as most chunks are compressed to a smaller size.

Recovery: Although no workaround is guaranteed for this problem, performing a subsequent migration might work if the workload changes and therefore the memory contents change. You might also attempt to use dynamic reconfiguration to modify the memory size of the domain.

`send_mondo_set: timeout` Panic Occurs When Using the `ldm stop` Command on a Guest Domain After Stress

Bug ID 16486383: This problem can occur if you assign a PCI device or bus directly to a guest domain where the domain does not have a core assigned from the /SYS/DCU where the PCI card physically resides. Because the hypervisor resets PCI devices on behalf of guest domains, during each guest domain reboot a domain with cores on the DCU connected to the PCI device might panic. More PCI devices assigned to non-DCU-local guests increases the possibility of panics.

Workaround: Perform one of the following workarounds:

Ensure that when you assign PCI devices to a guest domain, the card is located physically in the same DCU as the cores.
Manually assign cores for physical card placement flexibility.

As an example, for a PCI device on IOU0 (pci_0 through pci_15), choose a core between 0 and 127, and allocate it to the domain.
```
# ldm add-core cid=16 domain
```
View the system cores by using the following command:
```
# ldm ls-devices -a core
```
For a PCI device on IOU1 (pci_16 through pci_31), choose a core between 128 and 255. For a PCI device on IOU2 (pci_32 through pci_47), choose a core between 256 and 383. For a PCI device on IOU3 (pci_48 through pci_63), choose a core between 384 and 511.

Subdevices Under a PCIe Device Revert to an Unassigned Name

Bug ID 16299053: After disabling a PCIe device, you might experience unexpected behavior. The subdevices that are under the disabled PCIe device revert to the non-assigned names while the PCIe device is still owned by the domain.

Workaround: If you decide to disable a PCIe slot on the ILOM, ensure that the PCIe slot is not assigned to a domain by means of the direct I/O (DIO) feature. That is, first ensure that the PCIe slot is assigned to the corresponding root domain before disabling the slot on the ILOM.

If you disable the PCIe slot on the ILOM while the PCIe slot is assigned to a domain with DIO, stop that domain and reassign the device to the root domain for the correct behavior.

`WARNING: ddi_intr_alloc: cannot fit into interrupt pool` Means That Interrupt Supply Is Exhausted While Attaching I/O Device Drivers

Bug ID 16284767: This warning on the Oracle Solaris console means the interrupt supply was exhausted while attaching I/O device drivers:

WARNING: ddi_intr_alloc: cannot fit into interrupt pool

The hardware provides a finite number of interrupts, so Oracle Solaris limits how many each device can use. A default limit is designed to match the needs of typical system configurations, however this limit may need adjustment for certain system configurations.

Specifically, the limit may need adjustment if the system is partitioned into multiple logical domains and if too many I/O devices are assigned to any guest domain. Oracle VM Server for SPARC divides the total interrupts into smaller sets given to guest domains. If too many I/O devices are assigned to a guest domain, its supply might be too small to give each device the default limit of interrupts. Thus, it exhausts its supply before it completely attaches all the drivers.

Some drivers provide an optional callback routine which allows Oracle Solaris to automatically adjust their interrupts. The default limit does not apply to these drivers.

Workaround: Use the ::irmpools and ::irmreqs MDB macros to determine how interrupts are used. The ::irmpools macro shows the overall supply of interrupts divided into pools. The ::irmreqs macro shows which devices are mapped to each pool. For each device, ::irmreqs shows whether the default limit is enforced by an optional callback routine, how many interrupts each driver requested, and how many interrupts the driver is given.

The macros do not show information about drivers that failed to attach. However, the information that is shown helps calculate the extent to which you can adjust the default limit. Any device that uses more than one interrupt without providing a callback routine can be forced to use fewer interrupts by adjusting the default limit. Reducing the default limit below the amount that is used by such a device results in freeing of interrupts for use by other devices.

To adjust the default limit, set the ddi_msix_alloc_limit property to a value from 1 to 8 in the /etc/system file. Then, reboot the system for the change to take effect.

To maximize performance, start by assigning larger values and decrease the values in small increments until the system boots successfully without any warnings. Use the ::irmpools and ::irmreqs macros to measure the adjustment's impact on all attached drivers.

For example, suppose the following warnings are issued while booting the Oracle Solaris OS in a guest domain:

WARNING: emlxs3: interrupt pool too full.
WARNING: ddi_intr_alloc: cannot fit into interrupt pool

The ::irmpools and ::irmreqs macros show the following information:

# echo "::irmpools" | mdb -k
ADDR             OWNER   TYPE   SIZE  REQUESTED  RESERVED
00000400016be970 px#0    MSI/X  36    36         36

# echo "00000400016be970::irmreqs" | mdb -k
ADDR             OWNER   TYPE   CALLBACK NINTRS NREQ NAVAIL
00001000143acaa8 emlxs#0 MSI-X  No       32     8    8
00001000170199f8 emlxs#1 MSI-X  No       32     8    8
000010001400ca28 emlxs#2 MSI-X  No       32     8    8
0000100016151328 igb#3   MSI-X  No       10     3    3
0000100019549d30 igb#2   MSI-X  No       10     3    3
0000040000e0f878 igb#1   MSI-X  No       10     3    3
000010001955a5c8 igb#0   MSI-X  No       10     3    3

The default limit in this example is eight interrupts per device, which is not enough interrupts to accommodate the attachment of the final emlxs3 device to the system. Assuming that all emlxs instances behave in the same way, emlxs3 probably requested 8 interrupts.

By subtracting the 12 interrupts used by all of the igb devices from the total pool size of 36 interrupts, 24 interrupts are available for the emlxs devices. Dividing the 24 interrupts by 4 suggests that 6 interrupts per device would enable all emlxs devices to attach with equal performance. So, the following adjustment is added to the /etc/system file:

set ddi_msix_alloc_limit = 6

When the system successfully boots without warnings, the ::irmpools and ::irmreqs macros show the following updated information:

# echo "::irmpools" | mdb -k
ADDR             OWNER   TYPE   SIZE  REQUESTED  RESERVED
00000400018ca868 px#0    MSI/X  36    36         36
 
# echo "00000400018ca868::irmreqs" | mdb -k
ADDR             OWNER   TYPE   CALLBACK NINTRS NREQ NAVAIL
0000100016143218 emlxs#0 MSI-X  No       32     8    6
0000100014269920 emlxs#1 MSI-X  No       32     8    6
000010001540be30 emlxs#2 MSI-X  No       32     8    6
00001000140cbe10 emlxs#3 MSI-X  No       32     8    6
00001000141210c0 igb#3   MSI-X  No       10     3    3
0000100017549d38 igb#2   MSI-X  No       10     3    3
0000040001ceac40 igb#1   MSI-X  No       10     3    3
000010001acc3480 igb#0   MSI-X  No       10     3    3

SPARC M5-32 and SPARC M6-32: `panic: mpo_cpu_add: Cannot read MD`

Bug ID 16238762: On a SPARC M5-32 or a SPARC M6-32 with at least 2.4 Tbytes of memory, attempting to set the number of CPUs in the primary domain from 6 to 1056 CPUs causes the kernel to panic with the following message:

mpo_cpu_add: Cannot read MD

The following procedure causes the panic:

Power on with a DCU assigned to a host.

For example, assign DCU0 to HOST0.
Create guest domains.
Save a configuration to the SP.
Power off the host.
Assign another DCU to the host.

For example, assign DCU1 to HOST0.
Power on the host.

The firmware verifies that the configuration is “bootable.” This verification ensures that all the CPUs, memory, and I/O that were present at the time the configuration was created are still present. The firmware also generates a new PRI to describe the configuration of the entire system.

The configuration successfully powers on and guest domains are booted.
Attempt to dynamically add a CPU to an existing domain.

A new machine description is generated that reflects correct latency information, but the Oracle Solaris OS cannot parse the new information and panics.

Workaround: To avoid the panic, do not perform the steps that are in the problem description.

If you have already performed these steps and experienced the panic, perform the following steps:

Perform an action after booting a saved configuration from a smaller physical domain. For example, remove a CPU from each active domain.
Reboot the domain.
Unbind the domain.
Rebind any bound domains.
Save a new configuration to the SP.

SPARC M5-32 and SPARC M6-32: Issue With Disks That Are Accessible Through Multiple Direct I/O Paths

Bug ID 16232834: When using the ldm add-vcpu command to assign CPUs to a domain, the Oracle Solaris OS might panic with the following message:

panic[cpu16]/thread=c4012102c860: mpo_cpu_add: Cannot read MD

This panic occurs if the following conditions exist:

Additional DCUs have been assigned to a host
The host is started by using a previously saved SP configuration that does not contain all the hardware that is assigned to the host

The target domain of the ldm add-vcpu operation is the domain that panics. The domain recovers with the additional CPUs when it reboots.

Workaround: Do not use configurations that are generated with fewer hardware resources than are assigned to the host.

To avoid the problem, do not add CPUs as described in the problem description. Or, perform the following steps:

Generate a new SP configuration after the DCUs have been added.

For example, the following command creates a configuration called new-config-more-dcus:
```
primary# ldm add-config new-config-more-dcus
```
Shutdown the domain.
Stop the host.
```
-> stop /HOST
```
Start the host.
```
-> start /HOST
```

`ixgbevf` Device in SR-IOV Domains Might Become Disabled When Rebooting the `primary` Domain

Bug ID 16224353: After rebooting the primary domain, ixgbevf instances in primary domain might not work.

Workaround: None.

Reboot of the Oracle Solaris 10 1/13 `primary` Domain Might Not Automatically Plumb or Assign an IP Address to a Virtual Function Interface

Bug ID 16219069: On a primary domain that runs the Oracle Solaris 10 1/13 OS, the virtual function interfaces might not be automatically plumbed or assigned an IP address based on the /etc/hostname.vf-interface file.

This issue occurs when you boot or reboot a SPARC T3, SPARC T4 or SPARC T5 system that runs the Oracle Solaris 10 1/13 OS on the primary domain. This problem affects virtual functions that have been created both on on-board physical functions and on add-in physical functions. This issue does not occur when you boot a Logical Domains guest domain image.

Oracle Solaris 10 Only: `mutex_enter: bad mutex` Panic in `primary` Domain During a Reboot or Shutdown

Bug ID 16080855: During a reboot or shutdown of the primary domain, the primary domain might experience a kernel panic with a panic message similar to the following:

panic[cpu2]/thread=c40043b818a0: mutex_enter: bad mutex, lp=c4005fa01c88
owner=c4005f70aa80 thread=c40043b818a0

000002a1075c3630 ldc:ldc_mem_rdwr_cookie+20 (c4005fa01c80,
c4004e2c2000,2a1075c37c8, 6c80000, 1, 0)
%l0-3: 00000000001356a4 0000000000136800 0000000000000380
00000000000002ff
%l4-7: 00000000001ad3f8 0000000000000004 00000000ffbffb9c
0000c4005fa01c88
000002a1075c3710 vldc:i_vldc_ioctl_write_cookie+a4 (c4004c400030,
380,ffbff898, 100003, 0, 70233400)
%l0-3: 0000000006c80000 0000000000156dc8 0000000000000380
0000000000100003
%l4-7: 00000000702337b0 000002a1075c37c8 0000000000040000
0000000000000000
000002a1075c37f0 vldc:vldc_ioctl+1a4 (3101, c4004c400030,
ffbff898,c4004c400000, c4004c438030, 0)
%l0-3: 0000000000100003 0000000000000000 000000007b340400
0000c4004c438030
%l4-7: 0000c4004c400030 0000000000000000 0000000000000000
0000000000000000
000002a1075c38a0 genunix:fop_ioctl+d0 (c4004d327800, 0, ffbff898,
100003,c4004384f718, 2a1075c3acc)
%l0-3: 0000000000003103 0000000000100003 000000000133ce94
0000c4002352a480
%l4-7: 0000000000000000 0000000000000002 00000000000000c0
0000000000000000
000002a1075c3970 genunix:ioctl+16c (3, 3103, ffbff898, 3, 134d50, 0)
%l0-3: 0000c40040e00a50 000000000000c6d3 0000000000000003
0000030000002000
%l4-7: 0000000000000003 0000000000000004 0000000000000000
0000000000000000

Recovery: Allow the primary domain to reboot. If the primary domain is configured not to reboot after a crash, manually boot the primary domain.

SPARC M5-32 and SPARC M6-32: LSI-SAS Controller Is Incorrectly Exported With SR-IOV

Bug ID 16071170: On a SPARC M5-32 or a SPARC M6-32 system, the internal SAS controllers are exported as SR-IOV-enabled controllers even though these cards do not support SR-IOV.

The Oracle VM Server for SPARC log shows the following messages when attempting to create the physical function on these cards:

Dec 11 04:27:54 warning: Dropping pf
pci@d00/pci@1/pci@0/pci@0/pci@0/pci@4/LSI,sas@0: no IOV capable driver
Dec 11 04:27:54 warning: Dropping pf
pci@d80/pci@1/pci@0/pci@c/pci@0/pci@4/LSI,sas@0: no IOV capable driver
Dec 11 04:27:54 warning: Dropping pf
pci@c00/pci@1/pci@0/pci@c/pci@0/pci@4/LSI,sas@0: no IOV capable driver
Dec 11 04:27:54 warning: Dropping pf
pci@e00/pci@1/pci@0/pci@0/pci@0/pci@4/LSI,sas@0: no IOV capable driver

The system has four LSI SAS controller ports, each in one IOU of the SPARC M5-32 and SPARC M6-32 assembly. This error is reported for each port.

Workaround: You can ignore these messages. These messages indicate only that the LSI-SAS controller devices on the system are capable of SR-IOV but no SR-IOV support is available for this hardware.

SPARC T5-8: Uptime Data Shows a Value of 0 for Some `ldm` List Commands

Bug ID 16068376: On a T5-8 with approximately 128 domains, some ldm commands such as ldm list might show 0 seconds as the uptime for all domains.

Workaround: Log in to the domain and use the uptime command to determine the domain's uptime.

Cannot Set a Jumbo MTU for `sxge` Virtual Functions in the `primary` Domain of a SPARC T5-1B System

Bug ID 16059331: The sxge driver cannot properly set jumbo MTUs for its virtual functions on the primary domain.

Workaround: Manually modify the /kernel/drv/sxge.conf file to set up the jumbo MTU on sxge virtual function interfaces in the guest domain.

`ldmd` Is Unable to Set the `mac-addr` and `alt-mac-addrs` Property Values for the `sxge` Device

Bug ID 15974640: The ldm command fails to set the mac-addr and alt-mac-addrs property values properly for the sxge device. As a result, the ldmd daemon reports an inconsistent MAC address. Also, any link aggregations that are based on the VNIC MAC address also fail.

`ldm list-io -d` Output for an `sxge` Device on SPARC T5-1B System Is Missing Two Properties

Bug ID 15974547: When run on a SPARC T5-1B system that has sxge, the ldm list-io -d PF-device output does not show the max-vlans or max-vf-mtu properties. These properties are present on a SPARC T5-1B system with ixgbe as well as on non-blade systems.

The max-vlans property value is missing. The value should be 0 because sxge device does not support hardware VLAN tagging. The max-vf-mtu property value is fixed at 1500, which prevents the physical function driver to set the jumbo MTU for virtual functions.

`ldm` Fails to Evacuate a Faulty Core From a Guest Domain

Bug ID 15962837: A core evacuation does not complete when a chip-level fault occurs. An evacuation that is followed by a core fault works as expected, but the chip-level fault does not complete when trying to retire an entire CMP node.

Workaround: None. Schedule a chip replacement when you diagnose a chip-level fault.

Memory DR Operations Hang When Reducing Memory Below Four Gbytes

Bug ID 15942036: If you perform a memory DR operation to reduce memory below four Gbytes, the operation might hang forever. If you issue an ldm cancel-op memdr command on that domain, an incorrect message is issued:

The memory removal operation has completed. You cannot cancel this operation.

Despite the message, the memory DR operation is hung and you might not be able to perform other ldmd operations on that guest domain.

Workaround: Do not attempt to reduce memory in any domain below four Gbytes. If you are already in this state, issue the ldm stop -f command or log in to the domain and reboot it.

CPU DR of Very Large Number of Virtual CPUs Can Appear to Fail

Bug ID 15826354: CPU dynamic reconfiguration (DR) of a very large number of CPUs causes the ldmd daemon to return a failure. Although ldmd times out, the DR operation continues in the background and eventually succeeds. Nevertheless, ldmd is no longer aligned with the resulting domain and subsequent DR operations might not be permitted.

For example:

# ldm ls
NAME             STATE      FLAGS   CONS    VCPU  MEMORY   UTIL  NORM  UPTIME
primary          active     -n-cv-  UART    7     20G      2.7%  0.4%  1h 41m
ldg0             active     -n----  5000    761   16G       75%   51%  6m

# ldm rm-vcpu 760 ldg0
Request to remove cpu(s) sent, but no valid response received
VCPU(s) will remain allocated to the domain, but might
not be available to the guest OS
Resource removal failed
 
# ldm set-vcpu 1 ldg0
Busy executing earlier command; please try again later.
Unable to remove the requested VCPUs from domain ldg0
Resource modification failed
 
# ldm ls
NAME             STATE      FLAGS   CONS    VCPU  MEMORY   UTIL  NORM  UPTIME
primary          active     -n-cv-  UART    7     20G      0.9%  0.1%  1h 45m
ldg0             active     -n----  5000    761   16G      100%  0.0%  10m

Workaround: Wait a few minutes and then run the ldm set-vcpu command again:

# ldm set-vcpu 1 ldg0
# ldm ls
NAME             STATE      FLAGS   CONS    VCPU  MEMORY   UTIL  NORM  UPTIME
primary          active     -n-cv-  UART    7     20G      0.9%  0.1%  1h 50m
ldg0             active     -n----  5000    1     16G       52%  0.0%  15m

Note that 760 exceeds the recommended maximum.

Migration of a Guest Domain With HIO Virtual Networks and `cpu-arch=generic` Times Out While Waiting for the Domain to Suspend

Bug ID 15825538: On a logical domain that is configured with both Hybrid network I/O interfaces (mode=hybrid) and cross-CPU migration enabled (cpu-arch=generic), if a secure live migration is executed (ldm migrate), the migration might time out and leave the domain in a suspended state.

Recovery: Restart the logical domain.

Workaround: Do not use hybrid I/O virtual network devices with secure cross-CPU live migration.

SPARC T4-4: Unable to Bind a Guest Domain

Bug ID 15825330: Oracle VM Server for SPARC appears to hang at startup on some SPARC T4-4 configurations that have only a single processor board.

Workaround: Ensure that a processor board always occupies the slots for processors 0 and 1. Restarting the system in such a configuration enables the Oracle VM Server for SPARC software to start up.

Guest Domain Panics While Changing the threading Property Value From `max-throughput` to `max-ipc`

Bug ID 15821246: On a system that runs the Oracle Solaris 11.1 OS, changing the threading property value on a migrated domain from max-ipc to max-throughput can lead to a panic on the guest domain.

Workaround: Do not change the threading status for a migrated guest domain until it is rebooted.

Control Domain Hangs on Reboot With Two Active Direct I/O Domains

Bug ID 15820741: On an Oracle Solaris 11.1 system that has two domains with direct I/O configurations, the control domain might hang when you reboot it.

Recovery: To recover from the reboot hang, reset the control domain by issuing the following command on the SP:

-> reset -f /HOST/domain/control

No Error Message When a Memory DR Add is Partially Successful

Bug ID 15812823: In low free-memory situations, not all memory blocks can be used as part of a memory DR operation due to size. However, these memory blocks are included in the amount of free memory. This situation might lead to a smaller amount of memory being added to the domain than expected. No error message is shown if this situation occurs.

Workaround: None.

Primary or Guest Domain Panics When Unbinding or Migrating a Guest Domain That Has Hybrid I/O Network Devices

Bug ID 15803617: The primary domain or an active guest domain might panic during an unbind operation or a live migration operation if the domain is configured with hybrid I/O virtual network devices.

Recovery: Restart the affected domain.

Workaround: Do not use hybrid I/O virtual network devices.

Re-creating a Domain That Has PCIe Virtual Functions From an XML File Fails

Bug ID 15783851: You might encounter a problem when attempting to re-create a configuration from an XML file that incorrectly represents virtual function constraints.

This problem occurs when you use the ldm list-constraints -x command to save the configuration of a domain that has PCIe virtual functions.

If you later re-create the domain by using the ldm add-domain -i command, the original virtual functions do not exist, and a domain bind attempt fails with the following error message:

No free matching PCIe device...

Even if you create the missing virtual functions, another domain bind attempt fails with the same error message because the virtual functions are miscategorized as PCIe devices by the ldm add-domain command.

Workaround: Perform the following steps:

Save the information about the virtual functions by using the ldm list-io command.
Destroy each affected domain by using the ldm rm-dom command.
Create all the required virtual functions by using the ldm create-vf command.
Rebuild the domains by using the ldm command.

When you use the ldm add-io command to add each virtual function, it is correctly categorized as a virtual function device, so the domain can be bound.

For information about rebuilding a domain configuration that uses virtual functions, see ldm init-system Command Might Not Correctly Restore a Domain Configuration on Which Physical I/O Changes Have Been Made.

Incorrect Error Message Issued When Changing the Control Domain From Using Whole Cores to Using Partial Cores

Bug ID 15783608: When you change the control domain from using physically constrained cores to using unconstrained CPU resources, you might see the following extraneous message:

Whole-core partitioning has been removed from domain primary,because
dynamic reconfiguration has failed and the domain is now configured
with a partial CPU core.

Workaround: You can ignore this message.

`ldm init-system` Command Might Not Correctly Restore a Domain Configuration on Which Physical I/O Changes Have Been Made

Bug ID 15783031: You might experience problems when you use the ldm init-system command to restore a domain configuration that has used direct I/O or SR-IOV operations.

A problem arises if one or more of the following operations have been performed on the configuration to be restored:

A slot has been removed from a bus that is still owned by the primary domain.
A virtual function has been created from a physical function that is owned by the primary domain.
A virtual function has been assigned to the primary domain, to other guest domains, or to both.
A root complex has been removed from the primary domain and assigned to a guest domain, and that root complex is used as the basis for further I/O virtualization operations.

In other words, you created a non-primary root domain and performed any of the previous operations.

To ensure that the system remains in a state in which none of the previous actions have taken place, see Using the ldm init-system Command to Restore Domains on Which Physical I/O Changes Have Been Made.

Logical Domains Manager Might Crash and Restart When You Attempt to Modify Many Domains Simultaneously

Bug ID 15782994: Logical Domains Manager might crash and restart when you attempt an operation that affects the configuration of many domains. You might see this issue when you attempt to change anything related to the virtual networking configuration and if many virtual network devices in the same virtual switch exist across many domains. Typically, this issue is seen with approximately 90 or more domains that have virtual network devices connected to the same virtual switch, and the inter-vnet-link property is enabled (the default behavior). Confirm the symptom by finding the following message in the ldmd log file and a core file in the /var/opt/SUNWldm directory:

Frag alloc for 'domain-name'/MD memory of size 0x80000 failed

Workaround: Avoid creating many virtual network devices connected to the same virtual switch. If you intend to do so, set the inter-vnet-link property to off on the virtual switch. Be aware that this option might negatively affect network performance between guest domains.

`ldm list -o` Command No Longer Accepts Format Abbreviations

Bug ID 15781142: The ldm list -o format command no longer accepts abbreviations for format.

Although the Oracle VM Server for SPARC 3.0 software enabled you to use the ldm list -o net command to show information about the network, such abbreviations have been removed from the Oracle VM Server for SPARC 3.1 software. In Oracle VM Server for SPARC 3.1, you must use the full version of format in the command: ldm list -o network.

Workaround: Use the format names that are specified in the ldm(1M) man page.

Control Domain Requires the Lowest Core in the System

Bug ID 15778392: The control domain requires the lowest core in the system. So, if core ID 0 is the lowest core, it cannot be shared with any other domain if you want to apply the whole-core constraint to the control domain.

For example, if the lowest core in the system is core ID 0, the control domain should look similar to the following output:

# ldm ls -o cpu primary
NAME
primary

VCPU
VID    PID    CID    UTIL STRAND
0      0      0      0.4%   100%
1      1      0      0.2%   100%
2      2      0      0.1%   100%
3      3      0      0.2%   100%
4      4      0      0.3%   100%
5      5      0      0.2%   100%
6      6      0      0.1%   100%
7      7      0      0.1%   100%

After Canceling a Migration, `ldm` Commands That Are Run on the Target System Are Unresponsive

Bug ID 15776752: If you cancel a live migration, the memory contents of the domain instance that is created on the target must be “scrubbed” by the hypervisor. This scrubbing process is performed for security reasons and must be complete before the memory can be returned to the pool of free memory. While this scrubbing is in progress, ldm commands become unresponsive. As a result, the Logical Domains Manager appears to be hung.

Recovery: You must wait for this scrubbing request to finish before you attempt to run other ldm commands. This process might take a long time. For example, a guest domain that has 500 Gbytes of memory might complete this process in up to 7 minutes on a SPARC T4 server or up to 25 minutes on a SPARC T3 server.

Some Emulex Cards Do Not Work When Assigned to I/O Domain

Bug ID 15776319: On a system that runs the Oracle Solaris OS on the control domain and an I/O domain, some Emulex cards that are assigned to the I/O domain do not function properly because the cards do not receive interrupts. However, when assigned to the control domain, the same cards work properly.

This problem occurs with the following Emulex cards:

Emulex 2-Gigabit/Sec PCI Express Single and Dual FC Host Adapter (SG-XPCIE1(2)FC-EM2)
Emulex 4-Gigabit/Sec PCI Express Single and Dual FC Host Adapter (SG-XPCIE2FC-EB4-N)
Emulex 4-Gigabit/Sec PCI Express Single and Dual FC Host Adapter (SG-XPCIE1(2)FC-EM4)
Emulex 8-Gigabit/Sec PCI Express Single and Dual FC Host Adapter (SG-XPCIE1(2)FC-EM8-Z)
Emulex 8-Gigabit/Sec PCI Express Single and Dual FC Host Adapter (SG-XPCIE1(2)FC-EM8-N)

Workaround: None.

Guest Domain Panics When Running the `cputrack` Command During a Migration to a SPARC T4 System

Bug ID 15776123: If the cputrack command is run on a guest domain while that domain is migrated to a SPARC T4 system, the guest domain might panic on the target machine after it has been migrated.

Workaround: Do not run the cputrack command during the migration of a guest domain to a SPARC T4 system.

Oracle Solaris 11: DRM Stealing Reports Oracle Solaris DR Failure and Retries

Bug ID 15775668: A domain that has a higher-priority policy can steal virtual CPU resources from a domain with a lower-priority policy. While this “stealing” action is in progress, you might see the following warning messages in the ldmd log every 10 seconds:

warning: Unable to unconfigure CPUs out of guest domain-name

Workaround: You can ignore these misleading messages.

Limit the Maximum Number of Virtual Functions That Can Be Assigned to a Domain

Bug ID 15775637: An I/O domain has a limit on the number of interrupt resources that are available per root complex.

On SPARC T3 and SPARC T4 systems, the limit is approximately 63 MSI/X vectors. Each igb virtual function uses three interrupts. The ixgbe virtual function uses two interrupts.

If you assign a large number of virtual functions to a domain, the domain runs out of system resources to support these devices. You might see messages similar to the following:

WARNING: ixgbevf32: interrupt pool too full.
WARNING: ddi_intr_alloc: cannot fit into interrupt pool

Guest Domain That Uses Cross-CPU Migration Reports Random Uptimes After the Migration Completes

Bug ID 15775055: After a domain is migrated between two machines that have different CPU frequencies, the uptime reports by the ldm list command might be incorrect. These incorrect results occur because uptime is calculated relative to the STICK frequency of the machine on which the domain runs. If the STICK frequency differs between the source and target machines, the uptime appears to be scaled incorrectly.

The uptime reported and shown by the guest domain itself is correct. Also, any accounting that is performed by the Oracle Solaris OS in the guest domain is correct.

Oracle Solaris 10: `ixgbe` Driver Might Cause a Panic When Booted With an Intel Dual Port Ethernet Controller X540 Card

Bug ID 15773603: When booted with an Intel dual port Ethernet Controller X540 card, the Oracle Solaris 10 ixgbe driver might cause a system panic. This panic occurs because the driver has a high-priority timer that blocks other drivers from attaching.

Workaround: Reboot the system.

Guest Domain Console Randomly Hangs on SPARC T4 Systems

Bug ID 15771384: A domain's guest console might freeze if repeated attempts are made to connect to the console before and during the time the console is bound. For example, this might occur if you use an automated script to grab the console as a domain is being migrated onto the machine.

Workaround: To unfreeze console, perform the following commands on the domain that hosts the domain's console concentrator (usually the control domain):

primary# svcadm disable vntsd
primary# svcadm enable vntsd

Destroying All Virtual Functions and Returning the Slots to the Root Domain Does Not Restore the Root Complex Resources

Bug ID 15765858: The resources on the root complex are not restored after you destroy all the virtual functions and return the slots to the root domain.

Workaround: Set the iov option to off for the specific PCIe bus.

primary# ldm start-reconf primary
primary# ldm set-io iov=off pci_0

`ldm remove-io` of PCIe Cards That Have PCIe-to-PCI Bridges Should Be Disallowed

Bug ID 15761509: Use only the PCIe cards that support the Direct I/O (DIO) feature, which are listed in this support document.

Workaround: Use the ldm add-io command to add the card to the primary domain again.

`ldm stop` Command Might Fail If Issued Immediately After an `ldm start` Command

Bug ID 15759601: If you issue an ldm stop command immediately after an ldm start command, the ldm stop command might fail with the following error:

LDom domain stop notification failed

Workaround: Reissue the ldm stop command.

`init-system` Does Not Restore Named Core Constraints for Guest Domains From Saved XML Files

Bug ID 15758883: The ldm init-system command fails to restore the named CPU core constraints for guest domains from a saved XML file.

Workaround: Perform the following steps:

Create an XML file for the primary domain.

# ldm ls-constraints -x primary > primary.xml

Create an XML file for the guest domain or domains.

# ldm ls-constraints -x ldom[,ldom][,...] > guest.xml

Power cycle the system and boot a factory default configuration.
Apply the XML configuration to the primary domain.
```
# ldm init-system -r -i primary.xml
```
Reboot.
Apply the XML configuration to the guest domain or domains.
```
# ldm init-system -f -i guest.xml
```

System Panics When Rebooting a `primary` Domain That Has a Very Large Number of Virtual Functions Assigned

Bug ID 15750727: A system might panic when you reboot a primary domain that has a very large number of virtual functions assigned to it.

Workaround: Perform one of the following workarounds:

Decrease the virtual function number to reduce the number of failed virtual functions. This change might keep the chip responsive.
Create more Interrupt Resource Management (IRM) pools for the ixgbe virtual function because only one IRM pool is created by default for all the ixgbe virtual functions on the system.

Partial Core `primary` Fails to Permit Whole-Core DR Transitions

Bug ID 15748348: When the primary domain shares the lowest physical core (usually 0) with another domain, attempts to set the whole-core constraint for the primary domain fail.

Workaround: Perform the following steps:

Determine the lowest bound core that is shared by the domains.
```
# ldm list -o cpu
```
Unbind all the CPU threads of the lowest core from all domains other than the primary domain.

As a result, CPU threads of the lowest core are not shared and are free for binding to the primary domain.
Set the whole-core constraint by doing one of the following:
- Bind the CPU threads to the primary domain, and set the whole-core constraint by using the ldm set-vcpu -c command.
- Use the ldm set-core command to bind the CPU threads and set the whole-core constraint in a single step.

`ldm list-io` Command Shows the UNK or INV State After Boot

Bug ID 15738561: The ldm list-io command might show the UNK or INV state for PCIe slots and SR-IOV virtual functions if the command runs immediately after the primary domain is booted. This problem is caused by the delay in the Logical Domains agent's reply from the Oracle Solaris OS.

This problem has been reported only on a few systems.

Workaround: The status of the PCIe slots and the virtual functions is automatically updated after the information is received from the Logical Domains agent.

Migrating a Very Large Memory Domain on SPARC T4-4 Systems Results in a Panicked Domain on the Target System

Bug ID 15731303: Avoid migrating domains that have over 500 Gbytes of memory. Use the ldm list -o mem command to see the memory configuration of your domain. Some memory configurations that have multiple memory blocks that total over 500 Gbytes might panic with a stack that resembles the following:

panic[cpu21]/thread=2a100a5dca0:
BAD TRAP: type=30 rp=2a100a5c930 addr=6f696e740a232000 mmu_fsr=10009

sched:data access exception: MMU sfsr=10009: Data or instruction address
out of range context 0x1

pid=0, pc=0x1076e2c, sp=0x2a100a5c1d1, tstate=0x4480001607, context=0x0
g1-g7: 80000001, 0, 80a5dca0, 0, 0, 0, 2a100a5dca0

000002a100a5c650 unix:die+9c (30, 2a100a5c930, 6f696e740a232000, 10009,
2a100a5c710, 10000)
000002a100a5c730 unix:trap+75c (2a100a5c930, 0, 0, 10009, 30027b44000,
2a100a5dca0)
000002a100a5c880 unix:ktl0+64 (7022d6dba40, 0, 1, 2, 2, 18a8800)
000002a100a5c9d0 unix:page_trylock+38 (6f696e740a232020, 1, 6f69639927eda164,
7022d6dba40, 13, 1913800)
000002a100a5ca80 unix:page_trylock_cons+c (6f696e740a232020, 1, 1, 5,
7000e697c00, 6f696e740a232020)
000002a100a5cb30 unix:page_get_mnode_freelist+19c (701ee696d00, 12, 1, 0, 19, 3)
000002a100a5cc80 unix:page_get_cachelist+318 (12, 1849fe0, ffffffffffffffff, 3,
0, 1)
000002a100a5cd70 unix:page_create_va+284 (192aec0, 300ddbc6000, 0, 0,
2a100a5cf00, 300ddbc6000)
000002a100a5ce50 unix:segkmem_page_create+84 (18a8400, 2000, 1, 198e0d0, 1000,
11)
000002a100a5cf60 unix:segkmem_xalloc+b0 (30000002d98, 0, 2000, 300ddbc6000, 0,
107e290)
000002a100a5d020 unix:segkmem_alloc_vn+c0 (30000002d98, 2000, 107e000, 198e0d0,
30000000000, 18a8800)
000002a100a5d0e0 genunix:vmem_xalloc+5c8 (30000004000, 2000, 0, 0, 80000, 0)
000002a100a5d260 genunix:vmem_alloc+1d4 (30000004000, 2000, 1, 2000,
30000004020, 1)
000002a100a5d320 genunix:kmem_slab_create+44 (30000056008, 1, 300ddbc4000,
18a6840, 30000056200, 30000004000)
000002a100a5d3f0 genunix:kmem_slab_alloc+30 (30000056008, 1, ffffffffffffffff,
0, 300000560e0, 30000056148)
000002a100a5d4a0 genunix:kmem_cache_alloc+2dc (30000056008, 1, 0, b9,
fffffffffffffffe, 2006)
000002a100a5d550 genunix:kmem_cpucache_magazine_alloc+64 (3000245a740,
3000245a008, 7, 6028f283750, 3000245a1d8, 193a880)
000002a100a5d600 genunix:kmem_cache_free+180 (3000245a008, 6028f2901c0, 7, 7,
7, 3000245a740)
000002a100a5d6b0 ldc:vio_destroy_mblks+c0 (6028efe8988, 800, 0, 200, 19de0c0, 0)
000002a100a5d760 ldc:vio_destroy_multipools+30 (6028f1542b0, 2a100a5d8c8, 40,
0, 10, 30000282240)
000002a100a5d810 vnet:vgen_unmap_rx_dring+18 (6028f154040, 0, 6028f1a3cc0, a00,
200, 6028f1abc00)
000002a100a5d8d0 vnet:vgen_process_reset+254 (1, 6028f154048, 6028f154068,
6028f154060, 6028f154050, 6028f154058)
000002a100a5d9b0 genunix:taskq_thread+3b8 (6028ed73908, 6028ed738a0, 18a6840,
6028ed738d2, e4f746ec17d8, 6028ed738d4)

Workaround: Avoid performing migrations of domains that have over 500 Gbytes of memory.

Removing a Large Number of CPUs From a Guest Domain Fails

Bug ID 15726205: You might see the following error message when you attempt to remove a large number of CPUs from a guest domain:

Request to remove cpu(s) sent, but no valid response received
VCPU(s) will remain allocated to the domain, but might
not be available to the guest OS
Resource modification failed

Workaround: Stop the guest domain before you remove more than 100 CPUs from the domain.

Cannot Use Oracle Solaris Hot-Plug Operations to Hot-Remove a PCIe Endpoint Device

Bug ID 15721872: You cannot use Oracle Solaris hot-plug operations to hot-remove a PCIe endpoint device after that device is removed from the primary domain by using the ldm rm-io command. For information about replacing or removing a PCIe endpoint device, see Making PCIe Hardware Changes in Oracle VM Server for SPARC 3.1 Administration Guide .

`nxge` Panics When Migrating a Guest Domain That Has Hybrid I/O and Virtual I/O Virtual Network Devices

Bug ID 15710957: When a heavily loaded guest domain has a hybrid I/O configuration and you attempt to migrate it, you might see an nxge panic.

Workaround: Add the following line to the /etc/system file on the primary domain and on any service domain that is part of the hybrid I/O configuration for the domain:

set vsw:vsw_hio_max_cleanup_retries = 0x200

All `ldm` Commands Hang When Migrations Have Missing Shared NFS Resources

Bug ID 15708982: An initiated or ongoing migration, or any ldm command, hangs forever. This situation occurs when the domain to be migrated uses a shared file system from another system and the file system is no longer shared.

Workaround: Make the shared file system accessible again.

Logical Domains Agent Service Does Not Come Online If the System Log Service Does Not Come Online

Bug ID 15707426: If the system log service, svc:/system/system-log, fails to start and does not come online, the Logical Domains agent service will not come online. When the Logical Domains agent service is not online, the virtinfo, ldm add-vsw, ldm add-vdsdev, and ldm list-io commands might not behave as expected.

Workaround: Ensure that the svc:/ldoms/agents:default service is enabled and online:

# svcs -l svc:/ldoms/agents:default

If the svc:/ldoms/agents:default service is offline, verify that the service is enabled and that all dependent services are online.

Kernel Deadlock Causes Machine to Hang During a Migration

Bug ID 15704500: The migration of an active guest domain might hang and cause the source machine to become unresponsive. When this problem occurs, the following message is written to the console and to the /var/adm/messages file:

vcc: i_vcc_ldc_fini: cannot close channel 15

vcc: [ID 815110 kern.notice] i_vcc_ldc_fini: cannot
close channel 15

Note that the channel number shown is an Oracle Solaris internal channel number that might be different for each warning message.

Workaround: Before you migrate the domain, disconnect from the guest domain's console.

Recovery: Perform a power cycle of the source machine.

DRM and `ldm list` Output Shows a Different Number of Virtual CPUs Than Are Actually in the Guest Domain

Bug ID 15702475: A No response message might appear in the Oracle VM Server for SPARC log when a loaded domain's DRM policy expires after the CPU count has been substantially reduced. The ldm list output shows that more CPU resources are allocated to the domain than is shown in the psrinfo output.

Workaround: Use the ldm set-vcpu command to reset the number of CPUs on the domain to the value that is shown in the psrinfo output.

Live Migration of a Domain That Depends on an Inactive Master Domain on the Target Machine Causes `ldmd` to Fault With a Segmentation Fault

Bug ID 15701865: If you attempt a live migration of a domain that depends on an inactive domain on the target machine, the ldmd daemon faults with a segmentation fault, and the domain on the target machine restarts. Although you can still perform a migration, it will not be a live migration.

Workaround: Perform one of the following actions before you attempt the live migration:

Remove the guest dependency from the domain to be migrated.
Start the master domain on the target machine.

DRM Fails to Restore the Default Number of Virtual CPUs for a Migrated Domain When the Policy Is Removed or Expired

Bug ID 15701853: After you perform a domain migration while a DRM policy is in effect, if the DRM policy expires or is removed from the migrated domain, DRM fails to restore the original number of virtual CPUs to the domain.

Workaround: If a domain is migrated while a DRM policy is active and the DRM policy is subsequently expired or removed, reset the number of virtual CPUs. Use the ldm set-vcpu command to set the number of virtual CPUs to the original value on the domain.

Virtual CPU Timeout Failures During DR

Bug ID 15701258: Running the ldm set-vcpu 1 command on a guest domain that has over 100 virtual CPUs and some cryptographic units fails to remove the virtual CPUs. The virtual CPUs are not removed because of a DR timeout failure. The cryptographic units are successfully removed.

Workaround: Use the ldm rm-vcpu command to remove all but one of the virtual CPUs from the guest domain. Do not remove more than 100 virtual CPUs at a time.

Migration Failure Reason Not Reported When the System MAC Address Clashes With Another MAC Address

Bug ID 15699763: A domain cannot be migrated if it contains a duplicate MAC address. Typically, when a migration fails for this reason, the failure message shows the duplicate MAC address. However in rare circumstances, this failure message might not report the duplicate MAC address.

# ldm migrate ldg2 system2
Target Password:
Domain Migration of LDom ldg2 failed

Workaround: Ensure that the MAC addresses on the target machine are unique.

Simultaneous Migration Operations in “Opposite Direction” Might Cause `ldm` to Hang

Bug ID 15696986: If two ldm migrate commands are issued simultaneously in the “opposite direction,” the two commands might hang and never complete. An opposite direction situation occurs when you simultaneously start a migration on machine A to machine B and a migration on machine B to machine A.

The hang occurs even if the migration processes are initiated as dry runs by using the –n option. When this problem occurs, all other ldm commands might hang.

Workaround: None.

Removing a Large Number of CPUs From the Control Domain Fails

Bug ID 15677358: Use a delayed reconfiguration rather than dynamic reconfiguration to remove more than 100 CPUs from the control domain (also known as the primary domain). Use the following steps:

Use the ldm start-reconf primary command to put the control domain in delayed reconfiguration mode.
Remove the desired number of CPU resources.

If you make a mistake while removing CPU resources, do not attempt another request to remove CPUs while the control domain is still in a delayed reconfiguration. If you do so, the commands will fail (see Only One CPU Configuration Operation Is Permitted to Be Performed During a Delayed Reconfiguration). Instead, undo the delayed reconfiguration operation by using the ldm cancel-reconf command, and start over.
Reboot the control domain.

System Running the Oracle Solaris 10 8/11 OS That Has the Elastic Policy Set Might Hang

Bug IDs 15672651 and 15731467: You might experience OS hangs at login or while executing commands when the following conditions are met:

The Oracle Solaris 10 8/11 OS is running on a SPARC sun4v system
The Power Management (PM) Elastic policy is set on the system's ILOM Service Processor

Workaround: Apply patch ID 147149-01.

`pkgadd` Fails to Set ACL Entries on `/var/svc/manifest/platform/sun4v/ldmd.xml`

Bug ID 15668881: When using the pkgadd command to install the SUNWldm.v package from a directory that is exported by means of NFS from a Sun ZFS Storage Appliance, you might see the following error message:

cp: failed to set acl entries on /var/svc/manifest/platform/sun4v/ldmd.xml

Workaround: Ignore this message.

SPARC T3-1: Issue With Disks That Are Accessible Through Multiple Direct I/O Paths

Bug ID 15668368: A SPARC T3-1 system can be installed with dual-ported disks, which can be accessed by two different direct I/O devices. In this case, assigning these two direct I/O devices to different domains can cause the disks to be used by both domains and affect each other based on the actual usage of those disks.

Workaround: Do not assign direct I/O devices that have access to the same set of disks to different I/O domains. To determine whether you have dual-ported disks on a SPARC T3-1 system, run the following command on the SP:

-> show /SYS/SASBP

If the output includes the following fru_description value, the corresponding system has dual-ported disks:

fru_description = BD,SAS2,16DSK,LOUISE

If dual disks are found to be present in the system, ensure that both of the following direct I/O devices are always assigned to the same domain:

pci@400/pci@1/pci@0/pci@4  /SYS/MB/SASHBA0
pci@400/pci@2/pci@0/pci@4  /SYS/MB/SASHBA1

Memory DR Removal Operations With Multiple Plumbed NIU `nxge` Instances Can Hang Indefinitely and Never Complete

Bug ID 15667770: When multiple NIU nxge instances are plumbed on a domain, the ldm rm-mem and ldm set-mem commands, which are used to remove memory from the domain, might never complete. To determine whether the problem has occurred during a memory removal operation, monitor the progress of the operation with the ldm list -o status command. You might have encountered this problem if the progress percentage remains constant for several minutes.

Workaround: Cancel the ldm rm-mem or ldm set-mem command, and check whether a sufficient amount of memory was removed. If not, a subsequent memory removal command to remove a smaller amount of memory might complete successfully.

If the problem has occurred on the primary domain, do the following:

Start a delayed reconfiguration operation on the primary domain.
```
# ldm start-reconf primary
```
Assign the desired amount of memory to the domain.
Reboot the primary domain.

If the problem occurred on another domain, stop the domain before adjusting the amount of memory that is assigned to the domain.

Using the `ldm stop -a` Command on Domains in a Master-Slave Relationship Leaves the Slave With the `stopping` Flag Set

Bug ID 15664666: When a reset dependency is created, an ldm stop -a command might result in a domain with a reset dependency being restarted instead of only stopped.

Workaround: First, issue the ldm stop command to the master domain. Then, issue the ldm stop command to the slave domain. If the initial stop of the slave domain results in a failure, issue the ldm stop -f command to the slave domain.

Migration of a Domain That Has an Enabled Default DRM Policy Results in a Target Domain Being Assigned All Available CPUs

Bug ID 15655513: Following the migration of an active domain, CPU utilization in the migrated domain can increase dramatically for a short period of time. If a dynamic resource managment (DRM) policy is in effect for the domain at the time of the migration, the Logical Domains Manager might begin to add CPUs. In particular, if the vcpu-max and attack properties were not specified when the policy was added, the default value of unlimited causes all the unbound CPUs in the target machine to be added to the migrated domain.

Recovery: No recovery is necessary. After the CPU utilization drops below the upper limit that is specified by the DRM policy, the Logical Domains Manager automatically removes the CPUs.

An In-Use MAC Address Can be Reassigned

Bug ID 15655199: Sometimes an in-use MAC address is not detected and it is erroneously reassigned.

Workaround: Manually ensure that an in-use MAC address cannot be reassigned.

`ldmconfig` Cannot Create a Domain Configuration on the SP

Bug ID 15654965: The ldmconfig script cannot properly create a stored domain configuration on the service processor (SP).

Workaround: Do not power cycle the system after the ldmconfig script completes and the domain reboots. Instead, perform the following manual steps:

Add the configuration to the SP.
```
# ldm add-spconfig new-config-name
```
Remove the primary-with-clients configuration from the SP.
```
# ldm rm-spconfig primary-with-clients
```
Power cycle the system.

If you do not perform these steps prior to the system being power cycled, the existence of the primary-with-client configuration causes the domains to be inactive. In this case, you must bind each domain manually and then start them by running the ldm start -a command. After the guests have booted, repeating this sequence enables the guest domains to be booted automatically after a power cycle.

Uncooperative Oracle Solaris Domain Migration Can Be Blocked If `cpu0` Is Offline

Bug ID 15653424: The migration of an active domain can fail if it is running a release earlier than the Oracle Solaris 10 10/09 OS release and the lowest-numbered CPU in the domain is in the offline state. The operation fails when the Logical Domains Manager uses CPU DR to reduce the domain to a single CPU. In doing so, the Logical Domains Manager attempts to remove all but the lowest CPU in the domain but because that CPU is offline, the operation fails.

Workaround: Before attempting the migration, ensure that the lowest-numbered CPU in the domain is in the online state.

Memory DR Is Disabled Following a Canceled Migration

Bug ID 15646293: After an Oracle Solaris 10 9/10 domain has been suspended as part of a migration operation, memory dynamic reconfiguration (DR) is disabled. This action occurs not only when the migration is successful but also when the migration has been canceled, despite the fact that the domain remains on the source machine.

Dynamic Reconfiguration of MTU Values of Virtual Network Devices Sometimes Fails

Bug ID 15631119: If you modify the maximum transmission unit (MTU) of a virtual network device on the control domain, a delayed reconfiguration operation is triggered. If you subsequently cancel the delayed reconfiguration, the MTU value for the device is not restored to the original value.

Recovery: Rerun the ldm set-vnet command to set the MTU to the original value. Resetting the MTU value puts the control domain into delayed reconfiguration mode, which you need to cancel. The resulting MTU value is now the original, correct MTU value.

# ldm set-vnet mtu=orig-value vnet1 primary
# ldm cancel-op reconf primary

Migrated Domain With MAUs Contains Only One CPU When Target OS Does Not Support DR of Cryptographic Units

Bug ID 15606220: Starting with the Logical Domains 1.3 release, a domain can be migrated even if it has one or more cryptographic units bound to it.

In the following circumstances, the target machine will contain only one CPU after the migration is completed:

Target machine runs Logical Domains 1.2
Control domain on the target machine runs a version of the Oracle Solaris OS that does not support cryptographic unit DR
You migrate a domain that contains cryptographic units

After the migration completes, the target domain will resume successfully and be operational, but will be in a degraded state (just one CPU).

Workaround: Prior to the migration, remove the cryptographic unit or units from the source machine that runs Logical Domains 1.3.

Mitigation: To avoid this issue, perform one or both of these steps:

Install the latest Oracle VM Server for SPARC software on the target machine.
Install patch ID 142245-01 on the control domain of the target machine, or upgrade to at least the Oracle Solaris 10 10/09 OS.

Confusing Migration Failure Message for Real Address Memory Bind Failures

Bug ID 15605806: In certain situations, a migration fails with the following error message, and ldmd reports that it was not possible to bind the memory needed for the source domain. This situation can occur even if the total amount of available memory on the target machine is greater than the amount of memory being used by the source domain (as shown by ldm ls-devices -a mem).

Unable to bind 29952M memory region at real address 0x8000000
Domain Migration of LDom ldg0 failed

Cause: This failure is due the inability to meet congruence requirements between the Real Address (RA) and the Physical Address (PA) on the target machine.

Workaround: Stop the domain and perform the migration as a cold migration. You can also reduce the size of the memory on the guest domain by 128 Mbytes, which might permit the migration to proceed while the domain is running.

Dynamically Removing All the Cryptographic Units From a Domain Causes SSH to Terminate

Bug ID 15600969: If all the hardware cryptographic units are dynamically removed from a running domain, the cryptographic framework fails to seamlessly switch to the software cryptographic providers, and kills all the ssh connections.

Recovery: Re-establish the ssh connections after all the cryptographic units are removed from the domain.

Workaround: Set UseOpenSSLEngine=no in the /etc/ssh/sshd_config file on the server side, and run the svcadm restart ssh command.

All ssh connections will no longer use the hardware cryptographic units (and thus not benefit from the associated performance improvements), and ssh connections will not be disconnected when the cryptographic units are removed.

PCI Express Dual 10-Gigabit Ethernet Fiber Card Shows Four Subdevices in `ldm list-io -l` Output

Bug ID 15597025: When you run the ldm ls-io -l command on a system that has a PCI Express Dual 10-Gigabit Ethernet Fiber card (X1027A-Z) installed, the output might show the following:

primary# ldm ls-io -l
...
pci@500/pci@0/pci@c PCIE5 OCC primary
network@0
network@0,1
ethernet
ethernet

The output shows four subdevices even though this Ethernet card has only two ports. This anomaly occurs because this card has four PCI functions. Two of these functions are disabled internally and appear as ethernet in the ldm ls-io -l output.

Workaround: You can ignore the ethernet entries in the ldm ls-io -l output.

Using Logical Domains mpgroup With MPXIO Storage Array Configuration for High-Disk Availability

Bug ID 15591769: When creating a LUN, you can add it to the virtual disk service for both primary and alternate domains by using the same mpgroup. To specify which domain to use first when accessing the LUN, add that virtual disk service device first.

To use the LUN from primary-vds0 first, perform the following commands:

primary# ldm add-vdsdev mpgroup=ha lun1@primary-vds0
primary# ldm add-vdsdev mpgoup=ha  lun1@alternate-vds0
primary# ldm add-vdisk disk1 lun1@primary-vds0 gd0

To use the LUN from alternate-vds0 first, perform the following commands:

primary# ldm add-vdsdev mpgroup=ha lun1@alternate-vds0
primary# ldm add-vdsdev mpgoup=ha  lun1@primary-vds0
primary# ldm add-vdisk disk1 lun1@alternate-vds0 gd0

`ldm` Commands Are Slow to Respond When Several Domains Are Booting

Bug ID 15572184: An ldm command might be slow to respond when several domains are booting. If you issue an ldm command at this stage, the command might appear to hang. Note that the ldm command will return after performing the expected task. After the command returns, the system should respond normally to ldm commands.

Workaround: Avoid booting many domains simultaneously. However, if you must boot several domains at once, refrain from issuing further ldm commands until the system returns to normal. For instance, wait for about two minutes on Sun SPARC Enterprise T5140 and T5240 servers and for about four minutes on the Sun SPARC Enterprise T5440 server or Sun Netra T5440 server.

Oracle Solaris 11: Zones Configured With an Automatic Network Interface Might Fail to Start

Bug ID 15560811: In Oracle Solaris 11, zones that are configured with an automatic network interface (anet) might fail to start in a domain that has Logical Domains virtual network devices only.

Workaround 1: Assign one or more physical network devices to the guest domain. Use PCIe bus assignment, the Direct I/O (DIO), or the SR-IOV feature to assign a physical NIC to the domain.
Workaround 2: If the zones configuration requirement is to have interzone communication only within the domain, create an etherstub device. Use the etherstub device as the “lower link” in the zones configuration so that that virtual NICs are created on the etherstub device.
Workaround 3: Use exclusive link assignment to assign a Logical Domains virtual network device to a zone. Assign virtual network devices, as needed, to the domain. You might also choose to disable inter-vnet links to be able to create a large number of virtual network devices.

Oracle Solaris 10: Virtual Network Devices Are Not Created Properly on the Control Domain

Bug ID 15560201: Sometimes ifconfig indicates that the device does not exist after you add a virtual network or virtual disk device to a domain. This situation might occur as the result of the /devices entry not being created.

Although this problem should not occur during normal operation, the error sometimes occurs when the instance number of a virtual network device does not match the instance number listed in /etc/path_to_inst file.

For example:

# ifconfig vnet0 plumb
ifconfig: plumb: vnet0: no such interface

The instance number of a virtual device is shown under the DEVICE column in the ldm list output:

# ldm list -o network primary
NAME             
primary          

MAC
 00:14:4f:86:6a:64

VSW
 NAME         MAC               NET-DEV DEVICE   DEFAULT-VLAN-ID PVID VID MTU  MODE  
 primary-vsw0 00:14:4f:f9:86:f3 nxge0   switch@0 1               1        1500        

NETWORK
 NAME   SERVICE              DEVICE    MAC               MODE PVID VID MTU  
 vnet1  primary-vsw0@primary network@0 00:14:4f:f8:76:6d      1        1500

The instance number (0 for both the vnet and vsw shown previously) can be compared with the instance number in the path_to_inst file to ensure that they match.

# egrep '(vnet|vsw)' /etc/path_to_inst
"/virtual-devices@100/channel-devices@200/virtual-network-switch@0" 0 "vsw"
"/virtual-devices@100/channel-devices@200/network@0" 0 "vnet"

Workaround: In the case of mismatching instance numbers, remove the virtual network or virtual switch device. Then, add them again by explicitly specifying the instance number required by setting the id property.

You can also manually edit the /etc/path_to_inst file. See the path_to_inst(4) man page.

Caution - Changes should not be made to /etc/path_to_inst without careful consideration.

Newly Added NIU/XAUI Adapters Are Not Visible to the Host OS If Logical Domains Is Configured

Bug ID 15555509: When Logical Domains is configured on a system and you add another XAUI network card, the card is not visible after the machine has undergone a power cycle.

Recovery: To make the newly added XAUI visible in the control domain, perform the following steps:

Set and clear a dummy variable in the control domain.

The following commands use a dummy variable called fix-xaui:
```
# ldm set-var fix-xaui=yes primary
# ldm rm-var fix-xaui primary
```
Save the modified configuration to the SP, replacing the current configuration.

The following commands use a configuration name of config1:
```
# ldm rm-spconfig config1
# ldm add-spconfig config1
```
Perform a reconfiguration reboot of the control domain.
```
# reboot -- -r
```
At this time, you can configure the newly available network or networks for use by Logical Domains.

I/O Domain or Guest Domain Panics When Booting From `e1000g`

Bug ID 15543982: You can configure a maximum of two domains with dedicated PCI-E root complexes on systems such as the Sun Fire T5240. These systems have two UltraSPARC T2 Plus CPUs and two I/O root complexes.

pci@500 and pci@400 are the two root complexes in the system. The primary domain will always contain at least one root complex. A second domain can be configured with an unassigned or unbound root complex.

The pci@400 fabric (or leaf) contains the on-board e1000g network card. The following circumstances could lead to a domain panic:

If the system is configured with a primary domain that contains pci@500 and a second domain that contains pci@400

Note - For some blades, the primary domain (system disk) is on the pci@400 bus by default.
The e1000g device on the pci@400 fabric is used to boot the second domain

Avoid the following network devices if they are configured in a non-primary domain:

/pci@400/pci@0/pci@c/network@0,1
/pci@400/pci@0/pci@c/network@0

When these conditions are true, the domain will panic with a PCI-E Fatal error.

Avoid such a configuration or, if the configuration is used, do not boot from the listed devices.

Explicit Console Group and Port Bindings Are Not Migrated

Bug ID 15527921: During a migration, any explicitly assigned console group and port are ignored, and a console with default properties is created for the target domain. This console is created using the target domain name as the console group and using any available port on the first virtual console concentrator (vcc) device in the control domain. If there is a conflict with the default group name, the migration fails.

Recovery: To restore the explicit console properties following a migration, unbind the target domain and manually set the desired properties using the ldm set-vcons command.

Migration Does Not Fail If a `vdsdev` on the Target Has a Different Back End

Bug ID 15523133: If the virtual disk on the target machine does not point to the same disk back end that is used on the source machine, the migrated domain cannot access the virtual disk using that disk back end. A hang can result when accessing the virtual disk on the domain.

Currently, the Logical Domains Manager checks only that the virtual disk volume names match on the source and target machines. In this scenario, no error message is displayed if the disk back ends do not match.

Workaround: When you are configuring the target domain to receive a migrated domain, ensure that the disk volume (vdsdev) matches the disk back end used on the source domain.

Recovery: Do one of the following if you discover that the virtual disk device on the target machine points to the incorrect disk back end:

Migrate the domain and fix the vdsdev.
1. Migrate the domain back to the source machine.
2. Fix the vdsdev on the target to point to the correct disk back end.
3. Migrate the domain to the target machine again.
Stop and unbind the domain on the target, and fix the vdsdev. If the OS supports virtual I/O dynamic reconfiguration and the incorrect virtual disk in not in use on the domain (that is, it is not the boot disk and is unmounted), do the following:
1. Use the ldm rm-vdisk command to remove the disk.
2. Fix the vdsdev.
3. Use the ldm add-vdisk command to add the virtual disk again.

Migration Can Fail to Bind Memory Even If the Target Has Enough Available

Bug ID 15523120: In certain situations, a migration fails and ldmd reports that it was not possible to bind the memory needed for the source domain. This situation can occur even if the total amount of available memory on the target machine is greater than the amount of memory being used by the source domain.

This failure occurs because migrating the specific memory ranges in use by the source domain requires that compatible memory ranges are available on the target as well. When no such compatible memory range is found for any memory range in the source, the migration cannot proceed.

Recovery: If this condition is encountered, you might be able to migrate the domain if you modify the memory usage on the target machine. To do this, unbind any bound or active logical domain on the target.

Use the ldm list-devices -a mem command to see what memory is available and how it is used. You might also need to reduce the amount of memory that is assigned to another domain.

Logical Domains Manager Does Not Start If the Machine Is Not Networked and an NIS Client Is Running

Bug ID 15518409: If you do not have a network configured on your machine and have a Network Information Services (NIS) client running, the Logical Domains Manager will not start on your system.

Workaround: Disable the NIS client on your non-networked machine:

# svcadm disable nis/client

Logical Domains Manager Displays Migrated Domains in Transition States When They Are Already Booted

Bug ID 15516245: On occasion, an active logical domain appears to be in the transition state instead of the normal state long after it is booted, or following the completion of a domain migration. This glitch is harmless, and the domain is fully operational. To see what flag is set, check the flags field in the ldm list -l -p command output, or check the FLAGS field in the ldm list command, which shows -n---- for normal or -t---- for transition.

Recovery: After the next reboot, the domain shows the correct state.

Cannot Connect to Migrated Domain's Console Unless `vntsd` Is Restarted

Bug ID 15513998: Occasionally, after a domain has been migrated, it is not possible to connect to the console for that domain.

Workaround: Restart the vntsd SMF service to enable connections to the console:

# svcadm restart vntsd

Note - This command will disconnect all active console connections.

Sometimes, Executing the `uadmin 1 0` Command From a Logical Domains System Does Not Return the System to the OK Prompt

Bug ID 15511551: Sometimes, executing the uadmin 1 0 command from the command line of a Logical Domains system does not leave the system at the ok prompt after the subsequent reset. This incorrect behavior is seen only when the Logical Domains variable auto-reboot? is set to true. If auto-reboot? is set to false, the expected behavior occurs.

Workaround: Use this command instead:

uadmin 2 0

Or, always run with auto-reboot? set to false.

Logical Domains Manager Can Take Over 15 Minutes to Shut Down a Domain

Bug ID 15505014: A domain shutdown or memory scrub can take over 15 minutes with a single CPU and a very large memory configuration. During a shutdown, the CPUs in a domain are used to scrub all the memory owned by the domain. The time taken to complete the scrub can be quite long if a configuration is imbalanced, for example, a single CPU domain with 512 Gbytes of memory. This prolonged scrub time extends the amount of time needed to shut down a domain.

Workaround: Ensure that large memory configurations (more than 100 Gbytes) have at least one core.

`scadm` Command Can Hang Following an SC or SP Reset

Bug ID 15469227: The scadm command on a control domain running at least the Oracle Solaris 10 5/08 OS can hang following an SC reset. The system is unable to properly re-establish a connection following an SC reset.

Recovery: Reboot the host to re-establish connection with the SC.

Simultaneous Net Installation of Multiple Domains Fails When in a Common Console Group

Bug ID 15453968: Simultaneous net installation of multiple guest domains fails on systems that have a common console group.

Workaround: Only net-install on guest domains that each have their own console group. This failure is seen only on domains with a common console group shared among multiple net-installing domains.

Guest Domain With Too Many Virtual Networks on the Same Network Using DHCP Can Become Unresponsive

Bug ID 15422900: If you configure more than four virtual networks (vnets) in a guest domain on the same network using the Dynamic Host Configuration Protocol (DHCP), the guest domain can eventually become unresponsive while running network traffic.

Workaround: Set ip_ire_min_bucket_cnt and ip_ire_max_bucket_cnt to larger values, such as 32, if you have 8 interfaces.

Recovery: Issue an ldm stop-domain ldom command followed by an ldm start-domain ldom command on the guest domain (ldom) in question.

OpenBoot PROM Variables Cannot be Modified by the `eeprom` Command When the Logical Domains Manager Is Running

Bug ID 15387338: This issue is summarized in Logical Domains Variable Persistence and affects only the control domain.

Cannot Set Security Keys With Logical Domains Running

Bug ID 15370442: The Logical Domains environment does not support setting or deleting wide-area network (WAN) boot keys from within the Oracle Solaris OS by using the ickey(1M) command. All ickey operations fail with the following error:

ickey: setkey: ioctl: I/O error

In addition, WAN boot keys that are set using OpenBoot firmware in logical domains other than the control domain are not remembered across reboots of the domain. In these domains, the keys set from the OpenBoot firmware are valid only for a single use.

Behavior of the `ldm stop-domain` Command Can Be Confusing

Bug ID 15368170: In some cases, the behavior of the ldm stop-domain command is confusing.

# ldm stop-domain -f ldom

If the domain is at the kernel module debugger, kmdb(1), prompt, then the ldm stop-domain command fails with the following error message:

LDom <domain-name> stop notification failed

Skip Navigation Links
Exit Print View
	Oracle^® VM Server for SPARC 3.1.1.2, 3.1.1.1, 3.1.1, and 3.1 Release Notes