Logical Domains 1.3 Release Notes

Known Issues

This section contains general issues and specific bugs concerning the LDoms 1.3 software.

General Issues

This section describes general known issues about this release of LDoms software that are broader than a specific bug number. Workarounds are provided where available.

Service Processor and System Controller Are Interchangeable Terms

For discussions in Logical Domains documentation, the terms service processor (SP) and system controller (SC) are interchangeable.

Cards Not Supported

The following cards are not supported for this LDoms 1.3 software release:

Caution – Caution –

If these unsupported configurations are used with LDoms 1.3, stop and unbind all logical domains before the control domain is rebooted. Failure to do so can result in a system crash causing the loss of all the logical domains that are active in the system.

In Certain Conditions, a Guest Domain's SVM Configuration or Metadevices Can Be Lost

If a service domain is running a version of Solaris 10 OS prior to Solaris 10 10/09 and is exporting a physical disk slice as a virtual disk to a guest domain, then this virtual disk will appear in the guest domain with an inappropriate device ID. If that service domain is then upgraded to Solaris 10 10/09, the physical disk slice exported as a virtual disk will appear in the guest domain with no device ID.

This removal of the device ID of the virtual disk can cause problems to applications attempting to reference the device ID of virtual disks. In particular, this can cause the Solaris Volume Manager (SVM) to be unable to find its configuration or to access its metadevices.

Workaround: After upgrading a service domain to Solaris 10 10/09, if a guest domain is unable to find its SVM configuration or its metadevices, execute the following procedure.

ProcedureFind a Guest Domain's SVM Configuration or Metadevices

  1. Boot the guest domain.

  2. Disable the devid feature of SVM by adding the following lines to the /kernel/dr/md.conf file:

  3. Reboot the guest domain.

    After the domain has booted, the SVM configuration and metadevices should be available.

  4. Check the SVM configuration and ensure that it is correct.

  5. Re-enable the SVM devid feature by removing from the /kernel/drv/md.conf file the two lines that you added in Step 2.

  6. Reboot the guest domain.

    During the reboot, you will see messages similar to this:

    NOTICE: mddb: unable to get devid for 'vdc', 0x10

    These messages are normal and do not report any problems.

Logical Domain Channels (LDCs) and Logical Domains

There is a limit to the number of LDCs available in any logical domain. For UltraSPARC T2 based platforms, that limit is 512. For UltraSPARC T2 Plus based platforms, that limit is 768. This only becomes an issue on the control domain because the control domain has at least part, if not all, of the I/O subsystem allocated to it. This might also be an issue because of the potentially large number of LDCs that are created for both virtual I/O data communications and the Logical Domains Manager control of the other logical domains.

If you try to add a service, or bind a domain, so that the number of LDC channels exceeds the limit on the control domain, the operation fails with an error message similar to the following:

13 additional LDCs are required on guest primary to meet this request,
but only 9 LDCs are available

    The following guidelines can help prevent creating a configuration that could overflow the LDC capabilities of the control domain:

  1. The control domain allocates 12 LDCs for various communication purposes with the hypervisor, Fault Management Architecture (FMA), and the system controller (SC), independent of the number of other logical domains configured.

  2. The control domain allocates 1 LDC to every logical domain, including itself, for control traffic.

  3. Each virtual I/O service on the control domain consumes 1 LDC for every connected client of that service.

For example, consider a control domain and 8 additional logical domains. Each logical domain needs the following at a minimum:

Applying the above guidelines yields the following results (numbers in parentheses correspond to the preceding guideline number from which the value was derived):

12(1) + 9(2) + 8 x 3(3)=45 LDCs in total.

Now consider the case where there are 45 domains instead of 8, and each domain includes 5 virtual disks, 5 virtual networks, and a virtual console. Now the equation becomes:

12 + 46 + 45 x 11=553 LDCs in total.

Depending upon the number of supported LDCs of your platform, the Logical Domains Manager will either accept or reject the configurations.

Memory Size Requirements

Logical Domains software does not impose a memory size limitation when creating a domain. The memory size requirement is a characteristic of the guest operating system. Some Logical Domains functionality might not work if the amount of memory present is less than the recommended size. For recommended and minimum size memory requirements, see the installation guide for the operating system you are using.

The OpenBootTM PROM has a minimum size restriction for a domain. Currently, that restriction is 12 Mbytes. If you have a domain less than that size, the Logical Domains Manager will automatically boost the size of the domain to 12 Mbytes. Refer to the release notes for your system firmware for information about memory size requirements.

Booting a Large Number of Domains

You can the boot following number of domains depending on your platform:

If unallocated virtual CPUs are available, assign them to the service domain to help process the virtual I/O requests. Allocate 4 to 8 virtual CPUs to the service domain when creating more than 32 domains. In cases where maximum domain configurations have only a single CPU in the service domain, do not put unnecessary stress on the single CPU when configuring and using the domain. The virtual switch (vsw) services should be spread over all the network adapters available in the machine. For example, if booting 128 domains on a Sun SPARC Enterprise T5240 server, create 4 vsw services, each serving 32 virtual net (vnet) instances. Do not have more than 32 vnet instances per vsw service because having more than that tied to a single vsw could cause hard hangs in the service domain.

To run the maximum configurations, a machine needs the following amount of memory, depending on your platform, so that the guest domains contain an adequate amount of memory:

Memory and swap space usage increases in a guest domain when the vsw services used by the domain provides services to many virtual networks (in multiple domains). This is due to the peer-to-peer links between all the vnet connected to the vsw. The service domain benefits from having extra memory. Four Gbytes is the recommended minimum when running more than 64 domains. Start domains in groups of 10 or less and wait for them to boot before starting the next batch. The same advice applies to installing operating systems on domains.

Cleanly Shutting Down and Power Cycling a Logical Domains System

If you have made any configuration changes since last saving a configuration to the SC, before you attempt to power off or power cycle a Logical Domains system, make sure that you save the latest configuration that you want to keep.

ProcedurePower Off a System With Multiple Active Domains

  1. Shut down and unbind all the non-I/O domains.

  2. Shut down and unbind any active I/O domains.

  3. Halt the primary domain.

    Because no other domains are bound, the firmware automatically powers off the system.

ProcedurePower Cycle the System

  1. Shut down and unbind all the non-I/O domains.

  2. Shut down and unbind any active I/O domains.

  3. Reboot the primary domain.

    Because no other domains are bound, the firmware automatically power cycles the system before rebooting it. When the system restarts, it boots into the Logical Domains configuration last saved or explicitly set.

Memory Size Requested Might Be Different From Memory Allocated

Under certain circumstances, the Logical Domains Manager rounds up the requested memory allocation to either the next largest 8-Kbyte or 4-Mbyte multiple. This can be seen in the following example output of the ldm list-domain -l command, where the constraint value is smaller than the actual allocated size:

          Constraints: 1965 M
          raddr          paddr5          size
          0x1000000      0x291000000     1968M

Logical Domain Variable Persistence

Variable updates persist across a reboot, but not across a powercycle, unless the variable updates are either initiated from OpenBoot firmware on the control domain or followed by saving the configuration to the SC.

In this context, it is important to note that a reboot of the control domain could initiate a powercycle of the system:

LDom variables for a domain can be specified using any of the following methods:

The goal is that, variable updates that are made by using any of these methods always persist across reboots of the domain. The variable updates also always reflect in any subsequent logical domain configurations that were saved to the SC.

In LDoms 1.3 software, there are a few cases where variable updates do not persist as expected:

If you are concerned about Logical Domains variable changes, do one of the following:

The following Bug IDs have been filed to resolve these issues: 6520041, 6540368, 6540937, and 6590259.

Sun SNMP Management Agent Does Not Support Multiple Domains

Sun Simple Management Network Protocol (SNMP) Management Agent does not support multiple domains. Only a single global domain is supported.

Containers, Processor Sets, and Pools Are Not Compatible With CPU Power Management

Using CPU dynamic reconfiguration (DR) to power down virtual CPUs does not work with processor sets, resource pools, or the zone's dedicated CPU feature.

When using CPU power management in elastic mode, the Solaris OS guest sees only the CPUs that are allocated to the domains that are powered on. That means that output from the psrinfo(1M) command dynamically changes depending on the number of CPUs currently power-managed. This causes an issue with processor sets and pools, which require actual CPU IDs to be static to allow allocation to their sets. This can also impact the zone's dedicated CPU feature.

Workaround: Set the performance mode for the power management policy.

Fault Management

There are several issues associated with FMA and power-managing CPUs. If a CPU faults when running in elastic mode, switch to performance mode until the faulted CPU recovers. If all faulted CPUs recover, then elastic mode can be used again.

For more information about faulted resources, see the OpenSolaris Fault Management web page.

Delayed Reconfiguration

When a primary domain is in a delayed reconfiguration state, CPUs are power managed only after the primary domain reboots. This means that CPU power management will not bring additional CPUs online when the domain is experiencing high-load usage until the primary domain reboots, clearing the delayed reconfiguration state.

Domain Migration in Elastic Mode Is Not Supported

Domain migrations are not supported for a source or target machine in elastic mode. If a migration is underway while in performance mode and the power management policy is set to elastic mode, the policy switch is deferred until the migration completes. The migration command returns an error if either the source or target machine is in elastic mode and a domain migration is attempted.

Cryptographic Units

The Solaris 10 10/09 OS introduces the capability to dynamically add and remove cryptographic units from a domain, which is called cryptographic unit dynamic reconfiguration (DR). The Logical Domains Manager automatically detects whether a domain allows cryptographic unit DR, and enables the functionality only for those domains. In addition, CPU DR is no longer disabled in domains that have cryptographic units bound and are running an appropriate version of the Solaris OS.

However, the restriction against enabling power management (PM) elastic mode still applies if there are any domains with cryptographic units bound. This restriction is enforced by the Logical Domains Manager. So, any attempt to change power management's policy to elastic from the SP when there are cryptographic units bound to any domain would be ineffective and is therefore not recommended.

Bugs Affecting LDoms 1.3 Software

This section summarizes the bugs that you might encounter when using this version of the software. The bug descriptions are in numerical order by bug ID. If a workaround and a recovery procedure are available, they are specified.

Logical Domains Manager Does Not Validate Disk Paths and Network Devices

Bug ID 6447740: The Logical Domains Manager does not validate disk paths and network devices.

Disk Paths

If a disk device listed in a guest domain's configuration is either non-existent or otherwise unusable, the disk cannot be used by the virtual disk server (vds). However, the Logical Domains Manager does not emit any warning or error when the domain is bound or started.

When the guest tries to boot, messages similar to the following are printed on the guest's console:

WARNING: /virtual-devices@100/channel-devices@200/disk@0: Timeout
connecting to virtual disk server... retrying

In addition, if a network interface specified using the net-dev= property does not exist or is otherwise unusable, the virtual switch is unable to communicate outside the physical machine, but the Logical Domains Manager does not emit any warning or error when the domain is bound or started.

ProcedureRecover From an Errant net-dev Property Specified for a Virtual Switch

  1. Issue the ldm set-vsw command with the corrected net-dev property value.

  2. Reboot the domain hosting the virtual switch in question.

ProcedureRecover From an Errant Virtual Disk Service Device or Volume

  1. Stop the domain owning the virtual disk bound to the errant device or volume.

  2. Issue the ldm rm-vdsdev command to remove the errant virtual disk service device.

  3. Issue the ldm add-vdsdev command to correct the physical path to the volume.

  4. Restart the domain owning the virtual disk.

Network Devices

If a disk device listed in a guest domain's configuration is being used by software other than the Logical Domains Manager (for example, if it is mounted in the service domain), the disk cannot be used by the virtual disk server (vds), but the Logical Domains Manager does not emit a warning that it is in use when the domain is bound or started.

When the guest domain tries to boot, a message similar to the following is printed on the guest's console:

WARNING: /virtual-devices@100/channel-devices@200/disk@0: Timeout
connecting to virtual disk server... retrying

ProcedureRecover From a Disk Device Being Used by Other Software

  1. Unbind the guest domain.

  2. Unmount the disk device to make it available.

  3. Bind the guest domain.

  4. Boot the domain.

Hang Can Occur With Guest OS in Simultaneous Operations

Bug ID 6497796: Under rare circumstances, when a Logical Domains variable, such as boot-device, is being updated from within a guest domain by using the eeprom(1M) command at the same time that the Logical Domains Manager is being used to add or remove virtual CPUs from the same domain, the guest OS can hang.

Workaround: Ensure that these two operations are not performed simultaneously.

Recovery: Use the ldm stop-domain and ldm start-domain commands to stop and start the guest OS.

Behavior of the ldm stop-domain Command Can Be Confusing

Bug ID 6506494: There are some cases where the behavior of the ldm stop-domain command is confusing.

# ldm stop-domain -f ldom

If the domain is at the kernel module debugger, kmdb(1), prompt, then the ldm stop-domain command fails with the following error message:

LDom <domain name> stop notification failed

Cannot Set Security Keys With Logical Domains Running

Bug ID 6510214: In a Logical Domains environment, there is no support for setting or deleting wide-area network (WAN) boot keys from within the Solaris OS by using the ickey(1M) command. All ickey operations fail with the following error:

ickey: setkey: ioctl: I/O error

In addition, WAN boot keys that are set using OpenBoot firmware in logical domains other than the control domain are not remembered across reboots of the domain. In these domains, the keys set from the OpenBoot firmware are only valid for a single use.

Logical Domains Manager Forgets Variable Changes After a Power Cycle

Bug ID 6590259: This issue is summarized in Logical Domain Variable Persistence.

Using the server-secure.driver With an NIS Enabled System, Whether or Not LDoms Is Enabled

Bug ID 6533696: On a system configured to use the Network Information Service (NIS) or NIS+ naming service, if the Solaris Security Toolkit software is applied with the server-secure.driver, NIS or NIS+ fails to contact external servers. A symptom of this problem is that the ypwhich(1) command (which returns the name of the NIS or NIS+ server or map master) fails with a message similar to the following:

Domain atlas some.atlas.name.com not bound on nis-server-1.c

The recommended Solaris Security Toolkit driver to use with the Logical Domains Manager is ldm_control-secure.driver, and NIS and NIS+ work with this recommended driver.

If you are using NIS as your naming service, you cannot use the Solaris Security Toolkit profile server-secure.driver because you might encounter Solaris OS Bug ID 6557663, IP Filter causes panic when using ipnat.conf. However, the default Solaris Security Toolkit driver, ldm_control-secure.driver, is compatible with NIS.

ProcedureRecover by Resetting Your System

  1. Log in to the system controller by using the ssh command.

  2. Power off the system.

    -> stop /SYS
  3. Power on the system.

    -> start /SYS
  4. Log in to the system console.

    -> start /SP/console
  5. Boot the system.

    ok boot -s
  6. Edit the file /etc/shadow.

    Change the root entry of the shadow file to the following:

  7. Log in to the system and do one of the following:

    • Add file /etc/ipf/ipnat.conf.

    • Undo the Solaris Security Toolkit, and apply another driver.

    # /opt/SUNWjass/bin/jass-execute -ui
    # /opt/SUNWjass/bin/jass-execute -a ldm_control-secure.driver

Network Performance Is Worse in a Logical Domain Guest Than in a Non-LDoms Configuration

Bug ID 6486234: The virtual networking infrastructure adds additional overhead to communications from a logical domain. All packets are sent through a virtual network device, which, in turn, passes the packets to the virtual switch. The virtual switch then sends the packets out through the physical device. The lower performance is seen due to the inherent overheads of the stack.

Workaround: For T2 platforms, you can assign a Network Interface Unit (NIU) to the logical domain.

Logical Domain Time-of-Day Changes Do Not Persist Across a Power Cycle of the Host

Bug ID 6590259: If the time or date on a logical domain is modified, for example using the ntpdate command, the change persists across reboots of the domain but not across a power cycle of the host.

Workaround: For time changes to persist, save the configuration with the time change to the SC and boot from that configuration.

OpenBoot PROM Variables Cannot be Modified by the eeprom(1M) Command When the Logical Domains Manager is Running

Bug ID 6540368: This issue is summarized in Logical Domain Variable Persistence and affects only the control domain.

Logical Domains Manager Does Not Retire Resources On Guest Domain After a Panic and Reboot

Bug ID 6591844: If a CPU or memory fault occurs, the affected domain might panic and reboot. If the Fault Management Architecture (FMA) attempts to retire the faulted component while the domain is rebooting, the Logical Domains Manager is not able to communicate with the domain, and the retire fails. In this case, the fmadm faulty command lists the resource as degraded.

Recovery: Wait for the domain to complete rebooting, and then force FMA to replay the fault event by restarting the fault manager daemon (fmd) on the control domain by using this command:

primary# svcadm restart fmd

Guest Domain With Too Many Virtual Networks on the Same Network Using DHCP Can Become Unresponsive

Bug ID 6603974: If you configure more than four virtual networks (vnets) in a guest domain on the same network using the Dynamic Host Configuration Protocol (DHCP), the guest domain can eventually become unresponsive while running network traffic.

Workaround: Set ip_ire_min_bucket_cnt and ip_ire_max_bucket_cnt to larger values, such as 32, if you have 8 interfaces.

Recovery: Issue an ldm stop-domain ldom command followed by an ldm start-domain ldom command on the guest domain (ldom) in question.

The scadm Command Can Hang Following an SC or SP Reset

Bug ID 6629230: The scadm command on a control domain running at least the Solaris 10 11/06 OS can hang following an SC reset. The system is unable to properly reestablish a connection following an SC reset.

Workaround: Reboot the host to reestablish connection with the SC.

Recovery: Reboot the host to reestablish connection with the SC.

Simultaneous Net-Installation of Multiple Domains Fails When in a Common Console Group

Bug ID 6656033: Simultaneous net installation of multiple guest domains fails on Sun SPARC Enterprise T5140 and Sun SPARC Enterprise T5240 systems that have a common console group.

Workaround: Only net-install on guest domains that each have their own console group. This failure is seen only on domains with a common console group shared among multiple net-installing domains.

SVM Volumes Built on Slice 2 Fail JumpStart When Used as the Boot Device in a Guest Domain

Bug ID 6687634: If the Sun Volume Manager (SVM) volume is built on top of a disk slice that contains block 0 of the disk, then SVM prevents writing to block 0 of the volume to avoid overwriting the label of the disk.

If an SVM volume built on top of a disk slice that contains block 0 of the disk is exported as a full virtual disk, then a guest domain is unable to write a disk label for that virtual disk, and this prevents the Solaris OS from being installed on such a disk.

Workaround: SVM volumes exported as a virtual disk should not be built on top of a disk slice that contains block 0 of the disk.

A more generic guideline is that slices that start on the first block (block 0) of a physical disk should not be exported (either directly or indirectly) as a virtual disk. Refer to Directly or Indirectly Exporting a Disk Slice in Logical Domains 1.3 Administration Guide.

If the Solaris 10 5/08 OS Is Installed on a Service Domain, Attempting a Net Boot of the Solaris 10 8/07 OS on Any Guest Domain Serviced by It Can Hang the Installation

Bug ID 6705823: Attempting a net boot of the Solaris 10 8/07 OS on any guest domain serviced by a service domain running the Solaris 10 5/08 OS can result in a hang on the guest domain during the installation.

Workaround: Patch the miniroot of the Solaris 10 8/07 OS net install image with Patch ID 127111-05.

Logical Domains Manager Can Take Over 15 Minutes to Shut Down a Logical Domain

Bug ID 6742805: A domain shutdown or memory scrub can take over 15 minutes with a single CPU and a very large memory configuration. During a shutdown, the CPUs in a domain are used to scrub all the memory owned by the domain. The time taken to complete the scrub can be quite long if a configuration is imbalanced, for example, a single CPU domain with 512 Gbytes of memory. This prolonged scrub time extends the amount of time it takes to shut down a domain.

Workaround: Ensure that large memory configurations (>100 Gbytes) have at least one core. This results in a much faster shutdown time.

Sometimes, Executing the uadmin 1 0 Command From an LDoms System Does Not Return the System to the OK Prompt

Bug ID 6753683: Sometimes, executing the uadmin 1 0 command from the command line of an LDoms system does not leave the system at the OK prompt after the subsequent reset. This incorrect behavior is seen only when the LDoms variable auto-reboot? is set to true. If auto-reboot? is set to false, the expected behavior occurs.

Workaround: Use this command instead:

uadmin 2 0

Or, always run with auto-reboot? set to false.

Logical Domains Manager Displays Migrated Domains in Transition States When They Are Already Booted

Bug ID 6760933: On occasion, an active logical domain appears to be in the transition state instead of the normal state long after it is booted or following the completion of a domain migration. This glitch is harmless, and the domain is fully operational. To see what flag is set, check the flags field in the ldm list -l -p command output, or check the FLAGS field in the ldm list command, which shows -n---- for normal or -t---- for transition.

Recovery: The logical domain should display the correct state upon the next reboot.

Logical Domains Manager Does Not Start If the Machine Is Not Networked and an NIS Client Is Running

Bug ID 6764613: If you do not have a network configured on your machine and have a Network Information Services (NIS) client running, the Logical Domains Manager will not start on your system.

Workaround: Disable the NIS client on your non-networked machine:

# svcadm disable nis/client

Migration Can Fail to Bind Memory Even If the Target Has Enough Available

Bug ID 6772089: In certain situations, a migration fails and ldmd reports that it was not possible to bind the memory needed for the source domain. This can occur even if the total amount of available memory on the target machine is greater than the amount of memory being used by the source domain.

This failure occurs because migrating the specific memory ranges in use by the source domain requires that compatible memory ranges are available on the target, as well. When no such compatible memory range is found for any memory range in the source, the migration cannot proceed.

Recovery: If this condition is encountered, you might be able to migrate the domain if you modify the memory usage on the target machine. To do this, unbind any bound or active logical domain on the target.

Use the ldm list-devices -a mem command to see what memory is available and how it is used. You might also need to reduce the amount of memory that is assigned to another domain.

Migration Does Not Fail If a vdsdev on the Target Has a Different Backend

Bug ID 6772120: If the virtual disk on the target machine does not point to the same disk backend that is used on the source machine, the migrated domain cannot access the virtual disk using that disk backend. A hang can result when accessing the virtual disk on the domain.

Currently, the Logical Domains Manager checks only that the virtual disk volume names match on the source and target machines. In this scenario, no error message is displayed if the disk backends do not match.

Workaround: Ensure that when you are configuring the target domain to receive a migrated domain that the disk volume (vdsdev) matches the disk backend used on the source domain.

Recovery: Do one of the following if you discover that the virtual disk device on the target machine points to the incorrect disk backend:

Constraint Database Is Not Synchronized to Saved Configuration

Bug ID 6773569: After switching from one configuration to another (using the ldm set-config command followed by a powercycle), domains defined in the previous configuration might still be present in the current configuration, in the inactive state.

This is a result of the Logical Domains Manager's constraint database not being kept in sync with the change in configuration. These inactive domains do not affect the running configuration and can be safely destroyed.

Explicit Console Group and Port Bindings Are Not Migrated

Bug ID 6781589: During a migration, any explicitly assigned console group and port are ignored, and a console with default properties is created for the target domain. This console is created using the target domain name as the console group and using any available port on the first virtual console concentrator (vcc) device in the control domain. If there is a conflict with the default group name, the migration fails.

Recovery: To restore the explicit console properties following a migration, unbind the target domain, and manually set the desired properties using the ldm set-vcons command.

VIO DR Operations Ignore the Force (-f) Option

Bug ID 6703127: Virtual input/output (VIO) dynamic reconfiguration (DR) operations ignore the -f (force) option in CLI commands.

libpiclsnmp:snmp_init() Blocks Indefinitely in open() on primary Domain

Bug ID 6736962: Power Management sometimes fails to retrieve policy from the service processor on LDoms startup after the control domain boots. If CPU power management could not retrieve the power management policy from the service processor, it allows LDoms to start up as expected, but logs the error Unable to get the initial PM Policy - timeout to the LDoms log and remains in performance mode.

Add forceload: drv/ds_snmp to /etc/system, then reboot the control domain.

FMA Status Failures

Bug ID 6759853: The following error message might be written intermittently to the LDoms log when a domain is at the ok prompt:

fma_cpu_svc_get_p_status: Can't find fma_cpu_get_p_status routine error

Workaround: Boot the domain.

ldmconfig Might Cause the Root File System of the Control Domain to Become Full and Halt the System

Bug ID 6848114: ldmconfig can run on a system that does not have file systems of sufficient capacity to contain the virtual disks for the created domains. In this situation, an error message is issued. However, ldmconfig permits you to continue to use the disks that are in /ldoms/disks to deploy the configuration. This situation could cause the root file system of the control domain to become full and halt the system.

Workaround: Do the following:

  1. Exit the Configuration Assistant by typing q or by typing Ctrl-C.

  2. Add more file systems of adequate capacity.

  3. Rerun the ldmconfig command.

Guest Domain Sometimes Fails to Make Proper Domain Services Connection to the Control Domain

Bug ID 6839787: Sometimes, a guest domain that runs at least the Solaris 10 10/08 OS does not make a proper Domain Services connection to a control domain that runs the Solaris 10 5/09 OS.

Domain Services connections enable features such as dynamic reconfiguration (DR), FMA, and power management (PM). Such a failure occurs when the guest domain is booted, so rebooting the domain usually clears the problem.

Workaround: Reboot the guest domain.

ldm: Autosave Feature Should Identify and Allow the Downloading of Damaged Configurations

Bug ID 6840800: An otherwise usable corrupted or damaged autosave configuration cannot be downloaded.

Workaround: Use another, undamaged autosave configuration or SP configuration.

ldmd Dumps Core If a rm-io Operation Is Followed by Multiple set-vcpu Operations

Bug ID 6697096: Under certain circumstances, when a ldm rm-io operation is followed by multiple ldm set-vcpu operations, ldmd might abort and be restarted by SMF.

Workaround: After executing an ldm rm-io operation on a domain, take care when attempting an ldm set-vcpu operation. A single ldm set-vcpu operation will succeed, but a second ldm set-vcpu operation might cause the ldmd daemon to dump core under certain circumstances. Instead, reboot the domain before attempting the second set-vcpu operation.

Virtual Network Devices Are Not Created Properly on the Control Domain

Bug ID 6836587: Sometimes ifconfig indicates that the device does not exist after you add a virtual network or virtual disk device to a domain. This situation might occur as the result of the /devices entry not being created.

Although this should not occur during normal operation, the error was seen when the instance number of a virtual network device did not match the instance number listed in /etc/path_to_inst file.

For example:

# ifconfig vnet0 plumb
ifconfig: plumb: vnet0: no such interface

The instance number of a virtual device is shown under the DEVICE column in the ldm list output:

# ldm list -o network primary


    primary-vsw0 00:14:4f:f9:86:f3 nxge0   switch@0 1               1        1500        

    NAME   SERVICE              DEVICE    MAC               MODE PVID VID MTU  
    vnet1  primary-vsw0@primary network@0 00:14:4f:f8:76:6d      1        1500

The instance number (0 for both the vnet and vsw shown previously) can be compared with the instance number in the path_to_inst file to ensure that they match.

# egrep '(vnet|vsw)' /etc/path_to_inst
"/virtual-devices@100/channel-devices@200/virtual-network-switch@0" 0 "vsw"
"/virtual-devices@100/channel-devices@200/network@0" 0 "vnet"

Workaround: In the case of mismatching instance numbers, remove the virtual network or virtual switch device. Then, add them again by explicitly specifying the instance number required by setting the id property.

You can also manually edit the /etc/path_to_inst file. See the path_to_inst(4) man page.

Caution – Caution –

Be aware of the warning contained in the man page that states “changes should not be made to /etc/path_to_inst without careful consideration.”

Cannot Connect to Migrated Domain's Console Unless vntsd Is Restarted

Bug ID 6757486: Occasionally, after a domain has been migrated, it is not possible to connect to the console for that domain.

Workaround: Restart the vntsd SMF service to enable connections to the console:

# svcadm restart vntsd

Note –

This command will disconnect all active console connections.

I/O Domain or Guest Domain Panics When Booting From e1000g

Bug ID 6808832: You can configure a maximum of two domains with dedicated PCI-E root complexes on systems such as the Sun Fire T5240. These systems have two UltraSPARC T2+ CPUs and two I/O root complexes.

pci@500 and pci@400 are the two root complexes in the system. The primary domain will always contain at least one root complex. A second domain can be configured with an unassigned or unbound root complex.

The pci@400 fabric (or leaf) contains the onboard e1000g network card. The following circumstances could lead to a domain panic:

Avoid the following network devices if they are configured in a non-primary domain:


When these conditions are true, the domain will panic with a PCI-E Fatal error.

Avoid such a configuration, or if the configuration is used, do not boot from the listed devices.

Guest Domain Might Fail to Successfully Reboot When a System Is in Power Management Elastic Mode

Bug ID 6853273: While a system is in power management elastic mode, rebooting a guest domain might produce the following warning messages and fail to boot successfully:

WARNING: /virtual-devices@100/channel-devices@200/disk@0:
Sending packet to LDC, status: -1
WARNING: /virtual-devices@100/channel-devices@200/disk@0:
Can't send vdisk read request!
WARNING: /virtual-devices@100/channel-devices@200/disk@0:
Timeout receiving packet from LDC ... retrying

Workaround: If you see these warnings, perform one of the workarounds in the following order:

ldm Commands Are Slow to Respond When Several Domains Are Booting

Bug ID 6855079: An ldm command might be slow to respond when several domains are booting. If you issue an ldm command at this stage, the command might appear to hang. Note that the ldm command will return after performing the expected task. After the command returns, the system should respond normally to ldm commands.

Workaround: Avoid booting many domains simultaneously. However, if you must boot several domains at once, refrain from issuing further ldm commands until the system returns to normal. For instance, wait for about two minutes on Sun SPARC Enterprise T5140 and T5240 Servers and for about four minutes on the Sun SPARC Enterprise T5440 Server or Netra T5440 Server.

Spurious ds_ldc_cb: LDC READ event Message Seen When Rebooting the Control Domain or a Guest Domain

Bug ID 6846889: When rebooting the control domain or a guest domain, the following warning message might be logged on the control domain and on the guest domain that is rebooting:

WARNING: ds@0: ds_ldc_cb: LDC READ event while port not up

Workaround: You can ignore this message.

Migrated Domain With MAUs Contains Only One CPU When Target OS Does Not Support DR of Cryptographic Units

Bug ID 6904849: With the release of Logical Domains 1.3, a domain can be migrated even if it has one or more cryptographic units bound to it.

In the following circumstances, the target machine will only contain one CPU after the migration is completed:

After migration completes, the target domain will resume successfully and be operational, but will be in a degraded state (just one CPU).

Workaround: Prior to migration, remove the cryptographic unit or units from the source machine that runs Logical Domains 1.3.

Mitigation: To avoid this issue, perform one or both of these steps:

ldm set-vcc Prevented When System Is Already in Delayed Reconfiguration Mode

Bug ID 6852143: When you make an ldm set-vcc request while a delayed reconfiguration is in progress, the request is rejected. Normally, this request should be added to the delayed reconfiguration.

Workaround: Do one of the following:

Confusing Migration Failure Message for Real Address Memory Bind Failures

Bug ID 6904240: In certain situations, a migration fails with the following error message, and ldmd reports that it was not possible to bind the memory needed for the source domain. This situation can occur even if the total amount of available memory on the target machine is greater than the amount of memory being used by the source domain (as shown by ldm ls-devices -a mem).

Unable to bind 29952M memory region at real address 0x8000000
Domain Migration of LDom ldg0 failed

Cause: This failure is due the inability to meet congruence requirements between the Real Address (RA) and the Physical Address (PA) on the target machine.

Workaround: Stop the domain and perform the migration as a cold migration. You can also reduce the size of the memory on the guest domain by 128 Mbytes, which might permit the migration to proceed while the domain is running.

An Enabled Resource Managment Policy Might Not Decrease CPU Count As Expected

Bug ID 6908985: The following problem occurs on large domains that have more than 99 virtual CPUs. When utilization is less than the low water mark, dynamic resource management (DRM) will not release virtual CPUs if the total number of virtual CPUs is greater than 99.

ldm Does Not Make Use of Cryptographic Units Added to the Control Domain After Startup

Bug ID 6880106: When the Logical Domains Manager (ldm) starts up, it initializes any cryptographic units that are present in the control domain for use in securing migration operations. If a cryptographic unit is added to the control domain at any point after ldm starts, those cryptographic units are not utilized for migration. This situation might negatively impact the time it takes for a migration operation to complete.

Workaround: Restart ldmd to force it to reinitialize its use of the cryptographic units in the primary domain. Run the following command:

# svcadm restart ldmd

Warm Migration Can Fail With an Unknown migration failure Message

Bug ID 6904238: In rare circumstances, you might see a warm migration operation fail with the following message:

Unknown migration failure

The ldmd log file on the source machine shows the following message:

warning: Failed to read feasibility response type (9) from
target LDoms Manager

This failure can occur when there is a problem migrating the runtime state of the logical domain channels of the guest. This problem has occurred when the migrating domain has an unplumbed virtual network interface or has a sparse memory configuration.

Workaround: If the migrating domain has one or more unplumbed virtual network interfaces, plumb them. If the problem persists, shut down the domain and perform a cold migration.

DRM: Some tod-begin and tod-end Values Cannot Begin With a Leading Zero

Bug ID 6909998: You cannot supply values of 08 or 09 for hours, minutes, or seconds for the tod-begin and tod-end properties. The 08 and 09 values are considered to be invalid octal values.

Workaround: Specify 8 instead of 08 and 9 instead of 09.

For example, to set the beginning of the day to 08:09:01, specify the value of tod-begin as follows:

# ldm set-policy tod-begin=8:9:01 name=drm_policy primary

Newly Added NIU/XAUI Adapters Are Not Visible to Host OS If Logical Domains Is Configured

Bug ID 6829016: When Logical Domains is configured on a system and you add another XAUI network card, the card is not visible after the machine is powercycled.

Recovery: To make the newly added XAUI visible in the control domain, perform the following steps:

  1. Set and clear a dummy variable in the control domain.

    The following commands use a dummy variable called fix-xaui:

    # ldm set-var fix-xaui=yes primary
    # ldm rm-var fix-xaui primary
  2. Save the modified configuration to the SP, replacing the current configuration.

    The following commands use a configuration name of config1:

    # ldm rm-spconfig config1
    # ldm add-spconfig config1
  3. Perform a reconfiguration reboot of the control domain.

    # reboot -- -r

    At this time, you can configure the newly available network or networks for use by Logical Domains.

Dynamically Removing All the Cryptographic Units From a Domain Causes SSH to Terminate

Bug ID 6897743: If all the hardware cryptographic units are dynamically removed from a running domain, the cryptographic framework fails to seamlessly switch to the software cryptographic providers, and kills all the ssh connections.

Recovery: Re-establish the ssh connections after all the cryptograpic units are removed from the domain.

Workaround: Set UseOpenSSLEngine=no in the /etc/ssh/sshd_config file on the server side, and run the svcadm restart ssh command.

Then, all ssh connections will no longer use the hardware cryptograpic units (and thus not benefit from the associated performance improvements), and ssh connections would not be disconnected when the cryptograpic units are removed.

Unexpected Probe-Based IPMP Failures

Bug ID 6888928: If you use probe-based IPMP, the interfaces in the IPMP group might fail abruptly. This might happen shortly after you configure test addresses on the interfaces in the IPMP group for probe-based failure detection. The problem can occur with any network interface, virtual or physical, including Logical Domains virtual network devices.

Workaround: If you still want to use probe-based failure detection, install patch ID 142900-03. Or, you can disable probe-based failure detection by removing the test addresses, and use link-based failure detection. For more information, see IPMP Link-Based Only Failure Detection with Solaris 10 on SunSolve.

Starting with the Logical Domains 1.3 release, you can configure link-based IPMP for virtual network devices. See Using Link-Based IPMP in Logical Domains Virtual Networking in Logical Domains 1.3 Administration Guide.

Documentation Errata

This section contains documentation errors that have been found too late to resolve for the LDoms 1.3 release.

Incorrect Parameter Names in the Input/Output Bus Table

Bug ID 6843196: “Input/Output Bus Table (IOBusTable)” on page 31 of the Logical Domains (LDoms) MIB 1.0.1 Administration Guide shows incorrect parameter names.

IOBusDevName should be IOBusName, and IOBusDevPath should be IOBusPath.

ldmp2v(1M): Incorrect Options Shown for the ldmp2v prepare -R guest-root Command

The ldmp2v prepare -R guest-root command does not support the -m mountpoint:size, -x no-auto-adjust-fs, and -x remove-unused-slices options.