System Administration Guide: Oracle Solaris Containers-Resource Management and Oracle Solaris Zones

Chapter 30 Troubleshooting Miscellaneous Solaris Zones Problems

This chapter is new for the Solaris 10 6/06 release.

For a complete listing of new Solaris 10 features and a description of Solaris releases, see Oracle Solaris 10 9/10 What’s New.

Solaris 10 6/06, Solaris 10 11/06, Solaris 10 8/07, and Solaris 10 5/08: Do Not Place the Root File System of a Non-Global Zone on ZFS

The zonepath of a non-global zone should not reside on ZFS for these releases. This action might result in patching problems and possibly prevent the system from being upgraded to a later Solaris 10 update release.

Note that the root file system of a non-global zone can reside on ZFS starting with the Solaris 10 10/08 release. Solaris Live Upgrade can then be used to upgrade the system.

Exclusive-IP Zone Is Using Device, so dladm reset-linkprop Fails

If the following error message is displayed:


dladm: warning: cannot reset link property 'zone' on 'bge0': operation failed

Referring to How to Use dladm reset-linkprop, the attempt to use dladm reset-linkprop failed. The running zone excl is using the device, which was assigned by executing ifconfig bge0 plumb inside the zone.

To reset the value, use the procedure ifconfig bge0 unplumb inside the zone and rerun the dladm command.

Zone Administrator Mounting Over File Systems Populated by the Global Zone

The presence of files within a file system hierarchy when a non-global zone is first booted indicates that the file system data is managed by the global zone. When the non-global zone was installed, a number of the packaging files in the global zone were duplicated inside the zone. These files must reside under the zonepath directly. If the files reside under a file system created by a zone administrator on disk devices or ZFS datasets added to the zone, packaging and patching problems could occur.

The issue with storing any of the file system data that is managed by the global zone in a zone-local file system can be described by using ZFS as an example. If a ZFS dataset has been delegated to a non-global zone, the zone administrator should not use that dataset to store any of the file system data that is managed by the global zone. The configuration could not be patched or upgraded correctly.

For example, a ZFS delegated dataset should not be used as a /var file system. The Solaris operating system delivers core packages that install components into /var. These packages have to access /var when they are upgraded or patched, which is not possible if /var is mounted on a delegated ZFS dataset.

File system mounts under parts of the hierarchy controlled by the global zone are supported. For example, if an empty /usr/local directory exists in the global zone, the zone administrator can mount other contents under that directory.

You can use a delegated ZFS dataset for file systems that do not need to be accessed during patching or upgrade, such as /export in the non-global zone.

Zone Does not Halt

In the event that the system state associated with the zone cannot be destroyed, the halt operation will fail halfway. This leaves the zone in an intermediate state, somewhere between running and installed. In this state there are no active user processes or kernel threads, and none can be created. When the halt operation fails, you must manually intervene to complete the process.

The most common cause of a failure is the inability of the system to unmount all file systems. Unlike a traditional Solaris system shutdown, which destroys the system state, zones must ensure that no mounts performed while booting the zone or during zone operation remain once the zone has been halted. Even though zoneadm makes sure that there are no processes executing in the zone, the unmount operation can fail if processes in the global zone have open files in the zone. Use the tools described in the proc(1) (see pfiles) and fuser(1M) man pages to find these processes and take appropriate action. After these processes have been dealt with, reinvoking zoneadm halt should completely halt of the zone.

For a zone that cannot be halted, as of the Solaris 10 10/09 release, you can migrate a zone that has not been detached by using the zoneadm attach -F option to force the attach without a validation. The target system must be properly configured to host the zone. An incorrect configuration could result in undefined behavior. Moreover, there is no way to know the state of the files within the zone.

Incorrect Privilege Set Specified in Zone Configuration

If the zone's privilege set contains a disallowed privilege, is missing a required privilege, or includes an unknown privilege name, an attempt to verify, ready, or boot the zone will fail with an error message such as the following:


zonecfg:zone5> set limitpriv="basic"
.
.
.
global# zoneadm -z zone5 boot
 	required privilege "sys_mount" is missing from the zone's privilege set
 	zoneadm: zone zone5 failed to verify

netmasksWarning Displayed When Booting Zone

If you see the following message when you boot the zone as described in How to Boot a Zone:


# zoneadm -z my-zone boot
zoneadm: zone 'my-zone': WARNING: hme0:1: no matching subnet
	found in netmasks(4) for 192.168.0.1; using default of
	255.255.255.0.

The message is only a warning, and the command has succeeded. The message indicates that the system was unable to find the netmask to be used for the IP address specified in the zone's configuration.

To stop the warning from displaying on subsequent reboots, ensure that the correct netmasks databases are listed in the /etc/nsswitch.conf file in the global zone and that at least one of these databases contains the subnet and netmasks to be used for the zone my-zone.

For example, if the /etc/inet/netmasks file and the local NIS database are used for resolving netmasks in the global zone, the appropriate entry in /etc/nsswitch.conf is as follows:

netmasks: files nis

The subnet and corresponding netmask information for the zone my-zone can then be added to /etc/inet/netmasks for subsequent use.

For more information about the netmasks command, see the netmasks(4) man page.

Resolving Problems With a zoneadm attach Operation

ProcedurePatches and Packages Are Out of Sync

The target system must be running the same versions of the following required operating system packages and patches as those installed on the original host.

  1. If packages and patches are different between the original host and the new host, you might see a display similar to the following:


    host2# zoneadm -z my-zone attach
    	These packages installed on the source system are inconsistent with this system:
                SUNWgnome-libs (2.6.0,REV=101.0.3.2005.12.06.20.27) version mismatch
                        (2.6.0,REV=101.0.3.2005.12.19.21.22)
                SUNWudaplr (11.11,REV=2005.12.13.01.06) version mismatch
                        (11.11,REV=2006.01.03.00.45)
                SUNWradpu320 (11.10.0,REV=2005.01.21.16.34) is not installed
                SUNWaudf (11.11,REV=2005.12.13.01.06) version mismatch
                        (11.11,REV=2006.01.03.00.45)
                NCRos86r (11.10.0,REV=2005.01.17.23.31) is not installed
    	These packages installed on this system were not installed on the source system:
                SUNWukspfw (11.11,REV=2006.01.03.00.45) was not installed
                SUNWsmcmd (1.0,REV=2005.12.14.01.53) was not installed
    	These patches installed on the source system are inconsistent with this system:
                120081 is not installed
                118844 is not installed
                118344 is not installed
    	These patches installed on this system were not installed on the source system:
                118669 was not installed
                118668 was not installed
                116299 was not installed
  2. To migrate the zone successfully, use one of the following methods:

ProcedureOperating System Releases Do Not Match

To migrate the zone successfully, install the same Solaris release that is running on the original host on a system with the same architecture.

  1. Verify the Solaris release running on the original system.


    host1# uname -a
    
  2. Install the same release on the new host.

    Refer to the Solaris installation documentation on docs.sun.com.

ProcedureMachine Architectures Do Not Match

To migrate the zone successfully, use the -u option to zoneadm attach.

  1. Verify the system architecture on both systems.


    host1# uname -a
    
  2. If the architectures are different, use the -u option to zoneadm attach to perform the attach.


    host2# zoneadm -z my-zone attach -u
    

    For more information, see How to Migrate A Non-Global Zone.

Zones With an fs Resource Defined With a Type of lofs Cannot Be Upgraded to the Solaris 10 11/06 Release


Note –

This problem has been corrected in the Solaris 10 8/07 release.


If all non-global zones that are configured with lofs fs resources are mounting directories that exist in the miniroot, the system can be upgraded from an earlier Solaris 10 release to the Solaris 10 11/06 release using standard upgrade. For example, a lofs mounted /opt directory presents no issues for upgrade.

However, if any of your non-global zones are configured with a non-standard lofs mount, such as a lofsmounted /usr/local directory, the following error message is displayed:


The zones upgrade failed and the system needs to be restored
from backup.  More details can be found in the file
/var/sadm/install_data/upgrade_log on the upgrade root file
system.

Although this error message states that the system must be restored from backup, the system is actually fine, and it can be upgraded successfully using the following workaround:

  1. Reboot your system with the installed OS.

  2. Reconfigure the zones, removing the fs resources defined with a type of lofs.

  3. After removing these resources, upgrade the system to Solaris 10 11/06.

  4. Following the upgrade, you can reconfigure your zones again to restore the additional fs resources that you removed.