Go to main content

man pages section 5: Standards, Environments, and Macros

Exit Print View

Updated: July 2017
 
 

solaris-kz(5)

Name

solaris-kz - solaris kernel zone

Description

The solaris-kz brand uses the branded zones framework described in brands(5) to run zones with a separate kernel and OS installation from that used by the global zone.

Installation and Update

A solaris-kz installation is independent of that of the global zone; it is not a pkg(5) linked image and can be modified regardless of the global zone content. A solaris-kz zone can be installed in the same manner as other brands directly from the global zone, or via a boot media as described below.

When specifying a manifest for installation, the manifest used should be the one suitable for a global zone installation. As kernel zones always install into a known location for the root pool, an installation target disk should not be specified.

If an AI manifest is used to install a different version of Solaris than the one that is installed in the global zone, the installation must be performed using installation media that matches the version of Solaris being installed. �A typical command line would resemble:

zoneadm -z kzone1 install -b <ai.iso> -m <manifest.xml>

Boot environment (BE) management is independent of the global zone. BE creation in the global zone does not create a new BE in the zone. For more information, see the beadm(1M) man page.

Process Management and Visibility

As, unlike other brands, a solaris-kz zone runs a separate kernel, some differences are apparent when examining the zone from the global zone.

Processes that are running in a solaris-kz zone are not directly accessible by the global zone. For example, to see the list of processes in a kernel zone named kz-zone, rather than using the ps command with –z kz-zone options, you need to use the following command:

# zlogin kz-zone ps -e

The global zone and each kernel zone manage their own process ID space. Thus, the process 1234 may exist in the global zone and one or more kernel zones. Those are unique processes. If the global zone administrator wish to kill process 1234 in kz-zone, it should be done with the following command or an equivalent:

# zlogin kz-zone kill 1234

ps(1) and similar tools run from the global zone will see processes associated with managing a solaris-kz zone instance, such as kzhost and zlogin-kz. This can be useful for debugging, but otherwise they are private implementation details.

Similarly, resource management functionality is different. For example, resource controls such as max-processes are not available when configuring a solaris-kz zone, as they are only meaningful when sharing a single kernel instance. That is, a process running inside a solaris-kz zone cannot take up a process table slot in the global zone, as the kernels are independent.

The zonestat utility displays the resource usage of the zone. The output is generally correct, but may reflect the host values. For example, the resource control values such as lwps show the lwps used on the host, not the ones used inside the zone.

The solaris-kz brand uses certain hardware features which may not be available in older systems, or in virtualized environments. To detect whether a system supports the solaris-kz brand, install the brand-solaris-kz package and then run the virtinfo command.

# virtinfo -c supported list kernel-zone

If kernel-zone is not shown in the supported list, you can see syslog for more information. Messages pertaining to kernel zones will contain the string kernel-zone.

Stolen time as reported by the mpstat(1M), iostat(1M), vmstat(1M) and other utilities directly reflects the time when the kernel zone could not run as the host might be using CPU resources for other purposes.

Storage Access

A solaris-kz brand zone must reside on one or more devices. A default zfs(1M) volume will be created in the global zone's root zpool if the configuration is not customized prior to installation. The device onto which the zone is installed is specified with device resources that have the bootpri property set to any positive integer value. If a device will not be used as a boot device, it must not have the bootpri property set. To unset bootpri, use clear bootpri while in the device resource scope. If multiple bootable devices are present during installation, the devices will be used for a mirrored root ZFS pool in the zone. The default boot order is determined by sorting devices first by bootpri, then by id if multiple devices have the same bootpri.

The zonepath cannot be set for a kernel zone. As an implementation detail, it is set to a fixed location using tmpfs(7FS). It contains no persistent or otherwise user-servicable data. As the zone root is contained with the root ZFS volume, it is not mounted in the global zone under the zone path, unlike traditional zones. Access to the zone root can only be done via the zone itself, for example zlogin.

A solaris-kz zone cannot directly take advantage of shared-kernel features such as ZFS datasets and file system mounts. Instead, storage is made available to the zone via block devices such as raw disks, ZFS volumes, and lofi devices.

A solaris-kz zone's root is always accessible. Storage can be added by using add device in zonecfg. The device path specified must be a raw device, and either a ZFS volume, a raw disk, or a lofi device. The specified device must be a whole disk or LUN. Use the device path without any partition/slice suffix, for example:

# zonecfg -z myzone
zonecfg:myzone> add device
zonecfg:myzone:device> set match=/dev/rdsk/c4t9d0
zonecfg:myzone:device> set id=4
zonecfg:myzone:device> set bootpri=1

The id can be specified to fix the disk address inside the zone. If not given, it is automatically allocated.

As described in zonecfg(1M), a device resource can also configure a storage URI in order to make the zone's configuration portable to other host systems. See suri(5).

For example:

# zonecfg -z myzone
zonecfg:myzone> add device
zonecfg:myzone:device> set storage=nfs://user1:staff@host1/export/file1
zonecfg:myzone:device> set create-size=4g

To see information about the current configuration for device resources, use the info subcommand. For example:

# zonecfg -z myzone info device
device:
    match not specified
    storage: dev:/dev/zvol/dsk/rpool/VARSHARE/zones/myzone/disk0
    id: 0
    bootpri: 0
device:
    match not specified
    storage: nfs://user1:staff@host1/export/file1
    create-size: 4g
    id: 1
    bootpri not specified

You can also shorten the output by specifying the id:

# zonecfg -z myzone info device id=1
device:
    match not specified
    storage: nfs://user1:staff@host1/export/file1
    create-size: 4g
    id: 1
    bootpri not specified

To install a zone to a non-default location, to an iSCSI logical unit, for example, the device resource for the root disk must be modified. For example:

# zonecfg -z myzone
zonecfg:myzone> select device id=0
zonecfg:myzone:device> set storage=iscsi://host/luname.naa.0000abcd

At least one device must have bootpri set to a positive integer to indicate that it is bootable. Within a kernel zone, all devices that act as mirrors or spares for the root ZFS pool must be bootable.

Only storage devices are supported by add device for the solaris-kz brand.

Network Access

Kernel zones must be exclusive stack. Network access is provided by adding net or anet resources for Ethernet datalinks and by adding anet resources for IPoIB datalinks. The datalink specified by these resources will be used as the backend of the datalinks visible in the zone. Both IPoIB and Ethernet network resources can be specified, and the datalinks visible in the zone will be of the corresponding media type. As with storage devices, an ID may be specified to identify the virtual NIC address inside the zone. Adding InfiniBand network links through net resources is not supported.

Kernel zones may themselves host zones (in which case they play the role of the global zone for those zones). The network access to the hosted zones are provided over the Ethernet datalinks only and not over the IPoIB datalinks. However, because the networking configuration of the kernel zone is partially defined by its zone configuration, hosted zones are restricted in which MAC addresses may be used. Attempting to boot a zone with mac-address settings of random or a specific MAC address are not permitted.

To supply additional MAC addresses to a kernel zone, add them to the mac-address property for the relevant resource. See zonecfg(1M). This will make that mac-address available as a factory address inside the kernel zone.

A hosted zone may then use that MAC address itself. To do this, configure the mac-address property of the hosted zone to be either the explicit MAC address configured (use mac-address property), or specify auto. See zonecfg(1M) for details of these settings.

Memory Configuration

A fixed amount of host RAM must be allocated to a kernel zone. This is configured by the physical property of the capped-memory resource in zonecfg(1M). The given value must be aligned to a suitable value, depending on the platform and the value of pagesize-policy. The allocated memory is locked, and hence not pageable to a swap device.

When specifying physical property you also need to specify the pagesize-policy property of capped-memory resource in zonecfg(1M). The pagesize-policy property is used to specify a policy for solaris-kz brand to use large page(s) for its physical memory. The pagesize-policy property can only be used in conjunction with physical property. Following are the acceptable keywords for pagesize-policy property:

largest-only

Only the largest possible page size for the Kernel Zone's physical memory is allocated. If you fail to assign all the pages, then you fail to boot the zone.

largest-available

You can attempt to provide the largest possible page size, scaling down the page size if one cannot allocate all physical memory with a particular page size. The priority is to boot the zone.

smallest-only

Lowest allowable page size required to boot the Kernel Zone for the particular platform is chosen.

Clearing the pagesize-policy property, or not present, is necessary for supporting older suspend image format. It allows live migration and resume of KZ from new systems to older systems. Lowest allowable page size required to boot the Kernel Zone for the particular platform is chosen.

Suspend, Resume, and Warm Migration

Kernel zones may be suspended to disk by the zoneadm suspend command. The running state of the zone is written to the disk. As this includes the entire RAM used by the zone, this can take a significant amount of time and space.

Suspend and resume are supported for a kernel zone only if it has a suspend resource in its configuration. Within a suspend resource, the path or storage (but not both) must be specified. The path property specifies the name of a file that will contain the suspend image. The directory containing the file must exist and be writable by the root user. Any file system that is mounted prior to the start of svc:/system/zones:default may be used. The storage property specifies the storage URI (see suri(5)) of a disk device that will contain the suspend image. The whole device will be used. This device may not be shared with anything else.

The suspend image is compressed prior to writing. As such, the size of the suspend image will typically be significantly smaller than the size of the zone's RAM. During suspend, a message is printed and logged to the console log indicating the size of the suspend image.

After compression, the suspend image is encrypted using AES-128-CCM. The encryption key is automatically generated by /dev/random (see random(7D)) and is stored in the keysource resource's raw property.

If a zone is suspended, the zoneadm boot command will resume it. The boot –R option can be used to boot afresh if a resume is not desired.

If the suspend image and the rest of the zone's storage is accessible by multiple hosts (typically by using suspend:storage and device:storage properties), the suspend image can be used to support warm migration following the usual zone cold migration with zoneadm detach or zoneadm attach, but using zoneadm suspend instead of zoneadm shutdown as the first step. This will avoid any zone startup cost on the destination host, excluding the time spent to resume.

The source and the destination host must be the same platform. On x86, the vendor (AMD/Intel) as well as the CPU model name must match. On SPARC, the hardware platform must be the same. For example, you cannot warm migrate from a T4 host to a T5 host. If you want to migrate between different hardware platforms, you need to specify migration class in cpu-arch property approriately.

Also, besides migration class you may need to specify host compatibility level in host-compatible property to make sure the hardware features supported by the version of Oracle Solaris running on source and target host match.

    On resume, the current configuration of the zone is used to boot and to allow specifying a new configuration. However, there are restrictions, as the resuming zone is expecting a particular setup. Any incompatibilities will cause boot to fail. For example, the boot process might fail if:

  • The CPU supports different features (for example, see cpuid(7D))

  • The configuration has a different capped-memory value

  • The configuration defines different number of virtual CPUs

  • A disk is missing (no device resource with a suitable id property)

  • A virtual NIC is missing (no net or anet resource with a suitable id property)

No specific check for storage identification is done. Note that it is the administrator's responsibility to ensure that the device listed under a particular ID is the one that the zone is expecting to see.

Live Migration

Kernel zones can be live migrated to compatible hosts by using the zoneadm migrate command, as described in zoneadm(1M).

Live migration has the same compatibility restrictions as described in the Suspend, Resume, and Warm Migration section above.

Auxiliary State

The following auxiliary states (as shown by zoneadm list -is) are defined for this brand:

suspended

The zone has been suspended and will resume on next boot. Note that the zone must be attached before this state is visible.

debugging

The zone is in running state, but the kernel debugger is running within the zone and therefore cannot service network requests etc. Connect to the zone console to interact with the debugger (kmdb).

panicked

The zone is in running state, but the zone has panicked and the host is not affected.

migrating-out

The zone is fully running, but is being migrated to another host.

migrating-in

The zone is booted on the host, and is receiving the migration image, so is not yet fully running until migration is complete.

Host Data

Each of a kernel zone's bootable devices contains state information known as host data. This data keeps track of where a zone is in use, if it is suspended, and other state information. Host data is encrypted and authenticated with AES-128-CCM, using the same encryption key used for the suspend image.

As a kernel zone is readied or booted, the host data is read to determine if the kernel zone's boot storage is in use on another system. If it is in use by another system, the kernel zone will enter the unavailable state and an error message will indicate which system is using it. If it is certain that the storage is not in use on the other system, the kernel zone can be repaired by using the -x force-takeover extended option to zoneadm attach. See the warning below before executing this command.

If the encryption key is inaccessible, the host data and any suspend image will not be readable. In such a circumstance, any attempt to ready or boot the zone will cause the zone to enter the unavailable state. If recovery of the encryption key is not possible, the -x initialize-hostdata extended option to the zoneadm attach subcommand can be used to generate a new encryption key and host data. See the warning below before executing this command.


Note -  WARNING: Forcing a take over or reinitialization of host data will make it impossible to detect if the zone is in use on any other system. Running multiple instances of a zone that reference the same storage will lead to unrepairable corruption of the zone's file systems.

To prevent loss of the encryption key during a warm or cold migration, use zonecfg export on the source system to generate a command file to be used on the destination system. For example:

  root@host1# zonecfg -z myzone export -f /net/.../myzone.cfg
  root@host2# zonecfg -z myzone -f /net/.../myzone.cfg

Because myzone.cfg in this example contains the encryption key, it is important to protect its contents from disclosure.

Configuration

A solaris-kz brand zone can be configured by using the SYSsolaris-kz template.

The following zonecfg(1M) resources and properties are not supported for this brand:

anet:address
capped-memory:locked
capped-memory:swap
dataset
device:allow-partition
device:allow-raw-io
fs
file-mac-profile
fs-allowed
ip-type
limitpriv
global-time
max-lwps
max-msg-ids
max-processes
max-sem-ids
max-shm-memory
rctl:zone.max-lofi
rctl:zone.max-swap
rctl:zone.max-locked-memory
rctl:zone.max-shm-memory
rctl:zone.max-shm-ids
rctl:zone.max-sem-ids
rctl:zone.max-msg-ids
rctl:zone.max-processes
rctl:zone.max-lwps
rootzpool
zpool

The following zonecfg(1M) resources and properties are supported by the live zone reconfiguration for this brand:

anet (with exceptions stated below)
device
net (with exceptions stated below)

The following zonecfg(1M) resources and properties are not supported by the live zone reconfiguration for this brand:

anet:allowed-address
anet:configure-allowed-address
anet:defrouter
capped-cpu (zone.cpu-cap)
capped-memory
cpu-shares (zone.cpu-shares)
dedicated-cpu
hostid
ib-vhca
ib-vhca:port
cpu-arch
keysource
net:allowed-address
net:configure-allowed-address
net:defrouter
pool
rctl
scheduling-class
tenant
virtual-cpu
host-compatible

Any changes made to the listed unsupported resources and properties in the persistent configuration will be ignored by the live zone reconfiguration if they are applied to the running zone.

Any attempts to modify listed unsupported resources and properties in the live configuration will be refused.

Changes made to anet and net properties supported for solaris-kz brand should be for the same media type.

There are specific defaults for properties supported for solaris-kz brand as listed below:

Resource                Property                    Default Value
global                  zonepath                    /system/zones/%{zonename}
                        autoboot                    false
                        ip-type                     exclusive
                        auto-shutdown               shutdown
net                     configure-allowed-address   true
anet                    mac-address                 auto
                        lower-link                  auto
                        link-protection             mac-nospoof
                        linkmode                    cm
anet:mac                mac-address                 auto
ib-vhca                 smi-enabled                 off
ib-vhca:port            pkey                        auto

Sub Commands

The following solaris-kz brand-specific subcommand options are supported by zoneadm(1M).

attach [-x force-takeover | initialize-hostdata]

Attach the specified solaris-kz branded zone image into the zone. The zone's bootable device(s) are assumed to already be populated correctly.

The -x force-takeover extended option clears state information indicating that the zone is installed or running on another system. Use this option with extreme caution: if the same storage is simultaneously used by two instances of a zone, file system will corrupt.

The -x initialize-hostdata extended option reinitializes the encryption key and host data. As with -x force-takeover, ensure the zone is not in use on another system before using this option.

boot [-R] -- [-L | -Z bootenv]

If a zone is suspended, the -R option can be used to ignore the suspended image (which is then deleted), and boot afresh.

The -L option tells the bootloader to list the available boot environments. The BE to boot can be interactively selected.

The -Z option tells the bootloader to boot a particular BE. For example:

# zoneadm -z myzone \ 
boot -- -Z rpool/ROOT/solaris
clone [-c config_profile.xml | dir]

Provides a profile or a directory of profiles to apply after installation from the repository.

All profiles must have an .xml extension.

For zoneadm clone, if storage is automatically created, it will be created as the same size as the disk in the source zone.

install [–v] [–a archive [–x no-auto-shutdown] | –m manifest.xml] [–c config_profile.xml | dir] [–C install_profile.xml | dir] [–S rootsize] [–b /path/to/media.iso [–x no-auto-shutdown]] [–z archived_zone]
[–x <cert|cacert|key >=path] ...

Kernel zones can be installed with the global zone's publishers and a default AI manifest, with a custom AI manifest, with an ISO image of Solaris installation media, or with a Unified Archive.

Unless the –a, –b, or –m options are used, the default AI manifest, /usr/share/auto_install/manifest/default.xml, and the global zone's pkg(5) publishers are used to perform the installation. The supported media types are the text installer and the automated installer. This allows any supported Oracle Solaris version to be installed. Solaris 11.2 is the first version of Solaris supported in a kernel zone.

If an AI manifest is specified with the –m option, an IPS or Unified Archive installation will be performed, based on the content of the AI manifest. See ai_manifest(4).

If an ISO image of bootable Solaris installation media is provided with the –b option, the kernel zone is booted from the installation media and the install program is run on the zone's console. A console login session is established during installation, allowing for interaction with and/or observation of the install program.

If a Unified Archive is specified with the –a option, the installation is performed from the Unified Archive. If the Unified Archive contains multiple zones (deployable systems in archiveadm info output), the –z option is used to specify which archived zone to install. Unified archives are created with archiveadm(1M).

-a archive

Install from the specified Unified Archive. The archived_zone may be a global zone, a kernel zone, or a solaris brand zone. If the archived zone is a solaris brand zone, a non-global to global pkg(5) image transform is performed. For the transform to be successful, the zone's installation envirionment must have sufficient network access to allow access to all pkg(5) publishers. This is most easily accomplished by allowing the kernel zone's network to be configured via DHCP.

-b /path/to/media.iso

Boot and install from the given media.

-c config_profile.xml | dir

Provides a profile or a directory of profiles to apply after installation from the repository.

All profiles must have an .xml extension.

-C install_profile.xml | dir

Provides a profile or a directory of profiles to apply to the installation environment when booted to AI media to perform the install.

All profiles must have an .xml extension.

-m manifest.xml

Manifest file to be specified to the automated installer.

-x install-size

Explicitly set the size of the root file system (default is 16g).

-x no-auto-shutdown

Leaves the kernel zone login to the console after installation, allowing for interactions with the install system. This option is only valid with the –a or the –b options.

-x cert=path
-x cacert=path
-x key=path

Use the specified certificate, CA certificate, and/or key when installing from a Unified Archive at a https URI. Only valid with the –a option.

-v

Verbose output from the install process.

-z archived_zone

Install the zone using archived_zone from the Unified Archive. See Deployable Systems in the output of archiveadm(1M) info command for a list of valid values for a particular Unified Archive. Only valid with the –a option.

See Also

archiveadm(1M), ai_manifest(4), pkg(5)

Notes

VirtualBox can be used on the same host as kernel zones, but must be configured appropriately. See the VirtualBox documentation for more details. Since kernel zones are running in a separate Solaris kernel environment they may possibly crash and dump the same core that a kernel in a global zone running on metal would. In such a case the dump is saved in the kernel zone storage and found in the same place as any Solaris crash dump would be found, subject to the crash dump parameters as configured by dumpadm(1M). Kernel zones also have the ability to have a core dump generated from the host environment using the zoneadm savecore subcommand. Additionally, if a kernel zone does crash and attempts to dump a core image but is unable to successfully save a core in the kernel zone's storage it will request the host to attempt to save a core image as if a zoneadm savecore subcommand had been issued. The core will be saved in a location specified by coreadm(1M), this will only succeed if coreadm(1M) has configured a location for and enabled kernel zone core dumps.