Logical Domains 1.2 Administration Guide

Chapter 8 Migrating Logical Domains

This chapter describes how to migrate logical domains from one host machine to another as of this release of Logical Domains 1.2 software.

This chapter covers the following topics:

Introduction to Logical Domain Migration

Logical Domain Migration provides the ability to migrate a logical domain from one host machine to another. The host where the migration is initiated is referred to as the source machine, and the host where the domain is migrated to is referred to as the target machine. Similarly, once a migration is started, the domain to be migrated is referred to as the source domain and the shell of a domain created on the target machine is referred to as the target domain while the migration is in progress.

Overview of a Migration Operation

The Logical Domains Manager on the source machine accepts the request to migrate a domain and establishes a secure network connection with the Logical Domains Manager running on the target machine. Once this connection has been established, the migration occurs. The migration itself can be broken down into different phases.

Phase 1: After connecting with the Logical Domains Manager running in the target host, information about the source machine and domain are transferred to the target host. This information is used to perform a series of checks to determine whether a migration is possible. The checks differ depending on the state of the source domain. For example, if the source domain is active, a different set of checks are performed than if the domain is bound or inactive.

Phase 2: When all checks in Phase 1 have passed, the source and target machines prepare for the migration. In the case where the source domain is active, this includes shrinking the number of CPUs to one and suspending the domain. On the target machine, a domain is created to receive the source domain.

Phase 3: For an active domain, the next phase is to transfer all the runtime state information for the domain to the target. This information is retrieved from the hypervisor. On the target, the state information is installed in the hypervisor.

Phase 4: Handoff. After all state information is transferred, the handoff occurs when the target domain resumes execution (if the source was active) and the source domain is destroyed. From this point on, the target domain is the sole version of the domain running.

Software Compatibility

For a migration to occur, both the source and target machines must be running compatible software:

Note –

The migration feature was first released with the Logical Domains 1.1 software and corresponding firmware. For information about the latest firmware for your platform, see the Logical Domains 1.2 Release Notes.

Authentication for Migration Operations

Since the migration operation executes on two machines, a user must be authenticated on both the source and target host. In particular, the user must have the solaris.ldoms.write authorization on both machines.

The ldm command line interface for migration allows the user to specify an optional alternate user name for authentication on the target host. If this is not specified, the user name of the user executing the migration command is used. In both cases, the user is prompted for a password for the target machine.

Migrating an Active Domain

For the migration of an active domain to occur with Logical Domains 1.2 software, there is a certain set of requirements and restrictions imposed on the source logical domain, the source machine, and the target machine. The sections following describe these requirements and restrictions for each of the resource types.

Note –

The migration operation speeds up when the primary domain on the source and target systems have cryptographic units assigned.

Migrating CPUs in an Active Domain

Following are the requirements and restrictions on CPUs when performing a migration.

Migrating Memory in an Active Domain

There must be sufficient free memory on the target machine to accommodate the migration of the source domain. In addition, following are a few properties that must be maintained across the migration:

The target machine must have sufficient free memory to accommodate the migration of the source domain. In addition, the layout of the available memory on the target machine must be compatible with the memory layout of the source domain or the migration will fail.

In particular, if the memory on the target machine is fragmented into multiple small address ranges, but the source domain requires a single large address range, the migration will fail. The following example illustrates this scenario. The target domain has two Gbytes of free memory in two memory blocks:

# ldm list-devices memory
    PA                   SIZE
    0x108000000          1G
    0x188000000          1G

The source domain, ldg-src, also has two Gbytes of free memory, but it is laid out in a single memory block:

# ldm list -o memory ldg-src

    RA               PA               SIZE
    0x8000000        0x208000000      2G

Given this memory layout situation, the migration fails:

# ldm migrate-domain ldg-src dt212-239
Target Password:
Unable to bind 2G memory region at real address 0x8000000
Domain Migration of LDom ldg-src failed

Migrating Physical I/O Devices in an Active Domain

Virtual devices that are backed by physical devices can be migrated. However, virtual devices that have direct access to physical devices cannot be migrated. For instance, you cannot migrate I/O domains.

Migrating Virtual I/O Devices in an Active Domain

All virtual I/O (VIO) services used by the source domain must be available on the target machine. In other words, the following conditions must exist:

Migrating NIU Hybrid Input/Output in an Active Domain

A domain using NIU Hybrid I/O resources can be migrated. A constraint specifying NIU Hybrid I/O resources is not a hard requirement of a logical domain. If such a domain is migrated to a machine that does not have available NIU resources, the constraint is preserved, but not fulfilled.

Migrating Cryptographic Units in an Active Domain

You cannot migrate a logical domain that has bound cryptographic units if it has more than one VCPU. Attempts to migrate such a domain will fail.

Delayed Reconfiguration in an Active Domain

Any active delayed reconfiguration operations on the source or target hosts prevent a migration from starting. Delayed reconfiguration operations are blocked while a migration is in progress.

Operations on Other Domains

While a migration is in progress on a machine, any operation which could result in the modification of the Machine Description (MD) of the domain being migrated is blocked. This includes all operations on the domain itself as well as operations such as bind and stop on other domains on the machine.

Migrating Bound or Inactive Domains

Because a bound or inactive domain is not executing at the time of the migration, there are fewer restrictions than when you migrate an active domain.

The migration of a bound domain requires that the target is able to satisfy the CPU, memory, and I/O constraints of the source domain. Otherwise, the migration will fail. The migration of an inactive domain does not have such requirements. However, the target must satisfy the domain's constraints when the binding occurred. Otherwise, the domain binding will fail.

Migrating CPUs in a Bound or Inactive Domain

You can migrate a bound or inactive domain between machines running different processor types and machines that are running at different frequencies.

The Solaris OS image in the guest must support the processor type on the target machine.

Migrating Virtual Input/Output in a Bound or Inactive Domain

For an inactive domain, there are no checks performed against the virtual input/output (VIO) constraints. So, the VIO servers do not need to exist for the migration to succeed. As with any inactive domain, the VIO servers need to exist and be available at the time the domain is bound.

Performing a Dry Run

When you provide the -n option to the migrate-domain subcommand, migration checks are performed, but the source domain is not migrated. Any requirement that is not satisfied is reported as an error. This allows you to correct any configuration errors before attempting a real migration.

Note –

Because of the dynamic nature of logical domains, it is possible for a dry run to succeed and a migration to fail and vice-versa.

Monitoring a Migration in Progress

When a migration is in progress, the source and target domains are shown differently in the status output. The output of the ldm list command indicates the state of the migrating domain.

The sixth column in the FLAGS field shows one of the following values:

The following shows that ldg-src is the source domain of the migration:

# ldm list ldg-src
ldg-src    suspended  -n---s          1     1G       0.0%  2h 7m

The following shows that ldg-tgt is the target domain of the migration:

# ldm list ldg-tgt
ldg-tgt    bound      -----t  5000    1     1G

In the long form of the status output, additional information is shown about the migration. On the source, the percentage of the operation complete is displayed along with the target host and domain name. Similarly, on the target, the percentage of the operation complete is displayed along with the source host and domain name.

Example 8–1 Monitoring a Migration in Progress

# ldm list -o status ldg-src
    migration    17%         t5440-sys-2

Canceling a Migration in Progress

Once a migration starts, if the ldm command is interrupted with a KILL signal, the migration is terminated. The target domain is destroyed, and the source domain is resumed if it was active. If the controlling shell of the ldm command is lost, the migration continues in the background.

A migration operation can also be canceled externally by using the ldm cancel-operation command. This terminates the migration in progress, and the source domain resumes as the active domain. The ldm cancel-operation command should be initiated from the source system. On a given system, any migration-related command impacts the migration operation that was started from that system. A system cannot control a migration operation when it is the target system.

Note –

Once a migration has been initiated, suspending the ldm(1M) process does not pause the operation, because it is the Logical Domains Manager daemon (ldmd) on the source and target machines that are effecting the migration. The ldm process waits for a signal from the ldmd that the migration has been completed before returning.

Recovering From a Failed Migration

If the network connection is lost after the source has completed sending all the runtime state information to the target, but before the target can acknowledge that the domain has been resumed, the migration operation terminates, and the source is placed in an error state. This indicates that user interaction is required to determine whether or not the migration was completed successfully. In such a situation, take the following steps.

Migration Examples

Example 8–2 shows how a domain, called ldg1, can be migrated to a machine called t5440-sys-2.

Example 8–2 Migrating a Guest Domain

# ldm migrate-domain ldg1 t5440-sys-2
Target Password:

Example 8–3 shows that a domain can be renamed as part of the migration. In this example, ldg-src is the source domain, and it is renamed to ldg-tgt on the target machine (t5440-sys-2) as part of the migration. In addition, the user name (root) on the target machine is explicitly specified.

Example 8–3 Migrating and Renaming a Guest Domain

# ldm migrate ldg-src root@t5440-sys-2:ldg-tgt
Target Password:

Example 8–4 shows a sample failure message if the target domain does not have migration support; that is, if you are running an LDoms version prior to version 1.2.

Example 8–4 Migration Failure Message

# ldm migrate ldg1 t5440-sys-2
Target Password:
Failed to establish connection with ldmd(1m) on target: t5440-sys-2
Check that the 'ldmd' service is enabled on the target machine and
that the version supports Domain Migration. Check that the 'xmpp_enabled'
and 'incoming_migration_enabled' properties of the 'ldmd' service on
the target machine are set to 'true' using svccfg(1M).

Example 8–5 shows how to obtain status on a target domain while the migration is in progress. In this example, the source machine is t5440-sys-1.

Example 8–5 Obtaining Target Domain Status

# ldm list -o status ldg-tgt
    migration    55%         t5440-sys-1

Example 8–6 shows how to obtain parseable status on the source domain while the migration is in progress. In this example, the target machine is t5440-sys-2.

Example 8–6 Obtaining Source Domain Parseable Status

# ldm list -o status -p ldg-src