C H A P T E R  8

Migrating Logical Domains

This chapter describes how to migrate logical domains from one host machine to another as of this release of LDoms 1.1 software.


Introduction to Logical Domain Migration

Logical Domain Migration provides the ability to migrate a logical domain from one host machine to another. The host where the migration is initiated is referred to as the source machine, and the host where the domain is migrated to is referred to as the target machine. Similarly, once a migration is started, the domain to be migrated is referred to as the source domain and the shell of a domain created on the target machine is referred to as the target domain while the migration is in progress.


Overview of a Migration Operation

The Logical Domains Manager on the source machine accepts the request to migrate a domain and establishes a secure network connection with the Logical Domains Manager running on the target machine. Once this connection has been established, the migration occurs. The migration itself can be broken down into different phases.

Phase 1: After connecting with the Logical Domains Manager running in the target host, information about the source machine and domain are transferred to the target host. This information is used to perform a series of checks to determine whether a migration is possible. The checks differ depending on the state of the source domain. For example, if the source domain is active, a different set of checks are performed than if the domain is bound or inactive.

Phase 2: When all checks in Phase 1 have passed, the source and target machines prepare for the migration. In the case where the source domain is active, this includes shrinking the number of CPUs to one and suspending the domain. On the target machine, a domain is created to receive the source domain.

Phase 3: For an active domain, the next phase is to transfer all the runtime state information for the domain to the target. This information is retrieved from the hypervisor. On the target, the state information is installed in the hypervisor.

Phase 4: Handoff. After all state information is transferred, the handoff occurs when the target domain resumes execution (if the source was active) and the source domain is destroyed. From this point on, the target domain is the sole version of the domain running.


Software Compatibility

For a migration to occur, both the source and target machines must be running compatible software:



Note - Since this is the first release of the migration feature, both machines must be running LDoms 1.1 software and up-to-date firmware. Refer to the Logical Domains (LDoms) 1.1 Release Notes for the latest firmware for your platform.




Authentication

Since the migration operation executes on two machines, a user must be authenticated on both the source and target host. In particular, the user must have the solaris.ldoms.write authorization on both machines.

The ldm command line interface for migration allows the user to specify an optional alternate user name for authentication on the target host. If this is not specified, the user name of the user executing the migration command is used. In both cases, the user is prompted for a password for the target machine.


Migrating an Active Domain

For the migration of an active domain to occur with LDoms 1.1 software, there is a certain set of requirements and restrictions imposed on the source logical domain, the source machine, and the target machine. The sections following describe these requirements and restrictions for each of the resource types.

CPUs

Following are the requirements and restrictions on CPUs when performing a migration.

Memory

There must be sufficient free memory on the target machine to accommodate the migration of the source domain. In addition, following are a few properties that must be maintained across the migration:

Physical Input/Output

The logical domain to be migrated must not contain any physical I/O devices. If a domain has any physical I/O devices, the migration fails.

Virtual Input/Output

All virtual I/O (VIO) services used by the source domain must be available on the target machine. In other words, the following conditions must exist:

NIU Hybrid Input/Output

A domain using NIU Hybrid I/O resources can be migrated. A constraint specifying NIU Hybrid I/O resources is not a hard requirement of a logical domain. If such a domain is migrated to a machine that does not have available NIU resources, the constraint is preserved, but not fulfilled.

Cryptographic Units

You cannot migrate a logical domain that has bound cryptographic units. Attempts to migrate such a domain fail.

Delayed Reconfiguration

Any active delayed reconfiguration operations on the source or target hosts prevent a migration from starting. Delayed reconfiguration operations are blocked while a migration is in progress.

Operations on Other Domains

While a migration is in progress on a machine, any operation which could result in the modification of the Machine Description (MD) of the domain being migrated is blocked. This includes all operations on the domain itself as well as operations such as bind, stop, and start on other domains on the machine.


Migrating Bound or Inactive Domains

Because a bound or inactive domain is not executing at the time of the migration, there are fewer restrictions than when you migrate an active domain.

CPUs

You can migrate a bound or inactive domain between machines running different processor types and machines that are running at different frequencies.

The Solaris OS image in the guest must support the processor type on the target machine.

Virtual Input/Output

For an inactive domain, there are no checks performed against the virtual input/output (VIO) constraints. So, the VIO servers do not need to exist for the migration to succeed. As with any inactive domain, the VIO servers need to exist and be available at the time the domain is bound.


Performing a Dry Run

When you provide the -n option to the migrate-domain subcommand, migration checks are performed, but the source domain is not migrated. Any requirement that is not satisfied is reported as an error. This allows you to correct any configuration errors before attempting a real migration.



Note - Because of the dynamic nature of logical domains, it is possible for a dry run to succeed and a migration to fail and vice-versa.




Monitoring a Migration in Progress

When a migration is in progress, the source and target domains are displayed differently in the status output. In particular, the short version of the status output shows a new flag indicating the state of the migrating domain. The source domain shows a s to indicate that it is the source of the migration. The target domain shows a t to indicate that it is the target of a migration. If an error occurs that requires user intervention, an e is displayed.

In the long form of the status output, additional information is displayed about the migration. On the source, the percentage of the operation complete is displayed along with the target host and domain name. Similarly, on the target, the percentage of the operation complete is displayed along with the source host and domain name.


EXAMPLE 8-1   Monitoring a Migration in Progress
# ldm ls -o status ldg-src
NAME
ldg-src
 
STATUS
    OPERATION    PROGRESS    TARGET 
    migration    17%         t5440-sys-2 


Canceling a Migration in Progress

Once a migration starts, if the ldm command is interrupted with a KILL signal, the migration is terminated. The target domain is destroyed, and the source domain is resumed if it was active. If the controlling shell of the ldm command is lost, the migration continues in the background.

A migration operation can also be canceled externally from the ldm command using the cancel-operation subcommand. This terminates the migration in progress, and the source domain resumes as the master domain.



Note - Once a migration has been initiated, suspending the ldm(1M) process does not pause the operation, because it is the Logical Domains Manager daemon (ldmd) on the source and target machines that are effecting the migration. The ldm process waits for a signal from the ldmd that the migration has been completed before returning.




Recovering From a Failed Migration

If the network connection is lost after the source has completed sending all the runtime state information to the target, but before the target can acknowledge that the domain has been resumed, the migration operation terminates, and the source is placed in an error state. This indicates that user interaction is required to determine whether or not the migration was completed successfully. In such a situation, take the following steps.


Examples

EXAMPLE 8-2 shows how a domain, called ldg1, can be migrated to a machine called t5440-sys-2.


EXAMPLE 8-2   Migrating a Guest Domain
# ldm migrate-domain ldg1 t5440-sys-2
Target Password:
#

EXAMPLE 8-3 shows that a domain can be renamed as part of the migration. In this example, ldg-src is the source domain, and it is renamed to ldg-tgt on the target machine (t5440-sys-2) as part of the migration. In addition, the user name (root) on the target machine is explicitly specified.


EXAMPLE 8-3   Migrating and Renaming a Guest Domain
# ldm migrate ldg-src root@t5440-sys-2:ldg-tgt
Target Password:
#

EXAMPLE 8-4 shows a sample failure message if the target domain does not have migration support; that is, if you are running an LDoms version prior to version 1.1.


EXAMPLE 8-4   Migration Failure Message
# ldm migrate ldg1 t5440-sys-2
Target Password:
Failed to establish connection with ldmd(1m) on target: t5440-sys-2
Check that the ’ldmd’ service is enabled on the target machine and
that the version supports Domain Migration. Check that the ’xmpp_enabled’
and ’incoming_migration_enabled’ properties of the ’ldmd’ service on
the target machine are set to ’true’ using svccfg(1M).

EXAMPLE 8-5 shows how to obtain status on a target domain while the migration is in progress. In this example, the source machine is t5440-sys-1.


EXAMPLE 8-5   Obtaining Target Domain Status
# ldm ls -o status ldg-tgt
NAME
ldg-tgt
 
STATUS
    OPERATION    PROGRESS    SOURCE
    migration    55%         t5440-sys-1

EXAMPLE 8-6 shows how to obtain parseable status on the source domain while the migration is in progress. In this example, the target machine is t5440-sys-2.


EXAMPLE 8-6   Obtaining Source Domain Parseable Status
# ldm ls -o status -p ldg-src
VERSION 1.3
DOMAIN|name=ldg-src|
STATUS
|op=migration|progress=42|error=no|target=t5440-sys-2