Overview of a Migration Operation

The Logical Domains Manager on the source machine accepts the request to migrate a domain and establishes a secure network connection with the Logical Domains Manager running on the target machine. Once this connection has been established, the migration occurs. The migration itself can be broken down into different phases.

Phase 1: After connecting with the Logical Domains Manager running in the target host, information about the source machine and domain are transferred to the target host. This information is used to perform a series of checks to determine whether a migration is possible. The checks differ depending on the state of the source domain. For example, if the source domain is active, a different set of checks are performed than if the domain is bound or inactive.

Phase 2: When all checks in Phase 1 have passed, the source and target machines prepare for the migration. In the case where the source domain is active, this includes shrinking the number of CPUs to one and suspending the domain. On the target machine, a domain is created to receive the source domain.

Phase 3: For an active domain, the next phase is to transfer all the runtime state information for the domain to the target. This information is retrieved from the hypervisor. On the target, the state information is installed in the hypervisor.

Phase 4: Handoff. After all state information is transferred, the handoff occurs when the target domain resumes execution (if the source was active) and the source domain is destroyed. From this point on, the target domain is the sole version of the domain running.

Software Compatibility

For a migration to occur, both the source and target machines must be running compatible software:

The hypervisor on the source and target machines both must support the most recent version of the LDoms 1.1 firmware.

If you see the following error, you do not have the correct version of system firmware on either the source or target machine.
System Firmware version on <downrev machine> does not support Domain Migration Domain Migration of LDom <source domain> failed
A compatible version of the Logical Domains Manager must be running on both machines.

Note - Since this is the first release of the migration feature, both machines must be running LDoms 1.1 software and up-to-date firmware. Refer to the Logical Domains (LDoms) 1.1 Release Notes for the latest firmware for your platform.

Authentication

Since the migration operation executes on two machines, a user must be authenticated on both the source and target host. In particular, the user must have the solaris.ldoms.write authorization on both machines.

The ldm command line interface for migration allows the user to specify an optional alternate user name for authentication on the target host. If this is not specified, the user name of the user executing the migration command is used. In both cases, the user is prompted for a password for the target machine.

Migrating an Active Domain

For the migration of an active domain to occur with LDoms 1.1 software, there is a certain set of requirements and restrictions imposed on the source logical domain, the source machine, and the target machine. The sections following describe these requirements and restrictions for each of the resource types.

CPUs

Following are the requirements and restrictions on CPUs when performing a migration.

The source and target machines must have the same processor type running at the same frequency.
The target machine must have sufficient free strands to accommodate the number of strands in use by the domain. In addition, full cores must be allocated for the migrated domain. If the number of strands in the source are less than a full core, the extra strands are unavailable to any domain until after the migrated domain is rebooted.
After a migration, CPU dynamic reconfiguration (DR) is disabled for the target domain until it has been rebooted. Once a reboot has occurred, CPU DR becomes available for that domain.
Either the source domain must have only a single strand, or the guest OS must support CPU DR, so that the domain can be shrunk to a single strand before migration. Conditions in the guest domain that would cause a CPU DR removal to fail would also cause the migration attempt to fail. For example, processes bound to CPUs within the guest domain, or processor sets configured in the source logical domain, can cause a migration operation to fail.

Memory

There must be sufficient free memory on the target machine to accommodate the migration of the source domain. In addition, following are a few properties that must be maintained across the migration:

It must be possible to create the same number of identically-sized memory blocks.
The physical addresses of the memory blocks do not need to match, but the same real addresses must be maintained across the migration.

Physical Input/Output

The logical domain to be migrated must not contain any physical I/O devices. If a domain has any physical I/O devices, the migration fails.

Virtual Input/Output

All virtual I/O (VIO) services used by the source domain must be available on the target machine. In other words, the following conditions must exist:

Each logical volume used in the source logical domain must also be available on the target host and must refer to the same storage.

Caution - If the logical volume used by the source as a boot device exists on the target but does not refer to the same storage, the migration appears to succeed, but the machine is not usable as it is unable to access its boot device. The domain has to be stopped, the configuration issue corrected, and then the domain restarted. Otherwise, the domain could be left in an inconsistent state.

For each virtual network device in the source domain, a virtual network switch must exist on the target host, with the same name as the virtual network switch the device is attached to on the source host.

For example, if vnet0 in the source domain is attached to a virtual switch service name switch-y, then there must be a logical domain on the target host providing a virtual switch service named switch-y.

Note - The switches do not have to be connected to the same network for the migration to occur, though the migrated domain can experience networking problems if the switches are not connected to the same network.

MAC addresses used by the source domain that are in the automatically allocated range must be available for use on the target host.

A virtual console concentrator (vcc) service must exist on the target host and have at least one free port. Explicit console constraints are ignored during the migration. The console for the target domain is created using the target domain name as the console group and using any available port on the first vcc device in the control domain. If there is a conflict with the default group name, the migration fails.

NIU Hybrid Input/Output

A domain using NIU Hybrid I/O resources can be migrated. A constraint specifying NIU Hybrid I/O resources is not a hard requirement of a logical domain. If such a domain is migrated to a machine that does not have available NIU resources, the constraint is preserved, but not fulfilled.

Cryptographic Units

You cannot migrate a logical domain that has bound cryptographic units. Attempts to migrate such a domain fail.

Delayed Reconfiguration

Any active delayed reconfiguration operations on the source or target hosts prevent a migration from starting. Delayed reconfiguration operations are blocked while a migration is in progress.

Operations on Other Domains

While a migration is in progress on a machine, any operation which could result in the modification of the Machine Description (MD) of the domain being migrated is blocked. This includes all operations on the domain itself as well as operations such as bind, stop, and start on other domains on the machine.

Migrating Bound or Inactive Domains

Because a bound or inactive domain is not executing at the time of the migration, there are fewer restrictions than when you migrate an active domain.

CPUs

You can migrate a bound or inactive domain between machines running different processor types and machines that are running at different frequencies.

The Solaris OS image in the guest must support the processor type on the target machine.

Virtual Input/Output

For an inactive domain, there are no checks performed against the virtual input/output (VIO) constraints. So, the VIO servers do not need to exist for the migration to succeed. As with any inactive domain, the VIO servers need to exist and be available at the time the domain is bound.

Performing a Dry Run

When you provide the -n option to the migrate-domain subcommand, migration checks are performed, but the source domain is not migrated. Any requirement that is not satisfied is reported as an error. This allows you to correct any configuration errors before attempting a real migration.

Note - Because of the dynamic nature of logical domains, it is possible for a dry run to succeed and a migration to fail and vice-versa.

Monitoring a Migration in Progress

When a migration is in progress, the source and target domains are displayed differently in the status output. In particular, the short version of the status output shows a new flag indicating the state of the migrating domain. The source domain shows a s to indicate that it is the source of the migration. The target domain shows a t to indicate that it is the target of a migration. If an error occurs that requires user intervention, an e is displayed.

In the long form of the status output, additional information is displayed about the migration. On the source, the percentage of the operation complete is displayed along with the target host and domain name. Similarly, on the target, the percentage of the operation complete is displayed along with the source host and domain name.

**EXAMPLE 8-1 Monitoring a Migration in Progress**
# `ldm ls -o status ldg-src` NAME ldg-src STATUS OPERATION PROGRESS TARGET migration 17% t5440-sys-2

Canceling a Migration in Progress

Once a migration starts, if the ldm command is interrupted with a KILL signal, the migration is terminated. The target domain is destroyed, and the source domain is resumed if it was active. If the controlling shell of the ldm command is lost, the migration continues in the background.

A migration operation can also be canceled externally from the ldm command using the cancel-operation subcommand. This terminates the migration in progress, and the source domain resumes as the master domain.

Note - Once a migration has been initiated, suspending the ldm(1M) process does not pause the operation, because it is the Logical Domains Manager daemon (ldmd) on the source and target machines that are effecting the migration. The ldm process waits for a signal from the ldmd that the migration has been completed before returning.

Recovering From a Failed Migration

If the network connection is lost after the source has completed sending all the runtime state information to the target, but before the target can acknowledge that the domain has been resumed, the migration operation terminates, and the source is placed in an error state. This indicates that user interaction is required to determine whether or not the migration was completed successfully. In such a situation, take the following steps.

Determine whether the target domain has resumed successfully. The target domain will be in one of two states:
- If the migration completed successfully, the target domain is in the normal state.
- If the migration failed, the target cleans up and destroys the target domain.
If the target is resumed, it is safe to destroy the source domain in the error state. If the target is not present, the source domain is still the master version of the domain, and it must be recovered. To do this, execute the cancel command on the source machine. This clears the error state and restores the source domain back to its original condition.

Examples

EXAMPLE 8-2 shows how a domain, called ldg1, can be migrated to a machine called t5440-sys-2.

**EXAMPLE 8-2 Migrating a Guest Domain**
# `ldm migrate-domain ldg1 t5440-sys-2` Target Password: #

EXAMPLE 8-3 shows that a domain can be renamed as part of the migration. In this example, ldg-src is the source domain, and it is renamed to ldg-tgt on the target machine (t5440-sys-2) as part of the migration. In addition, the user name (root) on the target machine is explicitly specified.

**EXAMPLE 8-3 Migrating and Renaming a Guest Domain**
# `ldm migrate ldg-src root@t5440-sys-2:ldg-tgt` Target Password: #

EXAMPLE 8-4 shows a sample failure message if the target domain does not have migration support; that is, if you are running an LDoms version prior to version 1.1.

**EXAMPLE 8-4 Migration Failure Message**
# `ldm migrate ldg1 t5440-sys-2` Target Password: Failed to establish connection with ldmd(1m) on target: t5440-sys-2 Check that the ’ldmd’ service is enabled on the target machine and that the version supports Domain Migration. Check that the ’xmpp_enabled’ and ’incoming_migration_enabled’ properties of the ’ldmd’ service on the target machine are set to ’true’ using svccfg(1M).

EXAMPLE 8-4 Migration Failure Message

# ldm migrate ldg1 t5440-sys-2
Target Password:
Failed to establish connection with ldmd(1m) on target: t5440-sys-2
Check that the ’ldmd’ service is enabled on the target machine and
that the version supports Domain Migration. Check that the ’xmpp_enabled’
and ’incoming_migration_enabled’ properties of the ’ldmd’ service on
the target machine are set to ’true’ using svccfg(1M).

EXAMPLE 8-5 shows how to obtain status on a target domain while the migration is in progress. In this example, the source machine is t5440-sys-1.

**EXAMPLE 8-5 Obtaining Target Domain Status**
# `ldm ls -o status ldg-tgt` NAME ldg-tgt STATUS OPERATION PROGRESS SOURCE migration 55% t5440-sys-1

EXAMPLE 8-6 shows how to obtain parseable status on the source domain while the migration is in progress. In this example, the target machine is t5440-sys-2.

**EXAMPLE 8-6 Obtaining Source Domain Parseable Status**
# `ldm ls -o status -p ldg-src` VERSION 1.3 DOMAIN\|name=ldg-src\| STATUS \|op=migration\|progress=42\|error=no\|target=t5440-sys-2

Migrating Logical Domains

Introduction to Logical Domain Migration

Overview of a Migration Operation

Software Compatibility

Authentication

Migrating an Active Domain

CPUs

Memory

Physical Input/Output

Virtual Input/Output

NIU Hybrid Input/Output

Cryptographic Units

Delayed Reconfiguration

Operations on Other Domains

Migrating Bound or Inactive Domains

CPUs

Virtual Input/Output

Performing a Dry Run

Monitoring a Migration in Progress

Canceling a Migration in Progress

Recovering From a Failed Migration

Examples