Traditional Data Migration
Traditional file migration typically works in one of two ways: repeated
synchronization or external interposition.
Migration via Synchronization
This method works by taking an active host X and migrating data to the new
host Y while X remains active. Clients still read and write to the original host
while this migration is underway. Once the data is initially migrated,
incremental changes are repeatedly sent until the delta is small enough to be
sent within a single downtime window. At this point the original share is made
read-only, the final delta is sent to the new host, and all clients are updated
to point to the new location. The most common way of accomplishing this is
through the rsync tool, though other integrated tools exist. This mechanism has
several drawbacks:
-
The anticipated downtime, while small, is not easily quantified. If a
user commits a large amount of change immediately before the scheduled
downtime, this can increase the downtime window.
-
During migration, the new server is idle. Since new servers typically
come with new features or performance improvements, this represents a
waste of resources during a potentially long migration period.
-
Coordinating across multiple filesystems is burdensome. When
migrating dozens or hundreds of filesystems, each migration will take a
different amount of time, and downtime will have to be scheduled across
the union of all filesystems.
Migration via External Interposition
This method works by taking an active host X and inserting a new ZFSSA M that
migrates data to a new host Y. All clients are updated at once to point to M,
and data is automatically migrated in the background. This provides more
flexibility in migration options (for example, being able to migrate to a new
server in the future without downtime), and leverages the new server for already
migrated data, but also has significant drawbacks:
-
The migration ZFSSA represents a new physical machine, with
associated costs (initial investment, support costs, power and cooling)
and additional management overhead.
-
The migration ZFSSA represents a new point of failure within the
system.
-
The migration ZFSSA interposes on already migrated data, incurring
extra latency, often permanently. These ZFSSAs are typically left in
place, though it would be possible to schedule another downtime window
and decommission the migration ZFSSA.