Go to main content

Administering an Oracle® Solaris Cluster 4.4 Configuration

Exit Print View

Updated: March 2019
 
 

Restoring Cluster Files

You can restore the ZFS root file system to a new disk.

You can restore a cluster or node from a unified archive, or you can restore specific files or file systems.

Before you start to restore files or file systems, you need to know the following information.

  • Which tapes you need

  • The raw device name on which you are restoring the file system

  • The type of tape drive you are using

  • The device name (local or remote) for the tape drive

  • The partition scheme of any failed disk, because the partitions and file systems must be exactly duplicated on the replacement disk

Table 20  Task Map: Restoring Cluster Files
Task
Instructions
For Solaris Volume Manager, restore the ZFS root (/) file system

How to Restore the ZFS Root (/) File System (Solaris Volume Manager)

Use this procedure to restore the ZFS root (/) file systems to a new disk, such as after replacing a bad root disk. The node being restored should not be booted. Ensure that the cluster is running without errors before performing the restore procedure. UFS is supported, except as a root file system. UFS can be used on metadevices in Solaris Volume Manager metasets on shared disks.


Note -  Because you must partition the new disk by using the same format as the failed disk, identify the partitioning scheme before you begin this procedure, and recreate file systems as appropriate.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Oracle Solaris Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical.

  1. Assume the root role or a role that provides solaris.cluster.modify authorization on a cluster node with access to the disk sets to which the node to be restored is also attached.

    Use a node other than the node that you are restoring.

  2. Remove from all metasets the hostname of the node being restored.

    Run this command from a node in the metaset other than the node that you are removing. Because the recovering node is offline, the system will display an RPC: Rpcbind failure - RPC: Timed out error. Ignore this error and continue to the next step.

    # metaset -s setname -f -d -h nodelist
    -s setname

    Specifies the disk set name.

    -f

    Deletes the last host from the disk set.

    -d

    Deletes from the disk set.

    -h nodelist

    Specifies the name of the node to delete from the disk set.

  3. Restore the ZFS root file system (/).

    For more information, see Replacing Disks in a ZFS Root Pool in Managing ZFS File Systems in Oracle Solaris 11.4.

    To recover the ZFS root pool or root pool snapshots, follow the procedure in Replacing Disks in a ZFS Root Pool in Managing ZFS File Systems in Oracle Solaris 11.4.


    Note -  Ensure that you create the /global/.devices/node@nodeid file system.

    If the /.globaldevices backup file exists in the backup directory, it is restored along with ZFS root restoration. The file is not created automatically by the globaldevices SMF service.

  4. Reboot the node in multiuser mode.
    # reboot
  5. Replace the device ID.
    # cldevice repair root-disk
  6. Use the metadb command to recreate the state database replicas.
    # metadb -c copies -af raw-disk-device
    -c copies

    Specifies the number of replicas to create.

    -f raw-disk-device

    Raw disk device on which to create replicas.

    -a

    Adds replicas.

    See the metadb(8) man page for more information.

  7. From a cluster node other than the restored node add the restored node to all disksets.
    phys-schost-2# metaset -s setname -a -h nodelist
    -a

    Creates and adds the host to the disk set.

    The node is rebooted into cluster mode. The cluster is ready to use.

Example 82  Restoring the ZFS Root (/) File System (Solaris Volume Manager)

The following example shows the root (/) file system restored to the node phys-schost-1. The metaset command is run from another node in the cluster, phys-schost-2, to remove and later add back node phys-schost-1 to the disk set schost-1. All other commands are run from phys-schost-1 . A new boot block is created on /dev/rdsk/c0t0d0s0, and three state database replicas are recreated on /dev/rdsk/c0t0d0s4. For more information on restoring data, see Resolving Data Problems in a ZFS Storage Pool in Managing ZFS File Systems in Oracle Solaris 11.4.

Remove the node from the metaset
phys-schost-2# metaset -s schost-1 -f -d -h phys-schost-1

Replace the failed disk and boot the node
Restore the root (/) and /usr file system using procedures in Oracle Solaris documentation

Reboot the node
# reboot

Replace the disk ID
# cldevice repair /dev/dsk/c0t0d0

Re-create state database replicas
# metadb -c 3 -af /dev/rdsk/c0t0d0s4

Add the node back to the metaset
phys-schost-2# metaset -s schost-1 -a -h phys-schost-1