Sun Cluster System Administration Guide for Solaris OS

Chapter 12 Backing Up and Restoring a Cluster

This chapter provides the following sections:

Backing Up a Cluster

Table 12–1 Task Map: Backing Up Cluster Files

Task 

Instructions 

Find the names of the file systems you want to back up 

How to Find File System Names to Back Up

Calculate how many tapes you need to contain a full backup 

How to Determine the Number of Tapes Needed for a Full Backup

Back up the root file system 

How to Back Up the Root (/) File System

Perform online backup for mirrored or plexed file systems 

How to Perform Online Backups for Mirrors (Solaris Volume Manager)

 

How to Perform Online Backups for Volumes (Veritas Volume Manager)

Back up the cluster configuration 

How to Back Up the Cluster Configuration

Back up disk partitioning configuration for storage disk 

See the documentation for your storage disk 

ProcedureHow to Find File System Names to Back Up

Use this procedure to determine the names of the file systems that you want to back up.

  1. Display the contents of the /etc/vfstab file.

    You do not need to be superuser or assume an equivalent role to run this command.


    # more /etc/vfstab
    
  2. Look in the mount-point column for the name of the file system that you are backing up.

    Use this name when you back up the file system.


    # more /etc/vfstab 
    

Example 12–1 Finding File System Names to Back Up

The following example displays the names of available file systems that are listed in the /etc/vfstab file.


# more /etc/vfstab
#device             device             mount  FS fsck  mount  mount
#to mount           to fsck            point  type     pass   at boot  options
#
#/dev/dsk/c1d0s2    /dev/rdsk/c1d0s2   /usr     ufs     1      yes      -
 f                  -                  /dev/fd  fd      -      no       -
 /proc              -                  /proc    proc    -      no       -
 /dev/dsk/c1t6d0s1  -                  -        swap    -      no       -
 /dev/dsk/c1t6d0s0  /dev/rdsk/c1t6d0s0 /        ufs     1      no       -
 /dev/dsk/c1t6d0s3  /dev/rdsk/c1t6d0s3 /cache   ufs     2      yes      -
 swap               -                  /tmp     tmpfs   -      yes      -

ProcedureHow to Determine the Number of Tapes Needed for a Full Backup

Use this procedure to calculate the number of tapes that you need to back up a file system.

  1. Become superuser or assume an equivalent role on the cluster node that you are backing up.

  2. Estimate the size of the backup in bytes.


    # ufsdump S filesystem 
    
    S

    Displays the estimated number of bytes needed to perform the backup.

    filesystem

    Specifies the name of the file system you want to back up.

  3. Divide the estimated size by the capacity of the tape to see how many tapes you need.


Example 12–2 Determining the Number of Tapes Needed

In the following example, the file system size of 905,881,620 bytes easily fits on a 4-Gbyte tape (905,881,620 ÷ 4,000,000,000).


# ufsdump S /global/phys-schost-1
905881620

ProcedureHow to Back Up the Root (/) File System

Use this procedure to back up the root (/) file system of a cluster node. Ensure that the cluster is running without errors before performing the backup procedure.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Sun Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical. For a list of the commands and their short forms, see Appendix B, Sun Cluster Object-Oriented Commands.

  1. Become superuser or assume a role that provides solaris.cluster.modify RBAC authorization on the cluster node that you are backing up.

  2. Switch each running data service from the node to be backed up to another node in the cluster.


    # clnode evacuate node
    
    node

    Specifies the node from which you are switching resource groups and device groups.

  3. Shut down the node.


    # shutdown -g0 -y -i0
    
  4. Reboot the node in noncluster mode.

    • On SPARC based systems, run the following command.


      ok boot -xs
      
    • On x86 based systems, run the following commands.


      phys-schost# shutdown -g -y -i0
      
      Press any key to continue
    1. In the GRUB menu, use the arrow keys to select the appropriate Solaris entry and type e to edit its commands.

      The GRUB menu appears similar to the following:


      GNU GRUB version 0.95 (631K lower / 2095488K upper memory)
      +-------------------------------------------------------------------------+
      | Solaris 10 /sol_10_x86                                                  |
      | Solaris failsafe                                                        |
      |                                                                         |
      +-------------------------------------------------------------------------+
      Use the ^ and v keys to select which entry is highlighted.
      Press enter to boot the selected OS, 'e' to edit the
      commands before booting, or 'c' for a command-line.

      For more information about GRUB based booting, see Booting an x86 Based System by Using GRUB (Task Map) in System Administration Guide: Basic Administration.

    2. In the boot parameters screen, use the arrow keys to select the kernel entry and type e to edit the entry.

      The GRUB boot parameters screen appears similar to the following:


      GNU GRUB version 0.95 (615K lower / 2095552K upper memory)
      +----------------------------------------------------------------------+
      | root (hd0,0,a)                                                       |
      | kernel /platform/i86pc/multiboot                                     |
      | module /platform/i86pc/boot_archive                                  |
      +----------------------------------------------------------------------+
      Use the ^ and v keys to select which entry is highlighted.
      Press 'b' to boot, 'e' to edit the selected command in the
      boot sequence, 'c' for a command-line, 'o' to open a new line
      after ('O' for before) the selected line, 'd' to remove the
      selected line, or escape to go back to the main menu.
    3. Add -x to the command to specify that the system boot into noncluster mode.


      [ Minimal BASH-like line editing is supported. For the first word, TAB
      lists possible command completions. Anywhere else TAB lists the possible
      completions of a device/filename. ESC at any time exits. ]
      
      grub edit> kernel /platform/i86pc/multiboot -x
    4. Press the Enter key to accept the change and return to the boot parameters screen.

      The screen displays the edited command.


      GNU GRUB version 0.95 (615K lower / 2095552K upper memory)
      +----------------------------------------------------------------------+
      | root (hd0,0,a)                                                       |
      | kernel /platform/i86pc/multiboot -x                                  |
      | module /platform/i86pc/boot_archive                                  |
      +----------------------------------------------------------------------+
      Use the ^ and v keys to select which entry is highlighted.
      Press 'b' to boot, 'e' to edit the selected command in the
      boot sequence, 'c' for a command-line, 'o' to open a new line
      after ('O' for before) the selected line, 'd' to remove the
      selected line, or escape to go back to the main menu.-
    5. Type b to boot the node into noncluster mode.


      Note –

      This change to the kernel boot parameter command does not persist over the system boot. The next time you reboot the node, it will boot into cluster mode. To boot into noncluster mode instead, perform these steps to again to add the -x option to the kernel boot parameter command.


  5. Back up the root (/) file system by creating a UFS snapshot.

    1. Make sure the file system has enough disk space for the backing-store file.


      # df -k
      
    2. Make sure that a backing-store file of the same name and location does not already exist.


      # ls /backing-store-file
      
    3. Create the UFS snapshot.


      # fssnap -F ufs -o bs=/backing-store-file /file-system
      
    4. Verify that the snapshot has been created.


      # /usr/lib/fs/ufs/fssnap -i /file-system
      
  6. Reboot the node in cluster mode.


    # init 6
    

Example 12–3 Backing Up the Root (/) File System

In the following example, a snapshot of the root (/) file system is saved to /scratch/usr.back.file in the /usr directory. `


# fssnap -F ufs -o bs=/scratch/usr.back.file /usr
  /dev/fssnap/1

ProcedureHow to Perform Online Backups for Mirrors (Solaris Volume Manager)

A mirrored Solstice DiskSuite metadevice or Solaris Volume Manager volume can be backed up without unmounting it or taking the entire mirror offline. One of the submirrors must be taken offline temporarily, thus losing mirroring, but it can be placed online and resynchronized as soon as the backup is complete, without halting the system or denying user access to the data. Using mirrors to perform online backups creates a backup that is a “snapshot” of an active file system.

A problem might occur if a program writes data onto the volume immediately before the lockfs command is run. To prevent this problem, temporarily stop all the services running on this node. Also, ensure the cluster is running without errors before performing the backup procedure.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Sun Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical. For a list of the commands and their short forms, see Appendix B, Sun Cluster Object-Oriented Commands.

  1. Become superuser or assume an equivalent role on the cluster node that you are backing up.

  2. Use the metaset(1M) command to determine which node has the ownership on the backed-up volume.


    # metaset -s setname
    
    -s setname

    Specifies the disk set name.

  3. Use the lockfs(1M) command with the -w option to lock the file system from writes.


    # lockfs -w mountpoint 
    

    Note –

    You must lock the file system only if a UFS file system resides on the mirror. For example, if the Solstice DiskSuite metadevice or Solaris Volume Manager volume is set up as a raw device for database management software or some other specific application, you do not need to use the lockfs command. You might, however, run the appropriate vendor-dependent utility to flush any buffers and lock access.


  4. Use the metastat(1M) command to determine the names of the submirrors.


    # metastat -s setname -p
    
    -p

    Displays the status in a format similar to the md.tab file.

  5. Use the metadetach(1M) command to take one submirror offline from the mirror.


    # metadetach -s setname mirror submirror
    

    Note –

    Reads continue to be made from the other submirrors. However, the offline submirror is unsynchronized as soon as the first write is made to the mirror. This inconsistency is corrected when the offline submirror is brought back online. You do not need to run fsck.


  6. Unlock the file systems and allow writes to continue, using the lockfs command with the -u option.


    # lockfs -u mountpoint 
    
  7. Perform a file system check.


    # fsck /dev/md/diskset/rdsk/submirror
    
  8. Back up the offline submirror to tape or another medium.

    Use the ufsdump(1M) command or the backup utility that you usually use.


    # ufsdump 0ucf dump-device submirror
    

    Note –

    Use the raw device (/rdsk) name for the submirror, rather than the block device (/dsk) name.


  9. Use the metattach(1M) command to place the metadevice or volume back online.


    # metattach -s setname mirror submirror
    

    When the metadevice or volume is placed online, it is automatically resynchronized with the mirror.

  10. Use the metastat command to verify that the submirror is resynchronizing.


    # metastat -s setname mirror
    

Example 12–4 Performing Online Backups for Mirrors (Solaris Volume Manager)

In the following example, the cluster node phys-schost-1 is the owner of the metaset schost-1, therefore the backup procedure is performed from phys-schost-1. The mirror /dev/md/schost-1/dsk/d0 consists of the submirrors d10 , d20, and d30.


[Determine the owner of the metaset:]
# metaset -s schost-1
Set name = schost-1, Set number = 1
Host                Owner
  phys-schost-1     Yes 
...
[Lock the file system from writes:] 
# lockfs -w /global/schost-1
[List the submirrors:]
# metastat -s schost-1 -p
schost-1/d0 -m schost-1/d10 schost-1/d20 schost-1/d30 1
schost-1/d10 1 1 d4s0
schost-1/d20 1 1 d6s0
schost-1/d30 1 1 d8s0
[Take a submirror offline:]
# metadetach -s schost-1 d0 d30
[Unlock the file system:]
# lockfs -u /
[Check the file system:]
# fsck /dev/md/schost-1/rdsk/d30
[Copy the submirror to the backup device:]
# ufsdump 0ucf /dev/rmt/0 /dev/md/schost-1/rdsk/d30
  DUMP: Writing 63 Kilobyte records
  DUMP: Date of this level 0 dump: Tue Apr 25 16:15:51 2000
  DUMP: Date of last level 0 dump: the epoch
  DUMP: Dumping /dev/md/schost-1/rdsk/d30 to /dev/rdsk/c1t9d0s0.
  ...
  DUMP: DUMP IS DONE
[Bring the submirror back online:]
# metattach -s schost-1 d0 d30
schost-1/d0: submirror schost-1/d30 is attached
[Resynchronize the submirror:]
# metastat -s schost-1 d0
schost-1/d0: Mirror
    Submirror 0: schost-0/d10
      State: Okay         
    Submirror 1: schost-0/d20
      State: Okay
    Submirror 2: schost-0/d30
      State: Resyncing
    Resync in progress: 42% done
    Pass: 1
    Read option: roundrobin (default)
...

ProcedureHow to Perform Online Backups for Volumes (Veritas Volume Manager)

Veritas Volume Manager identifies a mirrored volume as a plex. A plex can be backed up without unmounting it or taking the entire volume offline. This result is accomplished by creating a snapshot copy of the volume and backing up this temporary volume without halting the system or denying user access to the data.

Ensure that the cluster is running without errors before performing the backup procedure.

The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Sun Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical. For a list of the commands and their short forms, see Appendix B, Sun Cluster Object-Oriented Commands.

  1. Log on to any node in the cluster, and become superuser or assume a role that provides solaris.cluster.admin RBAC authorization on the current primary node for the disk group on the cluster.

  2. List the disk group information.


    # vxprint -g diskgroup
    
  3. Determine which node has the disk group currently imported, indicating it is the primary node for the disk group.


    # cldevicegroup status
    
  4. Create a snapshot of the volume.


    # vxassist -g diskgroup snapstart volume
    

    Note –

    Creating a snapshot can take a long time, depending on the size of your volume.


  5. Verify that the new volume was created.


    # vxprint -g diskgroup
    

    When the snapshot is complete, a status of Snapdone displays in the State field for the selected disk group.

  6. Stop any data services that are accessing the file system.


    # clresourcegroup offline resource-group
    

    Note –

    Stop all data services to ensure that the data file system is properly backed up. If no data services are running, you do not need to perform Step 6 and Step 8.


  7. Create a backup volume named bkup-vol and attach the snapshot volume to it.


    # vxassist -g diskgroup snapshot volume bkup-vol
    
  8. Restart any data services that were stopped in Step 6, using the clresourcegroup command.


    # clresourcegroup online - zone -n node resourcegroup
    
    node

    The name of the node.

    zone

    The name of the global-cluster non-voting node (node) that can master the resource group. Specify zone only if you specified a non-voting node when you created the resource group.

  9. Verify the volume is now attached to the new volume bkup-vol.


    # vxprint -g diskgroup
    
  10. Register the device group configuration change.


    # cldevicegroup sync diskgroup
    
  11. Check the backup volume.


    # fsck -y /dev/vx/rdsk/diskgroup/bkup-vol
    
  12. Perform a backup to copy the volume bkup-vol to tape or another medium.

    Use the ufsdump(1M) command or the backup utility that you normally use.


    # ufsdump 0ucf dump-device /dev/vx/dsk/diskgroup/bkup-vol
    
  13. Remove the temporary volume.


    # vxedit -rf rm bkup-vol
    
  14. Register the disk group configuration changes.


    # cldevicegroup sync diskgroup
    

Example 12–5 Performing Online Backups for Volumes (Veritas Volume Manager)

In the following example, the cluster node phys-schost-2 is the primary owner of the device group schost-1. Therefore, the backup procedure is performed from phys-schost-2. The volume /vo101 is copied and then associated with a new volume, bkup-vol.


[Become superuser or assume a role that provides solaris.cluster.admin RBAC authorization on 
the primary node.]
[Identify the current primary node for the device group:]
# cldevicegroup status
-- Device Group Servers --
                         Device Group     Primary           Secondary
                         ------------     -------           ---------
 Device group servers:   rmt/1            -                 -
 Device group servers:   schost-1         phys-schost-2     phys-schost-1

-- Device Group Status --
                             Device Group        Status              
                             ------------        ------              
 Device group status:        rmt/1               Offline
 Device group status:        schost-1            Online
[List the device group information:]
# vxprint -g schost-1
TY NAME            ASSOC     KSTATE   LENGTH   PLOFFS STATE   TUTIL0  PUTIL0
dg schost-1       schost-1   -        -        -      -        -      -
  
dm schost-101     c1t1d0s2   -        17678493 -      -        -      -
dm schost-102     c1t2d0s2   -        17678493 -      -        -      -
dm schost-103     c2t1d0s2   -        8378640  -      -        -      -
dm schost-104     c2t2d0s2   -        17678493 -      -        -      -
dm schost-105     c1t3d0s2   -        17678493 -      -        -      -
dm schost-106     c2t3d0s2   -        17678493 -      -        -      -
 
v  vol01          gen        ENABLED  204800   -      ACTIVE   -      -
pl vol01-01       vol01      ENABLED  208331   -      ACTIVE   -      -
sd schost-101-01  vol01-01   ENABLED  104139   0      -        -      -
sd schost-102-01  vol01-01   ENABLED  104139   0      -        -      -
pl vol01-02       vol01      ENABLED  208331   -      ACTIVE   -      -
sd schost-103-01  vol01-02   ENABLED  103680   0      -        -      -
sd schost-104-01  vol01-02   ENABLED  104139   0      -        -      -
pl vol01-03       vol01      ENABLED  LOGONLY  -      ACTIVE   -      -
sd schost-103-02  vol01-03   ENABLED  5        LOG    -        -      -
[Start the snapshot operation:]
# vxassist -g schost-1 snapstart vol01
[Verify the new volume was created:]
# vxprint -g schost-1
TY NAME            ASSOC    KSTATE    LENGTH   PLOFFS STATE   TUTIL0  PUTIL0
dg schost-1       schost-1   -        -        -      -        -      -
  
dm schost-101     c1t1d0s2   -        17678493 -      -        -      -
dm schost-102     c1t2d0s2   -        17678493 -      -        -      -
dm schost-103     c2t1d0s2   -        8378640  -      -        -      -
dm schost-104     c2t2d0s2   -        17678493 -      -        -      -
dm schost-105     c1t3d0s2   -        17678493 -      -        -      -
dm schost-106     c2t3d0s2   -        17678493 -      -        -      -
  
v  vol01          gen        ENABLED  204800   -      ACTIVE   -      -
pl vol01-01       vol01      ENABLED  208331   -      ACTIVE   -      -
sd schost-101-01  vol01-01   ENABLED  104139   0      -        -      -
sd schost-102-01  vol01-01   ENABLED  104139   0      -        -      -
pl vol01-02       vol01      ENABLED  208331   -      ACTIVE   -      -
sd schost-103-01  vol01-02   ENABLED  103680   0      -        -      -
sd schost-104-01  vol01-02   ENABLED  104139   0      -        -      -
pl vol01-03       vol01      ENABLED  LOGONLY  -      ACTIVE   -      -
sd schost-103-02  vol01-03   ENABLED  5        LOG    -        -      -
pl vol01-04       vol01      ENABLED  208331   -      SNAPDONE -      -
sd schost-105-01  vol01-04   ENABLED  104139   0      -        -      -
sd schost-106-01  vol01-04   ENABLED  104139   0      -        -      -
[Stop data services, if necessary:]
# clresourcegroup offline nfs-rg
[Create a copy of the volume:]
# vxassist -g schost-1 snapshot vol01 bkup-vol
[Restart data services, if necessary:]
# clresourcegroup online -n phys-schost-1 nfs-rg
[Verify bkup-vol was created:]
# vxprint -g schost-1
TY NAME           ASSOC       KSTATE   LENGTH   PLOFFS STATE   TUTIL0  PUTIL0
dg schost-1       schost-1    -        -        -      -        -      -
 
dm schost-101     c1t1d0s2    -        17678493 -      -        -      -
...
 
v  bkup-vol       gen         ENABLED  204800   -      ACTIVE   -      -
pl bkup-vol-01    bkup-vol    ENABLED  208331   -      ACTIVE   -      -
sd schost-105-01  bkup-vol-01 ENABLED  104139   0      -        -      -
sd schost-106-01  bkup-vol-01 ENABLED  104139   0      -        -      -
 
v  vol01          gen         ENABLED  204800   -      ACTIVE   -      -
pl vol01-01       vol01       ENABLED  208331   -      ACTIVE   -      -
sd schost-101-01  vol01-01    ENABLED  104139   0      -        -      -
sd schost-102-01  vol01-01    ENABLED  104139   0      -        -      -
pl vol01-02       vol01       ENABLED  208331   -      ACTIVE   -      -
sd schost-103-01  vol01-02    ENABLED  103680   0      -        -      -
sd schost-104-01  vol01-02    ENABLED  104139   0      -        -      -
pl vol01-03       vol01       ENABLED  LOGONLY  -      ACTIVE   -      -
sd schost-103-02  vol01-03    ENABLED  5        LOG    -        -      -
[Synchronize the disk group with cluster framework:]
# cldevicegroup sync schost-1
[Check the file systems:]
# fsck -y /dev/vx/rdsk/schost-1/bkup-vol
[Copy bkup-vol to the backup device:]
# ufsdump 0ucf /dev/rmt/0 /dev/vx/rdsk/schost-1/bkup-vol
  DUMP: Writing 63 Kilobyte records
  DUMP: Date of this level 0 dump: Tue Apr 25 16:15:51 2000
  DUMP: Date of last level 0 dump: the epoch
  DUMP: Dumping /dev/vx/dsk/schost-2/bkup-vol to /dev/rmt/0.
  ...
  DUMP: DUMP IS DONE
[Remove the bkup-volume:]
# vxedit -rf rm bkup-vol
[Synchronize the disk group:]
# cldevicegroup sync schost-1

ProcedureHow to Back Up the Cluster Configuration

To ensure that your cluster configuration is archived and to facilitate easy recovery of the your cluster configuration, periodically back up your cluster configuration. Sun Cluster 3.2 provides the ability to export your cluster configuration to an eXtensible Markup Language (XML) file.

  1. Log on to any node in the cluster, and become superuser or assume a role that provides solaris.cluster.read RBAC authorization.

  2. Export the cluster configuration information to a file.


    # /usr/cluster/bin/cluster export -o configfile
    
    configfile

    The name of the XML configuration file that the cluster command is exporting the cluster configuration information to. For information about the XML configuration file, see clconfiguration(5CL)

  3. Verify that the cluster configuration information was successfully exported to the XML file.


    # vi configfile
    

Restoring Cluster Files

The ufsrestore(1M) command copies files to disk, relative to the current working directory, from backups created by using the ufsdump(1M) command. You can use ufsrestore to reload an entire file system hierarchy from a level 0 dump and incremental dumps that follow it, or to restore one or more single files from any dump tape. If ufsrestore is run as superuser or assumed an equivalent role, files are restored with their original owner, last modification time, and mode (permissions).

Before you start to restore files or file systems, you need to know the following information.

Table 12–2 Task Map: Restoring Cluster Files

Task 

Instructions 

For Solaris Volume Manager, restore files interactively  

How to Restore Individual Files Interactively (Solaris Volume Manager)

For Solaris Volume Manager, restore the root (/) file system

How to Restore the Root (/) File System (Solaris Volume Manager)

  

How to Restore a Root (/) File System That Was on a Solstice DiskSuite Metadevice or Solaris Volume Manager Volume

For Veritas Volume Manager, restore a root ( /) file system

How to Restore a Nonencapsulated Root (/) File System (Veritas Volume Manager)

For Veritas Volume Manager, restore an encapsulated root ( /) file system

How to Restore an Encapsulated Root (/) File System (Veritas Volume Manager)

ProcedureHow to Restore Individual Files Interactively (Solaris Volume Manager)

Use this procedure to restore one or more individual files. Ensure that the cluster is running without errors before performing the restore procedure.

  1. Become superuser or assume a role that provides solaris.cluster.admin RBAC authorization on the cluster node you are restoring.

  2. Stop all the data services that are using the files to be restored.


    # clresourcegroup offline resource-group
    
  3. Restore the files.


    # ufsrestore
    

ProcedureHow to Restore the Root (/) File System (Solaris Volume Manager)

Use this procedure to restore the root (/) file systems to a new disk, such as after replacing a bad root disk. The node being restored should not be booted. Ensure that the cluster is running without errors before performing the restore procedure.


Note –

Because you must partition the new disk by using the same format as the failed disk, identify the partitioning scheme before you begin this procedure, and recreate file systems as appropriate.


The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Sun Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical. For a list of the commands and their short forms, see Appendix B, Sun Cluster Object-Oriented Commands.

  1. Become superuser or assume a role that provides solaris.cluster.modify RBAC authorization on a cluster node with access to the disksets to which the node to be restored is also attached.

    Use a node other than the node that you are restoring.

  2. Remove the hostname of the node being restored from all metasets.

    Run this command from a node in the metaset other than the node that you are removing. Because the recovering node is offline, the system will display an RPC: Rpcbind failure - RPC: Timed out error. Ignore this error and continue to the next step.


    # metaset -s setname -f -d -h nodelist
    
    -s setname

    Specifies the disk set name.

    -f

    Deletes the last host from the disk set.

    -d

    Deletes from the disk set.

    -h nodelist

    Specifies the name of the node to delete from the disk set.

  3. Restore the root (/) and /usr file systems.

    To restore the root and /usr file systems, follow the procedure in Chapter 26, Restoring UFS Files and File Systems (Tasks), in System Administration Guide: Devices and File Systems. Omit the step in the Solaris OS procedure to reboot the system.


    Note –

    Ensure that you create the /global/.devices/node@nodeid file system.


  4. Reboot the node in multiuser mode.


    # reboot
    
  5. Replace the device ID.


    # cldevice repair rootdisk
    
  6. Use the metadb(1M) command to recreate the state database replicas.


    # metadb -c copies -af raw-disk-device
    
    -c copies

    Specifies the number of replicas to create.

    -f raw-disk-device

    Raw disk device on which to create replicas.

    -a

    Adds replicas.

  7. From a cluster node other than the restored node add the restored node to all disksets.


    phys-schost-2# metaset -s setname -a -h nodelist
    
    -a

    Creates and adds the host to the disk set.

    The node is rebooted into cluster mode. The cluster is ready to use.


Example 12–6 Restoring the Root (/) File System (Solaris Volume Manager)

The following example shows the root (/) file system restored to the node phys-schost-1 from the tape device /dev/rmt/0. The metaset command is run from another node in the cluster, phys-schost-2, to remove and later add back node phys-schost-1 to the disk set schost-1. All other commands are run from phys-schost-1 . A new boot block is created on /dev/rdsk/c0t0d0s0, and three state database replicas are recreated on /dev/rdsk/c0t0d0s4 .


[Become superuser or assume a  role that provides solaris.cluster.modify RBAC authorization on a cluster node other than the node to be restored
.]
[Remove the node from the metaset:]
phys-schost-2# metaset -s schost-1 -f -d -h phys-schost-1
[Replace the failed disk and boot the node:]
Restore the root (/) and /usr file system using the procedure in the Solaris system administration documentation
 [Reboot:]
# reboot
[Replace the disk ID:]
# cldevice repair /dev/dsk/c0t0d0
[Re-create state database replicas:]
# metadb -c 3 -af /dev/rdsk/c0t0d0s4
[Add the node back to the metaset:]
phys-schost-2# metaset -s schost-1 -a -h phys-schost-1

ProcedureHow to Restore a Root (/) File System That Was on a Solstice DiskSuite Metadevice or Solaris Volume Manager Volume

Use this procedure to restore a root (/) file system that was on a Solstice DiskSuite metadevice or a Solaris Volume Manager volume when the backups were performed. Perform this procedure under circumstances such as when a root disk is corrupted and replaced with a new disk. The node being restored should not be booted. Ensure that the cluster is running without errors before performing the restore procedure.


Note –

Because you must partition the new disk by using the same format as the failed disk, identify the partitioning scheme before you begin this procedure, and recreate file systems as appropriate.


The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Sun Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical. For a list of the commands and their short forms, see Appendix B, Sun Cluster Object-Oriented Commands.

  1. Become superuser or assume a role that provides solaris.cluster.modifiy RBAC authorization on a cluster node with access to the disk set, other than the node you restoring.

    Use a node other than the node that you are restoring.

  2. Remove the hostname of the node being restored from all disksets.


    # metaset -s setname -f -d -h nodelist
    
    -s setname

    Specifies the metaset name.

    -f

    Deletes the last host from the disk set.

    -d

    Deletes from the metaset.

    -h nodelist

    Specifies the name of the node to delete from the metaset.

    -m mediator_host_list

    Specifies the name of the mediator host to add or delete from the disk set.

  3. Replace the failed disk on the node on which the root (/) file system will be restored.

    Refer to disk replacement procedures in the documentation that shipped with your server.

  4. Boot the node that you are restoring.

    • If you are using the Solaris OS CD, note the following:

      • SPARC: Type:


        ok boot cdrom -s
        
      • x86:Insert the CD into the system's CD drive and boot the system by shutting it down and then turning it off and on. In the Current Boot Parameters screen, type b or i.


                             <<< Current Boot Parameters >>>
        Boot path: /pci@0,0/pci8086,2545@3/pci8086,1460@1d/pci8086,341a@
        7,1/sd@0,0:a
        Boot args:
        
        Type b [file-name] [boot-flags] <ENTER> to boot with options
        or   i <ENTER>                          to enter boot interpreter
        or   <ENTER>                            to boot with defaults
        
                         <<< timeout in 5 seconds >>>
        Select (b)oot or (i)nterpreter: b -s
        
    • If you are using a Solaris JumpStartTM server, note the following:

      • SPARC: Type:


        ok boot net -s
        
      • x86:Insert the CD into the system's CD drive and boot the system by shutting it down and then turning it off and on. In the Current Boot Parameters screen, type b or i.


                             <<< Current Boot Parameters >>>
        Boot path: /pci@0,0/pci8086,2545@3/pci8086,1460@1d/pci8086,341a@
        7,1/sd@0,0:a
        Boot args:
        
        Type b [file-name] [boot-flags] <ENTER> to boot with options
        or   i <ENTER>                          to enter boot interpreter
        or   <ENTER>                            to boot with defaults
        
                         <<< timeout in 5 seconds >>>
        Select (b)oot or (i)nterpreter: b -s
        
  5. Create all the partitions and swap space on the root disk by using the format command.

    Re-create the original partitioning scheme that was on the failed disk.

  6. Create the root (/) file system and other file systems as appropriate, by using the newfs command

    Re-create the original file systems that were on the failed disk.


    Note –

    Ensure that you create the /global/.devices/node@nodeid file system.


  7. Mount the root (/) file system on a temporary mount point.


    # mount device temp-mountpoint
    
  8. Use the following commands to restore the root (/) file system.


    # cd temp-mountpoint
    # ufsrestore rvf dump-device
    # rm restoresymtable
    
  9. Install a new boot block on the new disk.


    # /usr/sbin/installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk 
    raw-disk-device
    
  10. Remove the lines in the /temp-mountpoint/etc/system file for MDD root information.


    * Begin MDD root info (do not edit)
    forceload: misc/md_trans
    forceload: misc/md_raid
    forceload: misc/md_mirror
    forceload: misc/md_hotspares
    forceload: misc/md_stripe
    forceload: drv/pcipsy
    forceload: drv/glm
    forceload: drv/sd
    rootdev:/pseudo/md@0:0,10,blk
    * End MDD root info (do not edit)
  11. Edit the /temp-mountpoint/etc/vfstab file to change the root entry from a Solstice DiskSuite metadevice or a Solaris Volume Manager volume to a corresponding normal slice for each file system on the root disk that is part of the metadevice or volume.


    Example: 
    Change from—
    /dev/md/dsk/d10   /dev/md/rdsk/d10    /      ufs   1     no       -
    
    Change to—
    /dev/dsk/c0t0d0s0 /dev/rdsk/c0t0d0s0  /      ufs   1     no       -
  12. Unmount the temporary file system, and check the raw disk device.


    # cd /
    # umount temp-mountpoint
    # fsck raw-disk-device
    
  13. Remove the node being restored from the mediator host list of the disk set.


    # metaset -s setname -f -d -m hostname
    
  14. Reboot the node in multiuser mode.


    # reboot
    
  15. Replace the device ID.


    # cldevice repair rootdisk
    
  16. Use the metadb command to recreate the state database replicas.


    # metadb -c copies -af raw-disk-device
    
    -c copies

    Specifies the number of replicas to create.

    -af raw-disk-device

    Creates initial state database replicas on the named raw disk device.

  17. From a cluster node other than the restored node, add the restored node to all disksets.


    phys-schost-2# metaset -s setname -a -h nodelist
    
    -a

    Adds (creates) the metaset.

    Set up the metadevice or volume/mirror for root ( /) according to the Solstice DiskSuite documentation.

    The node is rebooted into cluster mode. The cluster is ready to use.


Example 12–7 Restoring a Root (/) File System That Was on a Solstice DiskSuite Metadevice or Solaris Volume Manager Volume

The following example shows the root (/) file system restored to the node phys-schost-1 from the tape device /dev/rmt/0. The metaset command is run from another node in the cluster, phys-schost-2, to remove and later add back node phys-schost-1 to the metaset schost-1. All other commands are run from phys-schost-1 . A new boot block is created on /dev/rdsk/c0t0d0s0, and three state database replicas are recreated on /dev/rdsk/c0t0d0s4 .


[Become superuser or assume a role that provides solaris.cluster.modify RBAC authorization on a cluster node with access to the metaset, other than the node to be restored.]
[Remove the node from the metaset:]
phys-schost-2# metaset -s schost-1 -f -d -h phys-schost-1
[Replace the failed disk and boot the node:]

Boot the node from the Solaris OS CD:


[Use format and newfs to recreate partitions and file systems
.]
[Mount the root file system on a temporary mount point:]
# mount /dev/dsk/c0t0d0s0 /a
[Restore the root file system:]
# cd /a
# ufsrestore rvf /dev/rmt/0
# rm restoresymtable
[Install a new boot block:]
# /usr/sbin/installboot /usr/platform/`uname \
-i`/lib/fs/ufs/bootblk /dev/rdsk/c0t0d0s0

[Remove the lines in / temp-mountpoint/etc/system file for MDD root information:
]
* Begin MDD root info (do not edit)
forceload: misc/md_trans
forceload: misc/md_raid
forceload: misc/md_mirror
forceload: misc/md_hotspares
forceload: misc/md_stripe
forceload: drv/pcipsy
forceload: drv/glm
forceload: drv/sd
rootdev:/pseudo/md@0:0,10,blk
* End MDD root info (do not edit)
[Edit the /temp-mountpoint/etc/vfstab file]
Example: 
Change from—
/dev/md/dsk/d10   /dev/md/rdsk/d10    /      ufs   1     no       -

Change to—
/dev/dsk/c0t0d0s0 /dev/rdsk/c0t0d0s0  /usr   ufs   1     no       -
[Unmount the temporary file system and check the raw disk device:]
# cd /
# umount /a
# fsck /dev/rdsk/c0t0d0s0
[Reboot:]
# reboot
[Replace the disk ID:]
# cldevice repair /dev/rdsk/c0t0d0
[Re-create state database replicas:]
# metadb -c 3 -af /dev/rdsk/c0t0d0s4
[Add the node back to the metaset:]
phys-schost-2# metaset -s schost-1 -a -h phys-schost-1

ProcedureHow to Restore a Nonencapsulated Root (/) File System (Veritas Volume Manager)

Use this procedure to restore a nonencapsulated root (/) file system to a node. The node being restored should not be booted. Ensure the cluster is running without errors before performing the restore procedure.


Note –

Because you must partition the new disk using the same format as the failed disk, identify the partitioning scheme before you begin this procedure, and recreate file systems as appropriate.


The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Sun Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical. For a list of the commands and their short forms, see Appendix B, Sun Cluster Object-Oriented Commands.

  1. Replace the failed disk on the node where the root file system will be restored.

    Refer to disk replacement procedures in the documentation that shipped with your server.

  2. Boot the node that you are restoring.

    • If you are using the Solaris OS CD, at the OpenBoot PROM ok prompt, type the following command:


      ok boot cdrom -s
      
    • If you are using a Solaris JumpStart server, at the OpenBoot PROM ok prompt, type the following command:


      ok boot net -s
      
  3. Create all the partitions and swap on the root disk by using the format command.

    Re-create the original partitioning scheme that was on the failed disk.

  4. Create the root (/) file system and other file systems as appropriate, using the newfs command.

    Re-create the original file systems that were on the failed disk.


    Note –

    Ensure that you create the /global/.devices/node@nodeid file system.


  5. Mount the root (/) file system on a temporary mount point.


    # mount device temp-mountpoint
    
  6. Restore the root (/) file system from backup, and unmount and check the file system.


    # cd temp-mountpoint
    # ufsrestore rvf dump-device
    # rm restoresymtable
    # cd /
    # umount temp-mountpoint
    # fsck raw-disk-device
    

    The file system is now restored.

  7. Install a new boot block on the new disk.


    # /usr/sbin/installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk raw-disk-device
    
  8. Reboot the node in multiuser mode.


    # reboot
    
  9. Update the device ID.


    # cldevice repair /dev/rdsk/disk-device
    
  10. Press Control-d to resume in multiuser mode.

    The node reboots into cluster mode. The cluster is ready to use.


Example 12–8 Restoring a Nonencapsulated Root (/) File System (Veritas Volume Manager)

The following example shows a nonencapsulated root (/) file system that is restored to the node phys-schost-1 from the tape device /dev/rmt/0.


[Replace the failed disk and boot the node:]

Boot the node from the Solaris OS CD. At the OpenBoot PROM ok prompt, type the following command:


ok boot cdrom -s
...
[Use format and newfs to create partitions and file systems]
[Mount the root file system on a temporary mount point:]
# mount /dev/dsk/c0t0d0s0 /a
[Restore the root file system:]
# cd /a
# ufsrestore rvf /dev/rmt/0
# rm restoresymtable
# cd /
# umount /a
# fsck /dev/rdsk/c0t0d0s0
[Install a new boot block:]
# /usr/sbin/installboot /usr/platform/`uname \
-i`/lib/fs/ufs/bootblk /dev/rdsk/c0t0d0s0

[Reboot:]
# reboot
[Update the disk ID:]
# cldevice repair /dev/rdsk/c0t0d0

ProcedureHow to Restore an Encapsulated Root (/) File System (Veritas Volume Manager)

Use this procedure to restore an encapsulated root (/) file system to a node. The node being restored should not be booted. Ensure the cluster is running with errors before performing the restore procedure.


Note –

Because you must partition the new disk using the same format as the failed disk, identify the partitioning scheme before you begin this procedure, and recreate file systems as appropriate.


The phys-schost# prompt reflects a global-cluster prompt. Perform this procedure on a global cluster.

This procedure provides the long forms of the Sun Cluster commands. Most commands also have short forms. Except for the long and short forms of the command names, the commands are identical. For a list of the commands and their short forms, see Appendix B, Sun Cluster Object-Oriented Commands.

  1. Replace the failed disk on the node where the root file system will be restored.

    Refer to disk replacement procedures in the documentation that shipped with your server.

  2. Boot the node that you are restoring.

    • If you are using the Solaris OS CD, at the OpenBoot PROM ok prompt, type the following command:


      ok boot cdrom -s
      
    • If you are using a Solaris JumpStart server, at the OpenBoot PROM ok prompt, type the following command:


      ok boot net -s
      
  3. Create all the partitions and swap space on the root disk by using the format command.

    Re-create the original partitioning scheme that was on the failed disk.

  4. Create the root (/) file system and other file systems as appropriate, by using the newfs command.

    Re-create the original file systems that were on the failed disk.


    Note –

    Ensure that you create the /global/.devices/ node@nodeid file system.


  5. Mount the root (/) file system on a temporary mount point.


    # mount device temp-mountpoint
    
  6. Restore the root (/) file system from backup.


    # cd temp-mountpoint
    # ufsrestore rvf dump-device
    # rm restoresymtable
    
  7. Create an empty install-db file.

    This file puts the node in VxVM installation mode at the next reboot.


    # touch \
    /temp-mountpoint/etc/vx/reconfig.d/state.d/install-db
    
  8. Remove the following entries from the / temp-mountpoint/etc/system file.


    * rootdev:/pseudo/vxio@0:0
    * set vxio:vol_rootdev_is_volume=1
  9. Edit the /temp-mountpoint /etc/vfstab file and replace all VxVM mount points with the standard disk devices for the root disk, such as /dev/dsk/c0t0d0s0.


    Example: 
    Change from—
    /dev/vx/dsk/rootdg/rootvol /dev/vx/rdsk/rootdg/rootvol /      ufs   1     no -
    
    Change to—
    /dev/dsk/c0t0d0s0 /dev/rdsk/c0t0d0s0  / ufs   1     no       -
  10. Unmount the temporary file system and check the file system.


    # cd /
    # umount temp-mountpoint
    # fsck raw-disk-device
    
  11. Install the boot block on the new disk.


    # /usr/sbin/installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk raw-disk-device
    
  12. Reboot the node in multiuser mode.


    # reboot
    
  13. Update the device ID by using scdidadm(1M).


    # cldevice repair /dev/rdsk/c0t0d0
    
  14. Run the clvxvm encapsulate command to encapsulate the disk and reboot.

  15. If a conflict in minor number occurs with any other system, unmount the global devices and re-minor the disk group.

    • Unmount the global devices file system on the cluster node.


      # umount /global/.devices/node@nodeid
      
    • Re-minor the rootdg disk group on the cluster node.


      # vxdg reminor rootdg 100
      
  16. Shut down and reboot the node in cluster mode.


    # shutdown -g0 -i6 -y
    

Example 12–9 Restoring an Encapsulated root (/) File System (Veritas Volume Manager)

The following example shows an encapsulated root (/) file system restored to the node phys-schost-1 from the tape device /dev/rmt/0.


[Replace the failed disk and boot the node:]

Boot the node from the Solaris OS CD. At the OpenBoot PROM ok prompt, type the following command:


ok boot cdrom -s
...
[Use format and newfs to create partitions and file systems]
[Mount the root file system on a temporary mount point:]
# mount /dev/dsk/c0t0d0s0 /a
[Restore the root file system:]
# cd /a
# ufsrestore rvf /dev/rmt/0
# rm restoresymtable
[Create an empty install-db file:]
# touch /a/etc/vx/reconfig.d/state.d/install-db
[Edit /etc/system on the temporary file system and 
remove or comment out the following entries:]
	# rootdev:/pseudo/vxio@0:0
	# set vxio:vol_rootdev_is_volume=1
[Edit /etc/vfstab on the temporary file system:]
Example: 
Change from—
/dev/vx/dsk/rootdg/rootvol /dev/vx/rdsk/rootdg/rootvol / ufs 1 no-

Change to—
/dev/dsk/c0t0d0s0 /dev/rdsk/c0t0d0s0  / ufs   1     no       -
[Unmount the temporary file system, then check the file system:]
# cd /
# umount /a
# fsck /dev/rdsk/c0t0d0s0
[Install a new boot block:]
# /usr/sbin/installboot /usr/platform/`uname \
-i`/lib/fs/ufs/bootblk /dev/rdsk/c0t0d0s0
[Reboot:]
# reboot
[Update the disk ID:]
# cldevice repair /dev/rdsk/c0t0d0
[Encapsulate the disk::]
# vxinstall
Choose to encapsulate the root disk.
[If a conflict  in minor number occurs, reminor the rootdg disk group:]
# umount /global/.devices/node@nodeid
# vxdg reminor rootdg 100
# shutdown -g0 -i6 -y

See Also

For instructions about how to mirror the encapsulated root disk, see the Sun Cluster Software Installation Guide for Solaris OS.