Oracle Solaris ZFS Administration Guide

Recovering the ZFS Root Pool or Root Pool Snapshots

The following sections describe how to perform the following tasks:

ProcedureHow to Replace a Disk in the ZFS Root Pool

You might need to replace a disk in the root pool for the following reasons:

In a mirrored root pool configuration, you can attempt a disk replacement without booting from alternate media. You can replace a failed disk by using the zpool replace command. Or, if you have an additional disk, you can use the zpool attach command. See the procedure in this section for an example of attaching an additional disk and detaching a root pool disk.

Some hardware requires that you take a disk offline and unconfigure it before attempting the zpool replace operation to replace a failed disk. For example:


# zpool offline rpool c1t0d0s0
# cfgadm -c unconfigure c1::dsk/c1t0d0
<Physically remove failed disk c1t0d0>
<Physically insert replacement disk c1t0d0>
# cfgadm -c configure c1::dsk/c1t0d0
# zpool replace rpool c1t0d0s0
# zpool online rpool c1t0d0s0
# zpool status rpool
<Let disk resilver before installing the boot blocks>
SPARC# installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/c1t0d0s0
x86# installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c1t9d0s0

On some hardware, you do not have to online or reconfigure the replacement disk after it is inserted.

You must identify the boot device pathnames of the current disk and the new disk so that you can test booting from the replacement disk and also manually boot from the existing disk, if the replacement disk fails. In the example in the following procedure, the path name for current root pool disk (c1t10d0s0) is:


/pci@8,700000/pci@3/scsi@5/sd@a,0

The path name for the replacement boot disk (c1t9d0s0) is:


/pci@8,700000/pci@3/scsi@5/sd@9,0
  1. Physically connect the replacement (or new) disk.

  2. Confirm that the new disk has an SMI label and a slice 0.

    For information about relabeling a disk that is intended for the root pool, see the following site:

    http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide

  3. Attach the new disk to the root pool.

    For example:


    # zpool attach rpool c1t10d0s0 c1t9d0s0
    
  4. Confirm the root pool status.

    For example:


    # zpool status rpool
      pool: rpool
     state: ONLINE
    status: One or more devices is currently being resilvered.  The pool will
            continue to function, possibly in a degraded state.
    action: Wait for the resilver to complete.
     scrub: resilver in progress, 25.47% done, 0h4m to go
    config:
    
            NAME           STATE     READ WRITE CKSUM
            rpool          ONLINE       0     0     0
              mirror-0     ONLINE       0     0     0
                c1t10d0s0  ONLINE       0     0     0
                c1t9d0s0   ONLINE       0     0     0
    
    errors: No known data errors
  5. After the resilvering is completed, apply the boot blocks to the new disk.

    Using syntax similar to the following:

    • SPARC:


      # installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/c1t9d0s0
      
    • x86:


      # installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c1t9d0s0
      
  6. Verify that you can boot from the new disk.

    For example, on a SPARC based system, you would use syntax similar to the following:


    ok boot /pci@8,700000/pci@3/scsi@5/sd@9,0
    
  7. If the system boots from the new disk, detach the old disk.

    For example:


    # zpool detach rpool c1t10d0s0
    
  8. Set up the system to boot automatically from the new disk, either by using the eeprom command, the setenv command from the SPARC boot PROM, or reconfigure the PC BIOS.

ProcedureHow to Create Root Pool Snapshots

You can create root pool snapshots for recovery purposes. The best way to create root pool snapshots is to perform a recursive snapshot of the root pool.

The following procedure creates a recursive root pool snapshot and stores the snapshot as a file in a pool on a remote system. If a root pool fails, the remote dataset can be mounted by using NFS and the snapshot file can be received into the recreated pool. You can instead store root pool snapshots as the actual snapshots in a pool on a remote system. Sending and receiving the snapshots from a remote system is a bit more complicated because you must configure ssh or use rsh while the system to be repaired is booted from the Solaris OS miniroot.

For information about remotely storing and recovering root pool snapshots, for the most up-to-date information about root pool recovery, go to this site:

http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide

Validating remotely stored snapshots as files or snapshots is an important step in root pool recovery. With either method, snapshots should be recreated on a routine basis, such as when the pool configuration changes or when the Solaris OS is upgraded.

In the following procedure, the system is booted from the zfsBE boot environment.

  1. Create a pool and file system on a remote system to store the snapshots.

    For example:


    remote# zfs create rpool/snaps
    
  2. Share the file system with the local system.

    For example:


    remote# zfs set sharenfs='rw=local-system,root=local-system' rpool/snaps
    # share
    -@rpool/snaps   /rpool/snaps   sec=sys,rw=local-system,root=local-system   "" 
  3. Create a recursive snapshot of the root pool.


    local# zfs snapshot -r rpool@0804
    local# zfs list
    NAME                        USED  AVAIL  REFER  MOUNTPOINT
    rpool                      6.17G  60.8G    98K  /rpool
    rpool@0804                     0      -    98K  -
    rpool/ROOT                 4.67G  60.8G    21K  /rpool/ROOT
    rpool/ROOT@0804                0      -    21K  -
    rpool/ROOT/zfsBE           4.67G  60.8G  4.67G  /
    rpool/ROOT/zfsBE@0804       386K      -  4.67G  -
    rpool/dump                 1.00G  60.8G  1.00G  -
    rpool/dump@0804                0      -  1.00G  -
    rpool/swap                  517M  61.3G    16K  -
    rpool/swap@0804                0      -    16K  -
  4. Send the root pool snapshots to the remote system.

    For example:


    local# zfs send -Rv rpool@0804 > /net/remote-system/rpool/snaps/rpool.0804
    sending from @ to rpool@0804
    sending from @ to rpool/swap@0804
    sending from @ to rpool/ROOT@0804
    sending from @ to rpool/ROOT/zfsBE@0804
    sending from @ to rpool/dump@0804

ProcedureHow to Recreate a ZFS Root Pool and Restore Root Pool Snapshots

In this procedure, assume the following conditions:

All the steps are performed on the local system.

  1. Boot from a CD/DVD or the network.

    • SPARC: Select one of the following boot methods:


      ok boot net -s
      ok boot cdrom -s
      

      If you don't use -s option, you'll need to exit the installation program.

    • x86: Select the option for booting from the DVD or the network. Then, exit the installation program.

  2. Mount the remote snapshot dataset.

    For example:


    # mount -F nfs remote-system:/rpool/snaps /mnt
    

    If your network services are not configured, you might need to specify the remote-system's IP address.

  3. If the root pool disk is replaced and does not contain a disk label that is usable by ZFS, you must relabel the disk.

    For more information about relabeling the disk, go to the following site:

    http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide

  4. Recreate the root pool.

    For example:


    # zpool create -f -o failmode=continue -R /a -m legacy -o cachefile=
    /etc/zfs/zpool.cache rpool c1t1d0s0
    
  5. Restore the root pool snapshots.

    This step might take some time. For example:


    # cat /mnt/rpool.0804 | zfs receive -Fdu rpool
    

    Using the -u option means that the restored archive is not mounted when the zfs receive operation completes.

  6. Verify that the root pool datasets are restored.

    For example:


    # zfs list
    NAME                        USED  AVAIL  REFER  MOUNTPOINT
    rpool                      6.17G  60.8G    98K  /a/rpool
    rpool@0804                     0      -    98K  -
    rpool/ROOT                 4.67G  60.8G    21K  /legacy
    rpool/ROOT@0804                0      -    21K  -
    rpool/ROOT/zfsBE           4.67G  60.8G  4.67G  /a
    rpool/ROOT/zfsBE@0804       398K      -  4.67G  -
    rpool/dump                 1.00G  60.8G  1.00G  -
    rpool/dump@0804                0      -  1.00G  -
    rpool/swap                  517M  61.3G    16K  -
    rpool/swap@0804                0      -    16K  -
  7. Set the bootfs property on the root pool BE.

    For example:


    # zpool set bootfs=rpool/ROOT/zfsBE rpool
    
  8. Install the boot blocks on the new disk.

    SPARC:


    # installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/c1t1d0s0
    

    x86:


    # installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c1t1d0s0
    
  9. Reboot the system.


    # init 6
    

ProcedureHow to Roll Back Root Pool Snapshots From a Failsafe Boot

This procedure assumes that existing root pool snapshots are available. In the example, they are available on the local system.


# zfs snapshot -r rpool@0804
# zfs list
NAME                        USED  AVAIL  REFER  MOUNTPOINT
rpool                      6.17G  60.8G    98K  /rpool
rpool@0804                     0      -    98K  -
rpool/ROOT                 4.67G  60.8G    21K  /rpool/ROOT
rpool/ROOT@0804                0      -    21K  -
rpool/ROOT/zfsBE           4.67G  60.8G  4.67G  /
rpool/ROOT/zfsBE@0804       398K      -  4.67G  -
rpool/dump                 1.00G  60.8G  1.00G  -
rpool/dump@0804                0      -  1.00G  -
rpool/swap                  517M  61.3G    16K  -
rpool/swap@0804                0      -    16K  -
  1. Shut down the system and boot failsafe mode.


    ok boot -F failsafe
    ROOT/zfsBE was found on rpool.
    Do you wish to have it mounted read-write on /a? [y,n,?] y
    mounting rpool on /a
    
    Starting shell.
  2. Roll back each root pool snapshot.


    # zfs rollback rpool@0804
    # zfs rollback rpool/ROOT@0804
    # zfs rollback rpool/ROOT/zfsBE@0804
    
  3. Reboot to multiuser mode.


    # init 6