The following sections describe how to perform the following tasks:
You might need to replace a disk in the root pool for the following reasons:
The root pool is too small and you want to replace it with a larger disk
The root pool disk is failing. In a non-redundant pool, if the disk is failing so that the system won't boot, you'll need to boot from an alternate media, such as a CD or the network, before you replace the root pool disk.
In a mirrored root pool configuration, you might be able to attempt a disk replacement without having to boot from alternate media. You can replace a failed disk by using the zpool replace command or if you have an additional disk, you can use the zpool attach command. See the steps below for an example of attaching an additional disk and detaching a root pool disk.
Some hardware requires that you offline and unconfigure a disk before attempting the zpool replace operation to replace a failed disk. For example:
# zpool offline rpool c1t0d0s0 # cfgadm -c unconfigure c1::dsk/c1t0d0 <Physically remove failed disk c1t0d0> <Physically insert replacement disk c1t0d0> # cfgadm -c configure c1::dsk/c1t0d0 # zpool replace rpool c1t0d0s0 # zpool online rpool c1t0d0s0 # zpool status rpool <Let disk resilver before installing the boot blocks> SPARC# installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/c1t0d0s0 x86# installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c1t9d0s0 |
On some hardware, you do not have to online or reconfigure the replacement disk after it is inserted.
Identify the boot device pathnames of the current and new disk so that you can test booting from the replacement disk and also manually boot from the existing disk, if necessary, if the replacement disk fails. In the example below, the current root pool disk (c1t10d0s0) is:
/pci@8,700000/pci@3/scsi@5/sd@a,0 |
In the example below, the replacement boot disk is (c1t9d0s0):
/pci@8,700000/pci@3/scsi@5/sd@9,0 |
Physically connect the replacement disk.
Confirm that the replacement (new) disk has an SMI label and a slice 0.
For information about relabeling a disk that is intended for the root pool, see the following site:
http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide
Attach the new disk to the root pool.
For example:
# zpool attach rpool c1t10d0s0 c1t9d0s0 |
Confirm the root pool status.
For example:
# zpool status rpool pool: rpool state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress, 25.47% done, 0h4m to go config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t10d0s0 ONLINE 0 0 0 c1t9d0s0 ONLINE 0 0 0 errors: No known data errors |
After the resilvering is complete, apply the boot blocks to the new disk.
For example:
On a SPARC based system:
# installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/c1t9d0s0 |
On an x86 based system:
# installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c1t9d0s0 |
Verify that you can boot from the new disk.
For example, on a SPARC based system:
ok boot /pci@8,700000/pci@3/scsi@5/sd@9,0 |
If the system boots from the new disk, detach the old disk.
For example:
# zpool detach rpool c1t10d0s0 |
Set up the system to boot automatically from the new disk, either by using the eeprom command, the setenv command from the SPARC boot PROM, or reconfigure the PC BIOS.
Create root pool snapshots for recovery purposes. The best way to create root pool snapshots is to do a recursive snapshot of the root pool.
The procedure below creates a recursive root pool snapshot and stores the snapshot as a file in a pool on a remote system. In the case of a root pool failure, the remote dataset can be mounted by using NFS and the snapshot file received into the recreated pool. You can also store root pool snapshots as the actual snapshots in a pool on a remote system. Sending and receiving the snapshots from a remote system is a bit more complicated because you must configure ssh or use rsh while the system to be repaired is booted from the Solaris OS miniroot.
For information about remotely storing and recovering root pool snapshots and the most up-to-date information about root pool recovery, go to this site:
http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide
Validating remotely stored snapshots as files or snapshots is an important step in root pool recovery and in either method, snapshots should be recreated on a routine basis, such as when the pool configuration changes or when the Solaris OS is upgraded.
In the following example, the system is booted from the zfs1009BE boot environment.
Create space on a remote system to store the snapshots.
For example:
remote# zfs create rpool/snaps |
Share the space to the local system.
For example:
remote# zfs set sharenfs='rw=local-system,root=local-system' rpool/snaps # share -@rpool/snaps /rpool/snaps sec=sys,rw=local-system,root=local-system "" |
Create a recursive snapshot of the root pool.
In this example, the system has two BEs, zfsnv109BE and zfsnv1092BE. The active BE is zfsnv109BE.
local# zpool set listsnapshots=on mpool local# zfs snapshot -r mpool@0311 local# zfs list NAME USED AVAIL REFER MOUNTPOINT mpool 9.98G 41.2G 22.5K /mpool mpool@0311 0 - 22.5K - mpool/ROOT 7.48G 41.2G 19K /mpool/ROOT mpool/ROOT@0311 0 - 19K - mpool/ROOT/zfsnv1092BE 85K 41.2G 7.48G /tmp/.alt.luupdall.2934 mpool/ROOT/zfsnv1092BE@0311 0 - 7.48G - mpool/ROOT/zfsnv109BE 7.48G 41.2G 7.45G / mpool/ROOT/zfsnv109BE@zfsnv1092BE 28.7M - 7.48G - mpool/ROOT/zfsnv109BE@0311 58K - 7.45G - mpool/dump 2.00G 41.2G 2.00G - mpool/dump@0311 0 - 2.00G - mpool/swap 517M 41.7G 16K - mpool/swap@0311 0 - 16K - |
Send the root pool snapshots to the remote system.
For example:
local# zfs send -Rv mpool@0311 > /net/remote-system/rpool/snaps/mpool.0311 sending from @ to mpool@0311 sending from @ to mpool/swap@0311 sending from @ to mpool/dump@0311 sending from @ to mpool/ROOT@0311 sending from @ to mpool/ROOT/zfsnv109BE@zfsnv1092BE sending from @zfsnv1092BE to mpool/ROOT/zfsnv109BE@0311 sending from @ to mpool/ROOT/zfsnv1092BE@0311 |
In this scenario, assume the following conditions:
ZFS root pool cannot be recovered
ZFS root pool snapshots are stored on a remote system and are shared over NFS
All steps below are performed on the local system.
Boot from CD/DVD or the network.
On a SPARC based system, select one of the following boot methods:
ok boot net -s ok boot cdrom -s |
If you don't use -s option, you'll need to exit the installation program.
On an x86 based system, select the option for booting from the DVD or the network. Then, exit the installation program.
Mount the remote snapshot dataset.
For example:
# mount -F nfs remote-system:/rpool/snaps /mnt |
If your network services are not configured, you might need to specify the remote-system's IP address.
If the root pool disk is replaced and does not contain a disk label that is usable by ZFS, you will have to relabel the disk.
For more information about relabeling the disk, go to the following site:
http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide
Recreate the root pool.
For example:
# zpool create -f -o failmode=continue -R /a -m legacy -o cachefile=/etc/zfs/zpool.cache mpool c1t0d0s0 |
Restore the root pool snapshots.
This step might take some time. For example:
# cat /mnt/mpool.0311 | zfs receive -Fdu mpool |
Using the -u option means that the restored archive is not mounted when the zfs receive operation completes.
(Optional) If you want to modify something in the BE, you will need to explicitly mount them like this:
Mount the BE components. For example:
# zfs mount mpool/ROOT/zfsnv109BE |
Mount everything in the pool that is not part of a BE. For example:
# zfs mount -a mpool |
Other BEs are not mounted since they have canmount=noauto, which suppresses mounting when the zfs mount -a operation is done.
Verify that the root pool datasets are restored.
For example:
# zfs list NAME USED AVAIL REFER MOUNTPOINT mpool 9.98G 41.2G 22.5K /mpool mpool@0311 0 - 22.5K - mpool/ROOT 7.48G 41.2G 19K /mpool/ROOT mpool/ROOT@0311 0 - 19K - mpool/ROOT/zfsnv1092BE 85K 41.2G 7.48G /tmp/.alt.luupdall.2934 mpool/ROOT/zfsnv1092BE@0311 0 - 7.48G - mpool/ROOT/zfsnv109BE 7.48G 41.2G 7.45G / mpool/ROOT/zfsnv109BE@zfsnv1092BE 28.7M - 7.48G - mpool/ROOT/zfsnv109BE@0311 58K - 7.45G - mpool/dump 2.00G 41.2G 2.00G - mpool/dump@0311 0 - 2.00G - mpool/swap 517M 41.7G 16K - mpool/swap@0311 0 - 16K - |
Set the bootfs property on the root pool BE.
For example:
# zpool set bootfs=mpool/ROOT/zfsnv109BE mpool |
Install the boot blocks on the new disk.
On a SPARC based system:
# installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/c1t5d0s0 |
On an x86 based system:
# installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c1t5d0s0 |
Reboot the system.
# init 6 |
This procedure assumes that existing root pool snapshots are available. In this example, the root pool snapshots are available on the local system. For example:
# zpool set listsnapshots=on rpool # zfs snapshot -r rpool/ROOT@0311 # zfs list NAME USED AVAIL REFER MOUNTPOINT rpool 5.67G 1.04G 21.5K /rpool rpool/ROOT 4.66G 1.04G 18K /rpool/ROOT rpool/ROOT@1013 0 - 18K - rpool/ROOT/zfsnv109BE 4.66G 1.04G 4.66G / rpool/ROOT/zfsnv109BE@0311 0 - 4.66G - rpool/dump 515M 1.04G 515M - rpool/swap 513M 1.54G 16K - |
Shutdown the system and boot failsafe mode.
ok boot -F failsafe Multiple OS instances were found. To check and mount one of them read-write under /a, select it from the following list. To not mount any, select 'q'. 1 /dev/dsk/c1t1d0s0 Solaris Express Community Edition snv_109 SPARC 2 rpool:7641827061132033134 ROOT/zfsnv1092BE Please select a device to be mounted (q for none) [?,??,q]: 2 mounting rpool on /a |
Rollback the individual root pool snapshots.
# zfs rollback -rf rpool/ROOT@0311 |
Reboot back to multiuser mode.
# init 6 |