JavaScript is required to for searching.
Skip Navigation Links
Exit Print View
Solaris Volume Manager Administration Guide
search filter icon
search icon

Document Information

Preface

1.  Getting Started With Solaris Volume Manager

2.  Storage Management Concepts

3.  Solaris Volume Manager Overview

4.  Solaris Volume Manager for Sun Cluster (Overview)

5.  Configuring and Using Solaris Volume Manager (Scenario)

6.  State Database (Overview)

7.  State Database (Tasks)

8.  RAID-0 (Stripe and Concatenation) Volumes (Overview)

9.  RAID-0 (Stripe and Concatenation) Volumes (Tasks)

10.  RAID-1 (Mirror) Volumes (Overview)

11.  RAID-1 (Mirror) Volumes (Tasks)

12.  Soft Partitions (Overview)

13.  Soft Partitions (Tasks)

14.  RAID-5 Volumes (Overview)

15.  RAID-5 Volumes (Tasks)

16.  Hot Spare Pools (Overview)

17.  Hot Spare Pools (Tasks)

18.  Disk Sets (Overview)

19.  Disk Sets (Tasks)

20.  Maintaining Solaris Volume Manager (Tasks)

21.  Best Practices for Solaris Volume Manager

22.  Top-Down Volume Creation (Overview)

23.  Top-Down Volume Creation (Tasks)

24.  Monitoring and Error Reporting (Tasks)

25.  Troubleshooting Solaris Volume Manager (Tasks)

Troubleshooting Solaris Volume Manager (Task Map)

Overview of Troubleshooting the System

Prerequisites for Troubleshooting the System

General Guidelines for Troubleshooting Solaris Volume Manager

General Troubleshooting Approach

Replacing Disks

How to Replace a Failed Disk

Recovering From Disk Movement Problems

Disk Movement and Device ID Overview

Resolving Unnamed Devices Error Message

Device ID Discrepancies After Upgrading to the Solaris 10 Release

Recovering From Boot Problems

Background Information for Boot Problems

How to Recover From Improper /etc/vfstab Entries

Recovering the root (/) RAID-1 (Mirror) Volume

How to Recover From a Boot Device Failure

Recovering From State Database Replica Failures

How to Recover From Insufficient State Database Replicas

Recovering From Soft Partition Problems

How to Recover Configuration Data for a Soft Partition

Recovering Storage From a Different System

How to Recover Storage From a Local Disk Set

Recovering Storage From a Known Disk Set

How to Print a Report on Disk Sets Available for Import

How to Import a Disk Set From One System to Another System

Recovering From Disk Set Problems

What to Do When You Cannot Take Ownership of A Disk Set

How to Purge a Disk Set

Performing Mounted Filesystem Backups Using the ufsdump Command

How to Perform a Backup of a Mounted Filesystem Located on a RAID-1 Volume

Performing System Recovery

How to Recover a System Using a Solaris Volume Manager Configuration

A.  Important Solaris Volume Manager Files

B.  Solaris Volume Manager Quick Reference

C.  Solaris Volume Manager CIM/WBEM API

Index

Replacing Disks

This section describes how to replace disks in a Solaris Volume Manager environment.


Caution

Caution - If you have soft partitions on a failed disk or on volumes that are built on a failed disk, you must put the new disk in the same physical location Also, use the same cntndn number as the disk being replaced.


How to Replace a Failed Disk

  1. Identify the failed disk to be replaced by examining the /var/adm/messages file and the metastat command output.
  2. Locate any state database replicas that might have been placed on the failed disk.

    Use the metadb command to find the replicas.

    The metadb command might report errors for the state database replicas that are located on the failed disk. In this example, c0t1d0 is the problem device.

    # metadb
       flags       first blk        block count
      a m     u        16               1034            /dev/dsk/c0t0d0s4
      a       u        1050             1034            /dev/dsk/c0t0d0s4
      a       u        2084             1034            /dev/dsk/c0t0d0s4
      W   pc luo       16               1034            /dev/dsk/c0t1d0s4
      W   pc luo       1050             1034            /dev/dsk/c0t1d0s4
      W   pc luo       2084             1034            /dev/dsk/c0t1d0s4

    The output shows three state database replicas on each slice 4 of the local disks, c0t0d0 and c0t1d0. The W in the flags field of the c0t1d0s4 slice indicates that the device has write errors. Three replicas on the c0t0d0s4 slice are still good.

  3. Record the slice name where the state database replicas reside and the number of state database replicas. Then, delete the state database replicas.

    The number of state database replicas is obtained by counting the number of appearances of a slice in the metadb command output. In this example, the three state database replicas that exist on c0t1d0s4 are deleted.

    # metadb -d c0t1d0s4

    Caution

    Caution - If, after deleting the bad state database replicas, you are left with three or fewer, add more state database replicas before continuing. Doing so helps to ensure that configuration information remains intact.


  4. Locate and delete any hot spares on the failed disk.

    Use the metastat command to find hot spares. In this example, hot spare pool hsp000 included c0t1d0s6, which is then deleted from the pool.

    # metahs -d hsp000 c0t1d0s6
    hsp000: Hotspare is deleted
  5. Replace the failed disk.

    This step might entail using the cfgadm command, the luxadm command, or other commands as appropriate for your hardware and environment. When performing this step, make sure to follow your hardware's documented procedures to properly manipulate the Solaris state of this disk.

  6. Repartition the new disk.

    Use the format command or the fmthard command to partition the disk with the same slice information as the failed disk. If you have the prtvtoc output from the failed disk, you can format the replacement disk with the fmthard -s /tmp/failed-disk-prtvtoc-output command.

  7. If you deleted state database replicas, add the same number back to the appropriate slice.

    In this example, /dev/dsk/c0t1d0s4 is used.

    # metadb -a -c 3 c0t1d0s4
  8. If any slices on the disk are components of RAID-5 volumes or are components of RAID-0 volumes that are in turn submirrors of RAID-1 volumes, run the metareplace -e command for each slice.

    In this example, /dev/dsk/c0t1d0s4 and mirror d10 are used.

    # metareplace -e d10 c0t1d0s4
  9. If any soft partitions are built directly on slices on the replaced disk, run the metarecover -m -p command on each slice that contains soft partitions. This command regenerates the extent headers on disk.

    In this example, /dev/dsk/c0t1d0s4 needs to have the soft partition markings on disk regenerated. The slice is scanned and the markings are reapplied, based on the information in the state database replicas.

    # metarecover c0t1d0s4 -m -p
  10. If any soft partitions on the disk are components of RAID-5 volumes or are components of RAID-0 volumes that are submirrors of RAID-1 volumes, run the metareplace -e command for each slice.

    In this example, /dev/dsk/c0t1d0s4 and mirror d10 are used.

    # metareplace -e d10 c0t1d0s4
  11. If any RAID-0 volumes have soft partitions built on them, run the metarecover command for each RAID-0 volume.

    In this example, RAID-0 volume, d17, has soft partitions built on it.

    # metarecover d17 -m -p
  12. Replace hot spares that were deleted, and add them to the appropriate hot spare pool or pools.

    In this example, hot spare pool, hsp000 included c0t1d0s6. This slice is added to the hot spare pool.

    # metahs -a hsp000 c0t1d0s6
    hsp000: Hotspare is added
  13. If soft partitions or nonredundant volumes were affected by the failure, restore data from backups. If only redundant volumes were affected, then validate your data.

    Check the user and application data on all volumes. You might have to run an application-level consistency checker, or use some other method to check the data.