Replacing a Failed Physical Disk

A physical disk outage can reduce performance and data redundancy. Therefore, you should replace a failed disk with a new disk as soon as possible.

To replace a disk when it fails:

  1. Determine which disk failed.
    CellCLI> LIST PHYSICALDISK WHERE diskType=HardDisk AND status=failed DETAIL
    
             name:                   28:5
             deviceId:               21
             diskType:               HardDisk
             enclosureDeviceId:      28
             errMediaCount:          0
             errOtherCount:          0
             foreignState:           false
             luns:                   0_5
             makeModel:              "SEAGATE ST360057SSUN600G"
             physicalFirmware:       0705
             physicalInterface:      sas
             physicalSerial:         A01BC2
             physicalSize:           558.9109999993816G
             slotNumber:             5
             status:                 failed
    

    The slot number shows the location of the disk, and the status shows that the disk failed.

  2. Ensure that the blue "OK to Remove" LED on the disk is lit, before you remove the disk.
  3. Replace the physical disk on the storage server and wait three minutes. The physical disk is hot pluggable, and you can replace it with the power on.
  4. Confirm that the disk is online and its status is NORMAL:
    CellCLI> LIST PHYSICALDISK WHERE name=28:5 ATTRIBUTES status
    

    When you replace a physical disk, the RAID controller must acknowledge the replacement disk before you can use it. Acknowledgment is quick.

  5. Verify that the firmware is correct:
    ALTER CELL VALIDATE CONFIGURATION
    

    You can also check the ms-odl.trc file to confirm that the firmware was updated and the logical unit number (LUN) was rebuilt.

  6. Re-create the grid disks and cell disks that existed on the previous disk in that slot. See "About Rebalancing the Data".

See Also: