Removing and Replacing a Drive Module Assembly

E5-APP-B cards are designed for high-availability environments, but even with the advanced reliability of the E5-APP-B card, hardware failures can occur. The E5-APP-B card is designed for easy maintenance when drive module replacement is needed. Since there are two drive modules configured with RAID in an E5-APP-B card, if one becomes corrupt the other drive continues to function. No down time is required to replace a drive module as this procedure can be used on a setup that is up and running.

Procedure - Remove and Replace a Drive Module Assembly

  1. Use the smartd command to verify the drive module names.

    # ls /var/TKLC/log/smartd
    lock log.sda log.sdb sda sdb

    In this example, the drive module names are sda and sdb.

  2. Use the mdstat command to determine whether a drive module is corrupt:

     # cat /proc/mdstat
    • On a healthy system where both drive modules (sda and sdb) are functioning properly, the mdstat output will include both drive modules:
       # cat /proc/mdstat
      Personalities : [raid1]
      md1 : active raid1 sdb2[1] sda2[0]
            262080 blocks super 1.0 [2/2] [UU]
      
      md2 : active raid1 sda1[0] sdb1[1]
            292631552 blocks super 1.1 [2/2] [UU]
            bitmap: 2/3 pages [8KB], 65536KB chunk
      
      unused devices: <none>
      
    • On a system where one of the drive modules is healthy and one is corrupt, only the healthy drive module is displayed:
       # cat /proc/mdstat
      Personalities : [raid1]
      md1 : active raid1 sdb2[1]
            262080 blocks super 1.0 [2/1] [_U]
      
      md2 : active raid1 sdb1[1]
            292631552 blocks super 1.1 [2/1] [_U]
            bitmap: 2/3 pages [8KB], 65536KB chunk
      
      unused devices: <none>
      

      In this example, the mdstat output shows only sdb, which indicates that sda is corrupt.

  3. Log in as root and run the failDisk command to mark the appropriate drive module to be replaced.

    # /usr/TKLC/plat/sbin/failDisk <disk to be removed>

    For example:

    # /usr/TKLC/plat/sbin/failDisk /dev/sda

  4. After failDisk runs successfully, remove the drive module assembly.

  5. Insert the new drive module assembly.