Removing and Replacing a Drive Module Assembly

E5-APP-B cards are designed for high-availability environments, but even with the advanced reliability of the E5-APP-B card, hardware failures can occur. The E5-APP-B card is designed for easy maintenance when drive module replacement is needed. Since there are two drive modules configured with RAID in an E5-APP-B card, if one becomes corrupt the other drive continues to function. No down time is required to replace a drive module as this procedure can be used on a setup that is up and running.

Oracle now provides 480G drive modules that allow for a larger data capacity. When upgrading from 300G to 480G drive modules, both drive modules should be replaced one after the other. The 480G drive modules will support the existing data capacity and no down time is required. To take advantage of the increased storage capacity of the 480G drive modules, EPAP must be re-installed. For information about increasing the existing data capacity after upgrading to 480G drive modules, see Increasing Data Capacity with 480G Drive Modules.

Procedure - Remove and Replace a Drive Module Assembly

Log in as admusr and use the smartd command to verify the drive module names.
```
$ ls /var/TKLC/log/smartd
lock log.sda log.sdb sda sdb
```
In this example, the drive module names are sda and sdb.

Use the mdstat command to determine whether a drive module is corrupt:

 $ sudo cat /proc/mdstat

On a healthy system where both drive modules (sda and sdb) are functioning properly, the mdstat output will include both drive modules:

 $ sudo cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdb2[1] sda2[0]
      262080 blocks super 1.0 [2/2] [UU]

md2 : active raid1 sda1[0] sdb1[1]
      292631552 blocks super 1.1 [2/2] [UU]
      bitmap: 2/3 pages [8KB], 65536KB chunk

unused devices: <none>

On a system where one of the drive modules is healthy and one is corrupt, only the healthy drive module is displayed:

 $ sudo cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdb2[1]
      262080 blocks super 1.0 [2/1] [_U]

md2 : active raid1 sdb1[1]
      292631552 blocks super 1.1 [2/1] [_U]
      bitmap: 2/3 pages [8KB], 65536KB chunk

unused devices: <none>

In this example, the mdstat output shows only sdb, which indicates that sda is corrupt.

Run the failDisk command to mark the appropriate drive module to be replaced.
If you are replacing a healthy drive module with a higher capacity drive module, the force option is required. The force option is not required when replacing a corrupt drive module.
- Replacing a corrupt drive module:
```
$ sudo /usr/TKLC/plat/sbin/failDisk <disk to be removed>
```
  For example:
```
$ sudo /usr/TKLC/plat/sbin/failDisk /dev/sda
```
- Replacing a healthy drive module with a higher capacity drive module:
```
$ sudo /usr/TKLC/plat/sbin/failDisk --force <disk to be removed>
```
  For example:
```
$ sudo /usr/TKLC/plat/sbin/failDisk --force /dev/sda
```
After failDisk runs successfully, remove the drive module assembly.
See Removing a Drive Module Assembly.
Insert the new drive module assembly.
See Replacing a Drive Module Assembly.
If you are replacing a 300G drive module with a 480G drive module, repeat these steps to replace the other 300G drive module with a 480G drive module.