9 Field Replaceable Units

This chapter describes the components of an E5-APP-B card that can be replaced in the field and includes procedures for replacing each type of field replaceable unit (FRU).

Introduction

Oracle Communication EAGLE Application B Cards (E5-APP-B) are complete application server platforms and are designed for the high-availability environments required by telephony networks. They are installed in an EAGLE shelf.

Even with the advanced reliability of the E5-APP-B design, hardware failures may still occur. The E5-APP-B card is designed for easy maintenance when replacements are needed.

This chapter highlights the E5-APP-B card components that are field replaceable units (FRU) and provides procedures for replacing them.

This chapter explains how to remove a card from the EAGLE. The procedures include the administrative commands required to take a card out of service and place it back into service.

In the event a numbered event message is encountered, refer to the appropriate procedure in the Unsolicited Alarm and Information Messages Reference.

Additional information about each command can be found in the EAGLE Commands User's Guide.

E5-APP-B Card FRUs and Part Numbers

The following E5-APP-B card components can be replaced in the field:

  • E5-APP-B cards (P/N 870-3096-01 and P/N 870-3096-02)
  • Drive modules (P/N 870-3097-01 and P/N 870-3097-02)

Removing and Replacing E5-APP-B Cards

This section gives procedures on removing and replacing the E5-APP-B card and drive modules.

Removing an E5-APP-B Card

Procedure - Remove E5-APP-B card

Note:

The shutdown, init 6 or halt commands will not shut down the E5-APP-B card.
  1. On the E5-APP-B card, slide the Ejector switch (4) up to the UNLOCKED position (see Figure 9-1).

    Caution:

    When the Ejector switch goes from locked to unlocked and the E5-APP-B card is in service, the card will halt.

    Figure 9-1 E5-APP-B Card Eject Hardware Switch, UNLOCKED


    img/card-eject-1.jpg
  2. WAIT for the E5-APP-B Eject Status LED to go from blinking red to a steady red.
    When the Eject Status LED is steady red, the E5-APP-B card is in shutdown state.
    If the Ejector switch is put into the LOCKED position now, the E5-APP-B card will reboot.
  3. Grasp the upper and lower card Inject/Eject (I/E) lever release (3) just underneath the I/E lever, and press it to meet the I/E lever. This is the mechanical interlock for the card.
    See Figure 9-2

    Figure 9-2 E5-APP-B Card UNLOCKED


    img/card-eject-2.jpg
  4. While holding the I/E interlock and lever, pull the levers (2) away from the shelf until they are parallel to the floor.
  5. Remove the E5-APP-B card from the EAGLE shelf.

Replacing an E5-APP-B Card

Procedure - Replace E5-APP-B card

  1. While holding the I/E interlock and lever, pull the levers (2) away from the card until they are parallel to the floor.
    Figure 9-3 illustrates the angle of the interlocks and levers just before inserting E5-APP-B Card into the EAGLE shelf.

    Figure 9-3 E5-APP-B Card UNLOCKED


    img/card-unlocked.jpg
  2. Insert the E5-APP-B card into the EAGLE shelf.

    Carefully align the edges of the card with the top and bottom card guides. Then, push the card along the length of the card guides until the rear connectors on the card engage the mating connectors on the target shelf backplane.

  3. Push in the top and bottom inject/eject clamps (see Figure 9-4).

    Figure 9-4 E5-APP-B Card Inject Levers


    img/card-inject-1.jpg

    This locks the card in place and ensures a strong connection with the pins on the target shelf backplane.

  4. Slide the E5-APP-B Ejector switch (4) down to the LOCKED position (see Figure 9-5).

    Note:

    When the Ejector switch goes from UNLOCKED to LOCKED, the E5-APP-B Eject Status LED blinks red as the E5-MASP card goes online.

    Figure 9-5 E5-APP-B Card Inject Hardware Switch, LOCKED


    img/card-inject-2.jpg
  5. WAIT for the E5-APP-B Eject Status LED to go from blinking red to off.

Removing and Replacing a Drive Module Assembly

E5-APP-B cards are designed for high-availability environments, but even with the advanced reliability of the E5-APP-B card, hardware failures can occur. The E5-APP-B card is designed for easy maintenance when drive module replacement is needed. Since there are two drive modules configured with RAID in an E5-APP-B card, if one becomes corrupt the other drive continues to function. No down time is required to replace a drive module as this procedure can be used on a setup that is up and running.

Procedure - Remove and Replace a Drive Module Assembly

  1. Use the smartd command to verify the drive module names.
    # ls /var/TKLC/log/smartd
    lock log.sda log.sdb sda sdb
    In this example, the drive module names are sda and sdb.
  2. Use the mdstat command to determine whether a drive module is corrupt:
     # cat /proc/mdstat
    • On a healthy system where both drive modules (sda and sdb) are functioning properly, the mdstat output will include both drive modules:
       # cat /proc/mdstat
      Personalities : [raid1]
      md1 : active raid1 sdb2[1] sda2[0]
            262080 blocks super 1.0 [2/2] [UU]
      
      md2 : active raid1 sda1[0] sdb1[1]
            292631552 blocks super 1.1 [2/2] [UU]
            bitmap: 2/3 pages [8KB], 65536KB chunk
      
      unused devices: <none>
      
    • On a system where one of the drive modules is healthy and one is corrupt, only the healthy drive module is displayed:
       # cat /proc/mdstat
      Personalities : [raid1]
      md1 : active raid1 sdb2[1]
            262080 blocks super 1.0 [2/1] [_U]
      
      md2 : active raid1 sdb1[1]
            292631552 blocks super 1.1 [2/1] [_U]
            bitmap: 2/3 pages [8KB], 65536KB chunk
      
      unused devices: <none>
      

      In this example, the mdstat output shows only sdb, which indicates that sda is corrupt.

  3. Log in as root and run the failDisk command to mark the appropriate drive module to be replaced.
    # /usr/TKLC/plat/sbin/failDisk <disk to be removed>

    For example:

    # /usr/TKLC/plat/sbin/failDisk /dev/sda
  4. After failDisk runs successfully, remove the drive module assembly.
  5. Insert the new drive module assembly.

Removing a Drive Module Assembly

Procedure - Remove Drive Module Assembly

  1. Verify that the drive module is locked in position and in use.

    The switch lock release (C) is in the LOCKED position and the Status LED on the E5-APP-B card is OFF.

    Move the switch lock release (C) to the "released" position by pressing in the direction indicated. Refer to Figure 9-6.

    Figure 9-6 Drive Module Released


    img/ssd-eject-1.jpg
  2. Move drive module locking switch (D) from the LOCKED to the unlocked position and wait for the LED (B) to indicate a steady red state. See Figure 9-7 and Figure 9-8, respectively.
    When drive module locking switch (D) is transitioned from locked to unlocked, the LED will flash red to indicate the drive is unlocked and in process of shutting down.

    Figure 9-7 Drive Module UNLOCKED


    img/ssd-eject-2.jpg

    Caution:

    Removal of the drive prior to the LED indicating steady red could result in drive corruption.

    Figure 9-8 Drive Module Status


    img/ssd-eject-3.jpg
  3. When the LED indicates a steady red, the drive module can be safely removed.
  4. Loosen the drive module screw (E) (see Figure 9-8).
  5. Grasp the screw (E) and pull the drive out slowly until it is free from the card (see Figure 9-9).

    Figure 9-9 Drive Module Removal


    img/ssd-eject-5.jpg

Replacing a Drive Module Assembly

Procedure - Replace Drive Module Assembly

  1. Slide a new drive(s) module into the drive slot on the card (see Figure 9-10).

    Figure 9-10 Drive Module Replacement


    img/ssd-inject-1.jpg
  2. Gently push the drive (A) in slowly until it is properly seated.
  3. Tighten the mounting screw until the Drive Status LED is in a steady red state ((B), from Figure 9-8).
  4. Move the drive module locking switch (D) from the unlocked to the LOCKED position.

    When drive module locking switch (D) is transitioned from unlocked to locked, the LED will flash red to indicate the drive is locked and in process of coming online (see Figure 9-11).

    Figure 9-11 Drive Module Locked


    img/ssd-inject-3.jpg
  5. When the LED turns off, log in as admusrroot and run the cpDiskCfg command to copy the partition table from the good drive module to the new drive module.
    $ sudo /usr/TKLC/plat/sbin/cpDiskCfg <source disk> <destination disk>
    # /usr/TKLC/plat/sbin/cpDiskCfg <source disk> <destination disk>
    For example:
    $ sudo /usr/TKLC/plat/sbin/cpDiskCfg /dev/sdb /dev/sda
    # /usr/TKLC/plat/sbin/cpDiskCfg /dev/sdb /dev/sda
  6. After successfully copying the partition table, use the mdRepair command to replicate the data from the good drive module to the new drive module.
    $ sudo /usr/TKLC/plat/sbin/mdRepair
    # /usr/TKLC/plat/sbin/mdRepair
    This step takes 45 to 90 minutes and runs in the background without impacting functionality.

    Sample output of the command:

    [admusr@recife-b ~]$ sudo /usr/TKLC/plat/sbin/mdRepair
    SCSI device 'sdb' is not currently online
    probing for 'sdb' on SCSI 1:0:0:0
    giving SCSI subsystem some time to discover newly-found disks
    Adding device /dev/sdb1 to md group md1...
    md resync in progress, sleeping 30 seconds...
    md1 is 0.0% percent done...
    
    This script MUST be allowed to run to completion.  Do not exit.
    
    
    bgRe-installing master boot loader(s)
    
    Adding device /dev/sdb2 to md group md3...
    Adding device /dev/sdb9 to md group md5...
    Adding device /dev/sdb7 to md group md4...
    Adding device /dev/sdb6 to md group md7...
    Adding device /dev/sdb8 to md group md6...
    Adding device /dev/sdb3 to md group md2...
    Adding device /dev/sdb5 to md group md8...
    md resync in progress, sleeping 30 seconds...
    md3 is 3.6% percent done...
    
    This script MUST be allowed to run to completion.  Do not exit.
    
    
    md resync in progress, sleeping 30 seconds...
    md5 is 27.8% percent done...
    
    This script MUST be allowed to run to completion.  Do not exit.
    
    md resync in progress, sleeping 30 seconds...
    md4 is 8.9% percent done...
    
    This script MUST be allowed to run to completion.  Do not exit.
    
    md resync in progress, sleeping 30 seconds...
    md4 is 62.5% percent done...
    
    This script MUST be allowed to run to completion.  Do not exit.
    
    md resync in progress, sleeping 30 seconds...
    md7 is 14.7% percent done...
    
    This script MUST be allowed to run to completion.  Do not exit.
    
    md resync in progress, sleeping 30 seconds...
    md7 is 68.3% percent done...
    
    This script MUST be allowed to run to completion.  Do not exit.
    
    md resync in progress, sleeping 30 seconds...
    md8 is 0.3% percent done...
    
    This script MUST be allowed to run to completion.  Do not exit.
    
    md resync in progress, sleeping 30 seconds...
    md8 is 1.1% percent done...
    
    This script MUST be allowed to run to completion.  Do not exit.
    
    md resync in progress, sleeping 30 seconds...
    md8 is 2.0% percent done...
    
  7. Use the cat /proc/mdstat command to confirm whether RAID repairs are successful.

    After the RAID is repaired successfully, output showing both drive modules is displayed:

    Personalities : [raid1]
    md1 : active raid1 sdb2[1] sda2[0]
          262080 blocks super 1.0 [2/2] [UU]
     
    md2 : active raid1 sda1[0] sdb1[1]
          468447232 blocks super 1.1 [2/2] [UU]
          bitmap: 1/4 pages [4KB], 65536KB chunk
     
    unused devices: <none>
    
    Personalities : [raid1]
    md2 : active raid1 sda2[0] sdb2[1]
          26198016 blocks super 1.1 [2/2] [UU]
          bitmap: 1/1 pages [4KB], 65536KB chunk
     
    md1 : active raid1 sda3[0] sdb3[1]
          262080 blocks super 1.0 [2/2] [UU]
     
    md3 : active raid1 sdb1[1] sda1[0]
          442224640 blocks super 1.1 [2/2] [UU]
          bitmap: 1/4 pages [4KB], 65536KB chunk
    
    unused devices: <none>
    

    Output of cat /proc/mdstat prior to re-mirroring:

    [admusr@recife-b ~]$ sudo cat /proc/mdstat
    Personalities : [raid1] 
    md1 : active raid1 sda1[0]
          264960 blocks [2/1] [U_]
          
    md3 : active raid1 sda2[0]
          2048192 blocks [2/1] [U_]
          
    md8 : active raid1 sda5[0]
          270389888 blocks [2/1] [U_]
          
    md7 : active raid1 sda6[0]
          4192832 blocks [2/1] [U_]
          
    md4 : active raid1 sda7[0]
          4192832 blocks [2/1] [U_]
          
    md6 : active raid1 sda8[0]
          1052160 blocks [2/1] [U_]
          
    md5 : active raid1 sda9[0]
          1052160 blocks [2/1] [U_]
          
    md2 : active raid1 sda3[0]
          1052160 blocks [2/1] [U_]
          
    unused devices: <none>
    
    
    Output of cat /proc/mdstat during re-mirroring process:
    [admusr@recife-b ~]$ sudo cat /proc/mdstat
    Personalities : [raid1] 
    md1 : active raid1 sdb1[1] sda1[0]
          264960 blocks [2/2] [UU]
          
    md3 : active raid1 sdb2[1] sda2[0]
          2048192 blocks [2/2] [UU]
          
    md8 : active raid1 sdb5[2] sda5[0]
          270389888 blocks [2/1] [U_]
          [=====>...............]  recovery = 26.9% (72955264/270389888) finish=43.8min speed=75000K/sec
          
    md7 : active raid1 sdb6[1] sda6[0]
          4192832 blocks [2/2] [UU]
          
    md4 : active raid1 sdb7[1] sda7[0]
          4192832 blocks [2/2] [UU]
          
    md6 : active raid1 sdb8[1] sda8[0]
          1052160 blocks [2/2] [UU]
          
    md5 : active raid1 sdb9[1] sda9[0]
          1052160 blocks [2/2] [UU]
          
    md2 : active raid1 sdb3[2] sda3[0]
          1052160 blocks [2/1] [U_]
          resync=DELAYED
    
    Output of cat /proc/mdstat upon successful completion of re-mirror:
    [admusr@recife-b ~]$ sudo cat /proc/mdstat
    Personalities : [raid1]
    md1 : active raid1 sdb1[1] sda1[0]
          264960 blocks [2/2] [UU]
    
    md3 : active raid1 sdb2[1] sda2[0]
          2048192 blocks [2/2] [UU]
    
    md8 : active raid1 sdb5[1] sda5[0]
          270389888 blocks [2/2] [UU]
    
    md7 : active raid1 sdb6[1] sda6[0]
          4192832 blocks [2/2] [UU]
    
    md4 : active raid1 sdb7[1] sda7[0]
          4192832 blocks [2/2] [UU]
    
    md6 : active raid1 sdb8[1] sda8[0]
          1052160 blocks [2/2] [UU]
    
    md5 : active raid1 sdb9[1] sda9[0]
          1052160 blocks [2/2] [UU]
    
    md2 : active raid1 sdb3[1] sda3[0]
          1052160 blocks [2/2] [UU]
    
    unused devices: <none>