1.9.2.3 Step 3: Enable the New Disk Controller BBU

Similar to "Step 1: Prepare the Disk Controller BBU for Removal", this section has two subsections:

For Systems with Remote Mount BBU

Perform the steps in this section if your system has a remote mount BBU. In this scenario, the system was not shut down at the end of "Step 1: Prepare the Disk Controller BBU for Removal".

If you are running image version 11.2.3.3.0 or later:

  1. Log in as the celladmin or root user.

  2. Re-enable the BBU.

    # cellcli -e alter cell bbu reenable
    HDD disk controller battery has been reenabled
    
  3. Verify the disk controller BBU battery state is operational.

    # cellcli -e list cell attributes bbustatus
    normal
    

    If the "BBU status" is anything other than "normal", then investigate and correct the problem before continuing.

If you are running image version 11.2.3.2.x:

  1. Log in as the root user.

  2. Turn off the server's locate LED.

    # ipmitool chassis identify off
    Chassis identify interval: off
    
  3. Wait approximately 5 minutes for the HBA to recognize and start communicating with the new BBU.

  4. Verify the HBA battery status is Operational and charging.

    # /opt/MegaRAID/MegaCli/MegaCli64 -adpbbucmd -a0
    
  5. Set all logical drives cache policy to WriteBack cache mode.

    # /opt/MegaRAID/MegaCli/MegaCli64 -ldsetprop wb -lall -a0
    
  6. Verify the current cache policy for all logical drives is now using WriteBack cache mode.

    # /opt/MegaRAID/MegaCli/MegaCli64 -ldpdinfo -a0 | grep -i bbu
    Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
    Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
    ... <repeated for each logical volume present>
    

For Systems That Do Not Have Remote Mount BBU

At the end of "Step 1: Prepare the Disk Controller BBU for Removal", systems without a remote mount BBU were shut down. You now have to restart the system.

  1. Power on the server by pressing the power button.

  2. After ILOM has booted, power on the server by pressing the power button, and then connect to the server's console.

    To connect to the console from the ILOM Web browser (preferred): Access the "Remote Control -> Redirection" tab and click the "Launch Remote Console" button. On ILOM 3.1.x systems, the console button can be launched from the initial Summary Information screen.

    To connect to the console from the ILOM CLI:

    > start /SP/console
    
  3. From the server's console, monitor the system booting. Watch in particular the LSI controller BIOS while it is loading. If it gives a warning message regarding drives with preserved cache, then choose "D" to discard the cache and continue. This is not an issue as the disk will get re-synced after boot by ASM. If it gives a warning message regarding drives are in write-through mode due to a low battery, then choose to continue.

    The Exadata boot should continue normally after that, showing the Exadata boot splash screen and continue with normal OS boot messages. Note that there may be a long pause between screen outputs on the ILOM serial console during subsequent boot steps as the default console is the graphics, and the Exadata boot splash screen will not display.

  4. Once full boot is completed, log in as the root user and verify the new battery is seen and is charging.

    # /opt/MegaRAID/MegaCli/MegaCli64 -adpbbucmd -a0
    
  5. Set all logical drives cache policy to WriteBack cache mode using the battery.

    # /opt/MegaRAID/MegaCli/MegaCli64 -ldsetprop wb -lall -a0
    
  6. Verify the current cache policy for all logical drives is now using WriteBack cache mode.

    # /opt/MegaRAID/MegaCli/MegaCli64 -ldpdinfo -a0 | grep BBU
    
  7. Return the cell back to service.

    1. Activate the grid disks.

      # cellcli
      CellCLI> alter griddisk all active
      GridDisk DATA_CD_00_dmorlx8cel01 successfully altered
      GridDisk DATA_CD_01_dmorlx8cel01 successfully altered
      GridDisk DATA_CD_02_dmorlx8cel01 successfully altered
      GridDisk RECO_CD_00_dmorlx8cel01 successfully altered
      GridDisk RECO_CD_01_dmorlx8cel01 successfully altered
      GridDisk RECO_CD_02_dmorlx8cel01 successfully altered
      ...etc...
      
    2. Verify that all disks are active.

      CellCLI> list griddisk
      DATA_CD_00_dmorlx8cel01         active
      DATA_CD_01_dmorlx8cel01         active
      DATA_CD_02_dmorlx8cel01         active
      RECO_CD_00_dmorlx8cel01         active
      RECO_CD_01_dmorlx8cel01         active
      RECO_CD_02_dmorlx8cel01         active
      ...etc...
      
    3. Verify all grid disks have been successfully put online. Wait until 'asmmodestatus' is in status 'ONLINE' for all grid disks. The following is an example of the output early in the activation process.

      CellCLI> list griddisk attributes name,status,asmmodestatus,asmdeactivationoutcome
      DATA_CD_00_dmorlx8cel01 active ONLINE Yes
      DATA_CD_01_dmorlx8cel01 active ONLINE Yes
      DATA_CD_02_dmorlx8cel01 active ONLINE Yes
      RECO_CD_00_dmorlx8cel01 active SYNCING Yes
      RECO_CD_01_dmorlx8cel01 active ONLINE Yes
      ...etc...
      

      In the example above 'RECO_CD_00_dmorlx8cel01' is still in the 'SYNCING' process. Oracle ASM synchronization is only complete when ALL grid disks show 'asmmodestatus=ONLINE'. This process can take some time depending on how busy the machine is, and has been while this individual server was down for repair.