12 Maintaining the Recovery Appliance Hardware

This chapter describes how to maintain the Recovery Appliance rack components. It contains the following topics:

See Also:

Replacement Units

12.1 Cautions and Warnings

When maintaining the Recovery Appliance hardware, observe the following precautions:

WARNING:

Do not touch the parts of this product that use high-voltage power. Touching them might result in serious injury.

Caution:

  • Do not power off Recovery Appliance unless there is an emergency. In that case, follow "Emergency Power-Off Procedure".

  • Keep the front and rear cabinet doors closed. Failure to do so might cause system failure or result in damage to the hardware components.

  • Keep the top, front, and back of the cabinets clear to allow proper airflow and to prevent the components from overheating.

12.2 Determining the Server Model

Use the following command to determine the model of a compute server or a storage server:

dmidecode -s system-product-name

12.3 Powering On and Off a Recovery Appliance Rack

This section includes the following topics:

12.3.1 Emergency Power-Off Procedure

In an emergency, halt power to Recovery Appliance immediately. The following emergencies might require powering off Recovery Appliance:

  • Natural disasters, such as earthquake, flood, hurricane, tornado, or cyclone

  • Abnormal noise, smell, or smoke coming from the system

  • Threat to human safety

12.3.1.1 Powering Off in an Emergency

In an emergency, do one of the following:

  • Turn off power at the circuit breaker.

  • Pull the emergency power-off switch in the computer room.

After the emergency, contact Oracle Support Services about restoring power to the system.

12.3.1.2 About the Emergency Power-Off Switch

You can use the emergency power-off (EPO) switch to remove power from Recovery Appliance.

EPO switches are required when computer equipment contains batteries capable of supplying more than 750 volt-amperes for more than five minutes. Systems that have these batteries include internal EPO hardware for connecting to a site EPO switch or relay.

12.3.2 Shutting Down Recovery Appliance

Under normal, nonemergency conditions, you can power down the software services and hardware gracefully.

Stop all software services before shutting down the rack components.

12.3.2.1 Stopping Recovery Appliance Services

You must stop the Recovery Appliance services, Oracle Database File System, Oracle Database, and the cluster services.

To stop the Recovery Appliance services:

  1. Log in as oracle to either Recovery Appliance compute server.

  2. Open a SQL connection to Oracle Database as the rasys user:

    $ sqlplus rasys
    
  3. Check the status of the services:

                 SQL> SELECT state FROM ra_server;
                    
                    STATE
                    ------------
                    ON
                    
    
  4. Shut down Recovery Appliance services:

                 SQL> exec dbms_ra.shutdown;
    
  5. Disconnect from Oracle Database:

    SQL> exit
    
  6. If Oracle Secure Backup is configured:

    1. Switch to the root user.

    2. Check the current status of Oracle Secure Backup:

      # $GRID_HOME/bin/crsctl status res osbadmin
        NAME=osbadmin
        TYPE=cluster_resource
        TARGET=ONLINE
        STATE=ONLINE on example01adm04
      
    3. If Oracle Secure Backup is online, then stop it:

      # $GRID_HOME/bin/crsctl stop res osbadmin
      
    4. Switch back to the oracle user.

  7. Check the status of the Oracle Database File System (DBFS) mounts:

    $ $GRID_HOME/bin/crsctl status res ob_dbfs rep_dbfs
                    NAME=ob_dbfs
                    TYPE=local_resource
                    TARGET=ONLINE               , ONLINE
                    STATE=ONLINE on radb07, ONLINE on radb08
    
    NAME=rep_dbfs
                    TYPE=local_resource
                    TARGET=ONLINE               , ONLINE
                    STATE=ONLINE on radb07, ONLINE on radb08
    
  8. Stop DBFS:

    $ $GRID_HOME/bin/crsctl stop res ob_dbfs rep_dbfs
                    CRS-2673: Attempting to stop 'ob_dbfs' on 'radb08'
                    CRS-2673: Attempting to stop 'rep_dbfs' on 'radb07'
                    CRS-2673: Attempting to stop 'ob_dbfs' on 'radb07'
                    CRS-2673: Attempting to stop 'rep_dbfs' on 'radb08'
                    CRS-2677: Stop of 'rep_dbfs' on 'radb07' succeeded
                    CRS-2677: Stop of 'ob_dbfs' on 'radb07' succeeded
                    CRS-2677: Stop of 'rep_dbfs' on 'radb08' succeeded
                    CRS-2677: Stop of 'ob_dbfs' on 'radb08' succeeded
    
  9. Verify that the DBFS mounts are offline:

                 $ $GRID_HOME/bin/crsctl status res ob_dbfs           rep_dbfs
                    NAME=ob_dbfs
                    TYPE=local_resource
                    TARGET=OFFLINE, OFFLINE
                    STATE=OFFLINE, OFFLINE
                    
                    NAME=rep_dbfs
                    TYPE=local_resource
                    TARGET=OFFLINE, OFFLINE
                    STATE=OFFLINE, OFFLINE
    
  10. Check the status of Oracle Database:

                 $ srvctl status database -d zdlra5
                    Instance zdlra51 is running on node radb07
                    Instance zdlra52 is running on node radb08
    
  11. Stop Oracle Database:

    $ srvctl stop database -d zdlra5
    
  12. Verify that Oracle Database is stopped:

                 $ srvctl status database -d zdlra5
                    Instance zdlra51 is not running on node radb07
                    Instance zdlra52 is not running on node radb08
    
  13. Switch to the root user.

  14. Stop the Oracle Clusterware stack on all nodes in the cluster:

         # $GRID_HOME/bin/crsctl stop cluster -all
            CRS-2673: Attempting to stop 'ora.crsd' on 'zdlradb07'
            CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on
            'zdlradb07'
            CRS-2673: Attempting to stop 'ora.LISTENER_SCAN2.lsnr' on 'zdlradb07'
            CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'zdlradb07'
         .
         .
         .
    #
    

    If the command fails, reenter it with the -f option.

  15. On each compute server, run the following command to stop the Oracle Cluster Ready Services (CRS):

    # $GRID_HOME/bin/crsctl stop crs
    CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'radb08'
    CRS-2673: Attempting to stop 'ora.crf' on 'radb08'
    CRS-2673: Attempting to stop 'ora.mdnsd' on 'radb08'
         .
         .
         .
    CRS-2677: Stop of 'ora.crf' on 'radb08' succeeded
    CRS-2677: Stop of 'ora.mdnsd' on 'radb08' succeeded
    CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'radb08' has completed
    CRS-4133: Oracle High Availability Services has been stopped.
    
  16. Shut down or reboot the hardware as required, in the following order:

    1. Compute servers

    2. Storage servers

    3. Rack and switches

12.3.2.2 Powering Down the Servers

Before powering down a server, stop the services running on it, as described in "Shutting Down Recovery Appliance".

To shut down a compute server or a storage server:

  1. Log in to the server as root.
  2. Stop the operating system:
    # shutdown -h -y now
    

    Or restart the operating system:

    # shutdown -r -y now
    

Example 12-1 Powering Off Recovery Appliance Using the dcli Utility

  1. Stop Oracle Clusterware on all compute servers:

    # GRID_HOME/grid/bin/crsctl stop cluster -all
    
  2. Shut down the other compute server in the rack:

    # dcli -l root -g ra-adm02 shutdown -h -y now
    

    In the preceding command, ra01adm02 is the name of the second compute server.

  3. Shut down all storage servers:

    # dcli -l root -g cell_group shutdown -h -y now
    

    In the preceding command, cell_group is a file that lists all storage servers.

  4. Shut down the local compute server:

    shutdown -h -y now
    
  5. Power off the rack.

Use the dcli utility to run the shutdown command on multiple servers simultaneously. Do not run dcli from a server that will be powered off by the command.

The following example shuts down a group of storage servers listed in a file named cell_group:

# dcli -l root -g cell_group shutdown -h -y now

Example 12-1 shows the power off procedure for the rack when using the dcli utility to shut down multiple servers simultaneously. The commands run from a compute server.

12.3.2.3 Powering the Network Switches

The gateway and spine switches do not have power controls. They power off when power is removed, by turning off a PDU or a breaker in the data center.

12.3.3 Starting Up Recovery Appliance

Turn on the rack components first, then start the software services.

12.3.3.1 Starting Up Recovery Appliance Components

To power on the rack components, use one of the following methods:

  • Press the power button on the front of the component.

  • Log in to Oracle ILOM and apply power to the system. See "Powering On Servers Remotely".

12.3.3.2 Startup Sequence

Power on the rack components in this sequence:

  1. Rack and switches

    Allow the switches a few minutes to initialize, before you start the storage servers.

  2. Storage servers

    Allow five to 10 minutes for the storage servers to start all services. Ensure that they finish initializing before you start the compute servers.

  3. Compute servers

    When a compute server is powered on, the operating system and Oracle Clusterware start automatically. Oracle Clusterware then starts all resources that are configured to start automatically.

12.3.3.3 Powering On Servers Remotely

You can use the Oracle ILOM interface to power on the Recovery Appliance servers remotely. To access Oracle ILOM, use the web console, the command-line interface (CLI), intelligent platform management interface (IPMI), or simple network management protocol (SNMP).

For example, to apply power to server ra01cel01 using IPMI, you use its Oracle ILOM with a command like the following:

# ipmitool -H ra01cel01-ilom -U root chassis power on

IPMItool must be installed on the server where you use the command.

See Also:

Oracle Integrated Lights Out Manager (ILOM) 3.0 documentation for additional information about using Oracle ILOM to power on the servers:

http://docs.oracle.com/cd/E19860-01/index.html

12.3.3.4 Starting the Recovery Appliance Software

  1. Log in as root to a Recovery Appliance compute server.

  2. Confirm that Oracle Cluster Ready Services (CRS) is running:

         # $GRID_HOME/bin/crsctl status server
              NAME=radb07
              STATE=ONLINE
            
              NAME=radb08
              STATE=ONLINE
    
  3. If CRS is not running, then start it:

    # $GRID_HOME/bin/crsctl start cluster -all
    CRS-2672: Attempting to start 'ora.evmd' on 'radb07'
    CRS-2672: Attempting to start 'ora.cssdmonitor' on 'radb07'
    CRS-2672: Attempting to start 'ora.cssdmonitor' on 'radb08'
         .
         .
         .
    #
    
  4. Switch to the oracle user.

  5. Verify that Oracle Database is running:

    $ srvctl status database -d zdlra5
    Instance zdlra51 is not running on node radb07
    Instance zdlra52 is not running on node radb08
    
  6. If Oracle Database is not running:

    1. Start Oracle Database:

      $ srvctl start database -d zdlra5
      
    2. Confirm that Oracle Database is running:

      $ srvctl status database -d zdlra5
      Instance zdlra51 is running on node radb07
      Instance zdlra52 is running on node radb08
      
  7. Verify that the Database File System (DBFS) mounts are online:

    $ $GRID_HOME/bin/crsctl status res ob_dbfs rep_dbfs
        NAME=ob_dbfs
        TYPE=local_resource
        TARGET=OFFLINE, OFFLINE
        STATE=OFFLINE, OFFLINE
     
        NAME=rep_dbfs
        TYPE=local_resource
        TARGET=OFFLINE, OFFLINE
        STATE=OFFLINE, OFFLINE
    
  8. If DBFS is offline, then start it:

    $ $GRID_HOME/bin/crsctl start res ob_dbfs rep_dbfs
    CRS-2672: Attempting to start 'rep_dbfs' on 'zdlradb07'
    CRS-2672: Attempting to start 'ob_dbfs' on 'zdlradb07'
    CRS-2672: Attempting to start 'ob_dbfs' on 'zdlradb08'
         .
         .
         .
    $
    
  9. Connect to Oracle Database as the RASYS user:

    $ sqlplus rasys
    
  10. Check the status of Recovery Appliance services:

    SQL> SELECT state FROM ra_server;
     
    STATE
    ------------
    OFF
    
  11. If the services are off, then start them:

    SQL> exec dbms_ra.startup;
    
  12. Confirm that the services are started:

    SQL> /
     
    STATE
    ------------
    ON
    

12.4 Replacing the Disk Controller Batteries

The disk controllers in storage servers and compute servers have battery-backed write cache to accelerate write performance. If the battery charge capacity degrades, so that the battery cannot protect the cached data for a power loss of 48 hours or more, then the write cache is disabled and the disk controller switches to write-through mode. Write performance is reduced, but no data is lost.

The battery charge capacity degrades over time, and its life expectancy is inversely proportional to the operating temperature. Table 12-1 shows the worst case life expectancy of a battery in Recovery Appliance.

Table 12-1 Battery Life Expectancy

Inlet Ambient Temperature Battery Lifetime

< 25 degrees Celsius (77 degrees Fahrenheit)

3 years

< 32 degrees Celsius (89.6 digresses Fahrenheit)

2 years

12.4.1 Monitoring Batteries in the Compute Servers

To monitor the battery change capacity in the compute servers:

# /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -a0 | grep "Full Charge" -A5 | sort \
| grep Full -A1

The following is an example of the output from the command:

Full Charge Capacity: 1357 mAh
Max Error: 2 %

You should proactively replace batteries that have a capacity less than 800 milliampere hour (mAh) and a maximum error less than 10%. Immediately replace any battery that has less than 674 mAh or a maximum error greater than 10%.

To monitor the battery temperature:

/opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -a0 | grep BatteryType; \
/opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -a0 | grep -i temper

The following is an example of the output from the command:

BatteryType: iBBU08
Temperature: 38 C
  Temperature                  : OK
  Over Temperature        : No

If the battery temperature is greater than or equal to 55 degrees Celsius, then determine the cause, and correct the problem.

Note:

Storage servers generate an alert when the battery charge capacity is insufficient, the temperature is high, or the battery should be replaced.

12.4.2 Replacing Batteries in Disk Controllers

Oracle replaces the failed batteries at no extra charge under these conditions:

  • The battery charge capacity in the disk controllers falls below the minimum threshold

  • The system is covered either by the Oracle Premier Support for Systems or occurs during the warranty period.

For customers with Premier Support for Systems, Oracle attempts to proactively replace the batteries in Recovery Appliance before the end of the estimated lifetime, on a best effort basis.

12.5 Replacing a Power Distribution Unit

A power distribution unit (PDU) can be replaced while Recovery Appliance is online. The second PDU in the rack maintains the power to all components in the rack. PDU-A is on the left, and PDU-B is on the right, when viewing the rack from the rear.

12.5.1 PDU Replacement Guidelines

Before replacing a PDU, review the following guidelines to ensure that you can perform the procedure safely and without disrupting availability:

  • Unlatching the InfiniBand cables while removing or inserting PDU-A might remove servers from the cluster and thus make the rack unavailable. Be careful when handling the InfiniBand cables, which are normally latched securely. Do not place excessive tension on the InfiniBand cables by pulling them.

  • Unhooking the wrong power feeds shuts down the rack. Trace the power cables that will be replaced from the PDU to the power source, and only unplug those feeds.

  • Allow time to unpack and repack the PDU replacement parts. Notice how the power cords are coiled in the packaging, so you can repack the failed unit the same way.

  • Removing the side panel decreases the time needed to replace the PDU. However, removing the side panel is optional.

  • Using a cordless drill or power screwdriver decreases the time needed to replace the PDU. Allow more time for the replacement if you use a hand wrench. A screwdriver requires Torx T30 and T25 bits.

  • You might need to remove the server cable arms to move the power cables. In that case, twist the plug connection and flex the cable arm connector, to avoid having to unclip the cable arm. If you must unclip the cable arm, then support the cables with one hand, remove the power cord, and then clip the cable arm. Do not leave the cable arm hanging.

  • When you remove the T30 screws from the L-bracket, do not remove the T25 screws or nuts that attach the PDU to the bracket, until the PDU is out of the rack.

12.5.2 Replacing a PDU

To replace a PDU:

  1. Restart the PDU monitor to identify the network settings:

    1. Press the reset button for 20 seconds, until it starts to count down from 5 to 0. While it is counting, release the button, and then press it once.

    2. Record the network settings, firmware version, and so on, displayed on the LCD screen as the monitor restarts.

    If the PDU monitor is not working, then retrieve the network settings by connecting to the PDU over the network, or from the network administrator.

  2. Turn off all PDU breakers.

  3. Unplug the PDU power plugs from the AC outlets.

    If the rack is on a raised floor, then move the power cords out through the floor cutout. You might need to maneuver the rack over the cutout first.

    WARNING:

    If the power cords use overhead routing, then put them in a location where they will not fall or hit anyone.

  4. For replacing PDU-B when there is no side panel access, and the rack does not have an InfiniBand cable harness:

    Note:

    Do not unstrap any cables attached to the cable arms.

    1. Unscrew the T25 screws holding the square cable arms to the rack.

    2. Move the InfiniBand cables to the middle, out of the way.

  5. Unplug all power cables that connect the servers and switches to the PDU. Keep the power cables together in group bundles.

  6. Remove the T30 screws from the top and bottom of the L-bracket, and note where the screws are used.

  7. Note where the PDU sits in the rack frame.

    The PDU is typically an inch back from the rack frame, to allow access to the breaker switches.

  8. Angle and maneuver the PDU out of the rack.

  9. Hold the PDU or put it down, if there is enough room, while maneuvering the AC power cords through the rack. You might need to cut the cable ties that hold the AC cord flush with the bottom side of the PDU.

  10. Pull the cords as near to the bottom or top of the rack as possible. There is more room between the servers to guide the outlet plug through the routing hole.

  11. Remove the smaller Torx T25 screws, and loosen the nut on the top and bottom to remove the PDU from the L-bracket. You do not need to remove the nut.

  12. Attach the L-bracket to the new PDU.

  13. Put the new PDU next to the rack.

  14. Route the AC cords through the rack to the outlets.

    Note:

    Do not cable-tie the AC cord to the new PDU.

  15. Place the new PDU in the rack by angling and maneuvering it until the L-brackets rest on the top and bottom rails.

  16. Line up the holes and slots so that the PDU is about an inch back from the rack frame.

  17. Attach the power cords, using the labels on the cords as a guide. For example, G5-0 indicates PDU group 5 outlet 0 on the PDU.

  18. Attach the InfiniBand cable holders, if you removed them in step 4. Oracle recommends that you first screw in the holders by hand to avoid stripping the screws.

  19. Attach the AC power cords to the outlets.

  20. Turn on the breakers.

  21. Cable and program the PDU monitor for the network, as needed.

    See Also:

    Oracle Sun Rack II Power Distribution Units User's Guide for information about programming the PDU monitor at

    http://docs.oracle.com/cd/E19844-01/html/E23956/index.html

12.6 Resetting a Non-Responsive Oracle ILOM

The Oracle Integrated Lights Out Manager (Oracle ILOM) might become unresponsive. If this happens, then you must manually reset the Service Processor (SP) on Oracle ILOM.

The following procedures describe how to reset Oracle ILOM:

See Also:

Oracle Integrated Lights Out Manager (ILOM) 3.0 documentation at

http://docs.oracle.com/cd/E19860-01/E21549/bbgiedhj.html#z4000b491400243

12.6.1 Resetting Oracle ILOM Using SSH

To reset Oracle ILOM using SSH:

  1. Connect to Oracle ILOM using SSH from another system.
  2. Enter the following command at the ILOM prompt:
    reset /SP
    

12.6.2 Resetting Oracle ILOM Using the Remote Console

If you cannot connect to Oracle ILOM using SSH, then log in to the remote console.

To reset Oracle ILOM using the remote console:

  1. Log in to the Oracle ILOM remote console.
  2. Select Reset SP from the Maintenance tab.
  3. Click Reset SP.

12.6.3 Resetting Oracle ILOM Using IPMItool

If you cannot connect to Oracle ILOM using either SSH or the remote console, then use IPMItool.

To reset Oracle ILOM using IPMItool:

  1. Log in to the local host or another host on the management network.
  2. Use the following IPMItool command:
    • On the local host:

      $ ipmitool mc reset cold
      Sent cold reset command to MC
      
    • On another host:

      $ ipmitool -H ILOM_host_name -U ILOM_user mc reset cold
      Sent cold reset command to MC
      

      In the preceding command, ILOM_host_name is the host name being used, and ILOM_user is the user name for Oracle ILOM.

12.6.4 Resetting Oracle ILOM By Removing Power

If you cannot reset Oracle ILOM using the preceding options:

  1. Unplug the server from the power supply.
  2. Plug the server back into the power supply.

This action power cycles the server and Oracle ILOM.

12.7 Maintaining the Compute Servers

You do not need to shut down a compute server in Recovery Appliance to repair the physical disks. No downtime of the rack is required; however, individual servers might require downtime, and you might need to take them out of the cluster temporarily.

An LSI MegaRAID SAS 9261-8i disk controller manages the disk drives in each compute server. The disks have a RAID-5 configuration. Each compute server has four disk drives. One virtual drive comprises the RAID set.

See Also:

12.7.1 Verifying the RAID Status of a Compute Server

Oracle recommends that you periodically verify the status of the compute server RAID devices. The impact is minimal. In contrast, the impact of corrective action varies depending on the specific issue uncovered, and can range from simple reconfiguration to an outage.

Log in to each compute server as root and perform the following procedure.

To verify the RAID status:

  1. Check the current disk controller configuration:
    # /opt/MegaRAID/MegaCli/MegaCli64 -AdpAllInfo -aALL | grep "Device Present" -A 8
    
                    Device Present
                    ================
    Virtual Drives    : 1 
      Degraded        : 0 
      Offline         : 0 
    Physical Devices  : 5 
      Disks           : 4 
      Critical Disks  : 0 
      Failed Disks    : 0 
    

    Verify that the output shows one virtual drive, none degraded or offline, five physical devices (one controller + four disks), four disks, and no critical or failed disks.

    If the output is different, then investigate and correct the problem. Degraded virtual drives usually indicate absent or failed physical disks. Replace critical disks and failed disks immediately. Otherwise, you risk data loss if the number of working disks in the server is less than the number required to sustain normal operation.

  2. Check the current virtual drive configuration:
    # /opt/MegaRAID/MegaCli/MegaCli64 CfgDsply -aALL | grep "Virtual Drive:";    \
    /opt/MegaRAID/MegaCli/MegaCli64 CfgDsply -aALL | grep "Number Of Drives";  \
    /opt/MegaRAID/MegaCli/MegaCli64 CfgDsply -aALL | grep "^State" 
    
    Virtual Drive                 : 0 (Target Id: 0)
    Number Of Drives              : 4
    State                         : Optimal
    

    Verify that virtual device 0 has four drives, and the state is Optimal. If the output is different, then investigate and correct the problem.

  3. Check the current physical drive configuration:
    # /opt/MegaRAID/MegaCli/MegaCli64 -PDList -aALL | grep "Firmware state"
    Firmware state: Online, Spun Up
    Firmware state: Online, Spun Up
    Firmware state: Online, Spun Up
    Firmware state: Online, Spun Up
    

    Ensure that all drives are Online, Spun Up. If the output is different, then investigate and correct the problem.

    If the output is different, then investigate and correct the problem. Degraded virtual drives usually indicate absent or failed physical disks. Replace critical disks and failed disks immediately. Otherwise, you risk data loss if the number of working disks in the server is less than the number required to sustain normal operation.

12.8 Reimaging a Compute Server

If a compute server is irretrievably damaged, then you must replace it and reimage the replacement server. During the reimaging procedure, the other compute servers in the cluster are available. When adding the new server to the cluster, you copy the software from a working compute server to the new server.

The following tasks describe how to reimage a compute server:

12.8.1 Contacting Oracle Support Services

Open a support request with Oracle Support Services. The support engineer identifies the failed server and sends you a replacement. The support engineer also asks for the output from the imagehistory command, run from a working compute server. The output provides a link to the computeImageMaker file that was used to image the original compute server, and is used to restore the system.

12.8.2 Downloading the Latest Release of the Cluster Verification Utility

The latest release of the cluster verification utility (cluvfy) is available from My Oracle Support Doc ID 316817.1.

12.8.3 Removing the Failed Compute Server from the Cluster

You must remove the failed compute server from Oracle Real Application Clusters (Oracle RAC).

In these steps, working_server is a working compute server in the cluster, failed_server is the compute server being replaced, and replacement_server is the new server.

To remove a failed compute server from the Oracle RAC cluster:

  1. Log in to working_server as the oracle user.

  2. Disable the listener that runs on the failed server:

    $ srvctl disable listener -n failed_server
    $ srvctl stop listener -n failed_server
    
  3. Delete the Oracle home directory from the inventory:

    $ cd $ORACLE_HOME/oui/bin
    $ ./runInstaller -updateNodeList ORACLE_HOME= \
    /u01/app/oracle/product/12.1.0/dbhome_1 "CLUSTER_NODES=list_of_working_servers"
    

    In the preceding command, list_of_working_servers is a list of the compute servers that are still working in the cluster, such as ra01db02, ra01db03, and so on.

  4. Verify that the failed server was deleted—that is, unpinned—from the cluster:

    $ olsnodes -s -t
    
    ra01db01     Inactive        Unpinned
    ra01db02        Active          Unpinned
    
  5. Stop and delete the virtual IP (VIP) resources for the failed compute server:

    # srvctl stop vip -i failed_server-vip
    PRCC-1016 : failed_server-vip.example.com was already stopped
    
    # srvctl remove vip -i failed_server-vip
    Please confirm that you intend to remove the VIPs failed_server-vip (y/[n]) y
    
  6. Delete the compute server from the cluster:

    # crsctl delete node -n failed_server
    CRS-4661: Node failed_server successfully deleted.
    

    If you receive an error message similar to the following, then relocate the voting disks.

    CRS-4662: Error while trying to delete node ra01db01.
    CRS-4000: Command Delete failed, or completed with errors.
    

    To relocate the voting disks:

    1. Determine the current location of the voting disks. The sample output shows that the current location is DBFS_DG.

      # crsctl query css votedisk
      
      ##  STATE    File Universal Id          File Name                Disk group
      --  -----    -----------------          ---------                ----------
      1. ONLINE   123456789abab (o/192.168.73.102/DATA_CD_00_ra01cel07) [DBFS_DG]
      2. ONLINE   123456789cdcd (o/192.168.73.103/DATA_CD_00_ra01cel08) [DBFS_DG]
      3. ONLINE   123456789efef (o/192.168.73.100/DATA_CD_00_ra01cel05) [DBFS_DG]
      Located 3 voting disk(s).
      
    2. Move the voting disks to another disk group:

      # ./crsctl replace votedisk +DATA
      
      Successful addition of voting disk 2345667aabbdd.
      ...
      CRS-4266: Voting file(s) successfully replaced
      
    3. Return the voting disks to the original location. This example returns them to DBFS_DG:

      # ./crsctl replace votedisk +DBFS_DG
      
    4. Repeat the crsctl command to delete the server from the cluster.

  7. Update the Oracle inventory:

    $ cd $ORACLE_HOME/oui/bin
    $ ./runInstaller -updateNodeList ORACLE_HOME=/u01/app/12.1.0/grid \
      "CLUSTER_NODES=list_of_working_servers" CRS=TRUE
    
  8. Verify that the server was deleted successfully:

    $ cluvfy stage -post nodedel -n failed_server -verbose
    
    Performing post-checks for node removal
    Checking CRS integrity...
    The Oracle clusterware is healthy on node "ra01db02"
    CRS integrity check passed
    Result:
    Node removal check passed
    Post-check for node removal was successful.
    

See Also:

Oracle Real Application Clusters Administration and Deployment Guide for information about deleting a compute server from a cluster

12.8.4 Preparing the USB Flash Drive for Imaging

Use a USB flash drive to copy the image to the new compute server.

To prepare the USB flash drive for use:

  1. Insert a blank USB flash drive into a working compute server in the cluster.
  2. Log in as the root user.
  3. Unzip the computeImage file:
    # unzip computeImageMaker_release_LINUX.X64_release_date.platform.tar.zip
    
    # tar -xvf computeImageMaker_release_LINUX.X64_release_date.platform.tar
    
  4. Load the image onto the USB flash drive:
    # cd dl360
    # ./makeImageMedia.sh -dualboot no
    

    The makeImageMedia.sh script prompts for information.

  5. Remove the USB flash drive from the compute server.
  6. Remove the unzipped d1360 directory and the computeImageMaker file from the working compute server. The directory and file require about 2 GB of disk space.

12.8.5 Copying the Image to the New Compute Server

Before you perform the following procedure, replace the failed compute server with the new server. See Expanding a Recovery Appliance Rack with Additional Storage Servers.

To load the image onto the replacement server:

  1. Insert the USB flash drive into the USB port on the replacement server.

  2. Log in to the console through the service processor to monitor progress.

  3. Power on the compute server either by physically pressing the power button or by using Oracle ILOM.

  4. If you replaced the motherboard:

    1. Press F2 during BIOS

    2. Select BIOS Setup

    3. Set the USB flash drive first, and then the RAID controller.

    Otherwise, press F8 during BIOS, select the one-time boot selection menu, and choose the USB flash drive.

  5. Allow the system to start.

    As the system starts, it detects the CELLUSBINSTALL media. The imaging process has two phases. Let both phases complete before proceeding to the next step.

    The first phase of the imaging process identifies any BIOS or firmware that is out of date, and upgrades the components to the expected level for the image. If any components are upgraded or downgraded, then the system automatically restarts.

    The second phase of the imaging process installs the factory image on the replacement compute server.

  6. Remove the USB flash drive when the system prompts you.

  7. Press Enter to power off the server.

12.8.6 Configuring the Replacement Compute Server

The replacement compute server does not have a host names, IP addresses, DNS, or NTP settings. This task describes how to configure the replacement compute server.

The information must be the same on all compute servers in Recovery Appliance. You can obtain the IP addresses from the DNS. You should also have a copy of the Installation Template from the initial installation.

To configure the replacement compute server:

  1. Assemble the following information:
    • Name servers

    • Time zone, such as Americas/Chicago

    • NTP servers

    • IP address information for the management network

    • IP address information for the client access network

    • IP address information for the InfiniBand network

    • Canonical host name

    • Default gateway

  2. Power on the replacement compute server. When the system starts, it automatically runs the configuration script and prompts for information.
  3. Enter the information when prompted, and confirm the settings. The startup process then continues.

Note:

  • If the compute server does not use all network interfaces, then the configuration process stops with a warning that some network interfaces are disconnected. It prompts whether to retry the discovery process. Respond with yes or no, as appropriate for the environment.

  • If bonding is used for the ingest network, then it is now set in the default active-passive mode.

12.8.7 Preparing the Replacement Compute Server for the Cluster

The initial installation of Recovery Appliance modified various files.

To modify the files on the replacement compute server:

  1. Replicate the contents of the following files from a working compute server in the cluster:

    1. Copy the /etc/security/limits.conf file.

    2. Merge the contents of the /etc/hosts files.

    3. Copy the /etc/oracle/cell/network-config/cellinit.ora file.

    4. Update the IP address with the IP address of the BONDIB0 interface on the replacement compute server.

    5. Copy the /etc/oracle/cell/network-config/cellip.ora file.

    6. Configure additional network requirements, such as 10 GbE.

    7. Copy the /etc/modprobe.conf file.

    8. Copy the /etc/sysctl.conf file.

    9. Restart the compute server, so the network changes take effect.

  2. Set up the Oracle software owner on the replacement compute server by adding the user name to one or more groups. The owner is usually the oracle user.

    1. Obtain the current group information from a working compute server:

      # id oracle
      uid=1000(oracle) gid=1001(oinstall) groups=1001(oinstall),1002(dba),1003(oper),1004(asmdba)
      
    2. Use the groupadd command to add the group information to the replacement compute server. This example adds the groups identified in the previous step:

      # groupadd –g 1001 oinstall
      # groupadd –g 1002 dba
      # groupadd –g 1003 oper
      # groupadd –g 1004 asmdba
      
    3. Obtain the current user information from a working compute server:

      # id oracle uid=1000(oracle) gid=1001(oinstall) \
        groups=1001(oinstall),1002(dba),1003(oper),1004(asmdba)
      
    4. Add the user information to the replacement compute server. This example adds the group IDs from the previous step to the oracle user ID:

      # useradd -u 1000 -g 1001 -G 1001,1002,1003,1004 -m -d /home/oracle -s \
        /bin/bash oracle
      
    5. Create the ORACLE_BASE and Grid Infrastructure directories. This example creates /u01/app/oracle and /u01/app/12.1.0/grid:

      # mkdir -p /u01/app/oracle
      # mkdir -p /u01/app/12.1.0/grid
      # chown -R oracle:oinstall /u01/app
      
    6. Change the ownership of the cellip.ora and cellinit.ora files. The owner is typically oracle:dba.

      # chown -R oracle:dba /etc/oracle/cell/network-config
      
    7. Secure the restored compute server:

      $ chmod u+x /opt/oracle.SupportTools/harden_passwords_reset_root_ssh
      $ /opt/oracle.SupportTools/harden_passwords_reset_root_ssh
      

      The compute server restarts.

    8. Log in as the root user. When you are prompted for a new password, set it to match the root password of the other compute servers.

    9. Set the password for the Oracle software owner. The owner is typically oracle.

      # passwd oracle
      
  3. Set up SSH for the oracle account:

    1. Change to the oracle account on the replacement compute server:

      # su - oracle
      
    2. Create the dcli group file on the replacement compute server, listing the servers in the Oracle cluster.

    3. Run the setssh-Linux.sh script on the replacement compute server. This example runs the script interactively:

      $ /opt/oracle.SupportTools/onecommand/setssh-Linux.sh -s
      

      The script prompts for the oracle password on the servers. The -s option causes the script to run in silent mode.

    4. Change to the oracle user on the replacement compute server:

      # su - oracle
      
    5. Verify SSH equivalency:

      $ dcli -g dbs_group -l oracle date
      
  4. Set up or copy any custom login scripts from the working compute server to the replacement compute server:

    $ scp .bash* oracle@replacement_server:. 
    

    In the preceding command, replacement_server is the name of the new server, such as ra01db01.

12.8.8 Applying Patch Bundles to a Replacement Compute Server

Oracle periodically releases software patch bundles for Recovery Appliance. If the working compute server has a patch bundle that is later than the release of the computeImageMaker file, then you must apply the patch bundle to the replacement compute server.

To determine if a patch bundle was applied, use the imagehistory command. Compare information on the replacement compute server to information on the working compute server. If the working database has a later release, then apply the storage server patch bundle to the replacement compute server.

12.8.9 Cloning the Oracle Grid Infrastructure

The following procedure describes how to clone the Oracle Grid infrastructure onto the replacement compute server. In the commands, working_server is a working compute server, and replacement_server is the replacement compute server.

To clone the Oracle Grid infrastructure:

  1. Log in as root to a working compute server in the cluster.

  2. Verify the hardware and operating system installation using the cluster verification utility (cluvfy):

    $ cluvfy stage -post hwos -n replacement_server,working_server –verbose
    

    The phrase Post-check for hardware and operating system setup was successful should appear at the end of the report.

  3. Verify peer compatibility:

    $ cluvfy comp peer -refnode working_server -n replacement_server  \
      -orainv oinstall -osdba dba | grep -B 3 -A 2 mismatched
    

    The following is an example of the output:

    Compatibility check: Available memory [reference node: ra01db02]
    Node Name Status Ref. node status Comment
    ------------ ----------------------- ----------------------- ----------
    ra01db01 31.02GB (3.2527572E7KB) 29.26GB (3.0681252E7KB) mismatched
    Available memory check failed
    Compatibility check: Free disk space for "/tmp" [reference node: ra01db02]
    Node Name Status Ref. node status Comment
    ------------ ----------------------- ---------------------- ----------
    ra01db01 55.52GB (5.8217472E7KB) 51.82GB (5.4340608E7KB) mismatched
    Free disk space check failed
    

    If the only failed components are related to the physical memory, swap space, and disk space, then it is safe for you to continue.

  4. Perform the requisite checks for adding the server:

    1. Ensure that the GRID_HOME/network/admin/samples directory has permissions set to 750.

    2. Validate the addition of the compute server:

      $ cluvfy stage -ignorePrereq -pre nodeadd -n replacement_server \
      -fixup -fixupdir  /home/oracle/fixup.d
       
      

      If the only failed component is related to swap space, then it is safe for you to continue.

      You might get an error about a voting disk similar to the following:

      ERROR: 
      PRVF-5449 : Check of Voting Disk location "o/192.168.73.102/ \
      DATA_CD_00_ra01cel07(o/192.168.73.102/DATA_CD_00_ra01cel07)" \
      failed on the following nodes:
      Check failed on nodes: 
              ra01db01
              ra01db01:No such file or directory
      ...
      PRVF-5431 : Oracle Cluster Voting Disk configuration check failed
      

      If this error occurs, then use the -ignorePrereq option when running the addnode script in the next step.

  5. Add the replacement compute server to the cluster:

    $ cd /u01/app/12.1.0/grid/addnode/
    $ ./addnode.sh -silent "CLUSTER_NEW_NODES={replacement_server}" \
      "CLUSTER_NEW_VIRTUAL_HOSTNAMES={replacement_server-vip}"[-ignorePrereq]
    

    The addnode script causes Oracle Universal Installer to copy the Oracle Clusterware software to the replacement compute server. A message like the following is displayed:

    WARNING: A new inventory has been created on one or more nodes in this session.
    However, it has not yet been registered as the central inventory of this
    system. To register the new inventory please run the script at
    '/u01/app/oraInventory/orainstRoot.sh' with root privileges on nodes
    'ra01db01'. If you do not register the inventory, you may not be able to 
    update or patch the products you installed.
    
    The following configuration scripts need to be executed as the "root" user in
    each cluster node:
     
    /u01/app/oraInventory/orainstRoot.sh #On nodes ra01db01
     
    /u01/app/12.1.0/grid/root.sh #On nodes ra01db01
    
  6. Run the configuration scripts:

    1. Open a terminal window.

    2. Log in as the root user.

    3. Run the scripts on each cluster server.

    After the scripts are run, the following message is displayed:

    The Cluster Node Addition of /u01/app/12.1.0/grid was successful.
    Please check '/tmp/silentInstall.log' for more details.
    
  7. Run the orainstRoot.sh and root.sh scripts:

    # /u01/app/oraInventory/orainstRoot.sh
    Creating the Oracle inventory pointer file (/etc/oraInst.loc)
    Changing permissions of /u01/app/oraInventory.
    Adding read,write permissions for group.
    Removing read,write,execute permissions for world.
    Changing groupname of /u01/app/oraInventory to oinstall.
    The execution of the script is complete.
     
    # /u01/app/12.1.0/grid/root.sh
    

    Check the log files in /u01/app/12.1.0/grid/install/ for the output of the root.sh script. The output file reports that the listener resource on the replaced compute server failed to start. This is an example of the expected output:

    /u01/app/12.1.0/grid/bin/srvctl start listener -n ra01db01 \
    ...Failed
    /u01/app/12.1.0/grid/perl/bin/perl \
    -I/u01/app/12.1.0/grid/perl/lib \
    -I/u01/app/12.1.0/grid/crs/install \
    /u01/app/12.1.0/grid/crs/install/rootcrs.pl execution failed
    
  8. Reenable the listener resource that you stopped in "Removing the Failed Compute Server from the Cluster".

    # GRID_HOME/grid/bin/srvctl enable listener -l LISTENER \
      -n replacement_server
    
    # GRID_HOME/grid/bin/srvctl start listener -l LISTENER  \
      -n replacement_server
    

12.8.10 Clone Oracle Database Homes to the Replacement Compute Server

To clone the Oracle Database homes to the replacement server:

  1. Add Oracle Database ORACLE_HOME to the replacement compute server:
    $ cd /u01/app/oracle/product/12.1.0/db_home/addnode/
    $ ./addnode.sh -silent "CLUSTER_NEW_NODES={replacement_server}" -ignorePrereq
    

    The addnode script causes Oracle Universal Installer to copy the Oracle Database software to the replacement compute server.

    WARNING: The following configuration scripts need to be executed as the "root"
    user in each cluster node.
    /u01/app/oracle/product/12.1.0/dbhome_1/root.sh #On nodes ra01db01
    To execute the configuration scripts:
    Open a terminal window.
    Log in as root.
    Run the scripts on each cluster node.
     
    

    After the scripts are finished, the following messages appear:

    The Cluster Node Addition of /u01/app/oracle/product/12.1.0/dbhome_1 was
    successful.
    Please check '/tmp/silentInstall.log' for more details.
    
  2. Run the root.sh script on the replacement compute server:
    # /u01/app/oracle/product/12.1.0/dbhome_1/root.sh
     
    

    Check the /u01/app/oracle/product/12.1.0/dbhome_1/install/root_replacement_server.company.com_date.log file for the output of the script.

  3. Ensure that the instance parameters are set for the replaced database instance. The following is an example for the CLUSTER_INTERCONNECTS parameter.
    SQL> SHOW PARAMETER cluster_interconnects
    
    NAME                                 TYPE        VALUE
    ------------------------------       --------    -------------------------
    cluster_interconnects                string
     
    SQL> ALTER SYSTEM SET cluster_interconnects='192.168.73.90' SCOPE=spfile SID='dbm1';
    
  4. Validate the configuration files and correct them as necessary:
    • The ORACLE_HOME/dbs/initSID.ora file points to server parameter file (SPFILE) in the Oracle ASM shared storage.

    • The password file that is copied in the ORACLE_HOME/dbs directory has been changed to orapwSID.

  5. Restart the database instance.

12.9 Maintaining the Storage Servers

This section describes how to perform maintenance on the storage servers. It contains the following topics:

12.9.1 Shutting Down a Storage Server

When performing maintenance on a storage server, you might need to power down or restart the server. Before shutting down a storage server, verify that taking a server offline does not impact Oracle ASM disk group and database availability. Continued database availability depends on the level of Oracle ASM redundancy used on the affected disk groups, and the current status of disks in other storage servers that have mirror copies of the same data.

Caution:

  • If a disk in a different cell fails while the cell undergoing maintenance is not completely back in service on the Recovery Appliance, a double disk failure can occur. If the Recovery Appliance is deployed with NORMAL redundancy for the DELTA disk group and if this disk failure is permanent, you will lose all backups on the Recovery Appliance.

  • Ensure that the cell undergoing maintenance is not offline for an extended period of time. Otherwise, a rebalance operation will occur and this will cause issues because of insufficient space for the operation to complete. By default, the rebalance operation begins 24 hours after the cell goes offline.

To power down a storage server:

  1. Log in to the storage server as root.
  2. (Optional) Keep the grid disks offline after restarting the storage server:
    CellCLI> ALTER GRIDDISK ALL INACTIVE
    

    Use this command when doing multiple restarts, or to control when the cell becomes active again. For example, so you can verify the planned maintenance activity was successful before the server is used.

  3. Stop the cell services:
    CellCLI> ALTER CELL SHUTDOWN SERVICES ALL
    

    The preceding command checks if any disks are offline, in predictive failure status, or must be copied to its mirror. If Oracle ASM redundancy is intact, then the command takes the grid disks offline in Oracle ASM, and stops the services.

    The following error indicates that stopping the services might cause redundancy problems and force a disk group to dismount:

    Stopping the RS, CELLSRV, and MS services...
    The SHUTDOWN of ALL services was not successful.
    CELL-01548: Unable to shut down CELLSRV because disk group DATA, RECO may be
    forced to dismount due to reduced redundancy.
    Getting the state of CELLSRV services... running
    Getting the state of MS services... running
    Getting the state of RS services... running
    

    If this error occurs, then restore Oracle ASM disk group redundancy. Retry the command when the status is normal for all disks.

  4. Shut down the server. See "Powering Down the Servers".
  5. After you complete the maintenance procedure, power up the server. The services start automatically. During startup, all grid disks are automatically online in Oracle ASM.
  6. Verify that all grid disks are online:
    CellCLI> LIST GRIDDISK ATTRIBUTES name, asmmodestatus
    

    Wait until asmmodestatus shows ONLINE or UNUSED for all grid disks.

  7. If you inactivated the grid disks in step 2, then reactivate them:
    CellCLI> ALTER GRIDDISK ALL ACTIVE
    

    If you skipped step 2, then the grid disks are activated automatically.

See Also:

My Oracle Support Doc ID 1188080.1, "Steps to shut down or reboot an Exadata storage cell without affecting ASM."

12.9.2 Enabling Network Connectivity Using the Diagnostics ISO

You might need to use the diagnostics ISO to access a storage server that fails to restart normally. After starting the server, you can copy files from the ISO to the server, replacing the corrupt files.

The ISO is located on all Recovery Appliance servers at /opt/oracle.SupportTools/diagnostics.iso.

Caution:

Use the diagnostics ISO only after other restart methods, such as using the USB drive, have failed. Contact Oracle Support for advise and guidance before starting this procedure.

To use the diagnostics ISO:

  1. Enable a one-time CD-ROM boot in the service processor, using either the web interface or a serial console, such as Telnet or puTTY. For example, use this command from a serial console:
    set boot_device=cdrom
    
  2. Mount a local copy of diagnostics.iso as a CD-ROM, using the service processor interface.
  3. Use the reboot command to restart the server.
  4. Log in to the server as the root user with the diagnostics ISO password.
  5. To avoid pings:
    alias ping="ping -c"
    
  6. Make a directory named /etc/network.
  7. Make a directory named/etc/network/if-pre-up.d.
  8. Add the following settings to the /etc/network/interfaces file, entering the actual IP address and netmask of the server, and the IP address of the gateway:
    iface eth0 inet static
    address IP address of server
    netmask netmask of server
    gateway gateway IP address of server
    
  9. Start the eth0 interface:
    # ifup eth0
     
    

    Ignore any warning messages.

  10. Use either FTP or the wget command to retrieve the files needed to repair the server.

12.10 Maintaining the Physical Disks of Storage Servers

This section contains the following topics:

See Also:

Oracle Maximum Availability Architecture (MAA) website at http://www.oracle.com/goto/maa for additional information about maintenance best practices

12.10.1 About System Disks and Data Disks

The first two disks of storage servers are system disks. Storage server software system software resides on a portion of each of the system disks. These portions on both system disks are referred to as the system area. The nonsystem area of the system disks, referred to as data partitions, is used for normal data storage. All other disks in a storage server are called data disks.

12.10.2 Monitoring the Status of Physical Disks

You can monitor a physical disk by checking its attributes with the CellCLI LIST PHYSICALDISK command. For example, a physical disk with a status of failed or warning - predictive failure is having problems and probably must be replaced. The disk firmware maintains the error counters, and marks a drive with Predictive Failure when internal thresholds are exceeded. The drive, not the server software, determines if it needs replacement.

The following list identifies the storage server physical disk statuses.

Physical Disk Status for Storage Servers

  • Physical Disk Status
  • normal
  • normal - dropped for replacement
  • normal - confinedOnline
  • normal - confinedOnline - dropped for replacement
  • not present
  • failed
  • failed - dropped for replacement
  • failed - rejected due to incorrect disk model
  • failed - rejected due to incorrect disk model - dropped for replacement
  • failed - rejected due to wrong slot
  • failed - rejected due to wrong slot - dropped for replacement
  • warning - confinedOnline
  • warning - confinedOnline - dropped for replacement
  • warning - peer failure
  • warning - poor performance
  • warning - poor performance - dropped for replacement
  • warning - poor performance, write-through caching
  • warning - predictive failure, poor performance
  • warning - predictive failure, poor performance - dropped for replacement
  • warning - predictive failure, write-through caching
  • warning - predictive failure
  • warning - predictive failure - dropped for replacement
  • warning - predictive failure, poor performance, write-through caching
  • warning - write-through caching

12.10.3 What Happens When Disk Errors Occur?

Oracle ASM performs bad extent repair for read errors caused by hardware errors. The disks stay online, and no alerts are sent.

When a disk fails:

  • The Oracle ASM disks associated with it are dropped automatically with the FORCE option, and then an Oracle ASM rebalance restores data redundancy.

  • The blue LED and the amber LED are turned on for the drive, indicating that disk replacement can proceed. The drive LED stays on solid. See "LED Status Descriptions" for information about LED status lights during predictive failure and poor performance.

  • The server generates an alert, which includes specific instructions for replacing the disk. If you configured the system for alert notifications, then the alert is sent by email to the designated address.

When a disk has a faulty status:

  • The Oracle ASM disks associated with the grid disks on the physical drive are dropped automatically.

  • An Oracle ASM rebalance relocates the data from the predictively failed disk to other disks.

  • The blue LED is turned on for the drive, indicating that disk replacement can proceed.

When Oracle ASM gets a read error on a physically-addressed metadata block, it does not have mirroring for the blocks:

  • Oracle ASM takes the disk offline.

  • Oracle ASM drops the disk with the FORCE option.

  • The storage server software sends an alert stating that the disk can be replaced.

12.10.4 About Detecting Underperforming Disks

ASR automatically identifies and removes a poorly performing disk from the active configuration. Recovery Appliance then runs a set of performance tests. When CELLSRV detects poor disk performance, the cell disk status changes to normal - confinedOnline, and the physical disk status changes to warning - confinedOnline. Table 12-2 describes the conditions that trigger disk confinement:

Table 12-2 Alerts Indicating Poor Disk Performance

Alert Code Cause

CD_PERF_HANG

Disk stopped responding

CD_PERF_SLOW_ABS

High service time threshold (slow disk)

CD_PERF_SLOW_RLTV

High relative service time threshold (slow disk)

CD_PERF_SLOW_LAT_WT

High latency on writes

CD_PERF_SLOW_LAT_RD

High latency on reads

CD_PERF_SLOW_LAT_RW

High latency on reads and writes

CD_PERF_SLOW_LAT_ERR

Frequent very high absolute latency on individual I/Os

CD_PERF_IOERR

I/O errors

If the problem is temporary and the disk passes the tests, then it is brought back into the configuration. If the disk does not pass the tests, then it is marked poor performance, and ASR submits a service request to replace the disk. If possible, Oracle ASM takes the grid disks offline for testing. Otherwise, the cell disk status stays at normal - confinedOnline until the disks can be taken offline safely. See "Removing an Underperforming Physical Disk".

The disk status change is recorded in the server alert history:

MESSAGE ID date_time info "Hard disk entered confinement status. The LUN
 n_m changed status to warning - confinedOnline. CellDisk changed status to normal
 - confinedOnline. Status: WARNING - CONFINEDONLINE  Manufacturer: name  Model
 Number: model  Size: size  Serial Number: serial_number  Firmware: fw_release 
 Slot Number: m  Cell Disk: cell_disk_name  Grid Disk: grid disk 1, grid disk 2
     .
     .
     .
Reason for confinement: threshold for service time exceeded"

These messages are entered in the storage cell alert log:

CDHS: Mark cd health state change cell_disk_name  with newState HEALTH_BAD_
ONLINE pending HEALTH_BAD_ONLINE ongoing INVALID cur HEALTH_GOOD
Celldisk entering CONFINE ACTIVE state with cause CD_PERF_SLOW_ABS activeForced: 0
inactiveForced: 0 trigger HistoryFail: 0, forceTestOutcome: 0 testFail: 0
global conf related state: numHDsConf: 1 numFDsConf: 0 numHDsHung: 0 numFDsHung: 0
     .
     .
     .

12.10.5 About Rebalancing the Data

After you replace the physical disk, you must re-create the grid disks and cell disks that existed on the previous disk in that slot. If those grid disks were part of an Oracle ASM group, then add them back to the disk group, and rebalance the data, based on the disk group redundancy and the ASM_POWER_LIMIT parameter.

Oracle ASM rebalance occurs when dropping or adding a disk. To check the status of the rebalance:

  • Did the rebalance operation run successfully?

    Check the Oracle ASM alert logs.

  • Is the rebalance operation currently running?

    Check the GV$ASM_OPERATION view.

  • Did the rebalance operation fail?

    Check the V$ASM_OPERATION.ERROR view.

You can perform rebalance operations from multiple disk groups on different Oracle ASM instances in the same cluster, if the failed physical disk contained ASM disks from multiple disk groups. One Oracle ASM instance can run one rebalance operation at a time. If all Oracle ASM instances are busy, then the rebalance operations are queued.

12.10.6 Monitoring Hard Disk Controller Write-Through Cache Mode

The hard disk controller on each storage server periodically performs a discharge and charge of the controller battery. During the operation, the write cache policy changes from write-back caching to write-through caching. Write-through cache mode is slower than write-back cache mode. However, write-back cache mode risks data loss if the storage server loses power or fails. The operation occurs every three months, for example, at 01:00 on the 17th day of January, April, July and October.

This example shows an informational alert that a storage server generates about the status of the caching mode for its logical drives:

HDD disk controller battery on disk contoller at adapter 0 is going into a learn
cycle. This is a normal maintenance activity that occurs quarterly and runs for
approximately 1 to 12 hours. The disk controller cache might go into WriteThrough
caching mode during the learn cycle. Disk write throughput might be temporarily
lower during this time. The message is informational only, no action is required.

Use the following commands to manage changes to the periodical write cache policy:

  • To change the start time for the learn cycle, use a command like the following example:

    CellCLI> ALTER CELL bbuLearnCycleTime="2013-01-22T02:00:00-08:00"
    

    The time reverts to the default learn cycle time after the cycle completes.

  • To see the time for the next learn cycle:

    CellCLI> LIST CELL ATTRIBUTES bbuLearnCycleTime
    
  • To view the status of the battery:

    # /opt/MegaRAID/MegaCli/MegaCli64 -AdpBbuCmd -GetBbuStatus -a0
    
    BBU status for Adapter: 0
     
    BatteryType: iBBU08
    Voltage: 3721 mV
    Current: 541 mA
    Temperature: 43 C
     
    BBU Firmware Status:
    Charging Status : Charging
    Voltage : OK
    Temperature : OK
    Learn Cycle Requested : No
    Learn Cycle Active : No
    Learn Cycle Status : OK
    Learn Cycle Timeout : No
    I2c Errors Detected : No
    Battery Pack Missing : No
    Battery Replacement required : No
    Remaining Capacity Low : Yes
    Periodic Learn Required : No
    Transparent Learn : No
     
    Battery state:
     
    GasGuageStatus:
    Fully Discharged : No
    Fully Charged : No
    Discharging : No
    Initialized : No
    Remaining Time Alarm : Yes
    Remaining Capacity Alarm: No
    Discharge Terminated : No
    Over Temperature : No
    Charging Terminated : No
    Over Charged : No
     
    Relative State of Charge: 7 %
    Charger System State: 1
    Charger System Ctrl: 0
    Charging current: 541 mA
    Absolute state of charge: 0 %
    Max Error: 0 %
     
    Exit Code: 0x00
    

12.10.7 Replacing a Failed Physical Disk

A physical disk outage can reduce performance and data redundancy. Therefore, you should replace a failed disk with a new disk as soon as possible.

To replace a disk when it fails:

  1. Determine which disk failed.
    CellCLI> LIST PHYSICALDISK WHERE diskType=HardDisk AND status=failed DETAIL
    
             name:                   28:5
             deviceId:               21
             diskType:               HardDisk
             enclosureDeviceId:      28
             errMediaCount:          0
             errOtherCount:          0
             foreignState:           false
             luns:                   0_5
             makeModel:              "SEAGATE ST360057SSUN600G"
             physicalFirmware:       0705
             physicalInterface:      sas
             physicalSerial:         A01BC2
             physicalSize:           558.9109999993816G
             slotNumber:             5
             status:                 failed
    

    The slot number shows the location of the disk, and the status shows that the disk failed.

  2. Ensure that the blue "OK to Remove" LED on the disk is lit, before you remove the disk.
  3. Replace the physical disk on the storage server and wait three minutes. The physical disk is hot pluggable, and you can replace it with the power on.
  4. Confirm that the disk is online and its status is NORMAL:
    CellCLI> LIST PHYSICALDISK WHERE name=28:5 ATTRIBUTES status
    

    When you replace a physical disk, the RAID controller must acknowledge the replacement disk before you can use it. Acknowledgment is quick.

  5. Verify that the firmware is correct:
    ALTER CELL VALIDATE CONFIGURATION
    

    You can also check the ms-odl.trc file to confirm that the firmware was updated and the logical unit number (LUN) was rebuilt.

  6. Re-create the grid disks and cell disks that existed on the previous disk in that slot. See "About Rebalancing the Data".

See Also:

12.10.8 Replacing a Faulty Physical Disk

You might need to replace a physical disk because its status is warning - predictive failure. This status indicates that the physical disk will fail soon, and you should replace it at the earliest opportunity.

If the drive fails before you replace it, then see "Replacing a Failed Physical Disk".

To replace a disk before it fails:

  1. Identify the faulty disk:
    CellCLI> LIST PHYSICALDISK WHERE diskType=HardDisk AND status= \
            "warning - predictive failure" DETAIL
    
             name:                   28:3
             deviceId:               19
             diskType:               HardDisk
             enclosureDeviceId:      28
             errMediaCount:          0
             errOtherCount:          0
             foreignState:           false
             luns:                   0_3
             makeModel:              "SEAGATE ST360057SSUN600G"
             physicalFirmware:       0705
             physicalInterface:      sas
             physicalSerial:         E07L8E
             physicalSize:           558.9109999993816G
             slotNumber:             3
             status:                 warning - predictive failure
    

    In the sample output from the previous command, the slot number shows the location of the disk, and the status shows that the disk is expected to fail.

  2. Ensure that the blue "OK to Remove" LED on the disk is lit, before you remove the disk.
  3. Wait while the affected Oracle ASM disks are dropped. To check the status, query the V$ASM_DISK_STAT view on the Oracle ASM instance.

    Caution:

    The disks in the first two slots are system disks, which store the operating system and the Recovery Appliance storage server software. One system disk must be in working condition for the server to operate.

    Before replacing the other system disk, wait until ALTER CELL VALIDATE CONFIGURATION shows no RAID mdadm errors. This output indicates that the system disk resynchronization is complete.

    See Also:

    Oracle Database Reference for information about querying the V$ASM_DISK_STAT view

  4. Replace the physical disk on the storage server and wait three minutes. The physical disk is hot pluggable, and you can replace it when the power is on.
  5. Confirm that the disk is online and its status is NORMAL:
    CellCLI> LIST PHYSICALDISK WHERE name=28:5 ATTRIBUTES status
    

    When you replace a physical disk, the RAID controller must acknowledge the replacement disk before you can use it. Acknowledgment is quick.

  6. Verify that the firmware is correct:
    ALTER CELL VALIDATE CONFIGURATION
    
  7. Re-create the grid disks and cell disks that existed on the previous disk in that slot. See "About Rebalancing the Data".

See Also:

12.10.9 Removing an Underperforming Physical Disk

A bad physical disk can degrade the performance of other good disks. You should remove the bad disk from the system.

To remove a physical disk after identifying the bad disk:

  1. Illuminate the physical drive service LED to identify the drive to be replaced:
    cellcli -e 'alter physicaldisk disk_name serviceled on'
    

    In the preceding command, disk_name is the name of the physical disk to be replaced, such as 20:2.

  2. Identify all grid disks on the bad disk, and direct Oracle ASM to stop using them:
    ALTER DISKGROUP diskgroup_name DROP DISK asm_disk_name
    
  3. Ensure that the blue "OK to Remove" LED on the disk is lit.
  4. Query the V$ASM_DISK_STAT view to ensure that the Oracle ASM disks affected by the bad disk were dropped successfully.
  5. Remove the bad disk.

    An alert is sent when the disk is removed.

  6. When a new disk is available, install it in the system. The cell disks and grid disks are created automatically on the new physical disk.
  7. Confirm that the disk is online and its status is NORMAL:
    CellCLI> LIST PHYSICALDISK WHERE name=28:5 ATTRIBUTES status
    

    When you replace a physical disk, the RAID controller must acknowledge the replacement disk before you can use it. Acknowledgment is quick.

12.10.10 Moving All Drives from One Storage Server to Another

You might need to move all drives from one storage server to another storage server. This situation might occur when a chassis-level component fails, such as a motherboard or Oracle ILOM, or when you are troubleshooting a hardware problem.

To move the drives between storage servers:

  1. Back up the files in the following directories:
    • /etc/hosts

    • /etc/modprobe.conf

    • /etc/sysconfig/network

    • /etc/sysconfig/network-scripts

  2. Inactivate all grid disks and shut down the storage server. See "Shutting Down a Storage Server".
  3. Ensure that the Oracle ASM disk_repair_time attribute is set long enough, so that Oracle ASM does not drop the disks before you can activate the grid disks in another storage server.
  4. Move the physical disks, flash disks, disk controller, and USB flash drive from the original storage server to the new storage server.

    Caution:

    • Ensure that the first two disks, which are the system disks, are in the same, first two slots. Otherwise, the storage server will function improperly.

    • Ensure that the flash cards are installed in the same PCIe slots as in the original storage server.

  5. Power on the new storage server. You can either use the service processor interface or press the power button.
  6. Log in to the console using the service processor.
  7. Check the files in the following directories. Restore corrupt files from the backups.
    • /etc/hosts

    • /etc/modprobe.conf

    • /etc/sysconfig/network

    • /etc/sysconfig/network-scripts

  8. Use the ifconfig command to retrieve the new MAC addresses for eth0, eth1, eth2, and eth3. This example shows that the eth0 MAC address (HWaddr) is 00:14:4F:CA:D9:AE.
    # ifconfig eth0
    eth0      Link encap:Ethernet  HWaddr 00:14:4F:CA:D9:AE
              inet addr:10.204.74.184  Bcast:10.204.75.255  Mask:255.255.252.0
              inet6 addr: fe80::214:4fff:feca:d9ae/64 Scope:Link
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
              RX packets:141455 errors:0 dropped:0 overruns:0 frame:0
              TX packets:6340 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1000
              RX bytes:9578692 (9.1 MiB)  TX bytes:1042156 (1017.7 KiB)
              Memory:f8c60000-f8c80000
    
  9. In the /etc/sysconfig/network-scripts directory, edit the following files to change HWADDR to the value returned in step 8:
    • ifcfg-eth0
    • ifcfg-eth1
    • ifcfg-eth2
    • ifcfg-eth3

    The following example shows the edited ifcfg-eth0 file:

    #### DO NOT REMOVE THESE LINES ####
    #### %GENERATED BY CELL% ####
    DEVICE=eth0
    BOOTPROTO=static
    ONBOOT=yes
    IPADDR=10.204.74.184
    NETMASK=255.255.252.0
    NETWORK=10.204.72.0
    BROADCAST=10.204.75.255
    GATEWAY=10.204.72.1
    HOTPLUG=no
    IPV6INIT=no
    HWADDR=00:14:4F:CA:D9:AE
    
  10. Restart the storage server.
  11. Activate the grid disks:
    CellCLI> ALTER GRIDDISK ALL ACTIVE
    

    If the Oracle ASM disks were not dropped, then they go online automatically and start being used.

  12. Validate the configuration:
    CellCLI> ALTER CELL VALIDATE CONFIGURATION
    
  13. Activate Oracle ILOM for ASR.

12.10.11 Removing and Replacing the Same Physical Disk

If you remove the wrong physical disk and replace it, then Recovery Appliance automatically adds the disk back in the Oracle ASM disk group, and resynchronizes its data.

Note:

When replacing a faulty or failed disk, look for a lit LED on the disk. The LED is lit to help you locate the bad disk.

12.10.12 Reenabling a Rejected Physical Disk

Recovery Appliance rejects a physical disk when it is in the wrong slot.

Caution:

Reenabling a physical disk removes all data stored on it.

  • To reenable a rejected physical disk, replace hard_disk_name and hard_disk_id with the appropriate values in this command:

    CellCLI> ALTER PHYSICALDISK hard_disk_name/hard_disk_id reenable force
    Physical disk hard_disk_name/hard_disk_id  was reenabled.
    

12.11 Maintaining the Flash Disks of Storage Servers

This section describes how to perform maintenance on flash disks. It contains the following topics:

12.11.1 About the Flash Disks

Recovery Appliance mirrors data across storage servers, and sends write operations to at least two storage servers. If a flash card in one storage server has problems, then Recovery Appliance services the read and write operations using the mirrored data in another storage server. Service is not interrupted.

If a flash card fails, then the storage server software identifies the data in the flash cache by reading the data from the surviving mirror. It then writes the data to the server with the failed flash card. When the failure occurs, the software saves the location of the data lost in the failed flash cache. Resilvering then replaces the lost data with the mirrored copy. During resilvering, the grid disk status is ACTIVE -- RESILVERING WORKING.

Each storage server has four PCIe cards. Each card has four flash disks (FDOMs) for a total of 16 flash disks. The four PCIe cards are located in PCI slot numbers 1, 2, 4, and 5.

To identify a failed flash disk, use the following command:

CellCLI> LIST PHYSICALDISK WHERE DISKTYPE=flashdisk AND STATUS=failed DETAIL

         name:                   FLASH_5_3
         diskType:               FlashDisk
         luns:                   5_3
         makeModel:              "Sun Flash Accelerator F40 PCIe Card"
         physicalFirmware:       TI35
         physicalInsertTime:     2012-07-13T15:40:59-07:00
         physicalSerial:         5L002X4P
         physicalSize:           93.13225793838501G
         slotNumber:             "PCI Slot: 5; FDOM: 3"
         status:                 failed

The card name and slotNumber attributes show the PCI slot and the FDOM number.

When the server software detects a failure, it generates an alert that indicates that the flash disk, and the LUN on it, failed. The alert message includes the PCI slot number of the flash card and the exact FDOM number. These numbers uniquely identify the field replaceable unit (FRU). If you configured the system for alert notification, then the alert is sent to the designated address in an email message.

A flash disk outage can reduce performance and data redundancy. Replace the failed disk at the earliest opportunity. If the flash disk is used for flash cache, then the effective cache size for the server is reduced. If the flash disk is used for flash log, then the flash log is disabled on the disk, thus reducing the effective flash log size. If the flash disk is used for grid disks, then the Oracle ASM disks associated with them are automatically dropped with the FORCE option from the Oracle ASM disk group, and an Oracle ASM rebalance starts to restore the data redundancy.

See Also:

12.11.2 Faulty Status Indicators

The following status indicators generate an alert. The alert includes specific instructions for replacing the flash disk. If you configured the system for alert notifications, then the alerts are sent by email message to the designated address.

warning - peer failure

One of the flash disks on the same Sun Flash Accelerator PCIe card failed or has a problem. For example, if FLASH5_3 fails, then FLASH5_0, FLASH5_1, and FLASH5_2 have peer failure status:

CellCLI> LIST PHYSICALDISK
         36:0            L45F3A          normal
         36:1            L45WAE          normal
         36:2            L45WQW          normal
          .
          .
          .
         FLASH_5_0       5L0034XM        warning - peer failure
         FLASH_5_1       5L0034JE        warning - peer failure
         FLASH_5_2       5L002WJH        warning - peer failure
         FLASH_5_3       5L002X4P        failed
warning - predictive failure

The flash disk will fail soon, and should be replaced at the earliest opportunity. If the flash disk is used for flash cache, then it continues to be used as flash cache. If the flash disk is used for grid disks, then the Oracle ASM disks associated with these grid disks are automatically dropped, and Oracle ASM rebalance relocates the data from the predictively failed disk to other disks.

When one flash disk has predictive failure status, then the data is copied. If the flash disk is used for write back flash cache, then the data is flushed from the flash disks to the grid disks.

warning - poor performance

The flash disk demonstrates extremely poor performance, and should be replaced at the earliest opportunity. If the flash disk is used for flash cache, then flash cache is dropped from this disk, thus reducing the effective flash cache size for the storage server. If the flash disk is used for grid disks, then the Oracle ASM disks associated with the grid disks on this flash disk are automatically dropped with the FORCE option, if possible. If DROP...FORCE cannot succeed because of offline partners, then the grid disks are dropped normally, and Oracle ASM rebalance relocates the data from the poor performance disk to the other disks.

warning - write-through caching

The capacitors used to support data cache on the PCIe card failed, and the card should be replaced as soon as possible.

12.11.3 Identifying Flash Disks in Poor Health

To identify a flash disk with a particular health status, use the LIST PHYSICALDISK command. This example queries for the warning - predictive failure status:

CellCLI> LIST PHYSICALDISK WHERE DISKTYPE=flashdisk AND STATUS=  \
'warning - predictive failure' DETAIL


         name:                   FLASH_5_3
         diskType:               FlashDisk
         luns:                   5_3
         makeModel:              "Sun Flash Accelerator F40 PCIe Card"
         physicalFirmware:       TI35
         physicalInsertTime:     2012-07-13T15:40:59-07:00
         physicalSerial:         5L002X4P
         physicalSize:           93.13225793838501G
         slotNumber:             "PCI Slot: 1; FDOM: 2"
         status:                 warning - predictive failure

12.11.4 Identifying Underperforming Flash Disks

ASR automatically identifies and removes a poorly performing disk from the active configuration. Recovery Appliance then runs a set of performance tests. When CELLSRV detects poor disk performance, the cell disk status changes to normal - confinedOnline, and the physical disk status changes to warning - confinedOnline. Table 12-2 describes the conditions that trigger disk confinement. The conditions are the same for both physical and flash disks.

If the problem is temporary and the disk passes the tests, then it is brought back into the configuration. If the disk does not pass the tests, then it is marked poor performance, and ASR submits a service request to replace the disk. If possible, Oracle ASM takes the grid disks offline for testing. Otherwise, the cell disk status stays at normal - confinedOnline until the disks can be taken offline safely.

The disk status change is recorded in the server alert history:

MESSAGE ID date_time info "Hard disk entered confinement status. The LUN
 n_m changed status to warning - confinedOnline. CellDisk changed status to normal
 - confinedOnline. Status: WARNING - CONFINEDONLINE  Manufacturer: name  Model
 Number: model  Size: size  Serial Number: serial_number  Firmware: fw_release 
 Slot Number: m  Cell Disk: cell_disk_name  Grid Disk: grid disk 1, grid disk 2
 ... Reason for confinement: threshold for service time exceeded"

These messages are entered in the storage cell alert log:

CDHS: Mark cd health state change cell_disk_name  with newState HEALTH_BAD_
ONLINE pending HEALTH_BAD_ONLINE ongoing INVALID cur HEALTH_GOOD
Celldisk entering CONFINE ACTIVE state with cause CD_PERF_SLOW_ABS activeForced: 0
inactiveForced: 0 trigger HistoryFail: 0, forceTestOutcome: 0 testFail: 0
global conf related state: numHDsConf: 1 numFDsConf: 0 numHDsHung: 0 numFDsHung: 0
     .
     .
     .

12.11.5 When Is It Safe to Replace a Faulty Flash Disk?

When the server software detects a predictive or peer failure in a flash disk used for write back flash cache, and only one FDOM is bad, then the server software resilvers the data on the bad FDOM, and flushes the data on the other three FDOMs. If there are valid grid disks, then the server software initiates an Oracle ASM rebalance of the disks. You cannot replace the bad disk until the tasks are completed and an alert indicates that the disk is ready.

An alert is sent when the Oracle ASM disks are dropped, and you can safely replace the flash disk. If the flash disk is used for write-back flash cache, then wait until none of the grid disks are cached by the flash disk.

12.11.6 Replacing a Failed Flash Disk

Caution:

The PCIe cards are not hot pluggable; you must power down a storage server before replacing the flash disks or cards.

Before you perform the following procedure, shut down the server. See "Shutting Down a Storage Server".

To replace a failed flash disk:

  1. Replace the failed flash disk. Use the PCI number and FDOM number to locate the failed disk. A white cell LED is lit to help you locate the affected server.
  2. Power up the server. The services start automatically. As part of the server startup, all grid disks are automatically online in Oracle ASM.
  3. Verify that all grid disks are online:
    CellCLI> LIST GRIDDISK ATTRIBUTES name, asmmodestatus
    

    Wait until asmmodestatus shows ONLINE or UNUSED for all grid disks.

See Also:

12.11.7 Replacing a Faulty Flash Disk

Caution:

The PCIe cards are not hot pluggable; you must power down a storage server before replacing the flash disks or cards.

Before you perform the following procedure, review the "When Is It Safe to Replace a Faulty Flash Disk?" topic.

To replace a faulty flash disk:

  1. Use the following command to check the cachedBy attribute of all grid disks.
    CellCLI> LIST GRIDDISK ATTRIBUTES name, cachedBy
    

    The cell disk on the flash disk should not appear in any grid disk cachedBy attribute. If the flash disk is used for both grid disks and flash cache, then wait until receiving the alert, and the cell disk is not shown in any grid disk cachedBy attribute.

  2. Stop all services:
    CellCLI> ALTER CELL SHUTDOWN SERVICES ALL
    

    The preceding command checks if any disks are offline, in predictive failure status, or must be copied to a mirror. If Oracle ASM redundancy is intact, then the command takes the grid disks offline in Oracle ASM, and then stops the services.

    The following error indicates that it might be unsafe to stop the services, because stopping them might force a disk group to dismount:

    Stopping the RS, CELLSRV, and MS services...
    The SHUTDOWN of ALL services was not successful.
    CELL-01548: Unable to shut down CELLSRV because disk group DATA, RECO may be
    forced to dismount due to reduced redundancy.
    Getting the state of CELLSRV services... running
    Getting the state of MS services... running
    Getting the state of RS services... running
    

    If this error occurs, then restore Oracle ASM disk group redundancy, and retry the command when the disk status is normal for all disks.

  3. Shut down the server.
  4. Replace the failed flash disk. Use the PCI number and FDOM number to locate the failed disk. A white cell LED is lit to help you locate the affected server.
  5. Power up the server. The services start automatically. As part of the server startup, all grid disks are automatically online in Oracle ASM.
  6. Verify that all grid disks are online:
    CellCLI> LIST GRIDDISK ATTRIBUTES name, asmmodestatus
    

    Wait until asmmodestatus shows ONLINE or UNUSED for all grid disks.

The system automatically uses the new flash disk, as follows:

  • If the flash disk is used for flash cache, then the effective cache size increases.

  • If the flash disk is used for grid disks, then the grid disks are re-created on the new flash disk.

  • If the grid disks were part of an Oracle ASM disk group, then they are added back to the disk group. The data is rebalanced on them, based on the disk group redundancy and the ASM_POWER_LIMIT parameter.

12.11.8 Removing an Underperforming Flash Disk

A bad flash disk can degrade the performance of other good flash disks. You should remove a bad flash disk. See "Identifying Underperforming Flash Disks".

To remove an underperforming flash drive:

  1. If the flash disk is used for flash cache:

    1. Ensure that data not synchronized with the disk (dirty data) is flushed from flash cache to the grid disks:

      CellCLI> ALTER FLASHCACHE ... FLUSH
      
    2. Disable the flash cache and create a new one. Do not include the bad flash disk when creating the flash cache.

      CellCLI > DROP FLASHCACHE
      CellCLI > CREATE FLASHCACHE CELLDISK='fd1,fd2,fd3,fd4, ...' 
      
  2. If the flash disk is used for grid disks, then direct Oracle ASM to stop using the bad disk immediately:

    SQL> ALTER DISKGROUP diskgroup_name DROP DISK asm_disk_name FORCE 
    

    Offline partners might cause the DROP command with the FORCE option to fail. If the previous command fails, do one of the following:

    • Restore Oracle ASM data redundancy by correcting the other server or disk failures. Then retry the DROP...FORCE command.

    • Direct Oracle ASM to rebalance the data off the bad disk:

      SQL> ALTER DISKGROUP diskgroup_name DROP DISK asm_disk_name  NOFORCE
      
  3. Wait until the Oracle ASM disks associated with the bad flash disk are dropped successfully. The storage server software automatically sends an alert when it is safe to replace the flash disk.

  4. Stop the services:

    CellCLI> ALTER CELL SHUTDOWN SERVICES ALL
    

    The preceding command checks if any disks are offline, in predictive failure status, or must be copied to its mirror. If Oracle ASM redundancy is intact, then the command takes the grid disks offline in Oracle ASM, and stops the services.

    The following error indicates that stopping the services might cause redundancy problems and force a disk group to dismount:

    Stopping the RS, CELLSRV, and MS services...
    The SHUTDOWN of ALL services was not successful.
    CELL-01548: Unable to shut down CELLSRV because disk group DATA, RECO may be
    forced to dismount due to reduced redundancy.
    Getting the state of CELLSRV services... running
    Getting the state of MS services... running
    Getting the state of RS services... running
    

    If this error occurs, then restore Oracle ASM disk group redundancy. Retry the command when the status is normal for all disks.

  5. Shut down the server. See "Shutting Down a Storage Server".

  6. Remove the bad flash disk, and replace it with a new flash disk.

  7. Power up the server. The services are started automatically. As part of the server startup, all grid disks are automatically online in Oracle ASM.

  8. Add the new flash disk to flash cache:

    CellCLI> DROP FLASHCACHE
    CellCLI> CREATE FLASHCACHE ALL
    
  9. Verify that all grid disks are online:

    CellCLI> LIST GRIDDISK ATTRIBUTES asmmodestatus
    

    Wait until asmmodestatus shows ONLINE or UNUSED for all grid disks.

The flash disks are added as follows:

  • If the flash disk is used for grid disks, then the grid disks are re-created on the new flash disk.

  • If these grid disks were part of an Oracle ASM disk group and DROP...FORCE was used in Step 2, then they are added back to the disk group and the data is rebalanced on based on disk group redundancy and the ASM_POWER_LIMIT parameter.

  • If DROP...NOFORCE was used in Step 2, then you must manually add the grid disks back to the Oracle ASM disk group.

12.11.9 About Write-Back Flash Cache

You cannot modify the write-back flash cache settings on Recovery Appliance.

12.12 Replacing a Disk Controller Battery Backup Unit

The disk controller battery backup unit (disk controller BBU) resides on a drive tray in the compute and storage servers. You can replace the disk controller BBU without downtime. The following procedures describe how to replace the disk controller BBU:

Note:

The procedures in this section do not apply to on-controller battery backup units. Replacement of those units require a system shutdown, because the system must be opened to access the controller card.

12.12.1 Replacing a Disk Controller BBU on a Compute Server

The following procedure describes how to replace a disk controller BBU on a compute server:

  1. Drop the disk controller BBU for replacement:
    # /opt/oracle.cellos/compmon/exadata_mon_hw_asr.pl -drop_bbu_for_replacement
    
  2. Verify that the disk controller BBU has been dropped for replacement:
    # /opt/oracle.cellos/compmon/exadata_mon_hw_asr.pl -list_bbu_status
    
    BBU status: dropped for replacement.
    
  3. Replace the disk controller BBU by releasing the drive caddy and slowly pulling out the tray, and then sliding the replacement tray into the slot. The disk controller BBU is located in slot 7.
  4. Verify that the new disk controller BBU has been detected. It may take several minutes.
    # /opt/oracle.cellos/compmon/exadata_mon_hw_asr.pl -list_bbu_status
    
    BBU status: present
    
  5. Verify that the current logical disk drive cache policy uses writeback mode.
    # /opt/MegaRAID/MegaCli/MegaCli64 -ldinfo -lall -a0 | egrep         \
    'Default Cache|Current Cache'
    Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if
    Bad BBU
    Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if
    Bad BBU
    

    If the cache policy is not writeback, then go to step 6. Otherwise, go to step 7.

  6. Verify that the battery state is Operational. This step is required only when the cache policy output from step 5 is not writeback.
    # /opt/MegaRAID/MegaCli/MegaCli64 -adpbbucmd -getbbustatus -a0|grep Battery 
    BatteryType: iBBU08
    Battery State : Operational
    Battery Pack Missing : No
    Battery Replacement required : No 
    

    If the battery state is not Operational, then investigate and correct the problem.

  7. Perform battery checks as described in My Oracle Support Doc ID 1274318.1. If the checks return unexpected results, then refer to the note for additional information and instructions.
  8. (Optional) Use the exachk tool to verify the health of the system. See My Oracle Support Doc ID 1070954.1.

12.12.2 Replacing a Disk Controller BBU on a Storage Server

To replace a disk controller BBU on a storage server:

  1. Drop the disk controller BBU for replacement using the following command:
    # cellcli -e alter cell bbu drop for replacement
    
  2. Verify that the disk controller BBU has been dropped for replacement using the following command:
    # cellcli -e list cell attributes bbustatus
    
    BBU status: dropped for replacement.
    
  3. Replace the disk controller BBU by releasing the drive caddy and slowly pulling out the tray, and then sliding the replacement tray into the slot. The disk controller BBU is located in rear slot 1 of the server.
  4. Verify that the disk controller BBU battery state is operational.
    # cellcli -e list cell attributes bbustatus
    
    BBU status: normal
    
  5. Perform battery checks as described in My Oracle Support Doc ID 1274318.1. If the checks return unexpected results, then refer to the note for additional information and instructions.
  6. (Optional) Use the exachk tool to verify the health of the system. The tool is available in My Oracle Support Doc ID 1070954.1.

12.13 Using the Storage Server Rescue Procedure

Each storage server maintains a copy of the software on the USB stick. Whenever the system configuration changes, the server updates the USB stick. You can use this USB stick to recover the server after a hardware replacement or a software failure. You restore the system when the system disks fail, the operating system has a corrupt file system, or the boot area is damaged. You can replace the disks, cards, CPU, memory, and so forth, and recover the server. You can insert the USB stick in a different server, and it will duplicate the old server.

If only one system disk fails, then use CellCLI commands to recover. In the rare event that both system disks fail simultaneously, then use the rescue functionality provided on the storage server CELLBOOT USB flash drive.

This section contains the following topics:

12.13.1 First Steps Before Rescuing the Storage Server

Before rescuing a storage server, you must take steps to protect the data that is stored on it. Those steps depend on whether the system is set up with normal redundancy or high redundancy.

12.13.1.1 If the Server Has Normal Redundancy

If you are using normal redundancy, then the server has one mirror copy. The data could be irrecoverably lost, if that single mirror also fails during the rescue procedure.

Oracle recommends that you duplicate the mirror copy:

  1. Make a complete backup of the data in the mirror copy.

  2. Take the mirror copy server offline immediately, to prevent any new data changes to it before attempting a rescue.

This procedure ensures that all data residing on the grid disks on the failed server and its mirror copy is inaccessible during the rescue procedure.

The Oracle ASM disk repair timer has a default repair time of 3.6 hours. If you know that you cannot perform the rescue procedure within that time frame, then use the Oracle ASM rebalance procedure to rebalance the disks until you can do the rescue procedure.

See Also:

Oracle Exadata Storage Server Software User's Guide for information about resetting the timer

12.13.1.2 If the Server Has High Redundancy

When the server has high redundancy disk groups, so that Oracle ASM has multiple mirror copies for all the grid disks of the failed server, then take the failed cell offline. After Oracle ASM times out, it automatically drops the grid disks on the failed server, and starts rebalancing the data using mirror copies.

The default time out is two hours. If the server rescue takes more than two hours, then you must re-create the grid disks on the rescued cells in Oracle ASM.

12.13.2 About the Rescue Procedure

Note the following before using the rescue procedure:

  • The rescue procedure can rewrite some or all of the disks in the cell. If this happens, then you might lose all the content of those disks without the possibility of recovery. Ensure that you complete the appropriate preliminary steps before starting the rescue. See "If the Server Has Normal Redundancy" or "If the Server Has High Redundancy".

  • Use extreme caution when using this procedure, and pay attention to the prompts. Ideally, use the rescue procedure only with assistance from Oracle Support Services, and when you can afford to lose the data on some or all of the disks.

  • The rescue procedure does not destroy the contents of the data disks or the contents of the data partitions on the system disks, unless you explicitly choose to do so during the rescue procedure.

  • The rescue procedure restores the storage server software to the same release, including any patches that existed on the server during the last successful boot.

  • The rescue procedure does not restore these configuration settings:

    • Server configurations, such as alert configurations, SMTP information, administrator email address

    • ILOM configuration. However, ILOM configurations typically remain undamaged even when the server software fails.

  • The recovery procedure does restore these configuration settings:

  • The rescue procedure does not examine or reconstruct data disks or data partitions on the system disks. If there is data corruption on the grid disks, then do not use this rescue procedure. Instead, use the rescue procedures for Oracle Database and Oracle ASM.

After a successful rescue, you must reconfigure the server. If you want to preserve the data, then import the cell disks. Otherwise, you must create new cell disks and grid disks.

See Also:

Oracle Exadata Storage Server Software User's Guide for information on configuring cells, cell disks, and grid disks using the CellCLI utility

12.13.3 Rescuing a Server Using the CELLBOOT USB Flash Drive

Caution:

Follow the rescue procedure with care to avoid data loss.

To rescue a server using the CELLBOOT USB flash drive:

  1. Connect to the Oracle ILOM service processor (SP) of the rescued server. You can use either HTTPS or SSH.
  2. Start the server. As soon as you see the splash screen, press any key on the keyboard. The splash screen is visible for only 5 seconds.
  3. In the displayed list of boot options, select the last option, CELL_USB_BOOT_CELLBOOT_usb_in_rescue_mode, and press Enter.
  4. Select the rescue option, and proceed with the rescue.
  5. At the end of the first phase of the rescue, choose the option to enter the shell. Do not restart the system
  6. Log in to the shell using the rescue root password.
  7. Use the reboot command from the shell.
  8. Press F8 as the server restarts and before the splash screen appears. Pressing F8 accesses the boot device selection menu.
  9. Select the RAID controller as the boot device. This causes the server to boot from the hard disks.

Note:

Additional options might be available that allow you to enter a rescue mode Linux login shell with limited functionality. Then you can log in to the shell as the root user with the password supplied by Oracle Support Services, and manually run additional diagnostics and repairs on the server. For complete details, contact your Oracle Support Services representative.

12.13.4 Reconfiguring the Rescued Storage Server

After a successful rescue, you must configure the server. If the data partitions were preserved, then the cell disks are imported automatically during the rescue procedure.

  1. For any replaced servers, re-create the cell disks and grid disks.
  2. Log in to the Oracle ASM instance, and set the disks to ONLINE using the following command for each disk group:
    SQL> ALTER DISKGROUP disk_group_name ONLINE DISKS IN FAILGROUP \
    cell_name WAIT; 
    
  3. Reconfigure the cell using the ALTER CELL command. The following example shows the most common parameters:
    CellCLI> ALTER CELL
    smtpServer='my_mail.example.com', -
    smtpFromAddr='john.doe@example.com', -
    smtpFromPwd=email_address_password, -
    smtpToAddr='jane.smith@example.com', -
    notificationPolicy='critical,warning,clear', -
    notificationMethod='mail,snmp'
    
  4. Re-create the I/O Resource Management (IORM) plan.
  5. Re-create the metric thresholds.

See Also:

Oracle Exadata Storage Server Software User's Guide for information about IORM plans and metric thresholds

12.13.5 Recreating a Damaged CELLBOOT USB Flash Drive

If the CELLBOOT USB flash drive is lost or damaged, then you can create another one.

To create a CELLBOOT flash drive:

  1. Log in to the server as the root user.
  2. Attach a new USB flash drive with a capacity of 1 to 8 GB.
  3. Remove any other USB flash drives from the system.
  4. Change directories:
    cd /opt/oracle.SupportTools
    
  5. Copy the server software to the flash drive:
    ./make_cellboot_usb -verbose -force