18 Running Recovery Appliance Checks

Recovery Appliance checks validate that its components are in a stable and healthy state.

The checks are available through the RACLI utility and can be run together or individually. The Recovery Appliance component checks include:

  • ZDLRA Services - verifies whether the Recovery Appliance Services (RA Server, DB, CRS) are online.
  • Compute Server Alerts - checks the compute nodes for dbmcli alert history with severity greater than warnings.
  • Storage Server Alerts - checks the storage cells for dbmcli alert history with severity greater than warnings.
  • Active Incidents in the Database - checks the Recovery Appliance Database for incidents in the database. Can often be bypassed during patching with '–ignore_incidents' during the patch appliance steps.
  • Invalid Objects in the Database - checks the Recovery Appliance database for invalid objects that need to be recompiled.
  • Consistency between the Deployed vs Installed RA Automation RPM - checks the Recovery Appliance to ensure the deployed RPM vs the installed RPM are consistent.
  • Exadata Image Version Consistency Across All the Hosts - checks the compute nodes and storage cells to ensure there is only one (1) existing image version for consistency.
  • Init Parameter Validation - checks the Recovery Appliance database to confirm that set init parameters are consistent for a Recovery Appliance configuration.
  • Export Bundle Availability - checks Recovery Appliance to ensure an export bundle has been successfully taken. In the event of a disaster/crash, the export bundle is used to rebuild the Recovery Appliance to a known working state. The export bundle must be copied to a safe system or location before the Recovery Appliance is rebuilt.
  • Oracle Password Status - checks to ensure the oracle password has not expired.
  • RASYS User Wallet Status - checks the validity of the rasys wallet. This is required for operations including patching, expansion and upgrade.

racli list check

Use the racli list check command to learn the spelling and enabled status of the various checks.

  1. From the compute server as raadmin group member, run the command.

    [adminra1@zdlra05 ~]# racli list check --all
    check_image_versions
    check_cell_alerts
    check_appliance_status
    check_compute_alerts
    check_init_parameter
    check_ra_prechecks
    check_active_incidents
    check_oracle_access
    check_invalid_objects
    check_ra_exportcheck_ra_version
    [adminra1@zdlra05 ~]#
  2. List which checks are enabled.

    [adminra1@zdlra05 ~]# racli list check --status=enabled
    check_active_incidents
    check_appliance_status
    check_cell_alerts
    check_compute_alerts
    check_image_versions
    check_init_parameter
    check_invalid_objects
    check_ra_export
    check_ra_version
    [adminra1@zdlra05 ~]#
  3. List which checks are disabled.

    [adminra1@zdlra05 ~]# racli list check --status=disabled
    check_ra_prechecks
    [adminra1@zdlra05 ~]#racli list check --status=disabled --verbose
    check_ra_prechecks
    VERSION=1.0.0.0
    GROUP_NAME=DEV
    SCRIPT=/opt/oracle.RecoveryAppliance/bin/check_ra_prechecks.pl
    TYPE=system
    OPTS=''
    ORDER=15
    ENABLED=NO
    DB_USER=''
    [adminra1@zdlra05 ~]#

racli run check

Recovery Appliance checks can be run one or more at a time, or all checks that are enabled.

  1. From the compute server as raadmin group member, run the command.

    [adminra1@zdlra05 ~]# racli run check --check_name=check_active_incidents,check_invalid_objects
    Wed Oct 10 13:53:07 2018: Start: racli run check --check_name=check_active_incidents,check_invalid_objects
    HOST: [nnnnnn01.oracle.com]
    
    Created log file scas10adm01.us.oracle.com:/opt/oracle.RecoveryAppliance/log/racli_run_check_20181010.1353.log
    Wed Oct 10 13:53:07 2018: CHECK: Active Incidents - PASS
    Wed Oct 10 13:53:09 2018: CHECK: Invalid Objects - PASS
    Wed Oct 10 13:53:09 2018: End: racli run check --check_name=check_active_incidents,check_invalid_objects
    HOST: [nnnnnn01.oracle.com]
    [adminra1@zdlra05 ~]#
  2. Run all checks that are enabled.

    [adminra1@zdlra05 ~]# racli run check --all
    Wed Oct 10 13:50:28 2018: Start: racli run check --all
    HOST: [nnnnnn01.oracle.com]
    
    Created log file scas10adm01.us.oracle.com:/opt/oracle.RecoveryAppliance/log/racli_run_check_20181010.1350.log
    
    Wed Oct 10 13:50:29 2018: CHECK: RA Services - PASS
    Wed Oct 10 13:50:32 2018: CHECK: Compute Node AlertHistory
    Wed Oct 10 13:50:32 2018: HOST: [nnnnnn01] - PASS
    Wed Oct 10 13:50:32 2018: HOST: [nnnnnn01] - PASS
    Wed Oct 10 13:50:43 2018: CHECK: Storage Cell AlertHistory
    Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm09] - PASS
    Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm05] - PASS
    Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm03] - PASS
    Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm07] - PASS
    Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm01] - PASS
    Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm04] - PASS
    Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm02] - PASS
    Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm06] - PASS
    Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm08] - PASS
    Wed Oct 10 13:50:44 2018: CHECK: ZDLRA Version
    Wed Oct 10 13:50:44 2018: HOST: [scyyyyyyyadm02] - FAIL
    Wed Oct 10 13:50:44 2018:
    Wed Oct 10 13:50:44 2018: CAUSE:
    Wed Oct 10 13:50:44 2018: Unexpected ZDLRA version found.
    Wed Oct 10 13:50:44 2018: For more details, see log file:
    Wed Oct 10 13:50:44 2018: - /opt/oracle.RecoveryAppliance/log/racli_check_ra_versions_20181010.1350.log
    Wed Oct 10 13:50:44 2018:
    Wed Oct 10 13:50:44 2018: HOST: [scyyyyyyyadm01] - FAIL
    Wed Oct 10 13:50:44 2018:
    Wed Oct 10 13:50:44 2018: CAUSE:
    Wed Oct 10 13:50:44 2018: Unexpected ZDLRA version found.
    Wed Oct 10 13:50:44 2018: For more details, see log file:
    Wed Oct 10 13:50:44 2018: - /opt/oracle.RecoveryAppliance/log/racli_check_ra_versions_20181010.1350.log
    Wed Oct 10 13:50:44 2018:
    Wed Oct 10 13:50:53 2018: CHECK: Exadata Image Version - PASS
    Wed Oct 10 13:50:53 2018: CHECK: Active Incidents - PASS
    Wed Oct 10 13:50:56 2018: CHECK: Init Parameters - FAIL
    Wed Oct 10 13:50:56 2018:
    Wed Oct 10 13:50:56 2018: CAUSE:
    Wed Oct 10 13:50:56 2018: Init Parameter Error found
    Wed Oct 10 13:50:56 2018: ZDLRA DB Init Parameter Errors:
    Wed Oct 10 13:50:56 2018: For more details, see log file:
    Wed Oct 10 13:50:56 2018: - /opt/oracle.RecoveryAppliance/log/racli_check_init_params_20181010.1350.log
    Wed Oct 10 13:50:56 2018:
    Wed Oct 10 13:50:56 2018: Parameter: _report_capture_cycle_time
    Wed Oct 10 13:50:56 2018:
    Wed Oct 10 13:50:56 2018: Instance ID: 1
    Wed Oct 10 13:50:56 2018: Recomended Value: N/A
    Wed Oct 10 13:50:56 2018: Actual Value: 0
    Wed Oct 10 13:50:56 2018: Error Text: Init Parameters have non default value
    Wed Oct 10 13:50:56 2018:
    Wed Oct 10 13:50:56 2018: Instance ID: 2
    Wed Oct 10 13:50:56 2018: Recomended Value: N/A
    Wed Oct 10 13:50:56 2018: Actual Value: 0
    Wed Oct 10 13:50:56 2018: Error Text: Init Parameters have non default value
    Wed Oct 10 13:50:56 2018:
    Wed Oct 10 13:50:56 2018: Please run dbms_ra_adm.update_init_param
    Wed Oct 10 13:50:56 2018: in SQL env and bounce database to make them
    Wed Oct 10 13:50:56 2018: validate.
    Wed Oct 10 13:50:57 2018: CHECK: Invalid Objects - PASS
    Wed Oct 10 13:50:58 2018: CHECK: Export Backup - PASS
    Wed Oct 10 13:50:58 2018: End: racli run check --all
    HOST: [nnnnnn01.oracle.com]
    [adminra1@zdlra05 ~]#