18 Running Recovery Appliance Checks
Recovery Appliance checks validate that its components are in a stable and healthy state.
The checks are available through the RACLI utility and can be run together or individually. The Recovery Appliance component checks include:
- ZDLRA Services - verifies whether the Recovery Appliance Services (RA Server, DB, CRS) are online.
- Compute Server Alerts - checks the compute nodes for
dbmcli
alert history with severity greater than warnings. - Storage Server Alerts - checks the storage cells for
dbmcli
alert history with severity greater than warnings. - Active Incidents in the Database - checks the Recovery Appliance Database for incidents in the database. Can often be bypassed during patching with
'–ignore_incidents'
during the patch appliance steps. - Invalid Objects in the Database - checks the Recovery Appliance database for invalid objects that need to be recompiled.
- Consistency between the Deployed vs Installed RA Automation RPM - checks the Recovery Appliance to ensure the deployed RPM vs the installed RPM are consistent.
- Exadata Image Version Consistency Across All the Hosts - checks the compute nodes and storage cells to ensure there is only one (1) existing image version for consistency.
- Init Parameter Validation - checks the Recovery Appliance database to confirm that
set init
parameters are consistent for a Recovery Appliance configuration. - Export Bundle Availability - checks Recovery Appliance to ensure an export bundle has been successfully taken. In the event of a disaster/crash, the export bundle is used to rebuild the Recovery Appliance to a known working state. The export bundle must be copied to a safe system or location before the Recovery Appliance is rebuilt.
- Oracle Password Status - checks to ensure the oracle password has not expired.
- RASYS User Wallet Status - checks the validity of the
rasys
wallet. This is required for operations including patching, expansion and upgrade.
racli list check
Use the racli list check
command to learn the spelling and enabled status of the various checks.
-
From the compute server as
raadmin
group member, run the command.[adminra1@zdlra05 ~]# racli list check --all check_image_versions check_cell_alerts check_appliance_status check_compute_alerts check_init_parameter check_ra_prechecks check_active_incidents check_oracle_access check_invalid_objects check_ra_exportcheck_ra_version [adminra1@zdlra05 ~]#
-
List which checks are enabled.
[adminra1@zdlra05 ~]# racli list check --status=enabled check_active_incidents check_appliance_status check_cell_alerts check_compute_alerts check_image_versions check_init_parameter check_invalid_objects check_ra_export check_ra_version [adminra1@zdlra05 ~]#
-
List which checks are disabled.
[adminra1@zdlra05 ~]# racli list check --status=disabled check_ra_prechecks [adminra1@zdlra05 ~]#racli list check --status=disabled --verbose check_ra_prechecks VERSION=1.0.0.0 GROUP_NAME=DEV SCRIPT=/opt/oracle.RecoveryAppliance/bin/check_ra_prechecks.pl TYPE=system OPTS='' ORDER=15 ENABLED=NO DB_USER='' [adminra1@zdlra05 ~]#
racli run check
Recovery Appliance checks can be run one or more at a time, or all checks that are enabled.
-
From the compute server as
raadmin
group member, run the command.[adminra1@zdlra05 ~]# racli run check --check_name=check_active_incidents,check_invalid_objects Wed Oct 10 13:53:07 2018: Start: racli run check --check_name=check_active_incidents,check_invalid_objects HOST: [nnnnnn01.oracle.com] Created log file scas10adm01.us.oracle.com:/opt/oracle.RecoveryAppliance/log/racli_run_check_20181010.1353.log Wed Oct 10 13:53:07 2018: CHECK: Active Incidents - PASS Wed Oct 10 13:53:09 2018: CHECK: Invalid Objects - PASS Wed Oct 10 13:53:09 2018: End: racli run check --check_name=check_active_incidents,check_invalid_objects HOST: [nnnnnn01.oracle.com] [adminra1@zdlra05 ~]#
-
Run all checks that are enabled.
[adminra1@zdlra05 ~]# racli run check --all Wed Oct 10 13:50:28 2018: Start: racli run check --all HOST: [nnnnnn01.oracle.com] Created log file scas10adm01.us.oracle.com:/opt/oracle.RecoveryAppliance/log/racli_run_check_20181010.1350.log Wed Oct 10 13:50:29 2018: CHECK: RA Services - PASS Wed Oct 10 13:50:32 2018: CHECK: Compute Node AlertHistory Wed Oct 10 13:50:32 2018: HOST: [nnnnnn01] - PASS Wed Oct 10 13:50:32 2018: HOST: [nnnnnn01] - PASS Wed Oct 10 13:50:43 2018: CHECK: Storage Cell AlertHistory Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm09] - PASS Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm05] - PASS Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm03] - PASS Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm07] - PASS Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm01] - PASS Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm04] - PASS Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm02] - PASS Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm06] - PASS Wed Oct 10 13:50:43 2018: HOST: [scyyyyyyyadm08] - PASS Wed Oct 10 13:50:44 2018: CHECK: ZDLRA Version Wed Oct 10 13:50:44 2018: HOST: [scyyyyyyyadm02] - FAIL Wed Oct 10 13:50:44 2018: Wed Oct 10 13:50:44 2018: CAUSE: Wed Oct 10 13:50:44 2018: Unexpected ZDLRA version found. Wed Oct 10 13:50:44 2018: For more details, see log file: Wed Oct 10 13:50:44 2018: - /opt/oracle.RecoveryAppliance/log/racli_check_ra_versions_20181010.1350.log Wed Oct 10 13:50:44 2018: Wed Oct 10 13:50:44 2018: HOST: [scyyyyyyyadm01] - FAIL Wed Oct 10 13:50:44 2018: Wed Oct 10 13:50:44 2018: CAUSE: Wed Oct 10 13:50:44 2018: Unexpected ZDLRA version found. Wed Oct 10 13:50:44 2018: For more details, see log file: Wed Oct 10 13:50:44 2018: - /opt/oracle.RecoveryAppliance/log/racli_check_ra_versions_20181010.1350.log Wed Oct 10 13:50:44 2018: Wed Oct 10 13:50:53 2018: CHECK: Exadata Image Version - PASS Wed Oct 10 13:50:53 2018: CHECK: Active Incidents - PASS Wed Oct 10 13:50:56 2018: CHECK: Init Parameters - FAIL Wed Oct 10 13:50:56 2018: Wed Oct 10 13:50:56 2018: CAUSE: Wed Oct 10 13:50:56 2018: Init Parameter Error found Wed Oct 10 13:50:56 2018: ZDLRA DB Init Parameter Errors: Wed Oct 10 13:50:56 2018: For more details, see log file: Wed Oct 10 13:50:56 2018: - /opt/oracle.RecoveryAppliance/log/racli_check_init_params_20181010.1350.log Wed Oct 10 13:50:56 2018: Wed Oct 10 13:50:56 2018: Parameter: _report_capture_cycle_time Wed Oct 10 13:50:56 2018: Wed Oct 10 13:50:56 2018: Instance ID: 1 Wed Oct 10 13:50:56 2018: Recomended Value: N/A Wed Oct 10 13:50:56 2018: Actual Value: 0 Wed Oct 10 13:50:56 2018: Error Text: Init Parameters have non default value Wed Oct 10 13:50:56 2018: Wed Oct 10 13:50:56 2018: Instance ID: 2 Wed Oct 10 13:50:56 2018: Recomended Value: N/A Wed Oct 10 13:50:56 2018: Actual Value: 0 Wed Oct 10 13:50:56 2018: Error Text: Init Parameters have non default value Wed Oct 10 13:50:56 2018: Wed Oct 10 13:50:56 2018: Please run dbms_ra_adm.update_init_param Wed Oct 10 13:50:56 2018: in SQL env and bounce database to make them Wed Oct 10 13:50:56 2018: validate. Wed Oct 10 13:50:57 2018: CHECK: Invalid Objects - PASS Wed Oct 10 13:50:58 2018: CHECK: Export Backup - PASS Wed Oct 10 13:50:58 2018: End: racli run check --all HOST: [nnnnnn01.oracle.com] [adminra1@zdlra05 ~]#