This chapter provides information about the Cluster metrics.
For each metric, it provides the following information:
Description
Metric table
The metric table can include some or all of the following: target version, default collection frequency, default warning threshold, default critical threshold, and alert text.
The metrics in this metric category provide an overview of the clusterware status for this cluster, how many nodes in this cluster have problems, and the Cluster Verification (CLUVFY) utility output for all the nodes of this cluster. Generally, the clusterware is up if the clusterware on at least one host is up.
This metric shows the CLUVFY output of clusterware for all nodes of this cluster.
Data Source
The following command is data source for metric where node1, node2 is the node list for the cluster:
cluvfy comp crs -n node1, node2 ...
User Action
Search for the Cluster Verification (CLUVFY) utility in the Oracle Clusterware Administration and Deployment Guide.
This metric shows the overall clusterware status for this cluster. The clusterware is up if the clusterware on at least one host is up.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
10gR2, 11g, 12c | Every 5 Minutes | 2 | 0 | Clusterware has problems on the master agent host %CRS_output% |
Data Source
The following command is data source for metric where node1, node2� is the node list for the cluster:
cluvfy comp crs -n node1, node2 ...
User Action
Search for the Cluster Verification (CLUVFY) utility in the Oracle Clusterware Administration and Deployment Guide.
The metrics in this metric category provide details about the Cluster Alert Log metrics.
This metric collects certain error messages in the CRS alert log at the cluster level.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
10gR2, 11gR1 | Every 5 Minutes | CRS-1601 | Not Defined | %clusterwareErrStack%
See %alertLogName% for details. |
11gR2, 12c | Every 5 Minutes | CRS-(8011|8013|8014|8015) | Not Defined | %clusterwareErrStack%
See %alertLogName% for details. |
Note:
Do not modify the default warning and critical thresholds for this metric.This column collects CRS-1607, 1802, 1803, 1804 and 1805 messages from the CRS alert log at the cluster level, and issues alerts based on the error code.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
10gR2, 11gR1 | Every 5 Minutes | CRS-180(2|3|4|5) | CRS-1607 | %nodeErrStack%
See %alertLogName for details. |
11gR2, 12c | Every 5 Minutes | Not Defined | CRS-1607 | %nodeErrStack%
See %alertLogName% for details. |
Note:
Do not modify the default warning and critical thresholds for this metric.This column collects CRS-1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1010 and 1011 messages from CRS alert log at the cluster level and issue alerts based on the error code.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
10gR2, 11gR1 | Every 5 Minutes | CRS-100(1|2|3|4|5|7) | CRS-(1006|1008|1010|1011) | %ocrErrStack%
See %alertLogName for details. |
Note:
Do not modify the default warning and critical thresholds for this metric.This column collects CRS-1607, 1802, 1803, 1804 and 1805 messages from the CRS alert log at the cluster level, and issues alerts based on the error code.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
10gR2, 11gR1 | Every 5 Minutes | Not Defined | CRS-160(4|5|6) | %votingErrStack%
See %alertLogName for details. |
11gR2, 12c | Every 5 Minutes | Not Defined | CRS-160(4|5|6) | %votingErrStack%
See %alertLogName% for details. |
Note:
Do not modify the default warning and critical thresholds for this metric.The metrics in this metric category provide information about the Quality of Service (QoS) events.
For a database to be managed by Oracle Database QoS Management, the database must be compliant.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
11gR2, 12c | - | Not Defined | NOT_COMPLIANT | Server pool %wlm_entity_name% has a violation. Please refer to the Grid Operations Manager log for details |
Oracle Database QoS Management detects memory pressure on a server in real time and redirects new sessions to other servers to prevent using all available memory on the stressed server.
This metric indicates that the database server is experiencing memory pressure.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
11gR2, 12c | - | RED | Not Defined | Server %wlm_server% is under elevated memory pressure and services on all instances on this server will be stopped |
This metric displays the reason for a change in the Oracle Database QoS Management state.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
11gR2, 12c | - | USER_DISABLED | EXCEPTION_DISABLED | QoSM service is disabled due to %wlm_qosm_state%. |
The metrics in this metric category provide information about the Cluster Resource State (CRS).
This is the CRS resource status change metric.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
11gR2, 12c | Every 24 Hours | COMPLETE_INTERMEDIATE|PARTIALLY_UNKNOWN|PARTIALLY_OFFLINE|PARTIALLY_INTERMEDIATE | COMPLETE_UNKNOWN|COMPLETE_OFFLINE|ADD|DOWN | %crs_entity_name% has %resource_status_alert_count% instances in %resource_status_alert_state% State %resource_status_additional_mesg% |