4 Cluster
For each metric, it provides the following information:
-
Description
-
Metric table
The metric table can include some or all of the following: target version, default collection frequency, default warning threshold, default critical threshold, and alert text.
Clusterware
The metrics in this metric category provide an overview of the clusterware status for this cluster, how many nodes in this cluster have problems, and the Cluster Verification (CLUVFY) utility output for all the nodes of this cluster. Generally, the clusterware is up if the clusterware on at least one host is up.
Cluster Verification Output
This metric shows the CLUVFY output of clusterware for all nodes of this cluster.
Data Source
The following command is data source for metric where node1, node2 is the node list for the cluster:
cluvfy comp crs -n node1, node2 ...
User Action
Search for the Cluster Verification (CLUVFY) utility in the Oracle Clusterware Administration and Deployment Guide.
Clusterware Status
This metric shows the overall clusterware status for this cluster. The clusterware is up if the clusterware on at least one host is up.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
10gR2, 11g, 12c |
Every 5 Minutes |
2 |
0 |
Clusterware has problems on the master agent host %CRS_output% |
Data Source
The following command is the data source for metric where node1 and node2 is the node list for the cluster:
cluvfy comp crs -n node1, node2 ...
User Action
Search for the Cluster Verification (CLUVFY) utility in the Oracle Clusterware Administration and Deployment Guide.
Alert Log Metrics
The metrics in this metric category provide details about the Cluster Alert Log metrics.
-
Alert Log Error: This metric group de-duplicates the recurring errors over a period of time and raises a single alert for the same underlying issue. It is enabled by default.
-
Clusterware Alert Log: This metric group is an old way of collecting data and raises alerts with every occurrence. It is disabled by default.
Note:
Both Alert Log Error and Clusterware Alert Log metric groups use the same source path of the Alert log destination.Target Version | Alert Log File |
---|---|
10gR2, 11gR1 |
%OracleHome%/log/%NodeName%/alert%NodeName%.log |
11gR2 |
%OracleHome%/log/%NodeName%/alert%NodeName%.log |
12c, 12cR2 |
%AdrHome%/trace/alert.log |
Alert Log Error
The metrics in this metric category provide details about the Alert Log Error metrics.
Clusterware Service Alert Log Error
This metric collects certain error messages in the CRS alert log at the cluster level.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
10gR2, 11gR1 |
Every 5 Minutes |
CRS-1601 |
Not Defined |
%clusterwareErrStack% See %alertLogName% for details. |
11gR2, 12c |
Every 5 Minutes |
CRS-(8011|8013|8014|8015) |
Not Defined |
%clusterwareErrStack% See %alertLogName% for details. |
Note:
Do not modify the default warning and critical thresholds for this metric.
Node Configuration Alert Log Error
This column collects CRS-1607, 1802, 1803, 1804, and 1805 messages from the CRS alert log at the cluster level, and issues alerts based on the error code.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
10gR2, 11gR1 |
Every 5 Minutes |
CRS-180(2|3|4|5) |
CRS-1607 |
%nodeErrStack% See %alertLogName% for details. |
11gR2, 12c |
Every 5 Minutes |
Not Defined |
CRS-1607 |
%nodeErrStack% See %alertLogName% for details. |
Note:
Do not modify the default warning and critical thresholds for this metric.
OCR Alert Log Error
This column collects CRS-1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1010, and 1011 messages from CRS alert log at the cluster level, and issue alerts based on the error code.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
10gR2, 11gR1 |
Every 5 Minutes |
CRS-100(1|2|3|4|5|7) |
CRS-(1006|1008|1010|1011) |
%ocrErrStack% See %alertLogName% for details. |
Note:
Do not modify the default warning and critical thresholds for this metric.
Voting Disk Alert Log Error
This column collects CRS-1607, 1802, 1803, 1804, and 1805 messages from the CRS alert log at the cluster level, and issues alerts based on the error code.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
10gR2, 11gR1 |
Every 5 Minutes |
Not Defined |
CRS-160(4|5|6) |
%votingErrStack% See %alertLogName% for details. |
11gR2, 12c |
Every 5 Minutes |
Not Defined |
CRS-160(4|5|6) |
%votingErrStack% See %alertLogName% for details. |
Note:
Do not modify the default warning and critical thresholds for this metric.
Clusterware Alert Log Metric
The metrics in this metric category provide details about the Cluster Alert Log metrics.
Clusterware Service Alert Log Error
This metric collects certain error messages in the CRS alert log at the cluster level.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
10gR2, 11gR1 |
— |
CRS-1601 |
Not Defined |
%clusterwareErrStack% See %alertLogName% for details. |
11gR2, 12c |
— |
CRS-(8011|8013|8014|8015) |
Not Defined |
%clusterwareErrStack% See %alertLogName% for details. |
Note:
Do not modify the default warning and critical thresholds for this metric.
Node Configuration Alert Log Error
This column collects CRS-1607, 1802, 1803, 1804, and 1805 messages from the CRS alert log at the cluster level, and issues alerts based on the error code.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
10gR2, 11gR1 |
— |
CRS-180(2|3|4|5) |
CRS-1607 |
%nodeErrStack% See %alertLogName for details. |
11gR2, 12c |
— |
Not Defined |
CRS-1607 |
%nodeErrStack% See %alertLogName% for details. |
Note:
Do not modify the default warning and critical thresholds for this metric.
OCR Alert Log Error
This column collects CRS-1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1010, and 1011 messages from CRS alert log at the cluster level, and issue alerts based on the error code.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
10gR2, 11gR1 |
— |
CRS-100(1|2|3|4|5|7) |
CRS-(1006|1008|1010|1011) |
%ocrErrStack% See %alertLogName% for details. |
Note:
Do not modify the default warning and critical thresholds for this metric.
Voting Disk Alert Log Error
This column collects CRS-1607, 1802, 1803, 1804, and 1805 messages from the CRS alert log at the cluster level, and issues alerts based on the error code.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
10gR2, 11gR1 |
— |
Not Defined |
CRS-160(4|5|6) |
%votingErrStack% See %alertLogName% for details. |
11gR2, 12c |
— |
Not Defined |
CRS-160(4|5|6) |
%votingErrStack% See %alertLogName% for details. |
Note:
Do not modify the default warning and critical thresholds for this metric.
QoS Events
The metrics in this metric category provide information about the Quality of Service (QoS) events.
Compliance State
For a database to be managed by Oracle Database QoS Management, the database must be compliant.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
11gR2, 12c |
- |
Not Defined |
NOT_COMPLIANT |
Server pool %wlm_entity_name% has a violation. Refer to the Grid Operations Manager log for details. |
Memory Pressure Analysis Risk State
Oracle Database QoS Management detects memory pressure on a server in real time and redirects new sessions to other servers to prevent using all available memory on the stressed server.
This metric indicates that the database server is experiencing memory pressure.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
11gR2, 12c |
- |
RED |
Not Defined |
Server %wlm_server% is under elevated memory pressure and services on all instances on this server will be stopped. |
QoSM State Change
This metric displays the reason for a change in the Oracle Database QoS Management state.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
11gR2, 12c |
- |
USER_DISABLED |
EXCEPTION_DISABLED |
QoSM service is disabled due to %wlm_qosm_state%. |
Resource State
The metrics in this metric category provide information about the Cluster Resource State (CRS).
State Change
This is the CRS resource status change metric.
Target Version | Evaluation and Collection Frequency | Default Warning Threshold | Default Critical Threshold | Alert Text |
---|---|---|---|---|
11gR2, 12c |
Every 24 Hours |
COMPLETE_INTERMEDIATE|PARTIALLY_UNKNOWN|PARTIALLY_OFFLINE|PARTIALLY_INTERMEDIATE |
COMPLETE_UNKNOWN|COMPLETE_OFFLINE|ADD|DOWN |
%crs_entity_name% has %resource_status_alert_count% instances in %resource_status_alert_state% State %resource_status_additional_mesg% |