2 Cluster

The Oracle RAC database metrics provide the following information for each metric:

Description
Metric summary. The metric summary can include some or all of the following: target version, evaluation frequency, collection frequency, upload frequency, operator, default warning threshold, default critical threshold, consecutive number of occurrences preceding notification, and alert text.
Multiple Thresholds (where applicable)
Data source
User action

2.1 Clusterware

The metrics in this category provide an overview of the clusterware status for this cluster, how many nodes in this cluster have problems, and the CLUVFY utility output for all the nodes of this cluster. Generally, the clusterware is up if the clusterware on at least one host is up.

2.1.1 Cluster Verification Output

This metric shows the CLUVFY output of clusterware for all nodes of this cluster.

Data Source

The load list is:

cluvfy comp crs -n node1, node2 ...

where node1, node2� is the node list for the cluster.

User Action

Search for the CLUVFY utility in the 10g Release 2 Oracle Clusterware and Oracle Real Application Clusters Administration and Deployment Guide.

2.1.2 Clusterware Status

This metric shows the overall clusterware status for this cluster. The clusterware is up if the clusterware on at least one host is up.

Metric Summary

The following table shows how often the metric's value is collected.

Table 2-1 Metric Summary Table

Target Version	Evaluation and Collection Frequency	Upload Frequency	Operator	Default Warning Threshold	Default Critical Threshold	Consecutive Number of Occurrences Preceding Notification	Alert Text
10.2.0.0	Every 5 Minutes	After Every Sample	=	Not Defined	1	1	Clusterware has problems on all hosts of this cluster. %CRS_output%

Note: Although the warning threshold by default is 0, you can change this value to represent how many nodes should have problems before an alert is triggered.

Data Source

The load list is:

cluvfy comp crs -n node1, node2 ...

User Action

Search for the CLUVFY utility in the 10g Release 2 Oracle Clusterware and Oracle Real Application Clusters Administration and Deployment Guide.

2.1.3 Node(s) with Clusterware Problem

This metric shows how many nodes have clusterware problems.

Data Source

The load list is:

cluvfy comp crs -n node1, node2 ...

where node1, node2� is the node list for the cluster.

Metric Summary

The following table shows how often the metric's value is collected.

Table 2-2 Metric Summary Table

Target Version	Evaluation and Collection Frequency	Upload Frequency	Operator	Default Warning Threshold	Default Critical Threshold	Consecutive Number of Occurrences Preceding Notification	Alert Text
All Versions	Every 5 Minutes	After Every Sample	>	0	Not Defined	1	There are %CRS_failed_node_count% host(s) with Clusterware problems. %CRS_output%

Note: Although the warning threshold by default is 0, you can change this value to represent how many nodes have problems before an alert is triggered.

User Action

Search for the CLUVFY utility in the 10g Release 2 Oracle Clusterware and Oracle Real Application Clusters Administration and Deployment Guide.

2.2 Clusterware Alert Log

Cluster Alert Log metrics

2.2.1 Alert Log Name

This column shows the name and full path of the CRS alert log.

This metric appears in Enterprise Manager Grid Control 10.2.

Metric Summary

The following table shows how often the metric's value is collected.

Target Version	Collection Frequency
CRS Version 10.2	Every 5 Minutes

2.2.2 Clusterware Service Alert Log Error

This metric collects certain error messages in the CRS alert log at the cluster level.

Metric Summary

The following table shows how often the metric's value is collected and compared against the default thresholds. The 'Consecutive Number of Occurrences Preceding Notification' column indicates the consecutive number of times the comparison against thresholds should hold TRUE before an alert is generated.

Table 2-3 Metric Summary Table

Target Version	Evaluation and Collection Frequency	Upload Frequency	Operator	Default Warning Threshold	Default Critical Threshold	Consecutive Number of Occurrences Preceding Notification	Alert Text
All Versions	Every 5 Minutes	After Every Sample	CONTAINS	Not Defined	CRS-	1*	%crsErrStack% See %alertLogName for details.

Target Version

Evaluation and Collection Frequency

Upload Frequency

Operator

Default Warning Threshold

Default Critical Threshold

Consecutive Number of Occurrences Preceding Notification

Alert Text

All Versions

Every 5 Minutes

After Every Sample

CONTAINS

Not Defined

CRS-

%crsErrStack%

See %alertLogName for details.

* After an alert is triggered for this metric, you must manually clear it.

Multiple Thresholds

For this metric, you can set different warning and critical threshold values for each "Time/Line Number" object. If warning or critical threshold values are currently set for any "Time/Line Number" object, you can view these thresholds on the Metric Detail page for this metric.

To specify or change warning or critical threshold values for each "Time/Line Number" object, use the Edit Thresholds page.

2.2.3 Node Configuration Alert Log Error

This column collects CRS-1607, 1802, 1803, 1804 and 1805 messages from the CRS alert log at the cluster level, and issues alerts based on the error code.

Metric Summary

This metric appears in version 10.2 of Enterprise Manager Grid Control.

Table 2-4 Metric Summary Table

Target Version	Evaluation and Collection Frequency	Upload Frequency	Operator	Default Warning Threshold	Default Critical Threshold	Consecutive Number of Occurrences Preceding Notification	Alert Text
All Versions	Every 5 Minutes	After Every Sample	MATCH	CRS-180(2\|3\|4\|5)	CRS-1607	1*	%nodeErrStack% See %alertLogName for details.

Target Version

Evaluation and Collection Frequency

Upload Frequency

Operator

Default Warning Threshold

Default Critical Threshold

Consecutive Number of Occurrences Preceding Notification

Alert Text

All Versions

Every 5 Minutes

After Every Sample

MATCH

CRS-180(2|3|4|5)

CRS-1607

%nodeErrStack%

See %alertLogName for details.

* After an alert is triggered for this metric, you must manually clear it.

Multiple Thresholds

For this metric, you can set different warning and critical threshold values for each "Time/Line Number" object. If warning or critical threshold values are currently set for any "Time/Line Number" object, these thresholds can be viewed on the Metric Detail page for this metric.

To specify or change warning or critical threshold values for each "Time/Line Number" object, use the Edit Thresholds page. See Editing Thresholds for information on accessing the Edit Thresholds page.

2.2.4 OCR Alert Log Error

This column collects CRS-1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1010 and 1011 messages from CRS alert log at the cluster level and issue alerts based on the error code.

Metric Summary

This metric appears in version 10.2 of Enterprise Manager Grid Control.

Table 2-5 Metric Summary Table

Target Version	Evaluation and Collection Frequency	Upload Frequency	Operator	Default Warning Threshold	Default Critical Threshold	Consecutive Number of Occurrences Preceding Notification	Alert Text
All Versions	Every 5 Minutes	After Every Sample	MATCH	CRS-100(1\|2\|3\|4\|5\|7)	CRS-(1006\|1008\|1010\|1011)	1*	%ocrErrStack% See %alertLogName for details.

Target Version

Evaluation and Collection Frequency

Upload Frequency

Operator

Default Warning Threshold

Default Critical Threshold

Consecutive Number of Occurrences Preceding Notification

Alert Text

All Versions

Every 5 Minutes

After Every Sample

MATCH

CRS-100(1|2|3|4|5|7)

CRS-(1006|1008|1010|1011)

%ocrErrStack%

See %alertLogName for details.

* After an alert is triggered for this metric, you must manually clear it.

Multiple Thresholds

For this metric, you can set different warning and critical threshold values for each "Time/Line Number" object. If warning or critical threshold values are currently set for any "Time/Line Number" object, these thresholds can be viewed on the Metric Detail page for this metric.

2.2.5 Voting Disk Alert Log Error

This column collects CRS-1607, 1802, 1803, 1804 and 1805 messages from the CRS alert log at the cluster level, and issues alerts based on the error code.

Metric Summary

This metric appears in version 10.2 of Enterprise Manager Grid Control.

Table 2-6 Metric Summary Table

Target Version	Evaluation and Collection Frequency	Upload Frequency	Operator	Default Warning Threshold	Default Critical Threshold	Consecutive Number of Occurrences Preceding Notification	Alert Text
All Versions	Every 5 Minutes	After Every Sample	MATCH	Not Defined	CRS-160(4\|5\|6)	1*	%votingErrStack% See %alertLogName for details.

Target Version

Evaluation and Collection Frequency

Upload Frequency

Operator

Default Warning Threshold

Default Critical Threshold

Consecutive Number of Occurrences Preceding Notification

Alert Text

All Versions

Every 5 Minutes

After Every Sample

MATCH

Not Defined

CRS-160(4|5|6)

%votingErrStack%

See %alertLogName for details.

* After an alert is triggered for this metric, you must manually clear it.

Multiple Thresholds

For this metric, you can set different warning and critical threshold values for each "Time/Line Number" object. If warning or critical threshold values are currently set for any "Time/Line Number" object, these thresholds can be viewed on the Metric Detail page for this metric.

2.3 Response

This metric category contains the metrics that represent the status of the cluster; that is, whether it is up or down. As long as one of the member hosts is up, the cluster is up.

2.3.1 Status

This metric indicates the overall status of the hosts in the cluster. When all the hosts in the cluster are down, the cluster is considered unreachable.

Metric Summary

The following table shows how often the metric's value is collected and compared against the default thresholds. This metric is evaluated every minute on the OMS side to check if all the members are down.

Table 2-7 Metric Summary Table

Target Version	Evaluation and Collection Frequency	Upload Frequency	Operator	Default Warning Threshold	Default Critical Threshold	Consecutive Number of Occurrences Preceding Notification	Alert Text
All Versions	Every Minute	After Every Sample	=	Not Defined	0	1	Target is down -- all members are down.

Data Source

The calculation is based on the status of each member host. As long as one host is up, the cluster is up.

User Action

Check if the network is down or all the hosts for the cluster are shut down.