4.1 Understanding Cluster Health Monitor Services
Cluster Health Monitor uses system monitor (osysmond
) and cluster logger (ologgerd
) services to collect diagnostic data.
About the System Monitor Service
The system monitor service (osysmond
) is a real-time
monitoring and operating system metric collection service that runs on each cluster
node. The system monitor service is managed as a High Availability Services (HAS)
resource. The system monitor service forwards the collected metrics to the cluster
logger service, ologgerd
. The cluster logger service stores the data in
the Oracle Grid Infrastructure Management Repository database.
In addition, osysmond
persists the collected operating
system metrics under a directory in ORACLE_BASE.
Metric Repository is auto-managed on the local filesystem. You can change the location and size of the repository.
- Nodeview samples are continuously written to the repository (JSON record)
- Historical data is auto-archived into hourly zip files
- Archived files are automatically purged once the default retention limit is reached (default: 200 MB)
About the Cluster Logger Service
The cluster logger service (ologgerd
) is responsible for preserving the data collected by the system monitor service (osysmond
) in the Oracle Grid Infrastructure Management Repository database. In a cluster, there is one cluster logger service (ologgerd
) per 32 nodes. More logger services are spawned for every additional 32 nodes. The additional nodes can be a sum of Hub and Leaf Nodes. Oracle Clusterware relocates and starts the service on a different node, if:
-
The logger service fails and is not able to come up after a fixed number of retries
-
The node where the cluster logger service is running, is down
Parent topic: Collecting Operating System Resources Metrics