4.1 Understanding Cluster Health Monitor Services

Cluster Health Monitor uses system monitor (osysmond) and cluster logger (ologgerd) services to collect diagnostic data.

About the System Monitor Service

The system monitor service (osysmond) is a real-time monitoring and operating system metric collection service that runs on each cluster node. The system monitor service is managed as a High Availability Services (HAS) resource. The system monitor service forwards the collected metrics to the cluster logger service, ologgerd. The cluster logger service stores the data in the Oracle Grid Infrastructure Management Repository database.

In addition, osysmond persists the collected operating system metrics under a directory in ORACLE_BASE.

Metric Repository is auto-managed on the local filesystem. You can change the location and size of the repository.

  • Nodeview samples are continuously written to the repository (JSON record)
  • Historical data is auto-archived into hourly zip files
  • Archived files are automatically purged once the default retention limit is reached (default: 200 MB)

About the Cluster Logger Service

The cluster logger service (ologgerd) is responsible for preserving the data collected by the system monitor service (osysmond) in the Oracle Grid Infrastructure Management Repository database. In a cluster, there is one cluster logger service (ologgerd) per 32 nodes. More logger services are spawned for every additional 32 nodes. The additional nodes can be a sum of Hub and Leaf Nodes. Oracle Clusterware relocates and starts the service on a different node, if:

  • The logger service fails and is not able to come up after a fixed number of retries

  • The node where the cluster logger service is running, is down