1.2 Components of Autonomous Health Framework

This section describes the diagnostic components that are part of Oracle Autonomous Health Framework.

1.2.1 Introduction to Oracle Autonomous Health Framework Configuration Audit Tools

Oracle Orachk and Oracle Exachk provide a lightweight and non-intrusive health check framework for the Oracle stack of software and hardware components.

Oracle Orachk and Oracle Exachk:

  • Automates risk identification and proactive notification before your business is impacted
  • Runs health checks based on critical and reoccurring problems
  • Presents high-level reports about your system health risks and vulnerabilities to known issues
  • Enables you to drill-down specific problems and understand their resolutions
  • Enables you to schedule recurring health checks at regular intervals
  • Sends email notifications and diff reports while running in daemon mode
  • Integrates the findings into Oracle Health Check Collections Manager and other tools of your choice
  • Runs in your environment with no need to send anything to Oracle

You have access to Oracle Orachk and Oracle Exachk as a value add-on to your existing support contract. There is no additional fee or license required to run Oracle Orachk and Oracle Exachk.

Use Oracle Exachk for Oracle Engineered Systems except for Oracle Database Appliance. For all other systems, use Oracle Orachk.

Run health checks for Oracle products using the command-line options.

1.2.2 Introduction to Oracle Trace File Analyzer

Oracle Trace File Analyzer is a utility for targeted diagnostic collection that simplifies diagnostic data collection for Oracle Clusterware, Oracle Grid Infrastructure, and Oracle Real Application Clusters (Oracle RAC) systems, in addition to single instance, non-clustered databases.

Enabled by default, Oracle Trace File Analyzer:

  • Provides comprehensive first failure diagnostics collection
  • Efficiently collects, packages, and transfers diagnostic data to Oracle Support
  • Reduces round trips between customers and Oracle

Oracle Trace File Analyzer reduces the time required to obtain the correct diagnostic data, which eventually saves your business money.

For more information, see Oracle Autonomous Health Framework Checks and Diagnostics User's Guide.

New Attention Log for Efficient Critical Issue Resolution

Diagnosability of database issues is enhanced through a new attention log, as well as classification of information written to database trace files. The new attention log is written in a structured format (XML or JSON) that is much easier to process or interpret and only contains information that requires attention from an administrator. The contents of trace files now contains information that enables much easier classification of trace messages, such as for security and sensitivity.

Enhanced diagnosability features simplify database administration and improve data security.

For more information, see Attention Log

1.2.3 Introduction to AHF Insights

AHF Insights provides a bird's eye view of the entire system with the ability to further drill down for root cause analysis.

Note:

Starting in AHF 23.8, plotly.js dependency on CDN has been removed for customers using AHF Insights in restrictive environments.

Previously, results from different AHF components were not available in a single dashboard making it challenging to combine and correlate. To mitigate this, AHF Insights provides a web-based graphical user interface, which does not require a web server to host the web pages, for all diagnostic data collectors and analyzers that are part of AHF Kit.

AHF performs a contextual diagnostic collection for a given period to analyze the performance of database systems. The collection includes diagnostic data from various AHF features such as:
  • Configuration
  • Environment Topology
  • Metrics
  • Logs
This diagnostic data collected from the system passes through AHF Insights, which in turn produces an offline report with analysis in the following areas:
  • System Configuration
  • System State
  • Anomalies in the Operating System
  • Best Practices Compliance
  • System Traces
  • Root cause for issues and fixes in some of the anomalous cases
To get started, run the following command:
ahf analysis create --type insights

Example 1-1 ahf analysis create --type insights

[root@node02 ~]# tfactl print status

.-----------------------------------------------------------------------------------------------.
| Host   | Status of TFA | PID    | Port | Version    | Build ID             | Inventory Status |
+----------------+---------------+--------+------+------------+----------------------+----------+
| node02 | RUNNING       | 134679 | 5000 | 22.3.0.0.0 | 22300020221031131221 | COMPLETE         |
| node01 | RUNNING       | 128438 | 5000 | 22.3.0.0.0 | 22300020221031131221 | COMPLETE         |
'----------------+---------------+--------+------+------------+----------------------+----------'
[root@node02 ~]# ahf analysis create --type insights --last 2h
Starting analysis and collecting data for insights
Collecting data for AHF Insights (This may take a few minutes per node)
AHF Insights report is being generated for the last 2h
From Date : 11/20/2022 01:16:41 UTC - To Date : 11/20/2022 03:17:15 UTC
Report is generated at : /opt/oracle.ahf/data/repository/collection_Sun_Nov_20_03_16_36_UTC_2022_node_all/cgexa-ogmn12_insights_2022_11_20_03_18_13.zip

1.2.4 Introduction to Oracle Cluster Health Advisor

Oracle Cluster Health Advisor continuously monitors cluster nodes and Oracle RAC databases for performance and availability issue precursors to provide early warning of problems before they become critical.

Oracle Cluster Health Advisor does the following:

  • Detects node and database performance problems
  • Provides early-warning alerts and corrective action
  • Supports on-site calibration to improve sensitivity

In Oracle Database 12c release 2 (12.2.0.1), Oracle Cluster Health Advisor supports the monitoring of two critical subsystems of Oracle Real Application Clusters (Oracle RAC): the database instance and the host system. Oracle Cluster Health Advisor determines and tracks the health status of the monitored system. It periodically samples a wide variety of key measurements from the monitored system.

Over a hundred database and cluster node problems have been modeled, and the specific operating system and Oracle Database metrics that indicate the development or existence of these problems have been identified. This information is used to construct a trained, calibrated model that is based on a normal operational period of the target system.

Oracle Cluster Health Advisor runs an analysis multiple times a minute. Oracle Cluster Health Advisor estimates an expected value of an observed input based on the default model. Oracle Cluster Health Advisor then performs anomaly detection for each input based on the difference between observed and expected values. If sufficient inputs associated with a specific problem are abnormal, then Oracle Cluster Health Advisor raises a warning and generates an immediate targeted diagnosis and corrective action.

Oracle Cluster Health Advisor models are conservative to prevent false warning notifications. However, the default configuration may not be sensitive enough for critical production systems. Therefore, Oracle Cluster Health Advisor provides an onsite model calibration capability to use actual production workload data to form the basis of its default setting and increase the accuracy and sensitivity of node and database models.

You can also use Oracle Cluster Health Advisor to diagnose and triage past problems. Specify the past dates through the command-line interface CHACTL, AHF Insights, or AHF Scope.

1.2.5 Introduction to AHF Scope

AHF Scope is a standalone, interactive, real-time capable front-end to Oracle Cluster Health Advisor (CHA). AHF Scope requires a very small foot-print on the monitored system.

AHF Scope is invoked using the ahfscope script available in the /opt/oracle.ahf/ahfscope/bin/ directory. AHF Scope is designed primarily for cluster or database experts. It is capable of handling large amounts of data efficiently. Its layout and mode of operation is designed for functional efficiency. Most of the operations can be executed using a positional pointer and Hot Keys, or a floating menu available at the cursor position.

If Grid Infrastructure Management Repository (GIMR) is configured, AHF Scope will connect directly to GIMR using a JDBC connection, and read the current data in real-time. AHF Scope can also operate locally with no connection to GIMR using a data archive extracted from GIMR.

Note:

GIMR is optionally supported in Oracle Database 19c. However, it's desupported in Oracle Database 23ai. For more information, see Removing Grid Infrastructure Management Repository.

1.2.6 Introduction to AHF Balance

AHF Balance is a command-line utility that analyzes historical CPU consumption data and Database Resource Manager (DBRM) settings for the set of databases running in a cluster.

It assists in understanding the history of CPU-based noisy neighbor problems and recommends appropriate DBRM settings to minimize the risk of noisy neighbor problems.

1.2.7 Introduction to Cluster Health Monitor

Cluster Health Monitor is a component of Oracle Grid Infrastructure, which continuously monitors and stores Oracle Clusterware and operating system resources metrics.

Enabled by default, Cluster Health Monitor:

  • Assists node eviction analysis
  • Logs all process data locally
  • Enables you to define pinned processes
  • Listens to CSS and GIPC events
  • Categorizes processes by type
  • Supports plug-in collectors such as traceroute, netstat, ping, and so on
  • Provides CSV output for ease of analysis

Cluster Health Monitor serves as a data feed for other Oracle Autonomous Health Framework components such as Oracle Cluster Health Advisor.

1.2.8 Introduction to Blocker Resolver

Blocker Resolver is an Oracle Real Application Clusters (Oracle RAC) environment feature that autonomously resolves delays and keeps the resources available.

Enabled by default, Blocker Resolver:

  • Reliably detects database delays and deadlocks
  • Autonomously resolves database delays and deadlocks
  • Logs all detections and resolutions
  • Provides SQL interface to configure sensitivity (Normal/High) and trace file sizes

A database delays when a session blocks a chain of one or more sessions. The blocking session holds a resource such as a lock or latch that prevents the blocked sessions from progressing. The chain of sessions has a root or a final blocker session, which blocks all the other sessions in the chain. Blocker Resolver resolves these issues autonomously by detecting and resolving the delays.