Changes in this Release

This preface lists changes in the Oracle Autonomous Health Framework Checks and Diagnostics User's Guide 24.6.

Node Eviction Detection Due to Multipath Disk Failures and Resolution

AHF expands the causes of node evictions it can automatically detect and help resolve and can also identify some causes of performance problems. In addition each Problem Summary now includes relevant configuration details.

Since version 24.4, AHF has automatically detected Node Evictions and displayed them in the Detected Problems panel of the Insights dashboard. From there, you can drill down to view the details of a specific node eviction.

This release includes the ability to detect more causes of node eviction and adds Configuration details to the Problem Summary.

The Problem Summary contains:

  • Problem: including which node was restarted and at what time
  • Reason: explaining why the node was restarted
  • Cause: explaining the root cause.
  • Evidence: providing a bullet list audit trail detailing relevant operating system and database resource metrics that were out of the normal range leading up to the event
  • Configuration: showing database stack configuration
  • Resolution Steps: detailing in simple terms exactly how to resolve the problem

Evidence is expandable, showing charts or log details to confirm the evidence.

AHF will generate a Problem Summary for the following causes of Node Eviction:

  • Memory exhaustion due to
    • HugePages are over allocated
    • Database or Grid Infrastructure process increasing memory usage
    • New Database started
  • Multipath disk failures

It will also generate a Problem Summary for hangs and performance issues caused by:

  • Archiver stuck
  • Latch contention due to misconfigured target_pdbs parameter

Future releases will continue to expand to identify more problem causes.

System Health Metrics Available on First Failure

AHF now automatically collects real-time system health metrics for non-clustered database systems, so they are available at first failure and includes them within diagnostic collections.

System health metrics such as CPU, memory and IO consumers are invaluable to Oracle Support for diagnosing Service Requests. AHF now automatically captures System Health metrics, so they are available at the time of a failure and are included within diagnostics collections.

A New Command-Line Option to Save the AHF Installer

A new option -saveinstaller has added been to ahf_setup command to save the AHF Installer for later use in case a downgrade is needed.

Component-Level Grouping of Events and Faster Performance

AHF Insights now includes the ability to view events grouped at the component level, and the home dashboard has been optimized for faster performance.

You can now explore timeline events grouped by Components, in addition to the existing Host, Events, and Database groupings. This granular view provides a more comprehensive understanding of how issues impact specific components of the database stack, enabling quicker and easier identification of problem areas.

Insights reports are included within diagnostic collections or they can be generated on-demand by running:
ahf analysis create --type insights

System Health Monitor (SHM) Integrated into AHF

In AHF 24.6 System Health Monitor (SHM) has been integrated and enabled by default in AHF. AHF will now include the SHM files in its diagnostic collection.

System Health Monitor (SHM) monitors operating system metrics in real time for processes, memory, network, IO and disk to troubleshoot and root cause the system performance issues in real time as well as root cause analysis of past issues. System Health Monitor (SHM) analysis will be available in AHF Insights. For more information, see Explore Diagnostic Insights.

SHM operates as a daemon process, triggered and controlled by AHF, but it is only available on Single-Instance Database and non-GI based systems.

Also, you can use the ahfctl statusahf command to check the status of System Health Monitor.

New Oracle Orachk and Oracle Exachk Best Practice Checks

Release 24.6 includes the following new Oracle Orachk and Oracle Exachk best practice checks.

Oracle Exachk Specific Best Practice Checks

  • Exadata Critical Issue EX88
  • Exadata Critical Issue DB53

All checks can be explored in more detail via the Health Check Catalogs: