Database Service Events
Database Service Events feature implementation enables you to get notified about health issues with your Oracle Databases or other components on the DB system.
It is possible that Oracle Database or Clusterware may not be healthy or various system components may be running out of space on the DB system. Customers are not notified of this situation. Database Service Events feature implementation generates events for Data Plane operations and conditions, as well as Notifications for customers by leveraging the existing OCI Events service and Notification mechanisms in their tenancy. Customers can then create topics and subscribe to these topics through email, functions, or streams.
Note:
Events flow on the DB system depends on Oracle Trace File Analyzer (TFA) and Oracle Database Cloud Service (DBCS) agent. Ensure that these components are up and running.Receive Notifications about Database Service Events
Subscribe to the Database Service Events and get notified. To receive notifications, subscribe to Database Service Events and get notified using the Oracle Notification service, see Notifications Overview. For more information about Oracle Cloud Infrastructure Events, see Overview of Events.
Events Service - Event Types
- Database - Critical
- DB Node - Critical
- DB Node - Error
- DB Node - Warning
- DB Node - Information
- DB System - Critical
Database Service Event Types
The following table lists the event types that the Database Service emits.
Note:
- Critical events are triggered due to several types of critical conditions and errors that cause disruption to the database and other critical components. For example, database hang errors, and availability errors for databases, database nodes, and database systems to let you know if a resource becomes unavailable.
- Information events are triggered when the database and other critical components work as expected. For example, a clean shutdown of CRS, CDB, client, or scan listener, or a startup of these components will create an event with the severity of
INFO
. - Threshold limits reduce the number of notifications customers will receive for similar incident events whilst at the same time ensuring they receive the incident events and are reminded in a timely fashion.
Database Service Events
Table - Database Service Events
Friendly Name | Event Name | Description | Remediation | Event Type | Threshold |
---|---|---|---|---|---|
Resource Utilization - Disk Usage | HEALTH.DB_GUEST.FILESYSTEM.FREE_SPACE |
This event is reported when VM guest file system free space falls below 10% free, as determined by the operating system
|
HEALTH-DB_GUEST-FILESYSTEM-FREE_SPACE | com.oraclecloud.databaseservice.dbnode.critical |
Critical threshold: 90% |
CRS status Up/Down | AVAILABILITY.DB_GUEST.CRS_INSTANCE.DOWN .
|
An event of type CRITICAL is created when the Cluster Ready Service (CRS) is detected to be down. | AVAILABILITY-DB_GUEST-CRS_INSTANCE.DOWN | com.oraclecloud.databaseservice.dbnode.critical (if .DOWN and NOT "user_action")
|
NA |
AVAILABILITY.DB_GUEST.CRS_INSTANCE.DOWN_CLEARED |
An event of type INFORMATION is created once it is determined that the event for CRS down has cleared. | NA | com.oraclecloud.databaseservice.dbnode.information (if .DOWN_CLEARED)
|
NA | |
AVAILABILITY.DB_GUEST.CRS_INSTANCE.EVICTION |
An event of type CRITICAL is created. | AVAILABILITY-DB_GUEST-CRS_INSTANCE-EVICTION | com.oraclecloud.databaseservice.dbnode.critical |
NA | |
SCAN Listener Up/Down | AVAILABILITY.DB_CLUSTER.SCAN_LISTENER.DOWN |
A DOWN event is created when a SCAN listener goes down. The event is of type INFORMATION when a SCAN listener is shutdown due to user action, such as with the Server Control Utility ( There are three SCAN listeners per cluster called LISTENER_SCAN[1,2,3]. |
AVAILABILITY-DB_CLUSTER-SCAN_LISTENER-DOWN | com.oraclecloud.databaseservice.dbnode.critical (if .DOWN and NOT "user_action")
|
NA |
AVAILABILITY.DB_CLUSTER.SCAN_LISTENER.DOWN_CLEARED |
An event of type INFORMATION is created once it is determined that the event for SCAN Listener down has cleared. | NA | com.oraclecloud.databaseservice.dbnode.information (if .DOWN_CLEARED)
|
NA | |
Net Listener Up/Down | AVAILABILITY.DB_GUEST.CLIENT_LISTENER.DOWN |
A DOWN event is created when a client listener goes down. The event is of type INFORMATION when a client listener is shutdown due to user action, such as with the Server Control Utility ( There is one client listener per node, each called LISTENER. |
AVAILABILITY-DB_GUEST-CLIENT_LISTENER.DOWN | com.oraclecloud.databaseservice.database.critical (if .DOWN and NOT "user_action")
|
NA |
AVAILABILITY.DB_GUEST.CLIENT_LISTENER.DOWN_CLEARED |
An event of type INFORMATION is created once it is determined that the event for Client Listener down has cleared. | NA | com.oraclecloud.databaseservice.database.information (if .DOWN_CLEARED)
|
NA | |
CDB Up/Down | AVAILABILITY.DB_GUEST.CDB_INSTANCE.DOWN |
A DOWN event is created when a database instance goes down. The event is of type INFORMATION when a database instance is shutdown due to user action, such as with the SQL*Plus (sqlplus ) or Server Control Utility (srvctl ) commands, or any Oracle Cloud maintenance action that uses those commands, such as performing a database home software update. The event is of type CRITICAL when a database instance goes down unexpectedly. A corresponding DOWN_CLEARED event is created when a database instance is started.
|
AVAILABILITY-DB_GUEST-CDB_INSTANCE-DOWN | com.oraclecloud.databaseservice.database.critical (if .DOWN and NOT "user_action")
|
NA |
AVAILABILITY.DB_GUEST.CDB_INSTANCE.DOWN_CLEARED |
An event of type INFORMATION is created once it is determined that the event for the CDB down has cleared. | NA | com.oraclecloud.databaseservice.database.information (if .DOWN_CLEARED)
|
NA | |
Critical DB Errors | HEALTH.DB_CLUSTER.CDB.CORRUPTION |
Database corruption has been detected on your primary or standby database. The database alert.log is parsed for any specific errors that are indicative of physical block corruptions, logical block corruptions, or logical block corruptions caused by lost writes. | HEALTH-DB_CLUSTER-CDB-CORRUPTION | com.oraclecloud.databaseservice.database.critical |
NA |
Other DB Errors | HEALTH.DB_CLUSTER.CDB.ARCHIVER_HANG |
An event of type CRITICAL is created if a CDB is either unable to archive the active online redo log or unable to archive the active online redo log fast enough to the log archive destinations. | HEALTH-DB_CLUSTER-CDB-ARCHIVER_HANG | com.oraclecloud.databaseservice.database.critical |
NA |
HEALTH.DB_CLUSTER.CDB.DATABASE_HANG |
An event of type CRITICAL is created when a process/session hang is detected in the CDB. | HEALTH-DB_CLUSTER-CDB-DATABASE_HANG | com.oraclecloud.databaseservice.database.critical |
NA | |
Backup Failures | HEALTH.DB_CLUSTER.CDB.BACKUP_FAILURE |
An event of type CRITICAL is created if there is a CDB backup with a FAILED status reported in the v$rman_status view.
|
HEALTH-DB_CLUSTER-CDB-BACKUP_FAILURE | com.oraclecloud.databaseservice.database.critical |
NA |
HEALTH.DB_CLUSTER.CDB.BACKUP_FAILURE_CLEARED |
An event of type INFORMATION is created. | NA | com.oraclecloud.databaseservice.database.information |
NA | |
Disk Group Usage | HEALTH.DB_CLUSTER.DISK_GROUP.FREE_SPACE |
An event of type CRITICAL is created when an ASM disk group reaches space usage of 90% or higher. An event of type INFORMATION is created when the ASM disk group space usage drops below 90%. | HEALTH-DB_CLUSTER-DISK_GROUP-FREE_SPACE |
|
Notifications are sent when the usage hits 70%, 80%, 90%, and 100% with a corresponding severity of 4, 3, 2, and 1. |
Temporarily Restrict Automatic Diagnostic Collections for Specific Events
Use the tfactl blackout
command to temporarily suppress automatic diagnostic collections.
If you set blackout for a target, then Oracle Trace File Analyzer stops automatic diagnostic collections if it finds events in the alert logs for that target while scanning. By default, blackout will be in effect for 24 hours.
You can also restrict automatic diagnostic collection at a granular level, for example, only for ORA-00600
or even only ORA-00600
with specific arguments.
Syntax
tfactl blackout add|remove|print
-targettype host|crs|asm|asmdg|database|dbbackup|db_dataguard|
db_tablespace|pdb_tablespace|pdb|listener|service|os
-target all|name
[-container name]
[-pdb pdb_name]
-event all|"event_str1,event_str2"|availability
[-timeout nm|nh|nd|none]
[-c|-local|-nodes "node1,node2"]
[-reason "reason for blackout"]
[-docollection]
Parameters
Table - Parameters
Parameter | Description |
---|---|
add |remove |print |
|
Adds, removes, or prints blackout conditions. |
Target type:
|
Limits blackout only to the specified target type.
|
-target all|name |
Specify the target for blackout. You can specify a comma-delimited list of targets. By default, the target is set to |
-container name |
Specify the database container name (db_unique_name ) where the blackout will take effect (for PDB, DB_TABLESPACE, and PDB_TABLESPACE).
|
-pdb pdb_name |
Specify the PDB where the blackout will take effect (for PDB_TABLESPACE only). |
-events all|"str1,str2" |
Limits blackout only to the availability events, or event strings, which should not trigger auto collections, or be marked as blacked out in telemetry JSON.
string: Blackout for incidents where any part of the line contains the strings specified. Specify a comma-delimited list of strings. |
-timeout nh|nd|none |
Specify the duration for blackout in number of hours or days before timing out. By default, the timeout is set to 24 hours (24h). |
-c|-local |
Specify if blackout should be set to By default, blackout is set to |
-reason comment |
Specify a descriptive reason for the blackout. |
-docollection |
Use this option to do an automatic diagnostic collection even if a blackout is set for this target. |
Examples
The following are the examples to use tfactl blackout command.
To blackout event: ORA-00600
on targettype: database
, target: mydb
tfactl blackout add -targettype database -target mydb -event "ORA-00600"
To blackout event: ORA-04031
on targettype: database
, target: all
tfactl blackout add -targettype database -target all -event "ORA-04031" -timeout 1h
To blackout db backup events on targettype: dbbackup
, target: mydb
tfactl blackout add -targettype dbbackup -target mydb
To blackout db dataguard events on targettype: db_dataguard
, target: mydb
tfactl blackout add -targettype db_dataguard -target mydb -timeout 30m
To blackout db tablespace events on targettype: db_tablespace
, target: system
, container: mydb
tfactl blackout add -targettype db_tablespace -target system -container mydb -timeout 30m
To blackout ALL
events on targettype: host
, target: all
tfactl blackout add -targettype host -event all -target all -timeout 1h
-reason "Disabling all events during patching"
To print blackout details:
tfactl blackout print
.-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------.
| myhostname |
+---------------+---------------------+-----------+------------------------------+------------------------------+--------+---------------+--------------------------------------+
| Target Type | Target | Events | Start Time | End Time | Status | Do Collection | Reason |
+---------------+---------------------+-----------+------------------------------+------------------------------+--------+---------------+--------------------------------------+
| HOST | ALL | ALL | Thu Mar 24 16:48:39 UTC 2022 | Thu Mar 24 17:48:39 UTC 2022 | ACTIVE | false | Disabling all events during patching |
| DATABASE | MYDB | ORA-00600 | Thu Mar 24 16:39:03 UTC 2022 | Fri Mar 25 16:39:03 UTC 2022 | ACTIVE | false | NA |
| DATABASE | ALL | ORA-04031 | Thu Mar 24 16:39:54 UTC 2022 | Thu Mar 24 17:39:54 UTC 2022 | ACTIVE | false | NA |
| DB_DATAGUARD | MYDB | ALL | Thu Mar 24 16:41:38 UTC 2022 | Thu Mar 24 17:11:38 UTC 2022 | ACTIVE | false | NA |
| DBBACKUP | MYDB | ALL | Thu Mar 24 16:40:47 UTC 2022 | Fri Mar 25 16:40:47 UTC 2022 | ACTIVE | false | NA |
| DB_TABLESPACE | SYSTEM_CDBNAME_MYDB | ALL | Thu Mar 24 16:45:56 UTC 2022 | Thu Mar 24 17:15:56 UTC 2022 | ACTIVE | false | NA |
'---------------+---------------------+-----------+------------------------------+------------------------------+--------+---------------+--------------------------------------'
To remove blackout for event: ORA-00600
on targettype: database
, target: mydb
tfactl blackout remove -targettype database -event "ORA-00600" -target mydb
To remove blackout for db backup events on targettype: dbbackup
, target: mydb
tfactl blackout remove -targettype dbbackup -target mydb
To remove blackout for db tablespace events on targettype: db_tablespace
, target: system
, container: mydb
tfactl blackout remove -targettype db_tablespace -target system -container mydb
To remove blackout for host events on targettype: all
, target: all
tfactl blackout remove -targettype host -event all -target all
Manage Oracle Trace File Analyzer
To check the run status of Oracle Trace File Analyzer, run the tfactl status
command as root
or a non-root user:
tfactl status
.----------------------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status |
+-------+---------------+--------+------+------------+----------------------+------------------+
| node1 | RUNNING | 41312 | 5000 | 22.1.0.0.0 | 22100020220310214615 | COMPLETE |
| node2 | RUNNING | 272300 | 5000 | 22.1.0.0.0 | 22100020220310214615 | COMPLETE |
'----------------------------------------------------------------------------------------------'
To start the Oracle Trace File Analyzer daemon on the local node, run the tfactl start
command as root
user:
tfactl start
Starting TFA..
Waiting up to 100 seconds for TFA to be started..
. . . . .
Successfully started TFA Process..
. . . . .
TFA Started and listening for commands
To stop the Oracle Trace File Analyzer daemon on the local node, run the tfactl stop
command as root
user:
tfactl stop
Stopping TFA from the Command Line
Nothing to do !
Please wait while TFA stops
Please wait while TFA stops
TFA-00002 Oracle Trace File Analyzer (TFA) is not running
TFA Stopped Successfully
Successfully stopped TFA..
Manage Database Service Agent
View the /opt/oracle/dcs/log/dcs-agent.log
file to identify issues with the agent.
To check the status of the Database Service Agent, run the systemctl status
command:
systemctl status dbcsagent.service
dbcsagent.service
Loaded: loaded (/usr/lib/systemd/system/dbcsagent.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2022-04-01 13:40:19 UTC; 6min ago
Process: 9603 ExecStopPost=/bin/bash -c kill `ps -fu opc |grep "java.*dbcs-agent.*jar" |
awk '{print $2}' ` (code=exited, status=0/SUCCESS)
Main PID: 10055 (sudo)
CGroup: /system.slice/dbcsagent.service
‣ 10055 sudo -u opc /bin/bash -c umask 077; /bin/java -Doracle.security.jps.config=/opt/oracle/...
To start the agent if it is not running, run the systemctl start
command as the root
user:
systemctl start dbcsagent.service
For information about Oracle's commitment to accessibility, visit the Oracle Accessibility Program website at http://www.oracle.com/pls/topic/lookup?ctx=acc&id=docacc.
Access to Oracle Support
Oracle customers that have purchased support have access to electronic support through My Oracle Support. For information, visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=info or visit http://www.oracle.com/pls/topic/lookup?ctx=acc&id=trs if you are hearing impaired.