6 cnDBTier Alerts
cnDBTier generates alerts when cnDBTier meets a specified condition. You can access the alerts using the Prometheus dashboard and take necessary actions. Prometheus gets installed as part of common services during the vCNE installation. This section provides details about the available cnDBTier alerts.
6.1 cnDBTier Remote Server Backup Transfer Status Alerts
This section provides details about the cnDBTier remote server backup transfer status alerts.
Table 6-1 REMOTE_SERVER_BACKUP_TRANSFER_FAILED
| Field | Details |
|---|---|
| Description | This alert is triggered with major severity when the transfer of backup to a remote server fails. |
| Summary | Secure transfer of backup to remote server failed on cnDBTier site {{ $labels.site_name }} |
| Severity | major |
| Condition | db_tier_remote_server_backup_transfer_status == 1 |
| Expression Validity | NA |
| SNMP Trap ID | 2031 |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: The transfer of backup to remote server failed. Diagnostic Information: Check the status of the
Recommended Actions: This alert is cleared automatically when the backup
transfer status is updated from the good state to remote server as
success.
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
6.2 cnDBTier Backup Transfer Status Alerts
This section provides details about the cnDBTier backup transfer status alerts.
Table 6-2 BACKUP_TRANSFER_LOCAL_FAILED
| Field | Details |
|---|---|
| Description | This alert is triggered with major severity when the
system fails to transfer the backup from the data node to the
replication service pod on the cnDBTier site
(db_tier_backup_transfer_status metric value is
2).
|
| Summary | Failed to transfer backup from data node to replication service pod on cnDBTier site {{ $labels.site_name }} |
| Severity | major |
| Condition | db_tier_backup_transfer_status == 2 |
| Expression Validity | NA |
| SNMP Trap ID | 2026 |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause: The system failed to transfer a backup from the data node to the replication service pod on a cnDBTier site. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-3 BACKUP_TRANSFER_FAILED
| Field | Details |
|---|---|
| Description | This alert is triggered with major severity when the
backup transfer failed as the system failed to transfer the backup to
the remote site from the cnDBTier site
(db_tier_backup_transfer_status metric value is
3).
|
| Summary | Failed to transfer backup to remote site from cnDBTier site {{ $labels.site_name }} |
| Severity | major |
| Condition | db_tier_backup_transfer_status == 3 |
| Expression Validity | NA |
| SNMP Trap ID | 2027 |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause: The system failed to transfer a backup from the cnDBTier site to a remote site. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-4 BACKUP_TRANSFER_IN_PROGRESS
| Field | Details |
|---|---|
| Description | This alert is triggered with info severity when the
backup transfer is in progress on the cnDBTier site
(db_tier_backup_transfer_status metric value is 1).
|
| Summary | Backup Transfer is In Progress on cnDBTier site {{ $labels.site_name }} |
| Severity | info |
| Condition | db_tier_backup_transfer_status == 1 |
| Expression Validity | NA |
| SNMP Trap ID | 2028 |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: Backup transfer is in progress on the cnDBTier site. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
6.3 cnDBTier Heartbeat Alerts
This section provides details about cnDBTier heartbeat alerts.
Table 6-5 HEARTBEAT_FAILED
| Field | Details |
|---|---|
| Description | This alert is triggered with critical severity when HeartBeat fails on a remote site. |
| Summary | HeartBeat failed on cnDBTier site {{ $labels.site_name }} connected to mate site {{ $labels.mate_site_name }} on replication channel group id {{ $labels.replchannel_group_id }} and kubernetes namespace {{ $labels.namespace }}" |
| Severity | critical |
| Condition | db_tier_heartbeat_failure == 1 |
| Expression Validity | NA |
| SNMP Trap ID | 2025 |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause: The system is unable to connect to remote site and Heartbeat failed. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
6.4 cnDBTier BinLog Injector Thread Alerts
This section provides details about cnDBTier BinLog injector alerts.
Table 6-6 BINLOG_INJECTOR_STOPPED
| Field | Details |
|---|---|
| Description | This alert is triggered with critical severity when Bin
Log Injector stops working.
The value of
db_tier_binlog_injector_thread or
db_tier_binlog_injector_thread_latest_epoch
indicates the status of Bin Log Injector:
|
| Summary | BinLog Injector Thread is stopped for MySQL node having node id {{ $labels.node_id }} on cnDBTier site {{ $labels.site_name }} and kubernetes namespace {{ $labels.namespace }} |
| Severity | critical |
| Condition |
db_tier_binlog_injector_thread_latest_epoch == 1 ordb_tier_binlog_injector_thread == 1 |
| Expression Validity | NA |
| SNMP Trap ID | 2024 |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause: Bin Log Injector thread stalled for the replication SQL node. Diagnostic Information: The Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
6.5 cnDBTier Replication Error Skip Alerts
This section provides details about the cnDBTier replication error skip alerts.
Table 6-7 REPLICATION_SWITCHOVER_DUE_CLUSTERDISCONNECT
| Field | Details |
|---|---|
| Description | This alert is triggered when switch over happens on an API node due to configured cluster disconnect error, if skip replication error is enabled. |
| Summary | Replication channel on SQL node with node ID {{ $labels.node_id }} had switchover due to cluster disconnecterror number {{ $labels.error_number }} |
| Severity | info |
| Condition | db_tier_replication_switchover_due_to_clusterdisconnect == 1 |
| Expression Validity | NA |
| SNMP Trap ID | 2019 |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: Skip replication error is enabled on an API node and a switchover occurred on the node as the configured cluster disconnected. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-8 REPLICATION_TOO_MANY_EPOCHS_LOST
| Field | Details |
|---|---|
| Description | This alert is triggered when the epochs lost due to skip
error is greater than 10000 and less than or equal to 80000.
This alert is cleared one hour after the event. |
| Summary | Too many epochs are lost for skipping replication errors |
| Severity | major |
| Condition | (db_tier_epochs_lost_due_to_skiperror > 10000) and (db_tier_epochs_lost_due_to_skiperror <= 80000) |
| Expression Validity | NA |
| SNMP Trap ID | 2020 |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: Between 10000 and 80000 epochs are lost due to skip errors. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-9 REPLICATION_SKIP_ERRORS_LOW
| Field | Details |
|---|---|
| Description | This alert is triggered when the replication is halted
due to skip error count less than or equal to 5.
This alert is cleared one hour after the event. |
| Summary | Cross-site replication errors are skipped |
| Severity | minor |
| Condition | (db_tier_replication_halted_due_to_skiperror > 0) and (db_tier_replication_halted_due_to_skiperror <= 5) |
| Expression Validity | NA |
| SNMP Trap ID | 2021 |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: Replication halted due to less than five skip errors. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-10 REPLICATION_SKIP_ERRORS_HIGH
| Field | Details |
|---|---|
| Description | This alert is triggered when the replication is halted
due to skip error counts greater than 5.
This alert is cleared one hour after the event. |
| Summary | Cross-site replication errors skipped are high |
| Severity | major |
| Condition | db_tier_replication_halted_due_to_skiperror > 5 |
| Expression Validity | NA |
| SNMP Trap ID | 2022 |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: Replication halted due to more than five skip errors. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-11 REPLICATION_EPOCHS_LOST
| Field | Details |
|---|---|
| Description | This alert is triggered when the epochs lost due to skip
error is greater than 0 and less than 2000.
This alert is cleared one hour after the event. |
| Summary | Epochs are lost for skipping replication errors |
| Severity | info |
| Condition | db_tier_epochs_lost_due_to_skiperror > 0 and db_tier_epochs_lost_due_to_skiperror <= <Configured epoch interval lower threshold> |
| Expression Validity | NA |
| SNMP Trap ID | 2023 |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: Less than 2000 epochs are lost due to skip errors. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
6.6 cnDBTier Georeplication Recovery Status Alerts
This section provides details about the cnDBTier georeplication recovery status alerts.
Table 6-12 GEOREPLICATION_RECOVERY_IN_PROGRESS
| Field | Details |
|---|---|
| Description | This alert is triggered with critical severity when the georeplication recovery is in progress and the alert is cleared when georeplication recovery is complete. |
| Summary | Identified cnDBTier Site {{ $labels.site_name }} georeplication recovery is in progress for kubernetes namespace {{ $labels.namespace }} |
| Severity | critical |
| Condition | db_tier_georeplication_recovery_state == 1 |
| Expression Validity | 1m |
| SNMP Trap ID | 2018 |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause: When you perform georeplication recovery to recover failed site from a healthy site, that is when georeplication recovery is in progress. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
6.7 cnDBTier Cluster Status Alerts
This section provides details about cnDBTier cluster status alerts.
Table 6-13 CLUSTER_DOWN
| Field | Details |
|---|---|
| Description | This alert is triggered with critical severity when cnDBTier NDB cluster is not UP. |
| Summary | MySQL Cluster is down for cnDBTier site {{ $labels.site_name }} and kubernetes namespace {{ $labels.namespace }} |
| Severity | critical |
| Condition | db_tier_cluster_status == 0 |
| Expression Validity | 1m |
| SNMP Trap ID | 2017 |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause:
Diagnostic Information:
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-14 MYSQL_NDB_CLUSTER_DISCONNECT
| Field | Details |
|---|---|
| Description | This alert is triggered with critical severity when cnDBTier NDB cluster is not UP. |
| Summary | MySQL NDB Cluster Disconnected {{ $value }} times for cnDBTier site {{ $labels.site_name }} and kubernetes namespace {{ $labels.namespace }} |
| Severity | critical |
| Condition | db_tier_cluster_disconnect > 0 |
| Expression Validity | 1m |
| SNMP Trap ID | 2034 |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause:
Diagnostic Information:
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
6.8 cnDBTier Automated Backup Alerts
This section provides details about the cnDBTier automated backup alerts.
Table 6-15 BACKUP_FAILED
| Field | Details |
|---|---|
| Description | This alert is triggered with minor severity when the backup service fails to complete the backup successfully. |
| Summary | Could not backup database for cnDBTier site {{ $labels.site_name }} and kubernetes namespace {{ $labels.namespace }} |
| Severity | minor |
| Condition | db_tier_backup{status='FAILED'} |
| Expression Validity | N/A |
| SNMP Trap ID | 2011 |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause:
Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-16 BACKUP_PURGED_EARLY
| Field | Details |
|---|---|
| Description | This alert is triggered with minor severity when the backup service purges old backups earlier than expected to create space for new backup. |
| Summary | A backup was deleted prematurely for cnDBTier site {{ $labels.site_name }} and kubernetes namespace {{ $labels.namespace }} |
| Severity | minor |
| Condition | db_tier_backup{status='PURGED_EARLY'} |
| Expression Validity | N/A |
| SNMP Trap ID | 2012 |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: When the backup service purges the old backups earlier than the expected time, to create space for a new backup. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-17 BACKUP_SIZE_GROWTH
| Field | Details |
|---|---|
| Description | This alert is triggered with minor severity whenever the current backup size exceeds 20% of the average of the previous backups. |
| Summary | Backup size exceeded expected size for cnDBTier site {{ $labels.site_name }} and kubernetes namespace {{ $labels.namespace }} |
| Severity | minor |
| Condition | (db_tier_backup_used_disk_percentage/(avg_over_time(db_tier_backup_used_disk_percentage[5d])))>1.05 |
| Expression Validity | N/A |
| SNMP Trap ID | 2013 |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: When the current backup size exceeds 20% of the average of the previous backups. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-18 BACKUP_STORAGE_LOW
| Field | Details |
|---|---|
| Description | This alert is triggered with minor severity when the total backup size of the data node is >= 70% and < 80% of the total data node disk size. |
| Summary | Disk storage on DATA node with node ID {{ $labels.node_id }} at {{ $value }} percent for cnDBTier site {{ $labels.site_name }} and kubernetes namespace {{ $labels.namespace }} |
| Severity | minor |
| Condition | (avg_over_time(db_tier_backup_used_disk_percentage[5m])>=70) and (avg_over_time(db_tier_backup_used_disk_percentage[5m])<80) |
| Expression Validity | N/A |
| SNMP Trap ID | 2014 |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: When the total backup size of the data node is >= 70% and < 80% of the total data node disk size. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-19 BACKUP_STORAGE_LOW
| Field | Details |
|---|---|
| Description | This alert is triggered with major severity when the total backup size of the data node is >= 80% and < 95% of the total data node disk size. |
| Summary | Disk storage on DATA node with node ID {{ $labels.node_id }} at {{ $value }} percent for cnDBTier site {{ $labels.site_name }} and kubernetes namespace {{ $labels.namespace }} |
| Severity | major |
| Condition | (avg_over_time(db_tier_backup_used_disk_percentage[5m])>=80) and (avg_over_time(db_tier_backup_used_disk_percentage[5m])<95) |
| Expression Validity | N/A |
| SNMP Trap ID | 2015 |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: When the total backup size of the data node is >= 80% and < 95% of the total data node disk size. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-20 BACKUP_STORAGE_FULL
| Field | Details |
|---|---|
| Description | This alert is triggered with critical severity when the total backup size of the data node is >= 95% of the total data node disk size. |
| Summary | Disk storage on DATA node with node ID {{ $labels.node_id }} is full for cnDBTier site {{ $labels.site_name }} and kubernetes namespace {{ $labels.namespace }} |
| Severity | critical |
| Condition | (avg_over_time(db_tier_backup_used_disk_percentage[5m])>=95) |
| Expression Validity | N/A |
| SNMP Trap ID | 2016 |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: When the total backup size of the data node is >= 95% of the total data node disk size. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-21 DB_TIER_NDB_BACKUP_IN_PROGRESS
| Field | Details |
|---|---|
| Description | This alert is triggered with minor severity when a data node backup is in progress in the current site. |
| Summary | Indicates that a data node backup process is in progress in the current site. |
| Severity | minor |
| Condition | db_tier_ndb_backup_in_progress == 1 |
| Expression Validity | N/A |
| SNMP Trap ID | 2037 |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: When a data node backup is in progress in the current site. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
6.9 cnDBTier Bin Log Usage Alerts
This section provides details about the cnDBTier binlog usage alerts.
Table 6-22 BINLOG_STORAGE_LOW
| Field | Details |
|---|---|
| Description | This alert is triggered with a minor severity when the total BinLog size of the SQL node is >= 70% and < 80% of the total SQL node disk size. |
| Summary | Disk storage on SQL node with node ID {{ $labels.node_id }} at {{ $value }} percent for cnDBTier site {{ $labels.site_name }} and kubernetes namespace {{ $labels.namespace }} |
| Severity | minor |
| Condition | (avg_over_time( db_tier_binlog_used_bytes_percentage[5m]) >= 70) and (avg_over_time( db_tier_binlog_used_bytes_percentage[5m] ) < 80) |
| Expression Validity | 5m |
| SNMP Trap ID | 2007 |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: When the total BinLog size of the SQL node is >= 70% and < 80% of the total SQL node disk size. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-23 BINLOG_STORAGE_LOW
| Field | Details |
|---|---|
| Description | This alert is triggered with major severity when the total BinLog size of the SQL node is >=80% and <95% of the total SQL node disk size. |
| Summary | Disk storage on SQL node with node ID {{ $labels.node_id }} at {{ $value }} percent for cnDBTier site {{ $labels.site_name }} and kubernetes namespace {{ $labels.namespace }} |
| Severity | major |
| Condition | (avg_over_time( db_tier_binlog_used_bytes_percentage[5m]) >= 80) and (avg_over_time( db_tier_binlog_used_bytes_percentage[5m]) < 95) |
| Expression Validity | 5m |
| SNMP Trap ID | 2036 |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: When the total BinLog size of the SQL node is >=80% and <95% of the total SQL node disk size. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-24 BINLOG_STORAGE_FULL
| Field | Details |
|---|---|
| Description | This alert is triggered with critical severity when the total BinLog size of the SQL node is >= 95% of the total SQL node disk size. |
| Summary | Disk storage on SQL node with node ID {{ $labels.node_id }} is full for cnDBTier site {{ $labels.site_name }} and kubernetes namespace {{ $labels.namespace }} |
| Severity | critical |
| Condition | avg_over_time( db_tier_binlog_used_bytes_percentage[5m]) >= 95 |
| Expression Validity | N/A |
| SNMP Trap ID | 2008 |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause: When the total BinLog size of the SQL node is >= 95% of the total SQL node disk size. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
6.10 cnDBTier Replication Alerts
This section provides details about cnDBTier replication alerts.
Table 6-25 REPLICATION_CHANNEL_DOWN
| Field | Details |
|---|---|
| Description | This alert is triggered with major severity when an ACTIVE channel goes to the FAILED state. |
| Summary | Cross-site replication is down on node {{ $labels.node_id }} for cnDBTier site {{ $labels.site_name }} and kubernetes namespace {{ $labels.namespace }} |
| Severity | major |
| Condition | (db_tier_replication_status{role="failed"} == 0) or
(db_tier_replication_status{role="active"} == 0) |
| Expression Validity | N/A |
| SNMP Trap ID | 2005 |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: When any ACTIVE channel goes to the FAILED state when the crosssite replication is down on a node. Diagnostic Information: The following metrics
provide information if the replication channel is down:
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-26 REPLICATION_FAILED
| Field | Details |
|---|---|
| Description | This alert is triggered with critical severity when all the channels are in the STANDBY or FAILED state. |
| Summary | Cross-site replication is down for cnDBTier site {{ $labels.site_name }} and kubernetes namespace {{ $labels.namespace }} |
| Severity | critical |
| Condition | (count by (site_name, namespace, replchannel_group_id) (db_tier_replication_status) == count by (site_name, namespace, replchannel_group_id) (db_tier_replication_status{role="standby"})) or (count by (namespace, replchannel_group_id, site_name) (db_tier_replication_status) == count by (namespace, replchannel_group_id, site_name) (db_tier_replication_status{role="failed"})) |
| Expression Validity | N/A |
| SNMP Trap ID | 2006 |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause: When all the channels are in the STANDBY or FAILED state as the cross-site replication is down for the cnDBTier site. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-27 REPLICA_REPLICATION_DELAY_HIGH
| Field | Details |
|---|---|
| Description | This alert is triggered when the last record read by the replica is more than five minutes behind the latest record written by the source. |
| Summary | Replica replication on SQL node at {{ $labels.replica_node_ip }} is {{ $value }} seconds behind the source for cnDBTier site {{ $labels.site_name }} and kubernetes namespace {{ $labels.namespace }} |
| Severity | major |
| Condition | avg(avg_over_time(db_tier_replication_replica_delay[5m])) by (source_node_ip,replica_node_ip) >= 300 and avg(avg_over_time(db_tier_replication_replica_delay[5m])) by (source_node_ip,replica_node_ip) < 48*3600 |
| Expression Validity | 1m |
| SNMP Trap ID | 2009 |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: When the last record read by the worker node is more than 5 minutes and less than 48 hours behind the latest record written by the controller. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-28 REPLICA_REPLICATION_FAILED
| Field | Details |
|---|---|
| Description | This alert is triggered when the last record read by the replica is more than 48 hours behind the latest record written by the source. |
| Summary | Replica replication has fallen more than 48 hours behind the source. Manual restore from backup may be required for cnDBTier site {{ $labels.site_name }} and kubernetes namespace {{ $labels.namespace }} |
| Severity | critical |
| Condition | avg(avg_over_time(db_tier_replication_replica_delay[5m])) by (source_node_ip,replica_node_ip) >= 48*3600 |
| Expression Validity | 1m |
| SNMP Trap ID | 2010 |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause: When the last record read by the worker node is more than 48 hours behind the latest record written by the controller. Diagnostic Information: The
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-29 GEOREPLICATION_RECOVERY_FAILED
| Field | Details |
|---|---|
| Description | This alert is triggered with critical severity when georeplication recovery fails on a unhealthy site where georeplication recovery was started. |
| Summary | Georeplication recovery has failed on cnDBTier Site {{ $labels.site_name }} from kubernetes namespace {{ $labels.namespace }} |
| Severity | critical |
| Condition | db_tier_georeplication_recovery_state == 2 |
| Expression Validity | NA |
| SNMP Trap ID | 2033 |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause: Incorrect disk size, incorrect SSH key configurations, or other similar reasons. Diagnostic Information: This alert indicates that georeplication recovery failed on a unhealthy site and replication couldn't be reestablished using the georeplication recovery procedure. This alert requires immediate attention. Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
6.11 cnDBTier Memory Usage Alerts
This section provides details about the cnDBTier memory usage alerts.
Table 6-30 LOW_MEMORY
| Field | Details |
|---|---|
| Description | This alert is triggered when the RAM usage of any node is greater than or equal to 80%. |
| Summary | Node ID {{ $labels.node_id }}, memory utilization at {{ $value }} percent for memory type {{ $labels.memory_type }} for cnDBTier site {{ $labels.site_name }} and kubernetes namespace {{ $labels.namespace }} |
| Severity | major |
| Condition | ((avg_over_time(db_tier_memory_used_bytes{memory_type="Data memory"}[1m]) / avg_over_time(db_tier_memory_total_bytes{memory_type="Data memory"}[1m])) * 100) >= 80 |
| Expression Validity | 1m |
| SNMP Trap ID | 2003 |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: When the RAM or memory usage of any node reaches the major level of threshold value. Diagnostic Information: Check if the memory usage
of the following metrics are too high:
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-31 OUT_OF_MEMORY
| Field | Details |
|---|---|
| Description | This alert is triggered with critical severity when the RAM usage of any node is greater than or equal to 90%. |
| Summary | Node ID {{ $labels.node_id }} out of memory for memory type {{ $labels.memory_type }} for cnDBTier site {{ $labels.site_name }} and kubernetes namespace {{ $labels.namespace }} |
| Severity | critical |
| Condition | ((avg_over_time(db_tier_memory_used_bytes{memory_type="Data memory"}[1m]) / avg_over_time(db_tier_memory_total_bytes{memory_type="Data memory"}[1m])) * 100) >= 90 |
| Expression Validity | 1m |
| SNMP Trap ID | 2004 |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause: When the RAM or memory usage of any node reaches the critical level of threshold value. Diagnostic Information: Check if the memory usage
of the following metrics are too high:
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
6.12 cnDBTier CPU Usage Alerts
This section provides details about cnDBTier CPU usage alerts.
Table 6-32 HIGH_CPU
| Field | Details |
|---|---|
| Description | This alert is triggered with major severity when the CPU usage of any data node is greater than or equal to 80%, and less than 90%. |
| Summary | Node ID {{ $labels.node_id }} CPU utilization at {{ $value }} for cnDBTier site {{ $labels.site_name }} and kubernetes namespace {{ $labels.namespace }} |
| Severity | major |
| Condition | ((100 - (avg(avg_over_time(db_tier_cpu_os_idle[10m])) by (node_id))) >= 80) and ((100 - (avg(avg_over_time(db_tier_cpu_os_idle[10m])) by (node_id))) < 90) |
| Expression Validity | 1m |
| SNMP Trap ID | 2002 |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: When the CPU utilization of any data node is greater than or equal to 80%, and less than 90%. Diagnostic Information: Check the CPU threshold level status from the cnDBTier worker pod logs. Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-33 HIGH_CPU
| Field | Details |
|---|---|
| Description | This alert is triggered with critical severity when the CPU usage of any data node is greater than or equal to 90%. |
| Summary | Node ID {{ $labels.node_id }} CPU utilization at {{ $value }} for cnDBTier site {{ $labels.site_name }} and kubernetes namespace {{ $labels.namespace }} |
| Severity | critical |
| Condition | (100 - (avg(avg_over_time(db_tier_cpu_os_idle[10m]))BY (node_id)))>= 90 |
| Expression Validity | 1m |
| SNMP Trap ID | 2035 |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: When the CPU utilization of any data node is greater than or equal to 90%. Diagnostic Information: Check the CPU threshold level status from the cnDBTier worker pod logs. Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
6.13 cnDBTier Node Status Alerts
The section provides details about cnDBTier node status alerts.
Table 6-34 NODE_DOWN
| Field | Details |
|---|---|
| Description | This alert is raised with critical severity when the
data node is down. db_tier_node_status value:
|
| Summary | MySQL {{ $labels.node_type }} node having node id {{ $labels.node_id }} is down for cnDBTier site {{ $labels.site_name }} and kubernetes namespace {{ $labels.namespace }} |
| Severity | critical |
| Condition | db_tier_node_status == 0 |
| Expression Validity | N/A |
| SNMP Trap ID | 2001 |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause:
Diagnostic Information:
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
6.14 cnDBTier Node Data Volume Alerts
This section provides details about cnDBTier node data volume alerts.
Table 6-35 DB_TIER_API_SEND_NODE_DATA_VOLUME_LOW
| Field | Details |
|---|---|
| Description | This alert is triggered when any NDB application node sends less data to NDB when compared to the other NDB application nodes. |
| Summary | Send Node Data Volume Low for API Node ID {{ $labels.remote_node_id }} at kubernetes namespace {{ $labels.namespace }} |
| Severity | critical |
| Condition | (((sum by (remote_node_id,namespace) (avg_over_time(rate(db_tier_node_transporter_bytes_received{node_type=~"ndbapp_node",namespace="<${CNDBTIER_NAMESPACE}>"}[5m])[15m:5m])))/scalar(sum (avg_over_time(rate(db_tier_node_transporter_bytes_received{node_type=~"ndbapp_node",namespace="<${CNDBTIER_NAMESPACE}>"}[5m])[15m:5m]))))*100) < (100/(scalar(count(count by (remote_node_id) (db_tier_node_transporter_bytes_received{node_type=~"ndbapp_node",namespace="<${CNDBTIER_NAMESPACE}>"})))*1.6)) |
| Expression Validity | NA |
| SNMP Trap ID | 3001 |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause: When NDB application node sends less data to NDB when compared to other NDB application nodes. Diagnostic Information: The alert indicates that the NDB application node is slow, therefore check the underlying infrastructure. Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-36 DB_TIER_API_RECEIVE_NODE_DATA_VOLUME_LOW
| Field | Details |
|---|---|
| Description | This alert is triggered when any NDB sends less data to any specific NDB application node when compared to the other NDB application nodes. |
| Summary | Receive Node Data Volume Low for API Node ID {{ $labels.remote_node_id }} at kubernetes namespace {{ $labels.namespace }} |
| Severity | critical |
| Condition | (((sum by (remote_node_id,namespace) (avg_over_time(rate(db_tier_node_transporter_bytes_sent{node_type=~"ndbapp_node",namespace="<${CNDBTIER_NAMESPACE}>"}[5m])[15m:5m])))/scalar(sum (avg_over_time(rate(db_tier_node_transporter_bytes_sent{node_type=~"ndbapp_node",namespace="<${CNDBTIER_NAMESPACE}>"}[5m])[15m:5m]))))*100) < (100/(scalar(count(count by (remote_node_id) (db_tier_node_transporter_bytes_sent{node_type=~"ndbapp_node",namespace="<${CNDBTIER_NAMESPACE}>"})))*1.6)) |
| Expression Validity | NA |
| SNMP Trap ID | 3002 |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause: When NDB application node sends less data to any specific NDB application node when compared to other NDB application nodes due to the communication not happening intermittently or fully or the underlying platform is slow. Diagnostic Information: The alert indicates that the NDB application node is slow, or underlying infrastructure is slow or communication did not happen intermittently or fully, therefore check the underlying infrastructure. Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-37 DB_TIER_SEND_DATA_NODE_DATA_VOLUME_LOW
| Field | Details |
|---|---|
| Description | This alert is triggered when any NDB doesn't send the traffic data in the required speed or when the speed is slower when compared to another data node. |
| Summary | Send Data Node Data Volume Low for DATA Node ID {{ $labels.node_id }} at kubernetes namespace {{ $labels.namespace }} |
| Severity | critical |
| Condition | ((sum by (node_id,namespace) (avg_over_time(rate(db_tier_node_transporter_bytes_sent{namespace="<${CNDBTIER_NAMESPACE}>"}[5m])[15m:5m]))/scalar(sum (avg_over_time(rate(db_tier_node_transporter_bytes_sent{namespace="<${CNDBTIER_NAMESPACE}>"}[5m])[15m:5m])))) * 100) < (100/(scalar(count(count by (node_id) (db_tier_node_transporter_bytes_sent{namespace="<${CNDBTIER_NAMESPACE}>"})))*1.6)) |
| Expression Validity | NA |
| SNMP Trap ID | 3003 |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause: When any NDB doesn't send the traffic data in the required speed or when the speed is slower when compared to another data node. Diagnostic Information: The alert indicates that the NDB application node is slow, therefore check the underlying infrastructure. Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-38 DB_TIER_RECEIVE_DATA_NODE_DATA_VOLUME_LOW
| Field | Details |
|---|---|
| Description | This alert is triggered when any NDB doesn't receive the traffic data in the required speed or when the speed is slower when compared to another data node. |
| Summary | Receive Data Node Data Volume Low for DATA Node ID {{ $labels.node_id }} at kubernetes namespace {{ $labels.namespace }} |
| Severity | critical |
| Condition | ((sum by (node_id,namespace) (avg_over_time(rate(db_tier_node_transporter_bytes_received{namespace="<${CNDBTIER_NAMESPACE}>"}[5m])[15m:5m]))/scalar(sum (avg_over_time(rate(db_tier_node_transporter_bytes_received{namespace="<${CNDBTIER_NAMESPACE}>"}[5m])[15m:5m])))) * 100)< (100/(scalar(count(count by (node_id) (db_tier_node_transporter_bytes_received{namespace="<${CNDBTIER_NAMESPACE}>"})))*1.6)) |
| Expression Validity | NA |
| SNMP Trap ID | 3004 |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause: When any NDB doesn't receive the traffic data in the required speed or when the speed is slower when compared to another data node. Diagnostic Information: The alert indicates that the NDB application node is slow, therefore check the underlying infrastructure. Recommended Actions:
For any assistance, contact My Oracle Support. Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-39 DB_TIER_DATA_NODE_SCAN_FRAGMENT_SLOW
| Field | Details |
|---|---|
| Description | This alert is triggered when any data node scan fragment is slow when compared with other data nodes. |
| Summary | Scan Fragment is Slow for DATA Node ID {{ $labels.node_id }} at kubernetes namespace {{ $labels.namespace }} |
| Severity | critical |
| Condition | (((sum by (node_id,comm_node_id,namespace) (rate(db_tier_tc_time_track_stats_total_scan_fragments_time{path_type="INTERNAL",namespace="<${CNDBTIER_NAMESPACE}>"}[15m]))/sum by (node_id,comm_node_id,namespace) (rate(db_tier_tc_time_track_stats_total_scan_fragments_count{path_type="INTERNAL",namespace="<${CNDBTIER_NAMESPACE}>"}[15m])))/ (scalar(sum(sum by (node_id,comm_node_id,namespace) (rate(db_tier_tc_time_track_stats_total_scan_fragments_time{path_type="INTERNAL",namespace="<${CNDBTIER_NAMESPACE}>"}[15m]))/sum by (node_id,comm_node_id,namespace) (rate(db_tier_tc_time_track_stats_total_scan_fragments_count{path_type="INTERNAL",namespace="<${CNDBTIER_NAMESPACE}>"}[15m]))))))*100) > ((100/scalar(count(sum by (node_id,comm_node_id,namespace)(db_tier_tc_time_track_stats_total_scan_fragments_time{path_type="INTERNAL",namespace="<${CNDBTIER_NAMESPACE}>"}))))*1.6) |
| Expression Validity | NA |
| SNMP Trap ID | 3005 |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause: When the scan fragment for any particular data node is slow. Diagnostic Information: The alert indicates that the NDB application node is slow, therefore check the underlying infrastructure. Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
6.15 cnDBTier Certificate Expiry Alerts
This section provides details about cnDBTier certificate expiry alerts.
Table 6-40 DBTIER_CERTIFICATE_EXPIRY_INFO
| Field | Details |
|---|---|
| Description | This alert is triggered with info
severity whenever the certificate for a cnDBTier is set to expire within
the next 90 days.
|
| Summary | dbtier Certificate {{ $labels.certType }}for {{ $labels.hostname }} is expiring with in 90 days for cnDBTier site {{ $labels.site_name }}and kubernetes namespace {{ $labels.namespace }} |
| Severity | info |
| Condition | (db_tier_cert_expiry / 1000 - time()) > 2592000 and (db_tier_cert_expiry / 1000 - time()) <= 7776000 |
| Expression Validity | NA |
| OID | 1.3.6.1.4.1.323.5.3.50.1.2.2045 |
| Metric Used | db_tier_cert_expiry |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: This alert is triggered when any cnDBTier certificate is going to expire in next 90 days. Diagnostic Information: This alert is triggered
with Recommended Actions:
|
| Available in OCI | Yes |
Table 6-41 DBTIER_CERTIFICATE_EXPIRY_MAJOR
| Field | Details |
|---|---|
| Description | This alert is triggered with major
severity whenever the certificate for a cnDBTier is set to expire within
the next 30 days.
|
| Summary | dbtier Certificate {{ $labels.certType }}for {{ $labels.hostname }} is expiring with in 30 days for cnDBTier site {{ $labels.site_name }}and kubernetes namespace {{ $labels.namespace }} |
| Severity | major |
| Condition | (db_tier_cert_expiry / 1000 - time()) > 604800 and (db_tier_cert_expiry / 1000 - time()) <= 2592000 |
| OID | 1.3.6.1.4.1.323.5.3.50.1.2.2040 |
| Metric Used | db_tier_cert_expiry |
| Expression Validity | NA |
| Affects Service (Y/N) | N |
| Recommended Action |
Cause: This alert is triggered when any cnDBTier certificate is going to expire in next 30 days. Diagnostic Information: This alert is triggered
with Recommended actions:
|
| Available in OCI | Yes |
Table 6-42 DBTIER_CERTIFICATE_EXPIRY_CRITICAL
| Field | Details |
|---|---|
| Description | This alert is triggered with critical
severity whenever the certificate for a cnDBTier is set to expire within
the next 7 days.
|
| Summary | dbtier Certificate {{ $labels.certType }}for {{ $labels.hostname }} is expiring with in 7 days for cnDBTier site {{ $labels.site_name }}and kubernetes namespace {{ $labels.namespace }} |
| Severity | critical |
| Condition | (db_tier_cert_expiry / 1000 - time()) > 0 and (db_tier_cert_expiry / 1000 - time()) <= 604800 |
| OID | 1.3.6.1.4.1.323.5.3.50.1.2.2041 |
| Metric Used | db_tier_cert_expiry |
| Expression Validity | NA |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause: This alert is triggered when any cnDBTier certificate is going to expire in next 7 days. Diagnostic Information: This alert is triggered
with Recommended actions:
|
| Available in OCI | Yes |
Table 6-43 DBTIER_CERTIFICATE_EXPIRED
| Field | Details |
|---|---|
| Description | This alert is triggered with critical
severity when any cnDBTier certificate has expired.
|
| Summary | dbtier Certificate {{ $labels.certType }}for {{ $labels.hostname }} is expiredfor cnDBTier site {{ $labels.site_name }}and kubernetes namespace {{ $labels.namespace }} |
| Severity | critical |
| Condition | (db_tier_cert_expiry / 1000 - time()) <= 0 |
| OID | 1.3.6.1.4.1.323.5.3.50.1.2.2041 |
| Metric Used | db_tier_cert_expiry |
| Expression Validity | NA |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause: This alert is triggered when any cnDBTier certificate is going to expire in next 7 days. Diagnostic Information: This alert is triggered
with Recommended actions:
|
| Available in OCI | Yes |
6.16 cnDBTier PVC Health Alerts
This section provides details about cnDBTier PVC health related alerts.
Table 6-44 PVC_NOT_ACCESSIBLE
| Field | Details |
|---|---|
| Description | This alert is triggered with critical
severity when db_tier_pvc_is_accesible condition is zero.
|
| Summary | PVC is not accessible on cnDBTier site {{ $labels.site_name } |
| Severity | critical |
| Condition | db_tier_pvc_is_accesible == 0 |
| OID | 1.3.6.1.4.1.323.5.3.50.1.2.2029 |
| Metric Used | db_tier_pvc_is_accesible |
| Expression Validity | 1m |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause: When PVC is not accessible for read or write operation. Diagnostic Information: The Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Table 6-45 PVC_STORAGE_FULL
| Field | Details |
|---|---|
| Description | The PVC_STORAGE_FULL alert is triggered with critical severity when a pod's PVC reaches full capacity. |
| Summary | PVC is not accessible on cnDBTier site {{ $labels.site_name } |
| Severity | critical |
| Condition | db_tier_pvc_is_accesible == 0 |
| OID | 1.3.6.1.4.1.323.5.3.50.1.2.2029 |
| Metric Used | db_tier_pvc_is_accesible |
| Expression Validity | 1m |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause: This alert is triggered when the PVC reaches full capacity, preventing further write operations.. Diagnostic Information: The system detects that the PVC has no available space, leading to storage-related failures. Recommended steps:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
6.17 cnDBTier Backup Manager Svc Down Alerts
This section provides details about cnDBTier backup manager Svc down alerts.
Table 6-46 DB_BACKUP_MANAGER_SVC_DOWN
| Field | Details |
|---|---|
| Description | This alert is triggered with
critical severity when
db_backup_manager_svc pod is down.
|
| Summary | PVC is not accessible on cnDBTier site {{ $labels.site_name } |
| Severity | critical |
| Condition | kube_deployment_status_replicas_available{deployment=~".*db-backup-manager-svc.*"} == 0 |
| OID | 1.3.6.1.4.1.323.5.3.50.1.2.2039 |
| Metric Used |
kube_deployment_status_replicas_available{deployment=~".*db-backup-manager-svc.*"}
|
| Expression Validity | 1m |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause: When
Diagnostic Information: The
system detects that the
Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
6.18 cnDBTier Forced Switchover Disabled Alerts
This section provides details about cnDBTier forced switchover disabled alerts.
Table 6-47 DB_TIER_FORCED_SWITCHOVER_DISABLED
| Field | Details |
|---|---|
| Description | This alert is triggered with
critical severity when switchover
is disabled forcefully.
|
| Summary | dbtier switchover is disabled forcefully for cnDBTier {{ $labels.site_name }}and kubernetes namespace {{ $labels.namespace }} |
| Severity | critical |
| Condition | kube_deployment_status_replicas_available{deployment=~".*db-backup-manager-svc.*"} == 0 |
| OID | 1.3.6.1.4.1.323.5.3.50.1.2.2039 |
| Metric Used |
kube_deployment_status_replicas_available{deployment=~".*db-backup-manager-svc.*"}
|
| Expression Validity | 1m |
| Affects Service (Y/N) | Y |
| Recommended Action |
Cause: When switchover is disabled forcefully. Diagnostic Information: The alert informs the operator that switchover is currently disabled and needs to be updated. Recommended Actions:
Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |