7.1.1 NEF Alerts

This chapter includes information about the following NEF alerts:

Note:

  • The performance and capacity of the NEF system may vary based on the call model, feature or interface configuration, and underlying CNE and hardware environment.
  • Due to unavailability of metric and/or MQL queries, the following alerts are not supported for OCI:
    • OcnefNfStatusUnavailable
    • OcnefPodsRestart
    • OcnefIngressGatewayServiceDown
    • OcnefApiRouterServiceDown
    • OcnefFiveGcAgentServiceDown
    • OcnefMonitoringEventServiceDown
    • OcnefCCFClientServiceDown
    • OcnefExpiryAuditorServiceDown
    • OcnefQOSServiceDown
    • OcnefTIServiceDown
    • OcnefDTServiceDown
    • OcnefEgressGatewayServiceDown
    • OcnefMemoryUsageCrossedMinorThreshold
    • OcnefMemoryUsageCrossedMajorThreshold
    • FiveGcInvalidConfiguration
    • OcnefAllSiteStatus
    • OcnefDBReplicationStatus

7.1.1.1 System Level Alerts

This section lists the system level alerts for NEF.

7.1.1.1.1 OcnefNfStatusUnavailable

Table 7-1 OcnefNfStatusUnavailable

Field Details
Description 'NEF services unavailable'
Summary "kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query \"time()\" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : All NEF services are unavailable."
Severity Critical
Condition All the NEF services are unavailable, either because the NEF is getting deployed or purged.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7001
Metric Used

'up'

Note: This is a Prometheus metric used for instance availability monitoring.

If this metric is not available, use a similar metric as exposed by the monitoring system.

Recommended Actions The alert is cleared automatically when the NEF services restart.

Steps:

  1. Check for service-specific alerts which may be causing the issues with service exposure.
  2. Run the following command to check the pod status:
    $ kubectl get po -n <namespace>
    1. Run the following command to analyze the error condition of the pod that is not in the Running state:
      $ kubectl describe pod <pod name not in Running state> -n <namespace>

      Where <pod name not in Running state> indicates the pod that is not in the Running state.

  3. Refer to the application logs on Kibana and check for database related failures such as connectivity and invalid secrets. The logs can be filtered based on the services.
  4. Check for helm status to make sure there are no errors:
    $ helm status <helm release name of the desired NF> -n <namespace>

    If it is not in “STATUS : DEPLOYED”, then capture logs and event again.

  5. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

    Note: Use CNC NF Data Collector tool for capturing logs. For more information on the Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

7.1.1.1.2 OcnefPodsRestart

Table 7-2 OcnefPodsRestart

Field Details
Description 'Pod <Pod Name> has restarted.
Summary "kubernetes_namespace: {{$labels.namespace}}, podname: {{$labels.pod}}, timestamp: {{ with query \"time()\" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : A Pod has restarted"
Severity Major
Condition A pod belonging to any of the NEF services has restarted.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7002
Metric Used kube_pod_container_status_restarts_total
Recommended Actions

The alert is cleared automatically if the specific pod is up.

Steps:

  1. Refer to the application logs on Kibana and filter based on pod name, check for database related failures such as connectivity and Kubernetes secrets.
  2. To check the orchestration logs for liveness or readiness probe failures, do the following:
    1. Run the following command to check the pod status:
      $ kubectl get po -n <namespace>
    2. Run the following command to analyze the error condition of the pod that is not in the Running state:
      $ kubectl describe pod <pod name not in Running state> -n <namespace>

      Where <pod name not in Running state> indicates the pod that is not in the Running state.

  3. Check the database status. For more information, see Oracle Communications Cloud Native Core, cnDBTier User Guide.
  4. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

    Note: Use CNC NF Data Collector tool for capturing logs. For more information on the Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

7.1.1.1.3 OcnefTotalExternalIngressTrafficRateAboveMinorThreshold

Table 7-3 OcnefTotalExternalIngressTrafficRateAboveMinorThreshold

Field Details
Description OCNEF External Ingress traffic rate is above the configured minor threshold i.e. 800 TPS (current value is: {{ $value }})
Summary "timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Traffic rate is above 80 percent of max TPS (1000)"
Severity Minor
Condition The total NEF External Ingress traffic rate has crossed the configured minor threshold of 800 TPS.

Default value of this alert trigger point in NefAlertrules alert file is 80 % of 1000 (maximum ingress request rate).

OID 1.3.6.1.4.1.323.5.3.39.1.2.7003
Metric Used oc_ingressgateway_http_requests_total
Recommended Actions The alert is cleared either when the total External Ingress traffic rate falls below the minor threshold or when the total traffic rate crosses the major threshold, in which case the OcnefTotalExternalIngressTrafficRateAboveMajorThreshold alert is raised.

Note: The threshold is configurable in the NefAlertrules alert file.

Reassess why the NEF is receiving additional traffic. If this alert is unexpected, contact My Oracle Support.
Steps:
  1. Refer Grafana to determine which service is receiving high traffic.
  2. Refer Ingress gateway section in Grafana to determine the increase in 4xx and 5xx error codes.
  3. Check Ingress gateway logs on Kibana to determine the reason for the errors.

7.1.1.1.4 OcnefTotalFivegcIngressTrafficRateAboveMinorThreshold

Table 7-4 OcnefTotalFivegcIngressTrafficRateAboveMinorThreshold

Field Details
Description OCNEF Fivegc Ingress traffic rate is above the configured minor threshold i.e. 800 TPS (current value is: {{ $value }})
Summary "timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Traffic rate is above 80 percent of max TPS (1000)"
Severity Minor
Condition The total NEF Fivegc Ingress traffic rate has crossed the configured minor threshold of 800 TPS.

Default value of this alert trigger point in NefAlertrules alert file is 80 % of 1000 (maximum ingress request rate).

OID 1.3.6.1.4.1.323.5.3.39.1.2.7004
Metric Used oc_ingressgateway_http_requests_total
Recommended Actions The alert is cleared either when the total Fivegc Ingress traffic rate falls below the minor threshold or when the total traffic rate crosses the major threshold, in which case the OcnefTotalFivegcIngressTrafficRateAboveMajorThreshold alert is raised.

Note: The threshold is configurable in the NefAlertrules alert file.

Reassess why the NEF is receiving additional traffic. If this alert is unexpected, contact My Oracle Support.
Steps:
  1. Refer Grafana to determine which service is receiving high traffic.
  2. Refer Ingress gateway section in Grafana to determine the increase in 4xx and 5xx error codes.
  3. Check Ingress gateway logs on Kibana to determine the reason for the errors.

7.1.1.1.5 OcnefTotalExternalIngressTrafficRateAboveMajorThreshold

Table 7-5 OcnefTotalExternalIngressTrafficRateAboveMajorThreshold

Field Details
Description OCNEF External Ingress traffic rate is above the configured major threshold i.e. 900 TPS (current value is: {{ $value }})
Summary "timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Traffic rate is above 90 percent of max TPS (1000)"
Severity Major
Condition The total NEF External Ingress traffic rate has crossed the configured major threshold of 900 TPS.

Default value of this alert trigger point in NefAlertrules alert file is 90 % of 1000 (maximum ingress request rate).

OID 1.3.6.1.4.1.323.5.3.39.1.2.7005
Metric Used oc_ingressgateway_http_requests_total
Recommended Actions The alert is cleared either when the total External Ingress traffic rate falls below the major threshold or when the total traffic rate crosses the critical threshold, in which case the OcnefTotalExternalIngressTrafficRateAboveCriticalThreshold alert is raised.

Note: The threshold is configurable in the NefAlertrules alert file.

Reassess why the NEF is receiving additional traffic. If this alert is unexpected, contact My Oracle Support.
Steps:
  1. Refer Grafana to determine which service is receiving high traffic.
  2. Refer Ingress gateway section in Grafana to determine the increase in 4xx and 5xx error codes.
  3. Check Ingress gateway logs on Kibana to determine the reason for the errors.

7.1.1.1.6 OcnefTotalFivegcIngressTrafficRateAboveMajorThreshold

Table 7-6 OcnefTotalFivegcIngressTrafficRateAboveMajorThreshold

Field Details
Description OCNEF Fivegc Ingress traffic rate is above the configured major threshold i.e. 900 TPS (current value is: {{ $value }})
Summary "timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Traffic rate is above 90 percent of max TPS (1000)"
Severity Major
Condition The total NEF Fivegc Ingress traffic rate has crossed the configured major threshold of 900 TPS.

Default value of this alert trigger point in NefAlertrules alert file is 90 % of 1000 (maximum ingress request rate).

OID 1.3.6.1.4.1.323.5.3.39.1.2.7006
Metric Used oc_ingressgateway_http_requests_total
Recommended Actions The alert is cleared either when the total Fivegc Ingress traffic rate falls below the major threshold or when the total traffic rate crosses the critical threshold, in which case the OcnefTotalFivegcIngressTrafficRateAboveCriticalThreshold alert is raised.

Note: The threshold is configurable in the NefAlertrules alert file.

Reassess why the NEF is receiving additional traffic. If this alert is unexpected, contact My Oracle Support.
Steps:
  1. Refer Grafana to determine which service is receiving high traffic.
  2. Refer Ingress gateway section in Grafana to determine the increase in 4xx and 5xx error codes.
  3. Check Ingress gateway logs on Kibana to determine the reason for the errors.

7.1.1.1.7 OcnefTotalExternalIngressTrafficRateAboveCriticalThreshold

Table 7-7 OcnefTotalExternalIngressTrafficRateAboveCriticalThreshold

Field Details
Description OCNEF External Ingress traffic rate is above the configured critical threshold i.e. 950 TPS (current value is: {{ $value }})
Summary "timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Traffic Rate is above 95 percent of max TPS (1000)"
Severity Critical
Condition The total NEF External Ingress traffic rate has crossed the configured critical threshold of 950 TPS.

Default value of this alert trigger point in NefAlertrules alert file is 95 % of 1000 (maximum ingress request rate).

OID 1.3.6.1.4.1.323.5.3.39.1.2.7007
Metric Used oc_ingressgateway_http_requests_total
Recommended Actions The alert is cleared either when the total External Ingress traffic rate falls below the critical threshold.

Note: The threshold is configurable in the NefAlertrules alert file.

Reassess why the NEF is receiving additional traffic. If this alert is unexpected, contact My Oracle Support.
Steps:
  1. Refer Grafana to determine which service is receiving high traffic.
  2. Refer Ingress gateway section in Grafana to determine the increase in 4xx and 5xx error codes.
  3. Check Ingress gateway logs on Kibana to determine the reason for the errors.

7.1.1.1.8 OcnefTotalFivegcIngressTrafficRateAboveCriticalThreshold

Table 7-8 OcnefTotalFivegcIngressTrafficRateAboveCriticalThreshold

Field Details
Description OCNEF Fivegc Ingress traffic rate is above the configured critical threshold i.e. 950 TPS (current value is: {{ $value }})
Summary "timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Traffic Rate is above 95 percent of max TPS (1000)"
Severity Critical
Condition The total NEF Fivegc Ingress traffic rate has crossed the configured critical threshold of 950 TPS.

Default value of this alert trigger point in NefAlertrules alert file is 95 % of 1000 (maximum ingress request rate).

OID 1.3.6.1.4.1.323.5.3.39.1.2.7008
Metric Used oc_ingressgateway_http_requests_total
Recommended Actions The alert is cleared either when the total Fivegc Ingress traffic rate falls below the critical threshold.

Note: The threshold is configurable in the NefAlertrules alert file.

Reassess why the NEF is receiving additional traffic. If this alert is unexpected, contact My Oracle Support.
Steps:
  1. Refer Grafana to determine which service is receiving high traffic.
  2. Refer Ingress gateway section in Grafana to determine the increase in 4xx and 5xx error codes.
  3. Check Ingress gateway logs on Kibana to determine the reason for the errors.

7.1.1.1.9 OcnefExternalIngressTransactionErrorRateAboveZeroPointOnePercent

Table 7-9 OcnefExternalIngressTransactionErrorRateAboveZeroPointOnePercent

Field Details
Description External Ingress transaction Error rate is above 0.1 percent(current value is {{ $value }})
Summary "timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Transaction Error rate detected above 0.1 percent of total transactions"
Severity Warning
Condition The number of failed external ingress transactions is above 0.1 percent of the total transactions.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7009
Metric Used oc_ingressgateway_http_responses_total
Recommended Actions The alert is cleared when the number of failure external ingress transactions is below 0.1 percent of the total transactions or when the number of failed transactions crosses the 1% threshold, in which case the OcnefExternalIngressTransactionErrorRateAbove1Percent is raised.

Steps:

  1. Check the service specific metrics to understand the specific service request errors.
  2. Check metrics per service, per method:
  3. If guidance is required, contact My Oracle Support.

7.1.1.1.10 OcnefFivegcIngressTransactionErrorRateAboveZeroPointOnePercent

Table 7-10 OcnefFivegcIngressTransactionErrorRateAboveZeroPointOnePercent

Field Details
Description Fivegc Ingress transaction error rate is above 0.1 percent of total transactions (current value is {{ $value }})
Summary "timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Transaction error rate detected above 0.1 percent of total transactions"
Severity Warning
Condition The number of failed Fivegc ingress transactions is above 0.1 percent of the total transactions.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7010
Metric Used oc_ingressgateway_http_responses_total
Recommended Actions The alert is cleared when the number of failure Fivegc ingress transactions is below 0.1 percent of the total transactions or when the number of failed transactions crosses the 1% threshold, in which case the OcnefFivegcIngressTransactionErrorRateAbove1Percent is raised.

Steps:

  1. Check the service specific metrics to understand the specific service request errors.
  2. Check metrics per service, per method:
  3. If guidance is required, contact My Oracle Support.

7.1.1.1.11 OcnefExternalIngressTransactionErrorRateAbove1Percent

Table 7-11 OcnefExternalIngressTransactionErrorRateAbove1Percent

Field Details
Description External Ingress transaction error rate is above 1 percent of total transactions (current value is {{ $value }})
Summary "timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Transaction error rate detected above 1 percent of total transactions"
Severity Warning
Condition The number of failed External Ingress transactions is above 1 percent of the total transactions.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7011
Metric Used oc_ingressgateway_http_responses_total
Recommended Actions The alert is cleared when the number of failure External Ingress transactions is below 1 percent of the total transactions or when the number of failed transactions crosses the 10% threshold, in which case the OcnefExternalIngressTransactionErrorRateAbove10Percent is raised.

Steps:

  1. Check the service specific metrics to understand the specific service request errors.
  2. Check metrics per service, per method:
  3. If guidance is required, contact My Oracle Support.

7.1.1.1.12 OcnefFivegcIngressTransactionErrorRateAbove1Percent

Table 7-12 OcnefFivegcIngressTransactionErrorRateAbove1Percent

Field Details
Description Fivegc Ingress transaction error rate is above 1 percent of total Fivegc Ingress transactions (current value is {{ $value }})
Summary "timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Transaction error rate detected above 1 percent of total transactions"
Severity Warning
Condition The number of failed Fivegc Ingress transactions is above 1 percent of the total transactions.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7012
Metric Used oc_ingressgateway_http_responses_total
Recommended Actions The alert is cleared when the number of failure Fivegc Ingress transactions is below 1 percent of the total transactions or when the number of failed transactions crosses the 10% threshold, in which case the OcnefFivegcIngressTransactionErrorRateAbove10Percent is raised.

Steps:

  1. Check the service specific metrics to understand the specific service request errors.
  2. Check metrics per service, per method:
  3. If guidance is required, contact My Oracle Support.

7.1.1.1.13 OcnefExternalIngressTransactionErrorRateAbove10Percent

Table 7-13 OcnefExternalIngressTransactionErrorRateAbove10Percent

Field Details
Description External Ingress transaction error rate is above 10 percent of total External Ingress transactions (current value is {{ $value }})
Summary "timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Transaction error rate detected above 10 percent of total transactions"
Severity Minor
Condition The number of failed External Ingress transactions is above 10 percent of the total transactions.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7013
Metric Used oc_ingressgateway_http_responses_total
Recommended Actions The alert is cleared when the number of failure External Ingress transactions is below 10 percent of the total transactions or when the number of failed transactions crosses the 25% threshold, in which case the OcnefExternalIngressTransactionErrorRateAbove25Percent is raised.

Steps:

  1. Check the service specific metrics to understand the specific service request errors.
  2. Check metrics per service, per method:
  3. If guidance is required, contact My Oracle Support.

7.1.1.1.14 OcnefFivegcIngressTransactionErrorRateAbove10Percent

Table 7-14 OcnefFivegcIngressTransactionErrorRateAbove10Percent

Field Details
Description Fivegc Ingress transaction error rate is above 10 percent of total Fivegc Ingress transactions (current value is {{ $value }})
Summary "timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Transaction error rate detected above 10 percent of total transactions"
Severity Minor
Condition The number of failed Fivegc Ingress transactions is above 10 percent of the total transactions.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7014
Metric Used oc_ingressgateway_http_responses_total
Recommended Actions The alert is cleared when the number of failure Fivegc Ingress transactions is below 10 percent of the total transactions or when the number of failed transactions crosses the 25% threshold, in which case the OcnefFivegcIngressTransactionErrorRateAbove25Percent is raised.

Steps:

  1. Check the service specific metrics to understand the specific service request errors.
  2. Check metrics per service, per method:
  3. If guidance is required, contact My Oracle Support.

7.1.1.1.15 OcnefExternalIngressTransactionErrorRateAbove25Percent

Table 7-15 OcnefExternalIngressTransactionErrorRateAbove25Percent

Field Details
Description External Ingress transaction error rate detected above 25 percent of total External Ingress transactions (current value is {{ $value }})
Summary "timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Transaction error rate detected above 25 percent of total transactions"
Severity Major
Condition The number of failed External Ingress transactions is above 25 percent of the total transactions.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7015
Metric Used oc_ingressgateway_http_responses_total
Recommended Actions The alert is cleared when the number of failure External Ingress transactions is below 25 percent of the total transactions or when the number of failed transactions crosses the 50% threshold, in which case the OcnefExternalIngressTransactionErrorRateAbove50Percent is raised.

Steps:

  1. Check the service specific metrics to understand the specific service request errors.
  2. Check metrics per service, per method:
  3. If guidance is required, contact My Oracle Support.

7.1.1.1.16 OcnefFivegcIngressTransactionErrorRateAbove25Percent

Table 7-16 OcnefFivegcIngressTransactionErrorRateAbove25Percent

Field Details
Description Fivegc Ingress transaction error rate detected above 25 percent of total Fivegc Ingress transactions (current value is {{ $value }})
Summary "timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Transaction error rate detected above 25 percent of total transactions"
Severity Major
Condition The number of failed Fivegc Ingress transactions is above 25 percent of the total transactions.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7016
Metric Used oc_ingressgateway_http_responses_total
Recommended Actions The alert is cleared when the number of failure Fivegc Ingress transactions is below 25 percent of the total transactions or when the number of failed transactions crosses the 50% threshold, in which case the OcnefFivegcIngressTransactionErrorRateAbove50Percent is raised.

Steps:

  1. Check the service specific metrics to understand the specific service request errors.
  2. Check metrics per service, per method:
  3. If guidance is required, contact My Oracle Support.

7.1.1.1.17 OcnefExternalIngressTransactionErrorRateAbove50Percent

Table 7-17 OcnefExternalIngressTransactionErrorRateAbove50Percent

Field Details
Description External Ingress transaction error rate detected above 50 percent of total External Ingress transactions (current value is {{ $value }})
Summary "timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Transaction error rate detected above 50 percent of total transactions"
Severity Critical
Condition The number of failed External Ingress transactions is above 50 percent of the total transactions.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7017
Metric Used oc_ingressgateway_http_responses_total
Recommended Actions The alert is cleared when the number of failure External Ingress transactions is below 50 percent of the total transactions.

Steps:

  1. Check the service specific metrics to understand the specific service request errors.
  2. Check metrics per service, per method:
  3. If guidance is required, contact My Oracle Support.

7.1.1.1.18 OcnefFivegcIngressTransactionErrorRateAbove50Percent

Table 7-18 OcnefFivegcIngressTransactionErrorRateAbove50Percent

Field Details
Description Fivegc Ingress transaction error rate detected above 50 percent of total Fivegc Ingress transactions (current value is {{ $value }})
Summary "timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Transaction error rate detected above 50 percent of total transactions"
Severity Critical
Condition The number of failed Fivegc Ingress transactions is above 50 percent of the total transactions.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7018
Metric Used oc_ingressgateway_http_responses_total
Recommended Actions The alert is cleared when the number of failure Fivegc Ingress transactions is below 50 percent of the total transactions.

Steps:

  1. Check the service specific metrics to understand the specific service request errors.
  2. Check metrics per service, per method:
  3. If guidance is required, contact My Oracle Support.

7.1.1.1.19 OcnefEgressGatewayServiceDown

Table 7-19 OcnefEgressGatewayServiceDown

Field Details
Description "NEF Egress-Gateway service {{$labels.app_kubernetes_io_name}} is down"
Summary "kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query \"time()\" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Egress-Gateway service down"
Severity Critical
Condition None of the pods of the Egress Gateway microservice is available.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7019
Metric Used 'up'

Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system.

Recommended Actions

The alert is cleared when the Egress Gateway service is available.

Note: The threshold is configurable in the NefAlertrules alert file.

Steps:

  1. To check the orchestration logs of Egress Gateway service and check for liveness or readiness probe failures, do the following:
    1. Run the following command to check the pod status:
      $ kubectl get po -n <namespace>
    2. Run the following command to analyze the error condition of the pod that is not in the Running state:
      $ kubectl describe pod <pod name not in Running state> -n <namespace>

      Where <pod name not in Running state> indicates the pod that is not in the Running state.

  2. Refer to the application logs on Kibana and filter based on Egress Gateway service names. Check for ERROR WARNING logs related to thread exceptions.
  3. Depending on the failure reason, take the resolution steps.
  4. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

    Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

7.1.1.1.20 OcnefMemoryUsageCrossedMinorThreshold

Table 7-20 OcnefMemoryUsageCrossedMinorThreshold

Field Details
Description "NEF Memory Usage for pod {{ $labels.pod }} has crossed the configured minor threshold (50%) (value={{ $value }}) of its limit."
Summary "namespace: {{$labels.namespace}}, podname: {{$labels.pod}}, timestamp: {{ with query \"time()\" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Memory Usage of pod exceeded 50% of its limit."
Severity Minor
Condition A pod has reached the configured minor threshold (50%) of its memory resource limits.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7020
Metric Used 'container_memory_usage_bytes''container_spec_memory_limit_bytes'

Note: This is a Kubernetes metric used for instance availability monitoring. If the metric is not available, use the similar metric as exposed by the monitoring system.

Recommended Actions The alert gets cleared when the memory utilization falls below the Minor Threshold or crosses the major threshold, in which case OcnefMemoryUsageCrossedMajorThreshold alert is raised.

Note: The threshold is configurable in the NefAlertrules alert file.

In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

7.1.1.1.21 OcnefMemoryUsageCrossedMajorThreshold

Table 7-21 OcnefMemoryUsageCrossedMajorThreshold

Field Details
Description "NEF Memory Usage for pod {{ $labels.pod }} has crossed the configured major threshold (60%) (value = {{ $value }}) of its limit."
Summary "namespace: {{$labels.namespace}}, podname: {{$labels.pod}}, timestamp: {{ with query \"time()\" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Memory Usage of pod exceeded 60% of its limit."
Severity Major
Condition A pod has reached the configured major threshold (60%) of its memory resource limits.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7021
Metric Used

'container_memory_usage_bytes'

'container_spec_memory_limit_bytes'

Note: This is a Kubernetes metric used for instance availability monitoring. If the metric is not available, use the similar metric as exposed by the monitoring system.

Recommended Actions The alert gets cleared when the memory utilization falls below the Major Threshold or crosses the critical threshold, in which case OcnefMemoryUsageCrossedCriticalThreshold alert is raised.

Note: The threshold is configurable in the NefAlertrules alert file.

In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

7.1.1.1.22 OcnefMemoryUsageCrossedCriticalThreshold

Table 7-22 OcnefMemoryUsageCrossedCriticalThreshold

Field Details
Description "NEF Memory Usage for pod {{ $labels.pod }} has crossed the configured major threshold (70%) (value = {{ $value }}) of its limit."
Summary "namespace: {{$labels.namespace}}, podname: {{$labels.pod}}, timestamp: {{ with query \"time()\" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Memory Usage of pod exceeded 70% of its limit."
Severity Critical
Condition A pod has reached the configured critical threshold (70%) of its memory resource limits.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7022
Metric Used

'container_memory_usage_bytes'

'container_spec_memory_limit_bytes'

Note: This is a Kubernetes metric used for instance availability monitoring. If the metric is not available, use a similar metric as exposed by the monitoring system.

Recommended Actions The alert gets cleared when the memory utilization falls below the Critical threshold.

Note: The threshold is configurable in the NefAlertrules alert file.

In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

7.1.1.1.23 OcnefIngressGatewayServiceDown

Table 7-23 OcnefIngressGatewayServiceDown

Field Details
Description "NEF Ingress-Gateway service {{$labels.app_kubernetes_io_name}} is down"
Summary "kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query \"time()\" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Ingress-gateway service down"
Severity Critical
Condition None of the pods of the Ingress-Gateway microservice is available.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7023
Metric Used 'up'

Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system.

Recommended Actions

The alert is cleared when the Ingress Gateway service is available.

Steps:

  1. To check the orchestration logs of Ingress Gateway service and check for liveness or readiness probe failures, do the following:
    1. Run the following command to check the pod status:
      $ kubectl get po -n <namespace>
    2. Run the following command to analyze the error condition of the pod that is not in the Running state:
      $ kubectl describe pod <pod name not in Running state> -n <namespace>

      Where <pod name not in Running state> indicates the pod that is not in the Running state.

  2. Refer to the application logs on Kibana and filter based on Ingress Gateway service names. Check for ERROR WARNING logs related to thread exceptions.
  3. Depending on the failure reason, take the resolution steps.
  4. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

    Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

7.1.1.1.24 OcnefApiRouterServiceDown

Table 7-24 OcnefApiRouterServiceDown

Field Details
Description "NEF API Router service {{$labels.app_kubernetes_io_name}} is down"
Summary "namespace: {{$labels.namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : ApiRouter service down"
Severity Critical
Condition The API Router service is down.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7024
Metric Used 'up'

Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system.

Recommended Actions

The alert is cleared when the NEF API Router service is available.

Steps:

  1. To check the orchestration logs of ocnef_expgw_apirouter service and check for liveness or readiness probe failures, do the following:
    1. Run the following command to check the pod status:
      $ kubectl get pod -n <namespace>
    2. Run the following command to analyze the error condition of the pod that is not in the Running state:
      $ kubectl describe pod <pod name not in Running state> -n <namespace>

      Where <pod name not in Running state> indicates the pod that is not in the Running state.

  2. Refer the application logs on Kibana and filter based on ocnef_expgw_apirouter service names. Check for ERROR WARNING logs related to thread exceptions.
  3. Check the DB status. For more information on how to check the DB status, see Oracle Communications Cloud Native Core, cnDBTier User Guide.
  4. Depending on the failure reason, take the resolution steps.
  5. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

    Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

7.1.1.1.25 OcnefFiveGcAgentServiceDown

Table 7-25 OcnefFiveGcAgentServiceDown

Field Details
Description "NEF FiveGc Agent service down {{$labels.app_kubernetes_io_name}} is down"
Summary "kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query \"time()\" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : FiveGc Agent service down"
Severity Critical
Condition The 5GC Agent service is down.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7025
Metric Used 'up'

Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system.

Recommended Actions

The alert is cleared when the NEF 5GC Agent service is available.

Steps:

  1. To check the orchestration logs of 5gcagent service and check for liveness or readiness probe failures, do the following:
    1. Run the following command to check the pod status:
      $ kubectl get pod -n <namespace>
    2. Run the following command to analyze the error condition of the pod that is not in the Running state:
      $ kubectl describe pod <pod name not in Running state> -n <namespace>

      Where <pod name not in Running state> indicates the pod that is not in the Running state.

  2. Refer the application logs on Kibana and filter based on 5gcagent service names. Check for ERROR WARNING logs related to thread exceptions.
  3. Check the DB status. For more information on how to check the DB status, see Oracle Communications Cloud Native Core, cnDBTier User Guide.
  4. Depending on the failure reason, take the resolution steps.
  5. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

    Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

7.1.1.1.26 OcnefMonitoringEventServiceDown

Table 7-26 OcnefMonitoringEventServiceDown

Field Details
Description "NEF MonitoringEvent service {{$labels.app_kubernetes_io_name}} is down"
Summary "kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query \"time()\" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : MonitoringEvent service down"
Severity Critical
Condition The Monitoring Event (ME) service is down.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7026
Metric Used 'up'

Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system.

Recommended Actions

The alert is cleared when the NEF Monitoring Event (ME) service is available.

Steps:

  1. To check the orchestration logs of ocnef_monitoring_events service and check for liveness or readiness probe failures, do the following:
    1. Run the following command to check the pod status:
      $ kubectl get pod -n <namespace>
    2. Run the following command to analyze the error condition of the pod that is not in the Running state:
      $ kubectl describe pod <pod name not in Running state> -n <namespace>

      Where <pod name not in Running state> indicates the pod that is not in the Running state.

  2. Refer the application logs on Kibana and filter based on ocnef_monitoring_events service names. Check for ERROR WARNING logs related to thread exceptions.
  3. Check the DB status. For more information on how to check the DB status, see Oracle Communications Cloud Native Core, cnDBTier User Guide.
  4. Depending on the failure reason, take the resolution steps.
  5. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

    Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

7.1.1.1.27 OcnefCCFClientServiceDown

Table 7-27 OcnefCCFClientServiceDown

Field Details
Description "NEF CCFClient service {{$labels.app_kubernetes_io_name}} is down"
Summary "kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query \"time()\" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : CCFClient service down"
Severity Critical
Condition The CCF Client service is down.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7027
Metric Used 'up'

Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system.

Recommended Actions

The alert is cleared when the NEF CCF Client service is available.

Steps:

  1. To check the orchestration logs of ocnef_ccfclient service and check for liveness or readiness probe failures, do the following:
    1. Run the following command to check the pod status:
      $ kubectl get pod -n <namespace>
    2. Run the following command to analyze the error condition of the pod that is not in the Running state:
      $ kubectl describe pod <pod name not in Running state> -n <namespace>

      Where <pod name not in Running state> indicates the pod that is not in the Running state.

  2. Refer the application logs on Kibana and filter based on ocnef_ccfclient service names. Check for ERROR WARNING logs related to thread exceptions.
  3. Check the DB status. For more information on how to check the DB status, see Oracle Communications Cloud Native Core, cnDBTier User Guide.
  4. Depending on the failure reason, take the resolution steps.
  5. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

    Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

7.1.1.1.28 OcnefExpiryAuditorServiceDown

Table 7-28 OcnefExpiryAuditorServiceDown

Field Details
Description "NEF Expiry Auditor service {{$labels.app_kubernetes_io_name}} is down"
Summary "kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query \"time()\" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Expiry Auditor service down"
Severity Critical
Condition The expiry auditor service is down.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7028
Metric Used 'up'

Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system.

Recommended Actions

The alert is cleared when the NEF Expiry Auditor service is available.

Steps:

  1. To check the orchestration logs of ocnef-expiry-auditor service and check for liveness or readiness probe failures, do the following:
    1. Run the following command to check the pod status:
      $ kubectl get pod -n <namespace>
    2. Run the following command to analyze the error condition of the pod that is not in the Running state:
      $ kubectl describe pod <pod name not in Running state> -n <namespace>

      Where <pod name not in Running state> indicates the pod that is not in the Running state.

  2. Refer the application logs on Kibana and filter based on ocnef-expiry-auditor service names. Check for ERROR WARNING logs related to thread exceptions.
  3. Check the DB status. For more information on how to check the DB status, see Oracle Communications Cloud Native Core, cnDBTier User Guide.
  4. Depending on the failure reason, take the resolution steps.
  5. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

    Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

7.1.1.1.29 OcnefQOSServiceDown

Table 7-29 OcnefQOSServiceDown

Field Details
Description "NEF QOS service {{$labels.app_kubernetes_io_name}} is down"
Summary namespace: {{$labels.namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : QOS service down
Severity Critical
Condition The QoS service is down.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7029
Metric Used 'up'

Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system.

Recommended Actions

The alert is cleared when the NEF QoS service is available.

Steps:

  1. To check the orchestration logs of ocnef-qualityofservice service and check for liveness or readiness probe failures, do the following:
    1. Run the following command to check the pod status:
      $ kubectl get pod -n <namespace>
    2. Run the following command to analyze the error condition of the pod that is not in the Running state:
      $ kubectl describe pod <pod name not in Running state> -n <namespace>

      Where <pod name not in Running state> indicates the pod that is not in the Running state.

  2. Refer the application logs on Kibana and filter based on ocnef-expiry-auditor service names. Check for ERROR WARNING logs related to thread exceptions.
  3. Check the DB status. For more information on how to check the DB status, see Oracle Communications Cloud Native Core, cnDBTier User Guide.
  4. Depending on the failure reason, take the resolution steps.
  5. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

    Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

7.1.1.1.30 OcnefTIServiceDown

Table 7-30 OcnefTIServiceDown

Field Details
Description OCNEF Traffic Influence service {{$labels.app_kubernetes_io_name}} is down
Summary namespace: {{$labels.namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : TI service down
Severity Critical
Condition Traffic Influence service is down.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7030
Metric Used 'up'

Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system.

Recommended Actions

The alert is cleared when the NEF TI service is available.

Steps:

  1. To check the orchestration logs of ocnef-trafficinfluence service and check for liveness or readiness probe failures, do the following:
    1. Run the following command to check the pod status:
      $ kubectl get pod -n <namespace>
    2. Run the following command to analyze the error condition of the pod that is not in the Running state:
      $ kubectl describe pod <pod name not in Running state> -n <namespace>

      Where <pod name not in Running state> indicates the pod that is not in the Running state.

  2. Refer the application logs on Kibana and filter based on ocnef-expiry-auditor service names. Check for ERROR WARNING logs related to thread exceptions.
  3. Check the DB status. For more information on how to check the DB status, see Oracle Communications Cloud Native Core, cnDBTier User Guide.
  4. Depending on the failure reason, take the resolution steps.
  5. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

    Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

7.1.1.1.31 OcnefDTServiceDown

Table 7-31 OcnefDTServiceDown

Field Details
Description OCNEF Device Trigger service {{$labels.app_kubernetes_io_name}} is down
Summary namespace: {{$labels.namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : DT service down
Severity Critical
Condition Device Trigger service is down.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7031
Metric Used 'up'

Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system.

Recommended Actions

The alert is cleared when the NEF DT service is available.

Steps:

  1. To check the orchestration logs of ocnef-devicetrigger service and check for liveness or readiness probe failures, do the following:
    1. Run the following command to check the pod status:
      $ kubectl get pod -n <namespace>
    2. Run the following command to analyze the error condition of the pod that is not in the Running state:
      $ kubectl describe pod <pod name not in Running state> -n <namespace>

      Where <pod name not in Running state> indicates the pod that is not in the Running state.

  2. Refer the application logs on Kibana and filter based on ocnef-expiry-auditor service names. Check for ERROR WARNING logs related to thread exceptions.
  3. Check the DB status. For more information on how to check the DB status, see Oracle Communications Cloud Native Core, cnDBTier User Guide.
  4. Depending on the failure reason, take the resolution steps.
  5. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

    Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.

7.1.1.2 Application Level Alerts

This section lists the application level alerts for NEF.

7.1.1.2.1 AEFApiRouterOauthValidationFailureRateCrossedThreshold

Table 7-32 AEFApiRouterOauthValidationFailureRateCrossedThreshold

Field Details
Description "Failure Rate of API Router Oauth Validation Is Crossing the Threshold (10%)"
Summary "{{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Failure Rate Of Oauth Validation is above 10 percent of total requests."
Severity Error
Condition The failure rate of the OAuth validations at API Router is reaching the threshold value.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7032
Metric Used ocnef_aef_apirouter_resp_total
Recommended Actions

The alert is cleared when the failure rate of OAuth validations at API Router is below the threshold.

Steps:
  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.2 MEAddSubscriptionFailureRateCrossedThreshold

Table 7-33 MEAddSubscriptionFailureRateCrossedThreshold

Field Details
Description "Failure Rate of ME Subscriptions Is Crossing the Threshold (10%)"
Summary "namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Failure Rate Of ME Subscriptions requests is above 10 percent of total requests."
Severity Error
Condition The failure rate of the Monitoring Event subscription requests is reaching the threshold value.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7033
Metric Used ocnef_me_af_resp_total
Recommended Actions

The alert is cleared when the failure rate of Monitoring Event subscription requests is below the threshold.

Steps:
  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.3 MEDeleteSubscriptionFailureRateCrossedThreshold

Table 7-34 MEDeleteSubscriptionFailureRateCrossedThreshold

Field Details
Description "Failure Rate of Delete ME Subscriptions Is Crossing the Threshold (10%)"
Summary "namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query \"time()\" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Failure Rate Of delete ME Subscriptions requests is above 10 percent of total requests."
Severity Error
Condition The failure rate of the Monitoring Event subscription deletion requests is reaching the threshold value.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7034
Metric Used ocnef_me_af_resp_total
Recommended Actions

The alert is cleared when the failure rate of Monitoring Event subscription deletion requests is below the threshold.

Steps:
  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.4 MENotificationFailureRateCrossedThreshold

Table 7-35 MENotificationFailureRateCrossedThreshold

Field Details
Description "Failure Rate of Delete ME Notifications Is Crossing the Threshold (10%)"
Summary "namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query \"time()\" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Failure Rate Of delete ME Subscriptions requests is above 10 percent of total requests."
Severity Error
Condition The failure rate of the DELETE Monitoring Event notification requests is reaching the threshold value.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7035
Metric Used ocnef_me_af_resp_total
Recommended Actions

The alert is cleared when the failure rate of DELETE ME notification requests is below the threshold.

Steps:
  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.5 FiveGcInvalidConfiguration

Table 7-36 FiveGcInvalidConfiguration

Field Details
Description "Invalid Configuration For Five GC Service"
Summary "namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query \"time()\" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Invalid Configuration For Five GC Service."
Severity Error
Condition Invalid configuration of the 5GCAgent service.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7036
Metric Used ocnef_5gc_invalid_config
Recommended Actions

The alert is cleared when the 5GCAgent service configuration are valid.

Steps:
  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.6 QOSAddSubscriptionFailureRateCrossedThreshold

Table 7-37 QOSAddSubscriptionFailureRateCrossedThreshold

Field Details
Description Failure rate of QoS subscriptions is crossing the threshold (10%).
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}
Severity Error
Condition Failure rate of QoS subscription requests is above 10 percent of total requests.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7037
Metric Used ocnef_qos_af_resp_total
Recommended Actions The alert is cleared when the failure rate of subscription requests is below the failure threshold.

Steps:

  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.7 QOSDeleteSubscriptionFailureRateCrossedThreshold

Table 7-38 QOSDeleteSubscriptionFailureRateCrossedThreshold

Field Details
Description Failure rate of delete QoS subscriptions is crossing the threshold (10%).
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}
Severity Error
Condition Failure rate of delete QoS subscriptions requests is above 10 percent of total requests.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7038
Metric Used ocnef_qos_af_resp_total
Recommended Actions The alert is cleared when the failure rate of subscription requests is below the failure threshold.

Steps:

  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.8 QOSNotificationFailureRateCrossedThreshold

Table 7-39 QOSNotificationFailureRateCrossedThreshold

Field Details
Description Failure rate of QoS notifications is crossing the threshold (10%).
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}
Severity Error
Condition Failure rate of QoS notifications requests is above 10 percent of total requests.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7039
Metric Used ocnef_qos_5g_resp_total
Recommended Actions The alert is cleared when the failure rate of notification requests is below the failure threshold.

Steps:

  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.9 OcnefAllSiteStatus

Table 7-40 OcnefAllSiteStatus

Field Details
Description "Alert for any NEF sites status if SUSPENDED in Georedundant setup"
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Alert for any NEF sites status if SUSPENDED in Georedundant setup
Severity Error
Condition An NEF site of a georedundant deployment is in Suspended state.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7040
Metric Used ocnef_all_site_status
Recommended Actions

The alert is cleared when all the sites in a georedundant deployment are UP.

  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.10 OcnefDBReplicationStatus

Table 7-41 OcnefDBReplicationStatus

Field Details
Description "Alert for NEF sites status if DB Replication down in Georedundant setup"
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Alert for NEF sites status if DB Replication down in Georedundant setup
Severity Error
Condition The database replication channel status between the given site and the georedundant site(s) is inactive. The alert is raised per replication channel. The alarm is raised or cleared only if the georedundancy feature is enabled.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7041
Metric Used ocnef_db_replication_status
Recommended Actions The alert is cleared when the database channel replication status between the given site and the georedundant site(s) is UP. For more information on how to check the database replication status, see Oracle Communications Cloud Native Core, cnDBTier User Guide.

7.1.1.2.11 MeEPCAddSubscriptionFailureRateCrossedThreshold

Table 7-42 MeEPCAddSubscriptionFailureRateCrossedThreshold

Field Details
Description Failure rate of ME subscriptions to EPC crossing the threshold (10%).
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}
Severity error
Condition Failure Rate of ME EPC Subscriptions requests is above 10 percent of total requests.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7042
Metric Used ocnef_me_epc_sub_total
Recommended Actions The alert is cleared when the failure rate of CHF Requests is below the failure threshold.
Steps:
  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.12 DiameterGwT6InvocationFailureRateCrossedThreshold

Table 7-43 DiameterGwT6InvocationFailureRateCrossedThreshold

Field Details
Description Failure rate of Diameter Gateway T6x Invocation requests crossing the threshold (10%).
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}
Severity error
Condition Failure rate of Diameter Gateway T6x Invocation requests crossing the threshold (10%).
OID 1.3.6.1.4.1.323.5.3.39.1.2.7043
Metric Used ocnef_diamgw_diam_resp_total
Recommended Actions The alert is cleared when the failure rate of CHF Requests is below the failure threshold.
Steps:
  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.13 DiameterGwT4InvocationFailureRateCrossedThreshold

Table 7-44 DiameterGwT4InvocationFailureRateCrossedThreshold

Field Details
Description Failure rate of Diameter Gateway T4 Invocation requests crossing the threshold (10%).
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}
Severity error
Condition Failure rate of Diameter Gateway T4 Invocation requests crossing the threshold (10%).
OID 1.3.6.1.4.1.323.5.3.39.1.2.7044
Metric Used ocnef_diamgw_diam_resp_total
Recommended Actions The alert is cleared when the failure rate of CHF Requests is below the failure threshold.
Steps:
  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.14 DiameterGwRxInvocationFailureRateCrossedThreshold

Table 7-45 DiameterGwRxInvocationFailureRateCrossedThreshold

Field Details
Description Failure rate of Diameter Gateway Rx Invocation requests crossing the threshold (10%).
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}
Severity error
Condition Failure rate of Diameter Gateway Rx Invocation requests crossing the threshold (10%).
OID 1.3.6.1.4.1.323.5.3.39.1.2.7045
Metric Used ocnef_diamgw_diam_resp_total
Recommended Actions The alert is cleared when the failure rate of CHF Requests is below the failure threshold.
Steps:
  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.15 DiameterGwSgdT4InvocationFailureRateCrossedThreshold

Table 7-46 DiameterGwSgdT4InvocationFailureRateCrossedThreshold

Field Details
Description Failure rate of Diameter Gateway SgdT4 Invocation requests crossing the threshold (10%).
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}
Severity error
Condition Failure rate of Diameter Gateway SgdT4 Invocation requests crossing the threshold (10%).
OID 1.3.6.1.4.1.323.5.3.39.1.2.7046
Metric Used ocnef_diamgw_diam_resp_total
Recommended Actions The alert is cleared when the failure rate of CHF Requests is below the failure threshold.
Steps:
  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.16 DiameterGwT6NotificationFailureRateCrossedThreshold

Table 7-47 DiameterGwT6NotificationFailureRateCrossedThreshold

Field Details
Description Failure rate of Diameter Gateway T6x Notification requests crossing the threshold (10%).
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}
Severity error
Condition Failure rate of Diameter Gateway T6x Notification requests crossing the threshold (10%).
OID 1.3.6.1.4.1.323.5.3.39.1.2.7047
Metric Used ocnef_diamgw_diam_resp_total
Recommended Actions The alert is cleared when the failure rate of CHF Requests is below the failure threshold.
Steps:
  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.17 DiameterGwT4NotificationFailureRateCrossedThreshold

Table 7-48 DiameterGwT4NotificationFailureRateCrossedThreshold

Field Details
Description Failure rate of Diameter Gateway T4 Notification requests crossing the threshold (10%).
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}
Severity error
Condition Failure rate of Diameter Gateway T4 Notification requests crossing the threshold (10%).
OID 1.3.6.1.4.1.323.5.3.39.1.2.7048
Metric Used ocnef_diamgw_diam_resp_total
Recommended Actions The alert is cleared when the failure rate of CHF Requests is below the failure threshold.
Steps:
  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.18 DiameterGwRxNotificationFailureRateCrossedThreshold

Table 7-49 DiameterGwRxNotificationFailureRateCrossedThreshold

Field Details
Description Failure rate of Diameter Gateway Rx Notification requests crossing the threshold (10%).
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}
Severity error
Condition Failure rate of Diameter Gateway Rx Notification requests crossing the threshold (10%).
OID 1.3.6.1.4.1.323.5.3.39.1.2.7049
Metric Used ocnef_diamgw_diam_resp_total
Recommended Actions The alert is cleared when the failure rate of CHF Requests is below the failure threshold.
Steps:
  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.19 DiameterGwSgdT4NotificationFailureRateCrossedThreshold

Table 7-50 DiameterGwSgdT4NotificationFailureRateCrossedThreshold

Field Details
Description Failure rate of Diameter Gateway SgdT4 Notification requests crossing the threshold (10%).
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}
Severity error
Condition Failure rate of Diameter Gateway SgdT4 Notification requests crossing the threshold (10%).
OID 1.3.6.1.4.1.323.5.3.39.1.2.7050
Metric Used ocnef_diamgw_diam_resp_total
Recommended Actions The alert is cleared when the failure rate of CHF Requests is below the failure threshold.
Steps:
  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.20 DiameterGwT6TranslationFailureRateCrossedThreshold

Table 7-51 DiameterGwT6TranslationFailureRateCrossedThreshold

Field Details
Description Failure Rate of T6x Translations In Diameter GW is above 10 percent of total requests.
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}
Severity error
Condition Failure Rate of T6x Translations In Diameter GW is above 10 percent of total requests.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7051
Metric Used ocnef_diamgw_translator_request_total
Recommended Actions The alert is cleared when the failure rate of CHF Requests is below the failure threshold.
Steps:
  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.21 DiameterGwT4TranslationFailureRateCrossedThreshold

Table 7-52 DiameterGwT4TranslationFailureRateCrossedThreshold

Field Details
Description Failure Rate of T4 Translations In Diameter GW is above 10 percent of total requests.
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}
Severity error
Condition Failure Rate of T4 Translations In Diameter GW is above 10 percent of total requests.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7052
Metric Used ocnef_diamgw_translator_request_total
Recommended Actions The alert is cleared when the failure rate of CHF Requests is below the failure threshold.
Steps:
  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.22 DiameterGwRxTranslationFailureRateCrossedThreshold

Table 7-53 DiameterGwRxTranslationFailureRateCrossedThreshold

Field Details
Description Failure Rate of Rx Translations In Diameter GW is above 10 percent of total requests.
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}
Severity error
Condition Failure Rate of Rx Translations In Diameter GW is above 10 percent of total requests.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7053
Metric Used ocnef_diamgw_translator_request_total
Recommended Actions The alert is cleared when the failure rate of CHF Requests is below the failure threshold.
Steps:
  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.23 DiameterGwSgdT4TranslationFailureRateCrossedThreshold

Table 7-54 DiameterGwSgdT4TranslationFailureRateCrossedThreshold

Field Details
Description Failure Rate of SgdT4 Translations in Diameter GW is above 10 percent of total requests.
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}
Severity error
Condition Failure Rate of SgdT4 Translations in Diameter GW is above 10 percent of total requests.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7054
Metric Used ocnef_diamgw_translator_request_total
Recommended Actions The alert is cleared when the failure rate of CHF Requests is below the failure threshold.
Steps:
  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.24 CHFAddChargingDataRequestFailureRateCrossedErrorThreshold

Table 7-55 CHFAddChargingDataRequestFailureRateCrossedErrorThreshold

Field Details
Description Failure rate of CHF Create Charging Data request is crossing the threshold (10%).
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}
Severity error
Condition Failure rate of CHF Create Charging Data request is above 10 percent of total requests.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7055
Metric Used ocnef_chf_qos_resp_total
Recommended Actions The alert is cleared when the failure rate of CHF Requests is below the failure threshold.
Steps:
  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.25 CHFAddChargingDataRequestFailureRateCrossedCriticalThreshold

Table 7-56 CHFAddChargingDataRequestFailureRateCrossedCriticalThreshold

Field Details
Description Failure rate of CHF Create Charging Data request is crossing the threshold (25%).
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}
Severity critical
Condition Failure rate of CHF Create Charging Data request is above 25 percent of total requests.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7056
Metric Used ocnef_chf_qos_resp_total
Recommended Actions The alert is cleared when the failure rate of CHF Requests is below the failure threshold.
Steps:
  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.26 CHFAddChargingDataRequestFailureRateCrossedMinorThreshold

Table 7-57 CHFAddChargingDataRequestFailureRateCrossedMinorThreshold

Field Details
Description Failure rate of CHF Create Charging Data request is crossing the threshold (5%).
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}
Severity minor
Condition Failure rate of CHF Create Charging Data request is above 5 percent of total requests.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7057
Metric Used ocnef_chf_qos_resp_total
Recommended Actions The alert is cleared when the failure rate of CHF Requests is below the failure threshold.
Steps:
  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.27 MSISDNLessMoSMSRequestFailureRateCrossedCriticalThreshold

Table 7-58 MSISDNLessMoSMSRequestFailureRateCrossedCriticalThreshold

Field Details
Description Failure rate of MSISDNLess MO SMS notification request is crossing the threshold (25%).
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}

{{ . | first | value | humanizeTimestamp }}{{ end }}

Severity Critical
Condition Failure rate Of MSISDNLess MO SMS request is above 25 percent of total requests.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7058
Metric Used ocnef_msisdnless_mo_sms_diamgw_notify_resp_total
Recommended Actions The alert is cleared when the failure rate of notification requests is below the failure threshold.

Steps:

  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.28 MSISDNLessMoSMSRequestFailureRateCrossedMajorThreshold

Table 7-59 MSISDNLessMoSMSRequestFailureRateCrossedMajorThreshold

Field Details
Description Failure rate of MSISDNLess MO SMS notification request is crossing the threshold (10%).
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}

{{ . | first | value | humanizeTimestamp }}{{ end }}

Severity Major
Condition Failure rate of MSISDNLess MO SMS request is above 10 percent of total requests.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7059
Metric Used ocnef_msisdnless_mo_sms_diamgw_notify_resp_total
Recommended Actions The alert is cleared when the failure rate of notification requests is below the failure threshold.

Steps:

  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.29 MSISDNLessMoSMSRequestFailureRateCrossedMinorThreshold

Table 7-60 MSISDNLessMoSMSRequestFailureRateCrossedMinorThreshold

Field Details
Description Failure rate of MSISDNLess MO SMS notification request is crossing the threshold (5%).
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}

{{ . | first | value | humanizeTimestamp }}{{ end }}

Severity Minor
Condition Failure rate of MSISDNLess MO SMS request is above 5 percent of total requests.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7060
Metric Used ocnef_msisdnless_mo_sms_diamgw_notify_resp_total
Recommended Actions The alert is cleared when the failure rate of notification requests is below the failure threshold.

Steps:

  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

7.1.1.2.30 MSISDNLessMoSMSShortCodeConfigMatchFailure

Table 7-61 MSISDNLessMoSMSShortCodeConfigMatchFailure

Field Details
Description Failure when shortcode configured doesn't match the shortcode from incoming request.
Summary namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}

{{ . | first | value | humanizeTimestamp }}{{ end }}

Severity Error
Condition Failure when shortcode configured doesn't match the shortcode from SMSSC.
OID 1.3.6.1.4.1.323.5.3.39.1.2.7061
Metric Used ocnef_diamgw_http_resp_total
Recommended Actions The alert is cleared when the failure rate of notification requests is below the failure threshold.

Steps:

  1. Check for pod logs on Kibana for ERROR WARN logs.
  2. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.