5.4.7.5 OcnrfPodCpuUsageInDangerOfCongestionState

Table 5-94 OcnrfPodCpuUsageInDangerOfCongestionState

Field Details
Description 'The pod {{$labels.kubernetes_pod_name}} of service {{$labels.app_kubernetes_io_name}} is in Danger of Congestion state due to CPU usage above threshold'
Summary 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : The pod is in Danger of Congestion state due to CPU usage above threshold'
Severity Major
Condition

A pod of a service is in Danger Of Congestion state due to its CPU above configured thresholds.

This alert is raised when the Pod Pretoectoin feature is enabled for nfSubscription service. Currently this is applicable for NfSubscription service only.

OID 1.3.6.1.4.1.323.5.3.36.1.2.7080
Metric Used ocnrf_pod_cpu_congestion_state
Recommended Actions The alert is cleared when the CPU goes below the configured thresholds for the Danger of Congested state.

Note: The thresholds can be viewed using REST API.

Steps:

Reassess if the NRF is receiving additional traffic.

If this is unexpected, contact My Oracle Support.

  1. Refer to alert to determine which pod is receiving high traffic. It may due to a sudden spike in traffic. For example: When one mated site goes down, the NFs move to the given site. Check if NF is sending sending high number of updates, register or deregister.
  2. Check the service pod logs on Kibana to determine the reason for the errors.
  3. If this is expected traffic, then the thresholds levels may need to be re-evaluated as per the call rate and reconfigured as mentioned in Oracle Communications Cloud Native Core, Network Repository Function REST Specification Guide.
Available in OCI No