7.1.2.1.19 OccapifEgressGatewayServiceDown

Table 7-80 OccapifEgressGatewayServiceDown

Field Details
Description "CAPIF Egress-Gateway service {{$labels.app_kubernetes_io_name}} is down"
Summary "kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query \"time()\" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Egress-Gateway service down"
Severity Critical
Condition None of the pods of the Egress Gateway microservice is available.
OID 1.3.6.1.4.1.323.5.3.39.1.3.5019
Metric Used 'up'

Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system.

Recommended Actions

The alert is cleared when the Egress Gateway service is available.

Note: The threshold is configurable in the NefAlertrules alert file.

Steps:

  1. To check the orchestration logs of Egress Gateway service and check for liveness or readiness probe failures, do the following:
    1. Run the following command to check the pod status:
      $ kubectl get po -n <namespace>
    2. Run the following command to analyze the error condition of the pod that is not in the Running state:
      $ kubectl describe pod <pod name not in Running state> -n <namespace>

      Where <pod name not in Running state> indicates the pod that is not in the Running state.

  2. Refer to the application logs on Kibana and filter based on Egress Gateway service names. Check for ERROR WARNING logs related to thread exceptions.
  3. Depending on the failure reason, take the resolution steps.
  4. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support.

    Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide.