5 NRF Alerts
This chapter includes information about the NRF alerts.
The following table describes the various alert levels generated by NRF:
Table 5-1 Alerts Levels or Severity Types
Alerts Levels/Severity Types | Definition |
---|---|
Critical | Indicates a severe issue that poses a significant risk to safety, security, or operational integrity. It requires immediate response to address the situation and prevent serious consequences. Raised for conditions may affect the service of NRF. |
Major | Indicates a more significant issue that has an impact on operations or poses a moderate risk. It requires prompt attention and action to mitigate potential escalation. Raised for conditions may affect the service of NRF. |
Minor | Indicates a situation that is low in severity and does not pose an immediate risk to safety, security, or operations. It requires attention but does not demand urgent action. Raised for conditions may affect the service of NRF. |
Info or Warn (Informational) | Provides general information or updates that are not related to immediate risks or actions. These alerts are for awareness and do not typically require any specific response. WARN and INFO alerts may not impact the service of NRF. |
Note:
- Summary or dimensions may vary based on deployment.
- The alert triggering time varies as per the environment in which it is deployed.
- The performance and capacity of the NRF system may vary based on the call model, Feature or Interface configuration, and underlying CNE and hardware environment.
5.1 System Level Alerts
This section lists the system level alerts.
5.1.1 OcnrfNfStatusUnavailable
Table 5-2 OcnrfNfStatusUnavailable
Field | Details |
---|---|
Description | 'OCNRF services unavailable' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : All OCNRF services are unavailable.' |
Severity | Critical |
Condition | When all the NRF services are unavailable, either because the NRF is getting deployed or purged. The NRF services considered are nfregistration, nfsubscription, nrfauditor, nrfconfiguration, nfaccesstoken, nfdiscovery, appinfo, ingressgateway, and egressgateway. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7016 |
Metric Used |
'up' Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system. |
Recommended Actions | The alert is cleared automatically when the NRF services
restart.
Steps:
|
Available in OCI | No |
5.1.2 OcnrfPodsRestart
Table 5-3 OcnrfPodsRestart
Field | Details |
---|---|
Description | 'Pod <Pod Name> has restarted. |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : A Pod has restarted' |
Severity | Major |
Condition | A pod belonging to any of the NRF services have restarted. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7017 |
Metric Used | 'kube_pod_container_status_restarts_total'
Note: This is a Kubernetes metric. If this metric is not available, use a similar metric as exposed by the monitoring system. |
Recommended Actions |
The alert is cleared automatically if the specific pod is up. Steps:
|
Available in OCI | No |
5.1.3 NnrfNFManagementServiceDown
Table 5-4 NnrfNFManagementServiceDown
Field | Details |
---|---|
Description | 'OCNRF Nnrf_Management service <nfregistration|nfsubscription|nrfauditor> is down' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : NFManagement service is down' |
Severity | Critical |
Condition | This alert is raised when either NFRegistration, NFSubscription, or NrfAuditor services are unavailable. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7018 |
Metric Used | ''up'
Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system. |
Recommended Actions | The alert is cleared when all the Nnrf_NFManagement
services nfregistration, nfsubscription, and nrfauditor are available.
Steps:
|
Available in OCI | No |
5.1.4 NnrfAccessTokenServiceDown
Table 5-5 NnrfAccessTokenServiceDown
Field | Details |
---|---|
Description | 'OCNRF Nnrf_NFAccessToken service nfaccesstoken is down' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : NFAccessToken service down' |
Severity | Critical |
Condition | This alert is raised when NFAccessToken service is unavailable. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7020 |
Metric Used | ''up''
Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available use a similar metric as exposed by the monitoring system. |
Recommended Actions | The alert is cleared when the Nnrf_AccessToken service
is available.
Steps:
|
Available in OCI | No |
5.1.5 NnrfNFDiscoveryServiceDown
Table 5-6 NnrfNFDiscoveryServiceDown
Field | Details |
---|---|
Description | 'OCNRF Nnrf_NFDiscovery service nfdiscovery is down' |
Applicable in OCI | No |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : NFDiscovery service down' |
Severity | Critical |
Condition | NFDiscovery is unavailable. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7019 |
Metric Used | 'up'
Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system. |
Recommended Actions |
The alert is cleared when the Nnrf_NFDiscovery service is available. Steps:
|
Available in OCI | No |
5.1.6 OcnrfRegistrationServiceDown
Table 5-7 OcnrfRegistrationServiceDown
Field | Details |
---|---|
Description | 'OCNRF NFRegistration service nfregistration is down' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : NFRegistration service is down' |
Severity | Critical |
Condition | None of the pods of the NFRegistration microservice is available. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7021 |
Metric Used | 'up'
Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system. |
Recommended Actions |
The alert is cleared when the nfregistration service is available. Steps:
|
Available in OCI | No |
5.1.7 OcnrfSubscriptionServiceDown
Table 5-8 OcnrfSubscriptionServiceDown
Field | Details |
---|---|
Description | 'OCNRF NFSubscription service nfsubscription is down. |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : NFSubscription service is down' |
Severity | Critical |
Condition | None of the pods of the NFSubscription microservice is available. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7022 |
Metric Used | 'up'
Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system. |
Recommended Actions | The alert is cleared when the nfsubscription service is
available.
Steps:
|
Available in OCI | No |
5.1.8 OcnrfDiscoveryServiceDown
Table 5-9 OcnrfDiscoveryServiceDown
Field | Details |
---|---|
Description | 'OCNRF NFDiscovery service nfdiscovery is down' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : NFDiscovery service down' |
Severity | Critical |
Condition | None of the pods of the NFDiscovery microservice is available. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7023 |
Metric Used | 'up'
Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system. |
Recommended Actions | The alert is cleared when the nfdiscovery service is
available.
Steps:
|
Available in OCI | No |
5.1.9 OcnrfAccessTokenServiceDown
Table 5-10 OcnrfAccessTokenServiceDown
Field | Details |
---|---|
Description | 'OCNRF NFAccessToken service nfaccesstoken is down |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : NFAccesstoken service down' |
Severity | Critical |
Condition | None of the pods of the NFAccessToken microservice is available. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7024 |
Metric Used | 'up'
Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system. |
Recommended Actions | The alert is cleared when the nfaccesstoken service is
available.
Steps:
|
Available in OCI | No |
5.1.10 OcnrfAuditorServiceDown
Table 5-11 OcnrfAuditorServiceDown
Field | Details |
---|---|
Description | 'OCNRF NrfAuditor service nrfauditor is down' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : NrfAuditor service down' |
Severity | Critical |
Condition | None of the pods of the NrfAuditor microservice is available. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7026 |
Metric Used | 'up' Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system. |
Recommended Actions |
The alert is cleared when the nrfauditor service is available. Steps:
|
Available in OCI | No |
5.1.11 OcnrfConfigurationServiceDown
Table 5-12 OcnrfConfigurationServiceDown
Field | Details |
---|---|
Description | 'OCNRF NrfConfiguration service nrfconfiguration is down' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : NrfConfiguration service down' |
Severity | Critical |
Condition | None of the pods of the NrfConfiguration microservice is available. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7025 |
Metric Used | 'up'
Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system. |
Recommended Actions |
The alert is cleared when the nrfconfiguration service is available. Steps:
|
Available in OCI | No |
5.1.12 OcnrfAppInfoServiceDown
Table 5-13 OcnrfAppInfoServiceDown
Field | Details |
---|---|
Description | 'OCNRF Appinfo service appinfo is down' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Appinfo service down' |
Severity | Critical |
Condition | None of the pods of the appinfo microservice is available. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7027 |
Metric Used | 'up'
Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system. |
Recommended Actions |
The alert is cleared when the appinfo service is available. Steps:
|
Available in OCI | No |
5.1.13 OcnrfArtisanServiceDown
Table 5-14 OcnrfArtisanServiceDown
Field | Details |
---|---|
Description | 'OCNRF NrfArtisan service {{$labels.app_kubernetes_io_name}} is down' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : NrfArtisan service is down' |
Severity | Critical |
Condition | NrfArtisan is unavailable. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7056 |
Metric Used | 'up'
Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use the similar metric as exposed by the monitoring system. |
Recommended Actions |
The alert is cleared when the NrfArtisan service is available. Steps:
|
Available in OCI | No |
5.1.14 OcnrfAlternateRouteServiceDown
Table 5-15 OcnrfAlternateRouteServiceDown
Field | Details |
---|---|
Description | 'OCNRF AlternateRoute service {{$labels.app_kubernetes_io_name}} is down' |
Applicable in OCI | No |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : AlternateRoute service is down' |
Severity | Critical |
Condition | AlternateRoute is unavailable. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7057 |
Metric Used |
'up' Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use the similar metric as exposed by the monitoring system. |
Recommended Actions |
The alert is cleared when the alternate-route service is available. Steps:
|
5.1.15 OcnrfPerfInfoServiceDown
Table 5-16 OcnrfPerfInfoServiceDown
Field | Details |
---|---|
Description | 'OCNRF Perfinfo service {{$labels.app_kubernetes_io_name}} is down' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Perfinfo service down' |
Severity | Critical |
Condition | Perfinfo is unavailable. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7058 |
Metric Used |
'up' Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use the similar metric as exposed by the monitoring system. |
Recommended Actions |
The alert is cleared when the Perfinfo service is available. Steps:
|
Available in OCI | No |
5.1.16 OcnrfIngressGatewayServiceDown
Table 5-17 OcnrfIngressGatewayServiceDown
Field | Details |
---|---|
Description | 'OCNRF Ingress-Gateway service ingressgateway is down' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Ingress-gateway service down' |
Severity | Critical |
Condition | None of the pods of the Ingress Gateway microservice is available. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7028 |
Metric Used | 'up'
Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system. |
Recommended Actions |
The alert is cleared when the Ingress Gateway service is available. Steps:
|
Available in OCI | No |
5.1.17 OcnrfEgressGatewayServiceDown
Table 5-18 OcnrfEgressGatewayServiceDown
Field | Details |
---|---|
Description | 'OCNRF Egress-Gateway service egressgateway is down' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Egress-Gateway service down' |
Severity | Critical |
Condition | None of the pods of the Egress Gateway microservice is available. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7029 |
Metric Used | 'up'
Note: This is a Prometheus metric used for instance availability monitoring. If this metric is not available, use a similar metric as exposed by the monitoring system. |
Recommended Actions |
The alert is cleared when the Egress Gateway service is available. Steps:
|
Available in OCI | No |
5.1.18 OcnrfTotalIngressTrafficRateAboveMinorThreshold
Table 5-19 OcnrfTotalIngressTrafficRateAboveMinorThreshold
Field | Details |
---|---|
Description | 'Total Ingress traffic Rate is above configured minor threshold. (current value is: {{ $value }})' |
Summary | 'timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Traffic Rate is above 80 Percent of Max requests per second' |
Severity | Minor |
Condition |
The total NRF Ingress Message rate has crossed the configured minor threshold of 800 TPS. Default value of this alert trigger point in alert file is when NRF Ingress Rate crosses 80 % of 1000 (Maximum ingress request rate). |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7001 |
Metric Used | 'oc_ingressgateway_http_requests_total' |
Recommended Actions |
The alert is cleared either when the total Ingress Traffic rate falls below the minor threshold or when the total traffic rate crosses the major threshold, in which case the OcnrfTotalIngressTrafficRateAboveMajorThreshold alert is raised. Note: The threshold is configurable in the alert file. Reassess why the NRF is receiving additional traffic (for example, Mated site NRF is unavailable in georedundancy scenario). If this alert is unexpected, contact My Oracle Support.Steps:
|
Available in OCI | No |
5.1.19 OcnrfTotalIngressTrafficRateAboveMajorThreshold
Table 5-20 OcnrfTotalIngressTrafficRateAboveMajorThreshold
Field | Details |
---|---|
Description | 'Total Ingress traffic Rate is above major threshold. (current value is: {{ $value }})' |
Summary | 'timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Traffic Rate is above 90 Percent of Max requests per second' |
Severity | Major |
Condition |
The total NRF Ingress Message rate has crossed the configured major threshold of 900 TPS. Default value of this alert trigger point in the alert file is when NRF Ingress Rate crosses 90 % of 1000 (Maximum ingress request rate). |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7002 |
Metric Used | 'oc_ingressgateway_http_requests_total' |
Recommended Actions |
The alert is cleared when the total Ingress Traffic rate falls below the major threshold or when the total traffic rate crosses the critical threshold, in which case the OcnrfTotalIngressTrafficRateAboveCriticalThreshold alert is raised. Note: The threshold is configurable in the alert file. Reassess why the NRF is receiving additional traffic (for example, Mated site NRF is unavailable in georedundancy scenario). If this alert is unexpected, contact My Oracle Support.Steps:
|
Available in OCI | No |
5.1.20 OcnrfTotalIngressTrafficRateAboveCriticalThreshold
Table 5-21 OcnrfTotalIngressTrafficRateAboveCriticalThreshold
Field | Details |
---|---|
Description | 'Total Ingress traffic Rate is above critical threshold.(current value is: {{ $value }})' |
Summary | 'timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Traffic Rate is more than 52000 requests per second' |
Severity | Critical |
Condition |
The total NRF Ingress Message rate has crossed the configured critical threshold of 52000 TPS. Default value of this alert trigger point in the alert file is when NRF Ingress Rate crosses 52000 TPS. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7003 |
Metric Used | 'oc_ingressgateway_http_requests_total' |
Recommended Actions |
The alert is cleared when the Ingress traffic rate falls below the critical threshold. Note: The threshold is configurable in the alert file. Reassess why the NRF is receiving additional traffic (for example, Mated site NRF is unavailable in georedundancy scenario). If this alert is unexpected, contact My Oracle Support.Steps:
|
Available in OCI | No |
5.1.21 OcnrfTransactionErrorRateAbove0Dot1Percent
Table 5-22 OcnrfTransactionErrorRateAbove0Dot1Percent
Field | Details |
---|---|
Description | 'Transaction Error rate is above 0.1 Percent of Total Transactions (current value is {{ $value }})' |
Summary | 'timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Transaction Error Rate detected above 0.1 Percent of Total Transactions' |
Severity | Warning |
Condition | The number of failed transactions is above 0.1 percent of the total transactions. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7004 |
Metric Used | 'oc_ingressgateway_http_responses_total' |
Recommended Actions |
The alert is cleared when the number of failure transactions is below 0.1 percent of the total transactions or when the number of failed transactions crosses the 1% threshold, in which case the OcnrfTransactionErrorRateAbove1Percent is raised. Steps:
|
Available in OCI | No |
5.1.22 OcnrfTransactionErrorRateAbove1Percent
Table 5-23 OcnrfTransactionErrorRateAbove1Percent
Field | Details |
---|---|
Description | 'Transaction Error rate is above 1 Percent of Total Transactions (current value is {{ $value }})' |
Summary | 'timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Transaction Error Rate detected above 1 Percent of Total Transactions' |
Severity | Warning |
Condition | When the number of failed transactions is above 1 percent of the total transactions. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7005 |
Metric Used | 'oc_ingressgateway_http_responses_total' |
Recommended Actions |
The alert is cleared when the number of failure transactions is below 1% of the total transactions or when the number of failed transactions crosses the 10% threshold, in which case the OcnrfTransactionErrorRateAbove10Percent is raised. Steps:
|
Available in OCI | No |
5.1.23 OcnrfTransactionErrorRateAbove10Percent
Table 5-24 OcnrfTransactionErrorRateAbove10Percent
Field | Details |
---|---|
Description | 'Transaction Error rate is above 10 Percent of Total Transactions (current value is {{ $value }})' |
Summary | 'timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Transaction Error Rate detected above 10 Percent of Total Transactions' |
Severity | Minor |
Condition | The number of failed transactions has crossed the minor threshold of 10 percent of the total transactions. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7006 |
Metric Used | 'oc_ingressgateway_http_responses_total' |
Recommended Actions |
The alert is cleared when the number of failure transactions is below 10 percent of the total transactions or when the number of failed transactions crosses the 25 percent threshold, in which case the OcnrfTransactionErrorRateAbove25Percent is raised. Steps:
|
Available in OCI | No |
5.1.24 OcnrfTransactionErrorRateAbove25Percent
Table 5-25 OcnrfTransactionErrorRateAbove25Percent
Field | Details |
---|---|
Description | 'Transaction Error rate is above 25 Percent of Total Transactions (current value is {{ $value }})' |
Summary | 'timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Transaction Error Rate detected above 25 Percent of Total Transactions' |
Severity | Major |
Condition | The number of failed transactions has crossed the minor threshold of 25 percent of the total transactions. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7007 |
Metric Used | 'oc_ingressgateway_http_responses_total' |
Recommended Actions |
The alert is cleared when the number of failure transactions is below 25 percent of the total transactions or when the number of failed transactions crosses the 50 percent threshold, in which case the OcnrfTransactionErrorRateAbove50Percent is raised. Steps:
|
Available in OCI | No |
5.1.25 OcnrfTransactionErrorRateAbove50Percent
Table 5-26 OcnrfTransactionErrorRateAbove50Percent
Field | Details |
---|---|
Description | 'Transaction Error rate is above 50 Percent of Total Transactions (current value is {{ $value }})' |
Summary | 'timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Transaction Error Rate detected above 50 Percent of Total Transactions' |
Severity | Critical |
Condition | The number of failed transactions has crossed the minor threshold of 50 percent of the total transactions. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7008 |
Metric Used | 'oc_ingressgateway_http_responses_total' |
Recommended Actions |
The alert is cleared when the number of failure transactions is below 50 percent of the total transactions. Steps:
|
Available in OCI | No |
5.1.26 OcnrfTotalEgressTrafficRateAboveCriticalThreshold
Table 5-27 OcnrfTotalEgressTrafficRateAboveCriticalThreshold
Field | Details |
---|---|
Description | 'Egress traffic rate is above the configured critical threshold. (current value is: {{ $value }})' |
Summary | ''kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Traffic Rate is above 51600 requests per second' |
Severity | Critical |
Condition | This alarm is raised when the Egress traffic rate is greater than the critical configured threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7109 |
Metric Used | oc_egressgateway_http_requests_total |
Recommended Actions | The alert is cleared either when the total discovery
rate falls below the critical threshold.
Note: The
threshold is configurable in the alert file. Reassess why the NRF is
receiving additional traffic (for example, Mated site NRF is unavailable
in georedundancy scenario). If this alert is unexpected, contact My Oracle
Support.
Steps:
|
Available in OCI | No |
5.1.27 OcnrfTotalForwardingTrafficRateAboveCriticalThreshold
Table 5-28 OcnrfTotalForwardingTrafficRateAboveCriticalThreshold
Field | Details |
---|---|
Description | 'NRF-NRF Forwarding Rate is above the configured critical threshold. (current value is: {{ $value }})' |
Summary | 'kubernetes_namespace: $labels.kubernetes_namespace, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Traffic Rate is above 5200 requests per second.' |
Severity | Critical |
Condition | This alarm is raised when the rate between NRF and NRF Forwarding is greater than the critical configured threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7110 |
Metric Used | ocnrf_forward_nfDiscover_tx_requests_total |
Recommended Actions | The alert is cleared either when the total NRF
Forwarding rate falls below the critical threshold.
Note: The threshold is configurable in the alert file.
Reassess why the NRF is receiving additional traffic (for example, Mated
site NRF is unavailable in georedundancy scenario). If this alert is
unexpected, contact My Oracle
Support.
Steps:
|
Available in OCI | No |
5.1.28 OcnrfHeapUsageCrossedMinorThreshold
Table 5-29 OcnrfHeapUsageCrossedMinorThreshold
Field | Details |
---|---|
Description | 'OCNRF Heap Usage for pod {{ $labels.pod }} has crossed the configured minor threshold (50%) (value={{ $value }}) of its limit.' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.pod}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Heap Usage of pod exceeded 50% of its limit.' |
Severity | Minor |
Condition | This alert is raised when the Java memory heap usage of pods exceeds the configured minor threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7126 |
Metric Used | jvm_memory_used_bytes |
Recommended Actions |
The alert is cleared when the heap usage of pods falls below the minor threshold. Note: The threshold is configurable in the alert file. If this alert is unexpected, contact My Oracle Support. Steps:
|
Available in OCI | No |
5.1.29 OcnrfHeapUsageCrossedMajorThreshold
Table 5-30 OcnrfHeapUsageCrossedMajorThreshold
Field | Details |
---|---|
Description | 'OCNRF Heap Usage for pod {{ $labels.pod }} has crossed the configured major threshold (60%) (value={{ $value }}) of its limit.' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.pod}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Heap Usage of pod is more than or equal to 60% and less than 70% of its limit.' |
Severity | Major |
Condition | This alert is raised when the Java memory heap usage of pods exceeds the configured major threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7127 |
Metric Used | jvm_memory_used_bytes |
Recommended Actions |
The alert is cleared when the heap usage of pods falls below the major threshold. Note: The threshold is configurable in the alert file. If this alert is unexpected, contact My Oracle Support. Steps:
|
Available in OCI | No |
5.1.30 OcnrfHeapUsageCrossedCriticalThreshold
Table 5-31 OcnrfHeapUsageCrossedCriticalThreshold
Field | Details |
---|---|
Description | 'OCNRF Heap Usage for pod {{ $labels.pod }} has crossed the configured critical threshold (70%) (value={{ $value }}) of its limit.' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.pod}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Heap Usage of pod is more than 70% of its limit.' |
Severity | Critical |
Condition | This alert is raised when the Java memory heap usage of pods exceeds the configured critical threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7128 |
Metric Used | jvm_memory_used_bytes |
Recommended Actions |
The alert is cleared when the heap usage of pods falls below the critical threshold. Note: The threshold is configurable in the alert file. If this alert is unexpected, contact My Oracle Support. Steps:
|
Available in OCI | No |
5.2 Service Level Alerts
This section lists the service level alerts.
5.2.1 OcnrfAccessTokenRequestsRejected
Table 5-32 OcnrfAccessTokenRequestsRejected
Field | Details |
---|---|
Description | 'AccessToken request(s) have been rejected by OCNRF (current value is: {{ $value }})' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} AccessToken Request has been rejected by OCNRF.' |
Severity | Warning |
Condition | NRF rejected an AccessToken Request |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7014 |
Metric Used | 'ocnrf_accessToken_tx_responses_total' |
Recommended Actions | The alert is cleared automatically.
Steps:
|
Available in OCI | No |
5.2.2 OcnrfAuditorMultiplePodUnavailable
Table 5-33 OcnrfAuditorMultiplePodUnavailable
Field | Details |
---|---|
Description | Ocnrf Auditor Multiple Pods are Unavailable in deployment |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Ocnrf Auditor Multiple Pods are Unavailable' |
Severity | Critical |
Condition | Ocnrf Auditor Multiple Pods are Unavailable. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7075 |
Metric Used | NA |
Recommended Actions |
This alert is raised due to auditor multiple pods are unavailable. This alert is cleared automatically when the pods are available. |
Available in OCI | No |
5.2.3 OcnrfAppInfoMultiplePodUnavailable
Table 5-34 OcnrfAppInfoMultiplePodUnavailable
Field | Details |
---|---|
Description | Ocnrf AppInfo Multiple Pods are Unavailable in deployment |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Ocnrf AppInfo Multiple Pods are Unavailable' |
Severity | Critical |
Condition | Ocnrf Auditor Multiple Pods are Unavailable. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7076 |
Metric Used | NA |
Recommended Actions |
This alert is raised due to App-Info multiple pods are unavailable. This alert is cleared automatically when the pods are available. |
Available in OCI | No |
5.2.4 OcnrfPerfInfoMultiplePodUnavailable
Table 5-35 OcnrfPerfInfoMultiplePodUnavailable
Field | Details |
---|---|
Description | Ocnrf PerfInfo Multiple Pods are Unavailable in deployment |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Ocnrf PerfInfo Multiple Pods are Unavailable' |
Severity | Critical |
Condition | Ocnrf PerfInfo Multiple Pods are Unavailable. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7077 |
Metric Used | NA |
Recommended Actions |
This alert is raised due to perf-Info multiple pods are unavailable. This alert is cleared automatically when the pods are available. |
Available in OCI | No |
5.2.5 OcnrfTotalSLFRateAboveCriticalThreshold
Table 5-36 OcnrfTotalSLFRateAboveCriticalThreshold
Field | Details |
---|---|
Description | 'NRF-SLF Rate is above the configured critical threshold. (current value is: {{ $value }})' |
Summary | 'kubernetes_namespace: $labels.kubernetes_namespace, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Traffic Rate is above 45600 requests per second.' |
Severity | Critical |
Condition | This alarm is raised when the rate between NRF and SLF reaches is greater than the critical configured threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7111 |
Metric Used | ocnrf_SLF_tx_requests_total |
Recommended Actions | The alert is cleared either when the total SLF rate
falls below the critical threshold.
Note: The
threshold is configurable in the alert file. Reassess why the NRF is
receiving additional traffic (for example, Mated site NRF is unavailable
in georedundancy scenario). If this alert is unexpected, contact My Oracle
Support.
Steps:
|
Available in OCI | No |
5.2.6 OcnrfTotalDiscoveryRateAboveCriticalThreshold
Table 5-37 OcnrfTotalDiscoveryRateAboveCriticalThreshold
Field | Details |
---|---|
Description | 'Total Discovery Rate is above the configured critical threshold. (current value is: {{ $value }})' |
Summary | 'kubernetes_namespace: $labels.kubernetes_namespace, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Traffic Rate is above 51600 requests per second.' |
Severity | Critical |
Condition | This alarm is raised when the total discovery rate is greater than the critical configured threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7112 |
Metric Used | ocnrf_nfDiscover_rx_requests_total |
Recommended Actions | The alert is cleared when the total discovery rate falls
below the critical threshold.
Note: The threshold
is configurable in the alert file. Reassess why the NRF is receiving
additional traffic (for example, Mated site NRF is unavailable in
georedundancy scenario). If this alert is unexpected, contact My Oracle
Support.
Steps:
|
Available in OCI | No |
5.2.7 OcnrfAccessTokenRequestsAboveThreshold
Table 5-38 OcnrfAccessTokenRequestsAboveThreshold
Field | Details |
---|---|
Description | 'Total Access token request rate is above the configured critical threshold. (current value is: {{ $value }})' |
Summary | 'namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Total Access token request rate is above 5' |
Severity | Critical |
Condition | The alert is raised when the rate of Access Token requests is greater than the configured threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7115 |
Metric Used | ocnrf_accessToken_rx_requests_total |
Recommended Actions | The alert is cleared when the total number of access
token request rate falls below the critical threshold.
Note: The threshold is configurable in the alert file.
Reassess why the NRF is receiving additional traffic (for example, Mated
site NRF is unavailable in georedundancy scenario). If this alert is
unexpected, contact My Oracle
Support.
Steps:
|
Available in OCI | No |
5.2.8 OcnrfNfUpdateRequestsAboveThreshold
Table 5-39 OcnrfNfUpdateRequestsAboveThreshold
Field | Details |
---|---|
Description | 'Total NfUpdate request rate is above the configured critical threshold. (current value is: {{ $value }})' |
Summary | 'namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Total NfUpdate request rate is above 5' |
Severity | Critical |
Condition | This alert is raised when the total number of NfUpdate requests is greater than the configured threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7116 |
Metric Used | ocnrf_nfUpdate_rx_requests_total |
Recommended Actions | The alert is cleared when the total number of NfUpdate
request falls below the critical threshold.
Note:
The threshold is configurable in the alert file. Reassess why the NRF is
receiving additional traffic (for example, Mated site NRF is unavailable
in georedundancy scenario). If this alert is unexpected, contact My Oracle
Support.
Steps:
|
Available in OCI | No |
5.2.9 OcnrfNfHeartBeatRequestsAboveThreshold
Table 5-40 OcnrfNfHeartBeatRequestsAboveThreshold
Field | Details |
---|---|
Description | 'Total NfHeartBeat request rate is above the configured critical threshold. (current value is: {{ $value }})' |
Summary | 'namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Total NfHeartBeat request rate is above 52' |
Severity | Critical |
Condition | This alert is raised when the total number of NfHeartBeat requests is greater than the configured threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7117 |
Metric Used | ocnrf_nfHeartBeat_rx_requests_total |
Recommended Actions | The alert is cleared when the total number of
NfHeartBeat request falls below the critical threshold.
Note: The threshold is configurable in the alert file.
Reassess why the NRF is receiving additional traffic (for example, Mated
site NRF is unavailable in georedundancy scenario). If this alert is
unexpected, contact My Oracle
Support.
Steps:
|
Available in OCI | No |
5.2.10 OcnrfRegisteredNfCountAboveThreshold
Table 5-41 OcnrfRegisteredNfCountAboveThreshold
Field | Details |
---|---|
Description | 'Total Number of active registrations in OCNRF is above critical threshold. (current value is: {{ $value }})' |
Summary | 'namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Total Number of active registrations in OCNRF is above 260' |
Severity | Critical |
Condition | The alert is raised when the total number of NFs registered in the set is greater than the configured threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7118 |
Metric Used | ocnrf_nf_registered_count |
Recommended Actions | The alert is cleared when the total number active
registrations in NRF falls below the critical threshold.
Note: The threshold is configurable in the alert file.
Reassess why the NRF is receiving additional registrations. If this
alert is unexpected, contact My Oracle
Support.
Step:
|
Available in OCI | No |
5.2.11 OcnrfNfProfileSizeAboveThreshold
Table 5-42 OcnrfNfProfileSizeAboveThreshold
Field | Details |
---|---|
Description | 'The size of the NF profile is above the critical threshold. (current value is: {{ $value }})' |
Summary | ''namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: The size of the NF profile is above 12kB threshold' |
Severity | Critical |
Condition | This alert is raised when the size of the NF profile is greater than the configured threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7119 |
Metric Used | ocnrf_nf_profile_size |
Recommended Actions | The alert is cleared when the size of the NF profile is
smaller than the critical threshold.
Note: The
threshold is configurable in the alert file.
Step:
|
Available in OCI | No |
5.2.12 OcnrfDiscoveryResponseSizeAboveThreshold
Table 5-43 OcnrfDiscoveryResponseSizeAboveThreshold
Field | Details |
---|---|
Description | 'The size of nfDiscover response is above the critical threshold. (current value is: {{ $value }})' |
Summary | 'namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: The size of nfDiscover response is above 45kB threshold'' |
Severity | Critical |
Condition | This alert is raised when the size of the nfDiscover response is greater than the configured threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7120 |
Metric Used | ocnrf_nfDiscover_tx_response_size_bytes_max |
Recommended Actions | The alert is cleared when the size of the nfDiscover
response is less than the critical threshold.
Note: The threshold is configurable in the alert file.
Step:
|
Available in OCI | No |
5.2.13 OcnrfTotalSubscriptionsAboveThreshold
Table 5-44 OcnrfTotalSubscriptionsAboveThreshold
Field | Details |
---|---|
Description | 'Total Number of active subscriptions in OCNRF is above the critical threshold. (current value is: {{ $value }})' |
Summary | 'namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Total Number of active subscriptions in OCNRF is above 1000.' |
Severity | Critical |
Condition | This alert is raised when the total number of active subscriptions in NRF is greater than the configured threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7121 |
Metric Used | ocnrf_nfset_active_subscriptions |
Recommended Actions | The alert is cleared when the total number active
subscriptions in NRF is less than the critical threshold.
Note: The threshold is configurable in the alert file.
Reassess why the NRF has received additional subscriptions (for example,
Mated site NRF is unavailable in georedundancy scenario). If this alert
is unexpected, contact My Oracle
Support.
Steps:
|
Available in OCI | No |
5.2.14 OcnrfDiscoveryRequestsForUDRAboveThreshold
Table 5-45 OcnrfDiscoveryRequestsForUDRAboveThreshold
Field | Details |
---|---|
Description | 'Total NfDiscover request rate for nfType UDR is above the configured critical threshold. (current value is: {{ $value }})' |
Summary | 'namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Total NfDiscover request rate for nfType UDR is above above 700' |
Severity | Critical |
Condition | This alert is raised when the rate of nfDiscover requests for nfType UDR is greater than the configured threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7122 |
Metric Used | ocnrf_nfDiscover_rx_requests_total |
Recommended Actions | The alert is cleared when the rate of nfDiscover requests for nfType UDR is below than the critical threshold. Note: The threshold is configurable in the alert file. Reassess why the NRF is receiving additional traffic for UDR. If this alert is unexpected, contact My Oracle Support. |
Available in OCI | No |
5.2.15 OcnrfDiscoveryRequestsForUDMAboveThreshold
Table 5-46 OcnrfDiscoveryRequestsForUDMAboveThreshold
Field | Details |
---|---|
Description | 'Total NfDiscover request rate for nfType UDM is above the configured critical threshold. (current value is: {{ $value }})' |
Summary | 'namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Total NfDiscover request rate for nfType UDM is above above 46000' |
Severity | Critical |
Condition | This alert is raised when the rate of nfDiscover requests for nfType UDM is greater than the configured threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7123 |
Metric Used | ocnrf_nfDiscover_rx_requests_total |
Recommended Actions | The alert is cleared when the rate of nfDiscover requests for nfType UDM is below than the critical threshold. Note: The threshold is configurable in the alert file. Reassess why the NRF is receiving additional traffic for UDM. If this alert is unexpected, contact My Oracle Support. |
Available in OCI | No |
5.2.16 OcnrfDiscoveryRequestsForAMFAboveThreshold
Table 5-47 OcnrfDiscoveryRequestsForAMFAboveThreshold
Field | Details |
---|---|
Description | 'Total NfDiscover request rate for nfType AMF is above the configured critical threshold. (current value is: {{ $value }})' |
Summary | 'namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Total NfDiscover request rate for nfType AMF is above 2500' |
Severity | Critical |
Condition | This alert is raised when the rate of nfDiscover requests for nfType AMF is greater than the configured threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7124 |
Metric Used | ocnrf_nfDiscover_rx_requests_total |
Recommended Actions | The alert is cleared when the rate of nfDiscover requests for nfType AMF is below than the critical threshold. Note: The threshold is configurable in the alert file. Reassess why the NRF is receiving additional traffic for AMF. If this alert is unexpected, contact My Oracle Support. |
Available in OCI | No |
5.2.17 OcnrfDiscoveryRequestsForSMFAboveThreshold
Table 5-48 OcnrfDiscoveryRequestsForSMFAboveThreshold
Field | Details |
---|---|
Description | 'Total NfDiscover request rate for nfType SMF is above the configured critical threshold. (current value is: {{ $value }})' |
Summary | 'namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Total NfDiscover request rate for nfType SMF is above 4500' |
Severity | Critical |
Condition | This alert is raised when the rate of nfDiscover requests for nfType SMF is greater than the configured threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7125 |
Metric Used | ocnrf_nfDiscover_rx_requests_total |
Recommended Actions | The alert is cleared when the rate of nfDiscover requests for nfType SMF is below than the critical threshold. Note: The threshold is configurable in the alert file. Reassess why the NRF is receiving additional traffic for SMF. If this alert is unexpected, contact My Oracle Support. |
Available in OCI | No |
5.3 NfProfile Status Change Alerts
This section lists the alerts raised when there is status change in NfProfile.
5.3.1 OcnrfRegisteredPCFsBelowCriticalThreshold
Table 5-49 OcnrfRegisteredPCFsBelowCriticalThreshold
Field | Details |
---|---|
Description | 'The number of registered NFs detected below critical threshold (current value is: {{ $value }})' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, nftype:{{$labels.RequesterNfType}}, nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: The number of registered NFs detected below critical threshold.' |
Severity | Critical |
Condition |
The number of NFs of the given NFType PCF currently registered with NRF is below the critical threshold. Note: Operator can add similar alerts for each NfType and configure the corresponding thresholds as required. Default value of this alert trigger point in the alert file is when registered PCFs count with NRF is below 2. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7009 |
Metric Used | 'ocnrf_active_registrations_count' |
Recommended Actions |
The alert is cleared when the number of registered PCFs is above the critical threshold. Steps:
|
Notes |
|
Available in OCI | No |
5.3.2 OcnrfRegisteredPCFsBelowMajorThreshold
Table 5-50 OcnrfRegisteredPCFsBelowMajorThreshold
Field | Details |
---|---|
Description | 'The number of registered NFs detected below major threshold (current value is: {{ $value }})' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, nftype:{{$labels.NfType}}, nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: The number of registered NFs detected below major threshold.' |
Severity | Major |
Condition |
The number of NFs of the given NFType PCF currently registered with NRF is below the major threshold. Note: Operator can add similar alerts for each NfType and configure the corresponding thresholds as required. Default value of this alert trigger point in the alert file is when Registered PCFs count with NRF is greater than or equal to 2 and below 10. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7010 |
Metric Used | 'ocnrf_active_registrations_count' |
Recommended Actions |
The alert is cleared when the number of registered PCFs is above the major threshold. Steps:
|
Notes |
|
Available in OCI | No |
5.3.3 OcnrfRegisteredPCFsBelowMinorThreshold
Table 5-51 OcnrfRegisteredPCFsBelowMinorThreshold
Field | Details |
---|---|
Description | 'The number of registered NFs detected below minor threshold (current value is: {{ $value }})' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, nftype:{{$labels.NfType}}, nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: The number of registered NFs detected below minor threshold.' |
Severity | Minor |
Condition |
The number of NFs of the given NFType PCF currently registered with NRF is below the minor threshold. Note: Operator can add similar alerts for each NfType and configure the corresponding thresholds as required. Default value of this alert trigger point in the alert file is when registered PCFs count with NRF is greater than or equal to 10 and below 20. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7011 |
Metric Used | 'ocnrf_active_registrations_count' |
Recommended Actions |
The alert is cleared when the number of registered PCFs is above the minor threshold. Steps:
|
Notes |
|
Available in OCI | No |
5.3.4 OcnrfRegisteredPCFsBelowThreshold
Table 5-52 OcnrfRegisteredPCFsBelowThreshold
Field | Details |
---|---|
Description | 'The number of registered NFs is approaching minor threshold (current value is: {{ $value }})' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, nftype:{{$labels.NfType}}, nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: The number of registered NFs approaching minor threshold.' |
Severity | Warning |
Condition |
The number of NFs of the given NFType PCF currently registered with NRF is approaching minor threshold. Note: Operator can add similar alerts for each NfType and configure the corresponding thresholds as required. Default value of this alert trigger point in the alert file is when registered PCFs count with NRF is greater than or equal to 20 and below 30. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7012 |
Metric Used | 'ocnrf_active_registrations_count' |
Recommended Actions |
The alert is cleared when the number of registered PCFs is approaching minor threshold. Steps:
|
Notes |
|
Available in OCI | No |
5.3.5 OcnrfTotalNFsRegisteredBelowCriticalThreshold
Table 5-53 OcnrfTotalNFsRegisteredBelowCriticalThreshold
Field | Details |
---|---|
Description | 'Number of active registrations in OCNRF (current value is: {{ $value }}) is below critical threshold' |
Summary | kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Active registrations count. |
Severity | Critical |
Condition | The total number of NFs currently in "REGISTERED" state
with the NRF is below the critical threshold.
Note: The threshold values are provided as an example. User can configure the threshold value as per the requirement. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7042 |
Metric Used | 'ocnrf_active_registrations_count' |
Recommended Actions | The alert is cleared when the number of registered NFs
is above the critical threshold.
Steps:
|
Notes |
|
Available in OCI | Yes |
5.3.6 OcnrfTotalNFsRegisteredBelowMajorThreshold
Table 5-54 OcnrfTotalNFsRegisteredBelowMajorThreshold
Field | Details |
---|---|
Description | 'Number of active registrations in OCNRF (current value is: {{ $value }}) is below major threshold' |
Summary | kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Active registrations count. |
Severity | Major |
Condition | The total number of NFs currently in "REGISTERED" state
with the NRF is below the major threshold.
Note: The threshold values are provided as an example. The user can configure the threshold value as per the requirement. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7043 |
Metric Used | 'ocnrf_active_registrations_count' |
Recommended Actions | The alert is cleared when the number of registered NFs
is above the major threshold.
Steps:
|
Notes |
|
Available in OCI | Yes |
5.3.7 OcnrfTotalNFsRegisteredBelowMinorThreshold
Table 5-55 OcnrfTotalNFsRegisteredBelowMinorThreshold
Field | Details |
---|---|
Description | 'Number of active registrations in OCNRF (current value is: {{ $value }}) is below minor threshold' |
Summary | kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Active registrations count. |
Severity | Minor |
Condition | The total number of NFs currently in "REGISTERED" state
with the NRF is below the minor threshold.
Note: The threshold values are provided as an example. The user can configure the threshold value as per the requirement. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7044 |
Metric Used | 'ocnrf_active_registrations_count' |
Recommended Actions | The alert is cleared when the number of registered NFs
is above the minor threshold.
Steps:
|
Notes |
|
Available in OCI | Yes |
5.3.8 OcnrfTotalNFsRegisteredApproachingMinorThreshold
Table 5-56 OcnrfTotalNFsRegisteredApproachingMinorThreshold
Field | Details |
---|---|
Description | 'Number of active registrations in OCNRF (current value is: {{ $value }}) is approaching minor threshold' |
Summary | kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Active registrations count. |
Severity | Info |
Condition | The total number of NFs currently in "REGISTERED" state
with the NRF is approaching minor threshold.
Note: The threshold values provided as an example. The user can configure the threshold as per need. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7045 |
Metric Used | 'ocnrf_active_registrations_count' |
Recommended Actions | The alert is cleared when the number of registered NFs
are approaching minor threshold.
Steps: No action is required. This is an information alert. |
Notes |
|
Available in OCI | Yes |
5.3.9 OcnrfNFStatusTransitionToRegistered
Table 5-57 OcnrfNFStatusTransitionToRegistered
Field | Details |
---|---|
Description | 'NF with NF profile fqdn {{$labels.NfProfileFqdn}} NF instance id {{$labels.NfInstanceId}} NF type {{$labels.NfType}} is REGISTERED , previous status was {{$labels.PreviousStatus}}' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}},podname: {{$labels.kubernetes_pod_name}},NfInstanceId: {{$labels.NfInstanceId}},NfProfileFqdn: {{$labels.NfProfileFqdn}},NfType: {{$labels.NfType}},PreviousStatus: {{$labels.PreviousStatus}},NewStatus: {{$labels.NewStatus}},timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} NF is REGISTERED.' |
Severity | Info |
Condition | NF Instance's status transitions to REGISTERED.
Note: When multiple alerts are present for a given NF, the latest alert is always considered. The timestamp can also be seen in the "Active Since" field of the alert in Prometheus. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7046 |
Metric Used | ocnrf_nfInstance_status_change_total |
Recommended Actions | The alert is cleared automatically after a window of 5
minutes.
Steps: No action is required. This is an information alert. |
Available in OCI | Yes |
5.3.10 OcnrfNFServiceStatusTransitionToRegistered
Table 5-58 OcnrfNFServiceStatusTransitionToRegistered
Field | Details |
---|---|
Description | 'NF service {{$labels.NfServiceName}} and service instance id {{$labels.NfServiceInstanceId}} of NF profile fqdn {{$labels.NfProfileFqdn}} and instance id {{$labels.NfInstanceId}} is REGISTERED, previous status was {{$labels.PreviousStatus}}' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}},podname: {{$labels.kubernetes_pod_name}},NfInstanceId: {{$labels.NfInstanceId}},NfServiceName: {{$labels.NfServiceName}},NfServiceInstanceId:{{$labels.NfServiceInstanceId}},NfProfileFqdn: {{$labels.NfProfileFqdn}},NfServiceFqdn: {{$labels.NfServiceFqdn}},PreviousStatus: {{$labels.PreviousStatus}},NewStatus: {{$labels.NewStatus}},timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} NF service is REGISTERED.' |
Severity | Info |
Condition | Status of an NF Instance's service transitions to
REGISTERED.
Note: When multiple alerts are present for a given NF, the latest alert is always considered. The timestamp can also be seen in the "Active Since" field of the alert in Prometheus. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7047 |
Metric Used | ocnrf_nfService_status_change_total |
Recommended Actions | The alert is cleared automatically after a window of 5
minutes.
Steps: No action is required. This is an information alert. |
Available in OCI | Yes |
5.3.11 OcnrfNFStatusTransitionToSuspended
Table 5-59 OcnrfNFStatusTransitionToSuspended
Field | Details |
---|---|
Description | 'NF with NF profile fqdn {{$labels.NfProfileFqdn}} NF instance id {{$labels.NfInstanceId}} NF type {{$labels.NfType}} is SUSPENDED, previous status was {{$labels.PreviousStatus}}' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}},podname: {{$labels.kubernetes_pod_name}},NfInstanceId: {{$labels.NfInstanceId}},NfProfileFqdn: {{$labels.NfProfileFqdn}},NfType: {{$labels.NfType}},PreviousStatus: {{$labels.PreviousStatus}},NewStatus: {{$labels.NewStatus}},timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} NF is SUSPENDED.' |
Severity | Major |
Condition | NF Instance's status transitions to SUSPENDED.
Note: When multiple alerts are present for a given NF, the latest alert is always considered. The timestamp can also be seen in the "Active Since" field of the alert in Prometheus. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7048 |
Metric Used | ocnrf_nfInstance_status_change_total |
Recommended Actions | The alert is cleared automatically after a window of 5
minutes.
Steps:
|
Available in OCI | Yes |
5.3.12 OcnrfNFServiceStatusTransitionToSuspended
Table 5-60 OcnrfNFServiceStatusTransitionToSuspended
Field | Details |
---|---|
Description | 'NF service {{$labels.NfServiceName}} and service instance id {{$labels.NfServiceInstanceId}} of NF profile fqdn {{$labels.NfProfileFqdn}} and instance id {{$labels.NfInstanceId}} is SUSPENDED, previous status was {{$labels.PreviousStatus}}' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}},podname: {{$labels.kubernetes_pod_name}},NfInstanceId: {{$labels.NfInstanceId}},NfServiceName: {{$labels.NfServiceName}},NfServiceInstanceId:{{$labels.NfServiceInstanceId}},NfProfileFqdn: {{$labels.NfProfileFqdn}},NfServiceFqdn: {{$labels.NfServiceFqdn}},PreviousStatus: {{$labels.PreviousStatus}},NewStatus: {{$labels.NewStatus}},timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} NF service is SUSPENDED.' |
Severity | Minor |
Condition | Status of an NF Instance's service transitions to
SUSPENDED.
Note: When multiple alerts are present for a given NF, the latest alert is always considered. The timestamp can also be seen in the "Active Since" field of the alert in Prometheus. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7049 |
Metric Used | ocnrf_nfService_status_change_total |
Recommended Actions | The alert is cleared automatically after a window of 5
minutes.
Steps:
|
Available in OCI | Yes |
5.3.13 OcnrfNFStatusTransitionToUndiscoverable
Table 5-61 OcnrfNFStatusTransitionToUndiscoverable
Field | Details |
---|---|
Description | 'NF with NF profile fqdn {{$labels.NfProfileFqdn}} NF instance id {{$labels.NfInstanceId}} NF type {{$labels.NfType}} is UNDISCOVERABLE, previous status was {{$labels.PreviousStatus}}' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}},podname: {{$labels.kubernetes_pod_name}},NfInstanceId: {{$labels.NfInstanceId}},NfProfileFqdn: {{$labels.NfProfileFqdn}},NfType: {{$labels.NfType}},PreviousStatus: {{$labels.PreviousStatus}},NewStatus: {{$labels.NewStatus}},timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} NF is UNDISCOVERABLE.' |
Severity | Info |
Condition | NF Instance's status transitions to UNDISCOVERABLE.
Note: When multiple alerts are present for a given NF, the latest alert is always considered. The timestamp can also be seen in the "Active Since" field of the alert in Prometheus. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7050 |
Metric Used | ocnrf_nfInstance_status_change_total |
Recommended Actions | The alert is cleared automatically after a window of 5
minutes.
Steps:
|
Available in OCI | Yes |
5.3.14 OcnrfNFServiceStatusTransitionToUndiscoverable
Table 5-62 OcnrfNFServiceStatusTransitionToUndiscoverable
Field | Details |
---|---|
Description |
'NF service {{$labels.NfServiceName}} and service instance id {{$labels.NfServiceInstanceId}} of NF profile fqdn {{$labels.NfProfileFqdn}} and instance id {{$labels.NfInstanceId}} is UNDISCOVERABLE, previous status was {{$labels.PreviousStatus}}' |
Summary |
'kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}},podname: {{$labels.kubernetes_pod_name}},NfInstanceId: {{$labels.NfInstanceId}},NfServiceName: {{$labels.NfServiceName}},NfServiceInstanceId:{{$labels.NfServiceInstanceId}},NfProfileFqdn: {{$labels.NfProfileFqdn}},NfServiceFqdn: {{$labels.NfServiceFqdn}},PreviousStatus: {{$labels.PreviousStatus}},NewStatus: {{$labels.NewStatus}},timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} NF service is UNDISCOVERABLE.' |
Severity | Info |
Condition | Status of an NF Instance's service transitions to
UNDISCOVERABLE.
Note: When multiple alerts are present for a given NF, the latest alert is always considered. The timestamp can also be seen in the "Active Since" field of the alert in Prometheus. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7051 |
Metric Used | ocnrf_nfService_status_change_total |
Recommended Actions | The alert is cleared automatically after a window of 5
minutes.
Steps:
|
Available in OCI | Yes |
5.3.15 OcnrfNFStatusTransitionToDeregistered
Table 5-63 OcnrfNFStatusTransitionToDeregistered
Field | Details |
---|---|
Description | 'NF with NF profile fqdn {{$labels.NfProfileFqdn}} NF instance id {{$labels.NfInstanceId}} NF type {{$labels.NfType}} is DEREGISTERED, previous status was {{$labels.PreviousStatus}}' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}},podname: {{$labels.kubernetes_pod_name}},NfInstanceId: {{$labels.NfInstanceId}},NfProfileFqdn: {{$labels.NfProfileFqdn}},NfType: {{$labels.NfType}},PreviousStatus: {{$labels.PreviousStatus}},NewStatus: {{$labels.NewStatus}},timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} NF is DEREGISTERED.' |
Severity | Info |
Condition | NF Instance's status transitions to DEREGISTERED.
Note: When multiple alerts are present for a given NF, the latest alert is always considered. The timestamp can also be seen in the "Active Since" field of the alert in Prometheus. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7052 |
Metric Used | ocnrf_nfInstance_status_change_total |
Recommended Actions | The alert is cleared automatically after a window of 5
minutes.
Steps:
|
Available in OCI | Yes |
5.3.16 OcnrfNFServiceStatusTransitionToDeregistered
Table 5-64 OcnrfNFServiceStatusTransitionToDeregistered
Field | Details |
---|---|
Description | 'NF service {{$labels.NfServiceName}} and service instance id {{$labels.NfServiceInstanceId}} of NF profile fqdn {{$labels.NfProfileFqdn}} and instance id {{$labels.NfInstanceId}} is DEREGISTERED, previous status was {{$labels.PreviousStatus}}' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}},podname: {{$labels.kubernetes_pod_name}},NfInstanceId: {{$labels.NfInstanceId}},NfServiceName: {{$labels.NfServiceName}},NfServiceInstanceId:{{$labels.NfServiceInstanceId}},NfProfileFqdn: {{$labels.NfProfileFqdn}},NfServiceFqdn: {{$labels.NfServiceFqdn}},PreviousStatus: {{$labels.PreviousStatus}},NewStatus: {{$labels.NewStatus}},timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} NF service is DEREGISTERED.' |
Severity | Info |
Condition | Status of an NF Instance's service transitions to
DEREGISTERED.
Note: When multiple alerts are present for a given NF, the latest alert is always considered. The timestamp can also be seen in the "Active Since" field of the alert in Prometheus. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7053 |
Metric Used | ocnrf_nfService_status_change_total |
Recommended Actions | The alert is cleared automatically after a window of 5
minutes.
Steps:
|
Available in OCI | Yes |
5.4 Feature Specific Alerts
This section lists the feature specific alerts.
5.4.1 KeyID for AccessToken Feature
This section lists the alerts that are specific to KeyID for AccessToken feature. For more information about the feature, see the "Key-ID for AccessToken" section in Oracle Communications Cloud Native Core, Network Repository Function User Guide.
5.4.1.1 OcnrfAccessTokenCurrentKeyIdNotConfigured
Table 5-65 OcnrfAccessTokenCurrentKeyIdNotConfigured
Field | Details |
---|---|
Description | 'AccessToken request(s) have been rejected by OCNRF (current value is: {{ $value }})' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} AccessToken Request has been rejected by OCNRF as Current Key Id is not configured.' |
Severity | Critical |
Condition | NRF Access Token Rejected due to CurrentKeyId not configured |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7033 |
Metric Used | 'ocnrf_accessToken_tx_responses_total' |
Recommended Actions | The alert is automatically cleared as it is raised when
NRF receives Access Token Request, and at that point, Current Key Id is
not selected. For more information about configuring
currentKeyID parameter, see Oracle Communications Cloud Native Core, Network Repository Function REST
Specification Guide.
|
Available in OCI | No |
5.4.1.2 OcnrfAccessTokenCurrentKeyIdInvalidDetails
Table 5-66 OcnrfAccessTokenCurrentKeyIdInvalidDetails
Field | Details |
---|---|
Description | 'AccessToken request(s) have been rejected by OCNRF (current value is: {{ $value }})' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, KeyType: {{$labels.KeyType}}, RejectionReason: {{$labels.RejectionReason}},timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} AccessToken Request has been rejected by OCNRF as CurrentKeyId details are invalid.' |
Severity | Critical |
Condition | NRF Access Token Rejected due to token signing details corresponding to CurrentKeyId are invalid. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7034 |
Metric Used | 'ocnrf_accessToken_tx_responses_total' |
Recommended Actions | The alert is automatically cleared when NRF receives
Access Token Request, and at that point, Current Key Id details are
invalid. For more information about configuring
currentKeyID parameter, see Oracle Communications Cloud Native Core, Network Repository Function REST
Specification Guide.
|
Available in OCI | No |
5.4.1.3 OcnrfOauthCurrentKeyNotConfigured
Table 5-67 OcnrfOauthCurrentKeyNotConfigured
Field | Details |
---|---|
Description | 'OCNRF Oauth Access token Current Key Id is not configured' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} OCNRF Oauth Access token Current Key Id is not configured.' |
Severity | Critical |
Condition | Oauth Current Key ID is not configured |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7035 |
Metric Used | ocnrf_oauth_currentKeyId_configuredStatus |
Recommended Actions | The alert is cleared when the current key ID is
configured.
Steps: Configure
valid current key ID in Access Token Configuration. For more
information about configuring |
Available in OCI | No |
5.4.1.4 OcnrfOauthCurrentKeyDataHealthStatus
Table 5-68 OcnrfOauthCurrentKeyDataHealthStatus
Field | Details |
---|---|
Description | 'OCNRF Oauth Access token Current Key Id status is not healthy' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, KeyId: {{$labels.KeyId}}, KeyType: {{$labels.KeyType}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} OCNRF Oauth Access token Current Key Id status is not healthy.' |
Severity | Critical |
Condition | Oauth Current Key ID details health is not good. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7036 |
Metric Used | ocnrf_oauth_keyData_healthStatus |
Recommended Actions |
The alert is cleared when the current key ID status is healthy. Steps: Rectify the condition by checking ErrorCondition For example: For ErrorCondition Invalid_Key_Details, check if the k8SecretName, k8SecretNameSpace, and filename combination exists correctly for both privateKey and certificate. Make sure that the pem file data is not corrupt or the certificate has not expired. |
Available in OCI | No |
5.4.1.5 OcnrfOauthNonCurrentKeyDataHealthStatus
Table 5-69 OcnrfOauthNonCurrentKeyDataHealthStatus
Field | Details |
---|---|
Description | 'OCNRF Oauth Access token Non current Key Id status is not healthy' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, KeyId: {{$labels.KeyId}}, KeyType: {{$labels.KeyType}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} OCNRF Oauth Access token non current Key Id status is not healthy.' |
Severity | Info |
Condition | Oauth Non Current Key details health is not good |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7037 |
Metric Used | ocnrf_oauth_keyData_healthStatus |
Recommended Actions |
The alert is cleared when the current key ID status is healthy. Steps: Rectify the condition by checking ErrorCondition For example: For ErrorCondition Invalid_Key_Details, check if the k8SecretName, k8SecretNameSpace, and filename combination exists correctly for both privateKey and certificate. Make sure that the pem file data is not corrupt or the certificate has not expired. |
Available in OCI | No |
5.4.1.6 OcnrfOauthCurrentCertificateExpiringIn1Week
Table 5-70 OcnrfOauthCurrentCertificateExpiringIn1Week
Field | Details |
---|---|
Description | 'OCNRF Oauth Access token current Key Id certificate is expiring in less than 1 week' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, KeyId: {{$labels.KeyId}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} OCNRF Oauth Access token current Key Id certificate is expiring in less than 1 week.' |
Severity | Critical |
Condition | Oauth Current Key ID details are expiring in less than 1 week |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7038 |
Metric Used | ocnrf_oauth_keyData_expiryStatus |
Recommended Actions |
The alert is cleared when the key expiry time is more than 1 week. Steps: Replace expiring certificate key pair with new ones. For more information on creating certificate key pair, see Oracle Communications Cloud Native Core, Network Repository Function Installation, Upgrade, and Fault Recovery Guide. |
Available in OCI | No |
5.4.1.7 OcnrfOauthNonCurrentCertificateExpiringIn1Week
Table 5-71 OcnrfOauthNonCurrentCertificateExpiringIn1Week
Field | Details |
---|---|
Description | 'OCNRF Oauth Access token non current Key Id certificate is expiring in less than 1 week' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, KeyId: {{$labels.KeyId}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} OCNRF Oauth Access token non current Key Id certificate is expiring in less than 1 week.' |
Severity | Info |
Condition | Oauth Non Current Key ID details are expiring in less than 1 week |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7039 |
Metric Used | ocnrf_oauth_keyData_expiryStatus |
Recommended Actions |
The alert is cleared when the key expiry time is more than 1 week. Steps: Replace expiring certificate key pair with new ones. For more information on creating certificate key pair, see Oracle Communications Cloud Native Core, Network Repository Function Installation, Upgrade, and Fault Recovery Guide. |
Available in OCI | No |
5.4.1.8 OcnrfOauthCurrentCertificateExpiringIn30days
Table 5-72 OcnrfOauthCurrentCertificateExpiringIn30days
Field | Details |
---|---|
Description | 'OCNRF Oauth Access token current Key Id certificate is expiring in less than 30 days' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, KeyId: {{$labels.KeyId}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} OCNRF Oauth Access token current Key Id certificate is expiring in less than 30 days.' |
Severity | Major |
Condition | Oauth Current Key ID details are expiring in more than 24 hours and less than 30 days |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7040 |
Metric Used | ocnrf_oauth_keyData_expiryStatus |
Recommended Actions |
The alert is cleared when certificate for the current key id's expiry time is more than 30 days. Steps: Replace expiring certificate key pair with new ones. For more information on creating certificate key pair, see Oracle Communications Cloud Native Core, Network Repository Function Installation, Upgrade, and Fault Recovery Guide. |
Available in OCI | No |
5.4.1.9 OcnrfOauthNonCurrentCertificateExpiringIn30days
Table 5-73 OcnrfOauthNonCurrentCertificateExpiringIn30days
Field | Details |
---|---|
Description | 'OCNRF Oauth Access token non current Key Id certificate is expiring in less than 30 days' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, KeyId: {{$labels.KeyId}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} OCNRF Oauth Access token non current Key Id certificate is expiring in less than 30 days.' |
Severity | Info |
Condition | Oauth Non Current Key ID details are expiring in more than 24 hours and less than 30 days |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7041 |
Metric Used | ocnrf_oauth_keyData_expiryStatus |
Recommended Actions |
The alert is cleared when certificate for the non-current key id's certificate expiry time is more than 30 days. Steps: Replace expiring certificate key pair with new ones. For more information on creating certificate key pair, see Oracle Communications Cloud Native Core, Network Repository Function Installation, Upgrade, and Fault Recovery Guide. |
Available in OCI | No |
5.4.2 Overload Control Based on Percentage Discards Feature
This section lists the alerts that are specific to Overload Control Based on Percentage Discards feature. For more information about the feature, see the "Overload Control Based on Percentage Discards" section in Oracle Communications Cloud Native Core, Network Repository Function User Guide.
5.4.2.1 OcnrfMemoryUsageCrossedMinorThreshold
Table 5-74 OcnrfMemoryUsageCrossedMinorThreshold
Field | Details |
---|---|
Description | 'OCNRF Memory Usage for pod <Pod name> has crossed the configured minor threshold (50 %) (value={{ $value }}) of its limit.' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Memory Usage of pod exceeded 50% of its limit.' |
Severity | Minor |
Condition | A pod has reached the configured minor threshold (50%) of its memory resource limits. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7030 |
Metric Used | 'container_memory_usage_bytes' and
'container_spec_memory_limit_bytes'
Note: This is a Kubernetes metric used for instance availability monitoring. If the metric is not available, use the similar metric as exposed by the monitoring system. |
Recommended Actions | The alert gets cleared when the memory utilization falls
below the minor threshold or crosses the major threshold, in which case
OcnrfMemoryUsageCrossedMajorThreshold alert is raised.
Note: The threshold is configurable in the alerts file. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support. Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Available in OCI | Yes |
5.4.2.2 OcnrfMemoryUsageCrossedMajorThreshold
Table 5-75 OcnrfMemoryUsageCrossedMajorThreshold
Field | Details |
---|---|
Description | 'OCNRF Memory Usage for pod <Pod name> has crossed the major threshold (60%) (value = {{ $value }}) of its limit.' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Memory Usage of pod exceeded 60% of its limit.' |
Severity | Major |
Condition | A pod has reached the configured major threshold (60%) of its memory resource limits. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7031 |
Metric Used | 'container_memory_usage_bytes' and
'container_spec_memory_limit_bytes'
Note: This is a Kubernetes metric used for instance availability monitoring. If the metric is not available, use the similar metric as exposed by the monitoring system. |
Recommended Actions | The alert gets cleared when the memory utilization falls
below the major threshold or crosses the critical threshold, in which
case OcnrfMemoryUsageCrossedCriticalThreshold alert is raised.
Note: The threshold is configurable in the alert file. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support. Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Available in OCI | Yes |
5.4.2.3 OcnrfMemoryUsageCrossedCriticalThreshold
Table 5-76 OcnrfMemoryUsageCrossedCriticalThreshold
Field | Details |
---|---|
Description | 'OCNRF Memory Usage for pod <Pod name> has crossed the configured critical threshold (70%) (value = {{ $value }}) of its limit.' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Memory Usage of pod exceeded 70% of its limit.' |
Severity | Critical |
Condition | A pod has reached the configured critical threshold (70%) of its memory resource limits. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7032 |
Metric Used | 'container_memory_usage_bytes' and
'container_spec_memory_limit_bytes'
Note: This is a Kubernetes metric used for instance availability monitoring. If the metric is not available, use a similar metric as exposed by the monitoring system. |
Recommended Actions | The alert gets cleared when the memory utilization falls
below the critical threshold.
Note: The threshold is configurable in the alert file. In case the issue persists, capture all the outputs for the above steps and contact My Oracle Support. Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Available in OCI | Yes |
5.4.2.4 OcnrfOverloadThresholdBreachedL1
Table 5-77 OcnrfOverloadThresholdBreachedL1
Field | Details |
---|---|
Description | 'Overload Level of {{$labels.app_kubernetes_io_name}} service is L1' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: Overload Level of {{$labels.app_kubernetes_io_name}} service is L1' |
Severity | Warning |
Condition | NRF Services have breached its configured threshold of Level L1 for any of the aforementioned metrics. Thresholds are configured for CPU, svc_failure_count, svc_pending_count, and memory. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7059 |
Metric Used | load_level |
Recommended Actions |
The alert is cleared when the Ingress Traffic rate falls below the configured L1 threshold. Note: The thresholds can be configured using REST API. Steps:
|
Available in OCI | Yes |
5.4.2.5 OcnrfOverloadThresholdBreachedL2
Table 5-78 OcnrfOverloadThresholdBreachedL2
Field | Details |
---|---|
Description | 'Overload Level of {{$labels.app_kubernetes_io_name}} service is L2' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: Overload Level of {{$labels.app_kubernetes_io_name}} service is L2' |
Severity | Warning |
Condition | NRF Services have breached its configured threshold of Level L2 for any of the aforementioned metrics. Thresholds are configured for CPU, svc_failure_count, svc_pending_count, and memory. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7060 |
Metric Used | load_level |
Recommended Actions |
The alert is cleared when the Ingress Traffic rate falls below the configured L2 threshold. Note: The thresholds can be configured using REST API. Steps:
|
Available in OCI | Yes |
5.4.2.6 OcnrfOverloadThresholdBreachedL3
Table 5-79 OcnrfOverloadThresholdBreachedL3
Field | Details |
---|---|
Description | 'Overload Level of {{$labels.app_kubernetes_io_name}} service is L3' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: Overload Level of {{$labels.app_kubernetes_io_name}} service is L3' |
Severity | Warning |
Condition | NRF Services have breached its configured threshold of Level L3 for any of the aforementioned metrics. Thresholds are configured for CPU, svc_failure_count, svc_pending_count, and memory. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7061 |
Metric Used | load_level |
Recommended Actions |
The alert is cleared when the Ingress Traffic rate falls below the configured L3 threshold. Note: The thresholds can be configured using REST API. Steps:
|
Available in OCI | Yes |
5.4.2.7 OcnrfOverloadThresholdBreachedL4
Table 5-80 OcnrfOverloadThresholdBreachedL4
Field | Details |
---|---|
Description | 'Overload Level of {{$labels.app_kubernetes_io_name}} service is L4' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}: Overload Level of {{$labels.app_kubernetes_io_name}} service is L4' |
Severity | Warning |
Condition | NRF Services have breached its configured threshold of Level L4 for any of the aforementioned metrics. Thresholds are configured for CPU, svc_failure_count, svc_pending_count, and memory. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7062 |
Metric Used | load_level |
Recommended Actions |
The alert is cleared when the Ingress Traffic rate falls below the configured L4 threshold. Note: The thresholds can be configured using REST API. Steps:
|
Available in OCI | Yes |
5.4.3 DNS NAPTR Update Feature
This section lists the alerts that are specific to DNS NAPTR Update feature. For more information about the feature, see the "DNS NAPTR Update" section in Oracle Communications Cloud Native Core, Network Repository Function User Guide.
5.4.3.1 OcnrfDnsNaptrFailureResponseStatus
Table 5-81 OcnrfDnsNaptrFailureResponseStatus
Field | Details |
---|---|
Description | OCNRF DNS NAPTR Response status is not healthy |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, NfInstanceId: {{$labels.NfInstanceId}}, NfSetFqdn: {{$labels.NfSetFqdn}}, Replacement: {{$labels.Replacement}},timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} OCNRF Dns Naptr Response status is not healthy.' |
Severity | Major |
Condition | The DNS NAPTR response towards DNS Server is not successful. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7063 |
Metric Used | ocnrf_dns_naptr_failure_rx_response |
Recommended Actions | This alert is cleared when DNS NAPTR response is successful either automatic through service operations, or manual trigger for update and delete NAPTR requests. |
5.4.3.2 OcnrfAlternateRouteUpstreamDnsRetryExhausted
Table 5-82 OcnrfAlternateRouteUpstreamDnsRetryExhausted
Field | Details |
---|---|
Description | OCNRF alternate route upstream DNS retry exhausted |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, FQDNS_Name: {{$labels.FQDNS_Name}}, Replacement_Name: {{$labels.Replacement_Name}},timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} OCNRF alternate route upstream dns retry exhausted' |
Severity | Major |
Condition | The DNS NAPTR retry is exhausted. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7064 |
Metric Used | oc_alternate_route_upstream_dns_retry_exhausted |
Recommended Actions | This alert is cleared automatically in 2 minutes. |
Available in OCI | No |
5.4.4 Notification Retry Feature
This section lists the alerts that are specific to Notification Retry feature. For more information about the feature, see the "Notification Retry" section in Oracle Communications Cloud Native Core, Network Repository Function User Guide.
5.4.4.1 OcnrfNotificationRetryExhausted
Table 5-83 OcnrfNotificationRetryExhausted
Field | Details |
---|---|
Description | 'OCNRF NotificationRetry Exhausted' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, SubscriptionId: {{$labels.SubscriptionId}}, NotificationHostPort: {{$labels.NotificationHostPort}}' |
Severity | Major |
Condition | This alarm is raised when number of retries are exhausted. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7065 |
Metric Used | ocnrf_nfStatusNotify_rx_responses_total |
Recommended Actions | The alert is cleared automatically after 5 minutes.
Steps: Check logs in NF management pod to check the reason for retry query failures. Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Available in OCI | Yes |
5.4.4.2 OcnrfNotificationFailureOtherThanRetryExhausted
Table 5-84 OcnrfNotificationFailureOtherThanRetryExhausted
Field | Details |
---|---|
Description | 'OCNRF notification failure other than retry exhausted' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, SubscriptionId: {{$labels.SubscriptionId}}, NotificationHostPort: {{$labels.NotificationHostPort}}, NumberOfRetriesAttempted: {{$labels.NumberOfRetriesAttempted}}' |
Severity | Major |
Condition | This alarm is raised when notification failure occurs with reason other than retry count exhausted. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7066 |
Metric Used | ocnrf_nfStatusNotify_rx_responses_total |
Recommended Actions | The alert is cleared automatically after 5 minutes.
Steps: Check logs in NF management pod to check the reason for retry query failures. Note: Use CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Available in OCI | Yes |
5.4.5 NRF Message Feed Feature
This section lists the alerts that are specific to NRF Message Feed feature. For more information about the feature, see the "NRF Message Feed" section in Oracle Communications Cloud Native Core, Network Repository Function User Guide.
5.4.5.1 OcnrfIngressGatewayDDUnreachable
Table 5-85 OcnrfIngressGatewayDDUnreachable
Field | Details |
---|---|
Description | OCNRF Ingress Gateway Data Director unreachable |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} OCNRF Ingress Gateway Data Director unreachable' |
Severity | Major |
Condition | This alarm is raised when data director is not reachable from Ingress Gateway. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7067 |
Metric Used | oc_ingressgateway_dd_unreachable |
Recommended Actions | Alert gets cleared automatically when the connection with data director is established. |
Available in OCI | No |
5.4.5.2 OcnrfEgressGatewayDDUnreachable
Table 5-86 OcnrfEgressGatewayDDUnreachable
Field | Details |
---|---|
Description | OCNRF Egress Gateway Data Director unreachable |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} OCNRF Egress Gateway Data Director unreachable' |
Severity | Major |
Condition | This alarm is raised when data director is not reachable from Egress Gateway. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7068 |
Metric Used | oc_egressgateway_dd_unreachable |
Recommended Actions | Alert gets cleared automatically when the connection with data director is established. |
Available in OCI | No |
5.4.6 Subscription Limit Feature
This section lists the alerts that are specific to Subscription Limit feature. For more information about the feature, see the "Subscription Limit" section in Oracle Communications Cloud Native Core, Network Repository Function User Guide.
5.4.6.1 OcnrfSubscriptionGlobalCountWarnThresholdBreached
Table 5-87 OcnrfSubscriptionGlobalCountWarnThresholdBreached
Field | Details |
---|---|
Description | The total number of subscriptions has breached the configured WARN level threshold. |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}: The total number of subscriptions has breached the configured WARN level threshold' |
Severity | Warning |
Condition | This alarm is raised when the total number of subscriptions has breached the configured WARN level threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7069 |
Metric Used | ocnrf_nfset_limit_level |
Recommended Actions |
The alert is cleared automatically when the count comes down due to unsubscription. Note: The thresholds can be configured using REST API. Steps:
|
Available in OCI | Yes |
5.4.6.2 OcnrfSubscriptionGlobalCountMinorThresholdBreached
Table 5-88 OcnrfSubscriptionGlobalCountMinorThresholdBreached
Field | Details |
---|---|
Description | The total number of subscriptions has breached the configured MINOR level threshold |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}: The total number of subscriptions has breached the configured MINOR level threshold' |
Severity | Minor |
Condition | This alarm is raised when the total number of subscriptions has breached the configured MINOR level threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7070 |
Metric Used | ocnrf_nfset_limit_level |
Recommended Actions |
The alert is cleared automatically when the count comes down due to unsubscription. Note: The thresholds can be configured using REST API. Steps:
|
Available in OCI | Yes |
5.4.6.3 OcnrfSubscriptionGlobalCountMajorThresholdBreached
Table 5-89 OcnrfSubscriptionGlobalCountMajorThresholdBreached
Field | Details |
---|---|
Description | The total number of subscriptions has breached the configured MAJOR level threshold |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}: The total number of subscriptions has breached the configured MAJOR level threshold' |
Severity | MAJOR |
Condition | This alarm is raised when the total number of subscriptions has breached the configured MAJOR level threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7071 |
Metric Used | ocnrf_nfset_limit_level |
Recommended Actions |
The alert is cleared automatically when the count comes down due to unsubscription. Note: The thresholds can be configured using REST API. Steps:
|
Available in OCI | Yes |
5.4.6.4 OcnrfSubscriptionGlobalCountCriticalThresholdBreached
Table 5-90 OcnrfSubscriptionGlobalCountCriticalThresholdBreached
Field | Details |
---|---|
Description | The total number of subscriptions has breached the configured CRITICAL level threshold |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}: The total number of subscriptions has breached the configured CRITICAL level threshold' |
Severity | Critical |
Condition | This alarm is raised when the total number of subscriptions has breached the configured CRITICAL level threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7072 |
Metric Used | ocnrf_nfset_limit_level |
Recommended Actions |
The alert is cleared automatically when the count comes down due to unsubscription. Note: The thresholds can be configured using REST API. Steps:
|
Available in OCI | Yes |
5.4.6.5 OcnrfSubscriptionMigrationInProgressWarn
Table 5-91 OcnrfSubscriptionMigrationInProgressWarn
Field | Details |
---|---|
Description | The subscription migration is pending and subscriptionLimit feature is disabled |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, nrflevel:{{$labels.NrfLevel}}, subscriptionLimitFeatureStatus:{{$labels.subscriptionLimitFeatureStatus}}: The subscription migration is pending and subscriptionLimit feature is disabled' |
Severity | Warning |
Condition | The subscription migration is pending and subscriptionLimit feature is disabled. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7073 |
Metric Used | ocnrf_subscription_migration_status |
Recommended Actions | This alert is cleared automatically when the migration is complete. |
5.4.6.6 OcnrfSubscriptionMigrationInProgressCritical
Table 5-92 OcnrfSubscriptionMigrationInProgressCritical
Field | Details |
---|---|
Description | The subscription migration is pending and subscriptionLimit feature is enabled |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, nrflevel:{{$labels.NrfLevel}}, subscriptionLimitFeatureStatus:{{$labels.subscriptionLimitFeatureStatus}}: The subscription migration is pending and subscriptionLimit feature is enabled' |
Severity | Warning |
Condition | The subscription migration is pending and subscriptionLimit feature is enabled. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7074 |
Metric Used | ocnrf_subscription_migration_status |
Recommended Actions |
This alert is cleared automatically when the migration is complete. Steps: Disable the Subscription Limit feature. For more information, see Oracle Communications Cloud Native Core, Network Repository Function REST Specification Guide. |
Available in OCI | No |
5.4.7 Pod Protection Support for NRF Subscription Microservice Feature
This section lists the alerts that are specific to Pod Protection Support for NRF Subscription Microservice feature. For more information about the feature, see the "Pod Protection Support for NRF Subscription Microservice" section in Oracle Communications Cloud Native Core, Network Repository Function User Guide.
5.4.7.1 OcnrfPodInDangerOfCongestionState
Table 5-93 OcnrfPodInDangerOfCongestionState
Field | Details |
---|---|
Description | 'The pod {{$labels.kubernetes_pod_name}} of service {{$labels.app_kubernetes_io_name}} is in Danger of Congestion state' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : The pod is in Danger of Congestion state' |
Severity | Major |
Condition | A pod of a service is in Danger Of Congestion state.
This could be due to CPU Usage or Pending Message Count above configured
thresholds.
This alert is raised when the Pod Protection feature is enabled for nfSubscription service. Currently this is applicable for NfSubscription service only. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7079 |
Metric Used | ocnrf_pod_congestion_state |
Recommended Actions | The alert is cleared when the CPU or Pending Message
Count goes below the configured thresholds for the Danger of Congested
state.
Note: The thresholds can be viewed using REST API. Reassess if the NRF is receiving additional traffic. If this is unexpected, contact My Oracle Support. Steps:
|
Available in OCI | No |
5.4.7.2 OcnrfPodPendingMessageCountInDangerOfCongestionState
Table 5-94 OcnrfPodPendingMessageCountInDangerOfCongestionState
Field | Details |
---|---|
Description | 'The pod {{$labels.kubernetes_pod_name}} of service {{$labels.app_kubernetes_io_name}} is in Danger of Congestion state due to Pending Message Count above threshold' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : The pod is in Danger of Congestion state due to Pending Message Count above threshold' |
Severity | Major |
Condition |
A pod of a service is in Danger Of Congestion state due to its Pending Message Count above configured thresholds. Currently this is applicable for NfSubscription service only. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7081 |
Metric Used | ocnrf_pod_pending_message_count_congestion_state |
Recommended Actions | The alert is cleared when the pending message count goes
below the configured thresholds for the Danger of Congested state.
Note: The thresholds can be viewed using REST API. Steps: Reassess if the NRF is receiving additional traffic. If this is unexpected, contact My Oracle Support.
|
Available in OCI | No |
5.4.7.3 OcnrfPodInCongestedState
Table 5-95 OcnrfPodInCongestedState
Field | Details |
---|---|
Description | 'The pod {{$labels.kubernetes_pod_name}} of service {{$labels.app_kubernetes_io_name}} is in Congested state' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : The pod is in Congested state' |
Severity | Major |
Condition | One or more pods of a service are in congested state. This could be due to CPU usage or Pending Message Count above configured thresholds. Currently this is applicable for NfSubscription service only. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7082 |
Metric Used | ocnrf_pod_congested_state |
Recommended Actions | The alert is cleared when the CPU usage or Pending
Message Count goes below the configured thresholds for the congested
state.
Note: The thresholds can be viewed using REST API. Steps: Reassess if the NRF is receiving additional traffic. If this is unexpected, contact My Oracle Support.
|
Available in OCI | No |
5.4.7.4 OcnrfPodCpuUsageInCongestedState
Table 5-96 OcnrfPodCpuUsageInCongestedState
Field | Details |
---|---|
Description | 'The pod {{$labels.kubernetes_pod_name}} of service {{$labels.app_kubernetes_io_name}} is in Congested state due to CPU usage above threshold' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : The pod is in Congested state due to CPU usage above threshold' |
Severity | Major |
Condition | A pod of a service is in Congested state due to its CPU Usage above configured thresholds. Currently this is applicable for NfSubscription service only. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7083 |
Metric Used | ocnrf_pod_cpu_congestion_state |
Recommended Actions | The alert is cleared when the CPU usage goes below the
configured thresholds for the congested state.
Note: The thresholds can be viewed using REST API. Steps: Reassess if the NRF is receiving additional traffic. If this is unexpected, contact My Oracle Support.
|
Available in OCI | No |
5.4.7.5 OcnrfPodCpuUsageInDangerOfCongestionState
Table 5-97 OcnrfPodCpuUsageInDangerOfCongestionState
Field | Details |
---|---|
Description | 'The pod {{$labels.kubernetes_pod_name}} of service {{$labels.app_kubernetes_io_name}} is in Danger of Congestion state due to CPU usage above threshold' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : The pod is in Danger of Congestion state due to CPU usage above threshold' |
Severity | Major |
Condition |
A pod of a service is in Danger Of Congestion state due to its CPU above configured thresholds. This alert is raised when the Pod Pretoectoin feature is enabled for nfSubscription service. Currently this is applicable for NfSubscription service only. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7080 |
Metric Used | ocnrf_pod_cpu_congestion_state |
Recommended Actions | The alert is cleared when the CPU goes below the
configured thresholds for the Danger of Congested state.
Note: The thresholds can be viewed using REST API. Steps: Reassess if the NRF is receiving additional traffic. If this is unexpected, contact My Oracle Support.
|
Available in OCI | No |
5.4.7.6 OcnrfPodPendingMessageCountInCongestedState
Table 5-98 OcnrfPodPendingMessageCountInCongestedState
Field | Details |
---|---|
Description | 'The pod {{$labels.kubernetes_pod_name}} of service {{$labels.app_kubernetes_io_name}} is in Congested state due to Pending Message Count above threshold' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : The pod is in Congested state due to Pending Message Count above threshold' |
Severity | Major |
Condition | A pod of a service is in Congested state due to its Pending Message Count above configured thresholds. Currently this is applicable for NfSubscription service only. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7084 |
Metric Used | ocnrf_pod_pending_message_count_congestion_state |
Recommended Actions | The alert is cleared when the pending message count goes
below the configured thresholds for the congested state.
Note: The thresholds can be viewed using REST API. Steps: Reassess if the NRF is receiving additional traffic. If this is unexpected, contact My Oracle Support.
|
Available in OCI | No |
5.4.8 Controlled Shutdown of NRF Feature
This section lists the alerts that are specific to Controlled Shutdown of NRF feature. For more information about the feature, see the "Controlled Shutdown of NRF" section in Oracle Communications Cloud Native Core, Network Repository Function User Guide.
5.4.8.1 OcnrfOperationalStateCompleteShutdown
Table 5-99 OcnrfOperationalStateCompleteShutdown
Field | Details |
---|---|
Description | 'The operational state of NRF is Complete Shutdown.' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : The Operational state of NRF is Complete Shutdown' |
Severity | Warning |
Condition | The operator has changed the operational state of NRF to Complete Shutdown. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7085 |
Metric Used | ocnrf_operational_state |
Recommended Actions | The alert is cleared when the user changes the
operational state to NORMAL
|
Available in OCI | No |
5.4.8.2 OcnrfAuditOperationsPaused
Table 5-100 OcnrfAuditOperationsPaused
Field | Details |
---|---|
Description | 'The Audit procedures at NRF have been paused.' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : The Audit procedures at NRF has been paused' |
Severity | Warning |
Condition | The NrfAuditor microservice has paused all audit
procedures.
This occurs during any of the following scenarios:
|
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7086 |
Metric Used | ocnrf_audit_status |
Recommended Actions | The alert is expected to clear automatically, after the
waiting period, and once all the above conditions are resolved.
|
Notes |
NrfAuditor continues to remain in the paused state for
some time, even after
|
Available in OCI | No |
5.4.9 Monitoring the Availability of SCP Using SCP Health APIs Feature
This section lists the alerts that are specific to Monitoring the Availability of SCP Using SCP Health APIs feature. For more information about the feature, see the "Monitoring the Availability of SCP Using SCP Health APIs" section in Oracle Communications Cloud Native Core, Network Repository Function User Guide.
5.4.9.1 OcnrfAllSCPsMarkedAsUnavailable
Table 5-101 OcnrfAllSCPsMarkedAsUnavailable
Field | Details |
---|---|
Description | 'All SCPs have been marked unavailable.' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : All SCPs have been marked as unavailable' |
Severity | Critical |
Condition | All SCPs have been marked unavailable. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7088 |
Metric Used | 'oc_egressgateway_peer_count and oc_egressgateway_peer_available_count' |
Recommended Actions | NF clears the critical alarm when atleast 1 SCP peer in a peerset becomes available such that all other SCP peers in the given peerset are still unavailable. |
Available in OCI | Yes |
5.4.9.2 OcnrfSCPMarkedAsUnavailable
Table 5-102 OcnrfSCPMarkedAsUnavailable
Field | Details |
---|---|
Description | 'An SCP has been marked unavailable.' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : One of the SCP has been marked unavailable' |
Severity | Major |
Condition | One of the SCPs has been marked unhealthy. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7087 |
Metric Used | oc_egressgateway_peer_health_status |
Recommended Actions | This alert gets cleared when unavailable SCPs become available. |
Available in OCI | Yes |
5.4.10 CCA Header Validation in NRF for Access Token Service Operation Feature
This section lists the alerts that are specific to CCA Header Validation in NRF for Access Token Service Operation feature. For more information about the feature, see the "CCA Header Validation in NRF for Access Token Service Operation" section in Oracle Communications Cloud Native Core, Network Repository Function User Guide.
5.4.10.1 OcnrfCcaRootCertificateExpiringIn4Hours
Table 5-103 OcnrfCcaRootCertificateExpiringIn4Hours
Field | Details |
---|---|
Description | 'The CCA Root Certificates expiring in 4 hours'. |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : CCA Root Certificate is expiring in 4 Hours' |
Severity | Critical |
Condition | Indicates the expiry dates of the CCA Root certificates that are expiring in four hours. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7091 |
Metric Used | 'oc_ingressgateway_cca_certificate_info' |
Recommended Actions | The alert is cleared when the expiring CCA root
certificates are replaced with new ones.
Steps: Replace expiring certificate key pair with new ones. For more information on creating certificate key pair, see Oracle Communications Cloud Native Core, Network Repository Function Installation, Upgrade, and Fault Recovery Guide. |
Available in OCI | No |
5.4.10.2 OcnrfCcaRootCertificateExpiringIn1Day
Table 5-104 OcnrfCcaRootCertificateExpiringIn1Day
Field | Details |
---|---|
Description | 'The CCA Root Certificates expiring in 1 day'. |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : CCA Root Certificate is expiring in 1 Day' |
Severity | Major |
Condition | Indicates the expiry dates of the CCA Root certificates that are expiring in one day. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7090 |
Metric Used | 'oc_ingressgateway_cca_certificate_info' |
Recommended Actions | The alert is cleared when the expiring CCA root
certificates are replaced with new ones.
Steps: Replace expiring certificate key pair with new ones. For more information on creating certificate key pair, see Oracle Communications Cloud Native Core, Network Repository Function Installation, Upgrade, and Fault Recovery Guide. |
Available in OCI | No |
5.4.10.3 OcnrfCcaRootCertificateExpiringIn5Days
Table 5-105 OcnrfCcaRootCertificateExpiringIn5Days
Field | Details |
---|---|
Description | 'The CCA Root Certificates expiring in 5 days.' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : CCA Root Certificate is expiring in 5 Days' |
Severity | Minor |
Condition | Indicates the expiry dates of the CCA Root certificates that are expiring in five days. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7089 |
Metric Used | 'oc_ingressgateway_cca_certificate_info' |
Recommended Actions | The alert is cleared when the expiring CCA root
certificates are replaced with new ones.
Steps: Replace expiring certificate key pair with new ones. For more information on creating certificate key pair, see Oracle Communications Cloud Native Core, Network Repository Function Installation, Upgrade, and Fault Recovery Guide. |
Available in OCI | No |
5.4.11 NRF Georedundancy Feature
This section lists the alerts that are specific to NRF Georedundancy feature. For more information about the feature, see the "NRF Georedundancy" section in Oracle Communications Cloud Native Core, Network Repository Function User Guide.
5.4.11.1 OcnrfDbReplicationStatusInactive
Table 5-106 OcnrfDbReplicationStatusInactive
Field | Details |
---|---|
Description | 'The Database Replication Status is currently INACTIVE.' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, nftype:{{$labels.NfType}}, nrflevel:{{$labels.NrfLevel}}, remoteNrfInstanceId: {{$labels.nrfInstanceId}}, remoteSiteName: {{$labels.siteName}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: The database replication status is INACTIVE.' |
Severity | Critical |
Condition | The database replication channel status between the given site and the georedundant site(s) is inactive. The alert is raised per replication channel. The alarm is raised or cleared only if the georedundancy feature is enabled. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7013 |
Metric Used | 'ocnrf_dbreplication_status' |
Recommended Actions | The alert is cleared when the database channel replication status between the given site and the georedundant site(s) is up. For more information on how to check the database replication status, see Oracle Communications Cloud Native Core, cnDBTier User Guide. |
Notes | The alarm is included only if the georedundancy feature is enabled. |
Available in OCI | No |
5.4.11.2 OcnrfReplicationStatusMonitoringInactive
Table 5-107 OcnrfReplicationStatusMonitoringInactive
Field | Details |
---|---|
Description | 'OCNRF Replication Status Monitoring Inactive' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Pod {{ $labels.kubernetes_pod_name}} are not monitoring the replication status' |
Severity | Critical |
Condition | This alarm is raised when one or more pods are not monitoring the replication status. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7078 |
Metric Used | ocnrf_replication_status_monitoring_inactive |
Recommended Actions | Resolution Steps:
|
Available in OCI | No |
5.4.12 XFCC Header Validation Feature
This section lists the alert that is specific to XFCC Header Validation feature. For more information about the feature, see the "XFCC Header Validation" section in Oracle Communications Cloud Native Core, Network Repository Function User Guide.
5.4.12.1 OcnrfNfAuthenticationFailureRequestsRejected
Table 5-108 OcnrfNfAuthenticationFailureRequestsRejected
Field | Details |
---|---|
Description | 'Service request(s) received from NF have been rejected by OCNRF (current value is: {{ $value }})' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Request rejected for Nf FQDN based Authentication failure.' |
Severity | Warning |
Condition | NRF rejected a service request due to NF authentication failure |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7015 |
Metric Used | 'ocnrf_nf_authentication_failure_total' |
Recommended Actions | The alert is cleared automatically.
Steps: Filter out nfAccessToken application ERROR logs on Kibana for more details. |
Available in OCI | No |
5.4.13 Enhanced NRF Set Based Deployment (NRF Growth) Feature
This section lists the alert that is specific to Enhanced NRF Set Based Deployment (NRF Growth) feature. For more information about the feature, see the "Enhanced NRF Set Based Deployment (NRF Growth)" section in Oracle Communications Cloud Native Core, Network Repository Function User Guide.
5.4.13.1 OcnrfRemoteSetNrfSyncFailed
Table 5-109 OcnrfRemoteSetNrfSyncFailed
Field | Details |
---|---|
Description | 'A sync request to the NRF in the remote set has
failed.'
Note: The alert must be configured only if the NRF Growth feature is enabled. |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : A sync request to the NRF in the remote set has failed.' |
Severity | Minor |
Condition | Sync request to the NRF in the remote NRF set has failed. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7098 |
Metric Used | ocnrf_query_remote_cds_responses_total |
Recommended Actions |
The alert is cleared when the synchronization with the remote NRF set is successful. Steps:
|
5.4.13.2 OcnrfSyncFailureFromAllNrfsOfAnyRemoteSet
Table 5-110 OcnrfSyncFailureFromAllNrfsOfAnyRemoteSet
Field | Details |
---|---|
Description | 'Sync requests to all the NRFs of a remote set has
failed.'
Note: The alert must be configured only if the NRF Growth feature is enabled. |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Sync requests to all the NRFs in any of the remote sets have failed' |
Severity | Major |
Condition | The sync requests to all the NRFs in the remote sets has failed. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7099 |
Metric Used | ocnrf_remote_set_unavailable_total |
Recommended Actions | The alert is cleared when synchronization is successful
with at least one NRF of the remote NRF set.
Steps:
|
Available in OCI | No |
5.4.13.3 OcnrfSyncFailureFromAllNrfsOfAllRemoteSets
Table 5-111 OcnrfSyncFailureFromAllNrfsOfAllRemoteSets
Field | Details |
---|---|
Description | 'Sync request to all the NRFs in all the remote sets
have failed.'
Note: The alert must be configured only if the NRF Growth feature is enabled. |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Sync request to all the NRFs in all the remote sets have failed' |
Severity | critical |
Condition | Sync requests to all the NRFs in all the remote sets have failed. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7100 |
Metric Used | ocnrf_all_remote_sets_unavailable_total |
Recommended Actions |
The alert is cleared when synchronization is successful with at least one NRF of the remote set(s). Steps:
|
Available in OCI | No |
5.4.13.4 OcnrfCacheDataServiceDown
Table 5-112 OcnrfCacheDataServiceDown
Field | Details |
---|---|
Description | 'OCNRF NrfCacheData service {{$labels.app_kubernetes_io_name}} is down' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : Cache Data Service is down' |
Severity | Critical |
Condition | Cache Data Service is unavailable. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7101 |
Metric Used | up |
Recommended Actions |
The alert is cleared when the Cache Data Service (CDS) is available. Steps:
Note: Use the CNC NF Data Collector tool for capturing logs. For more information on how to collect logs using the Data Collector tool, see Oracle Communications Cloud Native Core, Network Function Data Collector User Guide. |
Available in OCI | No |
5.4.13.5 OcnrfDatabaseFallbackUsed
Table 5-113 OcnrfDatabaseFallbackUsed
Field | Details |
---|---|
Description | 'A service operation is unable to get data from the Cache Data Service, and hence gets the data from the cnDBTier to fulfill the service operation' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : A service Operation is unable to get data from the Cache Data Service, so falling back to DB' |
Severity | Major |
Condition | When a service operation is unable to get data from the Cache Data Service, and hence gets the data from the database to fulfill the service operation. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7102 |
Metric Used | ocnrf_db_fallback_total |
Recommended Actions |
The alert is cleared automatically. Steps:
|
Available in OCI | No |
5.4.13.6 OcnrfTotalNFsRegisteredAtSegmentBelowMinorThreshold
Table 5-114 OcnrfTotalNFsRegisteredAtSegmentBelowMinorThreshold
Field | Details |
---|---|
Description | The alert is raised when the number of NFs registered at the segment is below the configured minor threshold. |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : The number of NFs registered at the segment is below minor threshold' |
Severity | Minor |
Condition | The number of NFs registered at the segment is below
minor threshold.
Note: This alert is triggered when the
registered NF count is greater than or equal to 10 and below 20.
This default value can be modified in the
|
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7103 |
Metric Used | ocnrf_nf_registered_count |
Recommended Actions |
The alert is cleared when the number of registered NFs in the segment is above the minor threshold. Steps:
|
Available in OCI | No |
5.4.13.7 OcnrfTotalNFsRegisteredAtSegmentBelowMajorThreshold
Table 5-115 OcnrfTotalNFsRegisteredAtSegmentBelowMajorThreshold
Field | Details |
---|---|
Description | The alert is raised when the number of NFs registered at the segment is below the configured major threshold. |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : The number of NFs registered at the segment is below major threshold |
Severity | Major |
Condition | The number of NFs registered at the segment is below
major threshold.
Note: This alert is triggered when the
registered NF count is greater than or equal to 2 and below 10. This
default value can be modified in the
|
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7104 |
Metric Used | ocnrf_nf_registered_count |
Recommended Actions |
The alert is cleared when the number of registered NFs in the segment is above the major threshold. Steps:
|
Available in OCI | No |
5.4.13.8 OcnrfTotalNFsRegisteredAtSegmentBelowCriticalThreshold
Table 5-116 OcnrfTotalNFsRegisteredAtSegmentBelowCriticalThreshold
Field | Details |
---|---|
Description | The alert is raised when the number of NFs registered at the segment is below the configured critical threshold. |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : The number of NFs registered at the segment is below critical threshold' |
Severity | Critical |
Condition | The number of NFs registered at the segment is below
critical threshold.
Note: This alert is triggered when the
registered NF count is below 2. This default value can be modified
in the |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7105 |
Metric Used | ocnrf_nf_registered_count |
Recommended Actions |
The alert is cleared when the number of registered NFs in the segment is above the critical threshold. Steps:
|
Available in OCI | No |
5.4.14 Ingress Gateway Pod Protection Feature
This section lists the alerts that are specific to Ingress Gateway Pod Protection feature. For more information about the feature, see the "Ingress Gateway Pod Protection" section in Oracle Communications Cloud Native Core, Network Repository Function User Guide.
5.4.14.1 OcnrfIngressGatewayPodInDangerOfCongestionState
Table 5-117 OcnrfIngressGatewayPodInDangerOfCongestionState
Field | Details |
---|---|
Description | 'The pod {{$labels.kubernetes_pod_name}} is in Danger of Congestion state' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : The pod is in Danger of Congestion state' |
Severity | Major |
Condition |
When Ingress Gateway pod is in Danger Of Congestion state. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7092 |
Metric Used | oc_ingressgateway_pod_congestion_state |
Recommended Actions | The alert is cleared when the pod is out of Danger Of
Congestion (DoC) state.
Note: The thresholds can be viewed using REST API. Steps: Reassess if the NRF is receiving additional traffic. If this is unexpected, contact My Oracle Support.
|
Available in OCI | No |
5.4.14.2 OcnrfIngressGatewayPodInCongestedState
Table 5-118 OcnrfIngressGatewayPodInCongestedState
Field | Details |
---|---|
Description | 'The pod {{$labels.kubernetes_pod_name}} is in Congested state' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : The pod is in Congested state' |
Severity | Critical |
Condition |
When Ingress Gateway pod is in Congested state. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7093 |
Metric Used | oc_ingressgateway_pod_congestion_state |
Recommended Actions | The alert is cleared when the pod is out of Congested
state.
Note: The thresholds can be viewed using REST API. Steps: Reassess if the NRF is receiving additional traffic. If this is unexpected, contact My Oracle Support.
|
Available in OCI | No |
5.4.14.3 OcnrfIngressGatewayPodCpuUsageInCongestedState
Table 5-119 OcnrfIngressGatewayPodCpuUsageInCongestedState
Field | Details |
---|---|
Description | 'The pod {{$labels.kubernetes_pod_name}} is in Congested state' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : The pod is in Congested state' |
Severity | Critical |
Condition |
Ingress Gateway pod is in Congested state due to CPU consumption above the configured thresholds. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7094 |
Metric Used | oc_ingressgateway_pod_resource_state |
Recommended Actions | The alert is cleared when the CPU consumption goes below
the configured thresholds for the Congested state.
Note: The thresholds can be viewed using REST API. Steps: Reassess if the NRF is receiving additional traffic. If this is unexpected, contact My Oracle Support.
|
Available in OCI | No |
5.4.14.4 OcnrfIngressGatewayPodCpuUsageInDangerOfCongestionState
Table 5-120 OcnrfIngressGatewayPodCpuUsageInDangerOfCongestionState
Field | Details |
---|---|
Description | 'The pod {{$labels.kubernetes_pod_name}} is in Danger of Congestion state due to CPU usage above threshold' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : The pod is in Danger of Congestion state due to CPU usage above threshold' |
Severity | Major |
Condition |
Ingress Gateway pod is in Danger of Congestion state due to CPU consumption above the configured thresholds. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7095 |
Metric Used | oc_ingressgateway_pod_resource_state |
Recommended Actions | The alert is cleared when the CPU consumption is not as
per the configured thresholds value for the Danger of Congestion
state.
Note: The thresholds can be viewed using REST API. Steps: Reassess if the NRF is receiving additional traffic. If this is unexpected, contact My Oracle Support.
|
Available in OCI | No |
5.4.14.5 OcnrfIngressGatewayPodPendingMessageInCongestedState
Table 5-121 OcnrfIngressGatewayPodPendingMessageInCongestedState
Field | Details |
---|---|
Description | 'The pod {{$labels.kubernetes_pod_name}} is in Congested state' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : The pod is in Congested state' |
Severity | Critical |
Condition |
Ingress Gateway pod is in Congested state due to pending message count above the configured thresholds. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7096 |
Metric Used | oc_ingressgateway_pod_resource_state |
Recommended Actions | The alert is cleared when the pending message count is
not as per the configured thresholds value for the Congested state.
Note: The thresholds can be viewed using REST API. Steps: Reassess if the NRF is receiving additional traffic. If this is unexpected, contact My Oracle Support.
|
Available in OCI | No |
5.4.14.6 OcnrfIngressGatewayPodPendingMessageInDangerOfCongestionState
Table 5-122 OcnrfIngressGatewayPodPendingMessageInDangerOfCongestionState
Field | Details |
---|---|
Description | 'The pod {{$labels.kubernetes_pod_name}} is in Danger of Congestion state due to Pending Message above threshold' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}},podname: {{$labels.kubernetes_pod_name}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : The pod is in Danger of Congestion state due to Pending Message above threshold' |
Severity | Major |
Condition |
Ingress Gateway pod is in Danger of Congestion state due to pending message count above the configured thresholds. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7097 |
Metric Used | oc_ingressgateway_pod_resource_state |
Recommended Actions | The alert is cleared when the pending message count is
not as per the configured thresholds value for the Danger of Congestion
state.
Note: The thresholds can be viewed using REST API. Steps: Reassess if the NRF is receiving additional traffic. If this is unexpected, contact My Oracle Support.
|
Available in OCI | No |
5.4.15 Subscriber Location Function Feature
This section lists the alert that is specific to Subscriber Location Function feature. For more information about the feature, see the "Subscriber Location Function Feature" section in Oracle Communications Cloud Native Core, Network Repository Function User Guide.
5.4.15.1 OcnrfMaxSlfAttemptsExhausted
Table 5-123 OcnrfMaxSlfAttemptsExhausted
Field | Details |
---|---|
Description | 'NF discovery request with fqdn {{$labels.NfProfileFqdn}} NF type {{$labels.NfType}} has exhausted maximum SLF attempts' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, nrflevel:{{$labels.NrfLevel}}, podname: {{$labels.kubernetes_pod_name}}, NfProfileFqdn: {{$labels.NfProfileFqdn}}, NfType: {{$labels.NfType}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: The maximum slf attempts have exhausted.' |
Severity | Critical |
Condition |
NF discovery request with FQDN of the given NFType UDR
has exhausted maximum SLF attempts. This alert is raised when the
Note: This alert is included if SLF selection from registered profiles is enabled. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7054 |
Metric Used | 'ocnrf_max_slf_attempts_exhausted_total' |
Recommended Actions |
The alert is cleared automatically after 5 minutes. Steps:
|
Available in OCI | Yes |
5.4.16 EmptyList in Discovery Response Feature
This section lists the alert that is specific to EmptyList in Discovery Response feature. For more information about the feature, see the "EmptyList in Discovery Response" section in Oracle Communications Cloud Native Core, Network Repository Function User Guide.
5.4.16.1 OcnrfNFDiscoveryEmptyListObservedNotification
Table 5-124 OcnrfNFDiscoveryEmptyListObservedNotification
Field | Details |
---|---|
Description | 'Empty List observed with received discovery request with NfType $labels.NfType Feature Status $labels.FeatureStatus' |
Summary | 'namespace: $labels.namespace, nrflevel:$labels.NrfLevel, podname: $labels.pod, NfType: $labels.NfType, FeatureStatus: $labels.FeatureStatus: Empty List observed with received discovery request' |
Severity | Critical |
Condition |
This alarm is raised when profiles do not match the discovery request. Also, this alarm is raised when the SUSPENDED profile is in response to incoming request and Empty List feature is enabled. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7055 |
Metric Used | ocnrf_nfDiscover_emptyList_total |
Recommended Actions |
The alert is cleared automatically after a duration of 5 minutes. Steps:
|
Available in OCI | No |
5.4.17 Support for TLS Feature
This section lists the alert that is specific to Support for TLS feature. For more information about the feature, see the "Support for TLS" section in Oracle Communications Cloud Native Core, Network Repository Function User Guide.
5.4.17.1 OcnrfTLSCertificateExpireMinor
Table 5-125 OcnrfTLSCertificateExpireMinor
Field | Details |
---|---|
Description | 'TLS certificate to expire in 6 months'. |
Summary | 'namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : TLS certificate to expire in 6 months' |
Severity | Minor |
Condition | This alert is raised when the TLS certificate is about to expire in six months. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7106 |
Metric Used | security_cert_x509_expiration_seconds |
Recommended Actions |
The alert is cleared when the TLS certificate is renewed. For more information about certificate renewal, see "Creating Private Keys and Certificate " section in the Oracle Communications Cloud Native Core, Network Repository Function Installation, Upgrade, and Fault Recovery Guide. |
Available in OCI | No |
5.4.17.2 OcnrfTLSCertificateExpireMajor
Table 5-126 OcnrfTLSCertificateExpireMajor
Field | Details |
---|---|
Description | 'TLS certificate to expire in 3 months.' |
Summary | 'namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : TLS certificate to expire in 3 months' |
Severity | Major |
Condition | This alert is raised when the TLS certificate is about to expire in three months. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7107 |
Metric Used | security_cert_x509_expiration_seconds |
Recommended Actions |
The alert is cleared when the TLS certificate is renewed. For more information about certificate renewal, see "Creating Private Keys and Certificate" section in the Oracle Communications Cloud Native Core, Network Repository Function Installation, Upgrade, and Fault Recovery Guide. |
Available in OCI | No |
5.4.17.3 OcnrfTLSCertificateExpireCritical
Table 5-127 OcnrfTLSCertificateExpireCritical
Field | Details |
---|---|
Description | 'TLS certificate to expire in one month.' |
Summary | 'namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }} : TLS certificate to expire in 1 month' |
Severity | Critical |
Condition | This alert is raised when the TLS certificate is about to expire in one month. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7108 |
Metric Used | security_cert_x509_expiration_seconds |
Recommended Actions |
The alert is cleared when the TLS certificate is renewed. For more information about certificate renewal, see "Creating Private Keys and Certificate" section in the Oracle Communications Cloud Native Core, Network Repository Function Installation, Upgrade, and Fault Recovery Guide. |
Available in OCI | No |
5.4.18 Egress Gateway Pod Throttling
5.4.18.1 OcnrfEgressPerPodDiscardRateAboveMajorThreshold
Table 5-128 OcnrfEgressPerPodDiscardRateAboveMajorThreshold
Field | Details |
---|---|
Description | 'Egressgateway PerPod Discard Rate is greater than the configured major threshold. (current value is: {{ $value }})' |
Summary | 'kubernetes_namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Egressgateway PerPod Discard Rate is more than 1 request per second.' |
Severity | Major |
Condition | This alert is raised when the Egress Gateway pods discard traffic due to its request limit is greater than the configured threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7113 |
Metric Used | oc_egressgateway_podlevel_throttling_discarded_total |
Recommended Actions | The alert is cleared when the Egress Gateway pods
discard traffic rate falls below the major threshold.
Note: The threshold is configurable in the alert file.
Reassess why the NRF is receiving additional traffic (for example, Mated
site NRF is unavailable in georedundancy scenario). If this alert is
unexpected, contact My Oracle
Support.
Steps:
|
Available in OCI | No |
5.4.18.2 OcnrfEgressPerPodDiscardRateAboveCriticalThreshold
Table 5-129 OcnrfEgressPerPodDiscardRateAboveCriticalThreshold
Field | Details |
---|---|
Description | 'Egressgateway PerPod Discard Rate is greater than the configured critical threshold. (current value is: {{ $value }})' |
Summary | 'kubernetes_namespace: {{$labels.namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Egressgateway PerPod Discard Rate is more than 100 requests per second.’ |
Severity | Critical |
Condition | This alert is raised when the Egress Gateway pods discard traffic due to its request limit is greater than the configured threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7114 |
Metric Used | oc_egressgateway_podlevel_throttling_discarded_total |
Recommended Actions | The alert is cleared when the Egress Gateway pods
discard traffic rate falls below the critical threshold.
Note: The threshold is configurable in the alert file.
Reassess why the NRF is receiving additional traffic (for example, Mated
site NRF is unavailable in georedundancy scenario). If this alert is
unexpected, contact My Oracle
Support.
Steps:
|
Available in OCI | No |
5.4.19 Ingress Gateway Pod Protection Using Rate Limiting
5.4.19.1 OcnrfIngressDiscardDueToRateLimitMajorThreshold
Table 5-130 OcnrfIngressDiscardDueToRateLimitMajorThreshold
Field | Details |
---|---|
Description | 'Ingress Gateway discards due to rate limit exceeds the configured major threshold. (current value is: {{ $value }})' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Ingressgateway Discard due to Rate Limit is more than or equal to 1 requests per second and less than 100 requests per second.' |
Severity | Major |
Condition | This alert is raised when Ingress Gateway discard requests as rate limit exceeds the configured major threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7129 |
Metric Used | oc_ingressgateway_http_request_ratelimit_denied_count_total |
Recommended Actions |
The alert is cleared when the Ingress Gateway pods discard traffic rate falls below the major threshold or exceeds the critical threshold. Note: The threshold is configurable in the alert file. Reassess why the NRF is receiving additional traffic (for example, Mated site NRF is unavailable in georedundancy scenario). If this alert is unexpected, contact My Oracle Support. Steps:
|
Available in OCI | No |
5.4.19.2 OcnrfIngressDiscardDueToRateLimitCriticalThreshold
Table 5-131 OcnrfIngressDiscardDueToRateLimitCriticalThreshold
Field | Details |
---|---|
Description | 'Ingress gateway discards due to rate limit exceeds the configured critical threshold. (current value is: {{ $value }})' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Ingressgateway Discard due to Rate Limit is more than or equal to 100 requests per second.' |
Severity | Critical |
Condition | This alert is raised when Ingress Gateway discard requests as rate limit exceeds the configured critical threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7130 |
Metric Used | oc_ingressgateway_http_request_ratelimit_denied_count_total |
Recommended Actions |
The alert is cleared when the Ingress Gateway pods discard traffic rate falls below the critical threshold. Note: The threshold is configurable in the alert file. Reassess why the NRF is receiving additional traffic (for example, Mated site NRF is unavailable in georedundancy scenario). If this alert is unexpected, contact My Oracle Support. Steps:
|
Available in OCI | No |
5.4.20 Egress Gateway Pod Protection Using Rate Limiting
5.4.20.1 OcnrfEgressDiscardDueToRateLimitMajorThreshold
Table 5-132 OcnrfEgressDiscardDueToRateLimitMajorThreshold
Field | Details |
---|---|
Description | 'Egress Gateway discards due to rate limit exceeds the configured major threshold. (current value is: {{ $value }})' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Egressgateway Discard due to Rate Limit is more than or equal to 1 requests per second and less than 100 requests per second.' |
Severity | Major |
Condition | This alert is raised when Egress Gateway discard requests as rate limit exceeds the configured major threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7131 |
Metric Used | oc_egressgateway_http_request_ratelimit_denied_count_total |
Recommended Actions |
The alert is cleared when the Egress Gateway pods discard traffic rate falls below the major threshold or exceeds the critical threshold. Note: The threshold is configurable in the alert file. Reassess why the NRF is receiving additional traffic (for example, Mated site NRF is unavailable in georedundancy scenario). If this alert is unexpected, contact My Oracle Support. Steps:
|
Available in OCI | No |
5.4.20.2 OcnrfEgressDiscardDueToRateLimitCriticalThreshold
Table 5-133 OcnrfEgressDiscardDueToRateLimitCriticalThreshold
Field | Details |
---|---|
Description | 'Egress gateway discards due to rate limit exceeds the configured critical threshold. (current value is: {{ $value }})' |
Summary | 'kubernetes_namespace: {{$labels.kubernetes_namespace}}, timestamp: {{ with query "time()" }}{{ . | first | value | humanizeTimestamp }}{{ end }}: Egressgateway Discard due to Rate Limit is more than or equal to 100 requests per second.' |
Severity | Critical |
Condition | This alert is raised when Egress Gateway discard requests as rate limit exceeds the configured critical threshold. |
OID | 1.3.6.1.4.1.323.5.3.36.1.2.7132 |
Metric Used | oc_egressgateway_http_request_ratelimit_denied_count_total |
Recommended Actions |
The alert is cleared when the Egress Gateway pods discard traffic rate falls below the critical threshold. Note: The threshold is configurable in the alert file. Reassess why the NRF is receiving additional traffic (for example, Mated site NRF is unavailable in georedundancy scenario). If this alert is unexpected, contact My Oracle Support. Steps:
|
Available in OCI | No |
5.5 NRF Alert Configuration
NRF Alert Configuration
Follow the steps below for NRF Alert configuration in Prometheus:
Note:
- The
Name
is the release name used in helm install command. - The
Namespace
is the namespace used in helm install command. By defaultNamespace
for NRF isocnrf
that must be update as per the deployment. - The
ocnrf-config-1.1.0.0.0.zip
file can be downloaded from OHC.Unzip the
ocnrf-config-1.1.0.0.0.zip
package after downloading to getNrfAlertrules.yaml
file.
- Take Backup of current
configuration map of Prometheus:
kubectl get configmaps _NAME_-server -o yaml -n _Namespace_ > /tmp/tempConfig.yaml
- Check and Add NRF Alert
file name inside Prometheus configuration map:
sed -i '/etc\/config\/alertsnrf/d' /tmp/tempConfig.yaml sed -i '/rule_files:/a\ \- /etc/config/alertsnrf' /tmp/tempConfig.yaml
- Update configuration map
with updated file name of NRF alert file:
kubectl replace configmap _NAME_-server -f /tmp/tempConfig.yaml
- Add NRF Alert rules in
configuration map under file name of NRF alert file:
kubectl patch configmap _NAME_-server -n _Namespace_--type merge --patch "$(cat ~/NrfAlertrules.yaml)"
Note:
Prometheus server takes updated configuration map that is automatically reloaded after 60 seconds approximately. Refresh the Prometheus GUI to confirm that the NRF Alerts are loaded.5.5.1 Disable Alerts
- Edit NrfAlertrules-25.1.200.yaml file to remove a specific alert.
- Remove complete content of a specific alert from the
NrfAlertrules-25.1.200.yaml
file.
For example: If you want to remove
OcnrfTrafficRateAboveMinorThreshold
alert, remove the complete content:## ALERT SAMPLE START## - alert: OcnrfTrafficRateAboveMinorThreshold annotations: description: 'Ingress traffic Rate is above minor threshold i.e. 800 mps (current value is: {{ $value }})' summary: 'Traffic Rate is above 80 Percent of Max requests per second(1000)' expr: sum(rate(oc_ingressgateway_http_requests_total{app_kubernetes_io_name="ingressgateway",kubernetes_namespace="ocnrf"}[2m])) >= 800 < 900 labels: severity: Minor ## ALERT SAMPLE END##
- Perform Alert configuration. For more information about configuring alerts, see NRF Alert Configuration section.
5.5.2 Configuring SNMP Notifier
This section describes the procedure to configure SNMP Notifier.
- Run the following command to edit the
deployment:
$ kubectl edit deploy <snmp_notifier_deployment_name> -n <namespace>
Example:
$ kubectl edit deploy occne-snmp-notifier -n occne-infra
SNMP deployment yaml file is displayed.
- Edit the SNMP destination in the deployment
yaml file as
follows:
--snmp.destination=<destination_ip>:<destination_port>
Example:
--snmp.destination=10.75.203.94:162
- Save the file.
$ docker logs <trapd_container_id>
2020-04-29 15:34:24 10.75.203.103 [UDP: [10.75.203.103]:2747->[172.17.0.4]:162]:DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (158510800) 18 days, 8:18:28.00 SNMPv2-MIB::snmpTrapOID.0 = OID: SNMPv2-SMI::enterprises.323.5.3.36.1.2.7003 SNMPv2-SMI::enterprises.323.5.3.36.1.2.7003.1 = STRING: "1.3.6.1.4.1.323.5.3.36.1.2.7003[]" SNMPv2-SMI::enterprises.323.5.3.36.1.2.7003.2 = STRING: "critical" SNMPv2-SMI::enterprises.323.5.3.36.1.2.7003.3 = STRING: "Status: critical- Alert: OcnrfActiveSubscribersBelowCriticalThreshold Summary: namespace: ocnrf, nftype:5G_EIR, nrflevel:6faf1bbc-6e4a-4454-a507-a14ef8e1bc5c, podname: ocnrf-nrfauditor-6b459f5db5-4kvt4,
timestamp: 2020-04-29 15:33:24.408 +0000 UTC: Current number of registered NFs detected below critical threshold. Description: The number of registered NFs detected below critical threshold (current value
is: 0)
There are two MIB files which are used to generate the traps. The user need to update these files along with the Alert file in order to fetch the traps in their environment.
- ocnrf_mib_tc_25.1.200.mib
This is considered as NRF top level mib file, where the objects and their data types are defined.
- ocnrf_mib_25.1.200.mib
This file fetches the objects from the top level mib file and based on the alert notification, these objects can be selected for display.
- toplevel_25.1.200.mib: This defines the OIDs for all NFs.
Note:
MIB files are packaged along with the release package. Download the file from MOS. For more information on downloading the release package, see Oracle Communications Cloud Native Core, Network Repository Function Installation, Upgrade, and Fault Recovery Guide.