10 OCNWDAF Alerts
10.1 OCNWDAF Alert Configuration
This section describes the measurement based alert rules configuration for OCNWDAF. The Alert Manager uses the Prometheus measurements values as reported by microservices in conditions under alert rules to trigger alerts.
OCNWDAF Alert configuration in Prometheus
The following procedure is used to configure alerts in Prometheus:
- Download the
ocn-nwdaf-alerting-rules.yamlfile. Edit this file to configure the alert rules. The parameters in the file that can be edited includenameof the alert,rulesfor the alert includingalertname and the expressionexprdefined to trigger the alert. - Copy the updated
ocn-nwdaf-alerting-rules.yamlfile to Bastion Host. - Run the following command:
kubectl apply -f ocn-nwdaf-alerting-rules.yaml -n ocn-nwdaf - To verify if the Custom Resource Definition (CRD) is created, run the following command:
kubectl get prometheusrule -n ocn-nwdaf - Verify the alerts in the Prometheus GUI, the alert name and expression is listed. See example below:
Figure 10-1 Prometheus GUI

Alert Rules
The alerts are configured on the Prometheus server. The metrics scraped correspond to a pod that runs a single microservice, so each alert belongs to one of the pods running. Prometheus continously collects metrics and when any of the alerting rules are met, the alert is triggered. All the alert rules are written in one or multiple .yml files and deployed as described in procedure OCNWDAF Alert configuration in Prometheus. Listed below are the alert rules for the various alerts captured for OCNWDAF:
- name: <ALERT NAME>
rules:
- alert: <ALERT NAME>
expr: up{app="SERVICE LABEL"} == 0 - name: OCN_NWDAF_DATA_COLLECTION_NOT_RUNNING
rules:
- alert: OCN_NWDAF_DATA_COLLECTION_NOT_RUNNING
expr: up{app="ocn-nwdaf-data-collection"} == 0-
Request rate rule:
- name: <ALERT NAME> rules: - alert: <ALERT NAME> expr: > sum without(method,status,outcome,exception,app,instance,container,pod,pod_template_hash) (rate(http_server_requests_seconds_count{uri="<URI ENDPOINT>"}[1m])) > 1000Example:- name: HIGH_ABNORMAL_BEHAVIOUR_REQUEST_RATE rules: - alert: HIGH_ABNORMAL_BEHAVIOUR_REQUEST_RATE expr: sum without(method,status,outcome,exception,app,instance,container,pod,pod_template_hash) (rate(http_server_requests_seconds_count{uri="nnwdaf-analyticsinfo/v1/analytics?event-id=ABNORMAL_BEHAVIOUR"}[1m])) > 1000 -
Failure rate request rule:
- name: <ALERT NAME> rules: - alert: <ALERT NAME> expr: > (sum without(method,outcome,exception,app,instance,container,pod,pod_template_hash) (rate(http_server_requests_seconds_count{uri="<URI ENDPOINT>",status=~"[4-5].."}[1m]))/ ignoring(status) sum without(method,status,outcome,exception,app,instance,container,pod,pod_template_hash) (rate(http_server_requests_seconds_count{uri="<URI ENDPOINT>"}[1m]))) * 100 > 70Example:- name: HIGH_ABNORMAL_BEHAVIOUR_REQUEST_FAILURE_RATE rules: - alert: HIGH_ABNORMAL_BEHAVIOUR_REQUEST_FAILURE_RATE expr: (sum without(method,outcome,exception,app,instance,container,pod,pod_template_hash) (rate(http_server_requests_seconds_count{uri="nnwdaf-analyticsinfo/v1/analytics?event-id=ABNORMAL_BEHAVIOUR",status=~"[4-5].."}[1m]))/ ignoring(status) sum without(method,status,outcome,exception,app,instance,container,pod,pod_template_hash) (rate(http_server_requests_seconds_count{uri="nnwdaf-analyticsinfo/v1/analytics?event-id=ABNORMAL_BEHAVIOUR"}[1m]))) * 100 > 70
- name: <ALERT NAME>
rules:
- alert: <ALERT NAME>
expr: system_cpu_usage{app="<SERVICE LABEL>"} * 100 > 80 - name: OCN_NWDAF_DATA_COLLECTION_HIGH_CPU_LOAD
rules:
- alert: OCN_NWDAF_DATA_COLLECTION_HIGH_CPU_LOAD
expr: system_cpu_usage{app="ocn-nwdaf-data-collection"} * 100 > 80 - name: <ALERT NAME>
rules:
- alert: <ALERT NAME>
expr: >
(sum(avg_over_time(jvm_memory_used_bytes{area="heap",app="<SERVICE LABEL>"} [1m]))/sum(avg_over_time(jvm_memory_max_bytes{area="heap",app="<SERVICE LABEL>"}[1m]))) * 100 > 80 - name: OCN_NWDAF_DATA_COLLECTION_HIGH_JVM_HEAP_MEMORY_USAGE
rules:
- alert: OCN_NWDAF_DATA_COLLECTION_HIGH_JVM_HEAP_MEMORY_USAGE
expr: (sum(avg_over_time(jvm_memory_used_bytes{area="heap",app="ocn-nwdaf-data-collection"} [1m]))/sum(avg_over_time(jvm_memory_max_bytes{area="heap",app="ocn-nwdaf-data-collection"}[1m]))) * 100 > 8010.1.1 SNMP Support
Simple Network Management Protocol (SNMP) is an application-layer protocol designed for monitoring and managing network devices within a Local Area Network (LAN) or Wide Area Network (WAN).
OCNWDAF forwards the Prometheus alerts as Simple Network Management Protocol (SNMP) traps to the southbound SNMP servers. OCNWDAF uses two SNMP MIB files to generate the traps. Update the alertmanager.yaml file to configure the alert manager. In the alertmanager.yaml file, the alerts can be grouped based on podname, alertname, severity, namespace, and so on. The Prometheus Alert Manager is integrated with Oracle Communications Cloud Native Core, Cloud Native Environment (CNE) snmp-notifier service. The external SNMP servers are set up to receive the Prometheus alerts as SNMP traps. The operator must update the MIB and alert manager files to fetch the SNMP traps in their environment.
Configuring SNMP Support
The alertmanager.yaml file is updated to include additional information for SNMP traps.
Sample of the alertmanager.yaml file:
{{- range $key, $svcName := .Values.global.rules.services }}
- alert: {{ $svcName | replace "-" "_" | upper }}_HIGH_CPU_LOAD
expr: system_cpu_usage{job={{ $svcName | quote }}, namespace={{ $.Release.Namespace | quote }} } * 100 > 90
for: 5m
labels:
alertname: "OCN_NWDAF_SVC_HIGH_CPU_LOAD"
oid: "1.3.6.1.4.1.323.5.3.45.1.{{ index $.Values.global.rules.oid $svcName }}.4002"
severity: critical
namespace: {{ $.Release.Namespace | quote }}
annotations:
namespace: {{ $.Release.Namespace | quote }}
severity: critical
summary: "Service {{ "{{$labels.app}}" }} CPU load is high."
description: "Service {{ "{{$labels.app}}" }} CPU load has been high for more than 5 minutes."
{{- end }}Configure the SNMP Test Client
Follow the steps below to configure the SNMP Test Client:
- Create a ConfigMap that includes the MIB files. Run the following command:
kubectl create configmap my-config --from-file=/path/to/mib/files/ -n <namespace>Where,
my-configis the name of the ConfigMap. The same has to be used in the pod configuration file. The ConfigMap must be in the same namespace where the SNMP client is deployed. - To start the SNMP trap daemon service, use the service configuration .yaml file, see the example below:
apiVersion: v1 kind: Service metadata: labels: name: snmptrapd name: snmptrapd namespace: performance-idc // namespace in which you want to deploy the service spec: ports: - name: snmptrapd port: 162 protocol: UDP targetPort: 162 selector: name: snmptrapd sessionAffinity: None type: ClusterIP - Use the following pod deployment configuration to deploy the pod corresponding to the above service. The commands mentioned in this file add the MIB files from the ConfigMap to the pod, following which the SNMP trap daemon service application starts.
Sample docker file:
FROM ocr-docker-remote.artifactory.oci.oraclecorp.com/os/oraclelinux:8-slim ARG HTTPS_PROXY=http://www-proxy.us.oracle.com:80 RUN echo -e "[main]\nproxy=${HTTPS_PROXY}" >> /etc/dnf/dnf.conf RUN microdnf update -y && microdnf install -y lsof RUN microdnf install net-snmp ADD snmptrapd.conf /etc/snmp/snmptrapd.conf EXPOSE 162 CMD ["/bin/sh"]Sample snmp-pod.yaml file:
apiVersion: v1 kind: Pod metadata: name: snmptrapd labels: name: snmptrapd role: snmptrapd namespace: performance-idc // namespace in which you want to deploy the pod spec: containers: - name: snmptrapd image: occne-repo-host:5000/snmptrapd:1.1.1 // you need to create you own snmptrapd image using dockerfile volumeMounts: - name: config-volume mountPath: /MIB imagePullPolicy: IfNotPresent command: ["/bin/bash","-c","kill -9 $(lsof -t -i:162); cp /MIB/* /usr/share/snmp/mibs && echo MIB files copied successfully ;snmptrapd -m ALL -f -Of -Lo"] ports: - containerPort: 162 protocol: UDP resources: limits: cpu: "1" memory: 1Gi requests: cpu: "1" memory: 1Gi volumes: - name: config-volume configMap: name: my-config - Run the following commands to deploy the service and pod in the SNMP Client:
kubectl apply -f snmp-pod.yamlkubectl apply -f snmp-svc.yam - Run the following command to view the pod logs:
$ kubectl logs pod/snmptrapd -n performance-idcSample of the pod logs:
kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec] MIB files copied successfully NET-SNMP version 5.7.2
Integrate the Alert Manager with snmp-notifier Service
Update the SNMP client destination in occne-snmp-notifier service with the SNMP destination client IP:
$ kubectl edit deployment -n occne-infra occne-snmp-notifier--snmp.destination=<IP>:<port> inside the args of container and add the snmp-client destination IP as follows:- --snmp.destination=<fqdn of target receiver>:162Verify the Traps
Run the following command to verify the traps:
$ kubectl logs pod/snmptrapd -n performance-idc -fSample output:
2024-03-11 11:31:10 10-233-87-165.occne-snmp-notifier.occne-infra.svc.blurr8 [UDP: [10.233.87.165]:46951->[10.233.116.34]:162]:
.iso.org.dod.internet.mgmt.mib-2.system.sysUpTime.sysUpTimeInstance = Timeticks: (147060000) 17 days, 0:30:00.00 .iso.org.dod.internet.snmpV2.snmpModules.snmpMIB.snmpMIBObjects.snmpTrap.snmpTrapOID.0 = OID: .iso.org.dod.internet.private.enterprises.tekelecCorp.tekelecProductGroups.tekelecSwitchingGroup.oracleNWDAF.oracleNWDAFMIB.ocnwdafGeoredagent.ocnwdafgeoredagentSvcDown .iso.org.dod.internet.private.enterprises.tekelecCorp.tekelecProductGroups.tekelecSwitchingGroup.oracleNWDAF.oracleNWDAFMIB.ocnwdafGeoredagent.ocnwdafgeoredagentSvcDown.1 = STRING: "1.3.6.1.4.1.323.5.3.45.1.33.2002[job=ocn-nwdaf-georedagent]" .iso.org.dod.internet.private.enterprises.tekelecCorp.tekelecProductGroups.tekelecSwitchingGroup.oracleNWDAF.oracleNWDAFMIB.ocnwdafGeoredagent.ocnwdafgeoredagentSvcDown.2 = STRING: "critical" .iso.org.dod.internet.private.enterprises.tekelecCorp.tekelecProductGroups.tekelecSwitchingGroup.oracleNWDAF.oracleNWDAFMIB.ocnwdafGeoredagent.ocnwdafgeoredagentSvcDown.3 = STRING: "Status: critical
- Alert: OCN_NWDAF_GEOREDAGENT_NOT_RUNNING
Summary: Service is down.
Description: Service has been down for more than 2 minutes."
2024-03-11 11:35:38 10-233-87-165.occne-snmp-notifier.occne-infra.svc.blurr8 [UDP: [10.233.87.165]:56385->[10.233.116.34]:162]:
.iso.org.dod.internet.mgmt.mib-2.system.sysUpTime.sysUpTimeInstance = Timeticks: (147086800) 17 days, 0:34:28.00 .iso.org.dod.internet.snmpV2.snmpModules.snmpMIB.snmpMIBObjects.snmpTrap.snmpTrapOID.0 = OID: .iso.org.dod.internet.private.enterprises.tekelecCorp.tekelecProductGroups.tekelecSwitchingGroup.oracleNWDAF.oracleNWDAFMIB.cap4cModelController.ocnwdafcap4cModelControllerSvcDown .iso.org.dod.internet.private.enterprises.tekelecCorp.tekelecProductGroups.tekelecSwitchingGroup.oracleNWDAF.oracleNWDAFMIB.cap4cModelController.ocnwdafcap4cModelControllerSvcDown.1 = STRING: "1.3.6.1.4.1.323.5.3.45.1.24.2002[job=cap4c-model-controller]" .iso.org.dod.internet.private.enterprises.tekelecCorp.tekelecProductGroups.tekelecSwitchingGroup.oracleNWDAF.oracleNWDAFMIB.cap4cModelController.ocnwdafcap4cModelControllerSvcDown.2 = STRING: "critical" .iso.org.dod.internet.private.enterprises.tekelecCorp.tekelecProductGroups.tekelecSwitchingGroup.oracleNWDAF.oracleNWDAFMIB.cap4cModelController.ocnwdafcap4cModelControllerSvcDown.3 = STRING: "Status: critical
- Alert: CAP4C_MODEL_CONTROLLER_NOT_RUNNING
Summary: Service is down.
Description: Service has been down for more than 2 minutes."Figure 10-2 Prometheus GUI

OCNWDAF MIB Files
- OCNWDAF-MIB-TC-24.2.0.mib: This is a top level mib file, where the objects and their data types are defined.
- OCNWDAF-MIB-24.2.0.mib: This file fetches the objects from the top level mib file and based on the alert notification, the objects are selected for display.
OID Definition for OCNWDAF Services
OCNWDAF microservices and OID definitions are listed below:
OCNWDAF's OID: 1.3.6.1.4.1.323.5.3.45
Table 10-1 OID Definitions
| Service Name | OID |
|---|---|
| cap4c-api-gateway | 1.3.6.1.4.1.323.5.3.45.1.20 |
| cap4c-capex-optimization-service | 1.3.6.1.4.1.323.5.3.45.1.21 |
| cap4c-configuration-manager-service | 1.3.6.1.4.1.323.5.3.45.1.22 |
| cap4c-kafka-ingestor | 1.3.6.1.4.1.323.5.3.45.1.23 |
| cap4c-model-controller | 1.3.6.1.4.1.323.5.3.45.1.24 |
| cap4c-stream-analytics | 1.3.6.1.4.1.323.5.3.45.1.25 |
| cap4c-stream-transformer | 1.3.6.1.4.1.323.5.3.45.1.26 |
| nwdaf-cap4c-reporting-service | 1.3.6.1.4.1.323.5.3.45.1.27 |
| nwdaf-cap4c-scheduler-service | 1.3.6.1.4.1.323.5.3.45.1.28 |
| nwdaf-cap4c-spring-cloud-config-server | 1.3.6.1.4.1.323.5.3.45.1.29 |
| ocn-nwdaf-analytics-info | 1.3.6.1.4.1.323.5.3.45.1.30 |
| ocn-nwdaf-data-collection-service | 1.3.6.1.4.1.323.5.3.45.1.31 |
| ocn-nwdaf-datacollection-controller | 1.3.6.1.4.1.323.5.3.45.1.32 |
| ocn-nwdaf-georedagent | 1.3.6.1.4.1.323.5.3.45.1.33 |
| ocn-nwdaf-mtlf-service | 1.3.6.1.4.1.323.5.3.45.1.34 |
| ocn-nwdaf-subscription-service | 1.3.6.1.4.1.323.5.3.45.1.35 |
Alerts
OCNWDAF Subscription Alerts
This section lists the OCNWDAF subscription alerts:
Table 10-2 OCNWDAF_SUBSCRIPTION_CREATE
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 2000 |
| Description | Indicates the subscription is successfully created. |
Table 10-3 OCNWDAF_SUBSCRIPTION_CREATE_FAILURE
| Field | Details |
|---|---|
| Severity | Warning |
| OID to be appended | 2001 |
| Description | Indicates an issue in creating the subscription. |
Table 10-4 OCNWDAF_SUBSCRIPTION_DELETE
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 2002 |
| Description | Indicates the subscription is successfully deleted. |
Table 10-5 OCNWDAF_SUBSCRIPTION_UPDATE
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 2003 |
| Description | Indicates the subscription is successfully updated. |
Table 10-6 OCNWDAF_SUBSCRIPTION_DELETE_FAILURE
| Field | Details |
|---|---|
| Severity | Warning |
| OID to be appended | 2004 |
| Description | Indicates an issue in deleting the subscription. |
Table 10-7 OCNWDAF_SUBSCRIPTION_UPDATE_FAILURE
| Field | Details |
|---|---|
| Severity | Warning |
| OID to be appended | 2005 |
| Description | Indicates an issue in updating the subscription. |
Notification Alerts
This section lists the notification alerts:
Table 10-8 OCNWDAF_ABNORMAL_BEHAVIOR_STATISTICS_NOTIFICATION
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 3000 |
| Description | Indicates abnormal behavior statistics notification is received. |
Table 10-9 OCNWDAF_ABNORMAL_BEHAVIOR_THRESHOLD_NOTIFICATION
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 3001 |
| Description | Indicates abnormal behavior threshold notification is received. |
Table 10-10 OCNWDAF_ABNORMAL_BEHAVIOR_PREDICTION_NOTIFICATION
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 3002 |
| Description | Indicates abnormal behavior prediction notification is received. |
Table 10-11 OCNWDAF_NETWORK_PERFORMANCE_STATISTICS_NOTIFICATION
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 3003 |
| Description | Indicates network performance statistics notification is received. |
Table 10-12 OCNWDAF_NETWORK_PERFORMANCE_THRESHOLD_NOTIFICATION
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 3004 |
| Description | Indicates network performance threshold notification is received. |
Table 10-13 OCNWDAF_NETWORK_PERFORMANCE_PREDICTION_NOTIFICATION
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 3005 |
| Description | Indicates network performance prediction notification is received. |
Table 10-14 OCNWDAF_NF_LOAD_STATISTICS_NOTIFICATION
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 3006 |
| Description | Indicates NF load statistics notification is received. |
Table 10-15 OCNWDAF_NF_LOAD_THRESHOLD_NOTIFICATION
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 3007 |
| Description | Indicates NF load threshold notification is received. |
Table 10-16 OCNWDAF_NF_LOAD_PREDICTION_NOTIFICATION
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 3008 |
| Description | Indicates NF load prediction notification is received. |
Table 10-17 OCNWDAF_SLICE_LOAD_STATISTICS_NOTIFICATION
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 3009 |
| Description | Indicates slice load statistics notification is received. |
Table 10-18 OCNWDAF_SLICE_LOAD_THRESHOLD_NOTIFICATION
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 3010 |
| Description | Indicates slice load threshold notification is received. |
Table 10-19 OCNWDAF_SLICE_LOAD_PREDICTION_NOTIFICATION
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 3011 |
| Description | Indicates slice load prediction notification is received. |
ML Model Alerts
This section lists the ML model alerts:
Table 10-20 OCNWDAF_MODEL_CREATION_FAILURE
| Field | Details |
|---|---|
| Severity | Critical |
| OID to be appended | 4000 |
| Description | Indicates an issue in ML model creation. |
Table 10-21 OCNWDAF_MODEL_CREATION_SUCCESS
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 4001 |
| Description | Indicates ML model is successfully created. |
Data Collection Alerts
This section lists the data collection alerts:
Table 10-22 PRESENCE_IN_AOI_REPORT_RECEIVED
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 5000 |
| Description | Indicates Presence in AOI report is successfully received. |
Table 10-23 LOCATION_REPORT_RECEIVED
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 5001 |
| Description | Indicates location report is successfully received. |
Table 10-24 UES_IN_AREA_REPORT_RECEIVED
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 5002 |
| Description | Indicates UEs in area report is successfully received. |
Table 10-25 NF_LOAD_REPORT_RECEIVED
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 5003 |
| Description | Indicates NF load report is successfully received. |
Table 10-26 SMF_SES_EST_REPORT_RECEIVED
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 5004 |
| Description | Indicates a SMF session established report is successfully received. |
Table 10-27 SMF_SES_REL_REPORT_RECEIVED
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 5005 |
| Description | Indicates a SMF session released report is successfully received. |
Table 10-28 KAFKA_SOURCED_REPORT_RECEIVED
| Field | Details |
|---|---|
| Severity | Info |
| OID to be appended | 5006 |
| Description | Indicates Kafka sourced report is successfully received. |
Operation Alerts
This section lists the operational alerts:
Table 10-29 OCN_NWDAF_SVC_HIGH_CPU_LOAD
| Field | Details |
|---|---|
| Severity | Critical |
| OID to be appended | 6000 |
| Description | Verifies if the CPU usage of a particular service is exceeding 90%. |
Table 10-30 OCN_NWDAF_SVC_HIGH_JVM_MEMORY_USAGE
| Field | Details |
|---|---|
| Severity | Critical |
| OID to be appended | 6001 |
| Description | Verifies if the percentage of heap memory used by a specific JVM instance/service is exceeding 90% over a one minute duration. |
Table 10-31 OCN_NWDAF_SVC_NOT_RUNNING_ALERT
| Field | Details |
|---|---|
| Severity | Critical |
| OID to be appended | 6002 |
| Description | Verifies if there are no instances of the specified service running in the specified Kubernetes namespace or if all instances of the service are not healthy. |
10.2 System Level Alerts
This section lists the system level alerts.
OCN_NWDAF_ANALYTICS_HIGH_CPU_LOAD
Table 10-32 OCN_NWDAF_ANALYTICS_HIGH_CPU_LOAD
| Field | Details |
|---|---|
| Description | CPU load is high at the pod where the microservice is running. |
| Affected Functions | All |
| Cause | CPU load is more than 80% of the allocated resources. |
OCN_NWDAF_COMMUNICATION_HIGH_CPU_LOAD
Table 10-33 OCN_NWDAF_COMMUNICATION_HIGH_CPU_LOAD
| Field | Details |
|---|---|
| Description | CPU load is high at the pod where the microservice is running. |
| Affected Functions | All |
| Cause | CPU load is more than 80% of the allocated resources. |
OCN_NWDAF_CONFIGURATION_SERVICE_HIGH_CPU_LOAD
Table 10-34 OCN_NWDAF_CONFIGURATION_SERVICE_HIGH_CPU_LOAD
| Field | Details |
|---|---|
| Description | CPU load is high at the pod where the microservice is running. |
| Affected Functions | All |
| Cause | CPU load is more than 80% of the allocated resources. |
OCN_NWDAF_DATA_COLLECTION_HIGH_CPU_LOAD
Table 10-35 OCN_NWDAF_DATA_COLLECTION_HIGH_CPU_LOAD
| Field | Details |
|---|---|
| Description | CPU load is high at the pod where the microservice is running. |
| Affected Functions | All |
| Cause | CPU load is more than 80% of the allocated resources. |
OCN_NWDAF_GATEWAY_HIGH_CPU_LOAD
Table 10-36 OCN_NWDAF_GATEWAY_HIGH_CPU_LOAD
| Field | Details |
|---|---|
| Description | CPU load is high at the pod where the microservice is running. |
| Affected Functions | All |
| Cause | CPU load is more than 80% of the allocated resources. |
OCN_NWDAF_MTLF_HIGH_CPU_LOAD
Table 10-37 OCN_NWDAF_MTLF_HIGH_CPU_LOAD
| Field | Details |
|---|---|
| Description | CPU load is high at the pod where the microservice is running. |
| Affected Functions | All |
| Cause | CPU load is more than 80% of the allocated resources. |
OCN_NWDAF_SUBSCRIPTION_HIGH_CPU_LOAD
Table 10-38 OCN_NWDAF_SUBSCRIPTION_HIGH_CPU_LOAD
| Field | Details |
|---|---|
| Description | CPU load is high at the pod where the microservice is running. |
| Affected Functions | All |
| Cause | CPU load is more than 80% of the allocated resources. |
OCN_NWDAF_ANALYTICS_HIGH_JVM_HEAP_MEMORY_USAGE
Table 10-39 OCN_NWDAF_ANALYTICS_HIGH_JVM_HEAP_MEMORY_USAGE
| Field | Details |
|---|---|
| Description | The average of the memory heap usage is high. |
| Affected Functions | All |
| Cause | The heap memory usage is more than 80%. |
OCN_NWDAF_COMMUNICATION_HIGH_JVM_HEAP_MEMORY_USAGE
Table 10-40 OCN_NWDAF_COMMUNICATION_HIGH_JVM_HEAP_MEMORY_USAGE
| Field | Details |
|---|---|
| Description | The average of the memory heap usage is high. |
| Affected Functions | All |
| Cause | The heap memory usage is more than 80%. |
OCN_NWDAF_CONFIGURATION_SERVICE_HIGH_JVM_HEAP_MEMORY_USAGE
Table 10-41 OCN_NWDAF_CONFIGURATION_SERVICE_HIGH_JVM_HEAP_MEMORY_USAGE
| Field | Details |
|---|---|
| Description | The average of the memory heap usage is high. |
| Affected Functions | All |
| Cause | The heap memory usage is more than 80%. |
OCN_NWDAF_DATA_COLLECTION_HIGH_JVM_HEAP_MEMORY_USAGE
Table 10-42 OCN_NWDAF_DATA_COLLECTION_HIGH_JVM_HEAP_MEMORY_USAGE
| Field | Details |
|---|---|
| Description | The average of the memory heap usage is high. |
| Affected Functions | All |
| Cause | The heap memory usage is more than 80%. |
OCN_NWDAF_GATEWAY_HIGH_JVM_HEAP_MEMORY_USAGE
Table 10-43 OCN_NWDAF_GATEWAY_HIGH_JVM_HEAP_MEMORY_USAGE
| Field | Details |
|---|---|
| Description | The average of the memory heap usage is high. |
| Affected Functions | All |
| Cause | The heap memory usage is more than 80%. |
OCN_NWDAF_MTLF_HIGH_JVM_HEAP_MEMORY_USAGE
Table 10-44 OCN_NWDAF_MTLF_HIGH_JVM_HEAP_MEMORY_USAGE
| Field | Details |
|---|---|
| Description | The average of the memory heap usage is high. |
| Affected Functions | All |
| Cause | The heap memory usage is more than 80%. |
OCN_NWDAF_SUBSCRIPTION_HIGH_JVM_HEAP_MEMORY_USAGE
Table 10-45 OCN_NWDAF_SUBSCRIPTION_HIGH_JVM_HEAP_MEMORY_USAGE
| Field | Details |
|---|---|
| Description | The average of the memory heap usage is high. |
| Affected Functions | All |
| Cause | The heap memory usage is more than 80%. |
10.3 Application Level Alerts
This section lists the application level alerts.
OCN_NWDAF_ANALYTICS_NOT_RUNNING
Table 10-46 OCN_NWDAF_ANALYTICS_NOT_RUNNING
| Field | Details |
|---|---|
| Description | The microservice is not available or not reachable. |
| Cause | Microservice ocn-nwdaf-analytics is down. |
OCN_NWDAF_COMMUNICATION_NOT_RUNNING
Table 10-47 OCN_NWDAF_COMMUNICATION_NOT_RUNNING
| Field | Details |
|---|---|
| Description | The microservice is not available or not reachable. |
| Cause | Microservice ocn-nwdaf-communication is down. |
OCN_NWDAF_CONFIGURATION_SERVICE_NOT_RUNNING
Table 10-48 OCN_NWDAF_CONFIGURATION_SERVICE_NOT_RUNNING
| Field | Details |
|---|---|
| Description | The microservice is not available or not reachable. |
| Cause | Microservice ocn-nwdaf-configuration-service is down. |
OCN_NWDAF_DATA_COLLECTION_NOT_RUNNING
Table 10-49 OCN_NWDAF_DATA_COLLECTION_NOT_RUNNING
| Field | Details |
|---|---|
| Description | The microservice is not available or not reachable. |
| Cause | Microservice ocn-nwdaf-data-collection is down. |
OCN_NWDAF_GATEWAY_NOT_RUNNING
Table 10-50 OCN_NWDAF_GATEWAY_NOT_RUNNING
| Field | Details |
|---|---|
| Description | The microservice is not available or not reachable. |
| Cause | Microservice ocn-nwdaf-gateway is down. |
OCN_NWDAF_MTLF_NOT_RUNNING
Table 10-51 OCN_NWDAF_MTLF_NOT_RUNNING
| Field | Details |
|---|---|
| Description | The microservice is not available or not reachable. |
| Cause | Microservice ocn-nwdaf-mtlf is down. |
OCN_NWDAF_SUBSCRIPTION_NOT_RUNNING
Table 10-52 OCN_NWDAF_SUBSCRIPTION_NOT_RUNNING
| Field | Details |
|---|---|
| Description | The microservice is not available or not reachable. |
| Cause | Microservice ocn-nwdaf-subscription is down. |
HIGH_ABNORMAL_BEHAVIOUR_REQUEST_RATE
Table 10-53 HIGH_ABNORMAL_BEHAVIOUR_REQUEST_RATE
| Field | Details |
|---|---|
| Description | The number of requests received per second is high. |
| Cause | Traffic is high, above 1000 requests per second. |
| URI Endpoint | nnwdaf-analyticsinfo/v1/analytics?event-id=ABNORMAL_BEHAVIOUR |
| Affected Functions | ABNORMAL_BEHAVIOUR |
HIGH_UE_MOBILITY_REQUEST_RATE
Table 10-54 HIGH_UE_MOBILITY_REQUEST_RATE
| Field | Details |
|---|---|
| Description | The number of requests received per second is high. |
| Cause | Traffic is high, above 1000 requests per second. |
| URI Endpoint | nnwdaf-analyticsinfo/v1/analytics?event-id=UE_MOBILITY |
| Affected Functions | UE_MOBILITY |
HIGH_EVENT_SUBSCRIPTION_REQUEST_RATE
Table 10-55 HIGH_EVENT_SUBSCRIPTION_REQUEST_RATE
| Field | Details |
|---|---|
| Description | The number of requests received per second is high. |
| Cause | Traffic is high, above 1000 requests per second. |
| URI Endpoint | nnwdaf-eventssubscription/v1/subscriptions |
| Affected Functions | UE_MOBILITY, SLICE_LOAD_LEVEL, ABNORMAL_BEHAVIOUR |
HIGH_ABNORMAL_BEHAVIOUR_REQUEST_FAILURE_RATE
Table 10-56 HIGH_ABNORMAL_BEHAVIOUR_REQUEST_FAILURE_RATE
| Field | Details |
|---|---|
| Description | The number of requests failing per second is high. |
| Cause | The request failing rate is more than the 70%. |
| URI Endpoint | nnwdaf-analyticsinfo/v1/analytics?event-id=ABNORMAL_BEHAVIOUR |
| Affected Functions | ABNORMAL_BEHAVIOUR |
HIGH_UE_MOBILITY_REQUEST_FAILURE_RATE
Table 10-57 HIGH_ABNORMAL_BEHAVIOUR_REQUEST_FAILURE_RATE
| Field | Details |
|---|---|
| Description | The number of requests failing per second is high. |
| Cause | The request failing rate is more than the 70%. |
| URI Endpoint | nnwdaf-analyticsinfo/v1/analytics?event-id=UE_MOBILITY |
| Affected Functions | UE_MOBILITY |
HIGH_EVENT_SUBSCRIPTION_REQUEST_FAILURE_RATE
Table 10-58 HIGH_EVENT_SUBSCRIPTION_REQUEST_FAILURE_RATE
| Field | Details |
|---|---|
| Description | The number of requests failing per second is high. |
| Cause | The request failing rate is more than the 70%. |
| URI Endpoint | nnwdaf-eventssubscription/v1/subscriptions |
| Affected Functions | UE_MOBILITY, SLICE_LOAD_LEVEL, ABNORMAL_BEHAVIOUR |
10.4 OCNWDAF KPIs
This section provides information about Key Performance Indicators (KPIs) used for Oracle Communications Networks Data Analytics Function (OCNWDAF).
OCNWDAF KPIs are listed below:
Table 10-59 Frontend Reports Received Total
| KPI Detail | Total number of reports received on Front End. |
|---|---|
| Metric Used for the KPI (CNE) | PromQL: total_fe_reports_recieved_total{namespace="$NAMESPACE"} |
| Metric Used for the KPI (OCI) | MQL:total_fe_reports_recieved_total[5m]{k8Namespace="$NAMESPACE"}.count() |
| Service Operation | NA |
| Response Code | NA |
| Tags and Values |
Source NF: AMF,SMF,NRF,OAM |
Table 10-60 Frontend Bytes Received Total
| KPI Detail | Total number of bytes received on Front End. |
|---|---|
| Metric Used for the KPI (CNE) | PromQL: fe_bytes_recieved_total{namespace="$NAMESPACE"} |
| Metric Used for the KPI (OCI) | MQL:fe_bytes_recieved_total[5m]{k8Namespace="$NAMESPACE"}.count() |
| Service Operation | NA |
| Response Code | NA |
| Tags and Values |
Source NF: AMF,SMF,NRF,OAM |
Table 10-61 Kafka Sourced Reports Received Total
| KPI Detail | Total number of reports received by NWDAF Front End through Kafka. |
|---|---|
| Metric Used for the KPI (CNE) | PromQL: total_kafka_sourced_reports_recieved_total{namespace="$NAMESPACE"} |
| Metric Used for the KPI (OCI) | MQL:total_kafka_sourced_reports_recieved_total[5m]{k8Namespace="$NAMESPACE"}.count() |
| Service Operation | NA |
| Response Code | NA |
| Tags and Values |
Source NF: OAM |
Table 10-62 Total Kafka Bytes Received
| KPI Detail | Total number of Kafka bytes received. |
|---|---|
| Metric Used for the KPI (CNE) | PromQL: total_kafka_bytes_recieved{namespace="$NAMESPACE"} |
| Metric Used for the KPI (OCI) | MQL:total_kafka_bytes_recieved[5m]{k8Namespace="$NAMESPACE"}.count() |
| Service Operation | NA |
| Response Code | NA |
| Tags and Values |
Source NF: AMF,SMF,NRF,OAM Report Type: NW_PERF_OAM_REPORT, QOS_OAM_REPORT, UDC_OAM_REPORT |
Table 10-63 Nwdaf Subscriptions Created Total
| KPI Detail | Total number of subscriptions created. |
|---|---|
| Metric Used for the KPI (CNE) | PromQL: nwdaf_subscriptions_created_total{namespace="$NAMESPACE"} |
| Metric Used for the KPI (OCI) | MQL:nwdaf_subscriptions_created_total[5m]{k8Namespace="$NAMESPACE"}.count() |
| Service Operation | NA |
| Response Code | NA |
| Tags and Values |
Event Name: SLICE_LOAD_LEVEL, NETWORK_PERFORMANCE, NF_LOAD, ABNORMAL_BEHAVIOUR. Notification Method: DESCRIPTIVE, PREDICTIVE, THRSHOLDING. |
Table 10-64 Nwdaf Subscriptions Accepted Total
| KPI Detail | Total number of subscriptions accepted out of the subscriptions created. |
|---|---|
| Metric Used for the KPI (CNE) | PromQL: nwdaf_subscriptions_accepted_total{namespace="$NAMESPACE"} |
| Metric Used for the KPI (OCI) | MQL:nwdaf_subscriptions_accepted_total[5m]{k8Namespace="$NAMESPACE"}.count() |
| Service Operation | NA |
| Response Code | NA |
| Tags and Values |
Event Name: SLICE_LOAD_LEVEL, NETWORK_PERFORMANCE, NF_LOAD, ABNORMAL_BEHAVIOUR. Notification Method: DESCRIPTIVE, PREDICTIVE, THRSHOLDING. |
Table 10-65 Nwdaf Subscriptions Data Reports Sent
| KPI Detail | Total number of reports or notifications sent. |
|---|---|
| Metric Used for the KPI (CNE) | PromQL: nwdaf_subscriptions_data_reports_sent_total{namespace="$NAMESPACE"} |
| Metric Used for the KPI (OCI) | MQL:nwdaf_subscriptions_data_reports_sent_total[5m]{k8Namespace="$NAMESPACE"}.count() |
| Service Operation | NA |
| Response Code | NA |
| Tags and Values |
Event Name: SLICE_LOAD_LEVEL, NETWORK_PERFORMANCE, NF_LOAD, ABNORMAL_BEHAVIOUR. Notification Method: DESCRIPTIVE, PREDICTIVE, THRSHOLDING. |
Table 10-66 Nwdaf Subscriptions Threshold Reports Sent
| KPI Detail | Total number of threshold reports or notifications sent out of the total reports. |
|---|---|
| Metric Used for the KPI (CNE) | PromQL: nwdaf_subscriptions_threshold_reports_sent_total{namespace="$NAMESPACE"} |
| Metric Used for the KPI (OCI) | MQL:nwdaf_subscriptions_threshold_reports_sent_total[5m]{k8Namespace="$NAMESPACE"}.count() |
| Service Operation | NA |
| Response Code | NA |
| Tags and Values |
Event Name: SLICE_LOAD_LEVEL, NETWORK_PERFORMANCE, NF_LOAD, ABNORMAL_BEHAVIOUR. |
Table 10-67 Nwdaf Subscriptions Prediction Reports Sent
| KPI Detail | Total number of predictive reports or notifications sent out of the total reports. |
|---|---|
| Metric Used for the KPI (CNE) | PromQL: nwdaf_subscriptions_prediction_reports_sent_total{namespace="$NAMESPACE"} |
| Metric Used for the KPI (OCI) | MQL:nwdaf_subscriptions_prediction_reports_sent_total[5m]{k8Namespace="$NAMESPACE"}.count() |
| Service Operation | NA |
| Response Code | NA |
| Tags and Values |
Event Name: SLICE_LOAD_LEVEL, NETWORK_PERFORMANCE, NF_LOAD, ABNORMAL_BEHAVIOUR. |
Table 10-68 Analyticsinfo Request Received Total
| KPI Detail | Total number of analytics information requests received. |
|---|---|
| Metric Used for the KPI (CNE) | PromQL: analyticsinfo_request_received_total{namespace="$NAMESPACE"} |
| Metric Used for the KPI (OCI) | MQL:analyticsinfo_request_received_total[5m]{k8Namespace="$NAMESPACE"}.count() |
| Service Operation | NA |
| Response Code | NA |
| Tags and Values |
Event Name: SLICE_LOAD_LEVEL, NETWORK_PERFORMANCE, NF_LOAD, ABNORMAL_BEHAVIOUR. |