8 OCNADD KPIs

This section provides information about Key Performance Indicators (KPIs) used for Oracle Communications Network Analytics Data Director (OCNADD).

Important:

The "namespace" in the KPIs should be updated to reflect the current namespace used in the OCNADD deployment. The configured group message count should reflect in EGW_GROUP_MESSAGE_COUNT_FOR_LATENCY for KPIs.

The following KPIs are added in OCNADD 22.0.0.

Table 8-1 ocnadd_ingress_record_count_by_service

KPI Detail Measures the total ingress records in kafka source topics per aggregation service at the current time
Metric Used for the KPI sum by (service)(kafka_stream_processor_node_process_total{namespace="$NAMESPACE"})
Service Operation NA
Response Code NA

Table 8-2 ocnadd_ingress_record_count_total

KPI Detail Measures the total ingress records in kafka source topics at the current time
Metric Used for the KPI sum (kafka_stream_processor_node_process_total{namespace="$NAMESPACE"})
Service Operation NA
Response Code NA

Table 8-3 ocnadd_ingress_mps_per_service_5mAgg

KPI Detail Measures the total ingress MPS per service aggregated over 5min
Service Operation NA
Response Code NA
Metric Used for the KPI sum by (service)(rate(kafka_stream_processor_node_process_total{namespace="$NAMESPACE"}[5m]))

Table 8-4 ocnadd_total_ingress_mps_5mAgg

KPI Detail Measures the total ingress MPS aggregated over 5min
Metric Used for the KPI sum(rate(kafka_stream_processor_node_process_total{namespace="$NAMESPACE"}[5m]))
Service Operation NA
Response Code NA

Table 8-5 ocnadd_ingress_mps_per_service_5mAgg_last_24h

KPI Detail Measures the total ingress MPS per service aggregated over 5min for last 24 hours
Metric Used for the KPI sum by (service)(rate(kafka_stream_processor_node_process_total{namespace="$NAMESPACE"}[5m]))[24h:5m]
Service Operation NA
Response Code NA

Table 8-6 ocnadd_ingress_record_count_per_service_5mAgg_last_24h

KPI Detail Measures the total ingress messages per service aggregated over 5min for last 24 hours
Metric Used for the KPI sum by (service)(increase(kafka_stream_processor_node_process_total{namespace="$NAMESPACE"}[5m]))[24h:5m]
Service Operation NA
Response Code NA

Table 8-7 ocnadd_kafka_ingress_record_drop_rate_5minAgg

KPI Detail Measures the total ingress message drop rate aggregated over 5min
Metric Used for the KPI sum(rate(kafka_stream_task_dropped_records_total{namespace="$NAMESPACE"}[5m]))
Service Operation NA
Response Code NA

Table 8-8 ocnadd_kafka_ingress_record_drop_rate_per_service_5minAgg

KPI Detail Measures the total ingress message drop rate per service aggregated over 5min
Metric Used for the KPI sum(rate(kafka_stream_task_dropped_records_total{namespace="$NAMESPACE"}[5m])) by (service,pod)
Service Operation NA
Response Code NA

Table 8-9 ocnadd_egw_request_count_total_by_3rdparty_destination_endpoint

KPI Detail Total egress requests per 3rd party application per destination endpoint
Metric Used for the KPI sum by (instance_identifier,destination_endpoint)(ocnadd_egressgateway_http_requests_total{namespace="$NAMESPACE"})
Service Operation POST
Response Code NA

Table 8-10 ocnadd_egw_response_count_total_by_3rdparty_destination_endpoint

KPI Detail Total egress responses per 3rd party application per destination endpoint
Metric Used for the KPI sum by (instance_identifier,destination_endpoint)(ocnadd_egressgateway_http_responses_total{namespace="$NAMESPACE"})
Service Operation POST
Response Code NA

Table 8-11 ocnadd_egw_failure_count_total_by_3rdparty_destination_endpoint

KPI Detail Total egress failure count per 3rd party application per destination endpoint
Metric Used for the KPI sum by (destination_endpoint,instance_identifier)(ocnadd_egressgateway_connection_failure_total{namespace="$NAMESPACE"})
Service Operation POST
Response Code NA

Table 8-12 ocnadd_egw_request_rate_by_3rdparty_5mAgg

KPI Detail Total egress request rate per 3rd party application in 5min Aggregation
Metric Used for the KPI sum by (instance_identifier)(rate(ocnadd_egressgateway_http_requests_total{namespace="$NAMESPACE"}[5m]))
Service Operation POST
Response Code NA

Table 8-13 ocnadd_egw_failure_rate_by_3rdparty_5mAgg

KPI Detail Total egress failure rate per 3rd party application in 5min Aggregation
Metric Used for the KPI

sum by (instance_identifier)(rate(ocnadd_egressgateway_connection_failure_total{namespace="$NAMESPACE"}[5m]))

/

sum by (instance_identifier) (rate(ocnadd_egressgateway_http_requests_total{namespace="$NAMESPACE"}[5m]))

Service Operation POST
Response Code NA

Table 8-14 ocnadd_egw_failure_rate_by_3rdparty_per_destination_endpoint_5mAgg

KPI Detail Total egress failure rate per 3rd party application per destination endpoint in 5min Aggregation
Metric Used for the KPI

sum by (instance_identifier, destination_endpoint)(rate(ocnadd_egressgateway_connection_failure_total{namespace="$NAMESPACE"}[5m]))

/

sum by (instance_identifier, destination_endpoint) (rate(ocnadd_egressgateway_http_requests_total{namespace="$NAMESPACE"}[5m]))

Service Operation POST
Response Code NA

Table 8-15 ocnadd_e2e_avg_record_latency_by_3rdparty

KPI Detail Total e2e average latency per 3rd party application in 5min Aggregation
Metric Used for the KPI

(sum (irate(ocnadd_egressgateway_e2e_request_processing_latency_seconds_sum{namespace="$NAMESPACE"}[5m])) by (instance_identifier)

/

(sum (irate(ocnadd_egressgateway_e2e_request_processing_latency_seconds_count{namespace="$NAMESPACE"}[5m])) by (instance_identifier) *$EGW_GROUP_MESSAGE_COUNT_FOR_LATENCY ))

Service Operation POST
Response Code NA

Table 8-16 ocnadd_e2e_avg_record_latency_by_3rdparty_per_egw_pod

KPI Detail Total e2e average latency per 3rd party application per EGW POD in 5min Aggregation
Metric Used for the KPI

(sum (irate(ocnadd_egressgateway_e2e_request_processing_latency_seconds_sum{namespace="$NAMESPACE"}[5m])) by (instance_identifier,pod)

/

(sum (irate(ocnadd_egressgateway_e2e_request_processing_latency_seconds_count{namespace="$NAMESPACE"}[5m])) by (instance_identifier,pod) *$EGW_GROUP_MESSAGE_COUNT_FOR_LATENCY ))

Service Operation POST
Response Code NA

Table 8-17 ocnadd_egw_service_processing_avg_record_latency_by_3rdparty_per_egw_pod

KPI Detail Total service processing average latency per 3rd party application per EGW POD in 5min Aggregation
Metric Used for the KPI

(sum (irate(ocnadd_egressgateway_service_request_processing_latency_seconds_sum{namespace="$NAMESPACE"}[5m])) by (instance_identifier,pod)

/

(sum (irate(ocnadd_egressgateway_service_request_processing_latency_seconds_count{namespace="$NAMESPACE"}[5m])) by (instance_identifier,pod) *$EGW_GROUP_MESSAGE_COUNT_FOR_LATENCY ))

Service Operation POST
Response Code NA

Table 8-18 ocnadd_egw_request_processing_avg_record_latency_by_3rdparty_per_egw_pod

KPI Detail Total request processing average latency per 3rd party application per EGW POD in 5min Aggregation, this includes network latency added by response from 3rd party application
Metric Used for the KPI

(sum (irate(ocnadd_egressgateway_request_latency_seconds_sum{namespace="$NAMESPACE"}[5m])) by (instance_identifier,pod)

/

(sum (irate(ocnadd_egressgateway_request_latency_seconds_count{namespace="$NAMESPACE"}[5m])) by (instance_identifier,pod) *$EGW_GROUP_MESSAGE_COUNT_FOR_LATENCY ))

Service Operation POST
Response Code NA

Table 8-19 ocnadd_egw_e2e_avg_record_latency_95percentile_by_3rdparty_per_egw_pod

KPI Detail The 95 quantile value of e2e latency in milisec for egress gateway calculated over period of 5min
Metric Used for the KPI histogram_quantile(0.95, sum(rate(ocnadd_egressgateway_e2e_request_processing_latency_seconds_bucket{namespace="$namespaces"}[5m])) by (le))
Service Operation POST
Response Code NA

Table 8-20 Memory Usage per POD

KPI Detail Measures the memory usage per POD
Metric Used for the KPI sum(container_memory_working_set_bytes{namespace=~"$Namespace",image!=""}/(1024*1024*1024)) by (pod)
Service Operation NA
Response Code NA

Table 8-21 CPU Usage per POD

KPI Detail Measures the CPU usage per POD
Metric Used for the KPI sum(rate(container_cpu_usage_seconds_total{namespace=~"$Namespace",image!=""}[2m])) by (pod) * 1000
Service Operation NA
Response Code NA

Table 8-22 Service Status

KPI Detail Provide the status of each of the data director service running in the namespace provided
Metric Used for the KPI up{namespace="$NAMESPACE"}
Service Operation NA
Response Code NA