8 OCNADD KPIs
Important:
The "namespace" in the KPIs should be updated to reflect the current namespace used in the OCNADD deployment.The following KPIs are added in OCNADD 23.3.0.
Table 8-1 ocnadd_ingress_record_count_by_service
| KPI Detail | Measures the total ingress records in kafka source topics per aggregation service at the current time |
|---|---|
| Metric Used for the KPI | sum by (service)(kafka_stream_processor_node_process_total{namespace="$NAMESPACE", service=~".*aggregation.*"}) |
| Service Operation | NA |
| Response Code | NA |
Table 8-2 ocnadd_ingress_record_count_total
| KPI Detail | Measures the total ingress records in kafka source topics at the current time |
|---|---|
| Metric Used for the KPI | sum (kafka_stream_processor_node_process_total{namespace="$NAMESPACE", service=~".*aggregation.*"}) |
| Service Operation | NA |
| Response Code | NA |
Table 8-3 ocnadd_ingress_mps_per_service_5mAgg
| KPI Detail | Measures the ingress MPS per service aggregated over 5min |
|---|---|
| Metric Used for the KPI | sum by (service)(rate(kafka_stream_processor_node_process_total{namespace="$NAMESPACE",service=~".*aggregation.*"}[5m])) |
| Service Operation | NA |
| Response Code | NA |
Table 8-4 ocnadd_ingress_mps_5mAgg
| KPI Detail | Measures the ingress MPS aggregated over 5min |
|---|---|
| Metric Used for the KPI | sum(rate(kafka_stream_processor_node_process_total{namespace="$NAMESPACE",service=~".*aggregation.*"}[5m])) |
| Service Operation | NA |
| Response Code | NA |
Table 8-5 ocnadd_ingress_mps_per_service_5mAgg_last_24h
| KPI Detail | Measures the ingress MPS per service aggregated over 5min for last 24 hours |
|---|---|
| Metric Used for the KPI | sum by (service)(rate(kafka_stream_processor_node_process_total{namespace="$NAMESPACE",service=~".*aggregation.*"}[5m]))[24h:5m] |
| Service Operation | NA |
| Response Code | NA |
Table 8-6 ocnadd_ingress_record_count_per_service_5mAgg_last_24h
| KPI Detail | Measures the ingress messages per service aggregated over 5min for last 24 hours |
|---|---|
| Metric Used for the KPI | sum by (service)(increase(kafka_stream_processor_node_process_total{namespace="$NAMESPACE",service=~".*aggregation.*"}[5m]))[24h:5m] |
| Service Operation | NA |
| Response Code | NA |
Table 8-7 ocnadd_kafka_ingress_record_drop_rate_5minAgg
| KPI Detail | Measures the total ingress message drop rate aggregated over 5min |
|---|---|
| Metric Used for the KPI | sum(rate(kafka_stream_task_dropped_records_total{namespace="$NAMESPACE",service=~".*aggregation.*"}[5m])) |
| Service Operation | NA |
| Response Code | NA |
Table 8-8 ocnadd_kafka_ingress_record_drop_rate_per_service_5minAgg
| KPI Detail | Measures the total ingress message drop rate per service aggregated over 5min |
|---|---|
| Metric Used for the KPI | sum(rate(kafka_stream_task_dropped_records_total{namespace="$NAMESPACE",service=~".*aggregation.*}[5m])) by (service,pod) |
| Service Operation | NA |
| Response Code | NA |
Table 8-9 ocnadd_egress_request_count_total_by_3rdparty_destination_endpoint
| KPI Detail | Total egress requests per 3rd party application per destination endpoint |
|---|---|
| Metric Used for the KPI | sum by (instance_identifier,destination_endpoint)(ocnadd_egress_requests_total{namespace="$NAMESPACE"}) |
| Service Operation | POST |
| Response Code | NA |
Table 8-10 ocnadd_egress_response_count_total_by_3rdparty_destination_endpoint
| KPI Detail | Total egress responses per 3rd party application per destination endpoint |
|---|---|
| Metric Used for the KPI | sum by (instance_identifier,destination_endpoint)(ocnadd_egress_responses_total{namespace="$NAMESPACE"} |
| Service Operation | POST |
| Response Code | NA |
Table 8-11 ocnadd_egress_failure_count_total_by_3rdparty_destination_endpoint
| KPI Detail | Total egress failure count per 3rd party application per destination endpoint |
|---|---|
| Metric Used for the KPI | sum by (destination_endpoint,instance_identifier)(ocnadd_egress_failed_request_total{namespace="$NAMESPACE"}) |
| Service Operation | POST |
| Response Code | NA |
Table 8-12 ocnadd_egress_request_rate_by_3rdparty_5mAgg
| KPI Detail | Total egress request rate per 3rd party application in 5min Aggregation |
|---|---|
| Metric Used for the KPI | sum by (instance_identifier)(rate(ocnadd_egress_requests_total{namespace="$NAMESPACE"}[5m])) |
| Service Operation | POST |
| Response Code | NA |
Table 8-13 ocnadd_egress_failure_rate_by_3rdparty_5mAgg
| KPI Detail | Total egress failure rate per 3rd party application in 5min Aggregation |
|---|---|
| Metric Used for the KPI | sum by
(instance_identifier)(irate(ocnadd_egress_failed_request_total{namespace="$NAMESPACE"}[5m]))
/ sum by (instance_identifier) (irate(ocnadd_egress_requests_total{namespace="$NAMESPACE"}[5m])) |
| Service Operation | POST |
| Response Code | NA |
Table 8-14 ocnadd_egress_failure_rate_by_3rdparty_per_destination_endpoint_5mAgg
| KPI Detail | Total egress failure rate per 3rd party application per destination endpoint in 5min Aggregation |
|---|---|
| Metric Used for the KPI |
sum by (instance_identifier, destination_endpoint)(irate(ocnadd_egress_failed_request_total{namespace="$NAMESPACE"}[5m])) / sum by (instance_identifier, destination_endpoint) (irate(ocnadd_egress_requests_total{namespace="$NAMESPACE"}[5m])) |
| Service Operation | POST |
| Response Code | NA |
Table 8-15 ocnadd_e2e_avg_record_latency_by_3rdparty
| KPI Detail | Total e2e average latency per 3rd party application in 5min Aggregation |
|---|---|
| Metric Used for the KPI |
(sum (irate(ocnadd_egress_e2e_request_processing_latency_seconds_sum{namespace="$NAMESPACE"}[5m])) by (instance_identifier) / (sum (irate(ocnadd_egress_e2e_request_processing_latency_seconds_count{namespace="$NAMESPACE"}[5m])) by (instance_identifier))) |
| Service Operation | POST |
| Response Code | NA |
Table 8-16 ocnadd_e2e_avg_record_latency_by_3rdparty_per_adapter_pod
| KPI Detail | Total e2e average latency per 3rd party application per egress adapter POD in 5min Aggregation |
|---|---|
| Metric Used for the KPI |
(sum (irate(ocnadd_egress_e2e_request_processing_latency_seconds_sum{namespace="$NAMESPACE"}[5m])) by (instance_identifier,pod) / (sum (irate(ocnadd_egress_e2e_request_processing_latency_seconds_count{namespace="$NAMESPACE"}[5m])) by (instance_identifier,pod))) |
| Service Operation | POST |
| Response Code | NA |
Table 8-17 ocnadd_egress_adapter_processing_avg_record_latency_by_3rdparty_per_adapter_pod
| KPI Detail | Total service processing average latency per 3rd party application per adapter POD in 5min Aggregation |
|---|---|
| Metric Used for the KPI |
(sum (irate(ocnadd_egress_service_request_processing_latency_seconds_sum{namespace="$NAMESPACE"}[5m])) by (instance_identifier,pod) / (sum (irate(ocnadd_egress_service_request_processing_latency_seconds_count{namespace="$NAMESPACE"}[5m])) by (instance_identifier,pod))) |
| Service Operation | POST |
| Response Code | NA |
Table 8-18 ocnadd_egress_adapter_request_processing_avg_record_latency_by_3rdparty_per_adapter_pod
| KPI Detail | Total request processing average latency per 3rd party application per adapter POD in 5min Aggregation, this includes network latency added by response from 3rd party application |
|---|---|
| Metric Used for the KPI |
(sum (irate(ocnadd_egress_request_latency_seconds_sum{namespace="$NAMESPACE"}[5m])) by (instance_identifier,pod) / (sum (irate(ocnadd_egress_request_latency_seconds_count{namespace="$NAMESPACE"}[5m])) by (instance_identifier,pod))) |
| Service Operation | POST |
| Response Code | NA |
Table 8-19 ocnadd_egress_e2e_avg_record_latency_95percentile_by_3rdparty_per_egress_pod
| KPI Detail | The 95 quantile value of e2e latency in milisec for egress adapter calculated over period of 5min |
|---|---|
| Metric Used for the KPI | histogram_quantile(0.95, sum(rate(ocnadd_egress_e2e_request_processing_latency_seconds_bucket{namespace="$namespaces"}[5m])) by (le)) |
| Service Operation | POST |
| Response Code | NA |
Table 8-20 Memory Usage per POD
| KPI Detail | Measures the memory usage per POD |
|---|---|
| Metric Used for the KPI | sum(container_memory_working_set_bytes{namespace=~"$Namespace",image!=""}/(1024*1024*1024)) by (pod) |
| Service Operation | NA |
| Response Code | NA |
Table 8-21 CPU Usage per POD
| KPI Detail | Measures the CPU usage per POD |
|---|---|
| Metric Used for the KPI | sum(rate(container_cpu_usage_seconds_total{namespace=~"$Namespace",image!=""}[2m])) by (pod) * 1000 |
| Service Operation | NA |
| Response Code | NA |
Table 8-22 Service Status
| KPI Detail | Provide the status of each of the data director service running in the namespace provided |
|---|---|
| Metric Used for the KPI | up{namespace="$NAMESPACE"} |
| Service Operation | NA |
| Response Code | NA |
Table 8-23 ocnadd_ext_kafka_feed_record_total per external feed rate(MPS)
| KPI Detail | The rate of messages consumed per sec per external Kafka consumer, calculated over period of 5min |
|---|---|
| Metric Used for the KPI | sum(irate(ocnadd_ext_kafka_feed_record_total{namespace="$Namespace"}[5m])) by (feed_name) |
| Service Operation | NA |
| Response Code | NA |