4 Resource Utilization
CNE constrains resources such as CPU and RAM to each common service. Resource constraints help to ensure that the services don’t consume excess resources.
During initial CNE deployment, each service is provided an initial CPU and RAM allocation. Each service is allowed to consume each resource (CPU and RAM) to a specified upper limit while it continues to run.
Note:
Observability tools such as Prometheus,OpenSearch, Fleuntd, and Jaeger perform resource intensive operations. Even a small change of events can increase the resource consumption exponentially. For example:- Changing the log severity level from
WARN
toINFO
increases the CPU and memory usage dramatically. - Adding new metric labels or enabling new monitors have big impact on the performance of metric ingestion and can cause resource starvation.
- Adding more traces to Jaeger spikes the ingestion rate in OpenSearch and causes the data nodes to starve.
Table 4-1 CPU and RAM Resource Requests and Limits
Service | CPU Initial Request (m) | CPU Limit (m) | RAM Initial Request (Mi) | RAM Limit (Mi) | Instances |
---|---|---|---|---|---|
Prometheus | 2000 | 4000 | 16384 | 16384 | 2 |
Prometheus Node Exporter | 800 | 800 | 512 | 512 | 1 per node |
Prometheus Operator | 100 | 200 | 100 | 200 | 1 |
Prometheus AlertManager | 20 | 20 | 64 | 64 | 2 |
Prometheus Kube State Metrics | 20 | 20 | 32 | 100 | 1 |
Promxy | 100 | 100 | 512 | 512 | 1 |
OpenSearch Master | 1000 | 1000 | 100 | 2048 | 3 |
OpenSearch Data | 1000 | 4000 | 16384 | 101904 (100Gi) | 7 |
OpenSearch Client | 1000 | 3000 | 100 | 32609(32Gi) | 3 |
OpenSearch Dashboard | 100 | 100 | 512 | 512 | 1 |
occne-metrics-server | 100 | 100 | 200 | 200 | 1 |
occne-alertmanager-snmp-notifier | 100 | 100 | 128 | 128 | 1 |
Fluentd OpenSearch | 100 | 500 | 128 | 12228(12Gi) | 1 per worker node |
Jaeger Collector | 500 | 1250 | 512 | 1024 | 1 |
Jaeger query | 256 | 500 | 128 | 512 | 1 |
MetalLB Controller
(for CNLB disabled CNE only) |
100 | 100 | 100 | 100 | 1 |
MetalLB Speaker
(for CNLB disabled CNE only) |
100 | 100 | 100 | 100 | 1 per worker node |
LB Controller (vCNE only)
(for CNLB disabled CNE only) |
10 | 500 | 128 | 1024 | 1 |
Egress Controller
(for CNLB disabled CNE only) |
100 | 1000 | 200 | 500 | 1 per worker node |
Bastion Controller | 10 | 200 | 128 | 256 | 1 |
Kyverno | 100 | 200 | 256 | 512 | 3 |
CNLB-App (for CNLB enabled CNE only) |
500 | 4000 | 1024 | 1024 | 4 |
CNLB-Manager
(for CNLB enabled CNE only) |
500 | 4000 | 1024 | 1024 | 1 |
The overall common services resource usage varies on each worker node. The common services listed in Table 4-1 are evenly distributed across all worker nodes in the Kubernetes cluster.