Managing ECE Composable Service Pods

9 Managing ECE Composable Service Pods

You can manage the pods in the Oracle Communications Elastic Charging Engine (ECE) composable services by setting up autoscaling.

Topics in this document:

About Autoscaling Pods for ECE Composable Services

You can use the Kubernetes Horizontal Pod Autoscaler to automatically scale replicas up or down based on CPU or memory utilization. In the ECE composable services, the Horizontal Pod Autoscaler can monitor and scale the following pods:

charging-gateway
nchf-converged-charging

To achieve the best results, design your autoscaling strategy to balance efficient infrastructure resource usage with minimizing frequent ReplicaSet scaling events.

Setting Up Autoscaling of Pods

To set up and enable autoscaling for the ECE composable service pods:

Ensure that your cluster is set up and the system is in the UsageProcessing state.
Open your override-values.yaml file.
Ensure the Horizontal Pod Autoscaler is enabled for all pods needed to meet your business needs. You enable or disable it by using these keys in your override-values.yaml file:
- For the charging-gateway pod, set the cgf.chargingGateway.hpa.enabled key. The default is true.
- For the nchf-converged-charging pod, set the nchfConvergedCharging.nchfConvergedCharging.hpa.enabled key. The default is true.
For each pod, specify the minimum and maximum amount of memory and CPU that can be used.

Set these keys under the resources section of each pod. For example, under nchfConvergedCharging.nchfConvergedCharging.resources.
- memoryRequest: Set this to the minimum amount of memory required for a Kubernetes node to deploy a pod.
  
  If the minimum amount is not available, the pod's status is set to Pending.
- cpuRequest: Set this to the minimum CPU amount, in millicores, that must be available in a Kubernetes node to deploy a pod. For example, enter 1000m for 1 CPU core.
  
  If the minimum CPU amount is not available, the pod's status is set to Pending.
- memoryLimit: Set this to the maximum amount of memory that a pod can utilize.
- cpuLimit: Set this to the maximum amount of CPU that a pod can utilize.
For each pod, specify the minimum and maximum number of pod replicas that can be deployed.

Set these keys under the hpa section of each pod. For example, under the nchfConvergedCharging.nchfConvergedCharging.hpa section.
- minReplicas: Set this to the minimum number of pod replicas to deploy when scale-down is triggered.
  
  If the average utilization across a pod’s replicas falls below metrics.cpuAverageUtilization, the Horizontal Pod Autoscaler decreases the number of pod replicas down to this minimum count.
- maxReplicas: Set this to the maximum number of pod replicas to deploy when scale-up is triggered.
  
  If a pod's average utilization goes above metrics.cpuAverageUtilization, the Horizontal Pod Autoscaler increases the number of pod replicas up to this maximum count.
- metrics.cpuAverageUtilization: Set this as a target or threshold for average CPU usage across all of the pod’s replicas for the same deployment. For example, if a cluster has three charging-gateway pod replicas, the average is the sum of CPU usage divided by three. The default is 65%.
  
  The autoscaler increases or decreases the number of pod replicas to maintain the average CPU utilization you specified across all pods.
For each pod, specify the rules for scaling down pods.

Set these keys under the hpa.scaleDown section of each pod. For example, under the nchfConvergedCharging.nchfConvergedCharging.hpa.scaleDown section.
- selectPolicy: Specifies Min, Max, or Disabled.
  - Min selects the policy with the smallest change in the replica count.
  - Max selects the policy with the largest change in the replica count.
  - Disabled prevents autoscaling in the scale-down direction.
- stabilizationWindowSeconds: Specifies the duration, in seconds, of the stabilization window when scaling down pods.
- periodSeconds: Specifies the number of seconds for which metrics should be collected before scaling.
For each pod, specify the rules for scaling up pods.

Set these keys under the hpa.scaleUp section of each pod. For example, under the nchfConvergedCharging.nchfConvergedCharging.hpa.scaleUp section.
- selectPolicy: Specifies Min, Max, or Disabled.
  - Min selects the policy with the smallest change in the replica count.
  - Max selects the policy with the largest change in the replica count.
  - Disabled prevents autoscaling in the scale-up direction.
- stabilizationWindowSeconds: Specifies the duration, in seconds, of the stabilization window when scaling up pods.
- periodSeconds: Specifies the number of seconds for which metrics should be collected before scaling.
To lower the heap memory used by the pods, set the appropriate JVM parameters in the jvmOpts key.

Memory-based scale-down occurs only if the amount of pod memory decreases. You can decrease pod memory by using JVM garbage collection (GC).
Save and close your override-values.yaml file.
Do one of the following:
- Deploy the ECE composable services (if you have not already deployed them). See "Deploying the ECE Composable Services".
- If you have already deployed the ECE composable services, update your Helm release:
```
helm upgrade EceCompServicesReleaseName oc-ccs-version --values override-values.yaml -n EceCompServicesNameSpace
```
  where:
  - EceCompServicesReleaseName is the release name for the ECE composable services deployment.
  - version is the ECE composable services Helm chart number, such as 3.0.1+e59503ab.
  - EceCompServicesNameSpace is the namespace in which to create Kubernetes objects for the Helm chart.