7 Enable or Disable RAM Storage in Kafka Cluster

This chapter outlines the procedures to enable or disable RAM storage in a Kafka cluster, depending on performance and reliability requirements.

7.1 Enable RAM Storage in Kafka Cluster

The RAM storage is used to support higher throughput in the Kafka cluster. The throughput is normally restricted because of underlying DISK IOPS. In the case where the DISK bandwidth is not performant, the Kafka cluster results in restricted throughput and higher latency. This can be overcome if the Kafka cluster supports storing the messages in memory using RAM storage. However, since the storage is in memory (RAM), it is not possible to have higher message retention support. It should also be noted that with RAM-based storage, there may be message loss if the broker goes down, as the messages are not persisted over the Disk.

This procedure is required to support RAM storage in the Kafka cluster. It should be executed on all applicable groups relayagent or mediation where RAM storage is needed. Perform the following steps:

Step 1: Enable the RAM storage in the custom values file of the corresponding group relayagent or mediation (ocnadd-relayagent-custom-values.yaml or ocnadd-mediation-custom-values.yaml).

Note:

It is useful to take a backup of the custom values of the relayagent or mediation group so that parameters can be retrieved from the backup in case the RAM-based storage is disabled.

Update the following parameters in the custom values file ocnadd-relayagent-custom-values.yaml or ocnadd-mediation-custom-values.yaml of the corresponding group in the ocnaddkafka section.

For relay agent:

ocnaddrelayagent.ocnaddkafka.ocnadd.kafkaBroker.kafkaProperties.ramDriveStorage: false    # ==================> set it to true

# If Kafka broker requires 48Gi of memory and data retention of topic requires 200Gi, then total memory required will be 248Gi. 
# The memory value for a single topic can be calculated using "MPS * Retention Period * RF * Average Message Size". 
# Consider the memory with respect to the number of topics being planned, for example, SCP, NRF, SEPP, MAIN

ocnaddrelayagent.ocnaddkafka.ocnadd.kafkaBroker.resource.requests.memory: 48Gi       # ===================> set it to appropriate value
ocnaddrelayagent.ocnaddkafka.ocnadd.kafkaBroker.resource.limit.memory: 48Gi          # ===================> set it to appropriate value  

ocnaddrelayagent.ocnaddkafka.ocnadd.kafkaBroker.kafkaProperties.offsetsTopicReplicationFactor: 3    # ==================> set it to 2

For mediation group:

ocnaddmediation.ocnaddkafka.ocnadd.kafkaBroker.kafkaProperties.ramDriveStorage: false    # ==================> set it to true

# If Kafka broker requires 48Gi of memory and data retention of topic requires 200Gi, then total memory required will be 248Gi. 
# The memory value for a single topic can be calculated using "MPS * Retention Period * RF * Average Message Size". 
# Consider the memory with respect to the number of topics being planned, for example, MAIN

ocnaddmediation.ocnaddkafka.ocnadd.kafkaBroker.resource.requests.memory: 48Gi       # ===================> set it to appropriate value
ocnaddmediation.ocnaddkafka.ocnadd.kafkaBroker.resource.limit.memory: 48Gi          # ===================> set it to appropriate value  

ocnaddmediation.ocnaddkafka.ocnadd.kafkaBroker.kafkaProperties.offsetsTopicReplicationFactor: 3    # ==================> set it to 2

Update the following parameter under the consumeradapter section in ocnaddadmin:

consumeradapter.env.ADAPTER_KAFKA_PROCESSING_GUARANTEE: exactly_once_v2    # ===============> change it to "at_least_once"

It should be noted that when using "at_least_once", there may be duplicate messages in case of consumer rebalancing or broker restarts.

Note:

Skip the further steps if this is done as part of a fresh installation of the OCNADD cluster.

The below steps are additionally needed when the Kafka cluster is being migrated from CEPH-based storage to RAM storage.

Step 2: Uninstall the corresponding group, for example, relayagent or mediation, where the RAM-based storage is being enabled.

Uninstall Group

The user needs to perform the uninstallation of the corresponding groups relayagent or mediation by following the relevant procedures from Oracle Communications Network Analytics Data Director Installation, Upgrade, and Fault Recovery Guide.

For relay agent:

Follow the steps defined in section "Uninstalling Relay Agent" from Oracle Communications Network Analytics Data Director Installation, Upgrade, and Fault Recovery Guide.

For mediation group:

Follow the steps defined in section "Uninstalling Mediation Group" from Oracle Communications Network Analytics Data Director Install, Upgrade, and Fault Recovery Guide.

Note:

It must be noted that during the uninstallation of the corresponding group, the backup PVC is not deleted during Kafka broker/Kraft-controller PVC deletion. The delete all PVC command should be modified appropriately to delete only the required Kafka broker/Kraft-controller PVCs.

Step 3: Reinstall the same worker group (relayagent or mediation) where RAM-based storage is being enabled. Ensure to use the custom-values file that has the changes for enabling RAM storage (refer to Step 1).

Perform the steps mentioned in the section Installing OCNADD RelayAgent Group or Installing OCNADD Mediation Group from Oracle Communications Network Analytics Data Director Installation, Upgrade, and Fault Recovery Guide.

7.2 Disable RAM Storage in Kafka Cluster

This procedure is required to support persistent storage in the Kafka cluster. It should be executed on all applicable worker groups (RelayAgent or Mediation) where persistent storage is needed. Perform the following steps:

Step 1: Disable the RAM storage in the custom values file of the corresponding group relayagent or mediation (ocnadd-relayagent-custom-values.yaml or ocnadd-mediation-custom-values.yaml).

Ensure to use the corresponding parameters from the backup of the worker group custom values taken during the Enable RAM Storage procedure.

Disable RAM Storage Parameters

Update the following parameters in the custom values file ocnadd-relayagent-custom-values.yaml or ocnadd-mediation-custom-values.yaml of the corresponding group in the ocnaddkafka section.

For relay agent:

ocnaddrelayagent.ocnaddkafka.ocnadd.kafkaBroker.kafkaProperties.ramDriveStorage: true   # ==================> set it to false

ocnaddrelayagent.ocnaddkafka.ocnadd.kafkaBroker.resource.requests.memory: <248Gi>       # ===================> set it to 48Gi (set it to previously configured value)
ocnaddrelayagent.ocnaddkafka.ocnadd.kafkaBroker.resource.limit.memory: <248Gi>          # ===================> set it to 48Gi (set it to previously configured value)
ocnaddrelayagent.ocnaddkafka.ocnadd.kafkaBroker.pvcClaimSize: <previous value>         # ===================> set it to the previously configured value

# Update to below values when higher throughput with lower latency is needed. This can have lower message reliability if the Kafka broker goes down
offsetsTopicReplicationFactor: 1
transactionStateLogReplicationFactor: 1

# or

# Update to below values when higher message reliability is required (RF>1). This can potentially have lower throughput and higher latency if the Kafka cluster Disk IOPS & cluster network bandwidth are less performing.
offsetsTopicReplicationFactor: 2
transactionStateLogReplicationFactor: 2

For mediation group:

ocnaddmediation.ocnaddkafka.ocnadd.kafkaBroker.kafkaProperties.ramDriveStorage: true   # ==================> set it to false

ocnaddmediation.ocnaddkafka.ocnadd.kafkaBroker.resource.requests.memory: <248Gi>       # ===================> set it to 48Gi (set it to previously configured value)
ocnaddmediation.ocnaddkafka.ocnadd.kafkaBroker.resource.limit.memory: <248Gi>          # ===================> set it to 48Gi (set it to previously configured value)
ocnaddmediation.ocnaddkafka.ocnadd.kafkaBroker.pvcClaimSize: <previous value>         # ===================> set it to the previously configured value

# Update to below values when higher throughput with lower latency is needed. This can have lower message reliability if the Kafka broker goes down
offsetsTopicReplicationFactor: 1
transactionStateLogReplicationFactor: 1

# or

# Update to below values when higher message reliability is required (RF>1). This can potentially have lower throughput and higher latency if the Kafka cluster Disk IOPS & cluster network bandwidth are less performing.
offsetsTopicReplicationFactor: 2
transactionStateLogReplicationFactor: 2

Update the following parameter under the consumeradapter section in ocnaddadmin:

consumeradapter.env.ADAPTER_KAFKA_PROCESSING_GUARANTEE: at_least_once   # ===============> change it to "exactly_once_v2"

Note:

Skip the further steps if this is done as part of a fresh installation of the OCNADD cluster.

The below steps are additionally needed when the Kafka cluster is being migrated from RAM-based storage to CEPH-based persistent storage.

Step 2: Uninstall the corresponding group, for example, relayagent or mediation, where RAM-based storage was previously enabled.

Uninstall Group

The user needs to perform the uninstallation of the corresponding groups relayagent or mediation by following the relevant procedures from Oracle Communications Network Analytics Data Director Install, Upgrade, and Fault Recovery Guide.

For relay agent:

Follow the steps defined in section "Uninstalling Relay Agent" from Oracle Communications Network Analytics Data Director Installation, Upgrade, and Fault Recovery Guide.

For mediation group:

Follow the steps defined in section "Uninstalling Mediation Group" from Oracle Communications Network Analytics Data Director Install, Upgrade, and Fault Recovery Guide.

Note:

It must be noted that during the uninstallation of the corresponding group, the backup PVC is not deleted during Kafka broker/Kraft-controller PVC deletion. The delete all PVC command should be modified appropriately to delete only the required Kafka broker/Kraft-controller PVCs.

Step 3: Reinstall the same worker group (relayagent or mediation) where RAM-based storage is being disabled. Ensure to use the custom-values file that has the changes for disabling RAM storage (refer to Step 1).

Perform the steps mentioned in the section Installing OCNADD RelayAgent Group or Installing OCNADD Mediation Group from Oracle Communications Network Analytics Data Director Installation, Upgrade, and Fault Recovery Guide.