Resource Requirement

3 Resource Requirement

This chapter provides information about the resource requirements to install and run Oracle Communications Network Analytics Data Director (OCNADD) with the desired Message Per Second (MPS) profiles.

Cluster Details

The following tables provides information about the types of servers and the number of servers used in the test environment:

Table 3-1 Test Bed 1 - CNE on BareMetal

Type of Server	X9 Server and NVME
Master node	3
Worker node	19
Storage Class	Standard
LB Type	LBVM/CNLB

Table 3-2 Test Bed 2 - vCNE on OpenStack

Type of Server	X9 Server and NVME
Master node	3
Worker node	44
Storage Class	Standard
LB Type	LBVM/CNLB

Resource Requirements for OCI Environment

OCI block volume is attached to the PVC with auto-tune based performance from balanced to high performance. To change block volume to auto-tune based performance (Balance to High Performance), see Changing the Performance of a Volume.
All tests are performed with the default round-robin based ordering.
Resource requirements may vary after enabling key or custom based ordering and running traffic with actual NFs.

Table 3-3 Test Bed 3 - OKE on OCI

Type of Server	OCI Hardware
Worker nodes	6
Instance Shape	VM.Standard.E4.Flex
OCPUs in worker node	50 (CPU: 100)
Memory in worker node	194 GB

3.1 Profile Resource Requirements

This section provides information about the profile resource requirements to install and run Oracle Communications Network Analytics Data Director (OCNADD) with the desired Message Per Second (MPS) profiles.

Note:

It is recommended to have the following configurations for CNE Baremetal/vCNE setup to achieve the required throughput:

Jumbo frames should be enabled.
Ring buffer size should be increased to avoid packet drop at interfaces.(not applicable for vCNE)
FluentD pods should not be in "CrashLoopBackOff" state due to Out of Memory error. For more information see "High Latency in adapter feeds due to high disk latency" section in Oracle Communications Network Analytics Data Director Troubleshooting Guide.
The benchmark tests were performed with round trip latency of up to 5ms from third-party consumer applications. In case the latency is more than 5ms then the resource profile footprint and the E2E latency will be higher.

3.1.1 Resource Profile for Database

This section provides information about the database profile resource requirements to install and run Oracle Communications Network Analytics Data Director (OCNADD) with the desired Message Per Second (MPS) profiles.

Table 3-4 Resource Requirement

cnDBTier Pods	Min vCPU	Max vCPU	Min Memory	Max Memory	Total Replica
SQL (ndbmysqld) Kubernetes Resource Type: StatefulSet	1	1	1Gi	1Gi	2
SQL (ndbappmysqld) Kubernetes Resource Type: Statefulset	1	1	1Gi	1Gi	2
MGMT (ndbmgmd) Kubernetes Resource Type: StatefulSet	1	1	1Gi	1Gi	2
Database (ndbmtd) Kubernetes Resource Type: StatefulSet	1	1	4Gi	4Gi	2
Backup Manager Service (db-backup-manager-svc) Kubernetes Resource Type: Deployment	0.1	0.1	128Mi	128Mi	1
Monitor Service (db-monitor-svc) Kubernetes Resource Type: Deployment	0.2	0.2	500Mi	500Mi	1
EXTENDED STORAGE is ENABLED in CORRELATION Feed(Per Correlation Feed) Rate Supported in current release: 1K MPS rate with 24 hours retention Update "global.ndb.datamemory=96G" in custom-value.yaml of cndbTier PVC of ndbmtd= 150GB
Database (ndbmtd) Kubernetes Resource Type: StatefulSet	8	8	128Gi	128Gi	4

Note:

Configure "datamemory: 1G" under "ndbmtd" section while deploying the CnDbTier for OCNADD. For more details on cnDBTier resource profile, see "cnDBTier Small Profile" section in cnDBTier Resource Models Guide.

3.1.2 Resource Profile for 500K MPS

3.1.2.1 Resource Profile for OCNADD OAM Services

The following profile is used for management group services in all the performance scenarios.

Table 3-5 Resource Requirement

Service Name	vCPU	Memory Required (Gi)	Total Replica	Description
ocnaddconfiguration	1	1	1	-
ocnaddalarm	1	1	1	-
ocnaddhealthmonitoring	1	1	1	-
ocnaddgui	1	1	1	-
ocnadduirouter	1	1	1	-
ocnaddexport	0.5	1	1	Resource requirement will increase when export is configured.
ocnaddredundancyagent	1	1	1	Required only when Georedundancy is enabled for OCNADD.

3.1.2.2 Resource Profile for OCNADD Worker Group Services

The following profile shall be used for worker group services. The resource profile for worker group services will vary based on the scenario to be executed.

Note:

To support the increased throughput, first the number of Kafka instances must be increased followed by the number of topic partition changes based on the recommended MPS profile. For more details on this, see "Adding Partitions to an Existing Topic" in the Oracle Communications Network Analytics Data Director User Guide.

Note:

For more details about various supported MPS resource profiles, refer to the Oracle Communications Network Analytics Data Director Benchmarking Guide of previous or respective releases.

3.1.2.2.1 Egress 500K MPS Synthetic (TCP) Feed

Note:

Test conducted using separate clusters: One is dedicated to OCNADD services, while the other is shared by SCP NF services and third-party consumer.

Replication Factor: 1
Message Size: 3500
Feed Type: 2-TCP feeds
FILTER: Egress filter ON for second Feed with 1% data
Message Sequencing/Metadata: OFF
Test bed: LBVM-CNE Bare Metal Cluster Environment
Storage Type: Ceph

500K MPS SCP Profile(Ingress to DD)

Table 3-6 500K MPS SCP Profile (Ingress to DD)

Service	vCPU	Memory Required (Gi)	Total Replica	Topic Partitions
kraft-controller	1	2	3	-
ocnaddkafka (kafkaBroker)	5	96	17	-
ocnaddscpaggregation	2	4	20	SCP =120 (Each instance 6 partition)
ocnaddadapter-1(TCP) (consumeradapter, without any filter)	3	6	28	MAIN=168 (Each instance 6 partition)
ocnaddadapter-2(TCP) (consumeradapter, 500K MPS ingress and 5K MPS egress with Filter)	3	10	9	No partition change as MAIN topic already has 168 partition.
ocnaddadminservice	1	1	1	Admin Service moved to Worker-Group from release 25.2.100

Note:

Additional memory and/or replicas are required for the aggregation service if the Metadata Cache feature is enabled and the values of the properties METADATA_MAP_CACHE_EXPIRY_TIME_MS and METADATA_MAP_CACHE_SCHEDULER_TIME_MS are increased to higher values.
The end-to-end latency may increase based on:
- Higher values of METADATA_MAP_CACHE_EXPIRY_TIME_MS and METADATA_MAP_CACHE_SCHEDULER_TIME_MS.
- Timer Expiry Value + Processing Time + RF2/RF1 Processing Time + Third-party Response Time (for HTTP2 feed).
Resource requirements may vary for the consumeradapter service based on the percentage of data allowed after filtering and the number of filter conditions along with their values.
If ramDriveStorage is enabled, then the KafkaBroker Pod memory requirement will be:Memory Required for KafkaBroker + Memory Required for Data Retention
Example: If the KafkaBroker requires 48Gi of memory and data retention for the topic requires 200Gi, the total memory required will be 248Gi.

It is recommended to configure a replication factor of 2 for the Kafka internal topics (offsetTopicReplicationFactor and transactionStateReplicationFactor) to improve cluster durability. This configuration may enhance data availability and resilience.
Depending on cluster performance, more instances of the KafkaBroker may be required when running with RF=2, and end-to-end latency may also increase if disk I/O is slow.
For DISK I/O, see Disk Throughput Requirements.
For Kafka PVC-Storage, see Kafka PVC Storage Requirements.

3.2 Pod Affinity (or Anti-affinity) Rules

In the Data Director, support for node affinity has been added. The rules are currently defined for the services mentioned in the table below. The rules are currently disabled; however, the user can enable them for the supported services. The rules are provided to control the deployment of certain traffic processing services on a particular set of identified nodes.

Services name:

Consumer Adapter
Kafka
ocnaddnrfaggregation
ocnaddscpaggregation
ocnaddseppaggregation
ocnaddnonoracleaggregation
ocnaddbsfaggregation
ocnaddpcfaggregation

Node Affinity Rules

Update the "affinity" section in the ocnadd-custom-values.yaml file

affinity: {}
                # Node Affinity Configuration:
                #
                # To enable node affinity, remove the empty curly braces ({}) above and un-comment the nodeAffinity section below.
                # This allows you to specify rules for scheduling pods on specific nodes.
                #
                # Example Configuration:
                ###################################   
                # nodeAffinity:
                #   requiredDuringSchedulingIgnoredDuringExecution:
                #     nodeSelectorTerms:
                #     - matchExpressions:
                #       - key: kubernetes.io/hostname
                #         operator: NotIn
                #         values:
                #         - k8s-node-26
                #         - k8s-node-24
                #       - key: kubernetes.io/hostname
                #         operator: In
                #         values:
                #         - k8s-node-2
                #         - k8s-node-3   
                ###########################################
                # Explanation:
                #
                # - The 'NotIn' expression prevents pods from being scheduled on nodes k8s-node-26 and k8s-node-24.
                # - The 'In' expression ensures pods are scheduled on nodes k8s-node-2 and k8s-node-3.
                #
                # To customize, modify the 'key', 'operator', and 'values' fields according to your needs.
                # You can add or remove 'matchExpressions' to create more complex scheduling rules.
                #
                # Remember to remove the empty 'affinity: {}' and un-comment the desired nodeAffinity configuration to enable it.

Helm upgrade the corresponding worker group or the default group
```
helm upgrade <source-release-name> -f ocnadd-custom-values-<worker-group>.yaml --namespace <source-release-namespace> <target-release-helm-chart>
```
Where,
- <source-release-name> is the release name of the source release deployment.
- ocnadd-custom-values-<worker-group>.yaml is the custom values file created for default-worker-group or the Worker Group in separate namespace.
- <source-release-namespace> is the OCNADD namespace of the source release.
- <target-release-helm-chart> is the location of the Helm chart of the target release.
For example:
```
helm upgrade ocnadd -f ocnadd-custom-values-wg.yaml --namespace ocnadd-deploy ocnadd_wg
```
Verify that PODs of the modified services have been deployed as per the configured affinity rules.

3.3 Ephemeral Storage Requirements

The following table describes the Ephemeral Storage requirements for OCNADD:

Table 3-7 Ephemeral Storage Requirements

Service Name	Ephemeral Storage (Request) in Mi	Ephemeral Storage (Limit) in Mi	Description
OAM Services
ocnaddalarm	100	500	-
ocnaddhealthmonitoring	100	500	-
ocnaddconfiguration	100	500	-
ocnadduirouter	500	500	-
ocnaddexport	1000	2000	-
ocnaddredundancyagent	100	500	Required only when Geo Redundancy is enabled for OCNADD
Worker Group Services
<app-name>-adapter (consumeradapter)	1000	1000	-
ocnaddscpaggregation	100	500	-
ocnaddseppaggregation	100	500	-
ocnaddnrfaggregation	100	500	-
ocnaddbsfaggregation	100	500	-
ocnaddpcfaggregation	100	500	-
ocnaddnonoracleaggregation	100	500	Required only when Data processing is enabled from Non-oracle NFs
ocnaddcorrelation	400	800	-
ocnaddstorageadapter	400	800	-
ocnaddingressadapter	400	800	-
ocnaddfilter	100	800	Required only when "Filtered" or "Correlated Filtered" feed is created
ocnaddadminservice	200	200	Admin Service moved to Worker-Group from release 25.2.100

3.4 Disk Throughput Requirements

The following table describes the disk throughput requirements in OCNADD:

Table 3-8 Disk Throughput Requirements

Avg Size (in Bytes)	Rate	RF (Kafka Replication Factor)	Topic (NF+MAIN)	Consumer Feed	Total Write Throughput (MB/s)	Total Read Throughput (MB/s)	No. of Broker	Per Broker Write Throughput (MB/s)	Per Broker Read Throughput (MB/s)	Total per Broker Throughput (MB/s) with 10% buffer	Total Disk Throughput (MB/s) for the Cluster with 10% Buffer
1941	39000	1	2	1	145	145	3	54	54	108	324
1941	39000	2	2	1	289	289	3	106	106	212	636
3769	39000	1	2	1	281	281	3	104	104	208	624
3769	39000	2	2	1	561	561	3	206	206	412	1236

Note:

The average size of OCNADD Ingress message captured in the table includes the size of metadata list + header list of original 5G HTTP2 header frame + 5G-SBI-Message.
Currently, it is recommended to set the Replication Factor (RF) value to 1 with the assumption that the underlying storage provides data redundancy. RF value of "2" will be supported in a future release.

The disk throughput calculations are as follows:


Writes: W * RF * T    
Reads: ((RF*T)+C- 1) * W    
Disk Throughput (Write + Read): (W * RF *T) + (L * W)
W -> MB/sec of data that will be written                
RF -> Replication factor                
T   -> No of topics to which data copied. As of now, each message will be copied into two topics.                
C   -> Number of consumer groups, that is the number of readers for each write                
L   -> (RF*T) + C -1

Average Message in Table:

Average Message Size= (a1b1+a2b2+..+a(n)b(n))/(a1+a2+..+a(n))  
a1   -> SCP MPS    
b1   -> SCP message size    
a2   -> NRF MPS    
b2   -> NRF message size    
a(n) -> NF(n) MPS    
b(n) -> NF(n) message size

Example:

Average message size for row 1 = ((1624*30000)+(3000*9000))/(30000+9000) = 1941 Bytes (approx)

Average message size for row 4 = ((4000*30000)+(3000*9000))/(30000+9000) = 3769 Bytes (approx)

The following table describes the disk throughput for SCP and NRF:

Table 3-9 SCP, NRF, and SEPP Disk Throughput

SCP Message		NRF Message	SEPP Message		RF (Kafka Replication Factor)	Topic (NF+MAIN)	Consumer Feed	Total Write Throughput (MB/s)	Total Read Throughput (MB/s)	No.of Broker	Per Broker Write Throughput (MB/s)	Per Broker Read Throughput (MB/s)	Total per Broker Throughput (MB/s) with 10% buffer	Total Disk Throughput (MB/s) for Cluster with 10% Buffer
Avg Size (Bytes)	Rate	Avg Size (Bytes)	Rate	Avg Size (Bytes)	RF (Kafka Replication Factor)	Topic (NF+MAIN)	Consumer Feed	Total Write Throughput (MB/s)	Total Read Throughput (MB/s)	No.of Broker	Per Broker Write Throughput (MB/s)	Per Broker Read Throughput (MB/s)	Total per Broker Throughput (MB/s) with 10% buffer	Total Disk Throughput (MB/s) for Cluster with 10% Buffer	Rate
1624	30000	3000	9000	3000	15000	1	2	1	145	145	3	54	54	108	324
1624	30000	3000	9000	3000	15000	2	2	1	289	289	3	106	106	212	636
4000	30000	3000	9000	3000	15000	1	2	1	281	281	3	104	104	208	624
4000	30000	3000	9000	3000	15000	2	2	1	561	561	3	206	206	412	1236

Note:

The average size of OCNADD Ingress message captured in the table includes the size of metadata list + header list of original 5G HTTP2 header frame + 5G-SBI-Message.
Currently, it is recommended to set the Replication Factor (RF) value to 1 with the assumption that the underlying storage provides data redundancy.

3.5 Kafka PVC Storage Requirements

The following table describes the retention period per topic for different NFs:

Table 3-10 Retention Period Per Topic

Topic Name	Retention Period
SCP	5 Minutes
NRF	5 Minutes
SEPP	5 Minutes
BSF	5 Minutes
PCF	5 Minutes
MAIN	6 Hours (Max)

The following calculation is for storage requirements for a topic:

Important:

For the 6 hrs storage in the MAIN topic, the storage requirement must be calculated using the following information:

Storage Requirement for a topic = MPS * Retention Period * RF * Average Message Size

Where,

MPS is "Message Per Second"

RF is "Replication Factor"

Examples:

Average Message Size = 1941 Bytes

The following example uses the values from the first row of the Table 3-9 table. For more information about the table, see Disk Throughput Requirements:

Storage Requirement for SCP and NRF Topics = MPS * Retention Period * RF * Message Size = 39000 * 5 Minutes * 3 * 1941
                                                                               = 39000 * 5 * 60 * 3 * 1941
                                                                               = ~ 63.45 GB
Storage Requirement for MAIN = MPS * Retention Period * RF * Message Size  =  39000 * 6 Hours * 3 * 1941
                                                                               =  39000 * 6 * 60 * 60 * 3 * 1941
                                                                               =  ~ 4.46 TB
Total Storage Requirement for the Broker Cluster = Storage for SCP + Storage for NRF + Storage for MAIN
                                                                               = 63.45 GB + 4.46 TB
                                                                               = ~ 4.53 TB
Total Storage for each broker = (4.53/Number of Brokers) TB = (4.53/3) TB = ~ 1.51 TB [Assuming 3 Broker cluster]

Average Message Size = 3769 Bytes

The following example uses the values from the fourth row of the Table 3-9 table. For more information about the table, see Disk Throughput Requirements:

Storage Requirement for SCP and NRF Topics = MPS * Retention Period * RF * Message Size = 39000 * 5 Minutes * 3 * 3769
                                                                                        = 39000 * 5 * 60 * 3 * 3769
                                                                                        = ~ 123.20 GB

Storage Requirement for MAIN = MPS * Retention Period * RF * Message Size  =  39000 * 6 Hours * 3 * 3769
                                                                                        =  39000 * 6 * 60 * 60 * 3 * 3769
                                                                                        = ~ 8.66 TB
Total Storage Requirement for the Broker Cluster = Storage for SCP + Storage for NRF + Storage for MAIN
                                                                                        = 123.20 GB + 8.66 TB
                                                                                        = ~ 8.79 TB
Total Storage for each broker = (8.79/Number of Brokers) TB = (8.79/3) TB = ~ 2.93 TB [Assuming 3 Broker cluster]