3 Benchmarking Cloud Deployable DSR

This chapter is divided into the following sections:
  • Infrastructure Environment

    This section provides details of the infrastructures used for the benchmark testing, including the hardware and software. It also describes key settings and attributes, and some recommendations on configuration.

  • Benchmark section for each DSR server type

    Each DSR server type is treated independently for benchmarking. Each section describes the traffic setup, and the observed results. It also provides metrics and guidelines for assessing performance on any infrastructure.

Data Usage

This data is intended to provide guidance. Recommendations may need to be adapted to the conditions in a given operator’s network. Each of the following sections include metrics that provide feedback on the running performance of the application.

When planning to deploy a DSR into any cloud environment, a few steps are recommended:
  • Understand the initial deployment scenario for the DSR.
    • Which features are planned?
    • How much of what type of traffic?

      This may change once deployed, and the DSR can be grown or shrunk to meet the changing needs.

  • Use the DSR Cloud Dimensioning tool to get an estimate of the types of DSR virtual servers needed and an initial estimate of the quantity of the virtual machines and resources. Oracle Sales Consultant can run this tool based on DSR requirements:
    • The tool allows for a very detailed model to be built of your DSR requirements, including:
      • Required MPS by Diameter Application ID (S6a, Sd, Gx, Rx, so on).
      • Required DSR applications such as Full Address Based Resolution (FABR), Policy DRA (PDRA), and any required sizing information such as the number of subscribers supported for each application.
      • Any required DSR features such as Topology Hiding, Message Copy, IPSEC, or Mediation that can affect performance.
      • Network-level redundancy requirements, such as mated pair DSR deployments, where one DSR needs to support full traffic, when one of the DSRs is unavailable.
      • Infrastructure information, such as OpenStack or KVM, and Server parameters.
    • The tool then generates a recommended number of VMs for each of the required VM types.

    Note:

    These recommendations are just guidelines. Since the actual performance of the DSR can vary significantly based on the details of the infrastructure.
  • Based on the initial deployment scenario, determine if additional benchmarking is warranted:
    • For labs and trials, there is no need to benchmark performance and capacity if the goal of the lab is to test DSR functionality.
    • If the server hardware is different from the hardware used in this document then the performance differences can likely be estimated using industry standard metrics. This is done by comparing single-threaded processor performance of the CPUs used in this document with respect to the CPUs used in the customer’s infrastructure. This approach is most accurate for small differences in hardware (for instance, different clock speeds for the same generation of Intel processors) and least accurate across processor generations where other architectural differences such as networking interfaces could also affect the comparison.
    • It is the operator’s decision to determine if additional benchmarking in the operator’s infrastructure is desired. Here is a few things to consider when deciding:
      • Benchmark infrastructure is similar to the operator’s infrastructure, and the operator is satisfied with the benchmark data provided by Oracle.
      • Initial turn-up of the DSR is handling a relatively small amount of traffic and the operator prefers to measure and adjust once deployed.
      • Operator is satisfied with the high-availability and geo-diversity of the DSR, and is willing to risk initial overload conditions, and adjusts once the DSR is in production.
  • If required, perform benchmark testing on the target cloud infrastructure. Perform benchmark only on those types of DSR servers required for the deployment.

    For example, if full address resolution is not planned, do not waste time benchmarking the SDS, SDS SOAM, or DPs.

    • When the benchmark testing is complete, observe the data for each server type, and compare it with the baseline used for the estimate from the DSR Cloud Dimensioning tool.
      • If the performance estimate for a given DSR function is X and the observed performance is Y, then adjust the performance for that DSR function to Y.
      • Re-calculate the resources needed for deployment based on the updated values.
  • Deploy the DSR.
  • Monitor the DSR performance and capacity as described later in the document. As the network changes additional resources may be required. If needed, increase the DSR resources as described later in this document.

3.1 Infrastructure Environment

This section describes the infrastructure that was used for benchmarking. In general, the defaults or recommendations for hypervisor settings are available from the infrastructure vendors.

Whenever possible the DSR recommendations align with vendor defaults and recommendations. Benchmarking was performed with the settings described in this section. Operators may choose different values, better or worse performance compared to the benchmarks might be observed. When recommendations other than vendor defaults or recommendations are made, additional explanations are included in the applicable section.

There is a sub-section included for each infrastructure environment used in benchmarking.

3.1.1 General Rules for All Infrastructures

3.1.1.1 Hyper-Threading and CPU Over-Subscription
All of the tests were conducted with Hyper-Threading enabled, and with a 1:1 subscription ratio for vCPUs in the hypervisor. The hardware used for the testing were dual-processor servers with 32 physical cores each (Oracle X9-2). Thus, each server had:
(2 CPUs) x (32 cores per CPU) x (2 threads per core) = 128 vCPUs

It is not recommended to use over-subscribed vCPUs (for instance 4:1) in the hypervisor. Not only is the performance lower, but it makes the performance more dependent on the other loads running on each physical server.

Turning off Hyper-Threading is also not recommended. There is a small increase in performance of a given VM without Hyper-Threading for a given number of vCPUs. But since the number of vCPUs for each processor drops in half without Hyper-Threading, the overall throughput for each server also drops almost by half.

The vCPU sizing for each VM is provided in the DSR VM Configurations section.

Note:

The recommended configuration is: Hyper-Threading is enabled with 1:1 CPU subscription ratio.

CPU Technology

The CPUs in the servers used for the benchmarking were the Oracle X9-2. Servers with different processors does give different results. In general there are the following issues when mapping the results of the benchmarking data in this document to other CPUs:
  • The per-thread performance of a CPU is the main attribute that determines VM performance. The number of threads is fixed in the VM sizing as shown in DSR VM Configurations section. A good metric for comparing the per-thread performance of different CPUs is the integer performance measured by the SPECint2006 (CINT2006) defined by SPEC.ORG.

    The mapping of SPECint2006ratios to DSR VM performance ratios isn’t exact, but it’s a good measure to determine whether a different CPU is likely to run the VMs faster or slower than the benchmark results in this document.

    Conversely CPU clock speeds are a relatively poor indicator of relative CPU performance. Within a given Intel CPU generation (v2, v3, v4, so on) there are other factors that affect per-thread performance, such as potential turbo speeds of the CPU in comparison with the cooling solution in a given server.

    Comparing between Intel CPU generations, there is a generation over generation improvement of CPU throughput in comparison with the clock speed. This means that even a newer generation chip with a slower clock speed may run a DSR VM faster.

  • The processors must have enough cores that a given VM can fit entirely into a NUMA node. Splitting a VM across NUMA nodes greatly reduces the performance of that VM. The largest VM size (refer DSR VM Configurations section) is 18 vCPUs. Thus, the smallest processor that should be used is a 9-core processor. Using processors with more cores typically makes it easier to pack VMs more efficiently into NUMA nodes but should not affect individual VM CPU-related performance otherwise.
  • One caveat about CPUs with very high core counts is that the user must be aware of potential bottlenecks caused by many VMs contending for shared resources such as network interfaces and ephemeral storage on the server. These tests were run on relatively large CPUs (32 physical cores for each chip), and no such bottlenecks were encountered while running strictly DSR VMs. In clouds with VMs from other applications potentially running on the same physical server as DSR VMs, or in future processor generations with much higher core counts. This potential contention for shared server resources has to be watched closely.

Note:

The selected VM sizes should fit within a single NUMA node, for instance 9 physical cores for the VMs that required 18 vCPUs. Check the performance of the target CPU type against the benchmarked CPU using per-thread integer performance metrics.
3.1.1.2 VM Packing Rate

The DSR doesn’t require or use CPU pinning. Thus, the packing of the DSR VMs onto the physical servers is under the control of OpenStack using the affinity or anti-affinity rules given in DSR VM Configurations. Typically, the VMs do not fit exactly into the number of vCPUs available in each NUMA node, leaving some un-allocated vCPUs. The ratio of the allocated to the unallocated vCPUs is the VMPacking Ratio. For instance, on a given server if 102 out of 128 vCPUs on a server were allocated by the OpenStack, that server would have a packing ratio of ~80%. The achieved packing in a deployment depends on a lot of factors, including the mix of large VMs (DA-MPs, SBRs) with the smaller VMs, and whether the DSR is sharing the servers with other applications that have a lot or large or small VMs.

When planning the number of physical servers required for an DRS a target packing ratio of 80% is a good planning number. A packing ratio of 100% is hard to achieve and may affect the performance numbers shown in the benchmarks. Some amount of server capacity is necessary to run the Host OS for the VMs. While performing functions such as interrupt handling, a packing ratio of 95% or lower is desirable.

Note:

When planning for physical server capacity a packing ratio of 80% is a good guideline. Packing ratios of greater than 95% might affect the benchmark numbers since there aren’t sufficient server resources to handle the overhead of Host OSs.
3.1.1.3 Infrastructure Tuning
The following parameters should be set in the infrastructure to improve DSR VM performance. The instructions for setting them for a given infrastructure is including the DSR Cloud Installation Guide.
  • Txqueuelen: The default of 500 is too small. Recommendation is to set this parameter to 120000.
    • Tuned on the compute hosts.
    • Default value of 500 is too small. Our recommendation is to set to 120000. This increases the network throughput of a VM.
  • Ring buffer increase on the physical Ethernet interfaces: The default is too small. The recommendation is to set both receive and transmit values to 4096.
  • Multiqueue: Multiqueue should be enabled on any IPFE VMs to improve performance.

Note:

Refer to instructions in the DSR Cloud Installation Guide.

3.1.2 KVM (QEMU)/Oracle X9-2 – Infrastructure Environment

There are a number of settings that affects performance of the hosted virtual machines. A number of tests were performed to maximize the performance of the underlying virtual machines for the DSR application.

Host Hardware
  • Oracle Server X9-2
    • CPU Model: Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz
    • 2 CPUs
    • 32 physical cores per CPU
    • RAM: 768 GB
    • HDD: 3.8 TB of NVMe storage (with Software RAID-1 configured)
    • NIC:
      • Oracle Quad Port 10G Base-T Adapter
Hypervisor
  • QEMU-KVM Version: QEMU 1.5.3, libvirt 4.5.0, API QEMU 4.5.0
3.1.2.1 Device Drivers

VirtIO is a virtualizing standard for network and disk device drivers where just the guest’s device driver knows it is running in a virtual environment and cooperates with the hypervisor. This enables guests to get high performance network and disk operations and gives most of the performance benefits of para-virtualization.

Vhost-net provides improved network performance over Virtio-net by totally by passing QEMU as a fast path for interruptions. The vhost-net runs as a kernel thread and interrupts with less overhead providing near native performance. The advantages of using the vhost-net approach are reduced copy operations, lower latency, and lower CPU usage.

Note:

The VirtIO driver was used for Test Bed setting.
3.1.2.2 BIOS Power Settings
Typical BIOS power settings (hardware vendor dependent, see relevant infrastructure hardware vendor documentation for details) provide three options for power settings:
  • Power Supply Maximum: The maximum power the available PSUs can draw.
  • Allocated Power: The power is allocated for installed and hot pluggable components.
  • Peak Permitted: The maximum power the system is permitted to consume.

Note:

Set to Allocated Power or equivalent for your Hardware vendor.

Disk Image Formats

The preferred disk image file formats available when deploying a KVM virtual machine:
  • QCOW2: Disk format supported by the QEMU emulator that can expand dynamically and supports Copy-on-write.
QCOW2 provides a number of benefits, such as:
  • Smaller file size, even on file systems which don’t support holes (such as, sparse files)
  • Copy-on-write support, where the image only represents changes made to an underlying disk image
  • Snapshot support, where the image can contain multiple snapshots of the images history

Test Bed Setting: QCOW2

3.1.2.3 Guest Caching Modes

The operating system maintains a page cache to improve the storage I/O performance. With the page cache, write operations to the storage system are considered completed after the data has been copied to the page cache. Read operations can be satisfied from the page cache if the data requested is in the cache. The page cache is copied to permanent storage using fsync. Direct I/O requests bypass the page cache. In the KVM environment, both the host and guest operating systems can maintain their own page caches, resulting in two copies of data in memory.

The following caching modes are supported for KVM guests:
  • Writethrough: I/O from the guest is cached on the host but written through to the physical medium. This mode is slower and prone to scaling problems. Best used for a small number of guests with lower I/O requirements. Suggested for guests that do not support a writeback cache (such as, Red Hat Enterprise Linux 5.5 and earlier), where migration is not needed.
  • Writeback (Selected): With caching set to writeback mode, both the host page cache and the disk write cache are enabled for the guest. Due to this, the I/O performance for applications running in the guest is good, but the data is not protected in a power failure. As a result, this caching mode is recommended only for temporary data where potential data loss is not a concern.
  • None: With caching mode set to none, the host page cache is disabled, but the disk write cache is enabled for the guest. In this mode, the write performance in the guest is optimal because write operations bypass the host page cache and go directly to the disk write cache. If the disk write cache is battery-backed, or if the applications or storage stack in the guest transfer data properly (either through fsync operations or file system barriers), then data integrity can be ensured. However, because the host page cache is disabled, the read performance in the guest would not be as good as in the modes where the host page cache is enabled, such as write through mode.
  • Unsafe: The host may cache all disk I/O, and sync requests from guest are ignored.
Caching mode None is recommended for remote NFS storage, because direct I/O operations (O_DIRECT) perform better than synchronous I/O operations (with O_SYNC). Caching mode None effectively turns all guest I/O operations into direct I/O operations on the host, which is the NFS client in this environment. Moreover, it is the only option to support migration.

Note:

For Test Bed Setting, set Caching Mode to Writeback.
3.1.2.4 Memory Tuning Parameters

Swappiness

The swappiness parameter controls the tendency of the kernel to move processes out of physical memory and onto the swap disk. Since disks are much slower than RAM, this can lead to slower response times for system and applications if processes are too aggressively moved out of memory.
  • vm.swappiness = 0: The kernel swaps only to avoid an out of memory condition.
  • vm.swappiness = 1: Kernel version 3.5 and over, as well as kernel version 2.6.32-303 and over; Minimum amount of swapping without disabling it entirely.
  • vm.swappiness = 10: This value is recommended to improve performance when sufficient memory exists in a system.
  • vm.swappiness = 60: Default
  • vm.swappiness = 100: The kernel swaps aggressively

Note:

For Test Bed Setting, set vm.swappiness to 10.

Kernel Same Page Merging

Kernel Same-page Merging (KSM), used by the KVM hypervisor, allows KVM guests to share identical memory pages. These shared pages are usually common libraries or other identical, high-use data. KSM allows for greater guest density of identical or similar guest operating systems by avoiding memory duplication. KSM enables the kernel to examine two or more already running programs and compare their memory. If any memory regions or pages are identical, KSM reduces multiple identical memory pages to a single page. This page is then marked copy-on-write. If the contents of the page is modified by a guest virtual machine, a new page is created for that guest.

This is useful for virtualization with KVM. When a guest virtual machine is started, it only inherits the memory from the host qemu-kvm process. Once the guest is running, the contents of the guest operating system image can be shared when guests are running the same operating system or applications. KSM allows KVM to request that these identical guest memory regions be shared.

KSM provides enhanced memory speed and utilization. With KSM, common process data is stored in cache or in main memory. This reduces cache misses for the KVM guests, which can improve performance for some applications and operating systems. Secondly, sharing memory reduces the overall memory usage of guests, which allows for higher densities and greater utilization of resources.

The following 2 services controls KSM:
  • KSM Service: When the KSM service is started, KSM shares up to half of the host system's main memory. Start the KSM service to enable KSM to share more memory.
  • KSM Tuning Service: The ksmtuned service loops and adjusts KSM. The ksmtuned service is notified by libvirt, when a guest virtual machine is created or destroyed

Note:

For Test Bed Setting, set KSM service to active and ensure ksmtuned service running on KVM hosts.

Zone Reclaim Mode

When an operating system allocates memory to a NUMA node, but the NUMA node is full, the operating system reclaims memory for the local NUMA node rather than immediately allocating the memory to a remote NUMA node. The performance benefit of allocating memory to the local node outweighs the performance drawback of reclaiming the memory. However, in some situations reclaiming memory decreases performance to the extent that the opposite is true. In other words, in these situations, allocating memory to a remote NUMA node generates better performance than reclaiming memory for the local node.

A guest operating system causes zone to reclaim in the following situations:
  • When you configure the guest operating system to use huge pages.
  • When you use KSM to share memory pages between guest operating systems.

Configuring huge pages and running KSM are both best practices for KVM environments. Therefore, to optimize performance in KVM environments, it is recommended to disable zone reclaim.

Note:

For Test Bed Setting, disable zone reclaim.

Transparent Huge Pages

Transparent huge pages (THP) automatically optimize system settings for performance. By allowing all free memory to be used as cache, performance is increased.

Note:

For Test Bed Setting, enable THP.

3.2 Benchmark Testing

The way the testing was performed and the benchmark test set-up is the same for each benchmark infrastructure. Each section describes the common set-up and procedures used to benchmark, and then the specific results for the benchmarks are provided for each benchmark infrastructure.

3.2.1 DA-MP Relay Benchmark

This benchmarking case illustrates conditions for an overload of a DSR DA MP.

3.2.1.1 Topology
The below figure illustrates the logical topology used for this testing. Diameter traffic is generated by an MME simulator and sent to an HSS simulator.

Figure 3-1 DA-MP Relay Testing Topology


DA-MP Relay Testing Topology

The dsr.cpu utilization can be further increased to higher levels by means of configuration changes with DOC/CL1/CL2 discards set to 0 and multi-queuing enabled on all hosts. With this configuration, it must be noted that all the discards are at one step CL3 for all incoming and outgoing messages.

3.2.1.2 Message Flow

The following figure illustrates the Message sequence for this benchmark case.

Figure 3-2 DA-MP Relay Message Sequence


DA-MP Relay Message Sequence

Table 3-1 Relay Performance Benchmarking

Scenario Call Flow Model DSR MPS Achieved DA-MP Flavor DA-MP Profile Avg Msg Size CPU Peak RAM Utilization Peak
Relay 100% Relay 288K

12 vCPU

(Regular)
30K_MPS 2.0 K 23% 31%
Relay (with Multique enable, configuration set to DOC/CL1/CL2 discards set to 0 and multi queuing enabled on all hosts) 100% Relay 576K

12 vCPU

(Regular)
40K_MPS_FABR 2.0 K 36% 32%
Relay 100% Relay 560K

18 vCPU

(Large)
35K_MPS 2.0 K 32% 28%
3.2.1.3 Indicative Alarms or Events

During benchmark testing the following alarms or events were observed when it reaches congestion.

Table 3-2 DA-MP Relay Alarms or Events

Number Severity Server Name Description
22008 Info DA-MP Orphan Answer Response Received An answer response was received for which no pending request transaction existed resulting in the Answer message being discarded.
22201 Minor DA-MP MpRxAllRate DA-MP ingress message rate threshold crossed.
22225 Minor DA-MP MpRxDiamAllLen DA-MP diameter average ingress message length threshold crossed.

3.2.2 RBAR Benchmark

Range Based Address Resolution (RBAR) is a DSR-enhanced routing application that allows the routing of Diameter end-to-end transactions based on Diameter Application ID, Command Code, Routing Entity Type, and Routing Entity Addresses (range and individual) as a Diameter Proxy Agent.

A Routing Entity can be:
  • A User Identity:
    • International Mobile Subscriber Identity (IMSI)
    • Mobile Subscriber Integrated Services Digital Network (Number) (MSISDN)
    • IP Multimedia Private Identity (IMPI)
    • IP Multimedia Public Identity (IMPU)
  • An IP Address associated with the User Equipment:
    • IPv4 (based upon the full 32-bit value in the range of 0x00000000 to 0xFFFFFFFF)
    • IPv6-prefix (1 to 128 bits)
  • A general-purpose data type: UNSIGNED16 (16-bit unsigned value)
Routing resolves to a Destination that can be configured with any combination of a Realm and Fully Qualified Domain Name (FQDN); Realm-only, FQDN-only, or Realm and FQDN.

When a message successfully resolves to a destination, RBAR replaces the destination information (Destination-Host and/or Destination-Realm) in the ingress message with the corresponding values assigned to the resolved destination and forwards the message to the (integrated) Diameter Relay Agent for egress routing into the network.

3.2.2.1 Topology
The following figure illustrates the logical topology used for this testing. Diameter traffic is generated by an MME simulator and sent to an HSS simulator.

Figure 3-3 RBAR Testing Topology


RBAR Testing Topology

3.2.2.2 Message Flow
The following figure illustrates the Message sequence for this benchmark case.

Figure 3-4 DA-MP RBAR Message Sequence


DA-MP RBAR Message Sequence

Table 3-3 RBAR Performance Benchmarking

Scenario Call Flow Model DSR MPS Achieved DA-MP Flavor DA-MP Profile Avg Msg Size CPU Peak RAM Utilization Peak
RBAR 100% RBAR 256K 12 vCPU

(Regular)

30K_MPS 2.0 K 30% 30%
3.2.2.3 Indicative Alarms or Events
During benchmark testing the following alarms or events were observed when it reaches into congestion.

Table 3-4 DA-MP RBAR Alarms or Events

Number Severity Server Name Description
22008 Info DA-MP Orphan Answer Response Received An answer response was received for which no pending request transaction existed resulting in the Answer message being discarded.
22225 Minor DA-MP MpRxDiamAllLen DA-MP diameter average ingress message length threshold crossed.

3.2.3 Full Address Based Resolution (FABR - SDS) Capacity

The FABR application adds a Database Processor (DP) server to perform database lookups with a user defined key (IMSI, MSISDN, or Account ID and MSISDN or IMSI). If the key is contained in the database, the DP returns the realm and FQDN associated with that key. The returned realm and FQDN can be used by the DSR Routing layer to route the connection to the desired endpoint. Since there is additional work done on the DA-MP to query the DP, running the FABR application has an impact on the DA-MP performance. This section contains the performance of the DA-MP while running FABR as well as benchmark measurements on the DP itself.

3.2.3.1 Topology

Figure 3-5 SDS DP Testing Topology


SDS DP Testing Topology

SDS DB Details

The SDS database was first populated with subscribers. This population simulates real-world scenarios likely encountered in a production environment and ensure the database is of substantial size to be queried against.
  • SDS DB Size: 780 Million Routing Entities (260 Million Subscribers having 2 IMSI, 1 MSISDN)
  • AVP Decoded: User-Name for IMSI
3.2.3.2 Message Flow

Figure 3-6 SDS DP Message Sequence


SDS DP Message Sequence

Table 3-5 SDS DP Performance Benchmarking

Scenario Call Flow Model DP MPS Achieved DA-MP MPS Achieved DA-MP Flavor DA-MP Profile Avg Msg Size CPU Peak RAM Utilization Peak
FABR 100% FABR 80K 160K

12 vCPU

(Regular)
30K_MPS 2.0 K 26% 44%
3.2.3.3 Indicative Alarms or Events

Table 3-6 SDS DP Alarms or Events

Number Severity Server Name Description
19814 Info DA-MP Communication Agent Peer has not responded to heartbeat Communication Agent Peer has not responded to heartbeat.
19825 Major, Critical DA-MP Communication Agent Transaction Failure Rate The number of failed transactions during the sampling period has exceeded configured thresholds.
19832 Info DA-MP Communication Agent Reliable Transaction Failed Communication Agent Reliable Transaction Failed.
22008 Info DA-MP Orphan Answer Response Received An answer response was received for which no pending request transaction existed resulting in the Answer message being discarded.
22225 Minor DA-MP MpRxDiamAllLen DA-MP diameter average ingress message length threshold crossed.
22606 Info DA-MP Database or DB connection error FABR application received service notification indicating Database (DP) or DB connection (COM Agent) Errors (DP timeout, errors or COM Agent internal errors) for the sent database query.
31000 Critical DA-MP S/W Fault Program impaired by s/w fault

3.2.4 Full Address Based Resolution (FABR-UDR) Capacity

The FABR is a DSR application that provides an enhanced DSR routing capability to enable network operators to resolve the designated Diameter server (IMS HSS, LTE HSS PCRF, OCS, OFCS, and AAA) addresses based on individual user identity addresses in the incoming Diameter request messages. It offers enhanced functionalities with User Data Repository (UDR), which is used to store subscriber data. FABR routes the message as a Diameter Proxy Agent based on request message parameter content.

FABR use the services of the Diameter Plug-In for sending and receiving Diameter messages from or to the network. It uses Communication Agent to interact with off-board data repository (UDR) for address resolution. This section contains the performance of the DA-MP while running FABR.

3.2.4.1 Topology

Figure 3-7 FABR with UDR Testing Topology


FABR with UDR Testing Topology

UDR DB Details

The UDR database was first populated with subscribers. This population simulates real-world scenarios likely encountered in a production environment and ensure the database is of substantial size to be queried against.
  • UDR DB Size: Tested with 40 Million records
  • AVP Decoded: User-Name for IMSI
Following UDR profile is used for benchmarking.

Table 3-7 UDR Profile

vCPU RAM (GB) HDD (GB)
18 70 400
3.2.4.2 Message Flow

Figure 3-8 FABR with UDR Message Sequence


FABR with UDR Message Sequence

Table 3-8 FABR with UDR Performance Benchmarking

Scenario Call Flow Model DSR MPS Achieved DA-MP Flavor DA-MP Profile Avg Msg Size CPU Peak RAM Utilization Peak
FABR + Relay 70% FABR + 30% Relay

288K

(18K/MP)

12 vCPU

(Regular)
30K_MPS 2.0 K 34% 46%
3.2.4.3 Indicative Alarms or Events

Table 3-9 FABR with UDR Alarms or Events

Number Severity Server Name Description
22008 Info DA-MP Orphan Answer Response Received An Answer response was received for which no pending request transaction existed resulting in the Answer message being discarded.
22225 Minor DA-MP MpRxDiamAllLen DA-MP diameter average ingress message length threshold crossed.

3.2.5 vSTP MP

The vSTP-MP server type is a virtualized STP that supports M2PA, M3UA, and TDM. It can be deployed either with other DSR functionality as a combined DSR or vSTP, or as a standalone virtualized STP without any DSR functionality.

3.2.5.1 vSTP MP Benchmarking

The following table describes the feature wise vSTP MP benchmarking.

Table 3-10 Feature-wise vSTP MP Benchmarking

Scenario Call Flow Model vSTP MPS Achieved SS7-MP Flavor CPU Peak
SFAPP + MNP + GTT 2K (MNP + SFAPP) + 6K GTT 18K/MP 8 vCPU 22
SFAPP + MNP + GFLEX + GTT 2K (MNP + SFAPP) + 1K GFLEX + 4K GTT 18K/MP 8 vCPU 19
TIF + GTT 5K MNP+ 10K GTT 20K/MP 8 vCPU 32
vMNP + GTT 5K MNP+ 10K GTT 20K/MP 8 vCPU 29
GFLEX + GTT 5K MNP+ 10K GTT 20K/MP 8 vCPU 46
INPQ + GTT 5K MNP+ 10K GTT 20K/MP 8 vCPU 9
GTT + MTP Routing with MTP screening (M2PA & M3UA) 16K MPS 16K/MP 8 vCPU 45
GTT + MTP Routing 20K MPS 20K/MP 8 vCPU 51
vEIR 5K 5K/MP 8 vCPU 21
Elynx (E1/T1 Card) – GTT Relay 10K TDM + 10K GTT 20K/MP 8 vCPU 29
ENUM 5K 5K/MP 8 vCPU 11
DNS 10K 10K/MP 8 vCPU 2
vSTP – Home SMS MO-FSM AllowList + BlockList Traffic (10K + 10K)

10K/SS7

(2 MPs)

20K /Proxy MP
8 vCPU 29

Note:

  • For ENUM, new vENUM-MP is introduced. vENUM sends messages to UDR over ComAgent interface.
  • Default timer values are supported when vSTP is configured to operate at 10K MPS for each MP.
  • When vSTP is configured to operate at 20K MPS, then the t1Timer to t5Timer values has to be updated. For information about the updated timer values, see MMI API Specification.

3.2.6 Policy DRA (PDRA) Benchmarking

The Policy DRA (PDRA) application adds two additional database components, the SBR(session) (SBR-s) and the SBR (binding) (SBR-b). The DA-MP performance was also measured since the PDRA application puts a different load on the DA-MP than either running Relay or FABR traffic. There are two sizing metrics when determining how many SBR-s or SBR-g server groups (for example, horizontal scaling units) are required. The first is the MPS traffic rate seen at the DA-MPs. This is the metric that is benchmarked in this document. The second factor is the number of bindings (SBR-b) or sessions (SBR-s) that can be supported. This session or binding capacity is set primarily by the memory sizing of the VM and is fixed at a maximum of 16 million per SBR from the DSR 8.3 release. The number of bindings and sessions required for a given network are customer dependent. But a good starting place for engineering is to assume:
  • The number of bindings is equal to the number of subscribers supported by the PCRFs.
  • The number of sessions is equal to number of subscribers times the number of IPCAN sessions required on average for each subscriber. For instance, a subscriber might have one IPCAN session for LTE, and one for VoLTE.

    Note:

    The number of sessions is equal to or greater than the number of bindings.
3.2.6.1 Topology

Figure 3-9 SBR Testing Topology


SBR Testing Topology

3.2.6.2 Message Flow

Figure 3-10 PDRA Message Sequence


PDRA Message Sequence

The following table shows the call model used for the testing. The message distribution is Oracle’s baseline benchmarking may differ significantly from customer distributions based on factors such as the penetration of LTE support in comparison with VoLTE support. The Traffic Details shows the configured PDRA options. For more details on these options, see Oracle Communications Diameter Signaling Router Policy and Charging Application User Guide.

Table 3-11 PDRA Test Call Model

Messages Traffic Details
Message Count Distribution Message Distribution
CCR-I, CCA-I 1 7.14%

Gx with MSISDN Alternative Key,

Gx Topology Hiding
100%
CCR-U, CCA-U 3 21.42% Gx Topology Hiding 100%
CCR-T, CCA-T 1 7.14% Gx Topology Hiding 100%
Gx RAR, RAA 3 21.42% Gx Topology Hiding 100%
AAR, AAA Initial 2 14.29% Rx Topology Hiding 100%
STR, STA 2 14.29% Rx Topology Hiding 100%
Rx RAR, RAA 2 14.29% Rx Topology Hiding 100%

Table 3-12 PDRA Performance Benchmarking

Scenario Call Flow Model SBR MPS Achieved DA-MP Flavor DA-MP Profile Avg Msg Size DA-MPCPU Peak RAM Utilization Peak

Single Server group

(1 SBR(s), 1 SBR(b))
100% PDRA 50K

12 vCPU

(Regular)
30K_MPS 600 52% 29%

Single Server group

(4 SBR(s), 4 SBR(b))
100% PDRA 200K

12 vCPU

(Regular)
30K_MPS 648 56% 34%
3.2.6.3 Indicative Alarms or Events

Table 3-13 PDRA Alarms or Events

Number Severity Server Name Description
19814 Info DA-MP Communication Agent Peer has not responded to heartbeat Communication Agent Peer has not responded to heartbeat.
19825 Major, Critical DA-MP Communication Agent Transaction Failure Rate The number of failed transactions during the sampling period has exceeded configured thresholds.
19832 Info DA-MP Communication Agent Reliable Transaction Failed Communication Agent Reliable Transaction Failed
22008 Info DA-MP Orphan Answer Response Received An answer response was received for which no pending request transaction existed resulting in the Answer message being discarded.
22328 Minor DA-MP IcRate Connection ingress message rate threshold crossed
22704 Info DA-MP Communication Agent Error Policy and Charging server to SBR server communication failure
22705 Info DA-MP SBR Error Response Received Policy and Charging server received response from SBR server indicating SBR errors
22714 Info SBR SBR RAR Initiation Error SBR encountered an error while processing PCA initiated RAR requests
22716 Info SBR SBR Audit Statistics Report SBR Audit Statistics Report
22718 Info DA-MP Binding Not Found for Binding Dependent Session Initiate Request Binding record is not found for the configured binding keys in the binding dependent session-initiation request message
22741 Info DA-MP Failed to route PCA generated RAR RAA with Unable To Deliver (3002) error is received at PCA for the locally generated RAR
31232 Critical, Minor DA-MP HA Late Heartbeat Warning High availability server has not received a message on specified path within the configured interval
31236 Major DA-MP HA Link Down High availability TCP link is down

3.2.7 Diameter Security Application (DSA) Benchmarking

Diameter Security application (DSA) applies counter measures for ingress messages received from external foreign network and for egress messages sent to external foreign network. Different counter measure profiles can be created for different IPX or roaming partners by enabling or disabling counter measures individually for different IPX provider or roaming partner Diameter Peers. DSA application is enabled on DA-MP and it uses vUDR to store context information.

3.2.7.1 Topology

Figure 3-11 DSA Testing Topology


DSA Testing Topology

The following stateful and stateless counter measure application configuration and the modes of operations used in benchmarking tests.

Table 3-14 Stateful and Statelss Counter Measures

Application Configuration Data General Options Settings
Table Name Count of Configured Entries Options Values
AppCmdCst_Config 2 Opcodes Accounting Disabled
AppIdWL_Config 1 Max. UDR Queries per Message 5
AVPInstChk_Config 48 Max. Size of Application State 4800
Foreign_WL_Peers_Cfg_Sets 14 Logging of Vulnerable Messages Enabled
MCC_MNC_List 11    
MsgRateMon_Config 1    
Realm_List 6    
Security_Countermeasure_Config 19    
SpecAVPScr_Config 1 Application Threads
System_Config_Options 1 Request 6
TimeDistChk_Config 2000 Answer 4
TTL_Config 5 SbrEvent 4
VplmnORCst_Config 1 AsyncEvent 2
TimeDistChk_Country_Config 2    
TimeDistChk_Exception_List 0    
TimeDistChk_Continent_Config 15    
VplmnORCst_Config 1    
RealmIMSICst_Config 210    
Exception_Rule_Config 0    
All Exception Types Table
  • IMSI_Exception_Config
  • MCC_MNC_Exception_Config
  • Origin_Host_Exception_Config
  • Realm_Exception_Config
  • VPLMN_ID_Exception_Config
0    

Note:

The following error is received during performance run, if the call rate is more than 1.7k in each MP DSA:
UDR Internal Error: Create record failed. Error Code = SendError
This is caused due to comagent connection getting timeout due to ttl expired.
Communication Agent Reliable Transaction Failed}  .. GN_INFO/INF Failure reason = Time to live limit exceeded
To avoid this, run the following commands from Active DSR NOAM before running performance traffic:
iset -fvalue=400 ComAgtConfigParams where "name='IntraNe Maximum Timeout Value'"
iset -fvalue=3 ComAgtConfigParams where "name='Maximum Number Of Retries'"

Table 3-15 DSA Performance Benchmarking

Counter Measure (CM) Call Flow Model DSR MPS Achieved (Per DA-MP) DA-MP Flavor UDR Flavor DA-MP Profile Avg Msg Size CPU Peak RAM Utilization Peak
Previous_Location_Check 15% Vulnerable and 85% Non Vulnerable traffic 9.2K

12 vCPU

(Regular)
18vCPU 30K_MPS 2.0 K 29% 63%
Time_Distance_Check 15% Vulnerable and 85% Non Vulnerable traffic 9.8K

12 vCPU

(Regular)
18vCPU 30K_MPS 2.0 K 34% 63%
Source_Host_Validation_Hss 15% Vulnerable and 85% Non Vulnerable traffic 9.6K

12 vCPU

(Regular)
18vCPU 30K_MPS 2.0 K 34% 62%
Source_Host_Validation_Mme 15% Vulnerable and 85% Non Vulnerable traffic 10K

12 vCPU

(Regular)
18vCPU 30K_MPS 2.0 K 33% 63%
Message_Monitoring_Rate 15% Vulnerable and 85% Non Vulnerable traffic 10K

12 vCPU

(Regular)
18vCPU 30K_MPS 2.0 K 30% 63%
Session_Integrity_Validation_Chk 15% Vulnerable and 85% Non Vulnerable traffic 10K

12 vCPU

(Regular)
18vCPU 30K_MPS 2.0 K 39% 63%
All Stateful 15% Vulnerable and 85% Non Vulnerable traffic 7.02K

12 vCPU

(Regular)
18vCPU 30K_MPS 2.0 K 27% 51%
All Stateful + All stateless 15% Vulnerable and 85% Non Vulnerable traffic 5.25K

12 vCPU

(Regular)
18vCPU 30K_MPS 2.0 K 23% 54%