|Oracle® Communications Converged Application Server Technical Product Description
Part Number E17655-03
This chapter describes the Oracle Communications Converged Application Server cluster architecture.
Converged Application Server provides a multi-tier cluster architecture in which a stateless “Engine Tier” processes all traffic and distributes all transaction and session state to a “SIP Data Tier.” The SIP data tier is comprised of one or more partitions, labeled as “Data Nodes” in Figure 4-1 below. Each partition may contain one or more replicas of all state assigned to it and may be distributed across multiple physical servers or server blades. A standard load balancing appliance is used to distribute traffic across the Engines in the cluster. It is not necessary that the load balancer be SIP-aware; there is no requirement that the load balancer support affinity between Engines and SIP dialogs or transactions. However, SIP-aware load balancers can provide higher performance by maintaining a client's affinity to a particular engine tier server.
Figure 4-1 shows an example Converged Application Server cluster.
In some cases, it is advantageous to have SIP data tier instances and engine tier instances running on the same physical host. This is particularly true when the physical servers or server blades in the cluster are based on Symmetrical Multi-Processing (SMP) architectures, as is now common for platforms such as Advance Telecom Computing Architecture (ATCA). This is not arbitrarily required, however, and it is entirely possible to physically distribute SIP data tier and Engine instances each to a different physical server or server blade.
There is no arbitrary limit to the number of engines, partitions or physical servers within a cluster, and there is no fixed ratio of engines to partitions. When dimensioning the cluster, however, a number of factors should be considered, such as the typical amount of memory required to store the state for a given session and the increasing overhead of having more than two replicas within a partition.
Converged Application Server has demonstrated linear scalability from 2 to 16 hosts (up to 32 CPUs) in both laboratory and field tests and in commercial deployments. This characteristic is likely to be evident in larger clusters as well, up to the ability of the cluster interconnect (or alternatively the load balancer) to support the total traffic volume. Gigabit Ethernet is recommended as a minimum for the cluster interconnect.
The Converged Application Server SIP data tier is an in-memory, peer-replicated store. The store also functions as a lock manager, whereby call state access follows a simple “library book” model (a call state can only be checked out by one SIP engine at a time).
The nodes in the SIP data tier are called replicas. To increase the capacity of the SIP data tier, the data is split evenly across a set of partitions. Each partition has a set of 1-8 replicas which maintain a consistent state. (Oracle recommends using no more than 3 replicas per partition.) The number of replicas in the partition is the replication factor.
Replicas can join and leave the partition. Any given replica serves in exactly one partition at a time. The total available call state storage capacity of the cluster is determined by the capacity of each partition.
The call state store is peer-replicated. This means that clients perform all operations (reads and writes) to all replicas in a partition. Peer replication stands in contrast to the more common primary-secondary replication architecture, wherein one node acts as a primary and the all other nodes act as secondaries. With primary-secondary replication, clients only talk directly to the current primary node. Peer-replication is roughly equivalent to the synchronous primary-secondary architecture with respect to failover characteristics, peer replication has lower latency during normal operations on average. Lower latency is achieved because the system does not have to wait for the synchronous 2nd hop incurred with primary-secondary replication.
Peer replication also provides better failover characteristics than asynchronous primary-secondary systems because there is no change propagation delay.
The operations supported by all replicas for normal operations are: “lock and get call state,” “put and unlock call state,” and “lock and get call states with expired timers.” The typical message processing flow is simple:
Lock and get the call state.
Process the message.
Put and unlock the call state.
Additional management functions deal with bootstrapping, registration, and failure cases.
The current set of replicas in a partition is referred to as the partition view. The view contains an increasing ID number. A view change signals that either a new replica has joined the partition, or that a replica has left the partition. View changes are submitted to engines when they perform and operation against the SIP data tier.
When faced with a view change, engine nodes performing a lock/get operation must immediately retry their operations with the new view. Each SIP engine schedules a 10ms interval for retrying the lock/get operation against the new view. In the case of a view change on a put request, the new view is inspected for added replicas (in the case that the view change derives from a replica join operation instead of replica failure or shutdown). If there is an added replica, that replica also gets the put request to ensure consistency.
An additional function of the SIP data tier is timer processing. The replicas set timers for the call states when call states perform put operations. Engines then poll for and “check out” timers for processing. Should an engine fail at this point, this failure is detected by the replica and the set of checked-out timers is forcefully checked back in and rescheduled so that another engine may check them out and process them.
As an optimization, if a given call state contains only timers required for cleaning up the call state, the SIP data tier itself expires the timers. In this special case, the call state is not returned to an engine tier for further processing, because the operation can be completed wholly within the SIP data tier.
The SIP engine node clients perform failure detection for replicas, or for failed network connections to replicas.
During the course of message processing, an engine communicates with each replica in the current partition view. Normally all operations succeed, but occasionally a failure (a dropped socket or an invocation timeout) is detected. When a failure is detected the engine sends a “replica died” message to any of the remaining live replicas in the partition. (If there is no remaining live replica, the partition is declared “dead” and the engines cannot process calls hashing to that partition until the partition is restored). The replica that receives the failed replica notification proposes a new partition view that excludes the reportedly dead replica. All clients will then receive notification of the view change (see "Partition Views" for more information).
To handle partitioned network scenarios where one client cannot talk to the supposedly failed replica but another replica can, the “good” replica removes the reportedly failed replica offline, ensuring safe operation in the face of network partition.
The major concerns with engine failure are:
They are in the middle of a lock/get or put/unlock operation during failure
They fail to unlock call states for messages they are currently processing
They abandon the set of timers that they are currently processing
Replicas are responsible for detecting engine failure. In the case of failures during lock/get and put/unlock operations, there is risk of lock state and data inconsistency between the replicas (data inconsistency in the case of put/unlock only). To handle this, the replicas break locks for call states if they are requested by another engine and the current lock owner is deemed dead. This allows progress with that call state.
Additionally, to deal with possible data inconsistency in scenarios where locks had to be broken, the call state is marked as “possibly stale”. When an engine evaluates the response of a lock/get operation, it wants to choose the best data. If any one replica reports that it has non-stale data, that data is used. Otherwise, the “possibly stale” data is used (it is only actually stale in the case that the single replica that had the non-stale version died in the intervening period).
Because of the automatic failure recovery of the replicated store design, failures don't affect call flow unless the failure is of a certain duration or magnitude.
In some cases, failure recovery causes “blips” in the system where the engine's coping with view changes causes message processing to temporarily back-up. This is usually not dangerous, but may cause UAC or UAS re-transmits if the backlog created is substantial.
Catastrophic failure of a partition (whereby no replica is remaining) causes a fraction of the cluster to be unable to process messages. If there are four partitions, and one is lost, 25% of messages will be rejected. This situation will resolve once any of the replicas are put back in the service of that partition.
A Converged Application Server domain may optionally deploy support for the Diameter base protocol and IMS Sh interface provider on engine tier servers, which then act as Diameter Sh client nodes. SIP Servlets deployed on the engines can use the profile service API to initiate requests for user profile data, or to subscribe to and receive notification of profile data changes. The Sh interface is also used to communicate between multiple IMS Application Servers.
One or more server instances may be also be configured as Diameter relay agents, which route Diameter messages from the client nodes to a configured Home Subscriber Server (HSS) in the network, but do not modify the messages. Oracle recommends configuring one or more servers to act as relay agents in a domain. The relays simplify the configuration of Diameter client nodes, and reduce the number of network connections to the HSS. Using at least two relays ensures that a route can be established to an HSS even if one relay agent fails.
The relay agents included in Converged Application Server perform only stateless proxying of Diameter messages; messages are not cached or otherwise processed before delivery to the HSS.
Note:In order to support multiple HSSs, the 3GPP defines the Dh interface to look up the correct HSS. Converged Application Server does not provide a Dh interface application, and can be configured only with a single HSS.
Note that relay agent servers do not function as either engine or SIP data tier instances—they should not host applications, store call state data, maintain SIP timers, or even use SIP protocol network resources (sip or sips network channels).
Converged Application Server also provides a simple HSS simulator that you can use for testing Sh client applications. You can configure a Converged Application Server instance to function as an HSS simulator by deploying the appropriate application.
Figure 4-5 shows diameter protocol handling within a Converged Application Server Diameter Domain.
Converged Application Server may be deployed in non-clustered configurations where session retention is not a relevant capability. The SIP signaling throughput of individual Converged Application Server instances will be substantially higher due to the elimination of the computing overhead of the clustering mechanism. Non-clustered configurations are appropriate for development environments or for cases where all deployed services are stateless and/or session retention is not considered important to the user experience (where users are not disturbed by failure of established sessions).
With Converged Application Server, you can upgrade a deployed SIP application to a newer version without losing existing calls being processed by the application. This type of application upgrade is accomplished by deploying the newer application version alongside the older version. Converged Application Server automatically manages the SIP Servlet mapping so that new requests are directed to the new version. Subsequent messages for older, established dialogs are directed to the older application version until the calls complete. After all of the older dialogs have completed and the earlier version of the application is no longer processing calls, you can safely un-deploy it.
Converged Application Server's upgrade feature ensures that no calls are dropped while during the upgrade of a production application. The upgrade process also enables you to revert or rollback the process of upgrading an application. If, for example, you determine that there is a problem with the newer version of the deployed application, you can simply un-deploy the newer version. Converged Application Server then automatically directs all new requests to the older application version.
To use the application upgrade functionality of Converged Application Server:
You must assign version information to your updated application in order to distinguish it from the older application version. Note that only the newer version of a deployed application requires version information; if the currently-deployed application contains no version designation, Converged Application Server automatically treats this application as the “older” version.
Both the deployed application and the updated application must provide only SIP protocol functionality. You cannot upgrade converged HTTP/SIP applications using these procedures.
A maximum of two different versions of the same application can be deployed at one time.
If your application hard-codes the use of an application name (for example, in composed applications where multiple SIP Servlets process a given call), you must replace the application name with calls to a helper method that obtains the base application name. Converged Application Server provides SipApplicationRuntimeMBean methods for obtaining the base application name and version identifier, as well as determining whether the current application version is active or retiring.
When applications take part in a composed application (using application composition techniques), Converged Application Server always uses the latest version of an application when only the base name is supplied.
Converged Application Server also provides the ability for Administrators to upgrade the SIP Servlet container, JVM, or application on a cluster-wide basis without affecting existing SIP traffic. This is accomplished by creating multiple clusters and having Converged Application Server automatically forward requests during the upgrade process. See the discussion on upgrading the Oracle Communications Converged Application server software in Converged Application Server Administration Guide for more information on upgrading the SIP Servlet container, JVM, or applications.