Chapter 5 Designing for Service Availability

Once you have decided on your logical architecture, the next step is deciding what level of service availability is right for your site. The level of service availability you can expect is related to hardware chosen as well as the software infrastructure and maintenance practices you use. This chapter discusses several choices, their value, and their costs.

High Availability Solutions Overview

The Communications Services offering supports two different high availability solutions, Sun Cluster and Veritas Cluster Server (VCS). Messaging Server and Calendar Server provide agents for each of these solutions.

Messaging Server and Calendar Server support both asymmetric and symmetric HA. In asymmetric HA, you use a hot standby server. In symmetric HA, you run multiple instances of the software on a single server in the event of a failure. There are advantages to both solutions. However, in most cases, choose to run asymmetric HA. The following sections discuss the advantages and disadvantages of each solution.

See the Sun Java System Messaging Server Deployment Planning Guide for more information on HA deployments:

Symmetric HA

In symmetric HA, every server has a partner server that will take its entire processing load if the currently active server fails. Normally you deploy symmetric HA in pairs of servers, although this is not a requirement. However, when you deploy more than a pair of servers, the deployment becomes more complex to manage.

When deploying symmetric HA solutions, you need to size servers such that they are capable of running their own load as well as their partner’s load. This does not necessarily mean that servers must be sized to handle 200 percent of typical peak load with fast response times. It does mean that the server must not fall over under 200 percent of typical peak load. If you deploy to just survive under 200 percent load, and there is a failure during peak hour, then you must understand that user response times will increase (possibly dramatically). Also, there is a distinct possibility that mail delivery into the store will backlog. To many enterprises (and ISPs), this is an acceptable risk.

Asymmetric HA

Asymmetric HA can be deployed as 1/1, meaning each live server has a corresponding hot standby. Again, it is not necessary that the failover server is as powerful as the main server, but using a smaller box means that if a failure occurs during peak hours you will experience increased service times and possibly backlogs of mail delivery. Asymmetric HA can also be deployed in N/1, N+1, or N+M modes. Sun Cluster is essentially limited to a total of four nodes in a standard HA arrangement, so N+M is impractical, but 3+1 is a reasonable and supportable solution.

Veritas supports a much larger number of nodes, so the limitations are driven by customer comfort and not by the software hard limits or supported limits.

Automatic System Reconfiguration (ASR)

In addition to evaluating an HA solution, you should consider deploying hardware that is capable of ASR.

ASR is a process by which hardware failure related downtime can be minimized. If a server is capable of ASR, it is possible that individual component failures in the hardware result in only minimal downtime.

As a general rule, the more ASR capabilities a server has, the more it costs. In the absence of high availability software, choose machines with a significant amount of hardware redundancy and ASR capability for your data stores, assuming that it is not cost prohibitive.

Various Sun SPARC servers support various levels of ASR. Refer to each product’s data sheet to understand its ASR capabilities.

Using Enabling Techniques and Technologies

In addition to the high availability solutions discussed in the previous section, you can use enabling techniques and technologies to improve both availability and performance. These techniques and technologies include load balancers, Sun Java System Directory Proxy Server, and replica role promotion.

Using Load Balancers

You can use load balancers to ensure the functional availability of each tier in your architecture, providing high availability of the entire end-to-end system. Load balancers can be either a dedicated hardware appliance, or a strictly software solution.

Load balancing is the best way to avoid a single instance, server, or network as a single point of failure while at the same time improving the performance of the service. One of the primary goals of load balancing is to increase horizontal capacity of a service. For example, with a directory service, load balancers increase the aggregate number of simultaneous LDAP connections and LDAP operations per second that the directory service can handle.

Using Directory Proxy Server

Sun Java System Directory Proxy Server (formerly Sun ONE Directory Proxy Server) provides many proxy type features. One of these features is LDAP load balancing. Though Directory Proxy Server mighty not perform as well as dedicated load balancers, consider using it for failover, referral following, security, and mapping features.

Using Replica Role Promotion

Directory Server includes a way of promoting and demoting the replica role of a directory instance. This feature enables you to promote a replica hub to a multi-master supplier or vice versa. You can also promote a consumer to the role of replica hub and vice versa. However, you cannot promote a consumer directly to a multi-master supplier or vice versa. In this case, the consumer must first become a replica hub and then it can be promoted from a hub to a multi-master replica. The same is true in the reverse direction.

Replica role promotion is useful in distributed deployments. Consider the case when you have six geographically dispersed sites.You would like to have a multi-master supplier at each site but are limited to only one per site for up to four sites. If you put at least one hub at each of the other two sites, you could promote them if one of the other multi-master suppliers is taken offline or decommissioned for some reason.

See the Sun Java System Directory Server Administration Guide for more information:

High Availability Solutions for Communications Services

Making the Directory Highly Available

From the Communications Services standpoint, the most important factor in planning your directory service is availability. As an infrastructure service, the directory must provide as near-continuous service as possible to the higher-level applications for authorization, access, email routing, and so forth.

A key feature of Directory Server that provides for high availability is replication. Replication is the mechanism that automatically copies directory data from one Directory Server to another. Replication enables you to provide a highly available directory service, and to geographically distribute your data. In practical terms, replication brings the following benefits:

Table 5-1 Designing Directory Server for High Availability
Method	Description
Single-master replication	A server acting as a supplier copies a master replica directly to one or more consumer servers. In this configuration, all directory modifications are made to the master replica stored on the supplier, and the consumers contain read-only copies of the data.
Two-way, multi-master replication	In a multi-master environment between two suppliers that share responsibility for the same data, you create two replication agreements. Supplier A and Supplier B each hold a master replica of the same data and there are two replication agreements governing the replication flow of this multi-master configuration.
Four-way multi-master	Provides a pair of Directory Server masters, usually in two separate data centers. This configuration uses four-way Multi-Master Replication (MMR) for replication. Thanks to its four-way master failover configuration, this fully-connected topology provides a highly-available solution that guarantees data integrity. When used with hubs in the replication topology, load distribution is facilitated, and the four consumers in each data center allow this topology to scale for read (lookup) operations.
Sun Cluster Agent for Directory Server	Using Sun Cluster software provides the highest level of availability for your directory implementation. In the case of failure of an active Directory Server node, Sun Cluster provides for transparent failover of services to a backup node. However, the administrative (and hardware) costs of installing, configuring, and maintaining this kind of environment are typically higher than the Directory Server replication methods.

See the Sun Java System Directory Service Deployment Planning Guide for more information.

Making Messaging Server and Calendar Server Highly Available

Both Messaging Server and Calendar Server can be configured to be highly available by using Sun Cluster and Veritas technology. Messaging Server and Calendar Server support asymmetric and symmetric configurations. In the asymmetric (“hot standby”) configuration, services run only on the primary node, and a standby secondary node remains idle. Detection of a fault in any of the resources (such as storage, host system, or process itself) causes the services to be stopped on the primary node and started on the secondary node. In the symmetric configuration, multiple nodes are concurrently running active services, and the nodes serve as backup for each other. Upon failover, the services on the failing node are shut down and restarted on a designated backup node. In this configuration, the backup node is also already running other active services.

In a tiered Communications Services architecture, where front-end and back-end components are distributed onto separate machines, you would want to make the back-end components highly available through cluster technology. This is because the back ends are the “stores” that maintain persistent data. You would want to make the Messaging Server MTA front end highly available by protecting its disk subsystems. It does not make sense to use cluster technology on the Calendar Server front end. Typically, you would make the Calendar Server front end more available through redundancy, that is, by deploying multiple front-end hosts.

For more information, see the Sun Java System Messaging Server Deployment Planning Guide:

Previous Contents Index Next
Sun Java System Communications Services 6 2004Q2 Enterprise Deployment Planning Guide