Building Modules and High Availability Scenarios (Sun Java System Portal Server 7.1 Deployment Planning Guide)

Sun Java System Portal Server 7.1 Deployment Planning Guide

Building Modules and High Availability Scenarios

Portal Server provides three scenarios for high availability:

Best Effort

The system is available as long as the hardware does not fail and as long as the Portal Server processes can be restarted by the watchdog process.
No Single Point of Failure

The use of hardware and software replication creates a deployment with no single point of failure (NSPOF). The system is always available, as long as no more than one failure occurs consecutively anywhere in the chain of components. However, in the case of failures, user sessions are lost.
Transparent Failover

The system is always available but in addition to NSPOF, failover to a backup instance occurs transparently to end users. In most cases, users do not notice that they have been redirected to a different node or instance. Sessions are preserved across nodes so that users do not have to reauthenticate. Portal Server services are stateless or use checkpointing mechanisms to rebuild the current execution context up to a certain point. For detailed configuration information on Portal Server in high-availability and session failover, see the Sun Java System Portal Server 7.1 Configuration Guide.

Possible supported architectures include the following:

Using Sun^TM Cluster software on components that support Sun Cluster agents
Multi-master Directory Server techniques

This section explains implementing these architectures and leverages the building module concept, from a high-availability standpoint.

Table 4–1 summarizes these high availability scenarios along with their supporting techniques.

Table 4–1 Portal Server High Availability Scenarios


Component Requirements	Necessary for Best Effort Deployment?	Necessary for NSPOF Deployment?	Necessary for Transparent Failover Deployment?
Hardware Redundancy	Yes	Yes	Yes
Portal Server Building Modules	No	Yes	Yes
Multi-master Configuration	No	Yes	Yes
Load Balancing	Yes	Yes	Yes
Stateless Applications and Checkpointing Mechanisms	No	No	Yes
Session Failover	No	No	Yes
Directory Server Clustering	No	No	Yes

Note –

Load balancing is not provided out-of-the-box with the Sun Java System Web Server product.

Best Effort

In this scenario, you install Portal Server and Directory Server on a single node that has a secured hardware configuration for continuous availability, such as Sun Fire UltraSPARC^TM III machines. (Securing a Solaris^TM Operating Environment system requires that changes be made to its default configuration.)

This type of server features full hardware redundancy, including: redundant power supplies, fans, system controllers; dynamic reconfiguration; CPU hot-plug; online upgrades; and disks rack that can be configured in RAID 0+1 (striping plus mirroring), or RAID 5 using a volume management system, which prevents loss of data in case of a disk crash. Figure 4–5 shows a small, best effort deployment using the building module architecture.

Figure 4–5 Best Effort Scenario

This figure shows a best effort scenario consisting of
4 CPUs.

In this scenario, for memory allocation, four CPUs by eight GB RAM (4x8) of memory is sufficient for one building module. The Portal Server console is outside of the building module so that it can be shared with other resources. (Your actual sizing calculations might result in a different allocation amount.)

This scenario might suffice for task critical requirements. Its major weakness is that a maintenance action necessitating a system shutdown results in service interruption.

When Secure Remote Access is used, and a software crash occurs, a watchdog process automatically restarts the Gateway, Netlet Proxy, and Rewriter Proxy.

No Single Point of Failure

Portal Server natively supports the no single point of failure (NSPOF) scenario. NSPOF is built on top of the best effort scenario, and in addition, introduces replication and load balancing.

Figure 4–6 shows a building module consisting of a Portal Server instance, a Directory Server replica for profile reads and a search engine database. As such, at least two building modules are necessary to achieve NSPOF, thereby providing a backup if one of the building modules fails. These building modules consist of four CPUs by eight GB RAM.

Figure 4–6 No Single Point of Failure Example

This figure shows two building modules consisting of
a Portal Server instance, a Directory Server replica and a search engine.

When the load balancer detects Portal Server failures, it redirects users’ requests to a backup building module. Accuracy of failure detection varies among load balancing products. Some products are capable of checking the availability of a system by probing a service involving several functional areas of the server, such as the servlet engine, and the JVM. In particular, most vendor solutions from Resonate, Cisco, Alteon, and others enable you to create arbitrary scripts for server availability. As the load balancer is not part of the Portal Server software, you must acquire it separately from a third-party vendor.

Note –

Access Manager requires that you set up load balancing to enforce sticky sessions. This means that once a session is created on a particular instance, the load balancer needs to always return to the same instance for that session. The load balancer achieves this by binding the session cookie with the instance name identification. In principle, that binding is reestablished when a failed instance is decommissioned. Sticky sessions are also recommended for performance reasons.

Multi-master replication (MMR) takes places between the building modules. The changes that occur on each directory are replicated to the other, which means that each directory plays both roles of supplier and consumer. For more information on MMR, refer to the Sun Java System Directory Server Deployment Guide.

Note –

In general, the Directory Server instance in each building module is configured as a replica of a master directory, which runs elsewhere. However, nothing prevents you from using a master directory as part of the building module. The use of masters on dedicated nodes does not improve the availability of the solution. Use dedicated masters for performance reasons.

Redundancy is equally important to the directory master so that profile changes through the administration console or the Portal Desktop, along with consumer replication across building modules, can always be maintained. Portal Server and Access Manager support MMR. The NSPOF scenario uses a multi-master configuration. In this configuration, two suppliers can accept updates, synchronize with each other, and update all consumers. The consumers can refer update requests to both masters.

Secure Remote Access follows the same replication and load balancing pattern as Portal Server to achieve NSPOF. As such, two Secure Remote Access Gateways and pair of proxies are necessary in this scenario. The Secure Remote Access Gateway detects a Portal Server instance failure when the instance does not respond to a request after a certain time-out value. When this occurs, the HTTPS request is routed to a backup server. The Secure Remote Access Gateway performs a periodic check for availability until the first Portal Server instance is up again.

The NSPOF high availability scenario is suitable to business critical deployments. However, some high availability limitations in this scenario might not fulfill the requirements of a mission critical deployment.

Transparent Failover

Transparent failover uses the same replication model as the NSPOF scenario but provides additional high availability features, which make the failover to a backup server transparent to end users.

Figure 4–7 shows a transparent failover scenario. Two building modules are shown, consisting of four CPUs by eight GB RAM. Load balancing is responsible for detecting Portal Server failures and redirecting users’ requests to a backup Portal Server in the building module. Building Module 1 stores sessions in the sessions repository. If a crash occurs, the application server retrieves sessions created by Building Module 1 from the sessions repository.

Figure 4–7 Transparent Failover Example Scenario

This figure shows a transparent failover scenario. A
load balancer is in front of two building modules.

The session repository is provided by the application server software. Portal Server is running in an application server. Portal Server supports transparent failover on application servers that support HttpSession failover. See Appendix A, Understanding Portal Server and Application Servers for more information.

With session failover, users do not need to reauthenticate after a crash. In addition, portal applications can rely on session persistence to store context data used by the checkpointing. You configure session failover in the AMConfig.properties file by setting the com.iplanet.am.session.failover.enabled property to true.

The Netlet Proxy cannot support the transparent failover scenario because of the limitation of the TCP protocol. The Netlet Proxy tunnels TCP connections, and you cannot migrate an open TCP connection to another server. A Netlet Proxy crash drops off all outstanding connections that would have to be reestablished.