Deployment Guide

     Previous  Next    Open TOC in new window  Open Index in new window  View as PDF - New Window  Get Adobe Reader - New Window
Content starts here

Understanding AquaLogic Service Bus High Availability

A clustered AquaLogic Service Bus domain provides high availability. A highly available deployment has recovery provisions in the event of hardware or network failures, and provides for the transfer of control to a backup component when a failure occurs.

The following sections describe clustering and high availability for a AquaLogic Service Bus deployment:

 


About AquaLogic Service Bus High Availability

For a cluster to provide high availability, it must be able to recover from service failures. WebLogic Server supports failover for clustered objects and services pinned to servers in a clustered environment. For information about how WebLogic Server handles such failover scenarios, see Communications in a Cluster in Using WebLogic Server Clusters.

Recommended Hardware and Software

The basic components of a highly available AquaLogic Service Bus environment include the following:

A full discussion of how to plan the network topology of your clustered system is beyond the scope of this section. For information about how to fully utilize inbound load balancing and failover features for your AquaLogic Service Bus configuration by organizing one or more WebLogic Server clusters in relation to load balancers, firewalls, and Web servers, see Cluster Architectures in Using WebLogic Server Clusters. For information on configuring outbound load balancing, see “To Add a Business Service - Transport Configuration” in “Adding a Business Service” in Business Services in Using the AquaLogic Service Bus Console.

For a simplified view of a cluster, showing the http load balancer, highly available database and multi-ported file system, see the following figure.

Figure 5-1 Simplified View of a Cluster

Simplified View of a Cluster

Regarding JMS File Stores

The default AquaLogic Service Bus domain configuration uses a file store for JMS persistence to store collected metrics for monitoring purposes and alerts. The configuration shown relies on a highly available multi-ported disk that can be shared between managed servers to optimize performance. This will typically be more performant than a JDBC store.

For information about configuring JMS file stores, see Using the WebLogic Persistent Store in Configuring WebLogic Server Environments.

What Happens When a Server Fails

A server can fail due to either software or hardware problems. The following sections describe the processes that occur automatically in each case and the manual steps that must be taken in these situations.

Software Faults

If a software fault occurs, the Node Manager (if configured to do so) will restart the WebLogic Server. For more information about Node Manager, see Using Node Manager to Control Servers in Managing Server Startup and Shutdown. For information about the steps to take to prepare for recovering a secure installation, see “Directory and File Back Ups for Failure Recovery” in Avoiding and Recovering from Server Failure in Managing Server Startup and Shutdown.

Hardware Faults

If a hardware fault occurs, the physical machine may need to be repaired and could be out of operation for an extended period. In this case, the following events occur:

Server Migration

AquaLogic Service Bus leverages WebLogic Server’s whole server migration functionality to enable transparent failover of managed servers from one system to another. For detailed information regarding WebLogic Server whole server migration, see the following topics in the WebLogic Server documentation set:

Message Reporting Purger

The Message Reporting Purger for the JMS Message Reporting Provider is deployed in a single managed server in a cluster (see AquaLogic Service Bus Deployment Resources).

If the managed server that hosts the Message Reporting Purger application fails, you must select a different managed server for the Message Reporting Purger and its asscociated queue (wli.reporting.purge.queue) to resume normal operation.

Any pending purging requests in the managed server that failed are not automatically migrated. You must perform a manual migration. Otherwise, target the Message Reporting Purger application and its queue to a different managed server and send the purging request again.

 


AquaLogic Service Bus Failure and Recovery

In addition to the high availability features of WebLogic Server, AquaLogic Service Bus has failure and recovery characteristics that are based on the implementation and configuration of your AquaLogic Service Bus solution. The following sections discuss specific AquaLogic Service Bus failure and recovery topics:

Transparent Server Reconnection

AquaLogic Service Bus provides transparent reconnection to external servers and services when they fail and restart. If AquaLogic Service Bus sends a message to a destination while the connection is unavailable, you may see one or more runtime error messages in the server console.

Transparent reconnection is provided for the following types of servers and services:

AquaLogic Service Bus Console also provides monitoring features that enable you to view the status of services and to establish a system of SLAs and alerts to respond to service failures. For more information, see Monitoring in Using the AquaLogic Service Bus Console.

EIS Instance Failover

Most business services in production environments will be configured to point to a number of EIS instances for load balancing purposes and high availability. If you expect that an EIS instance failure will have an extended duration or a business service points to a single, failed EIS instance, you can reconfigure the business service to point at an alternate, operational EIS instance. This change can be made dynamically.

For information about using the AquaLogic Service Bus Console to change an endpoint URI for a business service, see “Viewing and Changing Business Services” in Business Services in Using the AquaLogic Service Bus Console.


  Back to Top       Previous  Next