Sun Java System Message Queue 4.1 Administration Guide

Types of Cluster

Two types of cluster can be created: conventional and high availability (HA). The distinction between the two depends on the value of the imq.cluster.ha property of the brokers belonging to the cluster. All of the brokers in a given cluster must have the same value for this property: if the value is false, the cluster is a conventional one; if true, it is a high-availability cluster.

Conventional Clusters

In a conventional cluster, each of the constituent brokers maintains its own separate persistent data store (see Persistence Services). Brokers within the cluster share information about one another’s persistent destinations, message consumers, and durable subscriptions. However, if one of the brokers should fail, none of the other brokers in the cluster can take over its operations, since none of them have access to the failed broker’s persistent messages, open transactions, and other aspects of its internal state.

Changes to a cluster’s destinations, consumers, or durable subscriptions are automatically propagated to all of the other brokers in the cluster. However, a broker that is offline at the time of the change (through failure, for instance) will not immediately receive this information. To keep such state information synchronized throughout the cluster, one of its brokers can optionally be designated as the master broker to track changes in the cluster’s persistent state. The master broker maintains a configuration change record containing information about changes in the persistent entities associated with the cluster, such as durable subscriptions and administrator-created physical destinations. All brokers in the cluster consult the master broker during startup to update their information about these persistent entities; thus a broker returning to operation after having been temporarily offline can update its information about changes that may have occurred during its absence.

Note –

While it is possible to mix brokers with different versions in the same cluster, all brokers must have a version at least as great as that of the master broker. If there is no master broker, all brokers in the cluster must have the same version.

Because all brokers in a conventional cluster need the master broker in order to perform persistent operations, the following imqcmd subcommands for any broker in the cluster will return an error when a master broker has been configured but is unavailable:

create dst

destroy dst

update dst

destroy dur

Auto-created physical destinations and temporary destinations are unaffected.

In the absence of a master broker, any client application attempting to create a durable subscriber or unsubscribe from a durable subscription will get an error. However, a client can successfully specify and interact with an existing durable subscription.

High-Availability Clusters

In a high-availability cluster, all of the brokers share a common JDBC-based persistent data store holding dynamic state information (destinations, persistent messages, durable subscriptions, open transactions, and so forth) for each broker. In the event of broker failure, this enables another broker to assume ownership of the failed broker’s persistent state and provide uninterrupted service to its clients. Because they share a common JDBC-based data store, all brokers belonging to an HA cluster must have their imq.persist.store property (see Table 14–4) set to jdbc.

Brokers within an HA cluster inform each other at regular intervals that they are still in operation by exchanging heartbeat packets, (using a special internal connection service, the cluster connection service), and updating their state information in the cluster’s shared persistent store. When no heartbeat packet is detected from a broker for a specified number of heartbeat intervals, the broker is considered suspect of failure. The other brokers in the cluster then begin to monitor the suspect broker’s state information in the persistent store to confirm whether the broker has indeed failed. If the suspect broker fails to update its state information within a certain threshold interval, it is considered to have failed. (The duration of these heartbeat and failure-detection intervals can be adjusted by means of broker configuration properties to balance the tradeoff between speed and accuracy of failure detection: shorter intervals result in quicker reaction to broker failure, but increase the likelihood of false suspicions and erroneous failure detection.)

When a broker in an HA cluster detects that another broker has failed, it will attempt to take over the failed broker’s persistent state (pending messages, destination definitions, durable subscriptions, pending acknowledgments, and open transactions), in order to provide uninterrupted service to the failed broker’s clients. If two or more brokers attempt such a takeover, only the first will succeed; that broker acquires a lock on the failed broker’s data in the persistent store, preventing subsequent takeover attempts by other brokers from succeeding. After an initial waiting period, the takeover broker will then clean up any transient resources (such as transactions and temporary destinations) belonging to the failed broker; these resources will be unavailable if the client later reconnects.