C H A P T E R 6 |
This chapter describes how the CMM API indicates changes in the state of the cluster by sending notifications to system services and applications. For more information, see the following topics:
Notifications are information messages sent by the nhcmmd daemon on a node to services or applications registered to receive them. Notifications are sent when there is a change in the membership of the cluster.
In a cluster, the master node is aware of all changes in the state of peer nodes. The cluster state information held by the nhcmmd daemon on the master node is propagated to all peer nodes.
Cluster notifications enable a service or application to maintain an accurate view of the state of the cluster and of the state of any peer node. An application or service can use notifications to coordinate changes in system services when a peer node joins or leaves the cluster.
A single change in the cluster state can cause an application or service to receive several associated cluster change notifications. This can be due to the fact that a change in the membership of one node can effect changes in the membership of several other nodes.
A cluster change notification does not contain any information about the previous role of a node. Therefore, for example, when a callback is invoked with the CMM_MEMBER_LEFT notification, the indicated node could have been in the cluster with no role, or could have had the CMM_MASTER or CMM_VICE_MASTER role.
Several scenarios in which there are changes in the state of the cluster, and the associated notifications sent during these changes, are described in Notifications During Changes in the Cluster State.
For an example of how to retrieve notifications about changes in the cluster state, see EXAMPLE 7-2, EXAMPLE 7-2.
To verify that the nhcmmd daemon is running on your peer nodes, see the Netra High Availability Suite 3.0 1/08 Foundation Services Cluster Administration Guide. For information about the nhcmmd daemon, see the nhcmmd(1M) man page, or for the Linux OS, refer to the nhcmmd(8) man page.
Applications that you write can register a callback function to handle notification messages. The cmm_notify_t callback receives the cluster membership change (cmc) callback function. The cmm_cmc_register function takes the service or application data and this callback function. You must provide relevant data when registering. The code in EXAMPLE 6-1 details the related structure, called the cmm_cmc_notification_t structure.
typedef struct { cmm_cmchanges_t cmchange; cmm_nodeid_t nodeid; } cmm_cmc_notification_t; |
The fields in this structure detail the cluster change and specify the node concerned. These fields are described in TABLE 6-1.
This structure is used by the cmm_notify_t callback function. The cmm_notify_t callback function contains the parameters described in TABLE 6-2.
If change notification data is required for longer than the duration of the callback, it must be handled by the client application or service.
Change notification messages contain the nodeid of the affected node and a cmm_cmchanges_t data type. The cmm_cmchanges_t data type describes the change notification. TABLE 6-3 lists the notifications of the cmm_cmchanges_t structure:
Value | Description |
---|---|
CMM_INVALID_CLUSTER | A critical problem occurred. For example, there are two master nodes. One node must be rebooted as soon as possible. The nodeid field is not useful in this case. |
CMM_MASTER_DEMOTED | The nodeid represents a previous master node that has been demoted. For more information, see Administrative Attributes. |
CMM_MASTER_ELECTED | The nodeid is that of the newly elected master node. A cluster election has selected a new master and the previous master (if any) quits its role. The new node might have just joined the cluster and there might not have been a previous master. |
CMM_MEMBER_JOINED | A peer node has joined the cluster. The nodeid is that of the new peer node. |
CMM_MEMBER_LEFT | A peer node has the CMM_OUT_OF_CLUSTER role. |
CMM_STALE_CLUSTER | The master node sends a membership frame every 4 seconds to inform other nodes of the current state of the cluster. If no frames are received by a node for more than 10 seconds, the CMM on this node notifies the local applications. The CMM_STALE_CLUSTER notification means that, even if the CMM API is available, the returned information from a node might not reflect the current state of the cluster. Operations involving the master, such as a new node joining the cluster, might fail because the master is unreachable. This situation is abnormal and recovery actions must be taken. The nodeid field is not useful in this case. Calls that return the CMM_OK value before this notification return CMM_EAGAIN after it while the cluster is in a stale state. |
CMM_VICEMASTER_DEMOTED | The nodeid represents a previous vice-master node that has been demoted. This is only sent if the vice-master node is disqualified. |
CMM_VICEMASTER_ELECTED | A new vice-master is elected. The node no longer has its previous role. The previous vice-master (if any) is demoted. The nodeid is that of the newly elected vice-master node. |
CMM_VALID_STATE | The state of the cluster is now valid and running correctly. The nodeid field is not useful for this notification. |
There are many scenarios in which the state of a cluster changes and registered applications and services receive notifications of changes in the cluster state.
A change in the state of a single node can cause the states of other nodes to change. For example, if a new master node is elected, the roles of both the new master node and the former master node change. When a scenario involves a change in the state of more than one node, several notifications can be sent. When several notifications are sent, the notifications are sent in the order in which the changes occur. The nhcmmd daemon sends the minimum number of notifications that describe a new cluster situation. Instead of sending a notification for each change of state for each node, the nhcmmd daemon bundles the information into the minimum number of notifications.
If peer nodes are communicating correctly, the same notification is sent to all nodes, regardless of their membership role.
In each of the scenarios described in this section, there are two example peer nodes: node A and node B. The roles of these nodes are shown in TABLE 6-4. The transition from one role to another is represented as (Role_A) —> (Role_B).
For a summary of the membership roles that a node can have, see Membership Roles.
Scenarios of cluster state change and related notifications are described in the following sections:
When neither node A nor B is currently running the Foundation Services, nodes A and B are out. When node A becomes the master node, a MASTER_ELECTED notification is sent to the registered applications and services. At cluster startup, this is the first step in the creation of a cluster. The notification sent for this scenario is shown in TABLE 6-5.
Transition (node A, node B) | Notifications Sent |
---|---|
(out, out) —> (master, out) | CMM_MASTER_ELECTED(A) |
The following scenario describes the election of a qualified node to the vice-master role at cluster initialization. This takes place in one step, when the CMM_VICEMASTER_ELECTED notification is sent. The notification sent for this scenario is shown in TABLE 6-6.
Transition (node A, node B) | Notifications Sent |
---|---|
(master, out) —> (master, vice-master) | CMM_VICEMASTER_ELECTED (B) |
The following scenario describes when a new node joins the cluster. This node does not take the master or vice-master role and could be a diskless node or a dataless node. The notification sent for this scenario is shown in TABLE 6-7. This scenario can occur at cluster initialization or when a new node is added to a running cluster.
Transition (node A, node B) | Notifications Sent |
---|---|
(master, out) —> (master, in) | CMM_MEMBER_JOINED (B) |
The following scenario describes the situation where a node that is in becomes vice-master. This scenario can occur if a node is in the cluster but does not immediately declare itself as master-eligible. When its eligibility to be a master node or a vice-master node is known, the node is elected vice-master. This scenario can also occur if a master-eligible node is disqualified. When the node is requalified, the node becomes that vice-master. The notification sent for this scenario is shown in TABLE 6-8.
Transition (node A, node B) | Notifications Sent |
---|---|
(master, in) —> (master, vice-master) | CMM_VICEMASTER_ELECTED (B) |
Provided that there is a running vice-master node, if the master node stops being master because its role has been removed, there is a failover, as explained in Failover Notifications.
If the vice-master node stops being vice-master due to a failure, or because its role has been removed, there is no backup for the master node and the cluster loses its 2N redundancy.
If the vice-master role is removed because of a failure or by using the cmm_membership_remove function, the notification is shown in TABLE 6-9.
The vice-master node can be disqualified if you use the cmm_member_setqualif function. The notification sent for this scenario is shown in TABLE 6-10.
Transition (node A, node B) | Notifications Sent |
---|---|
(master, vice-master) —> (master, in) | CMM_VICEMASTER_DEMOTED (B) |
For more information about disqualifying a node by using the cmm_member_setqualif function, see Setting the Qualification of a Node. Care must be taken with the use of the cmm_member_setqualif function. Do not trigger a failover. For more information, see Triggering a Failover by Using the cmm_member_setqualif Function. See also the cmm_member_setqualif(3CMM) man page.
If a peer node other than the master or vice-master loses its role in the cluster, it becomes temporarily out of the cluster. This occurs if you use the cmm_membership_remove function on the peer node. The notification sent in this scenario is shown in TABLE 6-11.
Transition (node A, node B) | Notifications Sent |
---|---|
(master, in) —> (master, out) | CMM_MEMBER_LEFT(B) |
If the master node fails, the node can be excluded from the cluster as described in Removing or Excluding a Node. The notification sent in this scenario is shown in TABLE 6-12
Transition (node A, node B) | Notifications Sent |
---|---|
(master, vice-master) —> (out, master) | CMM_MEMBER_LEFT(A) CMM_MASTER_ELECTED(B) |
If a node other than the master fails it can be excluded from the rest of the cluster as described in Removing or Excluding a Node. The notification sent is shown in TABLE 6-13.
Transition (node A, node B) | Notifications Sent |
---|---|
(master, in) —> (master, out) | CMM_MEMBER_LEFT(B) |
The notification sent for this scenario can also be sent for a diskless node.
A switchover is the scheduled transfer of the CMM_MASTER role from the master node to the vice-master node. A switchover is not a failure and does not change the qualification level of the master node. A switchover is not a persistent change. A switchover is usually triggered by the cluster administrator for the maintenance of a node. For more information about the maintenance of nodes, see the Netra High Availability Suite 3.0 1/08 Foundation Services Cluster Administration Guide.
The notifications sent in the case of a switchover from the master to the vice-master node, triggered by calling the cmm_mastership_release function, are shown in TABLE 6-14.
For further information and an example that uses the cmm_mastership_release function to trigger a switchover, see Triggering a Switchover.
A failover is the unscheduled transfer of the CMM_MASTER role from the master node to the vice-master node. A failover is a response to the removal or failure of the master node or disqualification of the master node. This section describes two failover scenarios:
If master node is removed from the cluster by using the cmm_membership_remove function, the node takes CMM_OUT_OF_CLUSTER role. This role indicates that the node is out of the cluster, but is configured to be in the cluster, and has access to cluster information. This is described in Membership Roles.
The notification sequence is the same, whether a master failover occurs because the master node fails or because the master role is removed. The master node is excluded from the cluster and the vice-master becomes the master. The notifications sent for this scenario are shown in TABLE 6-15.
The nhcmmd daemon issues notifications of this failover, described in Introduction to Change Notifications.
For an example of how to trigger a failover using the cmm_membership_remove function, see EXAMPLE 7-5.
In this scenario, the failover of the master node is due to the use of the cmm_member_setqualif function. The master node is no longer able to be either master or vice-master, but is in the cluster as a peer node. The vice-master becomes the master node. Because there is no other master-eligible node to take the vice-master role, the cluster loses its 2N redundancy. The former master node must be requalified to restore 2N redundancy. The notifications sent for this scenario are shown in TABLE 6-16.
Transition (node A, node B) | Notifications Sent |
---|---|
(master, vice-master) —> (in, master) | CMM_MASTER_DEMOTED (A) CMM_MASTER_ELECTED (B) |
The nhcmmd issues notifications of this failover, described in Introduction to Change Notifications.
The cmm_member_setqualif function is used during the process of peer node reboot and is called from the node that is being rebooted by the service coordinating the node reboot.
For an example of how to trigger a failover using the cmm_member_setqualif function, see Triggering a Failover by Using the cmm_member_setqualif Function.
When information received by a peer node from the master node is more than 10 seconds old, the information is considered to be stale. A stale cluster does not guarantee that there is no change in the cluster. A stale cluster means that information held by the master node is not reaching a peer node. This can happen if the master node is not functioning correctly and does not send information to the peer node. A stale cluster can also occur if the master node does send information but it does not reach the peer node, due for example to a problem in the network. The notification sent for this scenario is shown in TABLE 6-17.
Transition (node A, node B) | Notifications Sent |
---|---|
(master, any) —> (stale cluster) | CMM_STALE_CLUSTER(0) |
Amnesia is an error condition in which a cluster restarts with stale cluster configuration data. This can happen when a cluster is restarted from a node that was not previously part of the most recent cluster membership list.
Split brain is an error condition in which there are two master nodes. This can be caused by interconnect failure between peer nodes.
During split brain, each master node assumes that it is the only master node in the cluster. A split brain can begin with any combination of roles for nodes A and B. The notification sent for this scenario is shown in TABLE 6-18.
Transition (node A, node B) | Notifications Sent |
---|---|
(any, any) —> (master, master ) | CMM_INVALID_CLUSTER |
For information about how to recover from a split brain error condition, see the Netra High Availability Suite 3.0 1/08 Foundation Services Troubleshooting Guide.
Copyright © 2008, Sun Microsystems, Inc. All rights reserved.