C H A P T E R 11 |
This chapter describes how the Node Management Agent (NMA) can monitor and manipulate a cluster. This chapter contains the following sections:
The NMA is compliant with the Java Management Extensions (JMX) and based on the Java Dynamic Management Kit. The NMA provides access to cluster statistics through the Simple Network Management Protocol (SNMP) or through JMX clients using HTTP. The NMA supports the Internet Engineering Task Force standard RFC 2573.
The NMA retrieves statistics about the cluster membership, the reliable transport mechanism, the network file system, and the monitoring of process daemons. The NMA can be used to initiate a switchover, change the maximum number of times that it attempts to restart a daemon, reset the current retry count for a daemon, and listen for certain cluster notifications.
The following figure shows a remote client accessing nodes in a cluster.
The NMA collects two types of statistics: node-level statistics and cluster-level statistics.
There is an NMA on each peer node that collects node-level statistics, that is, statistics for that node. Each NMA collects statistics about CGTP, CMM, and the Daemon Monitor. The NMA on each server node collects node-level statistics about Reliable NFS.
The NMA on the master node collects cluster-level statistics, that is, statistics about the cluster.
The following table describes the statistics that are collected by the NMA on each peer node and on the master node.
The NMA can collect the following statistics:
Some cluster membership statistics are collected on each node. Other cluster membership statistics can be collected by the NMA on the master node only. Statistics collected from the master node include the following:
These statistics include a set of general CGTP statistics and a set of dedicated packet filtering statistics. Packet filtering statistics count the number of packets successfully received through each of the CGTP redundant links. In this way, packet filtering statistics measure the quality of the communication.
These statistics give the file status and disk replication status. These statistics can be collected for server nodes only.
External Address Manager statistics
These statistics give the status of the floating addresses and IPMP groups configured in the nhfs.conf file. The PID of the nheamd daemon is also reported.
These statistics give the status of each replicated disk, its configuration, information about the diskset used by Netra HA Suite, the hosts that are connected to the shared devices, the owner of the diskset, and fencing information.
The NMA running on the master node cascades the statistics from the NMA on each of the peer nodes into its namespace. By cascading, the NMA on the master node can see the statistics on all of the peer nodes. In this way, the NMA on the master node has a view of the entire cluster. The following figure illustrates the cascade of statistics from the peer nodes to the master node.
A set of NMA APIs can be used to develop applications that monitor and react to the statistics produced by a cluster. For information about developing applications with the NMA APIs, see the Netra High Availability Suite 3.0 1/08 Foundation Services NMA Programming Guide.
The statistics that a cluster generates depend on the type, size, and arrangement of the cluster. Each cluster has an individual set of statistics. Knowing the statistics that your cluster generates when it runs correctly can help you to interpret the statistics when the cluster is failing. Use the NMA and its APIs to establish a set of statistics or a benchmark for your cluster when it is working correctly.
The NMA can be configured to initiate a switchover or to change the following Daemon Monitor parameters:
The maximum number of times that the Daemon Monitor attempts to restart a daemon or group of daemons
For more information, see the Netra High Availability Suite 3.0 1/08 Foundation Services NMA Programming Guide.
The NMA can be used to listen for notifications of the following cluster events:
The maximum number of times that the Daemon Monitor attempts to restart a daemon or group of daemons is reached
The maximum number of times that the Daemon Monitor attempts to restart a daemon or group of daemons is changed by the NMA
The current number of times that the Daemon Monitor has attempted to restart a daemon or group of daemons is reset to zero by the NMA
For more information, see the Netra High Availability Suite 3.0 1/08 Foundation Services NMA Programming Guide.
Copyright © 2008, Sun Microsystems, Inc. All rights reserved.