Sun Cluster 2.2 System Administration Guide

Appendix D Using Sun Cluster SNMP Management Solutions

This appendix describes how to use SNMP to monitor the behavior of a Sun Cluster configuration.

This appendix includes the following procedures:

You can use the following SNMP Management solutions to monitor Sun Cluster configurations:

D.1 Cluster SNMP Agent and Cluster Management Information Base

Sun Cluster includes a Simple Network Management Protocol (SNMP) agent, along with a Management Information Base (MIB), for the cluster. The name of the agent file is snmpd (SNMP daemon) and the name of the MIB is sun.mib.

The cluster SNMP agent is a proxy agent that is capable of monitoring several clusters (a maximum of 32) at the same time. You can manage a typical Sun Cluster from the administration workstation or System Service Processor (SSP). By installing the cluster SNMP agent on the administrative workstation or SSP, network traffic is regulated and the CPU power of the nodes is not wasted in transmitting SNMP packets.

The snmpd daemon:

The Super Monitor daemon, smond, collects hardware configuration information and critical cluster events by connecting to the in.mond daemon from each of the member nodes of the cluster(s). The smond daemon then reports the same information to the SNMP daemon (snmpd).


Note -

You need to configure only one smond daemon to collect cluster information for several clusters.


The SUNWcsnmp package contains the following:

For additional information on the snmpd and smond daemons, see Appendix B, Sun Cluster Man Page Quick Reference.

D.2 Cluster Management Information Base

The Management Information Base (MIB) is a collection of objects that can be accessed through a network management protocol. The definition of the objects should be in a generic and consistent manner so that various management platforms can read and parse the definition.

Run the snmpd daemon on the management server, which is the cluster administration workstation, or on any client. This agent provides information (gathered from smond) for all the SNMP attributes defined in the cluster MIB. This MIB file is typically compiled into an "SNMP-aware" network manager like the SunNet Manager Console. See "D.5 Changing the snmpd.conf File".

The sun.mib file provides information about clusters in the following tables:


Note -

In the preceding bullets, time refers to the local time on the SNMP server (in which the table is maintained). Thus, the time indicates when any attribute change is reported on the server.


D.2.1 The clustersTable Attributes

The clusters table consists of entries for all of the monitored clusters. Each entry in the table contains specific attributes that provide cluster information. See Table D-1 for the clustersTable attributes.

Table D-1 clustersTable Attributes

Attribute Names 

Description 

clusterName

The name of the cluster 

clusterDescr

A description of the cluster 

clusterVersion

The release version of the cluster 

numNodes

The number of nodes in the cluster 

nodeNames

The names of all the nodes in the cluster, separated by commas 

quorumDevices

The names of all the quorum devices in the cluster, separated by commas 

clusterLastUpdate

The last time any of the attributes of this entry were modified 

D.2.2 The clusterNodesTable Attributes

The cluster nodes table consists of the known nodes of all of the monitored clusters. Each entry contains specific information about the node. See Table D-2 for the clusterNodesTable attributes.


Note -

When using a cross-reference, note that the belongsToCluster attribute acts as the key reference between this table and the clustersTable.


Table D-2 clusterNodesTable Attributes

Attribute Names 

Description 

nodeName

The host name of the node. 

belongsToCluster

The name of the cluster (to which this node belongs). 

scState 

State of the Sun Cluster software component on this node (Stopped, Aborted, In Transition, Included, Excluded, or Unknown). An enterprise specific trap signals a change in state. 

vmState 

State of the volume manager software component on this node. An enterprise specific trap signals a change in state. 

dbState 

State of the database software component on this node (Down, Up, or Unknown). An enterprise specific trap signals a change in state. 

vmType 

The type of volume manager currently being used on this node. 

vmOnNode 

Mode of the SSVM software component on this node (Master, Slave, or Unknown). An enterprise specific trap signals a change in state. This attribute is not valid for clusters with other volume managers. 

nodeLastUpdate

The last time any of the attributes of this entry were modified. 

D.2.3 The switchesTable Attributes

The switches table consists of entries for all of the switches. Each entry in the table contains specific information about a switch in a cluster. See Table D-3 for the switchesTable attributes.

Table D-3 switchesTable Attributes

Attribute Names 

Description 

switchName

The name of the switch 

numPorts

The number of ports on the switch 

connectedNodes

The names of all the nodes that are presently connected to the ports of the switch 

switchLastUpdate

The last time any of the switch attributes of this entry were modified 

D.2.4 The portsTable Attributes

The ports table consists of entries for all of the switch ports. Each entry in the table contains specific information about a port within a switch. See Table D-4 for the portsTable attributes.


Note -

When using a cross-reference, note that the belongsToSwitch attribute acts as the key reference between this table and the switchesTable.


Table D-4 portsTable Attributes

Attribute Names 

Description 

portId

The port ID or number 

belongsToSwitch

The name of the switch (to which this port belongs) 

connectedNode

The name of the node (to which this port is presently connected) 

nodeAdapterId

The adapter ID (of the SCI card) on the node to which this port is connected 

portStatus 

The status of the port (Active, Inactive, and so forth) 

portLastUpdate 

The last time any of the port attributes of this entry were modified 

D.2.5 The lhostTable Attributes

The logical hosts table consists of entries for each logical host configured in the cluster. See Table D-5 for the lhostTable attributes.

Table D-5 lhostTable Attributes

Attribute Names 

Description 

lhostName

The name of the logical host 

lhostMasters

The list of node names that constitute the logical host 

lhostCurrMaster

The name of the node that is currently the master for the logical host 

lhostDS

The list of data services configured to run on this logical host 

lhostDG

The disk groups configured on this logical host 

lhostLogicalIP

The logical IP address associated with this logical host 

lhostStatus

The current status of the logical host (UP or DOWN) 

lhostLastUpdate

The last time any of the attributes of this entry were modified 

D.2.6 The dsTable Attributes

The data services table consists of entries for all data services that are configured for all logical hosts in the monitored clusters. Each entry in the table consists of specific information about a data service configured on a logical host. See Table D-6 for the dsTable attributes.


Note -

When using a cross-reference, note that the dsonLhost attribute acts as a key reference between this table and the lhostTable.


Table D-6 dsTable Attributes

Attribute Names 

Description 

dsName

The name of the data service. 

dsOnLhost

The name of the logical host on which the data service is configured. 

dsReg

The value is 1 or 0 depending on whether the data service is registered and configured to run (1) or not run (0). 

dsStatus

The current status of the data service (ON/OFF/INST DOWN). 

dsDep

The list of other data services on which this data service depends. 

dsPkg

The package name for the data service. 

dsLastUpdate

The last time any of the attributes of this entry were last modified. 

D.2.7 The dsinstTable Attributes

The data service instance table consists of entries for all data service instances. See Table D-7 for the dsinstTable attributes.


Note -

When using a cross-reference, note that the dsinstOfDS attribute can be used as a key reference between this table and the dsTable. Similarly, the dsinstOnLhost attribute can be used as a key reference between this table and the lhostTable.


Table D-7 dsinstTable Attributes

Attribute Names 

Description 

dsinstName

The name of the data service instance 

dsinstOfDS

The name of the data service of which this is a data service instance 

dsinstOnLhost

The name of the logical host on which this data service instance is running 

dsinstStatus

The status of the data service instance 

dsinstLastUpdate

The last time any of the attributes of this entry were modified 

D.3 Cluster SNMP Daemon and Super Monitor Daemon Operation

The SNMP daemon operates under the following provisions:

D.4 SNMP Traps

SNMP traps are asynchronous notifications generated by the SNMP agent that indicate an unintended change in the state of monitored objects.

The software generates Sun Cluster-specific traps for critical cluster events. These events are listed in the following tables.

Table D-8 lists the Sun Cluster traps reflecting the state of the cluster software on a node.

Table D-8 Sun Cluster Traps Reflecting the Software on a Node

Trap Number 

Trap Name 

sc:stopped

sc:aborted

sc:in_transition

sc:included

sc:excluded

sc:unknown

Table D-9 lists the Sun Cluster traps reflecting the state of the volume manager on a node.

Table D-9 Sun Cluster Traps Reflecting the volume manager on a Node

Trap Number 

Trap Name 

10 

vm:down

11 

vm:up

12 

vm:unknown

Table D-10 lists the Sun Cluster traps reflecting the state of the database on a node.

Table D-10 Sun Cluster Traps Reflecting the Database on a Node

Trap Number 

Trap Name 

20 

db:down

21 

db:up

22 

db:unknown

Table D-11 lists the Sun Cluster traps reflecting the nature of the Cluster Volume Manager (master or slave) on a node.

Table D-11 Sun Cluster Traps Reflecting Cluster Volume Manager on a Node

Trap Number 

Trap Name 

30 

vm_on_node:master

31 

vm_on_node:slave

32 

vm_on_node:unknown

Table D-12 lists the Sun Cluster traps reflecting the states of a logical host.

Table D-12 Sun Cluster Traps Reflecting the States of a Logical Host

Trap Number 

Trap Name 

40 

lhost:givingup

41 

lhost:given

42 

lhost:takingover 

43 

lhost:taken 

46 

lhost:unknown

Table D-13 lists the Sun Cluster traps reflecting the states of a data service instance.

Table D-13 Sun Cluster Traps Reflecting the States of a Data Service Instance

Trap Number 

Trap Name 

50 

ds:started

51 

ds:stopped

52 

ds:in-transition 

53 

ds:failed-locally 

54 

ds:failed-remotely 

57 

ds:unknown

Table D-14 lists the Sun Cluster traps reflecting the states of the HA-NFS data service.

Table D-14 Sun Cluster Traps Reflecting the States of the HA-NFS Data Service Instance

Trap Number 

Trap Name 

60 

hanfs:start

61 

hanfs:stop

70 

hanfs:unknown

Table D-15 lists the Sun Cluster traps reflecting SNMP errors.

Table D-15 Sun Cluster Traps Reflecting SNMP Errors

Trap Number 

Trap Name 

100 

SOCKET_ERROR:node_out_of_system_resources

101 

CONNECT_ERROR:node_out_of_system_resources

102 

BADMOND_ERROR:node_running_bad/old_mond_version

103 

NOMOND_ERROR:mond_not_installed_on_node

104 

NOMONDYET_ERROR:mond_on_node_not_responding:node_may_be_rebooting

105 

TIMEOUT_ERROR:timed_out_upon_trying_to_connect_to_nodes_mond

106 

UNREACHABLE_ERROR:node's_mond_unreachable:network_problems??

107 

READFAILED_ERROR:node_out_of_system_resources

108 

NORESPONSE_ERROR:node_out_of_system_resources

109 

BADRESPONSE_ERROR:unexpected_welcome_message_from_node's_mond

110 

SHUTDOWN_ERROR:node's_mond_shutdown

200 

Fatal:super_monitor_daemon(smond)_exited!

For trap numbers 100-110, check the faulty node and fix the problem. For trap number 200, see "D.8 SNMP Troubleshooting".

D.5 Changing the snmpd.conf File

The snmpd.conf file is used for configuration information. Each entry in the file consists of a keyword followed by a parameter string. The default values in the file should suit your needs.

D.5.1 How to Change the snmpd.conf File

  1. Edit the snmpd.conf file.

    For details on the descriptions of the keywords, refer to the snmpd(7) man page.

  2. After making any changes to the snmpd.conf file, stop the smond and snmpd programs, then restart the scripts by entering:

    # /opt/SUNWcluster/bin/smond_ctl stop
    # /opt/SUNWcluster/bin/init.snmpd stop
    # /opt/SUNWcluster/bin/init.snmpd start
    # /opt/SUNWcluster/bin/smond_ctl start
    

    An example snmpd.conf file follows.

    sysdescr        Sun SNMP Agent, SPARCstation 10, Company
                                  Property Number 123456
     syscontact 	Coby Phelps
     sysLocation 	Room 123
     #
     system-group-read-community     public
     system-group-write-community    private
     #
     read-community  all_public
     write-community all_private
     #
     trap            localhost
     trap-community  SNMP-trap
     #
     #kernel-file    /vmunix
     #
     managers        lvs golden

D.6 Configuring the Cluster SNMP Agent Port

By default, the cluster SNMP agent listens on User Datagram Protocol (UDP) Port 161 for requests from the SNMP manager, for example, SunNet Manager Console. You can change this port by using the -p option to the snmpd and smond daemons.

Both the snmpd and smond daemons must be configured on the same port in order to function properly.


Caution - Caution -

If you are installing the cluster SNMP agent on an SSP or an Administrative workstation running the Solaris 2.6 Operating Environment or compatible versions, always configure the snmpd and the smond programs on a port other than the default UDP port 161.


For example, with the SSP, the cluster SNMP agent interferes with the SSP SNMP agent which also uses UDP port 161. This interference could result in the loss of RAS features of the Sun Enterprise 10000 server.

D.6.1 How to Configure the Cluster SNMP Agent Port

To configure the cluster SNMP agent on a port other than the default UDP Port 161, perform the following steps.

  1. Edit the /opt/SUNWcluster/bin/init.snmpd file and change the value of the CSNMP_PORT variable from 161 to the desired value.

  2. Edit the /opt/SUNWcluster/bin/smond_ctl file and change the value of the CSNMP_PORT variable from 161 to the same value you chose in Step 1.

  3. Stop and then restart both the snmpd and smond daemons for the changes to take effect.

    # /opt/SUNWcluster/bin/smond_ctl stop
    # /opt/SUNWcluster/bin/init.snmpd stop
    # /opt/SUNWcluster/bin/smond_ctl start
    # /opt/SUNWcluster/bin/init.snmpd start
    

    Note -

    Configuration files specific to the SNMP manager may need to be edited for SNMP manager to become aware of the new port number. Refer to your SNMP manager documentation for more information. Alternatively, you can configure the master SNMP agent on the Administrative workstation to start the cluster SNMP proxy agent as a subagent on a port other than 161. See the Solstice Enterprise Agents User's Guide or the snmpdx(1M) man page for information on how to configure the master SNMP agent.


D.7 Using the SNMP Agent With SunNet Manager

The cluster SNMP agent has been qualified with the SunNet Manager. Perform the following procedures prior to using SunNet Manager to monitor clusters.


Note -

These procedures assume you are using the UDP port 161 for SNMP. If you changed the port number as described in "D.6 Configuring the Cluster SNMP Agent Port", you need to run the SunNet Manager SNMP proxy agent, na.snmp to use the alternate port.


D.7.1 How to Use the SNMP Agent With SunNet Manager to Monitor Clusters

  1. Copy the cluster MIB /opt/SUNWcluster/etc/sun.mib to /opt/SUNWconn/snm/agents/cluster.mib on the SunNet Manager console.

  2. On the SunNet Manager console run mib2schema on the copied cluster.mib file:

    # /opt/SUNWconn/snm/bin/mib2schema cluster.mib
    
  3. On the Sun Cluster Administrative workstation, edit the snmpd.conf file and set the parameter string in the trap keyword to the name of the SunNet Manager console.

    For more information on editing the snmpd.conf file, refer to "D.5 Changing the snmpd.conf File".

  4. Run the smond_conf command on the Sun Cluster Administrative workstation for each cluster you want to monitor. For example:

    # /opt/SUNWcluster/bin/smond_conf -h [clustername ...]
  5. Set the proxy for cluster-snmp to be the name of the SunNet Manager console.


    Note -

    In order to monitor clusters, you must also monitor the Administrative workstation using SunNet Manager.


D.7.2 How to Reconfigure smond to Monitor a Different Cluster

You can reconfigure the smond daemon to monitor a different cluster.

  1. Stop the snmpd daemon by using:

    # /opt/SUNWcluster/bin/init.snmpd stop
    
  2. Reconfigure the smond daemon by using:

    # /opt/SUNWcluster/bin/smond_conf -h [clustername ...]
  3. Start the snmpd daemon by using:

    # /opt/SUNWcluster/bin/init.snmpd start
    
  4. Start the smond daemon by using:

    # /opt/SUNWcluster/bin/smond_ctl start
    

D.8 SNMP Troubleshooting

If the Cluster MIB tables are not filled in your application, or if you receive trap number 200, be sure that the snmpd and smond daemons are running by entering:

# ps -ef | grep snmpd
# ps -ef | grep smond

You do not see any output if the daemons are not running.

If the daemons are not running, enter:

# /opt/SUNWcluster/bin/init.snmpd start
# /opt/SUNWcluster/bin/smond_ctl start