Sun Cluster 2.2 System Administration Guide

6.5 Administering the Switch Management Agent

The Switch Management Agent (SMA) is a cluster module that maintains communication channels over the cluster private interconnect. It monitors the private interconnect and invokes a failover to a backup network if it detects a failure.

Note the following limitations before beginning the procedure:

6.5.1 How to Add Switches and SCI Cards

Use this procedure to add switches and SCI cards to cluster nodes. See the sm_config(1M) man page for details.

  1. Edit the sm_config template file to include the configuration changes.

    Normally, the template file is located in /opt/SUNWsma/bin/Examples.

  2. Configure the SCI SBus cards by running the sm_config(1M) command from one of the nodes.

    Rerun the command a second time to ensure that SCI node IDs and IP addresses are assigned correctly to the cluster nodes. Incorrect assignments can cause miscommunication between the nodes.

  3. Reboot the new nodes.

6.5.2 SCI Software Troubleshooting

If a problem occurs with the SCI software, verify that the following are true:

Also note the following problems and solutions:

6.5.3 How to Verify Connectivity Between Nodes

There are two ways to verify the connectivity between nodes: by running get_ci_status(1M) or by running ping(1).

    Run the get_ci_status(1M) command on all cluster nodes.

Example output for get_ci_status(1M) is shown below.

# /opt/SUNWsma/bin/get_ci_status
sma: sci #0: sbus_slot# 1; adapter_id 8 (0x08); ip_address 1; switch_id# 0; 
port_id# 0; Adapter Status - UP; Link Status - UP
sma: sci #1: sbus_slot# 2; adapter_id 12 (0x0c); ip_address 17; switch_id# 1; 
port_id# 0; Adapter Status - UP; Link Status - UP
sma: Switch_id# 0
sma: port_id# 1: host_name = interconn2; adapter_id = 72; active | operational
sma: port_id# 2: host_name = interconn3; adapter_id = 136; active | operational
sma: port_id# 3: host_name = interconn4; adapter_id = 200; active | operational
sma: Switch_id# 1
sma: port_id# 1: host_name = interconn2; adapter_id = 76; active | operational
sma: port_id# 2: host_name = interconn3; adapter_id = 140; active | operational
sma: port_id# 3: host_name = interconn4; adapter_id = 204; active | operational
# 

The first four lines indicate the status of the local node (in this case, interconn1). It is communicating with both switch_id# 0 and switch_id# 1 (Link Status - UP).

sma: sci #0: sbus_slot# 1; adapter_id 8 (0x08); ip_address 1; switch_id# 0; 
port_id# 0; Adapter Status - UP; Link Status - UP
sma: sci #1: sbus_slot# 2; adapter_id 12 (0x0c); ip_address 17; switch_id# 1; 
port_id# 0; Adapter Status - UP; Link Status - UP

The rest of the output indicates the global status of the other nodes in the cluster. All the ports on the two switches are communicating with their nodes. If there is a problem with the hardware, inactive is displayed (instead of active). If there is a problem with the software, inoperational is displayed (instead of operational).

sma: Switch_id# 0
sma: port_id# 1: host_name = interconn2; adapter_id = 72; active | operational
sma: port_id# 2: host_name = interconn3; adapter_id = 136; active | operational
sma: port_id# 3: host_name = interconn4; adapter_id = 200; active | operational
sma: Switch_id# 1
sma: port_id# 1: host_name = interconn2; adapter_id = 76; active | operational
sma: port_id# 2: host_name = interconn3; adapter_id = 140; active | operational
sma: port_id# 3: host_name = interconn4; adapter_id = 204; active | operational
#

    Run the ping(1) command on all the IP addresses of remote nodes.

Example output for ping(1) is shown below.

# ping IP-address

The IP addresses are found in the /etc/sma.ip file. Be sure to run the ping(1) command for each node in the cluster.

The ping(1) command returns an "alive" message indicating that the two ends are communicating without a problem. Otherwise, an error message is displayed.

For example,

# ping 204.152.65.2
204.152.65.2 is alive

6.5.4 How to Verify the SCI Interface Configuration

    Run the ifconfig -a command to verify that all SCI interfaces are up and that the cluster nodes have the correct IP addresses.

The last 8 bits of the IP address should match the IP field value in the /etc/sma.config file.

# ifconfig -a
lo0: flags=849<UP,LOOPBACK,RUNNING,MULTICAST> mtu 8232
 	inet 127.0.0.1 netmask ff000000
hme0: flags=863<UP,BROADCAST,NOTRAILERS,RUNNING,MULTICAST> mtu 1500
 	inet 129.146.238.55 netmask ffffff00 broadcast 129.146.238.255
 	ether 8:0:20:7b:fa:0
scid0: flags=80cl<UP,RUNNING,NOARP,PRIVATE> mtu 16321
 	inet 204.152.65.1 netmask fffffff0
scid1: flags=80cl<UP,RUNNING,NOARP,PRIVATE> mtu 16321
 	inet 204.152.65.17 netmask fffffff0