C H A P T E R  6

Reconfiguration Coordination Manager

This document describes how you can use Reconfiguration Coordination Manager scripts to automate certain dynamic reconfiguration processes when a Netra CP2000/CP2100 series board is used as a system controller.



Note - The Reconfiguration Coordination Manager scripts are supported on the Netra CP2140 and Netra CP2160 boards when they are used with the CP2000 Supplemental CD 4.0 for Solaris 8 only.



This chapter contains the following sections:

Note that the RCM framework can only be used in systems with the Netra CP2100 series CompactPCI boards as system controllers with full hot-swap support.


Reconfiguration Coordination Manager (RCM) Overview

Beginning with the Solaris 8 2/02 release, the Solaris operating environment provides a Reconfiguration Coordination Manager (RCM) scripting interface that enables you to create scripts that can shut down applications and release system resources from hardware devices during dynamic reconfiguration (DR) operations. For example, you can create an RCM script to release network interfaces of a network interface card prior to unconfiguring the card from the system.

Prior to the RCM software, operators had to release system resources manually before the operating system could dynamically remove hardware devices (for example, network interface cards and hard drives) from the system. Manually releasing these devices often left system applications and devices in unknown states, which would force the operators to shut down the system before they could remove the devices.

You can now write and install RCM scripts that can better control this dynamic reconfiguration process. When responding to reconfiguration requests, the Solaris RCM daemon will launch the RCM scripts at the appropriate time to allow for the orderly removal of system resources. With the system resources released, a hardware device can be successfully unconfigured during dynamic reconfiguration operations.

For instructions on how to write and install RCM scripts, and for a description of all of the RCM commands, refer to the "Reconfiguration Coordination Manager (RCM) Scripts" section of the Solaris 8 System Administration Supplement (806-7502-xx), which is part of the Solaris 8 2/02 Update Collection. You can view this document online on the http://docs.sun.com website. Refer to the rcmscript(4) man page for additional information about creating and installing RCM scripts.


Using RCM with the Netra CP2100 Series CompactPCI Board

The RCM framework is fully integrated into the CP2000 Supplemental CD 4.0 for Solaris 8 software and can be used in systems with the Netra CP2100 series CompactPCI boards as system controllers (with full hot-swap support).

You can write RCM scripts to shut down applications running on peripheral CompactPCI cards installed into the system backplane. The RCM daemon will execute these scripts when an operator attempts to unconfigure a card using the cfgadm command or by opening the CompactPCI board's ejection levers. For more information about the cfgadm command, refer to the cfgadm(1M) man page and the Solaris documentation.

The Solaris RCM software includes RCM modules that interact with dynamic reconfigurable devices. These RCM modules are shared object files, shipped with the Solaris software, that use the RCM application programming interface (API) to interact with the RCM framework. Both RCM modules and user-created RCM scripts can act as RCM clients during dynamic reconfiguration processes.

RCM scripts contain RCM commands that the Solaris DR framework will use to perform its operations. These RCM commands include register, which is used to specify the devices the script will manage, and preremove, which is used to remove resources from devices. For a full list of RCM commands, refer to the Solaris RCM documentation.

When an operator starts to unconfigure a card, the Solaris DR framework will call the RCM script's preremove function to do the necessary quiescing prior to proceeding with the actual unconfiguration. However, you can write an RCM script that can deny extraction requests, so when an operator tries to unconfigure a card, the Solaris DR framework will not attempt to unconfigure the device. In this case, the device will remain in a configured state and the card's blue extraction LED will stay unlit.

In the event that a device does not have associated RCM clients registered to receive device extraction notifications, the Solaris DR framework will automatically proceed with the unconfigure operation. The unconfigure operation will fail if the device is still in use. If the device is not in use, the unconfigure operation will succeed and the blue extraction LED will turn on.

If you are using a Netra CP2100 series system controller in a system that is set to basic hot-swap mode, use the cfgadm command with the -f option to prevent any possible interference from other RCM clients that have been registered for removal notifications of the same device. See Avoiding Error Messages When Extracting Devices in Basic Hot-Swap Mode for more information.



Note - The RCM functionality that enables you to write scripts to shut down applications operating on dumb I/O cards (like network interface cards) is not available on satellite CPU boards. You will need to shut down applications, including the Solaris operating environment, running on satellite CPU boards manually before dynamically removing these boards.



Using RCM to Work With the Intel 21554 Bridge Chip

CompactPCI cards containing the Intel 21554 PCI-PCI bridge chip falsely turn on the blue extraction LED when they are dynamically removed from a chassis using the Netra CP2100 series board as a system controller. This blue extraction LED, located on a CompactPCI card's front panel, incorrectly turns on when the ejection levers are opened for extraction while the card's devices (for example, the card's network interfaces) are still in use. The Intel 21554 bridge chip logic that clears the extraction bit during a dynamic removal operation will also turn on the card's blue extraction LED in error.

The Solaris DR framework must clear the extraction bit at the end of a dynamic removal operation, whether or not it is successful, to indicate that the ENUM interrupt has been handled. The Solaris DR framework will issue the command to turn on the blue extraction LED only when the card's unconfiguration operation is successful. Because the Intel 21554 bridge chip turns on the blue LED even when the unconfiguration fails, an operator may falsely assume that the card can be safely removed even though the card is actually still in use by the system.

With the addition of RCM script support, however, you can create applications that use the RCM framework to either approve or deny dynamic removal operations before the Solaris dynamic reconfiguration software proceeds with the card's unconfiguration. Using an RCM script, your application can deny the removal of the card if it currently cannot be freed, possibly because it is engaged in a critical task at the time. Since the RCM script denies the operation prior to starting the unconfiguration process, the blue extraction LED will not turn on.

If, on the other hand, your application approves the removal request and uses the RCM script to quiesce the card, the card's unconfiguration will succeed and the Solaris DR framework will turn on blue extraction LED correctly.



Note - If your application approves the dynamic removal operation, the RCM script should also shut down all applications using the card's devices. Otherwise, the blue extraction LED may come on in error if the unconfiguration fails because the device is still in use.



By providing a mechanism for either denying a dynamic reconfiguration request or quiescing the devices, the RCM framework provides a workaround to the incorrect lighting of the blue extraction LED.


RCM Script Example

CODE EXAMPLE 6-1 shows an example RCM Perl script designed to shut down applications running on a network interface of a Sun Dual FastEthernet/SCSI 6U CompactPCI Adapter with PMC. This CompactPCI card contains two network (qfe) interfaces and two SCSI interfaces.

The example script uses the following RCM commands:

For more information about these RCM commands, refer to rcmscript(4) man page and the Solaris RCM documentation.

See Testing the RCM Script Example for a test run of this script in a system controlled by a typical Netra CP2100 series CompactPCI board.

 

CODE EXAMPLE 6-1 RCM Script Example (SUNW,cp2000_io.pl)

#! /usr/bin/perl -w
#
# Copyright 2002 Sun Microsystems, Inc. All rights reserved.
# Use is subject to license terms.
#
#ident  "@(#)SUNW,cp2000_io.pl 1.1     02/03/28 SMI"
#
# A Sample site customization RCM script.
#
# When RCM_ENV_FORCE is FALSE, the script indicates to RCM that it can not
# release the network interface when the interface is plumbed and up
# When RCM_ENV_FORCE is TRUE, this script allows DR to remove the QFE device
# by bringing down the network interface that has been plumbed and up.
#
# For more information on RCM scripts see the man page rcmscript(4).
#
 
use strict;
 
my ($cmd, $rsrc, %dispatch);
 
# dispatch table for RCM commands
%dispatch = (
        "scriptinfo"    =>      \&do_scriptinfo,
        "register"      =>      \&do_register,
        "resourceinfo"  =>      \&do_resourceinfo,
        "queryremove"   =>      \&do_preremove,
        "preremove"     =>      \&do_preremove
);
 
sub do_scriptinfo
{
        print "rcm_script_version=1\n";
        print "rcm_script_func_info=ifconfig coordinator for DR\n";
 
        #
        # optionally specify command timeout value in seconds to override
        # the default timeout value.
        # Eg:
        #   print "rcm_cmd_timeout=10\n";
        #
 
        exit (0);
}
 
sub do_register
{
        #
        # register all resource names of interest using
        # print "rcm_resource_name=resourcename\n";
        # Eg: to register /dev/rmt/0 and /dev/dsk/c1t1d0s0
        #   print "rcm_resource_name=/dev/rmt/0\n";
        #   print "rcm_resource_name=/dev/dsk/c1t1d0s0\n";
        #
 
        # register all resource names of interest
 
        my ($devname);
        $devname='/devices/pci@1f,0/pci@1/pci@1/pci@12/SUNW,qfe@0,1';
        print "rcm_resource_name=$devname\n";
        exit(0);
}
 
sub do_resourceinfo
{
        #
        # specify the resource usage information of the given resource $rsrc
        #
 
        print "rcm_resource_usage_info=ifconfig managed QFE device\n";
        exit (0);
}
 
sub do_preremove
{
        if ($ENV{RCM_ENV_FORCE} eq 'TRUE') {
                if ($cmd eq 'preremove') {
                              `/usr/sbin/ifconfig qfe0 down`;
                                      if ($? == 0) {
                                            `/usr/sbin/ifconfig qfe0 unplumb`;
                                    }
                             }
                 exit(0);
                } else {
                print "rcm_failure_reason=device in use by ifconfig\n";
                exit (3);
        }
}
 
$ENV{'RCM_ENV_FORCE'} = "TRUE";
$cmd = $ARGV[0];
if (defined($ARGV[1])) {
        # resource name
        $rsrc = $ARGV[1];
}
 
if (defined($dispatch{$cmd})) {
        &{$dispatch{$cmd}};
} else {
        # unsupported command
        exit (2);
}
 


Testing the RCM Script Example

This section demonstrates how the SUNW,cp2000_io.pl RCM script (CODE EXAMPLE 6-1) unconfigures a network interface of a Sun Dual FastEthernet/SCSI 6U CompactPCI adapter.

In this example, the test system is operating in full hot-swap mode and has the following configuration:

  • 8-slot High-Availability (HA) CompactPCI chassis
  • Netra CP2100 series board used as the system board controller (SBC)
  • Sun Dual FastEthernet/SCSI 6U CompactPCI adapter in slot 4
  • Netra CP2100 series board used as a satellite CPU board in slot 5

The cfgadm command output below shows that the Sun Dual FastEthernet/SCSI 6U CompactPCI adapter is inserted and configured in slot 4 (pci1:cpci0_slot4) of the chassis.


# cfgadm 
Ap_Id                          Type         Receptacle   Occupant     Condition
c0                             scsi-bus     connected    configured   unknown
c1                             scsi-bus     connected    unconfigured unknown
pci1:cpci0_slot2               unknown      connected    unconfigured unknown
pci1:cpci0_slot3               unknown      connected    unconfigured unknown
pci1:cpci0_slot4               stpcipci/fhs connected    configured   ok
pci1:cpci0_slot5               mcd/fhs      connected    configured   ok
pci1:cpci0_slot6               unknown      connected    unconfigured unknown
pci1:cpci0_slot7               unknown      connected    unconfigured unknown
pci1:cpci0_slot8               unknown      connected    unconfigured unknown

Using the ifconfig command, the first FastEthernet device (qfe0) is plumbed (the streams needed for TCP/IP are set up) and brought up. Finally, the ifconfig -a command output shows that the adapter's qfe0 device has been connected to a Ethernet network, where test_ip is the hostname that corresponds to the 192.168.210.225 IP address. Refer to the ifconfig(1M) man page for more information about using ifconfig to configure network devices.


# ifconfig qfe0 plumb
# ifconfig qfe0 test_ip up
# ifconfig -a
lo0: flags=1000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000 
eri0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 192.168.210.213 netmask ffffff00 broadcast 192.168.210.255
        ether 0:3:ba:3:f4:58 
qfe0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        inet 192.168.210.225 netmask ffffff00 broadcast 192.168.210.255
        ether 0:3:ba:3:f4:58 
 

At this point, the operator opens the ejection levers of the Sun Dual FastEthernet/SCSI 6U CompactPCI adapter to begin the extraction process. When the levers are opened, the Solaris DR framework (in conjunction with the Solaris RCM framework) calls the SUNW,cp2000_io.pl RCM script, which unplumbs and brings down the FastEthernet (qfe0) interface. The Solaris DR framework then unconfigures the adapter and turns on the adapter's blue extraction LED, signalling that it can be safely removed.

The cfgadm and ifconfig output below show that the card has been successfully unconfigured from the system.


# cfgadm
Ap_Id                          Type         Receptacle   Occupant     Condition
c0                             scsi-bus     connected    configured   unknown
c1                             scsi-bus     connected    unconfigured unknown
pci1:cpci0_slot2               unknown      connected    unconfigured unknown
pci1:cpci0_slot3               unknown      connected    unconfigured unknown
pci1:cpci0_slot4               unknown      connected    unconfigured unknown
pci1:cpci0_slot5               mcd/fhs      connected    configured   ok
pci1:cpci0_slot6               unknown      connected    unconfigured unknown
pci1:cpci0_slot7               unknown      connected    unconfigured unknown
pci1:cpci0_slot8               unknown      connected    unconfigured unknown
 
# ifconfig -a
lo0: flags=1000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000 
eri0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 192.168.210.213 netmask ffffff00 broadcast 192.168.210.255
        ether 0:3:ba:3:f4:58 


Avoiding Error Messages When Extracting Devices in Basic Hot-Swap Mode

If you attempt to extract a hardware device using the cfgadm command when the system is operating in basic hot-swap mode, and you have installed an RCM script for the device, you may see error messages displayed in the system console. To avoid seeing these error messages, use the -f option with the cfgadm command when unconfiguring hardware devices. Using the -f option will help avoid any possible interference with other registered Solaris RCM modules or user-created RCM scripts.

For example, if you attempt to extract the Sun Dual FastEthernet/SCSI 6U CompactPCI adapter from a system operating in basic hot-swap mode, and you have installed the RCM script shown in CODE EXAMPLE 6-1, you may see error messages produced by other Solaris RCM modules in the system's console. For instance, if the adapter has a plumbed IP address, the Solaris SUNW,ip_rcm.so RCM module will fail the dynamic removal operation and display error messages unless you use the cfgadm -f command. By using the cfgadm -f command, you will avoid interfering with any other registered Solaris RCM modules or scripts.

The purpose of the SUNW,ip_rcm.so module is to protect anonymous consumers from inadvertent denial of service. Therefore, if an IP address is plumbed on a network interface card, the SUNW,ip_rcm.so module will fail the removal operation unless you unconfigure the card with the -f option.

However, if your Netra CP2100 series board controlled system is operating in full hot-swap mode, the Solaris hot-swap framework will automatically apply the -f option. Consequently, network cards with plumbed IP addresses will be unconfigured successfully even if there is no user-level RCM script installed. The -f option will force the SUNW,ip_rcm.so module to release the board and properly shut down all applications using the network card.

For more information about the cfgadm command, refer to the cfgadm(1M) man page and the Solaris system administration documentation. You can view this documentation on the http://docs.sun.com/ website.