Sun Cluster 2.2 System Administration Guide

Part I Sun Cluster Administration

Chapter 1 Preparing for Sun Cluster Administration

This chapter describes procedures used to prepare for administration of the Sun Cluster configuration. Some of the procedures documented in this chapter are dependent on your volume management software (Solstice DiskSuite, Sun StorEdge Volume Manager, or Cluster Volume Manager). Those that are dependent on the volume manager include the volume manager name in the procedure title.

1.1 Saving Disk Partition Information (Solstice DiskSuite)

Maintain disk partitioning information on all nodes and multihost disks in the Sun Cluster configuration. Keep this information up-to-date as new disks are added to the disksets and when any of the disks are repartitioned. You need this information to perform disk replacement.

The disk partitioning information for the local disks is not as critical because the local disks on all Sun Cluster nodes should have been partitioned identically. Most likely, you can obtain the local disk partition information from another Sun Cluster node if a local disk fails.

When a multihost disk is replaced, the replacement disk must have the same partitioning as the disk it is replacing. Depending on how a disk has failed, this information might not be available when replacement is performed. Therefore, it is especially important to retain a record of the disk partitioning information if you have several different partitioning schemes in your disksets.


Note -

Though SSVM and CVM do not impose this restriction, it is still a good idea to save this information.


A simple way to save disk partitioning information is shown in the following sample script. This type of script should be run after the Sun Cluster software has been configured. In this example, the files containing the volume table of contents (VTOC) information are written to the local /etc/opt/SUNWcluster/vtoc directory by the prtvtoc(1M) command.


Example 1-1 Sample Script for Saving VTOC Information

#! /bin/sh
 DIR=/etc/opt/SUNWcluster/vtoc
 mkdir -p $DIR
 cd /dev/rdsk
 for i in *s7
 do prtvtoc $i >$DIR/$i || rm $DIR/$i
 done

Each of the disks in a Solstice DiskSuite diskset is required to have a Slice 7. This slice contains the metadevice state database replicas.

If a local disk also has a valid Slice 7, the VTOC information also will be saved by the sample script in Example 1-1. However, this should not occur for the boot disk, because typically a boot disk does not have a valid Slice 7.


Note -

Make certain that the script is run while none of the disks is owned by another Sun Cluster node. The script will work if the logical hosts are in maintenance mode, if the logical hosts are owned by the local host, or if Sun Cluster is not running.


1.2 Saving and Restoring VTOC Information (Solstice DiskSuite)

When you save the VTOC information for all multihost disks, this information can be used when a disk is replaced. The sample script shown in the following example uses the VTOC information saved by the script shown in Example 1-1 to give the replacement disk the same partitioning as the failed disk. Use the actual names of the disk or disks to be added in place of c1t0d0s7 and c1t0d1s7 in the example. Specify multiple disks as a space-delimited list.


Example 1-2 Sample Script for Restoring VTOC Information

#! /bin/sh
 DIR=/etc/opt/SUNWcluster/vtoc
 cd /dev/rdsk
 for i in c1t0d0s7 c1t0d1s7
 do fmthard -s $DIR/$i $i
 done


Note -

The replacement drive must be of the same size and geometry (generally the same model from the same manufacturer) as the failed drive. Otherwise the original VTOC might not be appropriate for the replacement drive.


If you did not record this VTOC information, but you have mirrored slices on a disk-by-disk basis (for example, the VTOCs of both sides of the mirror are the same), it is possible to copy the VTOC information from the other submirror disk to the replacement disk. For this procedure to be successful, the replacement disk must be in maintenance mode, or must be owned by the same host as the failed disk, or Sun Cluster must be stopped. This procedure is shown in the following example.


Example 1-3 Sample Script to Copy VTOC Information From a Mirror

#! /bin/sh
 cd /dev/rdsk
 OTHER_MIRROR_DISK=c2t0d0s7
 REPLACEMENT_DISK=c1t0d0s7
 prtvtoc $OTHER_MIRROR_DISK | fmthard -s - $REPLACEMENT_DISK

If you did not save the VTOC information and did not mirror on a disk-by-disk basis, you can examine the component sizes reported by the metaset(1M) command and reverse engineer the VTOC information. Because the computations used in this procedure are complex, the procedure should be performed only by a trained service representative.

1.3 Saving Device Configuration Information

Record the /etc/path_to_inst and the /etc/name_to_major information on removable media (floppy disk or backup tape).

The path_to_inst(4) file contains the minor unit numbers for disks in each multihost disk expansion unit. This information will be necessary if the boot disk on any Sun Cluster node fails and has to be replaced.

1.4 Instance Names and Numbering

Instance names are occasionally reported in driver error messages. An instance name refers to system devices such as ssd20 or hme5.

You can determine the binding of an instance name to a physical name by looking at /var/adm/messages or dmesg(1M) output:

ssd20 at SUNW,pln0:
 ssd20 is /io-unit@f,e0200000/sbi@0,0/SUNW,soc@3,0/SUNW,pln@a0000800,20183777 \
 /ssd@4,0

 le5 at lebuffer5: SBus3 slot 0 0x60000 SBus level 4 sparc ipl 7
 le5 is /io-unit@f,e3200000/sbi@0,0/lebuffer@0,40000/le@0,60000

Once an instance name has been assigned to a device, it remains bound to that device.

Instance numbers are encoded in a device's minor number. To keep instance numbers persistent across reboots, the system records them in the /etc/path_to_inst file. This file is read only at boot time and is currently updated by the add_drv(1M) and drvconfig(1M) commands. For additional information refer to the path_to_inst(4) man page.

When you install the Solaris operating environment on a node, instance numbers can change if hardware was added or removed since the last Solaris installation. For this reason, use caution whenever you add or remove devices such as SBus or FC/OM cards on Sun Cluster nodes. It is important to maintain the same configuration of existing devices, so that the system is not confused in the event of a reinstall or reconfiguration reboot.

Instance number problems can arise in a configuration. For example, consider a Sun Cluster configuration that consists of three SPARCstorageTM Arrays with Fibre Channel/SBus (FC/S) cards installed in SBus slots 1, 2, and 4 on each of the nodes. The controller numbers are c1, c2, and c3. If the system administrator adds another SPARCstorage Array to the configuration using a FC/S card in SBus slot 3, the corresponding controller number will be c4. If Solaris is reinstalled on one of the nodes, the controller numbers c3 and c4 will refer to different SPARCstorage Arrays. The other Sun Cluster node will still refer to the SPARCstorage Arrays with the original instance numbers. Solstice DiskSuite will not communicate with the disks connected to the c3 and c4 controllers.

Other problems can arise with instance numbering associated with the Ethernet connections. For example, each of the Sun Cluster nodes has three Ethernet SBus cards installed in slots 1, 2, and 3, and the instance numbers are hme1, hme2, and hme3. If the middle card (hme2) is removed and Solaris is reinstalled, the third SBus card will be renamed from hme3 to hme2.

1.4.1 Performing Reconfiguration Reboots

During some of the administrative procedures documented in this book, you are instructed to perform a reconfiguration reboot by using the OpenBootTM PROM boot -r command, or by creating the /reconfigure file on the node and then rebooting.


Note -

It is not necessary to perform a reconfiguration reboot to add disks to an existing multihost disk expansion unit.


Avoid performing Solaris reconfiguration reboots when any hardware (especially a multihost disk expansion unit or disk) is powered off or otherwise defective. In such situations, the reconfiguration reboot removes the inodes in /devices and symbolic links in /dev/dsk and /dev/rdsk associated with the disk devices. These disks become inaccessible to Solaris until a later reconfiguration reboot. A subsequent reconfiguration reboot might not restore the original controller minor unit numbering, and therefore might the volume manager software to reject the disks. When the original numbering is restored, the volume manager software can access the associated objects.

If all hardware is operational, you can perform a reconfiguration reboot safely to add a disk controller to a node. You must add such controllers symmetrically to both nodes (though a temporary unbalance is allowed while the nodes are upgraded). Similarly, if all hardware is operational, it is safe to perform a reconfiguration reboot to remove hardware.


Note -

For the Sun StorEdge A3000, in the case of a single controller failure, you should replace the failed controller as soon as possible. Other administration tasks that would normally require a boot -r (such as after adding a new SCSI device) should be deferred until the failed controller has been replaced and brought back online, and all logical unit numbers (LUN) have been balanced back to their previous state when the failover occurred. Refer to the Sun StorEdge A3000 documentation for more information.


1.5 Logging Into the Servers as root

If you want to log in to Sun Cluster nodes as root through a terminal other than the console, you must edit the /etc/default/login file and comment out the following line:

CONSOLE=/dev/console

This enables root logins using rlogin(1), telnet(1), and other programs.

Chapter 2 Sun Cluster Administration Tools

This chapter provides information on the following topics.

This chapter includes the following procedures:

Administering the Sun Cluster software is facilitated by three Graphical User Interfaces (GUIs):

Cluster Control Panel - Launches the Cluster Console and other system administration tools.

Cluster Console - Executes commands on multiple nodes in the cluster simultaneously to simplify cluster administration.

Sun Cluster Manager - Monitors the current status of all nodes in the cluster via a HotJava browser.

Refer to the online help for complete documentation on these GUIs. You can also use utilities to monitor Sun Cluster software.

2.1 Monitoring Utilities

You can use the Sun Cluster hastat(1M)utility, in addition to the /var/adm/messages files, to monitor a Sun Cluster configuration. You can also use the Sun Cluster Manager graphical user interface, which shows the status of major cluster components and subcomponents. For more information about the Sun Cluster Manager, refer to "2.5 Monitoring Sun Cluster Servers With Sun Cluster Manager ". Sun Cluster also provides an SNMP agent that can be used to monitor a maximum of 32 clusters at the same time. See Appendix D, Using Sun Cluster SNMP Management Solutions.

If you are running Solstice DiskSuite, you can also use the metastat(1M), metadb(1M), metatool(1M), medstat(1M), and mdlogd(1M) utilities to monitor the status of your disksets. The SNMP-based Solstice DiskSuite log daemon, mdlogd(1M), generates a generic SNMP trap when Solstice DiskSuite logs a message to the syslog file. You can configure mdlogd(1M) to send a trap only when certain messages are logged by specifying a regular expression in the mdlogd.cf(4) configuration file. The trap is sent to the administrative host specified in the configuration file. The administrative host must be running a network management application such as Solstice SunNet ManagerTM. You can use mdlogd(1M) if you don't want to run metastat(1M) periodically or scan the syslog output looking for Solstice DiskSuite errors or warnings. See the mdlogd(1M) man page for more information.

If you are running SSVM or CVM, you can use the vxprint, vxstat, vxtrace, vxnotify, and vxva utilities. Refer to your volume management software documentation for information on these utilities.


Note -

For information about troubleshooting and repairing defective components, refer to the appropriate hardware documentation.


2.1.1 Monitoring the Configuration With hastat(1M)

The hastat(1M) program displays the current state of the configuration. The program displays status information about the hosts, logical hosts, private networks, public networks, data services, local disks, and disksets, along with the most recent error messages. The hastat(1M) program extracts Sun Cluster-related error messages from the /var/adm/messages file and outputs the last few messages from each host if -m is specified. Because the recent error messages list is a filtered extract of the log messages, the context of some messages might be lost. Check the /var/adm/messages file for a complete list of the messages. The following pages show an example of output from hastat(1M):

# hastat -m 10

 HIGH AVAILABILITY CONFIGURATION AND STATUS
 -------------------------------------------

 LIST OF NODES CONFIGURED IN <ha-host1> CLUSTER
       phys-host1 phys-host2

 CURRENT MEMBERS OF THE CLUSTER

      phys-host1 is a cluster member

      phys-host2 is a cluster member

 CONFIGURATION STATE OF THE CLUSTER

      Configuration State on phys-host1: Stable

      Configuration State on phys-host2: Stable

 UPTIME OF NODES IN THE CLUSTER

      uptime of phys-host1:         12:47pm  up 12 day(s), 21:11,  1 user,
 load average: 0.21, 0.15, 0.14

      uptime of phys-host2:         12:46pm  up 12 day(s),  3:15,  3 users,
 load average: 0.40, 0.20, 0.16

 LOGICAL HOSTS MASTERED BY THE CLUSTER MEMBERS

 Logical Hosts Mastered on phys-host1:
         ha-host-1
 Loghost Hosts for which phys-host1 is Backup Node:
         ha-host2

 Logical Hosts Mastered on phys-host2:
         ha-host2
 Loghost Hosts for which phys-host2 is Backup Node:
         ha-host1

 LOGICAL HOSTS IN MAINTENANCE STATE

      None

 STATUS OF PRIVATE NETS IN THE CLUSTER

      Status of Interconnects on phys-host1:
         interconnect0: selected
         interconnect1: up
      Status of private nets on phys-host1:
         To phys-host1 - UP
         To phys-host2 - UP

      Status of Interconnects on phys-host2:
         interconnect0: selected
         interconnect1: up
      Status of private nets on phys-host2:
         To phys-host1 - UP
         To phys-host2 - UP

 STATUS OF PUBLIC NETS IN THE CLUSTER

 Status of Public Network On phys-host1:

 bkggrp  r_adp   status  fo_time live_adp
 nafo0   le0     OK      NEVER   le0

 Status of Public Network On phys-host2:

 bkggrp  r_adp   status  fo_time live_adp
 nafo0   le0     OK      NEVER   le0

 STATUS OF SERVICES RUNNING ON LOGICAL HOSTS IN THE CLUSTER

        Status Of Registered Data Services
        q:                       Off
        p:                       Off
        nfs:                     On
        oracle:                  On
        dns:                     On
        nshttp:                  Off
        nsldap:                  On

       Status Of Data Services Running On phys-host1
       Data Service HA-NFS:
       On Logical Host ha-host1:      Ok
     
       Status Of Data Services Running On phys-host2
       Data Service HA-NFS:
       On Logical Host ha-host2:      Ok
       
        Data Service "oracle":
        Database Status on phys-host2:
        SC22FILE - running;

        No Status Method for Data Service "dns"

        RECENT  ERROR MESSAGES FROM THE CLUSTER

        Recent Error Messages on phys-host1
        ...
        Recent Error Messages on phys-host2
        ...

2.1.2 Checking Message Files

The Sun Cluster software writes messages to the /var/adm/messages file, in addition to reporting messages to the console. The following is an example of the messages reported when a disk error occurs.

...
 Jun 1 16:15:26 host1 unix: WARNING: /io-unit@f,e1200000/sbi@0.0/SUNW,pln@a0000000,741022/ssd@3,4(ssd49):  
 Jun 1 16:15:26 host1 unix: Error for command `write(I))' Err
 Jun 1 16:15:27 host1 unix: or Level: Fatal
 Jun 1 16:15:27 host1 unix: Requested Block 144004, Error Block: 715559
 Jun 1 16:15:27 host1 unix: Sense Key: Media Error
 Jun 1 16:15:27 host1 unix: Vendor `CONNER':
 Jun 1 16:15:27 host1 unix: ASC=0x10(ID CRC or ECC error),ASCQ=0x0,FRU=0x15
 ...

Note -

Because Solaris and Sun Cluster error messages are written to the /var/adm/messages file, the /var directory might become full. Refer to "4.7 Maintaining the /var File System" for the procedure to correct this problem.


2.1.3 Highly Available Data Service Utilities

In addition, Sun Cluster provides utilities for configuring and administering the highly available data services. The utilities are described in Appendix B, Sun Cluster Man Page Quick Reference. The utilities include:

2.2 Online Help System

Each Sun Cluster administration tool includes detailed online help. To access the online help, launch one of the administration tools from the administrative workstation and select Help from the menu bar.

Alternatively, double-click on the Help icon in the Cluster Control Panel.

Help topics cover the administration tools in detail, as well as some administration tasks. Also see Chapter 4, General Sun Cluster Administration, for additional detailed instructions on performing specific tasks.

Figure 2-1 shows a sample Help window for the Cluster Control Panel. The text covers a specific topic. When you first launch the Help window from a tool, it displays the top-level home topic. Afterwards, the Help window displays the last topic you viewed. Hypertext links to other topics are displayed as underlined, colored text.

Clicking once on a hypertext link displays the text for that topic. The online help system also includes an automatic history list feature that remembers the information you previously accessed. Display this list by choosing Topic History from the View menu.

The Help window has a scrollable text area, a menu bar, and several buttons. Each of these items is described in the following sections.

Figure 2-1 Sample Help Window Home Page for the Cluster Control Panel

Graphic

With Help, you can access each pull-down menu by:

You can customize the mnemonics and accelerators; refer to the online help for more information.

The tables in this section list each menu item, define the menu functions, and list the accelerators (keyboard combinations).

2.2.1 Help Window Menu Bar Items

The Help window menu contains the File, View, and Help menu items. You display the menus for these items by selecting them.

2.2.1.1 File Menu

The File menu contains these items:

Table 2-1 File Menu Items

Item 

Function 

Accelerator 

Print Topic 

Prints the topic currently displayed in the Help window scrolled text area. 

Alt + R 

Dismiss 

Dismisses the Help window. 

Alt + D 

2.2.1.2 View Menu

The View menu contains these items:

Table 2-2 View Menu Items

Item 

Function 

Accelerator 

Previous Topic 

Displays the previous help topic (if any). 

Alt + P 

Next Topic 

Displays the next help topic (if any). 

Alt + N 

Home Topic 

Displays the home (top level) topic. 

Alt + O 

Topic History... 

Displays the Help Topic History dialog box which allows you to navigate easily through the help topics you have already viewed. The uppermost topic in the scrolled list is the first topic you displayed. The bottommost topic is the last topic you have viewed in the current path. The highlighted topic is the current topic. 

Alt + I 

To display the dialog box, select View Topic History... (Figure 2-2).

Figure 2-2 Help Topic History of the Help Window

Graphic

2.2.1.3 Help Menu

The Help menu contains these items:

Table 2-3 Help Menu Items

Item 

Function 

Accelerator 

Help on Help... 

Describes the Help window and explains how to use it.  

Alt + E 

About... 

 

Displays the About Box, which contains information on the application, such as the version number. 

Alt + B 

2.2.2 Help Window Buttons

The following table lists the Help window buttons and describes their functions.

Table 2-4 Help Window Buttons

Button 

Function 

Home  

Displays the home page for the application. 

Dismiss 

Dismisses the Help window. 

Print Topic 

Prints the current topic on your default printer. 

Previous

Steps back through the displayed Help topics to the previous topic. Clicking on the left arrow repeatedly steps the display back through the Help windows until the first topic you viewed is redisplayed. Topics are "remembered" by the automatic Help history list. 

Next

Steps forward through the displayed Help topics, one at a time, to the last topic in the list. 

2.3 About the Cluster Control Panel

The Cluster Control Panel (CCP) is a GUI that enables you to launch the Cluster Console, and other system administration tools. The CCP contains icons that represent these tools.

2.3.1 How to Launch the CCP

After you have installed the Sun Cluster client software on the administrative workstation, use this procedure to run an application from the CCP.

  1. As superuser, add the Sun Cluster tools directory /opt/SUNWcluster/bin to the path on the administrative workstation.


    Note -

    For E10000 Platforms, you must first log into the System Service Processor (SSP) and connect by using the netcon command. Once connected, enter Shift~@ to unlock the console and gain write access. Then proceed to Step 2.


  2. In a shell window on your workstation, bring up the CCP.

    Specify the name of the cluster to be monitored:

    # ccp clustername
    

    Note -

    If the Sun Cluster tools are not installed in the default location of /opt/SUNWcluster, the environment variable $CLUSTER_HOME must be set to the alternate location.


2.3.2 CCP Items

The CCP (shown in the following figure) has a menu bar and an icon pane that displays all of the tools currently in the control panel. From the menu bar you can add, delete, or modify the tools.

Figure 2-3 Sample Cluster Control Panel

Graphic

From the File or Properties menu you can:

For detailed information about the CCP, refer to the online help.

For information about the programs represented by these tools and their usage, see "2.4 About the Cluster Console". For information about using the HotJava browser to monitor cluster configurations, see "2.5 Monitoring Sun Cluster Servers With Sun Cluster Manager ".

2.3.3 CCP Configuration File Locations

The CCP stores properties and other information in configuration files within a configuration directory. By default, the configuration directory is located at /opt/SUNWcluster/etc/ccp.


Note -

You must be root (superuser) to write to this default location. Only root can add, delete, or change CCP items in this configuration directory.


You can, however, create your own configuration directory and define its location using the environment variable $CCP_CONFIG_DIR. The $CCP_CONFIG_DIR variable specifies the configuration directory in which the configuration files containing item properties are stored. If the path name is not set, it defaults to the standard location, /opt/SUNWcluster/etc/ccp. To create your configuration directory, create a new directory and set the environment variable $CCP_CONFIG_DIR to the full path name of the new directory.

These files do not need to be edited manually, because they are created, modified, or deleted by ccp whenever you create, modify, or delete an item.

2.4 About the Cluster Console

The Cluster Console (CC) GUI enables you to run commands on multiple nodes simultaneously, simplifying cluster administration. The Cluster Console displays one terminal window for each cluster node, plus a small Common window that you can use to control all windows simultaneously.

Different types of remote sessions enable you to connect to the console of the host, or remotely log in by using rlogin or telnet. Hosts can be specified on the command line and added or deleted from the Select Hosts dialog box after the program is running. The session type can be specified only on the command line. Once started, the session type cannot be changed.

You can issue commands to multiple hosts from the Common window, and you can issue commands to a single host from a terminal window. Terminal windows use VT100 terminal emulation.

Alternatively, you can turn off all hosts in the Hosts menu except the one you want to access, then issue commands from the Common window text field.

2.4.1 How to Launch the Cluster Console

You can launch the Cluster Console from the CCP (see "2.3 About the Cluster Control Panel") or from the command line in a shell window. If an optional parameter is specified, a terminal window is created for each host in the cluster or for each host specified.

    Type cconsole to initiate remote console access:

% cconsole [clustername | hostname...]

    Type ctelnet to establish a telnet(1) connection from the console:

% ctelnet [clustername | hostname...]

    Launch crlogin with your user name to establish an rlogin(1) connection from the console:

% crlogin -l user name [clustername | hostname...]

All three of the preceding commands also take the standard X/Motif command-line arguments. Once the Cluster Console has started, the Console window is displayed.

For detailed information on the Cluster Console, refer to the online help.

2.4.2 The Common Window Menu Bar

The Common window (shown in the following figure) is the primary window used to send input to all nodes. The Common window is always displayed when you launch the Cluster Console.

Figure 2-4 Common Window Menu Bar of the Cluster Console

Graphic

This window has a menu bar with three menus and a text field for command entry. From the Hosts menu you can use the Select dialog box to:

From the Options menu, you can group or ungroup the Common Window and the terminal windows.

2.4.3 Configuration Files Used by the Cluster Console

Two configuration files are used by the Cluster Console: clusters and serialports. These can be /etc files or NIS/NIS+ databases. The advantage to using a NIS+ environment is that you can run the Cluster Console on multiple Administrative Workstations. Refer to your NIS/NIS+ System Administration Guide for complete information on NIS/NIS+.

2.4.4 About the clusters File

The clusters file maps a cluster name to the list of host names that comprise the cluster. Each line in the file specifies a cluster, as in this example:

planets      mercury venus earth mars
 wine         zinfandel merlot chardonnay riesling

The clusters file is used by all three session types of the Cluster Console (cconsole, ctelnet, and crlogin) to map cluster names to host names on the command line and in the Select Hosts dialog box. For additional information, see "3.10 Modifying the clusters File".

2.4.5 About the serialports File

The serialports file maps a host name to the Terminal Concentrator and Terminal Concentrator serial port to which the host is connected. Each line in this database specifies a serial port of the host.

Sample serialports file database entries for the Sun Enterprise 10000 are:

mercury    systemserviceprocessorname    23
 venus      systemserviceprocessorname    23
 earth      systemserviceprocessorname    23
 mars       systemserviceprocessorname    23

Sample serialports file database entries for all other nodes are:

mercury        planets-tc   5002
 venus          planets-tc   5003
 earth          planets-tc   5004
 mars           planets-tc   5005

The serialports file is used only by the cconsole variation of this program to determine which Terminal Concentrator and port to connect to for hosts or clusters that are specified in the command line or the Select Hosts dialog box.

In the preceding example, node mercury is connected to planets-tc Port 2, while node venus is connected to planets-tc Port 3. Port 1 is reserved for the administration of the Terminal Concentrator.

For additional information, see "3.11 Modifying the serialports File".

2.5 Monitoring Sun Cluster Servers With Sun Cluster Manager

Sun Cluster Manager (SCM) provides a single interface to many of Sun Cluster's command line monitoring features. SCM consists of two parts: SCM server software, and the SCM Graphical User Interface (GUI). The SCM server runs on each node in the cluster. The SCM GUI runs in a Java Development Kit (JDK) compliant browser such as HotJava. The HotJava browser can be running on any machine, including the cluster nodes. The SCM GUI reports information on:

2.6 Running the SCM GUI in a HotJava Browser

The following set of procedures outline what you need to do to run SCM in a HotJava browser.

You may need to determine if you have the correct version of the following:


Note -

If you choose to use the HotJava browser shipped with your Solaris 2.6 or 2.7 Operating System, there may be problem when using the menus, for example: after making your menu selection, the menu selection can remain visible on the browser. Refer to the Sun Cluster 2.2 Release Notes for complete information based on your software requirement needs.


Refer to the Sun Cluster 2.2 Release Notes for complete information on software requirements.

You will need to determine whether you want to:

Depending on what you decide, refer to the appropriate procedure.

2.6.1 How to Determine the JDK Version

    Type the following from the console prompt on the server in your cluster:

java -version

Refer to the Sun Cluster 2.2 Release Notes for complete information on software requirements.

2.6.2 How to Determine the HotJava Version

    From the machine running the HotJava browser, select About HotJava from the Help menu.

Refer to the Sun Cluster 2.2 Release Notes for complete information on software requirements.

2.6.3 How to Run the SCM Applet in a HotJava Browser From a Cluster Node

  1. Run your HotJava browser on a node in the cluster.

  2. Remotely display it on an X windows workstation.

  3. Set the applet security preferences in the HotJava browser:

  4. Choose Applet Security from Preferences on the Edit menu.

  5. Click Medium Security as the Default setting for Unsigned applets

  6. When you are ready to begin monitoring the cluster with SCM, type the appropriate URL.

    file:/opt/SUNWcluster/scmgr/index.html
    
  7. Click OK to dialog boxes that ask for permission to access certain files, ports, and so forth from the remote display workstation to the cluster node where the browser is started.


    Note -

    It may take HotJava some time to download and run the applet. No status information will appear during this time.


    Refer to the online help for complete information on menu navigation, tasks, and reference.

2.6.4 How to Set Up a Web Server to Run With SCM

If you choose, you can install a web server on the cluster nodes to run with SCM.

  1. Install a web server that supports HTML 2.0 or later on all nodes in the cluster.

    One option is Sun Web Server, which is available to download at http://www.sun.com/solaris/webserver.


    Note -

    If you are running the HA-HTTP service and an HTTP server on SCM, you need to configure the HTTP servers to listen on different ports. Otherwise there will be a port conflict between the two.


  2. Follow the web server's configuration procedure to make sure that SCM's index.html file is accessible to the clients.

    The client applet for SCM is in the index.html file in the /opt/SUNWcluster/scmgr directory.

    For example, go to your HTTP server's document root and create a link to the /opt/SUNWcluster/scmgr directory.

  3. Run your HotJava browser from your workstation.

  4. Set the applet security preferences in the HotJava browser:

    1. Choose Applet Security from Preferences on the Edit menu.

    2. Click Medium Security as the Default setting for Unsigned applets

  5. When you are ready to begin monitoring the cluster with SCM, type the appropriate URL.

    http://cluster node/scmgr/index.html
    
  6. Click OK to dialog boxes that ask for permission to access certain files, ports, and so forth to the cluster node where the browser is started.


    Note -

    It may take HotJava some time to download and run the applet. No status information will appear during this time


    Refer to the online help for complete information on menu navigation, tasks, and reference.

2.7 SCM Online Help System

2.7.1 How to Display SCM Online Help

    To display the Help window from SCM, select Help Help Contents

Alternatively, click on the Help Icon (question mark) in the tool bar on top of the folder.

If necessary, you can run online help in a separate browser by typing the following URL:

file:/opt/SUNWcluster/scmgr/help/locale/en/main.howtotopics.html

Chapter 3 Modifying the Sun Cluster Configuration

This chapter provides instructions on the following topics.

This chapter includes the following procedures:

3.1 Adding and Removing Cluster Nodes

When you add or remove cluster nodes, you must reconfigure the Sun Cluster software. When you installed the cluster originally, you specified the number of "active" nodes and the number of "potential" nodes in the cluster, using the scinstall(1M) command. Use the procedures in this section to add "potential" nodes and to remove "active" nodes.

To add nodes that were not already specified as potential nodes, you must halt and reconfigure the entire cluster.

3.1.1 How to Add a Cluster Node

Use this procedure only for nodes that were already specified as "potential" during initial installation.

  1. Use the scinstall(1M) command to install Sun Cluster 2.2 on the node you are adding.

    Use the installation procedures described in the Sun Cluster 2.2 Software Installation Guide, but note the following when responding to scinstall(1M) prompts:

    • When asked for the number of active nodes, include the node you are adding now in the total.

    • You will not be prompted for shared Cluster Configuration Database (CCD) information, since the new cluster will have greater than two nodes.

    • (SSVM with direct attached devices only) When prompted for the node lock port, provide the designated node lock device and port.

    • (SSVM only) Do not select a quorum device when prompted. Instead, select complex mode and then -N. Later, you will run the scconf -q command to configure the quorum device.

    • (SSVM only) Select ask when prompted to choose a cluster partitioning behavior.

  2. (Scalable Coherent Interface [SCI] only) Update the sm_config template file to verify information on the new node.

    This step is not necessary for Ethernet configurations.

    Nodes that were specified as "potential" during initial installation should have been included in the sm_config file with their host names commented out by the characters _%. Uncomment the name of the node you will be activating now. Make sure the configuration information in the file matches the physical layout of the node.

  3. (SCI only) Run sm_config.

  4. (SSVM only) Set up the root disk group.

    For details, see the appendix on configuring SSVM and CVM in the Sun Cluster 2.2 Software Installation Guide.

  5. (SDS only) Set up Solstice DiskSuite disksets.

    For details, see the appendix on configuring Solstice DiskSuite in the Sun Cluster 2.2 Software Installation Guide.

  6. If you have a direct attached device connected to all nodes, set up the direct attached disk flag on the new node.

    To set the direct attached flag correctly in the cdb files of all nodes, run the following command on all nodes in the cluster. In this example, the cluster name is sc-cluster:

    # scconf sc-cluster +D
    
  7. (SSVM only) Select a common quorum device.

    If your volume manager is SSVM or CVM and if you have a direct attached device connected to all nodes, run the following command on all nodes and select a common quorum device.

    # scconf sc-cluster -q -D
    

    If you do not have a direct attached disk connected to all nodes, run the following command for each pair of nodes that shares a quorum device with the new node.

    # scconf -q
    
  8. (SSVM only) Set the node lock port on the new node.

    If you just installed a direct attached disk, set the node lock port on all nodes.

    If the cluster contained a direct attached disk already, run the following command on only the new node. In this example, the cluster name is sc-cluster and the Terminal Concentrator is cluster-tc.

    # scconf sc-cluster -t cluster-tc -l port_number
    
  9. Stop the cluster.

  10. Run the scconf -A command on all nodes to update the number of active nodes.

    See the scconf(1M) man page for details. In this example, the cluster name is sc-cluster and the new total number of active nodes is three.

    # scconf sc-cluster -A 3
    
  11. (SSVM only) Remove the shared CCD if it exists, since it is necessary only with two-node clusters.

    Run the following command on all nodes.

    # scconf sc-cluster -S none
    
  12. Use ftp (in binary mode) to copy the cdb file from an existing node to the new node.

    The cdb files normally reside in /etc/opt/SUNWclus/conf/clustername.cdb.

  13. Reboot the new node.

  14. Start the cluster.

    Run the following command from any single node.

    # scadmin startcluster phys-hahost sc-cluster
    

    Then run the following command on all other nodes.

    # scadmin startnode
    

3.1.2 How to Remove a Cluster Node

The scconf(1M) command enables you to remove nodes by decrementing the number you specified as active nodes when you installed the cluster software with the scinstall(1M) command. For this procedure, you must run the scconf(1M) command on all nodes in the cluster.

  1. For an HA configuration, switch over all logical hosts currently mastered by the node to be removed.

    For parallel database configurations, skip this step.

    # haswitch phys-hahost3 hahost1
    
  2. Run the scconf -A command to exclude the node.

    Run the scconf(1M) command on all cluster nodes. See the scconf(1M) man page for details. In this example, the cluster name is sc-cluster and the new total number of active nodes is two.

    # scconf sc-cluster -A 2
    

3.2 Changing the Name of a Cluster Node

The names of the cluster nodes can be changed using the scconf(1M) command. Refer to the scconf(1M) man page for additional information.

3.2.1 How to Change the Name of a Cluster Node

  1. Find the names of the current cluster nodes.

    You can run the scconf -p command on any node that is an active cluster member.

    # scconf clustername -p 
     Current Configuration for Cluster clustername:
            Hosts in cluster: phys-hahost1 phys_hahost2 phys-hahost3
            Private Network Interfaces for
                phys-hahost1: be0 be1
                phys-hahost2: be0 be1
                phys-hahost3: hme0 hme1
  2. Run the scconf -h command on all nodes in the cluster.

    Run the scconf(1M) command on all nodes. See the scconf(1M) man page for details.

    # scconf -h clustername hostname0 [...hostname3] 
    

    The new node names should be specified in the order shown by the scconf -p command. For example, to change the name of phys-hahost3 to phys-opshost1, you would run the following command on all cluster nodes.

    # scconf -h sccluster phys-hahost1 phys-hahost2 phys-opshost1 
    

3.3 Changing the Private Network Interfaces

The private network interfaces of the nodes in the cluster can be changed using the scconf(1M) command. Refer to the scconf(1M) man page for additional information.

3.3.1 How to Change the Private Network Interfaces

Use the scconf(1M) command for all nodes in the cluster.

For example:

# scconf planets -i mercury scid0 scid1
# scconf planets -i venus   scid0 scid1
# scconf planets -i pluto   scid0 scid1
# scconf planets -i jupiter scid0 scid1

After running these commands, all four nodes mercury, venus, pluto, and jupiter use interfaces scid0 and scid1.


Caution - Caution -

While the cluster is up, you should not use the ifconfig(1M) command. This action causes unpredictable behavior on a running system.


3.4 Printing the Cluster Configuration

The cluster configuration information can be printed using scconf(1M). Refer to the scconf(1M) man page for more information.

3.4.1 How to Print the Cluster Configuration

Use the scconf(1M) command on any node that is an active cluster member.

For example:

# scconf planets -p

A response similar to the following is displayed. (Depending on your type of private interconnect, your messages may state hme instead of scid.)

Current Configuration for Cluster planets:
   Hosts in cluster: mercury venus pluto jupiter

   Private Network Interfaces for
       mercury: scid0 scid1
       venus: scid0 scid1
       pluto: scid2 scid3
       jupiter: scid2 scid3

3.5 Adding and Removing Logical Hosts

Logical hosts are the objects that fail over if a node fails. Each logical host is composed of a disk group or groups, a relocatable IP address, and a logical host name. Logical hosts are only used in configurations with HA data services. There are no logical hosts in a parallel database configuration.

You add or remove logical hosts by updating the information on your logical host and reconfiguring the cluster. When you configure the cluster originally, you provide scinstall(1M) with information on your logical host configuration. Once the cluster is up, there are two ways to change this information:

3.5.1 How to Add a Logical Host to the Cluster

As part of the process of adding a logical host, you will be asked to provide the following information:

Gather the answers to these questions before starting the procedure. Note that you must have already set up your disk group to be used by the new logical host. Refer to the appropriate appendix in the Sun Cluster 2.2 Software Installation Guide for your volume manager for details.

Use the following procedure to add a logical host to a cluster.

  1. Run the scinstall(1M) command and select the Change option from the main menu.

    # scinstall
    Assuming a default cluster name of planets
     Note: Cluster planets is currently running.
           Install/Uninstall actions are disabled during
           cluster operation.
    
             <<Press return to continue>>
    
             Checking on installed package state
     ........................
    
     ============ Main Menu =================
    
     1) Change  - Modify cluster or data service configuration.
     2) Verify  - Verify installed package sets.
     3) List    - List installed package sets.
    
     4) Quit    - Quit this program.
     5) Help    - The help screen for this menu.
    
     Please choose one of the menu items: [5]:  1
    
  2. From the Change menu, choose the Logical Hosts option.

    =========== Changes Menu ================
    
     Choices from this menu:
    
     1) Logical Hosts       - Change the logical hosts configuration.
     2) NAFO                - Re-initialize the NAFO configuration.
    
     3) Close  - Close this menu to return to the Main Menu.
     4) Quit   - Exit this program.
     5) Help   - Show the Help screen.
    
     Please choose a displayed option: [5] 1
    

    This will display the Logical Hosts Configuration menu.

  3. From the Logical Hosts Configuration menu, select the Add option.

    ====== Logical Hosts Configuration ======
    
     1) Add     - Add a logical host to the cluster.
     2) Remove  - Remove a logical host from the cluster.
     3) List    - List the logical hosts in the cluster.
    
     4) Close   - Return to the previous menu.
     5) Quit    - Exit.
    
    
     Please choose an option:  1
    

    You will be asked a series of questions regarding the new logical host.

    What is the primary public network controller for "phys-hahost1"?
     What is the primary public network controller for "phys-hahost2"?
     Does the cluster serve any secondary public subnets (yes/no) [no]?
     Re-initialize NAFO on "phys-hahost1" with one ctlr per group
     (yes/no)?
     What is the name of the new logical host?  hahost1
    What is the name of the default master for "hahost1"? phys-hahost1
    Enable automatic failback for "hahost1" (yes/no) [no]?
     Disk group name for logical host "hahost1" [hahost1]?
     Is it okay to add logical host "hahost1" now (yes/no) [yes]?
     /etc/opt/SUNWcluster/conf/ha.cdb
     Checking node status...
  4. Respond to the prompts with the required information.

    Once the scinstall(1M) portion of this procedure is complete, you will be returned to the Logical Hosts Configuration menu.

  5. Create a new HA administrative filesystem and update the /etc/opt/SUNWcluster/conf/hanfs/vfstab.logicalhost file.

    When you add a new logical host, you must set up a file system on a disk group within the logical host to store administrative information. The steps for setting up the HA administrative file system differ depending on your volume manager. These steps are described in the appendices of the Sun Cluster 2.2 Software Installation Guide.


    Note -

    Do not use host name aliases for the logical hosts. NFSTM clients mounting Sun Cluster file systems using logical host name aliases might experience statd lock recovery problems.


3.5.2 How to Remove a Logical Host From the Cluster

To remove a logical host from the cluster configuration, the cluster must be up and there must be no data services registered for the logical host.

  1. Stop all data service applications running on the logical host to be removed.

    # hareg -n dataservice
    
  2. Unregister the data service.

    # hareg -u dataservice
    
  3. Remove the logical host from the cluster.

    Run the scinstall(1M) command as described in the Sun Cluster 2.2 Software Installation Guide and select the Change option from the main menu.

    # scinstall
    Assuming a default cluster name of planets
     Note: Cluster planets is currently running.
           Install/Uninstall actions are disabled during
           cluster operation.
    
             <<Press return to continue>>
    
             Checking on installed package state
     ........................
    
     ============ Main Menu =================
    
     1) Change  - Modify cluster or data service configuration.
     2) Verify  - Verify installed package sets.
     3) List    - List installed package sets.
    
     4) Quit    - Quit this program.
     5) Help    - The help screen for this menu.
    
     Please choose one of the menu items: [5]:  1
    
  4. From the Change menu, select the Logical Hosts option.

    =========== Changes Menu ================
    
     Choices from this menu:
    
     1) Logical Hosts       - Change the logical hosts configuration.
     2) NAFO                - Re-initialize the NAFO configuration.
    
     3) Close  - Close this menu to return to the Main Menu.
     4) Quit   - Exit this program.
     5) Help   - Show the Help screen.
    
     Please choose a displayed option: [5] 1
    

    This will display the Logical Hosts Configuration menu.

  5. From the Logical Hosts Configuration menu, select the Remove option.

    ====== Logical Hosts Configuration ======
    
     1) Add     - Add a logical host to the cluster.
     2) Remove  - Remove a logical host from the cluster.
     3) List    - List the logical hosts in the cluster.
    
     4) Close   - Return to the previous menu.
     5) Quit    - Exit.
    
     Please choose an option:  2
    

    This displays a list of configured logical hosts.

  6. Enter the name of the logical host to be removed from the list of configured logical hosts.

    The list of logical hosts is:
    
             hahost1
             hahost2
    
     Which one do you want to remove?  hahost1
    

    The procedure is now complete and you are returned to the Logical Hosts Configuration menu.

  7. As root, delete the /etc/opt/SUNWcluster/conf/hanfs/vfstab.logicalhost file that was created when the logical host was added to the cluster configuration.

3.6 Changing the Logical Host IP Address

You can change a logical host IP address by removing and adding the logical host with the new IP address by using the procedures in "3.5 Adding and Removing Logical Hosts", or by using the procedure in this section.

Refer to the scconf(1M) man page for more information.

3.6.1 How to Change the Logical Host IP Address

The following steps must be done on a single node that is a cluster member.

  1. Remove the existing logical host entry from the configuration files by running the following command on all nodes.

    # scconf clustername -L logicalhost -r
    
  2. Create a new logical host entry, which uses the same logical host name but with the new IP address, by running the following command on all cluster nodes.

    # scconf clustername -L logicalhost -n nodelist -g diskgroup -i interfaces_and_IP
    

3.7 Forcing a Cluster Reconfiguration

You can force a cluster reconfiguration by using the haswitch(1M) command, or by changing the cluster membership by using the scconf(1M) command.

3.7.1 How to Force a Cluster Reconfiguration

Force a cluster reconfiguration by running the haswitch(1M) command on any node that is a cluster member. For example:

# haswitch -r

See the haswitch(1M) man page for details.

3.8 Configuring Sun Cluster Data Services

This section provides procedures for configuring Sun Cluster data services. These data services are normally configured with the logical hosts as part of the cluster installation. However, you can also configure logical hosts and data services after the installation. For more detailed information on a particular data service, see the individual chapter covering the data service in the Sun Cluster 2.2 Software Installation Guide.


Note -

All commands used in this section can be run from any node that is a cluster member, even a node that cannot master the specified logical hosts or run the specified data services. You can run the commands even if there is only one node in the cluster membership.



Caution - Caution -

The commands used in this section update the CCD even without a quorum. Consequently, updates to the CCD can be lost if nodes are shut down and brought up in the wrong sequence. Therefore, the node that was the last to leave the cluster must be the first node brought back into the cluster by using the scadmin startcluster command. For more information on the CCD, see the Sun Cluster 2.2 Software Installation Guide.


3.8.1 How to Configure a Sun Cluster Data Service

  1. Verify that the following tasks have been completed.

    • The logical hosts that will run the data services are configured. Refer to "3.5 Adding and Removing Logical Hosts", for details on configuring a logical host.

    • The necessary disk groups, logical volumes, and file systems are set up. Refer to the Sun Cluster 2.2 Software Installation Guide for details.

    • The HA administrative file system and vfstab.logicalhost file have been set up. This procedure varies depending on your volume manager. Refer to the appendix describing how to configure your volume manager in the Sun Cluster 2.2 Software Installation Guide.

  2. Register the data service.

    Register the Sun Cluster data service(s) associated with the logical host(s).

    # hareg -s -r dataservice [-h logicalhost]

    This assumes that the data service has already been installed and its methods are available.

    When the -h option is used in the hareg -r command, the data service is configured only on the logical hosts specified by the logicalhost argument. If the -h option is not specified, then the data service is configured on all currently-existing logical hosts. See the hareg(1M) man page for details.


    Note -

    If the data service is to be associated with any logical hosts that are created after the registration of the data service, run scconf -s on all cluster nodes to extend the set of logical hosts associated with the data service.


  3. Start the data service.

    # hareg -y dataservice
    

3.9 Unconfiguring Sun Cluster Data Services

Use this procedure to unconfigure Sun Cluster data services. For more detailed information on a data service, see the individual chapter covering the data service in the Sun Cluster 2.2 Software Installation Guide.

3.9.1 How to Unconfigure Sun Cluster Data Services

  1. Stop all data service applications to be unconfigured.

    Perform your normal shutdown procedures for the data service application.

  2. If the data service is a database management system (DBMS), stop all fault monitors.

  3. Stop the data service on all logical hosts.

    # hareg -n dataservice
    
  4. Unregister the data service.

    # hareg -u dataservice
    

    Note -

    If the hareg -u command fails, it can leave the Cluster Configuration Database (CCD) in an inconsistent state. If this occurs, run scconf clustername -R dataservice on all cluster nodes to forcibly remove the data service from the CCD.


  5. (Optional) Remove the logical hosts from the cluster configuration.

    You can remove a logical host from the cluster configuration only if all data services are disassociated from it.

    Use one of the following methods to remove a logical host.

    • Run this scconf(1M) command on one node that is a cluster member:

    # scconf clustername -L logicalhost -r
    
  6. Perform a cluster reconfiguration by using the haswitch(1M)command.

    # haswitch -r
    

    At your discretion, you may choose to remove or rename the vfstab.logicalhost and dfstab.logicalhost files associated with the logical host you removed, and reclaim the space occupied by its volumes and file systems. These files are untouched by the scconf(1M) remove operation.

3.10 Modifying the clusters File

The /etc/clusters file contains information on the known clusters in the local naming domain. This file, which maps a cluster name to the list of host names in the cluster, can be created as an NIS or NIS+ map, or locally in the /etc directory.

The /etc/clusters file requires updating only if:

For more information on NIS and NIS+ maps, refer to the NIS/NIS+ System Administration Guide. Refer to the Sun Cluster 2.2 Software Installation Guide for information on creating the /etc/clusters file. NIS/NIS+ files must be changed on the NIS/NIS+ server.

3.10.1 How to Modify the clusters File

Edit the /etc/clusters file to add the cluster name and physical host names of all nodes.

For example, to create a cluster named hacluster that consists of node 0 phys-hahost1, node 1 phys-hahost2, node 2 phys-hahost3, and node 3 phys-hahost4, enter this command:

# Sun Enterprise Cluster nodes
  hacluster phys-hahost1 phys-hahost2 phys-hahost3 phys-hahost4

Make the same modifications in the /etc/clusters file on each cluster node.

3.10.2 How to Create the clusters Table

In an NIS+ environment, you must create a clusters table. The entries in this table are the same as the entries in the /etc/clusters file.

For example, to create a clusters table in a domain named mydomain in an NIS+ environment, use the following command:

# nistbladm -c key-value key=SI value= clusters.mydomain.

Note -

The trailing period (.) at the end of the nistbladm command is required.


3.11 Modifying the serialports File

The serialports file maps a host name to the Terminal Concentrator and Terminal Concentrator serial port to which the console of the host is connected. This file can be created as an NIS or NIS+ map or locally in the /etc directory.

The serialports file requires updating only if:

Refer to the Sun Cluster 2.2 Software Installation Guide for information on creating the /etc/serialports file. For more information on NIS and NIS+ maps, refer to the NIS/NIS+ System Administration Guide.

3.11.1 How to Modify the serialports File

  1. As root, create a serialports file in the /etc directory.

  2. For a Sun EnterpriseTM 10000 system, enter hostname sspname 23 in the serialports file. For all other hardware systems, enter hostname terminal_concentrator serial_port in the serialports file.

    For Sun Enterprise 10000:

    # Sun Enterprise Cluster nodes
       phys-hahost1 sspname 23
       phys-hahost2 sspname 23
       phys-hahost3 sspname 23
       phys-hahost4 sspname 23

    For all other hardware systems:

    # Sun Enterprise Cluster nodes
       phys-hahost1 hacluster-tc    5002
       phys-hahost2 hacluster-tc    5003
       phys-hahost3 hacluster-tc    5004
       phys-hahost4 hacluster-tc    5005

3.11.2 How to Create the serialports Table

In an NIS+ environment, you need to create a serialports table. The entries in this table are the same as the entries in the /etc/serialports file.

To create a serialports table in a domain named mydomain in an NIS+ environment, use the following command:

# nistbladm -c key-value key=SI value=clusters.mydomain.

Note -

The trailing period (.) at the end of the nistbladm command is required.


3.12 Changing TC/SSP Information

When installing Sun Cluster software, information about the Terminal Concentrator (TC) or a System Service Processor (SSP) is required. This information is stored in the cluster configuration database (CCD).

This information is used to:

Both these mechanisms serve to protect data integrity in the case of four-node clusters with directly attached storage devices.

Use the scconf(1M) command to change the TC or SSP information associated with a particular node, as described in the following procedures.

3.12.1 How to Change TC/SSP Information

To change TC or SSP information, run the scconf(1M) command on all cluster nodes. For each node, supply the appropriate new information. The following examples show scconf(1M) command syntax for each type of information change.

Node architecture type and IP address - Supply the cluster name, the host name, the new architecture type, and the new IP address.

# scconf clustername -H hostname -d E10000 -t new_ip_address

Note -

Multiple hosts may be connected to the same TC; the -H option affects only the information associated with the host you specify in the command line.


Password for a TC or SSP - Supply the cluster name, the IP address, and the new password.

# scconf clustername -t ip_address -P
 ip_address (129.34.123.51) Password:

Port number for an SSP console - Supply the cluster name, host name, and new port number.

# scconf clustername -H hostname -p new_port_number

TC name or IP address - Supply the cluster name, host name, and new TC name or IP address.

# scconf clustername -H hostname -t new_tc_name|new_ip_address

For additional information on changing TC or SSP information, see the scconf(1M) man page and Chapter 8, Administering the Terminal Concentrator.

3.13 Changing the Quorum Device

A quorum device is used only in SSVM and CVM configurations. It is not used in Solstice DiskSuite configurations.

The scconf -q command can be used to change the quorum device to either a disk or a controller. This option is useful if the quorum device needs servicing. Refer to the scconf(1M) man page for details.


Note -

If the quorum device is a disk, the scconf -q command must be used whenever the disk address (in the form cxtydzs2) changes, even if the serial number of the disk is preserved. This change in disk address can happen if the SBus slot of the drive controller changes.



Caution - Caution -

Do not use the scconf -q option to modify the quorum device topology while the cluster is running. You cannot add or remove a quorum device between any two cluster nodes. Specifically, between a pair of nodes in the cluster:


3.13.1 How to Change the Quorum Device

  1. Before servicing the device, you can change the quorum device to a different device by running the scconf -q command on all cluster nodes.

    For example, to change the quorum device in the cluster haclust for nodes phys-hahost1 and phys-hahost2, run the scconf(1M) command as shown below.

    # scconf haclust -q phys-hahost1 phys-hahost2
    Select quorum device for nodes 0 (phys-hahost1) and 1 (phys-hahost2).
     Type the number corresponding to the desired selection.
     For example: 1<CR>
    
      1) DISK:c2t2d0s2:01943825
      2) DISK:c2t3d0s2:09064321
      3) DISK:c2t4d0s2:02171369
      4) DISK:c2t5d0s2:02149886
      5) DISK:c2t8d0s2:09062992
      6) DISK:c2t9d0s2:02166472
      7) DISK:c3t2d0s2:02183692
      8) DISK:c3t3d0s2:02183488
      9) DISK:c3t4d0s2:02160277
     10) DISK:c3t5d0s2:02166396
     11) DISK:c3t8d0s2:02164352
     12) DISK:c3t9d0s2:02164312
     Quorum device: 12
    

    The -q option probes the list of devices attached to each node and lists the devices that the two nodes share. The quorum device can then be selected from this list.

    To enable probing of devices attached to remote hosts, the local /.rhosts file is modified to enable rsh(1) permissions. The permissions are removed after the command completes.


    Note -

    This behavior occurs only if this command is run from all the nodes at the same time. If you do not want remote root access capability, use the -m option.


  2. You may choose either an SSA controller or a disk from this list as the quorum device.

    If you choose an SSA controller, the list of disks in that controller is displayed.

  3. If you chose an SSA controller in Step 2, you are given the option to select a disk from this SSA as the quorum device.

    If no disk is chosen in this step, the SSA controller chosen in the previous step is retained as the quorum device.

    The -q option also checks for the case where a node might have a reservation on the quorum device, due to some other node not being part of the membership. In this case, the -q option releases the reservation on the old quorum device and reserves the new quorum device.


    Note -

    All the specified nodes must be booted for the scconf -q command to run successfully. If any of the nodes is not booted, the command probes and presents the list of all devices on the local node. Be sure to select a shared device as the quorum device.


    If you already know the name of the device to use as the quorum device, use the -m option to specify the new device.

    # scconf clustername -q -m quorum-device hostname1 hostname2 
    

    The quorum device can either be an SSA controller's World Wide Name (WWN), or a disk identifier of the form WWN.disk-serial-id for disks in SSAs, or a disk identifier of the form disk-address:disk-serial-id for non-SSA disks. The disk-address is of the form cxtydzs2. You can use the finddevices(1M) command to obtain the serial numbers of SSA or non-SSA disks.

    If you have a cluster with more than two nodes where all nodes share a common quorum device, you can use the -q -D options to specify a new common quorum device.

    # scconf clustername -q -D
    

    Since all the hosts in the cluster share a common device, specifying a list of hosts is unnecessary.

    This is an interactive option that probes the list of devices attached to each host and then presents the list of shared devices. Select the quorum device from this list.


    Note -

    All the active hosts defined in the cluster must be booted for the scconf -q -D command to be successful. If any of the hosts are not booted, the command probes and presents the list of all devices on the local host. Be sure to select a shared device as the quorum device.


    The -q -D option also checks for the case where a node of the cluster may have a reservation on the quorum device, due to some other node not being part of the cluster. In this case, the reservation on the old quorum device is released and the new quorum device is reserved.

    If this command is run from all the nodes at the same time via the cconsole and crlogin GUI interfaces, then the local /.rhosts file is modified to enable rsh(1) permissions. This enables the probing of devices attached to remote hosts. The permissions are removed after the command completes.

    The -m option can be added if remote root access capability is not desired. The -m option configures the quorum device and is given as the last argument to the command for the specified nodes.

    # scconf clustername -q -D -m quorum-device 
    

    The quorum device is a disk identifier of the form cxtydzs2:disk-serial-ID. Use the finddevices(1M) command to obtain the serial numbers of disks.

3.14 Configuring Timeouts for Cluster Transition Steps

Sun Cluster has configurable timeouts for the cluster transition steps where logical hosts of the HA framework are taken over and given up as cluster membership changes. Adapt these timeouts as needed to effectively handle configurations consisting of large numbers of data services on each node. It is impractical to have constant timeout values for a wide variety of configurations, unless the timeouts are set to a very large default value.

There are essentially two considerations when tuning timeouts:

It is difficult to estimate what the correct value should be for a particular installation. These values should be arrived at by trial and error. You can use as guidelines the cluster console messages related to the beginning and end of each cluster transition step. They should give you a fairly good idea of how long a step takes to execute.

The timeouts need to account for the worst case scenario. When you configure cluster timeouts, ask yourself, "What is the maximum number of logical hosts that a cluster node can potentially master at any time?"

For example, in an N+1 configuration, the standby node can potentially master all the logical hosts of the other cluster nodes. In this case, the reconfiguration timeouts must be large enough to accommodate the time needed to master all of the logical hosts configured in the cluster.

3.14.1 How to Adjust Cluster Timeouts

  1. Adjust the cluster reconfiguration timeouts by using the scconf -T command.

    For example, to change the configurable transition step timeout values to 500 seconds, you would run the following command on all cluster nodes.

    # scconf clustername -T 500

    The default values for these steps are 720 seconds. Use the ssconf -p command to see the current timeout values.

    If there is insufficient time to master all the logical hosts in these steps, error messages are printed on the cluster console.

    Within the reconfiguration steps, the time taken to master a single logical host can vary depending on how many data services are configured on each logical host. If there is insufficient time to master a logical host--if the loghost_timeout parameter is too small--messages similar to the following appear on the console:

    ID[SUNWcluster.ccd.ccdd.5001]: error freeze cmd =
     command /opt/SUNWcluster/bin/loghost_sync timed out.

    In this example, the cluster framework makes a "best effort" to bring the system to a consistent state by attempting to give up the logical host. If this is not successful, the node may abort out of the cluster to prevent inconsistencies.

  2. Use the scconf -l option to adjust the loghost_timeout parameter.

    The default is 180 seconds.


    Note -

    The reconfiguration step timeouts can never be less than the loghost_timeout value. Otherwise, an error results and the cluster configuration file is not modified. This requirement is verified by the scconf -T or scconf -l options. A warning is printed if either of these timeouts is set to 100 seconds or less.


Chapter 4 General Sun Cluster Administration

This chapter provides instructions for the following topics.

This chapter includes the following procedures:

4.1 Starting the Cluster and Cluster Nodes

The scadmin startcluster command is used to make a node the first member of the cluster. This node becomes node 0 of the cluster. The other Sun Cluster nodes are started by a single command, scadmin startnode. This command starts the programs required for multinode synchronization, and coordinates integration of the other nodes with the first node (if the Sun Cluster software is already running on the first node). You can remove nodes from the cluster by using the scadmin command with the stopnode option on the node that you are removing from the cluster.

Make the local node the first member node in the cluster. This node must be a configured node of the cluster in order to run the scadmin startcluster command successfully. This command must complete successfully before any other nodes can join the cluster. If the local node aborts for any reason while the subsequent nodes are joining the cluster, the result might be a corrupted CCD. If this scenario occurs, restore the CCD using the procedure "4.11.3 How to Restore the CCD".

To make the local node a configured node of the cluster, see "3.1 Adding and Removing Cluster Nodes".

4.1.1 How to Start the Cluster

It is important that no other nodes are running the cluster software at this time. If this node detects that another cluster node is active, the local node aborts.

  1. Start the first node of the cluster by using the scadmin(1M) command.

    # scadmin startcluster localnode clustername
    

    The startcluster option does not run if localnode does not match the name of the node on which the command runs. See the scadmin(1M) man page for details.

    For example:

    phys-hahost1# scadmin startcluster phys-hahost1 haclust
     Node specified is phys-hahost1
     Cluster specified is haclust
    
     =========================== WARNING============================
     =                     Creating a new cluster 	 	 	 	 	 	 	 	 	=
     ===============================================================
    
     You are attempting to start up the cluster node 'phys-hahost1' 
     as the only node in a new cluster.  It is important that no 
     other cluster nodes be active at this time.  If this node hears 
     from other cluster nodes, this node will abort. Other nodes may 
     only join after this command has completed successfully.  Data 
     corruption may occur if more than one cluster is active.
    
    
     Do you want to continue [y,n,?] y
    

    If you receive a reconfig.4013 error message, then either there is already a node in a cluster, or another node is still in the process of going down. Run the get_node_status(1M) command on the node that might be up to determine that node's status.

  2. Add all other nodes to the cluster.

    Run the following command on all other nodes. This command can be run on multiple nodes at the same time.

    # scadmin startnode
    

    If you receive the following reconfig.4015 error message, there might be no existing cluster. Restart the cluster by using the scadmin startcluster localnode command.

    SUNWcluster.clustd.reconf.4015
     "Aborting--no existing or intact cluster to join."

    Alternately, there may be a partition or node failure. (For example, a third node is attempting to join a two-node cluster when one of the two nodes fails.) If this happens, wait until the failures have completed. Fix the problems, if any, and then attempt to rejoin the cluster.

    If any required software packages are missing, the command fails and the console displays a message similar to the following:

    Assuming a default cluster name of haclust
     Error: Required SC package `SUNWccm' not installed!
     Aborting cluster startup.

    For information on installing the Sun Cluster software packages, refer to the Sun Cluster 2.2 Software Installation Guide.

4.2 Stopping the Cluster and Cluster Nodes

Putting a node in any mode other than multiuser, or halting or rebooting the node, requires stopping the Sun Cluster membership monitor. Then your site's preferred method can be used for further node maintenance.

Stopping the cluster requires stopping the membership monitor on all cluster nodes by running the scadmin stopnode command on all nodes simultaneously.

phys-hahost1# haswitch ...
 phys-hahost1# scadmin stopnode

If a logical host is owned by the node when the scadmin stopnode command is run, ownership will be transferred to another node that can master the logical host before the membership monitor is stopped. If the other possible master of the logical host is down, the scadmin stopnode command will shut down the data services in addition to stopping the membership monitor.

After the scadmin stopnode command runs, Sun Cluster will remain stopped, even across system reboots, until the scadmin startnode command is run.

The scadmin stopnode command removes the node from the cluster. In the absence of other simultaneous failures, you may shut down as many nodes as you choose without losing quorum among the remaining nodes. (If quorum is lost, the entire cluster shuts down.)

If you shut down a node for disk maintenance, you also must prepare the boot disk or data disk using the procedures described in Chapter 10, Administering Sun Cluster Local Disks for boot disks, or those described in your volume manager documentation for data disks.

You might have to shut down one or more Sun Cluster nodes to perform hardware maintenance procedures such as adding or removing SBus cards. The following sections describe the procedure for shutting down a single node or the entire cluster.

4.2.1 How to Stop Sun Cluster on a Cluster Node

  1. If it is not necessary to have the data remain available, place the logical hosts (disk groups) into maintenance mode.

    phys-hahost2# haswitch -m logicalhost
    

    Refer to the haswitch(1M) man page for details.


    Note -

    It is possible to halt a Sun Cluster node by using the halt(1M) command, allowing a failover to restore the logical host services on the backup node. However, the halt(1M) operation might cause the node to panic. The haswitch(1M) command offers a more reliable method of switching ownership of the logical hosts.


  2. Stop Sun Cluster on one node without stopping services running on the other nodes in the cluster.

    phys-hahost1# scadmin stopnode
    
  3. Halt the node.

    phys-hahost1# halt
    

    The node is now ready for maintenance work.

4.2.2 How to Stop Sun Cluster on All Nodes

You might want to shut down all nodes in a Sun Cluster configuration if a hazardous environmental condition exists, such as a cooling failure or a severe lightning storm.

  1. Stop the membership monitor on all nodes simultaneously by using the scadmin(1M) command.

    You can do this in one step using the Cluster Console.

    phys-hahost1# scadmin stopnode
    ...
  2. Halt all nodes using halt(1M).

    phys-hahost1# halt
    ...

4.2.3 How to Halt a Sun Cluster Node

Shut down any Sun Cluster node by using the halt(1M) command or the uadmin(1M) command.

If the membership monitor is running when a node is shut down, the node will most likely take a "Failfast timeout" and display the following message:

panic[cpu9]/thread=0x50f939e0: Failfast timeout - unit 

You can avoid this by stopping the membership monitor before shutting down the node. Refer to the procedure, "4.2.2 How to Stop Sun Cluster on All Nodes", for additional information.

4.2.4 Stopping the Membership Monitor While Running RDBMS Instances

Database server instances can run on a node only after you have invoked the startnode option and the node has successfully joined the cluster. All database instances should be shut down before the stopnode option is invoked.


Note -

If you are running Oracle7 Parallel Server, Oracle8 Parallel Server, or Informix XPS, refer to your product documentation for shutdown procedures.


If the stopnode command is executed while the Oracle7 or Oracle8 instance is still running on the node, stopnode will hang and the following message is displayed on the console:

ID[vxclust]: stop: waiting for applications to end

The Oracle7 or Oracle8 instance must be shut down for the stopnode command to terminate successfully.

If the stopnode command is executed while the Informix-Online XPS instance is still running on the node, the database hangs and becomes unusable.

4.3 Switching Over Logical Hosts

The haswitch(1M) command is used to switch over the specified logical hosts (and associated disk groups, data services, and logical IP addresses) to the node specified by the destination host. For example, the following command switches over logical hosts hahost1 and hahost2 to both be mastered by phys-hahost1.

# haswitch phys-hahost1 hahost1 hahost2

If the logical host has more than one data service configured on it, you cannot selectively switch over just one data service, or a subset of the data services. Your only option is to switch over all the data services on the logical host.


Note -

Both the destination host and the current master of the logical host must be in the cluster membership, otherwise the command fails.


4.4 Disabling Automatic Switchover

In clusters providing HA data services, automatic switchover can be set up for the situation where a node fails, the logical hosts it mastered are switched over to another node, and later the failed node returns to the cluster. Logical hosts will automatically be remastered by their default master, unless you configure them to remain mastered by the host to which they were switched.

If you do not want a logical host to be automatically switched back to its original master, use the -m option of the scconf(1M) command. Refer to the scconf(1M) man page for details.


Note -

To disable automatic switchover for a logical host, you need only run the scconf(1M) command on a single node that is an active member of the cluster.


# scconf clustername -L logicalhost -n node1,node2 -g dg1 -i qe0,qe0,logaddr1 -m

4.5 Putting Logical Hosts in Maintenance Mode

Maintenance mode is useful for some administration tasks on file systems and disk groups. To put the disk groups of a logical host into maintenance mode, use the -m option to the haswitch(1M) command.


Note -

Unlike other types of ownership of a logical host, maintenance mode persists across node reboots.


For example, this command puts logical host hahost1 in maintenance mode.

phys-hahost2# haswitch -m hahost1

This command stops the data services associated with hahost1 on the Sun Cluster node that currently owns the disk group, and also halts the fault monitoring programs associated with hahost1 on all Sun Cluster nodes. The command also executes a umount(1M) of any Sun Cluster file systems on the logical host. The associated disk group ownership is released.

This command runs on any host, regardless of current ownership of the logical host and disk group.

You can remove a logical host from maintenance mode by performing a switchover specifying the physical host that is to own the disk group. For example, you could use the following command to remove hahost1 from maintenance mode:

phys-hahost1# haswitch phys-hahost1 hahost1 

4.6 Recovering From Cluster Partitions

Multiple failures (including network partitions) might result in subsets of cluster members attempting to remain in the cluster. Usually, these subsets have lost partial or total communication with each other. In such cases, the software attempts to ensure that there is only one resultant valid cluster. To achieve this, the software might cause some or all nodes to abort. The following discussion explains the criterion used to make these decisions.

The quorum criterion is defined as a subset with at least half the members of the original set of cluster nodes (not only the configured nodes). If a subset does not meet the quorum criterion, the nodes in the subset abort themselves and a reconfig.4014 error message is displayed. Failure to meet the quorum criterion could be due to a network partition or to a simultaneous failure of more than half of the nodes.


Note -

Valid clusters only contain nodes that can communicate with each other over private networks.


Consider a four-node cluster that partitions itself into two subsets: one subset consists of one node, while the other subset consists of three nodes. Each subset attempts to meet the quorum criterion. The first subset has only one node (out of the original four) and does not meet the quorum criterion. Hence, the node in the first subset shuts down. The second subset has three nodes (out of the original four), meets the quorum criterion, and therefore stays up.

Alternatively, consider a two-node cluster with a quorum device. If there is a partition in such a configuration, then one node and the quorum device meet the quorum criterion and the cluster stays up.

4.6.1 Split-Brain Partitions (SSVM or CVM Only)

A split-brain partition occurs if a subset has exactly half the cluster members. (The split-brain partition does not include the scenario of a two-node cluster with a quorum device.) During initial installation of Sun Cluster, you were prompted to choose your preferred type of recovery from a split-brain scenario. Your choices were ask and select. If you chose ask, then if a split-brain partition occurs, the system asks you for a decision about which nodes should stay up. If you chose select, the system automatically selects for you which cluster members should stay up.

If you chose an automatic selection policy to deal with split-brain situations, your options were Lowest Nodeid or Highest Nodeid. If you chose Lowest Nodeid, then the subset containing the node with the lowest ID value becomes the new cluster. If you chose Highest Nodeid, then the subset containing the node with the highest ID value becomes the new cluster. For more details, see the section on installation procedures in the Sun Cluster 2.2 Software Installation Guide.

In either case, you must manually abort the nodes in all other subsets.

If you did not choose an automatic selection policy or if the system prompts you for input at the time of the partition, then the system displays the following error message.

SUNWcluster.clustd.reconf.3010
 "*** ISSUE ABORTPARTITION OR CONTINUEPARTITION *** 
Proposed cluster: xxx  
Unreachable nodes: yyy"

Additionally, a message similar to the following is displayed on the console every ten seconds:

*** ISSUE ABORTPARTITION OR CONTINUEPARTITION ***
 If the unreachable nodes have formed a cluster, issue ABORTPARTITION.
 (scadmin abortpartition <localnode> <clustername>)
 You may allow the proposed cluster to form by issuing CONTINUEPARTITION.
 (scadmin continuepartition <localnode> <clustername>)
 Proposed cluster partition:  0  Unreachable nodes: 1

If you did not choose an automatic select process, use the procedure "4.6.2 How to Choose a New Cluster" to choose the new cluster.


Note -

To restart the cluster after a split-brain failure, you must wait for the stopped node to come up entirely (it might undergo automatic reconfiguration or reboot) before you bring it back into the cluster using the scadmin startnode command.


4.6.2 How to Choose a New Cluster

  1. Determine which subset should form the new cluster. Run the following command on one node in the subset that should abort.

    # scadmin abortpartition
    

    When the abortpartition command is issued on one node, the Cluster Membership Monitor (CMM) propagates that command to all the nodes in that partition. Therefore, if all nodes in that partition receive the command, they all abort. However, if some of the nodes in the partition cannot be contacted by the CMM, then they have to be manually aborted. Run the scadmin abortpartition command on any remaining nodes that do not abort.

  2. Run the following command on one node in the subset that should stay up.

    # scadmin continuepartition
    

    Note -

    A further reconfiguration occurs if there has been another failure within the new cluster. At all times, only one cluster is active.


4.7 Maintaining the /var File System

Because Solaris and Sun Cluster software error messages are written to the /var/adm/messages file, the /var file system can become full. If the /var file system becomes full while the node is running, the node will continue to run, but you probably will not be able to log into the node with the full /var file system. If the node goes down, Sun Cluster will not start and a login will not be possible. If this happens, you must reboot in single-user mode (boot -s).

If the node reports a full /var file system and continues to run Sun Cluster services, follow the steps outlined in the following procedure.

4.7.1 How to Repair a Full /var File System

In this example, phys-hahost1 has a full /var file system.

  1. Perform a switchover.

    Move all logical hosts off the node experiencing the problem.

    phys-hahost2# haswitch phys-hahost2 hahost1 hahost2
    
  2. Remove the node from the cluster membership.

    If you have an active login to phys-hahost1, enter the following:

    phys-hahost1# scadmin stopnode
    

    If you do not have an active login to phys-hahost1, halt the node.

  3. Reboot the node in single-user mode.

    (0) ok boot -s
    INIT: SINGLE USER MODE
    
     Type Ctrl-d to proceed with normal startup,
     (or give root password for system maintenance): root_password
     Entering System Maintenance Mode
    
     Sun Microsystems Inc. SunOS 5.6 Generic August 1997
  4. Perform the steps you would normally take to clear the full file system.

  5. After the file system is cleared, enter multiuser mode.

    # exit
    
  6. Use the scadmin startnode command to cause the node to rejoin the configuration.

    # scadmin startnode
    

4.8 Administering the Time in Sun Cluster Configurations

We recommend that you use Network Time Protocol (NTP) to maintain time synchronization between cluster nodes if NTP comes with your Solaris operating environment.


Caution - Caution -

An administrator cannot adjust the time of the nodes in a Sun Cluster configuration. Never attempt to perform a time change using the date(1), rdate(1M), or xntpdate(1M) commands.


In the Sun Cluster environment, the cluster nodes can run as NTP clients. You must have an NTP server set up and configured outside the cluster to use NTP; the cluster nodes cannot be configured to be NTP servers. Refer to the xntpd(1M) man page for information about NTP clients and servers.

If you are running cluster nodes as NTP clients, make sure that there are no crontab(1) entries that call ntpdate(1M). It is safer to run xntpd(1M) on the clients because that keeps the clocks in sync without making large jumps forward or backward.

4.9 Replacing a Failed Node

Complete the following steps when one node has a hardware failure and needs to be replaced with a new node.


Note -

This procedure assumes the root disk of the failed node is still operational and can be used. If your failed root disk is not mirrored, contact your local Sun Enterprise Service representative or your local authorized service provider for assistance.


4.9.1 How to Replace a Failed Node

If the failed node is not operational, start at Step 5.

  1. If you have a parallel database configuration, stop the database.


    Note -

    Refer to the appropriate documentation for your data services. All HA applications are automatically shut down with the scadmin stopnode command.


  2. Use the Cluster Console to open a terminal window.

  3. As root, enter the following command in the terminal window.

    This command removes the node from the cluster, stops the Sun Cluster software, and disables the volume manager on that node.

    # scadmin stopnode
    
  4. Halt the operating system on the node.

    Refer to the Solaris System Administrator's Guide.

  5. Power off the node.

    Refer to your hardware service manual for more information.


    Caution - Caution -

    Do not disconnect any cables from the failed node at this time.


  6. Remove the boot disk from the failed node.

    Refer to your hardware service manual for more information.

  7. Place the boot disk in the identical slot in the new node.

    The root disk should be accessible at the same address as before. Refer to your hardware service manual for more information.


    Note -

    Be sure that the new node has the same IP address as the failed system. You may need to modify the boot servers or arp servers to remap the IP address to the new Ethernet address. For more information, refer to the NIS+ and DNS Setup and Configuration Guide.


  8. Power on the new node.

    Refer to your hardware service manual for more information.

  9. If the node automatically boots, shut down the operating system and take the system to the OpenBoot PROM monitor.

    For more information, refer to the shutdown(1M) man page.

  10. Refer to your hardware planning and installation guide to ensure that every scsi-initiator-id is set correctly.

  11. Power off the new node.

    Refer to your hardware service manual for more information.

  12. On the surviving node that shares the multihost disks with the failed node, detach all of the disks in one disk expansion unit attached to the failed node.

    Refer to your hardware service manual for more information.

  13. Power off the disk expansion unit.

    Refer to your hardware service manual for more information.


    Note -

    As you replace the failed node, messages similar to the following might appear on the system console. Disregard these messages, because they might not indicate a problem.


    Nov  3 17:44:00 updb10a unix: WARNING: /sbus@1f,0/SUNW,fas@0,8800000/sd@2,0 (sd17):
     Nov  3 17:44:00 updb10a unix: SCSI transport failed: reason 'incomplete': retrying \ command
     Nov  3 17:44:03 updb10a unix: WARNING: /sbus@1f,0/SUNW,fas@0,8800000/sd@2,0 (sd17):
     Nov  3 17:44:03 updb10a unix:   disk not responding to selection
  14. Detach the SCSI cable from the failed node and attach it to the corresponding slot on the new node.

    Refer to your hardware service manual for more information.

  15. Power on the disk expansion unit.

    Refer to your hardware service manual for more information.

  16. Reattach all of the disks you detached in Step 12.

    Refer to your hardware service manual for more information.

  17. Wait for volume recovery to complete on all the volumes in the disk expansion unit before detaching the corresponding mirror disk expansion unit.

    Use your volume manager software to determine when volume recovery has occurred.

  18. Repeat Step 12 through Step 17 for all of the remaining disk expansion units.

  19. Power on the replaced (new) node.

    Refer to your hardware service manual for more information.

  20. Reboot the node and wait for the system to come up.

    <#0> boot
    
  21. Determine the Ethernet address on the replaced (new) node.

    # /usr/sbin/arp nodename
    
  22. Determine the node ID of the replaced node.

    By the process of elimination, you can determine which node is not in the cluster. The node IDs should be numbered consecutively starting with node 0.

    # get_node_status
    sc: included in running cluster
     node id: 0       
     membership: 0
     interconnect0: unknown
     interconnect1: unknown
     vm_type: cvm
     vm_on_node: master
     vm: up
     db: down
  23. Inform the cluster system of the new Ethernet address (of the replaced node) by entering the following command on all the cluster nodes.

    # scconf clustername -N node-id ethernet-address-of-host 
    

    Continuing with the example in Step 22, the node ID is 1:

    # scconf clustername -N 1 ethernet-address-of-host
    
  24. Start up the replaced node.

    # scadmin startnode
    
  25. In parallel database configuration, restart the database.


    Note -

    Refer to the appropriate documentation for your data services. All HA applications are automatically started with the scadmin startcluster and scadmin startnode commands.


4.10 Replacing a Failed Terminal Concentrator

The Terminal Concentrator need not be operational for the cluster to stay up. If the Terminal Concentrator fails, the cluster itself does not fail.

You can replace a failed Terminal Concentrator without affecting the cluster. If the new Terminal Concentrator has retained the same name, IP address, and password as the original, then no cluster commands are required. Simply plug in the new Terminal Concentrator and it will work as expected.

If the replacement Terminal Concentrator has a new name, IP address, or password, use the scconf(1M) command as described in "3.12 Changing TC/SSP Information", to change this information in the cluster database. This can be done with the cluster running without affecting cluster operations.

4.11 Administering the Cluster Configuration Database

The ccdadm(1M) command is used to perform administrative procedures on the Cluster Configuration Database (CCD). Refer to the ccdadm(1M) man page for additional information.


Note -

As root, you can run the ccdadm(1M) command from any active node. This command updates all the nodes in your cluster.


It is good practice to checkpoint the CCD using the -c option (checkpoint) to ccdadm(1M) each time cluster configuration is updated. The CCD is extensively used by the Sun Cluster framework to store configuration data related to logical hosts and HA data services. The CCD is also used to store the network adapter configuration data used by PNM. We strongly recommended that after any changes to the HA or PNM configuration of the cluster, you capture the current valid snapshot of the CCD by using the -c option as an insurance against problems that can occur under fault scenarios in the future. This requirement is no different from requiring database administrators or system administrators to frequently backup their data to avoid catastrophes in the future due to unforeseen circumstances.

4.11.1 How to Verify CCD Global Consistency

Run the -v option whenever there may be a problem with the Dynamic CCD.

This option compares the consistency record of each CCD copy on all the cluster nodes, enabling you to verify that the database is consistent across all the nodes. CCD queries are disabled while the verification is in progress.

# ccdadm clustername -v

4.11.2 How to Back Up the CCD

Run the -c option once a week or whenever you back up the CCD.

This option makes a backup copy of the Dynamic CCD. The backup copy subsequently can be used to restore the Dynamic CCD by using the -r option. See "4.11.3 How to Restore the CCD" for more information.


Note -

When backing up the CCD, put all logical hosts in maintenance mode before running the ccdadm -c command. The logical hosts must be in maintenance mode when restoring the CCD database. Therefore, having a backup file similar to the restore state will prevent unnecessary errors or problems.


# ccdadm clustername -c checkpoint-filename

In this command, checkpoint-filename is the name of your backup copy.

4.11.3 How to Restore the CCD

Run ccdadm(1M) with the -r option whenever the CCD has been corrupted. This option discards the current copy of the Dynamic CCD and restores it with the contents of the restore file you supply. Use this command to initialize or restore the Dynamic CCD after the ccdd(1M) reconfiguration algorithm failed to elect a valid CCD copy upon cluster restart. The CCD is then marked valid.

  1. If necessary, disable the quorum.

    See "4.11.4 How to Enable or Disable the CCD Quorum" for more information.

    # ccdadm clustername -q off
    
  2. Put the logical hosts in maintenance mode.

    # haswitch -m logicalhosts
    
  3. Restore the CCD.

    In this command, restore-filename is the name of the file you are restoring.

    # ccdadm clustername -r restore-filename
    
  4. If necessary, turn the CCD quorum back on.

    # ccdadm clustername -q on
    
  5. Bring the logical hosts back online.

    For example:

    # haswitch phys-host1 logicalhost1
     # haswitch phys-host2 logicalhost2

4.11.4 How to Enable or Disable the CCD Quorum

Typically, the cluster software requires a quorum before updating the CCD. The -q option enables you to disable this restriction and to update the CCD with any number of nodes.

Run this option to enable or disable a quorum when updating or restoring the Dynamic CCD. The quorum_flag is a toggle: on (to enable) or off (to disable) a quorum. By default, the quorum is enabled.

For example, if you have three physical nodes, you need at least two nodes to perform updates. Because of a hardware failure, you can bring up only one node. The cluster software does not enable you to update the CCD. If, however, you run the ccdadm -q command, you can toggle off the software control, and update the CCD.

# ccdadm clustername -q on|off

4.11.5 How to Purify the CCD

The -p option enables you to purify (verify the contents and check the syntax of) the CCD database file. Run this option whenever there is a syntax error in the CCD database file.

# ccdadm -p CCD-filename

The -p option reports any format errors in the candidate file and writes a corrected version of the file into the file filename.pure. You can then restore this "pure" file as the new CCD database. See "4.11.3 How to Restore the CCD" for more information.

4.11.6 How to Disable the Shared CCD

In some situations, you might want to disable a shared CCD. Such situations might include troubleshooting scenarios, or conversion of a two-node cluster to a three-node cluster such that you no longer need a shared CCD.

  1. Stop one node in the cluster.

    The following command stops phys-hahost2:

    phys-hahost2# scadmin stopnode
    
  2. Back up the shared CCD to a safe location.

  3. Turn off the shared CCD by using the scconf(1M) command.

    Run this command on all nodes.

    phys-hahost1# scconf -S none
    
  4. Copy the CCD from the shared diskset to both nodes

  5. Unmount the shared CCD volume.

    phys-hahost1# umount /etc/opt/SUNWcluster/conf/ccdssa
    
  6. Deport the disk group on which the shared CCD resided.

    phys-hahost1# vxdg deport sc_dg
    
  7. Restart the stopped node.

    phys-hahost2# scadmin startnode
    

    The private CCDs on each node should now be identical. To reinstate the shared CCD, follow the steps described in the appendix on configuring SSVM in the Sun Cluster 2.2 Software Installation Guide.

4.11.7 Troubleshooting the CCD

The system logs errors in the CCD to the /var/opt/SUNWcluster/ccd/ccd.log file. Critical error messages are also passed to the Cluster Console. Additionally, in the rare case of a crash, the software creates a core file under /var/opt/SUNWcluster/ccd.

The following is an example of the ccd.log file.

lpc204# cat ccd.log
Apr 16 14:54:05 lpc204 ID[SUNWcluster.ccd.ccdd.1005]: (info) starting `START' transition 
with time-out 10000
Apr 16 14:54:05 lpc204 ID[SUNWcluster.ccd.ccdd.1005]: (info) completed `START' transition 
with status 0
Apr 16 14:54:06 lpc204 ID[SUNWcluster.ccd.ccdd.1005]: (info) starting `STEP1' transition 
with time-out 20000
Apr 16 14:54:06 lpc204 ID[SUNWcluster.ccd.ccdd.1000]: (info) Nodeid = 0 Up = 0 Gennum = 0 
Date = Feb 14 10h30m00 1997 Restore = 4
Apr 16 14:54:06 lpc204 ID[SUNWcluster.ccd.ccdd.1002]: (info) start reconfiguration elected 
CCD from Nodeid = 0
Apr 16 14:54:06 lpc204 ID[SUNWcluster.ccd.ccdd.1004]: (info) the init CCD database is 
consistent
Apr 16 14:54:06 lpc204 ID[SUNWcluster.ccd.ccdd.1001]: (info) Node is up as a one-node 
cluster after scadmin startcluster; skipping ccd quorum test
Apr 16 14:54:06 lpc204 ID[SUNWcluster.ccd.ccdd.1005]: (info) completed `STEP1' transition 
with status 0

The following table lists the most common error messages with suggestions for resolving the problem. Refer to the Sun Cluster 2.2 Error Messages Manual for the complete list of error messages.

Table 4-1 Common Error Messages for the Cluster Configuration Database

Number Range 

Explanation 

What You Should Do 

4200 

Cannot open file 

Restore the CCD by running the ccdadm -r command.

4302 

File not found 

Restore the CCD by running the ccdadm -r command.

4307 

Inconsistent Init CCD 

Remove, then reinstall, the Sun Cluster software. 

4402 

Error registering RPC server 

Check your public network (networking problem). 

4403 

RPC client create failed 

Check your public network (networking problem). 

5000 

System execution error 

The synchronization script has an error. Check the permissions on the script. 

5300 

Invalid CCD, needs to be restored 

Restore the CCD by running the ccdadm -r command.

5304 

Error running freeze command 

There are incorrect arguments in the executed synchronization script. Check that the format of the script is correct. 

5306 

Cluster pointer is null 

This message indicates that the cluster does not exist (ccdadm cluster). Check that you typed the cluster name correctly.

4.12 Reserving Shared Disks (SSVM and CVM)

The list of disks maintained by the volume manager is used as the set of devices for failure fencing. If there are no disk groups present in a system, there are no devices for failure fencing (there is effectively no data to be protected). However, when new shared disk groups are imported while one or more nodes are not in the cluster, the cluster must be informed that an extra set of devices need failure fencing.

4.12.1 How to Reserve Shared Devices (SSVM and CVM)

When new shared disk groups are imported while one or more nodes are not in the cluster, the cluster must be informed that an extra set of devices need failure fencing. This is accomplished by running the scadmin resdisk command from a node that can access the new disk group(s).

# scadmin resdisks

This command reserves all the devices connected to a node if no other node (that has connectivity to the same set of devices) is in the cluster membership. That is, reservations are affected only if one and only one node, out of all possible nodes that have direct physical connectivity to the devices, is in the cluster membership. If this condition is false, the scadmin resdisks command has no effect. The command also fails if a cluster reconfiguration is in progress. Reservations on shared devices are automatically released when this one node is shut down, or when other nodes, with direct connectivity to the shared devices, join the cluster membership.


Note -

It is unnecessary to run the scadmin resdisks command if shared disk groups are imported while all nodes are in the cluster. Reservations and failure fencing are not relevant if full cluster membership is present.


However, if a shared disk group is deported, the reservations on the shared devices in the deported disk group are not released. These reservations are not released until either the node that does the reservations is shut down, or the other node, with which it shares devices, joins the cluster.

To enable the set of disks belonging to the deported disk group to be used immediately, enter the following two commands in succession on all cluster nodes, after deporting the shared disk group:

# scadmin reldisks
# scadmin resdisks

The first command releases reservations on all shared devices. The second command effectively redoes the reservations based on the currently imported set of disk groups, and automatically excludes the set of disks associated with deported disk groups.

Chapter 5 Recovering From Power Loss

This chapter describes different power loss scenarios and the steps you take to return the system to normal operation. The topics in this chapter are listed below.

Maintaining Sun Cluster configurations includes handling failures such as power loss. A power loss can shut down an entire Sun Cluster configuration, or one or more components within a configuration. Sun Cluster nodes behave differently depending on which components lose power. The following sections describe typical scenarios and expected behavior.

5.1 Recovering From Total Power Loss

In Sun Cluster configurations with a single power source, a power failure takes down all Sun Cluster nodes along with their multihost disk expansion units. When all nodes lose power, the entire configuration fails.

In a total-failure scenario, there are two ways in which the cluster hardware might come back up.

5.2 Recovering From Partial Power Loss

If the Sun Cluster nodes and the multihost disk expansion units have separate power sources, a failure can take down one or more components. Several scenarios can occur. The most likely cases are:

5.2.1 Failure of One Node

If separate power sources are used on the nodes and the multihost disk expansion units, and you lose power to only one of the nodes, the other node detects the failure and initiates a takeover.

When power is restored to the node that failed, it reboots. You must rejoin the cluster by using the scadmin startnode command. Then perform a manual switchover by using the haswitch(1M) command to restore the default logical host ownership.

5.2.2 Failure of a Multihost Disk Expansion Unit

If you lose power to one of the multihost disk expansion units, your volume management software detects errors on the affected disks and takes action to put them into an error state. Disk mirroring masks this failure from the Sun Cluster fault monitoring. No switchover or takeover occurs.

When power is returned to the multihost disk expansion unit, perform the procedure documented in Chapter 11, Administering SPARCstorage Arrays, or Chapter 12, Administering Sun StorEdge MultiPacks and Sun StorEdge D1000s.

5.2.3 Failure of One Server and One Multihost Disk Expansion Unit

If power is lost to one of the Sun Cluster nodes and one multihost disk expansion unit, a secondary node immediately initiates a takeover.

After the power is restored, you must reboot the node, rejoin the node to the configuration by using the scadmin startnode command, and then begin monitoring activity. If manual switchover is configured, use the haswitch(1M) command to manually return ownership of the diskset to the node that had lost power. Refer to "4.3 Switching Over Logical Hosts", for more information.

After the diskset ownership has been returned to the default master, any multihost disks that reported errors must be returned to service. Use the instructions provided in the chapters on your disk expansion unit to return the multihost disks to service.


Note -

The node might reboot before the multihost disk expansion unit. Therefore, the associated disks will not be accessible. Reboot the node after the multihost disk expansion unit comes up.


5.3 Powering On the System

Applying power to system cabinets, nodes, and boot disks varies, depending on the type of cabinet being used, and the manner in which the nodes receive AC power.

For disk arrays that do not receive AC power from an independent power source, AC power is applied when the system cabinet is powered on.

For specific power-on procedures for Sun StorEdge MultiPacks, refer to the Sun StorEdge MultiPack Service Manual.

The Terminal Concentrator that receives AC power from the system cabinet is turned on when power is applied to the cabinet. Otherwise, the Terminal Concentrator must be powered on independently.

Chapter 6 Administering Network Interfaces

This chapter provides a description of the Public Network Management (PNM) feature of Sun Cluster, and instructions for adding or replacing network interface components. The topics in this chapter are listed below.

This chapter includes the following procedures:

6.1 Public Network Management Overview

The PNM feature of Sun Cluster uses fault monitoring and failover to prevent loss of node availability due to single network adapter or cable failure. PNM fault monitoring runs in local-node or cluster-wide mode to check the status of nodes, network adapters, cables, and network traffic. PNM failover uses sets of network adapters called backup groups to provide redundant connections between a cluster node and the public network. The fault monitoring and failover capabilities work together to ensure availability of services.

If your configuration includes HA data services, you must enable PNM; HA data services are dependent on PNM fault monitoring. When an HA data service experiences availability problems, it queries PNM through the cluster framework to see whether the problem is related to the public network connections. If it is, the data services wait until PNM has resolved the problem. If the problem is not with the public network, the data services invoke their own failover mechanism.

The PNM package, SUNWpnm, is installed during initial Sun Cluster software installation. The commands associated with PNM include:

See the associated man pages for details.

6.1.1 PNM Fault Monitoring and Failover

PNM monitors the state of the public network and the network adapters associated with each node in the cluster, and reports dubious or errored states. When PNM detects lack of response from a primary adapter (the adapter currently carrying network traffic to and from the node) it fails over the network service to another working adapter in the adapter backup group for that node. PNM then performs some checks to determine whether the fault is with the adapter or the network.

If the adapter is faulty, PNM sends error messages to syslog(3), which are in turn detected by the Cluster Manager and displayed to the user through a GUI. After a failed adapter is fixed, it is automatically tested and reinstated in the backup group at the next cluster reconfiguration. If the entire adapter backup group is down, then the Sun Cluster framework invokes a failover of the node to retain availability. If an error occurs outside of PNM's control, such as the failure of a whole subnet, then a normal failover and cluster reconfiguration will occur.

PNM monitoring runs in two modes, cluster-aware and cluster-unaware. PNM runs in cluster-aware mode when the cluster is operational. It uses the Cluster Configuration Database (CCD) to monitor status of the network. For more information on the CCD, see the overview chapter in the Sun Cluster 2.2 Software Installation Guide. PNM uses the CCD to distinguish between public network failure and local adapter failure. See "C.3 Sun Cluster Fault Probes" for more information on logical host failover initiated by public network failure.

PNM runs in cluster-unaware mode when the cluster is not operational. In this mode, PNM is unable to use the CCD and therefore cannot distinguish between adapter and network failure. In cluster-unaware mode, PNM simply detects a problem with the local network connection.

You can check the status of the public network and adapters with the PNM monitoring command, pnmstat(1M). See the man page for details.

6.1.2 Backup Groups

Backup groups are sets of network adapters that provide redundant connections between a single cluster node and the public network. You configure backup groups during initial installation by using the scinstall(1M) command, or after initial installation by using the pnmset(1M) command. PNM allows you to configure as many redundant adapters as you want on a single host.

To configure backup groups initially, you run pnmset(1M) as root before the cluster is started. The command runs as an interactive script to configure and verify backup groups. It also selects one adapter to be used as the primary, or active, adapter. The pnmset(1M) command names backup groups nafon, where n is an integer you assign. The command stores backup group information in the /etc/pnmconfig file.

To change an existing PNM configuration on a cluster node, you must remove the node from the cluster and then run the pnmset(1M) command. PNM monitors and incorporates changes in backup group membership dynamically.


Note -

The /etc/pnmconfig file is not removed even if the SUNWpnm package is removed, for example, during a software upgrade. That is, the backup group membership information is preserved during software upgrades and you are not required to run the pnmset(1M) utility again, unless you want to modify backup group membership.


6.1.3 Updates to nsswitch.conf

When configuring PNM with a backup network adapter, the /etc/nsswitch.conf file should have one of the following entries for the netmasks entry.

Table 6-1 Name Service Entry Choices for the /etc/nsswitch.conf File

Name Service Used 

netmasks Entry

None 

netmasks: files

nis

netmasks: files [NOTFOUND=return] nis

nisplus

netmasks: files [NOTFOUND=return] nisplus

The above settings will ensure that the netmasks setting will not be looked up in an NIS/NIS+ lookup table. This is important if the adapter that has failed is the primary public network and thus would not be available to provide the requested information. If the netmasks entry is not set in the prescribed manner, failover to the backup adapter will not succeed.


Caution - Caution -

The preceding changes have the effect of using the local files (/etc/netmasks and /etc/groups) for lookup tables. The NIS/NIS+ services will only be used when the local files are unavailable. Therefore, these files must be kept up-to-date with their NIS/NIS+ versions. Failure to update them makes the expected values in these files inaccessible on the cluster nodes.


6.2 Setting Up and Administering Public Network Management

This section provides procedures for setting up PNM and configuring backup groups.

6.2.1 How to Set Up PNM

These are the high-level steps to set up PNM:

These are the detailed steps to set up PNM.

  1. Set up the node hardware so that you have multiple network adapters on a single node using the same subnet.

    Refer to your Sun Cluster node hardware documentation to set up your network adapters.

  2. If the Sun Cluster node software packages have not been installed already, install them by using the scinstall(1M) command.

    The scinstall(1M) command runs interactively to install the package set you select. The PNM package, SUNWpnm, is part of the node package set. See the Sun Cluster 2.2 Software Installation Guide for the detailed cluster installation procedure.

  3. Register the default network interface on each node, if you did not do so already.

    You must register one default network interface per node in the interface database associated with each node, and verify that the interface is plumbed and functioning correctly.

    1. Create an interface database on each node and register the primary public network interfaces.

      Create a file in the /etc directory on each node to use as the interface database. Name the file hostname.interface, where hostname is the name of the primary physical host. Then add one line containing the interface name for that node. For example, on node phys-hahost1 with a default interface qfe-1, create a file /etc/phys-hahost1.qfe1 containing the following line.

      phys-hahost1-qfe1
    2. In the /etc/hosts file on each node, associate the primary public network interface name with an IP address.

      In this example, the primary physical host name is phys-hahost1:

      129.146.75.200 phys-hahost1-qfe1

      If your system uses a naming mechanism other than /etc/hosts, refer to the appropriate section in the TCP/IP and Data Communications Administration Guide to perform the equivalent function.

  4. Establish PNM backup groups by using the pnmset(1M) command.

    Run the pnmset(1M) interactive script to set up backup groups.


    Caution - Caution -

    If you have configured logical hosts and data services already, you must stop the HA data services before changing the backup group membership with pnmset(1M). If you do not stop the data services before running the pnmset(1M) command, serious problems and data service failures can result.


    1. Run the pnmset(1M) command.

      phys-hahost1# /opt/SUNWpnm/bin/pnmset
      
    2. Enter the total number of backup groups you want to configure.

      Normally this number corresponds with the number of public subnets.

      In the following dialog, you will be prompted to configure public 
      network management.
      
       do you want to continue ... [y/n]: y
      
       How many NAFO backup groups on the host [1]: 2
      
    3. Assign backup group numbers.

      At the prompt, supply an integer between 0 and the maximum of 255. The pnmset(1M) command appends this number to the string nafo to form the backup group name.

      Enter backup group number [0]: 0
      
    4. Assign adapters to backup groups.

      Please enter all network adapters under nafo0:
       qe0 qe1
      ...

      Continue by assigning backup group numbers and adapters for all other backup groups in the configuration.

    5. Allow the pnmset(1M) command to test your adapter configuration.

      The pnmset(1M) command tests the correctness of your adapter configuration. In this example, the backup group contains one active adapter and two redundant adapters.

      The following test will evaluate the correctness of the customer 
      NAFO configuration...
       name duplication test passed
      
      
       Check nafo0... < 20 seconds
       qe0 is active
       remote address = 192.168.142.1
       nafo0 test passed
      
      
       Check nafo1... < 20 seconds
       qe3 is active
       remote address = 192.168.143.1
       test qe4 wait...
       test qe2 wait...
       nafo1 test passed
       phys-hahost1#

      Once the configuration is verified, the PNM daemon pnmd(1M) automatically notes the configuration changes and starts monitoring the interfaces.


      Note -

      Only one adapter within a backup group should be plumbed and have an entry in the /etc/hostname.adapter file. Do not assign IP addresses to the backup adapters; they should not be plumbed.



      Note -

      PNM uses broadcast ping(1M) to monitor networks, which in turn uses broadcast ICMP (Internet Control Message Protocol) packets to communicate with other remote hosts. Some routers do not forward broadcast ICMP packets; consequently, PNM's fault detection behavior is affected. See the Sun Cluster 2.2 Release Notes for a workaround to this problem.


  5. Start the cluster by using the scadmin(1M) command.

    Run the following command on one node:

    # scadmin startcluster physical-hostname sc-cluster
    

    Then add all other nodes to the cluster by running the following command from all other nodes:

    # scadmin startnode
    
  6. Verify the PNM configuration by using the pnmstat(1M) command.

    phys-hahost1# /opt/SUNWpnm/bin/pnmstat -l
    bkggrp  r_adp   status  fo_time live_adp
     nafo0   hme0    OK      NEVER   hme0
     phys-hahost1# 

    You have now completed the initial setup of PNM.

6.2.2 How to Reconfigure PNM

Use this procedure to reconfigure an existing PNM configuration by adding or removing network adapters. Follow these steps to administer one node at a time, so that Sun Cluster services remain available during the procedure.

  1. Stop the Sun Cluster software on the node to be reconfigured.

    phys-hahost1# scadmin stopnode
    
  2. Add or remove the network adapters.

    Use the procedures described in "6.4 Adding and Removing Network Interfaces".

  3. Run the pnmset(1M) command to reconfigure backup groups.

    Use the pnmset(1M) command to reconfigure backup groups as described in Step 4 of the procedure "6.2.1 How to Set Up PNM".

    phys-hahost1# pnmset
    
  4. Restart the Sun Cluster software on the node.

    Restart the node by running the following command from the administrative workstation:

    phys-hahost1# scadmin startnode
    
  5. Repeat Step 1 through Step 4 for each node you want to reconfigure.

6.2.3 How to Check the Status of Backup Groups

You can use the pnmptor(1M) and pnmrtop(1M) commands to check the status of local backup groups only, and the pnmstat(1M) command to check the status of local or remote backup groups.

    Run the pnmptor(1M) command to find the backup group to which an adapter belongs.

The pnmptor(1M) command maps a pseudo adapter name that you supply to a real adapter name. In this example, the system output shows that pseudo adapter name nafo0 is associated with the active adapter hme2:

phys-hahost1# pnmptor nafo0
hme2

    Run the pnmrtop(1M) command to find the active adapter associated with a given backup group.

In this example, the system output shows that adapter hme1 belongs to backup group nafo0:

phys-hahost1# pnmrtop hme1
nafo0

    Run the pnmstat(1M) command to determine the status of a backup group.

Use the -c option to determine the status of a backup group on the local host:

phys-hahost1# pnmstat -c nafo0
OK
 NEVER
 hme2

Use the following syntax to determine the status of a backup group on a remote host:

phys-hahost1# pnmstat -sh remotehost -c nafo1
 OK
 NEVER
 qe1

Note -

It is important to use the -s and -h options together. The -s option forces pnmstat(1M) to communicate over the private interconnect. If the -s option is omitted, pnmstat(1M) queries over the public interconnect. Both remotehost and the host on which you run pnmstat(1M) must be cluster members.


Whether checking the local or remote host, the pnmstat(1M) command reports the status, history, and current active adapter. See the man page for more details.

6.2.4 PNM Configurable Parameters

The following table describes the PNM parameters that are user-configurable. Configure these parameters after you have installed PNM, but before you bring up the cluster, by manually editing the configuration file /opt/SUNWcluster/conf/TEMPLATE.cdb on all nodes in the cluster. You can edit the file on one node and copy the file to all other nodes, or use the Cluster Console to modify the file on all nodes simultaneously. You can display the current PNM configuration with pnmset -l.

Table 6-2 PNM Configurable Parameters

pnmd.inactive_time

The time, in seconds, between fault probes. The default interval is 5 seconds. 

pnmd.ping_timeout

The time, in seconds, after which a fault probe will time out. The default timeout value is 4 seconds. 

pnmd.repeat_test

The number of times that PNM will retry a failed probe before deciding there is a problem. The default repeat quantity is 3.  

pnmd.slow_network

The latency, in seconds, between the listening phase and actively probing phase of a fault probe. The default latency period is 2 seconds. If your network is slow, causing PNM to initiate spurious takeovers, consider increasing this latency period. 

6.3 Troubleshooting PNM Errors

The following errors are those most commonly returned by PNM.

PNM rpc svc failed

This error indicates that the PNM daemon has not been started. Restart the PNM daemon with the following command. The node-id is the value returned by the /opt/SUNWcluster/bin/get_node_status command.

# /opt/SUNWpnm/bin/pnmd -s -c cluster-name -l node-id
PNM not started

This message indicates that no backup groups have been configured. Use the pnmset(1M) command to create backup groups.

No nafoXX

This message indicates that you have specified an illegal backup group name. Use the pnmrtop(1M) command to determine the backup group names associated with a given adapter. Rerun the command and supply it with a valid backup group name.

PNM configure error

This message indicates that either the PNM daemon was unable to configure an adapter, or that there is a formatting error in the configuration file, /etc/pnmconfig. Check the syslog messages and take the actions specified by Sun Cluster Manager. For more information on Sun Cluster Manager, see Chapter 2, Sun Cluster Administration Tools.

Program error

This message indicates that the PNM daemon was unable to execute a system call. Check the syslog messages and take the actions specified by Sun Cluster Manager. For more information on Sun Cluster Manager, see Chapter 2, Sun Cluster Administration Tools.

6.4 Adding and Removing Network Interfaces

The procedures in this section can be used to add or remove public network interface cards within a cluster configuration.

To add or remove a network interface to or from the control of a logical host, you must modify each logical host configured to use that interface. You change a logical host's configuration by completely removing the logical host from the cluster, then adding it again with the required changes. You can reconfigure a logical host with either the scconf(1M) or scinstall(1M) command. The examples in this section use the scconf(1M) command. Refer to "3.5 Adding and Removing Logical Hosts", for the logical host configuration steps using the scinstall(1M) command.

6.4.1 Adding a Network Interface

Adding a network interface requires unconfiguring and reconfiguring all logical hosts associated with the interface. Note that all data services will be inaccessible for a short period of time during the procedure.

6.4.2 How to Add a Network Interface

On each node that will receive a new network interface card, perform the following steps.

  1. Stop the cluster software.

    phys-hahost# scadmin stopnode
    
  2. Add the new interface card, using the instructions included with the card.

  3. Configure the new network interface on each node.

    This step is necessary only if the new interface will be part of a logical host. Skip this step if your configuration does not include logical hosts.

    phys-hahost# pnmset
    

    For Ethernet, create a new /etc/hostname.if file for each new interface on each node, and run the ifconfig(1M) command as you normally would in a non-cluster environment.


    Note -

    When you configure a set of network interfaces to be used by different logical hosts within a cluster, you must connect all interfaces in the set to the same subnet.


  4. Start the cluster software.

    If all nodes have been stopped, run the scadmin startcluster command on node 0 and then the scadmin startnode command on all other nodes. If at least one node has not had the cluster software stopped, run the scadmin startnode command on the remaining nodes.

    phys-hahost# scadmin startnode
    

    If the new interfaces are being added to already existing backup groups, the procedure is complete.

    If you modified the backup group configuration, you must bring the cluster back into normal operation and reconfigure each logical host that will be using the new set of network controllers. You will unconfigure and reconfigure each logical host, so run the scconf -p command to print out the current configuration before starting these steps. You can run the scconf -p command on any node that is an active cluster member; it does not need to be run on all cluster nodes.

    To unconfigure and reconfigure the logical host, you can use either the scconf(1M) command as shown in these examples, or the scinstall(1M) command as described in "3.5 Adding and Removing Logical Hosts".

  5. Notify users that data services on the affected logical hosts will be unavailable for a short period.

  6. Save copies of the /etc/opt/SUNWcluster/conf/ccd.database files on each node, in case you need to restore the original configuration.

  7. Turn off the data services.

    phys-hahost# hareg -n dataservice
    
  8. Unregister the data services.

    phys-hahost# hareg -u dataservice
    
  9. Remove the logical host from the cluster.

    Run this command on any node that is an active cluster member. You do not need to run this command on all cluster nodes.

    phys-hahost# scconf clustername -L logicalhost -r
    
  10. Reconfigure the logical host to include the new interface.

    Run this command on any node that is an active cluster member. You do not need to run this command on all cluster nodes.

    phys-hahost# scconf clustername -L logicalhost -n nodelist -g dglist -i logaddrinfo
    

    The logaddrinfo field is where you define the new interface name. Refer to the listing taken from the scconf -p command output to reconstruct each logical host.

  11. Register the data services.

    phys-hahost# hareg [-s] -r dataservice
    
  12. Turn on the data services.

    phys-hahost# hareg -y dataservice
    
  13. Check access to the data services.

  14. Notify users that the data services are once again available.

    This completes the process of adding a network interface.

6.4.3 Removing a Network Interface

Use the following procedure to remove a public network interface from a cluster.

6.4.4 How to Remove a Network Interface

While all nodes are participating in the cluster, perform the following steps on one node only.

  1. Identify which logical hosts must be reconfigured to exclude the network interface.

    All of these logical hosts will need to be unconfigured then reconfigured. Run the scconf -p command to print out a list of logical hosts in the current configuration; save this list for later use. You do not need to run the scconf -p command on all cluster nodes. You can run it on any node that is an active cluster member.

  2. Run the pnmset(1M) command to display the current PNM configuration.

  3. Remove the controller from a backup group, if necessary.

    If the controller to be removed is part of a backup group, remove the controller from all logical hosts, then run the pnmset(1M) command to remove the controller from the backup group.

  4. Notify users that any data services on the affected logical hosts will be unavailable for a short period.

  5. Turn off the data services.

    phys-hahost# hareg -n dataservice
    
  6. Unregister the data services.

    phys-hahost# hareg -u dataservice
    
  7. Remove the logical host from the cluster.


    Note -

    To unconfigure and reconfigure the logical host (Step 7 and Step 8), you can either run the scconf(1M) command as shown, or run the scinstall(1M) command as described in "3.5 Adding and Removing Logical Hosts".


    You can run this command on any node that is an active cluster member. You do not need to run it on all cluster nodes.

    phys-hahost# scconf clustername -L logicalhost -r
    
  8. Reconfigure the logical host to include the new interface.

    You can run this command on any node that is an active cluster member. You do not need to run it on all cluster nodes.

    phys-hahost# scconf clustername -L logicalhost -n nodelist -g dglist -i logaddrinfo
    

    The logaddrinfo field is where you define the new interface name. Refer to the listing taken from the scconf -p command output to reconstruct each logical host.

  9. If the controller being removed was part of a backup group, rerun the pnmset(1M) command.

    Rerun the pnmset(1M) command and exclude the controller being removed.

  10. (Optional) If you are removing the network adapter from the nodes, perform the following steps on each affected node:

    1. Stop the cluster software.

      phys-hahost# scadmin stopnode
      
    2. Halt the node and remove the interface card.

    3. Boot the node.

    4. Perform the Solaris system administration tasks you would normally perform to remove a network interface (remove hostname.if file, update /etc/hosts, etc).

    5. Restart the cluster software. If all nodes were brought down, start the first node using the scadmin startcluster command. If at least one node is still running the cluster software, restart the other nodes.

      phys-hahost# scadmin startnode
      
  11. Register the data services.

    phys-hahost# hareg -r dataservice
    
  12. Turn on the data services.

    phys-hahost# hareg -y dataservice
    
  13. Check access to the data services.

  14. Notify users that the data services are once again available.

6.5 Administering the Switch Management Agent

The Switch Management Agent (SMA) is a cluster module that maintains communication channels over the cluster private interconnect. It monitors the private interconnect and invokes a failover to a backup network if it detects a failure.

Note the following limitations before beginning the procedure:

6.5.1 How to Add Switches and SCI Cards

Use this procedure to add switches and SCI cards to cluster nodes. See the sm_config(1M) man page for details.

  1. Edit the sm_config template file to include the configuration changes.

    Normally, the template file is located in /opt/SUNWsma/bin/Examples.

  2. Configure the SCI SBus cards by running the sm_config(1M) command from one of the nodes.

    Rerun the command a second time to ensure that SCI node IDs and IP addresses are assigned correctly to the cluster nodes. Incorrect assignments can cause miscommunication between the nodes.

  3. Reboot the new nodes.

6.5.2 SCI Software Troubleshooting

If a problem occurs with the SCI software, verify that the following are true:

Also note the following problems and solutions:

6.5.3 How to Verify Connectivity Between Nodes

There are two ways to verify the connectivity between nodes: by running get_ci_status(1M) or by running ping(1).

    Run the get_ci_status(1M) command on all cluster nodes.

Example output for get_ci_status(1M) is shown below.

# /opt/SUNWsma/bin/get_ci_status
sma: sci #0: sbus_slot# 1; adapter_id 8 (0x08); ip_address 1; switch_id# 0; 
port_id# 0; Adapter Status - UP; Link Status - UP
sma: sci #1: sbus_slot# 2; adapter_id 12 (0x0c); ip_address 17; switch_id# 1; 
port_id# 0; Adapter Status - UP; Link Status - UP
sma: Switch_id# 0
sma: port_id# 1: host_name = interconn2; adapter_id = 72; active | operational
sma: port_id# 2: host_name = interconn3; adapter_id = 136; active | operational
sma: port_id# 3: host_name = interconn4; adapter_id = 200; active | operational
sma: Switch_id# 1
sma: port_id# 1: host_name = interconn2; adapter_id = 76; active | operational
sma: port_id# 2: host_name = interconn3; adapter_id = 140; active | operational
sma: port_id# 3: host_name = interconn4; adapter_id = 204; active | operational
# 

The first four lines indicate the status of the local node (in this case, interconn1). It is communicating with both switch_id# 0 and switch_id# 1 (Link Status - UP).

sma: sci #0: sbus_slot# 1; adapter_id 8 (0x08); ip_address 1; switch_id# 0; 
port_id# 0; Adapter Status - UP; Link Status - UP
sma: sci #1: sbus_slot# 2; adapter_id 12 (0x0c); ip_address 17; switch_id# 1; 
port_id# 0; Adapter Status - UP; Link Status - UP

The rest of the output indicates the global status of the other nodes in the cluster. All the ports on the two switches are communicating with their nodes. If there is a problem with the hardware, inactive is displayed (instead of active). If there is a problem with the software, inoperational is displayed (instead of operational).

sma: Switch_id# 0
sma: port_id# 1: host_name = interconn2; adapter_id = 72; active | operational
sma: port_id# 2: host_name = interconn3; adapter_id = 136; active | operational
sma: port_id# 3: host_name = interconn4; adapter_id = 200; active | operational
sma: Switch_id# 1
sma: port_id# 1: host_name = interconn2; adapter_id = 76; active | operational
sma: port_id# 2: host_name = interconn3; adapter_id = 140; active | operational
sma: port_id# 3: host_name = interconn4; adapter_id = 204; active | operational
#

    Run the ping(1) command on all the IP addresses of remote nodes.

Example output for ping(1) is shown below.

# ping IP-address

The IP addresses are found in the /etc/sma.ip file. Be sure to run the ping(1) command for each node in the cluster.

The ping(1) command returns an "alive" message indicating that the two ends are communicating without a problem. Otherwise, an error message is displayed.

For example,

# ping 204.152.65.2
204.152.65.2 is alive

6.5.4 How to Verify the SCI Interface Configuration

    Run the ifconfig -a command to verify that all SCI interfaces are up and that the cluster nodes have the correct IP addresses.

The last 8 bits of the IP address should match the IP field value in the /etc/sma.config file.

# ifconfig -a
lo0: flags=849<UP,LOOPBACK,RUNNING,MULTICAST> mtu 8232
 	inet 127.0.0.1 netmask ff000000
hme0: flags=863<UP,BROADCAST,NOTRAILERS,RUNNING,MULTICAST> mtu 1500
 	inet 129.146.238.55 netmask ffffff00 broadcast 129.146.238.255
 	ether 8:0:20:7b:fa:0
scid0: flags=80cl<UP,RUNNING,NOARP,PRIVATE> mtu 16321
 	inet 204.152.65.1 netmask fffffff0
scid1: flags=80cl<UP,RUNNING,NOARP,PRIVATE> mtu 16321
 	inet 204.152.65.17 netmask fffffff0

Chapter 7 Administering Server Components

This chapter describes the software procedure for adding or removing Sun Cluster node components. The topics in this chapter are listed below.

This chapter includes the following procedures:

7.1 Replacing System Boards

The Solstice DiskSuite component of Sun Cluster is sensitive to device numbering and can become confused if system boards are moved around. Refer to Chapter 1, Preparing for Sun Cluster Administration, for more information on instance names and numbering.

When the node is booted initially, the multihost disk expansion unit entries in the /dev directory are tied to the connection slot.

For example, when the node is booted, system board 0 and SBus slot 1 will be part of the identity of the multihost disk expansion unit. If the board or SBus card is shuffled to a new location, Solstice DiskSuite will be confused because Solaris will assign new controller numbers to the SBus controllers when they are in a new location.


Note -

The SBus cards can be moved as long as the type of SBus card in a slot remains the same.


Shuffling the fiber cables that lead to the multihost disk expansion units also can create problems. When SBus cards are switched, you must also reconnect the multihost disk expansion units back to the same SBus slot they were connected to before the changes.

7.2 Adding Board-Level Modules

Adding or replacing board-level modules such as SIMMs and CPUs involves both software and hardware procedures.

7.2.1 How to Add Board-Level Modules

  1. Stop Sun Cluster on the node that is to receive the board-level module.

    In this example, phys-hahost2 will be the first to receive the board-level module.

    phys-hahost2# scadmin stopnode
    
  2. Halt the node.

    phys-hahost2# halt
    
  3. Power off the node.

  4. Install the board-level module using the instructions in the appropriate hardware manual.

  5. Power on the node.

  6. Perform a reconfiguration reboot.

    ok boot -r
    
  7. Start the cluster software on the node.

    # scadmin startnode
    
  8. Repeat Step 1 through Step 7 on the other Sun Cluster nodes that need a similar hardware upgrade.

  9. Switch each logical host back to its default master, if necessary.

    If manual mode is not set, an automatic switchback will occur.

    phys-hahost2# haswitch phys-hahost1 hahost1
    

7.3 Replacing SBus Cards

Replacement of SBus cards in Sun Cluster nodes can be done by switching over the data services to the node that is functioning and performing the hardware procedure to replace the board. The logical hosts can be switched back to the default masters following the procedure.

7.3.1 How to Replace SBus Cards

  1. Switch ownership of the logical hosts from the Sun Cluster node that needs an SBus card replaced.

    For instance, if the board is being replaced on the physical host phys-hahost2, enter the following:

    phys-hahost1# haswitch phys-hahost1 hahost1 hahost2
    
  2. Stop Sun Cluster on the affected node.

    Run the scadmin(1M) command with the stopnode option on the host that has the failed SBus card.

    phys-hahost2# scadmin stopnode 
    
  3. Halt and power off the affected node.

  4. Perform the hardware replacement procedure.

    Refer to the instructions in the appropriate hardware service manual to replace the SBus card.

  5. Power on the node and start the cluster software on the node.

    # scadmin startnode
    

    The node automatically will rejoin the Sun Cluster configuration.

  6. Switch the logical hosts back to the default masters, if necessary.

    If manual mode is not set, an automatic switchback will occur.

    phys-hahost1# haswitch phys-hahost2 hahost2
    

Chapter 8 Administering the Terminal Concentrator

This chapter provides instructions for using the Terminal Concentrator when performing administration of Sun Cluster configurations. The topics in this chapter are listed below.

This chapter includes the following procedures:

8.1 Connecting to the Sun Cluster Console

You can perform administrative tasks from a window connected to any Sun Cluster node. The procedures for initial setup of a Terminal Concentrator and how to set up security are in the hardware planning and installation manual for your Sun Cluster node and the Terminal Concentrator documentation.

The following procedure describes how to create connections from the administrative workstation in a Sun Cluster configuration.

Because a shelltool(1) can be of variable size and the connection is made through a serial-port console interface, the console port is incapable of determining the window size of the shelltool(1) from which the connection was made. You must set the window size manually on the nodes for any applications that require information about row and column quantities.

8.1.1 How to Connect to the Sun Cluster Server Console

  1. Open a shelltool(1) window on the desktop of a workstation.

  2. Run the tput(1) command and note the size of the shelltool(1) window.

    These numbers will be used in Step 6.

    # tput lines
    35
    # tput cols
    80
  3. Enter the following command to open a telnet(1) connection to one of the Sun Cluster nodes, through the Terminal Concentrator.

    # telnet terminal-concentrator-name 5002
     Trying 192.9.200.1 ...
     Connected to 192.9.200.1.
     Escape character is '^]'.

    Note -

    Port numbers are configuration dependent. Typically, ports 2 and 3 (5002 and 5003 in the examples) are used for the first Solaris cluster at a site.


  4. Open another shelltool(1) window and enter the following command to open a telnet(1) connection to the other node.

    # telnet terminal-concentrator-name 5003
     Trying 192.9.200.1 ...
     Connected to 192.9.200.1.
     Escape character is '^]'.

    Note -

    If you set up security as described in the hardware planning and installation guide for your Sun Cluster node, you will be prompted for the port password. After establishing the connection, you will be prompted for the login name and password.


  5. Log in to the node.

    Console login: root
     Password: root-password
    
  6. Use the stty(1) command to reset the terminal rows and cols values to those found in Step 2.

    # stty rows 35
    # stty cols 80
    
  7. Set the TERM environment variable to the appropriate value based on the type of window used in Step 1.

    For example, if you are using an xterm window, type:

    # TERM=xterm; export TERM (sh or ksh)
     or
     # setenv TERM xterm (csh)

8.2 Resetting a Terminal Concentrator Connection

This section provides instructions for resetting a Terminal Concentrator connection.

If another user has a connection to the Sun Cluster node console port on the Terminal Concentrator, you can reset the port to disconnect that user. This procedure will be useful if you need to immediately perform an administrative task.

If you cannot connect to the Terminal Concentrator, the following message appears:

# telnet terminal-concentrator-name 5002
 Trying 192.9.200.1 ...
 telnet: Unable to connect to remote host: Connection refused
 #

If you use the port selector, you might see a port busy message.

8.2.1 How to Reset a Terminal Concentrator Connection

  1. Press an extra return after making the connection and select the command line interface (cli) to connect to the Terminal Concentrator.

    The annex: prompt appears.

    # telnet terminal-concentrator-name
     ...
     Enter Annex port name or number: cli
     ...
     annex:
  2. Enter the su command and password.

    By default, the password is the IP address of the Terminal Concentrator.

    annex: su
     Password:
  3. Determine which port you want to reset.

    The port in this example is Port 2. Use the Terminal Concentrator's built-in who command to show connections.

    annex# who
     Port			What			User			Location					When			Idle			Address
     2			PSVR			---			---					---			1:27			192.9.75.12
     v1			CLI			---			---					---						192.9.76.10
  4. Reset the port.

    Use the Terminal Concentrator's built-in reset command to reset the port. This example breaks the connection on Port 2.

    annex# admin reset 2
    
  5. Disconnect from the Terminal Concentrator.

    annex# hangup
    
  6. Reconnect to the port.

    # telnet terminal-concentrator-name 5002
    

8.3 Entering the OpenBoot PROM on a Sun Cluster Server

This section contains information for entering the OpenBoot PROM from the Terminal Concentrator.

8.3.1 How to Enter the OpenBoot PROM

  1. Connect to the port.

    # telnet terminal-concentrator-name 5002
     Trying 192.9.200.1 ...
     Connected to 129.9.200.1 .
     Escape character is '^]'.
  2. Stop the cluster software, if necessary, by using the scadmin stopnode command, and then halt the system.

    Halt the system gracefully by using the halt(1M) command.

    # halt
    

    If halting the system with the halt(1M) command is not possible, then enter the telnet(1) command mode. The default telnet(1) escape character is Control-].

  3. Send a break to the node.

    telnet> send brk
    
  4. Execute the OpenBoot PROM commands.

8.4 Troubleshooting the Terminal Concentrator

This section describes troubleshooting techniques associated with the Terminal Concentrator.

8.4.1 Port Configuration Access Errors

A connect: Connection refused message while trying to access a particular Terminal Concentrator port using telnet(1) can have two possible causes:

8.4.2 How to Correct a Port Configuration Access Error

  1. Telnet to the Terminal Concentrator without specifying the port, and then interactively specify the port.

    # telnet terminal-concentrator-name
     Trying ip_address ..
     Connected to 192.9.200.1
     Escape character is '^]'.
     [you may have to enter a RETURN to see the following prompts]
    
     Rotaries Defined:
           cli                              -
    
     Enter Annex port name or number: 2
    

    If you see the following message, the port is in use.

    Port(s) busy, do you wish to wait? (y/n) [y]:

    If you see the following message, the port is misconfigured.

    Port 2
     Error: Permission denied.

    If the port is in use, reset the Terminal Concentrator connections using the instructions in "8.2 Resetting a Terminal Concentrator Connection".

    If the port is misconfigured, do the following:

    1. Select the command-line interpreter (cli) and become the Terminal Concentrator superuser.

      Enter Annex port name or number: cli
      
       Annex Command Line Interpreter   *   Copyright 1991 Xylogics, Inc.
       
       annex: su
      Password:
    2. As the Terminal Concentrator superuser, reset the port mode.

      annex# admin
       Annex administration MICRO-XL-UX R7.0.1, 8 ports
       admin: port 2
      admin: set port mode slave
      	You may need to reset the appropriate port, Annex subsystem or
       	reboot the Annex for changes to take effect.
       admin: reset 2
       admin:

      The port is now configured correctly.

      For more information about the Terminal Concentrator administrative commands, see the Sun Terminal Concentrator General Reference Guide.

8.4.3 Random Interruptions to Terminal Concentrator Connections

Terminal concentrator connections made through a router can experience intermittent interruptions. These connections might come alive for random periods, then go dead again. When the connection is dead, any new Terminal Concentrator connection attempts will time out. The Terminal Concentrator will show no signs of rebooting. Subsequently, a needed route might be re-established, only to disappear again. The problem is due to Terminal Concentrator routing table overflow and loss of network connection.

This is not a problem for connections made from a host that resides on the same network as the Terminal Concentrator.

The solution is to establish a default route within the Terminal Concentrator and disable the routed feature. You must disable the routed feature to prevent the default route from being lost. The following procedure shows how to do this. See the Terminal Concentrator documentation for additional information.

The config.annex file is created in the Terminal Concentrator's EEPROM file system and defines the default route to be used. You also can use the config.annex file to define rotaries that allow a symbolic name to be used instead of a port number. Disable the routed feature using the Terminal Concentrator's set command.

8.4.4 How to Establish a Default Route

  1. Open a shelltool(1) connection to the Terminal Concentrator.

    # telnet terminal-concentrator-name
    Trying 192.9.200.2 ...
     Connected to xx-tc.
     Escape character is '^]'.
    
    
     Rotaries Defined:
         cli                              -
    
     Enter Annex port name or number: cli
    
    
     Annex Command Line Interpreter   *   Copyright 1991 Xylogics, Inc.
  2. Enter the su command and administrative password.

    By default, the password is the IP address of the Terminal Concentrator.

    annex: su
     Password: administrative-password
    
  3. Edit the config.annex file.

    annex# edit config.annex
    
  4. Enter the highlighted information appearing in the following example, substituting the appropriate IP address for your default router:

    Ctrl-W:save and exit Ctrl-X: exit Ctrl-F:page down Ctrl-B:page up
     %gateway
     net default gateway 192.9.200.2 metric 1 active ^W
    
  5. Disable the local routed.

    annex# admin set annex routed n
       You may need to reset the appropriate port, Annex subsystem or
        reboot the Annex for changes to take effect.
     annex#
  6. Reboot the Terminal Concentrator.

    annex# boot
    

    It takes a few minutes for the Terminal Concentrator to boot. During this time, the Sun Cluster node consoles are inaccessible.

8.5 Changing TC/SSP Information

In Sun Cluster 2.2, information about the Terminal Concentrator (TC) or a System Service Processor (Sun Enterprise 10000 only) is required during installation. The information is stored in the cluster configuration file.

This information is used to:

Both these mechanisms serve to protect data integrity in the case of four-node clusters with directly attached storage devices.


Note -

If you are using Solstice DiskSuite, tcmon and quorum will be disabled, and TC information is not required.


The scconf(1m) command enables you to change this information in the cluster configuration file if, for example, modifications are made to this part of the cluster hardware configuration.

For additional information on changing TC or SSP information, see Table 8-1 and the scconf(1M) man page.


Note -

These commands must be run on all cluster nodes.


Table 8-1 Task Map

To Accomplish This... 

Run This Command 

Replace the IP address/name of a TC 

scconf(1m) -t -i new-ip-address old-IP-address|TC-name

Supply a new password 

scconf(1m) -t -P old-IP-address|TC-name

Change the port number used for the cluster-wide locking mechanism (TC only) 

scconf(1m) -t -l new-port old-IP-address|TC-name

8.5.1 How to Change Host Information

Run the scconf -H command to change the information associated with a particular host. For example, to change a given host's architecture type and specify a new IP address for its SSP (or TC), run this command on all cluster nodes, where -d specifies the new architecture (Sun Enterprise 10000) associated with the host, and -t specifies a new IP address or host name (foo-ssp) for the SSP (or TC) connected to the host:

# scconf clustername -H foo -d E10000 -t foo-ssp

8.5.2 How to Specify a Port Number for an SSP or TC

Run the scconf -p command on all cluster nodes to specify a port number for this host's console for this SSP (or TC).

# scconf clustername -H hostname -p port-number

For example:

# scconf clustername -H foo -p 10

Note -

Multiple hosts may be connected to the same TC, and the -H option only affects the information associated with a particular host.


8.5.3 How to Change the Configuration of a TC

Run the scconf -t command on all cluster nodes to change the configuration of a particular TC in the system. For example, to change a TC IP address, use the following command, in which -i specifies a new IP address (129.34.123.52) for the specified Terminal Concentrator (or SSP), and -l specifies a new port (8) used for locking purposes in failure fencing:

# scconf clustername -t foo-tc -i 129.34.123.52 -l -8

If a Terminal Concentrator is being used, an unused TC port from 2 to n is specified, where n is the number of ports in the TC. If a SSP is being used, a value of -1 must be specified.

8.5.4 How to Specify a New Password for an SSP or TC

Run the scconf -P command on all cluster nodes to specify a new password for this SSP (or TC).

# scconf clustername -t foo-ssp -P
 foo-ssp(129.34.123.51) Password:*****

Note -

If you change the user password on the SSP or TC, you also need to notify Sun Cluster software of the change by running this procedure from each cluster node. Otherwise, failure fencing might not work properly when a faulty node needs to be brought down forcibly by a "send break" from the SSP or TC.


Chapter 9 Using Dual-String Mediators

This chapter describes the Solstice DiskSuite feature that allows Sun Cluster to run highly available data services using only two disk strings. The topics in this chapter are listed below. Refer to the Solstice DiskSuite documentation for more information on the Solstice DiskSuite features and concepts.

This chapter includes the following procedures:

9.1 Overview of Mediators

The requirement for Sun Cluster is that a dual string, that is, a configuration with only two disk strings, must survive the failure of a single node or a single string of drives without user intervention.

In a dual-string configuration, metadevice state database replicas are always placed such that exactly half of the replicas are on one string and half are on a second string. A quorum (half + 1 or more) of the replicas is required to guarantee that the most current data is being presented. In the dual-string configuration, if one string becomes unavailable, a quorum of the replicas will not be available.

A mediator is a host (node) that stores mediator data. Mediator data provides information on the location of other mediators and contains a commit count that is identical to the commit count stored in the database replicas. This commit count is used to confirm that the mediator data is in sync with the data in the database replicas. Mediator data is individually verified before use.

Solstice DiskSuite requires a replica quorum (half + 1) to determine when "safe" operating conditions exist. This guarantees data correctness. With a dual-string configuration, it is possible that only one string is accessible. In this situation it is impossible to get a replica quorum. If mediators are used and a mediator quorum is present, the mediator data can help you determine whether the data on the accessible string is up-to-date and safe to use.

The introduction of mediators enables the Sun Cluster software to ensure that the most current data is presented in the case of a single string failure in a dual-string configuration.

9.1.1 Golden Mediators

To avoid unnecessary user intervention in some dual-string failure scenarios, the concept of a golden mediator has been implemented. If exactly half of the database replicas are accessible and an event occurs that causes the mediator hosts to be updated, two mediator updates are attempted. The first update attempts to change the commit count and to set the mediator to not golden. The second update occurs if and only if during the first phase, all mediator hosts were successfully contacted and the number of replicas that were accessible (and which had their commit count advanced) were exactly half of the total number of replicas. If all the conditions are met, the second update sets the mediator status to golden. The golden status enables a takeover to proceed, without user intervention, to the host with the golden status. If the status is not golden, the data will be set to read-only, and user intervention is required for a takeover or failover to succeed. For the user to initiate a takeover or failover, exactly half of the replicas must be accessible.

The golden state is stored in volatile memory (RAM) only. Once a takeover occurs, the mediator data is once again updated. If any mediator hosts cannot be updated, the golden state is revoked. Since the state is in RAM only, a reboot of a mediator host causes the golden state to be revoked. The default state for mediators is not golden.

9.2 Configuring Mediators

Figure 9-1 shows a Sun Cluster system configured with two strings and mediators on two Sun Cluster nodes.

Regardless of the number of nodes, there are still only two mediator hosts in the cluster. The mediator hosts are the same for all disksets using mediators in a given cluster, even when a mediator host is not a member of the server set capable of mastering the diskset.

To simplify the presentation, the configurations shown here use only one diskset and a symmetric configuration. The number of disksets is not significant in these sample scenarios. In the stable state, the diskset is mastered by phys-hahost1.

Figure 9-1 Sun Cluster System in Steady State With Mediators

Graphic

Normally, if half + 1 of the database replicas are accessible, then mediators are not used. When exactly half of the replicas are accessible, the mediator's commit count can be used to determine whether the accessible half is the most up to date. To guarantee that the correct mediator commit count is being used, both of the mediators must be accessible, or the mediator must be golden. Half + 1 of the mediators constitutes a mediator quorum. The mediator quorum is independent of the replica quorum.

9.3 Failures Addressed by Mediators

With mediators, it is possible to recover from single failures, as well as some double failures. Since Sun Cluster only guarantees automatic recovery from single failures, only the single-failure recovery situation is covered here in detail. The double failure scenarios are included, but only general recovery processes are described.

Figure 9-1 shows a dual-string configuration in the stable state. Note that mediators are established on both Sun Cluster nodes, so both nodes must be up for a mediator quorum to exist and for mediators to be used. If one Sun Cluster node fails, a replica quorum will exist. If a takeover of the diskset is necessary, the takeover will occur without the use of mediators.

The following sections show various failure scenarios and describe how mediators help recover from these failures.

9.3.1 Single Server Failure

Figure 9-2 shows the situation where one Sun Cluster node fails. In this case, the mediator software is not used since there is a replica quorum available. Sun Cluster node phys-hahost2 will take over the diskset previously mastered by phys-hahost1.

The process for recovery in this scenario is identical to the process followed when one Sun Cluster node fails and there are more than two disk strings. No administrator action is required except perhaps to switch over the diskset after phys-hahost1 rejoins the cluster. See the haswitch(1M) man page for more information on the switchover procedure.

Figure 9-2 Single Sun Cluster Server Failure With Mediators

Graphic

9.3.2 Single String Failure

Figure 9-3 illustrates the case where, starting from the steady state shown in Figure 9-1, a single string fails. When String 1 fails, the mediator hosts on both phys-hahost1 and phys-hahost2 will be updated to reflect the event, and the system will continue to run as follows:

The commit count is incremented and the mediators remain golden.

Figure 9-3 Single String Failure With Mediators

Graphic

The administration required in this scenario is the same that is required when a single string fails in the three or more string configuration. Refer to the relevant chapter on administration of your disk expansion unit for details on these procedures.

9.3.3 Host and String Failure

Figure 9-4 shows a double failure where both String 1 and phys-hahost2 fail. If the failure sequence is such that the string fails first, and later the host fails, the mediator on phys-hahost1 could be golden. In this case, we have the following conditions:

Figure 9-4 Multiple Failure - One Server and One String

Graphic

This type of failure is recovered automatically by Sun Cluster. If phys-hahost2 mastered the diskset, phys-hahost1 will take over mastery of the diskset. Otherwise, mastery of the diskset will be retained by phys-hahost1. After String 1 is fixed, the data on String 1 must be resynchronized with the data on String 2. For more information on the resynchronization process, refer to the Solstice DiskSuite User's Guide and the metareplace(1M) man page.


Caution - Caution -

Although you can recover from this scenario, you must be sure to restore the failed components immediately since a third failure will cause the cluster to be unavailable.


If the mediator on phys-hahost1 is not golden, this case is not automatically recovered by Sun Cluster and requires administrative intervention. In this case, Sun Cluster generates an error message and the logical host is put into maintenance mode (read-only). If this or any other multiple failure occurs, contact your service provider to assist you.

9.4 Administering Mediators

Administer mediator hosts with the medstat(1M) and metaset(1M) commands. Use these commands to add or delete mediator hosts, and to check and fix mediator data. See the medstat(1M), metaset(1M), and mediator(7) man pages for details.

9.4.1 How to Add Mediator Hosts

Use this procedure after you have installed and configured Solstice DiskSuite.

  1. Start the cluster software on all nodes.

    On the first node:

    # scadmin startcluster
    

    On all remaining nodes:

    # scadmin startnode
    
  2. Determine the name of the private link for each node.

    Use grep(1) to identify the private link included in the clustername.cdb file.

    host1# grep "^cluster.node.0.hostname" \
     /etc/opt/SUNWcluster/conf/clustername.cdb
    cluster.node.0.hostname : host0
     host1# grep "cluster.node.0.hahost" \
     /etc/opt/SUNWcluster/conf/clustername.cdb | grep 204
     204.152.65.33
    
     host1# grep "^cluster.node.1.hostname" \
     /etc/opt/SUNWcluster/conf/clustername.cdb
     cluster.node.1.hostname : host1
     host1# grep "cluster.node.1.hahost" \
     /etc/opt/SUNWcluster/conf/clustername.cdb | grep 204
    204.152.65.34

    In this example, 204.152.65.33 is the private link for host0 and 204.152.65.34 is the private link for host1.

  3. Configure mediators using the metaset(1M) command.

    Add each host with connectivity to the diskset as a mediator for that diskset. Run each command on the host currently mastering the diskset. You can use the hastat(1M) command to determine the current master of the diskset. The information returned by hastat(1M) for the logical host identifies the diskset master.

    host1# metaset -s disksetA -a -m host0,204.152.65.33
    host1# metaset -s disksetA -a -m host1,204.152.65.34
    host1# metaset -s disksetB -a -m host0,204.152.65.33
    host1# metaset -s disksetB -a -m host1,204.152.65.34
    host1# metaset -s disksetC -a -m host0,204.152.65.33
    host1# metaset -s disksetC -a -m host1,204.152.65.34
    

    The metaset(1M) command treats the private link as an alias.

9.4.2 How to Check the Status of Mediator Data

Run the medstat(1M) command.

phys-hahost1# medstat -s diskset

See the medstat(1M) man page to interpret the output. If the output indicates that the mediator data for any one of the mediator hosts for a given diskset is bad, refer to the following procedure to fix the problem.

9.4.3 How to Fix Bad Mediator Data


Note -

The medstat(1M) command checks the status of mediators. Use this procedure if medstat(1M) reports that a mediator host is bad.


  1. Remove the bad mediator host(s) from all affected diskset(s).

    Log into the Sun Cluster node that owns the affected diskset and enter:

    phys-hahost1# metaset -s diskset -d -m bad_mediator_host
    
  2. Restore the mediator host and its aliases:

    phys-hahost1# metaset -s diskset -a -m bad_mediator_host, physical_host_alias,...
    

    Note -

    The private links must be assigned as mediator host aliases. Specify the physical host IP address first, and then the HA private link on the metaset(1M) command line. See the mediator(7) man page for details on this use of the metaset(1M) command.


9.4.4 Handling Failures Without Automatic Recovery

Certain double-failure scenarios exist that do not allow for automatic recovery by Sun Cluster. They include the following:

It is very important to monitor the state of the disksets, replicas, and mediators regularly. The medstat(1M) command is useful for this purpose. Bad mediator data, replicas, and disks should always be repaired immediately to avoid the risk of potentially damaging multiple failure scenarios.

When a failure of this type does occur, one of the following sets of error messages will be logged:

ERROR: metaset -s <diskset> -f -t exited with code 66
ERROR: Stale database for diskset <diskset>
NOTICE: Diskset <diskset> released

ERROR: metaset -s <diskset> -f -t exited with code 2
ERROR: Tagged data encountered for diskset <diskset>
NOTICE: Diskset <diskset> released

ERROR: metaset -s <diskset> -f -t exited with code 3
ERROR: Only 50% replicas and 50% mediator hosts available for 
diskset <diskset>
NOTICE: Diskset <diskset> released

Eventually, the following set of messages also will be issued:

ERROR: Could not take ownership of logical host(s) <lhost>, so 
switching into maintenance mode
ERROR: Once in maintenance mode, a logical host stays in 
maintenance mode until the admin intervenes manually
ERROR: The admin must investigate/repair the problem and if 
appropriate use haswitch command to move the logical host(s) out of 
maintenance mode

Note that for a dual failure of this nature, high availability goals are sacrificed in favor of attempting to preserve data integrity. Your data might be unavailable for some time. In addition, it is not possible to guarantee complete data recovery or integrity.

Your service provider should be contacted immediately. Only an authorized service representative should attempt manual recovery from this type of dual failure. A carefully planned and well coordinated effort is essential to data recovery. Do nothing until your service representative arrives at the site.

Your service provider will inspect the log messages, evaluate the problem, and, possibly, repair any damaged hardware. Your service provider might then be able to regain access to the data by using some of the special metaset(1M) options described on the mediator(7) man page. However, such options should be used with extreme care to avoid recovery of the wrong data.


Caution - Caution -

Attempts to alternate access between the two strings should be avoided at all costs; such attempts will make the situation worse.


Before restoring client access to the data, exercise any available validation procedures on the entire dataset or on any data affected by recent transactions against the dataset.

Before you run the haswitch(1M) command to return any logical host from maintenance mode, make sure that you release ownership of the associated diskset.

9.4.5 Error Log Messages Associated With Mediators

The following syslog or console messages indicate that there is a problem with mediators or mediator data. Use the procedure "9.4.3 How to Fix Bad Mediator Data" to address the problem.

Attention required - medstat shows bad mediator data on host %s 
for diskset %s

Attention required - medstat finds a fatal error in probing 
mediator data on host %s for diskset %s!

Attention required - medstat failed for diskset %s