Sun Cluster 2.2 System Administration Guide

Chapter 1 Preparing for Sun Cluster Administration

This chapter describes procedures used to prepare for administration of the Sun Cluster configuration. Some of the procedures documented in this chapter are dependent on your volume management software (Solstice DiskSuite or VERITAS Volume Manager). Those that are dependent on the volume manager include the volume manager name in the procedure title. This chapter contains the following sections:

Saving Disk Partition Information (Solstice DiskSuite)

Maintain disk partitioning information for all nodes and multihost disks in the Sun Cluster configuration. Keep this information up-to-date as new disks are added to the disksets and when any of the disks are repartitioned. You need this information to perform disk replacement.

The disk partitioning information for the local disks is not as critical because the local disks on all Sun Cluster nodes should have been partitioned identically. Most likely, you can obtain the local disk partition information from another Sun Cluster node if a local disk fails.

When a multihost disk is replaced, the replacement disk must have the same partitioning as the disk it is replacing. Depending on how a disk has failed, this information might not be available when replacement is performed. Therefore, it is especially important to retain a record of the disk partitioning information if you have several different partitioning schemes in your disksets.


Note -

Though VxVM does not impose this restriction, it is still a good idea to save this information.


A simple way to save disk partitioning information is shown in the following sample script. This type of script should be run after the Sun Cluster software has been configured. In this example, the files containing the volume table of contents (VTOC) information are written to the local /etc/opt/SUNWcluster/vtoc directory by the prtvtoc(1M) command.


#! /bin/sh
DIR=/etc/opt/SUNWcluster/vtoc
mkdir -p $DIR
cd /dev/rdsk
for i in *s7
do prtvtoc $i >$DIR/$i || rm $DIR/$i
done

Each of the disks in a Solstice DiskSuite diskset is required to have a Slice 7. This slice contains the metadevice state database replicas.

If a local disk also has a valid Slice 7, the VTOC information also will be saved by the sample script. However, this should not occur for the boot disk, because typically a boot disk does not have a valid Slice 7.


Note -

Make certain that the script is run while none of the disks is owned by another Sun Cluster node. The script will work if the logical hosts are in maintenance mode, if the logical hosts are owned by the local host, or if Sun Cluster is not running.


Saving and Restoring VTOC Information (Solstice DiskSuite)

When you save the VTOC information for all multihost disks, this information can be used when a disk is replaced. The sample script shown in the following example uses the VTOC information saved by the script shown below to give the replacement disk the same partitioning as the failed disk. Use the actual names of the disk or disks to be added in place of c1t0d0s7 and c1t0d1s7 in the example. Specify multiple disks as a space-delimited list.


#! /bin/sh
DIR=/etc/opt/SUNWcluster/vtoc
cd /dev/rdsk
for i in c1t0d0s7 c1t0d1s7
do fmthard -s $DIR/$i $i
done

Note -

The replacement drive must be of the same size and geometry (generally the same model from the same manufacturer) as the failed drive. Otherwise the original VTOC might not be appropriate for the replacement drive.


If you did not record this VTOC information, but you have mirrored slices on a disk-by-disk basis (for example, the VTOCs of both sides of the mirror are the same), it is possible to copy the VTOC information from the other submirror disk to the replacement disk. For this procedure to be successful, the replacement disk must be in maintenance mode, or must be owned by the same host as the failed disk, or Sun Cluster must be stopped. This procedure is shown in the following example.


#! /bin/sh
cd /dev/rdsk
OTHER_MIRROR_DISK=c2t0d0s7
REPLACEMENT_DISK=c1t0d0s7 
prtvtoc $OTHER_MIRROR_DISK | fmthard -s - $REPLACEMENT_DISK

If you did not save the VTOC information and did not mirror on a disk-by-disk basis, you can examine the component sizes reported by the metaset(1M) command and reverse engineer the VTOC information. Because the computations used in this procedure are complex, the procedure should be performed only by a trained service representative.

Saving Device Configuration Information

Record the /etc/path_to_inst and the /etc/name_to_major information on removable media (floppy disk or backup tape).

The path_to_inst(4) file contains the minor unit numbers for disks in each multihost disk expansion unit. This information will be necessary if the boot disk on any Sun Cluster node fails and has to be replaced.

Instance Names and Numbering

Instance names are occasionally reported in driver error messages. An instance name refers to system devices such as ssd20 or hme5.

You can determine the binding of an instance name to a physical name by looking at /var/adm/messages or dmesg(1M) output:


ssd20 at SUNW,pln0:
ssd20 is /io-unit@f,e0200000/sbi@0,0/SUNW,soc@3,0/SUNW,pln@a0000800,20183777 \

/ssd@4,0

le5 at lebuffer5: SBus3 slot 0 0x60000 SBus level 4 sparc ipl 7
le5 is /io-unit@f,e3200000/sbi@0,0/lebuffer@0,40000/le@0,60000

Once an instance name has been assigned to a device, it remains bound to that device.

Instance numbers are encoded in a device's minor number. To keep instance numbers persistent across reboots, the system records them in the /etc/path_to_inst file. This file is read only at boot time and is currently updated by the add_drv(1M) and drvconfig(1M) commands. For additional information refer to the path_to_inst(4) man page.

When you install the Solaris operating environment on a node, instance numbers can change if hardware was added or removed since the last Solaris installation. For this reason, use caution whenever you add or remove devices such as SBus or FC/OM cards on Sun Cluster nodes. It is important to maintain the same configuration of existing devices, so that the system is not confused in the event of a reinstall or reconfiguration reboot.

Instance number problems can arise in a configuration. For example, consider a Sun Cluster configuration that consists of three SPARCstorage(TM) Arrays with Fibre Channel/SBus (FC/S) cards installed in SBus slots 1, 2, and 4 on each of the nodes. The controller numbers are c1, c2, and c3. If the system administrator adds another SPARCstorage Array to the configuration using a FC/S card in SBus slot 3, the corresponding controller number will be c4. If Solaris is reinstalled on one of the nodes, the controller numbers c3 and c4 will refer to different SPARCstorage Arrays. The other Sun Cluster node will still refer to the SPARCstorage Arrays with the original instance numbers. Solstice DiskSuite will not communicate with the disks connected to the c3 and c4 controllers.

Other problems can arise with instance numbering associated with the Ethernet connections. For example, each of the Sun Cluster nodes has three Ethernet SBus cards installed in slots 1, 2, and 3, and the instance numbers are hme1, hme2, and hme3. If the middle card (hme2) is removed and Solaris is reinstalled, the third SBus card will be renamed from hme3 to hme2.

Performing Reconfiguration Reboots

During some of the administrative procedures documented in this book, you are instructed to perform a reconfiguration reboot by using the OpenBoot(TM) PROM boot -r command, or by creating the /reconfigure file on the node and then rebooting.


Note -

It is not necessary to perform a reconfiguration reboot to add disks to an existing multihost disk expansion unit.


Avoid performing Solaris reconfiguration reboots when any hardware (especially a multihost disk expansion unit or disk) is powered off or otherwise defective. In such situations, the reconfiguration reboot removes the inodes in /devices and symbolic links in /dev/dsk and /dev/rdsk associated with the disk devices. These disks become inaccessible to Solaris until a later reconfiguration reboot. A subsequent reconfiguration reboot might not restore the original controller minor unit numbering, and therefore might the volume manager software to reject the disks. When the original numbering is restored, the volume manager software can access the associated objects.

If all hardware is operational, you can perform a reconfiguration reboot safely to add a disk controller to a node. You must add such controllers symmetrically to both nodes (though a temporary unbalance is allowed while the nodes are upgraded). Similarly, if all hardware is operational, it is safe to perform a reconfiguration reboot to remove hardware.


Note -

For the Sun StorEdge A3000, in the case of a single controller failure, you should replace the failed controller as soon as possible. Other administration tasks that would normally require a boot --r (such as after adding a new SCSI device) should be deferred until the failed controller has been replaced and brought back online, and all logical unit numbers (LUN) have been balanced back to their previous state when the failover occurred. Refer to the Sun StorEdge A3000 documentation for more information.


Logging Into the Servers as root

If you want to log in to Sun Cluster nodes as root through a terminal other than the console, you must edit the /etc/default/login file and comment out the following line:


CONSOLE=/dev/console

This enables root logins using rlogin(1), telnet(1), and other programs.