Sun Cluster 2.2 7/00 Release Notes

Known Problems

The following known problems affect the operation of Sun Cluster 2.2 7/00 Release.

Framework Bugs

4132195 - Clusters that include dual-Ultra-2, dual-fas-SunSwift, and dual-Sun StorEdge MultiPack MI-SCSI devices can experience a bug in the fas chip that causes the fas SCSI bus to hang when it is selected by more than one host. This can occur in a number of situations--for example, when a SCSI target device driver is attached or after a dormant detached device is re-attached.

If an active cluster node with SCSI ID target 6 is running while another node using SCSI ID target 7 is rebooted, timeouts or resets might result. Note that Solaris reboots can cause a probe to all possible devices.

To prevent timeouts or resets caused by this bug, remove target 6 (and target 7 if present, and/or any other SCSI ID coinciding with SCSI IDs being used by SunSwift/fas initiators in an MI-SCSI configuration) from both the st.conf and sd.conf files.

4217658 - The public network monitor (PNM) daemon, pnmd, does not perform failover of multicast groups after a network adapter failover occurs. Sun Cluster supports failover of only the default multicast route (224.0.0.0). When a network adapter failover or switchover occurs, the default multicast route is switched to the appropriate adapter, but any client application that had established a multicast group will no longer work. You must restart any client applications in this condition.

4233956 - Cluster fails to come up and displays error messages if IP addresses are not assigned to logical hosts. The error messages might indicate that ifconfig failed. To prevent the problem, make sure all logical hosts have entries in the /etc/hosts files or name service maps indicating their associated IP addresses, before you attempt to bring up the cluster.

4270573 - The confccdssa(1M) command displays error messages and hangs if the disk name you specify contains the default suffix of a subdisk name. Work around the problem by creating disk groups (sc_dg) manually or by renaming any disks that contain a numeric suffix of the format -XX, so that they do not contain the suffix.

4286442 - In a cluster environment with shared single-ended or Differential SCSI devices, the SCSI chain can be broken when a node is powered off incorrectly or when the SCSI cable is disconnected before the bus is quiesced. This can cause data access errors on the node that is still active. Prevent this problem by following instructions exactly as documented in the Sun Enterprise Cluster System Site Preparation, Planning, and Installation Guide and the Sun Enterprise Cluster Hardware Service Manual when powering off a node or disconnecting SCSI cables.

4291427 - In Sun Cluster 2.2 running on Solaris 7, using the scinstall(1M) command to remove the client packages can fail with the following error message:

Patch 108400 is required to be installed by patch 108446it cannot be backed out until patch 108446 is backed out.

This occurs because of dependencies between patches 108446 and 108400. Work around the problem by removing patches 108446 and 108400 manually and then re-starting the package removal process using scinstall.

4296706 - If a connection is lost from a differential SCSI device (A/D1000, A3x00), or if termination is lost due to one cluster node being powered off, the storage device can become inaccessible to the surviving host and the surviving host can panic. Prevent this problem by following instructions exactly as documented in the Sun Enterprise Cluster System Site Preparation, Planning, and Installation Guide and the Sun Enterprise Cluster Hardware Service Manual when powering off a node or disconnecting SCSI cables.

4299187 - The cluster console does not accept non-ascii characters, for example, Japanese characters or French (accented) characters. Work around the problem by inputting such characters through the individual terminal windows on each cluster node, instead of through the cluster console.

4319412 - Killing clustd on the master node panics both the master node and backup node. Prevent or work around the problem by applying a Solstice DiskSuite patch, available from your service provider.

4321549 - Cannot switch over logical host while database instance is running on single-CPU nodes, on clusters using Oracle 8.1.6 and Solstice DiskSuite. Work around the problem by applying patch 108508 (Solaris 2.6) or 108509 (Solaris 7), available from your service provider or from the Sun patch web site, http://sunsolve.sun.com.

4326020 - Layered volumes feature of VERITAS Volume Manager 3.0.x: Problems can occur when you create or switch over a logical host associated with a disk group containing layered volumes. Prevent the problem by installing a Sun Cluster patch. See "Patches" for patch details.

4326276 - Node failover or removal is prevented on clusters using Instant Image 2.0. Because the volume manager is overlaid with the Instant Image sv driver, the Sun Cluster software cannot unmount disk group volumes during failover. Prevent the problem by applying the relevant Sun Cluster and Instant Image patches, available from your service provider or from the Sun patch web site, http://sunsolve.sun.com.

Hardware Qualification Bugs

4367622 - Upgrading Sun StorEdge T3 disk tray firmware from 1.16 to 1.16a can result in a hung telnet window. The controller firmware upgrade command boot -i with telnet can hang due to lack of memory. Work around the problem by upgrading the controller firmware using a serial port connection to the Sun StorEdge T3 disk tray or by resetting the Sun StorEdge T3 disk tray and trying it again.

4374280 - If you are running a RAID-0 volume on a Sun StorEdge T3 disk tray and you lose a disk drive in this Sun StorEdge T3 disk tray, the Sun StorEdge T3 disk tray continues to make the volume available to the host, resulting in VERITAS Volume Manager delays and overall system performance issues. Work around this problem by using RAID-0 volumes with host-based mirroring configurations.

4399132 - During volume reconstruction using the Sun StorEdge T3 disk tray, if recon_rate is set to high, nodes cannot join the cluster. Work around this problem by using the factory default (medium) for recon_rate.

4393512 - SCSI-reservations failures have been observed when clustering StorEdge MultiPack enclosures that contain a particular model of Quantum disk drive: SUN4.2G VK4550J. It is recommended that you do not use this particular model of Quantum disk drive for clustering with StorEdge MultiPack enclosures. If you do use this model of disk drive, you must set the scsi-initiator-id of the "first node" to 6. If you are using a six-slot StorEdge MultiPack enclosure, this also requires that you set it for the 9-through-14 SCSI target address range (for more information, see the Sun StorEdge MultiPack Storage Guide).

Installation Bugs

4336171 - During initial cluster software installation, the scinstall(1M) command displays volume manager choices as follows:

1) Cluster Volume Manager (CVM)
2) Sun StorEdge Volume Manager (SSVM)
3) Solstice DiskSuite (SDS)

Sun Cluster 2.2 7/00 Release supports VERITAS Volume Manager 3.0.4, which includes the functionality formerly called Cluster Volume Manager. However, scinstall has not been updated to reflect the new product names. If you want to install VERITAS Volume Manager 3.0.4, select option 2. If you need the cluster functionality formerly known as Cluster Volume Manager (if your cluster includes Oracle Parallel Server, for example), select option 1.

4359807 - When installing Netscape Messaging Server, the name you type for the server instance name should not start with the prefix msg-. The installation software automatically adds that prefix to the base name you specify. You should also specify that same base name as the data service instance name when you run the hadsconfig(1M) utility. The hadsconfig utility automatically adds the prefix SUNWscnsm_ to the base name you specify. For example, if you specify the base name my_mail, the resulting server instance name would be msg-my_mail, and the resulting data service instance name would be SUNWscnsm_my_mail.

Upgrade Bugs

4218613 - During upgrade to Sun Cluster 2.2 from HA 1.3, instance configuration information for the HA-DBMS data services is not propagated to the new cluster. This prevents the database instances from starting when the new cluster is started. This bug affects the Sun Cluster HA for Oracle, Sun Cluster HA for Sybase, and Sun Cluster HA for Informix data services.

Work around the problem by manually recreating the database instance after completing the upgrade. Use the appropriate hadbms insert command (haoracle insert, hasybase insert, or hainformix insert) as described in the associated man pages, and in the appropriate data service chapters in the Sun Cluster 2.2 Software Installation Guide.

After you recreate the database instances, start the instances by using the appropriate hadbms start command.

4218823 - During upgrade from HA 1.3 to Sun Cluster 2.2, only two of three required IP addresses are added to the /.rhosts file on each node. The address lost is the highly available IP address for the private interconnects. Utilities such as hadsconfig(1M) will not work without this entry. The user must manually add the required entries to the /.rhosts file. The procedure is documented in Chapter 3 of the Sun Cluster 2.2 Software Installation Guide.

4327771 - When upgrading from Sun Cluster 2.2 on Solaris 2.6 to Sun Cluster 2.2 7/00 on Solaris 8, the SUNWdidx package is not installed. This occurs only when Solaris 8 is booted in 64-bit mode. This causes initialization of disk IDs to fail, leaving the upgrade incomplete. Work around the problem by installing the SUNWdidx package manually, after installing the upgraded Solaris and Sun Cluster packages. Then re-initialize disk IDs, using the scdidadm(1M) command, as documented in Chapter 4 of the Sun Cluster 2.2 Software Installation Guide.

Administrative Command Bugs

4204883 - The confccdssa(1M) command will fail when you select the controller that contains the boot disk, and will display the misleading error message: "First RE may not be NULL. WARNING: All disks on this SSA (ctlr: nn) are either already in disk groups, have already been selected as one of the devices for the shared CCD or are otherwise unavailable." To prevent this problem, do not select the controller that contains the boot disk.

4235744 - The scconf clustername -F logicalhost command creates the primary and mirror of the HA administrative volume dg-stat on two different disks in the same storage device. If that storage device fails, or connection to that storage device is lost, automatic volume recovery is not possible. You must manually fix the volume and restart the volume.

To diagnose and correct this problem, perform the following steps.

Check whether your existing administrative file system is created with the mirrors on the same controller. If not, then no further action is needed.

If the administrative file system volumes are mirrored on disks on same controller, proceed with the following steps to rebuild the administrative file system so that the volumes are correctly mirrored across controllers.
Back up any data that is in the administrative file system (/logicalhost) directory.
Put the logical host in maintenance mode.
Using VERITAS Volume Manager commands, manually import the disk group to where the administrative file system resides, remove the dg-stat volume, then create the volume using the same name dg-stat, specifying a mirror layout across controllers.
Recreate the administrative file system.
# scconf clustername -F logicalhost
The command will find that an administrative file system volume (dg-stat) already exists, and will use that volume to create the administrative file system.
Unmount the newly-created file system.
Deport the disk group.
Bring up the logical host by using the haswitch(1M) command.
Restore any data to the /logicalhost directory.

4240225 - A umount operation will fail during a switchover if the df command is run before the partition is unmounted. This causes the cluster to attempt to re-master the logical host on the original node, which fails, leaving the logical host in a partially mastered state. The error message produced in this situation is cryptic: "ID[SUNWcluster.scnfs.4010]: unmount /mail/spool failed." To work around the problem, switch the logical host into maintenance mode by using haswitch or scconf(1M), and then re-master the logical host correctly, using the scconf command. See the scconf(1M) man page for details.

Data Service Bugs

4262913 - The Sun Cluster HA for Oracle ksh script in /opt/SUNWcluster/bin has problems if it encounters a non-Sun-supplied data service with the string "oracle" in its name. Therefore, do not include the string "oracle" in any names of data services you create using the Sun Cluster Application Programming Interface (API).

4336343 - Inclusion of "child-level monitoring" features in pmfd for Sun Cluster 2.2. See "Features" for more information.

4338556 - The Sun Cluster HA for NetBackup activity monitor does not display the correct status after switchover. This means that you cannot detect the status of backups after a switchover. No workaround exists for this problem currently. See your service provider for the latest status.

4345031 - If a switchover or failover from a NetBackup client cluster occurs during a restore operation, the restore process continues to write to the root disk and might fill up that disk, thus stalling the switchover of the cluster. Simultaneously, the NetBackup Progress Report utility does not report the correct status of the restore operation.

To correct this situation, include the following parameter in the bp.conf file for the NetBackup client cluster:

REQUIRED_INTERFACE=logicialhostname

For example:

REQUIRED_INTERFACE=lh-schost-1

This parameter ensures that the restore process (a tar operation) stops writing to the root disk via the shared disk mount point after a switchover or failover. However, the tar process may persist for a while; you can just delete it and reexecute the restore.

4387527 - When installing the ha-oracle agent in Sun Cluster 2.2, root does not need to be listed as a member of the database administrator group in the /etc/group file as previously documented. The entry can now be

dba:*:520:oracle

For more information on installing and configuring Sun Cluster HA for Oracle, see the Sun Cluster 2.2 Software Installation Guide.

4405556 - Missing information on installing Sun Cluster HA for Oracle on multihost disks. The following note should be included in Chapter 5 of the Sun Cluster 2.2 Software Installation Guide.

only -

If you install the Oracle binaries on a multihost disk, you must install the SQL*PLUS option from Oracle on the local nodes as well. The Sun Cluster HA for Oracle fault monitor only works correctly when you install the SQL*PLUS option on the local nodes.

Sun Cluster Manager Bugs

Running SCM with the HotJava browser - If you choose to use the HotJava browser shipped with your Solaris 2.6 or Solaris 7 operating environment to run SCM, there may be problems such as:

Menu errors. For example, after making your menu selection, the menu selection might remain visible on the browser.

Use of swap space. If you choose to use the HotJava browser with SCM, you should have at least 40 Megabytes of free swap space. If you find that swap space gets low, restarting the HotJava browser might help.

Cannot access online help. When running HotJava on a cluster node and displaying it remotely, online help may freeze when attempting to access online help.

4221612 - Sun Cluster Manager sometimes incorrectly reports that the Sun Cluster HA for Netscape HTTP data service is down when it is actually up. Work around the problem by checking the status of the data service with hareg(1M) or hastat(1M) instead. See the hareg(1M) and hastat(1M) man pages for details.

4312093 - On Solaris 8, you cannot run Sun Cluster Manager as an applet with the Netscape browser and Java Development Toolkit (JDK) version 1.2. Instead, you can either run Sun Cluster Manager as a standalone application, or change the default JDK to version 1.1 (if your cluster is not running any applications that depend upon JDK 1.2).

To run Sun Cluster Manager as a standalone application, follow the detailed instructions in Chapter 2 of the Sun Cluster 2.2 System Administration Guide.

If your cluster is not hosting any applications that depend upon JDK 1.2, you can choose to change the JDK default to version 1.1. Do this by modifying the JAVA_HOME fields in the Sun Cluster Manager start-up script to specify version 1.1:

# cd /opt/SUNWcluster/scmgr/lib
# vi scm_server_start
... 
JAVA_HOME=/usr/java1.1
PATH=${JAVA_HOME}/bin:/bin:/etc:/sbin:/usr/sbin
...

4316289 - Sun Cluster Manager, when invoked from the command line, does not display data services associated with a logical host. Work around the problem by obtaining data service information from the Properties > Registered HA Services menu of the Sun Cluster Manager GUI.

4332639 - Sun Cluster Manager calls the HotJava browser by default, but HotJava is not supported on Solaris 8. Work around the problem on Solaris 8 configurations by specifying the Netscape browser. Use the following command:

# /opt/SUNWcluster/bin/scmgr -b /usr/dt/bin/netscape

4333246 - Sun Cluster Manager displays qfe private network connections as Unknown. Work around the problem by refreshing the cluster view in the Sun Cluster Manager GUI, using the menus Help > Cluster > Refresh Current View.

Documentation Errata

4233113 - Sun Cluster documentation omits information regarding logical host timeout values and how they are used. When you configure the cluster, you set a timeout value for the logical host. This timeout value is used by the CCD when you bring a data service up or down using the hareg(1M) command. The CCD operation occurs in two steps; half of the timeout value is used for each step. Therefore, when configuring START and STOP methods for data services, make sure each method uses no more than half of the timeout value set for the logical host.

4330501 - The Sun Cluster 2.2 System Administration Guide, section 4.4, "Disabling Automatic Switchover" indicates that you can disable automatic switchover of logical hosts by using the scconf -m command. This is misleading. You can use scconf -m to disable automatic switchover of logical hosts only if you issue the command when you create the logical hosts initially.

If the logical host already exists, you must remove the logical host and then re-create it using scconf -m, in order to disable automatic switchover.

4336091 - Sun Cluster documentation omits information regarding how to set logical unit numbers (LUNs) for A1000 and A3x00 storage devices.

When you add A1000 and A3x00s to a Sun Cluster configuration, you must set the LUNs so that they survive switchover or failover of a cluster node, without loss of pseudo-device information. Use the following procedure to ensure that LUNs are set correctly and permanently for these disk types.

On both nodes, install or verify the existence of the RAID manager packages, SUNWosafw, SUNWosamn, SUNWosar, and SUNWosau.
(Solaris 8 only) Install or verify the existence of the RAID manager patch 108553.

Obtain the patch from your service provider or from the patch web site http://sunsolve.sun.com.
Use the RM6 tool to set up the LUNs on the first node.

Using the tool's GUI, click on "Configuration," then on "Module Name," and then on "Create LUN icon."
Compare the /etc/osa/rdac_address files on both nodes.

In Step 3, LUNs were assigned to either controller A or B, and the rdac_address file records this assignment. If necessary, modify the rdac_address file on the second node so that the controller assignments match those on the first node.

Run the following RAID manager command on both nodes.
# /usr/lib/osa/bin/hot_add

4341222 - Chapter 1, sections 1.3.2 and 1.5.6 of the Sun Cluster 2.2 Software Installation Guide do not accurately describe the behavior of CCD quorum during cluster configuration. The documentation should reflect that it is possible to modify the cluster configuration database even when CCD quorum conditions are not met (that is, when greater than half of all cluster nodes do not have a valid CCD).

Typically, the cluster software requires a quorum before updating the CCD. This requirement is highly restrictive in configurations using logical hosts. Therefore, to overcome this limitation in Sun Cluster 2.2, all administrative and configuration commands related to logical hosts and data services that update the CCD database can be executed without CCD quorum. Such commands include hareg(1M) and scconf(1M) operations, for example.

To prevent loss of any CCD updates, you should always make sure that the last node to leave the cluster during cluster shutdown is the first node to rejoin the cluster upon start up.

4342066 - The procedure "How to Change the Name of a Cluster Node" in section 3.2 of the Sun Cluster 2.2 System Administration Guide is incorrect and should not be used. The correct procedure involves changing various framework files which should not be altered manually by anyone other than your service representative. If you need to change the name of a cluster node, contact your service provider for assistance.

4342236 - The abort_net method is described incorrectly in the Sun Cluster 2.2 API Developer's Guide. The documentation states that the abort_net method can be used to execute "last wishes" cleanup code before a cluster is stopped. This is incorrect.

Instead, the abort_net method is called by the clustd daemon when a node is about to abort from the cluster, typically in a split-brain situation when the node in question is the loser in the race for the quorum device (see Chapter 1 in the Sun Cluster 2.2 Software Installation Guide for more information about quorum devices). In such a case, first abort_net is called, then the network is taken down, and finally abort is called. However, these methods are executed on the node that is aborting only if the node owns the data service associated with the methods. (That is, if the aborting node does not own any logical host, then it will not execute any of the abort methods associated with a data service.) The aborting node will stop the cluster software, but the node itself will remain up.

Note that stop and stop_net methods are called each time the cluster reconfigures itself (due to nodes joining or leaving the cluster), as part of normal cluster operation.

4343021 - In Chapter 14 in the Sun Cluster 2.2 Software Installation Guide, the documented installation procedure for Sun Cluster HA for NetBackup is incorrect. The correct procedure is as follows.

Install Sun Cluster 2.2 7/00 Release using the procedures documented in Chapter 3 of the Sun Cluster 2.2 Software Installation Guide.

Stop the cluster by running the following command on all nodes, sequentially.
# scadmin stopnode

On all nodes, install VERITAS NetBackup, using the procedures documented in Chapter 14 of the Sun Cluster 2.2 Software Installation Guide.

On all nodes, install Sun Cluster patch 109214, which enhances the scinstall(1M) command to recognize Sun Cluster HA for NetBackup. The patch is available from your service provider or from the Sun patch web site http://sunsolve.sun.com.

On all nodes, re-run the scinstall command and install the Sun Cluster HA for NetBackup data service.
# scinstall

On all nodes, install Sun Cluster patches 108450 and 108423, which enhance the hadsconfig(1M) command to recognize Sun Cluster HA for NetBackup.

Start the cluster. On the first node, run the following command:
# scadmin startcluster
Sequentially, on all other nodes, run the following command:
# scadmin startnode

Configure the Sun Cluster HA for NetBackup data service by running the hadsconfig command on one node only. See Chapter 14 in the Sun Cluster 2.2 Software Installation Guide for configuration parameters to supply to hadsconfig.
# hadsconfig

Activate the data service by running the following command on one node only.
# hareg -y netbackup

4343093 - Chapter 2, section 2.6.5, in the Sun Cluster 2.2 Software Installation Guide states that Sun Cluster 2.2 must run in C locale. This is incorrect. Sun Cluster 2.2 7/00 Release can run in C, fr (French), ko (Korean), and ja (Japanese) locales.

4344711 - Appendix C in the Sun Cluster 2.2 Software Installation Guide contains incorrect or incomplete information about configuring VERITAS Volume Manager. These errors are described in more detail below.

The document mentions only VxFS file systems, and omits information about UFS file systems. In Sun Cluster configurations, UFS file systems can be created in a similar fashion to VxFS file systems. See your system administration documentation for more information about creating and administering UFS file systems.

When using the mkfs(1M) command to administer VxFS file systems, use the fully qualified path to the command, such as /usr/lib/fs/vxfs/mkfs. The documentation omits this information wherever the mkfs command is described.

In section C.3, "Configuring VxFS File Systems on the Multihost Disks," the procedure contains erroneous steps. The correct procedure follows. Use this procedure after creating logical hosts as described in the scconf(1M) man page or in Chapter 3, "Installing and Configuring Sun Cluster Software."

Take ownership of the disk group containing the volume by using the vxdg(1M) command to import the disk group to the active node.
phys-hahost1# vxdg import diskgroup

Run the following scconf(1M) command on each cluster node.

This scconf command will create a volume for the administrative file system, create a file system within that volume, create mount points for that volume in the root file system ("/"), create dfstab.logicalhost and vfstab.logicalhost files in /etc/opt/SUNWcluster/conf/hanfs, and create an appropriate entry in the vfstab.logicalhost file for the administrative file system.
phys-hahost1# scconf clustername -F logicalhost

Create file systems for all volumes. These volumes will be mounted by the logical hosts.
phys-hahost1# mkfs -F vxfs /dev/vx/rdsk/diskgroup/volume

Update the vfstab.logicalhost file to include entries for the file systems created in Step 3.

Create mount points for the file systems created in Step 3.
phys-hahost1# mkdir /logicalhost/volume

Import the disk groups to their default masters.

It is most convenient to create and populate disk groups from the active node that is the default master of the particular disk group.

Import each disk group onto the default master node using the -t option. The -t option is important, as it prevents the import from persisting across the next boot.
phys-hahost1# vxdg -t import diskgroup

(Optional) To make file systems NFS-sharable, refer to Chapter 11, "Installing and Configuring Sun Cluster HA for NFS."

4345750 - In Chapter 14 of the Sun Cluster 2.2 System Administration Guide, the procedure "How to Replace a Sun StorEdge A5000 Disk (VxVM)," Steps 3 and 4 are not valid for a cluster that runs on the Solaris 7 11/99 operating environment and later. In Step 3, you should run the luxadm remove_device command on only one of the nodes connected to the array. Performing the command on additional nodes is unnecessary and will generate error messages. In Step 4, after you physically replace the disk, do not run the luxadm insert_device command. This command is not necessary.

If your cluster runs on a Solaris operating environment earlier than the Solaris 7 11/99 release, Steps 3 and 4 are still valid as documented.

4356674 - In Chapter 14 of the Sun Cluster 2.2 System Administration Guide, the procedure "How to Replace a Sun StorEdge A5000 Disk (Solstice DiskSuite)" contains errors in Steps 2, 3, 11, 12, and 13. In these steps, the directory /tmp should be replaced with /var/tmp, and physical device names should be replaced with did device names. The corrected procedure, in its entirety, is as follows.

Identify all metadevices or applications that use the failing disk.

If the metadevices are mirrored or RAID 5, the disk can be replaced without stopping the metadevices. Otherwise all I/O to the disk must be stopped using the appropriate commands. For example, use the umount(1M) command to unmount a file system on a stripe or concatenation.

Preserve the disk label, if necessary. For example:

# prvtoc /dev/rdsk/c1t3d0s2 > /var/tmp/c1t3d0.vtoc

(Optional) Use metareplace to replace the disk slices if the disk has not been hot-spared. For example:
# metareplace d1 /dev/did/dsk/d23 /dev/did/dsk/d88 d1: device d23 is replaced with d88

Use luxadm -F to remove the disk.

The -F option is required because Solstice DiskSuite does not offline disks. Repeat the command for all hosts, if the disk is multihosted. For example:

# luxadm remove -F /dev/rdsk/c1t3d0s2
WARNING!!! Please ensure that no filesystems are mounted on these
device(s). All data on these devices should have been backed 
up. The list of devices which will be removed is: 
1: Box Name "macs1" rear slot 1
Please enter `q' to Quit or <Return> to Continue: stopping: Drive
in "macs1" rear slot 1....Done
offlining: Drive in "macs1" rear  slot 1....Done
Hit <Return> after removing the device(s).

only -

The FPM icon for the disk drive to be removed should be blinking. The amber LED under the disk drive should also be blinking.

Remove the disk drive and enter Return. The output should look similar to the following:

Hit <Return> after removing the device(s). 
Drive in Box Name "macs1" rear slot 1 
Removing Logical Nodes: 
Removing c1t3d0s0 Removing c1t3d0s1 Removing c1t3d0s2 Removing
c1t3d0s3 Removing c1t3d0s4 Removing c1t3d0s5 Removing c1t3d0s6
Removing c1t3d0s7 Removing c2t3d0s0 Removing c2t3d0s1 Removing
c2t3d0s2 Removing c2t3d0s3 Removing c2t3d0s4 Removing c2t3d0s5
Removing c2t3d0s6 Removing c2t3d0s7
#

Repeat Step 4 for all nodes, if the disk array is in a multi-host configuration.

Use the luxadm insert command to insert the new disk. Repeat for all nodes. The output should be similar to the following:

# luxadm insert macs1,r1
The list of devices which will be inserted is: 
1: Box Name "macs1" rear slot 1
Please enter `q' to Quit or <Return> to Continue: Hit <Return>
after inserting the device(s).

Insert the disk drive and enter Return. The output should be similar to the following:

Hit <Return> after inserting the device(s). Drive in Box Name
"macs1" rear slot 1  Logical Nodes under /dev/dsk and /dev/rdsk:
c1t3d0s0 c1t3d0s1 c1t3d0s2 c1t3d0s3 c1t3d0s4 c1t3d0s5 c1t3d0s6
c1t3d0s7 c2t3d0s0 c2t3d0s1 c2t3d0s2 c2t3d0s3 c2t3d0s4 c2t3d0s5
c2t3d0s6 c2t3d0s7
#

only -

The FPM icon for the disk drive you replaced should be lit. In addition, the green LED under the disk drive should be blinking.

On all nodes connected to the disk, use scdidadm(1M) to update the DID pseudo device information.

In this command, DID_instance is the instance number of the disk that was replaced. Refer to the scdidadm(1M) man page for more information.
# scdidadm -R DID_instance

Reboot all nodes connected to the new disk.

To avoid down time, use the haswitch(1M) command to switch ownership of all logical hosts that can be mastered by the node to be rebooted. For example,
# haswitch phys-hahost2 hahost1 hahost2

Label the disk, if necessary. For example:

# cat /var/tmp/c1t3d0.vtoc | fmthard -s - /dev/rdsk/c1t3d0s2
fmthard:  New volume table of contents now in place.

Replace the metadb, if necessary. For example:

# metadb -s setname -d /dev/did/rdsk/d23s7; 
metadb -s setname -a -c 3 /dev/did/rdsk/d23s7

Enable the new disk slices with metareplace -e. For example:
# metareplace -e d1 /dev/did/rdsk/d23s0 d1: device d23s0 is enabled
This completes the disk replacement procedure.

4448815 - In the cports(1M) man page, there is a typo in a file name. The man page currently says: "If an entry for "serialports" has been made in the /etc/nisswitch.conf file, then the order of lookups is ..." The correct file name is /etc/nsswitch.conf.

4448860 - In the chosts(1) man page, there is a typo in a file name. The man page currently says: "If an entry for "clusters" has been made in the /etc/nisswitch.conf file, then the order of lookups is ..." The correct file name is /etc/nsswitch.conf.

Other Known Issues

Oracle Parallel Server 8.1.6 UDLM Requirements

To run Oracle Parallel Server 8.1.6 with Sun Cluster 2.2 7/00 Release on Solaris 8, you must download an Oracle patch that provides fixes to the UNIX Dynamic Lock Manager (UDLM), version 3.3.4.4, allowing it to recognize and install on Solaris 8. The patch is available from your Oracle or Sun service provider.

Failover/Switchover When Logical Host File System Is Busy

If a failover or switchover occurs while a logical host's file system is busy, the logical host fails over only partially; part of the disk group remains on the original target physical host. Do not attempt a switchover if a logical host's file system is busy.

Displaying `LOG_DB_WARNING` Messages for the SAP Probe

The Sun Cluster HA for SAP parameter LOG_DB_WARNING determines whether warning messages should be displayed if the Sun Cluster HA for SAP probe cannot connect to the database. When LOG_DB_WARNING is set to -y and the probe cannot connect to the database, a message is logged at the warning level in the local0 facility. By default, the syslogd(1M) daemon does not display these messages to /dev/console or to /var/adm/messages. To see these warnings, you must modify the /etc/syslog.conf file to display messages of local0.warning priority. For example:

...
*.err;kern.notice;auth.notice;local0.warning /dev/console
*.err;kern.debug;daemon.notice;mail.crit;local0.warning /var/adm/messages
...

After modifying the file, you must restart syslogd. See the syslog.conf(1M) and syslogd(1M) man pages for more information.

Known Problems

Framework Bugs

Hardware Qualification Bugs

Installation Bugs

Upgrade Bugs

Administrative Command Bugs

Data Service Bugs

Sun Cluster Manager Bugs

Documentation Errata

Other Known Issues

Oracle Parallel Server 8.1.6 UDLM Requirements

Failover/Switchover When Logical Host File System Is Busy

Displaying LOG_DB_WARNING Messages for the SAP Probe

Displaying `LOG_DB_WARNING` Messages for the SAP Probe