Administering Multihost Metadevices (Sun Cluster 2.2 System Administration Guide)

Sun Cluster 2.2 System Administration Guide

Administering Multihost Metadevices

The following sections contain information about the differences between administering metadevices in the multihost Sun Cluster environment and in a single-host environment.

Unless noted in the following sections, you can use the instructions in the Solstice DiskSuite documentation.

Note -

The instructions in the Solstice DiskSuite books are relevant only for single-host configurations.

The following sections describe the Solstice DiskSuite command-line programs to use when performing a task. Optionally, you can use the metatool(1M) graphical user interface for all the tasks unless directed otherwise. Use the -s option when running metatool(1M), because it allows you to specify the diskset name.

Managing Metadevices

For ongoing management of metadevices, you must constantly monitor the metadevices for errors in operation, as discussed in "Monitoring Utilities".

When hastat(1M) reports a problem with a diskset, use the metastat(1M) command to locate the errored metadevice.

You must use the -s option when running either metastat(1M) or metatool(1M), so that you can specify the diskset name.

Note -

You should save the metadevice configuration information when you make changes to the configuration. Use metastat -p to create output similar to what is in the md.tab file and then save the output. Refer to "Saving Disk Partition Information (Solstice DiskSuite)", for details on saving partitioning data.

Adding a Mirror to a Diskset

Mirrored metadevices can be used as part of a logging UFS file system for Sun Cluster highly available applications.

Idle slices on disks within a diskset can be configured into metadevices by using the metainit(1M) command.

Removing a Mirror From a Diskset

Sun Cluster highly available database applications can use raw mirrored metadevices for database storage. While these are not mentioned in the dfstab.logicalhost file or in the vfstab file for each logical host, they appear in the related Sun Cluster database configuration files. The mirror must be removed from these files, and the Sun Cluster database system must stop using the mirror. Then the mirror can be deleted by using the metaclear(1M) command.

Taking Submirrors Offline

If you are using SPARCstorage Arrays, note that before replacing or adding a disk drive in a SPARCstorage Array tray, all metadevices on that tray must be taken offline.

In symmetric configurations, taking the submirrors offline for maintenance is complex because disks from each of the two disksets might be in the same tray in the SPARCstorage Array. You must take the metadevices from each diskset offline before removing the tray.

Use the metaoffline(1M) command to take offline all submirrors on every disk in the tray.

Creating New Metadevices

After a disk is added to a diskset, create new metadevices using metainit(1M) or metatool(1M). If the new devices will be hot spares, use the metahs(1M) command to place the hot spares in a hot spare pool.

Replacing Errored Components

When replacing an errored metadevice component, use the metareplace(1M) command.

A replacement slice (or disk) must be available. This could be an existing device that is not in use, or a new device that you have added to the diskset.

You also can return to service drives that have sustained transient errors (for example, as a result of a chassis power failure) by using the metareplace -e command.

Deleting Metadevices

Before deleting a metadevice, verify that none of the components in the metadevice is in use by Sun Cluster HA for NFS. Then use the metaclear(1M) command to delete the metadevice.

Growing Metadevices

To grow a metadevice, you must have a least two slices (disks) in different multihost disk expansion units available. Each of the two new slices should be added to a different submirror with the metainit(1M) command. You then use the growfs(1M) command to grow the file system.

Caution -

When the growfs(1M) command is running, clients might experience interruptions of service.

If a takeover occurs while the file system is growing, the file system will not be grown. You must reissue the growfs(1M) command after the takeover completes.

Note -

The file system that contains /logicalhost/statmon cannot be grown. Because the statd(1M) program modifies this directory, it would be blocked for extended periods while the file system is growing. This would have unpredictable effects on the network file locking protocol. This is a problem only for configurations using Sun Cluster HA for NFS.

Managing Hot Spare Pools

You can add or delete hot spare devices to or from hot spare pools at any time, as long as they are not in use. In addition, you can create new hot spare pools and associate them with submirrors using the metahs(1M) command.

Managing UFS Logs

All UFS logs on multihost disks are mirrored. When a submirror fails, it is reported as an errored component. Repair the failure using either metareplace(1M) or metatool(1M).

If the entire mirror that contains the UFS log fails, you must unmount the file system, back up any accessible data, repair the error, repair the file system (using fsck(1M)), and remount the file system.

Adding UFS Logging to a Logical Host

All UFS file systems within a logical host must be logging UFS file systems to ensure that the failover or haswitch(1M) timeout criteria can be met. This facilitates fast switchovers and takeovers.

The logging UFS file system is set up by creating a trans device with a mirrored logging device and a mirrored UFS master file system. Both the logging device and UFS master device must be mirrored.

Typically, Slice 6 of each drive in a diskset can be used as a UFS log. The slices can be used for UFS log submirrors. If the slices are smaller than the log size you want, several can be concatenated. Typically, one Mbyte per 100 Mbytes is adequate for UFS logs, up to a maximum of 64 Mbytes. Ideally, log slices should be drive-disjoint from the UFS master device.

Note -

If you must repartition the disk to gain space for UFS logs, then preserve the existing Slice 7, which starts on Cylinder 0 and contains at least two Mbytes. This space is required and reserved for metadevice state database replicas. The Tag and Flag fields (as reported by the format(1M) command) must be preserved for Slice 7. The metaset(1M) command sets the Tag and Flag fields correctly when the initial configuration is built.

After the trans device has been configured, create the UFS file system using newfs(1M) on the trans device.

After the newfs process is completed, add the UFS file system to the vfstab file for the logical host, by editing the /etc/opt/SUNWcluster/conf/hanfs/vfstab.logicalhost file to update the administrative and multihost UFS file system information.

Make sure that the vfstab.logicalhost files of all cluster nodes contain the same information. Use the cconsole(1) facility to make simultaneous edits to vfstab.logicalhost files on all nodes in the cluster.

Here's a sample vfstab.logicalhost file showing the administrative file system and four other UFS file systems:

#device                 device                   mount       FS  fsck  mount mount
#to mount                to fsck                    point       type pass  all   options# 
/dev/md/hahost1/dsk/d11  /dev/md/hahost1/rdsk/d11 /hahost1    ufs  1     no   -
/dev/md/hahost1/dsk/d1   /dev/md/hahost1/rdsk/d1  /hahost1/1  ufs  1     no   -
/dev/md/hahost1/dsk/d2   /dev/md/hahost1/rdsk/d2  /hahost1/2  ufs  1     no   -
/dev/md/hahost1/dsk/d3   /dev/md/hahostt1/rdsk/d3 /hahost1/3  ufs  1     no   -
/dev/md/hahost1/dsk/d4   /dev/md/hahost1/rdsk/d4  /hahost1/4  ufs  1     no   -

If the file system will be shared by Sun Cluster HA for NFS, follow the procedure for sharing NFS file systems as described in Chapter 11 in the Sun Cluster 2.2 Software Installation Guide.

The new file system will be mounted automatically at the next membership monitor reconfiguration. To force membership reconfiguration, use the following command:

# haswitch -r