Sun Cluster 2.2 Software Installation Guide

Appendix B Configuring Solstice DiskSuite

Configure your local and multihost disks for Solstice DiskSuite by using the guidelines in this chapter along with the information in Chapter 2, Planning the Configuration. Then create your md.tab file using the guidelines and examples in this chapter. Refer to your Solstice DiskSuite documentation for more details about creating an md.tab file. It is easiest to create the md.tab file as you plan and design your metadevice configuration, then copy the md.tab file to each of the Sun Cluster nodes after installing the Sun Cluster and Solstice DiskSuite software.

This appendix includes the following sections:

Overview of Configuring Solstice DiskSuite for Sun Cluster

Table B-1 shows the high-level steps to configure Solstice DiskSuite to work with Sun Cluster. The tasks should be performed in the order shown.

Table B-1 High-Level Steps to Configure Solstice DiskSuite


Task	Go To ...
Planning your Solstice DiskSuite configuration	Chapter 2, Planning the Configuration
Calculating the quantity of metadevice names needed	"How to Calculate the Number of Metadevice Names"
Preparing the disk IDs by running the `scdidadm(1M)` command	"How to Prepare the Configuration to Use the DID Driver"
Creating metadevice state database replicas on the local (private) disks by running the `metadb(1M)` command	"How to Create Local Metadevice State Database Replicas"
(Optional) Mirroring the root (`/`) file system	"Mirroring the root (`/`) File System"
Creating the disksets by running the `metaset(1M)` command	"How to Create a Diskset"
Adding drives to the diskset by running the `metaset(1M)` command	"How to Add Drives to a Diskset"
(Optional) Repartitioning drives in a diskset	"How to Repartition Drives in a Diskset"
Setting up the `md.tab` file to create metadevices on disksets	"Using the `md.tab` File to Create Metadevices in Disksets"
"Running" the `md.tab` file by using the `metainit(1M)` command	"How to Initialize the `md.tab` File"
Configuring file systems for each logical host	"How to Create Multihost UFS File Systems"

The following sections describe all the procedures necessary to configure Solstice DiskSuite with Sun Cluster.

Note -

If your cluster has only two disk storage units (two drive strings), you must configure Solstice DiskSuite Mediators. Refer to the Sun Cluster 2.2 System Administration Guide for details on configuring and administering mediators.

Configuring Solstice DiskSuite for Sun Cluster

Use the procedures in this section to configure the following:

(Optional) Number of metadevice names
Disk IDs
Local metadevice state database replicas
(Optional) Mirrored root (/) file system
Disksets
Drives in a diskset
(Optional) Drive partitions
md.tab file
File systems

Note -
For convenience, modify your PATH variable to include /usr/opt/SUNWmd/sbin (for Solstice DiskSuite 4.2), and /usr/sbin (for Solstice DiskSuite 4.2.1).

Calculating the Number of Metadevice Names

You must calculate the number of Solstice DiskSuite metadevice names needed for your configuration before you set up the configuration. The default number of metadevice names is 128. Many configurations will need more than the default. Increasing this number before implementing a configuration will save administration time later on.

How to Calculate the Number of Metadevice Names

Calculate the quantity of metadevice names needed by determining the largest of the metadevice names to be used in each diskset.

This requirement is based on the metadevice name value rather than on the actual quantity. For example, if your metadevice names range from d950 to d1000, Solstice DiskSuite will require one thousand names, not fifty.

If the calculated quantity exceeds 128, you must edit the /kernel/drv/md.conf file.

Set the nmd field in /kernel/drv/md.conf to the largest metadevice name value used in a diskset.

Changes to the /kernel/drv/md.conf file do not take effect until a reconfiguration reboot is performed. The md.conf files on each cluster node must be identical.

Refer to "Configuration Worksheets", for worksheets to help you plan your metadevice configuration.

Note -

The Solstice DiskSuite documentation states that the only modifiable field in the /kernel/drv/md.conf file is the nmd field. However, you can modify the md_nsets field as well if you want to configure additional disksets.

Using the Disk ID Driver

All new installations running Solstice DiskSuite require a Disk ID (DID) pseudo driver to make use of disk IDs. Disk IDs enable metadevices to locate data independent of the device name of the underlying disk. Configuration changes or hardware updates are no longer a problem because the data is located by Disk ID and not the device name.

To create a mapping between a disk ID and a disk path, you run the scdidadm(1M) command from node 0. The scdidadm(1M) command sets up three components:

Disk ID (DID) - This is a "short-hand" number assigned to the physical disk, such as "1."
DID Instance Number -This is the full path to the raw disk device, such as phys-hahost3:/dev/rdsk/c0t0d0.
DID Full Name - This is the full path of the DID, such as /dev/did/rdsk/d1.

The Solstice HA 1.3 release supported two-node clusters only. In this two-node configuration, both nodes were required to be configured identically on identical platforms, so the major/minor device numbers used by the Solstice DiskSuite device driver were the same on both systems. In greater than two-node configurations, it is difficult to cause the minor numbers of the disks to be identical on all nodes within a cluster. The same disk might have different major/minor numbers on different nodes. The DID driver uses a generated DID device name to access a disk that might have different major/minor numbers on different nodes.

Although use of the DID driver is required for clusters using Solstice DiskSuite with more than two nodes, the requirement has been generalized to all new Solstice DiskSuite installations. This enables future conversion of two-node Solstice DiskSuite configurations to greater than two-node configurations.

Note -

If you are upgrading from HA 1.3 to Sun Cluster 2.2, you do not need to run the scdidadm(1M) command.

How to Prepare the Configuration to Use the DID Driver

To set up a Solstice DiskSuite configuration using the DID driver, complete this procedure.

Note -

If you have a previously generated md.tab file to convert to use disk IDs, you can use the script included in "DID Conversion Script", to help with the conversion.

Run the scdidadm(1M) command to create a mapping between a disk ID instance number and the local and remote paths to the disk.

Perform this step after running the scinstall(1M) command with the cluster up. In order to maintain one authoritative copy of the DID configuration file, you can run the script only on node 0 while all nodes are up; otherwise it will fail. The get_node_status(1M) command includes the node ID number as part of its output. Refer to the scdidadm(1M) man page for details.
phys-hahost1# scdidadm -r
Note -
You must run the scdidadm(1M) command from cluster node 0.

If the scdidadm(1M) command is unable to discover the private links of the other cluster nodes, run this version of the command from node 0.
phys-hahost1# scdidadm -r -H hostname1,hostname2,...
Make sure the appropriate host name for node 0 is in the /.rhosts files of the other cluster nodes when using this option. Do not include the host name of the cluster node from which you run the command in the hostname list.

Use the DID mappings to update your md.tab file.

Refer to "Troubleshooting DID Driver Problems", if you receive the following error message:

The did entries in name_to_major must be the same on all nodes.

Correct the problem, then rerun the scdidadm(1M) command.

Once the mapping between DID instance numbers and disk IDs has been created, use the full DID names when adding drives to a diskset and in the md.tab file in place of the lower level device names (cXtXdX). The -l option to the scdidadm(1M) command shows a list of the mappings to help generate your md.tab file. In the following example, the first column of output is the DID instance number, the second column is the full path (physical path), and the third column is the full name (pseudo path):

phys-hahost1# scdidadm -l
60        phys-hahost3:/dev/rdsk/c4t5d2     /dev/did/rdsk/d60 
59        phys-hahost3:/dev/rdsk/c4t5d1     /dev/did/rdsk/d59 
58        phys-hahost3:/dev/rdsk/c4t5d0     /dev/did/rdsk/d58 
57        phys-hahost3:/dev/rdsk/c4t4d2     /dev/did/rdsk/d57 
56        phys-hahost3:/dev/rdsk/c4t4d1     /dev/did/rdsk/d56 
55        phys-hahost3:/dev/rdsk/c4t4d0     /dev/did/rdsk/d55 
...
6         phys-hahost3:/dev/rdsk/c0t1d2     /dev/did/rdsk/d6 
5         phys-hahost3:/dev/rdsk/c0t1d1     /dev/did/rdsk/d5 
4         phys-hahost3:/dev/rdsk/c0t1d0     /dev/did/rdsk/d4 
3         phys-hahost3:/dev/rdsk/c0t0d2     /dev/did/rdsk/d3 
2         phys-hahost3:/dev/rdsk/c0t0d1     /dev/did/rdsk/d2 
1         phys-hahost3:/dev/rdsk/c0t0d0     /dev/did/rdsk/d1

Proceed to "Creating Local Metadevice State Database Replicas" to create local replicas.

If you have problems with the DID driver, refer to "Troubleshooting DID Driver Problems".

Troubleshooting DID Driver Problems

In previous releases, Solstice DiskSuite depended on the major number and instance number of the low-level disk device being the same on the two nodes connected to the disk. With this release of Sun Cluster, Solstice DiskSuite requires that the DID major number be the same on all nodes and that the instance number of the DID device be the same on all nodes. The scdidadm(1M) command checks the DID major number on all nodes. The value recorded in the /etc/name_to_major file must be the same on all nodes.

If the scdidadm(1M) command finds that the major number is different, it will report this and ask you to fix the problem and re-run the scdidadm(1M) command. The DID driver uses major number 149; if there is a numbering conflict, you must choose another number for the DID driver. The following procedure enables you to make the necessary changes.

How to Resolve Conflicts With the DID Major Number

Choose a number that does not conflict with any other entry in the /etc/name_to_major file.

Edit the /etc/name_to_major file on each node and change the DID entry to the number you chose.

On each node where the /etc/name_to_major file was updated, execute the following commands.
phys-hahost1# rm -rf /devices/pseudo/did* /dev/did phys-hahost1# reboot -- -r ...

On the node used to run the scdidadm(1M) command, execute the following commands.
phys-hahost3# rm -f /etc/did.conf phys-hahost3# scdidadm -r
This procedure resolves mapping conflicts and reconfigures the cluster with the new mappings.

DID Conversion Script

If you have a previously generated md.tab file to convert to use disk IDs, you can use the following script to help with the conversion. The script checks the md.tab file for physical device names, such as /dev/dsk/c0t0d0 or c0t0d0, and converts these names to the full DID name, such as /dev/did/rdsk/d60.

more phys_to_did
#! /bin/sh
#
# ident "@(#)phys_to_did        1.1     98/05/07 SMI"
#
# Copyright (c) 1997-1998 by Sun Microsystems, Inc.
# All rights reserved.
#
# Usage: phys_to_did <md.tab filename> 
# Converts $1 to did-style md.tab file.
# Writes new style file to stdout.

MDTAB=$1
TMPMD1=/tmp/md.tab.1.$$
TMPMD2=/tmp/md.tab.2.$$
TMPDID=/tmp/didout.$$

# Determine whether we have a "physical device" md.tab or a "did" md.tab.
# If "physical device", convert to "did".
grep "\/dev\/did" $MDTAB > /dev/null 2>&1
if [ $? -eq 0 ]; then
        # no conversion needed
        lmsg=`gettext "no conversion needed"`
        printf "${lmsg}\n"
        exit 0
fi

scdidadm -l > $TMPDID
if [ $? -ne 0 ]; then
        lmsg=`gettext "scdidadm -l failed"`
        printf "${lmsg}\n"
        exit 1
fi

cp $MDTAB $TMPMD1

...

...
# Devices can be specified in md.tab as /dev/rdsk/c?t?d? or simply c?t?d?
# There can be multiple device names on a line.
# We know all the possible c.t.d. names from the scdidadm -l output.

# First strip all /dev/*dsk/ prefixes.
sed -e 's:/dev/rdsk/::g' -e 's:/dev/dsk/::g' $TMPMD1 > $TMPMD2

# Next replace the resulting physical disk names "c.t.d." with
# /dev/did/rdsk/<instance>
exec < $TMPDID
while read in6stance fullpath fullname
do
        old=`basename $fullpath`
        new=`basename $fullname`
        sed -e 's:'$old':/dev/did/rdsk/'$new':g' $TMPMD2 > $TMPMD1
        mv $TMPMD1 $TMPMD2
done

cat $TMPMD2
rm -f $TMPDID $TMPMD1 $TMPMD2

exit 0

Creating Local Metadevice State Database Replicas

Before you can perform any Solstice DiskSuite configuration tasks, such as creating disksets on the multihost disks or mirroring the root (/) file system, you must create the metadevice state database replicas on the local (private) disks on each cluster node. The local disks are separate from the multihost disks. The state databases located on the local disks are necessary for the operation of Solstice DiskSuite.

How to Create Local Metadevice State Database Replicas

Perform this procedure on each node in the cluster.

As root, use the metadb(1M) command to create local replicas on each cluster node's system disk.

For example, this command creates three metadevice state database replicas on Slice 7 of the local disk.
# metadb -afc 3 c0t0d0s7
The -c option creates the replicas on the same slice. This example uses Slice 7, but you can use any free slice.

Use the metadb(1M) command to verify the replicas.

# metadb
flags         first blk      block count
     a        u         16              1034            /dev/dsk/c0t0d0s7
     a        u         1050            1034            /dev/dsk/c0t0d0s7
     a        u         2084            1034            /dev/dsk/c0t0d0s7

Mirroring the root (`/`) File System

You can mirror the root (/) file system to prevent the cluster node itself from going down due to a system disk failure. Refer to Chapter 2, Planning the Configuration, for more information.

The high-level steps to mirror the root (/) file system are:

Using the metainit(1M) -f command to put the root slice in a single slice (one-way) concatenation (submirror1)
Creating a second concatenation (submirror2)
Using the metainit(1M) command to create a one-way mirror with submirror1
Running the metaroot(1M) command
Running the lockfs(1M) command
Rebooting
Using the metattach(1M) command to attach submirror2
Recording the alternate boot path

For more information, refer to the metainit(1M), metaroot(1M), and metattach(1M) man pages and to the Solstice DiskSuite documentation.

Creating Disksets

A diskset is a set of multihost disk drives containing Solstice DiskSuite objects that can be accessed exclusively (but not concurrently) by multiple hosts. To create a diskset, root must be a member of Group 14.

When creating your disksets, use the following rules to ensure correct operation of the cluster in the event of disk enclosure failure:

If exactly two "strings" are being used, the diskset should have the same number of physical disks on the two strings.

Note -
For the two-string configuration, mediators are required. Refer to the Sun Cluster 2.2 System Administration Guide for details on setting up mediators.
If more than two strings are being used, for example three strings, then you must ensure that for any two strings S1 and S2, the sum of the number of disks on those strings exceeds the number of disks on the third string S3. This is expressed as the following formula: count(S1) + count(S2) > count(S3).

How to Create a Diskset

Perform this procedure for each diskset in the cluster. All nodes in the cluster must be up. Creating a diskset involves assigning hosts and disk drives to the diskset.

Make sure the local metadevice state database replicas exist.

If necessary, refer to the procedure "How to Create Local Metadevice State Database Replicas".

As root, create the disksets by running the metaset(1M) command from one of the cluster nodes.

For example, this command creates two disksets, hahost1 and hahost2, consisting of nodes phys-hahost1 and phys-hahost2.
phys-hahost1# metaset -s hahost1 -a -h phys-hahost1 phys-hahost2 phys-hahost1# metaset -s hahost2 -a -h phys-hahost1 phys-hahost2

Check the status of the new disksets by running the metaset(1M) command.
phys-hahost1# metaset
You are now ready to add drives to the diskset, as explained in the procedure "How to Add Drives to a Diskset".

How to Add Drives to a Diskset

When a drive is added to a diskset, Solstice DiskSuite repartitions it as follows so that the metadevice state database for the diskset can be placed on the drive.

A small portion of each drive is reserved in Slice 7 for use by Solstice DiskSuite. The remainder of the space on each drive is placed into Slice 0.
Drives are repartitioned when they are added to the diskset only if Slice 7 is not set up correctly.
Any existing data on the disks is lost by the repartitioning.
If Slice 7 starts at Cylinder 0, and the disk is large enough to contain a state database replica, the disk is not repartitioned.

After adding a drive to a diskset, you may repartition it as necessary, with the exception that Slice 7 is not altered in any way. Refer to "How to Repartition Drives in a Diskset", and to Chapter 2, Planning the Configuration, for recommendations on how to set up your multihost disk partitions.

Caution -

If you repartition a disk manually, create a Partition 7 starting at Cylinder 0 that is large enough to hold a state database replica (approximately 2 Mbytes). The Flag field in Slice 7 must have V_UNMT (unmountable) set and must not be set to read-only. Slice 7 must not overlap with any other slice on the disk. Do this to prevent the metaset(1M) command from repartitioning the disk.

Use this procedure to add drives to a diskset.

Make sure you have prepared the configuration to use the DID driver, and that the disksets have been created.

If necessary, refer to "How to Prepare the Configuration to Use the DID Driver" and "How to Create a Diskset".

As root, use the metaset(1M) command to add the drives to the diskset.

Use the DID driver name for the disk drives rather than the character device name. For example:
phys-hahost1# metaset -s hahost1 -a /dev/did/dsk/d1 /dev/did/dsk/d2 phys-hahost1# metaset -s hahost2 -a /dev/did/dsk/d3 /dev/did/dsk/d4

Use the metaset(1M) command to verify the status of the disksets and drives.
phys-hahost1# metaset -s hahost1 phys-hahost1# metaset -s hahost2

(Optional) Refer to "Planning and Layout of Disks", to optimize multihost disk slices.

Planning and Layout of Disks

The metaset(1M) command repartitions drives in a diskset so that a small portion of each drive is reserved in Slice 7 for use by Solstice DiskSuite. The remainder of the space on each drive is placed into Slice 0. To make more effective use of the disk, use the procedure in this section to modify the disk layout.

How to Repartition Drives in a Diskset

Use the format(1M) command to change the disk partitioning for the majority of drives as shown in Table B-2.
Table B-2 Multihost Disk Partitioning for Most Drives

Slice

Description

7

2 Mbytes, reserved for Solstice DiskSuite

6

UFS logs

0

remainder of the disk

2

overlaps Slices 6 and 0

In general, if UFS logs are created, the default size for Slice 6 should be 1 percent of the size of the largest multihost disk found on the system.

Note -
The overlap of Slices 6 and 0 by Slice 2 is used for raw devices where there are no UFS logs.

Partition a drive on each of the first two controllers in each of the disksets as shown in Table B-3.

In the following table, we partition the first drive on the first two controllers as shown. You are not required to use the first drives or the first two controllers, if you have more than two.

Table B-3 Multihost Disk Partitioning-First Drive, First Two Controllers


Slice	Description
7	2 Mbytes, reserved for Solstice DiskSuite
5	2 Mbytes, UFS log for HA administrative file systems
4	9 Mbytes, UFS master for HA administrative file systems
6	UFS logs
0	remainder of the disk
2	overlaps Slices 6 and 0

Partition 7 should be reserved for use by Solstice DiskSuite as the first 2 Mbytes on each multihost disk.

Using the `md.tab` File to Create Metadevices in Disksets

This section describes how to use the md.tab file to configure metadevices and hot spare pools. Note that with Solstice DiskSuite 4.2, the md.tab file is located in /etc/opt/SUNWmd. With Solstice DiskSuite 4.2.1, the md.tab file is located in /etc/lvm.

Note -

If you have an existing md.tab file and want to convert it to use disk IDs, you can use the script in "DID Conversion Script", to help with the conversion.

Creating an `md.tab` File

The md.tab file can be used by the metainit(1M) command to configure metadevices and hot spare pools in a batch-like mode. Solstice DiskSuite does not store configuration information in the md.tab file. The only way information appears in the md.tab is through editing it by hand.

When using the md.tab file, each metadevice or hot spare pool in the file must have a unique entry. Entries can include simple metadevices (stripes, concatenations, and concatenations of stripes); mirrors, trans metadevices, and RAID5 metadevices; and hot spare pools.

Note -

Because md.tab only contains entries that are manually included in it, you should not rely on the file for the current configuration of metadevices, hot spare pools, and replicas on the system at any given time.

Tabs, spaces, comments (preceded by a pound sign, #), and continuation of lines (preceded by a backslash-newline), are allowed.

With Solstice DiskSuite 4.2, the md.tab file is located in /etc/opt/SUNWmd. With Solstice DiskSuite 4.2.1, the md.tab file is located in /etc/lvm.

`md.tab` File Creation Guidelines

Follow these guidelines when setting up your disk configuration and the associated md.tab file.

It is advisable to maintain identical md.tab files on each node in the cluster to ease administration.
A multihost disk and all the partitions found on that disk can be included in no more than one diskset.
All metadevices used by data services must be fully mirrored. Two-way mirrors are recommended, but three-way mirrors are acceptable.
No components of a submirror for a given mirror should be found on the same controller as any other component in any other submirror used to define that mirror.
If more than two disk strings are used, each diskset must include disks from at least three separate controllers. If only two disk strings are used, each diskset must include disks from the two controllers and mediators will be configured. See the Sun Cluster 2.2 System Administration Guide for more information about using dual-string mediators.
Hot spares are recommended, but not required. If hot spares are used, configure them so that the activation of any hot spare drive will not result in components of a submirror for a given metamirror sharing the same controller with any other component in any other submirror used to define that given metamirror.
If you are using Solaris UFS logging, you only need to set up mirrored metadevices in md.tab files, transdevices are not necessary.
If your are using Solstice DiskSuite logging, create multihost file systems on trans metadevices only. Both the logging and master device components of each trans metadevice must be mirrored.
If you are using Solstice DiskSuite logging, do not share spindles between logging and master device components of the same trans metadevice, unless the devices are striped across multiple drives. Otherwise, system performance will be degraded.
Each diskset has a small administrative file system associated with it. This file system is not NFS shared. It is used for data service-specific state or configuration information. The administrative file system should be named the same as the logical host that is the default master for the diskset. This strategy is necessary to enable start up of DBMS fault monitors.

Sample `md.tab` File

The ordering of lines in the md.tab file is not important, but construct your file in the "top down" fashion described below. The following sample md.tab file defines the metadevices for the diskset named green. The # character can be used to annotate the file. In this example, the logical host name is also green.

# administrative file system for logical host mounted under /green
green/d0 -t green/d1 green/d4
	green/d1 -m green/d2 green/d3
	    green/d2 1 1 /dev/did/rdsk/d1s4
	    green/d3 1 1 /dev/did/rdsk/d2s4
	green/d4 -m green/d5 green/d6
	    green/d5 1 1 /dev/did/rdsk/d3s5
	    green/d6 1 1 /dev/did/rdsk/d4s5

# /green/web
green/d10 -t green/d11 green/d14
	green/d11 -m green/d12 green/d13
	    green/d12 1 1 /dev/did/rdsk/d1s0
	    green/d13 1 1 /dev/did/rdsk/d2s0
	green/d14 -m green/d15 green/d16
	    green/d15 1 1 /dev/did/rdsk/d3s6
	    green/d16 1 1 /dev/did/rdsk/d4s6

#/green/home to be NFS-shared
green/d20 -t green/d21 green/d24
	green/d21 -m green/d22 green/d23
	    green/d22 1 1 /dev/did/rdsk/d3s0
	    green/d23 1 1 /dev/did/rdsk/d4s0
	green/d24 -m green/d25 green/d26
	    green/d25 1 1 /dev/did/rdsk/d1s6
	    green/d26 1 1 /dev/did/rdsk/d2s6

The first line defines the administrative file system as the trans metadevice d0 to consist of a master (UFS) metadevice d1 and a log device d4. The -t signifies this is a trans metadevice; the master and log devices are implied by their position after the -t flag.

The second line defines the master device as a mirror of the metadevices. The -m in this definition signifies a mirror device.

green/d1 -m green/d2 green/d3

The fifth line similarly defines the log device, d4, as a mirror of metadevices.

green/d4 -m green/d5 green/d6

The third line defines the first submirror of the master device as a one-way stripe.

green/d2 1 1 /dev/did/rdsk/d1s4

The next line defines the other master submirror.

green/d3 1 1 /dev/did/rdsk/d2s4

Finally, the log device submirrors are defined. In this example, simple metadevices for each submirror are created.

green/d5 1 1 /dev/did/rdsk/d3s5
green/d6 1 1 /dev/did/rdsk/d4s5

Similarly, mirrors are created for two other applications: d10 will contain a Web server and files, and d20 will contain an NFS-shared file system.

If you have existing data on the disks that will be used for the submirrors, you must back up the data before metadevice setup and restore it onto the mirror.

How to Initialize the `md.tab` File

This procedure assumes that you have ownership of the diskset on the node on which you will execute the metainit(1M) command. It also assumed that you have configured identical md.tab files on each node in the cluster. With Solstice DiskSuite 4.2, these files must be located in the /etc/opt/SUNWmd directory. With Solstice DiskSuite 4.2.1, the files must be in /etc/lvm.

As root, initialize the md.tab file by running the metainit(1M) command.
1. Take control of the diskset:
  phys-hahost1# metaset -s hahost1 -t
2. Initialize the md.tab file. The -a option activates all metadevices defined in the md.tab file. For example, this command initializes the md.tab file for diskset hahost1.
  phys-hahost1# metainit -s hahost1 -a
3. Repeat this for each diskset in the cluster.
  
  If necessary, run the metainit(1M) command from another node that has connectivity to the disks. This is required for clustered pair and ring topologies, where the disks are not accessible by all nodes.

Use the metastat(1M) command to check the status of the metadevices.
phys-hahost1# metastat -s hahost1

Creating File Systems Within a Diskset

You can create logging UFS multihost file systems in the Sun Cluster/Solstice DiskSuite environment by using either of these methods:

Creating a metatrans device, consisting of a master device and a logging device
Using the logging feature in the Solaris 7 or Solaris 8 operating environment

How to Create Multihost UFS File Systems

This procedure explains how to create multihost UFS file systems, including the administrative file system that is a requirement for each diskset.

For each diskset, identify or create the metadevices to contain the file systems.

It is recommended that you create a trans metadevice for the administrative file system consisting of these components:
- Master device: mirror using two 2-Mbyte slices on Slice 4 on Drive 1 on the first two controllers
- Logging device: mirror using two 2-Mbyte slices on Slice 6 on Drive 1 on the first two controllers

Make sure you have ownership of the diskset.

If you are creating multihost file systems as part of your initial setup, you should already have diskset ownership. If necessary, refer to the metaset(1M) man page for information on taking diskset ownership.

Create the HA administrative file system.
1. Run the newfs(1M) command.
  
  This example creates the file system on the trans metadevice d11.
  phys-hahost1# newfs /dev/md/hahost1/rdsk/d11
  Caution -
  The process of creating the file system destroys any data on the disks.
2. Create the directory mount point for the HA administrative file system.
  
  Always use the logical host name as the mount point. This strategy is necessary to enable start up of DBMS fault monitors.
  phys-hahost1# mkdir /hahost1
3. Mount the HA administrative file system.
  phys-hahost1# mount /dev/md/hahost1/dsk/d11 /hahost1

Create the multihost UFS file systems.

Run the newfs(1M) command.

This example creates file systems on trans metadevices d1, d2, d3, and d4.

phys-hahost1# newfs /dev/md/hahost1/rdsk/d1
phys-hahost1# newfs /dev/md/hahost1/rdsk/d2
phys-hahost1# newfs /dev/md/hahost1/rdsk/d3
phys-hahost1# newfs /dev/md/hahost1/rdsk/d4

Caution -

The process of creating the file system destroys any data on the disks.

Create the directory mount points for the multihost UFS file systems.

phys-hahost1# mkdir /hahost1/1
phys-hahost1# mkdir /hahost1/2
phys-hahost1# mkdir /hahost1/3
phys-hahost1# mkdir /hahost1/4

Create the /etc/opt/SUNWcluster/conf/hanfs directory.

Edit the /etc/opt/SUNWcluster/conf/hanfs/vfstab.logicalhost file to update the administrative and multihost UFS file system information.

Make sure that all cluster nodes' vfstab.logicalhost files contain the same information. Use the cconsole(1) facility to make simultaneous edits to vfstab.logicalhost files on all nodes in the cluster.

Here's a sample vfstab.logicalhost file showing the administrative file system and four other UFS file systems:

#device                 device                   mount       FS  fsck  mount mount
#to mount                to fsck                    point       type pass  all   options# 
/dev/md/hahost1/dsk/d11  /dev/md/hahost1/rdsk/d11 /hahost1    ufs  1     no   -
/dev/md/hahost1/dsk/d1   /dev/md/hahost1/rdsk/d1  /hahost1/1  ufs  1     no   -
/dev/md/hahost1/dsk/d2   /dev/md/hahost1/rdsk/d2  /hahost1/2  ufs  1     no   -
/dev/md/hahost1/dsk/d3   /dev/md/hahostt1/rdsk/d3 /hahost1/3  ufs  1     no   -
/dev/md/hahost1/dsk/d4   /dev/md/hahost1/rdsk/d4  /hahost1/4  ufs  1     no   -

Release ownership of the diskset.

Unmount file systems first, if necessary.

Because the node performing the work on the diskset takes implicit ownership of the diskset, it needs to release this ownership when done.
phys-hahost1# metaset -s hahost1 -r

(Optional) To make file systems NFS-sharable, refer to Chapter 11, Installing and Configuring Sun Cluster HA for NFS.

Solstice DiskSuite Configuration Examples

The following example helps to explain the process for determining the number of disks to place in each diskset when using Solstice DiskSuite. It assumes that you are using three SPARCstorage Arrays as your disk expansion units. In this example, existing applications are running over NFS (two file systems of five Gbytes each) and two Oracle databases (one 5 Gbytes and one 10 Gbytes).

Table B-4 shows the calculations used to determine the number of drives needed in the sample configuration. If you have three SPARCstorage Arrays, you would need 28 drives that would be divided as evenly as possible among each of the three arrays. Note that the five Gbyte file systems were given an additional Gbyte of disk space because the number of disks needed was rounded up.

Table B-4 Determining Drives Needed for a Configuration


Use	Data	Disk Storage Needed	Drives Needed
nfs1	5 Gbytes	3x2.1 Gbyte disks * 2 (Mirror)	6
nfs2	5 Gbytes	3x2.1 Gbyte disks * 2 (Mirror)	6
oracle1	5 Gbytes	3x2.1 Gbyte disks * 2 (Mirror)	6
oracle2	10 Gbytes	5x2.1 Gbyte disks * 2 (Mirror)	10

Table B-5 shows the allocation of drives among the two logical hosts and four data services.

Table B-5 Division of Disksets


Logical host (diskset)	Data Services	Disks	SPARCstorage Array 1	SPARCstorage Array 2	SPARCstorage Array 3
`hahost1`	nfs1/oracle1	12	4	4	4
`hahost2`	nfs2/oracle2	16	5	6	5

Initially, four disks on each SPARCstorage Array (so a total of 12 disks) are assigned to hahost1 and five or six disks on each (a total of 16) are assigned to hahost2. In Figure B-1, the disk allocation is illustrated. The disks are labeled with the name of the diskset (1 for hahost1 and 2 for hahost2.)

Figure B-1 Sample Diskset Allocation

No hot spares have been assigned to either diskset. A minimum of one hot spare per SPARCstorage Array per diskset enables one drive to be hot spared (restoring full two-way mirroring).

Appendix B Configuring Solstice DiskSuite

Overview of Configuring Solstice DiskSuite for Sun Cluster

Configuring Solstice DiskSuite for Sun Cluster

Calculating the Number of Metadevice Names

How to Calculate the Number of Metadevice Names

Using the Disk ID Driver

How to Prepare the Configuration to Use the DID Driver

Troubleshooting DID Driver Problems

How to Resolve Conflicts With the DID Major Number

DID Conversion Script

Creating Local Metadevice State Database Replicas

How to Create Local Metadevice State Database Replicas

Mirroring the root (/) File System

Creating Disksets

How to Create a Diskset

How to Add Drives to a Diskset

Planning and Layout of Disks

How to Repartition Drives in a Diskset

Using the md.tab File to Create Metadevices in Disksets

Creating an md.tab File

md.tab File Creation Guidelines

Sample md.tab File

How to Initialize the md.tab File

Creating File Systems Within a Diskset

How to Create Multihost UFS File Systems

Solstice DiskSuite Configuration Examples

Figure B-1 Sample Diskset Allocation

Mirroring the root (`/`) File System

Using the `md.tab` File to Create Metadevices in Disksets

Creating an `md.tab` File

`md.tab` File Creation Guidelines

Sample `md.tab` File

How to Initialize the `md.tab` File