NAME | SYNOPSIS | DESCRIPTION | OPTIONS | EXAMPLES | FILES | EXIT STATUS | ATTRIBUTES | SEE ALSO | NOTES
In a traditional disk set configuration, multiple hosts are physically connected to the same set of disks. When one host fails, the other host has exclusive access to the disks. The metaset command administers sets of disks shared for exclusive (but not concurrent) access among such hosts. While disk sets enable a high-availability configuration, Solaris Volume Manager itself does not actually provide a high-availability environment.
A single-node disk set configuration manages storage on a SAN or fabric-attached storage, or provides namespace control and state database replica management for a specified set of disks.
A multi-node disk set configuration, created with the -M option to metaset, provides for a disk set with shared ownership among multiple hosts. All owners can simultaneously access disks in the set. This option exists to support high-availability applications and does not attempt to protect against overlapping writes. Protection against overlapping writes is the responsibility of the application that issues the writes. Multi-node disk sets do not support RAID 5 volumes or transactional volumes (trans metadevices).
Shared metadevices and hot spare pools can be created only from drives which are in the disk set created by metaset. To create a set, one or more hosts must be added to the set. To create metadevices within the set, one or more devices must be added to the set. The drivename specified must be in the form cxtxdx with no slice specified.
When you add a new disk to any disk set, Solaris Volume Manager checks the disk format. If necessary, it repartitions the disk to ensure that the disk has an appropriately configured reserved slice (slice 7 on a VTOC labelled device or slice 6 on an EFI labelled device), with adequate space for a state database replica. The precise size of slice 7 (or slice 6 on an EFI labelled device), depends on the disk geometry. For traditonal disk sets, the slice is no less than 4 Mbytes, and probably closer to 6 Mbytes, depending on where the cylinder boundaries lie. For multi-node disk sets, the slice is a minimum of 256 Mbytes.
The minimal size for the reserved slice might change in the future. This change is based on a variety of factors, including the size of the state database replica and information to be stored in the state database replica.
For use in disk sets, disks must have a dedicated slice (six or seven) that meets specific criteria:
Slice must start at sector 0
Slice must include enough space for disk label
State database replicas cannot be mounted and does not overlap with any other slices, including slice 2
If the existing partition table does not meet these criteria, Solaris Volume Manager repartitions the disk. A portion of each drive is reserved in slice 7 (or slice 6 on an EFI labelled device), for use by Solaris Volume Manager. The remainder of the space on each drive is placed into slice 0. Any existing data on the disks is lost by repartitioning.
After you add a drive to a disk set, it might be repartitioned as necessary, with the exception that slice 7 (or slice 6 on an EFI labelled device), is not altered in any way.
After a disk set is created and metadevices are set up within the set, the metadevice name is in the following form:
/dev/md/setname/{dsk,rdsk}/dnumber
where setname is the name of the disk set, and number is the number of the metadevice (0-127).
Hot spare pools within local disk sets use standard Solaris Volume Manager naming conventions. Hot spare pools with shared disk sets use the following convention:
setname/hspnumber
where setname is the name of the disk set, and number is the number of the hot spare pool (0-999).
SVM provides support for a low-end HA solution consisting of two hosts that share only two strings of drives. The hosts in this type of configuration, referred to as mediators or mediator hosts, run a special daemon, rpc.metamedd(1M). The mediator hosts take on additional responsibilities to ensure that data is available in the case of host or drive failures.
A mediator configuration can survive the failure of a single host or a single string of drives, without administrative intervention. If both a host and a string of drives fail (multiple failures), the integrity of the data cannot be guaranteed. At this point, administrative intervention is required to make the data accessible. See mediator(7D) for further details.
Use the -m option to add or delete a mediator host.
The following options are supported:
Adds drives or hosts to the named set. For a drive to be accepted into a set, the drive must not be in use within another metadevice or disk set, mounted on, or swapped on. When the drive is accepted into the set, it is repartitioned and the metadevice state database replica (for the set) can be placed on it. However, if a slice 7 (or slice 6 on an EFI labelled device), starts at cylinder 0, and is large enough to hold a state database replica, then the disk is not repartioned. Also, a drive is not accepted if it cannot be found on all hosts specified as part of the set. This means that if a host within the specified set is unreachable due to network problems, or is administratively down, the add fails.
Adds (-a) or deletes (-d) mediator hosts to the specified disk set. A mediator_host_list is the nodename(4) of the mediator host to be added and (for adding) up to two other aliases for the mediator host. The nodename and aliases for each mediator host are separated only by commas. Up to two mediator hosts can be specified for the named disk set. For deleting a mediator host, specify only the nodename of that host as the option to -m.
In a single metaset command you can add or delete two mediator hosts. See EXAMPLES.
Specify auto-take status for a disk set. If auto-take is enabled for a set, the disk set is automatically taken at boot, and file systems on volumes within the disk set can be mounted through /etc/vfstab entries. Only a single host can be associated with an auto-take set, so attempts to add a second host to an auto-take set or attempts to configure a disk set with multiple hosts as auto-take fails with an error message. Disabling auto-take status for a specific disk set causes the disk set to revert to normal behavior. That is, the disk set is potentially shared (non-concurrently) among hosts, and unavailable for mounting through /etc/vfstab.
Insures that the replicas are distributed according to the replica layout algorithm. This can be invoked at any time, and does nothing if the replicas are correctly distributed. In cases where the user has used the metadb command to manually remove or add replicas, this command can be used to insure that the distribution of replicas matches the replica layout algorithm.
Do not interact with the Cluster Framework when used in a Sun Cluster 3 environment. In effect, this means do not modify the Cluster Configuration Repository. These options should only be used to fix a broken disk set configuration.
This option is not for use with a multi-node disk set.
Take ownership of the disk set but do not inform the Cluster Framework that the disk set is available
Release ownership of the disk set without informing the Cluster Framework. This option should only be used if the disk set ownership was taken with the corresponding -C take option.
Remove the disk set without informing the Cluster Framework that the disk set has been purged
Deletes drives or hosts from the named disk set. For a drive to be deleted, it must not be in use within the set. The last host cannot be deleted unless all of the drives within the set are deleted. Deleting the last host in a disk set destroys the disk set. This option fails on a multi-node disk set if attempting to withdraw the master node while other nodes are in the set.
Forces one of three actions to occur: takes ownership of a disk set when used with -t; deletes the last disk drive from the disk set; or deletes the last host from the disk set. (Deleting the last drive or host from a disk set requires the -d option.) When used to forcibly take ownership of the disk set, this causes the disk set to be grabbed whether or not another host owns the set. All of the disks within the set are taken over (reserved) and fail fast is enabled, causing the other host to panic if it had disk set ownership. The metadevice state database is read in by the host performing the take, and the shared metadevices contained in the set are accessible. The -f option is also used to delete the last drive in the disk set, because this drive would implicitly contain the last state database replica. The -f option is also used for deleting hosts from a set. When specified with a partial list of hosts, it can be used for one-host administration. One-host administration could be useful when a host is known to be non-functional, thus avoiding timeouts and failed commands. When specified with a complete list of hosts, the set is completely deleted. It is generally specified with a complete list of hosts to clean up after one-host administration has been performed.
Specifies one or more host names to be added to or deleted from a disk set. Adding the first host creates the set. The last host cannot be deleted unless all of the drives within the set have been deleted. The host name is not accepted if all of the drives within the set cannot be found on the specified host. The host name is the same name found in /etc/nodename.
Joins a host to the owner list for a multi-node disk set. The concepts of take and release, used with traditional disk sets, do not apply to multi-node sets, because multiple owners are allowed. As a host boots and is brought online, it must go through three configuration levels to be able to use a multi-node set. First, it must be included in the cluster nodelist, which happens automatically in a cluster or single-node sitatuion. Second, it must be added to the multi-node disk set with the -a -h options documented elsewhere in this man page. Finally, it must join the set. When the host is first added to the set, it is automatically joined. On manual restarts, the administrator must manually issue metaset -s multinodesetname -j to join the host to the owner list. After the cluster reconfiguration, when the host reenters the cluster, the node is automatically joined to the set. The metaset -j command joins the host to all multi-node sets that the host has been added to. In a single node situation, joining the node to the disk set starts any necessary resynchronizations.
Sets the size (in blocks) for the metadevice state database replica. The length can only be set when adding a new drive; it cannot be changed on an existing drive. The default (and maximum) size is 8192 blocks, which should be appropriate for most configurations. The minimum size of the length is 64 blocks.
Specifies that the diskset to be created or modified is a multi-node disk set that supports multiple concurrent owners. The -M option is required when creating a multi-node disk set, but optional on all other operations on a multi-node disk set. Existing disk sets cannot be converted to multi-node sets.
Returns an exit status of 0 if the local host or the host specified with the -h option is the owner of the disk set.
Purge the named disk set from the node on which the metaset command is run. The disk set must not be owned by the node that runs this command. If the node does own the disk set the command fails.
This option is not for use with a multi-node disk set.
Releases ownership of a disk set. All of the disks within the set are released. The metadevices set up within the set are no longer accessible.
This option is not for use with a multi-node disk set.
Specifies the name of a disk set on which metaset works. If no setname is specified, all disk sets are returned.
Takes ownership of a disk set safely. If metaset finds that another host owns the set, this host is not allowed to take ownership of the set. If the set is not owned by any other host, all the disks within the set are owned by the host on which metaset was executed. The metadevice state database is read in, and the shared metadevices contained in the set become accessible. The -t option takes a disk set that has stale databases. When the databases are stale, metaset exits with code 66, and prints a message. At that point, the only operations permitted are the addition and deletion of replicas. Once the addition or deletion of the replicas has been completed, the disk set should be released and retaken to gain full access to the data.
This option is not for use with a multi-node disk set.
Withdraws a host from the owner list for a multi-node disk set. The concepts of take and release, used with traditional disk sets, do not apply to multi-node sets, because multiple owners are allowed. Instead of releasing a set, a host may issue metaset -s multinodesetname -w to withdraw from the owner list. A host automatically withdraws on a reboot, but can be manually withdrawn if it should not be able to use the set, but should be able to rejoin at a later time. A host that withdrew due to a reboot may still appear joined from other hosts in the set until a reconfiguration cycle occurs. The command metaset -w withdraws from ownership of all multinode sets that the host is a member of. This option fails if attempting to withdraw the master node while other nodes are in the disk set owner list. This option cancels all resyncs running on the node. A cluster reconfiguration process that is removing a node from the cluster membership list effectively withdraws the host from the ownership list.
This example defines a disk set.
# metaset -s relo-red -a -h red blue |
The name of the disk set is relo-red. The names of the first and second hosts added to the set are red and blue, respectively. (The hostname is found in /etc/nodename.) Adding the first host creates the disk set. A disk set can be created with just one host, with the second added later. The last host cannot be deleted until all of the drives within the set have been deleted.
This example adds drives to a disk set.
# metaset -s relo-red -a c2t0d0 c2t1d0 c2t2d0 c2t3d0 c2t4d0 c2t5d0 |
The name of the previously created disk set is relo-red. The names of the drives are c2t0d0, c2t1d0, c2t2d0, c2t3d0, c2t4d0, and c2t5d0. Note that there is no slice identifier ("sx") at the end of the drive names.
The following command adds two mediator hosts to the specified disk set.
# metaset -s mydiskset -a -m myhost1,alias1 myhost2,alias2 |
The following command purges the disket relo-red from the node:
# metaset -s relo-red -P |
This example defines a multi-node disk set.
# metaset -s blue -M -a -h hahost1 hahost2 |
The name of the disk set is blue. The names of the first and second hosts added to the set are hahost1 and hahost2, respectively. The hostname is found in /etc/nodename. Adding the first host creates the multi-node disk set. A disk set can be created with just one host, with additional hosts added later. The last host cannot be deleted until all of the drives within the set have been deleted.
See attributes(5) for descriptions of the following attributes:
ATTRIBUTE TYPE |
ATTRIBUTE VALUE |
---|---|
Availability |
SUNWmdu |
metaclear(1M), metadb(1M), metadetach(1M), metahs(1M), metainit(1M), metaoffline(1M), metaonline(1M), metaparam(1M), metareplace(1M), metaroot(1M), metastat(1M), metasync(1M), metattach(1M), md.cf(4), md.tab(4), mddb.cf(4), attributes(5)
Disk set administration, including the addition and deletion of hosts and drives, requires all hosts in the set to be accessible from the network.
NAME | SYNOPSIS | DESCRIPTION | OPTIONS | EXAMPLES | FILES | EXIT STATUS | ATTRIBUTES | SEE ALSO | NOTES