Solstice DiskSuite 4.2.1 Reference Guide

Chapter 1 Introduction to DiskSuite

This chapter explains the overall structure of DiskSuite. Use the following table to proceed directly to the section that provides the information you need.

What Does DiskSuite Do?

DiskSuite is a software product that enables you to manage large numbers of disks and the data on those disks. Although there are many ways to use DiskSuite, most tasks include:

In some instances, DiskSuite can also improve I/O performance.

How Does DiskSuite Manage Disks?

DiskSuite uses virtual disks to manage physical disks and their associated data. In DiskSuite, a virtual disk is called a metadevice.

A metadevice is functionally identical to a physical disk in the view of an application. DiskSuite converts I/O requests directed at a metadevice into I/O requests to the underlying member disks.

DiskSuite's metadevices are built from slices (disk partitions). An easy way to build metadevices is to use the graphical user interface, DiskSuite Tool, that comes with DiskSuite. DiskSuite Tool presents you with a view of all the slices available to you. By dragging slices onto metadevice objects, you can quickly assign slices to metadevices. You can also build and modify metadevices using DiskSuite's command line utilities.

If, for example, you want to create more storage capacity, you could use DiskSuite to make the system treat a collection of many small slices as one larger slice or device. After you have created a large metadevice from these slices, you can immediately begin using it just as any "real" slice or device.

For a more detailed discussion of metadevices, see "Metadevices".

DiskSuite can increase the reliability and availability of data by using mirrors (copied data) and RAID5 metadevices. DiskSuite`s hot spares can provide another level of data availability for mirrors and RAID5 metadevices.

Once you have set up your configuration, you can use DiskSuite Tool to report on its operation. You can also use DiskSuite`s SNMP trap generating daemon to work with a network monitoring console to automatically receive DiskSuite error messages.

DiskSuite Tool

DiskSuite Tool is a graphical user interface for setting up and administering a DiskSuite configuration. The command to start DiskSuite Tool is:


# metatool &

DiskSuite Tool provides a graphical view of DiskSuite objects--metadevices, hot spare pools, and the MetaDB object for the metadevice state database. DiskSuite Tool uses drag and drop manipulation of DiskSuite objects, enabling you to quickly configure your disks or change an existing configuration.

DiskSuite Tool provides graphical views of both physical devices and metadevices, helping simplify storage administration. You can also perform tasks specific to administering SPARCstorageTM Arrays using DiskSuite Tool.

However, DiskSuite Tool cannot perform all DiskSuite administration tasks. You must use the command line interface for some operations (for example, creating and administering disksets).

To learn more about using DiskSuite Tool, refer to Chapter 4, DiskSuite Tool.

Command Line Interface

Listed here are all the commands you can use to administer DiskSuite. For more detailed information, see the man pages.

Table 1-1 Command Line Interface Commands

DiskSuite Command 

Description 

growfs(1M)

Expands a UFS file system in a non-destructive fashion. 

mdlogd(1M)

The mdlogd daemon and mdlogd.cf configuration file enable DiskSuite to send generic SNMP trap messages.

metaclear(1M)

Deletes active metadevices and hot spare pools. 

metadb(1M)

Creates and deletes state database replicas. 

metadetach(1M)

Detaches a metadevice from a mirror, or a logging device from a trans metadevice. 

metahs(1M)

Manages hot spares and hot spare pools. 

metainit(1M)

Configures metadevices. 

metaoffline(1M)

Places submirrors offline. 

metaonline(1M)

Places submirrors online. 

metaparam(1M)

Modifies metadevice parameters. 

metarename(1M)

Renames and switches metadevice names. 

metareplace(1M)

Replaces slices of submirrors and RAID5 metadevices. 

metaroot(1M)

Sets up system files for mirroring root (/).

metaset(1M)

Administers disksets. 

metastat(1M)

Displays status for metadevices or hot spare pools. 

metasync(1M)

Resyncs metadevices during reboot. 

metatool(1M)

Runs the DiskSuite Tool graphical user interface. 

metattach(1M)

Attaches a metadevice to a mirror, or a logging device to a trans metadevice. 

Overview of DiskSuite Objects

The three basic types of objects that you create with DiskSuite are metadevices, state database replicas, and hot spare pools. Table 1-2 gives an overview of these DiskSuite objects.

Table 1-2 Summary of DiskSuite Objects

DiskSuite Object 

What Is It? 

Why Use It? 

For More Information, Go To ... 

Metadevice (simple, mirror, RAID5, trans)

A group of physical slices that appear to the system as a single, logical device 

To increase storage capacity and increase data availability. 

"Metadevices"

Metadevice state database (state database replicas)

A database that stores information on disk about the state of your DiskSuite configuration 

DiskSuite cannot operate until you have created the metadevice state database replicas. 

"Metadevice State Database and State Database Replicas"

Hot spare pool

A collection of slices (hot spares) reserved to be automatically substituted in case of slice failure in either a submirror or RAID5 metadevice 

To increase data availability for mirrors and RAID5 metadevices. 

"Hot Spare Pools"


Note -

DiskSuite Tool, DiskSuite's graphical user interface, also refers to the graphical representation of metadevices, the metadevice state database, and hot spare pools as "objects."


Metadevices

A metadevice is a name for a group of physical slices that appear to the system as a single, logical device. Metadevices are actually pseudo, or virtual, devices in standard UNIX terms.

You create a metadevice by using concatenation, striping, mirroring, RAID level 5, or UFS logging. Thus, the types of metadevices you can create are concatenations, stripes, concatenated stripes, mirrors, RAID5 metadevices, and trans metadevices.

DiskSuite uses a special driver, called the metadisk driver, to coordinate I/O to and from physical devices and metadevices, enabling applications to treat a metadevice like a physical device. This type of driver is also called a logical, or pseudo, driver.

You can use either the DiskSuite Tool graphical user interface or the command line utilities to create and administer metadevices.

Table 1-3 summarizes the types of metadevices:

Table 1-3 Types of Metadevices

Metadevice 

Description 

Simple

Can be used directly, or as the basic building blocks for mirrors and trans devices. There are three types of simple metadevices: stripes, concatenations, and concatenated stripes. Simple metadevices consist only of physical slices. By themselves, simple metadevices do not provide data redundancy.  

Mirror

Replicates data by maintaining multiple copies. A mirror is composed of one or more simple metadevices called submirrors. 

RAID5

Replicates data by using parity information. In the case of missing data, the missing data can be regenerated using available data and the parity information. A RAID5 metadevice is composed of slices. One slice's worth of space is allocated to parity information, but it is distributed across all slices in the RAID5 metadevice. 

Trans

Used to log a UFS file system. A trans metadevice is composed of a master device and a logging device. Both of these devices can be a slice, simple metadevice, mirror, or RAID5 metadevice. The master device contains the UFS file system. 

How Are Metadevices Used?

You use metadevices to increase storage capacity and data availability. In some instances, metadevices can also increase I/O performance. Functionally, metadevices behave the same way as slices. Because metadevices look like slices, they are transparent to end users, applications, and file systems. Like physical devices, metadevices are accessed through block or raw device names. The metadevice name changes, depending on whether the block or raw device is used. See "Metadevice Conventions" for details about metadevice names.

You can use most file systems commands (mount(1M), umount(1M), ufsdump(1M), ufsrestore(1M),and so forth) on metadevices. You cannot use the format(1M) command, however. You can read, write, and copy files to and from a metadevice, as long as you have a file system mounted on the metadevice.

SPARC and x86 systems can create metadevices on the following disk drives:

Metadevice Conventions

Table 1-4 Example Metadevice Names

/dev/md/dsk/d0

Block metadevice d0 

/dev/md/dsk/d1

Block metadevice d1 

/dev/md/rdsk/d126

Raw metadevice d126 

/dev/md/rdsk/d127

Raw metadevice d127 

Example -- Metadevice Consisting of Two Slices

Figure 1-1 shows a metadevice "containing" two slices, one each from Disk A and Disk B. An application or UFS will treat the metadevice as if it were one physical disk. Adding more slices to the metadevice will increase its capacity.

Figure 1-1 Relationship Among a Metadevice, Physical Disks, and Slices

Graphic

Metadevice State Database and State Database Replicas

A metadevice state database (often simply called the state database) is a database that stores information on disk about the state of your DiskSuite configuration. The metadevice state database records and tracks changes made to your configuration. DiskSuite automatically updates the metadevice state database when a configuration or state change occurs. Creating a new metadevice is an example of a configuration change. A submirror failure is an example of a state change.

The metadevice state database is actually a collection of multiple, replicated database copies. Each copy, referred to as a state database replica, ensures that the data in the database is always valid. Having copies of the metadevice state database protects against data loss from single points-of-failure. The metadevice state database tracks the location and status of all known state database replicas.

DiskSuite cannot operate until you have created the metadevice state database and its state database replicas. It is necessary that a DiskSuite configuration have an operating metadevice state database.

When you set up your configuration, you have two choices for the location of state database replicas. You can place the state database replicas on dedicated slices. Or you can place the state database replicas on slices that will later become part of metadevices. DiskSuite recognizes when a slice contains a state database replica, and automatically skips over the portion of the slice reserved for the replica if the slice is used in a metadevice. The part of a slice reserved for the state database replica should not be used for any other purpose.

You can keep more than one copy of a metadevice state database on one slice, though you may make the system more vulnerable to a single point-of-failure by doing so.

How Does DiskSuite Use State Database Replicas?

The state database replicas ensure that the data in the metadevice state database is always valid. When the metadevice state database is updated, each state database replica is also updated. The updates take place one at a time (to protect against corrupting all updates if the system crashes).

If your system loses a state database replica, DiskSuite must figure out which state database replicas still contain non-corrupted data. DiskSuite determines this information by a majority consensus algorithm. This algorithm requires that a majority (half + 1) of the state database replicas be available before any of them are considered non-corrupt. It is because of this majority consensus algorithm that you must create at least three state database replicas when you set up your disk configuration. A consensus can be reached as long as at least two of the three state database replicas are available.

To protect data, DiskSuite will not function if a majority (half + 1) of all state database replicas is not available. The algorithm, therefore, ensures against corrupt data.

The majority consensus algorithm guarantees the following:


Note -

When the number of state database replicas is odd, DiskSuite computes the majority by dividing the number in half, rounding down to the nearest integer, then adding 1 (one). For example, on a system with seven replicas, the majority would be four (seven divided by two is three and one-half, rounded down is three, plus one is four).


During booting, DiskSuite ignores corrupted state database replicas. In some cases DiskSuite tries to rewrite state database replicas that are bad. Otherwise they are ignored until you repair them. If a state database replica becomes bad because its underlying slice encountered an error, you will need to repair or replace the slice and then enable the replica.

If all state database replicas are lost, you could, in theory, lose all data that is stored on your disks. For this reason, it is good practice to create enough state database replicas on separate drives and across controllers to prevent catastrophic failure. It is also wise to save your initial DiskSuite configuration information, as well as your disk partition information.

Refer to Solstice DiskSuite 4.2.1 User's Guide for information on adding additional state database replicas to the system, and on recovering when state database replicas are lost.

Metadevice State Database Conventions

Hot Spare Pools

A hot spare pool is a collection of slices (hot spares) reserved by DiskSuite to be automatically substituted in case of a slice failure in either a submirror or RAID5 metadevice. Hot spares provide increased data availability for mirrors and RAID5 metadevices. You can create a hot spare pool with either DiskSuite Tool or the command line interface.

How Do Hot Spare Pools Work?

When errors occur, DiskSuite checks the hot spare pool for the first available hot spare whose size is equal to or greater than the size of the slice being replaced. If found, DiskSuite automatically resyncs the data. If a slice of adequate size is not found in the list of hot spares, the submirror or RAID5 metadevice that failed is considered errored. For more information, see Chapter 3, Hot Spare Pools.

Metadevice and Disk Space Expansion

DiskSuite enables you to expand a metadevice by adding additional slices.

Mounted or unmounted UFS file systems contained within a metadevice can be expanded without having to halt or back up your system. (Nevertheless, backing up your data is always a good idea.) After the metadevice is expanded, you grow the file system with the growfs(1M) command.

After a file system is expanded, it cannot be decreased. Decreasing the size of a file system is a UFS limitation.

Applications and databases using the raw metadevice must have their own method to "grow" the added space so that the application or database can recognize it. DiskSuite does not provide this capability.

You can expand the disk space in metadevices in the following ways:

  1. Adding a slice to a stripe or concatenation.

  2. Adding multiple slices to a stripe or concatenation.

  3. Adding a slice or multiple slices to all submirrors of a mirror.

  4. Adding one or more slices to a RAID5 device.

You can use either DiskSuite Tool or the command line interface to add a slice to an existing metadevice.


Note -

When using DiskSuite Tool to expand a metadevice that contains a UFS file system, the growfs(1M) command is run automatically. If you use the command line to expand the metadevice, you must manually run the growfs(1M) command.


The growfs(1M) Command

The growfs(1M) command expands a UFS file system without loss of service or data. H`owever, write-access to the metadevice is suspended while the growfs(1M) command is running. You can expand the file system to the size of the slice or the metadevice that contains the file system.

The file system can be expanded to use only part of the additional disk space by using the -s size option to the growfs(1M) command.


Note -

When expanding a mirror, space is added to the mirror's underlying submirrors. Likewise, when expanding a trans metadevice, space is added to the master device. The growfs(1M) command is then run on the mirror or the trans metadevice, respectively. The general rule is that space is added to the underlying devices(s), and the growfs(1M) command is run on the top-level device.


System and Startup Files

This section explains the files necessary for DiskSuite to operate correctly. For the most part, you do not have to worry about these files because DiskSuite accesses (updates) them automatically (with the exception of md.tab).


Note -

The configuration information in the /etc/lvm/md.tab file may differ from the current metadevices, hot spares, and state database replicas in use. It is only used at metadevice creation time, not to recapture the DiskSuite configuration at boot.



Caution - Caution -

You should not directly edit either the mddb.cf or md.cf files.


For more information on DiskSuite system files, refer to the man pages.

Disksets

A shared diskset, or simply diskset, is a set of shared disk drives containing metadevices and hot spares that can be shared exclusively but not at the same time by two hosts. Currently, disksets are only supported on SPARCstorage Array disks.

A diskset provides for data redundancy and availability. If one host fails, the other host can take over the failed host's diskset. (This type of configuration is known as a failover configuration.)

For more information, see Chapter 5, Disksets.