Solaris Volume Manager Administration Guide

Chapter 14 RAID 5 Volumes (Overview)

This chapter provides conceptual information about Solaris Volume Manager RAID 5 volumes. For information about performing related tasks, see Chapter 15, RAID 5 Volumes (Tasks).

This chapter contains the following:

Overview of RAID 5 Volumes

RAID level 5 is similar to striping, but with parity data distributed across all components (disk or logical volume). If a component fails, the data on the failed component can be rebuilt from the distributed data and parity information on the other components. In Solaris Volume Manager, a RAID 5 volume is a volume that supports RAID level 5.

A RAID 5 volume uses storage capacity equivalent to one component in the volume to store redundant information (parity) about user data stored on the remainder of the RAID 5 volume's components. That is, if you have three components, the equivalent of one will be used for the parity information. If you have five components, then the equivalent of one will be used for parity information. The parity is distributed across all components in the volume. Like a mirror, a RAID 5 volume increases data availability, but with a minimum of cost in terms of hardware and only a moderate penalty for write operations. However, you cannot use a RAID 5 volume for root (/), /usr, and swap, or for existing file systems.

Solaris Volume Manager automatically resynchronizes a RAID 5 volume when you replace an existing component. Solaris Volume Manager also resynchronizes RAID 5 volumes during rebooting if a system failure or panic took place.

Example—RAID 5 Volume

Figure 14–1 shows a RAID 5 volume, d40.

The first three data chunks are written to Disks A through C. The next chunk that is written is a parity chunk, written to Drive D, which consists of an exclusive OR of the first three chunks of data. This pattern of writing data and parity chunks results in both data and parity being spread across all disks in the RAID 5 volume. Each drive can be read independently. The parity protects against a single disk failure. If each disk in this example were 2 Gbytes, the total capacity of d40 would be 6 Gbytes. (One drive's worth of space is allocated to parity.)

Figure 14–1 RAID 5 Volume Example

Diagram shows how several components are combined and
parity introduced to present a RAID 5 volume for use.

Example—Concatenated (Expanded) RAID 5 Volume

The following figure shows an example of an RAID 5 volume that initially consisted of four disks (components). A fifth disk has been dynamically concatenated to the volume to expand the RAID 5 volume.

Figure 14–2 Expanded RAID 5 Volume Example

Diagram shows how additional components can be concatenated
onto a RAID 5 volume to provide a larger volume with redundancy.

The parity areas are allocated when the initial RAID 5 volume is created. One component's worth of space is allocated to parity, although the actual parity blocks are distributed across all of the original components to distribute I/O. When you concatenate additional components to the RAID, the additional space is devoted entirely to data. No new parity blocks are allocated. The data on the concatenated components is, however, included in the parity calculations, so it is protected against single device failures.

Concatenated RAID 5 volumes are not suited for long-term use. Use a concatenated RAID 5 volume until it is possible to reconfigure a larger RAID 5 volume and copy the data to the larger volume.

Note –

When you add a new component to a RAID 5 volume, Solaris Volume Manager “zeros” all the blocks in that component. This process ensures that the parity will protect the new data. As data is written to the additional space, Solaris Volume Manager includes it in the parity calculations.

Background Information for Creating RAID 5 Volumes

When you work with RAID 5 volumes, consider the Requirements for RAID 5 Volumes and Guidelines for RAID 5 Volumes. Many striping guidelines also apply to RAID 5 volume configurations. See RAID 0 Volume Requirements.

Requirements for RAID 5 Volumes

A RAID 5 volume must consist of at least three components. The more components a RAID 5 volume contains, however, the longer read and write operations take when a component fails.
RAID 5 volumes cannot be striped, concatenated, or mirrored.
Do not create a RAID 5 volume from a component that contains an existing file system. Doing so will erase the data during the RAID 5 initialization process.
When you create a RAID 5 volume, you can define the interlace value. If not specified, the interlace value is 16 Kbytes. This value is reasonable for most applications.
A RAID 5 volume (with no hot spares) can only handle a single component failure.
When you create RAID 5 volumes, use components across separate controllers, because controllers and associated cables tend to fail more often than disks.
Use components of the same size. Creating a RAID 5 volume with components of different sizes results in unused disk space.

Guidelines for RAID 5 Volumes

Because of the complexity of parity calculations, volumes with greater than about 20 percent writes should probably not be RAID 5 volumes. If data redundancy on a write-heavy volume is needed, consider mirroring.
If the different components in the RAID 5 volume reside on different controllers and the accesses to the volume are primarily large sequential accesses, then setting the interlace value to 32 Kbytes might improve performance.
You can expand a RAID 5 volume by concatenating additional components to the volume. Concatenating a new component to an existing RAID 5 decreases the overall performance of the volume because the data on concatenations is sequential. Data is not striped across all components. The original components of the volume have data and parity striped across all components. This striping is lost for the concatenated component, although the data is still recoverable from errors because the parity is used during the component I/O. The resulting RAID 5 volume continues to handle a single component failure.

Concatenated components also differ in the sense that they do not have parity striped on any of the regions. Thus, the entire contents of the component are available for data.

Any performance enhancements for large or sequential writes are lost when components are concatenated.
You can create a RAID 5 volume without having to “zero out” the data blocks. To do so, do one of the following:
- Use the metainit command with the -k option. The -k option recreates the RAID 5 volume without initializing it, and sets the disk blocks to the OK state. This option is potentially dangerous, as any errors that exist on disk blocks within the volume will cause unpredictable behavior from Solaris Volume Manager, including the possibility of fabricated data.
- Initialize the device and restore data from tape. See the metainit(1M) man page for more information.

Overview of Checking Status of RAID 5 Volumes

The slice state is perhaps the most important information when you are troubleshooting RAID 5 volume errors. The RAID 5 state only provides general status information, such as “Okay” or “Needs Maintenance.” If the RAID 5 reports a “Needs Maintenance” state, refer to the slice state. You take a different recovery action if the slice is in the “Maintenance” or “Last Erred” state. If you only have a slice in the “Maintenance” state, it can be repaired without loss of data. If you have a slice in the “Maintenance” state and a slice in the “Last Erred” state, data has probably been corrupted. You must fix the slice in the “Maintenance” state first then the “Last Erred” slice.

The following table explains RAID 5 volume states.

Table 14–1 RAID 5 States


State	Meaning
Initializing	Slices are in the process of having all disk blocks zeroed. This process is necessary due to the nature of RAID 5 volumes with respect to data and parity interlace striping. Once the state changes to “Okay,” the initialization process is complete and you are able to open the device. Up to this point, applications receive error messages.
Okay	The device is ready for use and is currently free from errors.
Maintenance	A slice has been marked as failed due to I/O or open errors that were encountered during a read or write operation.

The following table explains the slice states for a RAID 5 volume and possible actions to take.

Table 14–2 RAID 5 Slice States


State	Meaning	Action
Initializing	Slices are in the process of having all disk blocks zeroed. This process is necessary due to the nature of RAID 5 volumes with respect to data and parity interlace striping.	Normally none. If an I/O error occurs during this process, the device goes into the “Maintenance” state. If the initialization fails, the volume is in the “Initialization Failed” state, and the slice is in the “Maintenance” state. If this happens, clear the volume and recreate it.
Okay	The device is ready for use and is currently free from errors.	None. Slices can be added or replaced, if necessary.
Resyncing	The slice is actively being resynchronized. An error has occurred and been corrected, a slice has been enabled, or a slice has been added.	If desired, monitor the RAID 5 volume status until the resynchronization is done.
Maintenance	A single slice has been marked as failed due to I/O or open errors that were encountered during a read or write operation.	Enable or replace the failed slice. See How to Enable a Component in a RAID 5 Volume, or How to Replace a Component in a RAID 5 Volume. The `metastat` command will show an `invoke` recovery message with the appropriate action to take with the `metareplace` command.
Maintenance / Last Erred	Multiple slices have encountered errors. The state of the failed slices is either “Maintenance” or “Last Erred.” In this state, no I/O is attempted on the slice that is in the “Maintenance” state, but I/O is attempted to the slice marked “Last Erred” with the outcome being the overall status of the I/O request.	Enable or replace the failed slices. See How to Enable a Component in a RAID 5 Volume, or How to Replace a Component in a RAID 5 Volume. The `metastat` command will show an `invoke` recovery message with the appropriate action to take with the `metareplace` command, which must be run with the `-f` flag. This state indicates that data might be fabricated due to multiple failed slices.

Overview of Replacing and Enabling Slices in RAID 5 Volumes

Solaris Volume Manager has the capability to replace and enable components within mirrors and RAID 5 volumes. The issues and requirements for doing so are the same for mirrors and RAID 5 volumes. For more information, see Overview of Replacing and Enabling Components in RAID 1 and RAID 5 Volumes.

Scenario—RAID 5 Volumes

RAID 5 volumes allow you to have redundant storage without the overhead of RAID 1 volumes, which require two times the total storage space to provide data redundancy. By setting up a RAID 5 volume, you can provide redundant storage of greater capacity than you could achieve with RAID 1 on the same set of disk components, and, with the help of hot spares (see Chapter 16, Hot Spare Pools (Overview) and specifically How Hot Spares Work), nearly the same level of safety. The drawbacks are increased write time and markedly impaired performance in the event of a component failure, but those trade-offs might be insignificant for many situations. The following example, drawing on the sample system explained in Chapter 5, Configuring and Using Solaris Volume Manager (Scenario), describes how RAID 5 volumes can provide extra storage capacity.

Other scenarios for RAID 0 and RAID 1 volumes used 6 slices (c1t1d0, c1t2d0, c1t3d0, c2t1d0, c2t2d0, c2t3d0) on six disks, spread over two controllers, to provide 27 Gbytes of redundant storage. By using the same slices in a RAID 5 configuration, 45 Gbytes of storage is available, and the configuration can withstand a single component failure without data loss or access interruption. By adding hot spares to the configuration, the RAID 5 volume can withstand additional component failures. The most significant drawback to this approach is that a controller failure would result in data loss to this RAID 5 volume, while it would not with the RAID 1 volume described in Scenario—RAID 1 Volumes (Mirrors).