This chapter provides conceptual information about Solaris Volume Manager's RAID-5 volumes. For information about performing related tasks, see Chapter 15, RAID-5 Volumes (Tasks).
This chapter contains the following:
RAID level 5 is similar to striping, but with parity data distributed across all components (disk or logical volume). If a component fails, the data on the failed component can be rebuilt from the distributed data and parity information on the other components. In Solaris Volume Manager, a RAID-5 volume is a volume that supports RAID level 5.
A RAID-5 volume uses storage capacity equivalent to one component in the volume to store redundant information (parity). This parity information contains information about user data stored on the remainder of the RAID-5 volume's components. That is, if you have three components, the equivalent of one component is used for the parity information. If you have five components, then the equivalent of one component is used for parity information. The parity information is distributed across all components in the volume. Similar to a mirror, a RAID-5 volume increases data availability, but with a minimum of cost in terms of hardware and only a moderate penalty for write operations. However, you cannot use a RAID-5 volume for the root (/), /usr, and swap file systems, or for other existing file systems.
Solaris Volume Manager automatically resynchronizes a RAID-5 volume when you replace an existing component. Solaris Volume Manager also resynchronizes RAID-5 volumes during rebooting if a system failure or panic took place.
Figure 14–1 illustrates a RAID-5 volume that consists of four disks (components).
The first three data segments are written to Component A (interlace 1), Component B (interlace 2), and Component C (interlace 3). The next data segment that is written is a parity segment. This parity segment is written to Component D (P 1–3). This segment consists of an exclusive OR of the first three segments of data. The next three data segments are written to Component A (interlace 4), Component B (interlace 5), and Component D (interlace 6). Then, another parity segment is written to Component C (P 4–6).
This pattern of writing data and parity segments results in both data and parity being spread across all disks in the RAID-5 volume. Each drive can be read independently. The parity protects against a single disk failure. If each disk in this example were 2 Gbytes, the total capacity of the RAID-5 volume would be 6 Gbytes. One drive's worth of space is allocated to parity.
The following figure shows an example of an RAID-5 volume that initially consisted of four disks (components). A fifth disk has been dynamically concatenated to the volume to expand the RAID-5 volume.
The parity areas are allocated when the initial RAID-5 volume is created. One component's worth of space is allocated to parity, although the actual parity blocks are distributed across all of the original components to distribute I/O. When additional components are concatenated to the RAID-5 volume, the additional space is devoted entirely to data. No new parity blocks are allocated. The data on the concatenated component is, however, included in the parity calculations, so the data is protected against single device failures.
Concatenated RAID-5 volumes are not suited for long-term use. Use a concatenated RAID-5 volume until it is possible to reconfigure a larger RAID-5 volume. Then, copy the data to the larger volume.
When you add a new component to a RAID-5 volume, Solaris Volume Manager “zeros” all the blocks in that component. This process ensures that the parity protects the new data. As data is written to the additional space, Solaris Volume Manager includes the data in the parity calculations.
When you work with RAID-5 volumes, consider the Requirements for RAID-5 Volumes and Guidelines for RAID-5 Volumes. Many striping guidelines also apply to RAID-5 volume configurations. See RAID-0 Volume Requirements.
A RAID-5 volume must consist of at least three components. The more components a RAID-5 volume contains, however, the longer read and write operations take when a component fails.
RAID-5 volumes cannot be striped, concatenated, or mirrored.
Do not create a RAID-5 volume from a component that contains an existing file system. Doing so will erase the data during the RAID-5 initialization process.
When you create a RAID-5 volume, you can define the interlace value. If not specified, the interlace value defaults to 512 Kbytes. This value is reasonable for most applications.
A RAID-5 volume (with no hot spares) can only handle a single component failure.
When you create RAID-5 volumes, use components across separate controllers. Controllers and associated cables tend to fail more often than disks.
Use components of the same size. Creating a RAID-5 volume with components of different sizes results in unused disk space.
Because of the complexity of parity calculations, volumes with greater than about 20 percent writes should probably not be RAID-5 volumes. If data redundancy on a write-heavy volume is needed, consider mirroring.
If the different components in a RAID-5 volume reside on different controllers and the accesses to the volume are primarily large sequential accesses, then setting the interlace value to 32 Kbytes might improve performance.
You can expand a RAID-5 volume by concatenating additional components to the volume. Concatenating a new component to an existing RAID-5 volume decreases the overall performance of the volume because the data on concatenations is sequential. Data is not striped across all components. The original components of the volume have data and parity striped across all components. This striping is lost for the concatenated component. However, the data is still recoverable from errors because the parity is used during the component I/O. The resulting RAID-5 volume continues to handle a single component failure.
Concatenated components also differ in the sense that they do not have parity striped on any of the regions. Thus, the entire contents of the component are available for data.
Any performance enhancements for large or sequential writes are lost when components are concatenated.
You can create a RAID-5 volume without having to “zero out” the data blocks. To do so, do one of the following:
Use the metainit command with the -k option. The -k option recreates the RAID-5 volume without initializing it, and sets the disk blocks to the “Okay” state. This option is potentially dangerous, as any errors that exist on disk blocks within the volume will cause unpredictable behavior from Solaris Volume Manager, including the possibility of fabricated data.
Initialize the device and restore data from tape. See the metainit(1M) man page for more information.
You can check the status of RAID-5 volumes by looking at the volume states and the slice states for the volume. The slice state provides the most specific information when you are troubleshooting RAID-5 volume errors. The RAID-5 volume state only provides general status information, such as “Okay” or “Maintenance.”
If the RAID-5 volume state reports a “Maintenance” state, refer to the slice state. The slice state specifically reports if the slice is in the “Maintenance” state or the “Last Erred” state. You take a different recovery action depending on if the the slice is in the “Maintenance” state or the “Last Erred” state. If you only have a slice in the “Maintenance” state, it can be repaired without loss of data. If you have a slice in the “Maintenance” state and a slice in the “Last Erred” state, data has probably been corrupted. You must fix the slice in the “Maintenance” state first, then fix the “Last Erred” slice.
The following table explains RAID-5 volume states.
Table 14–1 RAID-5 Volume States
State |
Meaning |
---|---|
Initializing |
Slices are in the process of having all disk blocks zeroed. This process is necessary due to the nature of RAID-5 volumes with respect to data and parity interlace striping. Once the state changes to “Okay,” the initialization process is complete and you are able to open the device. Until then, applications receive error messages. |
Okay |
The device is ready for use and is currently free from errors. |
Maintenance |
A slice has been marked as failed due to I/O or open errors. These errors were encountered during a read or write operation. |
The following table explains the slice states for a RAID-5 volume and possible actions to take.
Table 14–2 RAID-5 Slice States
State |
Meaning |
Action |
---|---|---|
Initializing |
Slices are in the process of having all disk blocks zeroed. This process is necessary due to the nature of RAID-5 volumes with respect to data and parity interlace striping. |
Normally, none. If an I/O error occurs during this process, the device goes into the “Maintenance” state. If the initialization fails, the volume is in the “Initialization Failed” state, and the slice is in the “Maintenance” state. If this happens, clear the volume and recreate it. |
Okay |
The device is ready for use and is currently free from errors. |
None. Slices can be added or replaced, if necessary. |
Resyncing |
The slice is actively being resynchronized. An error has occurred and been corrected, a slice has been enabled, or a slice has been added. |
If desired, monitor the RAID-5 volume status until the resynchronization is done. |
Maintenance |
A single slice has been marked as failed due to I/O or open errors. These errors were encountered during a read or write operation. |
Enable or replace the failed slice. See How to Enable a Component in a RAID-5 Volume, or How to Replace a Component in a RAID-5 Volume. The metastat command will show an invoke recovery message with the appropriate action to take with the metareplace command. |
Maintenance/Last Erred |
Multiple slices have encountered errors. The state of the failed slices is either “Maintenance” or “Last Erred.” In this state, no I/O is attempted on the slice that is in the “Maintenance” state. However, I/O is attempted on the slice marked “Last Erred” with the outcome being the overall status of the I/O request. |
Enable or replace the failed slices. See How to Enable a Component in a RAID-5 Volume, or How to Replace a Component in a RAID-5 Volume. The metastat command will show an invoke recovery message with the appropriate action to take with the metareplace command. This command must be run with the -f flag. This state indicates that data might be fabricated due to multiple failed slices. |
Solaris Volume Manager has the capability to replace and enable components within mirrors and RAID-5 volumes. The issues and requirements for doing so are the same for mirrors and RAID-5 volumes. For more information, see Overview of Replacing and Enabling Components in RAID-1 and RAID-5 Volumes.
RAID-5 volumes allow you to have redundant storage without the overhead of RAID-1 volumes, which require two times the total storage space to provide data redundancy. By setting up a RAID-5 volume, you can provide redundant storage of greater capacity than you could achieve with a RAID-1 volume on the same set of disk components. In addition, with the help of hot spares (see Chapter 16, Hot Spare Pools (Overview) and specifically How Hot Spares Work), you can achieve nearly the same level of safety. The drawbacks are increased write time and markedly impaired performance in the event of a component failure. However, those tradeoffs might be insignificant for many situations. The following example, drawing on the sample scenario explained in Chapter 5, Configuring and Using Solaris Volume Manager (Scenario), describes how RAID-5 volumes can provide extra storage capacity.
Other scenarios for RAID-0 and RAID-1 volumes used 6 slices (c1t1d0, c1t2d0, c1t3d0, c2t1d0, c2t2d0, c2t3d0) on 6 disks, spread over 2 controllers, to provide 27 Gbytes of redundant storage. By using the same slices in a RAID-5 configuration, 45 Gbytes of storage is available. Also, the configuration can withstand a single component failure without data loss or access interruption. By adding hot spares to the configuration, the RAID-5 volume can withstand additional component failures. The most significant drawback to this approach is that a controller failure would result in data loss to this RAID-5 volume, while it would not with the RAID-1 volume described in Scenario—RAID-1 Volumes (Mirrors).