This chapter describes some hints and tips for using DiskSuite.
Use the following to proceed directly to the section that provides the information you need.
Creating a trans metadevice (UFS logging) is an easy way to increase availability of UFS. Here's a tip that makes efficient use of slices when using trans metadevices:
On new systems, create two to three small slices (8-10 Mbytes each) on each disk. Use these slices to hold not only state database replicas but also logging devices. As a general rule of thumb, as you add new disks to a system, use this method for configuring state database replicas and trans metadevices. That way, you can take care of two DiskSuite objects with one slice.
For more information on adding a state database replica or creating a trans metadevice, refer to Chapter 2, Creating DiskSuite Objects.
PrestoserveTM is a hardware/software product that speeds response time in disk write-bound applications. The product accelerates performance by selectively caching disk block device write operations in non-volatile memory, reducing disk I/O bottlenecks.
Prestoserve improves NFSTM server performance, many disk I/O bound applications, and many file systems.
DiskSuite is fully compatible with Prestoserve, with the following restrictions.
Stripes/concatenations
Top-level metadevices (mirror is discouraged; see "Why is Using Prestoserve With Mirrors Discouraged?")
Underlying component (i.e., submirror)
State database replicas
Trans Metadevices (and the underlying master devices and logging devices)
The simple reason is that using Prestoserve together with mirors introduces a single point of failure into the I/O subsystem, which is exactly what mirrors are designed to avoid. The use of Prestoserve lowers the MTBF of a mirror to approximately the same as a single disk.
Prestoserve cannot be used on trans metadevices. Using Prestoserve on a logging UFS can cause system hangs or panics. Prestoserve operates by redirecting I/O from a device to NVRAM. This redirection interferes with the communication protocol between a logging UFS and a metadevice.
The following steps describe how to load and enable Prestoserve for use with DiskSuite. Basically, you edit the /etc/system file to load Prestoserve after the DiskSuite driver.
Add the following line to the /etc/system file.
exclude: drv/pr |
Edit the /etc/init.d/lvm.init file and add the following lines to the end of the "start" clause.
'start') rm -f /tmp/.mdlock if [ -x "$METAINIT" -a -c "$METADEV" ]; then #echo "$METAINIT -r" $METAINIT -r error=$? #echo "$error" case "$error" in 0|1) ;; 66) echo "Insufficient metadevice database replicas located." echo "" echo "Use metadb to delete databases which are broken." echo "Ignore any \"Read-only file system\" error messages." echo "Reboot the system when finished to reload the metadevice database." echo "After reboot, repair any broken database replicas which were deleted." /sbin/sulogin < /dev/console echo "Resuming system initialization. Metadevice database will remain stale." ;; *) echo "Unknown $METAINIT -r failure $error." ;; esac modload /kernel/drv/pr presto -p >/dev/null fi ;; |
Edit the /etc/init.d/prestoserve file.
Replace the following line:
presto -u |
with the line:
presto -u /filesystem... |
In this command, filesystem... is a list of every file system to be accelerated with Prestoserve. Do not include any of the following:
A poorly designed DiskSuite configuration can degrade performance. This section offers tips for getting good performance from DiskSuite.
Disk and controllers - Place drives in a metadevice on separate drive paths. For SCSI drives, this means separate host adaptors. For IPI drives, this means separate controllers. Spreading the I/O load over several controllers improves metadevice performance and availability.
For SPARCstorage Arrays, you should use drives in a mirror on different chassis, if possible. This type of configuration ensures that all mirror data would survive a SPARCstorage Array chassis failure. If you cannot spread drives across different chassis, then use drives in different trays. This enables you to offline submirrors and the tray to be spun down or removed for maintenance while the mirror stays online.
For example, consider a two-way mirror with each submirror composed of a concatenation of three SPARCstorage Array disks. One submirror would consist of three disks in tray 1, and the other submirror would consist of drives in tray 2. Using the command line interface to initialize this configuration would look like this:
# metainit d1 3 1 c0t0d0s2 1 c0t0d1s2 c0t0d2s2 d1: Concat/Stripe is setup # metainit d2 3 1 c0t2d0s2 1 c0t2d1s2 c0t2d2s2 d2: Concat/Stripe is setup # metainit d0 -m d1 d0: Mirror is setup # metattach d0 d2 d0: Component d2 is attached |
Strings t0 and t1 are contained in tray 1, t2 and t3 in tray 2, and t4 and t5 are in tray 3. Hence, in the above commands, to create submirrors in different trays, we use string t0 for one submirror, and t2 for the second submirror.
System Files - Never edit or remove the /etc/lvm/mddb.cf or /etc/lvm/md.cf files.
Make sure these files are backed up on a regular basis.
After a slice is defined as a metadevice and activated, do not use it for any other purpose.
Have a hardcopy of output from the prtvtoc(1M) command in case you need to reformat a bad disk.
When a state database replica is placed on a slice that becomes part of a metadevice, the capacity of the metadevice is reduced by the space occupied by the replica(s). The space used by a replica is rounded up to the next cylinder boundary and this space is skipped by the metadevice. However, the default size of a state database replicas is only 1034 blocks, and combining replicas and metadevices in this way is actually a very efficient use of DiskSuite.
Stripes are created only from slices, not other metadevices.
Do not use slices from disks with different disk geometry.
Use slices on the same controller but on different disks. Using stripes that are each on different controllers can increase the number of simultaneous reads and writes that can be performed.
Set up a stripe's interlace value to better match the I/O requests made by the system or applications.
Do not stripe partitions that are on a single disk. This practice eliminates simultaneous access and causes performance problems.
Use the same size disk components. Striping different size disk components results in unused disk space.
If you use slices of different sizes when striping, then disk capacity is limited to a multiple of the smallest.
Concatenations are created only from slices, not other metadevices.
Avoid using slices with different disk geometries.
When possible, distribute the components of a concatenated metadevice across different controllers and buses.
Concatenation uses less CPU cycles than striping. It performs well for small random I/O and for even I/O distribution.
Read the guidelines for stripes and concatenations above.
Disks and controllers - Keep the slices of different submirrors on different disks and controllers. Data protection is diminished considerably if slices of two or more submirrors of the same mirror are on the same disk. Likewise, organize submirrors across separate controllers, because controllers and associated cables tends to fail more often than disks. This practice also improves mirror performance.
Same disk - Do not define mirrors on the same disk. Writes to the same drive contend for the same resources, and the failure of the one drive would mean the loss of all data.
Read/write performance - Mirroring may improve read performance, but write performance is always degraded. Mirroring improves read performance only in threaded or asynchronous I/O situations. No performance gain results if there is only a single thread reading from the metadevice.
Same-size submirrors - Use the same size submirrors. Different size submirrors result in unused disk space.
Same-type disks and controllers - Use the same type of disks and controllers in a single mirror. Particularly in old SCSI or SMD storage devices, different models or brands of disk or controller can have widely varying performance. Mixing the different performance levels in a single mirror can cause performance to degrade significantly.
Setting read and write policies for submirrors - Experimenting with the mirror read policies can improve performance. For example, the default read mode is to alternate reads in a round-robin fashion among the disks. This is the default because it tends to work best for UFS multi-user, multi-process activity.
In some cases, the geometric read option improves performance by minimizing head motion and access time. This option is most effective when there is only one slice per disk, when only one process at a time is using the slice/file system, and when I/O patterns are highly sequential or when all accesses are read.
To change mirror options, refer to "How to Change a Mirror's Options (DiskSuite Tool)".
Mounting mirrors - Only mount the mirror device directly. Do not try and mount a submirror directly, unless it is offline and mounted read-only. Do not mount a slice that is part of a submirror; this could destroy data and crash the system.
Mirroring swap - Use swap -l to check for all swap devices. Slices specified as swap must be mirrored separately.
Follow the 20-percent write rule - Because of the complexity of parity calculations, metadevices with greater than about 20 percent writes should probably not be RAID5 metadevice devices. If data redundancy is needed, consider mirroring.
Drawbacks to a "slice-heavy" RAID5 metadevice - The more slices a RAID5 metadevice contains, the longer read and write operations will take when a component fails.
RAID5 metadevices cannot be mirrored.
RAID5 metadevices and striping guidelines - Striping guidelines also apply to RAID5 metadevice configurations. Refer to "Striping Guidelines".
Use different controllers - When creating RAID5 devices, use slices across separate controllers, because controllers and associated cables tends to fail more often than disks. This practice also improves mirror performance.
Use same-size slices - Use the same size disk slices. Creating a RAID5 metadevice of different size slices results in unused disk space.
Interlace value - It is configurable at the time the metadevice is created; thereafter, the value cannot be modified. The default interlace value is 16 Kbytes. This is reasonable for most applications. If the different slices in the RAID5 metadevice reside on different controllers and the accesses to the metadevice are primarily large sequential accesses, then an interlace value of 32 Kbytes might have better performance.
Concatenating to a RAID5 metadevice - Concatenating a new slice to an existing RAID5 will have an impact on the overall performance of the metadevice because the data on concatenations is sequential; data is not striped across all components. The original slices of the metadevice have data and parity striped across all slices. This striping is lost for the concatenated slice, although the data is still recoverable from errors because the parity is used during the component I/O.
Concatenated slices also differ in the sense that they do not have parity striped on any of the regions. Thus, the entire contents of the slice are available for data.
Any performance enhancements for large or sequential writes are lost when slice are concatenated.
For logging and master devices - Place logging devices and master device that belong to the same trans metadevice on separate drives and controllers.
For trans metadevices and shared logging devices - Trans metadevices can share metadevice logging devices. But file systems with the heaviest loads should have separate logs.
For small file systems - Small file systems with mostly read operations probably do not need to be logged.
For mirroring logging devices - Mirror all logging devices whenever possible. Losing the data in a log because of device errors can leave a file system in an inconsistent state which fsck(1M) may not be able to repair without user intervention.
Larger logging devices result in greater concurrency.
For hot spares as temporary fixes - Hot spares are not designed to remain a permanent part of your configuration. They need to be replaced with repaired or new slices.
For hot spares and state database replicas - Hot spares cannot contain state database replicas.
For cross-controller assigning - Ideally, slices added to the hot spare pool should be attached to different controllers. This ensures availability of data due to controller error or failure.
For wrong-size hot spares - Do not associate hot spares of the wrong size with submirrors or RAID5 metadevices.
For hot spares marked In Use - Make sure that all hot spares within a hot spare pool are not marked In-Use.
For one-way mirrors and hot spares - Do not assign a hot spare pool to a submirror in a one-way mirror.
Hot spares are used on a first-fit basis - When adding hot spares of different sizes to a hot spare pool, add the smaller slices first.
Do not mount file systems on a metadevice's underlying slice. If a slice is used for a metadevice of any kind, you must not mount that slice as a file system. If possible, unmount any physical device you intend to use as a metadevice before you activate it. For example, if you create a trans metadevice for a UFS, in the /etc/vfstab file, you would specify the trans metadevice name as the device to mount and fsck.
All physical devices must have a disk label, normally created by programs such as install, format, or fmthard. The label can appear on more than one of the logical partitions that are defined in the label. Physical partitions that contain a label should not allow a user to write to the block that contains the label. Normally, this is block 0. UNIX device drivers allow a user to overwrite this label.
DiskSuite does not provide an audit trail for any reconfiguration of metadevices that may be performed on the system. This means that DiskSuite does not support C2 security.
Systems running Solstice DiskSuite 4.2.1 must be running Solaris 7 or Solaris 8.
UFS logging and disksets require you to run Solaris 7 or Solaris 8.
DiskSuite is compatible with the Solstice BackupTM 5.5.1 product.
If you need to repartition a disk drive, for example, after a disk replacement, you can create a script using the fmthard(1M) command to quickly recreate the VTOC (Volume Table of Contents) information on the disk.
Use the prtvtoc(1M) command to get a listing of partitioning information for a disk.
# prtvtoc /dev/rdsk/c2t0d0s0 > /tmp/vtoc |
In this example, the information for disk c2t0d0 is redirected to a file on disk.
Create and run a script similar to the following, making use of the fmthard(1M) command.
for i in 1 2 3 5 do fmthard -s /tmp/vtoc /dev/rdsk/c2t${i}d0s2 done |
You can set up quotas to limit the amount of disk space and number of inodes (roughly equivalent to the number of files) available to users. (This is a feature of Solaris, not of DiskSuite.) These quotas are activated automatically each time a file system is mounted.
File systems set up for quotas are faster to check by using quotacheck if they are also setup for logging. Such a setup can decrease the amount of time quotacheck needs to run.
To create a trans metadevice, refer to Chapter 2, Creating DiskSuite Objects. For more information on quotas, refer to System Administration Guide, Volume II.
This section describes some advanced uses (and limitations) of DiskSuite Tool.
Color Mappings - DiskSuite Tool cannot save color mappings in the Disk View window when exiting the application. The color mapping is in effect during that particular usage of DiskSuite Tool. It cannot be saved.
Logical Names for Metadevices - DiskSuite Tool currently does not have a mechanism to assign logical names, such as table1, log1, and so on, to metadevices.
Slice Browser "Use" Column - The "Use" column in the Slice Browser does not change from "unassigned" when a slice is used as a raw device. All raw devices, not just metadevices used as the raw device, currently share this problem. There is no way to register devices for use other than as file systems or as swap. DiskSuite Tool does not have a way of its own for this purpose.
Here are three tips to help manage screen real estate on the Metadevice Editor's canvas:
Selecting Collapse from an object's pop-up menu enables you to fit more objects onto the canvas.
Selecting Clean Up Canvas from the Edit menu is useful when you have lots of objects on the canvas, and you are putting some of them away, or you are repositioning objects with the mouse. The Clean Up Canvas option rearranges objects on the canvas in a grid, making viewing easier.
Use the sash to resize the canvas. You can widen the canvas area by clicking the sash at the bottom of the Metadevice Editor window and dragging to the right.
Setting filters within DiskSuite Tool on the Slice View and Disk View windows can help you quickly locate suitable slices for the task at hand.
If you have a system with many disks (and slices), searching for available slices of a certain size can be a chore. Using the Slice Filter window can save you time in this activity.
This task describes how to create a filter in the Slice View window for available slices larger than 200 Mbytes, then drag and drop these slices to the Disk View window to see where they are located.
Click Slices to display the Slice View window.
The Slice View window appears.
Select Set Filters from the Filters menu in the Slice View window.
The Slice Filters window appears.
To search for available slices, make sure the "Available for use as" radio button as is checked, and that "Anything" is selected in the pull-down.
To filter for slices greater than 200 Mbytes, check the "Size" radio button, select "greater than" in the first pull-down, type 200 in the text box, and select Mbytes in the second pull-down.
Click Apply and view the results in the Slice Browser window.
If necessary, change values in the Slice Filters window and click Apply to change the filtering scheme.
After adjusting the filtering scheme to your satisfaction, close the Slice Filters window by clicking OK.
Click Disk View to display the Disk View window.
The Disk View window appears.
In the Slice Browser window, click Select All. Then drag the selected slices to a color drop site in the Disk View window.
View the results in the Disk View window.
DiskSuite Tool uses the selected drop site's color for all slices dragged to the Disk View window. You can now make your slice selection (for example to create a submirror) following the considerations outlined in "General Guidelines".
This task shows how to use DiskSuite Tool to find a suitably sized replacement slice for an errored slice in a submirror.
This approach is not limited to mirrors. You can use this task to find replacement slices for any type of metadevice.
Click Disk View to display the Disk View window.
The Disk View window appears.
Drag the errored Mirror object from the Objects list to the canvas.
Select one submirror (a Concat/Stripe object) within the mirror and drag it to the Disk View window. Then do the same for the second submirror (and third submirror if this is a three-way mirror).
The Disk View window colors the slices with a different color corresponding to the submirrors in the Mirror object. This helps you see where the slices are located, for example, across controllers.
Click Slices to display the Slice View window.
The Slice View window appears.
Click Set Filters in the Slice View window.
The Slice Filters window appears.
To search for available slices, make sure the "Available for use as" radio button is checked, and that "Metadevice Component" is selected in the pull-down.
Filter for slices to replace the errored slice.
One way to do this is to set up a filter that finds slices greater than a size that is slightly smaller than the errored slice. This will display a larger range of slices than if you set up a filter that searches for slices equal to the errored slice size.
Check the "Size" radio button, select "greater than" in the first pull-down, type the size of the slice (in Mbytes, and slightly smaller than the errored slice's size) in the text box, and select Mbytes in the second pull-down.
Click Apply and view the results in the Slice Browser window.
If necessary, change values in the Slice Filters window and click Apply to change the filtering scheme.
In the Slice Browser window, click Select All. Then drag the selected slices to a color drop site in the Disk View window.
View the results in the Disk View window.
DiskSuite Tool uses the color for all slices dragged to the Disk View window.
Select a replacement slice.
You can now make your slice selection for a DiskSuite object following the guidelines outlined in "General Guidelines". Pick a replacement that is large enough and follows mirror guidelines (on different controller, or at least a different disk.)
Drag the replacement slice from the Disk View window to the rectangle of the Concat/Stripe object with the errored slice.
Commit the mirror.
Click inside the top of the Mirror object then click Commit. A mirror resync begins.
By default, DiskSuite Tool uses colors and fonts that are compatible with the OpenWindowsTM desktop applications. This section describes how to change these colors and fonts.
DiskSuite Tool uses a variety of colors:
Standard foreground color - Determines the main colors used to display almost all of the application's elements. The standard background provides the default color for windows, buttons, and other controls.
Standard background color - Provides the color value used to display information presented in windows, buttons, and other controls.
Canvas background color - Provides the background color for data areas. For example, the display areas for the Editor, Disk View window, and scrolling lists all use the canvas background color.
Mapping colors - Display the mappings from logical devices to their slices in the Disk View window. There are eight mapping colors, one for each of the DiskView mappings.
Status colors - Highlight status information for objects needing attention. There are three status conditions requiring unique colors: Attention, Urgent and Critical.
The X Window System RGB (Red, Green, Blue) color specification mechanism enables you to specify a nearly infinite variety of colors. Of course, many of these colors will appear similar, varying only slightly in shade or intensity.
To aid in selecting and specifying colors, the X Window System provides a standard default set of colors that you can specify by name instead of RGB values. This "database" of color names can be examined using the standard X utility showrgb. It shows the RGB values and a corresponding descriptive alias. For example:
# showrgb 199 21 133 medium violet red 176 196 222 light steel blue 102 139 139 paleturquoise4 159 121 238 mediumpurple2 141 182 205 lightskyblue3 0 238 118 springgreen2 255 160 122 light salmon 154 205 50 yellowgreen 178 58 238 darkorchid2 69 139 116 aquamarine4 ... 107 107 107 gray42 71 71 71 gray28 61 61 61 gray24 255 255 255 white 0 205 205 cyan3 0 0 0 black |
You can also examine the default color name database by looking at the /usr/openwin/lib/X11/rgb.txt file.
Unfortunately, there are no standard applications for browsing colors. If you don't have access to a public domain color browser, experiment by trial and error.
DiskSuite Tool's default colors are shown in Table 8-1.
Table 8-1 DiskSuite Tool's Default Colors
Color Type |
Color |
---|---|
Standard Foreground |
black |
Standard Background |
gray |
Canvas Background |
gray66 |
Mapping Colors: |
|
mappingColor1 |
blue |
mappingColor2 |
green |
mappingColor3 |
magenta |
mappingColor4 |
cyan |
mappingColor5 |
purple |
mappingColor6 |
mediumseagreen |
mappingColor7 |
firebrick |
mappingColor8 |
tan |
mappingColor9 |
white |
Status Colors: |
|
Critical |
red |
Urgent |
orange |
Attention |
yellow |
DiskSuite Tool uses four different fonts:
Standard font - Displays almost all text in the tool, for example, in button labels, menus, and dialog boxes.
Mono-spaced (fixed-width) font - Enables consistent columnar alignment, for example, in the various browsers and scrolling lists. This needs to be specified several times.
Bold font - Distinguishes attribute names and labels from the actual attribute values. The names/labels in Information windows appear in the standard font and the corresponding values appear in the bold font. This font is used sparingly.
Small font - Shows the physical devices at the 50 percent scaling level in the Disk View window.
The available fonts depend on which X Window System server you use to display the application. The standard X utility, xlsfonts(1), displays the available fonts on a server. For example:
# xlsfonts --courier-bold-o-normal--0-0-0-0-m-0-iso8859-1 --courier-bold-r-normal--0-0-0-0-m-0-iso8859-1 --courier-medium-o-normal--0-0-0-0-m-0-iso8859-1 --courier-medium-r-normal--0-0-0-0-m-0-iso8859-1 --symbol-medium-r-normal--0-0-0-0-p-0--symbol -symbol-medium-r-normal--0-0-0-0-p-0-sun-fontspecific -adobe-courier-bold-i-normal--0-0-0-0-m-0-iso8859-1 ... utopia-bolditalic utopia-italic utopia-regular variable vshd vtbold vtsingle zapfchancery-mediumitalic zapfdingbats |
Another helpful utility for displaying available fonts is xfontsel(1). Refer to the man pages for these utilities for more information.
DiskSuite Tool's default fonts all come from the Lucida font family:
Table 8-2 DiskSuite Tool's Default Fonts
Font Type |
Font |
---|---|
Standard Font |
lucidasans12 |
Mono-spaced Font |
lucidasans-typewriter12 |
Bold Font |
lucidasans-bold12 |
Small Font |
lucidasans8 |
DiskSuite Tool uses the X Window System's resource database mechanism to determine which fonts to use. The default resource specifications are:
Table 8-3 DiskSuite Tool's Default Font Resource Specifications
Resource |
Font |
---|---|
Metatool*fontList: |
lucidasans12 |
Metatool*smallFontList: |
lucidasans8 |
Metatool*boldFontList: |
lucidasans-bold12 |
Metatool*fixedFontList: |
lucidasans-typewriter12 |
Metatool*XmList.fontList: |
lucidasans-typewriter12 |
Metatool*Help*helpsubjs.fontlist: |
lucidasans-typewriter12 |
Metatool*Help*helptext.fontlist: |
lucidasans-typewriter12 |
You can change DiskSuite Tool's default colors and fonts by using one of the following four methods.
For one invocation of DiskSuite Tool, use the xrm option to specify the alternate font or color resources.
# metatool -xrm 'resource' |
For all of your invocations of DiskSuite Tool, edit your own .Xdefaults file and specify the alternate color or font resources. The .Xdefaults file is typically loaded when you start your desktop session. After editing this file, the next time you start your desktop session, the new or changed resources will be used.
For the current session, without having to restart, use the xrdb utility.
# xrdb -merge path_to_.Xdefaults |
For all users of DiskSuite Tool, edit the /usr/sadm/lib/lvm/X11/app-defaults/Metatool file. Changes made to this file are recognized the next time DiskSuite Tool is started.
This example changes the standard font to lucidasans16 for a single invocation of DiskSuite Tool.
# metatool -xrm 'Metatool*fontList: lucidasans16' |
Using a naming convention for your metadevices can help with your DiskSuite administration, and enable you at a glance to easily identify the metadevice type. Here are a few suggestions:
Use ranges for each particular type of metadevice. For example, assign numbers 0-20 for mirrors, 21-40 for stripes and concatenations, and so on.
Use a naming relationship for mirrors. For example, name mirrors with a number ending in zero (0), and submirrors ending in one (1) and two (2). For example: mirror-d10, submirrors d11 and d12; mirror-d20, submirrors d21 and d22, and so on.
Use a naming method that maps the slice number and disk number to metadevice numbers.
The metarename command enables you to reorganize your metadevice names. Refer to the metarename(1M) man page for more information.
In addition to renaming metadevices, DiskSuite's metarename command also provides the ability to switch "layered" metadevices. When used with the -x option, metarename switches (exchanges) the names of an existing layered metadevice and one of its subdevices. This includes a mirror and one of its submirrors, or a trans metadevice and its master device.
You must use the command line to exchange metadevices. This functionality is currently unavailable in DiskSuite Tool, although you can rename a metadevice with either the command line or DiskSuite Tool.
When to use metadevice name switching - The metarename -x command can make it easier to mirror or unmirror an existing stripe or concatenation, and to create or remove a trans metadevice of an existing metadevice.
Advantages of using metadevice name switching - Switching metadevice names is an administrative convenience for management of metadevice names. For example, you could arrange all file system mount points in a desired numeric range.
Combinations of metadevices that can be switched - The metarename -x command can be used to switch:
Mirror and submirror (concatenation or stripe)
Trans metadevice and master device, where the master device is a concatenation, stripe, mirror, or RAID5 metadevice
The metadevice name switch can take place in both directions.
Trans metadevice that has a mirrored master device - If the master device is a mirror, you cannot directly switch one of the mirror's submirrors for the trans metadevice. You can switch the mirror and the trans metadevice names, or the mirror and one of its submirrors. The relationship for the switch is always parent-child. In essence, you could achieve the same outcome of a submirror/trans metadevice exchange by using a two-step process: first, switch the submirror for the mirror; then switch the mirror for the trans metadevice.
"Rename busy" message when trying to switch components of a trans metadevice - This message could mean one or more of the following: you did not first detach the logging device; you did not unmount the file system using the trans metadevice; you did not use the -f (force) flag option of the metarename command.
You cannot switch (or rename) a metadevice that is currently in use. This includes metadevices used as mounted file systems, as swap, or as active storage for applications or databases. Thus, before using the metarename command, stop all access to the metadevice being renamed. For example, unmount a mounted file system using a metadevice. An application or database must have its own internal method for stopping access.
You cannot switch metadevices in an errored state, or metadevices using a hot spare replacement.
A switch can only take place between metadevices with a direct parent/child relationship. You could not, for example, directly exchange a stripe in a mirror that is a master device with the trans metadevice.
You must use the -f (force) flag when switching members of a trans device.
You cannot switch (or rename) a logging device. The workaround is to either detach the logging device, rename it, then reattach it to the trans device; or detach the logging device and attach another logging device of the desired name.
Only metadevices can be switched. You cannot switch slices or hot spares.
If you have an existing stripe, you can use the metarename -x command to create a compound metadevice. This includes creating a mirror from a concat/stripe, or a trans device with a metadevice as the master device.
This example begins with a concatenation, d1, with a mounted file system, and ends up with the file system mounted on a two-way mirror named d1.
# metastat d1 d1: Concat/Stripe Size: 5600 blocks Stripe 0: Device Start Block Dbase c0t0d0s1 0 No # metainit d2 1 1 c1t3d0s1 d2: Concat/Stripe is setup # metainit -f d20 -m d1 d20: Mirror is setup # umount /fs2 # metarename -x d20 d1 d20 and d1 have exchanged identities # metastat d1 d1: Mirror Submirror 0: d20 State: Okay ... d20: Submirror of d1 State: Okay ... # metattach d1 d2 d1: submirror d2 is attached # metastat d1 d1: Mirror Submirror 0: d20 State: Okay Submirror 1: d2 State: Okay ... # mount /fs2 |
The metastat command confirms that the concatenation d1 is in the "Okay" state. You use the metainit command to create a second concatenation (d2), and then to force (-f) the creation of mirror d20 from d1. You must unmount the file system before using metarename -x to switch d20 for d1; d1 becomes the top-level device (the mirror), which metastat confirms. You attach d2 as the second submirror, verify the state of the mirror with metastat, then remount the file system. Note that because the mount device for /fs2 did not change, you do not have to edit the /etc/vfstab file.
This example begins with a mirror, d1, with a mounted file system, and ends up with the file system mounted on a trans device named d1.
# metastat d1 d1: Mirror Submirror 0: d20 State: Okay Submirror 1: d2 State: Okay ... # umount /fs2 # metainit d21 -t d1 d21: Trans is setup # metarename -f -x d21 d1 d21 and d1 have exchanged identities # metastat d1 d1: Trans State: Detached Size: 5600 blocks Master Device: d21 ... # metattach d1 d0 d1: logging device d0 is attached # mount /fs2 |
The metastat command confirms that the mirror d1 is in the "Okay" state. You must unmount the file system before using the metainit command to create the trans device d21, with d1 as the master. The metarename -f -x command forces the switch of d21 and d1; d1 is now the top-level trans metadevice, as confirmed by the metastat command. A logging device d0 is attached with the metattach command. You then remount /fs2. Note that because the mount device for /fs2 has not changed (it is still d1), you do not have to edit the /etc/vfstab file.
If you have an existing mirror or trans metadevice, you can use the metarename -x command to remove the mirror or trans metadevice and keep data on an underlying metadevice. For a trans metadevice, as long as the master device is a metadevice (stripe/concat, mirror, or RAID5 metadevice), you keep data on that metadevice.
When you use metarename -x as part of this process, the mount point of the file system remains the same.
This example begins with a mirror, d1, containing a mounted file system, and ends up with the file system mounted on a stripe named d1.
# metastat d1 d1: Mirror Submirror 0: d20 State: Okay Submirror 1: d2 State: Okay Pass: 1 ... # umount /fs2 # metarename -x d1 d20 d1 and d20 have exchanged identities # metastat d20 d20: Mirror Submirror 0: d1 State: Okay Submirror 1: d2 State: Okay ... # metadetach d20 d1 d20: submirror d1 is detached # metaclear -r d20 d20: Mirror is cleared d2: Concat/Stripe is cleared # mount /fs2 |
The metastat command confirms that mirror d1 is in the "Okay" state. This file system is unmounted before exchanging the mirror d1 and its submirror d20. This makes the mirror d20, as confirmed by metastat. Next, d1 is detached from d20, then mirror d20 and the other submirror, d2 are deleted. Finally, /fs2 is remounted. Note that because the mount device for /fs2 did not change, the /etc/vfstab file does not require editing.
This example begins with a trans metadevice, d1, containing a mounted file system, and ends up with the file system mounted on the trans metadevice`s underlying master device, which will be d1.
# metastat d1 d1: Trans State: Okay Size: 5600 blocks Master Device: d21 Logging Device: d0 d21: Mirror Submirror 0: d20 State: Okay Submirror 1: d2 State: Okay ... d0: Logging device for d1 State: Okay Size: 5350 blocks # umount /fs2 # metadetach d1 d1: logging device d0 is detached # metarename -f -x d1 d21 d1 and d21 have exchanged identities # metastat d21 d21: Trans State: Detached Size: 5600 blocks Master Device: d1 d1: Mirror Submirror 0: d20 State: Okay Submirror 1: d2 State: Okay # metaclear 21 # fsck /dev/md/dsk/d1 ** /dev/md/dsk/d1 ** Last Mounted on /fs2 ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups FILE SYSTEM STATE IN SUPERBLOCK IS WRONG; FIX? y 3 files, 10 used, 2493 free (13 frags, 310 blocks, 0.5% fragmentation) # mount /fs2 |
The metastat command confirms that the trans metadevice, d1, is in the "Okay" state. The file system is unmounted before detaching the trans metadevice's logging device. The trans metadevice and its mirrored master device are exchanged using the -f (force) flag. Running metastat again confirms that the exchange occurred. The trans metadevice and the logging device (if desired) are cleared, in this case, d21 and d0, respectively. Next, the the fsck command is run on the mirror, d1, and the prompt is answered with a y. After the fsck command is done, the file system is remounted. Note that because the mount device for /fs2 did not change, the /etc/vfstab file does not require editing.
This section describes a technique for regaining access to a metadevice that is defined on a failing controller, causing sporadic system panics. If there is another available controller on the system, the metadevice can in effect be "moved" to the new controller by moving the disks to the controller and redefining the metadevice. This technique does away with the need to back up and restore data to the metadevice.
This example consists of a disk that has two slices that are each part of two separate striped metadevices, d100 and d101, containing file systems /user6 and /maplib1, respectively. The affected controller was c5; the disks will be moved to a free controller (c4). This example also uses the md.tab file.
Stop access to the affected stripes.
For example, unmount any file systems associated with the striped metadevice.
# umount /user6 # umount /maplib1 |
Use metaclear to clear the striped metadevice.
# metaclear d100 d100: Concat/Stripe is cleared # metaclear d101 d101: Concat/Stripe is cleared |
Shut down the server and move the disks to the new controller.
Edit the md.tab file to indicate the new controller in the metadevice names. This example uses "c4" not "c5" for the disk, because the disk was moved to controller 4.
Lines from the md.tab file before change: # Stripe /user6 /dev/md/dsk/d100 1 2 /dev/dsk/c5t0d0s3 /dev/dsk/c2t2d0s3 # Stripe /maplib1 /dev/md/dsk/d101 1 2 /dev/dsk/c5t0d0s0 /dev/dsk/c2t2d0s0 after change: # Stripe /user6 /dev/md/dsk/d100 1 2 /dev/dsk/c4t0d0s3 /dev/dsk/c2t2d0s3 # Stripe /maplib1 /dev/md/dsk/d101 1 2 /dev/dsk/c4t0d0s0 /dev/dsk/c2t2d0s0 |
Use metainit to initialize that striped metadevices and mount it, without reinitializing the file system with newfs.
# metainit d100 d100: Concat/Stripe is setup # metainit d101 d101: Concat/Stripe is setup |
Run metastat to verify that the metadevices are online.
Don't run the newfs command on the metadevice or its associated file system. It will result in a massive data loss, and the need to restore from tape.
This section describes some tips regarding mirrors and their operation.
The following two tasks show how to change the interlace value of submirrors without destroying a mirror, and how to use a mirror for an online backup.
Use this task to change the interlace value of a mirror's underlying submirrors which are composed of striped metadevices. Using this method does away with the need to recreate the mirror and submirrors and restore data.
To use the command line to perform this task, refer to the metadetach(1M), metainit(1M), and metattach(1M)man pages.
The high-level overview of the steps in this task are:
Detaching submirror1
Clearing submirror1
Creating a new stripe, with the new interlace value, to be used as submirror1
Attaching submirror1 to the mirror
Waiting for the mirror resync to finish
Repeating the above steps for submirror2
Make sure DiskSuite Tool is started.
Double-click the Mirror object in the Objects list.
The object appears on the canvas.
Click inside the submirror to be detached.
Drag the submirror out of the Mirror object to the canvas.
If this is a two-way mirror, the mirror's status changes to "Urgent."
Click the top rectangle of the Mirror object then click Commit.
Create a new submirror with the desired interlace value.
Refer to "How to Create a Striped Metadevice (DiskSuite Tool)".
Drag the new Submirror object to the Mirror object. Then click Commit to commit the mirror.
A mirror resync begins.
The Configuration Log shows that the mirror was committed.
Repeat Step 3 through Step 7 for the second (and possibly third) submirror in the mirror.
Although DiskSuite is not meant to be a "backup product," it does provide a means for backing up mirrored data without unmounting the mirror or taking the entire mirror offline, and without halting the system or denying users access to data. This happens as follows: one of the submirrors is taken offline--temporarily losing the mirroring--and backed up; that submirror is then placed online and resynced as soon as the backup is complete.
You can use this procedure on any file system except root (/). Be aware that this type of backup creates a "snapshot" of an active file system. Depending on how the file system is being used when it is write-locked, some files and file content on the backup may not correspond to the actual files on disk.
If you use this procedure on a two-way mirror, be aware that data redundancy is lost while one submirror is offline for backup. A three-way mirror does not have this problem.
There is some overhead on the system when the offlined submirror is brought back online after the backup is complete.
If you use these procedures regularly, put them into a script for ease of use.
The high-level steps in this procedure are:
Write locking the file system (UFS only). Do not lock root (/).
Using the metaoffline(1M) command to take one submirror offline from the mirror
Unlocking the file system
Backing up the data on the offlined submirror
Using the metaonline(1M) command to place the offlined submirror back online
Before beginning, run the metastat(1M) command to make sure the mirror is in the "Okay" state.
A mirror that is in the "Maintenance" state should be repaired first.
For all file systems except root (/), lock the file system from writes.
# /usr/sbin/lockfs -w mount point |
Only a UFS needs to be write-locked. If the metadevice is set up as a raw device for database management software or some other specific application, running lockfs(1M) is not necessary. (You may, however, want to run the appropriate vendor-supplied utility to flush any buffers and lock access.)
Write-locking root (/) causes the system to hang, so it should never be performed.
Take one submirror offline from the mirror.
# metaoffline mirror submirror |
In this command,
mirror |
Is the metadevice name of the mirror. |
submirror |
Is the metadevice name of the submirror (metadevice) being taken offline. |
Reads will continue to be made from the other submirror. The mirror will be out of sync as soon as the first write is made. This inconsistency is corrected when the offlined submirror is brought back online in Step 6.
There is no need to run fsck(1M) on the offlined file system.
Unlock the file system and allow writes to continue.
# /usr/sbin/lockfs -u mount point |
You may need to perform necessary unlocking procedures based on vendor-dependent utilities used in Step 2 above.
Perform a backup of the offlined submirror. Use ufsdump(1M) or your usual backup utility.
To ensure a proper backup, use the raw metadevice, for example, /dev/md/rdsk/d4. Using "rdsk" allows greater than 2 Gbyte access.
Place the mirror back online.
# metaonline mirror submirror |
DiskSuite automatically begins resyncing the submirror with the mirror.
This example uses a mirror named d1, consisting of submirrors d2 and d3. d3 is taken offline and backed up while d2 stays online. The file system on the mirror is /home1.
# /usr/sbin/lockfs -w /home1 # metaoffline d1 d3 d1: submirror d3 is offlined # /usr/sbin/lockfs -u /home1 (Perform backup using /dev/md/rdsk/d3) # metaonline d1 d3 d1: submirror d3 is onlined |
If a system with mirrors for root (/), /usr, and swap--the so-called "boot" file systems--is booted into single-user mode (boot -s), these mirrors and possibly all mirrors on the system will appear in the "Needing Maintenance" state when viewed with the metastat command. Furthermore, if writes occur to these slices, metastat shows an increase in dirty regions on the mirrors.
Though this appears potentially dangerous, there is no need for concern. The metasync -r command, which normally occurs during boot to resync mirrors, is interrupted when the system is booted into single-user mode. Once the system is rebooted, metasync -r will run and resync all mirrors.
If this is a concern, run metasync -r manually.
A hot spare pool can contain 0 to n hot spares. A hot spare pool can be associated with multiple submirrors and RAID5 metadevices. You can define one hot spare pool with a variety of different size slices, and associate it with all the submirrors or RAID 5 metadevices. DiskSuite knows how to use the correctly sized hot spare when necessary.
Place hot spares in the same hot spare pool across controllers, to preserve lines of failure. In this respect, follow the same guidelines as you would for creating submirrors.
This section provides tips for configuring disksets.
Currently, DiskSuite only supports disksets on SPARCstorage Array disks.
Configuring the hardware for use in diskset configuration can be problematic. The disk drives must be symmetric; that is, the shared drives must have the same device number, which implies the same device name and number (controller/target/drive). This task explains how to configure this setup.
On a set of new machines, where the hardware was pre-configured, the desired symmetry occurs by default. You do not need to perform this task.
You must configure device names done before creating any metadevices in the diskset. Any other drives that are on non built-in controllers will also be affected.
Make sure that the disk controllers are located in slots that will be found in the same order.
This is best achieved by having the controllers for a given SPARCstorage Array in the same slots on identical processor models. If this is not possible, then you must make sure that the order of the slots will probe out identically on both processors. Because the probing of the Sbus is conducted in an orderly fashion, this can be achieved, but not easily. It is also recommended that slots be used in order from lowest to highest numbered slot leaving all the unused slots at the high end.
The configuration system numbers controllers of the same type in sequence. In this case, "disk drive" is the type, so all controllers for disk drives will affect the order that devices are found. To this end, all the devices that are to be shared should probably be placed before any other disk controllers in the system to make sure that they will be found and accounted for in the correct order.
Once this has been done, you can do one of two things: a complete install on both host machines, or continue with this task. The latter is considerably faster.
One at a time, become root on each host and perform the following:
# rm /etc/path_to_inst* # reboot -- '-rav' reboot: rebooted by root syncing file systems... [1] done rebooting... Resetting ... Rebooting with command: -rav Boot device: /iommu/sbus/espdma@f,400000/esp@f,800000/sd@3,0 File and args: -rav Enter filename [kernel/unix]: Size: 253976+126566+39566 Bytes Enter default directory for modules [/kernel /usr/kernel]: SunOS Release 5.4 Generic [UNIX(R) System V Release 4.0] Copyright (c) 1983-1995, Sun Microsystems, Inc. Name of system file [etc/system]: The /etc/path_to_inst on your system does not exist or is empty. Do you want to rebuild this file [n]? y Using default device instance data root filesystem type [ufs]: Enter physical name of root device [/iommu@f,e0000000/sbus@f,e0001000/espdma@f,400000/esp@f,800000 /sd@3,0:a]: ... The system is ready. console login: root Password:<root-password> # /usr/bin/rm -r /dev/*dsk/* # /usr/sbin/disks # ^D |
Given that the hardware is set up correctly, this will ensure that the software reflects that setup. Because the /etc/path_to_inst file is used to keep things from sliding, which will occur generally when controllers are moved around, it is removed to make sure that controllers slide to the correct location. The '-rav' option with reboot makes sure that the kernel will interact with the user during boot and do a reconfigure reboot. The removal of /dev/*dsk/* is used to make sure that the sym-links get created correctly when the /usr/sbin/disks program is run.
Because the SPARCstorage Array controller contains a unique World Wide Name, which identifies it to Solaris, special procedures apply for SPARCstorage Array controller replacement. Contact your service provider for assistance.
If you want to change the size of your state database replicas in a diskset, the basic steps are adding two disks to the diskset, deleting one of the new disk's state database replicas, then deleting the other disk from the diskset. You then add the deleted disk back to the diskset, along with any other disks you want added to the diskset. The state database replicas will automatically resize themselves to the new size.
# metadb -s rimtic -d c1t0d0s7 # metadb -s rimtic -a -l 2068 c1t0d0s7 # metaset -s rimtic -d c1t1d0 # metaset -s rimtic -a c1t1d0 # metadb -s rimtic |
This example assumes you have already added two disks to the diskset, rimtic, and that there is no data on the rest of the disk to which the replica will be added. The new size of the state database replica is 2068 blocks, as specified by the -l 2068 option. The metadb command confirms the new size of the state database replicas.