C H A P T E R  1

File System Overview

This chapter covers only the Sun StorageTek Storage Archive Manager (SAM) file system features. For information on the Sun StorageTek SAM file system, see the Sun StorageTek QFS File System Configuration and Administration Guide.

This chapter contains the following sections:


File System Features

The Sun StorageTek SAM file system is a configurable file system that presents a standard UNIX file system (UFS) interface to users. TABLE 1-1 shows the entire family of Sun StorageTek SAM and Sun StorageTek SAM software.


TABLE 1-1 Product Overview

Product

Components

Sun StorageTek SAM file system

A stand-alone file system.

Sun StorageTek SAM shared file system

A distributed file system that can be mounted on multiple host systems.

SAM file system

The file system that is included with the Sun StorageTek SAM software. This file system does not include some of the features found in the Sun StorageTek SAM file system.

SAM-QFS

When the Sun StorageTek SAM and the Sun StorageTek SAM software are used together, you can take advantage of the advanced file system features in the Sun StorageTek SAM product as well as the storage management features of the Sun StorageTek SAM product. This combination is called SAM-QFS.


The Sun StorageTek SAM file system does not require changes to user programs or to the UNIX kernel. Some of the features of the Sun StorageTek SAM file system are described in the following sections.

Volume Management

Sun StorageTek SAM file systems support both striped and round-robin disk access. The master configuration file (mcf) and the mount parameters specify the volume management features and enable the file system to recognize the relationships between the devices it controls. This is in contrast to most UNIX file systems, which can address only one device or one portion of a device. Sun StorageTek SAM file systems do not require additional volume management applications. However, if you want to use mirroring for devices in a Sun StorageTek SAM environment, you must obtain an additional package, such as a logical volume manager.

The Sun StorageTek SAM integrated volume management features use the standard Solaris OS device driver interface to pass I/O requests to and from the underlying devices. The Sun StorageTek SAM software groups storage devices into family sets upon which each file system resides.

Support for Paged and Direct I/O

The Sun StorageTek SAM file system supports two different types of I/O: paged (also called cached or buffered I/O) and direct. These I/O types perform as follows:

High Capacity

The Sun StorageTek SAM software supports files of up to 263 bytes in length. Such very large files can be striped across many disks or RAID devices, even within a single file system. This is true because Sun StorageTek SAM file systems use true 64-bit addressing, in contrast to standard UNIX file systems, which are not true 64-bit file systems.

The number of file systems that you can configure is virtually unlimited. The volume manager enables each file system to include up to 252 device partitions, typically disk. Each partition can include up to 16 terabytes of data. This configuration offers virtually unlimited storage capacity.

There is no predefined limit on the number of files in a Sun StorageTek SAM file system. Because the inode space (which holds information about the files) is dynamically allocated, the maximum number of files is limited only by the amount of disk storage available. The inodes are cataloged in the .inodes file under the mount point. The .inodes file requires 512 bytes of storage per file.

Fast File System Recovery

A key function of a file system is its ability to recover quickly after an unscheduled outage. Standard UNIX file systems require a lengthy file system check (fsck(1M)) to repair inconsistencies after a system failure.

A Sun StorageTek SAM file system often does not require a file system check after a disruption that prevents the file system from being written to disk (using sync(1M)). In addition, Sun StorageTek SAM file systems recover from system failures without using journaling. They accomplish this dynamically by using identification records, serial writes, and error checking for all critical I/O operations. After a system failure, even multiterabyte-sized Sun StorageTek SAM file systems can be remounted immediately.

vnode Interface

The Sun StorageTek SAM file system is implemented through the standard Solaris OS virtual file system (vfs/vnode) interface.

By using the vfs/vnode interface, the file system works with the standard Solaris OS kernel and requires no modifications to the kernel for file management support. Thus, the file system is protected from operating system changes and typically does not require extensive regression testing when the operating system is updated.

The kernel intercepts all requests for files, including those that reside in Sun StorageTek SAM file systems. If a file is identified as a Sun StorageTek SAM file, the kernel passes the request to the appropriate file system for handling. Sun StorageTek SAM file systems are identified as type samfs in the /etc/vfstab file and through the mount(1M) command.

Sun StorageTek SAM Archive Management

The Sun StorageTek SAM software combines file system features with a storage and archive management utility. Users can read and write files directly from magnetic disk, or they can access archive copies of files as though they were all on primary disk storage.

When possible, Sun StorageTek SAM software uses the standard Solaris OS disk and tape device drivers. For devices not directly supported under the Solaris OS, such as certain automated library and optical disk devices, Sun Microsystems provides special device drivers in the Sun StorageTek SAM software package.

See the Sun StorageTek Storage Archive Manager Archive Configuration and Administration Guide manual for more information about the storage and archive management features of Sun StorageTek SAM.

Additional File System Features

The following additional features are also supported by the Sun StorageTek SAM file system:


Design Basics

Sun StorageTek SAM file systems are multithreaded, advanced storage management systems. To take maximum advantage of the software's capabilities, create multiple file systems whenever possible.

Sun StorageTek SAM file systems use a linear search method for directory lookups, searching from the beginning of the directory to the end. As the number of files in a directory increases, the search time through the directory also increases. Search times can become excessive when you have directories with thousands of files. These long search times are also evident when you restore a file system. To increase performance and speed up file system dumps and restores, keep the number of files in a directory under 10,000.

The directory name lookup cache (DNLC) feature improves file system performance. This cache stores the directory lookup information for files whose paths are short (30 characters or less), removing the need for directory lookups to be performed on the fly. The DNLC feature is available in all Solaris OS 9 and later releases.

The following sections cover some additional features that affect file system design:

Inode Files and File Characteristics

The types of files to be stored in a file system affect file system design. An inode is a 512-byte block of information that describes the characteristics of a file or directory. This information is allocated dynamically within the file system.

Inodes are stored in the .inodes file located under the file system mount point.

Like a standard Solaris OS inode, a Sun StorageTek SAM file system inode contains the file's POSIX standard inode times: file access, file modification, and inode changed times. A Sun StorageTek SAM file system inode includes other times as well, as shown in TABLE 1-2.


TABLE 1-2 Content of .inode Files

Time

Incident

access

Time the file was last accessed. POSIX standard.

modification

Time the file was last modified. POSIX standard.

changed

Time the inode information was last changed. POSIX standard.

attributes

Time the attributes specific to the Sun StorageTek SAM file system were last changed. Sun Microsystems extension.

creation

Time the file was created. Sun Microsystems extension.

residence

Time the file changed from offline to online or vice versa. Sun Microsystems extension.




Note - If the WORM-FS (write once read many) package is installed, the inode also includes a retention-end date. See Configuring WORM-FS File Systems for more information.



For more information on viewing inode file information, see Viewing Files and File Attributes.

Specifying Disk Allocation Units

Disk space is allocated in basic units of online disk storage called disk allocation units (DAUs). Whereas sectors, tracks, and cylinders describe the physical disk geometry, the DAU describes the file system geometry. Choosing the appropriate DAU size and stripe size can improve performance and optimize magnetic disk usage. The DAU setting is the minimum amount of contiguous space that is used when a file is allocated.

The following subsections describe how to configure DAU settings and stripe widths.

DAU Settings and File System Geometry

Sun StorageTek SAM file systems use an adjustable DAU. You can use this configurable DAU to tune the file system to the physical disk storage device. This feature minimizes the system overhead caused by read-modify-write operations and is therefore particularly useful for applications that manipulate very large files. For information about how to control the read-modify-write operation, see Increasing File Transfer Performance for Large Files.

Each file system can have its own unique DAU setting, even if it is one of several mounted file systems active on a server. The DAU setting is determined through the sammkfs(1M) command when the file system is created. It cannot be changed dynamically.

The following sections introduce the master configuration (mcf) file. You create this ASCII file at system configuration time. It defines the devices and file systems used in your Sun StorageTek SAM environment. For details about the mcf file, see Configuring the File System.

Sun StorageTek SAM File Systems

In a Sun StorageTek SAM file system the file system is defined in your mcf file by an Equipment Type value of ms. In the ms file system, the only device type allowed is type md, and both metadata and file data are written to the md devices. By default, the DAU on an md device is 64 kilobytes.

Dual Allocation Scheme

The md devices use a dual allocation scheme, as follows:



Note - When using an ms file system, the stripe width should be set to greater than zero to stripe metadata information across the disk. However, you should read and understand Stripe Widths on Data Disks before setting the stripe width and DAU size.



Depending on the type of file data stored in the file system, a larger DAU size can improve file system performance significantly. For information about tuning file system performance, see Chapter 5 Advanced Topics.

Data Alignment

Data alignment refers to matching the allocation unit of the RAID controller with the allocation unit of the file system. The optimal Sun StorageTek SAM file system alignment formula is as follows:

allocation-unit = RAID-stripe-width x number-of-data-disks

For example, suppose a RAID-5 unit has nine disks, with one of the nine being the parity disk, making the number of data disks eight. If the RAID stripe width is 64 kilobytes, then the optimal allocation unit is 64 multiplied by 8, which is 512 kilobytes.

Data files are allocated as striped or round-robin through each data disk within the same file system.

A mismatched alignment hurts performance because it can cause a read-modify-write operation.

Stripe Widths on Data Disks

The stripe width is specified by the -o stripe=n option in the mount(1M) command. If the stripe width is set to 0, round-robin allocation is used.

On ms file systems, the stripe width is set at mount time. TABLE 1-3 shows default stripe widths.


TABLE 1-3 ms File System Default Stripe Widths

DAU

Default Stripe Width

Amount of Data Written to Disk

16 kilobytes

8 DAUs

128 kilobytes

32 kilobytes

4 DAUs

128 kilobytes

64 kilobytes (default)

2 DAUs

128 kilobytes


For example, if sammkfs(1M) is run with default settings, the default large DAU is 64 kilobytes. If no stripe width is specified when the mount(1M) command is issued, the default is used, and the stripe width set at mount time is 2.



Note - It is important that the stripe width be set to greater than zero in an ms file system so that metadata information is striped across the disk.



Note that if you multiply the number in the first column of TABLE 1-3 by the number in the second column, the resulting number is 128 kilobytes. Sun StorageTek SAM file systems operate most efficiently if the amount of data being written to disk is at least 128 kilobytes.


File Allocation Methods

The Sun StorageTek SAM software enables you to specify both round-robin and striped allocation methods.

The rest of this section describes allocation in more detail.

Metadata Allocation

For ms file systems, metadata is allocated across the md devices.

Inodes are 512 bytes in length. Directories are initially 4 kilobytes in length. TABLE 1-4 shows how the system allocates metadata.


TABLE 1-4 Metadata Allocation

Metadata Type

Allocation Increments for
ms File Systems

Inodes (.inodes file)

16-, 32-, or 64-kilobyte DAU

Indirect blocks

16-, 32-, or 64-kilobyte DAU

Directories

4 kilobytes, up to a 32-kilobyte total, then DAU size


Round-Robin Allocation

The round-robin allocation method writes one data file at a time to each successive device in the family set. Round-robin allocation is useful for multiple data streams, because in this type of environment aggregate performance can exceed striping performance.

Round-robin disk allocation enables a single file to be written to a logical disk. The next file is written to the next logical disk, and so on. When the number of files written equals the number of devices defined in the family set, the file system starts over again with the first device selected. If a file exceeds the size of the physical device, the first portion of the file is written to the first device, and the remainder of the file is written to the next device with available storage. The size of the file being written determines the I/O size.

You can specify round-robin allocation explicitly in the /etc/vfstab file by entering stripe=0.

In the following figure, file 1 is written to disk 1, file 2 is written to disk 2, file 3 is written to disk 3, and so on. When file 6 is created, it is written to disk 1, restarting the round-robin allocation scheme.

FIGURE 1-1 depicts round-robin allocation on five devices in an ms file system.


FIGURE 1-1 Round-Robin Allocation in an ms File System Using Five Devices

Figure showing files coming into a Sun StorageTek SAM file system using round-robin allocation.Files 1-5 are written to each of five disks. File 6 is written to disk 1. File 7 is written to disk 2, and so on.


Striped Allocation

By default, Sun StorageTek SAM file systems use a striped allocation method to spread data over all the devices in the file system family set. Striping is a method of concurrently writing files in an interlaced fashion across multiple devices.

Striping is used when performance for one file requires the additive performance of all the devices. A file system that is using striped devices addresses blocks in an interlaced fashion rather than sequentially. Striping generally increases performance because it enables multiple I/O streams to simultaneously write a file across multiple disks. The DAU and the stripe width determine the size of the I/O transmission.

In a file system using striping, file 1 is written to disk 1, disk 2, disk 3, disk 4, and disk 5. File 2 is written to disks 1 through 5 as well. The DAU multiplied by the stripe width determines the amount of data written to each disk in a block.

When a Sun StorageTek SAM file system writes a file to an md device, it starts by trying to fit the file into a small DAU, which is 4 kilobytes. If the file does not fit into the first eight small DAUs (32 kilobytes) allocated, the file system writes the remainder of the file into one or more large DAUs.

Multiple active files cause significantly more disk head movement with striped than with round-robin allocation. If I/O is to occur to multiple files simultaneously, use round-robin allocation.

In the following figure, DAU x stripe-width bytes of the file are written to disk 1. DAU x stripe-width bytes of the file are written to disk 2 and so on. The order of the stripe is first-in-first-out for the files. Striping spreads the I/O load over all the disks.


FIGURE 1-2 Striping in an ms File System Using Five Devices

Figure showing files coming into a Sun StorageTek SAM file system using striped allocation. All files are striped across 5 disks.


Figure showing files coming into a Sun StorageTek SAM or Sun SAM-QFS file system using striped allocation.All files are striped across 5 disks. Metadata is written to a separate meta device.Figure showing files coming into a Sun StorageTek SAM or Sun SAM-QFS file system using striped group allocation.The disks are grouped, so the files coming in are written in a round-robin fashion to groups of disks. Metadata is written to a separate meta device.Figure showing files coming into a Sun StorageTek SAM or Sun SAM-QFS file system using striped group allocation.The disks are grouped, so the files coming in are written in a round-robin fashion to groups of disks. Metadata is written to a separate meta device.Figure showing files coming into a Sun StorageTek SAM or Sun SAM-QFS file system using mismatched striped group allocation.The disks are grouped, so the files coming in are written in a round-robin fashion to small groups of disks rather than to the entire group of disks. The number of disks in each group varies from group to group. Metadata is written to a separate meta device.

Per-logical unit number (LUN) Allocation Control

If necessary, you can disable allocation to a specific Sun StorageTek SAM data partition by using a nalloc command, which prohibits any future allocation to that device. The feature is currently only usable on data partitions, not on metadata partitions.

Allocation to a partition can be restarted by either an alloc or on command.

The allocation state of a partition (allocflag) is persistent across boots.

The nalloc and alloc commands are available in the samu interface, and the samu on command also sets allocation to on. The samu screens display the nalloc state for partitions that have been disabled. The samtrace and samfsinfo output also include the allocation state.

For more information about the samu interface, see Using the samu(1M) Operator Utility.