The cluster file system has the following features:
File access locations are transparent. A process can open a file that is located anywhere in the system. Processes on all Solaris hosts can use the same path name to locate a file.
When the cluster file system reads files, it does not update the access time on those files.
Coherency protocols are used to preserve the UNIX file access semantics even if the file is accessed concurrently from multiple nodes.
Extensive caching is used along with zero-copy bulk I/O movement to move file data efficiently.
The cluster file system provides highly available, advisory file-locking functionality by using the fcntl command interfaces. Applications that run on multiple cluster nodes can synchronize access to data by using advisory file locking on a cluster file system. File locks are recovered immediately from nodes that leave the cluster, and from applications that fail while holding locks.
Continuous access to data is ensured, even when failures occur. Applications are not affected by failures if a path to disks is still operational. This guarantee is maintained for raw disk access and all file system operations.
Cluster file systems are independent from the underlying file system and volume management software. Cluster file systems make any supported on-disk file system global.
You can mount a file system on a global device globally with mount -g or locally with mount.
Programs can access a file in a cluster file system from any node in the cluster through the same file name (for example, /global/foo).
A cluster file system is mounted on all cluster members. You cannot mount a cluster file system on a subset of cluster members.
A cluster file system is not a distinct file system type. Clients verify the underlying file system (for example, UFS).
In the Sun Cluster software, all multihost disks are placed into device groups, which can be Solaris Volume Manager disk sets, VxVM disk groups, or individual disks that are not under control of a software-based volume manager.
For a cluster file system to be highly available, the underlying disk storage must be connected to more than one Solaris host. Therefore, a local file system (a file system that is stored on a host's local disk) that is made into a cluster file system is not highly available.
You can mount cluster file systems as you would mount file systems:
Manually. Use the mount command and the -g or -o global mount options to mount the cluster file system from the command line, for example:
SPARC: # mount -g /dev/global/dsk/d0s0 /global/oracle/data |
Automatically. Create an entry in the /etc/vfstab file with a global mount option to mount the cluster file system at boot. You then create a mount point under the /global directory on all hosts. The directory /global is a recommended location, not a requirement. Here's a sample line for a cluster file system from an /etc/vfstab file:
SPARC: /dev/md/oracle/dsk/d1 /dev/md/oracle/rdsk/d1 /global/oracle/data ufs 2 yes global,logging |
While Sun Cluster software does not impose a naming policy for cluster file systems, you can ease administration by creating a mount point for all cluster file systems under the same directory, such as /global/disk-group. See Sun Cluster 3.1 9/04 Software Collection for Solaris OS (SPARC Platform Edition) and Sun Cluster System Administration Guide for Solaris OS for more information.
The HAStoragePlus resource type is designed to make local and global file system configurations highly available. You can use the HAStoragePlus resource type to integrate your local or global file system into the Sun Cluster environment and make the file system highly available.
You can use the HAStoragePlus resource type to make a file system available to a global-cluster non-voting node. To enable the HAStoragePlus resource type to do this, you must create a mount point on the global-cluster voting node and in the global-cluster non-voting node. The HAStoragePlus resource type makes the file system available to the global-cluster non-voting node by mounting the file system in the global-cluster voting node. The resource type then performs a loopback mount in the global-cluster non-voting node.
Sun Cluster systems support the following cluster file systems:
Solaris ZFSTM
UNIX file system (UFS)
Sun StorEdge QFS file system and Sun QFS Shared file system
Sun Cluster Proxy file system (PxFS)
Veritas file system (VxFS)
The HAStoragePlus resource type provides additional file system capabilities such as checks, mounts, and forced unmounts. These capabilities enable Sun Cluster to fail over local file systems. In order to fail over, the local file system must reside on global disk groups with affinity switchovers enabled.
See Enabling Highly Available Local File Systems in Sun Cluster Data Services Planning and Administration Guide for Solaris OS for information about how to use the HAStoragePlus resource type.
You can also use the HAStoragePlus resource type to synchronize the startup of resources and device groups on which the resources depend. For more information, see Resources, Resource Groups, and Resource Types.
You can use the syncdir mount option for cluster file systems that use UFS as the underlying file system. However, performance significantly improves if you do not specify syncdir. If you specify syncdir, the writes are guaranteed to be POSIX compliant. If you do not specify syncdir, you experience the same behavior as in NFS file systems. For example, without syncdir, you might not discover an out of space condition until you close a file. With syncdir (and POSIX behavior), the out-of-space condition would have been discovered during the write operation. The cases in which you might have problems if you do not specify syncdir are rare.
If you are using a SPARC based cluster, VxFS does not have a mount option that is equivalent to the syncdir mount option for UFS. VxFS behavior is the same as for UFS when the syncdir mount option is not specified.
See File Systems FAQs for frequently asked questions about global devices and cluster file systems.