7.1 About OCFS2

Oracle Cluster File System version 2 (OCFS2) is a general-purpose, high-performance, high-availability, shared-disk file system intended for use in clusters. It is also possible to mount an OCFS2 volume on a standalone, non-clustered system.

Although it might seem that there is no benefit in mounting ocfs2 locally as compared to alternative file systems such as ext4 or btrfs, you can use the reflink command with OCFS2 to create copy-on-write clones of individual files in a similar way to using the cp --reflink command with the btrfs file system. Typically, such clones allow you to save disk space when storing multiple copies of very similar files, such as VM images or Linux Containers. In addition, mounting a local OCFS2 file system allows you to subsequently migrate it to a cluster file system without requiring any conversion.

Almost all applications can use OCFS2 as it provides local file-system semantics. Applications that are cluster-aware can use cache-coherent parallel I/O from multiple cluster nodes to balance activity across the cluster, or they can use of the available file-system functionality to fail over and run on another node in the event that a node fails. The following examples typify some use cases for OCFS2:

  • Oracle VM to host shared access to virtual machine images.

  • Oracle VM and VirtualBox to allow Linux guest machines to share a file system.

  • Oracle Real Application Cluster (RAC) in database clusters.

  • Oracle E-Business Suite in middleware clusters.

OCFS2 has a large number of features that make it suitable for deployment in an enterprise-level computing environment:

  • Support for ordered and write-back data journaling that provides file system consistency in the event of power failure or system crash.

  • Block sizes ranging from 512 bytes to 4 KB, and file-system cluster sizes ranging from 4 KB to 1 MB (both in increments of powers of 2). The maximum supported volume size is 16 TB, which corresponds to a cluster size of 4 KB. A volume size as large as 4 PB is theoretically possible for a cluster size of 1 MB, although this limit has not been tested.

  • Extent-based allocations for efficient storage of very large files.

  • Optimized allocation support for sparse files, inline-data, unwritten extents, hole punching, reflinks, and allocation reservation for high performance and efficient storage.

  • Indexing of directories to allow efficient access to a directory even if it contains millions of objects.

  • Metadata checksums for the detection of corrupted inodes and directories.

  • Extended attributes to allow an unlimited number of name:value pairs to be attached to file system objects such as regular files, directories, and symbolic links.

  • Advanced security support for POSIX ACLs and SELinux in addition to the traditional file-access permission model.

  • Support for user and group quotas.

  • Support for heterogeneous clusters of nodes with a mixture of 32-bit and 64-bit, little-endian (x86, x86_64, ia64) and big-endian (ppc64) architectures.

  • An easy-to-configure, in-kernel cluster-stack (O2CB) with a distributed lock manager (DLM), which manages concurrent access from the cluster nodes.

  • Support for buffered, direct, asynchronous, splice and memory-mapped I/O.

  • A tool set that uses similar parameters to the ext3 file system.