Managing ZFS File Systems in Oracle® Solaris 11.2

Exit Print View

Updated: December 2014
 
 

ZFS Storage Pool Creation Practices

The following sections provide general and more specific pool practices.

General Storage Pool Practices

  • Use whole disks to enable disk write cache and provide easier maintenance. Creating pools on slices adds complexity to disk management and recovery.

  • Use ZFS redundancy so that ZFS can repair data inconsistencies.

    • The following message is displayed when a non-redundant pool is created:

      # zpool create tank c4t1d0 c4t3d0
      'tank' successfully created, but with no redundancy; failure
      of one device will cause loss of the pool
    • For mirrored pools, use mirrored disk pairs

    • For RAID-Z pools, group 3-9 disks per VDEV

    • Do not mix RAID-Z and mirrored components within the same pool. These pools are harder to manage and performance might suffer.

  • Use hot spares to reduce down time due to hardware failures

  • Use similar size disks so that I/O is balanced across devices

    • Smaller LUNs can be expanded to large LUNs

    • Do not expand LUNs from extremely varied sizes, such as 128 MB to 2 TB, to keep optimal metaslab sizes

  • Consider creating a small root pool and larger data pools to support faster system recovery

  • Recommended minimum pool size is 8 GB. Although the minimum pool size is 64 MB, anything less than 8 GB makes allocating and reclaiming free pool space more difficult.

  • Recommended maximum pool size should comfortably fit your workload or data size. Do not try to store more data than you can routinely back up on a regular basis. Otherwise, your data is at risk due to some unforeseen event.

See also Pool Creation Practices on Local or Network Attached Storage Arrays.

Root Pool Creation Practices

  • SPARC (SMI (VTOC)): Create root pools with slices by using the s* identifier. Do not use the p* identifier. In general, a system's ZFS root pool is created when the system is installed. If you are creating a second root pool or re-creating a root pool, use syntax similar to the following on a SPARC system:

    # zpool create rpool c0t1d0s0

    Or, create a mirrored root pool. For example:

    # zpool create rpool mirror c0t1d0s0 c0t2d0s0
  • Solaris 11.1 x86 (EFI (GPT)): Create root pools with whole disks by using the d* identifier. Do not use the p* identifier. In general, a system's ZFS root pool is created when the system is installed. If you are creating a second root pool or re-creating a root pool, use syntax similar to the following:

    # zpool create rpool c0t1d0

    Or, create a mirrored root pool. For example:

    # zpool create rpool mirror c0t1d0 c0t2d0
  • The root pool must be created as a mirrored configuration or as a single-disk configuration. Neither a RAID-Z nor a striped configuration is supported. You cannot add additional disks to create multiple mirrored top-level virtual devices by using the zpool add command, but you can expand a mirrored virtual device by using the zpool attach command.

  • The root pool cannot have a separate log device.

  • Pool properties can be set during an AI installation, but the gzip compression algorithm is not supported on root pools.

  • Do not rename the root pool after it is created by an initial installation. Renaming the root pool might cause an unbootable system.

  • Do not create a root pool on a USB stick on a production system because root pool disks are critical for continuous operation, particularly in an enterprise environment. Consider using a system's internal disks for the root pool, or at least, use the same quality disks that you would use for your non-root data. In addition, a USB stick might not be large enough to support a dump volume size that is equivalent to at least 1/2 the size of physical memory.

  • Rather than adding a hot spare to a root pool, consider creating a two- or a three-way mirror root pool. In addition, do not share a hot spare between a root pool and a data pool.

  • Do not use a VMware thinly-provisioned device for a root pool device.

Non-Root Pool Creation Practices

  • Create non-root pools with whole disks by using the d* identifier. Do not use the p* identifier.

    • ZFS works best without any additional volume management software.

    • For better performance, use individual disks or at least LUNs made up of just a few disks. By providing ZFS with more visibility into the LUNs setup, ZFS is able to make better I/O scheduling decisions.

  • Create redundant pool configurations across multiple controllers to reduce down time due to a controller failure.

    • Mirrored storage pools – Consume more disk space but generally perform better with small random reads.

      # zpool create tank mirror c1d0 c2d0 mirror c3d0 c4d0
    • RAID-Z storage pools – Can be created with 3 parity strategies, where parity equals 1 (raidz), 2 (raidz2), or 3 (raidz3). A RAID-Z configuration maximizes disk space and generally performs well when data is written and read in large chunks (128K or more).

      • Consider a single-parity RAID-Z (raidz) configuration with 2 VDEVs of 3 disks (2+1) each.

        # zpool create rzpool raidz1 c1t0d0 c2t0d0 c3t0d0 raidz1 c1t1d0 c2t1d0 c3t1d0
      • A RAIDZ-2 configuration offers better data availability, and performs similarly to RAID-Z. RAIDZ-2 has significantly better mean time to data loss (MTTDL) than either RAID-Z or 2-way mirrors. Create a double-parity RAID-Z (raidz2) configuration at 6 disks (4+2).

        # zpool create rzpool raidz2 c0t1d0 c1t1d0 c4t1d0 c5t1d0 c6t1d0 c7t1d0
        raidz2 c0t2d0 c1t2d0 c4t2d0 c5t2d0 c6t2d0 c7t2d
      • A RAIDZ-3 configuration maximizes disk space and offers excellent availability because it can withstand 3 disk failures. Create a triple-parity RAID-Z (raidz3) configuration at 9 disks (6+3).

        # zpool create rzpool raidz3 c0t0d0 c1t0d0 c2t0d0 c3t0d0 c4t0d0
        c5t0d0 c6t0d0 c7t0d0 c8t0d0

Pool Creation Practices on Local or Network Attached Storage Arrays

Consider the following storage pool practices when creating an a ZFS storage pool on a storage array that is connected locally or remotely.

  • If you create an pool on SAN devices and the network connection is slow, the pool's devices might be UNAVAIL for a period of time. You need to assess whether the network connection is appropriate for providing your data in a continuous fashion. Also, consider that if you are using SAN devices for your root pool, they might not be available as soon as the system is booted and the root pool's devices might also be UNAVAIL.

  • Confirm with your array vendor that the disk array is not flushing its cache after a flush write cache request is issued by ZFS.

  • Use whole disks, not disk slices, as storage pool devices so that Oracle Solaris ZFS activates the local small disk caches, which get flushed at appropriate times.

  • For best performance, create one LUN for each physical disk in the array. Using only one large LUN can cause ZFS to queue up too few read I/O operations to actually drive the storage to optimal performance. Conversely, using many small LUNs could have the effect of swamping the storage with a large number of pending read I/O operations.

  • A storage array that uses dynamic (or thin) provisioning software to implement virtual space allocation is not recommended for Oracle Solaris ZFS. When Oracle Solaris ZFS writes the modified data to free space, it writes to the entire LUN. The Oracle Solaris ZFS write process allocates all the virtual space from the storage array's point of view, which negates the benefit of dynamic provisioning.

    Consider that dynamic provisioning software might be unnecessary when using ZFS:

    • You can expand a LUN in an existing ZFS storage pool and it will use the new space.

    • Similar behavior works when a smaller LUN is replaced with a larger LUN.

    • If you assess the storage needs for your pool and create the pool with smaller LUNs that equal the required storage needs, then you can always expand the LUNs to a larger size if you need more space.

  • If the array can present individual devices (JBOD-mode), then consider creating redundant ZFS storage pools (mirror or RAID-Z) on this type of array so that ZFS can report and correct data inconsistencies.

Pool Creation Practices for an Oracle Database

Consider the following storage pool practices when creating an Oracle database.

  • Use a mirrored pool or hardware RAID for pools

  • RAID-Z pools are generally not recommended for random read workloads

  • Create a small separate pool with a separate log device for database redo logs

  • Create a small separate pool for the archive log

For more information about tuning ZFS for an Oracle database, Tuning ZFS for an Oracle Database in Oracle Solaris 11.2 Tunable Parameters Reference Manual .

Using ZFS Storage Pools in VirtualBox

  • Virtual Box is configured to ignore cache flush commands from the underlying storage by default. This means that in the event of a system crash or a hardware failure, data could be lost.

  • Enable cache flushing on Virtual Box by issuing the following command:

    VBoxManage setextradata vm-name "VBoxInternal/Devices/type/0/LUN#n/Config/IgnoreFlush" 0
    • vm-name – the name of the virtual machine

    • type – the controller type, either piix3ide (if you're using the usual IDE virtual controller) or ahci, if you're using a SATA controller

    • n – the disk number