|Skip Navigation Links
|Exit Print View
|Oracle Solaris Administration: ZFS File Systems Oracle Solaris 11 Information Library
The following sections provide recommended practices for creating and monitoring ZFS storage pools. For information about troubleshooting storage pool problems, see Chapter 11, Oracle Solaris ZFS Troubleshooting and Pool Recovery.
Keep system up-to-date with latest Solaris releases and patches
Size memory requirements to actual system workload
With a known application memory footprint, such as for a database application, you might cap the ARC size so that the application will not need to reclaim its necessary memory from the ZFS cache.
Consider deduplication memory requirements
Identify ZFS memory usage with the following command:
# mdb -k > ::memstat Page Summary Pages MB %Tot ------------ ---------------- ---------------- ---- Kernel 388117 1516 19% ZFS File Data 81321 317 4% Anon 29928 116 1% Exec and libs 1359 5 0% Page cache 4890 19 0% Free (cachelist) 6030 23 0% Free (freelist) 1581183 6176 76% Total 2092828 8175 Physical 2092827 8175 > $q
Consider using ECC memory to protect against memory corruption. Silent memory corruption can potentially damage your data.
Perform regular backups – Although a pool that is created with ZFS redundancy can help reduce down time due to hardware failures, it is not immune to hardware failures, power failures, or disconnected cables. Make sure you backup your data on a regular basis. If your data is important, it should be backed up. Different ways to provide copies of your data are:
Regular or daily ZFS snapshots
Weekly backups of ZFS pool data. You can use the zpool split command to create an exact duplicate of ZFS mirrored storage pool.
Monthly backups by using an enterprise-level backup product
Consider using JBOD-mode for storage arrays rather than hardware RAID so that ZFS can manage the storage and the redundancy.
Use hardware RAID or ZFS redundancy or both
Using ZFS redundancy has many benefits – For production environments, configure ZFS so that it can repair data inconsistencies. Use ZFS redundancy, such as RAIDZ, RAIDZ-2, RAIDZ-3, mirror, regardless of the RAID level implemented on the underlying storage device. With such redundancy, faults in the underlying storage device or its connections to the host can be discovered and repaired by ZFS.
Crash dumps consume more disk space, generally in the 1/2-3/4 size of physical memory range.
The following sections provide general and more specific pool practices.
Use whole disks to enable disk write cache and provide easier maintenance. Creating pools on slices adds complexity to disk management and recovery.
Use ZFS redundancy so that ZFS can repair data inconsistencies.
The following message is displayed when a non-redundant pool is created:
# zpool create tank c4t1d0 c4t3d0 'tank' successfully created, but with no redundancy; failure of one device will cause loss of the pool
For mirrored pools, use mirrored disk pairs
For RAIDZ pools, group 3-9 disks per VDEV
Use hot spares to reduce down time due to hardware failures
Use similar size disks so that I/O is balanced across devices
Smaller LUNs can be expanded to large LUNs
Do not expand LUNs from extremely varied sizes, such as 128 MB to 2 TB, to keep optimal metaslab sizes
Consider creating a small root pool and larger data pools to support faster system recovery
Create root pools with slices by using the s* identifier. Do not use the p* identifier. In general, a system's ZFS root pool is created when the system is installed. If you are creating a second root pool or re-creating a root pool, use syntax similar to the following:
# zpool create rpool c0t1d0s0
Or, create a mirrored root pool. For example:
# zpool create rpool mirror c0t1d0s0 c0t2d0s0
The root pool must be created as a mirrored configuration or as a single-disk configuration. A RAID-Z or a striped configuration is not supported. You cannot add additional disks to create multiple mirrored top-level virtual devices by using the zpool add command, but you can expand a mirrored virtual device by using the zpool attach command.
The root pool cannot have a separate log device.
Pool properties can be set during an AI installation, but the gzip compression algorithm is not supported on root pools.
Do not rename the root pool after it is created by an initial installation. Renaming the root pool might cause an unbootable system.
Create non-root pools with whole disks by using the d* identifier. Do not use the p* identifier.
ZFS works best without any additional volume management software.
For better performance, use individual disks or at least LUNs made up of just a few disks. By providing ZFS with more visibility into the LUNs setup, ZFS is able to make better I/O scheduling decisions.
Create redundant pool configurations across multiple controllers to reduce down time due to a controller failure.
Mirrored storage pools – Consume more disk space but generally perform better with small random reads.
# zpool create tank mirror c1d0 c2d0 mirror c3d0 c4d0
RAID-Z storage pools – Can be created with 3 parity strategies, where parity equals 1 (raidz), 2 (raidz2), or 3 (raidz3). A RAID-Z configuration maximizes disk space and generally performs well when data is written and read in large chunks (128K or more).
Consider a single-parity RAID-Z (raidz) configuration with 2 VDEVs of 3 disks (2+1) each.
# zpool create rzpool raidz1 c1t0d0 c2t0d0 c3t0d0 raidz1 c1t1d0 c2t1d0 c3t1d0
A RAIDZ-2 configuration offers better data availability, and performs similarly to RAID-Z. RAIDZ-2 has significantly better mean time to data loss (MTTDL) than either RAID-Z or 2-way mirrors. Create a double-parity RAID-Z (raidz2) configuration at 6 disks (4+2).
# zpool create rzpool raidz2 c0t1d0 c1t1d0 c4t1d0 c5t1d0 c6t1d0 c7t1d0 raidz2 c0t2d0 c1t2d0 c4t2d0 c5t2d0 c6t2d0 c7t2d
A RAIDZ-3 configuration maximizes disk space and offers excellent availability because it can withstand 3 disk failures. Create a triple-parity RAID-Z (raidz3) configuration at 9 disks (6+3).
# zpool create rzpool raidz3 c0t0d0 c1t0d0 c2t0d0 c3t0d0 c4t0d0 c5t0d0 c6t0d0 c7t0d0 c8t0d0
Consider the following storage pool practices when creating an Oracle database.
Use a mirrored pool or hardware RAID for pools
RAID-Z pools are generally not recommended for random read workloads
Create a small separate pool with a separate log device for database redo logs
Create a small separate pool for the archive log
For more information, see the following white paper:
Keep pool capacity below 80% for best performance
Mirrored pools are recommended over RAID-Z pools for random read/write workloads
Separate log devices
Recommended to improve synchronous write performance
With a high synchronous write load, prevents fragmentation of writing many log blocks in the main pool
Separate cache devices are recommended to improve read performance
Scrub/resilver - A very large RAID-Z pool with lots of devices will have longer scrub and resilver times
Pool performance is slow – Use the zpool status command to rule out any hardware problems that are causing pool performance problems. If no problems show up in the zpool status command, use the fmdump command to display hardware faults or use the fmdump -eV command to review any hardware errors that have not yet resulted in a reported fault.
Make sure that pool capacity is below 80% for best performance.
Pool performance can degrade when a pool is very full and file systems are updated frequently, such as on a busy mail server. Full pools might cause a performance penalty, but no other issues. If the primary workload is immutable files, then keep pool in the 95-96% utilization range. Even with mostly static content in the 95-96% range, write, read, and resilvering performance might suffer.
Monitor pool and file system space to make sure that they are not full.
Consider using ZFS quotas and reservations to make sure file system space does not exceed 80% pool capacity.
Monitor pool health
Redundant pools, monitor pool with zpool status and fmdump on a weekly basis
Non-redundant pools, monitor pool with zpool status and fmdump on a biweekly basis
Run zpool scrub on a regular basis to identify data integrity problems.
If you have consumer-quality drives, consider a weekly scrubbing schedule.
If you have datacenter-quality drives, consider a monthly scrubbing schedule.
You should also run a scrub prior to replacing devices or temporarily reducing a pool's redundancy to ensure that all devices are currently operational.
Monitoring pool or device failures - Use zpool status as described below. Also use fmdump or fmdump -eV to see if any device faults or errors have occurred.
Redundant pools, monitor pool health with zpool status and fmdump on a weekly basis
Non-redundant pools, monitor pool health with zpool status and fmdump on a biweekly basis
Pool device is UNAVAIL or OFFLINE – If a pool device is not available, then check to see if the device is listed in the format command output. If the device is not listed in the format output, then it will not be visible to ZFS.
If a pool device has UNAVAIL or OFFLINE, then this generally means that the device has failed or cable has disconnected, or some other hardware problem, such as a bad cable or bad controller has caused the device to be inaccessible.
Consider configuring the smtp-notify service to notify you when a hardware component is diagnosed as faulty. For more information, see the Notification Parameters section of smf(5) and smtp-notify(1M).
By default, some notifications are set up automatically to be sent to the root user. If you add an alias for your user account as root in the /etc/aliases file, you will receive electronic mail notifications, similar to the following:
-------- Original Message -------- Subject: Fault Management Event: tardis:SMF-8000-YX Date: Wed, 21 Sep 2011 11:11:27 GMT From: No Access User <email@example.com.COM> Reply-To: firstname.lastname@example.org.COM To: email@example.com.COM SUNW-MSG-ID: ZFS-8000-D3, TYPE: Fault, VER: 1, SEVERITY: Major EVENT-TIME: Wed Sep 21 11:11:27 GMT 2011 PLATFORM: Sun-Fire-X4140, CSN: 0904QAD02C, HOSTNAME: tardis SOURCE: zfs-diagnosis, REV: 1.0 EVENT-ID: d9e3469f-8d84-4a03-b8a3-d0beb178c017 DESC: A ZFS device failed. Refer to http://sun.com/msg/ZFS-8000-D3 for more information. AUTO-RESPONSE: No automated response will occur. IMPACT: Fault tolerance of the pool may be compromised. REC-ACTION: Run 'zpool status -x' and replace the bad device.
Monitor your storage pool space – Use the zpool list command and the zfs list command to identify how much disk is consumed by file system data. ZFS snapshots can consume disk space and if they are not listed by the zfs list command, they can silently consume disk space. Use the zfs list -t snapshot command to identify disk space that is consumed by snapshots.