Since I wrote this article years ago many things have changed so here’s a quick update. Today 90% of our deployments are ZFS based and we only use XFS within our Ceph deployments for OSDs. Even that use of XFS will be phased out in 2018 with the adoption of BlueStore in Ceph Luminous. The other big thing that’s changed is that for the last few years we done all our ZFS deployments using HBAs rather than RAID cards. In some environments the customer’s only option is to use HW RAID we’ve have to work with that and overall it’s worked well and we’ve only seen an occasion case of bit rot. Today in QuantaStor v4.x we have a powerful ZFS based HA features and management system in place that’s fast, highly parallelized (we redesigned systems like our iofencing logic in golang), and all that requires the use of SAS HDDs and/or SAS SSDs. We’ve also made HA a standard feature in QuantaStor which we include at no extra cost and HA deployments represent about 99% of all our deployments.
We just released QuantaStor v3.6 this month which completed our integration of ZoL (ZFS on Linux) with QuantaStor. This article outlines some of the new features ZFS brings vs. our other storage pool types which are based on XFS and BTRFS. ZFS was something of a big lift as our storage pool stack was initially designed for traditional filesystems like XFS or BTRFS over Linux’s MD software RAID. ZFS is different in that it incorporates both the filesystem and the underlying volume management layer into one and this also lends many advantages. These advantages are often must-have features for virtualization deployments like being able to online grow the storage pool by adding devices on the fly and hot-adding SSD drives as write-cache (ZIL) or read cache (L2ARC) to boost performance.
When to choose ZFS vs XFS?
We’re still learning alot about ZFS performance tuning but so far we’ve been able to get ZFS to perform at roughly 80% of XFS’s performance for many workloads and much better for some using SSD write caching (ZIL). Going forward we will continue support both XFS along with ZFS as even though XFS doesn’t have the depth of features. This is because XFS is an ideal filesystem for sequential I/O workloads we see in Media & Entertainment and for some archive applications where the data is often already compressed. For virtualization and many other workloads that benefit from ZFS’s advanced feature set, we recommend using ZFS now.
The table below gives a breakdown of the differences in capabilities between the storage pool types based on the underlying filesystem.Note that ‘Volumes’ / storage volumes are iSCSI/FC LUNs which reside in a storage pool, and ‘Network Shares’ are CIFS/NFS shares which also reside within a given Storage Pool. Both volumes and shares can reside within and be provisioned from any given storage pool. * XFS and BTRFS storage pools can be expanded if they use the ‘Linear’ software RAID type but there’s a 15 sec time period where the pool is taken offline to expand it. BTRFS has software RAID features which would eliminate this issue but at the time of this writing those features have not matured and are not integrated into QuantaStor. ** Storage Pool deduplication can be turned on using the zpool command line utility. BTRFS has some user mode utilities for deduplication but no in-line deduplication yet.
*** Replication of volumes from one QuantaStor appliance to another is done from a consistent point in time snapshot when done with btrfs or ZFS. XFS does not have this benefit as it lacks snapshot capabilities.
**** SSD Caching is possible using hardware RAID controllers like LSI Nytro MegaRAID or LSI CacheCade software add-on to MegaRAID.
Combining Hardware and Software RAID a ‘win-win’
Though QuantaStor allows you to use ZFS without a hardware RAID controller we recommend that you still use one. The hardware RAID controllers are much better at managing drive replacement, hot-spares, drive identification and more. It’s this combination of hardware RAID and software RAID that gives the best of both worlds as you can online expand ZFS with no downtime and you can easily manage the underlying fault-tolerance using an enterprise grade hardware RAID controller. The hardware RAID controllers like the LSI MegaRAID series also typically have 1GB or more of non-volatile DD3 write-cache which also helps boost ZFS performance so the combination of software and hardware is a win-win. Last, QuantaStor has integrated hardware RAID management so you can provision hardware RAID unit(s) within the web management interface for your storage pools in just a few clicks. This makes the overhead of provisioning the hardware RAID units negligible and greatly reduces the cost of maintaining the storage appliance.
Thoughts on BTRFS
We’re still tracking BTRFS and look forward to seeing it mature in the months ahead. For now we’ve tagged it as an ‘Experimental’ pool type until the BTRFS project team takes the ‘experimental’ badges off and sorts out some remaining issues. Overall we see the filesystem working well but we’ve had to add logic to Quantastor to regularly reblance the filesystem meta data or else we see these -28 (out of space) errors so there are still some gaps in the BTRFS test matrix. Once those are sorted we’ll restructure the software RAID management for btrfs along the lines of what we’ve done with ZFS so that we can hot-add storage to btrfs based storage pools with no downtime. From there the gaps between the ZFS and BTRFS will be a fairly small list. Specifically a mechanism like the ZFS send/receive is needed to support efficient remote replication, and support for SSD caching like you see with ZFS’s ZIL and L2ARC are similarly needed.
Hope you enjoyed the article and I look forward to more details on our ZFS tuning in future blog posts.
ps. To try out QuantaStor v3 with ZFS please see our ISO download and license request page here.
Categories: Storage Appliance Hardware