Optimizing SDS for Server Virtualization Workloads

With the broad range of hardware options available to Software Defined Storage (SDS) solutions like QuantaStor, selecting the right combination of hardware to make the workload perform well without overspending on higher cost components can be a challenge. If money is no object and the configuration is moderate in size, the process is fairly simple: just use all SSDs in a RAID10 configuration with lots of RAM and it will perform very well for just about anything you throw at it.

Fortunately, in the near future when SSDs drop in price, designing high performance storage solutions will get easier as hard disk drives will largely be relegated to just archive solutions. Until then, most companies need to strike the right balance of RAM, SSDs and HDDs for their SDS solutions that are running workloads like VMs, database, email, and other common applications.

Optimal Storage Pool Layout for Virtualization

Server Virtualization workloads are fairly sequential when you have just a few virtual machines but as you add more VMs, the I/O patterns to the storage appliance will become more and more random in nature. Therefore, you must design a configuration tuned to maximize IOPS in most cases.

As such , we always recommend that you configure your storage pool to use a RAID10 layout. In some rare cases you can get away with RAID50 (4+1 or 3+1 set size) but in general it’s best to stick with RAID10 unless you’re able to really verify that the RAID50 configuration is going to work for you at scale.

Importance of Disk/Spindle Count

When a flood of write IO is sent to a storage array the RAID10 layout works best because each pair of drives in the array can handle a write request all at the same time. This concurrency is what gives you great random write performance and is key to achieving high performance for virtualization deployments. More mirror pairs equals more IOPS as the graph shows below (Figure 1), a test showing higher IOPS delivered by a 20x disk RAID10 configuration versus a 10x disk RAID10 configuration. This rule also applies to SSD drives so be sure to use RAID10 with SSD as well. Due to the high speed of SSDs you can get good performance for some workloads using RAID50 (3+1 set size) when combined with a hardware RAID controller with the NVRAM write-back cache enabled.  You’ll want to do your own workload specific testing of a configuration before going into production and if there isn’t time for that, keep it simple and use RAID10 that will always give you maximum IOPS.

qs_spindle_count

Figure 1

Hardware vs. Software RAID

For High Availability (HA) configurations and those configurations that need bit-rot protection, you’ll want to use a LSI HBA with enterprise SAS disk combined with the ZFS based Storage Pool type in a RAID10 configuration. For everything else, and especially if you’re using SATA disk, we recommend using a hardware RAID controller such as the LSI MegaRAID or Adaptec RAID controller. With hardware RAID we also highly recommend that you add the NVRAM cache protection option so that you can take advantage of the controller cache as a NVRAM write-back cache. The impact of this cannot be overstated.  As you can see in the graph below (Figure 2), the random read/write IOPS stay very high until the cache size is exceeded. Therefore, always make sure you have the LSI CacheVault or Adaptec ZMCP module installed on the controller card(s) and make sure write-back cache is turned ON.  This is especially important when using parity based RAID types like RAID5/50/6/60.

qs_hardware_raid_nvram_performance

Figure 2

Hybrid Software + Hardware RAID

When you use hardware RAID with QuantaStor it’s really a hybrid of software and hardware RAID.  To configure, you can create one large RAID10 hardware unit/array in the web UI and then use the resulting storage to create a storage pool using the ZFS type with software RAID0. This provides you with a RAID10 storage pool that both leverages the hardware RAID controller for hot-spare management, and leverages the capabilities of ZFS for easy expansion of the pool.

Increasing RAM to Boost Performance

Another important design factor is selecting the right amount of RAM for you appliance. QuantaStor needs a couple GB of RAM for the OS and the internal services, after that the rest of the RAM is used by the storage workloads and the majority of that is used by the read cache (called ARC with ZFS based pools) to boost performance. A good starting point is 24GB for the base system plus 1GB – 2GB for each additional VM; adding more RAM is always good but if the workload is light it’ll have diminishing returns.  In general, for most virtual machine workloads you’ll see significant performance advantages to increasing the amount of RAM to 64GB, 128GB or 256GB for larger appliances.

Network Card Selection

Avoid using 1GbE if you can unless the system is small with just a handful of disks (less than 8) and even then you should try to use 10GbE as you’ll see huge performance benefits on cache hits.  So in general, use 10GbE or 8Gb FC for systems with many VMs or with VMs with high load applications like databases and Microsoft Exchange.

Summary

Best practice recommendations for optimizing server virtualization workloads and QuantaStor include the following:

  • Use a RAID 10 layout with a high disk to spindle count
  • Use the default Storage Pool type (ZFS)
  • Add additional appliance RAM for read cache (64GB – 256GB+)
  • Use 10GbE NICs
  • Use iSCSI with multipathing for hypervisor connectivity rather than LACP bonding
  • When using HDDs for the pool use SSDs for read cache if the VM count is large and you’re experiencing latency issues
  • Add 2x SSDs for write cache if the RAID10 spindle count is not high enough to keep up with the write load
  • With hardware RAID be sure you have the write-back cache enabled

If you need assistance on selecting the right hardware and configuration strategy for your QuantaStor storage grid, email OSNEXUS at solutiondesignreview@osnexus.com.



Categories: High Availability, IOPS, Performance, virtualization

Tags: , , ,

3 replies

  1. It looks like there is conflicting information regarding write back cache on hardware RAID controllers. In the last sentence in section Hardware vs. Software RAID “Therefore, always make sure you have the LSI CacheVault or Adaptec ZMCP module on the card(s) and make sure write-back cache is turned off.”

    vs.

    In the Summary section:
    With hardware RAID be sure you have the write-back cache enabled

  2. “…always make sure you have the LSI CacheVault or Adaptec ZMCP module on the card(s) and make sure write-back cache is turned off.”
    “With hardware RAID be sure you have the write-back cache enabled.”

    I’m no expert, but I think those sentences are contradictory.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: