Replication

Remote Replication and Disaster Recovery with Software Defined Storage

If you haven’t taken the time out to review and identify what parts of your IT infrastructure are housing the critical data, now is the time.  You never know when you’re going to need it and verifying that you have a plan in place to get things back up and running should be a top priority for every business.

Traditional storage systems required that you purchase “custom iron” with extra replication features that would allow your IT team to replicate VMs and databases between sites on a continuous basis throughout the day. With Software Defined Storage like QuantaStor, setting up remote replication policies (Disaster Recovery policies) has never been easier.

The screen shot below (Figure 1) shows how easy it is to set up a remote replication policy with just a few clicks once your appliances are deployed.

Create Replication Schedule

Figure 1

What is Remote Replication

Remote Replication allows the copying of volumes and/or network shares from any QuantaStor storage appliance to any other appliance. It’s a great tool for migrating data between systems and is an ideal component of any Disaster Recovery (DR) strategy.

Remote Replication is done asynchronously meaning that changes to volumes and network shares on one appliance is done up to every hour with calendar-based scheduling or as often as every 15 minutes with timer-based scheduling.  Once a given set of the volumes and/or network shares have been replicated from one appliance to another, the subsequent replication operations send only meta-data changes between appliances. The replication logic is also efficient in that it only replicates actual data written to your volumes/shares and not unused regions of your disk storage which could be vast.

To ensure proper security, all data sent over the network is encrypted. Because only the deltas/changes are sent over the network, replication also works well over limited bandwidth networks. Enabling storage pool compression (available with the default ZFS based Storage Pools) further reduces network load by keeping the data compressed as it is sent over the wire.

Creating a Storage System Link

The first step in setting up Remote Replication is to form a grid of at least two QuantaStor storage appliances (link).  Grids provide a mechanism for managing all your appliances as a single unit across geographies. The QuantaStor grid communication technology connects appliances (nodes) together so that they can share information, coordinate activities such as Remote Replication, and simplify management operations.

After you create the grid you’ll need to setup a Storage System Link between the two or more nodes between which you want to replicate data (volumes and/or shares). The Storage System Link represents a security key exchange between the two nodes so that they can send data between each other using low-level replication mechanisms that work at the storage pool level.

Creation of the Storage System Link is done through the QuantaStor Manager web interface by selecting the ‘Remote Replication’ tab, and then clicking the ‘Create Storage System Link’ button in the tool bar to bring up the dialog box. (Figure 2)

Storage System Link

Figure 2

Select the IP address on each system for the Remote Replication network traffic. If both systems are on the same network then you can simply select one of the IP addresses from one of the local ports but if the remote system is in the cloud or remote location then most likely you will need to specify the external IP address for your QuantaStor system. Note that the two systems communicate over ports 22 and 5151 so you will need to open these ports in your firewall in order for the QuantaStor systems to communicate properly.

Creating a Remote Replica

Once you have a Storage System Link created between at least two systems, you can now replicate volumes and network shares in either direction. Simply log in to the system that you want to replicate volumes from, right click on the volume to be replicated, then choose “Create Remote Replica.”

Creating a remote replica is much like creating a local clone only the data is being copied over to a remote storage system storage pool. As such, when you create a Remote Replica you must specify which storage system you want to replicate too (only systems which have established and online storage system links will be displayed) and which storage pool within that system should be utilized to hold the Remote Replica. If you have already replicated the specified volume to the remote storage system then you can re-sync the remote volume by choosing the Remote Replica association in the web interface and choosing ‘resync’. This can also be done via the ‘Create Remote Replica’ dialog and then choose the option to replicate to an existing target if available.

Creating a Remote Replication Schedule Replication Policy

Remote replication schedules provide a mechanism for replicating the changes to your volumes to a matching checkpoint volume on a remote appliance automatically on a timer or a fixed schedule. This is also commonly referred to as a DR or Disaster Recovery replication policy, but you can use replication schedules for a whole host of use cases.

To create a schedule navigate to the Remote Replication Schedules section after selecting the Remote Replication tab at the top of the screen. Right-click on the section header and choose “Create Replication Schedule.” (Figure 3)

Remote Replication Schedule

Figure 3

Besides selection of the volumes and/or shares to be replicated, you must select the number of snapshot checkpoints to be maintained on the local and remote systems. You can use these snapshots for off-host backup and other data recovery purposes as well so there is no need to have a Snapshot Schedule which would be redundant with the snapshots that will be created by your replication schedule.

If you choose five replicas then up to five snapshot checkpoints will be retained. If, for example, you were replicating nightly at 1 am each day of the week from Monday to Friday then you will have a week’s worth of snapshots as data recovery points. If you are replicating four times each day and need a week of snapshots then you would need 5×4 or a maximum replicas setting of 20.

Summary

With the myriad of server hardware you can deploy QuantaStor SDS onto and the ease with which you can setup a DR / remote replication schedule, you now have the tools to get the business continuity plans (BCP) in place in a reliable and economical way.  If you’re behind on your BCP, make micro goals to start chipping away at it and you’ll have the peace of mind you and your company need in no time.

Disaster Recovery Software Defined Storage
Concrete_wall

Managing Scale-out NAS File Storage with GlusterFS Volumes

QuantaStor provides scale-out NAS capabilities using the traditional CIFS/SMB and NFS as well as via the GlusterFS client protocol for scale-out deployments. For those not familiar with GlusterFS, it’s a scale-out filesystem that ties multiple underlying files systems together across appliances to present them in aggregate as a single filesystem or “single-namespace” as it’s often called.

In QuantaStor appliances, Gluster is layered on top of the QuantaStor Storage Pool architecture enabling the use of QuantaStor appliances for file, block, and scale-out file storage needs all at the same time.

Scale-out NAS using GlusterFS technology is great for unstructured data, archive, and many media use cases. However, due to the current architecture of GlusterFS, it’s not as good for high IOPS workloads such as databases and virtual machines. For those applications, you’ll want to provision block devices (Storage Volumes) or file shares (Network Shares) in the QuantaStor appliances which can deliver the necessary write IOPS performance needed for transactional workloads and VMs.

GlusterFS read/write performance via CIFS/NFS is moderate and can be improved with SSD or hardware RAID controllers with one or more gigabytes of read/write cache as NVRAM.  For deployments accessing scale-out NAS storage from Linux, it is ideal to use the native GlusterFS client as it will enable the performance and bandwidth to increase as you scale-out your QuantaStor grid. For Windows, OS/X and other operating systems you’ll need to use the traditional CIFS/NFS protocols.

QuantaStor Grid Setup

To provision the scale-out NAS shares on QuantaStor appliances, the first step is to create a management Grid by right-clicking on the Storage System icon in the tree stack view in the Web Management User Interface (WUI) and choose “Create Grid.” (Figure 1)

Software Defined Storage Grid Figure 1

After you create the grid you’ll need to add appliances to the grid by right-clicking on the Grid icon and choosing ‘Add Grid Node.’ (Figure 2) Input the node IP address and password for the additional appliances.

Storage Management Grid Node

Figure 2

After the nodes are added you’ll be able to manage them from the QuantaStor tree menu (Figure 3). User accounts across the appliances will be merged with the elected primary/master node in the grid taking precedence.

SDS Storage Grid

Figure 3

Network Setup Procedure

If you plan to use the native GlusterFS client with Linux servers connected directly to QuantaStor nodes then you should setup network bonding to bind multiple network ports on each appliance for additional bandwidth and automatic fail-over.

If you plan to use CIFS/NFS as the primary protocols then you could use either bonding or separate ports into a front-end network for client access and a back-end network for inter-node communication.

Peer Setup

QuantaStor appliance grids allow intercommunication but do not automatically setup GlusterFS peer relationships between appliances automatically. For that you’ll want to select ‘Peer Setup’ and select the IP address on each node to be used for GlusterFS intercommunication. (Figure 4)

GlusterFS Peer Setup

 Figure 4

Peer Setup creates a “hosts” file (/etc/hosts) on each appliance so that each node can refer to the other grid nodes by name and can also be done via DNS. This will ensure that the configuration is kept in sync across nodes and allows the nodes to resolve names even if DNS server access is down.

Gluster volumes span appliances and on each appliance it places a brick. These Gluster bricks are referenced with a brick path that looks much like a URL. By setting up the IP to hostname mappings QuantaStor is able to create brick paths using hostnames rather than IP addresses making it easier to change the IP address of a node.

Finally, in the Peer Setup dialog, there’s a check box to set up the Gluster Peer relationships. The ‘Gluster peer probe’ command links the nodes together so that Gluster volumes can be created across the appliances. Once the peers are attached, you’ll see them appear in the Gluster Peers section of the WUI and you can then begin to provision Gluster Volumes. Alternatively you can add the peers one at a time using the Peer Attach dialog. (Figure 5)

Gluster Peer Attach

Figure 5

Provisioning Gluster Volumes

Gluster Volumes are provisioned from the ‘Gluster Management’ tab in the web user interface. To make a new Gluster Volume simply right-click on the Gluster Volumes section or choose Create Gluster Volume from the tool bar. (Figure 6)

To make a Gluster Volume highly-available with two copies of each file, choose a replica count of two (2). If you only need fault tolerance in case of a disk failure then that is supplied by the storage pools and you can use a replica count of one (1). With a replica count of two (2) you have full read/write access to your scale-out Network Share even if one of the appliance is turned off. With a replica count of (1) you will loose read access to some of your data in the event that one of the appliances is turned off. When the appliance is turned back on it will automatically synchronize with the other nodes to bring itself up to the proper current state via auto-healing.

Gluster Volume

Figure 6

Storage Pool Design Recommendations

When designing large QuantaStor grids using GlusterFS, there are several configurations that we make to ensure maximum reliability and maintainability.

First and foremost, it’s better to create multiple storage pools per appliance to allow for faster rebuilds and filesystem checks than it is to make one large storage pool.  

In the following diagram (Figure 7) we show four appliances with one large 128TB storage pool each. With large amounts of data, it seems like a slightly simpler configuration from a setup perspective. However, there’s a very high cost in the event a GlusterFS brick needs to do a full heal given the large brick size (128TB). To put this in perspective, at 100MB/sec (1GbE) it takes at minimum 2.8 hours to transfer one of TB of data (and that’s assuming theoretical maximum throughput) for a 1GB port or 15 days to repair a 128TB brick.

128TB Gluster Pools

Figure 7

The QuantaStor SDS appliance grid below (Figure 8) shows four storage pools per appliance that are segmented into 32TB. Each of the individual 32TB storage pools can be any RAID layout type but we recommend using hardware RAID5 or RAID50 as that will provide low cost disk fault tolerance within the storage pool and write caching assuming the controller has a CacheVault/MaxCache or other NVRAM type technology. When accounting for a GlusterFS volume configured with two replicas (mirroring) the effective RAID type is RAID5+1 or RAID51.

SDS Appliance Grid

Figure 8

In contrast, we don’t recommend using RAID0 (which has no fault tolerance) as there is a high cost associated with having GlusterFS repair large bricks. In this scenario a full 32TB would need to be copied from the associated mirror brick and that can take a long time as well as utilize considerable disk and network bandwidth. Be sure to use RAID50 or RAID5 for all storage pools so that the disk rebuild process is localized to the hardware controller and doesn’t impact the entire network and other layers of the system.

Smart Brick Layout

When you provision a scale-out network share (GlusterFS Volume) it is also important that the bricks are selected so that no brick mirror-pair has both bricks on the same appliance. This ensures that there’s no single point of failure in HA configurations. QuantaStor automatically ensures correct Gluster brick placement when you select pools and provision a new Gluster Volume by processing the pool list for the new scale-out Gluster Volume using a round-robin technique so that brick mirror-pairs never reside on the same appliance. (Figure 9)

Block Replica Pair

Figure 9

Once provisioned, the scale-out storage is managed as a standard Network Share in the QuantaStor Grid except that it can be accessed from any appliance. Finally, the use of the highly-available interface for NFS/CIFS access in the diagram below (Figure 10). This ensures that the IP address is always available even in the event that an appliance goes offline that is actively serving that virtual interface. If you’re using the GlusterFS native client then you don’t need to think about setting up a HA virtual network interface as the GlusterFS native client communication mechanism is inherently highly available.

HA Interface

Figure 10

Filesystems GlusterFS High Availability NAS
Cloud Containers

Deploying Software Defined Storage with Cloud Containers

Over at The Virtualization Guy blog, Gaston Pantana wrote a great post on how to integrate SoftLayer’s Object Storage with QuantaStor using Cloud Containers. Cloud Containers let you back up storage volumes to Amazon S3, Google Cloud Storage and IBM/SoftLayer Object Storage. The data is compressed, encrypted, and deduplicated and is ideal for unstructured information such as documents.

quantastor-add-container

The following Vine shows a cloud container being created for SoftLayer Object Storage.

Software Defined Storage
tspic

Protecting Against Data Loss Using RAID for Post-production Media Storage

 

If you just watched the above YouTube video about how Toy Story 2 was almost erased out of existence by a mistyped Linux command, then you’re probably paying close attention to why using RAID for media storage is a great idea. In addition to a disaster recovery plan, it’s always wise to protect critical data against disk failures and RAID provides a great solution.

Disk drives do have life spans. After cloud backup provider Backblaze analyzed 25,000 of their deployed production drives, they found that 22% of drives fail in their first four years at varying failure rates (see Figure 1) with 78% of Backblaze drives still alive after four years.

drivefailureratesFigure 1

Using RAID to Protect Against Disk Failures

RAID, also known as “redundant array of independent disks,” combines multiple drives into one logical unit for fault tolerance and improved performance. Different RAID architectures provide a balance between storage system goals including reliability, availability, performance, and usable capacity. One of the primary uses of RAID technology is to provide fault-tolerance so that in the event or a disk failure there is no downtime and no loss of data. Some RAID types support multiple simultaneous disk failures such as RAID6, and on the other end of the spectrum RAID0 combines disks into a unit for improved performance but does not provide disk fault-tolerance.

Parity, from the Latin term “paritas,” means equal or equivalent and refers to RAID types 5 & 6 where an error correction algorithm (XOR and Reed-Solomon) are used to produce additional “parity” data which can be used by the system to recover all the data in the event a drive fails. RAID types 2, 3, and 4 are not commonly used as they require special hardware or have design aspects that make them less efficient than RAID5 or RAID6, so we’re going to skip over RAID2/3/4.

Typical RAID configurations include:

  • RAID 0 consists of striping, without mirroring or parity and is generally NOT recommended as loss of a single disk drive results in complete loss of all data.
  • RAID 1 consists of mirroring and is recommended for small configurations, usable capacity is 50% of total capacity.
  • RAID 5 consists of block-level striping with distributed parity and is recommended for archive configurations but use no more than 7 drives in a group. It can sustain the loss of one drive but then must be repaired using a spare disk before another disk fails. Large RAID5 groups are risky as the odds of a second disk failure is higher if many disks are in the RAID5 unit. For example, a RAID5 unit with 3 data disks and one parity disk (3+1) will have usable capacity of 75% of the total and be low risk. In contrast a RAID5 group with 12+1 is higher risk due to the increased probability of a second failure during a unit rebuild from a first disk failure.
  • RAID 6 consists of block-level striping with double distributed parity and can be slower with some RAID implementations but generally performs close to the performance of RAID5. Again, it’s good for archive, not so good for virtual machines and other high transaction workloads.
  • RAID 7 consists of block-level striping with triple distributed parity and can sustain three simultaneous disk failures which makes it ideal for large long term archive configurations. Using the ZFS storage pool type is referred to as RAIDZ3 indicating the 3 drives used for Reed-Solomon parity information.
  • RAID 10 consists of multiple RAID1 groups that are combined into one large unit using RAID0. It’s also the most recommended RAID layout as it combines fault tolerance with a large boost in IOPS or transactional performance.
  • RAID 50 consists of multiple RAID5 groups which are combined into one large unit using RAID0. Using small RAID5 groups of 3 disks + 1 parity or 4 disks + 1 parity disk you can use RAID50 for light load virtualization deployments while yielding a higher amount of usable disk space (75% and 80% respectively).
  • RAID 60 consists of multiple RAID6 groups which are combined into one large unit using RAID0 and is good for large archive configurations.

ZFS, QuantaStor’s native file system, supports RAID 0, RAID 1, RAID 5/50 (RAID-Z), RAID 6/60 (RAID-Z2) and RAID7/70 a triple-parity version called RAID-Z3.

Parity-based RAID for Media and Archive

Because RAID6 employs double parity (called P and Q) and can sustain two simultaneous disk failures with no data loss, it’s a good solution for ensuring critical data against drive failures. RAID6 is highly fault tolerant but it does have some drawbacks. To keep parity information consistent, parity-based RAID layouts like RAID5 and RAID6 must update the parity information any time data is written. Updating parity requires reading and/or writing from all the disks regardless of the data block size being written. This means that it takes roughly the same amount of time to write 4KB as it does to write 1MB.

If your workload is mostly reads with only one or two writers that do mostly sequential writes, as is the case with large files, then you’ve got a good candidate for RAID6.

RAID controllers that have a battery backed or super-capacitor protected NVRAM cache can hold writes for a period of time and often times can combine many 4K writes into larger efficient 1MB full-stripe writes. This IO coalescence works great when the IO patterns are sequential as with many media and archive applications but it doesn’t work well when the data is being written to disparate areas of the drive as you see with databases and virtual machines.

For more information on configuring RAID with QuantaStor see the Getting Stared Guide and be sure to check out the Vine on Creating a RAID6 Storage Pool with QuantaStor.

Disaster Recovery High Availability
vine-secrets-thumbnail

Use Vines to Learn QuantaStor in Seconds

At OSNEXUS we’re always trying to make it easy for our customers and partners to learn the many features that the QuantaStor SDS platform has to offer. We also realize that everyone is very busy and time is a precious commodity. To that end, we’re happy to announce that in addition to the OSNEXUS support site, we are introducing our new OSNEXUS Vine page enabling anyone to learn QuantaStor, even if they only have six seconds! You can view a few of our new Vines below. Also, be sure to follow our Twitter feed for new vines that will be added weekly.

Vine-logoCreate a Gluster Volume with QuantaStor
Vine-logoCreate a QuantaStor Storage Pool
Vine-logoCreate a QuantaStor Network Share
Vine-logoAdd a QuantaStor License Key

GlusterFS QuantaStor Software Defined Storage Storage Appliance Hardware
ZFS Hot Spares

How to: Universal Hot-spare Management for ZFS-based Storage Pools

A standard best practice for preventing data loss due to disk failure is to designate one or more disk spares so that fault-tolerant arrays can auto-heal using the spare in the event of a HDD or SSD drive failure.  Universal hot-spare management takes that a step further and lets you repair any array in the system with a given hot-spare rather than having to designate specific spares for specific RAID groups (arrays).

Policy Driven Hot Spare Management

QuantaStor’s universal hot-spare management was designed to automatically or manually reconstruct/heal fault-tolerant arrays (RAID 1/10/5/50/6/60/7/70) when one or more disks fail.  Depending on the specific needs of a given configuration or storage pool, spare disk devices can be assigned to a specific pool or added to the “Universal” hot spare list meaning that it can used by any storage pool.

QuantaStor’s hot spare management system is also distributed and “grid aware” so that spares can be shared between multiple appliances if they are connected to one or more shared disk enclosures (JBODs).  The grid aware spare management system makes decisions about how and when to repair disks pools based on hot-spare management policies set by the IT administrator.

Some of the challenges that the QuantaStor’s policy driven auto-heal system must tackle includes making sure to only repair pools with like devices (don’t use that slow HDD to repair an SSD pool!), detecting when an enclosure was turned off by differentiating between a power loss versus a disk failure, allowing users to set policies and designate disks to pools as needed and allowing hot spares to be shared and reserved within and across appliances for HA (High Availability) configurations.

With this policy driven hot-spare management, OSNEXUS has developed an advanced system that makes managing HBA (SAS Host Bus Adapter) connected spares for our ZFS-based Storage Pools as easy as managing spares in a hardware RAID controller (also manageable via QuantaStor).

Configuring Storage Pool Hot Spare Policies

The Hot Spare Policy Manager can be found in the Modify Storage Pool dialog box under Storage Pools. (Figure 1)

QuantaStor Hot Spare Policy Manager

Figure 1

The default hot spare policy is “Autoselect best match from assigned or universal spares.” Additional options include auto-select best match from pool assigned spares only, auto-select exact match from assigned or universal spares, auto-select exact match from pool assigned spares only and manual hot spare management only.

Marking Hot-Spares

Marking and unmarking physical disks as hot spares can be found under “Physical Disks” by clicking on the disk and then selecting from the dialog box “Mark as Hot Spare” or “Unmark as Hot Spare.” This applies to the “Universal” hot spare list. (Figure 2)

Figure 2Figure 2

Pinning Spares to Specific Storage Pools

To add assigned disk spares to specific storage pools rather than the universal spare list click on the storage pool and select “Recover Storage Pool/Add Spares.” (Figure 3)

Adding Assigned Disk Spares to Specific Storage Pools
Figure 3

Select disks to be added to the storage pool using the “Add Hot-spares / Recover Storage Pool” dialog box. (Figure 4)

Add Hot-spares / Recover Storage Pool
Figure 4 

For more information about Hot Spare Management and Storage Pool Sizing see the OSNEXUS Solution Design Guide.

Disaster Recovery