Configuring High Availability SDS Site Clusters

By Scott Arenson on June 1, 2015 • ( 0 )

With QuantaStor’s expanded High Availability (HA) capabilities in version 3.15 you can now create and monitor site clusters and group node pairs together all from within the web user interface. Site Clusters enable up to 32 nodes to pool resources and distribute work across the clusters while enabling continuous service to end users in the event that an individual node goes offline.

How High Availability Site Clusters Work

Managing up to 32 storage appliances across global sites is no small task but QuantaStor Grids make it easy to manage all appliances via the web interface, CLI or REST API as a single unit.

But what about the management of Highly Availability storage pools that are specific to nodes at a particular site? For example, say nodes N1, N2, and N3 are all being used in a data center to provide access to a couple of highly-available storage pools. To maintain a High Availability state, the storage pools rely on an underlying “heartbeat” mechanism that needs to be localized to the site where they’re deployed and this is where the Site Cluster comes in.

A Site Cluster uses a heartbeat cluster configuration that spans a subset of nodes within a QuantaStor grid. Once a Site Cluster is created you can then create HA Storage Pools and HA Virtual Interfaces for Gluster, Storage Pools, and even Grid Management (Figure 1).

Figure 1

Site Cluster Creation

To create a Site Cluster you must first create a QuantaStor grid with at least two nodes to create highly-available resources (Virtual IP addresses). Note that you’ll want to have set up two network ports on each appliance on separate networks to facilitate the Site Cluster heartbeat mechanism. On each network a “Heartbeat Ring” or Cluster Ring is formed. By having two rings the site cluster configuration is made resilient and flexible to network configuration changes. We always recommend setting up two rings for HA rather than a single ring configuration (Figure 2).

Figure 2

If a failure is detected on a node that has ownership of a Shared Storage Pool, an automatic HA fail over event will be triggered by the Site Cluster. For example, if N1 has an active HA storage pool and the appliance is turned off, the heartbeat to that appliance will stop and the Site Cluster will bring the Storage Pool’s HA virtual interface(s) to another node such as N3. QuantaStor’s HA storage pool fail over system automatically takes care of restoring NFS/CIFS client access to the Network Shares and iSCSI access to Storage Volumes.

Site Clusters are also used to maintain Gluster High Availability virtual interfaces (VIFs) that are used for CIFS and NFS access to Gluster Volumes. For example, if node N4 currently provides access to a Gluster HA VIF (10.0.12.23) and node N4 is turned off, then the VIF is automatically moved to another node in its Site Cluster, in this example, node N5. Note that each appliance can only be a member of a single Site Cluster but you can have multiple Site Clusters in a grid.

Site Cluster Primary Use Cases

There are several primary uses cases for Site Cluster management. The first one is “Grid Management automatic master election” that enables the automatic election of a new management master node if the current master node is disabled. To configure a new master node, go to the High-Availability tab and create a Grid Management virtual IP (VIF). The node with the Grid IP becomes the current master node. If you don’t set up a grid management virtual IP and you need to assign a different node as master then you must manually select a new master node by right clicking on the grid object and choose “Set Master Node.” Note that QuantaStor uses the terms Primary Node and Master Node interchangeably.

The second use case is HA Gluster access via NFS and CIFS. To facilitate highly available access to the Gluster Volume via traditional NAS protocols like NFS and CIFS you’ll need a virtual IP. Enable HA Gluster access by going to the High-Availability tab and creating a Gluster Volume virtual IP. Whichever node has the Gluster Volume IP is the node where your NFS and CIFS clients will be connecting to storage resources. You can also setup multiple Gluster VIFs on different appliances so that clients connect via round-robin DNS to distribute load across the volume.

The third use case is for HA Storage Pools. ZFS based Storage Pools that are using shared storage (SAS, iSCSI, FC) and support SCSI-3 persistent reservations can also be configured to be highly available.