Stretch clusters and cluster-of-clusters are two approaches for making shared resources available across geographically distributed sites so that a second site can be called into action after one site fails. To use these approaches, you must first understand how the applications you use and the storage subsystems in your network deployment can determine whether a stretch cluster or cluster of clusters solution is possible for your environment.
Traditional LAN applications require a LAN infrastructure that must be replicated at each site, and might require relocation of employees to allow the business to continue. Internet-based applications allow employees to work from any place that offers an Internet connection, including homes and hotels. Moving applications and services to the Internet frees corporations from the restrictions of traditional LAN-based applications.
By using Novell exteNd Director portal services, Novell Access Manager, and ZENworks, all services, applications, and data can be rendered through the Internet, allowing for loss of service at one site but still providing full access to the services and data by virtue of the ubiquity of the Internet. Data and services continue to be available from the other mirrored sites.
For clustering implementations that are deployed in data centers in different geographic locations, the data must be replicated between the storage subsystems at each data center. Data-block replication can be done by host-based mirroring for synchronous replication over short distances up to 10 km. Typically, replication of data blocks between storage systems in the data centers is performed by SAN hardware that allows synchronous mirrors over a greater distance.
For stretch clusters, host-based mirroring is required to provide synchronous mirroring of the SBD (split-brain detector) partition between sites. This means that stretch-cluster solutions are limited to distances of 10 km.
Table 1-1 compares the benefits and limitations of host-based and storage-based mirroring.
Table 1-1 Comparison of Host-Based and Storage-Based Data Mirroring
Capability |
Host-Based Mirroring |
Storage-Based Mirroring |
---|---|---|
Geographic distance between sites |
Up to 10 km |
Can be up to and over 300 km. The actual distance is limited only by the SAN hardware and media interconnects for your deployment. |
Mirroring the SBD partition |
An SBD can be mirrored between two sites. |
Yes, if mirroring is supported by the SAN hardware and media interconnects for your deployment. |
Synchronous data-block replication of data between sites |
Yes |
Yes, requires a Fibre Channel SAN or iSCSI SAN. |
Failover support |
No additional configuration of the hardware is required. |
Requires additional configuration of the SAN hardware. |
Failure of the site interconnect |
LUNs can become primary at both locations (split brain problem). |
Clusters continue to function independently. Minimizes the chance of LUNs at both locations becoming primary (split brain problem). |
SMI-S compliance |
If the storage subsystems are not SMI-S compliant, the storage subsystems must be controllable by scripts running on the nodes of the cluster. |
If the storage subsystems are not SMI-S compliant, the storage subsystems must be controllable by scripts running on the nodes of the cluster. |
A stretch cluster and a cluster of clusters are two clustering implementations that you can use with Novell Cluster Services to achieve your desired level of disaster recovery. This section describes each deployment type, then compares the capabilities of each.
Novell Business Continuity Clustering automates some of the configuration and processes used in a cluster of clusters. For information, see Section 1.3, Business Continuity Clustering.
A stretch cluster consists of a single cluster where the nodes are located in two geographically separate data centers. All nodes in the cluster must be in the same Novell eDirectory tree, which requires the eDirectory replica ring to span data centers. The IP addresses for nodes and cluster resources in the cluster must share a common IP subnet.
At least one storage system must reside in each data center. The data is replicated between locations by using host-based mirroring or storage-based mirroring. For information about using mirroring solutions for data replication, see Section 1.2.2, Host-Based versus Storage-Based Data Mirroring. Link latency can occur between nodes at different sites, so the heartbeat tolerance between nodes of the cluster must be increased to allow for the delay.
The split-brain detector (SBD) is mirrored between the sites. Failure of the site interconnect can result in LUNs becoming primary at both locations (split brain problem) if host-based mirroring is used.
In the stretch-cluster architecture shown in Figure 1-1, the data is mirrored between two data centers that are geographically separated. The server nodes in both data centers are part of one cluster, so that if a disaster occurs in one data center, the nodes in the other data center automatically take over.
Figure 1-1 Stretch Cluster
A cluster of clusters consists of multiple clusters in which each cluster is located in a geographically separate data center. Each cluster can be in different Organizational Unit (OU) containers in the same eDirectory tree, or in different eDirectory trees. Each cluster can be in a different IP subnet.
A cluster of clusters provides the ability to fail over selected cluster resources or all cluster resources from one cluster to another cluster. For example, the cluster resources in one cluster can fail over to separate clusters by using a multiple-site fan-out failover approach. A given service can be provided by multiple clusters. Resource configurations are replicated to each peer cluster and synchronized manually. Failover between clusters requires manual management of the storage systems and the cluster.
Nodes in each cluster access only the storage systems co-located in the same data center. Typically, data is replicated by using storage-based mirroring. Each cluster has its own SBD partition. The SBD partition is not mirrored across the sites, which minimizes the chance for a split-brain problem occurring when using host-based mirroring. For information about using mirroring solutions for data replication, see Section 1.2.2, Host-Based versus Storage-Based Data Mirroring.
In the cluster-of-clusters architecture shown in Figure 1-2, the data is synchronized by the SAN hardware between two data centers that are geographically separated. If a disaster occurs in one data center, the cluster in the other data center takes over.
Figure 1-2 Cluster of Clusters
Table 1-2 compares the capabilities of a stretch cluster and a cluster of clusters.
Table 1-2 Comparison of Stretch Cluster and Cluster of Clusters
Capability |
Stretch Cluster |
Cluster of Clusters |
---|---|---|
Number of clusters |
One |
Two or more |
Number of geographically separated data centers |
Two |
Two or more |
eDirectory trees |
Single tree only; requires the replica ring to span data centers. |
One or multiple trees |
eDirectory Organizational Units (OUs) |
Single OU container for all nodes. As a best practice, place the cluster container in an OU separate from the rest of the tree. |
Each cluster can be in a different OU. Each cluster is in a single OU container. As a best practice, place each cluster container in an OU separate from the rest of the tree. |
IP subnet |
IP addresses for nodes and cluster resources must be in a single IP subnet. Because the subnet spans multiple locations, you must ensure that your switches handle gratuitous ARP (Address Resolution Protocol). |
IP addresses in a given cluster are in a single IP subnet. Each cluster can use the same or different IP subnet. If you use the same subnet for all clusters in the cluster of clusters, you must ensure that your switches handle gratuitous ARP. |
SBD partition |
A single SBD is mirrored between two sites by using host-based mirroring, which limits the distance between data centers to 10 km. |
Each cluster has its own SBD. Each cluster can have an on-site mirror of its SBD for high availability. If the cluster of clusters uses host-based mirroring, the SBD is not mirrored between sites, which minimizes the chance of LUNs at both locations becoming primary. |
Failure of the site interconnect if using host-based mirroring |
LUNs might become primary at both locations (split brain problem). |
Clusters continue to function independently. |
Storage subsystem |
Each cluster accesses only the storage subsystem on its own site. |
Each cluster accesses only the storage subsystem on its own site. |
Data-block replication between sites For information about data replication solutions, see Section 1.2.2, Host-Based versus Storage-Based Data Mirroring. |
Yes; typically uses storage-based mirroring, but host-based mirroring is possible for distances up to 10 km. |
Yes; typically uses storage-based mirroring, but host-based mirroring is possible for distances up to 10 km. |
Clustered services |
A single service instance runs in the cluster. |
Each cluster can run an instance of the service. |
Cluster resource failover |
Automatic failover to preferred nodes at the other site. |
Manual failover to preferred nodes on one or multiple clusters (multiple-site fan-out failover). Failover requires additional configuration. |
Cluster resource configurations |
Configured for a single cluster |
Configured for the primary cluster that hosts the resource, then the configuration is manually replicated to the peer clusters. |
Cluster resource configuration synchronization |
Controlled by the master node |
Manual process that can be tedious and error-prone. |
Failover of cluster resources between clusters |
Not applicable |
Manual management of the storage systems and the cluster. |
Link latency between sites |
Can cause false failovers. The cluster heartbeat tolerance between master and slave must be increased to as high as 30 seconds. Monitor cluster heartbeat statistics, then tune down as needed. |
Each cluster functions independently in its own geographical site. |
Table 1-3 illustrates why a cluster of cluster solution is less problematic to deploy than a stretch cluster solution. Manual configuration is not a problem when using Novell Business Continuity Clustering for your cluster of clusters.
Table 1-3 Advantages and Disadvantages of Stretch Clusters versus Cluster of Clusters
Stretch Cluster |
Cluster of Clusters |
|
---|---|---|
Advantages |
|
|
Disadvantages |
|
|
Other Considerations |
|
|