Failover clusters provide high availability of server roles and applications by transferring resources from a failed node to another node. The cluster service moves resources offline in a defined order on the failed node and brings them online on the new node. Quorum determines whether the cluster has enough active nodes and witnesses, like disks or file shares, for the cluster to continue running to prevent split-brain scenarios where multiple parts of the cluster think they are active. There are different quorum modes depending on whether votes come from just nodes, nodes and disks, or nodes and a file share.
2. Agenda What is failover cluster and transfer. Failover cluster components. A failover attempt... Quorum. Types of quorum modes.
3. Failover clusters and transfer Failover clusters provide a high-availability solution for server roles and applications. By implementing failover clusters, you can maintain application or service availability if one or more computers in the failover cluster fails. Failover transfers the responsibility of providing access to resources within a cluster from one node to another. Failover can happen when an administrator intentionally moves resources to another node for maintenance or other reasons or when unplanned downtime of one node happens because of hardware failure or other reasons.
4. A failover attempt... The cluster service takes all the resources in the instance offline in an order that is determined by the instances dependency hierarchy; After all the resources are offline, the cluster service attempts to transfer the instance to the node that is listed next on the instances list of preferred owners. If the cluster service successfully moves the instance to another node, it attempts to bring all the resources online. This time it starts at the bottom of the dependency hierarchy. Failover is complete when all of the resources are online on. The cluster service can fail back instances that were originally hosted on the offline node, once the offline node becomes active again.
5. Quorum The quorum for a cluster is the number of elements that must be online for that cluster to continue running. In effect, each element (ie nodes, disk witnesses or file share ) can cast one vote to determine whether the cluster continues running. Quorum prevents two sets of nodes from operating simultaneously as the failover cluster. Without a quorum mechanism, each set of nodes could continue to operate as a failover cluster, resulting in a partition within the cluster. To prevent problems caused by a split in the cluster, failover clusters use a voting algorithm to determine whether the cluster has enough votes to maintain quorum
6. Types of quorum modes Node Majority Only nodes in the cluster have a vote Quorum is maintained when more than half the nodes are online Node and Disk Majority The nodes in the cluster and a disk witness have a vote. Quorum is maintained when more than half of the votes are online. Node and File share Majority The nodes in the cluster and a file share witness have a vote Quorum is maintained when more than half of the votes are online No Marjority: Disk Only Only the quorum-shared disk has a vote Quorum is maintained when the share disk is online
7. Reference web sites Failover Cluster Step-by-Step Guide, see http://go.microsoft.com/fwlink/LinkID=178026&clcid=0x409 .