Tag: #STONITH

  • Red Hat High Availability Clustering: A Technical Guide to Fault Tolerance & Data Consistency


    When critical workloads can’t afford downtime, Red Hat High Availability Clusters step in to keep services running, ensure data stays consistent, and eliminate single points of failure. Built on the solid foundation of the High Availability Add-On, these clusters use a mix of resource orchestration, fault detection, and fencing mechanisms to deliver enterprise-grade uptime.

    Whether you’re a Linux engineer, system architect, or platform owner evaluating RHEL clustering, this deep dive walks you through its architecture, components, and strategies for maintaining availability and integrity.


    🔧 What Makes a Cluster “Highly Available”?

    At the heart of RHEL HA is the High Availability Add-On, which transforms a group of RHEL systems (called nodes) into a cohesive cluster. This cluster continuously monitors each member, takes over services when failures occur, and ensures clients never know something went wrong.

    Clusters built using this RH-HA:

    • Avoid single points of failure
    • Automatically failover services
    • Maintain data integrity during transitions

    Key tools in the stack include:

    • Pacemaker: The brain of the cluster that manages resources
    • Corosync: Handles messaging, quorum, and membership
    • STONITH (Fencing): Ensures failed nodes are completely cut off
    • GFS2 and lvmlockd: Enable active-active shared storage access

    🧠 Core Components of RHEL High Availability

    1. Pacemaker: Resource Management Engine

    Pacemaker is the cluster’s resource orchestrator, comprising several daemons:

    • CIB: Holds configuration/status in XML, synced across all nodes
    • CRMd: Schedules actions like start/stop/move for resources
    • LRMd: Interfaces with local agents to execute actions and monitor state

    2. Corosync: Messaging Backbone

    Corosync ensures all nodes talk to each other reliably. It manages:

    • Membership and quorum determination
    • Messaging and state sync via kronosnet
    • Redundant links and failover networking

    3. Fencing (STONITH): Last Line of Defense

    If a node stops responding, how do you guarantee it won’t corrupt data? Enter fencing.

    • STONITH (“Shoot The Other Node In The Head”) cuts power or access to failed nodes
    • Prevents dual writes and split-brain scenarios
    • Required (stonith-enabled=true) for production clusters

    Examples:

    • Redundant power fencing ensures both power supplies of a node are killed
    • Use fencing delays (pcmk_delay_base, priority-fencing-delay) to avoid race conditions

    🧩 Ensuring Quorum and Preventing Split-Brain

    A cluster needs quorum (majority vote) to make decisions. Without it, Pacemaker halts all resources to protect data.

    • votequorum service tracks voting nodes
    • no-quorum-policy:
    • stop (default): Stops all services
    • freeze: Useful for GFS2 where shutdowns require quorum
    • Quorum devices (net-based) help even-node clusters survive more failures
    • Algorithms: ffsplit, lms

    💾 Storage Strategies for Data Consistency

    1. Shared Storage

    Failover only works if the new node can access the same data. Supported mediums include:

    • iSCSI
    • Fibre Channel
    • Shared block devices

    2. LVM in Clusters

    • HA-LVM: Active/passive, single-node access at a time
    • lvmlockd: Enables active/active access, works with GFS2

    3. GFS2: The Cluster File System

    • Allows simultaneous block-level access from multiple nodes
    • Requires Pacemaker, Corosync, DLM, and lvmlockd
    • Supports encrypted file systems (RHEL 8.4+)

    ⚙️ Resource Management Tactics

    Resources in Pacemaker are abstracted via agents. They can be grouped, ordered, colocated, and monitored with high precision.

    Key controls:

    • Groups: Start in order, stop in reverse
    • Constraints:
      • Location (where)
      • Ordering (when)
      • Colocation (with whom)
    • Health checks: Automatic monitoring with customizable failure policies
    • migration-threshold: Move resource after N failures
    • start-failure-is-fatal: Node marked bad after failed start
    • multiple-active: What to do if resource runs on >1 node
    • shutdown-lock: Prevents unnecessary failovers during planned maintenance

    🌐 Multi-Site Clustering & Remote Nodes

    1. Booth Ticket Manager

    Manages split-brain in geo-distributed clusters. Tickets control which site holds resource ownership.

    2. pacemaker_remote

    Lets you add nodes that don’t run Corosync (e.g., VMs) into your cluster:

    • Extend cluster size beyond 32 nodes
    • Useful for managing cloud VMs or containers

    🛠️ Configuration Tools

    Red Hat provides two main tools to manage the cluster:

    • pcs (CLI)
    • pcsd (Web UI)

    Tasks made simple:

    • Cluster creation
    • Adding/removing nodes
    • Config changes (live)
    • Viewing status and logs

    ✅ Summary: Why RHEL HA Matters

    If your workloads can’t go down—and your data can’t risk corruption—RHEL HA offers:

    • Mature, enterprise-tested components
    • Consistent handling of failovers and fencing
    • Flexibility for active/active and geo-distributed clusters
    • Integrated tooling for automation and visibility

    Start with two nodes. Plan your fencing. Decide quorum policies. Add shared storage. Then scale.

    When uptime matters, RHEL High Availability Add-On delivers.


    Have questions or want a deeper walkthrough? Contact us at OmOps or explore more Linux and infrastructure insights on our blog.