---
publish: true
description: "High Availability (HA) and Consensus: Understanding Failover, Quorum, and the terror of the Split-Brain."
image: "assets/arch_ha_governor.png"
---
# 7.5 The Governor (High Availability)

Horizontal scaling solves for **Throughput** (Read Replicas) and **Management** (Partitioning). But it does not solve for **Durability of Service**.
## The Election (Automatic Failover)
In a high-availability cluster, the Primary node is constantly sending a **Heartbeat** to its replicas and a distributed consensus-store (like `etcd` or `Consul`).
If the primary stops beating:
1. The replicas and the Governor perceive the silence.
2. An **Election** is held among the remaining healthy replicas.
3. The replica with the most recent data (the highest **LSN**) is promoted to be the new Primary.
4. The network traffic—managed by a **VIP (Virtual IP)** or a **Load Balancer**—is rerouted to the new leader.
## The Terror of the Split-Brain
The most dangerous scenario in a distributed database is a **Split-Brain**. This happens when a network "partition" splits the cluster into two halves, and both halves believe they are the legitimate leader.
If both nodes start accepting writes, your data diverges. You now have two different versions of the truth, and they cannot be merged without manual, excruciating intervention.
> [!CAUTION]
> **Fencing the Old King**: To prevent split-brain, modern HA tools (like `Patroni`) use **Fencing**. Before a new Primary is promoted, the cluster must ensure the old Primary is truly "dead" or isolated from the network. This is sometimes achieved via **STONITH** (Shoot The Other Node In The Head)—a aggressive but necessary protocol to ensure only one leader exists.
## Quorum: The Rule of Three
To safely hold an election without a tie, you need an odd number of nodes (typically 3). This ensures a **Quorum**. If a node cannot see a majority of its peers, it must assume it is the one isolated and voluntarily step down from leadership.
## Summary: The High Availability Stack
A professional HA setup requires several layers of coordination:
| Component | Technical Example | Role |
| :--- | :--- | :--- |
| **The Watcher** | `Patroni` | Monitors the local Postgres process and handles failover logic. |
| **The Consensus** | `etcd` / `Consul` | The source of truth for "Who is currently the Primary?" |
| **The Routing** | `HAProxy` / `PgBouncer` | Reroutes application traffic to the current leader. |
By implementing a Governor, you move from a brittle system dependent on a single physical server to a resilient, self-healing organism that survives even the total loss of a data center.
---
| ← Previous | ↑ TOC | Next → |
| :--- | :---: | ---: |
| [[Chapter 7/7.4 - The Crowded Hallway (Connection Pooling)\|7.4 Connection Pooling]] | [[Learn You a Postgres for Great Good\|Home]] | [[Chapter 8/8.0 - The Bouncers and the VIP List (Access Control)\|Chapter 8 - Access Control]] |