# Learn You a Postgres for Great Good!
<img src="assets/lyapfgg.png" width="250" style="float: left; margin: 0 20px 20px 0;" />
What actually happens when you run `SELECT`? Where does the row live on disk? Why does `UPDATE` not update in place? What is Postgres doing when your query is "slow" and `top` shows nothing?
You already know how to write SQL. You can `JOIN` three tables, add an index when something is slow, and copy a connection string from a README. That is enough to build things — but it is not enough to understand why they break. The answers live in the architecture of Postgres itself — and once you see them, every tuning decision, every `EXPLAIN` plan, and every mysterious production incident suddenly makes sense.
**Learn You a Postgres for Great Good!** is a narrative-driven tour of PostgreSQL internals, inspired by *Learn You a Haskell for Great Good!* by Miran Lipovača. It uses a single, unified data model — the **Elephant Cafe** — to ground every concept in working SQL you can run yourself.
## Who This Book Is For
You do not need to know C, operating systems theory, or database internals to start. We will build up from the physical layer (bytes on disk) to the logical layer (query plans, transactions, isolation) and show you exactly where each concept lives in the PostgreSQL database engine.
## The Lazy Elephant
To understand the architecture, you need to accept a single organizing principle: **Postgres is profoundly, deliberately lazy.**
Not lazy in the pejorative sense. Lazy in the way a physicist is lazy — every design decision in Postgres is an optimization to **avoid unnecessary work**. Once you internalize what the system considers "expensive," the entire architecture clicks into place:
- **Deleting a row?** Postgres doesn't actually remove it from disk. It crosses out the old version ([[Manuscript/02 - Physical Storage & MVCC/2.5 - MVCC (The Sharpie Ledger)|MVCC]]) and leaves the cleanup to a background process. Removing data in place would require locking readers out — too expensive.
- **Finding a specific row?** Postgres typically refuses to read the whole table if it can avoid it. It maintains pre-sorted shortcut structures ([[Manuscript/03 - Access Paths & Indexing/3.0 - Indexes (The Mighty Indexes)|Indexes]]) so it can jump directly to the answer. Scanning everything is the strategy of last resort.
- **Ensuring crash safety?** Rather than writing data to its final location immediately, Postgres first records a sequential log of what it *intends* to do. This allows the engine to be fast while remaining indestructible—sorting the data into its proper home later, on its own schedule.
- **Answering a complex query?** Postgres often doesn't just execute your SQL in the order you wrote it. It evaluates dozens of possible execution strategies and selects the cheapest one.
The pattern is remarkably consistent: defer work, batch work, or avoid work entirely. By understanding what Postgres considers "expensive" (random disk reads, lock contention, memory pressure), these architectural quirks become rational, predictable consequences.
## What You Will Learn
Every concept is illustrated with SQL you can run against the book's sample database (`docker compose up -d` and you're connected), and every `EXPLAIN` output in the book was generated against real data.
The book follows the data from the bottom of the stack to the top:
1. **How data is physically stored** — bytes, tuples, 8KB pages, and why Postgres has an opinion about all of them (Chapters 1–2)
2. **How data is found efficiently** — B-Trees, GIN, BRIN, vector indexes, and the write-amplification trade-off (Chapter 3)
3. **How queries are planned and executed** — the optimizer, scan types, join strategies, and how to read `EXPLAIN` output (Chapter 4)
4. **How data survives crashes** — the Write-Ahead Log, transactions, isolation levels, and the mechanics of XID wraparound (Chapter 5)
5. **How Postgres manages resources** — shared buffers, work memory, autovacuum, and why bloat happens (Chapter 6)
6. **Why queries get slow under concurrency** — wait events, locks, latches, and the "sweat vs. wait" diagnostic framework (Chapter 7)
7. **How Postgres scales beyond one server** — replication, partitioning, cloud storage, and connection pooling (Chapter 8)
8. **How access control actually works** — roles, privileges, row-level security, and debugging "Permission Denied" (Chapter 9)
## Table of Contents
### [[Manuscript/01 - Foundations & Data Modeling/1.0 - Relations & Normalization (The Cafe Layout)|Chapter 1 - Foundations & Data Modeling]]
> **The Purpose**: To establish the "Elephant Cafe" world, define the sample data model, and introduce SQL as the primary language of the engine.
> **The Takeaway**: Data is a physical asset that must be guarded. Normalization is not about academic purity; it is about ensuring that the engine never has to look in two places for the same truth.
- [[Manuscript/01 - Foundations & Data Modeling/1.0 - Relations & Normalization (The Cafe Layout)|1.0 Relations & Normalization (The Cafe Layout)]]
- [[Manuscript/01 - Foundations & Data Modeling/1.1 - SQL (The Waiter's Pad)|1.1 SQL (The Waiter's Pad)]]
- [[Manuscript/01 - Foundations & Data Modeling/1.2 - Relational Model (An academic detour)|1.2 Relational Model (An academic detour)]]
### [[Manuscript/02 - Physical Storage & MVCC/2.0 - Storage Foundations (The Building Blocks of Storage)|Chapter 2 - Physical Storage & MVCC]]
> **The Purpose**: To ground the reader in the physical reality of storage, moving from bit-level data types up to the 8KB containers that hold our world.
> **The Takeaway**: A "Row" is a logical abstraction; a "Tuple" is the physical suitcase. Postgres lives in 8KB blocks and uses MVCC to ensure that readers and writers never have to wait for each other.
- [[Manuscript/02 - Physical Storage & MVCC/2.1 - Data Types (Knicks, knacks, bits, and bobs)|2.1 Data Types (Knicks, knacks, bits, and bobs)]]
- [[Manuscript/02 - Physical Storage & MVCC/2.2 - Tuple (The Suitcase)|2.2 Tuple (The Suitcase)]]
- [[Manuscript/02 - Physical Storage & MVCC/2.2.1 - Visibility & System Columns|2.2.1 Visibility & System Columns]]
- [[Manuscript/02 - Physical Storage & MVCC/2.2.2 - Storage Optimizations|2.2.2 Storage Optimizations]]
- [[Manuscript/02 - Physical Storage & MVCC/2.3 - The Page (The Shipping Container)|2.3 The Page (The Shipping Container)]]
- [[Manuscript/02 - Physical Storage & MVCC/2.4 - Relation (The Table)|2.4 Relation (The Table)]]
- [[Manuscript/02 - Physical Storage & MVCC/2.5 - MVCC (The Sharpie Ledger)|2.5 MVCC (The Sharpie Ledger)]]
- [[Manuscript/02 - Physical Storage & MVCC/2.6 - TOAST (The Separate Trailer)|2.6 TOAST (The Separate Trailer)]]
- [[Structures/_Structures|Technical Reference: Structures]]
- [[Architecture/MVCC|Technical Reference: Architecture]]
### [[Manuscript/03 - Access Paths & Indexing/3.0 - Indexes (The Mighty Indexes)|Chapter 3 - Access Paths & Indexing]]
> **The Purpose**: To move from "Scanning Everything" to "Following Shortcuts," providing a comprehensive guide to indexing strategies and their hidden costs.
> **The Takeaway**: Performance is the art of avoiding the disk. An index is a specialized cheat sheet that trades expensive write-time labor for lightning-fast read-time results.
- [[Manuscript/03 - Access Paths & Indexing/3.1 - B-Tree (The Balanced Bookshelf)|3.1 B-Tree (The Balanced Bookshelf)]]
- [[Manuscript/03 - Access Paths & Indexing/3.2 - GIN & GiST (The Word Scavenger)|3.2 GIN & GiST (The Word Scavenger)]]
- [[Manuscript/03 - Access Paths & Indexing/3.3 - BRIN (The Industrial Label)|3.3 BRIN (The Industrial Label)]]
- [[Manuscript/03 - Access Paths & Indexing/3.4 - HNSW & IVFFlat (The Similarity Map)|3.4 HNSW & IVFFlat (The Similarity Map) — Vector Search]]
- [[Manuscript/03 - Access Paths & Indexing/3.5 - Constraints & Triggers (The Integrity Layer and the Chain Reaction)|3.5 Constraints & Triggers (The Integrity Layer and the Chain Reaction)]]
- [[Manuscript/03 - Access Paths & Indexing/3.6 - Index Maintenance (The Cost of Fame)|3.6 Index Maintenance (The Cost of Fame)]]
- [[Structures/Index|Technical Reference: Index Architecture]]
### [[Manuscript/04 - Query Planning & Execution/4.0 - Query Planning & Operations (The Strategy of Execution)|Chapter 4 - Query Planning & Execution]]
> **The Purpose**: To explain how a declarative riddle is transformed into an execution plan, highlighting the calculus of the Query Planner.
> **The Takeaway**: The Planner is a cost-based optimizer. It selects the most efficient execution strategy based on a rigorous model of resource costs, often rearranging operations to avoid unnecessary work.
- [[Manuscript/04 - Query Planning & Execution/4.1 - Query Planner (The Blueprint of Execution)|4.1 Query Planner]]
- [[Manuscript/04 - Query Planning & Execution/4.2 - Query Algebra (The Execution Tree)|4.2 Query Algebra]]
- [[Manuscript/04 - Query Planning & Execution/4.3 - Scans (The Full Table Walk)|4.3 Scans]]
- [[Manuscript/04 - Query Planning & Execution/4.4 - Joins (The Pairing Dance)|4.4 Joins]]
- [[Manuscript/04 - Query Planning & Execution/4.5 - Aggregations (The Running Receipt)|4.5 Aggregations]]
- [[Manuscript/04 - Query Planning & Execution/4.6 - Memory Operations (Sort Hash and Spill)|4.6 Memory Operations]]
- [[Manuscript/04 - Query Planning & Execution/4.7 - Mutation Path (The Write Pipeline)|4.7 Mutation Path]]
- [[Manuscript/04 - Query Planning & Execution/4.8 - Parallel & Distributed (The Worker Pool)|4.8 Parallel & Distributed]]
- [[Manuscript/04 - Query Planning & Execution/4.9 - Common Table Expressions (The Temporary Station)|4.9 Common Table Expressions]]
- [[Manuscript/04 - Query Planning & Execution/4.10 - Sargability (The Art of Not Opening Every Box)|4.10 Sargability]]
### [[Manuscript/05 - Durability & Transactions/5.0 - Write-Ahead Log (Safety Without Sweating)|Chapter 5 - Durability & Transactions]]
> **The Purpose**: To explain how Postgres survives sudden death — power failures and kernel panics — without losing a byte of confirmed work.
> **The Takeaway**: Safety in Postgres is dual-layered: the Write-Ahead Log gives you physical durability, and Transactions give you logical all-or-nothing semantics. Together they let you trust the engine with the kind of data you can't afford to apologize for.
- [[Manuscript/05 - Durability & Transactions/5.1 - WAL & fsync (The Durable Ledger)|5.1 WAL & fsync (The Durable Ledger)]]
- [[Manuscript/05 - Durability & Transactions/5.2 - Crash Recovery (The Recovery Parade)|5.2 Crash Recovery (The Recovery Parade)]]
- [[Manuscript/05 - Durability & Transactions/5.3 - Logical Replication (The Relay Chain)|5.3 Logical Replication (The Relay Chain)]]
- [[Manuscript/05 - Durability & Transactions/5.4 - Transactions (The Atomic Seal)|5.4 Transactions (The Atomic Seal)]]
- [[Manuscript/05 - Durability & Transactions/5.5 - Isolation (The Looking Glass Windows)|5.5 Isolation (The Looking Glass Windows)]]
- [[Manuscript/05 - Durability & Transactions/5.6 - Transaction ID Wraparound (The Infinite Calendar)|5.6 Transaction ID Wraparound (The Infinite Calendar)]]
- [[Architecture/WAL|Technical Reference: WAL Anatomy & LSNs]]
- [[Architecture/Transactions|Technical Reference: Transactions]]
### [[Manuscript/06 - Resource Management & Processes/6.0 - Memory & Disk (The Hierarchy of Inertia)|Chapter 6 - Resource Management & Processes]]
> **The Purpose**: To explain how Postgres consumes physical resources (RAM, CPU, Disk) and how to decide which process gets fed first when contention is inevitable.
> **The Takeaway**: Postgres can't make hardware faster — it can only decide what gets to live in which layer of memory. Tuning is the art of choosing what's worth keeping warm and what can afford to wait on the disk.
- [[Manuscript/06 - Resource Management & Processes/6.1 - Process Family (The Process Family)|6.1 Process Family (The Process Family)]]
- [[Manuscript/06 - Resource Management & Processes/6.1.1 - Connection Mechanics (Long running connections and sessions)|6.1.1 Connection Mechanics]]
- [[Manuscript/06 - Resource Management & Processes/6.2 - Shared Buffers (The Page Cache)|6.2 Shared Buffers (The Page Cache)]]
- [[Manuscript/06 - Resource Management & Processes/6.3 - Work Mem (Private Working Memory)|6.3 Work Mem (Private Working Memory)]]
- [[Manuscript/06 - Resource Management & Processes/6.4 - Vacuum & Freezing (The Housekeepers)|6.4 Vacuum & Freezing (The Housekeepers)]]
- [[Manuscript/06 - Resource Management & Processes/6.5 - Tuple Bloat (Garbage Collection)|6.5 Tuple Bloat (Garbage Collection)]]
- [[Manuscript/06 - Resource Management & Processes/6.6 - RAM, CPU & Disk (The Physical Machine)|6.6 RAM, CPU & Disk (The Physical Machine)]]
- [[Resources/_Resources|Technical Reference: Resources Overview]]
### [[Manuscript/07 - Wait Events & Concurrency/7.0 - Why Slow Queries Lie (The Waiting Game)|Chapter 7 - Why Slow Queries Lie (The Waiting Game)]]
- **Part 1: The Diagnostic Model**
- [[Manuscript/07 - Wait Events & Concurrency/7.0 - Why Slow Queries Lie (The Waiting Game)|7.0 Why Slow Queries Lie (The Waiting Game)]]
- [[Manuscript/07 - Wait Events & Concurrency/7.1 - The Diagnostic Views (Sweat vs. Sigh)|7.1 The Diagnostic Views (Sweat vs. Sigh)]]
- [[Manuscript/07 - Wait Events & Concurrency/7.2 - The Triage Protocol (The Concurrency Matrix)|7.2 The Triage Protocol (The Concurrency Matrix)]]
- **Part 2: The Three Failure Classes**
- [[Manuscript/07 - Wait Events & Concurrency/7.3 - CPU Saturation (The Sweat)|7.3 CPU Saturation (The Sweat)]]
- [[Manuscript/07 - Wait Events & Concurrency/7.4 - Storage Latency (Physical IO Stalls)|7.4 Storage Latency (Physical IO Stalls)]]
- [[Manuscript/07 - Wait Events & Concurrency/7.5 - WAL Pressure (Durability Bottlenecks)|7.5 WAL Pressure (Durability Bottlenecks)]]
- [[Manuscript/07 - Wait Events & Concurrency/7.6 - Lock Contention (Concurrency & Blocking)|7.6 Lock Contention (Concurrency & Blocking)]]
- **Part 3: Human-Caused Stalls**
- [[Manuscript/07 - Wait Events & Concurrency/7.7 - Human-Caused Stalls (The Starvation)|7.7 Human-Caused Stalls (The Starvation)]]
- **Part 4: Advanced Coordination**
- [[Manuscript/07 - Wait Events & Concurrency/7.8 - Advanced Coordination - Background & Replication|7.8 Background & Replication]]
- [[Manuscript/07 - Wait Events & Concurrency/7.9 - Advanced Coordination - Parallelism|7.9 Parallelism]]
- [[Manuscript/07 - Wait Events & Concurrency/7.10 - Advanced Coordination - Infrastructure (The Master Keys)|7.10 Infrastructure & Control (The Master Keys)]]
- [[Workloads/_Workloads|Technical Reference: Workloads Overview]]
### [[Manuscript/08 - Distributed Scaling & Clouds/8.0 - Distributed Storage (The Elephant in the Clouds)|Chapter 8 - Distributed Scaling & Clouds]]
> **The Purpose**: To explain how Postgres scales beyond a single server, moving from a single instance to a distributed cluster in the clouds.
> **The Takeaway**: Scaling is the art of moving the bottleneck. Whether through Read Replicas or Partitioning, the goal is to manage the coordination cost of distributed state.
- [[Manuscript/08 - Distributed Scaling & Clouds/8.1 - Read Replicas (The Shouting Gallery)|8.1 Read Replicas (The Shouting Gallery)]]
- [[Manuscript/08 - Distributed Scaling & Clouds/8.1.1 - synchronous_commit (The Durability Gradient)|8.1.1 synchronous_commit (The Durability Gradient)]]
- [[Manuscript/08 - Distributed Scaling & Clouds/8.1.2 - xmin Horizon (Hot Standby Feedback)|8.1.2 xmin Horizon (Hot Standby Feedback)]]
- [[Manuscript/08 - Distributed Scaling & Clouds/8.2 - Cloud Storage (Compute and Storage Separation)|8.2 Cloud Storage (Compute and Storage Separation)]]
- [[Manuscript/08 - Distributed Scaling & Clouds/8.3 - Table Partitioning (Splitting the Table)|8.3 Table Partitioning (Splitting the Table)]]
- [[Manuscript/08 - Distributed Scaling & Clouds/8.4 - Connection Pooling (The Waiting Room)|8.4 Connection Pooling (The Waiting Room)]]
- [[Manuscript/08 - Distributed Scaling & Clouds/8.5 - High Availability (Failover and Consensus)|8.5 High Availability (Failover and Consensus)]]
- [[Architecture/Cluster|Technical Reference: Cluster Architecture]]
### [[Manuscript/09 - Identity & Access Control/9.0 - Access Control (The Bouncers and the VIP List)|Chapter 9 - Access Control]]
> **The Purpose**: To demystify the PostgreSQL security model (Roles, Privileges, ACLs, Row-Level Security) and map every "Permission Denied" message back to a concrete row in a system catalog.
> **The Takeaway**: Security in Postgres is not a separate product bolted on top — it is just rows in `pg_authid`, `pg_class.relacl`, and `pg_policy`. Once you know which catalog to read, the engine stops being mysterious and starts being auditable.
- [[Manuscript/09 - Identity & Access Control/9.1 - Roles & Privileges (The Name Tags)|9.1 Roles & Privileges (The Name Tags)]]
- [[Manuscript/09 - Identity & Access Control/9.2 - Default Privileges (The Manager's Orders)|9.2 Default Privileges (The Manager's Orders)]]
- [[Manuscript/09 - Identity & Access Control/9.3 - Row-Level Security (The VIP List)|9.3 Row-Level Security (The VIP List)]]
- [[Manuscript/09 - Identity & Access Control/9.4 - Security Definers (The Manager's Override)|9.4 Security Definers (The Manager's Override)]]
### [[Manuscript/10 - Summary & Epilogue/Epilogue|Chapter 10 - Summary & Epilogue]]
---
[[Manuscript/01 - Foundations & Data Modeling/1.0 - Relations & Normalization (The Cafe Layout)|Start Reading: Chapter 1 — Relations & Normalization (The Cafe Layout) →]]