# 5.4 The Great Overflow (Tuple bloat)

In **[[Chapter 5/5.3 - The Housekeepers (Vacuum & Freezing)|Chapter 5.3]]**, we met the quiet staff who clean the depot while the elephant naps. But what happens when the workload becomes so violent that the housekeepers simply can't keep up?
This is the story of the **Great Overflow**, otherwise known as **Table Bloat**.
## The Infinite Paper Shredder
Imagine the Cafe is having a record-breaking busy day. Orders are being updated every millisecond. As we learned in **[[Chapter 1/1.5 - The Sharpie Ledger (MVCC)|MVCC]]**, the elephant never erases anything; he just marks the old suitcase as "dead" and creates a new one.
This is like a clerk who, every time a name changes on a ledger, shreds the old page and throws the scraps on the floor.
The **Housekeeper (Vacuum)** is supposed to walk around, pick up the scraps, and put them in a bin so they can be reused. But if the clerk is shredding faster than the vacuum can sweep, the scraps start to pile up. Eventually, the scraps cover the floor, the desks, and even the breakfast buffet!
## The Symptom: The Growing Depot
When the vacuum fails, the **[[Chapter 1/1.4 - The Depot (The Table)|Filing Cabinet]]** (the database file on disk) grows larger and larger. Even though you only have 1,000 "live" customers, the cabinet might take up 100GB of space. Most of that space is just "dead weight"—shredded paper that hasn't been recycled yet.
This is **Bloat**. It's the #1 reason for performance degradation in high-write systems. Why? Because the Runners have to walk over mountains of shredded paper just to find one "live" plate.
### The Invisible Waste: Index Bloat
A reader might "VACUUM" their table but wonder why their "B-Tree" is still gigabytes larger than it should be. This is **Index Bloat**.
When a row is shredded, the "Post-it Note" on the **[[Chapter 2/2.1 - The Balanced Bookshelf (The B-Tree)|Librarian's Map (the Index)]]** doesn't always vanish immediately. If the Librarian (Index) is full of notes pointing to empty air, the map itself becomes "bloated."
While `VACUUM` *does* try to clean the map, it’s much harder than cleaning the floor. To truly shrink the map, you often need a **`REINDEX`**—which is like the Librarian throwing away his old book and drawing a fresh one from scratch!
## The Diagnostics: Checking the Bins
To find out if your housekeepers are falling behind, you must look at the **Waste Audit**:
```sql
-- How much shredded paper is on the floor?
SELECT relname, n_live_tup, n_dead_tup, last_vacuum, last_autovacuum
FROM pg_stat_user_tables
WHERE relname = 'supply_deliveries';
```
If **`n_dead_tup`** (dead tuples) is significantly higher than **`n_live_tup`**, the elephant is drowning in his own trash!
## The Solution: Tuning the Cleaners
The elephant is very conservative. By default, he only starts cleaning when 20% of a table is dead. For a table with a billion rows, that’s 200 million dead rows before the vacuum even turns on!
To maintain a healthy depot, you must adjust the **Thresholds of Cleanliness**:
```sql
-- Telling the Housekeeper to be more aggressive on this table
ALTER TABLE supply_deliveries SET (
autovacuum_vacuum_scale_factor = 0.01,
autovacuum_vacuum_cost_limit = 1000
);
```
By lowering the `scale_factor` to 1% (0.01), we ensure the housekeeper starts sweeping much earlier. By increasing the `cost_limit`, we give them a bigger broom!
## The Nuclear Option (VACUUM FULL)
If the bloat is so bad that you absolutely *must* reclaim the disk space immediately, you can order a **`VACUUM FULL`**.
Be warned: this is the **Nuclear Option**. The elephant will lock the entire depot, move every single suitcase into a brand-new cabinet, and throw the old cabinet in the river. No one can order tea, no one can pay their bill, and the Cafe is effectively closed for the duration of the move. Great engineers use `VACUUM FULL` only in the direst of emergencies!
It’s more expensive and requires more "staging area" (disk space), but it ensures the Cafe stays open while the trash is removed!
## Summary: The Elephant's Menu (The Tuning Matrix)
Tuning the elephant's appetite is about balancing speed and safety. Use this matrix to guide your next configuration:
| Setting | Role | The "Lazy" Wisdom |
| :------------------------- | :---------------------- | :-------------------------------------------------------- |
| `shared_buffers` | The Warming Rack | Set to ~25% of RAM. Leave the rest for the Building Manager (OS). |
| `work_mem` | The Private Desk | Set low (4-16MB) by default. Set high for specific messy sessions. |
| `maintenance_work_mem` | The Housekeeping Budget | Set high (256MB-1GB) so the Vacuum Crew has a bigger broom. |
| `max_wal_size` | The Pocket Size | Larger keeps the checkpointer from panicking; smaller saves disk. |
| `autovacuum_scale_factor` | The Mess Threshold | Lower it for busy tables so the cleaners start earlier. |
| `vacuum_cost_limit` | The Cleaning Speed | Increase it to let the cleaners work faster at the cost of I/O. |
By mastering the Warming Rack and the Housekeeper's Budget, you ensure the Elephant Cafe is lean, fast, and always ready for the next lunch rush.
---
Now that the elephant has a full stomach and a clean depot, he needs to worry about the **Crowds**. In the next chapter, we'll explore **The Waiting Game (Workloads & Locking)**, where we see what happens when too many elephants try to use the same suitcase at once!
---
| ← Previous | ↑ Table of Contents | Next → |
| :--- | :---: | ---: |
| [[Chapter 5/5.3 - The Housekeepers (Vacuum & Freezing)\|5.3 The Housekeepers (Background Workers)]] | [[Learn You a Postgres for Great Good\|Home]] | [[Chapter 6/6.0 - The Waiting Game (Workloads & Locking)\|Chapter 6 - The Waiting Game (Workloads & Locking)]] |