# 7.3 Splitting the Depot (Partitioning) ![Splitting the Depot](assets/arch_partitioning_diverse.png) As your depot grows, you will eventually encounter the **Cramped Aisle Problem**. Imagine you have one single, massive table for "Orders." Every elephant in the building is trying to write to that one table. Even with the **[[Chapter 4/4.1 - The Pocket Diary (WAL & fsync)|Pocket Diary]]**, they are bumping into each other, fighting over the same **[[Chapter 1/1.2 - The Shipping Container (The Page)|Pages]]**, and waiting at the same narrow bridges (Locks). "No room! No room!" This is **Write Contention**. The solution is to split that one massive table into many smaller, manageable pieces. We call this **Partitioning**. The fundamental logic is undeniable: if the monolith is too heavy to move, you must shatter it into fragments. ## The Mini-Depot Strategy Instead of one "Orders" table, the elephant creates a "Parent" table that acts as a signpost, and dozens of "Child" tables (Partitions). - Orders from 2023 go to the **2023 Partition**. - Orders from 2024 go to the **2024 Partition**. Now, elephants working on 2023 data never even see the elephants working on 2024 data. The contention disappears, and the depot feels empty again. It’s like having a hundred tea parties instead of one big, messy one! ## Why It’s Difficult to Do Well If partitioning was easy, everyone would do it for every table. But it is a double-edged sword (and a very sharp one at that). ### 1. The Partition Key Dilemma You must choose a **Partition Key**—the rule for how data is split. - If you split by **Date**, but your users always search by **Customer ID**, the Maitre D' (Query Planner) has to send a staff member to _every single partition_ to find the data. This is called **Fan-out**, and it's devastating for performance. It’s like looking for a mouse in a hundred different holes! - If you split by **Hash** (randomly spreading rows), you solve the write contention perfectly, but you can no longer perform efficient "Range Queries" (e.g., "Show me everything from June"). ### 2. The Maintenance Burden Partitions are not magic; they are real tables. You have to create them, index them, and manage them. If you forget to create the "2025" partition before January 1st, the elephant will suddenly have nowhere to put the data and will start yelling errors. It’s a very rude way to start the new year! ### 3. The Cross-Partition Join Joining two large partitioned tables is like trying to coordinate two different armies across a hundred different kitchens. Unless the partition keys match perfectly (Co-location), the performance will plummet into a combinatorial abyss. Partitioning is a powerful tool for scaling **Writes** and managing **Data Lifecycle** (e.g., deleting old data by simply dropping a partition). But like any powerful tool, it requires a master's touch to avoid turning your organized depot into a fragmented maze. It’s a game of chess, played with elephants! ### Shattering the Monolith Let's perform the migration for our `supply_deliveries` table. We are moving from a single massive table to a partitioned one: ```sql -- 1. Look before you leap: Create the new Parent CREATE TABLE supply_deliveries_partitioned ( id BIGINT GENERATED ALWAYS AS IDENTITY, supplier_id INT NOT NULL, ingredient_id INT NOT NULL, delivery_time TIMESTAMPTZ NOT NULL, quantity_kg NUMERIC(8,2) NOT NULL, PRIMARY KEY (id, delivery_time) ) PARTITION BY RANGE (delivery_time); -- 2. Create the rooms for the different years CREATE TABLE supply_deliveries_2024 PARTITION OF supply_deliveries_partitioned FOR VALUES FROM ('2024-01-01') TO ('2025-01-01'); -- 3. The Swap (In a Transaction!) BEGIN; ALTER TABLE supply_deliveries RENAME TO supply_deliveries_old; ALTER TABLE supply_deliveries_partitioned RENAME TO supply_deliveries; COMMIT; ``` Now, the application keeps sending data to `supply_deliveries`, but the elephant secretly routes it to the correct "Year Room." The aisle is no longer cramped! --- [[Chapter 7/7.2 - The Infinite Archive (Cloud Storage)|← 7.2 - Cloud Storage]] | [[Chapter 7/7.0 - The Elephant in the Clouds (Distributed Storage)|↑ 7.0 - The Cloud Scales]] | [[Chapter 7/7.4 - The Crowded Hallway (Connection Pooling)|7.4 - The Crowded Hallway →]]