# 3.2.2 The Matchmakers (Joins) ![The Sous Chefs](assets/arch_sous_chefs_joins.png) Once the raw ingredients have been fetched from the pantry, they must be **Joined**. In the Query Algebra, this is the high-energy work of the **Sous Chefs**. Their job is to take two distinct bowls of ingredients (result sets) and find the precise way to **join** them into a single, complex flavor. Depending on the size and "prepped" state of the ingredients, the Head Chef chooses a different **Join** technique. > [!IMPORTANT] > **The Join Pull**: Remember the **[[Chapter 3/3.2 - The Assembly Line (Query Algebra)|Pull Model]]**! The Sous Chef doesn't just start mashing ingredients. He waits for the station above him to shout, "Give me a partner!" Only then does he reach into his bowls. ## Nested Loop Join: The Manual Hand-Off The **[[Operations/NestedLoop|Nested Loop Join]]** is the simplest way to join ingredients. Imagine a Sous Chef holding one ingredient from Bowl A (the Outer relation). For _every single item_ they hold, they must search through the entire Bowl B (the Inner relation) to find its partner. If Bowl A has 100 items and Bowl B has 100 items, the Sous Chef performs the **join** 10,000 times! It’s like trying to pair every single sock in the laundry by hand—effective, but slow. The Head Chef only orders this when one bowl is tiny. ## Hash Join: The Prep Wall Join When joining large amounts of data, the Head Chef calls for a **[[Operations/HashJoin|Hash Join]]**. This is a two-step assembly process: 1. **The Prep**: A Sous Chef takes Bowl A and builds a **Mise en Place Wall** (the Hash Table), organizing ingredients into numbered buckets based on their matching ID. 2. **The Match**: Once the wall is ready, the Sous Chef simply walks through Bowl B and instantly finds the match. **The Causal Account**: Building the Wall is like spending all morning chopping onions. It’s a huge effort at first—your eyes sting and your hands are tired! But once the onions are prepped, every single dish that needs them is finished in a split second. **The Shape of the Idea**: The Head Chef chooses a Hash Join when he has two large groups to match and no specific map (index) to guide the hand-off. He’d rather do the hard work of chopping onions once than search through the whole pantry ten thousand times! ## Merge Join: The Parallel Prep Join If both bowls are already perfectly sorted (prepped) by their IDs, the Head Chef deploys the elegant **[[Operations/MergeJoin|Merge Join]]**. Two Sous Chefs stand side-by-side with their sorted bowls and walk down their lists in sync. Since the ingredients are already in order, they don't have to search. They just "high-five" whenever their IDs match and **join** the ingredients into a new bowl. It requires almost no counter space, but it demands the high upfront cost of sorting the bowls first! ### The Duel of the Chefs Let's see how the Head Chef chooses his **Join** technique based on the scale of the Cafe's data. #### 1. The Small Batch Join (Nested Loop Join) When joining a single dish to its ingredients, the items are few. The engine just does a quick manual hand-off. ```sql EXPLAIN SELECT dishes.name, ingredients.name FROM dishes JOIN dish_ingredients ON dishes.id = dish_ingredients.dish_id JOIN ingredients ON dish_ingredients.ingredient_id = ingredients.id WHERE dishes.id = 5; -- Result (The Manual Pair): -- Nested Loop Join (cost=0.15..12.31 rows=1 width=64) -- -> Index Scan using dishes_pkey on dishes d (cost=0.14..8.15 rows=1 width=36) > [!TIP] > **Elephant's Footnote: The Small Pairing** > - **`cost=0.15..12.31`**: This is cheap because the "Outer" bowl (Dishes) only has one item! ``` #### 2. The Strategy Shift (Hash Join vs. Nested Loop Join) When we join all `ingredients` to their `supply_deliveries` to calculate total stock, the Head Chef has a choice to make depending on our prep work. **State 1: Joining with a Wall (Hash Join)** Without a specific index on the `ingredient_id` in a massive `supply_deliveries` table, the engine must build a Hash Wall to make the join feasible. ```sql EXPLAIN SELECT ingredients.name, supply_deliveries.delivery_time FROM ingredients JOIN supply_deliveries ON ingredients.id = supply_deliveries.ingredient_id WHERE ingredients.id < 10; -- Results (The Mise en Place Wall): -- Hash Join (cost=4.00..96.70 rows=740 width=15) -- Hash Cond: (supply_deliveries.ingredient_id = ingredients.id) -- -> Seq Scan on supply_deliveries (cost=0.00..88.80 rows=1480 width=12) -- -> Hash (cost=3.00..3.00 rows=80 width=11) -- -> Seq Scan on ingredients (cost=0.00..3.00 rows=80 width=11) -- Filter: (id < 10) > [!TIP] > **Elephant's Footnote: The Wall Construction** > - **`cost=4.00..96.70`**: Note the "Start" cost of 4.00. This is the effort required to build the Mise en Place wall before a single match can be made! ``` **State 2: The Fast Hand-off Join (Nested Loop Join)** Once we add an index (`CREATE INDEX idx_supply_ingredient ON supply_deliveries(ingredient_id);`), the plan shifts. The engine can now use a Nested Loop Join because for every ingredient, it can instantly "fetch" the matching deliveries! ```sql -- Results: -- Nested Loop Join (cost=0.43..85.00 rows=100 width=64) -- -> Seq Scan on ingredients (cost=0.00..12.45 rows=10) -- -> Index Scan using idx_supply_ingredient on supply_deliveries (cost=0.43..6.50 rows=10) ``` #### 3. The Sorted Join (Merge Join) If both lists are already prepped (sorted) by ID, the Chefs perform a Merge Join. ```sql SET enable_hashjoin = off; EXPLAIN SELECT ingredients.id, supply_deliveries.id FROM ingredients JOIN supply_deliveries ON ingredients.id = supply_deliveries.ingredient_id ORDER BY ingredients.id; -- Result: -- Merge Join (cost=0.30..150.00 rows=1000 width=8) -- Merge Cond: (ingredients.id = supply_deliveries.ingredient_id) -- -> Index Scan using ingredients_pkey on ingredients (...) -- -> Index Scan using idx_supply_ingredient on supply_deliveries (...) ``` --- | ← Previous | ↑ Table of Contents | Next → | | :--- | :---: | ---: | | [[Chapter 3/3.2.1 - The Food Runners (Scans)\|3.2.1 The Food Runners (Scans)]] | [[Learn You a Postgres for Great Good\|Home]] | [[Chapter 3/3.2.3 - The Prep Station (Aggregations)\|3.2.3 The Prep Station (Aggregations)]] |