### Hash Join ![[assets/ex_hashjoin.png|256]] ### The Explain Trace ```sql -- Joining two tables using a hash table EXPLAIN (ANALYZE, COSTS, BUFFERS, VERBOSE) SELECT * FROM animals a JOIN species s ON a.species_id = s.id; ``` ```text Hash Join (cost=2.11..446.11 rows=20000 width=42) (actual time=1.350..4.468 rows=20000 loops=1) Output: a.id, a.name, a.species_id, a.created_at, s.id, s.name, s.diet_type Inner Unique: true Hash Cond: (a.species_id = s.id) Buffers: shared hit=148 read=1 -> Seq Scan on public.animals a (...) -> Hash (...) ``` --- - **Description**: Joins tables using a hash table. - **Performance**: High performance for large datasets; prefers cases where the inner relation fits in `work_mem`. - **Factors**: Size of join tables, hash table size, and hash collision rates. - **Cost**: `build cost + probe cost + cpu_operator_cost * rows joined` ![[assets/ex_hash_semi_join.svg|256]] - **Operates on**: [[Structures/Result Set]] - **Workloads**: - [[Workloads/IO/BufFile/BufFileRead|IO: BufFileRead]] - [[Workloads/IO/BufFile/BufFileWrite|IO: BufFileWrite]] - [[Workloads/LWLock/Parallel/ParallelHashJoin|LWLock: ParallelHashJoin]] - [[Workloads/LWLock/Buffers/BufferContent|LWLock: BufferContent]] - [[Workloads/IPC/Hash/HashBuildElect|IPC: HashBuildElect]] - [[Workloads/IPC/Hash/HashBuildHashInner|IPC: HashBuildHashInner]] - [[Workloads/IPC/Hash/HashBuildHashOuter|IPC: HashBuildHashOuter]] - [[Workloads/IPC/Hash/HashBatchLoad|IPC: HashBatchLoad]]