### Distributed Gather ![[assets/ex_gather.png|256]] ### The Explain Trace ```sql -- Forcing a parallel aggregation and gather SET max_parallel_workers_per_gather = 2; SET min_parallel_table_scan_size = 0; SET parallel_setup_cost = 0; EXPLAIN (ANALYZE, COSTS, BUFFERS, VERBOSE) SELECT count(*) FROM animals; ``` ```text -> Gather (cost=252.17..252.18 rows=2 width=8) (actual time=2.347..3.628 rows=3 loops=1) Output: (PARTIAL count(*)) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=148 -> Partial Aggregate (...) ``` --- - **Description**: Collects data from multiple nodes or parallel processes and consolidates it into a single node. This is often the final step in a parallel query. - **Performance**: Can introduce a serialization bottleneck if a large volume of data must be shipped through the coordinator. - **Factors**: Volume of data, number of workers, and IPC (Inter-Process Communication) overhead. - **Cost**: `setup_cost + communication_cost * data_size` ![[assets/ex_gather_motion.svg|256]] - **Operates on**: [[Structures/Result Set]] - **Workloads**: - [[Workloads/IPC/Parallel/ExecuteGather|IPC: ExecuteGather]] - [[Workloads/IPC/Parallel/ParallelFinish|IPC: ParallelFinish]] - [[Workloads/LWLock/Parallel/ParallelQueryDSA|LWLock: ParallelQueryDSA]] - [[Workloads/LWLock/Buffers/SharedTupleStore|LWLock: SharedTupleStore]]