# 6.2 The Deep Pantry (Disk Wait)

This is likely the most common wait event you will encounter in any production system. It is **The Sigh of I/O**—the literal pause required to bridge the latency gap between memory and storage.
## The Block Latency Gap
As we saw in the **[[Chapter 5/5.6 - The Physical Machine (RAM, CPU & Disk)|Hardware Era Reference]]**, data not found in the **[[Chapter 5/5.2 - The Warming Rack (Shared Buffers)|Shared Buffer Pool]]** must be fetched from a block device.
When a process initiates a read, the CPU is set aside and the process is placed in a sleep queue until the drive controller signals that the data is ready. In terms of modern processor cycles, even a fast 100 µs NVMe read is an eternity — thousands of potential instructions lost to the speed of physics.
> [!IMPORTANT]
> **Measuring the Gap: `track_io_timing`**
> By default, Postgres does not time every single I/O operation because the overhead of calling the system clock can be expensive on some hardware. To see I/O time in your execution plans, you must enable this parameter:
> ```sql
> SET track_io_timing = on;
> ```
> Without this, `EXPLAIN ANALYZE` will show you *that* I/O happened, but it will not show you how many milliseconds were lost to it.
## Random vs. Sequential I/O Patterns
Postgres experiences disk access in two primary patterns:
1. **Sequential (Adjacent Containers)**: The process reads adjacent blocks in a single predictable stream. The OS kernel often detects this and pre-fetches the next blocks into the **Linux Page Cache** before Postgres even asks for them. Sequential I/O is the "happy path" of database storage.
2. **Random (Scattered Containers)**: The process jumps between disparate blocks (often during an Index Scan). This is far more taxing on the hardware and the kernel, as pre-fetching is often impossible. This is the #1 consumer of wait time in aging or unoptimized systems.
## How to Spot It
In `pg_stat_activity`, you will see these events labeled:
- **[[Workloads/IO/DataFileRead|IO:DataFileRead]]**: The backend process is waiting for a storage device to fulfill a block read request.
- **[[Workloads/IO/DataFileWrite|IO:DataFileWrite]]**: The backend process is waiting to flush a modified page to storage.
- **[[Workloads/IO/WALWrite|IO:WALWrite]]**: The backend process is waiting for the kernel to confirm a WAL write is durable (`fsync`).
## Reading the Evidence: EXPLAIN (ANALYZE, BUFFERS)
The most honest line in any query plan is the **Buffers** report. It tells you exactly how many trips to the Deep Pantry were made.
```sql
-- Run EXPLAIN with BUFFERS to see the I/O story
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE animal_id = 42;
```
The `Buffers` line in the output is the most actionable part of any query plan:
```text
Index Scan using idx_orders_animal_id on orders
(cost=0.14..12.50 rows=5 width=24)
(actual time=0.182..0.195 rows=5 loops=1)
Index Cond: (animal_id = 42)
Buffers: shared hit=4 read=2
Planning:
Buffers: shared hit=88
Planning Time: 0.2 ms
Execution Time: 0.3 ms
```
- **`shared hit=4`**: The page was found in the **[[Chapter 5/5.2 - The Warming Rack (Shared Buffers)|Shared Buffer Pool]]** (RAM). No physical I/O was required.
- **`shared read=2`**: The page was not in memory and required a physical block read from storage.
- **`shared dirtied=1`**: The process modified a page in memory. It is now "dirty" and must be eventually flushed by the **[[Chapter 5/5.4 - The Housekeepers (Vacuum & Freezing)|Background Writer]]**.
- **`shared written=1`**: The process was forced to physically write the page to storage itself (often because of a Buffer Eviction). This is slow.
A query with all `shared hit` and zero `shared read` is a query living entirely in memory—the ultimate performance goal. A query with a high `shared read` count is one bottlenecked by the latency of your storage hardware.
## Resolution Paths
If your database is I/O-bound, there are two levers:
1. **Faster storage hardware**: NVMe SSDs have roughly 100× lower latency than HDDs. Reducing raw read latency directly reduces `DataFileRead` wait times.
2. **Increase `shared_buffers`**: The more of your active working set fits in the buffer pool, the fewer reads reach the storage hardware at all.
```sql
-- How large is the current buffer pool?
SHOW shared_buffers;
-- See which tables are currently warm in the buffer pool
-- (requires pg_buffercache extension)
SELECT relname, count(*) AS buffers_in_cache
FROM pg_buffercache b
JOIN pg_class c ON b.relfilenode = c.relfilenode
GROUP BY relname
ORDER BY buffers_in_cache DESC
LIMIT 10;
```
The more of your active data set fits in your buffer pool, the fewer requests reach the storage hardware—and the faster your queries become.
---
| ← Previous | ↑ Table of Contents | Next → |
| :--- | :---: | ---: |
| [[Chapter 6/6.1.3 - The Background Rhythm (Activity & Extensions)\|6.1.3 Background Rhythm & Activity]] | [[Learn You a Postgres for Great Good\|Home]] | [[Chapter 6/6.2.1 - The Pocket Diary (WAL IO)\|6.2.1 The Pocket Diary (WAL IO)]] |