# Chapter 9: Identity & Access Control ## 9.0 The Bouncers and the VIP List (Access Control) ![[assets/arch_access_overview.png|450]] In the initial stages of a system's life, security is often handled on the honor system. We assume that any process connecting to the backend is permitted to read any ledger, and any admin possesses total authority over the schema. In a production environment, this is a significant architectural hazard. The database engine must ensure that sensitive internal data remains shielded from arbitrary access, and the system must prevent standard processes from altering the data belonging to other operational owners. > [!NOTE] > **Technical Secret: The Front Gate vs. The Room Keys** > PostgreSQL security is built on two distinct architectural layers: **Authentication** (identity validation) and **Authorization** (resource permissioning). While the two are often conflated, understanding the boundary between them is critical for debugging "Permission Denied" errors and designing secure distributed systems. ### The Axis of Trust 1. **Authentication (AuthN): The Front Gate (`pg_hba.conf`)** The mechanism by which Postgres verifies that you are who you say you are. This happens at the very start of the connection lifecycle, governed by the **Host-Based Authentication** file. Whether you use SCRAM passwords, GSSAPI, or certificates, the goal of AuthN is to assign an **Identity** to your database backend process. 2. **Authorization (AuthZ): The Catalog Check** The ongoing process of checking if your assigned Identity has the required permissions to perform a specific action (e.g., `SELECT` from a table, `EXECUTE` a function). Authorization is enforced by the **Query Planner** and **Executor**, referencing the system catalogs: - [pg_authid](Manuscript/09 - Identity & Access Control/9.1 - Roles & Privileges (The Name Tags).md): The master vault of role attributes. - [pg_class](Manuscript/02 - Physical Storage & MVCC/2.4 - Relation (The Table).md): Where table-level permissions (ACLs) live. - [pg_namespace](Manuscript/09 - Identity & Access Control/9.1 - Roles & Privileges (The Name Tags).md): The registry for schema-level "Door Keys." ### Identity as a Resource In Postgres, everything is a **Role**. A "User" is simply a Role with the `LOGIN` attribute. A "Group" is simply a Role that possesses other Roles as members. By separating identity from permissions, Postgres allows for an incredibly flexible, inheritance-based security model. You can define a single `readonly_viewer` role, grant it specific permissions on your tables, and then grant that role to dozens of individual human users. When you update the permissions on the group, every member inherits the change instantly. ### The Principle of Least Privilege The goal of this chapter is to guide you through the process of hardening your "Security Posture." We will move from the coarse-grained permissions of the **Superuser** toward a more surgical, policy-driven model: - **Roles & Membership**: Building an inheritance tree of responsibility. - **Default Privileges**: Ensuring that new resources are born secure. - **Row-Level Security**: Injecting visibility rules directly into the query execution plan. - **Security Definers**: Safely delegating authority without elevating permanent status. By the end of this chapter, you will understand that security in Postgres is not a "Lock on the Door," but a pervasive, catalog-driven framework that governs every single byte retrieved by Postgres. --- ## 9.1 The Name Tags (Roles & Privileges) ![[assets/arch_access_nametags.png|450]] In many systems, there is a strict difference between "Users" (people who log in) and "Groups" (collections of people). The Elephant doesn't care about this distinction. To Postgres, everything is simply a **Role**. Think of a Role as a **Name Tag**. **The Causal Account**: When you `GRANT` one role to another, it’s like printing a small logo inside the plastic of the primary tag. When a Manager walks into a room, the Bouncer looks at their tag and says, "Aha! You are a Manager, but I can see the **'Waiter'** logo printed inside the plastic." The Manager automatically gets the Waiter's keys without having to ask! **The Flow of Power**: - **`INHERIT` (The Default)**: Power flows **up**. The Manager automatically gets everything the Waiter has. - **`NOINHERIT`**: The Manager has the *right* to the keys, but they aren't active yet. They have to explicitly "Swap" to the other identity (`SET ROLE cafe_waiter`) to use those specific keys. ```sql -- Create a Manager tag and a Waiter tag CREATE ROLE cafe_manager WITH LOGIN PASSWORD 'peanuts' CREATEROLE; CREATE ROLE cafe_waiter; ``` ```sql -- The Manager is now also wearing the Waiter Name Tag GRANT cafe_waiter TO cafe_manager; ``` > [!TIP] > **Delegated Authority (`WITH ADMIN OPTION`)**: > If you want the Manager to be able to hand out the "Waiter" tag to other staff members, you give them the **`ADMIN OPTION`**. It's like giving them a spare box of name tags to distribute at their discretion. ### The Vault vs. The View: `pg_authid` and `pg_roles` Where does Postgres keep track of these tags? Identity information is split across two primary catalogs: - **`pg_authid` (The Vault)**: The literal table on disk that stores everything, including hashed passwords and sensitive attributes. Only the Superuser can look here. - **`pg_roles` (The Public View)**: A sanitized view of the vault that hides passwords but shows everyone's stickers (attributes). This is what you see when you run `\du` in psql. **The OID Secret**: Remember [Chapter 2.1](Manuscript/02 - Physical Storage & MVCC/2.1 - Data Types (Knicks, knacks, bits, and bobs).md)? Under all these "Name Tags" and "Room Keys," everything in Postgres is still just an **OID (Object Identifier)**. A Role is just a number. A Table is just a number. Security is just a list in the vault saying "Number 123 has Key A for Number 456." Let's look at the literal contents of `pg_roles` for our new staff: ```sql SELECT rolname, rolsuper, rolinherit, rolcreaterole, rolcanlogin FROM pg_roles WHERE rolname LIKE 'cafe_%'; -- Literal Output: -- -[ RECORD 1 ]-+------------- -- rolname | cafe_manager -- rolsuper | f -- rolinherit | t -- rolcreaterole | t -- rolcanlogin | t -- -[ RECORD 2 ]-+------------- -- rolname | cafe_waiter -- rolsuper | f -- rolinherit | t -- rolcreaterole | f -- rolcanlogin | f ``` Every Name Tag has **Attributes** (the stickers on the tag). The `cafe_manager` tag has the `rolcanlogin` sticker (they can walk through the front door) and the `rolcreaterole` sticker (they can print new name tags for others). The `cafe_waiter` role has no login sticker—it's just a group badge. > [!NOTE] > **Technical Secret: The Superuser Wall-hack** > The **Superuser** sticker (`rolsuper`) is the Master Key. A role wearing this tag doesn't even see the Bouncers; it can bypass all permission checks, access any relation, and even modify the manager roles themselves. Always keep this sticker in a locked vault—if an unprivileged process or malicious actor acquires Superuser status, the entire cluster is compromised. ### The Room Keys (Schema & Object Privileges) Just because a process is inside the database doesn't mean it can open the ledgers. Every table has its own access control. To read a table, a role must be explicitly handed a key using the `GRANT` command. ```sql -- Handing a key to the Waiters GRANT SELECT ON orders TO cafe_waiter; ``` > [!NOTE] > **Cipher Breakout: Reading `{postgres=arwdDxt/postgres}`** > > Think of this string as a **Permission Stamp** with three parts: > 1. **Grantee**: `postgres` (The staff member receiving the key). > 2. **Privileges**: `arwdDxt` (The specific keys they hold). > 3. **Grantor**: `/postgres` (The manager who handed over the key). > > If the string was `{cafe_waiter=r/postgres}`, it would mean: *"The waiter can read (r), as permitted by the master postgres staff."* Every time a query runs, Postgres physically checks `pg_class.relacl` to ensure the applicant has the correct `r`, `w`, or `a` key! ### Anatomy of an ACL Item Let's zoom in on that cryptic string. A single ACL entry (an `aclitem`) is a structured "Permission Stamp" with three parts: ```text ┌──────────────────────────────────────────────────────────┐ │ ACL Item (The Permission Stamp) │ ├─────────────────┬──────────────────┬─────────────────────┤ │ GRANTEE │ PRIVILEGES │ GRANTOR │ │ (Who gets it) │ (What labels) │ (Who gave it) │ ├─────────────────┼──────────────────┼─────────────────────┤ │ cafe_waiter │ = r │ / postgres │ │ (Puffin) │ │ │ └─────────────────┴──────────────────┴─────────────────────┘ ``` - **Grantee**: The Name Tag being given the access. - **Privileges**: A string of single-letter codes representing what can be done. - **Grantor**: The person who stood at the counter and handed over the key. > [!WARNING] > **The Key in the Door (PUBLIC)** > If the Grantee field is empty (e.g., `=r/postgres`), it means the privilege was given to **PUBLIC**. This is like leaving the key in the lock for **everyone**. This is a security hazard—if you see this on a ledger of secret recipes, you've essentially invited any unprivileged process to have a look. ### Key to the Symbols The elephant is a fan of shorthand. Here is how Postgres's cryptic stamps translate to the language of the kitchen: | Code | SQL Privilege | Cafe Metaphor | | :--- | :--- | :--- | | **`r`** | `SELECT` | **Read**: Peek into a suitcase to see what's inside. | | **`a`** | `INSERT` | **Append**: Place a new suitcase in the shipping container. | | **`w`** | `UPDATE` | **Write**: Change the contents of an existing suitcase. | | **`d`** | `DELETE` | **Discard**: Mark a suitcase for later disposal (MVCC). | | **`D`** | `TRUNCATE` | **Dump**: Empty the entire shipping container instantly. | | **`x`** | `REFERENCES`| **Link**: Use this suitcase as a label for another. | | **`t`** | `TRIGGER` | **Tripwire**: Set a beaver to watch this area. | | **`X`** | `EXECUTE` | **Execute**: Run a specialized machine or recipe. | | **`U`** | `USAGE` | **Utility**: Enter a specific pantry area (Schema). | When you see `{postgres=arwdDxt/postgres}`, you are seeing the "Master Key"—the ability to append, read, write, discard, dump, link, and set tripwires, all granted by the owner. > **The Schema Trap**: Granting `SELECT` on a table is useless if the user does not also have `USAGE` on the **Schema** containing that table. The schema is the "door" to the room; the table is the "cabinet" inside. You cannot open the cabinet if you cannot enter the room. ### The Inheritance Ledger: `pg_auth_members` To see the current state of your role hierarchy, you must look into the **Inheritance Ledger**. This catalog tracks the relationships between members and their groups. ```sql -- See the membership graph SELECT m.rolname AS member, g.rolname AS group_role, admin_option FROM pg_auth_members am JOIN pg_roles m ON am.member = m.oid JOIN pg_roles g ON am.roleid = g.oid; ``` By mastering roles and membership, you move away from managing thousands of individual passwords and toward a scalable, group-based identity model. --- ## 9.2 The Manager's Orders (Default Privileges) ![[assets/arch_access_default_privs.png|450]] The biggest headache in Postgres security is the **New Object Problem**. In **[[Manuscript/02 - Physical Storage & MVCC/2.0 - Storage Foundations (The Building Blocks)|Chapter 2]]**, we saw that data is stored in relations (tables). When a process creates a new table, **it becomes the owner, and it is the ONLY entity with access by default**. Even if you previously granted a group `SELECT ON ALL TABLES IN SCHEMA public`, that grant only applied to the objects that existed *at that specific moment*. If a migration creates a new recipe book tomorrow, other roles will receive a "Permission Denied" error if they try to access it. **The Intuition**: Think of Default Privileges as a **Standing Order** for a personal assistant. The assistant only knows what to do when *their specific boss* (the Creator) produces a new document. They have no instructions for what to do when a different office (a different Role) brings in a table! **The Shape of the Idea**: Default Privileges are **Personal Automation**, not **Global Gravity**. They do not pull permissions into the schema automatically; they are triggered only when a specific role performs a specific DDL action. This is managed through the [pg_default_acl](Manuscript/09 - Identity & Access Control/9.2 - Default Privileges (The Manager's Orders).md) system catalog. > [!TIP] > **The Owner's Trap** > While a standing order automates the granting of keys, the **Owner** (the creator) always holds the master lock. They can explicitly `REVOKE` access later or drop the entire relation, regardless of what the default privileges specified. The creator remains the ultimate authority over their own objects. ### The Standing Order To fix this, you don't grant privileges on the ledger. You give a **Standing Order** to Postgres, known as `ALTER DEFAULT PRIVILEGES`. ```sql ALTER DEFAULT PRIVILEGES FOR ROLE cafe_manager IN SCHEMA public GRANT SELECT ON TABLES TO cafe_waiter; ``` This tells Postgres: *"Listen closely. Every time the `cafe_manager` creates a new ledger in the public room, I want you to immediately cut a 'read' key and hand it to `cafe_waiter`."* > [!NOTE] > **Technical Secret: The Implicit Creator Gotcha** > If you don't specify `FOR ROLE`, the database engine assumes the standing order is for **you**. If you are the one running the migrations (`cafe_manager`), that's fine. But if you are logged in as a `service_role` and someone else runs the migrations, the order will never trigger! Always be explicit about who the "Creator" is. One of the most common security failures in a growing database is the "New Object Gap." When you create a new table, it is owned by the role that created it. By default, no other roles (except superusers) have access to it. If you have a team of developers creating tables, you will constantly be running manual `GRANT` commands to keep your permissions in sync. To solve this, PostgreSQL provides **Default Privileges**. ### The Mechanism: `pg_default_acl` Default privileges allow you to define a set of access rules that will be applied automatically to any **future** objects created within a specific schema. These rules are stored in the `pg_default_acl` system catalog. Unlike standard `GRANT` commands which modify the ACL of an existing object, `ALTER DEFAULT PRIVILEGES` modifies the *blueprint* for new objects. ```sql -- Enforce that any new tables created by 'postgres' in the 'public' schema -- are automatically readable by the 'cafe_admin' group. ALTER DEFAULT PRIVILEGES FOR ROLE postgres IN SCHEMA public GRANT SELECT ON TABLES TO cafe_admin; ``` ### The Critical "Gotcha": The Creator Constraint Default privileges in Postgres possess a significant architectural limitation: **They are tied to the creator, not just the schema.** If you run the command above for the `postgres` role, it will works perfectly—*as long as the `postgres` role is the one creating the tables*. If another user, `head_chef`, creates a table in the same schema, the default privileges you set for `postgres` **will not apply**. The `head_chef` is a different creator, and they have their own (likely empty) set of default ACLs. > [!WARNING] > **The Inheritance Fallacy**: Even if `head_chef` is a member of the `postgres` role, the default privileges of the group do not automatically apply to objects created by the member. Default privileges must be explicitly defined for every role that is expected to create objects, or you must ensure all creators use `SET ROLE` to become the authorized creator before running DDL. ### Auditing Defaults You can inspect your current default blueprints using the `\ddp` command in `psql` or by querying the catalog directly: ```sql -- View all default ACLs in the cluster SELECT defaclrole::regrole AS creator, defaclnamespace::regnamespace AS schema, defaclobjtype AS type, defaclacl AS acl FROM pg_default_acl; ``` - **`defaclrole`**: The role whose creations are being modified. - **`defaclnamespace`**: The schema filter (Zero means "all schemas"). - **`defaclobjtype`**: `r` for tables, `f` for functions, `S` for sequences. ### Strategy: The "Secure by Default" Pattern For a robust production environment, follow this sequence: 1. **Define a NOLOGIN Admin Role** (`cafe_mgr`) that will own all schemas. 2. **Define a NOLOGIN Read Role** (`cafe_reader`) for reporting. 3. **Establish Defaults** for the Admin Role so it always grants access to the Read Role. 4. **Enforce DDL execution** through the Admin Role only. By mastering default privileges, you ensure that security is an inherent property of your architecture, rather than a manual checklist that can be forgotten during a busy deployment. --- ## 9.3 The VIP List (Row-Level Security) ![[assets/arch_bouncer_rhino.png|450]] So far, we've controlled access to the *rooms* (Schemas) and the *entire ledgers* (Tables). But what if customers share the same ledger? Imagine the Cafe has a `secrets` table. The Manager has a secret code, and the Waiter has a terrible secret about the toast. ```sql CREATE TABLE secrets (id int, owner name, secret text); INSERT INTO secrets VALUES (1, 'cafe_manager', 'The safe code is 555-1234'); INSERT INTO secrets VALUES (2, 'cafe_waiter', 'The Planner burned the toast'); -- Give everyone a key to the ledger GRANT SELECT ON secrets TO PUBLIC; ``` Standard table-level permissions (GRANTs) are binary: you either have access to the entire table or you have access to none of it. **Row-Level Security (RLS)** provides a more granular mechanism by allowing you to define visibility and modification rules on a per-row basis. ### The Mechanism: Query Rewriting Architecturally, RLS is not a separate screening process that happens after data is retrieved. Instead, it is a **Query Rewriting** mechanism integrated into the [Query Planner](Manuscript/04 - Query Planning & Execution/4.1 - Query Planner (The Blueprint of Execution).md). When RLS is enabled on a table, the Planner automatically injects the relevant policy expressions from the [pg_policy](Manuscript/09 - Identity & Access Control/9.3 - Row-Level Security (The VIP List).md) catalog into your SQL as additional `WHERE` clauses (for reads) or validation checks (for writes). ```sql -- Enabling RLS on a table ALTER TABLE secrets ENABLE ROW LEVEL SECURITY; -- Creating a policy: Users can only see their own records CREATE POLICY read_own ON secrets FOR SELECT USING (owner = current_user); ``` When a user runs `SELECT * FROM secrets`, the Planner internally rewrites the query to: `SELECT * FROM secrets WHERE (owner = current_user)`. ### Visibility vs. Validation (USING vs. WITH CHECK) Postgres distinguishes between existing data and new data being introduced to the system: - **`USING`**: Defines which **existing** rows are visible to the user. If a row does not satisfy this expression, it is silently excluded from the result set as if it does not exist. - **`WITH CHECK`**: Defines which **newly created or updated** rows are allowed to be written. If a user attempts to `INSERT` or `UPDATE` a row that violates this expression, Postgres returns an error. > [!IMPORTANT] > **Silencing the Errors**: RLS is fundamentally a **Visibility** tool. When a user tries to read a row they don't own, Postgres does not throw an error; it simply returns an empty result set. This prevents "Identity Probing"—a side-channel attack where a hacker could guess the existence of IDs by watching for error messages. ### The Performance Tax Because RLS is a **Query Rewrite** mechanism, it is not free. Every policy you define adds a hidden computational burden to every query: 1. **Plan Expansion**: Postgres doesn't just run your query; it has to append every valid policy as a new `Filter` node in the execution plan. If you have several complex policies, the [Query Planner](Manuscript/04 - Query Planning & Execution/4.1 - Query Planner (The Blueprint of Execution).md) suddenly has a much more complex set of constraints to coordinate. 2. **Index Suppression**: If an RLS policy uses a function or a subquery that the Planner cannot fully analyze, it may default to a **Sequential Scan**, ignoring high-speed indexes entirely. 3. **The Subquery Hazard**: If your policy requires a lookup in another table (e.g., `USING (user_id IN (SELECT id FROM staff))`), Postgres may be forced to perform that dependent lookup for every scan, significantly increasing query latency. > [!TIP] > **Keep Policies Local**: To avoid the performance tax, keep your RLS policies as simple as possible. Use simple column-to-user comparisons (`owner = current_user`) whenever possible to ensure the Planner can still utilize indexes effectively. ### High-Performance Security: The Security Barrier Because RLS expressions are injected into the query, they must be evaluated frequently. To optimize this, the Planner often treats RLS-enabled tables as **`SECURITY_BARRIER`** views. This ensures that any user-provided functions or filters (which might contain malicious logging or side-effects) are only executed **after** the RLS security filters have been applied. Postgres creates a logical wall between the "Trusted" security filters and the "Untrusted" user filters. ### Auditing Policies Policy definitions are stored in the `pg_policy` system catalog. You can inspect the visceral, internal representation of these rules: ```sql -- View policies and their internal SQL expressions SELECT polname, polcmd, polroles, polqual FROM pg_policy; ``` - **`polcmd`**: The command type (`r` for SELECT, `a` for INSERT, `w` for UPDATE, `d` for DELETE). - **`polqual`**: The `USING` expression. - **`polwithcheck`**: The `WITH CHECK` expression. By leveraging RLS, you move identity logic out of the application layer and into the database engine itself, ensuring that security is a deterministic property of the data regardless of which application is accessing it. --- ## 9.4 The Manager's Override (Security Definers) ![[assets/arch_access_security_definer.png|450]] In a highly secure environment, you often face the "Access Paradox": a low-privileged role needs to perform a task that requires high-level permissions, but you do not want to grant that role permanent administrative rights. To solve this, PostgreSQL provides **`SECURITY DEFINER`** functions. ### The Mechanism: Identity Context Switching By default, functions in Postgres are defined as **`SECURITY INVOKER`**. This means the function executes with the privileges of the user who calls it. A **`SECURITY DEFINER`** function, however, executes with the privileges of the role that **created** the function. When a lower-privileged role calls a `SECURITY DEFINER` function: 1. **Identity Context Switch**: The database engine performs a literal switch of the `SessionUser` and `CurrentUser` variables. 2. **Elevation**: The effective user identity is temporarily elevated to that of the function's creator (the Definer). 3. **Privileged Execution**: The function body executes with elevated permissions (e.g., bypassing [RLS](Manuscript/09 - Identity & Access Control/9.3 - Row-Level Security (The VIP List).md) or reading private relations). 4. **Restoration**: Once the function returns, the identity is restored to the original invoker. > [!CAUTION] > **Privilege Escalation Risk**: `SECURITY DEFINER` is a deliberate mechanism for **Privilege Escalation**. You are allowing a role to assume the authority of another user for a specific code path. This is a powerful tool for building secure APIs, but it represents a significant security hazard if the function body is not rigorously vetted. ### The Trojan Horse (The `search_path` Vulnerability) The most dangerous vulnerability in `SECURITY DEFINER` functions is a **Search Path Attack**. If a function executes a query like `SELECT * FROM ingredients`, Postgres searches for the `ingredients` table using the current **`search_path`**. If an intruder creates a malicious table also named `ingredients` in a schema they control (like `public` or a private schema), and then tricks a high-privileged function into looking at their schema first, the function will execute against the malicious data using the elevated permissions. #### Hardening the Vault Door To prevent these "Trojan Horse" attacks, a truly secure Cafe follows two strict rules: 1. **Lock the Path**: Always use `SET search_path = public, pg_temp` on the function definition. This forces Postgres to only look in trusted locations. 2. **Revoke PUBLIC Access**: By default, the `public` schema is open to everyone. To harden the "Filing Cabinet," you must strip these default permissions: ```sql -- Close the "Global Kitchen" door to strangers REVOKE ALL ON SCHEMA public FROM PUBLIC; ``` By adding these constraints, you ensure that even when Postgres has the "Manager's Override," it only follows recipes from a trusted source. ### Auditing Your Posture Before concluding our exploration of Access Control, perform a final audit of the cluster's security state using these diagnostic axes: 1. **Identity Entropy**: Are you using dedicated Roles for services and humans, or are you over-relying on the `postgres` superuser? 2. **The Least Privilege Check**: Does every role have exactly the `GRANT` keys required for its specific room ([Schema](Manuscript/09 - Identity & Access Control/9.1 - Roles & Privileges (The Name Tags).md)) and cabinet ([Table](Manuscript/02 - Physical Storage & MVCC/2.4 - Relation (The Table).md))? 3. **The "Secure by Default" Check**: Are [Default Privileges](Manuscript/09 - Identity & Access Control/9.2 - Default Privileges (The Manager's Orders).md) configured so that new data is born within the correct security boundaries? 4. **The Visibility Audit**: Is [Row-Level Security](Manuscript/09 - Identity & Access Control/9.3 - Row-Level Security (The VIP List).md) active on multi-tenant tables, and have you verified that policies correctly exclude rows using `USING` clauses? 5. **The Escalation Audit**: Have you vetted every `SECURITY DEFINER` function for `search_path` safety and unintended side-effects? By mastering the catalogs and the identity graph, you transition from a passive consumer of database security to an architect who builds systems that are **Secure by Construction**.