What is Indexing & Performance?

Speed up queries with the right indexes and EXPLAIN

Indexing & Performance - PostgreSQL Tutorial | TechLead

How Indexes Work

Without an index, Postgres performs a Sequential Scan: it reads every row in the table to find matches. For a table with millions of rows, this is extremely slow. An index is a separate data structure (usually a B-tree) that keeps a sorted copy of one or more column values along with pointers to the corresponding heap pages, allowing the planner to jump directly to matching rows in O(log n) time.

The trade-off: indexes consume disk space and slow down writes (INSERT/UPDATE/DELETE must maintain every index on the table). Only index columns you actually query on.

Index Types

B-tree (default) — equality, range, sorting. Works with =, <, >, BETWEEN, LIKE ‘foo%’.
Hash — equality only (=). Slightly smaller than B-tree for simple equality lookups.
GIN — inverted index for JSONB, arrays, and full-text search (tsvector).
GiST — generalized search tree for geometry, ranges, and nearest-neighbor queries.
BRIN — block range index for very large, naturally ordered tables (timestamps on an append-only log).

Creating Indexes

-- Basic B-tree (equality + range)
CREATE INDEX idx_users_email ON users (email);

-- Composite index (order matters: leftmost prefix rule)
CREATE INDEX idx_orders_status_date ON orders (status, created_at DESC);

-- Partial index (only index a subset of rows)
CREATE INDEX idx_users_active ON users (email)
  WHERE deleted_at IS NULL;

-- Expression index (index the result of a function)
CREATE INDEX idx_users_lower_email ON users (lower(email));

-- GIN index for JSONB
CREATE INDEX idx_events_payload ON webhook_events USING GIN (payload);

-- GIN index for full-text search
CREATE INDEX idx_posts_search ON posts
  USING GIN (to_tsvector(‘english’, title || ‘ ‘ || body));

-- Unique index (also enforces the constraint)
CREATE UNIQUE INDEX idx_products_sku ON products (sku)
  WHERE deleted_at IS NULL;  -- partial unique constraint

Reading EXPLAIN ANALYZE

Use EXPLAIN ANALYZE to see how Postgres plans and executes a query. The key things to look for are scan type, actual rows vs. estimated rows, and total execution time.

EXPLAIN ANALYZE
SELECT u.name, p.title
FROM   posts p
JOIN   users u ON u.id = p.user_id
WHERE  p.published = true
  AND  p.created_at > NOW() - INTERVAL ‘7 days’;

Key terms in EXPLAIN output:

Seq Scan — reading the whole table. Fine for small tables; slow for large ones.
Index Scan — using an index to find rows, then fetching heap pages. Good.
Index Only Scan — satisfying the query entirely from the index (no heap fetch). Best.
Hash Join / Merge Join / Nested Loop — different join strategies; the planner chooses based on row estimates.
actual time vs. estimated rows — large discrepancies suggest stale statistics (run ANALYZE).

Indexing Checklist

• Always index foreign key columns (Postgres does not do this automatically).
• Index columns used in WHERE, JOIN ON, and ORDER BY clauses that appear in slow queries.
• Use partial indexes to cover just the rows you commonly query (e.g., active users).
• Use CREATE INDEX CONCURRENTLY in production to avoid locking the table.
• Regularly check for unused indexes with pg_stat_user_indexes and drop them.
• Run ANALYZE table_name after bulk loads so the planner has fresh statistics.

Indexing & Performance

How Indexes Work

Index Types

Creating Indexes

Reading EXPLAIN ANALYZE

Indexing Checklist

Continue Learning

Data Engineering

Node.js

REST & APIs

System Design

Python