What is PostgreSQL Performance Tuning?

Optimize PostgreSQL with EXPLAIN ANALYZE, configuration tuning, connection pooling, and VACUUM strategies.

PostgreSQL Performance Tuning - PostgreSQL Tutorial | TechLead

Reading EXPLAIN ANALYZE

EXPLAIN ANALYZE is the single most important tool for understanding query performance. It runs the query and shows the actual execution plan with timing information.

-- Always use ANALYZE for real timing data (BUFFERS adds I/O info)
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.name, COUNT(o.id) AS order_count
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE o.created_at > NOW() - INTERVAL '30 days'
GROUP BY u.name
ORDER BY order_count DESC
LIMIT 10;

-- Key things to look for:
-- Seq Scan on large tables     -> add an index
-- Nested Loop with many rows   -> consider Hash Join
-- Sort with external merge     -> increase work_mem
-- Rows estimated vs actual     -> run ANALYZE on the table

Query Optimization Patterns

-- PROBLEM: subquery runs for every row
SELECT name, (SELECT COUNT(*) FROM orders WHERE user_id = u.id) AS cnt
FROM users u;

-- BETTER: lateral join or regular join with GROUP BY
SELECT u.name, COUNT(o.id) AS cnt
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
GROUP BY u.name;

-- PROBLEM: OR conditions prevent index usage
SELECT * FROM orders WHERE customer_email = 'a@b.com' OR phone = '555-1234';

-- BETTER: UNION ALL with separate index scans
SELECT * FROM orders WHERE customer_email = 'a@b.com'
UNION ALL
SELECT * FROM orders WHERE phone = '555-1234'
  AND customer_email != 'a@b.com';

-- PROBLEM: counting all rows is slow
SELECT COUNT(*) FROM large_table;

-- WORKAROUND: approximate count from stats
SELECT reltuples::bigint AS estimate
FROM pg_class WHERE relname = 'large_table';

Essential Configuration Tuning

Default PostgreSQL settings are conservative. Tune these based on your available RAM and workload.

-- For a server with 16 GB RAM:

-- Memory for shared buffer cache (25% of RAM)
shared_buffers = '4GB'

-- Memory per operation (sort, hash join)
work_mem = '256MB'

-- Planner's estimate of OS cache size (50-75% of RAM)
effective_cache_size = '12GB'

-- WAL / checkpoint settings
wal_buffers = '64MB'
checkpoint_completion_target = 0.9
min_wal_size = '1GB'
max_wal_size = '4GB'

-- Parallel query workers
max_parallel_workers_per_gather = 4
max_parallel_workers = 8

Connection Pooling with PgBouncer

-- PostgreSQL forks a process per connection (~10 MB each)
-- 500 connections = 5 GB overhead + context switching
-- Solution: connection pooler in front of PostgreSQL

-- PgBouncer config (pgbouncer.ini)
[databases]
app_db = host=localhost port=5432 dbname=app_db

[pgbouncer]
listen_port = 6432
pool_mode = transaction      -- release connection after each transaction
max_client_conn = 1000       -- clients connect to PgBouncer
default_pool_size = 20       -- PgBouncer opens only 20 real connections
reserve_pool_size = 5

-- Application connects to PgBouncer on port 6432 instead of 5432
-- postgresql://app_user:secret@localhost:6432/app_db

VACUUM, ANALYZE & Monitoring

-- Find tables needing VACUUM (high dead tuple ratio)
SELECT
  schemaname, relname,
  n_live_tup, n_dead_tup,
  ROUND(n_dead_tup::numeric / NULLIF(n_live_tup, 0) * 100, 1) AS dead_pct,
  last_vacuum, last_autovacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;

-- Tune autovacuum for a high-churn table
ALTER TABLE events SET (
  autovacuum_vacuum_scale_factor = 0.01,   -- trigger at 1% dead rows
  autovacuum_analyze_scale_factor = 0.005,
  autovacuum_vacuum_cost_delay = 2         -- be more aggressive
);

-- Find slow queries with pg_stat_statements
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

SELECT
  calls, mean_exec_time::numeric(10,2) AS avg_ms,
  total_exec_time::numeric(10,2) AS total_ms,
  query
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;

Performance Checklist

• Run EXPLAIN (ANALYZE, BUFFERS) on slow queries before adding indexes
• Run ANALYZE after bulk data loads so the planner has fresh statistics
• Enable pg_stat_statements in production — it is the best way to find slow queries
• Use a connection pooler (PgBouncer or pgcat) for apps with many concurrent connections
• Set log_min_duration_statement = 1000 to log queries slower than 1 second
• Monitor pg_stat_user_tables for sequential scans on large tables

PostgreSQL Performance Tuning

Reading EXPLAIN ANALYZE

Query Optimization Patterns

Essential Configuration Tuning

Connection Pooling with PgBouncer

VACUUM, ANALYZE & Monitoring

Performance Checklist

Continue Learning

Data Engineering

Node.js

REST & APIs

System Design

Python