TechLead
Lesson 19 of 22
5 min read
PostgreSQL

PostgreSQL Performance Tuning

Optimize PostgreSQL with EXPLAIN ANALYZE, configuration tuning, connection pooling, and VACUUM strategies.

Reading EXPLAIN ANALYZE

EXPLAIN ANALYZE is the single most important tool for understanding query performance. It runs the query and shows the actual execution plan with timing information.

-- Always use ANALYZE for real timing data (BUFFERS adds I/O info)
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.name, COUNT(o.id) AS order_count
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE o.created_at > NOW() - INTERVAL '30 days'
GROUP BY u.name
ORDER BY order_count DESC
LIMIT 10;

-- Key things to look for:
-- Seq Scan on large tables     -> add an index
-- Nested Loop with many rows   -> consider Hash Join
-- Sort with external merge     -> increase work_mem
-- Rows estimated vs actual     -> run ANALYZE on the table

Query Optimization Patterns

-- PROBLEM: subquery runs for every row
SELECT name, (SELECT COUNT(*) FROM orders WHERE user_id = u.id) AS cnt
FROM users u;

-- BETTER: lateral join or regular join with GROUP BY
SELECT u.name, COUNT(o.id) AS cnt
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
GROUP BY u.name;

-- PROBLEM: OR conditions prevent index usage
SELECT * FROM orders WHERE customer_email = 'a@b.com' OR phone = '555-1234';

-- BETTER: UNION ALL with separate index scans
SELECT * FROM orders WHERE customer_email = 'a@b.com'
UNION ALL
SELECT * FROM orders WHERE phone = '555-1234'
  AND customer_email != 'a@b.com';

-- PROBLEM: counting all rows is slow
SELECT COUNT(*) FROM large_table;

-- WORKAROUND: approximate count from stats
SELECT reltuples::bigint AS estimate
FROM pg_class WHERE relname = 'large_table';

Essential Configuration Tuning

Default PostgreSQL settings are conservative. Tune these based on your available RAM and workload.

-- For a server with 16 GB RAM:

-- Memory for shared buffer cache (25% of RAM)
shared_buffers = '4GB'

-- Memory per operation (sort, hash join)
work_mem = '256MB'

-- Planner's estimate of OS cache size (50-75% of RAM)
effective_cache_size = '12GB'

-- WAL / checkpoint settings
wal_buffers = '64MB'
checkpoint_completion_target = 0.9
min_wal_size = '1GB'
max_wal_size = '4GB'

-- Parallel query workers
max_parallel_workers_per_gather = 4
max_parallel_workers = 8

Connection Pooling with PgBouncer

-- PostgreSQL forks a process per connection (~10 MB each)
-- 500 connections = 5 GB overhead + context switching
-- Solution: connection pooler in front of PostgreSQL

-- PgBouncer config (pgbouncer.ini)
[databases]
app_db = host=localhost port=5432 dbname=app_db

[pgbouncer]
listen_port = 6432
pool_mode = transaction      -- release connection after each transaction
max_client_conn = 1000       -- clients connect to PgBouncer
default_pool_size = 20       -- PgBouncer opens only 20 real connections
reserve_pool_size = 5

-- Application connects to PgBouncer on port 6432 instead of 5432
-- postgresql://app_user:secret@localhost:6432/app_db

VACUUM, ANALYZE & Monitoring

-- Find tables needing VACUUM (high dead tuple ratio)
SELECT
  schemaname, relname,
  n_live_tup, n_dead_tup,
  ROUND(n_dead_tup::numeric / NULLIF(n_live_tup, 0) * 100, 1) AS dead_pct,
  last_vacuum, last_autovacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;

-- Tune autovacuum for a high-churn table
ALTER TABLE events SET (
  autovacuum_vacuum_scale_factor = 0.01,   -- trigger at 1% dead rows
  autovacuum_analyze_scale_factor = 0.005,
  autovacuum_vacuum_cost_delay = 2         -- be more aggressive
);

-- Find slow queries with pg_stat_statements
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

SELECT
  calls, mean_exec_time::numeric(10,2) AS avg_ms,
  total_exec_time::numeric(10,2) AS total_ms,
  query
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;

Performance Checklist

  • • Run EXPLAIN (ANALYZE, BUFFERS) on slow queries before adding indexes
  • • Run ANALYZE after bulk data loads so the planner has fresh statistics
  • • Enable pg_stat_statements in production — it is the best way to find slow queries
  • • Use a connection pooler (PgBouncer or pgcat) for apps with many concurrent connections
  • • Set log_min_duration_statement = 1000 to log queries slower than 1 second
  • • Monitor pg_stat_user_tables for sequential scans on large tables

Continue Learning