Choosing the Right Data Type
Picking the correct column type is one of the most impactful database decisions you can make. The right type enforces correctness at the storage level, enables efficient indexing, and reduces the amount of validation logic your application code needs to duplicate.
-- Text and strings
TEXT -- variable-length, no arbitrary limit (preferred)
VARCHAR(n) -- variable-length with limit (use TEXT + CHECK instead)
CHAR(n) -- fixed-length, padded with spaces (rarely needed)
-- Numbers
SMALLINT -- 2 bytes, -32768 to 32767
INTEGER -- 4 bytes, ~-2 billion to 2 billion (most common)
BIGINT -- 8 bytes, very large integers
SERIAL / BIGSERIAL -- auto-incrementing integers (legacy; prefer GENERATED)
NUMERIC(p, s) -- exact decimal, e.g. NUMERIC(10,2) for money
REAL / DOUBLE PRECISION -- floating point (avoid for money due to rounding)
-- Booleans
BOOLEAN -- true / false / null
-- Date & time (store in UTC, convert on display)
DATE -- calendar date only (e.g. 2024-01-15)
TIME -- time of day without timezone
TIMESTAMP -- date + time without timezone
TIMESTAMPTZ -- date + time WITH timezone (strongly preferred)
INTERVAL -- duration, e.g. INTERVAL '3 days'
-- Identifiers
UUID -- universally unique identifier (128-bit)
SERIAL -- auto-increment integer (see GENERATED ALWAYS AS IDENTITY)
-- Structured / semi-structured
JSONB -- binary JSON (indexed, preferred over JSON)
ARRAY -- e.g. TEXT[], INTEGER[]
Modern Primary Keys
Postgres 10+ offers GENERATED ALWAYS AS IDENTITY which is cleaner than the older SERIAL shorthand. For distributed systems, UUIDs avoid conflicts when merging data from multiple sources.
-- Modern auto-increment (preferred over SERIAL)
CREATE TABLE posts (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
title TEXT NOT NULL
);
-- UUID primary key (great for distributed systems)
CREATE EXTENSION IF NOT EXISTS "pgcrypto";
CREATE TABLE events (
id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
payload JSONB NOT NULL
);
JSON vs JSONB
Postgres has two JSON types. JSON stores the exact text input and re-parses it on every access. JSONB converts JSON to an optimized binary format on write, deduplicates keys, and supports GIN indexes. Use JSONB unless you need to preserve exact key ordering or whitespace.
CREATE TABLE webhook_events (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
source TEXT NOT NULL,
payload JSONB NOT NULL,
received_at TIMESTAMPTZ DEFAULT NOW()
);
-- JSONB operators
-- -> returns a JSON value, ->> returns text
SELECT
payload->>'type' AS event_type,
payload->'user'->>'email' AS user_email,
payload#>>'{order,total}' AS order_total -- path navigation
FROM webhook_events
WHERE payload->>'type' = 'checkout.completed'
AND (payload->'amount')::numeric > 100;
Indexing JSONB
A GIN index on a JSONB column dramatically speeds up containment queries (@>) and key existence checks (?).
-- GIN index for containment and key checks
CREATE INDEX idx_events_payload ON webhook_events USING GIN (payload);
-- Now this query uses the index:
SELECT * FROM webhook_events
WHERE payload @> '{"type": "payment.failed"}';
-- Targeted index on a specific JSONB path
CREATE INDEX idx_events_type
ON webhook_events ((payload->>'type'));
-- Use this index with:
WHERE payload->>'type' = 'checkout.completed'
JSONB Best Practices
- • Prefer JSONB over JSON for any column you will query or index.
- • Extract frequently-queried keys into dedicated columns for better performance and constraints.
- • Use JSONB for truly variable or schemaless data (webhook payloads, audit logs, feature flags).
- • Add a GIN index immediately if you query by containment (
@>) or key existence (?).