The Goal of Data Modeling
A well-designed schema does more than organize data — it encodes your domain's business rules directly into the database. This means bugs from incorrect data (null where a value is required, duplicate emails, orphaned rows) are caught by the database before they reach your application layer. Good modeling also makes queries simpler and faster because the data is structured in a way that mirrors how you need to retrieve it.
Normalization: Eliminate Redundancy
Normalization is the process of organizing columns and tables to reduce duplicated data. The most common target is Third Normal Form (3NF): each non-key column depends on the primary key, and nothing else.
- 1NF: Each column holds a single atomic value (no comma-separated lists in a cell).
- 2NF: All columns depend on the whole primary key (no partial dependencies in composite keys).
- 3NF: No column depends on another non-key column (no transitive dependencies).
When in doubt: if you find yourself copying the same value into multiple rows, it probably belongs in its own table.
Constraints: Enforcing Business Rules
CREATE TABLE users (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
email TEXT NOT NULL UNIQUE,
username TEXT NOT NULL UNIQUE CHECK (length(username) >= 3),
role TEXT NOT NULL DEFAULT 'member'
CHECK (role IN ('member', 'admin', 'moderator')),
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
deleted_at TIMESTAMPTZ -- NULL = active
);
CREATE TABLE posts (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES users(id) ON DELETE CASCADE,
slug TEXT NOT NULL UNIQUE,
title TEXT NOT NULL CHECK (length(title) > 0),
body TEXT,
published BOOLEAN NOT NULL DEFAULT false,
published_at TIMESTAMPTZ,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
Foreign Keys and Referential Integrity
Foreign keys ensure that a referenced row actually exists. The ON DELETE clause controls what happens when the referenced row is deleted:
-- Cascade: delete comments when the post is deleted
post_id BIGINT NOT NULL REFERENCES posts(id) ON DELETE CASCADE
-- Restrict (default): prevent deletion if child rows exist
user_id BIGINT NOT NULL REFERENCES users(id) ON DELETE RESTRICT
-- Set NULL: keep the comment but clear the author reference
author_id BIGINT REFERENCES users(id) ON DELETE SET NULL
-- Always index foreign key columns — Postgres does NOT do this automatically
CREATE INDEX idx_posts_user_id ON posts (user_id);
CREATE INDEX idx_comments_post_id ON comments (post_id);
Many-to-Many Relationships
-- Junction table for post tags
CREATE TABLE tags (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
name TEXT NOT NULL UNIQUE
);
CREATE TABLE post_tags (
post_id BIGINT NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
tag_id BIGINT NOT NULL REFERENCES tags(id) ON DELETE CASCADE,
PRIMARY KEY (post_id, tag_id) -- composite PK prevents duplicates
);
-- Query all tags for a post
SELECT t.name
FROM post_tags pt
JOIN tags t ON t.id = pt.tag_id
WHERE pt.post_id = 42;
Schema Design Best Practices
- • Use
NOT NULLon every column that must always have a value. Nullable columns should be the exception. - • Add
UNIQUEconstraints for natural keys like email addresses, slugs, and SKUs. - • Use
TIMESTAMPTZ(with timezone) rather than plainTIMESTAMPfor all date/time columns. - • Soft-delete with a
deleted_atcolumn instead of physically removing rows when you need an audit trail. - • Add indexes on every foreign key column — Postgres never creates them automatically.
- • Use
CHECKconstraints for simple domain validation (minimum length, allowed values).