What is System Design Interview Framework?

A complete framework for acing system design interviews with step-by-step methodology, estimation techniques, and common mistakes to avoid.

System Design Interview Framework - System Design Tutorial | TechLead

System Design Interview Framework

System design interviews evaluate your ability to design large-scale systems under ambiguity. Unlike coding interviews with clear right answers, system design interviews test your ability to make reasonable trade-offs, communicate clearly, and demonstrate breadth of knowledge. This guide provides a structured framework that works for any system design question.

What Interviewers Are Looking For

Communication: Can you articulate your thinking clearly?
Problem-solving approach: Do you work methodically or jump to solutions?
Trade-off analysis: Can you evaluate different options and justify your choices?
Technical depth: Do you understand the technologies you propose using?
Scalability awareness: Can you identify and address bottlenecks?
Practical experience: Does your design reflect real-world considerations?

Step 1: Clarify Requirements (5 minutes)

This is the most important step. Many candidates fail because they design the wrong system. Ask questions to understand scope, constraints, and priorities.

Functional Requirements

What does the system need to do? Be specific about core features vs. nice-to-haves.

// Example: "Design a URL shortener"
// Questions to ask:
// - Should users be able to customize short URLs?
// - Do short URLs expire?
// - Do we need analytics (click counts, geographic data)?
// - Is there authentication or can anyone create URLs?
// - What is the expected format of short URLs?

interface FunctionalRequirements {
  core: string[];     // Must have
  secondary: string[]; // Nice to have, time permitting
  outOfScope: string[]; // Explicitly excluded
}

const urlShortenerRequirements: FunctionalRequirements = {
  core: [
    "Create short URL from long URL",
    "Redirect short URL to original URL",
    "Short URLs should be unique and non-guessable",
  ],
  secondary: [
    "Custom aliases",
    "URL expiration",
    "Click analytics",
  ],
  outOfScope: [
    "User authentication",
    "Rate limiting UI",
    "Admin dashboard",
  ],
};

Non-Functional Requirements

These drive your architecture decisions more than functional requirements.

interface NonFunctionalRequirements {
  scale: {
    dailyActiveUsers: number;
    readWriteRatio: number;
    peakMultiplier: number;
  };
  performance: {
    readLatency: string;   // e.g., "< 50ms for 99th percentile"
    writeLatency: string;  // e.g., "< 200ms"
  };
  availability: string;    // e.g., "99.99%"
  consistency: string;     // "strong" or "eventual"
  durability: string;      // "zero data loss"
}

// Key questions:
// - How many users? DAU/MAU?
// - What is the read/write ratio?
// - What are the latency requirements?
// - What consistency level is needed?
// - What are the availability requirements?
// - Is there geographic distribution?

Step 2: Back-of-Envelope Estimation (3-5 minutes)

Quick calculations to validate your design direction. Focus on the numbers that influence architecture decisions.

// Example: URL shortener estimation
// Given: 100M URLs created per month

// Write QPS
const urlsPerMonth = 100_000_000;
const writeQPS = urlsPerMonth / (30 * 24 * 3600); // ~40 writes/sec

// Read QPS (assuming 100:1 read:write ratio)
const readQPS = writeQPS * 100; // ~4,000 reads/sec
const peakReadQPS = readQPS * 3; // ~12,000 reads/sec at peak

// Storage (5-year horizon)
const urlsIn5Years = urlsPerMonth * 12 * 5; // 6 billion URLs
const avgRecordSize = 500; // bytes (short URL + long URL + metadata)
const totalStorage = urlsIn5Years * avgRecordSize; // 3 TB

// Bandwidth
const readBandwidth = readQPS * avgRecordSize; // ~2 MB/s (trivial)

// Conclusions:
// - QPS is manageable with a single database + caching
// - 3 TB fits on a single machine but sharding is reasonable
// - Caching hot URLs in Redis will handle most reads

Step 3: High-Level Design (10 minutes)

Draw the major components and their interactions. Start simple and add complexity as needed.

// High-level components for URL shortener
// 
// Client -> Load Balancer -> API Servers -> Database
//                                       -> Cache (Redis)
//
// Write flow:
// 1. Client sends POST /shorten { url: "https://long-url.com" }
// 2. API server generates short code
// 3. Store mapping in database
// 4. Return short URL to client
//
// Read flow:
// 1. Client sends GET /{shortCode}
// 2. Check Redis cache first
// 3. If cache miss, query database
// 4. Cache the result in Redis
// 5. Return 301/302 redirect to original URL

// Component decisions to state and justify:
const architectureDecisions = {
  loadBalancer: "AWS ALB - distributes traffic, SSL termination",
  apiServers: "Node.js - handles HTTP, stateless, horizontally scalable",
  database: "PostgreSQL - reliable, handles 40 writes/sec easily",
  cache: "Redis - in-memory, sub-ms reads for hot URLs",
  idGeneration: "Base62 encoding of auto-increment ID or pre-generated IDs",
};

Step 4: Detailed Design (15 minutes)

Deep dive into 2-3 components that are most interesting or challenging. The interviewer may guide you to specific areas.

// Deep dive: Short URL generation strategy

// Option 1: Hash-based (MD5/SHA256 + truncate)
function hashBased(longUrl: string): string {
  const hash = md5(longUrl);
  return base62Encode(hash.substring(0, 7));
  // Problem: collisions possible, need to check and retry
}

// Option 2: Counter-based (auto-increment + base62)
function counterBased(counter: bigint): string {
  return base62Encode(counter);
  // Problem: predictable, sequential
  // Solution: Use a distributed ID generator (Snowflake)
}

// Option 3: Pre-generated key service
class KeyGenerationService {
  // Pre-generate millions of unique keys
  // Store in a database table
  // API servers fetch keys in batches (e.g., 1000 at a time)
  // Mark keys as used

  private localKeys: string[] = [];

  async getKey(): Promise<string> {
    if (this.localKeys.length === 0) {
      // Fetch a batch from the key database
      this.localKeys = await this.fetchKeyBatch(1000);
    }
    return this.localKeys.pop()!;
  }
}

// Decision: Option 3 (pre-generated keys) because:
// - No collision handling needed
// - Keys are not sequential (better security)
// - Scales well (batch fetching reduces DB calls)
// - Trade-off: need to manage the key pool

Step 5: Bottlenecks and Trade-offs (5 minutes)

Proactively identify weaknesses in your design and propose mitigations. This demonstrates mature engineering thinking.

Bottleneck	Impact	Mitigation
Single database	Write throughput limit, single point of failure	Add read replicas, shard by hash of short code
Cache eviction	Cold cache causes database overload	Warm cache on startup, use LRU eviction, cache frequently accessed URLs longer
Hot keys	Viral URLs overload a single cache node	Replicate hot keys across multiple cache nodes
Abuse / spam	Malicious users create millions of URLs	Rate limiting per IP/user, CAPTCHA for unauthenticated users

Common Mistakes to Avoid

Jumping to the solution. Spending 30 seconds on requirements and 35 minutes on design is backwards. Requirements drive everything
Over-engineering. Using Kafka, Kubernetes, and microservices for a system that serves 100 users. Start simple, scale when needed
Not doing estimation. Without numbers, you cannot justify whether a single server or 100 servers is needed
Monologuing. The interview is a conversation, not a presentation. Check in with the interviewer regularly
Ignoring trade-offs. Every design decision has pros and cons. State them explicitly
Not knowing your tools. If you propose using Cassandra, you should know when it is better or worse than PostgreSQL
Focusing on the wrong details. Spending 10 minutes on the database schema when the interviewer wants to discuss caching strategy
Forgetting about failures. What happens when a server crashes? When the database is down? When the network partitions?

Tips from Interviewers

Drive the conversation. The best candidates lead the discussion while staying open to feedback
Think out loud. Interviewers cannot evaluate what they cannot hear. Share your reasoning, even when uncertain
Be honest about what you do not know. Saying "I am not sure how exactly Kafka handles this, but I believe..." is better than making something up
Use real numbers and real technologies. "We will use PostgreSQL with an estimated 10,000 writes/sec" is better than "We will use a database"
Address the interviewer's hints. If they ask "what about failure scenarios?", spend time on that topic

Practice Roadmap

Build your system design skills progressively:

Week	Focus Area	Practice Problems
1-2	Fundamentals	URL shortener, paste bin, key-value store
3-4	Data-intensive systems	Twitter feed, news feed, notification system
5-6	Real-time systems	Chat application, live streaming, collaborative editing
7-8	Complex systems	Ride sharing, payment system, search engine
9-10	Mock interviews	Practice with peers, timed sessions, feedback loops

System Design Interview Framework

System Design Interview Framework

What Interviewers Are Looking For

Step 1: Clarify Requirements (5 minutes)

Functional Requirements

Non-Functional Requirements

Step 2: Back-of-Envelope Estimation (3-5 minutes)

Step 3: High-Level Design (10 minutes)

Step 4: Detailed Design (15 minutes)

Step 5: Bottlenecks and Trade-offs (5 minutes)

Common Mistakes to Avoid

Tips from Interviewers

Practice Roadmap

Continue Learning

Software Architecture

Cloud & Kubernetes

Performance Engineering

Data Engineering

Engineering Leadership