What is Back-of-the-Envelope Estimation?

Master back-of-the-envelope estimation for system design interviews with latency numbers, throughput calculations, and storage estimation techniques.

Back-of-the-Envelope Estimation - System Design Tutorial | TechLead

Back-of-the-Envelope Estimation

Back-of-the-envelope estimation is a critical skill for system design interviews and real-world architecture decisions. It allows engineers to quickly determine whether a design is feasible, identify bottlenecks, and make informed technology choices without building prototypes. The goal is not exact precision but rather correct order-of-magnitude reasoning.

Why Estimation Matters

Determines if a single server or distributed system is needed
Identifies which components need horizontal scaling
Helps choose between different storage technologies
Validates or invalidates architectural decisions before implementation
Demonstrates structured thinking in interviews

Common Numbers Every Engineer Should Know

These are approximate values that serve as building blocks for estimation. Memorize the orders of magnitude, not the exact numbers.

Latency Numbers

Operation	Latency	Notes
L1 cache reference	~1 ns	On-CPU cache
L2 cache reference	~4 ns	On-CPU cache
Main memory (RAM) reference	~100 ns	DRAM access
SSD random read	~150 us	1000x slower than RAM
HDD random read	~10 ms	Mechanical seek time
Same datacenter round trip	~0.5 ms	Network within DC
Cross-continent round trip	~150 ms	Speed of light limit
Redis GET	~0.5 ms	In-memory + network
PostgreSQL simple query	~2-5 ms	Indexed query
External API call	~50-200 ms	Third-party service

Throughput Numbers

System	Throughput	Notes
Single web server (Node.js)	~10,000 RPS	Simple API endpoints
PostgreSQL	~10,000 queries/s	Simple indexed queries
Redis	~100,000 ops/s	Single instance
Kafka (single partition)	~10,000 msg/s	Per partition throughput
Kafka (cluster)	~1,000,000 msg/s	With many partitions

Storage and Data Size Numbers

Data	Size
ASCII character	1 byte
UTF-8 character (avg)	2 bytes
UUID	16 bytes
Typical tweet / short message	~250 bytes
Typical JSON API response	~2 KB
Compressed photo	~200 KB
1 minute of MP3 audio	~1 MB
1 minute of HD video	~50 MB

Power of Two Quick Reference

Power	Exact Value	Approximate
2^10	1,024	~1 thousand (1 KB)
2^20	1,048,576	~1 million (1 MB)
2^30	1,073,741,824	~1 billion (1 GB)
2^40	1,099,511,627,776	~1 trillion (1 TB)

Estimation Framework

Follow this systematic approach for any estimation problem:

// Estimation framework as code
interface EstimationProblem {
  // Step 1: Define the question clearly
  question: string;

  // Step 2: Identify the key variables
  assumptions: Record<string, number>;

  // Step 3: Calculate step by step
  calculate(): EstimationResult;
}

// Example: Estimate Twitter's storage needs for tweets
class TwitterStorageEstimation implements EstimationProblem {
  question = "How much storage does Twitter need for tweets per day?";

  assumptions = {
    dailyActiveUsers: 300_000_000,       // 300M DAU
    tweetsPerUserPerDay: 0.5,            // Not everyone tweets daily
    avgTweetSizeBytes: 250,              // Text content
    metadataPerTweetBytes: 200,          // User ID, timestamp, indexes
    mediaAttachmentRate: 0.2,            // 20% of tweets have media
    avgMediaSizeBytes: 200_000,          // 200KB per image
  };

  calculate() {
    const a = this.assumptions;

    const totalTweetsPerDay = a.dailyActiveUsers * a.tweetsPerUserPerDay;
    // = 300M * 0.5 = 150M tweets/day

    const textStoragePerDay = totalTweetsPerDay * (a.avgTweetSizeBytes + a.metadataPerTweetBytes);
    // = 150M * 450 bytes = 67.5 GB/day

    const mediaStoragePerDay = totalTweetsPerDay * a.mediaAttachmentRate * a.avgMediaSizeBytes;
    // = 150M * 0.2 * 200KB = 6 TB/day

    const totalPerDay = textStoragePerDay + mediaStoragePerDay;
    // = ~6 TB/day (media dominates)

    const totalPerYear = totalPerDay * 365;
    // = ~2.2 PB/year

    return {
      tweetsPerDay: "150 million",
      textStoragePerDay: "~67.5 GB",
      mediaStoragePerDay: "~6 TB",
      totalPerDay: "~6 TB",
      totalPerYear: "~2.2 PB",
      conclusion: "Media storage dominates. Need distributed object storage like S3.",
    };
  }
}

Practice Examples with Solutions

Example 1: QPS for a URL Shortener

Question: Estimate the QPS for a URL shortener with 100 million URLs created per month.

100M URLs/month = 100M / (30 * 24 * 3600) = ~40 URLs/second (writes)
Read:write ratio for URL shorteners is typically 100:1
Read QPS = 40 * 100 = ~4,000 reads/second
Peak = 2-3x average = ~10,000 reads/second
Conclusion: A single PostgreSQL instance can handle this. Add Redis cache for hot URLs.

Example 2: Bandwidth for a Video Streaming Service

Question: Estimate the bandwidth needed for a service streaming to 10 million concurrent users.

Average video bitrate: 5 Mbps (1080p)
10M concurrent users * 5 Mbps = 50 Tbps
This is massive; a single data center cannot serve this
Conclusion: Must use a global CDN with edge caching. Most traffic is served from CDN PoPs, not origin servers.

Example 3: Database Size for a Social Media Platform

Question: Estimate the database size for storing user profiles for 1 billion users.

User profile data: name (50B), email (50B), bio (200B), settings (100B), metadata (100B) = ~500 bytes
1B users * 500 bytes = 500 GB
With indexes (2x overhead): ~1 TB
Conclusion: Fits on a single large database server for storage, but query load will require sharding or read replicas.

Useful Time Conversions

Time Period	Seconds	Quick Approximation
1 day	86,400	~100,000 (10^5)
1 month	2,592,000	~2.5 million (2.5 * 10^6)
1 year	31,536,000	~30 million (3 * 10^7)

Tips for System Design Interviews

State your assumptions clearly. Interviewers care more about your reasoning process than exact numbers
Round aggressively. Use powers of 10 and simple multipliers. 86,400 seconds/day becomes 100,000
Show your work. Write down each step so the interviewer can follow your logic
Sanity-check your answer. If you calculate that a single laptop can handle all of Google's traffic, something is wrong
Focus on bottlenecks. Use estimates to identify which component needs the most attention

Back-of-the-Envelope Estimation

Back-of-the-Envelope Estimation

Why Estimation Matters

Common Numbers Every Engineer Should Know

Latency Numbers

Throughput Numbers

Storage and Data Size Numbers

Power of Two Quick Reference

Estimation Framework

Practice Examples with Solutions

Example 1: QPS for a URL Shortener

Example 2: Bandwidth for a Video Streaming Service

Example 3: Database Size for a Social Media Platform

Useful Time Conversions

Tips for System Design Interviews

Continue Learning

Software Architecture

Cloud & Kubernetes

Performance Engineering

Data Engineering

Engineering Leadership