What is Trade-offs in System Design?

Master trade-off analysis in system design covering consistency vs availability, SQL vs NoSQL, monolith vs microservices, and decision frameworks.

Trade-offs in System Design - System Design Tutorial | TechLead

Trade-offs in System Design

Every system design decision involves trade-offs. There is no perfect architecture, only architectures that are optimized for specific constraints. Senior engineers distinguish themselves not by knowing more technologies, but by their ability to evaluate trade-offs systematically and make defensible decisions. This topic covers the most important trade-offs you will encounter and frameworks for evaluating them.

Consistency vs. Availability

The CAP theorem states that in the presence of a network partition, a distributed system must choose between consistency and availability. In practice, partitions are inevitable, so the real question is: when a partition occurs, do you return stale data (availability) or an error (consistency)?

Choose Consistency (CP)	Choose Availability (AP)
Financial transactions	Social media feeds
Inventory management	Product catalog browsing
User authentication	Content recommendations
Leader election	Analytics dashboards
Distributed locks	DNS resolution

The Real-World Nuance

Most real systems are not purely CP or AP. They use different consistency levels for different operations. An e-commerce platform might use strong consistency for payment processing (CP) but eventual consistency for product reviews (AP). The key insight is that consistency is not a system-wide property; it is a per-operation decision.

Strong vs. Eventual Consistency

This is a spectrum, not a binary choice. Understanding the options between the two extremes is crucial.

// Consistency spectrum from strongest to weakest

enum ConsistencyLevel {
  // Linearizability: reads always see the most recent write
  // Slowest, requires consensus (Raft/Paxos)
  LINEARIZABLE = "LINEARIZABLE",

  // Sequential consistency: operations appear in some total order
  // consistent with each client's program order
  SEQUENTIAL = "SEQUENTIAL",

  // Causal consistency: causally related operations are seen in order
  // Concurrent operations may be seen in any order
  CAUSAL = "CAUSAL",

  // Read-your-writes: a client always sees its own writes
  // Other clients may see stale data
  READ_YOUR_WRITES = "READ_YOUR_WRITES",

  // Eventual consistency: all replicas converge eventually
  // No ordering guarantees in the meantime
  EVENTUAL = "EVENTUAL",
}

// Practical example: user profile update
interface UserProfileService {
  // Strong consistency: user always sees their own updates immediately
  // Implementation: read from primary database
  updateProfile(userId: string, data: ProfileData): Promise<void>;
  getOwnProfile(userId: string): Promise<ProfileData>; // Read from primary

  // Eventual consistency: other users may see stale profile
  // Implementation: read from nearest replica
  getPublicProfile(userId: string): Promise<ProfileData>; // Read from replica
}

Latency vs. Throughput

Optimizing for latency (how fast a single request completes) often conflicts with optimizing for throughput (how many requests per second the system handles).

Optimize for Latency	Optimize for Throughput
Process each request immediately	Batch requests for processing
Keep data in memory	Sequential disk writes (append-only)
Use caching aggressively	Use queues to smooth traffic spikes
Fewer network hops	Pipeline operations across services
Real-time processing	Batch / stream processing

// Example: Database write strategies

// Optimize for latency: write-through
async function writeThrough(key: string, value: string): Promise<void> {
  // Write to DB synchronously, return when confirmed
  await database.write(key, value);  // 5ms
  await cache.set(key, value);       // 1ms
  // Total: ~6ms per write, but only handles ~160 writes/sec per connection
}

// Optimize for throughput: write-behind (buffering)
class WriteBuffer {
  private buffer: Map<string, string> = new Map();
  private flushInterval = 100; // ms

  async write(key: string, value: string): Promise<void> {
    this.buffer.set(key, value);
    await cache.set(key, value); // Immediate cache update
    // Return immediately - data will be flushed to DB asynchronously
    // Total: ~1ms per write, can handle thousands of writes/sec
    // Trade-off: data loss risk if process crashes before flush
  }

  private async flush(): Promise<void> {
    const batch = Array.from(this.buffer.entries());
    this.buffer.clear();
    await database.batchWrite(batch); // One DB call for many writes
  }
}

Monolith vs. Microservices

This is perhaps the most debated trade-off in software architecture. The right answer depends heavily on team size, system complexity, and organizational structure.

Dimension	Monolith	Microservices
Deployment	Single deployable unit	Independent deployment per service
Development speed (small team)	Faster - less overhead	Slower - infrastructure complexity
Development speed (large team)	Slower - merge conflicts, coordination	Faster - teams work independently
Debugging	Easier - single process, local stack traces	Harder - distributed tracing required
Scaling	Scale the entire application	Scale individual services independently
Data consistency	Easy - single database, ACID transactions	Hard - distributed transactions, eventual consistency
Technology flexibility	One tech stack	Best tool for each service
Best for team size	1-20 engineers	50+ engineers with clear domain boundaries

The Pragmatic Middle Ground

Start with a modular monolith: a single deployable unit with well-defined internal module boundaries. When a specific module needs independent scaling or a separate team, extract it into a service. This approach gives you the simplicity of a monolith with a path to microservices when the need arises. Premature decomposition into microservices is one of the most common and expensive architectural mistakes.

SQL vs. NoSQL

Factor	SQL (PostgreSQL, MySQL)	NoSQL (MongoDB, Cassandra, DynamoDB)
Data model	Rigid schema, relational	Flexible schema, document/key-value/column
Query flexibility	Rich queries with JOINs, aggregations	Limited queries, optimized for specific access patterns
Transactions	Full ACID across multiple tables	Limited (single document or partition)
Horizontal scaling	Possible but complex	Built-in, often automatic
Best for	Complex relationships, ad-hoc queries, data integrity	High write throughput, flexible schemas, known access patterns

// Decision guide
function chooseDatabaseType(requirements: {
  needsJoins: boolean;
  needsACID: boolean;
  schemaIsStable: boolean;
  writeVolume: "low" | "medium" | "high" | "extreme";
  queryPatterns: "varied" | "known";
  dataRelationships: "simple" | "complex";
}): string {
  // Strong signals for SQL
  if (requirements.needsACID && requirements.needsJoins) {
    return "SQL (PostgreSQL recommended)";
  }

  // Strong signals for NoSQL
  if (requirements.writeVolume === "extreme" &&
      requirements.queryPatterns === "known" &&
      !requirements.needsJoins) {
    return "NoSQL (Cassandra for wide-column, DynamoDB for key-value)";
  }

  // Default recommendation
  if (requirements.dataRelationships === "complex") {
    return "SQL - complex relationships are hard to model in NoSQL";
  }

  return "Either works - choose based on team experience";
}

Push vs. Pull Architectures

Aspect	Push (Fan-out on write)	Pull (Fan-out on read)
When work happens	When data is written	When data is read
Read latency	Very fast (pre-computed)	Slower (computed on demand)
Write cost	High (fan-out to all followers)	Low (just store the item)
Storage	Higher (duplicated data in each feed)	Lower (single copy)
Best for	Users with few followers, read-heavy	Celebrity users with millions of followers

Hybrid Approach (What Twitter Uses)

Use push for regular users (fan-out their tweets to followers' feeds on write) and pull for celebrities (fetch their tweets at read time and merge). This avoids the "celebrity problem" where a single tweet from a user with 50 million followers would require 50 million write operations.

How to Evaluate Trade-offs Systematically

When facing an architectural decision, use this structured approach to avoid gut-feel decisions and bias.

interface ArchitecturalDecision {
  title: string;
  context: string;          // What is the situation?
  options: Option[];
  decision: string;         // What did we choose?
  rationale: string;        // Why?
  consequences: string[];   // What are the implications?
}

interface Option {
  name: string;
  pros: string[];
  cons: string[];
  score: {
    complexity: number;     // 1 (simple) to 5 (complex)
    scalability: number;    // 1 (poor) to 5 (excellent)
    reliability: number;    // 1 (poor) to 5 (excellent)
    cost: number;           // 1 (cheap) to 5 (expensive)
    teamExperience: number; // 1 (none) to 5 (expert)
  };
}

// Example ADR (Architecture Decision Record)
const example: ArchitecturalDecision = {
  title: "Database for user sessions",
  context: "We need to store user sessions with ~1M concurrent users, " +
           "sub-10ms read latency, and automatic expiration.",
  options: [
    {
      name: "Redis",
      pros: [
        "Sub-millisecond reads",
        "Built-in TTL for expiration",
        "Team has experience",
      ],
      cons: [
        "Data loss on restart (unless persistence enabled)",
        "Memory cost for 1M sessions",
      ],
      score: { complexity: 1, scalability: 4, reliability: 3, cost: 3, teamExperience: 5 },
    },
    {
      name: "PostgreSQL",
      pros: [
        "Durable storage",
        "Rich querying for analytics",
        "Already in our stack",
      ],
      cons: [
        "Higher latency (~5ms)",
        "Need to implement expiration (cron job or pg_cron)",
      ],
      score: { complexity: 2, scalability: 3, reliability: 5, cost: 2, teamExperience: 5 },
    },
  ],
  decision: "Redis",
  rationale: "Session data is ephemeral - durability is not required. " +
             "The sub-ms latency is critical for user experience. " +
             "Memory cost for 1M sessions (~500MB) is acceptable.",
  consequences: [
    "Must handle Redis failover (use Redis Sentinel or Cluster)",
    "Cannot perform complex queries on session data",
    "Need monitoring for memory usage",
  ],
};

Decision Framework for Architects

The STAR Framework for Technical Decisions

S - Situation: What are the constraints? Scale, team size, timeline, budget, existing systems
T - Trade-offs: What are you gaining and what are you giving up with each option?
A - Action: What is the recommendation and why? Make the decision reversible if possible
R - Review: Set a date to review the decision. Was it correct? What would you change?

Key Principles

Optimize for the common case. Design for the 95th percentile of your workload, not the edge cases
Make decisions reversible. Prefer options that can be changed later over options that lock you in
Boring technology is good technology. Use well-understood tools unless you have a compelling reason not to
Measure, do not assume. Benchmark before optimizing. Profile before refactoring. Load test before scaling
Document your decisions. Future engineers (including future you) will need to understand not just what you built, but why you built it that way
There is no best architecture. There is only the best architecture for your specific constraints, requirements, and team. Context is everything

Trade-offs in System Design

Trade-offs in System Design

Consistency vs. Availability

The Real-World Nuance

Strong vs. Eventual Consistency

Latency vs. Throughput

Monolith vs. Microservices

The Pragmatic Middle Ground

SQL vs. NoSQL

Push vs. Pull Architectures

Hybrid Approach (What Twitter Uses)

How to Evaluate Trade-offs Systematically

Decision Framework for Architects

The STAR Framework for Technical Decisions

Key Principles

Continue Learning

Software Architecture

Cloud & Kubernetes

Performance Engineering

Data Engineering

Engineering Leadership