TechLead
Lesson 18 of 30
7 min read
System Design

System Design: Video Streaming Platform

Design a video streaming platform like YouTube covering upload pipelines, transcoding, adaptive bitrate streaming, CDN delivery, and cost optimization

Problem Statement

Design a video streaming platform like YouTube or Netflix. The system must handle video uploads, processing, storage, and delivery to millions of concurrent viewers. This problem is challenging because video is the most bandwidth-intensive content type on the internet, requiring careful architectural decisions around encoding, storage, and delivery.

Step 1: Requirements

Functional Requirements

  • Users can upload videos (up to 1 hour, multiple formats)
  • Videos are transcoded into multiple resolutions and formats
  • Adaptive bitrate streaming based on viewer's network conditions
  • Video playback with seeking, pause, and resume
  • Thumbnail generation and video preview
  • Basic recommendation system

Non-Functional Requirements

  • Support 1 billion daily video views
  • Fast start time (<2 seconds to first frame)
  • High availability (99.99%)
  • Global delivery with low latency
  • Cost-efficient storage and bandwidth

Scale Estimates

Metric Estimate
Videos uploaded per day500,000
Average video size (original)500 MB
Daily upload volume~250 TB
Daily views1 billion
Average view duration5 minutes
Peak concurrent viewers~5 million

Step 2: Video Upload and Processing Pipeline

The upload pipeline is a multi-stage process that transforms a raw uploaded video into multiple streaming-ready formats. This is the most compute-intensive part of the system.

// Upload and processing pipeline

interface VideoUpload {
  id: string;
  userId: string;
  originalFileName: string;
  originalUrl: string;     // S3 location of uploaded file
  status: "uploading" | "processing" | "ready" | "failed";
  metadata: VideoMetadata;
  createdAt: Date;
}

interface VideoMetadata {
  title: string;
  description: string;
  tags: string[];
  duration: number;        // seconds
  resolution: string;      // "1920x1080"
  codec: string;
  frameRate: number;
  fileSize: number;        // bytes
}

interface ProcessedVideo {
  videoId: string;
  variants: VideoVariant[];
  thumbnails: string[];
  manifest: string;        // HLS/DASH manifest URL
}

interface VideoVariant {
  resolution: string;      // "1080p", "720p", "480p", "360p"
  bitrate: number;         // kbps
  codec: string;
  segmentUrls: string[];   // URLs of video segments
}

// Upload flow
class VideoUploadService {
  async initiateUpload(userId: string, metadata: Partial<VideoMetadata>): Promise<UploadSession> {
    // Generate a pre-signed S3 URL for direct upload
    // Client uploads directly to S3 (bypasses our servers)
    const uploadId = generateId();
    const s3Key = `uploads/${userId}/${uploadId}/original`;

    const presignedUrl = await this.s3.getSignedUrl("putObject", {
      Bucket: "video-uploads",
      Key: s3Key,
      Expires: 3600, // 1 hour
      ContentType: "video/*",
    });

    // Create upload record
    await this.db.videoUploads.insert({
      id: uploadId,
      userId,
      status: "uploading",
      originalUrl: `s3://video-uploads/${s3Key}`,
      metadata,
    });

    return {
      uploadId,
      presignedUrl,
      // For large files, use multipart upload
      multipartUrls: await this.generateMultipartUrls(s3Key),
    };
  }

  // Triggered by S3 event notification when upload completes
  async onUploadComplete(uploadId: string): Promise<void> {
    await this.db.videoUploads.update(uploadId, { status: "processing" });

    // Submit to processing pipeline
    await this.messageQueue.publish("video-processing", {
      uploadId,
      stages: ["validate", "transcode", "thumbnail", "manifest"],
    });
  }
}

Step 3: Transcoding and Adaptive Bitrate

Transcoding converts the original video into multiple resolutions and bitrates. This allows the player to switch between quality levels based on the viewer's bandwidth -- a technique called Adaptive Bitrate Streaming (ABR).

Standard Transcoding Profiles

Quality Resolution Bitrate (video) Use Case
4K UHD3840x216015-25 MbpsSmart TVs, high-end
1080p HD1920x10804-8 MbpsDesktop, good WiFi
720p HD1280x7202-4 MbpsMobile on WiFi
480p SD854x4801-2 MbpsMobile on 4G
360p640x3600.5-1 MbpsSlow connections
// Transcoding pipeline (runs on GPU-equipped workers)
class TranscodingService {
  private profiles = [
    { name: "1080p", width: 1920, height: 1080, bitrate: 5000 },
    { name: "720p",  width: 1280, height: 720,  bitrate: 2500 },
    { name: "480p",  width: 854,  height: 480,  bitrate: 1200 },
    { name: "360p",  width: 640,  height: 360,  bitrate: 700 },
  ];

  async transcodeVideo(uploadId: string, sourceUrl: string): Promise<VideoVariant[]> {
    const variants: VideoVariant[] = [];

    // Get source video info
    const sourceInfo = await this.probe(sourceUrl);

    // Only transcode to resolutions <= source resolution
    const applicableProfiles = this.profiles.filter(
      (p) => p.height <= sourceInfo.height
    );

    // Process each profile (can be parallelized across workers)
    for (const profile of applicableProfiles) {
      const segments = await this.transcodeToProfile(sourceUrl, profile);
      variants.push({
        resolution: profile.name,
        bitrate: profile.bitrate,
        codec: "h264", // or h265/AV1 for better compression
        segmentUrls: segments,
      });
    }

    return variants;
  }

  // Each video is split into small segments (2-10 seconds)
  // This enables adaptive bitrate switching at segment boundaries
  private async transcodeToProfile(
    sourceUrl: string,
    profile: TranscodeProfile
  ): Promise<string[]> {
    // Using FFmpeg under the hood:
    // ffmpeg -i input.mp4     //   -vf scale=1280:720     //   -b:v 2500k     //   -hls_time 6     //   -hls_segment_filename "segment_%03d.ts"     //   output.m3u8

    const outputDir = `processed/${profile.name}/`;
    const segmentDuration = 6; // seconds

    // Simplified - actual implementation uses FFmpeg
    return this.ffmpeg.transcode({
      input: sourceUrl,
      output: outputDir,
      width: profile.width,
      height: profile.height,
      bitrate: profile.bitrate,
      segmentDuration,
      format: "hls", // HTTP Live Streaming
    });
  }
}

Step 4: Streaming Protocols (HLS and DASH)

Modern video streaming uses HTTP-based adaptive streaming protocols. The two dominant protocols are HLS (HTTP Live Streaming, created by Apple) and DASH (Dynamic Adaptive Streaming over HTTP, an open standard).

HLS vs DASH

Feature HLS DASH
Created ByAppleMPEG (open standard)
Manifest Format.m3u8 (text).mpd (XML)
Segment Format.ts or .fmp4.m4s (fMP4)
Browser SupportNative on Safari, via JS elsewhereVia JS (dash.js, Shaka)
Codec SupportH.264, H.265, AV1Any codec
DRMFairPlayWidevine, PlayReady
// HLS manifest structure (simplified)
// Master playlist (video.m3u8) - points to variant playlists
const masterPlaylist = `
#EXTM3U
#EXT-X-STREAM-INF:BANDWIDTH=5000000,RESOLUTION=1920x1080
1080p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=2500000,RESOLUTION=1280x720
720p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=1200000,RESOLUTION=854x480
480p/playlist.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=700000,RESOLUTION=640x360
360p/playlist.m3u8
`;

// Variant playlist (720p/playlist.m3u8) - lists segments
const variantPlaylist = `
#EXTM3U
#EXT-X-VERSION:3
#EXT-X-TARGETDURATION:6
#EXTINF:6.0,
segment_001.ts
#EXTINF:6.0,
segment_002.ts
#EXTINF:6.0,
segment_003.ts
#EXTINF:4.5,
segment_004.ts
#EXT-X-ENDLIST
`;

// The player:
// 1. Downloads the master playlist
// 2. Selects a variant based on current bandwidth
// 3. Downloads segments from that variant
// 4. If bandwidth changes, switches to a different variant at the next segment boundary

Step 5: CDN for Video Delivery

Video delivery is the most bandwidth-intensive part of the system. A CDN is essential for delivering video segments from edge locations close to viewers, reducing latency and origin load.

  • Popular videos: Cached at edge locations worldwide. Cache hit ratio should be 90%+.
  • Long-tail content: Less popular videos may be served from regional caches or origin.
  • Tiered caching: Edge -> Regional Cache -> Origin. Each tier reduces origin requests.
  • Pre-warming: For anticipated viral content, push segments to edge locations proactively.

Step 6: Thumbnail Generation

class ThumbnailService {
  async generateThumbnails(
    videoUrl: string,
    videoId: string,
    duration: number
  ): Promise<string[]> {
    const thumbnails: string[] = [];

    // Generate thumbnails at regular intervals
    const intervals = this.calculateIntervals(duration);
    // e.g., for a 60s video: [0, 10, 20, 30, 40, 50]

    for (const timestamp of intervals) {
      // Extract frame using FFmpeg
      // ffmpeg -i input.mp4 -ss 10 -frames:v 1 -vf scale=320:180 thumb.jpg
      const thumbnailUrl = await this.extractFrame(videoUrl, timestamp, {
        width: 320,
        height: 180,
        format: "webp", // Smaller than JPEG
      });
      thumbnails.push(thumbnailUrl);
    }

    // Generate video preview (animated thumbnail / sprite sheet)
    const spriteSheet = await this.generateSpriteSheet(videoUrl, intervals);

    // Store thumbnail URLs in video metadata
    await this.db.videos.update(videoId, {
      thumbnails,
      defaultThumbnail: thumbnails[Math.floor(thumbnails.length / 3)],
      spriteSheet,
    });

    return thumbnails;
  }

  // Sprite sheets enable scrubbing preview on hover
  // A single image containing all thumbnail frames in a grid
  private async generateSpriteSheet(
    videoUrl: string,
    timestamps: number[]
  ): Promise<string> {
    // Combine all thumbnails into a grid image
    // Used for hover preview when scrubbing the progress bar
    return this.imageProcessor.createGrid(
      timestamps.map(t => this.extractFrame(videoUrl, t, { width: 160, height: 90 }))
    );
  }
}

Step 7: Recommendation Engine Overview

A recommendation system drives engagement by surfacing relevant videos. While a full recommendation engine is its own system design problem, the key concepts are:

  • Collaborative Filtering: "Users who watched X also watched Y"
  • Content-Based Filtering: Match videos by tags, categories, and content features
  • Engagement Signals: Watch time, completion rate, likes, shares influence rankings
  • Real-time Personalization: Adjust recommendations based on current session behavior
  • Candidate Generation + Ranking: Two-stage pipeline. First generate thousands of candidates, then rank with a more sophisticated model.

Step 8: Storage and Cost Optimization

Storage Cost Breakdown

  • Original uploads: Stored temporarily, deleted after processing (save 500MB per video)
  • Transcoded variants: 5x the original size (multiple resolutions). A 500MB video becomes ~2.5GB
  • Storage tiering: Move old, rarely accessed videos to cheaper storage (S3 Glacier, cold storage)
  • Codec efficiency: AV1 offers 30-50% better compression than H.264 but is slower to encode. Use for popular videos.
// Storage tiering strategy
class StorageTierManager {
  // Videos are tiered based on view frequency
  async tierVideo(videoId: string): Promise<void> {
    const stats = await this.getViewStats(videoId);
    const daysSinceUpload = this.daysSince(stats.uploadedAt);
    const recentViews = stats.viewsLast30Days;

    if (recentViews > 1000) {
      // Hot: keep all variants on fast storage + CDN
      await this.moveToTier(videoId, "hot"); // S3 Standard
    } else if (recentViews > 10) {
      // Warm: keep popular variants, move others to cheaper storage
      await this.moveToTier(videoId, "warm"); // S3 Infrequent Access
      // Only keep 720p and 360p on fast storage
      await this.archiveVariants(videoId, ["1080p", "4k"]);
    } else if (daysSinceUpload > 365) {
      // Cold: archive to cheapest storage, transcode on demand
      await this.moveToTier(videoId, "cold"); // S3 Glacier
      // Keep only 360p, re-transcode higher res on demand
    }
  }
}

// Bandwidth cost optimization:
// 1. Use efficient codecs (AV1 > H.265 > H.264) for popular content
// 2. Serve WebP/AVIF thumbnails instead of JPEG (50% smaller)
// 3. Client-side bandwidth estimation to avoid over-serving quality
// 4. Limit max quality based on screen size (don't serve 4K to phones)

Architecture Summary

  • Upload: Direct-to-S3 with pre-signed URLs, bypassing application servers
  • Processing: Async pipeline via message queue, GPU workers for transcoding
  • Storage: S3 with tiered storage, CDN for delivery, tiering based on popularity
  • Streaming: HLS/DASH with adaptive bitrate, 6-second segments
  • Delivery: Multi-tier CDN (edge -> regional -> origin)
  • Metadata: PostgreSQL for video metadata, Elasticsearch for search

Continue Learning