TechLead
Lesson 17 of 30
7 min read
System Design

System Design: Notification System

Design a scalable notification system covering push, SMS, email, and in-app notifications with priority queues, rate limiting, and delivery tracking

Problem Statement

Design a notification system that can send notifications across multiple channels: push notifications, SMS, email, and in-app notifications. The system must handle high throughput, respect user preferences, and provide delivery tracking. Notification systems are a critical component of virtually every modern application.

Step 1: Requirements

Functional Requirements

  • Support multiple notification types: push (iOS/Android), SMS, email, in-app
  • Template-based notification content with personalization
  • User preference management (opt-in/opt-out per channel and category)
  • Priority levels: critical, high, medium, low
  • Delivery tracking and analytics
  • Rate limiting to prevent notification fatigue
  • Scheduled notifications

Non-Functional Requirements

  • High throughput: 10 million notifications per day
  • Low latency for critical notifications (<1 second)
  • At-least-once delivery guarantee
  • Soft real-time (most notifications delivered within 5 seconds)
  • Graceful degradation if a channel provider is down

Step 2: Notification Types Deep Dive

Channel Comparison

Channel Latency Cost Reach Provider
Push (iOS)~1sFreeApp installedAPNs
Push (Android)~1sFreeApp installedFCM
Email1-30sLowHas emailSendGrid, SES
SMS1-5sHighHas phoneTwilio, SNS
In-AppInstantFreeIn the appWebSocket/SSE

Step 3: System Architecture

// Core notification data model
interface Notification {
  id: string;
  userId: string;
  type: "push" | "sms" | "email" | "in_app";
  category: string;           // "marketing", "transactional", "social", "security"
  priority: "critical" | "high" | "medium" | "low";
  templateId: string;
  templateData: Record<string, any>;  // Variables for personalization
  scheduledAt?: Date;
  status: "pending" | "queued" | "sent" | "delivered" | "failed" | "read";
  createdAt: Date;
  sentAt?: Date;
  deliveredAt?: Date;
  readAt?: Date;
  retryCount: number;
  metadata: Record<string, any>;
}

// User notification preferences
interface UserNotificationPreferences {
  userId: string;
  channels: {
    push: boolean;
    email: boolean;
    sms: boolean;
    inApp: boolean;
  };
  categories: {
    marketing: boolean;
    social: boolean;
    transactional: boolean;  // Usually can't be disabled
    security: boolean;       // Usually can't be disabled
  };
  quietHours?: {
    start: string;  // "22:00"
    end: string;    // "08:00"
    timezone: string;
  };
}

Architecture Components

  • Notification Service (API): Accepts notification requests from other services, validates, and enqueues them
  • Preference Service: Checks user preferences and filters out unwanted notifications
  • Priority Queue (Kafka/RabbitMQ): Separate queues for each priority level
  • Rate Limiter: Prevents sending too many notifications to a single user
  • Template Engine: Renders notification content from templates and user data
  • Channel Workers: Separate worker pools for each delivery channel
  • Delivery Tracker: Records delivery status and generates analytics
class NotificationService {
  private preferenceService: PreferenceService;
  private rateLimiter: RateLimiter;
  private templateEngine: TemplateEngine;
  private queue: MessageQueue;

  async send(request: NotificationRequest): Promise<string> {
    // Step 1: Validate the request
    this.validateRequest(request);

    // Step 2: Check user preferences
    const prefs = await this.preferenceService.getPreferences(request.userId);
    if (!this.isAllowed(request, prefs)) {
      return "FILTERED_BY_PREFERENCE";
    }

    // Step 3: Check rate limits
    const allowed = await this.rateLimiter.checkLimit(
      request.userId,
      request.type,
      request.category
    );
    if (!allowed) {
      return "RATE_LIMITED";
    }

    // Step 4: Check quiet hours
    if (request.priority !== "critical" && this.isQuietHours(prefs)) {
      // Schedule for after quiet hours
      request.scheduledAt = this.getEndOfQuietHours(prefs);
    }

    // Step 5: Render template
    const content = await this.templateEngine.render(
      request.templateId,
      request.templateData
    );

    // Step 6: Create notification record
    const notification: Notification = {
      id: generateId(),
      userId: request.userId,
      type: request.type,
      category: request.category,
      priority: request.priority,
      templateId: request.templateId,
      templateData: request.templateData,
      status: "queued",
      createdAt: new Date(),
      retryCount: 0,
      metadata: { renderedContent: content },
    };

    // Step 7: Enqueue to priority queue
    const queueName = `notifications.${request.priority}`;
    await this.queue.publish(queueName, notification);

    return notification.id;
  }
}

Step 4: Priority Queues and Rate Limiting

Not all notifications are equally urgent. A security alert should be delivered immediately, while a marketing email can wait. Use separate queues for each priority level, with different consumer concurrency settings.

// Priority queue configuration
const queueConfig = {
  critical: { concurrency: 100, maxRetries: 5, retryDelay: 1000 },    // 2FA codes, security alerts
  high:     { concurrency: 50,  maxRetries: 3, retryDelay: 5000 },    // Order confirmations
  medium:   { concurrency: 20,  maxRetries: 3, retryDelay: 30000 },   // Social notifications
  low:      { concurrency: 5,   maxRetries: 2, retryDelay: 60000 },   // Marketing, digests
};

// Rate limiter implementation
class NotificationRateLimiter {
  private redis: RedisClient;

  // Rate limit rules
  private rules = {
    push:  { perHour: 10, perDay: 50 },
    email: { perHour: 5,  perDay: 20 },
    sms:   { perHour: 3,  perDay: 10 },
    inApp: { perHour: 30, perDay: 100 },
  };

  async checkLimit(
    userId: string,
    channel: string,
    category: string
  ): Promise<boolean> {
    // Transactional and security notifications bypass rate limits
    if (category === "transactional" || category === "security") {
      return true;
    }

    const rule = this.rules[channel];
    const hourKey = `ratelimit:${userId}:${channel}:hour:${currentHour()}`;
    const dayKey = `ratelimit:${userId}:${channel}:day:${currentDay()}`;

    const [hourCount, dayCount] = await Promise.all([
      this.redis.incr(hourKey),
      this.redis.incr(dayKey),
    ]);

    // Set expiry on first increment
    if (hourCount === 1) await this.redis.expire(hourKey, 3600);
    if (dayCount === 1) await this.redis.expire(dayKey, 86400);

    return hourCount <= rule.perHour && dayCount <= rule.perDay;
  }
}

Step 5: Template Management

Notifications should not contain hardcoded text. A template system allows non-engineers to modify notification content without code changes, and supports localization.

interface NotificationTemplate {
  id: string;
  name: string;
  channels: {
    push?: { title: string; body: string; };
    email?: { subject: string; htmlBody: string; textBody: string; };
    sms?: { body: string; };
    inApp?: { title: string; body: string; actionUrl: string; };
  };
  variables: string[];  // Required template variables
  locale: string;       // "en", "es", "fr"
}

// Example template
const orderShippedTemplate: NotificationTemplate = {
  id: "order_shipped_v2",
  name: "Order Shipped",
  channels: {
    push: {
      title: "Your order is on its way!",
      body: "Order #{{orderId}} has shipped. Track: {{trackingUrl}}",
    },
    email: {
      subject: "Your order #{{orderId}} has shipped",
      htmlBody: "<h1>Great news, {{userName}}!</h1><p>Your order has shipped...</p>",
      textBody: "Great news, {{userName}}! Your order has shipped...",
    },
    sms: {
      body: "Your order #{{orderId}} shipped! Track at {{trackingUrl}}",
    },
  },
  variables: ["orderId", "userName", "trackingUrl"],
  locale: "en",
};

Step 6: Delivery Tracking and Analytics

class DeliveryTracker {
  // Track delivery status changes
  async updateStatus(
    notificationId: string,
    status: Notification["status"],
    metadata?: Record<string, any>
  ): Promise<void> {
    await this.db.notifications.update(notificationId, {
      status,
      [`${status}At`]: new Date(),
      metadata: { ...metadata },
    });

    // Emit event for analytics pipeline
    await this.eventBus.emit("notification.status_changed", {
      notificationId,
      status,
      timestamp: Date.now(),
    });
  }

  // Analytics queries
  async getDeliveryStats(
    timeRange: { start: Date; end: Date },
    groupBy: "channel" | "category" | "priority"
  ): Promise<DeliveryStats[]> {
    // Returns: sent count, delivered count, read count, failed count
    // Grouped by the specified dimension
    return this.analyticsDB.query({
      metrics: ["sent", "delivered", "read", "failed"],
      dimensions: [groupBy],
      timeRange,
    });
  }
}

// Delivery rate metrics to track:
// - Send rate: notifications sent per second
// - Delivery rate: % of sent notifications confirmed delivered
// - Open rate: % of delivered notifications opened/read (email, push)
// - Click-through rate: % that clicked on a CTA
// - Bounce rate: % that failed delivery (email bounces, invalid tokens)
// - Unsubscribe rate: users opting out after receiving

Step 7: Retry Mechanisms

Notification delivery can fail for various reasons: provider outages, invalid device tokens, rate limits from external providers, or network issues. A robust retry strategy is essential.

class NotificationWorker {
  async processNotification(notification: Notification): Promise<void> {
    try {
      const result = await this.deliverByChannel(notification);

      if (result.success) {
        await this.tracker.updateStatus(notification.id, "sent");
      } else {
        throw new Error(result.error);
      }
    } catch (error) {
      await this.handleFailure(notification, error);
    }
  }

  private async handleFailure(
    notification: Notification,
    error: Error
  ): Promise<void> {
    const config = queueConfig[notification.priority];

    if (notification.retryCount >= config.maxRetries) {
      // Max retries exceeded - mark as failed
      await this.tracker.updateStatus(notification.id, "failed", {
        error: error.message,
        finalRetryAt: new Date(),
      });

      // Move to dead letter queue for investigation
      await this.queue.publish("notifications.dead_letter", notification);
      return;
    }

    // Exponential backoff
    const delay = config.retryDelay * Math.pow(2, notification.retryCount);
    notification.retryCount++;

    await this.queue.publishWithDelay(
      `notifications.${notification.priority}`,
      notification,
      delay
    );
  }

  private async deliverByChannel(notification: Notification): Promise<DeliveryResult> {
    switch (notification.type) {
      case "push":
        return this.pushProvider.send(notification);
      case "email":
        return this.emailProvider.send(notification);
      case "sms":
        return this.smsProvider.send(notification);
      case "in_app":
        return this.inAppDelivery.send(notification);
    }
  }
}

Key Design Principles

  • Decouple sending from delivery: Use message queues between the notification API and channel workers
  • Respect user preferences: Always check opt-in/opt-out before sending
  • Idempotency: Use notification IDs to prevent duplicate sends on retry
  • Provider abstraction: Use an adapter pattern so you can swap providers (e.g., switch from Twilio to Vonage) without changing core logic
  • Graceful degradation: If push notifications fail, fall back to email or in-app notifications
  • Observability: Log every state transition and track delivery metrics per channel

Continue Learning