What is a Content Delivery Network?
A Content Delivery Network (CDN) is a geographically distributed network of proxy servers and data centers designed to provide high availability and performance by distributing content closer to end users. Instead of every request traveling to a single origin server, CDNs cache content at edge locations around the world, dramatically reducing latency and load on the origin.
CDNs are essential in modern system design. They handle a significant portion of global internet traffic and are used by virtually every large-scale web application, from streaming platforms to e-commerce sites.
Why Use a CDN?
- Reduced Latency: Content is served from the nearest edge location, cutting round-trip time from hundreds of milliseconds to single digits
- Scalability: CDNs absorb traffic spikes and distribute load across hundreds of edge servers
- Availability: If one edge node fails, traffic is routed to the next closest healthy node
- Security: CDNs provide DDoS protection, WAF, and TLS termination at the edge
- Cost Reduction: Fewer requests hit the origin server, reducing bandwidth and compute costs
How CDNs Work
When a user requests a resource, DNS resolves the domain to the nearest CDN edge server via anycast routing or geo-DNS. The edge server checks its cache. If the content is cached (a cache hit), it is returned immediately. If not (a cache miss), the edge server fetches it from the origin, caches it, and then serves it to the user.
// Simplified CDN request flow
interface CDNRequest {
url: string;
userLocation: { lat: number; lng: number };
headers: Record<string, string>;
}
interface EdgeServer {
id: string;
location: { lat: number; lng: number };
cache: Map<string, CachedResponse>;
}
interface CachedResponse {
body: Buffer;
headers: Record<string, string>;
cachedAt: number;
ttl: number;
}
function handleCDNRequest(request: CDNRequest, edge: EdgeServer): Response {
const cacheKey = generateCacheKey(request);
const cached = edge.cache.get(cacheKey);
if (cached && !isExpired(cached)) {
// Cache HIT - serve directly from edge
return {
body: cached.body,
headers: { ...cached.headers, "X-Cache": "HIT" },
};
}
// Cache MISS - fetch from origin
const originResponse = fetchFromOrigin(request.url);
const ttl = parseTTL(originResponse.headers);
edge.cache.set(cacheKey, {
body: originResponse.body,
headers: originResponse.headers,
cachedAt: Date.now(),
ttl,
});
return {
body: originResponse.body,
headers: { ...originResponse.headers, "X-Cache": "MISS" },
};
}
function isExpired(cached: CachedResponse): boolean {
return Date.now() - cached.cachedAt > cached.ttl * 1000;
}
Push CDN vs Pull CDN
CDNs can be classified by how content gets populated on edge servers. Understanding the difference is critical for choosing the right strategy for your workload.
Push vs Pull CDN Comparison
| Aspect | Push CDN | Pull CDN |
|---|---|---|
| Content Population | You upload content to the CDN proactively | CDN fetches from origin on first request |
| Best For | Static content that changes infrequently | Dynamic or frequently updated content |
| Storage Cost | Higher (content stored proactively) | Lower (only caches requested content) |
| First Request Latency | Low (content already at edge) | Higher (must fetch from origin) |
| Origin Load | Minimal after initial push | Varies based on cache hit ratio |
| Complexity | Requires upload pipeline | Simpler setup, just configure origin |
Push CDN
With a push CDN, you are responsible for uploading content directly to the CDN. This gives you full control over what is cached and when. It works best for large, static assets like software downloads, video files, or assets that are known in advance. AWS S3 + CloudFront with origin access identity is a common push CDN pattern.
Pull CDN
A pull CDN is more common for web applications. You configure the CDN with your origin server URL, and the CDN lazily fetches and caches content on the first request. Subsequent requests are served from the cache until the TTL expires. Cloudflare and most general-purpose CDNs operate primarily as pull CDNs.
Caching Headers and TTL Strategies
Cache behavior is primarily controlled through HTTP headers. Properly configuring these headers is one of the most impactful things you can do for performance.
// Common caching header configurations
// Immutable static assets (hashed filenames like main.a1b2c3.js)
const immutableHeaders = {
"Cache-Control": "public, max-age=31536000, immutable",
// Cached for 1 year, browser won't even revalidate
};
// Dynamic HTML pages
const htmlHeaders = {
"Cache-Control": "public, max-age=0, s-maxage=60, stale-while-revalidate=300",
// s-maxage: CDN caches for 60 seconds
// stale-while-revalidate: serve stale for 5 min while fetching fresh
};
// API responses
const apiHeaders = {
"Cache-Control": "private, no-cache",
// private: CDN must not cache (user-specific data)
// no-cache: always revalidate with origin
};
// Images and media
const mediaHeaders = {
"Cache-Control": "public, max-age=86400, stale-while-revalidate=86400",
// Cache for 1 day, serve stale for 1 more day while refreshing
};
Key Cache-Control Directives
- max-age: How long the browser should cache the resource (in seconds)
- s-maxage: How long shared caches (CDN) should cache the resource. Overrides max-age for CDNs
- stale-while-revalidate: Serve stale content while fetching a fresh version in the background
- immutable: Tells the browser the resource will never change (use with content-hashed URLs)
- no-store: Never cache this response anywhere (sensitive data)
- private: Only the browser may cache this, not CDNs (user-specific content)
CDN Cache Invalidation
Cache invalidation is one of the hardest problems in computer science, and CDNs are no exception. When your content changes, you need a strategy to ensure users see the updated version.
Invalidation Strategies
- TTL-based Expiration: Content expires automatically after the TTL. Simple but introduces a delay between updates and visibility.
- Purge/Invalidation API: Explicitly tell the CDN to remove specific cached content. Most CDNs provide an API for this.
- Cache Busting via URL Versioning: Append a version or hash to the URL (e.g.,
style.v2.cssorstyle.a1b2c3.css). Since the URL changes, the CDN treats it as new content. - Surrogate Keys / Cache Tags: Tag cached objects with metadata, then purge all objects with a given tag. Fastly and Cloudflare support this.
// Cache invalidation example using Cloudflare API
async function purgeCache(zoneId: string, urls: string[]): Promise<void> {
const response = await fetch(
`https://api.cloudflare.com/client/v4/zones/${zoneId}/purge_cache`,
{
method: "POST",
headers: {
Authorization: `Bearer ${process.env.CF_API_TOKEN}`,
"Content-Type": "application/json",
},
body: JSON.stringify({ files: urls }),
}
);
if (!response.ok) {
throw new Error(`Purge failed: ${response.statusText}`);
}
}
// Purge everything (nuclear option)
async function purgeAll(zoneId: string): Promise<void> {
await fetch(
`https://api.cloudflare.com/client/v4/zones/${zoneId}/purge_cache`,
{
method: "POST",
headers: {
Authorization: `Bearer ${process.env.CF_API_TOKEN}`,
"Content-Type": "application/json",
},
body: JSON.stringify({ purge_everything: true }),
}
);
}
Multi-CDN Strategies
Large-scale applications often use multiple CDN providers simultaneously for improved reliability, performance, and cost optimization. A multi-CDN strategy routes traffic across providers like Cloudflare, CloudFront, and Akamai based on real-time performance data.
Benefits of Multi-CDN
- Redundancy: If one CDN has an outage, traffic is routed to another provider
- Performance Optimization: Route users to whichever CDN is fastest for their location
- Cost Optimization: Use different CDNs for different regions based on pricing
- Vendor Independence: Avoid lock-in to a single provider
Implementation Approaches
- DNS-based Routing: Use a DNS provider like NS1 or Route 53 that supports performance-based routing to direct users to the fastest CDN
- Application-layer Routing: Use a global load balancer or service like Cedexis/Citrix ITM to make real-time routing decisions
- Failover-only: Use one primary CDN with automatic failover to a secondary CDN on health check failures
Real-World CDN Providers
| Provider | Strengths | Best For |
|---|---|---|
| Cloudflare | Free tier, Workers at edge, DDoS protection, easy setup | Web apps, APIs, general purpose |
| CloudFront | Deep AWS integration, Lambda@Edge, S3 origin | AWS-based architectures |
| Akamai | Largest network, enterprise features, advanced security | Enterprise, media streaming |
| Fastly | Instant purge, VCL/Compute@Edge, real-time logging | Dynamic content, API acceleration |
| Vercel Edge Network | Automatic with Next.js, ISR support, zero config | Next.js and Jamstack apps |
CDN Design Interview Tips
- Always mention CDN when discussing read-heavy systems or serving static assets
- Discuss cache hit ratio as a key metric (aim for 90%+ for static assets)
- Consider cache invalidation strategy as part of your design
- Remember edge compute for personalization without losing cache benefits (Cloudflare Workers, Lambda@Edge)
- Mention geographic distribution for global user bases