CDNs and Edge Caching
A CDN is a global fleet of caching reverse proxies. Each one — at a point of presence near a population of users — holds copies of your content and serves them without ever touching your origin server. The win is three-fold: latency drops because the content is physically closer (a cache hit in the same city beats a round-trip across an ocean), origin load drops because cache hits never reach you, and traffic spikes and DDoS floods get absorbed at the edge instead of melting your servers.
The caching part is easy. The hard part — the part that causes every CDN incident — is invalidation: knowing what's stale and getting every edge node on the planet to agree to drop it. A CDN that serves the wrong content fast is worse than an origin that serves the right content slow. The whole discipline of running a CDN is managing the gap between "I changed this" and "every edge has stopped serving the old copy," and the cleanest answer turns out to be to never need invalidation at all.
Points of Presence and Anycast Routing
A CDN runs hundreds of PoPs, and the routing trick that sends each user to the nearest one is anycast: the same IP address is announced from every PoP simultaneously, and BGP's shortest-path routing delivers each user to the topologically closest announcement. A user in Frankfurt and a user in São Paulo type the same hostname, resolve the same IP, and land on different PoPs — the network itself does the steering, with no DNS trickery required.
Anycast also gives DDoS resilience for free. A flood aimed at the anycast IP spreads across every PoP rather than converging on one location, so the attack is diluted across the CDN's global capacity instead of saturating a single link. The same property that puts users on a nearby edge spreads an attacker across all of them.
Cache Keys and TTLs
An edge decides whether it already has a response by computing a cache key — by default the request's host plus path, sometimes including query string or specific headers. Two requests that compute the same key share a cached copy, which is exactly why the cache key is where the worst bugs live: include too little and you serve one user's personalized page to another; include too much and your hit rate collapses because every minor variation is a separate entry.
How long the edge keeps a copy is governed by TTL, set primarily through the Cache-Control response header from the origin. max-age sets the freshness window; s-maxage overrides it for shared caches like the CDN; no-store forbids caching entirely. There is usually a hierarchy — a browser cache, then a regional edge, then a shielding tier in front of the origin — and the TTL applies at each layer.
# origin response headers that drive edge caching Cache-Control: public, max-age=300, s-maxage=86400 # public -> a shared CDN cache may store it # max-age -> browsers cache 5 minutes # s-maxage -> the CDN caches 24 hours Cache-Control: private, no-store # never cache (per-user data)
Invalidation, Purging, and Versioned URLs
There are three ways to make stale content go away. The passive one is TTL expiry: set a short max-age and the edge re-fetches on its own — simple, but you serve stale content for up to the TTL and pay origin traffic on every expiry. The active one is an explicit purge: call the CDN's API to evict a path from every PoP. Purging is immediate but expensive and dangerous at scale — purge everything on every deploy and you get a thundering herd of cache misses all stampeding the origin at once.
The third way sidesteps the problem entirely. With versioned URLs (cache-busting), you never overwrite content — you publish it under a new name. app.a1b2c3.js replaces app.9f8e7d.js, and because the URL changed, the old cache entry is simply never requested again. You give the versioned asset a one-year immutable TTL and never purge it, because a new deploy means a new filename, not a new copy of the old one. This is why build pipelines fingerprint their assets — it converts invalidation from an operation into a non-event.
Dynamic Acceleration
Not everything is cacheable — a logged-in dashboard, a checkout, a personalized feed all vary per user. CDNs still accelerate these through means other than caching the response. Connection reuse keeps warm, pre-established TLS connections from the PoP back to the origin, so a user's request skips the handshake latency to a distant server and rides an already-open pipe. The user pays the short hop to the nearby PoP; the long hop to origin is already warm.
Edge compute pushes the logic itself outward — running small functions at the PoP to assemble responses, do auth checks, or personalize a cached shell near the user. The pattern is to cache the parts that are common (the page template, the static assets) and compute only the per-user fragment, rather than treating a page as all-or-nothing cacheable. Even uncacheable traffic benefits from terminating TLS at a nearby edge and crossing the slow long-haul leg over an optimized backbone.
Cache TTL lets content expire passively after max-age — zero operational effort, but you serve stale content for up to the TTL and re-fetch on expiry. Use it for content where being a few minutes stale is harmless.
Explicit purge actively evicts a path from every PoP via API — immediate, but costly and prone to origin stampedes if overused. Use it sparingly, for urgent corrections to a specific URL.
Versioned URLs publish each change under a new filename so the old entry is never requested again — no invalidation needed at all. Use it for build assets, with a one-year immutable TTL, to make staleness structurally impossible.
- Caching a response that carries a session cookie or auth header without varying the cache key on it. One user's personalized page gets stored and served to the next user — a data leak the CDN happily replicates worldwide.
- Deploying new JavaScript or CSS under the same filename with no cache-busting. Edges and browsers keep serving the old version until the TTL expires, so users run a half-updated app for hours after the deploy.
- Purging the entire cache on every deploy. Every edge cold-misses at once and stampedes the origin in a thundering herd, turning a routine release into an origin-overload incident.
- Setting a long TTL on content that actually changes. A price or inventory count cached for 24 hours shows stale numbers all day, because the edge has no reason to re-fetch until the TTL runs out.
- Assuming a cache hit means correct content. An over-broad cache key ignores a header that should vary the response, so the fast hit returns the wrong variant — wrong language, wrong currency, wrong user.
- Fingerprint build assets with a content hash in the filename and serve them immutable, max-age=31536000, so a deploy publishes new URLs and never needs a purge.
- Set Cache-Control: private, no-store on any response carrying auth or per-user data, so a shared CDN cache never stores content that belongs to one user.
- Vary the cache key only on the headers that genuinely change the response, so personalization is correct without exploding the entry count and tanking the hit rate.
- Prefer versioned URLs over purging for routine changes, and reserve explicit purge for urgent one-off corrections to a specific path.
- Use s-maxage to give the CDN a long TTL while keeping browser max-age short, so the edge absorbs origin load without locking stale content into every visitor's browser.
Knowledge Check
After every deploy your CDN serves old JavaScript for hours. What is the cleanest fix?
- Fingerprint each asset's filename so a new build is a brand-new URL
- Purge the entire CDN cache on every deploy so that all edges re-fetch at once
- Set a very short TTL so the asset re-fetches from origin every few seconds
- Disable edge caching entirely for all of your JavaScript and CSS files
A logged-in user reports seeing another user's account page. The page is served by the CDN. What is the likely cause?
- The cache key ignored the auth cookie, so one user's page matched another
- The TTL was set too short, so the edge ended up refetching the wrong user's page
- Anycast routed both of these users to the same PoP and mixed their sessions together
- The origin forgot to send the CDN a purge request right after the user logged in
How does a CDN get a user in Frankfurt and one in São Paulo to the same hostname but different nearby PoPs?
- Anycast announces one IP from every PoP, and BGP picks the nearest
- A single fast origin server in one city that both users connect to directly
- Per-user cache TTLs that decide which PoP each individual user reaches
- A cache key that encodes each user's region and selects the right PoP
You got correct