HTTP/2
Topic 39

HTTP/2

HTTP/2

HTTP/2 keeps every semantic from HTTP/1.1 — the same methods, status codes, and headers — and throws away the text wire format that carried them. In its place is a binary, multiplexed framing layer: the connection is divided into independent streams, each carrying one request/response, and frames from many streams interleave on a single TCP connection. The thing browsers used to do — open six parallel connections to get concurrency — becomes one connection carrying dozens of concurrent exchanges.

It also compresses headers with HPACK, because in HTTP/1.1 the same fat header block (cookies, user-agent, accept lists) was re-sent verbatim on every request. HTTP/2 kills HTTP/1.1's head-of-line blocking at the HTTP layer — but it rides on one TCP connection, and a single lost TCP segment still stalls every stream underneath it. That residual blocking, at the transport layer this time, is exactly the problem HTTP/3 was built to remove.

HTTP/1.1serial requests
One response at a time, in order. Browsers open ~6 parallel TCP connections to fake concurrency; a slow response blocks everything queued behind it on its connection (HTTP head-of-line).
HTTP/2multiplexed streams
Dozens of streams interleave on one connection. HEADERS/DATA frames are tagged with a stream ID, HPACK compresses repeated headers, and a slow stream no longer blocks a fast one.

Streams, Frames, and Multiplexing

An HTTP/2 connection carries many streams, each identified by a numeric ID and representing one request/response pair. A stream is a sequence of frames — a HEADERS frame carries the request or response headers, DATA frames carry the body, and control frames like SETTINGS, WINDOW_UPDATE, and RST_STREAM manage the connection. Each frame is tagged with its stream ID, so the receiver can demultiplex interleaved frames back into the right streams.

This is what multiplexing means: stream 1's response and stream 3's response travel as interleaved frames on the same wire, neither waiting on the other. A slow response on one stream no longer blocks a fast one on another, which was impossible in HTTP/1.1 where responses came back strictly in order. Flow control is per-stream and per-connection via WINDOW_UPDATE, so one greedy download cannot starve the others. The whole point is concurrency without the cost of multiple connections.

# nghttp -v shows the frame trace — note interleaved stream IDs
nghttp -nv https://example.com/
 recv SETTINGS frame <stream_id=0>
 send HEADERS frame <stream_id=1>   GET /
 send HEADERS frame <stream_id=3>   GET /style.css
 recv HEADERS frame <stream_id=3>   200   <- stream 3 replies first
 recv DATA    frame <stream_id=3>
 recv HEADERS frame <stream_id=1>   200   <- out of request order, fine

HPACK Header Compression

Headers are repetitive and large. The same Cookie, User-Agent, and Accept lines go out on every request, easily hundreds of bytes each, and HTTP/1.1 sent them in full every time. HPACK fixes this with a shared dynamic table: once a header has been sent, both ends remember it, and later requests reference it by a small index instead of re-spelling the whole field. A static table covers common headers like :method: GET out of the box, so they cost one byte.

HPACK is deliberately not a general compressor like gzip, and that is a security decision. The CRIME and BREACH attacks showed that compressing attacker-influenced data alongside secrets leaks the secrets through compressed size; HPACK's index-based scheme with a bounded table avoids that class of attack while still collapsing repeated headers. The result is that the header overhead that made many small HTTP/1.1 requests expensive largely disappears — header-heavy API traffic benefits the most.

Server Push and Its Retirement

Server push let a server send resources the client had not yet requested — push the CSS and JS alongside the HTML, before the browser parsed the page and asked for them. On paper it saved a round trip. In practice it was a mess: the server pushed assets the browser already had cached, wasting bandwidth, and getting the priority and timing right was harder than the saving was worth.

Chrome removed support in 2022 and the feature is effectively dead. The problem push tried to solve — telling the browser to fetch a critical resource early — is now handled by the 103 Early Hints status and <link rel=preload> hints, which let the browser decide whether it actually needs the resource rather than the server guessing. Building anything new on server push today is building on a retired feature.

The Residual TCP Head-of-Line Blocking

Multiplexing solved head-of-line blocking at the HTTP layer, but HTTP/2 still runs over a single TCP connection — and TCP delivers bytes strictly in order. When one TCP segment is lost, TCP holds back every byte that arrived after it until the retransmission fills the gap, because it cannot hand the application data out of order. To TCP, the multiplexed streams are just one undifferentiated byte stream.

So a single dropped packet stalls all the HTTP/2 streams at once, even the ones whose data already arrived intact, because they are queued behind the missing segment in TCP's receive buffer. On a clean network this never shows; on a lossy mobile link it can make HTTP/2 perform worse than six independent HTTP/1.1 connections, where a loss on one connection only stalls that one. This is the wall HTTP/2 could not climb — and the reason HTTP/3 moved off TCP onto QUIC, where each stream has its own delivery.

HTTP/1.1 vs HTTP/2

HTTP/1.1 is text, one request at a time per connection, with responses in strict order — so browsers open about six connections per host to get concurrency, and head-of-line blocking lives at the application layer. Each connection re-sends full headers every request. It remains the right baseline and the fallback when HTTP/2 negotiation fails.

HTTP/2 is binary, multiplexes many streams over one connection, and compresses headers with HPACK — so one connection now beats six and header overhead largely vanishes. Its blocking moves down to the transport layer: a single lost TCP segment stalls every stream. Use it as the default for browser and same-origin API traffic, knowing the residual TCP HoL is real on lossy links.

Common Mistakes
  • Opening many connections out of HTTP/1.1 habit (domain sharding, connection pools) under HTTP/2. It defeats multiplexing and HPACK — each new connection has a cold header table and its own congestion window — so you get worse performance than the single connection HTTP/2 wanted.
  • Expecting HTTP/2 to fix TCP-level head-of-line blocking. It cannot — one lost segment still stalls every stream because they share one ordered TCP byte stream — so on a lossy link your "fix" delivers no improvement and you blame the wrong layer.
  • Building on HTTP/2 server push. It is retired in Chrome and broadly unsupported; assets you push are often already cached, wasting bandwidth, and the feature will simply be ignored — use 103 Early Hints or preload instead.
  • Disabling HPACK or sending uncompressible per-request headers (unique tokens in every header). You forfeit the compression win and pay full header cost on every request, exactly the HTTP/1.1 overhead HTTP/2 was meant to remove.
  • Assuming HTTP/2 is always negotiated. It is agreed via ALPN during the TLS handshake; a misconfigured proxy or missing ALPN silently drops you to HTTP/1.1, and you wonder why multiplexing never kicks in.
Best Practices
  • Consolidate onto a single HTTP/2 connection per origin and drop HTTP/1.1-era domain sharding, so multiplexing and the shared HPACK table actually deliver their concurrency and header savings.
  • Confirm ALPN negotiates h2 end to end through every proxy and load balancer, since a single hop that only speaks HTTP/1.1 silently downgrades the whole path.
  • Use 103 Early Hints with rel=preload rather than server push to fetch critical resources early, letting the browser skip anything it already has cached.
  • Set per-stream priorities and rely on per-stream flow control so a large download cannot starve interactive requests sharing the connection.
  • Prefer HTTP/3 for clients on lossy mobile networks where the residual TCP head-of-line blocking bites, and keep HTTP/2 for low-loss same-datacenter and wired paths.
Comparable conceptsHTTP/3 (TCP-HoL fix)SPDY (the predecessor)

Knowledge Check

On a lossy mobile link, why can HTTP/2 underperform six independent HTTP/1.1 connections?

  • All its streams share one TCP connection, so a single lost segment stalls every stream at once
  • HPACK header compression adds CPU overhead that slows every request on the link
  • It opens a new TCP connection for each request, so loss triggers many slow restarts
  • Its stream multiplexing reintroduces the old application-layer head-of-line blocking between all the concurrent streams

What problem does HPACK solve that plain HTTP/1.1 connections suffered from?

  • Re-sending large repeated headers in full on every request, by indexing them in a shared table
  • Compressing large response bodies that HTTP/1.1 always sent uncompressed
  • Eliminating the strict in-order delivery that TCP imposes on a connection
  • Encrypting all the header fields end to end so that intermediaries can no longer read sensitive cookies or auth tokens

A team enables domain sharding across four hostnames after moving to HTTP/2. What happens?

  • It defeats multiplexing — each connection has a cold header table and separate congestion window
  • It speeds things up by letting HTTP/2 multiplex across more connections in parallel
  • It eliminates the residual TCP head-of-line blocking by spreading loss across hosts
  • It automatically forces every single sharded connection to silently fall all the way back to plain HTTP/1.1

You got correct