HTTP/2
HTTP/2 keeps every semantic from HTTP/1.1 — the same methods, status codes, and headers — and throws away the text wire format that carried them. In its place is a binary, multiplexed framing layer: the connection is divided into independent streams, each carrying one request/response, and frames from many streams interleave on a single TCP connection. The thing browsers used to do — open six parallel connections to get concurrency — becomes one connection carrying dozens of concurrent exchanges.
It also compresses headers with HPACK, because in HTTP/1.1 the same fat header block (cookies, user-agent, accept lists) was re-sent verbatim on every request. HTTP/2 kills HTTP/1.1's head-of-line blocking at the HTTP layer — but it rides on one TCP connection, and a single lost TCP segment still stalls every stream underneath it. That residual blocking, at the transport layer this time, is exactly the problem HTTP/3 was built to remove.
Streams, Frames, and Multiplexing
An HTTP/2 connection carries many streams, each identified by a numeric ID and representing one request/response pair. A stream is a sequence of frames — a HEADERS frame carries the request or response headers, DATA frames carry the body, and control frames like SETTINGS, WINDOW_UPDATE, and RST_STREAM manage the connection. Each frame is tagged with its stream ID, so the receiver can demultiplex interleaved frames back into the right streams.
This is what multiplexing means: stream 1's response and stream 3's response travel as interleaved frames on the same wire, neither waiting on the other. A slow response on one stream no longer blocks a fast one on another, which was impossible in HTTP/1.1 where responses came back strictly in order. Flow control is per-stream and per-connection via WINDOW_UPDATE, so one greedy download cannot starve the others. The whole point is concurrency without the cost of multiple connections.
# nghttp -v shows the frame trace — note interleaved stream IDs nghttp -nv https://example.com/ recv SETTINGS frame <stream_id=0> send HEADERS frame <stream_id=1> GET / send HEADERS frame <stream_id=3> GET /style.css recv HEADERS frame <stream_id=3> 200 <- stream 3 replies first recv DATA frame <stream_id=3> recv HEADERS frame <stream_id=1> 200 <- out of request order, fine
HPACK Header Compression
Headers are repetitive and large. The same Cookie, User-Agent, and Accept lines go out on every request, easily hundreds of bytes each, and HTTP/1.1 sent them in full every time. HPACK fixes this with a shared dynamic table: once a header has been sent, both ends remember it, and later requests reference it by a small index instead of re-spelling the whole field. A static table covers common headers like :method: GET out of the box, so they cost one byte.
HPACK is deliberately not a general compressor like gzip, and that is a security decision. The CRIME and BREACH attacks showed that compressing attacker-influenced data alongside secrets leaks the secrets through compressed size; HPACK's index-based scheme with a bounded table avoids that class of attack while still collapsing repeated headers. The result is that the header overhead that made many small HTTP/1.1 requests expensive largely disappears — header-heavy API traffic benefits the most.
Server Push and Its Retirement
Server push let a server send resources the client had not yet requested — push the CSS and JS alongside the HTML, before the browser parsed the page and asked for them. On paper it saved a round trip. In practice it was a mess: the server pushed assets the browser already had cached, wasting bandwidth, and getting the priority and timing right was harder than the saving was worth.
Chrome removed support in 2022 and the feature is effectively dead. The problem push tried to solve — telling the browser to fetch a critical resource early — is now handled by the 103 Early Hints status and <link rel=preload> hints, which let the browser decide whether it actually needs the resource rather than the server guessing. Building anything new on server push today is building on a retired feature.
The Residual TCP Head-of-Line Blocking
Multiplexing solved head-of-line blocking at the HTTP layer, but HTTP/2 still runs over a single TCP connection — and TCP delivers bytes strictly in order. When one TCP segment is lost, TCP holds back every byte that arrived after it until the retransmission fills the gap, because it cannot hand the application data out of order. To TCP, the multiplexed streams are just one undifferentiated byte stream.
So a single dropped packet stalls all the HTTP/2 streams at once, even the ones whose data already arrived intact, because they are queued behind the missing segment in TCP's receive buffer. On a clean network this never shows; on a lossy mobile link it can make HTTP/2 perform worse than six independent HTTP/1.1 connections, where a loss on one connection only stalls that one. This is the wall HTTP/2 could not climb — and the reason HTTP/3 moved off TCP onto QUIC, where each stream has its own delivery.
HTTP/1.1 is text, one request at a time per connection, with responses in strict order — so browsers open about six connections per host to get concurrency, and head-of-line blocking lives at the application layer. Each connection re-sends full headers every request. It remains the right baseline and the fallback when HTTP/2 negotiation fails.
HTTP/2 is binary, multiplexes many streams over one connection, and compresses headers with HPACK — so one connection now beats six and header overhead largely vanishes. Its blocking moves down to the transport layer: a single lost TCP segment stalls every stream. Use it as the default for browser and same-origin API traffic, knowing the residual TCP HoL is real on lossy links.
- Opening many connections out of HTTP/1.1 habit (domain sharding, connection pools) under HTTP/2. It defeats multiplexing and HPACK — each new connection has a cold header table and its own congestion window — so you get worse performance than the single connection HTTP/2 wanted.
- Expecting HTTP/2 to fix TCP-level head-of-line blocking. It cannot — one lost segment still stalls every stream because they share one ordered TCP byte stream — so on a lossy link your "fix" delivers no improvement and you blame the wrong layer.
- Building on HTTP/2 server push. It is retired in Chrome and broadly unsupported; assets you push are often already cached, wasting bandwidth, and the feature will simply be ignored — use
103 Early Hintsorpreloadinstead. - Disabling HPACK or sending uncompressible per-request headers (unique tokens in every header). You forfeit the compression win and pay full header cost on every request, exactly the HTTP/1.1 overhead HTTP/2 was meant to remove.
- Assuming HTTP/2 is always negotiated. It is agreed via ALPN during the TLS handshake; a misconfigured proxy or missing ALPN silently drops you to HTTP/1.1, and you wonder why multiplexing never kicks in.
- Consolidate onto a single HTTP/2 connection per origin and drop HTTP/1.1-era domain sharding, so multiplexing and the shared HPACK table actually deliver their concurrency and header savings.
- Confirm ALPN negotiates
h2end to end through every proxy and load balancer, since a single hop that only speaks HTTP/1.1 silently downgrades the whole path. - Use
103 Early Hintswithrel=preloadrather than server push to fetch critical resources early, letting the browser skip anything it already has cached. - Set per-stream priorities and rely on per-stream flow control so a large download cannot starve interactive requests sharing the connection.
- Prefer HTTP/3 for clients on lossy mobile networks where the residual TCP head-of-line blocking bites, and keep HTTP/2 for low-loss same-datacenter and wired paths.
Knowledge Check
On a lossy mobile link, why can HTTP/2 underperform six independent HTTP/1.1 connections?
- All its streams share one TCP connection, so a single lost segment stalls every stream at once
- HPACK header compression adds CPU overhead that slows every request on the link
- It opens a new TCP connection for each request, so loss triggers many slow restarts
- Its stream multiplexing reintroduces the old application-layer head-of-line blocking between all the concurrent streams
What problem does HPACK solve that plain HTTP/1.1 connections suffered from?
- Re-sending large repeated headers in full on every request, by indexing them in a shared table
- Compressing large response bodies that HTTP/1.1 always sent uncompressed
- Eliminating the strict in-order delivery that TCP imposes on a connection
- Encrypting all the header fields end to end so that intermediaries can no longer read sensitive cookies or auth tokens
A team enables domain sharding across four hostnames after moving to HTTP/2. What happens?
- It defeats multiplexing — each connection has a cold header table and separate congestion window
- It speeds things up by letting HTTP/2 multiplex across more connections in parallel
- It eliminates the residual TCP head-of-line blocking by spreading loss across hosts
- It automatically forces every single sharded connection to silently fall all the way back to plain HTTP/1.1
You got correct