The Client-Server Model
The application layer is the top of the stack — the only layer that carries data a human cares about directly. Everything underneath moves opaque bytes; here those bytes mean a web page, an email, a database query, a remote function call. An application protocol defines the grammar of that conversation: who speaks first, what a message looks like, what a reply means, and what happens when one side goes quiet. It is built on top of a transport that already handles reliability and ordering, which is why most application protocols never mention retransmission or sequence numbers — TCP or QUIC already did that work.
The dominant shape is client-server request/response: one side sends a request, the other sends exactly one reply, and the interaction is over. HTTP, DNS-over-TCP, SMTP, and most RPC frameworks all wear this shape. It is not the only one — streaming and publish/subscribe exist for good reasons — but it is the default, and understanding why it scales (statelessness) and why it won (port 443 and tooling) is the frame for the rest of this chapter.
Request/Response, Streaming, and Pub/Sub
Three interaction shapes cover almost everything. Request/response is one message in, one message out: the client blocks until the reply arrives, and each exchange is independent. It maps cleanly onto a function call, which is why it dominates — a GET /users/42 is just a remote read. Streaming keeps the channel open and lets one or both sides send a sequence of messages over time: a video feed, a log tail, a gRPC server stream. The connection is long-lived, and the cost moves from per-request overhead to holding state for an idle socket.
Publish/subscribe breaks the direct coupling entirely. Publishers send to a topic, subscribers register interest, and a broker fans messages out — no publisher knows who is listening. This is how event systems decouple producers from consumers, but it trades the simple "I sent it, I got a reply" model for at-least-once or at-most-once delivery semantics you have to reason about explicitly. Most of this chapter lives in the request/response world; WebSockets and gRPC streaming are where the other two shapes show up.
Text Protocols versus Binary Protocols
Application protocols split into text-based and binary. A text protocol like HTTP/1.1 puts human-readable lines on the wire: GET /index.html HTTP/1.1, then headers as Name: value pairs. You can debug it with telnet or read a capture by eye, which is a real operational advantage when something breaks at 3 a.m. The cost is bytes and parsing — every header name is spelled out in full on every request, and the parser has to handle whitespace, line endings, and ambiguity.
A binary protocol like HTTP/2's framing layer encodes the same information in fixed-layout frames with numeric type codes and length-prefixed fields. It is compact and unambiguous to parse, but you need a tool that understands the format to read it — Wireshark, not telnet. The industry trend is clear: HTTP/1.1 was text, HTTP/2 and HTTP/3 are binary, and the readability HTTP/1.1 gave up is recovered with tooling. gRPC goes all the way: Protocol Buffers on the wire are unreadable without the schema.
# HTTP/1.1 is text you can type by hand — open a raw TCP socket # and speak the protocol directly with no client library printf 'GET / HTTP/1.1\r\nHost: example.com\r\nConnection: close\r\n\r\n' \ | openssl s_client -quiet -connect example.com:443 # HTTP/1.1 200 OK # Content-Type: text/html; charset=UTF-8 <- headers are plain ASCII
Statelessness and Why It Scales
HTTP is stateless by design: each request carries everything the server needs to handle it, and the server keeps no memory of the previous request from the same client. This sounds like a limitation — you have to re-send your credentials, your context, your everything, on every call — but it is the property that makes horizontal scaling trivial. Any server can handle any request, because no request depends on which server handled the last one. Put ten identical servers behind a load balancer and traffic distributes freely.
A stateful protocol pins a client to the specific server that holds its session, and that pinning is where scaling and failover get hard. If that server dies, the session dies with it; if it gets hot, you cannot shed its load to a peer that lacks the state. Statelessness pushes the state somewhere it can be shared — a cookie the client carries, a token, or an external store — and that is exactly the subject of the cookies-and-sessions topic later in this chapter. The web scaled because the protocol underneath it forgot you between requests.
Why HTTP Became the Universal Substrate
HTTP did not win on technical elegance — it won because it goes through firewalls. Corporate and ISP firewalls block almost everything but reflexively allow outbound TCP 443, because blocking the web breaks the internet for users. So if you want a protocol to work everywhere — across hotel Wi-Fi, mobile carriers, locked-down enterprise networks — you run it over HTTPS on 443, where the firewall waves it through as ordinary web traffic. Port 443 became the new everything-port.
The second reason is tooling. Decades of investment built load balancers, caches, proxies, CDNs, auth gateways, and observability that all speak HTTP fluently. Build on HTTP and you inherit that ecosystem for free; build a custom protocol on a custom port and you reimplement all of it and fight middleboxes the whole way. The cost is a real one — tunneling a streaming RPC over request/response HTTP buries it from anything that inspects traffic by protocol, and you lose the protocol-level visibility a purpose-built port would have given you.
Stateless protocols treat each request as self-contained — the server holds no per-client memory between calls. Any server can handle any request, so you scale by adding identical servers behind a load balancer and failover is a non-event. The cost is that the client must re-send its context every time. Choose it as the default for anything web-facing.
Stateful protocols keep per-session state on a specific server, so a client must keep talking to the same one. This buys low per-message overhead for long conversations but makes scaling and failover hard — a dead server takes its sessions with it. Choose it only when the conversation genuinely needs continuity, like a long-lived WebSocket, and accept the affinity it forces.
- Building stateful assumptions into a stateless protocol — caching a user's data in one server's local memory and assuming their next HTTP request lands on the same box. Behind a load balancer it won't, and the second request sees stale or missing state.
- Tunneling a custom protocol over HTTP and then losing all protocol-level visibility. Once everything is opaque POSTs to /api, your load balancer and firewall can no longer route, rate-limit, or inspect by message type — you traded observability for firewall traversal.
- Assuming application-layer reliability when the transport is UDP. A protocol over plain UDP (or QUIC without acks on a datagram) inherits no retransmission; if your app does not handle loss itself, messages silently vanish and you blame the network.
- Confusing connection reuse with statefulness. An HTTP keep-alive connection carries many independent stateless requests; reusing the socket does not mean the server remembers you between them, and treating it that way creates bugs that only appear when a new connection opens.
- Picking pub/sub for a request that needs an answer. Fire-and-forget topics give you no reply path and at-least-once delivery; using them where the caller actually needs the result back forces you to bolt a correlation-and-timeout layer on top, badly reinventing request/response.
- Default to stateless request/response over HTTP for anything client-facing, and externalize session state into a cookie, token, or shared store so any server can handle any request.
- Reach for streaming (WebSockets, SSE, gRPC streams) only when the server must push or the data is genuinely a sequence over time — not as a premature optimization for request/response traffic.
- Run protocols that must traverse hostile networks over HTTPS on 443, where firewalls and proxies let them through, rather than a custom port that gets silently dropped on locked-down Wi-Fi.
- Keep request bodies and headers self-contained so a load balancer can route on them without server affinity — put the routing key in the path or a header, not in server-held session state.
- When you do tunnel a non-web protocol over HTTP, expose enough structure (paths, methods, status codes) that your existing HTTP tooling can still observe and control it, instead of one opaque endpoint.
Knowledge Check
Why does statelessness make an HTTP service easy to scale horizontally?
- Any server can handle any request, so you add identical instances behind a load balancer
- Each request is smaller, so the server processes more of them per second
- Each client is pinned to one specific server that caches its session locally for fast reuse later
- It removes the need for any TCP connection between client and server
A team tunnels a custom binary protocol inside HTTP POSTs to a single /api endpoint. What do they primarily give up?
- Protocol-level visibility, since proxies and firewalls now see only opaque identical requests
- The ability to traverse restrictive firewalls, which dedicated custom ports actually do far more reliably
- Reliable ordered delivery, because HTTP cannot carry binary payloads
- End-to-end encryption, which only works for native HTTP traffic
Which interaction shape best fits a server that must push a continuous live log to a client?
- Streaming over a long-lived connection
- Plain request/response, one reply per call
- Publish/subscribe through a message broker
- A stateless per-request handler with no open channel
You got correct