Topic 05

Bandwidth, Latency, and Throughput

Performance

Three numbers describe how a link performs, and people collapse them into the single word "fast" at the cost of every performance argument that follows. Bandwidth is capacity — the bits per second the pipe can carry. Latency is delay — the time for one bit to travel from end to end. Throughput is what you actually achieve, which is bounded by both and often by neither alone.

They are independent, and the independence has a sharp consequence: a fat transcontinental link can be slow to finish a small transfer because latency, not bandwidth, dominates it. Adding capacity to a latency-bound workload does nothing. Most "the network is slow" complaints are really a confusion between these three, and sorting them out is the difference between fixing a problem and spending money on the wrong one.

Bandwidth, Latency, and Throughput Are Three Things

Bandwidth is the width of the pipe: a 10 Gbps link can put ten billion bits on the wire each second. It says nothing about how long any one bit takes to arrive. Latency is the length of the pipe in time: a packet to a server 3000 km away cannot arrive in under about 15 milliseconds one way, no matter how wide the link.

Throughput is the rate you actually get, and it is capped by whichever limit binds first — bandwidth on a large bulk transfer, latency on a small request-response, and loss or a too-small window on a long-distance flow. The analogy that holds: bandwidth is how many lanes the highway has, latency is how long the highway is, and throughput is how fast your particular trip actually completes.

The Components of Latency

End-to-end latency is the sum of four delays, and knowing which one dominates tells you whether anything can be done about it. Propagation delay is distance divided by the speed of signal in the medium — fixed by geography. Transmission delay is the time to clock the bits onto the wire, set by bandwidth and packet size. Queuing delay is time spent waiting in buffers behind other packets, which rises with congestion. Processing delay is the small time each node spends making a forwarding decision.

On a long-distance path, propagation dominates and you cannot reduce it without moving the endpoints closer. On a congested link, queuing dominates and you fix it by relieving congestion, not by adding bandwidth that fills right back up. Diagnosing latency starts with deciding which of the four components is the bottleneck, because three of them have completely different cures.

The four components of latency, NY↔London round trip on an uncongested fiber

propagation
55 ms

queuing
4 ms

transmission
1 ms

processing
<1 ms

0 ms30 ms60 ms

Propagation dominates — distance and the speed of light set the floor, and no extra bandwidth, faster router, or protocol tweak moves it.

The Speed of Light Floor

Signal in fiber travels at roughly two-thirds the speed of light in vacuum, about 200,000 km/s, which works out to near 5 milliseconds of one-way delay per 1000 km. New York to London is about 5500 km, so the round trip cannot beat roughly 55 milliseconds even on a perfectly straight, empty fiber — and real paths are neither straight nor empty.

This is a hard floor, not an engineering target. No amount of bandwidth, no faster router, no protocol tweak moves it, because it is physics. It is why content delivery networks push data to the edge near users, why a chatty protocol that makes ten sequential round trips across an ocean feels sluggish regardless of the link speed, and why the only way to cut propagation latency is to shorten the distance.

# ping measures round-trip latency; the floor is geography, not bandwidth
ping -c 3 lon.example.com
# time=72.4 ms   <- ~55 ms is light over fiber; the rest is path + queuing
# a 10x faster link would not lower this number at all

The Bandwidth-Delay Product

To keep a pipe full, a sender must have enough data in flight to cover the entire round trip before the first acknowledgment comes back. That amount is the bandwidth-delay product: bandwidth multiplied by round-trip time. A 1 Gbps link with an 80 ms RTT has a BDP of about 10 megabytes — meaning 10 MB must be unacknowledged and in flight at once to use the link fully.

If the protocol's window is smaller than the BDP, the sender stalls waiting for acknowledgments and throughput falls far below the available bandwidth, even with zero loss. This is why a single TCP connection often gets a fraction of a fast long-distance link until its window is tuned, and the BDP is the number that quantifies exactly how much data must be in flight. The transport chapter returns to this; here it is enough to know that bandwidth and latency together set the window you need.

Bandwidth vs Latency

Bandwidth is capacity. Adding it speeds up large bulk transfers that can fill the pipe — a backup, a video stream, a big file. It does nothing for a small request that finishes long before capacity is the limit.

Latency is delay. Lowering it speeds up chatty request-response traffic — an API doing many sequential round trips, a page loading dozens of small resources. Only moving endpoints closer or cutting round trips reduces it; bandwidth cannot.

Common Mistakes

Buying bandwidth to fix a latency-bound workload. A chatty API across an ocean is limited by round trips and the speed of light; a wider link leaves it exactly as slow.
Ignoring round-trip time in API and protocol design. A sequence of ten dependent calls at 70 ms each is 700 ms of pure waiting that no server optimization touches; batch or pipeline them instead.
Forgetting the bandwidth-delay product when tuning TCP. On a long-fat link a default window caps a single flow far below the line rate, and the fix is window size, not more bandwidth.
Reporting "the network is slow" without separating the three metrics. Bandwidth, latency, and loss have different causes and different fixes; the word "slow" hides which one you are actually hitting.
Measuring latency only at idle. Queuing delay appears under load, so a path that pings at 5 ms empty can balloon to 100 ms saturated — a problem invisible to a quiet test.

Best Practices

Decide whether a workload is bandwidth-bound or latency-bound before optimizing. Bulk transfers want capacity; request-response traffic wants fewer, shorter round trips.
Cut round trips for chatty protocols — batch requests, pipeline, or move logic to one side — because each saved round trip removes a full RTT of unavoidable waiting.
Compute the bandwidth-delay product when a single flow underperforms on a long-distance link, and size the window to it rather than assuming the link is at fault.
Place data near users with caching or a CDN when propagation latency dominates, since shortening the distance is the only lever that moves the speed-of-light floor.
Measure latency under realistic load, not at idle, so queuing delay shows up; a path's behavior when busy is the one that matters to users.

Comparable conceptsBandwidth-delay product (window sizing)RTT (ping baseline)iperf3 (throughput measurement)

Knowledge Check

An API makes ten sequential calls across the Atlantic and feels slow. Upgrading the link from 1 Gbps to 10 Gbps changes nothing. Why?

The workload is latency-bound — ten round trips at a fixed RTT — and bandwidth does not reduce round-trip time
The new link is faster but the requests are too large to benefit from it
The upgrade increased packet loss, cancelling out the speed gain
The server is the real bottleneck and simply cannot answer fast enough, so the network speed is irrelevant either way

A 1 Gbps link has an 80 ms round-trip time, yet a single TCP transfer only reaches a fraction of 1 Gbps with no packet loss. What is the most likely cause?

The send window is smaller than the bandwidth-delay product, so the sender stalls waiting for acknowledgments
Heavy packet loss is forcing constant retransmission
The physical link is actually slower than its advertised 1 Gbps rating
High latency directly limits the throughput regardless of how much data is kept in flight at any given moment

On a long, uncongested intercontinental fiber path, which component of latency dominates?

Propagation delay, set by distance and the speed of signal in fiber
Queuing delay, from packets waiting in router buffers
Transmission delay, the time to clock bits onto a fast link
Processing delay, from each router along the path making a forwarding decision

You got correct