Addressing Across the Layers
Three separate address spaces cooperate to get data to the right place, and conflating them is the source of more confusion than any other beginner mistake. A MAC address answers "which device on this wire," an IP address answers "which host on the internet," and a port answers "which process on that host." Each lives at a different layer, has a different scope, and is assigned by a different authority.
A packet needs all three to be useful. The MAC gets it across one link, the IP gets it across the internet to the right machine, and the port gets it to the one process out of the hundreds running there. Hold the three apart and the rest of networking — NAT, firewalls, sockets, load balancers — stops being mysterious, because each of those is just a device that reads or rewrites one specific address.
MAC Addresses — the Local Wire
A MAC address is a 48-bit identifier, usually written as six hex bytes like aa:bb:cc:11:22:33, assigned to a network interface. It is flat — there is no structure a router could use to aggregate or route by it — and its scope is a single link. The first half identifies the manufacturer (the OUI); the rest identifies the device. MAC addresses are used only to deliver a frame across one hop, and they are rewritten at every router, as the previous topic showed.
Because the MAC is flat and link-local, it cannot route across the internet: there is no hierarchy to follow and no way to find a distant device by its MAC. That is precisely the job the layer above exists to do. The MAC answers a small, local question, and answers it fast in switch hardware.
IP Addresses — the Whole Internet
An IP address identifies a host across the entire internet, and unlike the MAC it is hierarchical: the leading bits name a network and the trailing bits name a host within it, so routers can aggregate millions of addresses into a handful of routes. This hierarchy is what makes global routing tractable — a router does not need an entry per host, only per network prefix.
An IP address is assigned by static configuration, DHCP, or IPv6's SLAAC, not burned into hardware, and it belongs to a host's position in the network rather than to the hardware itself. Move a laptop to a new network and its IP changes while its MAC does not. The IP is the end-to-end address: the one a server actually sees as the source of a request, preserved across every hop unless a NAT rewrites it.
Ports — the Right Process
An IP address gets a packet to a host, but a host runs many programs at once — a web server, an SSH daemon, a database. The port is the 16-bit number in the transport header that says which of them the packet is for. A web server listens on port 443, SSH on 22, and the kernel uses the destination port to hand each arriving segment to the right socket.
Ports turn one host into thousands of independently addressable endpoints. They are the layer-4 demultiplexer: the transport layer's contribution to addressing is not getting data to a machine — IP does that — but getting it to the correct process on that machine, and distinguishing one of its connections from another.
How the Three Combine — the Five-Tuple
A single connection is identified not by any one address but by the combination the kernel keys on: source IP, source port, destination IP, destination port, and the protocol. This five-tuple is what makes a connection unique. Two browser tabs to the same server share a destination IP and port but differ in source port, so the kernel keeps their data straight.
The five-tuple is the thing that matters operationally, because it is what the interesting middleboxes act on. A stateful firewall tracks flows by five-tuple; a NAT rewrites the source IP and port and remembers the mapping by five-tuple; a layer-4 load balancer hashes the five-tuple to pick a backend. When a later chapter says a device "tracks connections," the five-tuple is what it tracks.
# ss shows the five-tuple of every connection: local and peer # IP:port pairs, with the protocol on the left ss -tn # Netid Local Address:Port Peer Address:Port # tcp 10.0.1.7:52344 93.184.216.34:443 <- one flow # tcp 10.0.1.7:52345 93.184.216.34:443 <- different src port
MAC — layer 2, flat, link-local scope, assigned by the hardware vendor. Delivers a frame across one wire and is rewritten every hop. You reason about it on the local segment only.
IP — layer 3, hierarchical, internet-wide scope, assigned by config or DHCP. Delivers a packet end to end across networks and is preserved across hops unless a NAT rewrites it.
Port — layer 4, host-local scope, chosen by the application or the kernel. Delivers a segment to the right process, and together with the IPs and protocol forms the five-tuple that identifies a connection.
- Thinking a MAC address can route across the internet. It is flat and link-local; there is no hierarchy to route on, which is the entire reason IP addresses exist above it.
- Assuming a public IP identifies a single process or user. It identifies a host interface; the port distinguishes processes, and behind a NAT one public IP can front thousands of separate hosts.
- Forgetting that the five-tuple, not the IP alone, identifies a connection. Two flows can share every field but the source port, and treating them as one breaks firewall, NAT, and load-balancer reasoning.
- Confusing "the server's address" with a port. Reaching a service needs both an IP and a port; a firewall rule or connection string that names one without the other is incomplete.
- Expecting the MAC to change when a host moves networks, or the IP to stay. It is the reverse — the MAC is fixed to the hardware while the IP belongs to the host's place in the network.
- Name the layer when you name an address: say "MAC" for the local wire, "IP" for the host, "port" for the process. The precision prevents most addressing confusion before it starts.
- Reason about connections as five-tuples when working with NAT, firewalls, or load balancers, because that is exactly the key those devices use to track and rewrite flows.
- When a service is unreachable, check both halves — is the IP routable and is the port open and listening — since either alone failing produces the same "can't connect" symptom.
- Use the source port to tell concurrent connections apart when reading
ssor a capture; identical IPs and destination ports with different source ports are separate flows, not duplicates. - Remember that one public IP can front many hosts behind NAT, so do not treat a source IP in a log as a unique client without accounting for shared addresses.
Knowledge Check
Why can a MAC address not be used to route a packet across the internet?
- It is flat and link-local, with no hierarchy for routers to aggregate or follow to a distant network
- It is too short to encode enough networks for global routing across the many providers of the internet
- It changes too often for routers to keep a stable table
- It is encrypted, so routers in the middle cannot read it
A host has two simultaneous TCP connections to the same web server on port 443. What keeps the kernel from confusing them?
- Each connection has a different source port, so the full five-tuples are distinct
- Each connection uses a different destination port on the server
- Each connection is assigned a different source IP address by the kernel for the duration of the session
- Each connection is tagged with a different source MAC address
Which fields make up the five-tuple that a stateful firewall or NAT uses to track a flow?
- Source IP, source port, destination IP, destination port, and protocol
- Source MAC, destination MAC, IP, port, and TTL
- Only the destination IP and destination port
- Source IP, destination IP, and the TCP sequence number of the current segment
You got correct