Topic 11

Loops, Spanning Tree, and Link Aggregation

Topology

Redundant layer-2 links are good for survivability and catastrophic for stability at the same time. The moment two switches have more than one path between them, you have a loop — and a broadcast frame in a layer-2 loop circulates forever, multiplying at every switch, until it saturates every link in seconds. This is a broadcast storm, and it takes a network down faster than almost any other failure because nothing at layer 2 stops it on its own.

Spanning Tree Protocol exists to make redundant topologies safe: it computes a loop-free tree and blocks the redundant links, keeping them as hot standbys that unblock only when the active path fails. Link aggregation solves a related problem from the opposite direction — it bonds several physical links into one logical link for bandwidth and failover, with no loop because the switches treat the bundle as a single port. Both deal in redundancy; they are mirror images of each other.

Why Layer-2 Loops Are Catastrophic

The IP header has a TTL that decrements at every router and kills a looping packet after at most 255 hops. The Ethernet header has no such field — there is nothing in a frame to count hops or expire it. A broadcast frame that enters a loop is flooded out every port, returns to where it started, and is flooded again, with no mechanism anywhere in the frame to stop it.

Worse, each switch in the loop duplicates the frame, so traffic grows exponentially rather than just circulating. Within seconds the links are at 100% utilization carrying nothing but copies of looping broadcasts, CPUs are pinned processing them, and the network is effectively dead for real traffic. A single mistakenly cabled redundant link, with no loop protection, is enough to cause this — which is exactly why STP is on by default on managed switches.

Spanning Tree Protocol

STP builds a loop-free tree in three steps. First, the switches elect a root bridge — the switch with the lowest bridge ID (configurable priority plus MAC). Second, every other switch finds its lowest-cost path to the root and marks that port as its root port. Third, for each segment one switch is chosen to forward, and every remaining redundant port is put into a blocking state — it stays up and listens for topology changes but forwards no data, so no loop can form.

When the active path fails, a blocked port can transition back to forwarding and restore connectivity over the redundant link. That is the entire value proposition: physical redundancy that is logically loop-free, with automatic failover. The cost is that half your links may sit idle in blocking state, carrying nothing until something breaks — which is part of why people reach for link aggregation or layer-3 designs instead.

# spanning tree state on a Linux bridge
mstpctl showport br0
# eth1  forwarding  (root port toward the root bridge)
# eth2  blocking    (redundant link held down to break the loop)

The loop-free tree STP builds over a redundant topology

Root bridge

lowest bridge ID wins the election

Root ports

each switch's lowest-cost path to the root

Designated ports

one forwarding port per segment

Blocked port

redundant link held down, loop prevented

RSTP and Faster Convergence

Classic STP (802.1D) is slow. A port moving from blocking to forwarding walks through listening and learning states on timers, taking 30 to 50 seconds to converge after a topology change — an eternity during which traffic over the recovering path is black-holed. On a network where links flap, that pause is repeatedly mistaken for an outage when it is actually STP reconverging.

Rapid Spanning Tree (802.1w, RSTP) replaces the timers with an explicit handshake between neighbors and converges in under a second on point-to-point links. RSTP is the modern default and supersedes plain STP everywhere it can — if you are still seeing 30-second reconvergence pauses, you are running legacy STP and should move to RSTP or a layer-3 design that avoids the problem entirely.

Link Aggregation

Link aggregation (LACP, 802.3ad, or "bonding" on Linux) bundles multiple physical links into one logical link. Two switches joined by four 10 Gbps links present a single 40 Gbps pipe, and the loss of any one member degrades capacity rather than dropping the connection. Because the switches treat the bundle as one logical port, STP sees no loop — the redundancy is used instead of blocked.

The catch is how traffic spreads across members: LACP hashes each flow (typically source/destination IP and port) to one member and pins it there, so a single TCP connection rides one physical link and cannot exceed that link's speed. Four bonded 10 Gbps links give 40 Gbps of aggregate across many flows, but any one flow tops out at 10 Gbps. Expecting a single large transfer to use the whole bundle is the most common LACP misunderstanding.

Spanning Tree vs Link Aggregation

Spanning Tree blocks redundant links to kill loops, keeping them idle until the active path fails. It is loop prevention with automatic failover, and the redundant capacity is wasted under normal operation — you trade bandwidth for a guarantee that no loop can form.

Link aggregation uses redundant links as one logical pipe, hashing flows across all members for bandwidth and surviving the loss of any one. Both target redundancy, but by opposite mechanisms: STP holds spare links down, LAG runs them all at once. Use LAG between two devices that need the extra bandwidth; rely on STP to make a meshed topology safe.

Common Mistakes

Disabling STP "to speed things up" or stop its convergence pauses. Without loop protection, the first redundant or mistakenly doubled link creates a broadcast storm that saturates every link in seconds — far worse than any pause STP ever caused.
Creating a loop with one careless cable. Patching two switch ports together, or daisy-chaining through a desk switch, forms a loop that storms the segment instantly unless STP is there to block one of the paths.
Expecting LACP to spread a single flow across all members. A bond hashes per flow and pins each connection to one link, so one big transfer is capped at a single member's speed — the aggregate bandwidth only appears across many simultaneous flows.
Mistaking an RSTP/STP convergence pause for a hardware outage. A link flap triggers reconvergence — sub-second on RSTP, 30 to 50 seconds on legacy STP — and chasing it as a dead device wastes time when the fix is moving off classic STP.
Mismatching LACP configuration on the two ends of a bond. One side static and the other negotiating, or different hashing, leaves the bundle half-up or dropping frames in a way that looks like intermittent loss rather than a bonding misconfiguration.

Best Practices

Leave STP enabled on every access switch and add BPDU guard on edge ports, so a user plugging in a rogue switch or looping a cable is shut down instantly instead of storming the segment.
Run RSTP rather than legacy 802.1D wherever the hardware supports it, to cut reconvergence from 30–50 seconds to under a second and stop convergence pauses from looking like outages.
Set the root bridge deliberately by lowering its priority on a core switch, so STP roots the tree where you want it rather than electing whichever switch happens to have the lowest MAC.
Use LACP (not static bonding) for link aggregation so both ends actively verify the bundle's health and a miswired or failed member is detected, rather than silently black-holing the flows hashed to it.
Size aggregation for many flows, not one — add bonded members to raise aggregate throughput across connections, and reach for a faster single link when one flow itself must go faster than a member.

Comparable conceptsL3 ECMP (equal-cost multipath)Cloud managed switching (no STP exposure)

Knowledge Check

Why is a broadcast frame in a layer-2 loop so much more dangerous than a packet in a layer-3 loop?

The Ethernet header has no TTL, so the frame never expires and multiplies at each switch
Broadcast frames are far larger, so they fill the links more quickly
Switches forward more slowly than routers, so the loop builds up backlog
The FCS integrity check fails over and over on the looping frame, which forces every switch in the path to resend it

What does Spanning Tree do with the redundant links it does not need?

It puts them in a blocking state where they forward nothing until the active path fails
It load-balances traffic evenly across all of them for extra bandwidth
It shuts the redundant ports down completely, so an operator has to re-enable each one by hand after a failure
It merges them with the active link into a single logical pipe

Four 10 Gbps links are bonded with LACP. How fast can a single TCP transfer go across them?

About 10 Gbps, because one flow is hashed to a single member link
The full 40 Gbps, since the bond spreads one flow across all four links
About 20 Gbps, because LACP stripes each flow across two links at a time
About 10 Gbps, because the other three links stay blocked as standby

You got correct