Topic 30

Disruption Budgets and QoS

AvailabilityPriority

Two mechanisms decide how a workload fares when the cluster has to disturb it. A PodDisruptionBudget (PDB) protects availability during voluntary disruptions like node drains. Quality of Service classes and priority decide who gets evicted first when a node runs out of resources.

These are the controls that separate a service that survives routine cluster operations from one that has an outage every time a node is upgraded. They cover the two kinds of disruption: the planned and the forced.

Voluntary vs Involuntary Disruption

A voluntary disruption is one the cluster initiates deliberately — draining a node for an upgrade, the Cluster Autoscaler removing a node, a rolling node replacement. An involuntary one is unplanned — a node crash, hardware failure, an out-of-memory kill. PDBs protect against the voluntary kind; nothing can fully protect against the involuntary kind except running enough replicas across enough failure domains.

PodDisruptionBudgets

A PDB declares how much disruption a workload can tolerate at once: minAvailable (keep at least this many running) or maxUnavailable (take at most this many down). When an operator or controller drains a node, the eviction API respects the PDB — it will not evict a Pod if doing so would breach the budget, and the drain waits. This is what lets a node upgrade roll through the cluster without ever dropping a service below its safe replica count.

A PDB keeping at least two replicas up

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: web
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: web

QoS and Eviction Order

When a node is under resource pressure, the kubelet evicts Pods to recover, and it chooses by QoS class (derived from requests and limits, Topic 25): BestEffort first, then Burstable exceeding its requests, and Guaranteed last. So the way to make a Pod survive node pressure is to give it requests equal to its limits. This is the involuntary-disruption side: you cannot prevent the pressure, but you can influence who pays for it.

Priority and Preemption

PriorityClasses add another dimension: a higher-priority Pending Pod can preempt (evict) lower-priority Pods to make room. This ensures critical workloads schedule even on a full cluster, at the cost of disrupting less-important ones. Used together, the picture is: PDBs bound voluntary disruption, QoS orders eviction under pressure, and priority decides who wins the competition for scarce capacity. A production-ready workload sets all three deliberately.

PDB vs QoS eviction

PDB (voluntary) — limits how many Pods may be taken down at once during drains and scale-down. Protects planned-disruption availability.

QoS eviction (involuntary) — orders which Pods the kubelet kills under node pressure. Guaranteed survives longest.

Common Mistakes

No PDB, so a node drain takes all of a service's replicas down at once.
A PDB so strict (e.g. minAvailable equal to replica count) that drains and scale-down hang forever.
Running important Pods as BestEffort, so they are the first evicted under pressure.
Assuming a PDB protects against node crashes — it only governs voluntary disruptions.
Ignoring PriorityClasses, so critical workloads can't preempt their way onto a full cluster.

Best Practices

Set a PDB on every multi-replica service so cluster operations never drop it below a safe count.
Size the PDB to allow progress — leave room for at least one Pod to be evicted at a time.
Give critical and stateful Pods Guaranteed QoS so they are evicted last under pressure.
Use PriorityClasses to ensure essential workloads can schedule on a contended cluster.
Combine all three — PDB, QoS, priority — for any workload that must stay up through cluster churn.

RelatedRequests, limits & QoS — where QoS class comes from (Topic 25)Upgrades and node maintenance — drains that respect PDBs (Topic 51)Cluster Autoscaler — scale-down also respects PDBs (Topic 28)

Knowledge Check

What kind of disruption does a PodDisruptionBudget protect against?

Voluntary disruptions like node drains and scale-down — not node crashes
Involuntary node crashes, kernel panics, and sudden hardware failures
Application bugs that crash the process inside a container
Network partitions that split a Pod's node off from the API server entirely

How do you make a Pod survive node memory pressure the longest?

Give it Guaranteed QoS by setting requests equal to limits
Set it to BestEffort with neither requests nor limits
Attach a PodDisruptionBudget with a very high minAvailable
Increase its readiness probe interval and timeout

A node drain is hanging and never completes. What PDB misconfiguration causes this?

A PDB so strict it never permits an eviction (e.g. minAvailable equals the replica count)
A PDB with maxUnavailable set far too high relative to the workload's total replica count
No QoS class assigned to any of the Pods currently sitting on the node being drained
A missing readiness probe on the workload's containers

You got correct