Topic 51

Upgrades and Node Maintenance

MaintenanceUpgrades

Kubernetes ships about three releases a year, and clusters must be upgraded to stay supported. Done right — respecting version skew, upgrading the control plane first, and draining nodes that honor disruption budgets — an upgrade rolls through without an outage. Done carelessly, it is how clusters break.

Node maintenance shares the same machinery: cordoning and draining a node safely is the basis of both upgrades and any planned node work.

Version Skew Rules

Kubernetes permits only limited version skew between components. The control plane must be upgraded before the nodes, the kubelet may trail the API server by a couple of minor versions but never lead it, and you upgrade one minor version at a time — you cannot jump from 1.29 to 1.32 directly. Violating skew is a reliable way to get subtle failures. The order is fixed: control plane first, then nodes.

Draining Nodes Safely

To work on a node you first cordon it (mark unschedulable, so no new Pods land) then drain it (evict the running Pods so they reschedule elsewhere). The drain uses the eviction API, which respects PodDisruptionBudgets (Topic 30) — it will not evict a Pod if doing so breaches a budget, so a well-set PDB keeps a service above its safe replica count throughout. DaemonSet Pods and certain others need explicit flags. After maintenance, uncordon to let Pods schedule again.

Cordon, drain, work, uncordon

kubectl cordon node-3                  # no new Pods
kubectl drain node-3 \                  # evict, respecting PDBs
  --ignore-daemonsets --delete-emptydir-data
# ... patch / upgrade the node ...
kubectl uncordon node-3                # schedulable again

The Upgrade Path

A rolling upgrade combines these. Snapshot etcd first. Upgrade the control plane components (on a managed cluster, one click or API call; on kubeadm, kubeadm upgrade). Then upgrade nodes one or a few at a time: drain, upgrade the kubelet and runtime (or, more commonly, replace the node with a new one on the target version — surge/blue-green node pools), uncordon, move on. Throughout, PDBs keep workloads available. Managed clusters offer release channels that automate much of this.

Deprecated APIs: The Pre-Upgrade Trap

The most common upgrade breakage is not the mechanics — it is removed API versions. Each release graduates and eventually removes older API versions, and a manifest using a removed version will fail to apply after the upgrade. Before upgrading, scan your manifests and Helm charts for deprecated APIs (tools like kubent / pluto help) and migrate them. Checking this before the upgrade turns a potential outage into a non-event.

In-place node upgrade vs node replacement

In-place — drain, upgrade kubelet/runtime on the same node, uncordon. Fewer machines, more state to get right.

Replacement (surge/blue-green pools) — add new nodes on the target version, drain old ones, remove them. Cleaner and the common managed-cluster approach.

Common Mistakes

Skipping minor versions (e.g. 1.29 → 1.32) instead of upgrading one at a time.
Upgrading nodes before the control plane, violating version skew.
Draining without PDBs, taking all of a service's replicas down at once.
Not scanning for removed API versions first, so manifests fail to apply post-upgrade.
Forgetting an etcd snapshot before a control-plane upgrade.

Best Practices

Upgrade one minor version at a time, control plane before nodes.
Set PDBs so drains and upgrades never drop a service below a safe replica count.
Scan for and migrate deprecated/removed APIs before upgrading (kubent/pluto).
Snapshot etcd before any control-plane upgrade.
Prefer node replacement (surge/blue-green pools) over in-place upgrades where possible.

RelatedPodDisruptionBudgets — what makes drains safe (Topic 30)etcd and backup — snapshot before upgrading (Topic 50)Managed release channels — automated upgrade paths (Part 11)

Knowledge Check

What is the correct upgrade order and step size?

Control plane before nodes, one minor version at a time
Nodes before control plane, any number of versions at once
All components at once, in a single simultaneous step
Nodes only; the control plane upgrades itself automatically

What does draining a node do, and what does it respect?

It evicts the node's Pods to reschedule elsewhere, respecting PodDisruptionBudgets
It deletes the Node object from etcd immediately and unregisters it from the cluster for good
It restarts every Pod across all nodes in the cluster
It ignores PodDisruptionBudgets to drain the node faster

What is the most common cause of an upgrade breaking workloads?

Manifests using API versions that were removed in the new release
The control plane running too many API server replicas behind the load balancer
Using PodDisruptionBudgets to guard availability
Draining the nodes one at a time during the rollout

You got correct