The Container Storage Interface
Topic 19

The Container Storage Interface

InterfacePlugins

The Container Storage Interface (CSI) is the standard that lets any storage vendor plug into Kubernetes without their code living in Kubernetes itself. Every cloud disk, every SAN, every distributed file system you use through a StorageClass is implemented as a CSI driver.

CSI is mostly invisible — you interact with PVCs and StorageClasses, and a driver does the work underneath. But knowing it is there explains how provisioning, snapshots, and expansion actually happen, and what breaks when a driver is misconfigured.

Why CSI Exists

Originally, storage drivers were compiled into Kubernetes core ("in-tree"), so every storage change meant a Kubernetes release and every vendor's bugs shipped in the core binary. CSI moved drivers out of tree: a vendor writes a driver to the CSI spec, ships it independently, and installs it into the cluster. The in-tree plugins have since been migrated to CSI, and the old code paths removed. This is the same decoupling pattern as the CRI for runtimes — a stable interface so the core stays small.

How a CSI Driver Is Structured

A CSI driver runs in two parts. A controller component (a Deployment) handles cluster-wide operations: creating and deleting volumes, attaching them, taking snapshots. A node component (a DaemonSet) runs on every node and handles the node-local work: mounting and unmounting the volume into Pods. When a stateful Pod can't mount its volume, the node component is usually where to look; when provisioning fails, it is the controller.

Snapshots, Cloning, Expansion

Beyond basic provisioning, CSI standardizes richer operations when the driver supports them. VolumeSnapshots capture a point-in-time copy of a volume; you can restore or clone from them. Volume expansion grows a PVC in place. One caution that costs people dearly: a snapshot is not a backup. It typically lives in the same storage system and shares its fate — a real backup is a copy kept somewhere independent (Topic 50 covers backup properly).

Topology and Operations

CSI is also topology-aware: a driver advertises which zones a volume can be reached from, which is what makes WaitForFirstConsumer binding work correctly. Operationally, the things that bite are version skew (a driver incompatible with the cluster version) and a missing or crash-looping node component (mounts hang). On managed clusters the provider ships and maintains the relevant CSI drivers, which is one less thing to operate.

CSI driver vs StorageClass

CSI driver — the software that actually creates, attaches, and mounts storage on a backend. Installed once per backend.

StorageClass — a policy object that points at a driver (provisioner) with parameters. You can have several per driver.

Common Mistakes
  • Confusing the CSI driver (the software) with the StorageClass (the policy that uses it).
  • Treating a VolumeSnapshot as a backup — it usually shares the storage system's fate.
  • Ignoring a crash-looping CSI node DaemonSet, then puzzling over Pods stuck mounting volumes.
  • Running a CSI driver version incompatible with the cluster version after an upgrade.
  • Assuming snapshots and expansion are always available — they depend on driver support.
Best Practices
  • Let the managed provider ship and maintain CSI drivers where possible; treat them as cluster infrastructure.
  • Keep real backups independent of the storage system; do not rely on snapshots alone.
  • When stateful Pods can't mount, check the CSI node DaemonSet; when provisioning fails, check the controller.
  • Verify CSI driver compatibility before and after cluster upgrades to avoid mount failures.
  • Use topology-aware drivers with WaitForFirstConsumer so volumes land in the right zone.
RelatedStorageClasses / PVCs — the user-facing layer over CSI (Topics 17-18)CRI — the same out-of-tree pattern for runtimes (Topic 02)etcd & backup — where real backup strategy lives (Topic 50)

Knowledge Check

What problem did moving storage drivers to CSI (out-of-tree) solve?

  • Vendors can ship and update drivers independently, so storage changes don't require Kubernetes releases
  • It made cloud block disks and SAN capacity free of charge for every cluster that adopts it
  • It removed the need for StorageClasses entirely, letting each driver provision volumes straight from the PVC spec
  • It encrypted all volumes at rest by default with no driver or StorageClass configuration

A stateful Pod is stuck unable to mount its volume. Which CSI component is the first suspect?

  • The CSI node component (DaemonSet) that handles mounting on each node
  • The StorageClass default-class annotation that decides which class an unspecified claim uses
  • The CSI controller's snapshot feature that captures point-in-time copies of the volume
  • The Pod's readiness probe gating traffic to the running container

Why is a VolumeSnapshot not a substitute for a backup?

  • It usually lives in the same storage system and shares its fate; a backup is an independent copy
  • Snapshots cannot be restored into a new volume once they have been taken
  • Snapshots are deleted every night automatically by a built-in retention sweep you cannot disable
  • Snapshots only work on emptyDir scratch volumes, never on persistent disks

You got correct