StatefulSets
A StatefulSet runs Pods that need stable identity and their own persistent storage — databases, message queues, and clustered systems where the replicas are not interchangeable. Where a Deployment's Pods are anonymous and disposable, a StatefulSet's Pods are named, ordered, and each keep their own disk.
It is the right tool for stateful workloads and the wrong tool for almost everything else. Most applications should be stateless Deployments backed by an external or managed data store; a StatefulSet is for when the state genuinely lives in the Pods.
Stable Identity
StatefulSet Pods get stable, predictable names — db-0, db-1, db-2 — that persist across rescheduling. Paired with a headless Service, each Pod also gets a stable DNS name, so db-0 is always reachable at the same address even after it restarts on a new node. Clustered systems rely on this: a database replica needs to know it is replica 0 and find replica 1 at a fixed name.
Per-Pod Storage
A StatefulSet uses volumeClaimTemplates to give each Pod its own PersistentVolumeClaim. db-0 gets its own disk, db-1 another, and that disk follows the Pod identity — when db-0 reschedules, it reattaches its own volume, not a fresh one. This per-replica persistence is the core of what a StatefulSet provides and what a Deployment cannot.
apiVersion: apps/v1 kind: StatefulSet metadata: name: db spec: serviceName: db # the headless Service replicas: 3 selector: matchLabels: app: db template: metadata: labels: app: db spec: containers: - name: db image: postgres:17 volumeMounts: - name: data mountPath: /var/lib/postgresql/data volumeClaimTemplates: - metadata: name: data spec: accessModes: ["ReadWriteOnce"] resources: requests: storage: 20Gi
Ordered Operations
StatefulSets deploy, scale, and update in order. Pods are created 0, 1, 2 and each waits for the previous to be ready; they are removed in reverse. Updates roll one Pod at a time, and a partition in the update strategy lets you stage a rollout to only the highest-numbered Pods — useful for canarying a database upgrade. This ordering is deliberate: clustered systems often require a stable bootstrap sequence.
When Not to Use One
A StatefulSet is not a database — it is scaffolding for running one. It does not handle replication, failover, or backups; your software or an operator must. For most teams the better answer to "we need a database" is a managed service or an operator (Topic 39), not a hand-rolled StatefulSet. And note that scaling down does not delete the PersistentVolumeClaims by default — the data is kept on purpose, which surprises people expecting cleanup.
StatefulSet — stable identity, ordered operations, per-Pod persistent storage. For stateful, clustered systems.
Deployment — anonymous, interchangeable, disposable Pods. For stateless applications — which should be most of them.
- Using a StatefulSet for a stateless app because it "feels more robust" — it just adds ordering overhead.
- Expecting the StatefulSet to handle replication, failover, or backups — that is the application's or an operator's job.
- Forgetting the required headless Service, so Pods get no stable DNS identity.
- Assuming scale-down deletes the PVCs; by default the data volumes are retained.
- Ignoring the ordered rollout during operations and being surprised when Pods update one at a time.
- Reserve StatefulSets for workloads that truly need stable identity and per-Pod storage.
- Prefer a managed database or a mature operator over a hand-built StatefulSet for production data.
- Always pair a StatefulSet with its headless Service for stable per-Pod DNS.
- Plan backups explicitly — the StatefulSet keeps the disks, not your recovery story.
- Use update partitions to canary changes to a clustered system one replica at a time.
Knowledge Check
What does a StatefulSet provide that a Deployment does not?
- Stable per-Pod identity and persistent per-Pod storage that follows each Pod
- Automatic built-in database replication and failover handling between its replicas
- Faster rolling updates that replace all Pods at once
- Guaranteed placement of exactly one Pod per node
What happens to the PersistentVolumeClaims when you scale a StatefulSet down?
- They are retained by default — the data is kept, not deleted
- They are deleted immediately to free up the underlying storage
- They are converted into ConfigMaps holding the same data
- They are relocated into the cluster's default namespace
Why is a hand-rolled StatefulSet usually not the best way to run a production database?
- It provides identity and storage but not replication, failover, or backups — a managed service or operator does
- StatefulSets cannot mount persistent volumes for their Pods
- StatefulSets are hard-capped at a single replica, so a database can never run more than one Pod for high availability
- Databases must always be deployed as a DaemonSet instead
You got correct