Topic 13

DaemonSets

Per-nodeAgents

A DaemonSet runs exactly one copy of a Pod on every node (or every node matching a selector). It is the controller for node-level agents — log collectors, monitoring exporters, the CNI network plugin, storage daemons — anything that must be present on each machine.

Where a Deployment thinks in terms of "how many replicas," a DaemonSet thinks in terms of "which nodes." As nodes join and leave the cluster, the DaemonSet adds and removes Pods to keep its one-per-node guarantee.

The Per-Node Guarantee

A DaemonSet's controller watches the set of nodes and ensures its Pod runs on each eligible one. Add a node and a Pod appears on it automatically; drain a node and its DaemonSet Pod goes too. You never set a replica count — the count is "however many nodes there are." This is exactly the model node-level infrastructure needs.

A DaemonSet for a log shipper

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: log-agent
spec:
  selector:
    matchLabels:
      app: log-agent
  template:
    metadata:
      labels:
        app: log-agent
    spec:
      tolerations:
        - operator: Exists          # run on tainted nodes too
      containers:
        - name: agent
          image: log-agent:1.0

Selecting Which Nodes

By default a DaemonSet targets all nodes, but you can scope it with a nodeSelector or node affinity — for example, a GPU monitoring agent only on GPU nodes. Crucially, control-plane and specialized nodes often carry taints that repel ordinary Pods; a DaemonSet that must run there needs matching tolerations. Missing tolerations is the usual reason an agent silently skips some nodes.

Update Strategy

DaemonSets support two update strategies. RollingUpdate (the default) replaces Pods node by node, bounded by maxUnavailable, so the agent is briefly absent on a few nodes at a time. OnDelete updates a node's Pod only when you delete it manually — useful for agents where you want tight control over when each node is touched.

When to Use One

Reach for a DaemonSet whenever the workload is about the node, not the application: log and metrics agents, security and compliance daemons, the CNI plugin, and node-local storage providers. If you find yourself writing a Deployment with node anti-affinity to spread one Pod per node, that is a DaemonSet trying to be born. Conversely, do not use a DaemonSet for application replicas — that is a Deployment's job.

DaemonSet vs Deployment

DaemonSet — one Pod per node, count tied to the node set; for node-level agents.

Deployment — a fixed or autoscaled number of interchangeable replicas placed anywhere; for applications.

Common Mistakes

Faking a DaemonSet with a Deployment plus node anti-affinity instead of using the right controller.
Omitting tolerations, so the agent silently skips tainted control-plane or specialized nodes.
Running a heavy, resource-hungry DaemonSet that taxes every node in the cluster.
Using a DaemonSet for application workloads that should scale by load, not by node count.
Ignoring the rollout strategy, so a DaemonSet update briefly removes a critical agent from many nodes at once.

Best Practices

Use DaemonSets for genuinely node-level concerns: logging, metrics, networking, security, node storage.
Add the tolerations needed to cover control-plane and tainted nodes when the agent must run everywhere.
Keep DaemonSet Pods lean — they multiply by every node, so their footprint matters.
Set requests and limits so a per-node agent cannot starve the workloads sharing its node.
Tune maxUnavailable on rollout so a critical agent is never gone from too many nodes at once.

RelatedDeployment — for application replicas rather than per-node agentsTaints and tolerations — how DaemonSets reach control-plane nodes (Topic 26)Managed node agents — cloud-provided logging/monitoring/CNI add-ons

Knowledge Check

How many replicas does a DaemonSet run?

One per eligible node — the count tracks the node set, not a fixed number
Exactly whatever number its replicas field happens to be explicitly set to
Always three Pods, spread across zones for high availability
Exactly one Pod for each namespace in the cluster

A DaemonSet agent is missing from the control-plane nodes. What is the likely fix?

Add tolerations so the Pod can run on the tainted control-plane nodes
Increase the DaemonSet replica count so it covers the control-plane nodes
Change the fronting Service type from ClusterIP to NodePort
Move the DaemonSet into the kube-system default namespace

Which workload is the right fit for a DaemonSet?

A log-collection agent that must run on every node
A stateless web API that scales with traffic
A nightly batch report that runs once on a fixed schedule
A clustered database needing stable identity and per-replica storage

You got correct