Cluster Cost Management
Topic 53

Cluster Cost Management

CostEfficiency

A Kubernetes bill is mostly nodes you are paying for whether or not your Pods use them. Cost management is the practice of keeping spend proportional to value — right-sizing requests, packing workloads densely, using cheaper capacity for the right work, and attributing cost so waste is visible.

The levers are mostly things covered elsewhere, applied with a cost lens. The recurring theme: idle, over-requested, and unattributable capacity is where the money leaks.

Where the Money Goes

The dominant cost is compute — the nodes — and the biggest waste is the gap between what Pods request (and thus reserve) and what they actually use. Because the scheduler reserves based on requests (Topic 25), an over-requested cluster runs half-empty nodes you fully pay for. Other real costs hide in storage (provisioned but unused volumes), and especially network egress and cross-zone traffic, which is easy to ignore until the bill arrives.

Right-Sizing and Bin-Packing

The first lever is right-sizing: set requests close to measured real usage so reservations reflect reality, recovering the idle headroom. The VPA's recommendations (Topic 28) help find the numbers. The second is bin-packing — letting the scheduler and Cluster Autoscaler consolidate Pods onto fewer, fuller nodes and remove the empties. Workloads must be movable (PDBs, no node-local storage) for consolidation to actually reclaim nodes.

Cheaper Capacity

Not all work needs on-demand, full-price nodes. Spot/preemptible nodes cost a fraction but can be reclaimed with little notice — ideal for fault-tolerant, restartable workloads (batch, stateless web behind enough replicas), wrong for stateful singletons. Reserved/committed capacity discounts the steady baseline you will run regardless. The pattern is a committed baseline plus on-demand for variability plus spot for the interruptible portion.

CapacityCostUse for
On-demandFull priceBaseline that must not be interrupted
Reserved/committedDiscountedPredictable steady-state load
Spot/preemptibleCheapestFault-tolerant, restartable workloads

Attribution and Autoscaling

You cannot manage what you cannot see. Cost attribution — labeling workloads by team/product and using tools like OpenCost/Kubecost to allocate node cost back to namespaces and labels — turns an opaque bill into accountable line items, which is what actually drives teams to clean up waste (showback/chargeback). Combined with autoscaling that scales to demand (and to zero where possible) and the levers above, cost tracks value. The anti-goal worth naming: never cut reliability — replicas, PDBs, headroom — to shave cost, because an outage costs more than the nodes.

On-demand vs reserved vs spot

On-demand — full price, no commitment, not interrupted. For baseline that must stay up.

Reserved/committed — discounted for a commitment. For predictable steady load.

Spot/preemptible — cheapest, reclaimable on short notice. For fault-tolerant, restartable work.

Common Mistakes
  • Over-requesting "to be safe," so the cluster reserves far more than it uses and runs half-empty.
  • Running fault-tolerant batch on full-price on-demand nodes instead of spot.
  • Ignoring egress and cross-zone traffic costs until the bill reveals them.
  • No cost attribution, so waste is invisible and no team is accountable.
  • Cutting replicas, PDBs, or headroom to save money and causing an outage that costs more.
Best Practices
  • Right-size requests from measured usage; use VPA recommendations to find the numbers.
  • Enable bin-packing and node consolidation, and keep workloads movable so empties can be removed.
  • Match capacity to work: committed baseline, on-demand for variability, spot for the interruptible.
  • Attribute cost by team/product (OpenCost/Kubecost) so waste is visible and owned.
  • Scale to demand (and to zero where possible), but never trade away reliability for cost.
RelatedRequests and limits — over-requesting is the main waste (Topic 25)Autoscalers — bin-packing and scale-to-demand (Topics 27-28)Cloud cost tools — Cost Explorer/Billing + OpenCost/Kubecost

Knowledge Check

What is the biggest source of wasted compute spend in a cluster?

  • The gap between what Pods request (and reserve) and what they actually use
  • The CPU and memory the control plane itself consumes on the dedicated master nodes
  • Having too many namespaces and labels in the cluster
  • The storage cost of periodic etcd snapshots

Which capacity type fits fault-tolerant, restartable batch work?

  • Spot or preemptible VM nodes
  • Full-price on-demand nodes
  • Reserved nodes on a one-year commitment
  • Dedicated control-plane nodes

Why does cost attribution matter?

  • Allocating cost to teams/labels makes waste visible and gives someone accountability to fix it
  • It directly negotiates a lower per-node price with the provider
  • It is a hard prerequisite without which the scheduler cannot place any Pod onto a worker node at all
  • It replaces the need to run the cluster autoscaler

You got correct