Topic 53

Cluster Cost Management

CostEfficiency

A Kubernetes bill is mostly nodes you are paying for whether or not your Pods use them. Cost management is the practice of keeping spend proportional to value — right-sizing requests, packing workloads densely, using cheaper capacity for the right work, and attributing cost so waste is visible.

The levers are mostly things covered elsewhere, applied with a cost lens. The recurring theme: idle, over-requested, and unattributable capacity is where the money leaks.

Where the Money Goes

The dominant cost is compute — the nodes — and the biggest waste is the gap between what Pods request (and thus reserve) and what they actually use. Because the scheduler reserves based on requests (Topic 25), an over-requested cluster runs half-empty nodes you fully pay for. Other real costs hide in storage (provisioned but unused volumes), and especially network egress and cross-zone traffic, which is easy to ignore until the bill arrives.

Right-Sizing and Bin-Packing

The first lever is right-sizing: set requests close to measured real usage so reservations reflect reality, recovering the idle headroom. The VPA's recommendations (Topic 28) help find the numbers. The second is bin-packing — letting the scheduler and Cluster Autoscaler consolidate Pods onto fewer, fuller nodes and remove the empties. Workloads must be movable (PDBs, no node-local storage) for consolidation to actually reclaim nodes.

Cheaper Capacity

Not all work needs on-demand, full-price nodes. Spot/preemptible nodes cost a fraction but can be reclaimed with little notice — ideal for fault-tolerant, restartable workloads (batch, stateless web behind enough replicas), wrong for stateful singletons. Reserved/committed capacity discounts the steady baseline you will run regardless. The pattern is a committed baseline plus on-demand for variability plus spot for the interruptible portion.

Capacity	Cost	Use for
On-demand	Full price	Baseline that must not be interrupted
Reserved/committed	Discounted	Predictable steady-state load
Spot/preemptible	Cheapest	Fault-tolerant, restartable workloads

Attribution and Autoscaling

You cannot manage what you cannot see. Cost attribution — labeling workloads by team/product and using tools like OpenCost/Kubecost to allocate node cost back to namespaces and labels — turns an opaque bill into accountable line items, which is what actually drives teams to clean up waste (showback/chargeback). Combined with autoscaling that scales to demand (and to zero where possible) and the levers above, cost tracks value. The anti-goal worth naming: never cut reliability — replicas, PDBs, headroom — to shave cost, because an outage costs more than the nodes.

On-demand vs reserved vs spot

On-demand — full price, no commitment, not interrupted. For baseline that must stay up.

Reserved/committed — discounted for a commitment. For predictable steady load.

Spot/preemptible — cheapest, reclaimable on short notice. For fault-tolerant, restartable work.

Common Mistakes

Over-requesting "to be safe," so the cluster reserves far more than it uses and runs half-empty.
Running fault-tolerant batch on full-price on-demand nodes instead of spot.
Ignoring egress and cross-zone traffic costs until the bill reveals them.
No cost attribution, so waste is invisible and no team is accountable.
Cutting replicas, PDBs, or headroom to save money and causing an outage that costs more.

Best Practices

Right-size requests from measured usage; use VPA recommendations to find the numbers.
Enable bin-packing and node consolidation, and keep workloads movable so empties can be removed.
Match capacity to work: committed baseline, on-demand for variability, spot for the interruptible.
Attribute cost by team/product (OpenCost/Kubecost) so waste is visible and owned.
Scale to demand (and to zero where possible), but never trade away reliability for cost.

RelatedRequests and limits — over-requesting is the main waste (Topic 25)Autoscalers — bin-packing and scale-to-demand (Topics 27-28)Cloud cost tools — Cost Explorer/Billing + OpenCost/Kubecost

Knowledge Check

What is the biggest source of wasted compute spend in a cluster?

The gap between what Pods request (and reserve) and what they actually use
The CPU and memory the control plane itself consumes on the dedicated master nodes
Having too many namespaces and labels in the cluster
The storage cost of periodic etcd snapshots

Which capacity type fits fault-tolerant, restartable batch work?

Spot or preemptible VM nodes
Full-price on-demand nodes
Reserved nodes on a one-year commitment
Dedicated control-plane nodes

Why does cost attribution matter?

Allocating cost to teams/labels makes waste visible and gives someone accountability to fix it
It directly negotiates a lower per-node price with the provider
It is a hard prerequisite without which the scheduler cannot place any Pod onto a worker node at all
It replaces the need to run the cluster autoscaler

You got correct