Azure Kubernetes Service
Service 06

Azure Kubernetes Service

ContainersKubernetes

Azure Kubernetes Service is managed Kubernetes: Azure runs the control plane — the API server, scheduler, and etcd — and you run the worker nodes and the workloads on them. You get the full Kubernetes API and its ecosystem of operators, Helm charts, and tooling, without operating the control plane yourself.

AKS is the right choice when you genuinely need Kubernetes — its API for portability, its operator ecosystem, fine-grained control over scheduling and networking. It is the wrong choice when you reached for it out of habit: the operational weight of running clusters, upgrades, and add-ons is real, and Container Apps delivers most of the value for containerized services without it.

Control Plane and Tiers

Azure manages the control plane, and the pricing tier sets the guarantee on it. The Free tier has no uptime SLA and suits dev and test. The Standard tier adds a financially backed control-plane SLA — 99.95% with availability zones, 99.9% without — for a per-cluster hourly fee, and is the production baseline. The Premium tier builds on Standard to add long-term support (two years) for older Kubernetes versions, for teams that cannot upgrade on the roughly-annual community cadence.

AKS — Azure Runs the Control Plane, You Run the Nodes
Control planeAzure-managed
API server · scheduler · etcd
Azure runs and scales the API server, scheduler, and etcd, and backs the SLA on Standard and Premium. You do not see or patch it.
Node poolsyours
system pooluser pool
You own the worker VMs — a system pool for cluster services and user pools for workloads — and their sizing, patching, and the cluster autoscaler.

Node Pools

Worker nodes are grouped into node pools, each a scale set of a chosen VM size. A system node pool runs cluster services like CoreDNS; user node pools run your workloads. Separating pools by workload — a memory-optimized pool for caches, a GPU pool for inference, a Spot pool for fault-tolerant batch — lets you size and scale each independently.

The cluster autoscaler adds and removes nodes in a pool as pending pods require, so you are not paying for idle nodes or starving pods of capacity. It works alongside the Horizontal Pod Autoscaler, which scales pod replicas; the two operate at different layers.

Networking

The network plugin shapes IP usage and is best decided early. Azure CNI Overlay — the recommended default for most clusters — gives pods IPs from a private overlay CIDR while only the nodes take VNet addresses, so a cluster scales to large pod counts without exhausting the subnet. Traditional Azure CNI assigns every pod a routable VNet IP, which integrates cleanly with Azure networking but consumes address space fast; the older kubenet model conserves IPs at the cost of routing complexity. Azure CNI Powered by Cilium is the recommended data plane for new clusters. Plan pod and node address space up front — you can update the CNI IPAM mode and data plane, but a badly sized traditional-CNI subnet is painful to undo.

Identity and Security

Workload Identity federates a Kubernetes service account with an Entra ID identity, so pods call Azure services with a token and no stored secret. This replaces the old pattern of mounting service-principal credential files, which are long-lived and leak. Azure RBAC for Kubernetes maps Entra identities to cluster permissions, unifying access control with the rest of Azure.

Scaling

AKS scales at three layers: the Horizontal Pod Autoscaler scales pod replicas on metrics, the cluster autoscaler scales nodes to fit the pods, and KEDA scales workloads on external event sources like queue depth. Node autoprovisioning can even pick node sizes automatically. Matching the scaling mechanism to the signal is the difference between a responsive cluster and an over-provisioned one.

AKS vs Container Apps

AKS — Full Kubernetes with the complete API, operators, and node-level control — and the operational weight that comes with it. Choose it when you need Kubernetes itself.

Container Apps — Serverless containers on managed Kubernetes and KEDA, with no cluster to operate. Choose it for services and jobs that do not need the raw Kubernetes API.

Common Mistakes
  • Choosing AKS when Container Apps would do — paying the operational tax of cluster upgrades, add-ons, and node management for capabilities the workload never uses.
  • Using traditional Azure CNI with an undersized subnet instead of Azure CNI Overlay — pods exhaust the VNet address space and the cluster cannot grow, where Overlay would have avoided it.
  • Mounting service-principal credential files into pods instead of using Workload Identity — long-lived secrets that leak and never rotate.
  • Running the Free tier in production — there is no control-plane SLA, so an API-server outage has no financial backing.
  • Running without the cluster autoscaler — nodes are either over-provisioned and wasteful or too few and starving pods.
  • Treating the Horizontal Pod Autoscaler and cluster autoscaler as interchangeable — one scales pods, the other scales nodes, and you usually need both.
Best Practices
  • Default to Container Apps for containerized services; choose AKS only when you need the Kubernetes API, operators, or node-level control.
  • Use the Standard tier in production for the control-plane SLA; reserve Free for dev and test.
  • Default to Azure CNI Overlay (with Cilium) so pods draw from an overlay CIDR; reserve traditional Azure CNI for when pods need routable VNet IPs, and size that subnet for the cluster's maximum pod count.
  • Enable Workload Identity and stop mounting service-principal files; map access with Azure RBAC for Kubernetes.
  • Separate system and user node pools, and use dedicated pools (GPU, memory, Spot) sized to their workloads.
  • Run the cluster autoscaler with the Horizontal Pod Autoscaler, and add KEDA for event-driven scaling.
Comparable servicesAWS EKSGCP GKE

Knowledge Check

What does the Standard control-plane tier add over Free in AKS?

  • A financially backed uptime SLA on the managed control plane
  • The ability to run worker nodes and host workloads, which Free cannot
  • Automatic selection of the cluster network plugin at creation
  • Free GPU node pools

Why is Workload Identity preferred over mounting service-principal credential files in pods?

  • Pods authenticate to Azure with short-lived federated tokens and no stored, long-lived secret to leak
  • It is the only supported way for pods to reach the public internet over the cluster's outbound egress path
  • It eliminates the need for the cluster autoscaler when nodes fill up
  • It lets pods run without a service account

Why is the Azure CNI subnet size a decision you must get right before creating the cluster?

  • Azure CNI gives every pod a VNet IP; an undersized subnet caps cluster growth and changing it means a rebuild
  • The subnet size sets the control-plane SLA tier you are billed for
  • CNI bills you per unused IP address reserved in the pod subnet
  • The subnet size determines which Kubernetes version gets installed on the cluster control plane and worker nodes

A team reaches for AKS for a handful of stateless HTTP microservices with no special Kubernetes needs. What is the likely better fit?

  • Container Apps — serverless containers with autoscaling and no cluster to operate
  • A single large VM running all the services
  • Azure Batch with a dedicated pool
  • The Functions Consumption plan with a dedicated HTTP trigger wired up for each microservice

You got correct