Service 07

Container Apps

ServerlessContainers

Azure Container Apps runs containers on managed Kubernetes with KEDA-based autoscaling, built-in ingress, and optional Dapr — without exposing the Kubernetes API or asking you to manage nodes. It scales to zero, scales out on events, and splits traffic across revisions. It is the modern default for containerized services and jobs that do not need raw Kubernetes.

Container Apps sits between Container Instances and AKS. It gives you the orchestration ACI lacks — autoscaling, health-based replacement, blue-green revisions — without the operational weight of running a cluster. When a team's instinct is to stand up AKS for a few microservices, Container Apps is usually the answer that delivers the same outcome with far less to operate.

Apps and Jobs

A Container App is a long-running service — an API, a web frontend, an event consumer — with ingress and autoscaling. A Container Apps job runs to completion and exits: triggered on a schedule, by an event such as a queue message, or manually. Using a job for batch work and an app for a service keeps each on the right scaling and billing model.

Scaling with KEDA

Scaling is driven by KEDA, the Kubernetes event-driven autoscaler. Beyond CPU and memory, a scale rule can react to HTTP concurrency, queue depth in Service Bus or Storage Queues, Event Hub backlog, or a custom metric — and scale the app to zero when there is nothing to do. You set a minimum and maximum replica count to bound the range.

Scale-to-zero is the headline economics and the headline trap. An app at zero replicas pays nothing, but the next request pays a cold start; for a service with a latency SLO, set a minimum of one replica so warm capacity is always available.

Revisions and Traffic Splitting

Each deployment can create a new revision, and you control how much traffic each revision receives by weight. Sending 10% of traffic to a new revision is a canary; flipping 100% from old to new is a blue-green cutover; an even split is an A/B test. This is built in, with no separate load balancer or service mesh to configure.

Environment and Dapr

Container Apps run inside an environment — a secure boundary, optionally VNet-injected, that apps share for networking and observability. Dapr, the Distributed Application Runtime, is available as a sidecar for service-to-service invocation, pub/sub, and state management, so cross-service plumbing is configuration rather than code.

Container Apps vs AKS vs App Service vs Functions

Container Apps — Serverless containers with KEDA autoscaling, revisions, and scale-to-zero. The default for containerized services and jobs without raw Kubernetes needs.

AKS — Full Kubernetes API, operators, and node control — and the ops weight to match. Choose only when you need Kubernetes itself.

App Service — A managed host for a conventional web app. Choose it for a standard long-running app where containers and event scaling are not the point.

Functions — Per-event code execution. Choose it for short, event-triggered tasks rather than a containerized service.

Common Mistakes

Standing up AKS for a few stateless microservices when Container Apps would run them with no cluster to operate.
Leaving scale-to-zero on for a latency-SLO service — every cold start violates the SLO; set a minimum of one replica.
Setting no minimum replicas where warm capacity is needed, then blaming cold starts on the platform.
Scaling on CPU alone when the real signal is queue depth or HTTP concurrency, leaving a KEDA scaler that fits unused.
Treating Container Apps as full Kubernetes — there is no node-level access, no DaemonSets, no arbitrary operators.
Using a long-running app for batch work that should be a job, so it never scales down and bills continuously.

Best Practices

Reach for Container Apps before AKS for containerized services; escalate to AKS only when you need the Kubernetes API itself.
Use jobs for run-to-completion work and apps for services, so each gets the right scaling and billing model.
Pick the KEDA scaler that matches the real signal — queue depth or HTTP concurrency, not CPU by default.
Set a minimum of one replica for any service with a latency SLO; allow scale-to-zero only for tolerant workloads.
Use revision traffic splitting for canary and blue-green rollouts instead of bolting on a separate load balancer.
Inject the environment into a VNet for private networking, and use Dapr for service-to-service and pub/sub plumbing.

Comparable servicesAWS App Runner / ECS FargateGCP Cloud Run

Knowledge Check

What does KEDA enable in Container Apps that plain CPU autoscaling does not?

Scaling on external event sources — queue depth, Event Hub backlog, HTTP concurrency — including down to zero
Direct administrative access to the underlying Kubernetes nodes, the kubelet, and the host operating system on each node
A financially backed control-plane uptime SLA on the platform
Running a DaemonSet pod on every node in the cluster

When should you set a minimum of one replica instead of allowing scale-to-zero?

When the service has a latency SLO that a cold start would violate
When the workload is a run-to-completion batch job that exits when finished
When you want to minimize cost above all else
When the app uses Dapr for pub/sub

How do you run a 10% canary of a new version in Container Apps?

Create a new revision and assign it 10% of traffic by weight
Deploy a second Container Apps environment and split DNS
Configure an external load balancer in front with weighted routing rules
Scale the old revision to zero and the new one to one

A team needs a few stateless microservices with autoscaling but no special Kubernetes features. Which service fits with the least to operate?

Container Apps
AKS with the Premium control-plane tier
Container Instances behind a manual load balancer
A VM Scale Set running Docker

You got correct