Service 03

Amazon ECS

ContainersManagedOrchestration

Amazon ECS (Elastic Container Service) is AWS's own container orchestrator: it runs your Docker containers, restarts them when they fail, and scales them with traffic. Compared with Kubernetes it is deliberately simpler — less to learn, less YAML, and tight integration with the rest of AWS.

Launched in 2014, ECS is the pragmatic default for teams that want production container orchestration without adopting Kubernetes and its operational surface.

Core Concepts

ECS has five building blocks. A container image packages your app, usually stored in Amazon ECR. A task definition is the JSON blueprint — image, CPU, memory, ports, IAM role, logging. A task is a running instance of that definition. A service keeps N tasks running, replaces failures, and connects them to a load balancer. A cluster is the logical group they all live in.

ECS Building Blocks — Cluster to Container

Cluster

logical group (prod / staging / batch)

Service

keeps N tasks running, behind a load balancer

Task

a running instance of a task definition

Container

your image — one or more per task, sharing localhost

EC2 vs Fargate Launch Types

ECS runs tasks two ways. With the EC2 launch type you provide and manage a fleet of instances; ECS only places containers on them. It can be cheaper for large, steady workloads via Spot and Savings Plans, but you own the patching and scaling of the hosts.

With the Fargate launch type AWS provides the compute — you declare CPU and memory per task and never see a server. It is the right default for most teams; move to EC2 only when cost at scale, GPUs, or special hardware justify the extra operational work.

Deployments and Scaling

Updating a service triggers a rolling deployment: ECS starts a new task, waits for it to pass health checks, then drains an old one, repeating until all are replaced. For production, blue/green deployments via CodeDeploy stand up the new version beside the old and switch traffic only after it is healthy.

Service Auto Scaling adjusts task count from CloudWatch metrics — CPU, memory, or average request count per task — adding tasks under load and removing them when it drops.

Networking

The recommended network mode is awsvpc, which gives each task its own elastic network interface, private IP, and security group — and it is the only mode Fargate supports. For service-to-service discovery, ECS integrates with Cloud Map to assign DNS names like payments.local; for HTTP, services sit behind an Application Load Balancer routing by path or host.

ECS vs EKS

ECS — AWS-native, simpler, less to operate. The right default unless you specifically need Kubernetes — pick it for most container workloads on AWS.

EKS — full Kubernetes — its API, ecosystem, and portability. Worth the added complexity when your team already runs Kubernetes or needs its specific features.

Common Mistakes

Tagging images latest instead of pinning a version — 'latest' silently changes, so deploys and rollbacks become non-deterministic.
Packing many unrelated containers into one task — they then scale and fail together. Use one container per task unless they are tightly coupled sidecars.
Running a fixed task count when load varies through the day — attach Service Auto Scaling instead of overprovisioning.
Giving the whole cluster one broad IAM role instead of per-task roles — a compromised task then has far more access than it needs.
Misjudging task CPU and memory — too little crashes the container under load, too much pays for idle capacity.
Reading logs by exec-ing into containers instead of shipping them to CloudWatch Logs, where search and alarms work.

Best Practices

Default to Fargate; move to the EC2 launch type only for clear cost-at-scale or hardware reasons.
Run one container per task unless a sidecar is genuinely required.
Assign a least-privilege IAM task role to each service.
Use awsvpc mode so every task gets its own IP and security group.
Pin image versions in task definitions; never deploy latest.
Use blue/green deployments via CodeDeploy for production traffic switches.

Comparable services GCP Cloud Run, GKEAzure Azure Container Apps, AKS

Knowledge Check

What is the relationship between a task definition and a task in ECS?

A task definition is the JSON blueprint; a task is a running instance of that blueprint
A task is the JSON blueprint you register, and a task definition is the live running container launched from it
They are two interchangeable names for the same object that ECS treats identically
A task definition is for Fargate; a task is for the EC2 launch type

A small team wants to run containers without managing servers. Which launch type fits, and why?

Fargate — AWS provides the compute, so there are no instances to patch, scale, or size
EC2 launch type — it is always cheaper and removes server management
Either one — the launch type has no effect on server management, so the team can pick freely without consequence
Neither — ECS always requires you to provision and patch your own EC2 instances behind the scenes

Why pin image versions instead of using the latest tag?

'latest' points to whatever was pushed most recently, making deploys and rollbacks non-deterministic
The 'latest' tag cannot be pulled from Amazon ECR, so the task fails to start until you give it a version number
Images carrying the 'latest' tag are billed at a higher storage and pull rate than pinned versions
ECS validates task definitions and rejects outright any image whose tag is 'latest'

When does EKS make more sense than ECS?

When the team already runs Kubernetes or needs its specific ecosystem features and portability
Whenever you have more than two containers, because ECS cannot orchestrate that many on its own
Whenever you want the simplest possible orchestration with the least to learn and operate
Whenever you want to avoid the fixed per-cluster control-plane fee that EKS charges every month

You got correct