Service 01

Amazon EC2

ComputeIaaSVMs

Amazon EC2 (Elastic Compute Cloud) is the virtual-machine service AWS launched in 2006, and it still underpins a large share of all AWS workloads. You pick a size, an operating system, a Region, and an Availability Zone; AWS hands you a running server in seconds. You connect over SSH or RDP and use it like any other machine.

What changes in the cloud is everything around the instance. You start ten or ten thousand with one command, stop them when traffic drops, and replace a failed one in a minute — paying by the second for what runs. EC2 is the default when you need full control of the operating system or run software that does not fit a managed service.

Instance Types

The instance type fixes how much CPU, memory, network, and storage an instance gets. A name like m5.large has three parts: a family letter (what it is good at), a generation number (higher is usually newer and cheaper per unit of work), and a size (nano through 48xlarge).

Family	Best for	Examples
T	Small, bursty workloads	`t3.micro`, `t4g.small`
M	General-purpose work	`m6i.large`, `m7g.xlarge`
C	CPU-heavy work	`c7g.large`, `c6i.4xlarge`
R	Memory-heavy work	`r6i.large`, `r7g.2xlarge`
I	Fast local NVMe SSDs	`i4i.large`
G / P	GPU work (graphics, ML)	`g5.xlarge`, `p4d.24xlarge`

The pattern beats memorization: start with an M-family instance, move to C if you are CPU-bound, move to R if you are memory-bound. Stay on the latest generation — newer generations are usually faster and cheaper for the same workload.

AMIs and Storage

An AMI (Amazon Machine Image) is the template an instance boots from — operating system, pre-installed software, and configuration. A common pattern bakes your software into a custom AMI so startup is fast and predictable; tools like EC2 Image Builder and Packer build them repeatably.

Almost every instance boots from an EBS volume — network-attached storage that persists independently of the instance, so it survives a stop or terminate and can be detached and reattached elsewhere. Instance store is local NVMe that is far faster but disappears when the instance stops; use it only for caches and rebuildable scratch data.

Networking and Security

Every instance runs inside a VPC with a private IP, and optionally a public IP to be reachable from the internet. A security group is a stateful firewall around the instance: inbound is deny-all (you open ports explicitly), outbound is allow-all by default, and return traffic for an allowed connection is permitted automatically.

A key pair handles the first login — AWS holds the public key, you keep the private key. For access without managing keys at all, use Systems Manager Session Manager instead of opening SSH.

Pricing Models

Choosing the pricing model is one of the largest cost levers on AWS. On-Demand pays by the second with no commitment. Reserved Instances and Savings Plans trade a one- or three-year commitment for up to ~70% off; Savings Plans are usually the easier choice because the discount spans instance types and services. Spot buys spare capacity at up to ~90% off, with the catch that AWS can reclaim it on two minutes' notice.

Model	Discount	Commitment	Best for
On-Demand	none	none	Spiky, short-lived workloads
Reserved	up to ~70%	1 or 3 years	Steady production workloads
Savings Plans	up to ~70%	1 or 3 years	Steady spend, flexible usage
Spot	up to ~90%	none — reclaimable	Fault-tolerant batch work

Most real systems mix models: cover the steady baseline with a Savings Plan, absorb daily spikes On-Demand, and run interruptible batch work on Spot.

Scaling with Auto Scaling Groups

An Auto Scaling Group manages a fleet of instances from a launch template, holding a desired capacity between a minimum and maximum and moving it with scaling policies tied to CloudWatch metrics like CPU or ALB request count. Unhealthy instances are replaced automatically; behind a load balancer this is the classic high-availability pattern.

Auto Scaling Group + ALB — Instance Count Lags Traffic

Scaling is not instantaneous — each instance launches, configures, and passes health checks, and each action respects a cooldown. The instance-count curve always lags the traffic curve by minutes; set a minimum capacity that absorbs the first spike.

Scaling is not instant — each new instance launches, configures, and passes health checks, and each action respects a cooldown. The instance-count curve always lags the traffic curve by a few minutes. Design for the lag with a minimum capacity that absorbs the first spike.

EC2 vs Lambda vs Fargate

EC2 — full OS control and any instance shape. Reach for it when you need the whole machine, special hardware, or a lift-and-shift migration.

Lambda — short, event-driven code with no servers and no idle cost. Better for spiky, sub-15-minute work.

Fargate — containers without managing instances. Better when your workload is already containerized and you do not need the host.

Common Mistakes

Opening port 22 (SSH) to 0.0.0.0/0 — this invites constant brute-force scanning. Restrict SSH to your own IP, or use Session Manager and open nothing.
Storing important data only on the instance's local disk — instance store is wiped on stop, and even an EBS root volume is easy to lose on terminate. Keep durable data on separate EBS volumes, S3, or a database.
Embedding long-lived access keys on the instance instead of attaching an IAM role — leaked keys are the classic breach vector.
Running everything On-Demand — a steady production fleet with no Savings Plan or Reserved coverage overpays by up to 70%.
Never setting a maximum on an Auto Scaling Group — a runaway scaling loop or traffic flood generates a surprise bill.
Leaving idle dev instances running — an unused instance costs the same as a busy one. Stop or schedule them.

Best Practices

Treat instances as replaceable cattle — you should be able to terminate any one and relaunch from an AMI without data loss.
Attach an IAM role to every instance; never store AWS access keys on disk.
Patch through Systems Manager Patch Manager rather than by hand.
Tag every instance with at least Environment, Project, and Owner.
Cover steady baseline load with a Savings Plan and run fault-tolerant work on Spot.
Set CloudWatch alarms on CPU, status checks, and disk before the first incident, not during it.

Comparable services GCP Compute EngineAzure Virtual Machines

Knowledge Check

A production fleet runs the same baseline load 24/7 all year. Which pricing approach fits the baseline best?

A Savings Plan or Reserved Instances — a 1–3 year commitment for up to ~70% off steady usage
On-Demand — pay the per-second rate with no commitment, since maximum flexibility is always worth the premium
Spot Instances — at 70–90% off, the steady round-the-clock load makes the risk of interruption irrelevant
Lightsail — its flat monthly bundle of compute and transfer is always the cheapest option at production scale

Why are Spot Instances unsuitable for a stateful database primary?

AWS can reclaim the instance with a two-minute warning, interrupting the workload
Spot Instances cannot attach EBS volumes, so a database has nowhere durable to store its data files
Spot Instances run on an older, slower hardware generation that cannot keep up with a database's I/O demands
Spot Instances are limited to the t-family burstable types, which throttle CPU and starve a busy database

What does it mean that a security group is stateful?

Return traffic for an allowed connection is permitted automatically, so you write rules in one direction
It remembers the instance's previously assigned IP addresses across restarts and keeps allowing traffic to them
It persists its rules to disk so they survive a reboot and reload automatically when the instance comes back up
It blocks all traffic until you explicitly allow both inbound and outbound for every flow

An Auto Scaling Group is configured with a target-tracking policy on CPU. Traffic spikes suddenly. What should you expect?

Instance count lags the spike by a few minutes due to launch, health checks, and cooldown — so set a minimum that absorbs the first burst
New instances serve traffic instantly the moment CPU crosses the threshold, so the spike is absorbed with no lag at all and no minimum to tune
The ASG temporarily scales past its configured maximum to meet demand, then trims back down once the spike subsides
The policy responds by growing each instance's EBS storage rather than adding instances to the group

When is EC2 the right choice over a managed compute service?

When you need full OS control, special hardware, or are migrating an existing workload that does not fit a managed service
Whenever the application is written in Python or Node.js, since those runtimes only run properly on a full EC2 instance
Whenever you want the lowest possible cost for spiky, event-driven code that sits idle most of the day between bursts
Whenever you need to serve a static website of plain HTML, CSS, and images to visitors

You got correct