Amazon Aurora
Amazon Aurora is AWS's own relational engine, wire-compatible with MySQL and PostgreSQL so existing drivers and tools work unchanged. The familiar SQL hides a very different architecture: compute is separated from a distributed storage layer that AWS built specifically for the cloud. AWS positions it inside RDS — same console, billing, and API — so the shared management layer behaves exactly as in RDS.
The headline differences from RDS are a shared distributed cluster volume, near-zero-lag readers, seconds-not-minutes failover, and Aurora-only features like Serverless v2, Global Database, and Backtrack. AWS claims up to 5× MySQL and 3× PostgreSQL throughput on the same hardware.
The Distributed Storage Layer
Aurora replaces the per-instance disk with a distributed storage service. The cluster volume keeps six copies of every block across three Availability Zones — two per AZ. A write needs four of six copies to acknowledge; a read needs three. The cluster tolerates losing one whole AZ (two copies) without losing reads or writes.
Storage is separate from compute, so adding or replacing an instance moves no data — it attaches to the shared volume in seconds. The volume auto-grows in 10 GiB steps to 128 TiB (256 on newest versions) and shrinks as you drop data. There is no gp3/io2 dial — only the cluster volume in one of two billing configurations.
The Cluster Model and Endpoints
A cluster has one writer and zero to fifteen readers, all sharing the volume; a reader is promoted to writer in seconds, and a new reader joins in about a minute with no data movement. Run at least one reader in a different AZ from the writer for cross-AZ resilience.
The cluster (writer) endpoint always points to the current writer; the reader endpoint round-robins across readers via DNS; custom endpoints group a subset of readers (e.g. analytics) for workload isolation. On failover, the cluster endpoint's DNS swings to the new writer.
Serverless v2 and Global Database
Aurora Serverless v2 scales compute in fine ACU steps in place, in seconds, without dropping connections — good for unpredictable, multi-tenant, or idle-most-of-the-time workloads. A common pattern pairs a provisioned writer with Serverless v2 readers that absorb spikes. At full utilization it costs more per hour than an equivalent provisioned instance.
Aurora Global Database replicates to up to ten secondary Regions with sub-second lag through the storage layer, serving local reads and giving a roughly one-minute cross-Region failover. It is for DR and distant readers — not active-active multi-Region writes.
Backtrack, Backups, and Pricing
Backups are continuous with no backup window and point-in-time recovery to any second in 1–35 days. Backtrack (Aurora MySQL only) rewinds the existing cluster up to 72 hours in seconds — the fastest recovery after an UPDATE without a WHERE, complementing snapshots rather than replacing them.
Pick one of two storage configurations: Aurora Standard (lower base price plus per-I/O charges; best when I/O is under ~25% of the bill) or Aurora I/O-Optimized (no per-I/O charge, higher base rates). Start on Standard, watch the I/O line for a month, and switch if I/O dominates.
Aurora — production workloads needing fast failover, read scaling, cross-Region DR, or higher throughput — on MySQL/PostgreSQL only.
RDS — smaller or dev workloads, other engines (Oracle, SQL Server, MariaDB, Db2), and the cheapest entry point. A db.t4g.micro RDS instance is far cheaper than a small Aurora cluster.
- Running an Aurora cluster with only a writer and no reader — a writer failure then forces a brand-new instance launch (minutes) instead of a reader promotion (seconds).
- Putting writer and readers in the same AZ, losing the cross-AZ resilience the architecture is designed to give.
- Using instance endpoints for read traffic instead of the reader endpoint, which follows cluster membership as instances change.
- Running Aurora Serverless v2 at steady full load where a provisioned Reserved Instance would be cheaper.
- Choosing Aurora for a tiny dev database when plain RDS on a db.t4g.micro is a fraction of the cost.
- Picking the storage configuration blind instead of measuring the I/O share of the bill first.
- Run at least one reader in a different AZ from the writer.
- Use the reader endpoint for read traffic, not instance endpoints.
- Turn on Backtrack for Aurora MySQL clusters as cheap insurance against operator mistakes.
- Use Serverless v2 readers behind a provisioned writer for spiky load.
- Use Aurora Global Database for cross-Region DR rather than snapshot copy.
- Choose the storage configuration by measuring your I/O share over a month; the switch is reversible (once per 30 days).
Knowledge Check
How does Aurora achieve seconds-not-minutes failover compared with classic RDS?
- Compute is separated from a shared distributed cluster volume, so a reader is promoted and the endpoint swings without moving any data
- It keeps a full synchronous copy of the engine and its warm buffer pool running on every instance in the cluster, ready to take over instantly
- It restores the database from the latest automated snapshot onto a freshly launched instance on each failover
- It runs the writer and the warm standby together on the same physical host for fast handover
How many copies of each data block does the Aurora cluster volume keep, and across how many AZs?
- Six copies across three Availability Zones — two per AZ
- Three copies spread across three Availability Zones — one per AZ
- Two mirrored copies kept inside a single Availability Zone
- One primary copy, replicated to S3 on a nightly schedule
When is Aurora Serverless v2 a poor fit?
- Steady, high, predictable load where a provisioned Reserved Instance is cheaper per hour
- Multi-tenant SaaS platforms with unpredictable, bursty per-tenant query load throughout the day
- Dev and test environments that sit idle for most of the working day
- Read replicas that must absorb occasional sharp traffic spikes
What does Aurora Backtrack do?
- Rewinds the existing Aurora MySQL cluster's storage to a recent point in time, in seconds, without creating a new cluster
- Continuously replicates the entire cluster storage volume to a second AWS Region to provide cross-Region disaster recovery
- Permanently deletes old automated snapshots on a schedule to save on storage cost
- Forwards writes issued in a secondary Region back to the primary writer instance
Aurora Global Database is the right tool for which need?
- Cross-Region disaster recovery and low-latency reads for distant users — not active-active multi-Region writes
- Accepting fully active-active read and write traffic concurrently across every attached secondary Region worldwide
- Sharding incoming writes across many independent writer instances within a single Region
- Caching frequent query results in memory to cut read latency
You got correct