Service 17

Amazon Aurora

DatabaseRelationalCloud-Native

Amazon Aurora is AWS's own relational engine, wire-compatible with MySQL and PostgreSQL so existing drivers and tools work unchanged. The familiar SQL hides a very different architecture: compute is separated from a distributed storage layer that AWS built specifically for the cloud. AWS positions it inside RDS — same console, billing, and API — so the shared management layer behaves exactly as in RDS.

The headline differences from RDS are a shared distributed cluster volume, near-zero-lag readers, seconds-not-minutes failover, and Aurora-only features like Serverless v2, Global Database, and Backtrack. AWS claims up to 5× MySQL and 3× PostgreSQL throughput on the same hardware.

The Distributed Storage Layer

Aurora replaces the per-instance disk with a distributed storage service. The cluster volume keeps six copies of every block across three Availability Zones — two per AZ. A write needs four of six copies to acknowledge; a read needs three. The cluster tolerates losing one whole AZ (two copies) without losing reads or writes.

Aurora Cluster Volume — 6 Copies Across 3 AZs

Cluster volume · shared by writer + up to 15 readers

Write quorum: 4 of 6 copies must ACK · Read quorum: 3 of 6 respond

AZ-a

copy 1

copy 2

AZ-b

copy 3

copy 4

AZ-c

copy 5

copy 6

Lose one whole AZ (−2 copies) and 4 remain — both quorums still met, reads and writes continue. Storage is separate from compute, so adding a reader attaches to this same volume and moves no data.

Storage is separate from compute, so adding or replacing an instance moves no data — it attaches to the shared volume in seconds. The volume auto-grows in 10 GiB steps to 128 TiB (256 on newest versions) and shrinks as you drop data. There is no gp3/io2 dial — only the cluster volume in one of two billing configurations.

The Cluster Model and Endpoints

A cluster has one writer and zero to fifteen readers, all sharing the volume; a reader is promoted to writer in seconds, and a new reader joins in about a minute with no data movement. Run at least one reader in a different AZ from the writer for cross-AZ resilience.

The cluster (writer) endpoint always points to the current writer; the reader endpoint round-robins across readers via DNS; custom endpoints group a subset of readers (e.g. analytics) for workload isolation. On failover, the cluster endpoint's DNS swings to the new writer.

Serverless v2 and Global Database

Aurora Serverless v2 scales compute in fine ACU steps in place, in seconds, without dropping connections — good for unpredictable, multi-tenant, or idle-most-of-the-time workloads. A common pattern pairs a provisioned writer with Serverless v2 readers that absorb spikes. At full utilization it costs more per hour than an equivalent provisioned instance.

Aurora Global Database replicates to up to ten secondary Regions with sub-second lag through the storage layer, serving local reads and giving a roughly one-minute cross-Region failover. It is for DR and distant readers — not active-active multi-Region writes.

Backtrack, Backups, and Pricing

Backups are continuous with no backup window and point-in-time recovery to any second in 1–35 days. Backtrack (Aurora MySQL only) rewinds the existing cluster up to 72 hours in seconds — the fastest recovery after an UPDATE without a WHERE, complementing snapshots rather than replacing them.

Pick one of two storage configurations: Aurora Standard (lower base price plus per-I/O charges; best when I/O is under ~25% of the bill) or Aurora I/O-Optimized (no per-I/O charge, higher base rates). Start on Standard, watch the I/O line for a month, and switch if I/O dominates.

Aurora vs RDS

Aurora — production workloads needing fast failover, read scaling, cross-Region DR, or higher throughput — on MySQL/PostgreSQL only.

RDS — smaller or dev workloads, other engines (Oracle, SQL Server, MariaDB, Db2), and the cheapest entry point. A db.t4g.micro RDS instance is far cheaper than a small Aurora cluster.

Common Mistakes

Running an Aurora cluster with only a writer and no reader — a writer failure then forces a brand-new instance launch (minutes) instead of a reader promotion (seconds).
Putting writer and readers in the same AZ, losing the cross-AZ resilience the architecture is designed to give.
Using instance endpoints for read traffic instead of the reader endpoint, which follows cluster membership as instances change.
Running Aurora Serverless v2 at steady full load where a provisioned Reserved Instance would be cheaper.
Choosing Aurora for a tiny dev database when plain RDS on a db.t4g.micro is a fraction of the cost.
Picking the storage configuration blind instead of measuring the I/O share of the bill first.

Best Practices

Run at least one reader in a different AZ from the writer.
Use the reader endpoint for read traffic, not instance endpoints.
Turn on Backtrack for Aurora MySQL clusters as cheap insurance against operator mistakes.
Use Serverless v2 readers behind a provisioned writer for spiky load.
Use Aurora Global Database for cross-Region DR rather than snapshot copy.
Choose the storage configuration by measuring your I/O share over a month; the switch is reversible (once per 30 days).

Comparable services GCP AlloyDB, Cloud SpannerAzure Azure Database for PostgreSQL/MySQL (Flexible Server)

Knowledge Check

How does Aurora achieve seconds-not-minutes failover compared with classic RDS?

Compute is separated from a shared distributed cluster volume, so a reader is promoted and the endpoint swings without moving any data
It keeps a full synchronous copy of the engine and its warm buffer pool running on every instance in the cluster, ready to take over instantly
It restores the database from the latest automated snapshot onto a freshly launched instance on each failover
It runs the writer and the warm standby together on the same physical host for fast handover

How many copies of each data block does the Aurora cluster volume keep, and across how many AZs?

Six copies across three Availability Zones — two per AZ
Three copies spread across three Availability Zones — one per AZ
Two mirrored copies kept inside a single Availability Zone
One primary copy, replicated to S3 on a nightly schedule

When is Aurora Serverless v2 a poor fit?

Steady, high, predictable load where a provisioned Reserved Instance is cheaper per hour
Multi-tenant SaaS platforms with unpredictable, bursty per-tenant query load throughout the day
Dev and test environments that sit idle for most of the working day
Read replicas that must absorb occasional sharp traffic spikes

What does Aurora Backtrack do?

Rewinds the existing Aurora MySQL cluster's storage to a recent point in time, in seconds, without creating a new cluster
Continuously replicates the entire cluster storage volume to a second AWS Region to provide cross-Region disaster recovery
Permanently deletes old automated snapshots on a schedule to save on storage cost
Forwards writes issued in a secondary Region back to the primary writer instance

Aurora Global Database is the right tool for which need?

Cross-Region disaster recovery and low-latency reads for distant users — not active-active multi-Region writes
Accepting fully active-active read and write traffic concurrently across every attached secondary Region worldwide
Sharding incoming writes across many independent writer instances within a single Region
Caching frequent query results in memory to cut read latency

You got correct