Blob Storage
Service 09

Blob Storage

Object

Blob Storage is Azure's object store: unstructured data — backups, media, logs, data-lake content, static website assets — addressed by URL and billed per gigabyte stored plus per operation. It scales to petabytes in a single account, and you never provision capacity; you write objects and pay for what is there.

Object storage is not a file system and not a disk. There is no in-place edit, no hierarchy in the flat model, and latency is measured in tens of milliseconds, not microseconds. Used for what it is — durable, cheap, massively scalable storage of whole objects — it is the backbone of most Azure architectures.

Blob Types

There are three blob types, fixed at creation. Block blobs hold files and stream data — the default for almost everything. Append blobs are optimized for append-only writes such as logging. Page blobs back the random-access disk images behind unmanaged VM disks. Choosing the wrong type for a workload is a rewrite, so block blob unless you have a specific reason.

Access Tiers

Each blob sits in an access tier that trades storage price against access price. Hot is cheapest to read and most expensive to store; Archive is the reverse — pennies to keep, but a rehydration that takes hours and costs per gigabyte to read. Cool and Cold sit between, for data accessed rarely but needed quickly.

The trap is Archive for data you actually read. Retrieving an archived blob means a rehydration of up to 15 hours (standard priority) before the data is even available, plus a retrieval charge. Archive is for compliance copies and cold backups, not for anything on a request path.

TierStore costAccess costUse for
HotHighestLowestActive data, frequent reads
CoolLowerHigherInfrequent, ~30-day data
ColdLower stillHigher stillRarely accessed, ~90-day data
ArchiveLowestHighest + rehydrate delayCompliance, cold backups
Access Tiers — Hotter Costs More to Store, Colder More to Read
Hot
instant access · highest storage cost · active data on a request path
Cool
instant access · ~30-day minimum · infrequently accessed data
Cold
instant access · ~90-day minimum · rarely accessed, long-term data
Archive
offline · rehydration up to 15 hours · compliance and cold backups only

Redundancy

Redundancy sets how many copies exist and where. LRS keeps three copies in one datacenter — cheapest, and lost if that datacenter is lost. ZRS spreads copies across availability zones in a region. GRS and GZRS replicate to a paired region hundreds of miles away for regional-disaster durability, and the read-access variants (RA-GRS, RA-GZRS) let you read the secondary copy directly.

Match redundancy to the data's value, not by reflex. LRS for easily regenerated data is fine; GZRS for the only copy of irreplaceable records is the point. Paying for GRS on transient scratch data is waste, and running LRS under irreplaceable data is a future incident.

Lifecycle and Immutability

Lifecycle management rules move or delete blobs automatically by age — tier to Cool after 30 days, to Archive after 90, delete after a year — which is how object-storage bills are kept from growing forever. Immutability policies (WORM) lock blobs against modification or deletion for a retention period, for regulatory and ransomware-resilience requirements.

Security and Access

Access is by Entra ID identity (the preferred model, with RBAC), by account key (powerful and to be avoided), or by a shared access signature — a signed, scoped, time-limited URL. Private endpoints keep blob traffic on the VNet and off the public internet. The default should be Entra plus private endpoints; SAS for the narrow cases that need a temporary public URL.

Hot vs Cool vs Cold vs Archive

Hot — Frequent access, lowest read cost, highest storage cost. Active data on a request path.

Cool / Cold — Infrequent access at ~30/90-day cadence. Lower storage, higher per-read; available instantly.

Archive — Lowest storage cost, but offline — rehydration takes hours and costs per GB. Compliance and cold backups only.

Common Mistakes
  • Putting data on the Archive tier that gets read on a request path — rehydration can take up to 15 hours and bills a per-GB retrieval charge.
  • Running LRS under the only copy of irreplaceable data — a single datacenter loss destroys it; that is what ZRS or GZRS is for.
  • Paying for GRS on transient or easily regenerated data, doubling the storage bill for durability you do not need.
  • Choosing an append or page blob by accident when a block blob was meant — the type is fixed at creation.
  • Leaving account keys in use and in config instead of Entra ID with RBAC, so a leaked key grants full account access.
  • Never setting a lifecycle policy, so cold data accumulates on the Hot tier and the bill grows without bound.
Best Practices
  • Use block blobs unless you have a specific append-only or page-blob need.
  • Set lifecycle rules to tier aging data to Cool/Cold and Archive, and to delete what expires.
  • Match redundancy to value — ZRS or GZRS for irreplaceable data, LRS for regenerable data.
  • Reserve Archive for compliance and cold backups; never put archived data on a read path.
  • Access blobs via Entra ID with RBAC and private endpoints; avoid account keys, and scope SAS tokens tightly when needed.
  • Enable immutability (WORM) for regulated data and as ransomware resilience.
Comparable servicesAWS S3GCP Cloud Storage

Knowledge Check

Why is the Archive tier the wrong place for data on a request path?

  • Archived blobs are offline — rehydration can take up to 15 hours and adds a per-GB retrieval charge before the data is readable
  • Archive carries the highest per-GB storage cost of every access tier, which is exactly why it is reserved for hot request paths
  • Archived blobs cannot be encrypted at rest by the platform
  • Archive accepts only append blobs, never block blobs

Which redundancy option protects against the loss of an entire region?

  • GRS or GZRS, which replicate to a paired region
  • LRS, which keeps three copies in one datacenter
  • ZRS, which spreads copies across zones in one region
  • None — Blob Storage is always single-region

What is the purpose of a shared access signature (SAS)?

  • A signed, scoped, time-limited URL granting access without sharing the account key
  • A way to permanently make a blob publicly readable to anyone on the internet without expiry
  • A replication policy that copies blobs between two regions
  • A customer-managed encryption key stored in the blob metadata

You got correct