Chapter 2: Images
Topic 10

Pulling Images and the Local Store

PullLocal store

Before a container can run, its image has to exist in the daemon's local store. docker pull fetches an image from a registry layer by layer, skipping any layer already present, and docker run pulls implicitly when the image is missing. The store is just a local cache of content addressed by digest — and once you see it that way, three things stop being mysterious: why the second pull is instant, why disk fills up quietly, and where "image not found" actually fails.

None of this is exotic. A pull is an HTTP fetch of a manifest followed by the blobs it names that you don't have; the store is a directory under /var/lib/docker; resolution is a default-registry lookup unless you say otherwise. The failures all trace back to one of those three steps.

Pull, Layer by Layer

A pull starts by fetching the manifest, which lists every layer by digest. The daemon then downloads only the layer blobs it doesn't already have, checking each against its digest as it lands. Because base layers are shared across images, most pulls transfer far less than the image's nominal size — pull nginx:1.27-alpine after you already have another alpine-based image and the alpine base layer is reused rather than re-fetched.

docker pull nginx:1.27-alpine — shared base layers are skipped
$ docker pull nginx:1.27-alpine
1.27-alpine: Pulling from library/nginx
9b1d4a0e2233: Already exists
1c8f0277de44: Pull complete
60a0e9b3a1d2: Pull complete
Digest: sha256:7c3e1f...
Status: Downloaded newer image for nginx:1.27-alpine

The Already exists line is layer sharing in action: that blob was present from an earlier pull, so it costs nothing this time. This is also why the second pull of an image you already have completes in a fraction of a second — every layer is already in the store, and the daemon only confirms the manifest hasn't changed.

What a pull actually does, step by step
Fetch manifest
Check local store
Download missing layers
Image ready

The Local Store

Pulled images live in the daemon's storage area under /var/lib/docker, indexed by digest. docker images lists what's there; docker run uses a local copy directly without contacting the registry when the image is already present. The store is the daemon's, not your shell's — which is why images you pulled stay available across reboots and across every terminal on the host.

Treat it as a cache rather than a source of truth. The registry holds the canonical image; the store is a reclaimable local copy that can be deleted and re-pulled at any time. That framing is what makes pruning it safe.

Implicit Pull on Run

A bare docker run postgres:16 pulls the image automatically if it's absent, then starts the container. Convenient — but it means the first run of an uncached image does network work, and a registry outage at that moment breaks what looks like a purely local command. The second run of the same image is fully local because the image is now in the store.

The gotcha is the "it ran fine yesterday" first deploy on a fresh host: the image isn't cached there, the implicit pull hits a flaky registry, and startup fails on a machine where the command worked everywhere else. Pre-pulling removes that surprise.

Registry Resolution and Auth

An unqualified name like postgres:16 resolves to Docker Hub by default, expanding to docker.io/library/postgres:16. A private image needs its registry host spelled out — registry.driftwood.io/web:1.4.0 — and usually a prior docker login. Omit the host and Docker tries Docker Hub, fails to find the image there, and reports a confusing not-found that has nothing to do with the image actually existing.

unqualified names resolve to Docker Hub; private images need the host and a login
$ docker pull postgres:16
# resolves to docker.io/library/postgres:16

$ docker pull registry.driftwood.io/web:1.4.0
Error: pull access denied ... no basic auth credentials
$ docker login registry.driftwood.io
$ docker pull registry.driftwood.io/web:1.4.0   # now succeeds

Two failure modes hide here and look alike. A missing registry host produces a not-found because Docker looked in the wrong place; a missing docker login produces an auth error that reads, to the unwary, like the image not existing. Spell out the host and authenticate first, and both disappear.

Disk Growth and Cleanup

Pulled images, dangling layers from rebuilt images, and old tags accumulate in the store silently — nothing cleans them up on its own. On a long-lived host the store grows until the disk fills and the daemon starts failing in ways that look unrelated to disk. docker image prune reclaims dangling images and docker system prune goes further, removing dangling images, stopped containers, networks, and build cache — and unused images too with -a.

On any host that runs for weeks, schedule one of these rather than waiting for a full disk to force the issue. The full operational treatment is in the operations chapter; the habit to start now is treating the store as something that needs periodic reclaiming.

Common Mistakes
  • Assuming docker run is fully local and being broken by a registry outage on the first run of an uncached image — the implicit pull needs the network, so a flaky registry fails startup on a fresh host.
  • Letting the image store grow unbounded on a long-lived host until the disk fills and the daemon starts failing — pulled and dangling images need periodic pruning.
  • Omitting the registry host for a private image and watching Docker try Docker Hub, then fail with a confusing not-found — private images need a fully qualified name.
  • Forgetting docker login for a private registry and misreading the resulting auth failure as the image not existing — it exists; you're just unauthenticated.
Best Practices
  • Pre-pull required images before they're needed — in CI or a warm-up step — so a registry outage at run time doesn't break container startup.
  • Schedule docker image prune or docker system prune on long-lived hosts to keep the store from quietly filling the disk.
  • Always fully qualify private-registry image names and authenticate with docker login so resolution is unambiguous and auth failures don't masquerade as missing images.
  • Treat the local store as a cache, not a source of truth — the registry holds the canonical image, and the store is a reclaimable copy you can prune and re-pull freely.
Comparable tools Podman · containerd/nerdctl maintain equivalent local image stores skopeo copy moves images between stores and registries without running them npm cache · pip wheelhouse the closest mental model for the local store

Knowledge Check

Why is the second pull of an image you already have nearly instant?

  • Every layer is already in the local store, so the daemon downloads nothing and only re-checks the manifest
  • The daemon quietly recompresses the cached layers in place, which is far faster than downloading them fresh
  • The registry keeps a dedicated, pre-warmed fast cache reserved specifically for repeat clients like you
  • The second pull skips the manifest check entirely as an optimization and simply assumes nothing has changed

When does docker run do network I/O?

  • Only when the image is missing from the local store, in which case it pulls implicitly before starting
  • Every single time, since it always re-validates the cached image against the upstream registry first
  • Only when the container publishes a port, which requires the daemon to coordinate with the registry
  • Never — run always uses only the local store, and pull is the single command that ever touches the network

An unqualified name like postgres:16 fails to pull from your private registry. Why?

  • Unqualified names resolve to Docker Hub by default, so Docker looks there instead of your private registry
  • Docker automatically searches through every configured registry in turn and then stops at the very first error it hits
  • The :16 tag is reserved exclusively by Docker Hub and cannot be reused in a private registry at all
  • A pull only succeeds while a container created from that same image is already up and running locally

Why does the local image store grow over time, and how is it reclaimed?

  • Pulled images, dangling layers, and old tags accumulate with no automatic cleanup, and docker prune reclaims them
  • Docker deletes old and unused images automatically on a schedule, so the store never actually grows over time
  • It grows only from accumulated container stdout and stderr logs, which are then cleared out automatically the moment you restart the daemon
  • It can only ever be reclaimed by fully reinstalling Docker and then re-pulling every image from scratch

You got correct