Shrinking Images
A 1.1 GB image is slow to pull, slow to start, and a large surface to patch; the production Driftwood image is 180 MB doing the same job. That difference comes from four levers pulled together: pick the right base, do install-and-cleanup in one RUN, use a multi-stage build to drop the toolchain, and stop adding layers that carry nothing.
You measure the result, you do not guess at it. docker history and dive show exactly which instruction owns the bytes, so you can target the 400 MB layer instead of shaving the 2 MB one because it was the easiest to see.
The Right Base
Choosing python:3.12-slim (~120 MB) over python:3.12 (~1 GB) sets the floor for everything stacked above it, and distroless or scratch drop it further where the app allows. The base-image spectrum from Chapter 2 topic 12 — slim, alpine, distroless, scratch — is the menu; this topic is about applying it deliberately rather than re-deriving it. The base you pick is the single largest fixed cost in the image before you add a byte of your own.
One RUN for Install-and-Cleanup
Each RUN is a layer, and a file added in one layer and deleted in a later one still occupies the earlier layer (Chapter 2 topic 07). Installing packages and removing the apt cache in a single RUN — or via a cache mount, topic 30 — keeps the cleanup in the same layer as the install, so the bytes actually leave the image instead of stranding in a lower layer that the deletion never touches.
RUN apt-get update \ && apt-get install -y --no-install-recommends libpq5 \ && rm -rf /var/lib/apt/lists/*
Because the rm shares the RUN that created /var/lib/apt/lists, the package index never persists in any layer. Split this across two RUN steps and the index would sit in the first layer forever, regardless of the second one deleting it.
Multi-Stage to Drop the Toolchain
The largest single win for Driftwood is the multi-stage build of topic 28: the builder stage's ~900 MB of compilers, headers, and intermediate files never reach the final image, because only the artifacts cross via COPY --from. This one structural change is most of the 1.1 GB → 180 MB drop on its own — the other three levers trim what remains.
Fewer, Tighter Layers
Combine related commands, copy only what is needed instead of the whole context, and skip the recommended-but-unused packages with --no-install-recommends. Each avoided package and each merged RUN is fewer bytes and one fewer thing to patch when a CVE lands. The win compounds: a leaner image is faster to pull, faster to start, and smaller to scan.
Measure, Don't Guess
docker history lists each layer's size against the instruction that made it, pointing straight at the bloated step. dive walks the layers interactively and flags wasted space — files added then overwritten in a later layer — so you can see bytes that are technically present but functionally dead. The Driftwood teardown shows a 400 MB toolchain layer in the naive image and its complete absence in the slim one.
python:3.12 base (~1 GB) + a 400 MB toolchain layer (gcc, headers, apt cache) + app. docker history points straight at the toolchain step.python:3.12-slim base (~120 MB) + installed wheels + app. The toolchain layer is gone — it stayed in the discarded builder stage.Profiling first turns shrinking from guesswork into a directed edit: you read where the bytes are, fix the largest source, and re-measure. Trimming by eye shaves the layers you happen to notice while the real bloat sits untouched.
- Installing packages in one
RUNandrm-ing the cache in a laterRUN, expecting the image to shrink — the cache still lives in the earlier layer; the cleanup must share theRUNthat created the files (Chapter 2 topic 07). - Shipping the full
python:3.12base for a service that only needs the runtime — you carry ~900 MB of compilers and userland the running app never touches. - Chasing size by switching to
alpineand then fighting musl-vs-glibc breakage on compiled wheels — the bytes saved are dwarfed by the debugging, and a slim glibc base plus multi-stage often wins anyway (Chapter 2 topic 12). - Adding
--no-install-recommendsnowhere and pulling in dozens of suggested packages that inflate the image and the patch surface for no functional gain. - Optimizing by eye instead of by
docker historyanddive, so you shave a 2 MB layer while a 400 MB toolchain layer sits untouched.
- Start from a pinned
-slimbase and move down the Chapter 2 spectrum — distroless, scratch — only when the app tests clean on it, so size drops without a debugging surprise. - Install, use, and clean up in a single
RUN(or a cache mount) so removed files never strand bytes in an earlier layer. - Use a multi-stage build to keep the compiler and build caches out of the final image — the single biggest size lever for any compiled-dependency app.
- Profile every image with
docker historyanddiveand target the largest layer first, rather than trimming whatever is easiest to see.
Knowledge Check
Of the four shrinking levers, which gives the biggest single drop for a compiled-dependency app like Driftwood?
- The multi-stage build, which leaves the ~900 MB toolchain in the discarded builder stage
- Adding
--no-install-recommendsto every singleapt-get installstep - Merging every separate
RUNinstruction down into a single combined layer - Switching the base image from
slimover toalpine, which strips the libc and shell down to the smallest possible footprint
Why doesn't an rm in a later RUN shrink the image?
- The deleted file still occupies the earlier layer where it was created; the cleanup must share that
RUN - The
rmcommand is silently ignored by the builder during image construction - Docker recompresses the lower layers and restores the deleted file on completion
- The file moves to the container's writable layer instead of being removed, so it leaves the image but reappears the moment the container starts
What is the tradeoff of switching to an alpine base to save size?
- alpine's musl libc can break compiled wheels, and the debugging cost often outweighs the bytes saved
- alpine images are actually larger than slim, so you lose on size anyway
- alpine only runs on arm64, so the amd64 production servers most teams deploy on cannot pull or start the image at all
- alpine images cannot be pushed to a standard remote registry
How do docker history and dive help you shrink an image?
- They show each layer's size against its instruction and flag waste, so you target the largest source
- They automatically rewrite the Dockerfile to remove the largest wasteful layers
- They benchmark how fast the running container responds to live requests and chart the latency per layer so you can drop the slowest ones
- They scan the image for known CVEs in all the installed packages
You got correct