Chapter 4: Dockerfiles
Topic 20

The Dockerfile Model and Build Context

ConceptBuild

A Dockerfile is a recipe: a text file of instructions the daemon executes top to bottom, each one producing a new image layer on top of the last. But docker build does not read that file in isolation — it first packs up a directory, the build context, and ships the whole thing to the daemon, and only then runs the instructions.

The size of that context, and what COPY can reach inside it, decide how fast and how fat the build is. A build that uploads a 4 GB directory before it executes a single line is slow for a reason that has nothing to do with the instructions — and a COPY . . that sweeps in node_modules and a .git history bakes the junk into the image. The model is two halves: the recipe, and the directory the recipe runs against.

The Recipe, Top to Bottom

A Dockerfile starts FROM a base image and applies instructions — RUN, COPY, ENV, CMD — in the order they appear. Each instruction that changes the filesystem commits a new read-only layer, exactly the layer stack from Chapter 2. So the Dockerfile is that stack written out as code: read it top to bottom and you are reading the layers bottom to top.

Driftwood's first Dockerfile — naive, single-stage
FROM python:3.12
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
EXPOSE 8000
CMD ["gunicorn", "--bind", "0.0.0.0:8000", "app:app"]

Six instructions, and every filesystem-changing one — the COPY and the RUN — is a layer. This is the image the chapter starts from and spends the next topic fixing: it works, it serves Driftwood on port 8000, and its instruction order quietly defeats the build cache. Reasoning about each line as "this commits a layer" is how you predict that before it costs you.

The Build Context

docker build . tars up that . directory — the build context — and uploads it to the daemon before any instruction runs. COPY ./app /app can only reach files inside that context, which is why a path that climbs out of it, like COPY ../secrets /etc, fails with a "forbidden path" error rather than reaching up the tree. The context is a sealed box; the build sees nothing outside it.

This is also why the argument to docker build matters as much as the Dockerfile. Build from a directory scoped to the application and the context holds only what the image needs; build from a sprawling monorepo root and the context is the whole tree, whether or not any COPY names those files.

Why the Daemon, Not the CLI, Builds

The client sends the context over the daemon socket and the daemon does the work. A 2 GB context is a 2 GB upload even when the daemon sits on the same machine — there is no shortcut for "local." It is why the build output includes a transferring context step with a byte count (the classic builder phrased this Sending build context to Docker daemon), and why a build against a remote daemon is slow before it has run a single instruction: the context crosses the wire first.

The Context Footgun

Run docker build . in a directory that holds node_modules, a .git history, build output, and a 4 GB dataset, and all of it ships to the daemon every build — tens of seconds of upload before the first instruction even runs. Worse, a COPY . . then bakes that junk straight into the image: the dataset, the git history, and the local virtualenv become layers you push to a registry and pull onto every host.

.dockerignore, Previewed

A .dockerignore file excludes paths from the context the way .gitignore excludes them from a commit. List .git/ and node_modules/ there and they never get uploaded, so the context shrinks and COPY . . can no longer sweep them in. This topic only teases it; Chapter 5 builds Driftwood's real .dockerignore alongside the multi-stage build that makes it matter.

What docker build . does, in order
1 · Pack the context
The . directory is tarred up — minus anything in .dockerignore — and uploaded to the daemon. A big context is a slow upload before any instruction runs.
2 · Run the instructions
The daemon executes the Dockerfile top to bottom, each filesystem change a new layer. COPY can only reach files inside the uploaded context.
Common Mistakes
  • Running docker build . from a directory containing .git, node_modules, and large artifacts with no .dockerignore — every build uploads gigabytes to the daemon and COPY . . bakes the lot into the image.
  • Expecting COPY ../config /etc to work — the context is the build directory, and everything above it is unreachable; the build fails with a forbidden-path error rather than climbing the tree.
  • Treating docker build as a purely local operation and being surprised by slow builds against a remote daemon — the entire context crosses the wire before the first instruction runs.
  • Dropping the Dockerfile into a folder it shares with a giant unrelated directory so the context balloons — the context is the whole build directory, not just the files COPY happens to name.
Best Practices
  • Keep a .dockerignore next to every Dockerfile from the first build, excluding .git, node_modules, local virtualenvs, and build output so the context stays small.
  • Build from a directory scoped to the application, not a monorepo root, so the context contains only what the image needs.
  • Reason about each instruction as "this commits a layer," so the Dockerfile reads as the layer stack it produces.
  • Watch the "transferring context" byte count on the first build of any project — a multi-hundred-megabyte number is the signal a .dockerignore is missing.
Comparable tools BuildKit · Kaniko build from the same Dockerfile and context model without a privileged daemon Buildpacks · ko skip the hand-written Dockerfile and infer the image from source Podman · Buildah read the identical Dockerfile and .dockerignore format

Knowledge Check

What is the build context, and when is it sent to the daemon?

  • The directory passed to docker build, tarred up and uploaded to the daemon before any instruction runs
  • Only the specific files that COPY and ADD reference, gathered lazily from disk as each instruction runs
  • The set of read-only layers the daemon produces, assembled into a tarball once the final instruction finishes
  • The Dockerfile itself, parsed by the client and streamed one instruction at a time to the daemon over the socket

Why does COPY ../config /etc fail?

  • COPY can only reach files inside the build context, and ../config climbs out of it
  • COPY cannot copy directories, only individual files, so a path like ../config is rejected
  • The destination /etc is a protected system directory the build is not permitted to write into as root
  • COPY requires absolute source paths, and a relative path like ../config is invalid syntax

Why can a missing .dockerignore slow builds and inflate the image?

  • The entire directory — .git, node_modules, artifacts — uploads every build, and COPY . . bakes it into the image
  • Without it the daemon cannot gzip-compress the context, so the whole directory is sent uncompressed and the upload runs slower
  • Without it the layer cache is disabled entirely, so every instruction rebuilds from scratch on each successive run
  • Without it the daemon re-pulls the FROM base image every build instead of reusing the locally cached copy

How does each Dockerfile instruction relate to the image's layers?

  • Each filesystem-changing instruction commits a new read-only layer, so the Dockerfile is the layer stack as code
  • All instructions are collapsed into a single squashed layer once the build finishes, regardless of how many there were
  • Only the FROM base image carries layers; every instruction after it modifies that one shared layer in place
  • The daemon reorders the instructions into layers by descending size to optimize the final image on disk

You got correct