Chapter 4: Dockerfiles
Topic 22

RUN — Shell Form vs Exec Form

InstructionBuild

RUN executes a command at build time and commits the result as a layer — it is how packages get installed and files get generated while the image is being built. It comes in two forms: shell form (RUN apt-get update) wraps the command in /bin/sh -c, and exec form (RUN ["apt-get", "update"]) runs the executable directly with no shell.

The same fork shows up again in CMD and ENTRYPOINT (topic 24), where it decides whether your process is PID 1 and receives signals. Learning the distinction on the low-stakes RUN — where the cost of the wrong form is usually nothing — makes the high-stakes case obvious, so it is worth getting straight here.

Shell Form

RUN apt-get update && apt-get install -y curl runs through /bin/sh -c, so shell features work: &&, pipes, $VARIABLE expansion, and globbing all behave as they would in a terminal. This is the common form for build commands that chain steps, because chaining is exactly what shell form gives you for free.

Exec Form

RUN ["executable", "arg1", "arg2"] is a JSON array that runs the binary directly with no shell, so &&, pipes, and variable expansion do not work — the array is passed as literal arguments to the executable. It matters most in CMD/ENTRYPOINT, and for base images that have no shell at all, such as distroless or scratch, where there is no /bin/sh for shell form to invoke.

Two forms, two execution paths
Shell form
RUN cmd runs via /bin/sh -c, so &&, pipes, and $VAR work — but the shell becomes PID 1, which breaks signal delivery.
Exec form
RUN ["cmd"] runs the binary directly, with no shell in between — no && or expansion, but the process is PID 1 and receives signals.

Each RUN Is a Layer

Every RUN commits a new layer, which is why chaining install-and-clean into one RUN ... && ... keeps the intermediate package cache out of a persisted layer. Split them into separate RUNs and the cache is baked in permanently — deleting it in a later layer does not reclaim the space, because the earlier layer still carries it. This is the layer lesson from Chapter 2 applied to build commands.

One RUN — install, then clean, in the same layer
RUN apt-get update \
 && apt-get install -y --no-install-recommends curl \
 && rm -rf /var/lib/apt/lists/*

The cleanup runs in the same layer that created the cache, so the layer never carries it. Write those three steps as three separate RUNs and the rm -rf only hides the files behind a later layer — the bytes are still in the image, padding every pull. Shell form earns its keep here: the && chain is what lets one layer do install-then-clean at all.

Build-Time vs Run-Time

RUN happens during docker build and its effects are frozen into the image. It is not the container's startup command — that is CMD/ENTRYPOINT in topic 24 — and confusing the two is a frequent beginner error. RUN pip install installs gunicorn into the image at build time; CMD ["gunicorn", ...] is what actually starts gunicorn when a container launches. The first runs once, the second runs every container start.

Why the Form Choice Echoes Later

The shell-vs-exec distinction is mechanically identical in ENTRYPOINT/CMD, but the stakes are higher: there, shell form inserts an sh -c that becomes PID 1 and swallows the signals docker stop sends — the signal lesson from Chapter 3, topic 15. On RUN the wrong form usually just fails to chain; in ENTRYPOINT it produces a container that ignores docker stop and gets SIGKILLed after a 10-second timeout. Same fork, very different blast radius.

Common Mistakes
  • Writing RUN ["sh", "-c", "apt-get update && apt-get install -y curl"] thinking exec form is "better" — you have just reintroduced the shell you were avoiding; if you need &&, shell form is the honest choice.
  • Expecting &&, pipes, or $VAR expansion to work in exec form — the JSON array has no shell, so the chain runs as literal arguments to the first executable and fails.
  • Splitting apt-get update, apt-get install, and the cache cleanup across separate RUN layers — each commits a layer, so the package cache is frozen into the image even after a later layer "removes" it.
  • Treating RUN as the container's start command — it runs at build time only; the process the container starts is set by CMD/ENTRYPOINT.
Best Practices
  • Use shell form for multi-step build commands that need &&, pipes, or variable expansion, and accept that it depends on a shell being present in the base.
  • Chain install, use, and cleanup into a single RUN joined by && so the intermediate files never persist in a committed layer.
  • Reach for exec form when the base has no shell (distroless, scratch) or when you want the exact arguments passed with no shell parsing in between.
  • Keep RUN strictly for build-time work and set the container's start command with ENTRYPOINT/CMD, never conflating the two.
Comparable tools BuildKit adds RUN --mount for cache and secret mounts plain RUN cannot do Buildah its run command is the scriptable equivalent outside a Dockerfile Buildpacks · ko generate the build steps for you, so there is no hand-written RUN

Knowledge Check

What does shell form wrap the command in, and what does exec form skip?

  • Shell form runs through /bin/sh -c; exec form runs the binary directly with no shell
  • Exec form runs through /bin/sh -c; shell form runs the binary directly with no shell
  • Shell form spawns a login shell with a profile; exec form spawns a non-login shell
  • Shell form commits one layer per command; exec form commits one layer per argument

Which form supports &&, pipes, and variable expansion?

  • Shell form, because it runs through a shell that interprets those features
  • Exec form, because the JSON array is parsed by the shell before execution
  • Both forms, since the daemon transparently wraps either one in /bin/sh -c before running it
  • Neither form, since RUN never invokes a shell at all and execs the binary in both cases

Why does chaining install and cleanup into one RUN matter?

  • Each RUN is a layer, so cleaning in the same one keeps the cache out of the image; a separate RUN bakes it in permanently
  • A later RUN that deletes the cache reclaims the space retroactively from the earlier layer that wrote it
  • Chaining lets the daemon run the install commands in parallel within one layer, so the build finishes faster
  • A single combined RUN instruction is entirely exempt from the layer build cache, so it can never trigger any downstream rebuild of the instructions that follow it

How does build-time RUN differ from the container's start command?

  • RUN executes once at build time and freezes into the image; CMD/ENTRYPOINT runs on every container start
  • RUN executes every time the container starts, while CMD runs only once at build time
  • They are fully interchangeable; the daemon simply picks whichever one of the two happens to be present in the Dockerfile and ignores the rest
  • RUN is only used when no CMD is present, acting as a fallback start command at container launch

You got correct