COPY vs ADD
Both COPY and ADD move files from the build context into the image, and for plain file copying they are identical. The difference is that ADD does two extra things — it auto-extracts local tar archives, and it can fetch a remote URL — and both of those are footguns that hide what the build is actually doing.
The rule is short: use COPY for everything, and reach for ADD only for the one case it is genuinely better at. The reason is auditability — a Dockerfile reviewer should be able to read a line and know exactly what lands in the image, and ADD turns a line that looks like a copy into one that might extract an archive or hit the network.
COPY — The Default
COPY src dest copies files and directories from the build context into the image, and nothing more. What you see is what you get, which is exactly why it is the instruction to use for application code, configs, and the requirements.txt in the cache-ordering lesson from topic 21. There is no behavior to remember beyond "it copies."
ADD's Tar Auto-Extraction
ADD local.tar.gz /opt silently extracts the archive into the destination instead of copying the file. This is convenient when you intended it and a surprise when you wanted the tarball intact — and it is the one place ADD legitimately beats COPY. If your build genuinely needs a local archive unpacked into the image, ADD does it in one line where COPY plus a RUN tar would take two.
ADD's URL Fetch
ADD https://example.com/file /opt downloads a remote file during the build. That pulls an unverified, unpinned resource into the image with no checksum, no caching guarantee, and a hidden network dependency baked into every build. A changed remote silently changes the image, and a flaky remote silently breaks the build — neither of which is visible from reading the line.
URL — surprising behavior hidden behind a copy-looking line.Why COPY Wins by Default
COPY does one obvious thing, so a reviewer reading the Dockerfile knows exactly what lands in the image. ADD's magic means a line that looks like a copy might extract an archive or hit the network, depending entirely on its argument — which makes the build harder to audit and harder to reproduce. The default is COPY not because ADD is broken, but because predictability is worth more than the occasional saved line.
The Honest URL Pattern
Instead of ADD <url>, fetch the file with an explicit, checksum-verified RUN so the download is visible and pinned — everything ADD's URL form hides becomes auditable in the Dockerfile.
ADD <url> with a checksum-verified fetchRUN curl -fsSL https://example.com/tool.tar.gz -o tool.tar.gz \ && echo "abc123… tool.tar.gz" | sha256sum -c - \ && tar -xzf tool.tar.gz -C /opt \ && rm tool.tar.gz
The download is explicit, the checksum fails the build if the remote changed, and the extraction and cleanup happen in the same layer. Compare that to ADD https://example.com/tool.tar.gz /opt, which fetches the file unverified and leaves no record of what it pulled — and, since a remote tar archive is not extracted by default, drops the tarball at the destination still needing an unpack step. The RUN version is longer precisely because it refuses to hide anything.
COPY — copies files from the build context into the image and does nothing else: predictable, auditable, the right choice for application code, configs, and dependency manifests. This is the default for every file move in the Dockerfile.
ADD — does the same plus auto-extracts local tar archives and fetches remote URLs. Use it only for the local-tar-extraction case where that behavior is exactly the intent. For everything else, COPY; if you need a remote file, fetch it with a checksum-verified RUN, not ADD.
- Using
ADD config.json /appout of habit whenCOPYis clearer — it works, but it hides thatADDcould have extracted or fetched, making the Dockerfile harder to audit. ADD-ing a remote URL to pull a dependency — the resource is unpinned and unverified, the network dependency is baked into every build, and a changed remote silently changes the image.- Expecting
ADD app.tar.gz /opt/app.tar.gzto leave a tarball at the destination —ADDextracts it, so you get the unpacked contents and no archive. - Relying on
ADD's URL fetch for caching — there is no checksum and weak cache semantics, so builds are neither reproducible nor reliably cached.
- Default to
COPYfor every file-copy in the Dockerfile so each line does one obvious, auditable thing. - Reserve
ADDfor the single case of extracting a local tar archive into the image, where its behavior is exactly the intent. - Fetch remote files with a
RUN curl ... && sha256sum -cso the download is explicit, pinned, and verified rather than hidden insideADD. - Read any
ADDin a code review as a question — is this extracting an archive, or should it be aCOPY? — since the instruction's behavior depends on its argument.
ADD --checksum adds verification to remote fetches, narrowing the gap
Buildah its copy/add mirror the same split
Buildpacks · ko move source into the image without either instruction
Knowledge Check
What do COPY and ADD share, and what does only ADD do?
- Both copy from the context;
ADDalso auto-extracts local tarballs and fetches remote URLs - Both fetch remote URLs into the image; only
COPYcan extract a local tar archive as it lands - Both copy files from the context; only
ADDsets correct file ownership on them automatically COPYmoves files into the image;ADDmoves files out of the image to the host
Why is ADD's URL fetch a reproducibility and security footgun?
- It pulls an unpinned, unverified resource with no checksum, so a changed remote silently changes the image
- It is disabled by default and requires a privileged build flag that weakens the daemon's security posture
- It downloads over plain HTTP only, so the fetched file is always transmitted unencrypted across the wire
- It automatically executes the downloaded file during the build step itself, directly running arbitrary fetched remote code as the root user
When is ADD's tar extraction the right call?
- When you genuinely want a local tar archive unpacked into the image in a single instruction
- When copying ordinary application source files into the image, since
ADDis faster at it thanCOPY - When fetching a remote tarball over the network, since
ADDverifies and extracts it in one step - When you want to keep a tarball intact at the destination without unpacking any of its contents
What replaces ADD <url> when you genuinely need a remote file?
- A
RUNthat fetches withcurland verifies the download withsha256sum -c - A
COPYwith the URL as its source, sinceCOPYcan also reach remote files - An
ADDwith a--pinflag that locks the URL to a fixed version - An
ENVthat records the URL so the daemon downloads it at container start
You got correct