GitHub Actions in Depth
Topic 55

Caching and Artifacts

CI/CD

Caching and artifacts both move files across the boundary of an ephemeral VM, but for opposite reasons. actions/cache restores regenerable inputs — node_modules, pip wheels — to make a build faster; a miss is harmless because the build just regenerates them. Artifacts persist outputs you actually need — compiled binaries, test reports — reliably, downloadable from the run UI and retained on a schedule.

Mixing them up is the most common Actions mistake after forgetting checkout. The rule that prevents it: if correctness would break when the file is missing, it is an artifact, not a cache.

actions/cache

A cache is keyed, usually by hashFiles() of a lockfile. On a hit, the keyed path is restored before the install step; on a miss, the step runs normally and the path is saved at the end of the job for next time. A cache is best-effort by design — nothing should depend on the hit happening.

Caching pip downloads by lockfile hash
- uses: actions/cache@v4
  with:
    path: ~/.cache/pip
    key: pip-${{ runner.os }}-${{ hashFiles('**/requirements.txt') }}
    restore-keys: |
      pip-${{ runner.os }}-

Cache Keys and restore-keys

The key is an exact match. When it misses, restore-keys supplies ordered prefixes so a recent-but-not-identical cache can still seed the job for partial reuse. Caches are immutable once written under a key, so the lockfile hash in the key is what forces a fresh save when dependencies change.

upload-artifact / download-artifact

actions/upload-artifact@v4 persists files from one job; download-artifact pulls them into a later job or exposes them in the run UI for a human. This is the supported way to hand a build output from a build job to a deploy job — reliably, not best-effort.

Passing a build output between jobs
- uses: actions/upload-artifact@v4
  with:
    name: dist
    path: dist/
    retention-days: 7

Scope and Retention

Caches are scoped per branch, with rules that let a branch read caches written on its base. Artifacts have a configurable retention — 90 days by default — and count against your storage billing, so leaving retention-days unset on large outputs quietly accumulates gigabytes.

The v4 Breaking Changes

v4 artifacts are immutable: you cannot upload two artifacts with the same name in one run, and a name cannot be re-uploaded once written. They are also not downloadable by the v3 download action, so a workflow must use v4 on both ends.

Cache vs Artifacts

Cache (actions/cache) — for regenerable inputs like node_modules or pip wheels. A miss just rebuilds them, so correctness never depends on the cache being present; it only affects speed.

Artifacts (upload/download-artifact) — for outputs you need: compiled binaries, test reports, coverage. They are durable, downloadable from the run, and billed against storage. Never use a cache to pass a build output between jobs — use an artifact, because a cache miss silently produces nothing.

Common Mistakes
  • Using a cache to hand a compiled binary from a build job to a deploy job — a cache miss silently produces nothing and the deploy ships an empty path.
  • Writing a cache key with no hashFiles() of the lockfile, so the cache never invalidates and the build keeps restoring stale dependencies.
  • Caching a directory that contains secrets or per-run tokens, leaking them into a shared, restorable cache.
  • Never setting retention-days on artifacts and forgetting they count against storage billing, accumulating gigabytes over time.
  • Uploading two artifacts with the same name in one run on v4, which errors because v4 artifact names are immutable.
Best Practices
  • Key caches on hashFiles('**/package-lock.json') so they invalidate the moment dependencies change.
  • Add restore-keys prefixes for partial cache reuse when the exact key misses.
  • Use artifacts, never a cache, to pass build outputs between jobs.
  • Set retention-days on every artifact to bound storage cost.
  • Prefer the setup-* action's built-in cache: option over hand-rolling actions/cache.
Comparable toolsGitLab CI/CD cache: and artifacts: keysCircleCI save_cache/restore_cache and persist_to_workspaceAzure Pipelines Cache@2 and PublishPipelineArtifactJenkins stash/unstash and archive

Knowledge Check

You need to pass a compiled binary from a build job to a deploy job. What do you use?

  • An artifact, because it is durable and a missing artifact fails loudly rather than silently
  • A cache keyed on the commit SHA, because it restores faster than an artifact does
  • Either one; caches and artifacts are fully interchangeable for passing files between separate jobs
  • An environment variable holding the absolute path to the compiled binary

Why does a cache key without a lockfile hash break invalidation?

  • The key never changes when dependencies change, so the build keeps restoring a stale cache
  • A key must always contain a hashFiles expression or the cache action refuses to run at all
  • It causes the cache entry to be purged at the end of every run
  • It exposes the cache so it becomes readable from unrelated repositories

What is the correctness risk of using a cache for build outputs?

  • A cache is best-effort, so a miss produces nothing and downstream steps run against missing files
  • Caches silently corrupt compiled binary files while compressing them for storage, so the bytes restore wrong
  • Caches are deleted the moment the job that wrote them finishes running
  • There is none; a cache is exactly as durable and reliable as an artifact

What changed about artifact names in v4?

  • They are immutable — you cannot upload two with the same name in one run
  • They are now matched case-insensitively when downloaded across separate runs
  • They must include the uploading job's name as a required prefix
  • They are now capped at a single file each rather than a directory

You got correct