Topic 59

Self-Hosted Runners

CI/CD

A self-hosted runner is a machine you own that executes jobs instead of GitHub's hosted VMs. The reason to want one is almost always concrete: special hardware like GPUs, access to resources on a private network, or build volume where owned capacity is cheaper than per-minute hosted billing. Outside those cases, hosted runners are the better default.

The catch that catches everyone is security. On a public repository a self-hosted runner is an arbitrary-code-execution vector: a fork's pull request can run its own code on your machine with whatever access that machine has. The convenience is real, but so is the liability, and the two have to be weighed deliberately.

When to Self-Host

Self-host for GPU, ARM, or other special hardware the hosted images do not offer; for jobs that must reach into a private VPC; or for high build volume where flat hardware cost beats hosted minutes. If none of those apply, the maintenance burden is not worth it.

Registration and Labels

Register a runner at the repository, organization, or enterprise scope, and tag it with labels. A job then targets it with those labels in runs-on:, so GPU work lands on GPU machines and nothing else does.

Targeting a labelled self-hosted runner

jobs:
  train:
    runs-on: [self-hosted, linux, gpu]
    steps:
      - uses: actions/checkout@v4
      - run: ./train.sh

Persistence Problem

Self-hosted runners are not ephemeral by default. A long-lived runner carries leftover state — files, caches, environment changes, even secrets — from one job into the next, which both corrupts builds and leaks data. The fix is ephemeral runners or a fresh VM or container per job.

Autoscaling

Static always-on machines either sit idle or run out of capacity. actions-runner-controller (ARC) on Kubernetes, or scale-set runners, add and remove runners on demand so capacity tracks the queue instead of being fixed.

The Public-Repo Hazard

A pull request from a fork includes the fork's own workflow and code. Run that on a self-hosted machine and you are executing untrusted code on your hardware. Never attach self-hosted runners to public repositories without strict controls; if you must, require approval for first-time contributors in the Actions settings.

GitHub-Hosted vs Self-Hosted Runners

GitHub-hosted — fresh, isolated VMs that GitHub patches and bills per minute (Linux 1x, Windows 2x, macOS 10x). Zero maintenance and no persistent state between jobs.

Self-hosted — gives you GPUs, private-network reach, and flat hardware cost, but you patch the OS, you manage isolation, and on public repos they are an arbitrary-code-execution vector. Default to hosted; self-host only for a concrete hardware or network requirement.

Common Mistakes

Attaching a self-hosted runner to a public repo, so a forked pull request runs untrusted code on your machine with its full access.
Reusing a long-lived runner across jobs without cleanup, leaking secrets, caches, and files from one job into the next.
Running the runner as root or with broad access to internal systems, so one malicious job pivots into your network.
Not patching the runner OS and tooling, taking on the CVE surface that GitHub used to handle for you.
Leaving the registration token or the _work directory readable by other tenants on a shared host.

Best Practices

Use ephemeral runners — one job per runner, then destroy it — via ARC or scale sets to eliminate cross-job state.
Never connect self-hosted runners to public repositories; if unavoidable, require approval for first-time contributors in Actions settings.
Isolate runners in their own network segment with least-privilege access to internal systems.
Autoscale with actions-runner-controller on Kubernetes instead of static always-on VMs.
Run the runner as an unprivileged user and rotate the OS image regularly.

Comparable toolsGitLab CI/CD self-managed GitLab Runner (shared/group/specific)CircleCI self-hosted runnersJenkins agents/nodes, the native modelAzure Pipelines self-hosted agentsBuildkite agents are the default model

Knowledge Check

Why is a self-hosted runner on a public repo dangerous?

A forked pull request can run its own untrusted code on your machine with that machine's access
Public repos cannot attach any custom runner labels to self-hosted machines, so jobs can never target them
GitHub bills public-repo self-hosted minutes at ten times the normal rate
The runner's job logs become publicly searchable on the open internet

Why do ephemeral runners matter?

Running one job per runner then destroying it removes leftover state that would leak between jobs
They are the only kind of runner that can expose attached GPUs and other accelerator hardware to a job
They run faster because they skip the repository checkout step entirely
They are billed per minute at GitHub's standard hosted Linux rate

What is the core tradeoff of self-hosted versus hosted runners?

Self-hosted gives hardware and network control but shifts OS patching and isolation onto you
Self-hosted is always cheaper and lower-maintenance than hosted runners
Hosted runners cannot run Linux jobs, only Windows and macOS ones
Self-hosted runners are automatically patched and kept fully up to date by GitHub on your behalf

When does self-hosting genuinely justify the overhead?

When jobs need specific hardware like GPUs or private-network resources hosted runners cannot reach
Whenever a workflow defines more than one job that must run in parallel across several separate machines
For any repository that reads encrypted secrets during its workflows
Only for Windows builds, since hosted runners cannot compile them

You got correct