Service Dependencies and Startup Order
web needs db to be up before it can serve a request, so the instinct is depends_on: [db] and assume the problem is solved. It is not. Plain depends_on waits only for the db container to start — not for Postgres inside it to finish initializing and start accepting connections — so web races ahead, connects to a port that isn't listening yet, and crashes on boot.
This is the single most common Compose footgun, and it trips everyone exactly once. The fix is small once you see it, but the failure mode is confusing because every service reports "running" while the stack is plainly broken.
What depends_on Actually Guarantees
depends_on in its short form orders container start: on up, Compose starts the dependencies first and the dependents after. That is the entire promise. "Started" means the container process is running — the kernel has launched the entrypoint — not that the application inside has finished booting and is ready to serve. The ordering is real and useful; the readiness guarantee people expect is not there.
Started Is Not Ready
postgres:16 spends a few seconds after its container starts initializing the data directory, running startup scripts, and finally binding port 5432. During that window the container is "running" but Postgres is not yet listening. web, having been told only to start after db started, opens its connection in exactly that window, gets connection-refused, and exits. The stack appears broken on first up even though every service is "running" — which is precisely why the cause is hard to spot.
Healthchecks Plus condition: service_healthy
The correct fix is to gate on readiness, not on start. Give db a healthcheck that probes whether Postgres actually accepts connections — pg_isready is the standard probe — and change web's dependency to the long form with condition: service_healthy. Now Compose holds web back until db's healthcheck passes, which is the moment Postgres is truly listening. This is the same HEALTHCHECK mechanism the image chapters touched, and Chapter 11 topic 67 covers how the check itself is written and tuned.
condition: service_healthydb startsdb healthcheck passesweb startsproxy startsservices:
web:
build: ./web
image: driftwood/web
depends_on:
db:
condition: service_healthy
networks:
- driftwood-net
db:
image: postgres:16
healthcheck:
test: ["CMD-SHELL", "pg_isready -U driftwood -d driftwood"]
interval: 5s
timeout: 3s
retries: 5
volumes:
- driftwood-db-data:/var/lib/postgresql/data
networks:
- driftwood-net
proxy:
image: nginx:1.27-alpine
depends_on:
web:
condition: service_started
ports:
- "80:80"
- "443:443"
networks:
- driftwood-net
Read the dependency chain off the file: proxy waits for web to start, web waits for db to pass pg_isready. The healthcheck retries every five seconds up to five times, so Compose holds web for as long as Postgres needs to come up and no longer.
App-Side Retry as the Resilient Path
Health gating fixes cold boot, but it does not cover a dependency that restarts mid-life. If db is restarted for an upgrade while web is running, the healthcheck condition was satisfied long ago and does nothing now — web simply loses its connection. The durable answer is for the application itself to retry the database connection with backoff, both at startup and on any dropped connection. The healthcheck handles the cold-boot half; the retry handles everything after.
Driftwood Wired Correctly
In the finished file db carries a pg_isready-based healthcheck, web depends on db with condition: service_healthy, and proxy depends on web. The chain means a single docker compose up brings the stack up in an order where each tier finds the one below it actually listening — no race, no connection-refused, no manual sleep between commands. Driftwood's own startup code also retries its connection, so a later db restart no longer takes web down with it.
Short form (depends_on: [db]) — orders container startup only and returns the moment the container process is running. It loses the race against a database that needs seconds to accept connections. Use it for pure ordering, when the dependent does not connect to the dependency at boot.
condition: service_healthy form — waits for the dependency's healthcheck to pass before starting the dependent, which is what "wait until the database is actually ready" requires. Use it whenever the dependent opens a connection to the dependency during startup — which is exactly Driftwood's web-to-db case.
- Writing
depends_on: [db]and expectingwebto wait for Postgres to accept connections — it waits only for the container to start, andwebcrashes on connection-refused during Postgres's init window. - Adding
condition: service_healthybut givingdbnohealthcheck— Compose rejects the condition or never satisfies it, because there is nothing reporting health to gate on. - Writing a healthcheck that returns healthy too early — checking the process exists rather than that the port answers — so
webstill races in; the probe must test readiness (pg_isready, a real query), not mere liveness. - Relying solely on startup ordering and shipping no connection retry in
web— the first cold boot works, but any laterdbrestart takeswebdown with it, because nothing reconnects.
- Gate dependents with
depends_onpluscondition: service_healthyagainst a real readinesshealthcheck(pg_isreadyfordb), not the bare short form, whenever the dependent connects at startup. - Write healthchecks that probe actual readiness — the port accepting a real request — rather than just the presence of a process, so "healthy" means "serves traffic".
- Add connection-retry-with-backoff in the application (
web) so a mid-life dependency restart is survived, treating health gating as the cold-boot half of the answer. - Order the full chain explicitly —
proxydepends onweb,webdepends ondb— soupreconciles the stack in an order where each tier finds the one below it listening.
docker run sequence orders only by the human running commands in order, plus a sleep
Podman podman-compose honors the same depends_on and healthcheck keys
Kubernetes splits this into readiness probes and startup probes / init containers
Knowledge Check
What does plain depends_on: [db] actually wait for before starting web?
- Only for the
dbcontainer process to start — not for Postgres inside it to accept connections - For Postgres inside the container to finish initializing and bind port 5432 so connections succeed
- For
db's defined healthcheck to start passing, which the short form runs and waits on automatically - For
web's own startup and initial data load to fully complete beforedbis allowed to run
Why does the stack appear broken on first up even though every service reports "running"?
webconnects during the window afterdb's container starts but before Postgres is listening, and crashes- The two services were placed on different project networks and so cannot resolve each other's names
- The
driftwood-db-datanamed volume silently failed to mount on boot, leaving Postgres with no writable data directory to initialize - The
dbimage only partially pulled from the registry, so the container runs without Postgres installed
Why must db have a healthcheck for condition: service_healthy to work?
- The condition gates on the dependency's reported health, so without a check there is no health status to gate on
- Compose synthesizes a default
pg_isreadyprobe for Postgres, so the explicit check is only documentation - The healthcheck is what registers
dbin the project network's embedded DNS, so without itwebcannot resolve the hostname at all - Without a healthcheck defined Compose simply refuses to build or pull the
dbimage at all
Why is app-side connection retry still needed once health gating is in place?
- Health gating covers cold boot only; a mid-life
dbrestart dropsweb's connection and only app retry survives it - The healthcheck re-runs and re-gates
webautomatically on every laterdbrestart, making retry redundant - Retry makes the initial cold boot of the stack measurably faster than waiting for the healthcheck
- App retry fully replaces the healthcheck on cold boot as well, so the
dbservice'shealthcheckblock can simply be deleted
You got correct