Topic 63

cron and Scheduled Tasks

SchedulingAutomation

cron is the time-based job scheduler that ships with every Debian and Ubuntu server. A daemon, cron, wakes once a minute, reads a set of tables called crontabs, and runs any command whose schedule matches the current minute. You describe a job in one line — five time fields and a command — and from then on it runs unattended, forever, with no supervisor and no acknowledgement that it ran.

That simplicity is exactly why cron is everywhere and exactly why it generates so many "the job didn't run" investigations. cron does not run your login shell, so the rich PATH and environment you tested in interactively are absent. It does not surface output anywhere you will see by default, so a job that errors out fails in silence. And it never makes up a missed run — if the machine was off at the scheduled minute, that occurrence is simply gone. Knowing those three behaviours up front turns cron from a source of mysteries into a predictable tool.

crontab Format

A crontab line is five whitespace-separated time fields followed by the command. The fields, in order, are minute (0–59), hour (0–23), day of month (1–31), month (1–12), and day of week (0–7, where both 0 and 7 are Sunday). A * means "every value", */5 means "every fifth", a comma lists values (1,15,30), and a hyphen gives a range (9-17).

# ┌─ minute  ┌─ hour  ┌─ day-of-month  ┌─ month  ┌─ day-of-week
# run backup.sh at 02:30 every day
30 2 * * *  /usr/local/bin/backup.sh
# every 15 minutes, Mon-Fri, business hours
*/15 9-17 * * 1-5  /usr/local/bin/health-check.sh

There are two kinds of crontab, and they differ by one field. A per-user crontab — what you get with crontab -e — has no user column, because every job runs as you. The system crontab at /etc/crontab and any file you drop into /etc/cron.d/ adds a sixth field, the user to run as, between the schedule and the command. Mixing the two formats is a frequent error: a six-field line in a user crontab treats the username as part of the command, and a five-field line in /etc/cron.d runs the wrong thing.

For coarse schedules you rarely write the time fields at all. The directories /etc/cron.hourly, /etc/cron.daily, /etc/cron.weekly, and /etc/cron.monthly hold executable scripts that run on that cadence. On Debian and Ubuntu, /etc/crontab runs cron.hourly directly with run-parts, while the daily, weekly, and monthly directories are handed to anacron when it is installed. Dropping a script there is the idiomatic way packages schedule maintenance — logrotate and apt housekeeping both arrive this way.

The cron Environment

The single most common cron failure is a script that runs perfectly in your terminal and does nothing under cron. The cause is the environment. cron runs jobs with a deliberately minimal environment: PATH is typically just /usr/bin:/bin, HOME is set, SHELL is /bin/sh (dash on Debian and Ubuntu, not bash), and almost nothing from your .bashrc or .profile is present, because those files are read by interactive login shells, which cron is not.

The practical consequences are concrete. A bare python3 or docker in your job may not be found because its directory is not on cron's short PATH. Variables you rely on — AWS_PROFILE, PGPASSWORD, a VIRTUAL_ENV — are simply unset. The fix is to depend on nothing implicit: call every binary by absolute path, or set PATH explicitly at the top of the crontab, and source any env file the job needs inside the command itself.

# set PATH once at the top of the crontab
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
# source env, then run by absolute path
0 3 * * *  . /etc/profile.d/app.env; /usr/bin/python3 /opt/app/job.py

Output and Failure

cron does not throw away a job's output — it mails it. Anything a job writes to stdout or stderr is captured and sent as local mail to the job's owner via the system mail transfer agent. On a default Ubuntu Server install there is no MTA and no mailbox, so that mail goes nowhere and the output is effectively lost. A job can fail every night for a month and the only trace is a non-empty spool file no one reads.

Treat output as something you must capture deliberately. Redirect both streams to a log file with >>/var/log/job.log 2>&1 so failures leave evidence, and let the job's own exit code drive alerting rather than relying on cron mail. The minimal-but-safe pattern below appends stdout and stderr to a log and timestamps each run.

# capture stdout AND stderr; without 2>&1, errors still vanish
0 4 * * *  /usr/local/bin/backup.sh >>/var/log/backup.log 2>&1
# discard stdout, keep stderr — only failures get mailed
0 4 * * *  /usr/local/bin/backup.sh >/dev/null

Special Schedules and the No-Catch-Up Rule

cron understands a handful of shorthand strings that replace the five fields: @hourly, @daily (midnight), @weekly, @monthly, @yearly, and @reboot, which runs once when the cron daemon starts after boot rather than on a clock. @reboot is the common way to launch a long-running helper at startup without writing a full service unit, though a real daemon belongs in systemd.

The behaviour that catches people is that cron never catches up. If a job is scheduled for 02:30 and the server is powered off, suspended, or simply was not running cron at 02:30, that occurrence does not run late — it is skipped entirely and the next eligible time is tomorrow. On a laptop, a VM that sleeps, or any host with irregular uptime, this means scheduled maintenance silently never happens. The answer is anacron, which records the last successful run of daily, weekly, and monthly jobs on disk and, on the next boot, runs anything that is overdue.

cron versus systemd Timers

On modern Debian and Ubuntu, systemd timers are the alternative to cron, and the gap is in observability and dependencies, not in scheduling power. A timer is a .timer unit paired with a .service unit; OnCalendar=*-*-* 02:30:00 expresses the same schedule as a crontab line. What you gain is real: every run's output lands in the journal (journalctl -u backup.service), the job's success or failure becomes a unit state you can query, you can express ordering with After= and Requires=, and Persistent=true gives you anacron-style catch-up for missed runs.

Stay with cron for a one-line job on a server that is always on and where local mail or a redirect is enough — it is universal and there is nothing to learn. Graduate to a systemd timer the moment you need centralized logging through the journal, dependencies on other units or mounts, resource limits, or guaranteed catch-up after downtime. For a single throwaway run at a specific future time, neither fits: use at, which queues one command once and then forgets it.

cron vs systemd timers vs anacron

cron — fixed wall-clock schedule, no catch-up, output mailed (and usually lost). Choose it for a simple recurring job on an always-on server where a redirect to a log file is enough monitoring.

systemd timers — schedule plus journal logging, unit state, dependencies, and Persistent=true catch-up. Choose it when you need to know whether a job succeeded, order it after other units, or recover missed runs.

anacron — not a clock scheduler; it runs daily/weekly/monthly jobs that were missed while the host was down, tracking the last run on disk. Choose it for laptops, VMs, and any host with irregular uptime where "run roughly daily, and don't skip a day off" is the requirement.

Common Mistakes

Relying on a PATH or environment that cron does not provide — a bare docker, python3, or an unset AWS_PROFILE works in your shell and is missing under cron, so the job silently does nothing. Use absolute paths and set PATH in the crontab.
Leaving output unredirected on a host with no MTA — stdout and stderr are mailed to a spool no one reads, so a job that has been failing nightly for weeks looks like it never ran at all. Append to a log with 2>&1.
Assuming missed runs catch up. A job scheduled for 02:30 on a machine that was asleep is skipped, not deferred — the maintenance simply never happens. Use anacron or a systemd timer with Persistent=true.
Editing the live crontab file under /var/spool/cron/ directly instead of through crontab -e — the daemon may not reload it, and there is no syntax check, so a malformed line silently disables every job below it.
Forgetting that % in a cron command is special: the first unescaped % becomes a newline and everything after it is fed to the command as standard input, so a date +%Y-%m-%d inside a job breaks unless each % is escaped as \%.
Assuming cron uses UTC. It runs in the system timezone, so a schedule written for one box behaves differently after a timezone change or a DST transition, and a job at 02:30 can run twice or not at all on the changeover night.
Putting a six-field /etc/cron.d line (with a username) into a per-user crontab, or a five-field user line into /etc/cron.d — the username is parsed as part of the command and the job fails or runs as the wrong user.

Best Practices

Call every binary by absolute path in a cron job, and set PATH explicitly at the top of the crontab — never assume cron inherits your interactive environment.
Redirect both streams to a log file with >>/var/log/job.log 2>&1 so failures leave evidence; do not depend on cron mail, which is silently dropped on hosts without an MTA.
Edit crontabs only with crontab -e (and inspect with crontab -l) so the daemon reloads them and the file is validated before it is installed.
Use anacron, /etc/cron.daily, or a systemd timer with Persistent=true on any host with irregular uptime, where plain cron would skip every missed run.
Verify a new schedule before trusting it — spell out the five fields against a known reference and, for anything non-obvious, confirm the next fire times rather than eyeballing the syntax.
Make jobs idempotent and guard them with a lock (flock -n /var/lock/job.lock) so an overrun run does not stack on top of the next scheduled one.
Graduate to a systemd timer once you need journal logging, dependencies on other units, or guaranteed catch-up — and reach for at when you want a single command to run once at a future time.

Comparable toolssystemd timers — the modern in-distribution alternative: same scheduling, plus journal logging, unit dependencies, and catch-upWindows Task Scheduler — GUI and schtasks equivalent, with built-in run history and missed-run handlingat — queues a single command to run once at a future time, rather than on a repeating schedule

Knowledge Check

A script runs fine when you launch it by hand but does nothing when cron runs it on schedule. What is the most likely cause?

cron runs jobs with a minimal environment — a short PATH and none of your .bashrc — so a binary or variable the script relies on is missing
cron silently lowers the job to an unprivileged user partway through the run, so the script suddenly can no longer read its own input files
cron mounts the filesystem read-only for the job unless you explicitly mark that crontab line as writable
cron caches the previous version of the script in memory and re-runs that stale copy on the next schedule

A nightly cron job has a redirect of >/dev/null on its command but no 2>&1. The host has no mail transfer agent installed. What happens when the job errors?

stderr is still produced and mailed, but with no MTA that mail is dropped — the failure leaves no visible trace
cron automatically retries the job every minute until it finally succeeds, because the error output was never suppressed
the error text is written to /dev/null together with stdout by the same redirect, so nothing of consequence is lost
cron disables the job after its first failure and records the error to /var/log/syslog

A daily cron job is scheduled for 02:30, but the VM it runs on is suspended overnight and resumes at 09:00. What does cron do?

Nothing — the 02:30 occurrence is skipped entirely; cron does not run missed jobs late, so you need anacron or a timer with Persistent=true
It detects the missed window and runs the job immediately at 09:00 to catch up the skipped occurrence
It queues the missed 02:30 run internally and then executes it at the very next available minute boundary once the suspended VM resumes at 09:00
It doubles up the following night, running the job twice to make up for the one skipped run

When does a systemd timer earn its extra complexity over a plain cron line?

When you need the run's output in the journal, dependencies on other units, or Persistent=true catch-up after downtime
When the job must fire more often than once a minute at sub-minute resolution, which the cron schedule format simply cannot express
When the schedule needs hour ranges like 9-17 or step values, which the crontab field syntax cannot represent
When the job must run as one specific named user, which a plain cron line has no way to specify

You got correct