Jobs and CronJobs
Not every workload runs forever. A Job runs one or more Pods until they complete successfully, then stops — for migrations, batch processing, and one-off tasks. A CronJob creates Jobs on a schedule, the Kubernetes answer to cron.
These are the controllers for work that has an end. The subtle parts are how they handle parallelism and retries, and the genuinely sharp edges of running cron in a distributed system.
Jobs: Run to Completion
A Job creates Pods and tracks them until a set number succeed. completions is how many successful runs you need; parallelism is how many Pods may run at once. With both at 1 you get a single task; raise parallelism for a fan-out of independent work items. The Job is complete when the required number of Pods exit successfully.
apiVersion: batch/v1 kind: Job metadata: name: import spec: completions: 10 # need 10 successful runs parallelism: 3 # up to 3 at a time backoffLimit: 4 # give up after 4 failures template: spec: restartPolicy: OnFailure containers: - name: import image: importer:1.0
Retries and Restart Policy
Jobs must use restartPolicy: OnFailure or Never — never Always, which would loop forever and never let the Job complete. backoffLimit caps how many times the Job retries a failing Pod before it is marked failed; without thinking about it, a broken Job can churn indefinitely. A ttlSecondsAfterFinished setting cleans up finished Jobs automatically so they don't accumulate.
CronJobs: Jobs on a Schedule
A CronJob wraps a Job template with a cron schedule. Each firing creates a new Job. The schedule is evaluated in a configurable time zone, and you control what happens when runs overlap with concurrencyPolicy: Allow (the default, overlapping runs), Forbid (skip the new run if the previous is still going), or Replace (kill the old, start the new).
apiVersion: batch/v1 kind: CronJob metadata: name: nightly-report spec: schedule: "0 2 * * *" # 02:00 every day concurrencyPolicy: Forbid successfulJobsHistoryLimit: 3 jobTemplate: spec: template: spec: restartPolicy: OnFailure containers: - name: report image: reporter:1.0
The Sharp Edges of Cron
CronJobs are best-effort, not guaranteed. If the control plane is down or busy at the scheduled moment, a run can be missed; startingDeadlineSeconds controls how late a missed run may still start. Long-running jobs with concurrencyPolicy: Allow can pile up on top of each other. And history limits matter — without successfulJobsHistoryLimit and failedJobsHistoryLimit, completed Jobs and their Pods accumulate until they clutter the namespace.
Job — runs to completion and stops; success is defined by exit code. For batch and one-off tasks.
Deployment — runs forever, restarting Pods to maintain a count. For services that should never "finish."
- Setting
restartPolicy: Alwayson a Job, so it can never reach completion. - Omitting
backoffLimit, letting a broken Job retry without bound. - Leaving
concurrencyPolicyatAllowfor slow jobs, so runs overlap and stack up. - Never setting history limits, so finished Jobs and Pods accumulate in the namespace.
- Assuming CronJob fires are guaranteed and exactly on time, rather than best-effort with possible misses.
- Use
restartPolicy: OnFailureorNeverfor Jobs, and set a sensiblebackoffLimit. - Set
ttlSecondsAfterFinishedand history limits so completed work cleans itself up. - Choose
concurrencyPolicy: Forbidfor jobs that must not overlap. - Make job logic idempotent — a retried or duplicated run should be safe.
- Set the CronJob time zone explicitly and use
startingDeadlineSecondsto bound missed runs.
Knowledge Check
What restartPolicy must a Job use, and why?
- OnFailure or Never — Always would restart forever and the Job could never complete
- Always — so the kubelet retries any Pod that fails part-way
- Any value works, because the restartPolicy field has no effect at all on how a Job behaves
- Only Never is permitted; OnFailure is rejected by the API
What does concurrencyPolicy: Forbid do for a CronJob?
- Skips a scheduled run if the previous run is still active
- Runs every overlapping scheduled job side by side in parallel
- Prevents the CronJob from ever firing more than twice total
- Forbids triggering the Job manually with kubectl create
Why might a CronJob miss a scheduled run entirely?
- CronJobs are best-effort; if the control plane is unavailable at the scheduled time a run can be missed
- CronJobs are hard-capped at one run per calendar day, so any extra slots in the cron expression are silently dropped
- Runs are only ever missed when the Job's parallelism field is set above one
- The schedule field is purely advisory and is never actually enforced
You got correct