Topic 45

Logging

LogsCollection

Logging in Kubernetes follows one rule: containers write to standard output, and something else collects it. The cluster itself keeps almost nothing — when a Pod is deleted, its logs go with it. Durable logging is a pipeline you assemble, not a feature you switch on.

Getting this wrong is common and painful: teams write logs to files inside containers, lose them on restart, and discover too late that kubectl logs only shows what is still on the node. The right model is simple once you see the pieces.

The stdout Model

The twelve-factor convention Kubernetes adopts is that an application logs to stdout and stderr as an unbuffered stream of events, and treats log routing as someone else's job. The container runtime captures that stream and writes it to a file on the node. kubectl logs reads that node file — which is why it works, and why it stops working once the Pod and its node files are gone.

Node-Level Collection

Because node log files are rotated and deleted, durable logging needs an agent that ships them off the node. The standard pattern is a DaemonSet (Topic 13) running a collector — Fluent Bit, Vector, Fluentd — on every node, tailing the container log files and forwarding them to a backend. This is node-level collection: one agent per node, gathering every Pod's stdout, decoupled from the applications themselves.

Where logs flow

app container  ->  stdout/stderr
runtime        ->  /var/log/.../*.log on the node
node agent     ->  (DaemonSet, e.g. Fluent Bit) tails and ships
backend        ->  Loki / Elasticsearch / cloud logging

Backends and Structure

The agent forwards to a log backend — Loki, the Elasticsearch/OpenSearch stack, or a cloud logging service — where logs are stored, indexed, and searched. What makes them useful there is structured logging: emitting JSON with consistent fields (level, request ID, tenant) rather than free text, so you can filter and correlate. Adding the request or trace ID lets you tie a log line to a distributed trace (Topic 47).

What You Lose When a Pod Dies

The crucial implication: anything written inside the container's filesystem, or held only in the node's rotated files, is lost when the Pod is deleted or rescheduled. So never write application logs to files in the container; always log to stdout and rely on the collection pipeline. For debugging a crashed container, kubectl logs --previous shows the prior instance's logs if they are still on the node — another reason the shipping pipeline, not the node, is your durable record.

In-cluster vs shipped logs

Node files / kubectl logs — ephemeral — rotated and deleted, gone when the Pod/node is gone. Fine for live debugging only.

Shipped to a backend — durable, searchable, correlatable. The actual logging system you must build.

Common Mistakes

Writing application logs to files inside the container, which vanish on restart or reschedule.
Assuming kubectl logs is durable — it reads node files that are rotated and deleted.
No node log agent, so logs never leave the node and are lost on Pod deletion.
Emitting unstructured free-text logs that can't be filtered or correlated at scale.
Logging secrets or PII to stdout, where the whole pipeline then carries them.

Best Practices

Log to stdout/stderr and let a node-level agent ship logs to a backend.
Run the collector as a DaemonSet so every node's Pods are covered.
Emit structured (JSON) logs with consistent fields, including a request/trace ID.
Treat the backend as the durable record; node files and kubectl logs are for live debugging.
Keep secrets and PII out of logs, since the pipeline propagates whatever is emitted.

RelatedDaemonSets — how the log agent runs on every node (Topic 13)Tracing — correlate logs with traces via IDs (Topic 47)Cloud logging — CloudWatch Logs, Cloud Logging, Azure Monitor as backends

Knowledge Check

Where should a containerized application send its logs in Kubernetes?

To stdout/stderr, leaving collection and routing to the platform
To a rotated log file on a writable mount path inside the container
Directly into the etcd key-value store as records
To the API server via the kubectl logs endpoint

Why is kubectl logs not a durable logging solution?

It reads node-local files that are rotated and deleted, and vanish when the Pod/node is gone
It only shows the last single line of container output
It requires full cluster-admin RBAC just to read any logs
It transparently encrypts the stored log records at rest so they can no longer be searched or read back

How are container logs typically shipped off nodes?

A DaemonSet log agent (Fluent Bit/Vector) tails node log files and forwards to a backend
The API server batches and pushes them to object storage
Each Pod uploads its own rotated log files as extra layers to the container image registry
The kubelet emails the rotated log files to the operator

You got correct