/proc and Process Introspection
/proc is a virtual filesystem the kernel synthesizes on demand — nothing on disk backs it. Read a file under it and the kernel runs code to produce the bytes at that instant: /proc/loadavg computes the load averages, /proc/meminfo reads the memory accounting, and /proc/<pid>/status dumps the live state of a process. Every per-process tool you already use — ps, top, htop, pgrep — is a parser over /proc, not a separate source of truth.
The operational consequence is that you never need a special agent to inspect a running process. If ps shows something you do not believe, you can read the raw kernel numbers yourself with cat and grep, and they will agree because they come from the same place. When a process is wedged, hung in uninterruptible sleep, or leaking file descriptors, /proc/<pid>/ is where the answer lives — and it is readable with the same tools you use on any text file.
The Per-Process Directory
Every running process gets a directory at /proc/<pid>/, created when it starts and removed the moment it exits. Inside are the files that describe everything the kernel knows about it. status gives a human-readable summary — state, UIDs, memory (VmRSS, VmSize), thread count, and the Cpus_allowed affinity mask. cmdline holds the argument vector with NUL separators, which is why cat /proc/<pid>/cmdline looks run-together and needs tr '\0' ' ' to read. environ exposes the environment the process was launched with — a place secrets leak if you pass them as variables.
# What is PID 1432 actually running, and where from? sudo tr '\0' ' ' < /proc/1432/cmdline; echo sudo ls -l /proc/1432/cwd /proc/1432/exe # cwd and exe are symlinks to the working dir and the on-disk binary
Three symlinks are worth memorizing. cwd points at the process's current working directory, exe at the executable it was launched from, and root at its root directory (different inside a container or a chroot). If a deleted binary is still running, readlink /proc/<pid>/exe shows the path with a trailing (deleted) — the single fastest way to spot a service running stale code after an upgrade.
File Descriptors and Open Files
/proc/<pid>/fd/ is a directory of symlinks, one per open file descriptor, each pointing at what that descriptor actually references: a regular file, a socket (socket:[12345]), a pipe, or /dev/null. This is the authoritative answer to "what does this process have open?" and it is how you diagnose descriptor leaks: count the entries over time and watch the number climb toward the process's RLIMIT_NOFILE, visible in /proc/<pid>/limits.
# How many fds is the process holding, and against what limit? sudo ls /proc/1432/fd | wc -l grep 'Max open files' /proc/1432/limits # lsof and ss read exactly this directory under the hood sudo lsof -p 1432
A deleted file that a process still holds open keeps consuming disk space until that descriptor closes — the inode is unlinked but not freed. The classic symptom is df reporting a full disk while du finds nothing. The fix is in /proc: find the descriptor under fd/ whose target ends in (deleted), and either restart the holder or truncate through the descriptor with : > /proc/<pid>/fd/<n>.
System-Wide Kernel State
Above the per-process directories, /proc exposes the whole machine. /proc/cpuinfo and /proc/meminfo are what every monitoring agent scrapes; /proc/loadavg carries the 1-, 5-, and 15-minute load plus the running/total task counts; /proc/mounts is the live mount table; /proc/net/ backs ss and the legacy netstat. These are the same numbers the kernel reports to free, uptime, and vmstat — reading them directly removes any doubt about a tool's formatting or rounding.
| Path | What it reports | Tool that parses it |
|---|---|---|
/proc/loadavg | Load averages, task counts | uptime, top |
/proc/meminfo | Memory and swap accounting | free |
/proc/<pid>/status | Per-process state, RSS, UIDs | ps, top |
/proc/<pid>/fd/ | Open file descriptors | lsof |
/proc/net/tcp | TCP sockets and states | ss, netstat |
Tunable State: /proc/sys and sysctl
One branch of /proc is writable. /proc/sys/ exposes hundreds of kernel tunables as files, and writing to one changes kernel behavior immediately — echo 1 > /proc/sys/net/ipv4/ip_forward turns on routing this instant. The catch is that writes here are not persistent: a reboot reverts everything to the boot-time defaults. The supported way to read and set these is sysctl, which is just a typed front-end over the same tree.
# Ephemeral (lost on reboot) vs persistent sudo sysctl -w net.ipv4.ip_forward=1 # Persist across reboots on Debian/Ubuntu: echo 'net.ipv4.ip_forward = 1' | sudo tee /etc/sysctl.d/99-forward.conf sudo sysctl --system # reload all *.conf, including /etc/sysctl.d/
On Debian and Ubuntu, drop tunables into /etc/sysctl.d/*.conf rather than editing /etc/sysctl.conf directly, so package upgrades do not clobber your changes; sysctl --system applies the whole directory in order. Red Hat systems use the same sysctl command and the same /etc/sysctl.d/ convention — this part of /proc does not diverge between distributions.
/proc — process and kernel state. Originally per-process introspection (the /proc/<pid>/ directories), it also carries global counters and the writable /proc/sys tunables. Reach for it for anything about a running process or for sysctl-style settings.
/sys — the sysfs view of the device and driver model: block devices, network interfaces, cgroups, kernel modules. Newer and more structured than /proc; hardware and driver tunables (a NIC's queue length, a disk's scheduler) live here, not under /proc.
/dev — device nodes you do I/O through (/dev/sda, /dev/null), managed by udev. It is for reading and writing devices, not for reading their metadata — that is what /sys is for.
- Writing tunables to
/proc/sys/...withechoand expecting them to survive a reboot — they do not. Persist them in/etc/sysctl.d/*.confand apply withsysctl --system, or the change silently reverts on the next restart. - Passing secrets as environment variables — any process that can read
/proc/<pid>/environ(the owner, and root) can recover them long after launch. Use a secrets file orsystemdLoadCredential=instead. cat /proc/<pid>/cmdlineand concluding the process has "no arguments" because the output ran together — the separators are NUL bytes, not spaces. Pipe throughtr '\0' ' 'to read it.- Chasing a "full disk" with
duwhendfanddudisagree — the space is held by a deleted-but-open file. Find it under/proc/<pid>/fd/(target ends in(deleted)); the inode frees only when the descriptor closes. - Reading PID-specific files without
sudoand trusting an empty result —cmdline,environ,fd/, andexefor another user's process are restricted, andhidepid=mount options can hide them entirely. - Scripting against a PID across a delay without confirming identity — PIDs are reused, so
/proc/<pid>/may now describe a different process. Re-checkcmdlineor the start time before acting on a stale PID. - Treating
/procsizes as bytes on disk — it is synthesized, so files report size0and the real data appears only when read. Tools that stat before reading will think the files are empty.
- Confirm what a service is really running with
readlink /proc/<pid>/exe— a trailing(deleted)means it is executing stale code and needs a restart after the upgrade. - Diagnose descriptor leaks by watching
ls /proc/<pid>/fd | wc -lagainstgrep 'Max open files' /proc/<pid>/limitsbefore the process hits itsRLIMIT_NOFILEceiling. - Set kernel tunables with
sysctl -wfor a live test, then persist the ones you keep in/etc/sysctl.d/*.confand runsysctl --systemso upgrades cannot clobber them. - Read
/proc/<pid>/statusfor the fieldspstruncates —State,VmRSS,Threads, andCpus_allowed_list— when you need exact per-process numbers, not a formatted column. - Reach for
/sys, not/proc, for hardware and driver tunables — disk schedulers, NIC settings, and cgroup limits live in sysfs. - Tighten exposure on multi-tenant hosts by mounting
/procwithhidepid=2(via asystemddrop-in orfstab) so users cannot enumerate other users' processes and command lines. - Prefer the maintained tools —
lsof,ss,htop,sysctl— for routine work, and drop to raw/proconly when you need to verify them or capture exact bytes for a bug report.
Knowledge Check
You write echo 1 > /proc/sys/net/ipv4/ip_forward and routing starts working, but after a reboot it is off again. Why?
- Writes to
/proc/syschange live kernel state only; persistence requires an entry in/etc/sysctl.d/*.confapplied at boot - The write failed silently because
/procis read-only and needs to be remounted read-write first - A reboot is required for any
/proc/syswrite to take effect, so the first change never actually applied ip_forwardis controlled byufw, which resets it to zero at startup and overwrites any value written directly into/proc
df reports the root filesystem 100% full, but du -sh / accounts for far less. What does /proc let you find?
- A deleted-but-still-open file — its descriptor under
/proc/<pid>/fd/targets a path ending in(deleted), and the space frees only on close - A corrupted inode table whose lost blocks only an online
fsckreading the mount list in/proc/mountscan reclaim and return to the filesystem's free count - Cached memory pages that
/proc/meminfocounts toward disk usage so thatdfsees them as full blocks whileducannot - A reserved-blocks setting under
/proc/sys/vmthat withholds the missing space fromdu's total whiledfstill counts it as occupied
Why is passing a credential as an environment variable risky even after the process has been running for hours?
- The launch environment stays readable at
/proc/<pid>/environfor the process owner and root for the life of the process - Environment variables are written to a real
/procfile on disk at launch and stay there until the next reboot clears the directory - Any unprivileged user on the system can read another user's
environfile directly, with no elevated privileges required - The kernel logs every environment variable to
/proc/kmsgwhen the process starts
When should you read /sys rather than /proc?
- For device and driver model state — block-device schedulers, NIC settings, cgroup limits, loaded modules
- For per-process file descriptors and command lines, which moved from
/procto/sysin recent kernels - For load averages and memory accounting figures, which
/procno longer exposes and which now live only under sysfs - For writable kernel tunables, since
/proc/sysis now read-only and superseded by/sys
You got correct