The Filesystem Hierarchy
Topic 06

The Filesystem Hierarchy

FilesystemLayout

Linux presents one rooted tree starting at / — there are no drive letters. Every disk, partition, network share, and virtual kernel filesystem is grafted onto a directory somewhere inside that single tree by mounting it. A second SSD does not become "D:"; it becomes whatever directory you mount it on, say /var/lib/postgresql, and from then on writing to that path writes to that disk.

The Filesystem Hierarchy Standard (FHS) fixes what belongs in each top-level directory so that software, packagers, and administrators agree on locations. The operational payoff is concrete: you know configuration lives in /etc, changing data lives in /var, and program files live in /usr — which tells you what to back up, what is safe to mount read-only, and what gets wiped on reboot.

The Single Tree and Mount Points

A mount point is just an existing directory that becomes the entry point for another filesystem. Before you mount a device, the directory is an empty folder on the root filesystem; after mount /dev/sdb1 /data, every path under /data is served by that device instead. The path is the same whether the bytes come from the boot disk, a separate NVMe drive, an NFS export, or RAM — that uniformity is the whole point.

One namespace can therefore span many backing devices, each chosen for its own reasons: / on the system disk, /home on a large drive you can grow independently, /boot on a small unencrypted partition the bootloader can read. Run findmnt or df -hT to see the whole map at once — which directory is served by which device, with which filesystem type.

# which device backs each path, and the filesystem type
findmnt -t ext4,xfs,vfat
# or the classic view, with type and free space
df -hT
/dev/sda2   ext4   234G   /
/dev/sda1   vfat   511M   /boot/efi
/dev/sdb1   xfs    1.8T   /home

Core Directories and Their Purpose

The top level looks crowded, but each entry has a defined job. The split that matters most is the three-way division between read-only program data (/usr), per-host configuration (/etc), and changing runtime state (/var). Get those three straight and the rest falls into place.

DirectoryHoldsNotes
/etcSystem-wide configuration, host-specific, all textNo binaries. The first thing to back up.
/varVariable state: logs, mail, spools, databases, cachesGrows over time; the usual disk-fill culprit (/var/log).
/usrRead-only program data: binaries, libraries, headers, docsShareable and mountable ro; the bulk of the OS.
/homePer-user home directoriesOften a separate partition so a full /home can't fill /.
/optSelf-contained third-party softwareWhere a vendor tarball that owns its own subtree belongs.
/srvData served by this host (web roots, FTP, repos)Site-defined layout; an intentional home for service data.
/tmpTemporary scratch for any userWorld-writable with the sticky bit; cleared on reboot or by a timer.
/rootThe root user's home directoryNot a place for application data.

On Debian and Ubuntu, distribution packages install binaries into /usr/bin and /usr/sbin, libraries into /usr/lib, and drop their default config under /etc. Anything you build or vendor by hand should go to /usr/local (for compiled-from-source tools) or /opt (for a self-contained vendor bundle), so a distribution upgrade never overwrites it and dpkg never claims to own it.

Virtual Filesystems: /proc, /sys, /dev, /run

Four top-level directories hold no disk blocks at all. /proc is a window into the kernel and into every running process — /proc/meminfo, /proc/cpuinfo, and /proc/<pid>/ are generated on read, not stored. /sys exposes the device and kernel-object model in a structured tree. /dev holds device nodes like /dev/sda and /dev/null, populated at runtime by udev. /run is a tmpfs in RAM for early-boot and runtime state such as PID files and sockets.

Because these are synthesized, their files lie about themselves: most report a size of 0 even when cat returns kilobytes, and copying them rarely does what you expect. They also vanish on reboot, which is exactly why writing a tuning value into /proc/sys by hand is temporary — it survives until the next boot and no longer. Persist kernel parameters in /etc/sysctl.d/*.conf instead, and the value is reapplied every boot.

The /usr Merge and Where Binaries Live

Historically /bin, /sbin, and /lib sat at the root, separate from /usr/bin, /usr/sbin, and /usr/lib — the root copies existed so a minimal set of tools worked before /usr was mounted. The initramfs made that split obsolete, and modern Debian and Ubuntu ship a "merged /usr": /bin is now a symbolic link to /usr/bin, /sbin to /usr/sbin, and /lib to /usr/lib. Fedora and RHEL completed the same merge years earlier.

In practice this means /bin/sh and /usr/bin/sh resolve to the identical file, so a script's shebang works regardless of which path it names. The convention behind bin versus sbin still holds — sbin directories carry administrative tools meant for root — but on a merged system the directories are no longer on different partitions, and /usr can no longer be mounted late from /etc/fstab — because /bin, /sbin, and /lib now point into it, a separate /usr must be mounted by the initramfs during early boot.

Ephemeral versus Persistent Locations

Two directories are designed to disappear. /run is RAM-backed and recreated empty every boot. /tmp is cleared either at boot or, on systemd systems with systemd-tmpfiles, by a timer that removes files untouched for 10 days by default; on some setups /tmp is itself a tmpfs in RAM. Treat anything in either as gone after a reboot — a download you parked in /tmp overnight may not be there in the morning.

Persistence lives in /etc (configuration), /var (logs, databases, spools), /home (user files), and /srv or /opt (service and third-party data). A correct backup of a server is essentially /etc plus the data directories under /var, /srv, and /home — you do not back up /usr, because reinstalling the packages reproduces it exactly.

/etc vs /var vs /usr

/etc — host-specific configuration, all of it text, no binaries. It changes only when an administrator or a package edits a config. Back it up; version-control it if you can.

/var — state that the running system writes: logs, mail spools, print queues, package caches, and many databases. It grows continuously, so it is the directory that fills a disk and the one to put on its own partition when uptime matters.

/usr — read-only program data installed by the package manager: binaries, shared libraries, headers, and docs. Nothing here is host-specific, so it can be shared or mounted read-only and never needs backing up — a reinstall recreates it byte for byte.

Common Mistakes
  • Dropping application data into / or /root — it pollutes the root filesystem, fills the partition that must never fill, and is invisible to a normal /etc+/var backup.
  • Writing runtime state under /usr — on a hardened host /usr is mounted read-only, so the write fails outright; even when it succeeds, a package upgrade can overwrite it.
  • Parking files you care about in /tmp or /run and treating them as durable — both are cleared on reboot, and systemd-tmpfiles deletes stale /tmp entries after 10 days even without a reboot.
  • Editing a value in /proc/sys for a permanent tuning change — it reverts on the next boot. Only /etc/sysctl.d/*.conf makes it stick.
  • Treating /proc and /sys files as ordinary files — most report size 0, are generated on read, and cannot be copied or archived meaningfully.
  • Installing hand-built software into /usr/bin instead of /usr/local or /opt — the next distribution upgrade or a stray package can clobber it, and dpkg -S shows no owner when you later try to trace the file.
  • Letting /var/log share the root partition with no rotation — a runaway log fills /, and a full root filesystem can stop the system from logging in or even booting cleanly.
Best Practices
  • Put configuration in /etc, changing state in /var, and self-contained third-party software in /opt — follow the FHS so any other admin can find things without a map.
  • Install compiled-from-source tools under /usr/local and vendor bundles under /opt, keeping them off the package manager's turf in /usr.
  • Give /var (or at least /var/log) its own partition or LVM volume on any server, so log or spool growth can never fill the root filesystem.
  • Persist kernel tunables in /etc/sysctl.d/*.conf and apply them with sysctl --system rather than echoing into /proc/sys.
  • Scope backups to /etc plus the data directories under /var, /srv, and /home; skip /usr, /proc, /sys, and /run, which a reinstall or reboot recreates.
  • Run findmnt or df -hT before adding storage so you know which device backs each path and where free space actually sits.
Comparable toolsWindows — per-volume drive letters (C:, D:) plus fixed roles for Program Files and ProgramData; no single rooted treemacOS — a Unix tree under / with the FHS-like core, plus /Applications and /System on a sealed read-only volumeBSD — FreeBSD's hier(7) layout, close to the FHS but with base system and ports split (/usr/local for everything from ports)

Knowledge Check

You set a value by writing to /proc/sys/net/ipv4/ip_forward and it works, but after a reboot it is back to the old value. Why?

  • /proc is a virtual filesystem regenerated each boot; persistent kernel tunables must be declared in /etc/sysctl.d/*.conf
  • The write needed root, and a non-root write is silently discarded at the next boot
  • /proc is mounted from a read-only partition, so the write landed in a memory overlay and was never actually committed back to the underlying disk
  • Kernel parameters reset unless you also run update-grub to bake them into the boot entry

For a server backup, why is it reasonable to exclude /usr while making sure to include /etc and parts of /var?

  • /usr is read-only program data a reinstall recreates exactly, while /etc holds host-specific config and /var holds changing state that nothing else can reproduce
  • /usr is too large to back up, whereas /etc and /var are always small enough to fit
  • /usr is a virtual filesystem with no real files, so there is nothing on disk to copy
  • Package binaries in /usr are signed and encrypted at install time and cannot be restored from a plain file-level backup, so reinstalling the packages is the only supported path

A vendor ships a self-contained application as a tarball that includes its own binaries, libraries, and config. Where does it belong, and why?

  • /opt — it owns its own subtree there, out of the package manager's /usr so a distribution upgrade can't overwrite it
  • /usr/bin — all executables must live there for the shell's PATH to find them
  • /var — third-party software is variable state and belongs with logs and spools
  • /tmp — third-party application bundles are meant to be staged and run from that scratch area, since it grants every user write access without sudo

On a modern Debian or Ubuntu system, what does the "merged /usr" actually mean for /bin?

  • /bin is a symbolic link to /usr/bin, so the two paths resolve to the same files and /usr can no longer be a separate partition
  • /bin still holds its own separate minimal toolset of statically linked rescue utilities, used during early boot before /usr is mounted from its partition
  • /bin was deleted, and scripts with #!/bin/sh shebangs must be rewritten to /usr/bin/sh
  • /bin now contains only statically linked binaries, while /usr/bin holds the dynamically linked ones

You got correct