Links — Hard and Symbolic
Topic 09

Links — Hard and Symbolic

FilesystemFoundations

A link is a second name for a file, but the two kinds of link work on completely different layers. A hard link is another directory entry pointing at the same inode — the same on-disk object, with the same data blocks, permissions, and ownership. A symbolic link (symlink) is a tiny separate file whose contents are a path string; opening it makes the kernel follow that path to whatever is there now. The file you see in ls is really just a name in a directory mapped to an inode number, and links are how one inode acquires more than one name.

The distinction is not academic — it decides what survives a rename, a move across filesystems, and a delete. Hard links share an inode's reference count, so the data lives until the last name is removed; symlinks just store text and break silently when their target moves. Pick the wrong one and you get a dangling pointer that cat reports as "No such file or directory" even though ls shows the link sitting right there.

Inodes and Reference Counts

Every file is an inode: a numbered record holding the metadata (size, permissions, owner, timestamps) and the pointers to the data blocks. The filename is not in the inode — it lives in a directory, which is just a table mapping names to inode numbers. A hard link adds one more name in that table pointing at the same number, and increments the inode's link count. Delete is therefore better named unlink: it removes a name and decrements the count, and the kernel frees the blocks only when the count hits zero and no process still holds the file open.

You can watch the count directly. The number after the permissions in ls -l is the link count, and ls -i prints the inode number, which is the only reliable way to tell whether two names are the same file.

# Create a file and a hard link to it
echo "payload" > data.txt
ln data.txt data-hard.txt

# Same inode number, link count is now 2
ls -li data.txt data-hard.txt
1838221 -rw-r--r-- 2 root root 8 May 30 10:02 data-hard.txt
1838221 -rw-r--r-- 2 root root 8 May 30 10:02 data.txt

# Removing one name leaves the data; count drops to 1
rm data.txt
cat data-hard.txt   # still prints "payload"

Hard Links and Their Limits

Because a hard link is just another name for the same inode, the two names are indistinguishable — there is no "original" and no "copy". Edit through either name and both see the change, because there is only one set of data blocks. This is what makes hard links cheap: they cost a directory entry, not a duplicate of the data.

The catch is that inode numbers are unique only within a single filesystem, so a hard link cannot cross a mount point. You cannot hard-link a file on / to a name on a separate /home partition — ln fails with "Invalid cross-device link". Most filesystems also forbid hard links to directories, because a directory with two parents would create cycles that find and fsck cannot reason about. The practical reach of a hard link is one filesystem, files only.

PropertyHard linkSymbolic link
Points atThe inode (same data blocks)A path string
Own inodeNo — shares the target'sYes — separate small file
Cross filesystemsNoYes
Link to a directoryNo (normal files only)Yes
Survives target deleteYes (data lives until last name gone)No — becomes dangling
Survives target rename/moveYesOnly if the path still resolves

Symbolic Links and Resolution

A symlink is created with ln -s target linkname and stores target verbatim as its contents. When a process opens the link, the kernel reads that string and resolves it — relative to the link's own directory if the path is relative, or from / if absolute. Nothing checks that the target exists at creation time, which is why you can make a symlink to a path that is not there yet (common in /etc for config files installed later).

Two failure modes follow from "it is just a string". A symlink with a relative target breaks the moment you move the link to a different directory, because the relative path is now resolved from the wrong place. And a symlink to an absolute path breaks if the target is renamed or deleted, leaving a dangling link that ls -l still lists but cat cannot open. Use ls -l to see the arrow, readlink -f to chase the chain to the final real file, and stat to inspect the link itself versus its target.

# Symlink across a mount boundary — fine for symlinks, impossible for hard links
ln -s /mnt/data/app.conf /etc/app.conf

ls -l /etc/app.conf
lrwxrwxrwx 1 root root 17 May 30 /etc/app.conf -> /mnt/data/app.conf

# Follow the whole chain to the real target
readlink -f /etc/app.conf
/mnt/data/app.conf

# A dangling link still lists but will not open
rm /mnt/data/app.conf
cat /etc/app.conf   # cat: /etc/app.conf: No such file or directory

Links in Practice on Servers

Symlinks run real production machinery, not just convenience shortcuts. Debian and Ubuntu wire systemd almost entirely through them: enabling a unit with systemctl enable nginx creates a symlink under /etc/systemd/system/*.wants/ pointing at the unit file, and disabling it removes that symlink — the "enabled" state is a symlink. The classic atomic-deploy pattern uses one too: ship the new release into /srv/app/releases/2026-05-30, then repoint /srv/app/current with ln -sfn, which swaps the symlink in a single operation so no request ever sees a half-updated tree.

The dependency manager update-alternatives on Debian/Ubuntu (the equivalent of alternatives on Red Hat) is layered symlinks: /usr/bin/editor points into /etc/alternatives/editor, which points at the actual vim or nano binary, so the admin can switch the system default without touching any package. Hard links earn their place elsewhere — snapshot and backup tools such as rsync --link-dest hard-link unchanged files between nightly snapshots, so a hundred dated copies of a mostly-static tree consume the disk of roughly one.

Common Mistakes
  • Trying to hard-link across a mount boundary — ln file /othermount/name fails with "Invalid cross-device link", and the fix is a symlink, not a retry.
  • Creating a relative symlink and then moving the link to another directory, so its target now resolves from the wrong place and dangles silently — ls still shows it, cat reports no such file.
  • Repointing a deploy symlink with plain ln -sf current when current is a directory — without -n the new link is created inside the old target instead of replacing it. Use ln -sfn.
  • Assuming a hard link is a backup. It shares the inode, so editing or truncating through either name changes the one and only copy of the data — there is no second copy to fall back to.
  • rsync-ing a tree with default flags and silently dropping symlinks — without -l rsync skips each one ("skipping non-regular file") instead of recreating it; pass -a (which includes -l) to preserve links as links.
  • Recursively chmod or chown through a symlink to a directory and accidentally changing permissions on the target tree; or deleting with a trailing slash (rm -r linkdir/) and removing the target's contents rather than the link.
  • Believing a symlink's permissions matter — the lrwxrwxrwx bits are ignored; access is governed entirely by the target's permissions, so locking down a symlink protects nothing.
Best Practices
  • Reach for a symlink by default — it crosses filesystems, links directories, and is visible in ls -l; keep hard links for the narrow cases where you need a true shared inode.
  • Swap deploy and version symlinks atomically with ln -sfn new current so concurrent readers never see a missing or half-updated path.
  • Use readlink -f (or realpath) in scripts to resolve a link to its final canonical path before acting on it, instead of assuming the name you were handed is the real file.
  • Confirm two names are the same file with ls -li and compare inode numbers — never infer identity from matching filenames or sizes.
  • Prefer absolute targets for system symlinks that must work regardless of the caller's directory, and relative targets only inside self-contained, relocatable trees.
  • Hunt for broken links before they bite with find /etc -xtype l, which lists symlinks whose target no longer exists.
  • Keep rsync archive mode (-a) for backups so links are preserved, and add --link-dest when you want hard-linked, space-efficient snapshots of unchanged files.
Comparable toolsWindows — NTFS supports hard links (mklink /H), symbolic links (mklink), and directory junctions; symlink creation needs admin or Developer ModemacOSln/ln -s work the same on APFS; Finder "aliases" are a separate, higher-level construct the shell does not followBSD — the same POSIX ln semantics; FreeBSD and OpenBSD share the inode and symlink model

Knowledge Check

You need a link from a file on the root filesystem to a name on a separately mounted /data partition. Which works, and why?

  • A symbolic link — it stores a path string and resolves across mount points, whereas a hard link cannot reference an inode on another filesystem
  • A hard link — inode numbers are allocated from one global pool shared across every mounted filesystem, so any inode number is reachable from any other partition
  • Either one; the only difference between them is the syntax of ln
  • Neither — links of any kind cannot span two partitions

A file has two hard links. You rm one of them. What happens to the data?

  • It survives — rm just unlinks one name and decrements the inode's link count; the blocks are freed only when the count reaches zero
  • The data blocks are released immediately, and the second hard link is left as a dangling pointer to an inode that has already been reclaimed
  • The second link keeps working but now reads stale data from a freed inode
  • Both names are removed at once, since hard links always delete in pairs

Why does a relative symlink sometimes break when you move the link to a different directory, even though the target file never moved?

  • A symlink stores its target as a path string resolved relative to the link's own location, so moving the link changes where that relative path points
  • Moving a symlink rewrites its inode number, which invalidates the stored target
  • Relative symlinks cache the target's inode, which is lost on a move
  • The kernel resolves each symlink only once at boot and then caches the result, so a link that is moved while the system is running is not noticed until the next full reboot

Why is repointing a deploy symlink with ln -sfn preferred over copying a new release over the live directory?

  • The symlink swap is a single atomic operation, so no request ever observes a partially-updated tree
  • A symlink uses less disk space than the directory it points to
  • Copying a new release over a live directory is blocked outright by the kernel for as long as any file in that tree is still held open
  • Symlinks bypass the page cache, so the new release loads faster

What does ls -li let you determine that a plain ls -l does not?

  • Whether two names share the same inode — the only reliable test that they are the same underlying file
  • Whether a file is a symlink, by adding an arrow to the target
  • The full canonical path that a multi-hop symlink chain ultimately resolves to, with every intermediate link followed for you
  • Which filesystem a file lives on, by printing its mount point

You got correct