Topic 49

DNS Resolution

Networking/Services

DNS resolution turns a name like api.example.com into an IP address an application can connect to. On Linux this is not one lookup against one server. The C library asks the Name Service Switch, which consults sources in a configured order — local files, then DNS, sometimes more — before any packet leaves the host.

The operational consequence: /etc/resolv.conf is rarely the source of truth on a modern Ubuntu box. It is usually a symlink to a file managed by systemd-resolved or NetworkManager. Editing it by hand works until the next network event, then your changes vanish and resolution breaks in ways that look random.

The resolver path through NSS

A program calls getaddrinfo() in glibc. glibc does not go straight to DNS. It reads /etc/nsswitch.conf and walks the hosts: line left to right. On Debian and Ubuntu that line typically reads files dns or, with systemd-resolved active, files resolve [!UNAVAIL=return] dns.

Order matters. With files first, an entry in /etc/hosts wins over any DNS record, no matter what the authoritative server says. This is why a stale /etc/hosts line silently shadows production DNS, and why dig — which queries DNS directly and ignores NSS — disagrees with what the application actually resolves.

# the hosts line decides source order
grep ^hosts /etc/nsswitch.conf
# hosts: files resolve [!UNAVAIL=return] dns

Why /etc/resolv.conf is managed

The classic resolver reads up to three nameserver lines and a search domain list from /etc/resolv.conf. The file still exists, but on most current systems a daemon owns it. Three managers compete for that role: the older resolvconf package, NetworkManager on desktops, and systemd-resolved on Ubuntu Server since 18.04.

Check what owns the file before touching it. If it points into /run, a daemon regenerates it on every link change, lease renewal, or VPN connect. Hand edits to a generated file are overwritten without warning.

# is resolv.conf real or a symlink to a stub?
ls -l /etc/resolv.conf
# -> ../run/systemd/resolve/stub-resolv.conf
# nameserver 127.0.0.53 means systemd-resolved owns DNS

systemd-resolved and the stub at 127.0.0.53

systemd-resolved runs a stub resolver listening on 127.0.0.53:53. Applications send queries there; the daemon forwards them to the real upstream servers it learned from DHCP, Netplan, or static config. This indirection enables split DNS: queries for an internal domain route to a corporate server while everything else goes to a public resolver.

DNS is configured per link, not globally. Each interface carries its own nameservers and search domains, and a VPN can register a routing domain so only matching names use its DNS. Set persistent DNS through Netplan rather than poking the daemon, because Netplan renders the systemd-resolved config on netplan apply.

# /etc/netplan/01-netcfg.yaml — DNS per interface
network:
  ethernets:
    eth0:
      nameservers:
        addresses: [10.0.0.2, 1.1.1.1]
        search: [corp.example.com]
# sudo netplan apply  (re-renders resolved config)

Caching, TTLs, and flushing

systemd-resolved caches positive and negative answers, honoring each record's time-to-live. A record with a 300-second TTL stays cached for five minutes; a negative answer (NXDOMAIN) is cached too, so a name that did not exist when you first asked stays "missing" until its TTL expires. glibc itself does not cache, so without resolved there is no host-level cache unless you run one — dnsmasq or unbound.

Flush with resolvectl flush-caches. Restarting the service also clears the cache but drops in-flight queries. Inspect the cache and per-link state with resolvectl statistics and resolvectl status, which show hit rates and the active upstream servers.

# clear the systemd-resolved cache, no restart
resolvectl flush-caches
# confirm upstream servers and per-link DNS
resolvectl status

Debugging tools that agree with the application

Pick the tool that matches the question. resolvectl query goes through systemd-resolved and honors split DNS and search domains, so it reflects what an app sees. getent hosts goes through NSS, so it also respects /etc/hosts and the nsswitch.conf order. dig talks DNS directly and bypasses both — useful for testing a specific server, misleading for "why can my app not resolve this".

Avoid nslookup for resolver debugging on Linux. It ignores NSS and /etc/hosts, queries a server of its own choosing, and reports a name as resolvable when the application would fail. On Red Hat systems the picture is similar, except getent and dig ship in glibc-common and bind-utils respectively.

# what the app sees (NSS-aware, respects /etc/hosts)
getent hosts api.example.com
# resolver-aware, honors split DNS + search domains
resolvectl query api.example.com
# raw DNS to one server, ignores NSS
dig @1.1.1.1 api.example.com A +short

Common Mistakes

Editing /etc/resolv.conf directly when it is a symlink into /run. The managing daemon overwrites it on the next link or lease event, and DNS reverts silently.
Trusting nslookup output. It bypasses NSS and /etc/hosts, so it can report success while the application fails on the same name.
Forgetting /etc/hosts precedence. With files first in nsswitch, a stale local entry shadows production DNS no matter what the authoritative server returns.
Ignoring the search domain list. A bare name like db gets each search suffix appended, and the wrong suffix resolves to the wrong host.
Mixing up A and AAAA. A host with a broken IPv6 path but a valid AAAA record times out before falling back, even though the IPv4 A record works.
Assuming a TTL of zero means no caching. systemd-resolved still caches negative answers, so an NXDOMAIN persists until its own TTL expires.

Best Practices

Query with resolvectl query <name> to see what the application sees, including split DNS and search-domain expansion.
Check the hosts: line in /etc/nsswitch.conf before debugging. The source order explains most "DNS works but the app disagrees" cases.
Set DNS through Netplan or systemd-resolved, then run netplan apply. Never hand-edit a generated /etc/resolv.conf.
Read the record TTL with dig name +noall +answer, whose answer section prints the live TTL, before assuming a change has propagated.
Use getent hosts for NSS-aware lookups when you need the answer that respects /etc/hosts and source order.
Flush correctly with resolvectl flush-caches instead of restarting the service, which drops in-flight queries.
Confirm per-link state with resolvectl status after a VPN connects, so you know which interface owns which domains.

Comparable toolsWindows (ipconfig /flushdns, nslookup)macOS (scutil --dns, dscacheutil)BSD (resolv.conf, no systemd)

Knowledge Check

An application cannot resolve a name, but dig returns the correct address. What is the most likely cause?

NSS source order or /etc/hosts is overriding what the app resolves, while dig queries DNS directly and bypasses both
The authoritative DNS server for the zone is down, so only dig's locally cached copy of the record is still able to answer the query
dig caches answers in a place the application cannot read
The record's TTL is too long for the application's own timeout

Why is hand-editing /etc/resolv.conf on a default Ubuntu Server unreliable?

It is usually a symlink to a generated stub file that a daemon rewrites on the next link change or DHCP lease
The file is read-only and the edit cannot be saved at all
glibc ignores the file entirely on modern kernels
Only the search and domain lines are honored, while any nameserver directive you add is silently dropped by the resolver library

Which tool reflects what an application actually sees, including /etc/hosts and the nsswitch source order?

getent hosts
nslookup
dig @127.0.0.53
host -a

What does systemd-resolved cache that surprises operators expecting a name to resolve immediately after it is created?

Negative answers such as NXDOMAIN, held until their own TTL expires
Only IPv4 A records, while AAAA records are fetched fresh from the upstream server on every single lookup
Records with a TTL of zero, kept indefinitely
Nothing — glibc performs all host-level caching

You got correct