Topic 15

Quoting and Escaping

ShellWord Splitting

Before the shell runs a command it rewrites the line you typed: it expands variables, splits the result into words on whitespace, expands globs like *.conf against the filesystem, and runs anything in backticks or $( ). Quoting and escaping are how you switch parts of that machinery off. A single quote turns everything inside it into a literal string; a double quote keeps variable and command substitution but stops word splitting and globbing; a backslash escapes exactly the next character. Nothing else changes the meaning of these rules.

This is not a style preference — it is the boundary between a script that survives real input and one that breaks on the first filename with a space in it. The single most common shell bug in production is an unquoted $variable that held an empty value, a path with a space, or a leading dash, and silently turned one argument into zero, two, or a flag. Knowing precisely what each quoting form disables is what lets you reason about a command instead of pasting it and hoping.

Single Quotes, Double Quotes, and the Backslash

Single quotes are the strongest and the simplest: every byte between them is literal, including $, *, backslashes, and newlines. The one thing you cannot put inside single quotes is another single quote — there is no escape for it, because the quoting is absolute. Reach for single quotes whenever the text must arrive at the program byte-for-byte: an awk or sed program, a regular expression, a password, a JSON snippet.

Double quotes are the workhorse. Inside them the shell still performs parameter expansion ($VAR, ${VAR}), command substitution ($(cmd)), and arithmetic ($(( ))), but it does not split the result into words and does not expand globs. That combination is what you want almost all of the time: you need the value of a variable, and you need it to stay one argument no matter what characters it contains. The backslash escapes a single following character, removing whatever special meaning it had — \$ is a literal dollar sign, \ is a literal space, and a backslash at the very end of a line continues the command onto the next.

$ file='my report.txt'
$ rm $file        # word-split: tries to remove 'my' and 'report.txt'
$ rm "$file"      # one argument: removes 'my report.txt'
$ echo '$file'    # literal: prints $file
$ echo "$file"    # expands: prints my report.txt
$ echo \$file     # escaped: prints $file

What Each Form Disables

The useful way to hold this in your head is as a table of which shell stages each quoting form switches off. Word splitting and globbing are the two stages that surprise people, because they happen after a variable is expanded — the shell expands $file to my report.txt and only then chops it on the space. Both single and double quotes stop those two stages; the difference between them is purely whether expansion still runs.

Form	Variable / command expansion	Word splitting	Glob expansion
Unquoted	Yes	Yes	Yes
`"double"`	Yes	No	No
`'single'`	No	No	No
`\c` (backslash)	Disables for one char	n/a	n/a

One consequence is worth stating plainly: "$@" and "$*" are not interchangeable. "$@" expands to each positional parameter as a separate, individually quoted word — the only correct way to forward a script's arguments to another command. "$*" joins them into a single word separated by the first character of IFS (a space by default). Forwarding arguments with $@ unquoted, or with "$*", is how wrapper scripts mangle paths that contain spaces.

Heredocs and Embedded Programs

When you need to feed several lines into a command — a config block, a SQL statement, a remote script over ssh — a here-document is cleaner than chained echo calls. The quoting of the delimiter word controls expansion of the whole body, mirroring the single-versus-double rule. An unquoted delimiter (<<EOF) expands variables and command substitutions inside the body; a quoted delimiter (<<'EOF') passes the body through verbatim.

# Expanded: $USER and $(hostname) are substituted now, locally
$ cat <<EOF
running as $USER on $(hostname)
EOF

# Literal: the remote shell sees $USER and runs it there
$ ssh web01 bash <<'EOF'
echo logged in as $USER
EOF

Using <<-EOF (with the dash) strips leading tab characters from each body line and from the closing delimiter, so you can indent a heredoc inside a function or loop with tabs without those tabs ending up in the output. Spaces are not stripped — only tabs — which is a frequent source of "why won't my closing EOF match" errors when an editor has expanded tabs to spaces.

Quoting Across the Layers

A command rarely lives in just one shell. The classic trap is a value that passes through two shells — a local shell and a remote one over ssh, or the outer shell and an inner one in bash -c "...". Each layer strips one level of quoting, so a string that needs to survive both must be quoted twice. ssh web01 "rm '$file'" expands $file locally, then hands the remote shell a line that still has single quotes protecting any spaces; getting either layer wrong deletes the wrong thing or nothing.

When the quoting gets deep enough to be unreadable, stop hand-escaping and let the shell build the safe form for you. The printf %q builtin emits a string quoted so that feeding it back to the shell reproduces the original exactly, and ${var@Q} in Bash 4.4 and later does the same for a variable. For passing arguments to a remote host, the reliable pattern is to avoid a second shell parse entirely where you can, and where you cannot, generate the escaped form with printf %q rather than guessing at how many backslashes you need.

$ path="/var/log/my app/app.log"
$ printf '%q\n' "$path"
/var/log/my\ app/app.log
$ ssh web01 "tail -n 50 $(printf '%q' "$path")"

Common Mistakes

Leaving $variable unquoted in a command — an empty value collapses to zero arguments, a value with spaces splits into several, and a value starting with - is read as a flag. rm $file on an empty $file can become a bare rm; on a multi-word value it removes the wrong files.
Wrapping a whole command in single quotes and then expecting a variable inside to expand — echo '$HOME' prints the literal text $HOME, because single quotes disable expansion entirely.
Trying to put a single quote inside single quotes. There is no escape for it; 'don't' leaves the shell waiting for a closing quote. The fix is the close-escape-reopen idiom 'don'\''t' or switching the outer quotes to double.
Using "$*" or unquoted $@ to forward a script's arguments. Only "$@" preserves each argument as a separate word, so wrappers built on the others corrupt any path containing spaces.
Closing a here-document with an indented delimiter after using spaces instead of tabs with <<-EOF — only tabs are stripped, so a space-indented EOF never matches and the heredoc swallows the rest of the file.
Hand-counting backslashes for a command that crosses an ssh or bash -c boundary. Each layer removes one level of quoting, and guessing the depth is how a remote command runs against the wrong path.

Best Practices

Quote every variable expansion by default — write "$var", "$@", and "${arr[@]}" unless you have a specific reason to allow splitting or globbing, and document that reason in a comment.
Use single quotes for any text that must reach the program unchanged: sed/awk scripts, regular expressions, JSON, and passwords. It removes all doubt about what the shell will touch.
Forward arguments with "$@" and nothing else; reserve "$*" for the rare case where you actually want one joined string.
Run scripts through shellcheck in CI on Debian and Ubuntu (apt install shellcheck); it flags unquoted expansions and word-splitting hazards before they reach a server.
Quote the heredoc delimiter as <<'EOF' whenever the body should be literal — config templates, scripts sent to a remote shell — so local expansion cannot rewrite the content.
Generate cross-shell quoting with printf '%q' or ${var@Q} instead of escaping by hand when a value must survive an ssh or bash -c round trip.
Indent heredocs with literal tabs when using <<-EOF, and configure your editor to keep real tabs in shell scripts so the closing delimiter still matches.

Comparable toolsPowerShell — single quotes are literal and double quotes interpolate $var, but it tokenizes by passing arrays rather than re-splitting strings, so the unquoted-variable trap mostly does not existWindows cmd.exe — quoting is far weaker; " groups arguments but there is no single-quote literal form and %VAR% expansion happens regardlessfish — same single/double rules as Bash, but variable expansion never word-splits, removing the most common Bash quoting bug by design

Knowledge Check

$file holds my report.txt. Why does rm $file fail where rm "$file" succeeds?

Without quotes the shell word-splits the expanded value on the space, so rm receives two arguments — my and report.txt — instead of one filename
Unquoted variables are never expanded by the shell, so rm receives the literal five-character string $file rather than the value the variable holds
Quotes are mandatory to expand any variable at all; without them $file evaluates to an empty string and rm is handed no argument whatsoever
rm itself only accepts arguments that arrive wrapped in quote characters and rejects any bare unquoted word handed to it on the command line

You need a string to reach awk exactly as typed, including any $ signs. Which quoting do you choose and why?

Single quotes — they make every character literal, so the shell performs no expansion before awk sees the program
Double quotes — they disable the shell's expansion of every $ sign in the program while still keeping word splitting fully active
No quotes at all — an awk program passed bare on the command line is automatically exempt from the shell's normal processing
A single backslash placed before the whole string — it escapes every character that follows it across the entire program in one stroke

What is the difference between "$@" and "$*" when forwarding a script's arguments?

"$@" expands to each argument as a separate quoted word; "$*" joins them all into one word separated by the first character of IFS
They behave identically inside double quotes, expanding to the very same single joined word, and any difference between the two forms only ever shows up once they are left completely unquoted
"$*" preserves each positional parameter as its own separately quoted argument, while "$@" is the form that joins them all into a single space-separated word
"$@" silently drops any empty-string arguments from the expanded list, while "$*" is careful to keep every one of them, empty or not, intact

A heredoc sent over ssh web01 bash <<'EOF' contains echo $USER. Whose username prints, and why?

The remote user's — the quoted delimiter keeps the body literal locally, so $USER is expanded by the remote shell that runs it
The local user's — heredoc bodies are always expanded by the local shell first, regardless of the delimiter's quoting, before the assembled command is ever handed off to be run
Neither user's — a quoted delimiter blocks the heredoc body from being fed to the command on stdin at all, so nothing is run on either host
The local user's — only an unquoted delimiter such as <<EOF would defer the expansion of $USER across the link to the remote shell

A path with spaces must survive being passed through a second shell over ssh. What is the most reliable way to escape it?

Generate the escaped form with printf '%q' (or ${var@Q}), which produces a string that re-parses to the original exactly
Wrap the path in a single layer of single quotes — one layer reliably survives any number of successive shell parses
Replace every space with \ by hand and carefully count out the doubled and quadrupled backslashes needed for each successive shell layer
Use double quotes around the value and let the remote shell re-expand the variable in place once the line arrives on the far side of the connection

You got correct