Loops and Functions
Loops and functions are the two constructs that turn a linear list of commands into a program. A loop repeats a block of work over a list of items or a stream of lines; a function names a block so you can call it many times and from many places. Together they are what separates a copy-pasted sequence of commands from something you can maintain.
In bash, both have edges that cut. A for loop word-splits its list, so a filename with a space becomes two iterations. A while read on the right of a pipe runs in a subshell, so any variable it sets vanishes when the loop ends. And a function has no real return value — return only sets an exit status from 0 to 255. Knowing where these traps sit is the difference between a script that works on your laptop and one that survives a directory of real-world filenames in production.
for, while, and until
A for loop iterates over a list of words. The list is whatever the shell produces after expansion — a glob, an array, the output of a command substitution — and the loop runs its body once per word. For numeric ranges, the C-style form for (( i=0; i<n; i++ )) is more honest than brace expansion because the bound can be a variable. Use a glob, not ls, when you want to walk files: the glob produces real filenames as separate words, with no parsing in between.
A while loop runs as long as its condition exits 0, which makes it the right tool for consuming a stream. The canonical pattern, while IFS= read -r line, reads one line at a time from standard input and is the only correct way to process file contents line by line. until is the inverse — it runs until its condition succeeds — and its natural home is a retry loop that keeps trying until a service comes up.
# glob, not ls — each file is one word even with spaces for f in /var/log/*.log; do echo "rotating $f" done # C-style numeric loop with a variable bound for (( i=0; i<5; i++ )); do echo "attempt $i" done # until: retry with a real backoff and a cap n=0 until curl -fsS http://localhost:8080/healthz; do n=$((n+1)) [ "$n" -ge 10 ] && echo "gave up" && exit 1 sleep "$((n*2))" done
Reading Input Safely
The single most common scripting bug is for line in $(cat file). It does not iterate over lines — it iterates over words, because the unquoted command substitution is split on every character in IFS (space, tab, newline) and then glob-expanded. A line containing two words becomes two iterations, and a line containing * expands to your directory listing. The fix is not a tweak; it is a different construct entirely.
Use while IFS= read -r line. Setting IFS= for that one command disables the leading and trailing whitespace trimming that read does by default, so the line arrives intact. The -r flag stops read from treating backslashes as escapes, so a path like C:\dir survives. For filenames specifically, even newlines are legal characters, so the safe pattern is find -print0 piped into read -d '', which delimits on the NUL byte that cannot appear in a path.
# wrong: word-splits and glob-expands every line for line in $(cat hosts.txt); do ssh "$line" uptime; done # right: one line per iteration, intact while IFS= read -r host; do ssh "$host" uptime done < hosts.txt # NUL-delimited: survives spaces AND newlines in filenames find . -type f -name '*.bak' -print0 | while IFS= read -r -d '' f; do rm -- "$f" done
Functions and Scope
A function is defined with name() { ...; } and called by name like any command. Its arguments arrive as the positional parameters $1, $2, and the whole set as "$@" — the same mechanism a script uses for its own arguments. Always expand it as "$@" in double quotes: unquoted $@ re-splits each argument on whitespace, while "$@" preserves every argument exactly as passed, even ones containing spaces.
Variables in bash are global by default, including inside functions — assign to x in a function and you have silently overwritten the caller's x. Declare every function-local variable with local to scope it to that call. The deeper point is that bash functions do not return values. return sets an exit status in the range 0 to 255 and nothing else; to hand back data, you print it to stdout and let the caller capture it with command substitution, reserving the exit status for success-or-failure signaling.
# data comes back on stdout; status signals success/failure get_pid() { local name="$1" local pid pid=$(pgrep -x "$name") || return 1 echo "$pid" } if pid=$(get_pid nginx); then echo "nginx is $pid" else echo "nginx not running" >&2 fi
Loop Control and Pitfalls
break leaves the loop entirely; continue skips to the next iteration. Both take an optional level — break 2 exits two nested loops at once — which is cleaner than a flag variable when you need to bail out of an inner loop and its parent together.
The pitfall that bites everyone is the piped while. When you write cmd | while read line; do ...; done, the right-hand side of the pipe runs in a subshell, so a counter or array you build inside the loop is gone the instant the loop ends — the parent shell never saw the assignments. There are two real fixes: feed the loop with a redirection or process substitution instead of a pipe, so it runs in the current shell, or set shopt -s lastpipe (with job control off) so the last command of a pipeline runs in the parent. Redirection is the portable choice.
# BROKEN: count is 0 here — the loop ran in a subshell count=0 grep -c ERROR *.log | while read n; do count=$((count+n)); done echo "$count" # prints 0 # FIXED: process substitution keeps the loop in the current shell count=0 while read n; do count=$((count+n)); done < <(grep -c ERROR *.log) echo "$count" # prints the real total
for f in $(ls) — broken. The command substitution is word-split on whitespace and glob-expanded, so any filename with a space or a * derails the loop. Never use it; there is no input type for which this is the right choice.
for f in * — the glob-safe form. The shell expands the pattern into real filenames as separate words, with no parsing step in between, so spaces and special characters survive. Use it whenever you are iterating over files in a directory.
while IFS= read -r line — the line-safe form. Use it when the input is a stream of lines from a file, a pipe, or a command — one intact line per iteration, which a for loop can never give you.
for f in $(ls)orfor f in $(cat list)— the unquoted substitution word-splits and glob-expands, so a filename with a space becomes two iterations and a name containing*explodes into a directory listing. Use a glob orfind -print0instead.- Piping into while read and then reading a variable set inside the loop. The loop body ran in a subshell, so the counter, array, or flag is empty afterward — and the bug is silent, the script just produces wrong totals.
- Using return to pass data out of a function. return only sets an exit status from 0 to 255;
return 300wraps to 44, and any string is meaningless. Print the data to stdout and capture it instead. - Dropping
-rfrom read, so backslashes are interpreted as escapes. A Windows path or a line ending in\gets mangled or silently joined to the next line. - Writing
$@instead of"$@"when forwarding arguments. Unquoted, each argument is re-split on whitespace, so a single path with a space is passed along as two separate arguments. - A while or until retry loop with no sleep and no attempt cap. It hammers the target thousands of times a second and, if the target never recovers, spins forever pinning a CPU.
- Forgetting local on a function variable, so an assignment inside the function clobbers a same-named variable in the caller — a bug that only surfaces when the names happen to collide.
- Read streams with while IFS= read -r line every time — it is the only construct that gives you one intact line per iteration without trimming or backslash mangling.
- Quote every expansion:
"$var","$@","$f"inside loops. Quoting is what makes the difference between handling a filename with a space and corrupting your data. - Declare local for every variable a function uses, so the function cannot leak into or stomp on the caller's namespace.
- Return data on stdout and signal success or failure through the exit status. Let the caller use
x=$(fn)for the value andif fn; thenfor the outcome. - Walk files with a glob or
find -print0 | while IFS= read -r -d '', never by parsingls. Globs and NUL delimiters are the only forms that survive arbitrary filenames. - Give every retry loop a max-attempts counter and a sleep with backoff, so a down dependency degrades gracefully instead of turning into a busy-wait.
- Feed a counting or accumulating while loop with a redirection or process substitution (
done < <(cmd)) rather than a pipe, so it runs in the current shell and its variables survive.
foreach and function with real typed return values and objects on the pipeline instead of word-split textPython — for/def with real return values, exceptions, and lexical scope; the tool to reach for when a bash loop or function grows past readabilityawk — its own for loop and function syntax built for line-and-field stream processing, often replacing a while read loop entirelyKnowledge Check
Why is for line in $(cat file) the wrong way to read a file line by line?
- The unquoted command substitution is word-split on
IFSand glob-expanded, so it iterates over words (and expands any*), not over lines catbuffers the entire file into memory before the loop body can start, so the construct fails outright on any file that happens to be larger than the available RAM- for can only iterate over a fixed numeric range, never over the output of a command like
cat - It reads the lines correctly but quietly strips the trailing newline off the final line
After cmd | while read n; do count=$((count+n)); done, why does $count read as 0 in the parent shell?
- The right side of the pipe runs in a subshell, so assignments inside the loop never propagate back to the parent
- read resets
countto 0 at the start of every iteration - Arithmetic with
$(( ))cannot accumulate a running total across successive loop iterations - The pipe discards the loop's standard output when it closes at the end, and the accumulated
countvariable is carried away along with it
A function needs to hand a hostname back to its caller. What is the correct mechanism in bash?
- Print the hostname to stdout and have the caller capture it with
h=$(fn), reserving the exit status for success or failure - Pass the hostname string to return, e.g.
return "$host", and read the value back from$?in the caller - Assign the hostname directly to
$1inside the function body so that the caller sees the updated positional parameter once the function returns - Use
exit "$host"so the value lands in the caller's exit code
What does setting IFS= and adding -r in while IFS= read -r line accomplish?
IFS=stops leading/trailing whitespace from being trimmed and-rstops backslashes from being treated as escapes, so the line arrives byte-for-byte intact- It makes read split the line into an array of fields on whitespace
- It forces read to consume the entire file in one call rather than one line
- It enables NUL-delimited reading so that the loop can safely handle even the pathological filenames that happen to contain embedded newline characters of their own
You got correct