Topic 04

The Three Trees

Concept

Git tracks three versions of your project at once: the working tree (the files on disk you edit), the index or staging area (what the next commit will contain), and HEAD (the last commit). Almost all the confusion around add, reset, and restore dissolves the moment you can name which of these three an operation moves content between.

The staging area is the part that surprises people coming from systems that commit the whole working copy at once. It is not overhead — it is the feature that lets you turn a messy working tree into a clean, deliberate series of commits.

Content moves between the three trees

Working treefiles on disk

→

Indexadd · staging area

→

HEADcommit · last commit

The Working Tree

The working tree is the ordinary directory of files you open in your editor. It is the only one of the three your compiler, your tests, and your tools actually see. Changes here are just changes on disk; Git does nothing with them until you stage them. This is also the only tree whose uncommitted contents Git cannot recover for you — lose unstaged edits and the reflog cannot bring them back.

The Index (Staging Area)

The index is a binary file at .git/index holding the proposed contents of the next commit. git add copies the current content of a file from the working tree into the index. The crucial subtlety: it captures a snapshot at the moment you run it. Edit the file again afterward and those new edits are not in the index — the staged version is frozen as of the add, and a plain commit will record the frozen version, not your latest disk content.

Because the index is a distinct, persistent layer, you can build up exactly the change you want to record — even part of a single file, via git add -p — while leaving other edits unstaged. That is the lever that makes focused, reviewable commits possible.

HEAD

HEAD is a pointer to the commit that represents your last committed state, usually reached through the current branch. It is the baseline the other two trees are compared against: git diff --staged compares the index to HEAD, and git diff compares the working tree to the index. Knowing which comparison you are looking at is what stops a staged change from looking like it "disappeared."

How Commands Move Content Between Trees

Read the everyday commands as movements between the three trees. git add moves content from the working tree into the index. git commit writes the index into a new commit and advances HEAD. git restore <file> copies from the index (or HEAD) back over the working tree. git reset moves HEAD and, depending on its flag, the index and working tree with it — --soft moves HEAD only, --mixed also resets the index, --hard also overwrites the working tree.

That last one is the dangerous member of the family: --hard overwrites all three trees, and any uncommitted working-tree changes it discards are gone for good, because they were never committed for the reflog to track.

The Staging Area as a Feature

Most version control systems commit the whole modified working copy. Git's separate index lets you choose precisely what goes into each commit, so one chaotic afternoon of edits can become three clean commits: the bug fix, the refactor, and the unrelated typo, each reviewable on its own. The discipline of staging deliberately is what makes git log and code review worth reading later.

Staging Area vs Direct Commit

Staging area (Git) — a persistent index lets you select exactly which changes, down to individual hunks with git add -p, go into the next commit. Choose Git's model when commit quality and reviewability matter.

Direct commit (SVN, default Mercurial) — the whole modified working copy is committed unless you name specific paths; there is no persistent "staged but not committed" layer. Simpler, but no built-in way to craft a partial commit.

Common Mistakes

Editing a file after git add and assuming the new edits are staged — the index holds the content as of the add, so the commit records the older version unless you re-add.
Running git reset --hard to "undo staging" and silently destroying uncommitted working-tree changes, which the reflog cannot recover.
Confusing git restore <file> (discards working-tree changes) with git restore --staged <file> (unstages but keeps changes) — the wrong one either keeps or loses your edits.
Committing without running git status first, then discovering only half the intended files were staged.
Reading plain git diff after staging everything, seeing nothing, and concluding the change vanished — it is staged, so git diff --staged is the right view.

Best Practices

Run git status before every commit to confirm exactly which paths are staged.
Use git add -p to stage hunks selectively so each commit is one coherent change.
Re-run git add on a file after editing it, since staging captures a point-in-time snapshot.
Use git restore --staged <file> to unstage and git restore <file> to discard, reserving git reset --hard for when losing working-tree changes is the intent.
Use git diff for working-vs-index and git diff --staged for index-vs-HEAD to see each boundary distinctly.

Comparable toolsMercurial no persistent staging; interactive selection per commitPerforce changelists group files, but are not a content snapshotSubversion commits the working copy directly

Knowledge Check

You run git add file.txt, then edit file.txt again, then git commit. What gets committed?

The content as of the add; the later edits are unstaged and not in the commit
The latest content on disk, because commit always reads straight from the working tree
Both versions of the file, recorded in the one commit as two separate sequential changes
Nothing at all — Git refuses to commit any file that was edited again after staging

Which command unstages a file while keeping your changes on disk?

git restore --staged <file>
git restore <file>
git reset --hard <file>
git rm --cached <file>

A staged change shows in git diff --staged but not in plain git diff. Why?

Plain git diff compares working tree to index; once content is staged it matches the index, so the difference is only visible against HEAD
The change was lost during the staging step and only a partial cached copy of it now remains tucked away inside the index, with the rest gone
git diff only ever shows brand-new untracked files that Git has never recorded, and it never shows ordinary edits to already-tracked ones
Staging physically deletes the working-tree copy of the file off disk, so plain diff is left with nothing on either side to compare

What does git reset --hard touch that git reset --soft does not?

The index and the working tree — --soft moves only HEAD, while --hard overwrites all three trees
Only the commit message of HEAD, rewriting its text in place while leaving every one of the three trees fully untouched
The remote tracking branch on the server as well as the local one, quietly syncing both of them in a single step
Nothing different at all between them; the two flags are simply interchangeable aliases for the exact same operation

You got correct