The Index in Depth
The index — the staging area — is a single binary file at .git/index. It holds a flat, sorted list of every tracked path along with that path's blob SHA, file mode, and a cache of stat data. It is the proposed contents of your next commit, and it is also the data structure that makes git status fast enough to run on every prompt.
Most people meet the index as "the thing git add writes to." Looking at what it actually stores explains a lot: why status is quick, why a file edited after add is not staged, and why two confusingly similar flags exist for "stop noticing my local changes."
What the Index Actually Stores
Each entry is a path with its mode, the SHA of the blob holding that path's content, and cached stat fields — modification time, size, inode. Critically, the index references a blob SHA; it does not store the file's bytes. The content already lives in the object database, and the index just names which blob represents each path right now.
The stat cache is the performance trick. To answer "did this file change?", Git compares the file's current stat data against the cached values; only if they differ does it bother re-hashing the content. That is how git status stays fast in a large tree — it skips reading files that look untouched.
The Three Trees and the Index Between Them
The index sits between HEAD (your last commit) and the working tree (files on disk). The two diff commands target the two boundaries: git diff compares the working tree against the index, showing unstaged changes, while git diff --cached compares the index against HEAD, showing what is staged for the next commit. Knowing which boundary you are looking at is what stops a staged change from appearing to vanish.
Plumbing Around the Index
Three plumbing commands operate on the index directly. git ls-files --stage dumps every entry with its mode and blob SHA, git update-index manipulates entries, and git write-tree turns the current index into a tree object — exactly the step git commit performs internally before recording the commit.
$ git ls-files --stage 100644 8d0e41234f5a6b7c... 0 README.md 100644 1a2b3c4d5e6f7a8b... 0 src/main.py $ git write-tree 7a1c9e4f2b8d6c0a3f5e1b9d8c2a4e6f0b3d7c5a
assume-unchanged and skip-worktree
Two flags look similar and are constantly confused. git update-index --assume-unchanged <file> is a performance hint: it tells Git to stop checking that file for changes, but Git is free to ignore the promise and will happily clobber the flag on a checkout or merge. git update-index --skip-worktree <file> is the right tool for "keep my local edits to a tracked config file" — it is designed to persist your changes and survives far more operations.
The practical rule: reach for --skip-worktree when you want local edits to a tracked file to stick, and treat --assume-unchanged as nothing more than a speed hint you cannot rely on.
The Sparse Index
On a monorepo, sparse-checkout cone mode plus a sparse index lets the index store a single collapsed tree entry for whole directories outside your cone, instead of one entry per file. That keeps operations cheap when millions of paths exist outside the slice you actually work in. Enable it with git sparse-checkout set --cone <dirs> and index.sparse=true; without cone mode the index cannot collapse those entries and you get none of the benefit.
assume-unchanged — a performance hint that Git may disregard, and that checkout or merge will reset, overwriting your local edits. Use it only to speed up status on files you genuinely will not touch, never to protect local changes.
skip-worktree — designed to persist local edits to a tracked file and to survive most operations. This is the correct flag for "ignore my local changes to a committed config file." Choose it whenever the goal is keeping your edits.
- Using
--assume-unchangedto hide local config edits, then losing them when a pull or merge resets the flag and overwrites the file. - Believing the index stores file content and reasoning about its size accordingly — it holds blob SHAs and stat data; the content is in the object store.
- Letting a script write files directly and then committing with a stale index, so the changes silently miss the commit because they were never staged.
- Enabling the sparse index without cone mode and getting none of the index-collapsing benefit, since only cone mode can collapse out-of-cone trees.
- Reading plain
git diffafter staging and seeing nothing, then assuming the change was lost, whengit diff --cachedis the view against HEAD.
- Persist local edits to a tracked file with
git update-index --skip-worktree <file>, not--assume-unchanged. - Inspect exactly what is staged with
git ls-files --stage. - Distinguish unstaged from staged changes using
git diffversusgit diff --cached. - Enable the sparse index on a large monorepo with
git sparse-checkout set --cone <dirs>plusindex.sparse=true. - Materialize the current index into a tree for inspection with
git write-tree.
Knowledge Check
What does each index entry reference for a path's content?
- The blob's SHA, along with the mode and cached stat data — not the bytes themselves
- A full inline copy of the file's current content, stored verbatim inside the index entry itself
- A line-level diff of the path against its version in HEAD
- The working-tree path only, with content resolved lazily on read
Why does the index cache stat data like mtime and size?
- So
git statuscan skip re-hashing files whose stat data is unchanged, keeping it fast - To keep a redundant backup copy of each file's metadata for later recovery after corruption
- To let Git restore each file's original timestamps on the next checkout
- Because the resulting commit object requires the stat fields
You want your local edits to a tracked config file to stick. Which flag?
git update-index --skip-worktree <file>git update-index --assume-unchanged <file>git rm --cached <file>git add --intent-to-add <file>
What does git diff --cached compare?
- The index against HEAD — what is staged for the next commit
- The working tree against the index — your as-yet unstaged edits to tracked files
- The working tree directly against HEAD, ignoring the index entirely
- Two arbitrary commits passed explicitly as arguments
Why does a sparse index need cone mode to help?
- Only cone mode lets the index collapse whole out-of-cone directories into a single tree entry
- Cone mode encrypts the out-of-cone paths so they take up far less space on disk
- Without cone mode the index is stored as plain text and so it cannot be made sparse
- Cone mode disables the per-file stat cache entirely, and that cache is the thing that was slowing status down
You got correct