Topic 19

Bisect

History

git bisect binary-searches the commit graph to find the exact commit that introduced a regression. Instead of reading diffs across thousands of commits, you mark one known-good and one known-bad commit, and Git walks you to the midpoint, halving the suspect range with every answer you give.

The math is what makes it worth reaching for: a binary search costs about log2(N) tests, so 8,000 commits collapse to roughly 13 steps. On a long-lived repository, that is the difference between an afternoon of guessing and a few minutes of mechanical answers — especially once you automate it.

The Binary Search

You give Git two anchors: a good commit where the bug is absent and a bad commit where it is present. Git checks out the commit halfway between them; you test and report. Each report halves the remaining range, and Git converges on the first commit where good flips to bad. The only thing the search assumes is that the property is monotonic — once broken, it stays broken — which is why mislabeling an anchor sends the search into the wrong half of the graph.

The Manual Workflow

Start with git bisect start, then mark the current broken state with git bisect bad and a known-good commit with git bisect good <sha>. Git checks out a midpoint; you build and test, then run git bisect good or git bisect bad for each step it presents. When Git announces the first bad commit, run git bisect reset to return to the branch you started on. Skipping that final reset leaves you stranded on a detached midpoint commit.

Automated Bisect

When the regression has a scripted check, git bisect run <script> drives the entire search by the script's exit code: 0 means good, 1 through 127 (except 125) means bad, and 125 means "this commit cannot be tested, skip it." Reserving 125 for untestable commits is the detail that separates a clean automated bisect from a misleading one. A subtle trap: if your script returns a generic 1 from a setup failure rather than the actual test, Git will mark perfectly good commits as bad and finger the wrong change.

git bisect start
git bisect bad
git bisect good v1.4.0
git bisect run ./test.sh

Skipping and First-Parent

Some commits in the range will not build at all. Mark them with git bisect skip (or exit 125 from a run script) so Git routes around them instead of treating an unbuildable commit as a test result. On a history thick with merged feature branches, git bisect start --first-parent restricts the search to the mainline, ignoring the internal commits of merged branches and often pointing at the merge that introduced the regression rather than some commit deep inside a branch.

Common Mistakes

Swapping the good and bad labels, which sends the search into the wrong subgraph and "finds" an innocent commit that never introduced the bug.
Writing a bisect run script that returns exit code 1 on a setup or build failure, marking commits bad for the wrong reason and convicting the wrong change.
Forgetting git bisect reset at the end, leaving the repository checked out on a detached midpoint commit instead of your branch.
Bisecting across commits that do not compile without using git bisect skip or exit code 125, so unbuildable commits pollute the result.
Letting an uncommitted working-tree change ride along through every checkout, contaminating each test with code that is not part of the commit being tested.

Best Practices

Automate with git bisect run ./test.sh whenever the regression can be detected by a scripted check, so the search runs hands-off.
Reserve exit code 125 in your script for untestable commits so Git skips them instead of recording a false result.
Cut through a messy merge history with git bisect start --first-parent to keep the search on the mainline.
Always finish with git bisect reset to return to your original branch and leave the working tree clean.
Commit or stash all local changes before starting so every checkout the bisect performs is clean and reproducible.

Comparable toolsMercurial hg bisect with the same good/bad/run modelSubversion no built-in bisectPerforce no native bisectFossil fossil bisect with comparable semantics

Knowledge Check

Why does bisect need only about log2(N) tests?

Each good/bad answer halves the remaining suspect range, so the search converges logarithmically — roughly 13 steps for 8,000 commits
It checks out and tests every one of the 8,000 commits but caches each verdict to disk, so all but the very first handful return instantly from the stored cache
It first diffs the range and tests only the commits that touched the failing file, skipping the rest of the history entirely
It spins up a worktree per commit and runs the test suite in parallel across all 8,000 of them, then collects the verdicts

What does exit code 125 signal to git bisect run, and why is it not the same as 1?

125 means the commit is untestable and should be skipped; 1 means the test ran and the commit is bad
Both codes mark the commit bad, but 125 additionally aborts the run and prints the first-bad commit found so far
125 means the commit is good and should bound the search, while 1 means the commit is bad
125 retries the same commit up to three times, while 1 moves on to the next midpoint immediately

What happens if you swap the good and bad labels?

The search proceeds into the wrong subgraph and identifies an innocent commit that did not introduce the regression
Git compares the two tips, detects that the labels are reversed, and refuses to start until you correct the order
The search still lands on the true first-bad commit eventually but takes roughly twice as many midpoint tests to get there
Nothing changes; bisect normalizes the two endpoints internally and ignores which label you attached to each

Why do uncommitted working-tree changes break a bisect run?

They ride along through every checkout, so each test runs against code that is not part of the commit being evaluated
Git discards them when you run bisect start and resets the tree to the first midpoint, silently losing your work
They make Git return exit code 125 on every checkout, forcing each commit in the suspect range to be marked as skipped
They stop Git from computing midpoints, so the binary search can no longer halve the suspect range each step

You got correct