Bisect
git bisect binary-searches the commit graph to find the exact commit that introduced a regression. Instead of reading diffs across thousands of commits, you mark one known-good and one known-bad commit, and Git walks you to the midpoint, halving the suspect range with every answer you give.
The math is what makes it worth reaching for: a binary search costs about log2(N) tests, so 8,000 commits collapse to roughly 13 steps. On a long-lived repository, that is the difference between an afternoon of guessing and a few minutes of mechanical answers — especially once you automate it.
The Binary Search
You give Git two anchors: a good commit where the bug is absent and a bad commit where it is present. Git checks out the commit halfway between them; you test and report. Each report halves the remaining range, and Git converges on the first commit where good flips to bad. The only thing the search assumes is that the property is monotonic — once broken, it stays broken — which is why mislabeling an anchor sends the search into the wrong half of the graph.
The Manual Workflow
Start with git bisect start, then mark the current broken state with git bisect bad and a known-good commit with git bisect good <sha>. Git checks out a midpoint; you build and test, then run git bisect good or git bisect bad for each step it presents. When Git announces the first bad commit, run git bisect reset to return to the branch you started on. Skipping that final reset leaves you stranded on a detached midpoint commit.
Automated Bisect
When the regression has a scripted check, git bisect run <script> drives the entire search by the script's exit code: 0 means good, 1 through 127 (except 125) means bad, and 125 means "this commit cannot be tested, skip it." Reserving 125 for untestable commits is the detail that separates a clean automated bisect from a misleading one. A subtle trap: if your script returns a generic 1 from a setup failure rather than the actual test, Git will mark perfectly good commits as bad and finger the wrong change.
git bisect start git bisect bad git bisect good v1.4.0 git bisect run ./test.sh
Skipping and First-Parent
Some commits in the range will not build at all. Mark them with git bisect skip (or exit 125 from a run script) so Git routes around them instead of treating an unbuildable commit as a test result. On a history thick with merged feature branches, git bisect start --first-parent restricts the search to the mainline, ignoring the internal commits of merged branches and often pointing at the merge that introduced the regression rather than some commit deep inside a branch.
- Swapping the good and bad labels, which sends the search into the wrong subgraph and "finds" an innocent commit that never introduced the bug.
- Writing a
bisect runscript that returns exit code1on a setup or build failure, marking commits bad for the wrong reason and convicting the wrong change. - Forgetting
git bisect resetat the end, leaving the repository checked out on a detached midpoint commit instead of your branch. - Bisecting across commits that do not compile without using
git bisect skipor exit code125, so unbuildable commits pollute the result. - Letting an uncommitted working-tree change ride along through every checkout, contaminating each test with code that is not part of the commit being tested.
- Automate with
git bisect run ./test.shwhenever the regression can be detected by a scripted check, so the search runs hands-off. - Reserve exit code
125in your script for untestable commits so Git skips them instead of recording a false result. - Cut through a messy merge history with
git bisect start --first-parentto keep the search on the mainline. - Always finish with
git bisect resetto return to your original branch and leave the working tree clean. - Commit or stash all local changes before starting so every checkout the bisect performs is clean and reproducible.
hg bisect with the same good/bad/run modelSubversion no built-in bisectPerforce no native bisectFossil fossil bisect with comparable semanticsKnowledge Check
Why does bisect need only about log2(N) tests?
- Each good/bad answer halves the remaining suspect range, so the search converges logarithmically — roughly 13 steps for 8,000 commits
- It checks out and tests every one of the 8,000 commits but caches each verdict to disk, so all but the very first handful return instantly from the stored cache
- It first diffs the range and tests only the commits that touched the failing file, skipping the rest of the history entirely
- It spins up a worktree per commit and runs the test suite in parallel across all 8,000 of them, then collects the verdicts
What does exit code 125 signal to git bisect run, and why is it not the same as 1?
125means the commit is untestable and should be skipped;1means the test ran and the commit is bad- Both codes mark the commit bad, but
125additionally aborts the run and prints the first-bad commit found so far 125means the commit is good and should bound the search, while1means the commit is bad125retries the same commit up to three times, while1moves on to the next midpoint immediately
What happens if you swap the good and bad labels?
- The search proceeds into the wrong subgraph and identifies an innocent commit that did not introduce the regression
- Git compares the two tips, detects that the labels are reversed, and refuses to start until you correct the order
- The search still lands on the true first-bad commit eventually but takes roughly twice as many midpoint tests to get there
- Nothing changes; bisect normalizes the two endpoints internally and ignores which label you attached to each
Why do uncommitted working-tree changes break a bisect run?
- They ride along through every checkout, so each test runs against code that is not part of the commit being evaluated
- Git discards them when you run
bisect startand resets the tree to the first midpoint, silently losing your work - They make Git return exit code 125 on every checkout, forcing each commit in the suspect range to be marked as skipped
- They stop Git from computing midpoints, so the binary search can no longer halve the suspect range each step
You got correct