The Debug Loop: How Claude Code Finds the Bug in 6 Steps Instead of 60
Most developers debug with Claude Code by pasting errors and accepting speculative fixes, leading to a 40-60 message death spiral. This post proposes a six-step loop: first establish a reliable repro (failing test), isolate the search area in plan mode, dispatch read-only subagents to trace root causes from multiple angles, fix only the root cause (not symptoms), verify with an automatic hook (e.g., PostToolUse running the test), and keep the repro as a permanent regression test. The key insight is that Claude Code was always capable; the failure mode is skipping straight to 'fix' before understanding the bug.
You hit a bug at 2pm. You paste the error into Claude Code. It suggests a fix. Still broken. You paste the new error. Another fix. Still broken. It's 4pm now, you've sent 40 messages, patched three things that weren't the problem, and the original bug is exactly where it started.
Everyone has lived this. And it has nothing to do with the model being weak. It happens because pasting an error and asking for a fix is not debugging, it's gambling. You're asking for a guess, getting a guess, and acting surprised when the guess is wrong.
你在下午2点遇到一个 bug,把报错粘贴到 Claude Code 里。它建议了一个修复,但问题依旧。你再贴新错误,它再给另一个修正,还是没解决。到下午4点,你已经发了40条消息,补丁打了三个(全找错了方向),最初的 bug 纹丝不动地躺在原地。
每个人都经历过这种场景。这和模型强不强毫无关系。原因很简单:粘贴报错然后索要修复并不是调试,而是在赌命。你要求一个猜测,得到另一个猜测,然后对猜测落空感到惊讶。
Real debugging is a process: reproduce the bug, isolate where it lives, trace it to the actual root cause, fix that, verify, and guard against its return. Six steps. Claude Code can run all of them, you just have to stop letting it skip to step four. Here's the loop.
真正的调试是一个过程:复现 bug、隔离它的位置、追踪到真实根因、修复根因、验证、并防范其复现。六个步骤。Claude Code 有能力完成所有这些,你只需要不再让它跳过前四步直接进入第四步。下面就是这个循环。
A bug you can't reproduce on demand is a bug you can't fix.
You cannot fix what you cannot trigger. If a bug only shows up sometimes, every "fix" is unfalsifiable, you can never tell if it worked or you just got lucky. So the loop starts by nailing the bug down: have Claude Code build a reliable repro, the exact steps, inputs, or a failing test that triggers it every single time.
ESTABLISH THE REPRO
"Before fixing anything: reproduce this bug reliably.
Write a failing test or a minimal script that triggers it
every time. Show me it failing. Don't propose a fix yet."
✓ Now there's a concrete, repeatable signal of "broken" to fix against
无法按需复现的 bug,就是无法修复的 bug。
你无法修复一个你无法触发的问题。如果 bug 只在某些情况下出现,那每一个“修复”都无法证伪——你永远不知道它是否真的生效,还是只是运气好。所以循环从锁定 bug 开始:让 Claude Code 构建一个可靠的复现方案,即每一步的具体操作、输入、或一个每次都触发它的失败测试。
ESTABLISH THE REPRO
"在修复任何东西之前:可靠地复现这个 bug。
编写一个每次都会触发它的失败测试或最小脚本。
展示给我看它失败的样子。暂时不要提出修复。"
✓ 现在你有了一个具体、可重复的“出问题”信号作为修复的依据。
Bound the hunt so it doesn't read your whole codebase for 20 minutes.
Left unbounded, an agent will happily read your entire codebase looking for a bug. Plan mode fixes this: it lets Claude form a hypothesis about where the bug likely lives and lay out an investigation plan before touching anything. You narrow the search to the suspect area instead of the whole repo.
ISOLATE IN PLAN MODE
# enter plan mode (Shift+Tab twice)
"In plan mode: given this failing test, where is the bug
most likely to live? List the 2-3 most suspect files and
your reasoning. Plan how you'd confirm it before changing
anything."
✓ The search is now bounded to a few suspect files, not the whole repo
限定搜索范围,免得它花20分钟读你的整个代码库。
如果不加限制,AI 代理会很乐意翻阅整个代码库来寻找 bug。计划模式解决了这个问题:它让 Claude 在动手之前先形成关于 bug 可能藏身之处的假设,并拟定调查计划。你将搜索范围缩小到可疑区域,而不是整个仓库。
ISOLATE IN PLAN MODE
# 进入计划模式(双击 Shift+Tab)
"在计划模式下:根据这个失败的测试,bug 最可能
藏在哪里?列出 2-3 个最可疑的文件并给出理由。
计划你如何确认,在修改任何东西之前。"
✓ 搜索范围现在限定在几个可疑文件中,而不是整个仓库。
Send investigators into the suspect area without bloating your session.
This is the heart of the loop. Instead of one agent reading everything into your main context, you dispatch investigation subagents, each with its own context window, to dig into a specific suspect. They report findings back, and a lead agent assembles them into a root-cause conclusion. This is distributed reasoning: the bug gets cornered from several angles at once.
DISPATCH THE INVESTIGATION
"Launch investigation subagents (read-only) to trace this bug:
- one to follow the data flow into the failing function
- one to check recent changes to the suspect files
- one to inspect the edge case the test exposes
Each reports findings. Then tell me the single root cause,
with the evidence, before proposing any fix."
Why subagents and not one big session:Investigation generates a lot of reading. Done in your main context, it bloats the session and Claude loses the thread. Subagents keep each line of inquiry in its own context and return only the conclusion, so the main session stays sharp.
✓ You get one evidence-backed root cause, not a pile of guesses
将调查员派入可疑区域,而不膨胀你的主会话。
这是循环的核心。不是让一个代理把所有内容读进主上下文,而是派遣多个调查子代理,每个都有自己的上下文窗口,去深入挖掘一个特定的可疑点。它们汇报发现,然后由主代理将这些发现汇聚成一个根因结论。这是一种分布式推理:从多个角度同时围堵 bug。
DISPATCH THE INVESTIGATION
"启动调查子代理(只读模式)来追踪这个 bug:
- 一个追踪数据流入失败函数的路径
- 一个检查可疑文件最近的变更
- 一个检查测试暴露出的边界情况
每个代理汇报发现。然后告诉我单一的根因,
附带证据,在提出任何修复之前。"
为什么用子代理而不是一个大会话:调查会产生大量阅读内容。如果放在主上下文中,会话会膨胀,Claude 会丢失线索。子代理将每条调查线保留在自己的上下文中,只返回结论,这样主会话始终保持清晰。
✓ 你得到一个有证据支持的根因,而不是一堆猜测。
The whole point of tracing: now you fix the right thing.
With a confirmed root cause, the fix is targeted instead of speculative. This is where the 60-message approach quietly fails: it patches the symptom (the NaN), so the underlying cause (no empty-cart guard) resurfaces in a new form next week. Tell Claude explicitly to fix the cause you identified, and to flag if the "fix" is actually just another symptom patch.
FIX THE ROOT CAUSE
"Fix the root cause we identified, not the surface symptom.
Keep the change minimal and targeted. If your fix only
addresses the symptom and not the cause, say so instead
of pretending it's solved."
✓ The fix targets the actual cause, so the bug doesn't mutate and return
追踪的全部意义就在于此:现在你修复正确的东西。
有了确认的根因,修复就是有目标的而非猜测式的。这正是 60 条消息方法悄悄失败的地方:它修补了症状(NaN 值),而底层原因(缺少空购物车防护)下周就会以新形式再次浮现。明确告诉 Claude 修复你识别出的原因,并在修复实际只是另一个症状补丁时如实指出。
FIX THE ROOT CAUSE
"修复我们识别出的根因,而不是表面症状。
保持变更最小化和具有针对性。如果你的修复
只处理了症状而非原因,请如实说明,而不是假装已解决。"
✓ 修复针对实际原因,因此 bug 不会变异后卷土重来。
Make "fixed" mean the repro test passes, automatically.
The trust-then-verify gap is real: the agent says "fixed," you accept, and later find the tests are red. The fix is a hook that runs your test command automatically before "done" can complete. Now your Step 1 repro test runs after every edit, and a fix isn't a fix until that test goes green.
.claude/settings.json // VERIFY-ON-EDIT HOOK
{
"hooks": {
"Stop": [{
"command": "npm run test"
}]
}
}
# the repro test now runs before the agent can finish
# "fixed" requires green, not a claim
✓ "Fixed" is now backed by a passing test, every time, no exceptions
让“已修复”意味着复现测试自动通过。
信任-然后-验证的缺口真实存在:代理说“已修复”,你接受,然后发现测试是红的。修复方案是一个钩子,在“完成”前自动运行测试命令。现在你的第一步复现测试在每次编辑后都会运行,一个修复只有在测试变绿时才算是修复。
.claude/settings.json // 编辑时验证钩子
{
"hooks": {
"Stop": [{
"command": "npm run test"
}]
}
}
# 复现测试现在会在代理完成前运行
# “已修复”要求变绿,而不是口头声称
✓ “已修复”现在由通过测试验证,每次必验,无一例外。
A senior engineer leaves a test so the bug can never come back silently.
The repro test from Step 1 becomes a permanent regression test. This is what separates fixing a bug from actually closing it: if anyone reintroduces the same cause later, the test catches it instantly instead of it shipping to production again. Have Claude keep the test, name it clearly, and note the root cause in a comment.
LOCK IN THE GUARD
"Keep the repro test as a permanent regression test. Name it
clearly and add a one-line comment explaining the root cause
it guards against, so future changes can't reintroduce it
silently."
✓ The bug is now closed, not just fixed. It can't come back unnoticed.
资深工程师会留下一个测试,以防止 bug 悄然复发。
第一步中的复现测试成为永久性的回归测试。这就是“修复了一个 bug”和“彻底关闭它”之间的区别:如果未来有人重新引入了同样的原因,测试会立刻捕获,而不是让它再次部署到生产环境。让 Claude 保留这个测试,清晰命名,并在注释中说明它防范的根因。
LOCK IN THE GUARD
"保留这个复现测试作为永久的回归测试。清晰命名它,
并添加一行注释说明它所防范的根因,
这样未来的修改就无法悄无声息地重新引入它。"
✓ 这个 bug 现在被关闭了,而不仅仅是修复了。它无法在不被注意的情况下复现。
The 60-message death spiral happens for one reason: you skip straight to fixing before you've found what's actually broken. Every "fix" is a guess, and guesses patch symptoms. The 6-step loop refuses to fix until the root cause is found, which is the entire difference.
- Reproducing first means you have a real signal for "fixed," not a vibe
- Isolating in plan mode means the agent hunts a few files, not the whole repo
- Subagents trace from multiple angles without drowning your main session
- Fixing the cause means the bug doesn't reappear in a new disguise
- The hook plus regression test means "fixed" is proven and stays fixed
The honest takeaway:Claude Code was always able to trace and fix bugs. The reason most people get 60 messages of symptom-patching is that they never ask it to find the cause first. Run the loop, and the same model that frustrated you yesterday closes the bug today.
那60条消息的死亡螺旋只为一个原因:你跳过了找到实际问题就直接进入修复阶段。每个“修复”都是一次猜测,而猜测修补的是症状。这个六步循环在找到根因之前拒绝修复,这就是全部的差别。
- 先复现意味着你有一个真实的“已修复”信号,而不是一种感觉
- 在计划模式下隔离意味着代理只搜索几个文件,而不是整个仓库
- 子代理从多个角度追踪,而不会淹没你的主会话
- 修复原因意味着 bug 不会以新伪装重新出现
- 钩子加上回归测试意味着“已修复”被验证并保持稳定
坦诚的结论:Claude Code 一直以来都具备追踪和修复 bug 的能力。大多数人数到60条症状补丁消息的原因是,他们从未要求它先找到原因。运行这个循环,昨天让你受挫的同一模型今天就能关闭 bug。