Loop Engineering: The AI skill every builder needs in 2026
This community-authored article introduces 'Loop Engineering,' arguing that the most effective AI builders are shifting from one-shot prompting to designing automated feedback loops for AI agents. Rather than crafting a perfect prompt, engineers should build systems that discover, plan, execute, verify, and iterate until a verified outcome is reached. It covers six building blocks (automations, worktrees, skills, plugins/connectors, subagents, memory), two loop scales (single-agent vs. fleet), and two types (open vs. closed), while frankly addressing the critical hidden cost of tokens. A practical primer for engineering teams turning AI agents from experiments into production workflows.
Most people are still prompting agents manually.
They type one task.
Wait for one answer.
Review it themselves.
Fix the mistakes themselves.
Then prompt again.
That means the human is still the loop.
The next step is different.
You do not just prompt the agent.
You design the loop that prompts the agent, checks the result, decides the next move, and keeps running until the work passes.
That is loop engineering.
"You should not be prompting coding agents anymore. You should be designing loops that prompt your agents."
Then Boris Cherny, head of Claude Code at Anthropic, said the same thing differently:
"I do not prompt Claude anymore. I have loops running that prompt Claude and figure out what to do. My job is to write loops."
大多数人仍然在手动提示智能体。
他们输入一个任务。
等待一个回答。
自己检查。
自己修正错误。
然后再次提示。
这意味着人仍然是循环中的一环。
下一步则不同。
你不仅仅是提示智能体。
你设计了一个循环:这个循环会提示智能体、检查结果、决定下一步行动,并持续运行直到工作达标。
这就是「循环工程」。
“你不应该再手动提示编码智能体了。你应该设计循环,让循环去提示你的智能体。”
随后,Anthropic 公司 Claude Code 的负责人 Boris Cherny 用不同的话表达了同样的观点:
“我不再手动提示 Claude 了。我有循环在运行,它们提示 Claude 并决定该做什么。我的工作就是编写这些循环。”
Loops sound amazing until you look at the token bill.
A normal agent loop can burn a lot of context fast:
- One medium coding loop can use 50K-200K tokens
- A fleet loop with one orchestrator and several specialist agents can use 500K-2M tokens
- A scheduled daily loop can reach millions of tokens per week
Every retry costs tokens.
Every self-correction costs tokens.
Every verification step costs tokens.
Every subagent costs tokens.
That is the hidden problem nobody talks about enough.
Loop engineering is not hard because the idea is complicated.
It is hard because most people cannot afford to let agents run freely for long.
"Easy for you to say, you have unlimited OpenAI access."
That reaction is fair.
This is why cheaper long-context models matter.
If you want loops to run every day, you need:
- Cheap input tokens
- Cheap output tokens
- Large context windows
- Tool calling
- JSON output
- High concurrency
- Enough context to remember what happened earlier in the loop
Without that, loops become expensive experiments.
With that, loops become practical workflows.
循环听起来很美好,直到你看到 Token 账单。
一个普通的智能体循环会迅速消耗大量上下文:
- 一个中等规模的编码循环可能消耗 5万–20万 Token
- 一个带有一个主协调器和数个专业智能体的舰队循环可能消耗 50万–200万 Token
- 一个按计划每日运行的循环每周可能达到数百万 Token
每次重试都要花费 Token。
每次自我修正都要花费 Token。
每次验证步骤都要花费 Token。
每个子智能体都要花费 Token。
这就是那个没人充分讨论的隐藏问题。
循环工程之所以难,并非因为其理念复杂。
它难在大多数人承受不起让智能体长时间自由运行的成本。
“你说得轻巧,你用的是无限制的 OpenAI 访问权限。”
这个反应很公平。
这就是为什么更便宜的、具有长上下文窗口的模型如此重要。
如果你想让循环每天运行,你需要:
- 便宜的输入 Token
- 便宜的输出 Token
- 大上下文窗口
- 工具调用能力
- JSON 输出
- 高并发性
- 足够大的上下文来记住循环中之前发生的事情
没有这些,循环就变成了昂贵的实验。
有了这些,循环就变成了实用的工作流。
For the last two years, most people used agents like this:
You prompt.
The agent answers.
You review.
You find the mistake.
You prompt again.
That works, but it does not scale.
The old way:
- You give a prompt
- The agent gives an output
- You review the output
- You fix the weak parts
- You repeat manually
The new way:
- You define the goal
- The loop discovers what is needed
- The loop plans the work
- The agent executes
- A checker verifies the result
- The loop fixes failures
- The system stops when the goal is reached
Prompting gives an agent an instruction.
Loop engineering gives an agent a job.
在过去两年里,大多数人这样使用智能体:
你发出提示。
智能体回答。
你检查。
你发现错误。
你再次提示。
这确实可行,但它无法规模化。
旧方式:
- 你给出提示
- 智能体给出输出
- 你检查输出
- 你修复薄弱环节
- 你手动重复
新方式:
- 你定义目标
- 循环发现需要什么
- 循环规划工作
- 智能体执行
- 检查者验证结果
- 循环修复失败项
- 系统在目标达成时停止
提示给智能体一条指令。
循环工程给智能体一份工作。
Loop engineering is the practice of designing repeatable feedback cycles for AI agents.
The goal is simple:
Get from attempt to verified result without needing a human to manually drive every step.
The basic cycle has five stages:
- Discover
- Plan
- Execute
- Verify
- Iterate
If the output passes, ship it.
If it fails, send it back into the loop.
That is the whole idea.
Not one perfect prompt.
A system that keeps improving the output until it meets the standard.
循环工程是为 AI 智能体设计可重复反馈循环的实践。
目标很简单:
从一次尝试到得到验证的结果,全程无需人工手动驱动每一步。
基本循环包含五个阶段:
- 发现
- 规划
- 执行
- 验证
- 迭代
如果输出通过,就发布它。
如果失败,就把它送回循环中。
这就是全部理念。
不是一条完美的提示。
而是一个持续改进输出直到它满足标准的系统。
There are two basic sizes of loops.
Single-Agent Loop
One agent runs the whole cycle.
It discovers what is needed, plans the work, executes the task, checks the result, and improves it if something fails.
This is like one person rewriting their own draft.
Good for:
- Focused tasks
- Small scopes
- Simple goals
- Content drafts
- Bug fixes
- Research summaries
One brain.
One loop.
Self-improvement.
循环有两种基本的规模。
单智能体循环
一个智能体运行整个周期。
它发现需要什么,规划工作,执行任务,检查结果,并在失败时改进。
这就像一个人修改自己的草稿。
适用于:
- 聚焦型任务
- 小范围工作
- 简单目标
- 内容草稿
- Bug 修复
- 研究总结
一个大脑。
一个循环。
自我改进。
Fleet Loop
A fleet loop is bigger.
You give one orchestrator agent the main goal.
It breaks the work into pieces.
Then it sends those pieces to specialist agents.
Each specialist can also use smaller subagents for narrow tasks.
Example:
Example: "Build a productivity app"
Orchestrator owns the mission
↓ ↓ ↓
Research Engineering QA
Specialist Specialist Specialist
↓ ↓ ↓
Web Code Writer Test Writer
Researcher + Debugger + Bug Tracker
This is not one agent working alone.
It is closer to a small team running a project end to end.
舰队循环
舰队循环规模更大。
你给一个主协调智能体设定主要目标。
它将工作分解成小块。
然后把小块分发给专业智能体。
每个专业智能体还可以使用更小的子智能体来处理狭窄的任务。
示例:
示例:“构建一个生产力应用”
主协调器掌管任务
↓ ↓ ↓
研究 工程 质量保证
专业智能体 专业智能体 专业智能体
↓ ↓ ↓
网站研究者 代码编写者 测试编写者
+ 调试器 + Bug 追踪器
这不是一个智能体独自工作。
它更接近一个小团队从头到尾运行一个项目。
This is the most important practical distinction.
Not all loops are equal.
Open Loops
Open loops are exploratory.
You give the agent a broad goal and let it search for the path.
This is powerful because the agent can discover things you did not specify.
But it is also expensive and messy.
Open loops can:
- Try too many paths
- Burn too many tokens
- Create low-quality output fast
- Drift away from the real goal
- Become hard to control
Open loops are exciting.
But for most people, they are not the best place to start.
Closed Loops
Closed loops are bounded.
The human designs the path first.
The loop still runs on its own, but inside clear rules.
A closed loop has:
- Clear goal
- Defined steps
- Evaluation after each step
- Stop condition
- Hand-off point if it gets stuck
This is the version that actually pays off today.
It is cheaper.
It is more reliable.
It produces cleaner output.
Start with closed loops.
Open them up later when your checks are strong.
这是最重要的实践性区别。
并非所有循环都是平等的。
开放循环
开放循环是探索性的。
你给智能体一个广泛的目标,让它自己寻找路径。
这很强大,因为智能体可以发现你没有指定的东西。
但它也很昂贵且杂乱。
开放循环可能会:
- 尝试太多路径
- 消耗过多 Token
- 快速产出低质量输出
- 偏离真正目标
- 变得难以控制
开放循环令人兴奋。
但对大多数人来说,这不是最好的起点。
封闭循环
封闭循环是有边界的。
人先设计好路径。
循环仍然自主运行,但在明确的规则内。
一个封闭循环具有:
- 清晰的目标
- 定义的步骤
- 每一步后的评估
- 停止条件
- 卡住时的移交点
这是今天真正能产生价值的版本。
它更便宜。
它更可靠。
它产出更清晰的输出。
从封闭循环开始。
等你的检查机制足够强大时,再逐步开放它们。
Conceptually, every loop has five stages.
But practically, you need six building blocks to make the loop work.
1. Automations
This is the heartbeat.
The automation starts the loop without you manually remembering to run it.
Examples:
- Run every morning
- Run when a PR opens
- Run when a file changes
- Run when a new ticket appears
- Run until all tests pass
If you still need to start everything manually, the loop is not really doing enough work.
2. Worktrees
Worktrees matter when multiple agents are editing code.
Without separation, agents collide.
Two agents can edit the same file.
One can overwrite the other.
A worktree gives each agent its own clean workspace and branch.
That lets multiple agents work in parallel without turning the repo into a mess.
3. Skills
Skills are reusable project knowledge.
Instead of explaining your project every time, you write the important context once.
Good skill files include:
- Vision
- Architecture
- Rules
- Build steps
- Testing steps
- Things the agent must never do
Without skills, every loop starts cold.
With skills, every loop starts with accumulated context.
4. Plugins And Connectors
A loop that only sees files is limited.
Connectors let the loop touch your real tools.
Examples:
- GitHub
- Slack
- Linear
- Jira
- Gmail
- Google Drive
- Database
- Staging API
This is the difference between "here is a suggested fix" and "I opened the PR, linked the ticket, watched CI, and posted the update."
5. Subagents
The maker and checker should not always be the same model.
The agent that wrote the code will often be too generous when reviewing it.
The agent that wrote the article will miss its own weak sections.
Use separate agents for:
- Exploration
- Implementation
- Review
- Testing
- Fact-checking
- Final summary
Quality improves when the reviewer is not the same agent that made the work.
6. Memory
Memory is what lets a loop continue across runs.
The model forgets.
The repo does not.
The notes do not.
The project log does not.
Memory can live in:
- Markdown files
- Project logs
- Linear tickets
- GitHub issues
- Obsidian vaults
- Databases
- Claude Projects
A long-running loop needs to know what was tried, what passed, what failed, and what still needs to happen.
Without memory, it starts from zero every time.
概念上,每个循环都有五个阶段。
但实践中,你需要六个构建模块来让循环工作。
1. 自动化
这是心跳。
自动化启动循环,无需你手动记得去运行它。
示例:
- 每天早上运行
- 当 PR 打开时运行
- 当文件变更时运行
- 当新工单出现时运行
- 一直运行直到所有测试通过
如果你仍然需要手动启动一切,那循环实际上并没有在做足够的工作。
2. 工作树 (Worktrees)
当多个智能体同时编辑代码时,工作树很重要。
没有隔离,智能体会冲突。
两个智能体可能编辑同一个文件。
一个会覆盖另一个。
工作树给每个智能体自己的干净工作空间和分支。
这使得多个智能体可以并行工作,而不会把仓库搞得一团糟。
3. 技能 (Skills)
技能是可复用的项目知识。
不必每次都解释你的项目,只需一次性写下重要的上下文。
好的技能文件包括:
- 愿景
- 架构
- 规则
- 构建步骤
- 测试步骤
- 智能体绝不能做的事
没有技能,每个循环都是冷启动。
有了技能,每个循环都从累积的上下文开始。
4. 插件和连接器
一个只看得到文件的循环是受限的。
连接器让循环接触到你的真实工具。
示例:
- GitHub
- Slack
- Linear
- Jira
- Gmail
- Google Drive
- 数据库
- 预发布 API (Staging API)
这就是“这里有一个建议修复”和“我打开了 PR,关联了工单,监控了 CI,并发布了更新”之间的区别。
5. 子智能体
制造者和检查者不应总是同一个模型。
编写代码的智能体在审查代码时常常会过于宽容。
撰写文章的智能体会忽略自己文章中的薄弱部分。
为以下角色使用独立的智能体:
- 探索
- 实施
- 审查
- 测试
- 事实核查
- 最终总结
当审查者不是产出工作的同一个智能体时,质量会提高。
6. 记忆
记忆让循环能够在多次运行之间延续。
模型会忘记。
但仓库不会忘记。
笔记不会忘记。
项目日志不会忘记。
记忆可以存在于:
- Markdown 文件
- 项目日志
- Linear 工单
- GitHub Issues
- Obsidian 知识库
- 数据库
- Claude Projects
一个长期运行的循环需要知道尝试过什么、什么通过了、什么失败了、以及还有什么需要做。
没有记忆,它每次都会从零开始。
Here are the loops that make the idea concrete.
Coding Loop
The loop:
Read VISION.md + ARCHITECTURE.md
↓
Plan next change
↓
Edit code
↓
Run tests
↓
If tests fail → read error → fix → test again
↓
If tests pass → summarize changes
↓
Stop
No human needs to push every step.
The agent writes, tests, fixes, and verifies.
Research Loop
The loop:
Define research question
↓
Search for sources
↓
Summarize findings
↓
Verify claims against sources
↓
Compare conflicting information
↓
Synthesize final answer
↓
Stop when confidence threshold met
This is much better than asking for one quick summary.
Content Loop
The loop:
Topic + audience + goal defined
↓
Draft created
↓
Critique agent reviews draft
↓
Rewrite based on critique
↓
Score against success criteria
↓
If score passes → publish
↓
If score fails → rewrite again
The loop turns one idea into a content system.
Sales Outreach Loop
The loop:
ICP (Ideal Customer Profile) defined
↓
Find leads matching profile
↓
Enrich with company data
↓
Qualify against criteria
↓
Personalize message
↓
Quality review
↓
Send or escalate to human
Same skeleton:
Goal.
Action.
Check.
Fix.
Repeat until done.
以下示例让这个理念变得具体。
编码循环
循环:
读取 VISION.md + ARCHITECTURE.md
↓
规划下一个更改
↓
编辑代码
↓
运行测试
↓
如果测试失败 → 读取错误 → 修复 → 再次测试
↓
如果测试通过 → 总结更改
↓
停止
无需人工推动每一步。
智能体自己编写、测试、修复和验证。
研究循环
循环:
定义研究问题
↓
搜索来源
↓
总结发现
↓
对照来源核实声明
↓
比较矛盾的信息
↓
综合出最终答案
↓
达到信心阈值时停止
这比请求一个快速摘要要好得多。
内容循环
循环:
定义主题 + 受众 + 目标
↓
创建草稿
↓
评论智能体审查草稿
↓
基于评论重写
↓
根据成功标准评分
↓
如果评分通过 → 发布
↓
如果评分失败 → 再次重写
这个循环把一个想法变成了一个内容系统。
销售外联循环
循环:
定义 ICP(理想客户画像)
↓
寻找匹配画像的线索
↓
用公司数据丰富信息
↓
根据标准进行资格审核
↓
个性化消息
↓
质量审查
↓
发送或升级给人工处理
相同的骨架:
目标。
行动。
检查。
修复。
重复直到完成。
This is the skill gap opening in 2026
Prompt Engineer
A prompt engineer focuses on better instructions.
They improve the wording.
They get a better single output.
But the human still reviews everything after the run.
The human is still the feedback loop.
Loop Engineer
A loop engineer designs the feedback system.
They decide:
- What starts the loop
- What context the agent needs
- What tools it can use
- What counts as success
- Who checks the work
- When the loop should stop
- Where the result should be saved
A prompt engineer says:
"Write me a function."
A loop engineer says:
"Write it, test it, fix it until it passes, then summarize the change."
Same tools.
Different mindset.
The highest-leverage AI builders are not just writing better English prompts.
They are designing systems that discover, plan, execute, verify, and stop correctly.
这是 2026 年正在拉开的技能差距。
提示工程师
提示工程师专注于更好的指令。
他们改进措辞。
他们得到更好的单次输出。
但人类仍然要在运行后检查所有内容。
人类仍然是反馈循环。
循环工程师
循环工程师设计反馈系统。
他们决定:
- 什么启动循环
- 智能体需要什么上下文
- 它可以访问什么工具
- 什么算成功
- 谁来检查工作
- 循环何时停止
- 结果应保存在哪里
提示工程师会说:
“给我写一个函数。”
循环工程师会说:
“写出它,测试它,修复它直到通过,然后总结更改。”
同样的工具。
不同的思维方式。
最具杠杆效应的 AI 构建者不仅仅在写更好的英文提示。
他们在设计能够正确发现、规划、执行、验证和停止的系统。
Loop engineering is the shift from manual prompting to automated feedback cycles.
The shift:
- Old way: Prompt agents one task at a time
- New way: Design loops that run the full cycle
The 6 things you actually build:
- Automations: The heartbeat that starts the loop
- Worktrees: Parallel agents without file conflicts
- Skills: Project knowledge reused every run
- Plugins and connectors: Access to real tools
- Subagents: Makers and checkers separated
- Memory: The loop remembers across runs
The 2 sizes:
- Single-agent loop: One agent improves its own work
- Fleet loop: Orchestrator plus specialists plus subagents
The 2 types:
- Open loop: Powerful, exploratory, expensive
- Closed loop: Bounded, reliable, affordable
The 5 stages:
- Discover
- Plan
- Execute
- Verify
- Iterate
The real cost problem:
- Loops burn tokens fast
- Cheap long-context models make loops practical
- Without affordable tokens, most people never get past experiments
The mindset shift:
- Prompt engineers ask AI for outputs
- Loop engineers design systems that produce verified outcomes
That is the real unlock.
Stop trying to write one perfect prompt.
Start building the loop that makes imperfect outputs better.
A reliable loop beats a perfect prompt.
循环工程是从手动提示到自动化反馈循环的转变。
这种转变:
- 旧方式:一次提示一个任务
- 新方式:设计运行完整周期的循环
你实际构建的 6 个东西:
- 自动化:启动循环的心跳
- 工作树:并行智能体,无文件冲突
- 技能:每次运行复用的项目知识
- 插件和连接器:访问真实工具
- 子智能体:制造者和检查者分离
- 记忆:循环在多次运行间保持记忆
2 种规模:
- 单智能体循环:一个智能体改进自己的工作
- 舰队循环:主协调器 + 专业智能体 + 子智能体
2 种类型:
- 开放循环:强大、探索性、昂贵
- 封闭循环:有边界、可靠、可负担
5 个阶段:
- 发现
- 规划
- 执行
- 验证
- 迭代
真正的成本问题:
- 循环快速消耗 Token
- 便宜的长上下文模型让循环变得实用
- 没有可负担的 Token,大多数人永远停留在实验阶段
思维方式的转变:
- 提示工程师向 AI 索取输出
- 循环工程师设计产生已验证结果的系统
这才是真正的解锁。
别再试图写出一个完美的提示。
开始构建那个能让不完美的输出变得更好的循环。
一个可靠的循环胜过一条完美的提示。