Glean 拾遗
日刊 /2026-06-10 / 设计一个自行驱动 Agent 的多步任务循环

设计一个自行驱动 Agent 的多步任务循环

原文 x.com 收录 2026-06-10 06:00 阅读 18 min
AI 解读

本文提出了一个让 AI Agent 自主完成多步任务的循环架构,核心在于用代码构建一个自动化的提示生成系统,而非手动编写单个提示。文章详细拆解了该循环的五个组成部分:定义验收标准(done check)、从状态构建上下文而非每次手写指令、执行操作并捕获所有输出、将失败结果作为反馈闭合到下一轮提示中,以及设置硬性停止条件(最大轮次、成本上限)。作者通过一个修复登录Bug的实例展示了循环如何运行,并指出实际开销来自多轮调用,而非单次代码生成,因此止损条件至关重要。此外,将反复出现的操作封装为可复用技能是提升长期价值的关键,而初学者常犯的错误包括缺乏退出条件、手动干预提示和丢弃失败输出。适合希望从单次提示工程转向构建Agent控制流的开发者阅读。

原文 18 分钟
原文 x.com ↗
§ 1

In this article, we will learn about how to design a loop that prompts your agent - what the loop really is, why a single prompt is not enough, and the five parts we need to build a loop that keeps prompting the agent on its own until the job is done. We will also see where loops came from, what they really cost to run, and the skills that make them powerful.

We will cover the following:

  • The Five Parts of the Loop
  • Step 1: Define What "Done" Looks Like
  • Step 2: Build the Context, Not the Instruction
  • Step 3: Let the Agent Act and Capture Everything
  • Step 4: Close the Loop with Feedback
  • Step 5: Set the Stop Conditions
  • The Full Loop in Code
  • Let's Walk Through One Run
  • Cost of Running the Loop
  • Reusable Skills
  • Common Mistakes I am Amit Shekhar, Founder @ Outcome School, I have taught and mentored many developers, and their efforts landed them high-paying tech jobs, helped many tech companies in solving their unique problems, and created many open-source libraries being used by top companies. I am passionate about sharing knowledge through open-source, blogs, and videos.

I teach AI and Machine Learning at Outcome School.

Let's get started.

本文将系统介绍如何设计一个能自动提示智能体的循环(loop)——理解这个循环的本质、为什么单次提示不够用,以及构建自驱循环所需的五个核心步骤。我们还会探讨循环的起源、真实运行成本,以及让循环发挥威力的关键技能。

内容包括:

  • 循环的五个组成部分
  • 步骤1:定义“完成”的标准
  • 步骤2:构建上下文,而非手工指令
  • 步骤3:让智能体行动并记录所有输出
  • 步骤4:用反馈形成闭环
  • 步骤5:设置停止条件
  • 完整循环代码
  • 一次完整运行推演
  • 循环的运行成本
  • 可复用的技能
  • 常见错误

我是 Amit Shekhar,Outcome School 创始人。我曾教导和辅导过众多开发者,帮助他们获得高薪技术岗位,协助多家科技公司解决独特难题,并创建了多个被顶级公司使用的开源库。我热衷于通过开源、博客和视频分享知识。

我在 Outcome School 教授 AI 与机器学习。

让我们开始吧。

§ 2

Before we go into the details, let's understand the big picture.

Most people think working with an AI agent means writing one really good prompt. We type our request, the agent replies, and we are done. This works for a quick question. It does not work for a real job that needs many steps.

A real job looks like this. Write the code. Run the tests. The tests fail. Read the error. Fix the code. Run the tests again. Repeat until everything passes.

We do not want to type each of those prompts by hand. That is slow, and it does not scale. So, here comes the loop to the rescue. Instead of us typing each prompt, we build a small system that writes those prompts for us, turn after turn, until the job is finished.

The new skill is not writing one perfect prompt. The new skill is building the system that keeps prompting the agent for us.

That system is the loop. In this blog, we will learn how to design it.

在深入细节之前,我们先理解全局图景。

很多人认为与 AI 智能体协作就是写一条高质量的提示词。我们输入请求,智能体回答,工作就完成了。这个方法适用于快速问答,但无法应对需要多步骤的真实任务。

真实任务通常是这样的:写代码 → 跑测试 → 测试失败 → 阅读错误信息 → 修复代码 → 再次跑测试 → 重复直到全部通过。

我们不可能手动逐条输入这些提示词,速度慢且不可扩展。所以循环应运而生——我们不再手写每条提示,而是构建一个小系统,让它一圈一圈地自动生成提示,直到任务完成。

新技能不是写出一个完美提示,而是构建一个能持续为智能体生成提示的系统。

这个系统就是循环。本文将教你如何设计它。

§ 3

Let's start with the most important idea in this whole blog.

Think of a game of chess. A single prompt is a single move. You look at the board, you make one move, and your turn ends. If you only ever make one move, you can never win a full game.

A loop is different. A loop is a strategy. It is the set of rules that decides every move, checks the board after each move, and keeps playing until the game is won.

A prompt is a single move. A loop is a strategy.

We are no longer playing the game turn by turn, typing each prompt ourselves. We are designing the rules that the agent plays inside. We set up the rules once, and then we let the agent play the full game on its own.

So, how do we design these rules?

Let's learn the five parts of the loop.

我们从整篇文章最重要的概念开始。

想象一盘国际象棋。单次提示就像一个单步走法——你观察棋盘、走一步棋、回合结束。如果只走一步,你永远赢不了一盘棋。

循环则完全不同。循环是一种策略——它是一套规则集,决定每一步该怎么走,每步之后检查棋盘状态,持续博弈直至获胜。

提示词是单步走法,循环是整局策略。

我们不再逐轮手动输入提示词。我们设计规则框架,让智能体在其中自主完成整盘对局。规则只需设定一次,然后放手让智能体独立完成全部步骤。

那么,这些规则该如何设计?

让我们学习循环的五个组成部分。

§ 4

A loop that prompts the agent has five parts. Do not worry, we will learn about each of them in detail.

  • Define "done" - the check that tells the loop when to stop.
  • Build the context - the fresh information we feed the agent each turn.
  • Act and capture - run the step and grab the result.
  • Close the loop with feedback - turn the result into the next prompt.
  • Set the stop conditions - the guardrails that keep the loop safe. Here is how these five parts fit together, as below:
        +--------------------------------------------------+
        |                     The Loop                     |
        |                                                  |
        |   Build Context  --->  +-------+                 |
        |                        | Agent |                 |
        |                        +-------+                 |
        |                            |                     |
        |                            | acts                |
        |                            v                     |
        |                    Capture Result                |
        |                            |                     |
        |                            v                     |
        |                    Check "Done"? ----- Yes ---> Stop
        |                            |                     |
        |                            | No (feedback)       |
        |                            |                     |
        |                            +---------------------+
        |                       (next turn)                |
        +--------------------------------------------------+
                    Stop conditions wrap the whole loop

Here, we can see the cycle. We build the context, the agent acts, we capture the result, and we check if we are done. If yes, we stop. If no, we feed the result back as the next prompt and go around again. The stop conditions sit around the whole thing to keep it safe.

Now, let's understand each part one by one.

一个能自动提示智能体的循环由五个部分组成。别担心,我们将逐一深入讲解。

  • 定义“完成”——循环据此判断何时停止。
  • 构建上下文——每轮喂给智能体的最新信息。
  • 行动并捕获——执行步骤并抓取结果。
  • 用反馈形成闭环——把结果转化为下一轮的提示词。
  • 设置停止条件——保障循环安全的防护栏。

这五个部分的配合方式如下:

        +--------------------------------------------------+
        |                     The Loop                     |
        |                                                  |
        |   Build Context  --->  +-------+                 |
        |                        | Agent |                 |
        |                        +-------+                 |
        |                            |                     |
        |                            | acts                |
        |                            v                     |
        |                    Capture Result                |
        |                            |                     |
        |                            v                     |
        |                    Check "Done"? ----- Yes ---> Stop
        |                            |                     |
        |                            | No (feedback)       |
        |                            |                     |
        |                            +---------------------+
        |                       (next turn)                |
        +--------------------------------------------------+
                    Stop conditions wrap the whole loop

图中清晰地展示了这个周期:构建上下文 → 智能体行动 → 捕获结果 → 检查是否完成。如果完成则停止,否则将结果反馈为下一轮提示词,循环继续。停止条件包裹整个循环,确保安全。

接下来,我们逐一理解每个步骤。

§ 5

Before anything runs, we must answer one question: how will the loop know it has finished?

This is the first step, and it is the most important one. If we cannot describe "done", the agent has nothing to loop toward. It will keep going forever, or it will stop too early.

So, we write the success check first, before we write anything else. This check must be in code, not in our head. Let's say we are building an agent that fixes a bug. For us, "done" means the tests pass. A few more examples of "done", as below:

  • The tests pass.
  • The output matches a schema.
  • A score clears a threshold. Let's see this as a simple function, as below:
def is_done(result):
    # done means all tests passed
    return result.tests_passed

Here, we have written a small function that returns True when the work is finished and False when it is not. The loop will call this function after every turn.

This check becomes the heartbeat of our loop. Every turn, the loop checks the heartbeat. If the heart says done, the loop stops. If it says not yet, the loop goes around again.

So we must always start here, with the check. Everything else in the loop is built on top of it.

This is how we define done. Now, let's move to the context.

在一切开始之前,我们必须回答一个问题:循环如何知道任务已经完成?

这是第一步,也是最重要的一步。如果我们无法描述“完成”,智能体就失去了循环的目标——它要么永远运行下去,要么过早停止。

因此,我们首先要编写成功检查逻辑,而不是其他任何代码。这个检查必须写在代码里,不能只存在于脑子里。假设我们构建一个修复 bug 的智能体,此时“完成”意味着测试通过。以下是一些“完成”的例子:

  • 测试全部通过
  • 输出符合预定义 schema
  • 分数超过某个阈值

我们可以用如下简单的函数实现:

def is_done(result):
    # done means all tests passed
    return result.tests_passed

这个函数在任务完成时返回 True,否则返回 False。循环在每轮结束后都会调用它。

这个检查就是循环的心跳。每轮循环都会检测心跳:如果心跳表示完成,循环停止;如果还没完成,循环继续下一轮。

所以我们必须从这里开始。循环中的所有其他部分都建立在这个检查之上。

这就是定义“完成”的方法。接下来,我们看看上下文。

§ 6

Here is where most people go wrong. They keep hand-feeding the agent. They type a new instruction every time, by hand, and they paste in the files and the errors themselves.

We must stop doing this. Instead of typing the instruction, we build the context.

What does context mean here? It means everything the agent needs to make a good decision this turn, as below:

  • The files it is working on.
  • The tools it can use.
  • The error logs from the last run.
  • The past attempts it has already made. So, the prompt is no longer typed by us. The prompt is put together from the current state of the system.

Let's see this as code, as below:

def build_prompt(state):
    return f"""
    Goal: {state.goal}
    Files: {state.files}
    Last error: {state.last_error}
    Past attempts: {state.past_attempts}

    Decide the next step and make the change.
    """

Here, we can see that the prompt is built from state. We do not type the error by hand. We read it from the state and drop it into the prompt automatically. When the state changes, the prompt changes with it. That is the whole point.

So, the loop stays the same on every turn. Only the context changes. The same build_prompt function gives a different prompt each time, because the state behind it has moved forward.

This is how we build the context. Now, it's time to let the agent act.

这是大多数人会犯错误的地方——他们习惯手动喂养智能体,每次都手写新指令,自己粘贴文件和错误信息。

我们必须停止这种做法。不是手写指令,而是构建上下文。

这里的上下文是什么意思?它指智能体在本轮做出好决策所需的一切信息:

  • 正在操作的文件
  • 可以使用的工具
  • 上次运行的错误日志
  • 之前做过的尝试

这样一来,提示词不再由我们手动输入,而是从系统当前状态动态组装而成。

用代码表示如下:

def build_prompt(state):
    return f"""
    Goal: {state.goal}
    Files: {state.files}
    Last error: {state.last_error}
    Past attempts: {state.past_attempts}

    Decide the next step and make the change.
    """

可见,提示词完全从状态构建。错误信息不是手动输入的,而是自动从状态中读取并嵌入提示词。当状态变化时,提示词也随之变化——这正是关键所在。

循环本身每轮保持不变,变化的只有上下文。同一个 build_prompt 函数,因为背后状态不同,每次生成不同的提示词。

这就是构建上下文的方法。现在,该让智能体行动了。

§ 7

Now we run the step. We send the prompt we built to the agent, and the agent does its work. It writes code, it calls a tool, it changes a file.

But the action is only half the job. The other half is to capture everything that came out of it, as below:

  • The diff of what changed.
  • The standard output from running the code.
  • The failure message, if it failed.
  • The new state of the system. Let's see this as code, as below:
def act_and_capture(prompt, state):
    output = agent.run(prompt)   # the agent does the work
    result = run_checks(output)  # run tests, grab logs, get the diff
    return result

Here, we have run the agent and then captured the result. The result holds the diff, the output, and whether the checks passed.

Now, here is the key idea. This output is not the finish line. It becomes the raw material for the next prompt. The failure we just captured is exactly what we will feed back to the agent so it can fix the problem. So we must capture all of it, and not throw it away.

This is how we act and capture. Now, let's close the loop.

现在执行步骤:把构建好的提示词发送给智能体,让它执行工作——写代码、调用工具、修改文件。

但行动只完成了一半工作。另一半是捕获行动产生的所有输出:

  • 变更的 diff
  • 运行代码的标准输出
  • 失败信息(如果有的话)
  • 系统的新状态

用代码实现如下:

def act_and_capture(prompt, state):
    output = agent.run(prompt)   # the agent does the work
    result = run_checks(output)  # run tests, grab logs, get the diff
    return result

这里我们运行智能体并捕获结果。结果中包含 diff、输出以及检查是否通过的信息。

关键思路是:这些输出不是终点线,而是下一轮提示词的原材料。我们刚刚捕获的失败信息,恰好就是要反馈给智能体供其修复问题的内容。所以必须捕获全部信息,不能丢弃。

这就是行动与捕获的方法。接下来我们形成闭环。

§ 8

We have a result. Now we feed that result back through the "done" check from Step 1. This is what makes it a loop and not just a single move.

There are only two paths, as below:

  • Passed? Stop. The job is done.
  • Failed? Turn the failure into the next prompt, automatically. This second path is the magic. When the work fails, we do not give up. We take the failure and turn it into the next instruction. Something like: "Tests failed with this error. Fix it."

The agent then re-prompts itself using what just happened. We did not type that new prompt. The loop built it from the failure.

Let's see this as code, as below:

def loop(state):
    while True:
        prompt = build_prompt(state)      # Step 2: build context
        result = act_and_capture(prompt, state)  # Step 3: act and capture

        if is_done(result):               # Step 1: check done
            return result                 # passed, so we stop

        # failed, so turn the failure into the next prompt
        state.last_error = result.error
        state.past_attempts.append(result)

Here, we can see the loop close on itself. If we are done, we return and stop. If we are not done, we save the error into the state, and the next turn's build_prompt will include that error automatically. The agent re-prompts itself using the failure. The loop feeds itself.

This is how we close the loop with feedback. But we have one problem left. Look at the code above. There is no exit if the agent keeps failing forever. Let's solve that next.

现在我们有了结果。接下来把结果送回步骤1的“完成”检查——正是这个机制让系统成为循环,而不是一次性的单步走法。

只有两条路径:

  • 通过?停止。任务完成。
  • 失败?自动将失败转化为下一轮提示词。

第二条路径是神奇之处。当工作失败时,我们不放弃,而是把失败信息转化为新的指令,比如:“测试因以下错误失败,请修复它。”

智能体利用刚刚发生的事情重新提示自己。我们并没有手写新提示词,循环自动从失败信息中构建了它。

用代码实现如下:

def loop(state):
    while True:
        prompt = build_prompt(state)      # Step 2: build context
        result = act_and_capture(prompt, state)  # Step 3: act and capture

        if is_done(result):               # Step 1: check done
            return result                 # passed, so we stop

        # failed, so turn the failure into the next prompt
        state.last_error = result.error
        state.past_attempts.append(result)

可以看到循环在自我闭合。如果完成就返回并停止;如果未完成,则将错误保存到状态中,下一轮 build_prompt 会自动包含该错误。智能体利用失败信息自我驱动,循环自我喂养。

这就是通过反馈形成闭环的方法。但还有一个问题——看上面的代码,如果智能体持续失败,循环没有退出机制。接下来我们解决这个问题。

§ 9

A loop with no way out is not a system. It is a cost that never stops. If the agent keeps failing and the loop keeps running, it will burn time and money with no end. So we must design the guardrails.

We set the stop conditions once, and then we let the loop run safely. A few important guardrails, as below:

  • Cap the retries. Stop after a fixed number of turns, even if the job is not done.
  • Watch the cost. Stop if we cross a budget for time or money.
  • Add a human checkpoint. For the calls that matter, pause and ask a person before doing something risky. Let's add these guardrails to our loop, as below:
def loop(state, max_turns=10, max_cost=5.0):
    turns = 0
    cost = 0.0

    while turns < max_turns and cost < max_cost:
        turns += 1
        prompt = build_prompt(state)
        result = act_and_capture(prompt, state)
        cost += result.cost

        if is_done(result):
            return result   # success

        state.last_error = result.error
        state.past_attempts.append(result)

    return "stopped: hit a guardrail"   # safe exit

Here, we have wrapped the loop with two of these guardrails - the retry cap and the cost cap. The loop stops when the job is done, or when it runs out of turns, or when it crosses the budget. There is always an exit. The loop can never run forever.

This is how we keep the loop safe.

Note: The human checkpoint matters most for actions that are hard to undo. Deleting a file, sending money, or pushing to production are good examples. For these, we must pause the loop and let a person say yes before the agent acts.

没有退出机制的循环不是系统,而是一个永不停机的成本黑洞。如果智能体持续失败、循环不停运转,时间和金钱就会无休止地消耗。因此我们必须设计安全护栏。

一次性设置好停止条件,然后让循环安全运行。几个重要的护栏包括:

  • 限制重试次数。达到固定轮数后停止,即使任务未完成。
  • 监控成本。如果超出时间或预算限制则停止。
  • 加入人工检查点。对于关键操作,暂停循环并请求人工确认。

将这些护栏添加到循环中:

def loop(state, max_turns=10, max_cost=5.0):
    turns = 0
    cost = 0.0

    while turns < max_turns and cost < max_cost:
        turns += 1
        prompt = build_prompt(state)
        result = act_and_capture(prompt, state)
        cost += result.cost

        if is_done(result):
            return result   # success

        state.last_error = result.error
        state.past_attempts.append(result)

    return "stopped: hit a guardrail"   # safe exit

现在循环由两个护栏包裹:重试上限和成本上限。循环在任务完成、达到重试次数或超出预算时都会停止。始终有退出路径,循环永远不会无限运行。

这就是确保循环安全的方法。

注意:人工检查点对于难以撤销的操作尤其重要——删除文件、转账、推送到生产环境都是典型例子。对这些操作,必须在智能体行动前暂停循环,等待人工确认。

§ 10

Now that we have learned about all five parts, let's put them together in one place, as below:

# Step 1: define done
def is_done(result):
    return result.tests_passed

# Step 2: build the context from state
def build_prompt(state):
    return f"""
    Goal: {state.goal}
    Files: {state.files}
    Last error: {state.last_error}
    Past attempts: {state.past_attempts}

    Decide the next step and make the change.
    """

# Step 3: act and capture
def act_and_capture(prompt, state):
    output = agent.run(prompt)
    return run_checks(output)

# Step 4 and 5: close the loop with feedback, inside the guardrails
def loop(state, max_turns=10, max_cost=5.0):
    turns = 0
    cost = 0.0

    while turns < max_turns and cost < max_cost:
        turns += 1
        prompt = build_prompt(state)
        result = act_and_capture(prompt, state)
        cost += result.cost

        if is_done(result):
            return result

        state.last_error = result.error
        state.past_attempts.append(result)

    return "stopped: hit a guardrail"

Here, we can see the full picture. The five parts work together as one system. We define done, we build the context, the agent acts, we capture the result, we feed it back, and the guardrails keep it all safe.

We wrote this once. Now the agent can finish a multi-step job on its own. It works perfectly.

学完所有五个步骤后,我们把它们整合在一起:

# Step 1: define done
def is_done(result):
    return result.tests_passed

# Step 2: build the context from state
def build_prompt(state):
    return f"""
    Goal: {state.goal}
    Files: {state.files}
    Last error: {state.last_error}
    Past attempts: {state.past_attempts}

    Decide the next step and make the change.
    """

# Step 3: act and capture
def act_and_capture(prompt, state):
    output = agent.run(prompt)
    return run_checks(output)

# Step 4 and 5: close the loop with feedback, inside the guardrails
def loop(state, max_turns=10, max_cost=5.0):
    turns = 0
    cost = 0.0

    while turns < max_turns and cost < max_cost:
        turns += 1
        prompt = build_prompt(state)
        result = act_and_capture(prompt, state)
        cost += result.cost

        if is_done(result):
            return result

        state.last_error = result.error
        state.past_attempts.append(result)

    return "stopped: hit a guardrail"

完整的画面现在呈现在眼前:五个部分协同工作,构成一个完整系统。定义完成、构建上下文、智能体行动、捕获结果、反馈循环、安全护栏——一切井然有序。

代码只需编写一次,智能体就能自主完成多步骤任务。完美运转。

§ 11

The best way to learn this is by taking an example. Let's say the goal is "fix the failing login bug". Let's walk through the loop turn by turn.

Turn 1: The state has the goal and the files, but no error yet. We build the prompt and send it. The agent changes the code. We capture the result. The tests still fail with "password check returns true for empty password". We are not done, so we save this error into the state.

Turn 2: Now build_prompt includes the new error from Turn 1. The agent reads "password check returns true for empty password" and fixes that exact line. We capture the result. The tests pass.

Turn 3: There is no Turn 3. The is_done check returned True on Turn 2, so the loop stopped on its own.

Here, we can notice the most important thing. We did not type a single prompt during this run. The error from Turn 1 became the prompt for Turn 2, automatically. The loop prompted the agent for us, and it stopped the moment the job was done.

学习的最佳方式是通过一个具体例子。假设目标是“修复登录失败的 bug”。我们来逐轮推演循环的运行过程。

第1轮:状态中包含目标和文件,还没有错误信息。构建提示词并发送给智能体。智能体修改代码。捕获结果:测试仍然失败,错误为“密码检查对空密码返回 true”。任务未完成,将错误保存到状态中。

第2轮:现在 build_prompt 包含第1轮的错误信息。智能体看到“密码检查对空密码返回 true”,精准修复了那行代码。捕获结果:测试通过。

第3轮:不存在第3轮。第2轮中 is_done 检查返回了 True,循环自动停止。

这里要注意最重要的一点:在整个运行过程中,我们没有手动输入任何一条提示词。第1轮的错误自动成为第2轮的提示词。循环替我们完成了对智能体的提示,并在任务完成的那一刻立即停止。

§ 12

Here is something that surprises people. Writing the code is cheap. Running the loop that writes it, again and again, is not.

Let's understand why. The model writes a piece of code in seconds, for a tiny cost. But the loop runs that model again and again, turn after turn, sometimes for hours. Every turn costs a little. A loop that runs all night can run thousands of turns. So the real cost is not in producing the code once. It is in all the turns the loop takes to get there.

This is why the stop conditions from Step 5 matter so much. A loop that does not stop is not just a bug. It is a charge that keeps growing while we sleep.

So, our most important job has changed. It is no longer about writing one clever prompt. It is about making sure the loop halts. We must cap the turns, watch the cost, and stop the moment the job is done.

有一件事出乎很多人意料:写代码本身很便宜,但重复运行编写代码的循环却并不便宜。

让我们理解其中的原因。模型可以在几秒内写出代码,成本极低。但循环会一遍又一遍地运行模型,一圈接一圈,有时持续数小时。每轮的成本不大,但一个通宵运行的循环可能累积数千轮。真正的成本不在于一次性的代码产出,而在于循环到达目标所经历的所有轮次。

这就是步骤5的停止条件如此重要的原因。一个无法停止的循环不仅是 bug,它还是一个在你睡觉时持续计费的账单。

所以,我们最重要的任务已经改变了——不再是写一条聪明的提示词,而是确保循环能够终止。我们必须限制轮次、监控成本,并在任务完成的那一刻立即停止。

§ 13

Now, let's get to the part that matters most in the long run. The loop itself is just the wiring. The real value is in the skills it calls.

What is a skill here? A skill is a small, reusable tool that does one job well. Instead of asking the model to work out the same thing from scratch every turn, we turn that repeated work into a named tool the loop can call directly.

Here is the rule we must follow. When we find ourselves doing the same step again and again, we pull it out and make it a skill. When we crack a hard problem, we save that solution as a skill too. After that, the loop gets it for almost no cost on every later run.

A loop that has no skills inside it asks the model to solve the same problems all over again on every turn. A loop that calls a set of sharp, tested skills gets stronger every time we add one. This is what makes a loop grow more valuable over time instead of just burning money.

现在我们来谈谈长期最重要的部分。循环本身只是接线工程,真正的价值在于它调用的技能。

什么是技能?技能是一个小型、可复用的工具,专精于做好一件事。与其让模型每轮从头解决同样的问题,不如把重复性工作提取为具名工具,让循环可以直接调用。

我们必须遵循这样一条规则:当发现自己在反复做同一个步骤时,就把它提取出来,封装成技能;当解决了一个难题时,也把解决方案保存为技能。此后,循环在每次运行时几乎零成本地使用这些技能。

一个内部没有技能的循环,每轮都要求模型重新解决相同的问题。而一个调用着锋利、经过测试的技能的循环,每添加一个技能就变得更强。这正是循环随时间推移价值增长而不只是烧钱的原因。

§ 14

Most of the time, we do mistakes while designing the loop. Let's learn the common ones so we can avoid them, as below:

  • No "done" check. Without a check in code, the loop never knows when to stop. Always write Step 1 first.
  • Hand-feeding the prompt. If we type each prompt by hand, it is not a loop, it is just us doing the work. Build the prompt from state.
  • Throwing away the output. The failure is the next prompt. If we do not capture it, we have nothing to feed back.
  • No stop conditions. A loop with no way out keeps running and keeps charging us. Always cap the retries and watch the cost.
  • Forcing a loop on a one-off task. A loop only pays off when the work repeats and can be checked. For a one-time job, a plain prompt is better.
  • No skills inside the loop. If the loop works out the same thing from scratch every turn, it wastes time and tokens. Turn the repeated work into reusable skills. So, we stop playing the game move by move. We design the loop once, give it a way to check itself and a way to stop, and then we let it run.

That's it for now.

Thanks

Amit Shekhar

Founder @Outcome School

大多数情况下,我们在设计循环时会犯一些错误。让我们了解这些常见问题,以便避免:

  • 缺少“完成”检查。没有代码中的检查,循环永远不知道何时停止。务必先写步骤1。
  • 手动喂养提示词。如果每条提示都手写,那就不叫循环,只是我们自己干活。应该从状态构建提示词。
  • 丢弃输出。失败信息就是下一轮的提示词。不捕获它,就没有东西可以反馈。
  • 没有停止条件。没有出口的循环会持续运行并持续收费。始终限制重试次数并监控成本。
  • 对一次性任务强加循环。循环只在可重复、可检查的任务上才有价值。一次性任务用普通提示词更好。
  • 循环内没有技能。如果循环每轮都从头解决相同的问题,既浪费时间也浪费 token。把重复工作提取为可复用技能。

所以,我们不再逐轮手动操作。一次性设计好循环,给它自我检查和停止的机制,然后放手让它运行。

今天就到这里。

感谢阅读。

Amit Shekhar Founder @Outcome School

打开原文 ↗