15种AI Agent设计模式:从单智能体到事件驱动,生产级系统的选择指南
本文总结了15种AI智能体设计模式,覆盖从单智能体、顺序/并行多智能体、循环评审、协调器、分层分解、ReAct、人机协同到事件驱动等。核心观点是:选择哪种模式取决于问题的确定性形态,而非追逐流行度。每个模式都给出了适用场景、真实案例和常见失败点,最后提炼10条生产规则强调最小化复杂度、严格限制工具调用次数、不可逆操作必须经确定性校验或人工批准。适合所有正在构建或计划部署Agent系统的一线工程师。
Every team building AI agents hits the same wall.
You start with one prompt and a few tools.
It works.
Then requirements grow. More edge cases. More teams. More risk.
Suddenly your "agent" is a 3,000-word system prompt trying to do five jobs at once.
The fix isn't more prompt engineering.
It's picking the right pattern.
每个构建 AI Agent 的团队都会撞上同一堵墙。
开始时,你只有一个提示词和几个工具。
它能工作。
然后需求增长。更多的边界情况出现。更多团队介入。更多风险浮现。
突然间,你的“Agent”变成了一条 3000 字的系统提示词,试图同时干五份活。
解决之道不在于更多的提示词工程。
而在于选对模式。
Before you pick a pattern
Not every task needs an agent.
A task justifies an agent when:
→ A single model call can't produce a reliable result
→ The model must choose between tools or data sources at runtime
→ The task needs planning, validation, or iterative refinement
→ The workflow has real uncertainty that can't be hardcoded
A task usually does NOT need an agent when the input-to-output path is predictable.
Summarization. Classification. Simple extraction. Templated generation.
These are faster, cheaper, and more reliable as direct model calls.
Wrapping them in an agent just adds latency and failure points for zero benefit.
选模式之前
并非每个任务都需要 Agent。
一个任务需要用 Agent 来解决的条件是:
→ 单一模型调用无法产生可靠结果
→ 模型必须在运行时在多个工具或数据源之间做出选择
→ 任务需要规划、验证或迭代优化
→ 工作流中存在无法硬编码的真正不确定性
当输入到输出的路径是可预测的,任务通常不需要 Agent。
摘要。分类。简单抽取。模板化生成。
这些作为直接模型调用更快、更便宜、也更可靠。
将它们包装成 Agent 只会增加延迟和故障点,却带不来任何好处。
PATTERN 1 — Single Agent
The simplest and most common starting point.
One model. One system prompt. A bounded set of tools.
The model decides which tool to call, observes the result, and keeps going until it has enough to answer.
Real example: A customer support agent that looks up order status, checks shipping, and creates a ticket if it can't resolve the issue — all with 2-3 tools and one clear job.
Use it when: the task is well-defined, the tool set is small, and one agent can hold the full context without getting confused.
It breaks when: you keep adding tools and the system prompt grows past a page. That's the signal you need a different pattern — not a longer prompt.
模式1——单Agent
最简单也最常见的起点。
一个模型。一条系统提示词。一组范围明确的工具。
模型决定调用哪个工具、观察返回结果,并持续进行,直到有足够的信息给出回答。
实际案例:一个客服 Agent,它查订单状态、核实物流信息,如果无法解决问题则创建工单——这些只靠 2 到 3 个工具和一项清晰的职责来完成。
适用场景:任务定义明确,工具集很小,单个 Agent 能完整掌握上下文而不混淆。
问题暴露的场景:当你不断添加工具,系统提示词超过一页时。这就是一个信号,说明你需要换一个模式,而不是写更长的提示词。
PATTERN 2 — Multi-Agent Sequential
Specialized agents run in a fixed order. Each one's output feeds the next one's input.
Real example: A contract review pipeline — one agent extracts obligations, the next identifies risks, a third drafts the summary for procurement. The sequence never changes.
Use it when: the workflow has clear, repeatable stages and each stage produces exactly what the next one needs.
It breaks when: the order actually needs to vary based on what's found mid-process. Sequential pipelines assume the path is fixed — if it isn't, you need something more dynamic.
模式2——顺序多Agent
专业化 Agent 按固定顺序执行。每个 Agent 的输出作为下一个 Agent 的输入。
实际案例:合同审查流水线——一个 Agent 提取义务条款,下一个识别风险,第三个为采购部门起草摘要。执行顺序从不改变。
适用场景:工作流具有清晰、可重复的阶段,每个阶段产生的输出正好是下一个阶段需要的输入。
问题暴露的场景:当顺序需要根据流程中发现的结果动态调整时。顺序流水线假设路径是固定的——如果不是,你需要更动态的方案。
PATTERN 3 — Multi-Agent Parallel
Independent subtasks run simultaneously, then get combined into one view.
Real example: A 2am production incident. Three agents investigate logs, metrics, and recent deployments at the same time — not one after another — because every minute matters during an outage.
Use it when: the subtasks are genuinely independent and speed matters.
It breaks when: tasks actually depend on each other's results. Forcing dependent work into parallel execution just creates race conditions and incomplete context.
模式3——并行多Agent
独立的子任务同时运行,然后将结果合并为一个视图。
实际案例:凌晨 2 点的生产事故。三个 Agent 同时调查日志、指标和近期部署——而不是依次进行——因为在宕机期间每一分钟都很宝贵。
适用场景:子任务真正独立,且速度至关重要。
问题暴露的场景:当任务之间确实依赖彼此的结果时。强行将依赖性的工作放入并行执行只会造成竞态条件和上下文不完整。
PATTERN 4 — Loop
Repeat a sequence of steps until an exit condition is met.
Real example: A data cleaning agent that profiles messy CSV data, proposes a cleaning plan, checks if it passes quality standards, and retries if it doesn't — up to a capped number of rounds.
Use it when: the task needs multiple attempts and you can define a clear, checkable stopping condition.
It breaks when: there's no reliable exit condition. Without one, you get runaway costs and a system that might never terminate.
模式4——循环
重复执行一系列步骤,直到满足退出条件。
实际案例:一个数据清洗 Agent,它分析杂乱的 CSV 数据、提出清洗方案、检查是否通过质量标准,如果不通过则重试——最多重试指定次数。
适用场景:任务需要多次尝试,且你能定义清晰、可检查的停止条件。
问题暴露的场景:当没有可靠的退出条件时。没有它,你会陷入成本失控,系统可能永不终止。
PATTERN 5 — Review and Critique
A judge agent reviews another agent's output, critiques it, and gives specific actionable feedback.
Real example: A generated report gets reviewed by a separate "critic" agent that flags weak claims, missing evidence, or unclear sections before it ever reaches a human.
Use it when: quality matters more than speed and you want a second opinion baked into the system, not bolted on after.
It breaks when: the critic agent uses the same blind spots as the generator. A reviewer trained on similar assumptions won't catch the same mistakes.
模式5——审查与批注
一个评审 Agent 审查另一个 Agent 的输出,提出批评意见,并给出具体、可操作的反馈。
实际案例:一份生成的报告在提交人工审核之前,先由一个独立的“审阅”Agent 审查,标记出薄弱的论点、缺失的证据或表述不清的部分。
适用场景:质量重于速度,并且你希望系统中内置第二意见,而不是事后临时添加。
问题暴露的场景:当审阅 Agent 与生成 Agent 存在相同的盲点时。基于相似假设训练的审阅者发现不了同样的错误。
PATTERN 6 — Iterative Refinement
A feedback loop with a quality score threshold. The generator keeps refining until it crosses the bar.
Real example: A marketing copy generator that scores its own draft against brand guidelines, and keeps rewriting until it hits a minimum quality score — not just one pass-fail check, but graded improvement.
Use it when: output quality is genuinely variable and "good enough" has a measurable threshold.
It breaks when: the scoring function is vague or gameable. If the model can inflate its own score without real improvement, the loop just burns tokens.
模式6——迭代优化
一个带质量分数阈值的反馈循环。生成器不断优化,直到越过质量门槛。
实际案例:一个营销文案生成器,它根据品牌指南对自己的草稿打分,并持续重写直到达到最低质量分数——不是单次的通过/不通过检查,而是渐进式的改进。
适用场景:输出质量确实存在波动,并且“足够好”有一个可衡量的阈值。
问题暴露的场景:当评分函数模糊不清或容易被钻空子时。如果模型能在没有实质改进的情况下提高自己的分数,这个循环只是在白白消耗 token。
PATTERN 7 — Coordinator
A central routing agent directs requests to specialized agents based on what's actually being asked.
Real example: Support tickets get routed to billing, technical, account, shipping, or fraud specialists — each with narrow context instead of one agent trying to know everything.
Use it when: you have genuinely different request types that need different context, tools, or decision logic.
It breaks when: the routing itself becomes ambiguous. If requests don't cleanly fall into one category, the coordinator becomes a new bottleneck and source of misrouting.
模式7——协调员
一个中央路由 Agent 根据实际请求内容,将任务分派给专业化的 Agent。
实际案例:支持工单被路由到计费、技术、账户、物流或欺诈等不同领域的专家 Agent——每个 Agent 只掌握狭窄的上下文,而不是让一个 Agent 试图知道所有事。
适用场景:你有不同类型的请求,需要不同的上下文、工具或决策逻辑。
问题暴露的场景:当路由本身变得模糊时。如果请求不能清晰地归入某一个类别,协调员就成了新的瓶颈和误路由的来源。
PATTERN 8 — Hierarchical Task Decomposition
A root agent breaks a complex goal into smaller subgoals, delegates them to specialist workers, then synthesizes everything into one answer.
Real example: "Which 3 countries should we expand into next year?" gets broken into competitive analysis, regulatory research, logistics feasibility, and market sizing — each handled by a different specialist, then combined.
Use it when: the problem is too broad for one reasoning pass but breaks cleanly into independent areas of expertise.
It breaks when: the subgoals aren't actually independent. If workstreams need to inform each other in real time, decomposing them upfront loses that interaction.
模式8——分层任务分解
一个根 Agent 将复杂目标分解成更小的子目标,委派给不同的专家 Agent,然后将所有结果综合成一个答案。
实际案例:“明年我们应该扩展到哪 3 个国家?”这个问题被分解为竞争分析、法规调研、物流可行性和市场规模评估——每个由不同的专家 Agent 处理,再合并结果。
适用场景:问题范围太大,无法通过一次推理解决,但能干净地分解为独立的专业领域。
问题暴露的场景:当子目标实际上不是独立的。如果各个工作流需要实时互相通知,事先分解就会丢失这种交互。
PATTERN 9 — Swarm
Multiple specialist agents contribute to a shared discussion, challenge each other's assumptions, and a facilitator synthesizes a final recommendation.
Real example: Should the company launch a subscription tier? Research, engineering, finance, and support agents each argue their perspective across multiple rounds before a facilitator weighs the trade-offs.
Use it when: there's no single "correct" answer — you need a well-reasoned decision shaped by genuinely competing viewpoints.
It breaks when: you need a fast, deterministic answer. Swarms are deliberately slow and exploratory — wrong tool if you need speed.
模式9——群体智能
多个专家 Agent 参与共同讨论,挑战彼此的假设,然后由一名促进者综合出最终建议。
实际案例:公司是否应该推出订阅模式?研究、工程、财务和支持等部门各自派出 Agent 进行多轮辩论,最后由促进者权衡利弊。
适用场景:没有唯一的“正确”答案——你需要一个经过充分推敲的决定,由真正竞争的观点共同塑造而成。
问题暴露的场景:当你需要快速、确定的答案时。群体智能模式是故意慢而探索性的——需要速度时就是错误的工具。
PATTERN 10 — ReAct (Reason and Act)
The agent alternates between reasoning and action: decide what to investigate, call a tool, observe the result, decide if there's enough evidence yet.
Real example: "The queue processor seems stuck" — the agent searches docs, checks service health, correlates findings, and only then suggests a fix. The investigation path isn't predefined; it depends on what it finds along the way.
Use it when: the path to the answer genuinely can't be planned upfront — it depends on what each step reveals.
It breaks when: investigations run long without converging. Always cap the number of reasoning-action cycles, or you risk infinite exploration.
模式10——推理与行动 (ReAct)
Agent在推理与行动之间交替进行:决定调查什么、调用工具、观察结果、判断是否已有足够证据。
实际案例:“队列处理器似乎卡住了”——Agent 搜索文档、检查服务健康状态、关联发现,然后才建议修复方案。调查路径不是预定义的,而是取决于沿途的发现。
适用场景:找到答案的路径确实无法事先规划——它取决于每一步揭示的信息。
问题暴露的场景:当调查持续进行而无法收敛时。务必限制推理-行动循环的次数,否则可能陷入无限探索。
PATTERN 11 — Human-in-the-Loop
The agent investigates and recommends, but a human makes the final call on anything risky or ambiguous.
Real example: Refund approvals — low-risk, clear-cut cases get automated. High amounts, fraud signals, or policy exceptions pause for human review before anything is finalized.
Use it when: the decision carries real financial, legal, or reputational risk and full automation isn't acceptable yet.
It breaks when: you treat this as just a UI feature instead of an architectural one. You need durable state, reviewer assignment, timeout handling, and escalation paths — not just a "pause" button.
模式11——人机协作
Agent 负责调查和建议,但人类对任何有风险或模糊的事项做最终决定。
实际案例:退款审批——低风险、界限明确的案例自动处理。高金额、欺诈信号或政策例外情况则暂停,等待人工审核后再最终执行。
适用场景:决策涉及真实的财务、法律或声誉风险,且完全自动化尚不可接受。
问题暴露的场景:当你仅仅将其视为一个 UI 特性而非架构特性时。你需要持久状态、审核者分配、超时处理和升级路径——而不仅仅是一个“暂停”按钮。
PATTERN 12 — Plan-and-Execute
A planner agent creates a full structured plan upfront — reviewable and modifiable — before any action is taken. An executor then runs through the steps.
Real example: "Resize the worker fleet from 10 to 20 instances, verify the queue drains, update the runbook." The full plan is visible before execution starts, unlike ReAct where the path emerges step by step.
Use it when: you want the plan to be reviewable or approvable before any action happens — important for operations with real consequences.
It breaks when: the environment changes faster than the plan can execute. A stale plan executed blindly is worse than no plan at all.
模式12——规划与执行
一个规划 Agent 在采取任何行动之前,先创建完整的结构化计划——该计划可审查、可修改。然后一个执行 Agent 按步骤执行。
实际案例:“将工作节点从 10 个扩展到 20 个,确认队列排空,更新操作手册。”完整的计划在开始执行前可见,这与 ReAct 模式不同,后者路径是逐步浮现的。
适用场景:你希望任何行动之前计划是可审查或可批准的——对于有实际后果的操作很重要。
问题暴露的场景:当环境变化速度快于计划执行速度时。盲目执行过时的计划比根本没有计划更糟。
PATTERN 13 — Reflexion
The agent evaluates its own failures, reflects on what went wrong, and carries that memory into the next attempt.
Real example: A code generation agent writes a script, it fails at runtime, the agent analyzes the actual error, records what to fix, and retries — getting smarter with each attempt instead of repeating the same mistake.
Use it when: failures are informative and self-correction genuinely improves the next attempt.
It breaks when: the failure modes are random or unrelated to each other. Reflexion only helps when there's a real pattern to learn from.
模式13——反思学习
Agent 评估自己的失败,反思哪里出了问题,并将这次经验记忆带入下一次尝试。
实际案例:一个代码生成 Agent 编写了一个脚本,运行时失败。Agent 分析实际错误,记录需要修复的部分,然后重试——每次尝试都更聪明,而不是重复同样的错误。
适用场景:失败本身提供信息,且自我修正能真正改进下一次尝试。
问题暴露的场景:当失败模式是随机的或彼此无关时。反思学习只有在存在可供学习的真实模式时才有效。
PATTERN 14 — Custom Logic
A hybrid: deterministic code handles the rules that must never be wrong, while the model handles judgment, drafting, and exception handling.
Real example: A refund workflow where purchase verification and fraud checks run as hard deterministic rules — never delegated to the model — while drafting the customer response and routing recommendations stay agentic.
Use it when: the workflow has real branching logic with legal or financial consequences, and you need to be precise about what's deterministic versus what's flexible.
It breaks when: teams blur the line and let the model make decisions that should be hardcoded rules. Eligibility, permissions, and money movement should never be the model's call alone.
模式14——自定义逻辑混合模式
一种混合方式:确定性代码处理绝对不能出错的规则,而模型负责判断、起草和异常处理。
实际案例:一个退款工作流中,购买验证和欺诈检查由硬性的确定性规则执行——绝不委托给模型——而起草客户回复和路由建议则保持 Agent 化。
适用场景:工作流存在具有法律或财务后果的实际分支逻辑,且你需要精准区分什么是确定性的、什么是灵活性的。
问题暴露的场景:当团队模糊了界限,让模型做出本应是硬编码规则的决定时。资格判断、权限控制和资金流转绝不能仅由模型决定。
PATTERN 15 — Event-Driven Agent
The agent doesn't wait to be asked. It subscribes to an event stream and acts the moment a condition is triggered.
Real example: A fraud detection agent that reacts the instant a suspicious transaction event fires — not when a support ticket eventually surfaces it, by which point the damage is done.
Use it when: timing matters more than anything else, and waiting for a human request means missing the window to act.
It breaks when: the triggering conditions are poorly defined. A noisy event stream with vague triggers turns into a system constantly crying wolf — or worse, missing the real signal.
模式15——事件驱动Agent
Agent 不等待被调用。它订阅事件流,并在条件触发的瞬间立即行动。
实际案例:一个欺诈检测 Agent,一旦可疑交易事件触发就立即响应——而不是等到支持工单最终暴露问题,那时损失已经造成。
适用场景:时机比什么都重要,等待人工请求意味着错过行动窗口。
问题暴露的场景:当触发条件定义不清晰时。一个嘈杂的事件流加上模糊的触发器,会变成一个不断“狼来了”的系统——或者更糟,错过真正的信号。
Pattern selection — match the uncertainty, not the hype
The right pattern matches the shape of the uncertainty in your work:
→ Uncertain which tool to use → Single Agent or ReAct → Uncertain where to route → Coordinator → Uncertain about quality → Review & Critique or Iterative Refinement → Uncertain execution path → Plan-and-Execute or ReAct → Uncertain how to self-correct → Reflexion or Loop → Uncertain about business risk → Human-in-the-Loop or Custom Logic → Uncertain problem structure → Hierarchical Decomposition or Swarm → Can't wait for a request → Event-Driven Agent
A swarm is not more advanced than a single agent if the task only needs one reliable tool call.
Plan-and-Execute is not an upgrade from ReAct if your plan goes stale by step three.
The most reliable production systems are not the most autonomous ones.
They put autonomy exactly where it creates value — and constrain it everywhere else.
模式选择——匹配不确定性,而非追逐炒作
正确的模式要匹配工作中不确定性的具体形态:
→ 不确定用哪个工具 → 单 Agent 或 ReAct → 不确定如何路由 → 协调员 → 不确定质量是否达标 → 审查与批注或迭代优化 → 不确定执行路径 → 规划与执行或 ReAct → 不确定如何自我纠正 → 反思学习或循环 → 不确定业务风险 → 人机协作或自定义逻辑混合模式 → 不确定问题结构 → 分层分解或群体智能 → 不能等待请求 → 事件驱动 Agent
如果任务只需要一次可靠的工具调用,群体智能并不比单 Agent 更高级。
如果规划在第三步就过时了,规划与执行并不是 ReAct 的升级版。
最可靠的生产系统不是最自主的系统。
它们只在创造价值的地方放权,并在其他地方加以约束。
10 rules for production agentic systems
- Start with the smallest pattern that works. A single agent with clean tool contracts beats a multi-agent system with weak ones.
- Write tool descriptions like contracts. The model only knows what the tool does from the description — not from your intent.
- Cap iterations, tool calls, and spend per request. An agent without budget limits is a liability waiting to show up in a bill.
- Log the full action trace. Tool calls, arguments, outputs, final decision. Without this, incident investigation is guesswork.
- Keep irreversible actions behind deterministic checks or human approval. Never let a model be the only gate before a money movement or production change.
- Evaluate with real failure cases, not just happy paths. Happy-path correctness is a prototype. Edge-case correctness is a product.
- Separate prompts by responsibility before the system prompt becomes unreadable. "But don't do X when Y" creeping into your prompt means the agent is doing two jobs.
- Treat multi-agent systems as distributed systems. Partial failure, timeouts, retries, and observability are not optional.
- Model review is not a substitute for testing. Use judges to improve quality. Use tests and permission checks to enforce correctness.
- Prefer the simpler pattern — not because simple is always better, but because the complexity budget you save can be spent on better tools, better prompts, better evaluation.
生产级Agent系统的10条规则
- 从可行的最小模式开始。一个拥有清晰工具合约的单 Agent,胜过拥有模糊合约的多 Agent 系统。
- 把工具描述写得像合约一样精确。模型只能通过描述来理解工具的用途——不是通过你的意图。
- 限制每次请求的迭代次数、工具调用次数和花费。没有预算限制的 Agent 是一个待出现在账单中的风险。
- 记录完整的操作轨迹。工具调用、参数、输出、最终决策。没有这些,事故调查就是猜测。
- 将不可逆操作置于确定性检查或人工审批之后。绝不要让模型成为资金流转或生产变更前的唯一关卡。
- 用真实的失败案例来评估,而不仅仅是正常路径。正常路径的正确性只代表原型。边界情况的正确性才是产品。
- 在系统提示词变得不可读之前,按职责分离提示词。“但是做X的时候不要做Y”这样的内容潜入提示词,说明Agent在干两份工作。
- 将多 Agent 系统视为分布式系统。部分失败、超时、重试和可观测性不是可选项。
- 模型审查不能替代测试。用评审来提升质量,用测试和权限检查来确保正确性。
- 优先选择更简单的模式——不是因为简单总是更好,而是因为你节省下来的复杂性预算可以投入到更好的工具、更好的提示词、更好的评估上。
Most teams don't fail because they picked the wrong pattern.
They fail because they never asked which uncertainty they were actually solving for.
Pick the pattern. Match the shape of the problem. Don't add autonomy where it doesn't earn its place.
大多数团队失败的原因并非选错了模式。
他们失败是因为从未问过自己:我们到底在解决哪种不确定性?
选对模式。匹配问题的形态。不要在它不创造价值的地方增加自主性。