解析器:智能系统的路由表,而非填鸭式上下文堆砌
作者以自身构建个人智能代理系统的深度复盘,指出决定系统能否持续进化的核心并非模型或技能本身,而是常被忽视的“解析器”(Resolver)。文章通过一个错归档案例揭示了硬编码路径如何导致知识库沦为垃圾抽屉,并通过“技能可达性孤岛”问题说明缺乏路由会制造“能力已存在但无法调用”的假象。核心论点是:解析器是一个用于上下文分发的路由表,能将2万行的臃肿指令压缩为200行决策树,通过“文件归档规则”、“触发词评估”和“可解析性检查”等模式防止系统漂移。作者进一步将这一技术模式类比为组织管理中的经理层,并开源了包含全套解析器模式的个人微Agent系统GBrain。适合正在长时间维护多技能Agent系统,并遭遇知识索引失效、模型注意力退化的工程师阅读。
In "Thin Harness, Fat Skills", I introduced five definitions for building agent systems that actually work. Skills got all the attention. People bookmarked the skill-as-method-call pattern, the diarization concept, the thin harness architecture. Good. Those matter.
But the one that got almost no attention is the one that matters most. Resolvers. And the reason they got ignored is the same reason they're so important: they're invisible when they work, and catastrophic when they don't.
A resolver is a routing table for context. When task type X appears, load document Y first. That's it. One sentence. But that one sentence is the difference between an agent that compounds intelligence and an agent that slowly forgets what it knows.
This is the story of how I learned that the hard way.
在「薄夹套,胖技能」一文中,我提出了构建真正可用的 agent 系统的五个定义。技能(Skills)获得了所有关注。人们收藏了「技能即方法调用」的模式、diarization 概念、薄夹套架构。很好。这些确实重要。
但那个几乎无人问津的定义,恰恰是最重要的一个:resolver(解析器)。它被忽略的原因,也正是它如此重要的原因——它在工作时隐于无形,在失效时却是一场灾难。
resolver 是路由表/路由逻辑表,专门为上下文而设。当任务类型 X 出现时,先加载文档 Y。仅此而已。一句话。但这一句话,便界定了 agent 是能不断积累智慧,还是会慢慢遗忘自己知道什么。
这正是我如何用惨痛教训学到这一点的故事。
My CLAUDE.md was 20,000 lines.
I'm not proud of this. Every quirk, every pattern, every lesson I'd ever encountered with Claude Code, every convention for my codebase, every edge case I'd been burned by. I kept adding. The file kept growing. It felt productive. It felt like I was making the model smarter.
I wasn't. I was drowning it.
The model's attention degraded. Responses got slower and less precise. Claude Code literally told me to cut it back. That's when you know you've gone too far — the AI is telling you to stop talking.
The instinct is natural. You want the model to know everything. So you cram everything into the system prompt, the instructions file, the context window. You're trying to make the model omniscient by proximity. It doesn't work. You can't make someone smarter by shouting louder. You make them smarter by giving them the right book at the right moment.
The fix was about 200 lines. A numbered decision tree. Pointers to documents. When the model needs to file something, it walks the tree:
- Is it a person? → /people/ directory
- A company? → /companies/ directory
- A policy analysis? → /civic/ directory Twenty thousand lines of knowledge, accessible on demand, without polluting the context window.
That 200-line file is the resolver. It replaced 20,000 lines of instructions. And the system immediately got better — faster responses, more accurate filing, fewer hallucinations. Not because the model got smarter. Because I stopped blinding it with noise.
我的 CLAUDE.md 文件曾有两万行。
我并不为此自豪。每一个怪癖、每一种模式、我与 Claude Code 遇到过的每一次教训、代码库的每一条约定、每一个让我吃过苦头的边缘情况——我都往里面加。文件不断膨胀。当时感觉颇有成效,仿佛我正在让模型变得更聪明。
但其实并没有。我是在溺毙它。
模型的注意力开始退化。回答变慢,精确度下降。Claude Code 甚至直接告诉我把它删减一下。这时候你才知道自己做得太过分了——AI 在叫你闭嘴。
这种直觉很正常。你希望模型无所不知,所以把所有东西塞进系统提示、指令文件和上下文窗口。你试图通过堆砌信息让模型全知全能。这行不通。你不能靠提高音量让人变聪明,而是要在恰当的时刻递上一本恰当的书。
最终的解决方案是大约 200 行的编号决策树。指向不同文档的指针。当模型需要归档某样东西时,它就遍历这棵树:
- 是一个人?→ /people/ 目录
- 一家公司?→ /companies/ 目录
- 一份政策分析?→ /civic/ 目录
两万行的知识,按需访问,而不污染上下文窗口。
这 200 行的文件就是 resolver。它取代了 20,000 行的指令。系统立刻变得更好了——响应更快、归档更准、幻觉更少。不是因为模型变聪明了,而是因为我停止了用噪声蒙蔽它。
I asked my agent to ingest Will Manidis's essay "No New Deal for OpenAI" — a devastating policy analysis of OpenAI's industrial policy brief. It's the kind of piece that breaks down a company's regulatory strategy, maps the political implications, names the institutional actors. Sharp civic analysis.
The agent filed it in sources/.
Wrong. sources/ is for raw data dumps and bulk imports. CSV files. API exports. Scraped datasets. This was political analysis — it belongs in civic/, where policy pieces, political actors, and institutional dynamics live.
Why did it happen? The idea-ingest skill had hardcoded brain/sources/ as the default directory. It didn't consult the resolver. It had its own half-assed filing logic baked into the skill itself. When no explicit path was given, it fell back to sources/ the way a lazy intern throws everything in the "misc" folder.
One misfiled article. I could have fixed it and moved on. Instead I pulled the thread.
The audit
When I caught the Manidis misfiling, I audited every skill that writes to the brain. I have 13 of them. Skills for ingesting articles, PDFs, meeting transcripts, videos, investor updates, voice notes, tweets. Each one writes pages to the brain repo.
Only 3 out of 13 referenced the resolver.
The other 10 had hardcoded paths. Idea-ingest defaulted to sources/. PDF-ingest defaulted to originals/. Meeting-ingest wrote to meetings/. Each skill had internalized its own filing assumptions. Each one was a potential misfiling waiting to happen.
This is the pattern that kills agent systems. Not a dramatic failure. Not a hallucination that produces nonsense. A slow, silent drift where information goes to the wrong place, connections don't form, and the knowledge base gradually becomes a junk drawer with 14,700 files in it instead of a structured intelligence layer.
The fix wasn't fixing 10 skills individually. That's whack-a-mole. You fix one, another drifts. The fix was a shared filing rules document — _brain-filing-rules.md — and a mandate that every brain-writing skill reads RESOLVER.md before creating any page. One rule. Ten skills fixed.
The filing rules doc also catalogs common misfiling patterns. Sources vs. originals. People vs. companies (when someone IS a company). Civic vs. sources (the Manidis case). Every mistake, documented, so the same mistake can't happen a different way.
Zero misfilings since. Every new skill that writes to the brain now has a two-line mandate at the top: Before creating any new brain page, read brain/RESOLVER.md and skills/_brain-filing-rules.md. File by primary subject, not by source format or skill name.
我让我的 agent 去摄取 Will Manidis 的文章「No New Deal for OpenAI」(对 OpenAI 工业政策简报的犀利政策分析)。这类文章剖析一家公司的监管策略,勾画政治影响,点出各机构角色。犀利的公民/政治分析。
Agent 把它归档到了 sources/。
错。sources/ 是存放原始数据转储和批量导入的地方。CSV 文件。API 导出。爬取的数据集。而这是一份政治分析——它应该归入 civic/,那里存放政策类文章、政治角色和制度动态。
为什么会这样?idea-ingest 这个技能将 brain/sources/ 硬编码为默认目录。它没有查询 resolver。它自己在技能内部打包了一套半吊子的归档逻辑。当没有显式指定路径时,它就退回到 sources/,就像懒散的实习生把所有东西都扔进「杂项」文件夹一样。
一篇归档错误的文章。我本可以修好它,然后继续前进。但我没有。我顺藤摸瓜,展开了调查。
审计
当我发现 Manidis 这篇文章被错误归档后,我审计了所有写入 brain(大脑/知识库)的技能。一共有 13 个。用于摄取文章、PDF、会议转录、视频、投资人更新、语音笔记、推文的技能。每个技能都会向 brain 仓库写入内容页面。
13 个中只有 3 个引用了 resolver。
另外 10 个都有硬编码路径。idea-ingest 默认写入 sources/;PDF-ingest 默认写入 originals/;meeting-ingest 写入 meetings/。每个技能都内化了自己的归档假设。每一个都是潜在的归档事故,随时可能爆发。
这就是慢慢扼杀 agent 系统的模式。不是戏剧性的失败,也不是产生胡言乱语的幻觉。而是缓慢、无声的漂移——信息去了错误的地方,连接无法形成,知识库渐渐变成一个装了 14,700 个文件的杂物抽屉,而不是一个结构化的智能层。
修复方法不是逐一修补 10 个技能——那是打地鼠,修好一个另一个又会偏。解决方案是一份共享的归档规则文档 _brain-filing-rules.md,以及一条强制规定:每个写入 brain 的技能在创建任何页面之前,都必须先读取 RESOLVER.md。这一条规则,修复了十个技能。
这份归档规则文档还记录了常见的错误归档模式:sources 与 originals 的区别;person(人)与 company(公司)之间的边界(当某人本身就是一家公司时);civic 与 sources 的混淆(即 Manidis 案例)。每一次错误都被记录下来,这样相同的错误就不会以另一种方式重演。
从那以后,零次归档错误。每个写入 brain 的新技能现在都在顶部有一条两条指令的任务:在创建任何新的 brain 页面之前,先读取 brain/RESOLVER.md 和 skills/_brain-filing-rules.md。按主体的主题来归档,而不是按来源格式或技能名称。
The above example talks about where to put files in your memory repo, but it applies to skill files (fat skills) and code to call (fat code) as well.
A resolver routes tasks to skills. But what happens when a skill exists and the resolver doesn't know about it?
For my OpenClaw, we built a signature-tracking system inside the executive assistant skill. It worked perfectly. Tracked DocuSign deadlines, surfaced unsigned documents, drafted reminders. Beautiful piece of engineering. Completely invisible.
When someone asked "check my signatures" or "what do I need to sign," the system shrugged. The resolver didn't have a trigger for signatures. The skill existed. The capability existed. The system couldn't reach it. It's like having a surgeon on staff but not listing them in the hospital directory.
This is worse than not having the skill at all. A missing skill is honest — the system says "I can't do that" and you know to build it. A skill that exists but isn't reachable creates the illusion of capability. You think the system handles signatures. It doesn't. And you don't find out until the moment it matters.
After a month of building, we had 40+ skills. Some created in response to specific incidents, others spawned by sub-agents running crons. Nobody was maintaining the resolver table. Skills were being born but not registered. The system had capabilities it didn't know it had.
So I built resolver trigger evals. A test suite of 50 sample inputs with expected outputs:
Input: "check my signatures" Expected: executive-assistant (signature section)
Input: "who is Pedro Franceschi" Expected: brain-ops → gbrain search
Input: "save this article to brain" Expected: idea-ingest + RESOLVER.md
Two failure modes. False negative: skill should fire but doesn't, because the trigger description is wrong or missing. False positive: wrong skill fires, because two triggers overlap. Both fixable by editing markdown. No code changes. The resolver is a document, and documents are cheap to fix.
I told my Claw: "Make sure the resolver is tested and also there are proper eval LLM tests for all the prompts and skills that use the resolver." This isn't optional. If you can't prove the right skill fires for the right input, you don't have a system. You have a collection of skills and a prayer.
上面的例子讲的是如何将文件存放到你的记忆仓库中,但这个原则同样适用于技能文件(胖技能/fat skills)和代码调用(胖代码/fat code)。
Resolver 将任务路由到技能。但如果一个技能存在,而 resolver 并不知道它,会怎样?
在 OpenClaw 项目中,我们在 executive assistant 技能内部构建了一个签名跟踪系统。它运行完美:追踪 DocuSign 截止日期、呈现未签文档、草拟提醒。漂亮的工程。但完全不可见。
当有人问「检查我的签名」或「我有什么需要签署的」时,系统耸了耸肩。Resolver 中没有针对签名(signatures)的触发条件。技能存在,能力存在,但系统访问不到。这就像医院有外科医生,但电话簿上没有列出他。
这比根本没有这个技能更糟糕。缺失的技能是诚实的——系统说「我做不了」,然后你知道需要去构建它。一个存在但不可达的技能,则会造成一种能力假象。你以为系统能处理签名,但它实际并不能。而你直到真正需要它的那一刻才会发现。
经过一个月的构建,我们有了 40 多个技能。有些是为应对特定事件而创建的,有些是由运行 cron 的子 agent 生成的。没有人维护 resolver 表。技能不断诞生,但没有被注册。系统拥有它自己都不知道的能力。
所以我构建了 resolver 触发条件测试/trigger evals。一个测试套件,包含 50 个样本输入及其预期输出:
输入:「检查我的签名」 预期输出:executive-assistant(signature 部分)
输入:「Pedro Franceschi 是谁」 预期输出:brain-ops → gbrain search
输入:「把这篇文章保存到 brain」 预期输出:idea-ingest + RESOLVER.md
两种失败模式。假阴性(false negative):技能应该触发但没有,因为触发描述写错或缺失。假阳性(false positive):错误的技能被触发,因为两个触发条件有重叠。这两种问题都可以通过编辑 markdown 来修复,不需要修改代码。Resolver 是一个文档,而文档修起来很便宜。
我告诉我的 Claw:「确保 resolver 经过了测试,并且所有使用 resolver 的提示词(prompt)和技能都有合适的 eval LLM tests。」这不是可选项。如果你不能证明正确的技能在正确的输入下会被触发,你就没有系统。你只有一堆技能和一个祈祷。
The trigger evals catch routing failures. But there's a deeper problem: skills that exist but have no path from the resolver at all. Not a wrong path — no path.
I was debugging a skill that should have fired and didn't. The usual drill: check the trigger description, check the resolver table, trace the chain. And I realized there was no systematic way to verify that a skill was reachable. You could check one skill at a time. You couldn't check all of them.
So I invented check-resolvable. A meta-skill that walks the entire chain — AGENTS.md → skill file → code — and finds dead links.
I told my agent: "Check if there is a direct line between the agents.md resolver all the way to this running. And then remember this as a 'check-resolvable' skill. The skill should actually check if this skill or codepath is either directly called out in the resolver or callable via something in the resolver. And if it isn't, figure out what resolvable skill should call it."
First run found 6 unreachable skills. Six capabilities the system had built but couldn't access. A flight tracker that nobody could invoke by asking about flights. A content-ideas generator that only ran on cron but couldn't be triggered manually. A citation fixer that existed in the skills directory but wasn't listed in the resolver at all.
Six. Out of 40+. Fifteen percent of the system's capabilities were dark.
Fixed in an hour. Just added triggers to AGENTS.md. Now check-resolvable runs weekly. It's the resolver equivalent of a linter — it tells you what's broken before a user discovers it the hard way.
Trigger evals 可以捕获路由错误,但还有一个更深层的问题:那些存在但完全没有任何路径从 resolver 指向它们的技能。不是走错路,而是根本没有路。
我曾在调试一个本应触发但实际上没有触发的技能。常规流程:检查触发描述、检查 resolver 表、追踪调用链。然后我意识到,没有系统化的方法可以验证一个技能是否可达。你可以逐个检查技能,但无法一次性检查所有。
于是我想出了 check-resolvable。这是一个元技能(meta-skill),它会遍历整条链——从 AGENTS.md 到技能文件到代码——并找出死链接。
我告诉我的 agent:「检查从 agents.md 的 resolver 一直到这个运行的技能之间是否有直接的路径。然后把这个记录为一个叫 'check-resolvable' 的技能。这个技能应该实际检查当前技能或代码路径是否被 resolver 直接调用,或者可以通过 resolver 中的某个条目来调用。如果不能,就找出应该调用它的可解析技能是什么。」
第一次运行就发现了 6 个不可达的技能。6 个系统已经构建但无法访问的能力。一个航班追踪器,没有人能通过询问航班来调用它。一个内容创意生成器,只在 cron 上运行,无法手动触发。一个引用修复器,存在于技能目录中,但根本没有列在 resolver 里。
6 个。在 40 多个技能中。系统 15% 的能力处于「黑暗」之中。
一小时内修复完成。只需在 AGENTS.md 中添加触发条件。现在 check-resolvable 每周运行一次。它相当于 resolver 的 linter——在用户通过惨痛教训发现之前就告诉你什么坏了。
Here's the thing nobody tells you about resolvers: they decay.
Day 1, the routing table is perfect. Every skill is registered. Every trigger is accurate. Every path resolves. You feel like a genius.
Day 30, three new skills exist that nobody added to the resolver. They were built in response to real needs, by sub-agents running at 3 AM, and nobody updated the table.
Day 60, two trigger descriptions don't match how users actually phrase things. The skill handles "track this flight" but users say "is my flight delayed?" The description says one thing. The user says another. The skill doesn't fire.
Day 90, the resolver is a historical document. An artifact of what the system used to be able to do. Not what it can do now.
I noticed the system was drifting. Skills were being invoked by direct instruction — "read skills/flight-tracker/SKILL.md" — instead of through the resolver, because the resolver didn't have the right triggers. The system worked because I knew which skill to call. That's not a system. That's a person with a filing cabinet.
Yesterday, in office hours with a YC company, a CTO asked me: "Could an RLM be used to solve context rot particularly around resolvers?"
The idea: a reinforcement learning loop where the system observes every task dispatch. Which skill fired. Which didn't. Which tasks had no match. Which tasks matched the wrong skill. And periodically — maybe nightly, maybe weekly — it rewrites the resolver based on observed evidence. Not a human maintaining a table. The table maintaining itself.
Eight hundred task dispatches over a month. The system sees that "is my flight on time" never triggers flight-tracker but "check my flight" does. It rewrites the trigger description. The system sees that pdf-ingest fires for investor update emails, but investor-update-ingest should have caught them first. It adjusts priority.
This is forward-looking. We haven't fully built it. Claude Code's AutoDream system — memory consolidation during idle time — is a primitive version. It reviews accumulated context and compresses it. Apply that principle to the resolver specifically, and you get a routing table that improves with use.
A resolver that learns from its own traffic. That's the endgame for agent governance.
有一件事没人会告诉你关于 resolver:它们会衰变(decay)。
第 1 天,路由表是完美的。每个技能都已注册,每个触发条件都准确,每一条路径都正确。你觉得自己像个天才。
第 30 天,出现了 3 个新技能,但没人将它们添加到 resolver 中。它们是为了满足真实需求而创建的,由凌晨 3 点运行的子 agent 产生,但没有人更新路由表。
第 60 天,有两个触发描述与用户的真实表述方式不再匹配。技能处理「track this flight」这种说法,但用户会说「is my flight delayed?」。描述是一套话,用户说的是另一套话。技能不会被触发。
第 90 天,resolver 变成了一个历史文档。它记录的是系统「曾经」能做什么,而不是现在能做什么。
我注意到系统正在漂移。技能通过直接指令被调用——比如「read skills/flight-tracker/SKILL.md」——而不是通过 resolver,因为 resolver 没有正确的触发条件。系统之所以还能工作,是因为我知道该调用哪个技能。但这不叫系统,这是一个知道文件柜在哪的人。
昨天,在和一家 YC 公司举行的 office hours 上,一位 CTO 问我:「能否用 RLM(强化学习模块)来解决 context rot,尤其是针对 resolver 的?」
思路是:一个强化学习循环,让系统观察每一次任务分发。哪个技能被触发了,哪个没有触发,哪些任务没有匹配项,哪些任务匹配到了错误的技能。然后定期地——也许是每晚,也许是每周——根据观察到的证据重写 resolver。不再需要人类维护路由表,而是路由表自我维护。
一个月内八百次任务分发。系统发现「is my flight on time」从未触发过 flight-tracker,但「check my flight」可以。于是它重写触发描述。系统发现 pdf-ingest 响应了投资人更新的邮件,但本应先由 investor-update-ingest 捕获才对。于是它调整优先级。
这仍然是前瞻性的,我们还没有完全构建好。Claude Code 的 AutoDream 系统(在空闲期间进行记忆整合)是一个原始版本。它回顾累积的上下文并将其压缩。如果把同样的原则专门应用于 resolver,你就会得到一个随着使用而持续改进的路由表。
一个能从自身流量中学习进化的 resolver。这才是 agent 治理的终极形态。
One more principle, and it's the one that makes everything click.
Resolvers compose. They exist at every layer of the system, not just the top.
The skill resolver lives in AGENTS.md. It maps task types to skill files. "Who is this person?" → brain-ops. "Ingest this PDF" → pdf-ingest. "Check my calendar" → google-calendar. This is the one everyone thinks of.
The filing resolver lives in RESOLVER.md. It maps content types to directories. Person → people/. Company → companies/. Policy analysis → civic/. This is the one that caught the Manidis misfiling.
The context resolver lives inside each skill. When the executive assistant skill fires, it has its own internal routing: email triage goes one way, scheduling goes another, signature tracking goes a third. Sub-routing within the skill.
Claude Code already has this pattern. Every skill has a description field. The model matches user intent to skill descriptions automatically. You never have to remember that /ship exists. The description is the resolver. It's resolvers all the way down.
The same architecture, at every layer. That's what makes it scale from 5 skills to 50, from 1,000 files to 25,000, from a toy demo to a production system that processes 200 inputs a day.
还有一个原则,它让所有东西都串联了起来。
Resolver 是可组合的(compose)。它们存在于系统的每一层,而不只是在顶层。
技能级别的 resolver 存在于 AGENTS.md 中。它将任务类型映射到技能文件。「这个人是谁?」→ brain-ops。「摄取这个 PDF」→ pdf-ingest。「检查我的日历」→ google-calendar。这是每个人都会想到的那一个。
归档级别的 resolver 存在于 RESOLVER.md 中。它将内容类型映射到目录。人物 → people/。公司 → companies/。政策分析 → civic/。正是这个 resolver 捕获了 Manidis 的错误归档。
上下文级别的 resolver 存在于每个技能内部。当 executive assistant 技能被触发时,它有自己的内部路由:邮件分类走一条路,日程安排走另一条,签名跟踪走第三条。技能内部有子路由。
Claude Code 已经采用了这种模式。每个技能都有一个描述字段。模型自动将用户意图匹配到技能描述。你不需要记住 /ship 命令是否存在。描述本身就是 resolver。这是一个「层层都有 resolver」的模式。
同样的架构,出现在每一层。正是这一点让系统从 5 个技能扩展到 50 个、从 1,000 个文件扩展到 25,000 个、从玩具 demo 扩展到每天处理 200 个输入的生产级系统。
Let me pull this together.
A resolver is 200 lines of markdown that replaced 20,000 lines of crammed context. When it's missing, skills invent their own filing logic and everything slowly degrades. When it's present but untested, capabilities go dark — you have a surgeon the hospital can't find. When it's tested but static, it rots within 90 days. When it's tested and self-healing, the system compounds.
The pattern:
- Load the right context at the right moment. Don't cram.
- Mandate that every skill consults the resolver. Don't trust individual filing logic.
- Test the routing, not just the output. Trigger evals.
- Audit reachability. Check-resolvable. Weekly.
- Make the resolver learn from its own traffic. The endgame. The resolver is the governance layer of an agent system. The traffic cop, the filing clerk, the org chart, and the institutional memory, all in one document that a model can read in 200 milliseconds.
Almost nobody is building them explicitly. Everyone is cramming 20,000 lines into the system prompt and wondering why the model seems dumber than it should be. The model isn't dumb. It's drowning. Give it a routing table and watch what happens.
让我把这一切串起来。
一个 resolver 是 200 行的 markdown,取代了 20,000 行塞满的上下文。当它缺失时,技能会发明自己的归档逻辑,一切逐渐退化。当它存在但没有经过测试时,能力会陷入黑暗——就像医院找不到外科医生。当它经过测试但静态不变时,90 天内就会腐烂。当它既经过测试又能自我修复时,系统会实现复合增长。
这套模式如下:
- 在恰当的时机加载正确的上下文,不要硬塞。
- 强制每个技能都咨询 resolver,不要相信单个技能的归档逻辑。
- 测试路由,而不仅仅是输出。使用 trigger evals。
- 审计可达性。运行 check-resolvable。每周一次。
- 让 resolver 从自己的流量中学习。这是最终目标。
Resolver 是 agent 系统的治理层(governance layer)。交通警察、归档管理员、组织架构图和机构记忆,全部浓缩在一个模型能在 200 毫秒内读完的文档中。
几乎没有人显式地构建它们。每个人都在把 20,000 行塞进系统提示,然后疑惑为什么模型看起来比预想的更笨。模型并不笨。它被淹没了。给它一个路由表,看看会发生什么。
Up to this point, I've been describing resolvers as a technical pattern. A way to make agents work better. Route tasks. Load the right context. Avoid drowning the model.
That framing is true. It's also too small.
What I actually built is closer to management.
Think about what's happening in a real system with 40+ skills and 25,000 files. You don't just have code. You have an organization.
Skills are employees. Each one has a capability. Some are specialists. Some are generalists. Some only run on cron. Some are user-facing.
The resolver is the org chart. It defines who handles what, how tasks get routed, and what happens when something doesn't match. It's also escalation logic — when one path fails, where does it go next?
The filing rules are internal process. Where information lives. How decisions get recorded. What counts as a "person" vs a "company" vs a "policy analysis." Without that, you don't have a knowledge base. You have a junk drawer.
check-resolvable is audit and compliance. It doesn't care if the code is beautiful. It asks a simpler question: can the system actually do what it claims? Are there capabilities that exist but can't be reached?
Trigger evals are performance reviews. Given a real input, does the right part of the organization respond? If not, you don't retrain the model. You fix the description. You update the routing. You make the org legible.
Once you see it this way, a lot of the confusion around agents disappears.
The problem isn't that models aren't smart enough. The problem is that we've been building organizations with no management layer. Just a pile of talented employees and a vague hope they'll coordinate.
Resolvers are that missing layer.
And once you treat them that way, the goal changes. You're not just wiring up tools. You're designing an organization that can grow, adapt, and stay coherent over time.
That's a different problem. And a much bigger one.
到目前为止,我将 resolver 描述为一种技术模式——让 agent 更好地工作的一种方式:路由任务、加载正确的上下文、避免淹死模型。
这个框架是真实的,但也太狭隘了。
我实际构建的东西,更接近管理(management)。
想一想,在一个拥有 40 多个技能和 25,000 个文件的真实系统中,情况是什么样的。这里不仅仅是代码,你面对的其实是一个组织。
技能(Skills)是员工。每一个都有自己的能力。有些是专家,有些是多面手。有些只在 cron 上运行,有些面向用户。
Resolver 就是组织架构图(org chart)。它定义了谁负责什么,任务如何被路由,以及当没有匹配项时会发生什么。它同时也是升级逻辑(escalation logic)——当一条路径失败时,下一步应该去哪里?
归档规则(filing rules)是内部流程。信息存放在哪里,决策如何被记录,什么算作一个人/一家公司/一篇政策分析。没有这个,你就没有知识库;你只有一个杂物抽屉。
Check-resolvable 是审计与合规(audit and compliance)。它不在乎代码是否漂亮,它只问一个更简单的问题:系统真的能做到它声称的那些事吗?是否存在无法被触及的能力?
Trigger evals 是绩效评估(performance reviews)。给定一个真实输入,组织中的正确部分是否做出了响应?如果没有,你不需要重新训练模型,而是修正描述,更新路由,让组织变得清晰可见。
一旦你这样看待问题,很多关于 agent 的困惑就消失了。
问题不在于模型不够聪明。问题在于,我们一直在构建没有管理层的组织。只是一堆有才华的员工,外加一个模糊的希望——希望他们能自行协调。
Resolver 就是那个缺失的管理层。
一旦你以这种视角去处理,目标也随之改变。你不仅仅是在连接工具,你正在设计一个能够成长、适应并随时间保持一致性的组织。
这是一个不同的问题,也是一个更宏大的问题。
Everything in this article — the resolver pattern, the trigger evals, check-resolvable, the filing rules, the self-healing loop — runs in production, every day, on my personal agent. It processes 200 inputs daily. It has 25,000 files. It compounds.
I open-sourced the entire system.
My open source project GBrain ships with the resolver pattern built in. gbrain init creates RESOLVER.md, the decision tree, and the disambiguation rules. Your agent starts filing correctly from day one. The check-resolvable skill comes built-in. You don't have to discover these patterns by breaking things — the system embodies them.
GStack is the coding layer. Fat skills in markdown. 72,000+ stars on GitHub. The skills in GStack call the knowledge in GBrain. Together they're the full architecture: intelligence on tap.
OpenClaw or Hermes Agent is the conductor — the thin harness that runs the agent loop, manages sessions, and runs crons. GBrain and GStack are skills that plug into it. Your agent reads GBrain's compiled truth before answering. Your crons run the rollup pipelines while you sleep.
This isn't a SaaS product. It's an architecture. The source code is open. The skills are markdown. The brain is a git repo you own. If any piece disappeared tomorrow, your knowledge survives as plain text files.
This is the new dawn of personal software. This is not packaged software. This is software that you build for yourself, but with the fat skills and fat code and thin harness that is your own personal mini-AGI. The future is already here, and I want you to have it in your pocket.
The architecture fits on an index card. The knowledge fits in a git repo. The only thing missing is you starting.
--
GBrain to build your personal mini-AGI in OpenClaw or Hermes Agent github.com/garrytan/gbrain
GStack to help you build faster in Claude Code github.com/garrytan/gstack
本文中所有的内容——resolver 模式、trigger evals、check-resolvable、归档规则、自修复循环——每天都在我的个人 agent 上运行,处于生产状态。它每天处理 200 个输入,拥有 25,000 个文件。它在不断复合增长。
我已经将整个系统开源了。
我的开源项目 GBrain 内置了 resolver 模式。gbrain init 会创建 RESOLVER.md、决策树和消歧规则。你的 agent 从第一天起就能正确归档。check-resolvable 技能也内置其中。你不必通过搞坏东西来发现这些模式——系统已经内化了它们。
GStack 是编码层(coding layer)。胖技能以 markdown 书写。在 GitHub 上超过 72,000 颗星。GStack 中的技能调用 GBrain 中的知识。它们合在一起就是完整的架构——触手可及的智能。
OpenClaw 或 Hermes Agent 是指挥官(conductor)——运行 agent 循环、管理会话和执行 crons 的薄夹套。GBrain 和 GStack 是插入其中的技能。你的 agent 在回答之前会读取 GBrain 编译后的真相。你的 crons 在你睡觉时运行汇总管道。
这不是一个 SaaS 产品。这是一个架构。源代码是开放的,技能是 markdown,brain 是你拥有的 git 仓库。如果任何一部分在明天消失了,你的知识仍然以纯文本文件的形式存在。
这是个人软件的新曙光。这不是打包好的软件,这是你为自己构建的软件,但借助胖技能、胖代码和薄夹套,它就是你个人的 mini-AGI。未来已经到来,我希望你把它握在手中。
整个架构可以写在一张索引卡上。知识可以存放在一个 git 仓库中。唯一缺失的是你——开始行动。
--
GBrain:在 OpenClaw 或 Hermes Agent 中构建你的个人 mini-AGI github.com/garrytan/gbrain
GStack:帮助你在 Claude Code 中更快地构建 github.com/garrytan/gstack