Switching from Superpowers to mattpocock/skills: Less Token Waste, More Control
The author shares a real-world comparison between Superpowers and mattpocock/skills, explaining why they switched. Superpowers uses hooks to enforce a rigid workflow, which is helpful for novices but often overcomplicates simple tasks and burns excessive tokens. mattpocock/skills takes a 'real engineer' approach, giving control back to the user via explicit commands like /grill-with-docs, /to-prd, /to-issues, and /implement. Key advantages: lower token consumption, built-in debugging (/tdd, /diagnosing-bugs), model handoff (/handoff), and architecture refactoring (/improve-codebase-architecture). The author pairs these skills with Fable 5 and Codex 5.5 models, storing PRDs and issues on GitHub for traceability. A candid take for engineers evaluating agent frameworks and tooling.
In my article "Why Superpowers Can Execute Long Tasks and Ensure Delivery Quality", I explained the principles behind the Superpowers skills and have been using them for a while.
These skills are designed to enforce a standardized workflow to regulate the entire software engineering iteration process. By using hooks, users don't need to be senior engineers—the system passively "automatically" invokes the required skill at each stage, forcibly walking them through the process from brainstorming to implementation using subagents and worktree.
These skills are excellent, especially for those less familiar with software engineering workflows. The plans generated can even be delegated to cheaper coding agents.
After you've signed off on the design, your agent puts together an implementation plan that's clear enough for an enthusiastic junior engineer with poor taste, no judgement, no project context, and an aversion to testing to follow. It emphasizes true red/green TDD, YAGNI (You Aren't Gonna Need It), and DRY.
But this is also Superpowers' downside: it's forced and token-intensive.
在Superpowers 为什么能执行长任务且确保交付质量?一文中我解释过superpowers这套skills的原理,我也用了一段时间。
这套skills的设计是希望用一套强制的流程规范整个软件工程的迭代过程。由于用了hooks,用户不需要是一位资深软件工程师,也能被动地“自动”调用各个阶段所需的skill,强行帮你完成一次从brainstorming到利用subagents + worktree实现的过程。
这套skills非常好,也很适合不太熟悉软件工程流程的人。用这套skills写出来的plan甚至可以派发给一些更便宜的coding agents
After you've signed off on the design, your agent puts together an implementation plan that's clear enough for an enthusiastic junior engineer with poor taste, no judgement, no project context, and an aversion to testing to follow. It emphasizes true red/green TDD, YAGNI (You Aren't Gonna Need It), and DRY.
但这也是superpowers的缺点: 强制,且费token。
Matt Pocock's Skills
As I mentioned last time, @Clu recommended the mattpocock/skills to me. I started using it, and for a while I was running both skill sets simultaneously.
At first I was thrilled that Superpowers could run independently for over an hour, but later I found it often overcomplicated simple problems. So the first thing I did was ask my coding agent to delete all the hooks from Superpowers, ensuring it wouldn't kick in automatically unless I manually invoked it.
This allowed me to combine mattpocock's /grill-with-docs with Superpowers. I found that mattpocock's Q&A used fewer tokens than Superpowers' brainstorming, with no worse results. Admittedly, the latter's visual companion provided demo previews that helped align visual and interaction design with the agent, but it was too token-heavy. After Fable 5 came back, I stopped using it altogether.
So if complex visual interactions need early alignment with the agent, we can use /prototype to quickly build a simple prototype as a rough substitute. This prototype is meant for testing new ideas and can be thrown away if unsuitable.
Matt Pocock的Skills
上次我提到@Clu 给我推荐了mattpocock/skills这个skills。我也开始用了。但是有一段时间我是两套skills同时用。
一开始我发现superpowers能自己独立运行1hr以上我很开心,但后来我发现它也很容易把简单问题复杂化,所以我首先做的事情是,让我的coding agent把superpowers的hooks全部删掉。这样除非我手动调用,否则coding agent不应该随便启动superpowers的流程。
这样我结合mattpocock的/grill-with-docs和superpowers一起用。我感觉mattpocock的问答比superpowers的brainstorming更省token,效果也不差。当然后者的visual companion能提供一些demo效果我觉得也是一种很好跟Agent对齐视觉和交互效果的方式,但就是太费token了。在Fable 5回归以后我根本用不上了。
所以如果有复杂的视觉交互效果需要agent提前沟通对齐的,那么我们可以用/prototype先快速实现一个简单的原型,近似替代一下。这个原型是用于测试新idea的,不合适可以throwaway。
Mattpocock's skills reached v1.0.0 two weeks ago, along with some renaming and the addition of ask-matt—call it after initial installation, and it will tell you how to use the skills for typical scenarios.
The basic workflow of these skills is similar to Superpowers:
Superpowers:
brainstorming => write-spec => write-plans => execute-plan
mattpocock skills:
grill-with-docs => to-prd => to-issues => implement
Mattpocock的这套skills在两周前实现v1.0.0,还改了其中一些skills的命名,增加了ask-matt,可以在初次安装后调用这个skill,他会告诉你典型的场景应该如何使用这套skill。
使用这套skills的基本流程和superpowers其实是类型的:
superpowers:
brainstroming => write-spec => write-plans => execute-plan
mattpocock skills:
grill-with-docs => to-prd => to-issues => implement
The above is the basic flow, but in practice there are some variations.
My personal projects are on GitHub, so I use GitHub Issues as the issue tracker, storing both PRDs and issues there. The advantage is that I can use an expensive, high-intelligence model like Fable 5 to write the PRD, then hand it off to a reliable top-tier model like Codex 5.5 to write the code. And since everything is recorded, I won't forget what I did with the agent or where I left off.
Using mattpocock's skills puts the decision of when to invoke which skill back in my hands. After all, he calls it "Skills For Real Engineers".
上述是基本流程,但实际使用过程中还有些不同的变数。
我的个人项目在github所以使用GitHub issues作为issue tracker,prd和issues都放在上面。好处是可以用Fable 5这种很贵的高智商模型来写PRD,然后交给codex 5.5这种靠谱的顶流模型去写代码。并且因为东西都记录下来了,我就不会忘记上次跟Agent做了什么,做到哪一步了。
使用mattpocock的这套skills,能够把什么时候应该调用什么skill这个决策权交回到我手上。毕竟他说这是 Skills For Real Engineers。
Matt wrote in the repo that the skills exist because:
The Agent Didn't Do What I Want => Actually, the agent doesn't know what we want. Through /grill-with-docs, we engage in Q&A to help the agent truly understand our requirements.
The Agent Is Way Too Verbose => We and the agent don't speak the same "language". Through /grill-with-docs, the agent records terms and project-specific context for future reference, enabling efficient communication between us.
The Code Doesn't Work => We use /tdd, paired with the debug skill /diagnosing-bugs.
We Built A Ball Of Mud => I feel coding agents are a bit like humans—they like to polish existing code. Each phase of these skills tends toward better design; /to-prd also confirms which modules should be refactored before starting. But even with caution, projects can become messy over time. That's when /improve-codebase-architecture shines, producing detailed documentation and an HTML page with visual charts.
Matt在repo中写到,这套Skills存在的原因是:
The Agent Didn't Do What I Want => 其实是Agent不知道我们要什么。通过/grill-with-docs跟Agent进行问答让Agent真正理解我们要的东西。
The Agent Is Way Too Verbose => 我们和Agent说的并不是同一种“语言”,通过/grill-with-docs,Agent会自己把一些术语和当前项目转述的东西记录下来,以便未来查询,实现我们跟Agent之间的高效沟通。
The Code Doesn't Work => 我们用 /tdd,并且配套debug skill /diagnosing-bugs
We Built A Ball Of Mud => 我也感觉coding agent跟人类有一点像,就是喜欢在现有代码的基础上雕花。这套skills的每个阶段其实会倾向于更良好的设计,/to-prd也会在动手之前确认哪些模块应该被动到。但即使再小心翼翼,项目开发久了之后还是难免一团浆糊,这时候/improve-codebase-architecture就很适合项目重构。不仅会产出详细的报告文档,还会给出一个html展示可视化图表。
These skills strictly differentiate between user-invoked and model-invoked modes; core workflow skills are manually invoked, while bug-hunting and code review skills can be auto-triggered.
Also, /handoff is very handy—when switching between Claude and Codex, I use /handoff to transfer context, and so on.
这套skills严格区分了user-invoked和model-invoked,像主流程的skills都是主动调用的,而查bug和code review那些可以自动调用。
另外/handoff也很好用,有时候claude和codex互换的时候我就用/handoff转交过去,诸如此类。
So recently I've replaced Superpowers with mattpocock's skill set—it saves tokens and time while delivering good results, and it leaves logs in GitHub Issues to rescue my memory, which is overwhelmed by too many projects.
The AI world loves coining terms: first "harness", now "loop engineer". But whatever the buzzword, it all comes down to the enhancement of base model capabilities and how we constrain model behavior in engineering practice. When constraints and model capabilities were weaker, we emphasized "harness"—guiding the model along desired paths without deviation. As harness engineering advances and models become more capable with better instruction following, we shift focus to stable long-task delivery, moving toward "loop engineering".
所以最近我已经用mattpocock这套skills替代superpowers了,节省token而且省时间,又能实现不错的效果,还能在github issues上留档拯救我这被太多个项目折磨的记忆。
现在AI圈很喜欢造词,前有harness现有loop engineer。但无论是造的什么词,都离不开基座大模型的能力增强,以及我们在工程实现上如何约束大模型的走向。在工程约束实践和大模型能力还不太强的情况下,我们强调harness,也就是如何让大模型按照我们想要的方向走,别跑偏。在harness工程逐步发展,大模型的能力越来越强,指令遵循更准确以后,我们开始强调长任务的稳定交付,也就转向了loop engineering。
Half of this year has passed, and AI has evolved so rapidly it feels like several generations have come and gone. The fully automated assembly-line cash-printing model for short-form dramas is basically operational, with low-cost copycat pipelines emerging everywhere. But in the domain of serious product development, there's still much to improve—"taste" is often cited as a critical part of human-in-the-loop, and product refinement remains time-consuming. With the release of Fable 5, however, I'm beginning to see a glimmer of efficiency gains in the polishing phase. What will the next generation of software look like? What an exciting future ahead.
今年已经过去半年,AI的发展之迅速仿佛已经更替了好几代。短剧领域的全自动流水线印钞机模式应该已经基本跑通,一些低成本复制山寨的链路也层出不穷。但是我们生产严肃产品的领域还有不少需要提升的地方,比如一直有人说Taste是Human in the loop中非常关键的部分,打磨产品仍然消耗不少时间。但随着Fable 5的发布,我开始看到了产品打磨阶段也能被提效的曙光。那么下一代软件将会变成什么样子呢?真是个令人期待的未来。