Glean 拾遗
Daily /2026-07-03 / The Claude Opus 4.8 Setup Guide: How to Get Maximum Quality for Minimum Cost (Exact Config Inside)

The Claude Opus 4.8 Setup Guide: How to Get Maximum Quality for Minimum Cost (Exact Config Inside)

Source x.com Glean’d 2026-07-03 09:24 Read 9 min
AI summary

A hands-on configuration guide published day after Claude Opus 4.8's release. The core value lies not in benchmark improvements (SWE-bench 87.6% → 88.6%) but in three operational features: Effort Control for per-task reasoning depth, Fast Mode at 3x cheaper than before, and Dynamic Workflows supporting up to 1,000 parallel subagents. The author provides a cost-optimization matrix routing tasks to Haiku/Sonnet/Opus at different effort levels, claiming ~50% monthly savings ($400-600 down to ~$205) for heavy users. Includes copy-paste configs for environment variables and settings.json. Practical for Claude Code users focused on cost control, though the savings claims are unverified estimates.

Original · 9 min
x.com ↗
§ 1

Claude Opus 4.8 dropped yesterday. Most people will just update the model and miss everything else.

Anthropic shipped 3 features alongside it that change how you use Claude Code entirely: effort control, dynamic workflows, and cheaper fast mode.

The people who configure these properly will get better results and spend less.

Here's the full setup for new Opus 4.8 👇

Claude Opus 4.8 昨天发布了。大多数人只会更新模型,错过其他一切。

Anthropic 还同时推出了三项特性,它们会彻底改变你使用 Claude Code 的方式:effort 控制、动态工作流,以及更便宜的快速模式。

正确配置这些特性的人,将获得更好的结果并花更少的钱。

以下是新版 Opus 4.8 的完整配置指南 👇

§ 2

Before we dive in, I share daily notes on AI & vibe coding in my Telegram channel: https://t.me/zodchixquant🧠

What actually changed (30-second version)

Model:           claude-opus-4-8
Price:           $5 / $25 per million tokens (same as 4.7)
Fast mode:       2.5x speed, $10 / $50 (3x cheaper than before)
Context window:  1,000,000 tokens (unchanged)
Max output:      128,000 tokens (unchanged)
SWE-bench:       88.6% (up from 87.6%)
Code flaws:      4x fewer unflagged bugs than 4.7
Honesty:         0% uncritically reporting flawed results

The benchmarks are a modest improvement. The operational changes are massive.

在深入之前,我在 Telegram 频道上每日分享 AI 与 vibe coding 的笔记:https://t.me/zodchixquant🧠

实际变更(30 秒速览)

模型:           claude-opus-4-8
价格:           每百万 token $5 / $25(与 4.7 相同)
快速模式:       2.5 倍速度,$10 / $50(比之前便宜 3 倍)
上下文窗口:     1,000,000 tokens(不变)
最大输出:       128,000 tokens(不变)
SWE-bench:      88.6%(从 87.6% 上升)
代码缺陷:       未标记 bug 比 4.7 减少 4 倍
诚实度:         0% 不加批判地报告有缺陷的结果

基准测试只是小幅提升。操作层面的变化却是巨大的。

§ 3

Feature 1: Effort Control

Opus 4.8 defaults to High effort. But now you can control how much thinking Claude puts into each task.

Low     → fast, simple tasks, lowest token usage
Medium  → everyday coding, balanced
High    → default, solid reasoning (what 4.7 always used)
Max     → deepest reasoning, highest token usage

In Claude Code:

/effort low       # quick question, formatting
/effort high      # daily coding
/effort max       # complex architecture decisions
/effort ultracode # max reasoning + automatic workflow orchestration

In claude.ai: there's now a slider in the UI. Low for quick questions, Max for deep analysis.

特性 1:Effort 控制

Opus 4.8 默认使用 High effort。但现在你可以控制 Claude 在每个任务上投入多少思考量。

Low     → 快速、简单的任务,最低 token 消耗
Medium  → 日常编码,均衡
High    → 默认,扎实的推理(4.7 一直使用的级别)
Max     → 最深度的推理,最高 token 消耗

在 Claude Code 中:

/effort low       # 快速提问、格式化
/effort high      # 日常编码
/effort max       # 复杂架构决策
/effort ultracode # 最大推理 + 自动工作流编排

claude.ai 中:UI 里现在有一个滑块。Low 用于快速提问,Max 用于深度分析。

§ 4

Why this matters for cost: running Low effort on simple tasks uses a fraction of the tokens that High uses.

If 60% of your prompts are simple questions, switching those to Low cuts your daily spend significantly without affecting quality on the work that matters.

# Set default in your terminal config
export CLAUDE_CODE_DEFAULT_EFFORT=high

# Override per task when needed
/effort max  # for the hard stuff
/effort low  # for "what does this function return?"

这对成本很重要:在简单任务上使用 Low effort 所消耗的 token 仅是 High 的一小部分。

如果你的提示词中有 60% 是简单问题,将它们切换到 Low,就能大幅削减每日花销,同时不会影响重要工作的质量。

# 在终端配置中设置默认值
export CLAUDE_CODE_DEFAULT_EFFORT=high

# 需要时按任务覆盖
/effort max  # 用于困难任务
/effort low  # 用于“这个函数返回什么?”
§ 5

Feature 2: Fast Mode (3x cheaper)

Fast mode runs Opus at 2.5x the speed.

Standard Opus 4.8:  $5 / $25 per million tokens
Fast mode Opus 4.8: $10 / $50 per million tokens (2.5x speed)

Previous fast mode:  $30 / $150 per million tokens
Price drop:          3x cheaper

In Claude Code:

/fast    # toggle fast mode on

When to use fast mode:

Use fast mode for:

- Large refactoring across many files (speed > depth)
- Code generation from specs (pattern matching, not reasoning)
- Documentation writing
- Test generation for existing code

Use standard mode for:

- Complex debugging
- Architecture decisions
- Security review
- Anything where thinking quality matters more than speed

特性 2:快速模式(3 倍便宜)

快速模式以 2.5 倍速度运行 Opus。

标准 Opus 4.8:       每百万 token $5 / $25
快速模式 Opus 4.8:   每百万 token $10 / $50(2.5 倍速度)

之前的快速模式:      每百万 token $30 / $150
价格下降:            便宜 3 倍

在 Claude Code 中:

/fast    # 切换快速模式

何时使用快速模式:

使用快速模式适用于:

- 跨多文件的大型重构(速度 > 深度)
- 根据规格生成代码(模式匹配而非推理)
- 文档编写
- 为现有代码生成测试

使用标准模式适用于:

- 复杂调试
- 架构决策
- 安全审查
- 任何思考质量比速度更重要的情况
§ 6

Feature 3: Dynamic Workflows (the big one)

This is the headline feature. Dynamic Workflows lets Claude Code spawn hundreds of parallel subagents in a single session.

Up to 1,000 agents per run.

# Trigger a workflow
/effort ultracode

# Or describe a large task naturally
"Audit every API endpoint under src/routes/ for missing auth checks"

Claude plans dynamically from your prompt. It breaks the task into subtasks. It fans work across subagents running in parallel.

Agents attack the problem from independent angles. Other agents try to refute those findings. The run iterates until answers converge.

Resumable runs: if your laptop dies or you close the terminal, the workflow resumes from where it stopped. No starting over.

特性 3:动态工作流(重头戏)

这是头条特性。动态工作流让 Claude Code 在单个会话中生成数百个并行子代理。

每次运行最多 1,000 个代理。

# 触发工作流
/effort ultracode

# 或自然地描述一个大型任务
"审计 src/routes/ 下的所有 API 端点是否存在缺失的认证检查"

Claude 根据你的提示动态规划。它把任务分解为子任务,将工作分发给并行运行的子代理。

代理从独立的角度攻击问题。其他代理则试图反驳那些发现。运行会迭代直到答案收敛。

可恢复运行:如果你的笔记本电脑没电或关闭了终端,工作流会从中断处恢复,无需从头开始。

§ 7
What dynamic workflows handle:

- Migration touching 200+ files
- Full codebase security audit
- Test suite generation for an entire project
- Large-scale refactoring
- Deep research across multiple codebases

What they don't handle well:

- Simple bug fixes (overkill)
- Single-file edits
- Quick questions

Cost warning: dynamic workflows consume meaningfully more tokens than a typical session. A run with 100 subagents can cost $50-200 depending on complexity.

Always set a budget cap:

claude -p "audit the entire codebase" --max-budget-usd 50.00
动态工作流能处理的:

- 波及 200+ 个文件的迁移
- 完整代码库安全审计
- 整个项目的测试套件生成
- 大规模重构
- 跨多个代码库的深度研究

动态工作流不擅长处理的:

- 简单的 bug 修复(杀鸡用牛刀)
- 单文件编辑
- 快速问题

成本警告:动态工作流比典型会话消耗明显更多的 token。一次运行 100 个子代理,根据复杂度成本可能在 $50-200 之间。

务必设置预算上限:

claude -p "audit the entire codebase" --max-budget-usd 50.00
§ 8

Feature 4: Better honesty (actually matters)

Opus 4.8 is 4x less likely to leave flaws in its own code unflagged. It scored 0% on uncritically reporting flawed results.

In practice: when Opus 4.8 isn't sure about something, it tells you instead of confidently giving you a wrong answer. Previous models would generate plausible-looking code that silently broke edge cases.

This compounds over long sessions. A model that flags its own uncertainty on turn 15 saves you 2 hours of debugging on turn 40.

特性 4:更好的诚实度(确实重要)

Opus 4.8 在自己代码中留下未标记缺陷的可能性降低了 4 倍。它在不加批判地报告有缺陷结果上的得分为 0%。

实际效果:当 Opus 4.8 不确定某事时,它会告诉你,而不是自信地给出错误答案。之前的模型会生成看起来合理的代码,但悄无声息地破坏边界情况。

这种特性在长会话中会累积。一个在第 15 轮就标记自身不确定性的模型,能为你节省第 40 轮时的 2 小时调试时间。

§ 9

The cost optimization matrix

Here's how to route every task to the right model and effort level:

Task                          Model      Effort    Mode
─────────────────────────────────────────────────────────
Quick question                Haiku      Low       Standard
Format this code              Sonnet     Low       Standard
Write a test                  Sonnet     Medium    Standard
Daily coding                  Opus 4.8   High      Standard
Code review                   Opus 4.8   High      Standard
Large refactor (speed)        Opus 4.8   High      Fast
Complex architecture          Opus 4.8   Max       Standard
Full codebase audit           Opus 4.8   Ultracode Dynamic
Migration (200+ files)        Opus 4.8   Ultracode Dynamic

Monthly cost comparison:

Before (everything on Opus High, standard):
~$400-600/mo for heavy usage

After (routed correctly):
Haiku for quick questions:     $5/mo
Sonnet for daily tasks:        $40/mo
Opus High for complex work:    $80/mo
Opus Fast for large refactors: $30/mo
Dynamic for big audits:        $50/mo (occasional)
─────────────────────────────────────────
Total:                         ~$205/mo

Savings:                       ~50%
Same output quality on every task that matters.

成本优化矩阵

以下是如何将每个任务路由到正确的模型和 effort 级别:

任务                            模型       Effort    模式
─────────────────────────────────────────────────────────
快速提问                        Haiku     Low       标准
格式化代码                      Sonnet    Low       标准
编写测试                        Sonnet    Medium    标准
日常编码                        Opus 4.8  High      标准
代码审查                        Opus 4.8  High      标准
大型重构(速度优先)            Opus 4.8  High      快速
复杂架构                        Opus 4.8  Max       标准
完整代码库审计                  Opus 4.8  Ultracode 动态
迁移(200+ 文件)               Opus 4.8  Ultracode 动态

月度成本对比:

之前(全部使用 Opus High + 标准模式):
高强度使用约每月 $400-600

之后(正确路由):
Haiku 用于快速提问:            $5/月
Sonnet 用于日常任务:           $40/月
Opus High 用于复杂工作:        $80/月
Opus Fast 用于大型重构:        $30/月
动态工作流用于大型审计:        $50/月(偶尔)
─────────────────────────────────────────
总计:                          ~$205/月

节省:                          ~50%
每项重要任务的输出质量保持不变。
§ 10

The full config (copy-paste ready)

Environment variables

# Add to ~/.zshrc or ~/.bashrc
export CLAUDE_CODE_DEFAULT_EFFORT=high
export CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1
export CLAUDE_CODE_SUBAGENT_MODEL="claude-sonnet-4-5-20250929"
export ANTHROPIC_MODEL="claude-opus-4-8"

settings.json

{
  "permissions": {
    "allow": [
      "Read", "Glob", "Grep", "LS", "Edit", "MultiEdit",
      "Write(src/**)", "Write(tests/**)", "Write(docs/**)",
      "Bash(npm run *)", "Bash(npm test *)", "Bash(npx tsc *)",
      "Bash(npx prettier *)", "Bash(npx eslint *)",
      "Bash(git status)", "Bash(git diff *)", "Bash(git log *)",
      "Bash(git add *)", "Bash(git commit *)"
    ],
    "deny": [
      "Read(**/.env*)", "Read(**/.ssh/**)", "Read(**/.aws/**)",
      "Bash(rm -rf *)", "Bash(sudo *)", "Bash(git push *)"
    ],
    "defaultMode": "acceptEdits"
  },
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Write(*.ts)",
        "hooks": [
          { "type": "command", "command": "npx prettier --write $file" },
          { "type": "command", "command": "npx tsc --noEmit 2>&1 | head -20" }
        ]
      }
    ],
    "Stop": [
      {
        "hooks": [
          { "type": "command", "command": "npm test 2>&1 | tail -10; echo \"Exit: $?\"" }
        ]
      }
    ]
  }
}

完整配置(可直接复制)

环境变量

# 添加到 ~/.zshrc 或 ~/.bashrc
export CLAUDE_CODE_DEFAULT_EFFORT=high
export CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1
export CLAUDE_CODE_SUBAGENT_MODEL="claude-sonnet-4-5-20250929"
export ANTHROPIC_MODEL="claude-opus-4-8"

settings.json

{
  "permissions": {
    "allow": [
      "Read", "Glob", "Grep", "LS", "Edit", "MultiEdit",
      "Write(src/**)", "Write(tests/**)", "Write(docs/**)",
      "Bash(npm run *)", "Bash(npm test *)", "Bash(npx tsc *)",
      "Bash(npx prettier *)", "Bash(npx eslint *)",
      "Bash(git status)", "Bash(git diff *)", "Bash(git log *)",
      "Bash(git add *)", "Bash(git commit *)"
    ],
    "deny": [
      "Read(**/.env*)", "Read(**/.ssh/**)", "Read(**/.aws/**)",
      "Bash(rm -rf *)", "Bash(sudo *)", "Bash(git push *)"
    ],
    "defaultMode": "acceptEdits"
  },
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Write(*.ts)",
        "hooks": [
          { "type": "command", "command": "npx prettier --write $file" },
          { "type": "command", "command": "npx tsc --noEmit 2>&1 | head -20" }
        ]
      }
    ],
    "Stop": [
      {
        "hooks": [
          { "type": "command", "command": "npm test 2>&1 | tail -10; echo \"Exit: $?\"" }
        ]
      }
    ]
  }
}
§ 11

Daily workflow cheat sheet

# Start of day: default effort
/effort high

# Quick questions
/effort low
"what does this function return?"
/effort high

# Large refactor (speed matters)
/fast
"refactor the entire auth module to use the new session handler"

# Full codebase audit (dynamic workflow)
/effort ultracode
"audit every endpoint for missing auth checks"

# Model switching
/model sonnet    # for simple tasks
/model opus      # for complex work
/model haiku     # for throwaway questions

日常工作流速查表

# 一天开始:默认 effort
/effort high

# 快速提问
/effort low
"这个函数返回什么?"
/effort high

# 大型重构(速度优先)
/fast
"重构整个 auth 模块以使用新的 session handler"

# 完整代码库审计(动态工作流)
/effort ultracode
"审计每个端点是否存在缺失的认证检查"

# 模型切换
/model sonnet    # 用于简单任务
/model opus      # 用于复杂工作
/model haiku     # 用于一次性问题
§ 12

The one thing most people will miss

Effort control is the highest-value feature in this release. Not dynamic workflows, not fast mode. Effort control.

Running Low effort on 60% of your prompts and Max on the 10% that actually need deep reasoning is the discipline that cuts your monthly bill in half without touching output quality on what matters.

Most people will leave everything on High and never touch the slider. The ones who learn to route effort per task will get the same results at half the cost.

Thanks for reading!

I share daily notes on AI, finance, and vibe coding in my Telegram channel: https://t.me/zodchixquant

大多数人会忽略的一件事

Effort 控制是本次发布中价值最高的特性。不是动态工作流,也不是快速模式。而是 effort 控制。

在 60% 的提示词上使用 Low effort,在真正需要深度推理的 10% 上使用 Max,这种自律能把你的月度账单砍半,同时不影响重要工作的输出质量。

大多数人会把所有任务留在 High 上,从不动滑块。而那些学会按任务路由 effort 的人,将以一半的成本获得相同的结果。

感谢阅读!

我在 Telegram 频道上每日分享 AI、金融和 vibe coding 的笔记:https://t.me/zodchixquant

Open source ↗