Glean 拾遗
Recent picks

27picks · chronological

05-31

How to Use Claude at 100% — Most People Never Get Past 10%

This guide reveals 17 hidden features of Claude that most users never use, including Projects, Artifacts, Extended Thinking, Memory, Claude in Chrome, Cowork, Scheduled Tasks, Skills, CLAUDE.md, Claude Code, Claude Design, and Prompt Caching. Each feature comes with setup instructions and ready-to-use prompts. Perfect for anyone wanting to turn Claude from a simple chatbot into a full productivity system.

x.com · 16 min · Agents · AI · LLM
05-31

Best Practices for Computer and Browser Use with Claude

Official best practices guide for integrating Claude's computer and browser use capabilities, covering screenshot scaling, click accuracy, cache breakpoints, context management with rolling buffer and server-side compaction, prompt injection defenses, thinking effort tuning, and experimental features like batch tools and the advisor tool. Based on internal testing with Claude 4.6 and Opus 4.7, includes concrete code and performance data.

claude.com · 59 min · Agents · LLM
05-31

Project Glasswing: What Mythos Showed Us

Cloudflare tested Anthropic's Mythos Preview on 50+ internal repos under Project Glasswing. The model excels at chaining low-severity bugs into working exploits and generating PoCs, making validation actionable. Real-world use revealed inconsistent model refusals and signal-to-noise challenges; a generic coding agent proved ineffective. Cloudflare built an eight-stage harness (Recon, Hunt, Validate, Gapfill, Dedupe, Trace, Feedback, Report) using parallel narrow tasks and adversarial review to improve quality. The post argues that beyond faster patching, defenses must limit exploit reachability from the architecture layer.

blog.cloudflare.com · 18 min · Agents · Infra · LLM
05-30

How to Make Claude Code Fix Its Own Mistakes Automatically (Exact Setup You Can Copy)

This article provides a complete, copy-paste-ready setup for making Claude Code automatically catch, fix, and learn from its own mistakes. It covers a self-growing CLAUDE.md for project rules, PostToolUse hooks for auto-formatting and type-checking, Stop hooks for running tests on completion, PreToolUse hooks for blocking dangerous operations, and cross-session memory. The included settings.json config reduces back-and-forth from 45 minutes to 10 unattended minutes per feature. Audience: engineers using Claude Code or AI coding assistants.

x.com · 10 min · Agents · AI
05-30

How I set up Claude to actually get work done

Most people use Claude as a one-off Q&A, losing context each time. The author shares a systematic setup: personal instructions, projects, reference files, a context file, connected tools like email and calendar, templates, and repeatable workflows. 25 concrete steps transform Claude from a chat window into a reusable work environment. Suitable for technical workers frustrated with inconsistent AI responses.

x.com · 9 min · Agents · LLM
05-30

20 AI Concepts You Must Understand in 2026

A beginner-friendly primer covering 20 core AI concepts split into four parts: foundational mechanisms, how LLMs work, how models improve, and how real systems are built. Uses simple analogies and visuals to explain neural networks, transformers, RAG, agents, and more. No code or deep implementation details — a quick reference for building mental models.

x.com · 17 min · Agents · AI · LLM
05-30

Introducing dynamic workflows in Claude Code

Claude Code now supports dynamic workflows, enabling parallel orchestration of tens to hundreds of subagents within a single session for large-scale engineering tasks. It handles end-to-end jobs like codebase-wide bug hunts, migrations across hundreds of files, and security audits. Workflows dynamically plan, fan out, cross-validate, and converge results. Example: Bun was ported from Zig to Rust in 11 days, producing ~750k lines with 99.8% test pass rate. Workflows show plans before execution, can resume after interruption, but consume significantly more tokens. Available in research preview for Max, Team, Enterprise users.

claude.com · 7 min · Agents · AI
05-29

Claude Code Dynamic Workflows: A New Primitive That Moves Orchestration Into Code

Anthropic introduces Dynamic Workflows, a primitive that turns task orchestration into JavaScript scripts executed by a deterministic runtime. The script manages loops, branching, and intermediate results, so only the final answer enters the main Claude context—solving the bottleneck of context overflow and attention dilution when coordinating hundreds of parallel tasks. A deep dive into architecture, primitives, and execution model is paired with a real-world Bun-to-Rust migration (11 days, 750K lines, 99.8% test pass) and a personal 133-session analysis. Comparisons with n8n/Coze/Dify show Workflow's advantage: Turing-complete code offers more expressiveness than visual DAGs, and the orchestration can be generated on‑the‑fly by a model. It shines for codebase audits, large migrations, and adversarial verification but comes with high token costs and current preview limits. Target audience: engineers tackling massive, automated coding tasks.

x.com · 19 min · Agents · AI · Framework
05-29

Prompt → Context → Harness: The Three Paradigms of AI Engineering

AI engineering has undergone three paradigm shifts: from Prompt Engineering (2023–2024) to Context Engineering (2025), and now to Harness Engineering (early 2026). Harness Engineering combines evaluation feedback loops, architectural constraints, and memory governance. Anthropic’s evaluator agent turned a 20‑minute useless artifact into a 6‑hour complete game; OpenAI built a million‑line system with zero human‑written code in five months, enforcing architectural boundaries via CI/linters. Two academic papers fill the memory layer: (S)AGE uses Byzantine‑fault‑tolerant Proof of Experience consensus to double agent calibration accuracy; a longitudinal study shows that 3 lines of prompt plus memory matches 200 lines of expert prompt in performance, yet only the memory group improves over time. Essential for engineers building multi‑agent systems.

x.com · 3 min · Agents · AI · LLM
05-29

10x Your Claude Skills with Karpathy's Autoresearch Method

This article shows how to automatically improve Claude Skills using Karpathy's autoresearch method. It works by giving an agent a yes/no checklist and letting it iteratively test, tweak, and keep only beneficial changes. The author improved a landing page copy skill from 56% to 92% pass rate in four rounds, with visible changelogs. The method applies to any measurable task—define a checklist and let the agent run. Includes download links and practical examples for engineers building AI workflows.

x.com · 5 min · Agents · AI
05-29

ClickStack Observability: MCP Server, AI Notebooks, and ClickStack Cloud

At Open House, ClickHouse announced three major observability updates: ClickStack Cloud (serverless, managed, private preview), AI Notebooks (beta), and an open-source ClickStack MCP server. AI Notebooks replace linear chat with persistent, branchable investigation workspaces, exposing every query and step. The MCP server provides semantic investigative tools to external agents; internal benchmarks show 25% fewer tool calls, 2.5× consistency improvement, and 20% higher evaluation scores vs. raw SQL MCP. The server also supports bi‑directional orchestration: agents can create dashboards and persist results. The design philosophy is “bring your own agents,” with SQL as an escape hatch when pre‑built tools fall short. The post includes setup instructions and a demo. For infrastructure/SRE engineers evaluating ClickHouse-based observability.

clickhouse.com · 15 min · Agents · AI · Database
05-29

Introducing ClickHouse Agent Skills

ClickHouse has released official Agent Skills: an open-source set of 28 prioritized best-practice rules covering schema design, query optimization, and data ingestion, packaged using Anthropic's Agent Skills specification. Users can add them locally with `npx skills add clickhouse/agent-skills`. AI agents (e.g., Claude Code) automatically invoke these rules when appropriate, helping avoid common pitfalls like wrong ORDER BY, non-scalable JOINs, or missing materialized views. The Apache 2.0-licensed repo welcomes community contributions.

clickhouse.com · 3 min · Agents · AI · Database
05-29

Organizational Structure for AI-First in the Harness Era

A podcast interview with Creao's founders explores Harness Engineering—building self-healing, self-improving systems around LLMs. True AI-First companies restructure around AI as the primary producer: development cycles shrink from weeks to a day, product managers are dismantled, and cross-team alignment is automated. Junior engineers adapt faster than seniors; the future rewards architecture + product + marketing generalists. The 'Agent Economy' means content may be produced for AI consumers. A 25-person team rebuilt their architecture in two weeks. Full transcript available.

x.com · 2 min · Agents · AI · Framework
05-28

Agent Unveiled: Principles, Architecture, and Engineering Practices

This article systematically examines the underlying architecture and engineering practices of agent systems. Starting from a stable agent loop, it contrasts workflows with agents, explains five control patterns, and emphasizes that the harness (evaluation baselines, execution boundaries, feedback, and fallbacks) often matters more than the model itself. It details context engineering via layered management and three compression strategies to prevent context rot, ACI‑oriented tool design, a four‑type memory system with consolidation, long‑task state externalization across sessions, protocol‑based multi‑agent coordination, eval frameworks (Pass@k and Pass^k), and event‑driven observability. Finally, it shows how these principles are implemented in OpenClaw, providing a practical reference for engineers building robust agents.

tw93.fun · 31 min · Agents · Framework · LLM
05-28

Deconstructing Claude Code: Architecture, Governance, and Engineering Practices

Based on six months of intensive use of Claude Code, the author breaks down its functionality into six layers (CLAUDE.md/rules/memory, Tools/MCP, Skills, Hooks, Subagents, Verifiers) and provides design principles, anti-patterns, and configuration examples for each. The article focuses on context engineering (token cost structure, layered loading strategy, compaction pitfalls), tool design, Hooks for mandatory enforcement, Subagents for context isolation, prompt caching, and verification loops. Ideal for engineers wanting to move from ad‑hoc chat to a disciplined agent engineering workflow.

tw93.fun · 20 min · Agents · AI
05-28

Beyond the Coding Assistant — A New Series

This free series examines AI-assisted software engineering at enterprise scale. While individual coding speed has skyrocketed, many teams have not seen delivery improve—some have even slowed down. The author argues that current AI coding assistants optimize a single role, but software is shipped by teams with many non-coding roles. The next frontier is lifecycle orchestration, not better code generation. The series is structured in four parts, publishing three times a week with no paywall. It is aimed at engineering leaders, architects, and developers interested in AI engineering.

articles.zimetic.com · 8 min · Agents · AI
05-28

how to build a production grade ai agent

Over 40% of agentic AI projects fail, not because of models, but due to poor risk controls, architecture, and business value. This article presents ten engineering principles: threat modeling, strictly typed tool contracts, least-privilege execution, compact context engineering, governed retrieval, deterministic orchestration, separated memory, reliability mechanics, full observability, and continuous governance. Each principle provides concrete implementation details and real-world numbers (e.g., prompt injection appears in 73% of deployments), guiding teams to build secure, scalable production-grade agents.

x.com · 20 min · Agents · AI · LLM
05-28

The 8 Levels of Agentic Engineering

Bassim Eledath maps the progression of AI-assisted coding into 8 levels, from tab-complete and AI IDEs to context engineering, compounding engineering, MCPs & skills, harness engineering with automated feedback loops, background agents, and autonomous agent teams. Each level builds on the previous, with practical insights on closing the gap between model capability and practice. He argues that plan mode is fading, multi-model dispatching yields better results, and true autonomous teams are still experimental. The piece serves as a roadmap for engineers looking to leverage AI more effectively.

www.bassimeledath.com · 22 min · Agents · AI · LLM
05-28

How Coding Agents Are Reshaping Engineering, Product and Design

Coding agents are fundamentally reshaping the EPD (Engineering, Product, and Design) collaboration model. With the cost of implementation plummeting, the traditional PRD→mockup→code waterfall is dead, replaced by a review-centric process where prototypes are rapidly generated and then scrutinized. Generalists who wield coding agents gain unprecedented leverage; system thinking and product sense become essential for everyone. The bar for specialization rises, and roles converge into either builders or reviewers. Ultimately, anyone with a deep grasp of both product and technology can thrive, blurring traditional role boundaries.

x.com · 12 min · Agents · AI
05-27

The Future Of Software Engineering with Anthropic

A summary of a roundtable on the future of software engineering, featuring leaders from Stripe, NVIDIA, Microsoft, and others. Key insights: closed-loop development creates compounding gains; test-first is the new default; human code review is fading; comments are written for AI readability; long-horizon tasks remain unsolved; developer tooling is being displaced first; hiring values experimentation over raw skill; human-authored context files help, agent-authored ones can hurt. Candid trade-offs and real-world practices are shared.

www.akashbajwa.co · 12 min · Agents · AI · LLM
05-27

5 Agent Skill Design Patterns Every ADK Developer Should Know

With SKILL.md format standardised across 30+ agent tools, the real challenge is content design. This article distills five recurring patterns from ecosystem-wide practices: Tool Wrapper (on-demand library context), Generator (template fill-in for consistent output), Reviewer (checklist scoring by severity), Inversion (agent-led interview before acting), and Pipeline (strict multi-step with gate conditions). Each pattern includes working ADK code, helping developers build reliable agents.

x.com · 13 min · Agents · AI · Framework
05-27

Your Best Prompt Is a Well-Defined User Story

In the age of agentic development, user story quality directly impacts AI output. The article argues teams should invest more time in breaking down stories and writing clear acceptance criteria rather than just estimating story points. A well-defined story includes three parts: Context, Acceptance Criteria, and Technical Hypothesis. Story point estimation is valuable only when forecasting or surfacing team misalignment is needed; otherwise it can be skipped. A good story acts as a good prompt, accelerating development cycles. Relevant for engineering teams using agile/Scrum.

spin.atomicobject.com · 7 min · Agents · LLM
05-27

Dreaming, Outcomes, and Multiagent Orchestration in Claude Managed Agents

Anthropic launches Dreaming in research preview for Claude Managed Agents: a scheduled process that reviews past sessions and memory to extract patterns, enabling agents to self-improve. Outcomes let developers define rubrics with a separate grader for self-correcting work; internal benchmarks show +10pp task success, +8.4% on docx, +10.1% on pptx. Multiagent orchestration allows a lead agent to decompose tasks to specialist subagents running in parallel with shared filesystem and traceability. Case studies include Harvey (6x completion rate improvement), Netflix (parallel log analysis), Spiral (writing quality via outcomes), and Wisedocs (50% faster document reviews). For engineers building autonomous AI agent systems.

claude.com · 6 min · Agents · LLM
05-27

ByteDance TRAE AI Coding Manuals: Context Engineering as Moat

A distilled summary of ByteDance TRAE team's 20 internal AI coding practice manuals. The core argument is that the bottleneck in AI coding efficiency is not model capability but context engineering. The article details six methodologies: Context Engineering, Skills, Spec Coding, Rules, MCP, and Agentic Coding, backed by experimental data (e.g., 32 real bug fixes: 100% success with Skills vs 59% without). Suitable for frontline developers, tech leads, and engineering managers.

x.com · 14 min · Agents · AI · LLM
05-27

From Scratch: Build Automated Claude Code Workflows with Hooks

A tutorial on using Claude Code Hooks to automate shell commands at lifecycle events, replacing unreliable prompt instructions. Covers 5 key events (PostToolUse, PreToolUse, etc.), 3 hook types (command/prompt/agent), and config file structure. Provides 5 ready-to-use examples: desktop notification, auto-formatting, file protection, context recovery after compaction, and commit message linting. Exit code 2 blocks dangerous actions and feeds stderr back to Claude. For developers seeking reliable Claude Code workflows.

x.com · 10 min · Agents · AI
05-27

Claude Code in Large Codebases: Best Practices and Getting Started

This article covers how Claude Code navigates large codebases using agentic search instead of RAG indexing, avoiding stale index issues but requiring good context configuration. It details the 'harness' ecosystem around the model—CLAUDE.md, Hooks, Skills, Plugins, MCP servers, LSP integration, and subagents—and presents three configuration patterns from successful deployments: making the codebase navigable, maintaining CLAUDE.md as models evolve, and assigning ownership for rollout. A practical guide for teams adopting Claude Code at scale.

claude.com · 19 min · Agents · AI
05-27

Why Your “AI-First” Strategy Is Probably Wrong

The CTO of an agent platform shares their journey of rebuilding the entire engineering workflow around AI: 99% of production code is written by AI, shipping features within a day. The article critiques the superficial “AI-assisted” approach and introduces “harness engineering,” detailing their tech stack, self-healing feedback loop, and the new engineer roles of Architect and Operator. Real-world results include 3–8 deployments per day. Valuable for teams and CTOs seeking genuine AI integration.

x.com · 19 min · Agents · AI