标签 · Verification — Glean

06-27

The New Software Lifecycle: From Writing Code to Judging It

Key insights from a Google whitepaper on how AI transforms the software lifecycle. The core thesis: an agent is 10% model and 90% harness (instructions, tools, sandboxes, orchestration, observability). Context engineering is the primary cost lever, with a critical distinction between static context (loaded every turn, reliable but expensive) and dynamic context (loaded on demand, cost-efficient but needs careful design). Verification determines whether you're vibe coding or doing agentic engineering: tests for deterministic parts, evals for non-deterministic output and trajectory. Real data: one team moved a coding agent from outside top 30 to top 5 on Terminal Bench 2.0 by changing only the harness with the same model; LangChain added 13.7 points on the same benchmark by changing system prompt, tools, and middleware around a fixed model. Implementation collapses from weeks to hours, while specification and verification become the new bottlenecks. For engineers and tech leads adopting AI agents in production workflows.

addyosmani.com · 15 min · Agent Architecture · AI Engineering · Context Engineering

Verification

1pick · chronological

The New Software Lifecycle: From Writing Code to Judging It