Glean 拾遗
Recent picks

1pick · chronological

07-03

Continually Improving Our Agent Harness

Cursor shares how it continuously improves its agent harness, covering context window evolution from static to dynamic fetching, a two-layer evaluation system (offline benchmarks and online A/B tests measuring code keep rate and user satisfaction), tool call error classification and repair pipeline (anomaly detection + automated log analysis with Cloud Agents), per-model customization of tool formats and prompts (e.g., patch vs. string replacement), and mid-chat model switching with specialized instructions. The post concludes with a vision of multi-agent architectures where the harness orchestrates specialized sub-agents.

cursor.com · 13 min · Agent Engineering · Ai Tooling · Context Engineering