05-31
Project Glasswing: What Mythos Showed Us
Cloudflare tested Anthropic's Mythos Preview on 50+ internal repos under Project Glasswing. The model excels at chaining low-severity bugs into working exploits and generating PoCs, making validation actionable. Real-world use revealed inconsistent model refusals and signal-to-noise challenges; a generic coding agent proved ineffective. Cloudflare built an eight-stage harness (Recon, Hunt, Validate, Gapfill, Dedupe, Trace, Feedback, Report) using parallel narrow tasks and adversarial review to improve quality. The post argues that beyond faster patching, defenses must limit exploit reachability from the architecture layer.