标签 · AI Agents — Glean

Recent picks

11picks · chronological

07-23

Turn any codebase into a queryable knowledge graph, built for AI coding assistants

Graphify is an open-source tool that transforms codebases, docs, PDFs, images, etc. into a queryable knowledge graph. It uses tree-sitter for local deterministic AST parsing to extract code relationships (calls, imports, inheritance) without any LLM calls. Non-code files are semantically extracted via your AI assistant's model. The output includes an interactive HTML visualization, CLI queries (query/path/explain), and an MCP server for team use. Every edge is tagged EXTRACTED or INFERRED, providing transparency. Ideal for engineers navigating large monorepos, tracing dependencies, or understanding architecture.

github.com · 59 min · AI Agents · Developer Tools · Knowledge Graph

07-22

Kimi Code CLI: Terminal AI Coding Agent for Next-Gen Agents

Kimi Code CLI is an AI coding agent that runs in your terminal. It reads and edits code, runs shell commands, searches files, fetches web pages, and autonomously decides next steps based on feedback. It ships as a single binary with millisecond startup, features a purpose-built TUI, supports video input, AI-native MCP configuration, a rich plugin ecosystem, subagents for parallel tasks, lifecycle hooks, and ACP integration with editors like Zed and JetBrains. Ideal for developers seeking a powerful, extensible AI coding companion.

github.com · 4 min · AI Agents · CLI · Developer Tools

07-16

The Short Leash AI Coding Method For Beating Fable

This post distills over a year of research on using AI agents for security-critical software. The author introduces the “Short Leash” method: only expert developers can use it; never enable YOLO mode; manually review every diff in the permissions prompt to keep the AI on track; commit after each subtask to safeguard against regressions. It also details AI-assisted code review: pair human and AI, with AI catching surface errors and humans guiding direction. PR authors must self-review line-by-line and disclose AI models used. This approach beats Fable even with non‑frontier models, without sacrificing quality. Targeted at senior engineers who want productivity gains without giving up understanding.

blog.okturtles.org · 7 min · AI Agents · AI Engineering · Code

07-03

Self-Healing Browser Harness That Lets LLMs Drive Any Real Browser

Browser Harness is a thin, self-healing CDP harness that connects an LLM directly to a real Chrome browser via a single WebSocket, with zero intermediate layers. When the agent needs to perform an action it hasn't seen before (e.g., file upload, cross-origin iframe interaction, drag and drop), it writes the missing helper code on the fly and saves it into an agent-workspace for reuse. The core package is roughly 1K lines, enabling complete freedom for browser automation tasks. Aimed at developers who need AI agents to perform real, unconstrained browser interactions.

github.com · 7 min · Agent Engineering · AI Agents · Browser Automation

06-30

Browser Automation CLI for AI Agents

agent-browser is a native Rust CLI designed for AI agents to automate browser interactions. It uses a client-daemon architecture where the Rust daemon directly communicates with Chrome via CDP, eliminating the Node.js dependency. The tool offers a comprehensive command set covering navigation, element interaction (via ref/CSS/XPath/text selectors), snapshots, screenshots, network interception, session management, and authentication state persistence. It includes built-in safety features like domain allowlists, action policies, and encrypted state storage. It is optimized for AI workflows with accessibility tree snapshots, annotated screenshots, and MCP server support, making it ideal for engineers building AI agents, automated testing, web scraping, or enabling LLMs to control browsers reliably.

github.com · 64 min · AI Agents · Browser Automation · CDP

06-18

A Structured Cybersecurity Skills Library Purpose-Built for AI Agents

This is not another collection of security scripts or checklists. It’s an AI-native knowledge base that encodes 754 practitioner-grade cybersecurity workflows into a structured, agent-readable format. Each skill carries YAML frontmatter for sub-second discovery and step-by-step Markdown procedures, essentially giving any LLM-based agent the decision-making playbook of a senior analyst. The library spans 26 domains—from DFIR and threat hunting to cloud security and OT/ICS—and maps every skill to MITRE ATT&CK, NIST CSF 2.0, MITRE ATLAS, D3FEND, and NIST AI RMF, making it uniquely suited for security professionals integrating AI into real operational workflows.

github.com · 28 min · AI Agents · Claude Code · Cybersecurity

06-12

How To Build AI Agents in 2026 (That Actually Work)

This article systematically deconstructs the architecture and engineering practices for building practical AI agents. It clarifies the boundaries between chatbots, AI agents, and agentic AI, emphasizing that a real agent is a system that persistently loops toward a goal rather than delivering a one-shot answer. The author explains the ReAct loop (Reasoning + Acting) and breaks down the five building blocks: the LLM as the brain, tools as hands, short-term and long-term memory, self-correcting loops, and verification. Using a case study of a startup research agent for the fitness niche, the article walks through goal setting, tool integration, loop construction, memory implementation, and the addition of a critic agent, complete with copy-paste system prompts. It highlights six common failure modes and recommends a 2026 tech stack including Claude Code, LangGraph, and MCP. The piece provides a weekend roadmap to build a basic agent from a 50-line Python script and is aimed at developers shifting from prompt engineering to designing agent systems.

x.com · 21 min · Agent Architecture · AI Agents · AI Engineering

06-11

The Missing Link Between Agents and Applications

This article introduces Headless Tools, a mechanism that allows agents to act directly on client-side runtimes such as browsers and desktop applications. The author argues that most current agent tools are server-side, limiting them to API calls while blocking access to browser state, device APIs, and in-app actions. Headless Tools wrap client-side capabilities like geolocation, clipboard, IndexedDB, and application-specific commands as standard tools invocable by the model. The model sees only a tool schema, while the server and client coordinate execution behind the scenes. Code examples in TypeScript demonstrate the pattern, alongside real-world use in a Slidev presentation plugin and browser-local agent memory. Privacy is improved because sensitive data can remain on-device. This is valuable for teams embedding AI agents into rich frontend contexts such as design tools, document editors, and desktop utilities.

x.com · 7 min · AI Agents · AI Engineering · Browser

06-10

AI Agent Skill: Cross-Platform Social Search and 30-Day Synthesis

/last30days is an AI agent skill that aggregates the latest content from Reddit, X, YouTube, TikTok, Hacker News, and more into a 30-day briefing. It uses entity pre-research to identify key people, communities, and topics, then searches in parallel and scores by real engagement (upvotes, likes, money) rather than SEO. An AI synthesizes a cited, in-depth summary. Open-source (MIT), it supports Claude Code and 50+ agent frameworks. Ideal for engineers, PMs, and researchers needing a quick, grounded update before meetings or decisions.

github.com · 27 min · AI Agents · Open Source · Social Media

06-10

Designing loops with Fable 5: self-correction and memory in agentic workflows

The author shares two practical directions for improving agentic workflows with Anthropic's Claude Fable 5 model: self-correction loops and cross-session memory. In a Parameter Golf challenge—train the best model within a 16MB artifact in under 10 minutes on 8×H100 GPUs—Fable 5 improved the training pipeline roughly 6× more than Opus 4.7 when using Claude Managed Agents with Outcomes judged by an independent verifier sub-agent against nine checkable criteria. Fable 5 bet on larger structural changes and pushed through a quantization regression, while Opus 4.7 stuck to tuning scalar hyperparameters. For memory, the author used a SQL-based task from Continual Learning Bench 1.0 with filesystem-backed memory across agent sessions. Sonnet 4.6 only logged failures and guesses; Opus 4.7 built flagged schema references but verified only 17% of questions; Fable 5 reached 73% verification coverage in the best run and distilled learnings into general rules. Engineers interested in agent architecture and model capability boundaries will find the experiments relevant.

x.com · 5 min · Agent Architecture · AI Agents · AI Engineering

06-09

Loop Engineering: Designing the System That Prompts Your Coding Agents

Addy Osmani argues that interacting with coding agents is shifting from prompt engineering to 'loop engineering'—designing a system that autonomously discovers tasks, delegates work, and verifies results using five building blocks: scheduled automations, parallel worktrees, project skills, connector plugins, and checker sub-agents. He maps how Claude Code and Codex both implement all five, noting that the leverage point has moved from writing good prompts to architecting persistent loops. The post cautions that loops amplify existing problems: verification, comprehension debt, and cognitive surrender become sharper risks. Intended for senior engineers evaluating how to productize AI coding tools beyond one-shot interactions.

x.com · 14 min · Agent Architecture · AI Agents · AI Engineering