Glean 拾遗
Daily · timeline

A few picks a day.

Thu, Jul 2, 2026 3picks
← 07-01
Calendar ▾
2026 · 07
MoTuWeThFrSaSu ··12345678910111213141516171819202122232425262728293031
has picks today
06:00

What I Learned About Agent Skills from Building Popular Ones

做了些爆款 Skills 后,我对 Skills 的看法

The author, having built several popular Skills (PPT, social media cards, logo generator, AI desk card), argues that Agents amplify rather than erase capability gaps. A Skill is defined as a reusable capability unit that bundles expert experience, workflows, taste, and tool calls. Core insights: Skill design is externalizing human taste as constraints (e.g., no pure white/black, text must not cover faces); architecture should be 'short center, thick radius' with SKILL.md holding only high-signal flow; quality must be maintained like code, with gotchas from real failures being the most valuable; the ecosystem should present each Skill as a feature page, not a repository list; distribution relies on GitHub for cross-platform reach and content platforms for community building, creating a flywheel of articles, products, and use cases. A full lifecycle from real need to feedback iteration is proposed. The article is aimed at AI Agent developers, product managers, and content creators, offering concrete cases and actionable design principles.

06:00

Micro-Agent: Beat Frontier Models with Collaboration inside Model API

微代理:在模型API层内协作,超越前沿模型

The vLLM Semantic Router proposes a different take: a router is not just a request dispatcher but an amplifier of model capability. The core idea is to encapsulate multi-model collaboration inside a single model API call. The user sees one model endpoint (vllm-sr/auto), but behind it the router can automatically select a collaboration pattern — from cost-aware escalation (Confidence), parallel aggregation (Ratings), repeated mixture-of-model reasoning (ReMoM), disagreement-as-signal (Fusion), to budgeted micro-agent workflows (Workflows). These patterns are controlled, configurable, observable runtimes, not application glue. Benchmarks on LiveCodeBench, GPQA-Diamond, and Humanity's Last Exam show the closed-model collaboration scheme (VSR Closed) achieving 92.6%, 96.0%, and 50.0% respectively, matching or beating single frontier models like Fugu Ultra and GPT-5.5. This article is valuable because it sinks multi-model collaboration from the product or application layer down to the serving infrastructure layer, while preserving a single model identity. For engineers building inference routing, multi-model strategies, or cost optimization.

vllm.ai · 14 min · AI Engineering · Cost Optimization · LLM · Open Source · Orchestration
06:00

Getting started with loops

Claude Code 循环模式:从手动检查到定时任务的工程化指南

This article is an official engineering guide from Claude Code that systematically lays out four agentic loop patterns and their use cases. Turn-based loops are for short exploratory tasks; users encode manual verification steps into SKILL.md — e.g., asking Claude to start a dev server, take screenshots, and check the browser console. Goal-based loops, triggered by /goal, define deterministic termination criteria such as 'get the Lighthouse score to 90 or above' and force iteration until the target is met. Time-based loops come in two flavors: /loop for local polling on an interval and /schedule for cloud-triggered routines, ideal for recurring work like PR review or CI fixups. Proactive loops combine /schedule, /goal, dynamic workflows, and auto mode into a pipeline for long-running, well-defined streams of work. The article also covers code quality maintenance and token usage management: encoding conventions, using scripts instead of re-reasoning, routing routine work to cheaper models, and monitoring cost with /usage. Suitable for engineers embedding Claude Code into daily dev workflows.