← Latest brief

Brief #43

38 articles analyzed

We're witnessing the architecture moment for AI agents: the field is shifting from 'can AI do this?' to 'how do we structure context so intelligence compounds rather than resets?' Three patterns dominate: (1) dynamic context loading is replacing static injection, (2) persistent state is becoming non-negotiable for real work, and (3) practitioners are discovering that clarity about the problem—not model capability—is the real bottleneck.

Dynamic Context Discovery Replaces Static Injection

The winning pattern is treating large/unbounded data (tool outputs, chat history, tool descriptions) as files/references, then using retrieval tools to dynamically load only necessary context. Static context injection is wasteful; dynamic loading can reduce token usage by 40-50% while maintaining quality.

Audit your agent's context injection. Identify what's being loaded statically (tool schemas, conversation history, large outputs). Convert these to external storage + retrieval tools. Measure token reduction. Prioritize MCP-heavy environments first—that's where you'll see 40%+ gains.
Cursor 发布最新上下文工程模式「动态上下文发现」

Cursor's dynamic context discovery reduced token usage by 46.9% for MCP by converting static context into file-based retrieval. Tool descriptions, chat history, and tool outputs are stored externally and fetched on demand.

Learn about how we use the filesystem to improve context efficiency

Cursor uses filesystem structure as a context management layer—organizing context through filesystem hierarchy reduces what must live in the context window while maintaining discoverability.

Cursor release a dynamic context discovery for all models which reduced total...

Dynamic context selection in multi-MCP scenarios: instead of loading all tool context regardless of relevance, analyze the current query first and include only relevant context. 47% efficiency gain.


Persistent State Separates Production Agents from Demos

Agents that reset context every session are toys. Production-grade agents require persistent state (learned skills, user preferences, task history) that survives session boundaries. The 'slow initial learning → fast execution' curve is the signature of real value.

If your agent resets every session, you're building a demo. Implement persistent state storage for: learned user preferences, domain-specific rules, task history, and behavioral patterns. Start with one domain (email management, content creation) and measure improvement velocity across 5+ sessions.
@paulbettner: @Letta_AI + skills is *incredibly* effective

AI executive assistant that learns email patterns, writing style, and spam rules over time. Initial setup cost amortizes as learned context eliminates friction in subsequent interactions—exponential returns from persistent skills.

Progressive Disclosure Beats Comprehensive Documentation

Agents perform better with lightweight primary context (SKILL.md with only non-obvious information) + flat-hierarchy references for domain details. Deep nesting and comprehensive upfront docs create context pollution. The pattern: minimalism + dynamic constraint adjustment matched to task risk.

Audit your agent's primary context. Cut everything Claude already knows (general programming, common APIs). Keep only: non-obvious domain rules, your specific constraints, and validation criteria. Move comprehensive docs to references. Test: can you explain the task in <200 tokens?
跟着 Anthropic 博客和文档,学习「Agent Skills」构建的最佳实践

Keep primary artifact (SKILL.md) lightweight and directive. Use flat-hierarchy references only for domain-specific detail. Match constraint level to task risk: low (scripts), medium (templates), high (natural language). Embed validation loops.

Agent Failures Are Prompt Clarity Failures

When agents don't behave as expected, the bottleneck is rarely model capability—it's unclear/misaligned prompting. The human hasn't clarified the problem in the AI's native language/format. This is the debugging heuristic: audit prompt clarity before optimizing agent logic.

Next time your agent fails, don't tweak the model or add more tools. Instead: (1) Write explicit success criteria as evals, (2) Test if Claude can pass them with current context, (3) If not, clarify the problem statement—not the solution steps. Measure: can you define success in 3 bullet points?
@steipete: oh your agent doesn't do what you want?

Common complaint: 'agent keeps doing the wrong thing.' Root cause: unclear prompting, not model limitations. Before optimizing agent, audit: Are you specific about success criteria? Are you using language the model naturally understands?

Subagent Context Isolation Prevents Window Pollution

Delegating high-volume, low-semantic-value operations (execution traces, command outputs) to separate context windows preserves main context for reasoning. This is context budget allocation: isolate noisy subprocesses so core intelligence doesn't degrade.

Identify operations that generate verbose outputs (bash commands, API responses, log files). Route these to separate context spaces. Keep your main agent context for high-level reasoning only. Test: if your main context window fills with execution traces, you need subagent isolation.
Claude Code v2.1.1 introduces a new 'Bash' subagent

Bash subagent handles multi-step operations in a separate context window, preventing main conversation thread from accumulating irrelevant execution traces. Main thread stays focused on problem-solving; separate thread handles implementation details.