Brief #43
We're witnessing the architecture moment for AI agents: the field is shifting from 'can AI do this?' to 'how do we structure context so intelligence compounds rather than resets?' Three patterns dominate: (1) dynamic context loading is replacing static injection, (2) persistent state is becoming non-negotiable for real work, and (3) practitioners are discovering that clarity about the problem—not model capability—is the real bottleneck.
Dynamic Context Discovery Replaces Static Injection
The winning pattern is treating large/unbounded data (tool outputs, chat history, tool descriptions) as files/references, then using retrieval tools to dynamically load only necessary context. Static context injection is wasteful; dynamic loading can reduce token usage by 40-50% while maintaining quality.
Cursor's dynamic context discovery reduced token usage by 46.9% for MCP by converting static context into file-based retrieval. Tool descriptions, chat history, and tool outputs are stored externally and fetched on demand.
Cursor uses filesystem structure as a context management layer—organizing context through filesystem hierarchy reduces what must live in the context window while maintaining discoverability.
Dynamic context selection in multi-MCP scenarios: instead of loading all tool context regardless of relevance, analyze the current query first and include only relevant context. 47% efficiency gain.
Persistent State Separates Production Agents from Demos
Agents that reset context every session are toys. Production-grade agents require persistent state (learned skills, user preferences, task history) that survives session boundaries. The 'slow initial learning → fast execution' curve is the signature of real value.
AI executive assistant that learns email patterns, writing style, and spam rules over time. Initial setup cost amortizes as learned context eliminates friction in subsequent interactions—exponential returns from persistent skills.
Progressive Disclosure Beats Comprehensive Documentation
Agents perform better with lightweight primary context (SKILL.md with only non-obvious information) + flat-hierarchy references for domain details. Deep nesting and comprehensive upfront docs create context pollution. The pattern: minimalism + dynamic constraint adjustment matched to task risk.
Keep primary artifact (SKILL.md) lightweight and directive. Use flat-hierarchy references only for domain-specific detail. Match constraint level to task risk: low (scripts), medium (templates), high (natural language). Embed validation loops.
Agent Failures Are Prompt Clarity Failures
When agents don't behave as expected, the bottleneck is rarely model capability—it's unclear/misaligned prompting. The human hasn't clarified the problem in the AI's native language/format. This is the debugging heuristic: audit prompt clarity before optimizing agent logic.
Common complaint: 'agent keeps doing the wrong thing.' Root cause: unclear prompting, not model limitations. Before optimizing agent, audit: Are you specific about success criteria? Are you using language the model naturally understands?
Subagent Context Isolation Prevents Window Pollution
Delegating high-volume, low-semantic-value operations (execution traces, command outputs) to separate context windows preserves main context for reasoning. This is context budget allocation: isolate noisy subprocesses so core intelligence doesn't degrade.
Bash subagent handles multi-step operations in a separate context window, preventing main conversation thread from accumulating irrelevant execution traces. Main thread stays focused on problem-solving; separate thread handles implementation details.