← Latest brief

Brief #65

38 articles analyzed

Practitioners are discovering that agent systems fail not from lack of model capability, but from architectural choices about context flow. The emerging pattern: successful multi-agent systems treat context management as infrastructure—using state machines, boundary objects, and explicit handoff protocols—rather than hoping models will 'figure it out.'

Parameterized Context Beats Embedded Prompts for Agent Clarity

Separating data plane (variables/parameters) from instruction plane (prompts) prevents context pollution in multi-step reasoning chains. Sub-agents as function returns rather than tool calls preserve context budgets across reasoning steps.

Refactor your agent prompts to externalize all dynamic data into runtime variables. Design sub-agents to return memoized results rather than exposing intermediate I/O to the main context window.
@_coenen: The most effective techniques are very often the most simple

Google AI engineer identifies two architectural patterns that prevent context window bloat: parameterize context via variables (not embedded text) and encapsulate sub-agent calls as function returns (not tool calls exposing raw I/O)

New way to write code: 1. Don't start by typing code...

Sequential context accumulation pattern validates separation of concerns: problem clarity (data flow mapping) happens before implementation constraints (style/structure), preventing mixed context types from degrading quality

Claude Code #4: From The Before Times

Two-part loop pattern (review updates context files BEFORE next job runs) demonstrates that separating learning extraction from execution prevents context resets between sessions


Multi-Agent Coordination Requires Org Design Not Better Models

Agent systems fail at scale because they ignore organizational coordination constraints—spans of control, boundary objects, coupling levels—that humans discovered managing complexity. Passing raw text between agents loses meaning; structured boundary objects preserve intent.

Before building agent swarms, design the coordination architecture: define explicit spans of control (how many sub-agents can one coordinator manage), create standardized boundary objects (structured data formats for inter-agent handoffs), and specify coupling levels (which agents can/cannot directly communicate).
I think agentic AI would work much better if people took lessons from...

Organizational theory provides proven patterns (bounded spans of control, standardized boundary objects, intentional coupling design) that directly apply to agent coordination failures. Current systems pass raw text/code between agents, requiring repeated context reconstruction.

Intent Summarization Prevents Context Fragmentation Across Agent Switches

Managing multiple simultaneous agents creates context fragmentation. High-level intent summaries (compressed 'what am I trying to do') preserve problem clarity across context switches more effectively than detailed state tracking.

Add a 'project intent' haiku or one-sentence goal visible in all agent contexts. Update it when direction changes. Make it the first thing agents see before receiving specific tasks.
Manager mode request for @conductor_build

Practitioner identifies that multi-agent orchestration loses high-level intent. Requests lightweight haiku bot to maintain compressed context about overall goal, surfaced before each interaction—not detailed state but meta-level summary.

Cache Hit Optimization Now First-Class Concern for Fast Mode Economics

Pricing changes make cache hit rates critical to /fast mode viability. Context reuse patterns must justify cost premium—practitioners currently lack visibility into cache effectiveness, creating blind optimization.

Before Feb 16 pricing changes: audit your request patterns to calculate cache hit rates. Structure prompts to maximize reusable prefix content. If cache hits are <40%, reconsider /fast mode economics.
@EricBuess: If you're using /fast mode think hard about cache hit optimization

Practitioner identifies under-discussed optimization: cache hit rates determine whether /fast mode economics work. Notes missing data (cache hit breakdown) is critical context needed for proper optimization. Pricing window deadline creates urgency.

Specialized AI Tools Maintain Better Domain Context Than General Models

Task-specific models (Codex for code) build better understanding of domain artifacts before acting, reducing downstream errors. Specialization acts as implicit context engineering by prioritizing relevant information in learned representations.

When choosing between specialized and general-purpose AI tools, favor specialized models for domains where context window must preserve deep artifact understanding (codebases, legal docs, scientific literature). Use general models for broad reasoning tasks.
[codex] always wants to code - It is called codex

Practitioner comparison: Codex builds better understanding of codebase before making changes, cleaning up Opus failures. Specialized tool maintains domain context (dependencies, patterns, constraints) better than general-purpose model.

Failure Analysis Reveals Context Engineering Leverage Points

Systematically pushing models to breaking points reveals how context, information density, and reasoning actually work in practice. Breakpoints show where clarity degrades, identifying true context engineering constraints.

Build a failure catalog: systematically test where your AI workflows break (context length, task complexity, abstraction level). Document what context was missing or unclear at each breakpoint. Use this to guide prompt/context architecture.
@steipete: I've consistently found the best way to understand what language models can do...

Anthropic engineer shares methodology: systematic failure analysis reveals model capabilities and limitations. Failure modes contain most information about how context and reasoning work. Tests context window limits, information density limits, reasoning complexity limits.

Interactive Exploration Pattern Enables Real-Time Context Compounding

Multi-turn context enables AI to explore unknown systems while keeping high-level goals in focus. Each turn adds information and refines decision space—context compounds in real-time rather than requiring session restarts.

For exploratory tasks (system audits, migration planning, technical debt analysis), structure workflows as multi-turn conversations where each interaction adds context. Don't try to front-load all information—let context build incrementally.
@alexhillman: got a new MacBook Air and using claude code to ssh

Claude Code maintains task context across tool invocations (ssh into remote system). Validates multi-turn context preservation enables interactive problem-solving at system admin level. Each ssh output builds on prior understanding without re-explanation.

Role Definition as Iterable Context Variable Activates Training Data Quality

Generic roles (e.g., 'software engineer') draw from averaged training data. Specialized role definitions activate narrower, higher-quality subsets. Iterative refinement of role/expert definition through feedback loops finds optimal framing for specific tasks.

Stop using generic role definitions ('you are a helpful assistant'). Experiment with 5-10 variations of expert role framing for your core tasks. A/B test outputs. Build a library of high-performing role definitions.
here's how i get AI outputs that nobody else gets...i play with role...

Practitioner discovers role/context specificity activates different training data distributions. Pattern: iteratively refine role/expert definition through loop until finding optimal framing. Generic roles return commodity results; specialized roles return differentiated outputs.