Brief #75
Context engineering bottlenecks are shifting from model capability to infrastructure: practitioners are abandoning specialized tools for general-purpose models with persistent execution environments, while standardization efforts (WebMCP, MCP Apps) make context accessibility—not context quality—the new frontier.
Terminal-native workflows preserve context better than web UIs
Practitioners are canceling specialized AI app subscriptions in favor of Claude Code + persistent infrastructure because terminal environments maintain richer semantic context (file system, git, environment state) across sessions than browser-isolated tools that fragment and reset context.
Practitioner abandoned specialized React builder for Claude Code because terminal access preserves continuous context across file system, git, and deployment—Lovable resets between interactions
Moving Claude Code to persistent Railway infrastructure transforms ephemeral sessions into stateful workflows where context accumulates across 24/7 execution rather than resetting
Successful long-running agent tasks require externalizing state (code reviews, CI/CD checks) so agents compound on prior work rather than redoing decisions—terminal workflows enable this better than isolated UIs
WebMCP eliminates screenshot parsing as context strategy
Browser-native tool registration (WebMCP) replaces context-heavy browser automation by making websites expose structured tool schemas directly, eliminating the need for agents to parse HTML/screenshots and turning every website into a deterministic API for AI agents.
Websites register getDresses(query, filters) with natural language descriptions and parameter schemas, giving agents perfect context about callable functions—no ambiguity, no screenshot parsing needed
Prompt minimalism unlocks capable models suppressed by scaffolding
Practitioners report that removing 'training wheels' prompts optimized for weaker models (anti-laziness instructions, forced tool syntax) dramatically improves results with Claude Sonnet 4.6, revealing that verbose context scaffolding becomes bottleneck clutter as model capability increases.
Practitioner got better results by removing anti-laziness prompts and forced tool syntax that were necessary for weaker models but suppressed Sonnet 4.6's capability
Agent coordination requires context isolation, not centralization
Practitioners successfully scaling to 50+ parallel agents use per-agent sandboxed contexts with MCPs as protocol layer, revealing that context contamination across agents—not centralized state management—is the actual bottleneck in multi-agent systems.
Orchestrating 50+ agents requires isolated sandboxes per agent to prevent context cross-contamination while MCPs provide shared interface for tool access
Resource-constrained agents need control theory, not just prompts
Production agents exhausting system resources (disk, memory) reveal that naive 'cleanup when low' prompts fail—practitioners are building PID controllers with Bayesian cost-weighting and circuit breakers because agent reliability under resource pressure requires control-theory feedback loops, not better instructions.
Agent disk exhaustion required multi-layer safety (six veto points), PID controller with EWMA for 30-min prediction, and Bayesian scoring reflecting actual costs of wrong deletions—prompts alone failed
MCP tool granularity defines what agents can reason about
Azure's decision to decompose its platform into 40+ granular MCP tools (vs. monolithic APIs) reveals that tool design—how finely you break capabilities into named, schema-defined functions—directly determines whether agents can map natural language queries to actions without context explosion from trial-and-error discovery.
Breaking Azure into 40+ tools categorized by problem domains (compute, storage, databases) makes tool capability explicit and reduces context needed for discovery—agents don't explore, they select
Agents need memory reorganization as a skill, not just accumulation
Static memory architectures hit performance walls in long-running agent sessions—Letta's demonstration of agents restructuring their own memory (flat → hierarchical) reveals that context management itself should be delegated to agents as an executable skill, not pre-engineered by developers.
Agents can request and execute memory reorganization (flat → nested hierarchies) as a skill when their memory becomes unwieldy—mirrors how humans 'organize notes' when context gets messy