Brief #74
MCP's rapid evolution is exposing a fundamental tension: protocol simplicity drives adoption, but practitioners are hitting architectural limits where context management requires visibility, persistence, and cross-environment portability that current tooling doesn't provide. The bottleneck isn't model capability—it's the infrastructure layer for context engineering.
Raw Artifacts Beat Human Summaries for AI
Practitioners are discovering that feeding AI systems raw error logs, email attachments, and unprocessed data dramatically outperforms human-written descriptions of the same problems. The setup work agents perform to process raw context is cheaper than the information loss from human interpretation.
Practitioner solved decade-old server issues by providing raw email attachments directly to agent rather than describing problems. Agent 'suffered through' reproduction steps efficiently.
Existing codebase acts as 'highly detailed prompt'—concrete tests + code provide better context than abstract specifications. LLMs excel when source material is dense and specific.
Developer backlash reveals need for raw visibility into AI actions (file names, operations, line counts). Hiding this context under UI abstractions breaks trust and debugging capability.
Context Persistence Breaks at Environment Boundaries
Practitioners need session portability across local/cloud, tool versions, and agent implementations, but current systems force intelligence to reset at these boundaries. The 'compounding intelligence' promise fails when context cannot cross architectural borders.
Practitioner explicitly asking how to preserve working context when switching Claude implementations. No solution exists—must manually recreate context.
LLM-Generated Context Degrades Without External Validation
Feedback loops that reuse LLM outputs as context for subsequent tasks amplify noise rather than compound intelligence. Self-generated skills, summaries, or examples fail to improve task completion rates—external validation is required to prevent degradation.
Direct citation of study showing LLM-generated skills don't improve performance when fed back into system. Self-generated context without validation introduces compounding errors.
MCP Tool Search Solves Lazy-Load Context Problem
Dynamic tool loading based on task detection solves the context exhaustion problem where unused MCP tools consume tokens. This validates that the bottleneck isn't context window size—it's clarity about which tools matter for each task.
MCP Tool Search enables lazy-loading tools into context only when relevant. System determines task intent and loads appropriate tools dynamically, preventing token waste from unused tool descriptions.
Task Duration Determines Tool Context Stability Requirements
Bounded, single-turn tasks require different context management than extended, multi-turn sessions. Practitioners are discovering that tool selection should be based on 'leash length'—how long the system maintains coherence—not just raw capability.
Practitioner discovered Claude Code works well for bounded 'scalpel' tasks but Codex maintains coherence better across longer sessions. Tool stability varies by task duration.