Brief #136
Production AI is failing not from model weakness but from context infrastructure gaps. Teams building with clear verification harnesses and selective tool exposure are shipping; those defaulting to protocol adoption without understanding context flow are accumulating invisible debt.
Verification Infrastructure Unlocks AI Problem-Solving Better Than Steering
EXTENDS tool-integration-patterns — existing graph covers integration approaches; this specifies that verification infrastructure (not connection) is the bottleneckAI debugging requires observable, deterministic test harnesses rather than conversational guidance. Practitioners who build logging proxies and verification infrastructure solve hard problems; those trying to steer incrementally fail.
Practitioner spent 2 hours failing with incremental steering, then succeeded immediately after building logging proxies and deterministic tests. The breakthrough was verification infrastructure, not better prompts.
Model selection based on specs failed; measuring actual tool call efficiency (context usage patterns) revealed smarter models win through better context utilization, not speed. Verification of actual behavior beats assumptions.
Linear as context→execution→verification loop. The pattern requires verification as architectural component: 'where verification happens' determines whether context compounds or resets.
MCP Token Overhead Invisible Until Measured
MCP tool schemas prepend to every turn, consuming tokens multiplicatively. Teams accumulate servers without auditing, degrading effective context windows without realizing it.
MCP schemas serialize into context on EVERY TURN, not just initialization. Overhead is multiplicative across multi-turn conversations. Teams need 2-week usage audits and project-scoped configs instead of global defaults.
Harness Design Determines Agent Performance More Than Model Choice
The 'sacred boundary' of context windows requires explicit harness design—deciding what crosses via truncation, compaction, offloading, or eviction. API contract enforcement pushes this clarity; without it, agents fail at token limits.
Harness as external context manager. The boundary is sacred—what gets passed determines agent performance. Without explicit strategy (truncation/compaction/offloading/eviction), system hits API errors instead of degrading gracefully.
CLI Tools Beat MCP for Development Context Access
AI writes code to manipulate data more efficiently than ingesting structured data through protocols. Practitioners prefer giving Claude executable CLIs over MCP servers—keeps context pristine, tasks granular.
Practitioner abandoned MCP for CLI approach. Key insight: don't serialize external data into context window; provide tools and let AI write code to fetch/process on-demand. Code generation more efficient than data interpretation.
Smarter Models Win Through Context Efficiency Not Speed
Model selection based on advertised cost/latency fails in production. Teams measuring actual tool call patterns find smarter models make fewer, better decisions—lower total cost despite higher per-token pricing.
Practitioner expected cheaper/faster model to win; measurement showed GPT-5.5 used context more intelligently (fewer tool calls per task). Smarter models compound effectiveness through better context utilization.
Context Engineering as Discipline Replacing Prompt Engineering
Production AI requires managing state, multi-step workflows, and multimodal context—not optimizing individual prompts. The shift from 'prompt engineering' to 'context engineering' reflects increasing system complexity requiring persistent state.
Explicit articulation that prompt engineering is insufficient framing for stateful, multi-step, multimodal systems. Context engineering addresses state preservation, workflow orchestration, and multi-turn complexity.
Daily intelligence brief
Get these patterns in your inbox every morning — plus MCP access to query the concept graph directly.
Subscribe free →