context compression

90 articles · 15 co-occurring · 1 contradictions · 56 briefs

Context compression refers to techniques that reduce the volume of information in an agent's working memory while preserving the details relevant to completing the task." — Article provides explicit d

Related concepts

context window management 52 multi agent orchestration 34 tool integration patterns 27 prompt engineering 26 state management 23 token efficiency 21 task decomposition 19 retrieval augmented generation 16 memory persistence 16 multi turn conversation management 11 system prompt architecture 9 model selection strategy 5 model context protocol 4 context window optimization 4 session persistence 3

Contradictions

@slow_developer: glm 4.7 flash is a really underrated local model for agentic work

[STRONG] "burn context fast, and can loop when the cache starts clearing" — GLM 4.7 Flash exhibits rapid context exhaustion and degraded performance as context cache fills, a constraint on its agentic capability

Signal history

2026-W22

2026-W21

419

2026-W20

413

2026-W19

295

2026-W18

413

2026-W17

405

2026-W16

399

2026-W15

456

2026-W14

336

Evidence chain (90 articles, showing 50)

The GTM Guide to AI Context Engineering - by Maja Voje example_of

Context engineering is about designing the entire information environment around the AI. Not just what you ask, but what the AI already knows when you ask it." — Directly defines context engineering a

Context Management for Deep Agents - LangChain Blog supports

Choose a design pattern for your agentic AI system supports

compressing large amounts of data to improve efficiency" — Article explicitly identifies data compression as a context engineering technique to improve efficiency.

@p_naix: Guys, it's happening! example_of

We also maintain the same rolling compression system from slate V0 that let it run single sessions for as long as 2 days" — Article describes a rolling compression system as key to enabling long-runni

@dexhorthy: Open-ended chatbot conversation is a good product paradigm for usability but ... example_of

take a large set of tokens and turn it into a smaller set of tokens that is most relevant and meaning-rich for the task at hand" — Article provides a concrete definition and implementation of context

@sarahwooders: Letta Code's compaction is the best I've personally experienced for coding --... example_of

Compaction is a form of context compression; article demonstrates this is use-case-sensitive and tunable

State of Context Engineering in 2026 - Maven example_of

Course explicitly covers context compression as a 'deep dive' topic, indicating this is a core pattern in 2026 CE practice

Context Engineering Explained: Mechanisms for Deciding When to Compress Context | by CreateMoMo | Apr, 2026 | Towards AI example_of

Article explicitly addresses compression as a context engineering mechanism

@adlrocha - The Model is still not the Product example_of

Article describes specific 3-phase compression strategy with boundary detection and structured summarization—a concrete implementation pattern for managing token budgets in long-context sessions.

@s_streichsbier: In case people are wondering whether compaction works well in Pi. example_of

Entire post is about compaction (context compression) working effectively across 33 iterations

Context Engineering Explained: A Core Technology Behind AI Agents (LLM Summary, Observation Masking & Memory) | by CreateMoMo | Apr, 2026 | Towards AI example_of

Article explicitly discusses LLM Summary and Observation Masking as two distinct compression strategies with tradeoffs

@doodlestein: The auto-compaction in Codex is really good now. You can basically just keep ... example_of

Auto-compaction is the explicit mechanism for context compression described in the tweet

all about AGENTS.md | LLM context engineering bootcamp | Lecture 5 example_of

Lecture explicitly focuses on context compression as a technique for managing large, complex context windows.

AI Agent Context Engineering: The Developer's Guide to Designing, Building, and Deploying Autonomous AI Agents (English Edition) eBook : Knight, Ryan: Amazon.com.mx: Tienda Kindle example_of

Master the Context Stack system prompts, tasks, RAG, tool outputs, and history" — Book directly teaches context stack engineering as a core architecture pattern for autonomous agents

@shao__meng: MCP 有一个关键挑战：AI Agent 需要大量工具来完成实际任务，但每个工具的描述都会占用宝贵的上下文空间，导致任务输入空间受限。传统方法是将每个操作... extends

AI Agent 需要大量工具来完成实际任务，但每个工具的描述都会占用宝贵的上下文空间，导致任务输入空间受限" — Article identifies context space as a critical constraint for tool-heavy agents and proposes dynamic tool discovery (search() interface) as a

@Hesamation: he's on point. we're truly in a new *software crisis* where the complexity an... supports

compress the context. find everything necessary and gather good context." — Article explicitly recommends context compression as a practical strategy to manage complexity and prevent issues.

Context Engineering Guide in 2025 supports

Best practices for avoiding context distraction involve periodically summarizing or compressing conversation history, outdated details, and prioritizing recent context through scoring mechanisms or re

@karpathy: There was a nice time where researchers talked about various ideas quite open... example_of

Prompt compaction = when the context window gets close to full, model generates a shorter summary" — Article describes prompt compaction as a concrete implementation of context compression, calling it

@shao__meng: · 极快的启动速度（通常 <100ms） example_of

支持自动压缩历史记录以适应模型上下文窗口" — Demonstrates automatic history compression as a mechanism to manage context window constraints.

Structured Context Engineering for File-Native Agentic Systems example_of

4 formats (YAML, Markdown, JSON, Token-Oriented Object Notation [TOON])" — Evaluating multiple file format representations is a direct exploration of how to compress and encode structured schemas for

Context Engineering: The 6 Techniques That Actually Matter in 2026 ( A Comprehensive Guide ) | by Divy Yadav | Feb, 2026 | Towards AI supports

Performance gains in 2026 come from dynamic context selection, compression, and memory management." — Article directly lists compression as one of the three core performance drivers in modern producti

Effective context engineering for AI agentsを読んでみる - Zenn supports

要約・圧縮・外部メモリ・検索・サブエージェントが主要技術" — Article explicitly identifies compression (圧縮) as a primary technique for effective context engineering

Strategies for Context Management - Talking Shop - LangChain Forum example_of

One idea I had is to compress the web search results after each tool call." — Article demonstrates context compression as a practical solution for managing accumulated results

@shao__meng: [论文解读] 一切皆上下文：面向上下文工程的智能体文件系统抽象，用管理操作系统文件那样严谨、标准化的方式，去管理 AI 的上下文。 example_of

它根据当前任务，从海量的"上下文文件"中选出最相关的部分，进行优先级排序和压缩，生成一份"清单"（Manifest）。" — The context constructor component directly implements context compression through prioritization and compression techniques as part of th

Build Production-Ready LLM Systems with Context Engineering - Zilliz blog example_of

context processing is especially important because it decides how retrieved information is cleaned, organized, and compressed before reaching the model." — Article directly discusses compression as a

@lotusfracture: today's reading example_of

We also maintain the same rolling compression system from slate V0 that let it run single sessions for as long as 2 days as reported by our customers." — Article describes a specific implementation of

@ranilmukesh: so much alpha in ts example_of

We also maintain the same rolling compression system from slate V0 that let it run single sessions for as long as 2 days as reported by our customers." — Article demonstrates context compression as a

@TheRealSergio91: this is really cool! example_of

Slate has episodic memory that actually makes sense. The system retains only the tool calls that contribute to its success. We also maintain the same rolling compression system from slate V0 that let

@shao__meng: 来自 @raunakdoesdev 的分享，法律合同审核、保险理赔、金融10-K报告、建筑变更订单等等企业中常见文档，原始文档解析 API 输出直接喂给 ... supports

一份 100 页的建筑变更订单，解析后产生约 20 万行 JSON...其中真正有用的合同条款、单价表内容占比极低，大部分是'坐标数组'和'元数据字段'" — Article demonstrates problem that context compression solves: removing non-content metadata reduces 200k tokens to actio

@tarunsachdeva: this thing is wild @realmcore_ @0xrandomlabs, nicely done example_of

Slate has episodic memory that actually makes sense. The system retains only the tool calls that contribute to its success. We also maintain the same rolling compression system from slate V0 that let

@Hesamation: this is a great article if you want to understand Claude Code or Codex and th... example_of

Component 4 explicitly names 'context reduction: clip, dedup, compress' as critical. This is applied context compression strategy.

LLM Context Management Strategies for Efficient Conversations | Aditya Santhanam posted on the topic | LinkedIn supports

Summarization for Compression Condense long conversations into shorter summaries." — Article explicitly mentions summarization as a compression strategy for managing context

@arb8020: random labs is one of very few teams that has spent the requisite time with c... example_of

We also maintain the same rolling compression system from slate V0 that let it run single sessions for as long as 2 days as reported by our customers." — Slate implements rolling compression to enable

Code Mode: give agents an entire API in 1,000 tokens example_of

The code acts as a compact plan. The model can explore tool operations, compose multiple calls, and return just the data it needs" — Code Mode exemplifies context compression by converting verbose too

Context Engineering: The Critical AI Skill that makes or breaks your LLM Applications | by Yashwant Deshmukh | Medium supports

Context engineering employs four key strategies to manage the context window effectively: writing, selecting, compressing, and isolating context." — Article identifies context compression as one of fo

Advanced Context Engineering for Agents - YouTube example_of

compaction strategies" — Article demonstrates practical application of context compression techniques as a key strategy for scaling coding agents in production.

"Mastering Context Engineering for AI Systems" | Addy Osmani posted on the topic | LinkedIn example_of

large files attached to a chat can be condensed to fit in the context limit" — Demonstrates practical context compression technique in modern AI tools like Cursor.

The Rise of Context Engineering： Building the Foundation for Next-Generation AI Agents | LLM Multi Agent example_of

Article explicitly discusses context compression as core optimization technique for managing limited context windows

@victorialslocum: Everyone talks about how good multivector models like ColPali and ColBERT are. example_of

MUVERA (Multi-Vector Retrieval via Fixed Dimensional Encodings) is the solution to this... Applies random linear projection to compress each sub-vector (following the Johnson-Lindenstrauss Lemma to pr

@LLMJunky: Codex compaction endpoint is literally voodoo. example_of

Let it compact. I don't know how they do it but it's great." — Author demonstrates practical use of automatic context compaction (compression) in Codex and reports zero drift over extended sessions, p

@fchollet: I keep reading this take (below) every few months, presented as if extremely ... supports

The human visual system actually processes 40 to 50 bits per second after spatial compression. Much, much less if you add temporal compression over a long time horizon." — Article explicitly discusses

@UnderleveledDev: x.com/UnderleveledDe… example_of

@shao__meng: 如何构建永不遗忘的 Agent example_of

读取时分层检索：先拉摘要，问 LLM '够了吗'，不够再下钻到具体事实。" — Article demonstrates context compression through hierarchical retrieval: summaries first, then drilling into facts only when needed. This directly exemplifies s

@shao__meng: Cursor 发布最新上下文工程模式「动态上下文发现」 example_of

大型或潜在无关的数据（如工具输出、聊天历史、终端会话）被转换为文件或引用，而不是直接注入提示" — The approach compresses context by converting large data into file references accessible via dynamic retrieval rather than inline injection

@irl_danB: I've implemented a proof of concept of the call-stack approach as an opencode... extends

presumably Suhail is thinking about the compaction problem as it occurs in long running agents like claude code" — Author identifies that the call-stack architecture addresses the compaction problem t

@shao__meng: Letta 发布「Context Repositories」，把 Agent 的记忆变成一个本地文件系统，并用 Git 进行版本管理 supports

记忆文件采用层级目录结构，文件名和目录层级本身就是导航信号。文件树结构始终在系统提示中——Agent 随时知道自己"记得什么"" — Hierarchical directory structure with progressive disclosure (system/ for resident memory, selective loading elsewhere) reduces token

@koylanai: The problem is how memory gets into the context window and what happens when ... example_of

The agent silently writes durable memories to disk before compaction hits. But after the window resets, the agent can't systematically browse what it flushed." — Article demonstrates practical example

@victorialslocum: 𝗠𝗲𝗺𝗼𝗿𝘆 𝗶𝘀𝗻'𝘁 𝗮 𝘀𝘁𝗼𝗿𝗮𝗴𝗲 𝗽𝗿𝗼𝗯𝗹𝗲𝗺. It's not a feature, ... supports

A useful memory system needs to handle: Write control (what deserves to become a memory vs. passing noise), Deduplication (collapsing repeated information into canonical facts), Reconciliation (handli

@victorialslocum: RLMs handle 100× more context by treating prompts as code variables. supports

Handles inputs 100× beyond model context windows" — The symbolic handles and recursive approach allow RLMs to effectively compress context representation, enabling processing of vastly larger inputs w

KV Cache Optimization for LLMs 2026: Engineering Guide example_of

FP8 / INT8 quantization tuning with eval gates" — Article demonstrates FP8 and INT8 quantization as applied techniques in KV cache optimization, with evaluation gates for production validation

query this concept

$ db.articles("context-compression")

$ db.cooccurrence("context-compression")

$ db.contradictions("context-compression")