# Context Rot

> **TL;DR.** Context rot is the measurable degradation in large-language-model output quality as input length grows, even when the prompt stays well under the advertised context window.

- **Category:** AI / LLM Research / Context Engineering
- **Stage:** established
- **Age:** 324 days
- **Origin date:** 2025-07-14
- **First detected:** 2026-04-15
- **Canonical URL:** https://earlyterms.com/term/context-rot
- **Sources:** 6 primary URLs

## Definition

Context rot is the measurable degradation in large-language-model output quality as input length grows, even when the prompt stays well under the advertised context window. Models don't process the 10,000th token as reliably as the 100th — performance drops with distractors, with semantic distance from the question, and even on trivial copy-the-text tasks.

The term was coined in the July 14, 2025 [Chroma technical report](https://www.trychroma.com/research/context-rot) by Kelly Hong, Anton Troynikov, and Jeff Huber, which evaluated 18 frontier models (Claude Opus 4, GPT-4.1, Gemini 2.5 Pro, Qwen3-235B, plus 14 others) and showed uniform processing is a myth. Hacker News launch hit 260 points; the companion [chroma-core/context-rot](https://github.com/chroma-core/context-rot) replication repo is at 247 stars.

## Example

The Chroma benchmark found that even on simple text replication, models like GPT-4.1 grew less reliable as inputs lengthened; haystacks with logical structure actually underperformed shuffled versions, and lower semantic similarity between question and relevant info sharply accelerated decay. Practitioners now routinely clear context between logical steps rather than let long chats accumulate.

## Analogy

Like keeping a grocery list too long — by item 80 the earlier items blur, and by item 200 you're randomly forgetting eggs even though they're written down.

## Why it's emerging now

Chroma's July 14, 2025 report named the phenomenon, tested 18 frontier models, and got 260 HN points + 247 replication-repo stars. Nine months later the term is a shorthand in every long-context launch debate — e.g. Opus 1M context retrospectives cite it as the reason bigger windows don't linearly help.

## Related terms

- *parent:* context engineering
- *related:* context window
- *related:* lost in the middle
- *related:* NoLiMa
- *related:* hallucination
- *related:* needle in a haystack
- *related:* RAG
- *child:* context compaction
- *related:* claude-opus-4-7
- *related:* agent-loop

## Sources

1. [Chroma Research — Context Rot technical report](https://www.trychroma.com/research/context-rot)
2. [chroma-core/context-rot replication repository](https://github.com/chroma-core/context-rot)
3. [Hacker News launch discussion (260 points)](https://news.ycombinator.com/item?id=44564248)
4. [ZenML LLMOps Database summary](https://www.zenml.io/llmops-database/context-rot-evaluating-llm-performance-degradation-with-increasing-input-tokens)
5. [Nilenso — Fight context rot with context observability](https://blog.nilenso.com/blog/2025/10/29/fight-context-rot-with-context-observability/)
6. [Chroma announcement on X](https://x.com/trychroma/status/1944835468551708905)

---
_Generated by EarlyTerms · https://earlyterms.com/term/context-rot_