Anthropic Finally Fixed The 1M Context Window Problem
Chapters6
Explains that a larger context window can degrade performance earlier than expected and that compaction often worsens the issue unless managed carefully.
A million-token context window isn’t a silver bullet—context rot and poor compaction still bite Claude Code; learn concrete tactics to keep it sharp.
Summary
AI Lab's Tariq dives into why the promised 1,000,000 token context window for Claude Code isn’t a miracle cure. He points out that degradation starts well before the halfway mark and that compaction often makes things worse, not better. The video explains context rot, five key failure modes in long-running tasks, and practical workflow tips to mitigate issues as you work with a larger memory. Tariq walks through how to manage context with strategies like proactive compaction, using clear vs. compaction, and combining both with structured JSON to preserve essential state. He also highlights the role of sub agents in isolating work and preventing main-context pollution, plus rewinding as a more reliable corrective approach than simple reruns. The discussion ties these ideas back to Claude Code’s real behavior, offering actionable steps you can apply to keep long-running tasks on track. Verdant is featured as a sponsor, but the core takeaways are about disciplined context management and mindful prompting. If you’re building long-running agents, these practices are essential to avoid drift and memory corruption while leveraging a larger context window.
Key Takeaways
- Context rot begins around 300,000–400,000 tokens, even with a 1,000,000 token window, so proactive management is essential.
- Compaction is lossy and can erase important details; prefer controlled compaction with explicit instructions rather than autocompact mid-task.
- Use clear commands to start fresh sessions when tasks are unrelated, and reserve compaction for carrying forward relevant context from a previous flow.
- Sub agents isolate work in separate context windows, preventing tool calls and intermediate reasoning from polluting the main context window.
- Rewinding is superior to re-prompting: it removes bad paths while preserving a clean, correct state for continued work.
- Structured JSON-based save-and-restore workflows let you preserve the exact state you want before clearing, blending both clear and compaction effectively.
- Ask for regular recaps during long-running tasks to keep goals and constraints fresh in the working context, reducing goal drift and decision inaccuracy.
Who Is This For?
Essential viewing for developers using Claude Code and other long-running agents who want to maximize a large context window without drowning in context rot. It’s particularly valuable for AI practitioners building automation that relies on persistent state, sub-agent architectures, and disciplined prompting.
Notable Quotes
""The 1 million context window sounds like a huge upgrade, but in reality, it's way worse than most people realize.""
—Sets up the core tension: bigger is not automatically better due to degradation and context management challenges.
""Context rot means the model's performance degrades with more information in its context window.""
—Defines the central problem that emerges with bloated context, even with a larger window.
""Claude is actually least reliable during compaction.""
—Highlights a critical caveat about the most common maintenance operation.
""A million context window also opens the door to longunning tasks without worrying too much about the context issues we used to face.""
—Describes both the promise and risk of longer-running work.
""Rewinding is really important compared to simply correcting because it removes irrelevant or incorrect parts from the context window while keeping only the correct state.""
—Promotes a robust corrective strategy over simple re-prompts.
Questions This Video Answers
- How does context rot affect Claude Code with a 1,000,000 token window?
- What are the best practices to manage context in long-running AI tasks with Claude Code?
- Why is compaction lossy, and how can I control it effectively in practice?
- What role do sub agents play in avoiding context pollution and memory corruption?
- What is rewinding in AI agent workflows, and how does it compare to re-prompting or correcting?
Claude Codecontext windowcontext rotcompactionrecency biasgoal driftmemory corruptiondecision inaccuracysub agentsrewinding (prompting)
Full Transcript
The 1 million context window sounds like a huge upgrade, but in reality, it's way worse than most people realize. And this is exactly why the engineer working on Claude Code, Tariq, wrote the article. If you think Claude Code only starts getting worse at 1 million tokens, or that 1 million is so much you don't have to worry about it, you are actually wrong here. The degradation actually starts way earlier than halfway through the window. And the fix most people reach for, which is compaction, usually makes it worse. By the end of this video, you'll know exactly how to stop clawed code from getting dumber.
the same way the team at Anthropic does it. Claude code feels degraded even though the models themselves are actually powerful. You might have noticed that it hallucinates more, has to be reminded again and again of instructions you gave earlier and forgets those instructions in the long run. We noticed this as well when we were running longer tasks and Claude's performance felt downgraded. But there is a whole reason behind it. Now models after OPUS 4.5 all ship with a 1 million context window instead of the previous 200,0001. While this upgrade sounds like most of the issues we used to have will be gone with a 1 million context window, it only sounds good in theory because now you are able to fit more at once in the context window than before and ground it with more documents and information so that Claude doesn't stray from the task it needs to do.
A million context window also opens the door to longunning tasks without worrying too much about the context issues we used to face. But the thing is all of this is not entirely solved. The million context window is actually a double-edged sword. While it does let Claude go longer and hold more information at once, it all comes at a cost. It opens the door to context rot. Context rot means the model's performance degrades with more information in its context window. Because with a bloated context window, it has more things to pay attention to and cannot stay focused.
And with a million context window, your context gets much more stuffed, which means there is way more information available to interfere with Claude's reasoning than there was with the 200,000 context window. Context rot is not something that occurs only at a highly bloated context either. According to the creator of Claude code, context rot actually starts happening around 300 to 400,000 tokens, which is much less than a million, around just 40% usage. So no matter the context window size, we need to do things to prevent context rot. And knowing this will actually change how you work with the 1 million context window.
Now a quick recap. The context window is everything the model sees at once, which includes the conversation so far, the claude.md file, the system prompt, files read into the session, and every tool call output. Each prompt adds more, and once the window fills up, you summarize to continue with a fresher window, which is compaction. If you don't manage context properly, there are four ways in which your agent can fail. This becomes even more evident and problematic in longunning agents. Context pollution is the first one, which we already discussed and is why it occurs. Goal drift is the second.
This happens when your agent strays away from what it needs to do because it has too many things to focus on at the moment or in simpler terms, it has forgotten the goals it was supposed to work toward. This might have happened often if you are working with clawed code where you want your UI to look a certain way and have already specified it, but it doesn't follow that and you have to remind it of the actual goal. Memory corruption is the third and it occurs when during execution the agents internal state or stored facts become incorrect and it continues acting based on that faulty state.
It is often hard to pinpoint the exact cause when agents run for long periods. It becomes unclear where the mistake originated. For example, memory corruption can look like a file being written one way by the agent itself and then modified by a sub agent that is not in the current context. The agent refers back to its own outdated memory and continues operating as if the file still exists in the same form it originally created. Decision inaccuracy is the last one. It occurs when an agent makes contradictory choices in nearly identical situations such as using one error handling pattern in one place and a different one elsewhere.
All of these issues occur when context is not managed properly and they impact the long-term performance of agents. These are exactly the factors that most agent harnesses try to optimize for. So once you have asked Claude to do something and it has finished, there are actually five possible options for what happens next in terms of your next instruction. Each one depends on what your next prompt is. If you use each one properly, the way you work with Claude can improve a lot. Though the most natural choice is to just continue, the other options actually help you manage your context more effectively.
So you need to decide carefully whether you actually want to continue in the same flow or start a new session. Once the context gets bloated, you have two ways to shed the context. And the first choice is compaction which we already explained as a summarization of the existing content. But you need to be clear about when you actually want to summarize because the summary is lossy and a lot of details that might look important to you but not important to Claude can get dropped. As a result, important context may no longer exist in the context window.
It is better to control compaction yourself instead of letting Claude hit autocompact because when it triggers mid task, the compaction becomes even messier. It tends to keep what it thinks is important and removes everything it does not think will be needed. So Claude is actually least reliable during compaction. At that point, Claude's focus is purely on summarization, and it is stripped of supporting context like the system prompt and other elements that normally make it more capable. It then relies heavily on its own assumptions about what is important, which can often lead to poor compaction decisions.
Bad compaction usually happens when the model cannot clearly determine the direction of your work. For example, if you are in a long debugging session and there was a warning encountered earlier after autoco compaction, if you ask it to fix that specific warning, it won't know what warning are you talking about. This happens because the session was focused on debugging as a whole. So only a general summary of debugging activity was retained and the specific warning was treated as noise and dropped. Recency bias makes it worse. When compaction is triggered, the prompt prioritizes preserving recent details of what was being worked on.
So older but still important information may be ignored or left out. If something was done incorrectly earlier, the model may no longer be aware of it after compaction. It only has access to the transcript level summary, not the full state of the project because tool call history is not fully preserved during compaction. You can set flags to control when autoco compaction happens, but this is something you should actively manage more often. Trigger compaction around the 300 to 400,000 range mentioned by the creator because that is typically where context rot begins to appear. And always provide a compaction instruction yourself because Claude responds more carefully when explicit instructions are included.
Tell it which decisions, constraints, and discovered issues to carry forward so it knows what to prioritize. So you should hit compact when you actually want context from the previous task flow to carry into the new window, not when you want a fresh start. But before we move forwards, let's have a word by our sponsor, Verdant, an AI powered platform that helps builders turn ideas into shipped products. Your midbuild finally in the zone and your credits run out. Your AI stops dead, momentum gone. Every AI coding tool does this to you, but Verdant doesn't. When your credits hit zero, just switch to eco mode, a zerocost mode that keeps your AI running without spending another dollar.
No interruption, no top up, no lost momentum. You just keep building. And when you do have credits, you're not stuck picking between Claude, GPT, or Gemini. Verdant's multiplan mode runs all three together like a decision committee, giving you better plans without the model anxiety. Want even more flexibility? BY lets you plug your own API key directly into Verdant. Use your company's Claude or GPT credits. No platform charges. You just pay for what you actually use. You get 100 credits and 7 days to test it out. Click the link in the pinned comment and try Verdant for free.
The second choice is to use the clear command which removes all context and starts a new session with an empty context. Unlike compaction, nothing is carried forward and only what you provide again remains in the context window. Just like compaction, you should not use clear only when you run out of context. If you are switching to an unrelated task, it is straightforward to clear the session and start fresh. So the previous task does not interfere with the new one. For example, if you ask the agent to write test cases for an application you are working on, you may not want it to retain details about how those test cases were generated.
Instead of continuing debugging within the same context, you can start a fresh session. This way, Claude can work on debugging your application more effectively without being influenced by how it previously generated the test cases. Now, there is another approach you can use, which is combining both clear and compaction. This allows you to retain only what you want and discard everything else. The idea is to use a structured JSON format that captures the information you want to preserve. You can create a custom command so that you can reuse it frequently. In that command, you can include a JSON structure that contains the full task, current state, constraints, discovered issues, and any other relevant details you want Claude to retain, and then instruct it to save this to a file.
This approach lets you get the best of both methods. Once you run the command, it will analyze the entire conversation and the current state of the application, something that a normal compaction does not reliably preserve, and save everything into the file as specified. A schema is much stricter than pros. So when Claude follows a defined structure, it can represent what is important more consistently and accurately. After the information has been saved to the file, you can safely use the clear command to remove everything from the context window. Then you can start a new session and instruct Claude to refer back to that document to gather context and implement the next task from there.
As mentioned earlier, as context grows, the agents focus can drift because there is simply more information competing for attention. And this is even more noticeable with the million context window. This practice helps address both the goal drift problem and the decision inconsistency issues we discussed earlier. Instead of continuously pushing forward in a longunning task, it is useful to pause periodically and ask the agent to recap what it has done so far along with the constraints and other important factors. When you do this, it reinforces the original goals and brings key details back into the more recent part of the context window rather than leaving them buried in older sections.
This helps ensure that important information stays fresh in the agents working context and is less likely to be lost during compaction or diluted over time. So, the agent remains more aligned with the task it is supposed to perform and maintains better consistency in its decisions. Also, if you are enjoying our content, consider pressing the hype button because it helps us create more content like this and reach out to more people. Sub agents might not look like much, but they are actually a very important way of managing context. Each sub agent is its own independent instance with a dedicated context window, full tool access, and the permissions it needs to complete its task.
They execute the assigned work in that separate context provided by the parent agent and then return only the final output back to the main context. So all the tool calls it made, files it read, web searches it performed, and intermediate reasoning stay within the sub agents own context and do not pollute the main agents context window. This is an effective way to reduce context rot. Research tasks are the clearest example. The agent goes through multiple websites, pages, and sources, and you do not want all of that raw information continuously added into the main context window.
In such cases, a sub agent can handle the work independently and return only the final synthesis. The key question you should ask yourself before using a sub agent is whether you will need access to the intermediate steps again or whether you only care about the final output. Claude Code also manages sub agent orchestration on its own and can spawn agents to handle tasks automatically. But sometimes you need to explicitly specify in your prompt that you want the work delegated to a sub aent so it is handled in isolation. So if you are working on research tasks, refactoring tasks, summarization or document generation, you should consider separating them using sub aents instead of your main agent.
Last but not least, rewinding is really important compared to simply correcting because it removes irrelevant or incorrect parts from the context window while keeping only the correct state. Whenever Claude runs into a mistake, people often try to reprompt it to take another approach. But a better option is to rewind instead and then provide the correct direction in the new prompt. You can use the rewind command or press the escape key twice to do this. After rewinding, you can also summarize from that point. So the conversation up to that stage is preserved as useful context while removing the parts that led to the issue.
Rewinding has multiple benefits. First, it cleans the context window by removing the part where things went wrong, which results in a cleaner compaction summary that preserves only correct implementations. Even if you pin important information, you avoid carrying forward sections where the agent deviated from the goal, which helps reduce both decision inconsistency and goal drift. If you are using sub aents, rewinding ensures they receive a cleaner and more accurate context when tasks are handed off. So incorrect approaches are not included in their working state. Similarly, if you use a handoff command, it captures the correct state of the application instead of a corrupted or outdated one.
So build the habit of rewinding instead of repeatedly correcting forward. so the agent consistently works from a clean and accurate state through the whole session. That brings us to the end of this video. If you'd like to support the channel and help us keep making videos like this, you can do so by using the super thanks button below. As always, thank you for watching and I'll see you in the next one.
More from AI LABS
Get daily recaps from
AI LABS
AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.









