This Huge Update Changed The Way I Use Claude Code

AI LABS| 00:10:15|Apr 14, 2026

Chapters7

Introducing the idea of using a combination of models instead of a single one to balance reasoning, speed, and token efficiency.

Claude Code’s adviser strategy blends Opus and Sonnet to cut tokens and boost efficiency without sacrificing capability.

Summary

AI Labs’ deep dive by the host explains Anthropic’s adviser strategy for Claude Code, showing how to orchestrate Opus, Sonnet, and Haiku to balance performance and token usage. The presenter notes that Opus is powerful but token-hungry, while Sonnet is fast yet less capable on hard tasks, and introduces the idea of making smaller models work under an executive/adviser split. By assigning Sonnet as the executive and Opus as the adviser, Claude Code can tackle complex problems with fewer tokens, lowering costs while maintaining quality. The video walks through practical testing on an existing Sonnet-based app, where the adviser helped fix a real-time sync issue by pinpointing changes and guiding the executive’s actions. A hands-on UI rewrite example using Playwright demonstrates how the adviser can determine dependency issues before code changes, though the timing matters: large-scale changes benefit more from Opus directly. The host also shares experiments in applying the approach to new features and highlights when the adviser strategy may introduce occasional misjudgments requiring nudges from the user. The sponsor segment for Juny by JetBrains sits in to promote a cross-environment coding agent. Finally, the takeaway is clear: for token-constrained projects with mostly straightforward tasks, the adviser strategy offers meaningful gains; for complex apps with many dependencies, running Opus as the main agent remains the safer bet. The video ends with a practical call to action to subscribe for more AI-product-building content.

Key Takeaways

Using a two-agent setup (executive on Sonnet, adviser on Opus) reduces token usage because Opus is only invoked when truly needed.
Anthropic’s adviser strategy outperformed Sonnet alone in the SU bench while costing less than running Opus as the main agent.
For real-time sync debugging in a Sonnet-based app, the adviser provided exact changes and restructured logic, resolving deletions syncing across devices.
UI rewrites with a new library showed why large-scale changes benefit from Opus handling parallel tasks, whereas Sonnet handles sequential tasks more efficiently.
In practice, the adviser strategy shines for token-limited, mostly straightforward tasks; for complex apps with many dependencies, Opus as the main agent is often the better choice.
There are moments when Sonnet will bypass the adviser or misjudge task complexity, requiring user nudges to steer workflow.

Who Is This For?

Developers and product builders using Claude Code who want to optimize token efficiency and cost while maintaining performance, especially in apps with real-time state and UI changes.

Notable Quotes

"“The adviser strategy is you give the role of executive to the sonnet model and use opus purely as an adviser that only gets consulted when the executive actually needs it.”"

—Definition of the adviser strategy and how the two-model collaboration works.

"“The executive model took in that advice and it applied those fixes directly without any additional back and forth.”"

—Illustrates how Adviser guidance can accelerate real debugging tasks.

"“Opus is a very large and powerful model. So, it consumes a lot of tokens even for simple tasks.”"

—Motivation for using Opus selectively rather than as main agent.

"“Using sonnet with opus as an adviser made the process much more efficient.”"

—Key performance/cost benefit from the adviser setup.

"“For complex apps with many connected dependencies or multiple failure points, you're better off just using Opus directly as your main agent.”"

—Important caveat about when to avoid the adviser strategy.

Questions This Video Answers

How does Anthropic's adviser strategy work with Claude Code and why does it save tokens?
When should you prefer Opus as the main agent versus using Sonnet with an adviser in Claude Code?
What are the practical steps to implement the adviser strategy in an existing Claude Code app?
Can the adviser strategy handle real-time UI changes without extra rounds of prompting?
What are the trade-offs of using smaller models like Sonnet and Haiku in multi-model orchestration?

Claude Code Opus Sonnet HaikuAnthropic adviser strategytoken efficiencyreal-time syncPlaywright MCPJuny by JetBrainsAI product development

Full Transcript

No single clawed model is enough on its own. Opus has the reasoning but burns through your limits. Sonnet is fast but hits a wall on harder decisions. And the answer isn't picking one over the other. It's using all of them together. Now Claude Code already does this to some extent. It orchestrates between models on its own. But Anthropic just released something that not only saves tokens but also makes smaller models almost as capable as the larger ones. Now when building with Claude, you might have noticed this. Whenever you hand Opus a task and it determines that it doesn't need that much effort, it hands it off to Sonnet or Haiku and delegates tasks to the smaller models in order to manage token usage properly. But there's a problem with this approach. As we mentioned in our previous video, Anthropic has been lowering the rate limits. So during peak hours, your 5-hour window fills up faster. And on top of that, Opus consumes a lot of tokens even on simple tasks, which means using Opus means your context limit fills up faster. Anthropic decided to flip the script on this and they came out with something called the adviser strategy. The way this strategy works is that you give the role of executive to the sonnet model and use opus purely as an adviser that only gets consulted when the executive actually needs it. There are two agents involved. The executive is your main agent running on sonnet and it handles all tool calls, code changes and userfacing output. The adviser runs on opus and its only job is to guide the executive when it gets stuck. The adviser never writes code or makes any changes. When Anthropic experimented with this approach, they found it outperformed sonnet alone on the su bench. They found that this combination outdid sonnet alone in terms of both performance and cost and it costs significantly less than running opus as the main agent because opus only gets invoked when it actually matters, not for every single iteration. Now, you might think that we already have a lot of frameworks for building apps that are better and ready to use. So why bother with this setup? The reason is that most existing frameworks are not built with cost and token efficiency in mind. Even though they get the job done, they fall short when it comes to making Claude run longer and more efficiently because they are primarily focused on building the app rather than optimizing for token usage. With this setup, you can build a working app using a weaker model, making the whole process far more token efficient. And that connects back to the limits problem we mentioned earlier. We already made a video on Claude's limits and told you to switch to a smaller model to make it last longer. Here's how it connects. Sonnet consumes way fewer tokens and requires less effort than Opus to perform the same task. Opus is a very large and powerful model. So, it consumes a lot of tokens even for simple tasks. Sonnet is able to handle many of those tasks more efficiently. So, using Opus only to bridge the performance gap on harder decisions is where the real impact comes in. You're only invoking that power when you actually need it, not for every single task. This makes the overall usage more token efficient and lets you get more done within the same limits. We share everything we find on building products with AI on this channel. So if you want more videos on that, subscribe and keep an eye out for future videos. So we wanted to test how this actually plays out on an app that was already built using Sonnet. To use the strategy inside Claude code, we set the advisor command with Opus 4.6 as the adviser model. Our main agent was the executive which I had already set to sonnet since I built the app using it. The app was supposed to have real-time sync and while moving and resizing elements synced perfectly across sessions, deletion wasn't syncing at all. We tried debugging this multiple times with Sonnet on its own, but the issue kept persisting no matter how much it tried to fix the issues. So, after turning on Opus as the adviser, we gave Claude the prompt describing the problem. And because Sonnet had already failed multiple times, instead of taking another shot on its own, it decided to invoke the adviser this time. The adviser reviewed the conversation so far to assess the situation. It provided the exact changes that needed to be made, pinpointing where the sync logic was breaking and what specifically needed to be restructured. The executive model took in that advice and it applied those fixes directly without any additional back and forth. We tested it across multiple devices to test the sync and found that the issue was resolved. Both ends were reflecting deletions properly as intended, even if the user had selected the item at one end and the other end was being deleted, which wasn't the case previously. If we had tried fixing this using sonnet alone, it would have taken more rounds of back and forth prompting because sonnet inherently is a weaker model and not capable enough to handle complex logic by itself. On the other hand, using opus alone would have consumed far more tokens and likely wouldn't have been this fast. Using sonnet with opus as an adviser made the process much more efficient. So overall, this strategy helped debug syncing issues much faster than before. But before we move forwards, let's have a word by our sponsor Juny by Jet Brains. If you're a developer, you know the struggle. Context switching between your terminal IDE and CI pipelines just to get stuff done. Most coding agents lock you into one environment or one specific LLM and call it a day. Juny CLI is different. It's an LLM agnostic coding agent that works everywhere. Your terminal, your IDE, GitHub, CI/CD pipelines, even your task manager. One agent everywhere. Delegate real work to it. Writing tests, building backends, refactoring, automating code reviews on every commit. Right now, Jet Brains is running a free early access program, including $50 in Gemini credits to test the agent, plus BY support, so you can use any model you prefer. Full access to all features, early access to new ones, and direct support from the dev team shaping the product. It's simply better with Juni. Click the link in the pinned comment to join for free. Now, we wanted to test whether Sonnet actually consults the adviser for major UI changes. We had a previously built application and we wanted to transform its UI to a different library. On top of that, we wanted to make multiple UI changes in one go, which isn't normally recommended, but we wanted to see how the smaller model performs in coordination with the larger one on a bigger task. It first accessed the current UI using the Playright MCP. Once it understood the layout, instead of jumping straight into code changes, it consulted the adviser to determine the best approach because it was a major critical change and might break the app if handled wrongly. The adviser reported that the library we chose as a new library and the one that was already used in the project had version issues. So before any UI work could start, Claude needed to resolve these first. Sonnet handled those first, ran multiple commands to make sure the dependencies were properly applied, then checked the current state of the UI through Playright to confirm the app was still running correctly with no client side issues. Once the dependencies were sorted, it started making the changes as the adviser suggested, working through each component one by one and effectively redesigning the app as a whole. The UI it created was much more interactive and looked significantly more polished than before. It still had some issues, but the overall improvement was clear. But here's where the limitation showed up. The entire process took around 31 minutes. Opus on its own would have done this much faster because it's better at orchestrating tasks by identifying what can run in parallel and executing them at the same time. Sonnet being a smaller model handled everything sequentially without breaking any of the work into parallel sub aents. For an app that wasn't even that complex, 31 minutes is longer than it should have been. It also handles smaller changes on its own without involving the adviser, which is the right behavior for minor tweaks. But for large scale changes across an entire app like this, you're better off using Opus directly because that will save you significantly more time and effort. Now, we wanted to test whether it implements a completely new feature on an existing codebase properly. We had an app already built and wanted to add another page with a different feature to it. We gave it a prompt describing what we wanted and this time we fully expected it to use the adviser because it wasn't a simple task. But it went ahead and implemented the changes entirely on its own without consulting the adviser at all. It treated the whole thing as routine implementation work which it clearly wasn't given the scope of the feature. When we tested the application, we found multiple issues. If we modified something and pressed the run button, changes like heading updates or color adjustments were also reflected in components outside the preview pane which shouldn't happen. On top of that, we wanted it to sync directly instead of requiring us to press run again after every change. So, we prompted it again and told it to use the adviser to fix these issues. Upon our prompt, it first invoked the advisor agent. The adviser looked at the implementation and identified what was actually causing both problems, that being the wrong component choice. It laid out what needed to change and why the original approach had introduced those issues in the first place. The executive took that guidance and applied it across the app. When we tested it again, streaming worked correctly. All changes reflected immediately as we edited without needing to press run after every modification. The issue of changes bleeding across components was also resolved and everything updated properly within the right boundaries. So there are times when it works exactly as intended, but other times the executive assumes a task is small enough and decides not to consult the adviser. In those cases, you often have to nudge it yourself so it follows the intended workflow. The model doesn't always judge the complexity of a task the same way you do. And when it misjudges, you end up with bugs that the adviser would have caught from the start. Also, if you are enjoying our content, consider pressing the hype button because it helps us create more content like this and reach out to more people. With real-time distributed state involved, this approach still needed multiple rounds of prompting before everything was working correctly. The strategy helped, but it has a ceiling you should understand before committing to it for a project. For simpler to medium-scale applications, the advisor strategy can save you several rounds of back and forth that you'd otherwise spend trying to push Sonnet past its limits on its own. If what you're building requires occasional deep reasoning, but mostly straightforward implementation. This is a genuinely good structure for it. You can build more within your token limits without having to babysit the model through every decision or fall back to Opus for the whole session. For complex apps with many connected dependencies or multiple failure points, you're better off just using Opus directly as your main agent. Even when Sonnet follows the adviser's guidance correctly, it can still choose the wrong implementation path because it doesn't have the reasoning depth to evaluate multiple approaches at once and weigh the downstream consequences. The adviser helps close that gap, but it doesn't fully close it. In those cases, the back and forth can cost you more time than running Opus from the start would have. So this strategy is useful when you're working within tight token limits and the application doesn't require opus level reasoning at every step. If both of those conditions are true for what you're building, it's worth setting up. That brings us to the end of this video. If you'd like to support the channel and help us keep making videos like this, you can do so by using the super thanks button below. As always, thank you for watching and I'll see you in the next one.