BREAKING: Claude Code source leaked
Chapters7
The host notes this is not an April fools joke and introduces the Claude Code leak as a major event in AI tooling history.
Theo uncovers a rare Claude Code source leak, digs into source maps, conspiracies, and what Anthropic should do next to address the fallout.
Summary
Theo’s birthday livestream kicks off with a blunt reality check about a massive Claude Code leak. Theo explains how Anthropic has traditionally kept Claude Code private, and how a leaked source map could expose most of the original code. He dives into the mechanics of source maps, code obfuscation, and why including source content in maps is dangerous for closed-source projects. The breakdown covers how the leak happened, possible causes (including logs debugging and rate limits), and why this isn’t necessarily malicious intent but a packaging mistake that spiraled. Theo weighs conspiracy theories—from intentional leaks to Bun’s bug—and systematically debunks them while highlighting what the reveal means for open-source vs. closed-source tooling. He also previews what’s inside the leaked files, including feature ideas like Buddy, dream mode, coordinator mode, and ultra features, and discusses their feasibility and potential impact. The video shifts to practical takeaways: Anthropic should consider open-sourcing Claude Code or at least publish a transparent road map, address the leaker-driven hype, and engage the community openly. He closes by urging a human, transparent response over legal posturing, and hints at how engineers on Claude Code could finally share their work publicly to restore trust.
Key Takeaways
- Source maps can unintentionally expose the original source code when used to map obfuscated JavaScript back to TypeScript in Claude Code.
- Anthropic’s Claude Code traditionally kept private ‘secret sauce,’ and the leak challenges the rationale for keeping code closed-source.
- The leak triggered DMCA actions and npm removals, illustrating tensions between open access and corporate IP protections.
- Estimated Claude Code source contains around 390,000 lines of TypeScript code (not counting npm packages), suggesting a substantial codebase has been exposed.
- Feature ideas surfaced in the leak (Buddy, dream mode, coordinator mode, ultra plan/review) illustrate ambitions that may be in flux or scrapped post-leak.
- Open-sourcing Claude Code or providing a clear road map would likely reduce misinfo, calm speculation, and improve community trust.
- The conversation emphasizes humane, transparent communication from Anthropic rather than legal heavy-handedness to recover reputation.
Who Is This For?
Essential viewing for AI/ML engineers and product teams who build on top of CLI harnesses like Claude Code. It’s also a must-watch for open-source advocates tracking the ethics of leaking, IP, and community trust in AI tooling.
Notable Quotes
""This is one of the biggest leaks in the AI world's history. It's actually absurd that this happened.""
—Theo underscores the magnitude of the leak and sets the stage for the discussion.
""Source maps are the method of doing that linking of linking the code you are sending back to the original source.""
—Explains the core concept behind how leaks happen via source maps.
""If you actually think this was intentional, I have a couple bridges for sale.""
—Theo doubts malicious intent and critiques conspiracy theories.
""Just open source it. And to be clear, I don't think they have to do this immediately.""
—Offers a practical path for Anthropic toward open sourcing or at least sharing a road map.
""Be humans about it. Just come out here and talk about it with the community.""
—Advocates for transparent, human-facing communication from Anthropic.
Questions This Video Answers
- How did Claude Code source map leaks occur and what can be done to mitigate them in the future?
- Why is open sourcing Claude Code controversial and what are the pros and cons for Anthropic?
- What impact does a Claude Code leak have on AI harnesses and open-source tooling compatibility?
- Could Growth Book, as used by Anthropic, influence feature flagging and transparency in AI tools?
- What should Anthropic say or publish in a roadmap if they decide to open source Claude Code?
Claude CodeAnthropicClaude Code source mapCloud CodeOpen source vs closed sourceDMCANPM packagingGitHub forksAI harnessesOpenAI Codex
Full Transcript
Before we start, I want to make sure y'all know this is not an April Fool's Day joke. I know the timing is a bit absurd, but like this isn't. What this is is the craziest birthday gift I've ever received. Yes, it is actually my birthday when I'm filming this. And right before going to bed the night before, I saw the craziest leak happen. One of the biggest labs just had all of their code leak for their agentic harness for coding with AI. Yes, the codec source says, "Wait, oh yeah, Codeex by OpenAI was already open source." What we're actually here to talk about is Claude Code because Claude Code isn't open source.
Anthropic has historically been uh not great about how they've handled the source code for Claude Code. They've been very strict trying to keep it private. Even though there is a GitHub repo for Cloud Code, it does not include Claude Code itself. It just includes plugins and skills and a few other random things. They've been protecting Cloud Code for a while, referring to it as their secret sauce. But they've also been using Cloud Code to build Cloud Code, which means dumb mistakes happen a lot. And one of those dumb mistakes is including the source maps. If you're not familiar, we'll break down what a source map is and how it gets included in something like this.
But there's a lot more to dive into here. From the thousands of DMCA requests Anthropic has sent to the conspiracies I already see people spreading to all the cool things we can learn from this source to most importantly how Anthropic can handle this going forward. This is one of the biggest leaks in the AI world's history. It's actually absurd that this happened. I'm going to do my best to cover it and hopefully not get sued in the process. But if I'm going to need lawyers, I'm going to need to pay them. So, we'll take a quick break for today's sponsor.
I don't know about y'all, but I've been wrangling a lot of slop lately. Both the PRs I'm filing with my own agents, but also the amount of stuff that's coming in through all the other people trying to contribute to my projects. Getting through it all is not easy, and I wish I had a bit more confidence when I looked at those PRs. Oh, I could have just been using Reptile. These guys get code review. Not only is their agent incredible at giving feedback in really quick, simple, digestible ways, they've also given us a ton of tools to integrate it in other places and to easily use that feedback in the tools we already like.
There's a ton of tools that have an open and cursor button, but I haven't been using cursor a whole lot recently. In fact, I kind of hopped between things and a lot of those don't have a way to pass information to them. Well, they didn't until now. Grapile just released fix. That doesn't mean that they're going to use Twitter to fix your code. Believe me, that does not work great. It means that you can use whatever tool you want. But how does that work with Cloud Code or Codeex? It works because you set up their npm package locally as a bridge on your machine and that allows it to trigger, pass, and do whatever is needed to get the context to your agent.
I just installed the CLI. It took literally under half a second. Let's fix in codeex. That is kind of nuts. It already submitted. It's already running in literally one click. That was real time. No editing there. I haven't actually seen one of these fix in whatever tools that fast ever before. Yep. I'm blown away. This is my real reaction. I did not know it was that useful. That is something I'm actually going to go set up and use after this. Stop reading slop and get back to work at soyb.link/grapile. As I was saying before, there's a lot to dive into on this one.
I want to start with how this happened specifically. We need to talk about source maps. You already know what a source map is. Feel free to skip along a bit, but I think this part will be useful regardless. And I'm going to have a little bit of history as to how this happened here as well. I'm going to start with a statement that sounds obvious, but will help you understand all of this. Your browser can't run TypeScript. Browsers run JavaScript, not TypeScript. Same with things like Node. Same with things like bun. Same with most things that run JavaScript code.
There are some that are able to parse out the TypeScript parts, but you're supposed to turn the TypeScript into JavaScript. If I go to some random site, in this case my site, and I say const value 1= 1, that works. But I do const value 2 number equals 2. uncaught syntax error because this syntax that we use for TypeScript is not syntax that the browser is compatible with. This is one of the many reasons that we have build steps in our JavaScript pipelines. The code you write is not the same as the code users run the majority of the time.
Other languages are traditionally compiled where the code you write becomes a binary that is running native processes. That is not the case in JavaScript. It's an interpreted language. What that means is that there is code being sent to the user and you could just write JavaScript that's vanillajs and send it with no additional build steps. But for many reasons, we build our JavaScript code. Whether it's trying to turn the typescript into JavaScript, whether it's bringing in additional modules, whether it's trying to add additional features that aren't in JS yet and then compile them in. Whether it's minification, offuscation, all of these things that we do, there's a lot of reasons that we have to transform our JavaScript into smaller, simpler, dumber, more minified code that isn't the original code.
So, just as a really simple, dumb example of obfuscation, minification, uglification, all the different things you can do in the JavaScript world. Here's just one demo of some very simple example code. Value one, hello, value two, world, console log. When you obuscate this, no one can read that. Obviously, not everyone is trying to intentionally obuscate their code. They're just using uglification and minification to make the code that they ship smaller. But if you're really trying to protect it, you're probably doing things like this. There are some problems with this though. In particular, when you want to debug, if this is the code you're shipping to your users, hell, if anything other than this is the code you're shipping to your users and they hit a bug, trying to trace back to where that bug came from is really hard.
If you turn hundreds of thousands of lines of code into three lines of code, error messages no longer really mean anything. So it's very common to do some type of process in order to get back the original source when an error happens. But in order to do that, you need to be able to map a given thing from the original code to something in the offuscated code and more importantly vice versa. I need to be able to know that this code here that like these first characters there are actually directly mapped to this and that this part is directly mapped to that and you need to have that mapping in order to get usable error messages in order to get better logs in order to get all the things you need to solve bugs.
Source maps are the method of doing that linking of linking the code you are sending back to the original source. It is the mapping between that original code and the source that was used to create that code. So if you were to install cloud code now and go inspect the binaries, it is JavaScript, but that JavaScript is not really readable JavaScript. So if we look at the actual official cloud code package that you can download on npm today, it includes a lot of things, but the actual core JavaScript here, the CLIJS file, you'll notice none of this is particularly readable.
For things that need to have formatting, like new lines in strings, there will be a lot of new lines. But if you were to get rid of those, this would effectively just be one or two lines of code. Very, very, very long lines, but just a couple nonetheless. This CLJS file that is pure text is 13 megabytes of just JavaScript. And that's this obuscated JavaScript that nobody can actually read. One of the common solutions for trying to get good errors out of stuff like this is to host the source map on a hidden cloud. Products like Sentry that are used for logging and managing your errors have the ability to upload your source map to them so your users can't see the source map, but once it is reported to Sentry, they can do the linking on their side and still give you the better errors.
So, how does this all relate to today's leaks? Well, as you can guess, in order for the source map to be able to link between the code that your users get and the original code, the source map effectively has to include all of the source. So, why would that ever get included in Cloud Code? Well, first off, it's worth noting this is not the first time this has happened. Enthropic actually in one of the earliest releases of Cloud Code accidentally included the source and this has led to them DMCAing more GitHub repos than as far as I know any other company in history.
They've sent hundreds of these requests to take down all of the repos that were mirroring the source code that they leaked through their own source maps through their own package. Yes, they published this in their own package. So, if you were doing what I just did, but yesterday by downloading the cloud code tar file off of npm, it would have included a source map folder in here that would have included pretty much all of the source code. So, how the hell did that happen? Well, I wish I was a little bit more timely because I filmed a video 2 days ago about the crunch on the rate limits that currently exist within Claude Code because they made changes and people are hitting rate limits aggressively.
Multiple employees at Enthropic posted updates about how they're seeing these higher rate limit hits more so than expected. They are investigating. They'll share more when they have an update. It is my assumption that they wanted to get better logs in the production builds of Cloud Code in order to see why people were hitting these limits. In their attempts to improve the logs they were getting to get these better errors so they could hopefully figure out where things were going wrong, it does not surprise me that they would have accidentally included the source maps. I have no proof that this is the case.
I have no insight info here. I have nothing that you guys don't have. I experienced this myself when I was trying to build cloud code locally. The reason I had a problem was that the leaked source maps included a link to the cloud agent SDK 0.2.88, which was also released last night. And when they pulled the latest Cloud Code build by attacking npm until they agreed to take it down, they also took down the latest agent SDK, which was included in the source. So, I couldn't do the npm install because it was linking to a package that no longer existed.
So, it had to downgrade. And when I tried doing the install, the installer would just hang because the package and the lock were pointing towards things that didn't exist. It's just funny to point out the reason npm doesn't allow these things is the same thing that caused the error when I was trying to use the stuff that they were trying to keep off of npm. It's just it's all weird funny circles. I've had a lot of fun digging into this. More importantly, had a lot of fun running it. Yes, I have a locally working cloud code build.
This is not as easy to make as you might guess because a lot of the packages that it relies on are also unreleased closed source packages that we effectively had to rebuild ourselves. Thank you to Ben for all of the help with this. Apparently Ben's gone even further and already got GBT models working in it as well. Oh man, he also got Doom working inside of it, too. Yeah. Oh man, Ben, why are you like this? Got Prime Beat at that point. Honestly, I haven't really looked at the code yet, which probably is good for me for liability reasons.
We'll do a little bit of playing in a bit, though. Anyways, as I was saying, think I've adequately covered the source map portion, but now we need to talk about all of the conspiracy theories. I have seen so many of these going around. The first one that I'm seeing the most by far is that this was somehow intentional. I have a lot of reasons to believe that is not the case. The easiest being that the original post that had an R2 file with the source zipped up, the one I downloaded last night, is no longer avail.
Wait, what? Oh, it was taken down. Did it get untaken down? Interesting. I wonder if somebody fought this at Cloudflare because this was taken down before I went to bed last night. Very interesting, actually. That might change things a bit. There's also the history of the DMCA requests from Anthropic, but I'm guessing that this one went big enough and far enough that they know better than to pretend that this can be suppressed. At this point, it absolutely cannot be. But the biggest issue is their own philosophy around Claude Code. I reference this clip a lot, but I feel like it's more important now than ever.
They never wanted to release cloud code in the first place. And we were having this debate. We're like, is this, you know, secret sauce? Are we sure we want to give it to people? Because this is the same tool everyone at Anthropic uses every day. And yeah, I think it turned out to be kind of the right decision cuz it makes people more productive and people people like it. They were legitimately considering not releasing it because internally it was helpful and they didn't want to give to the world something that was a potential wedge they had internally.
particularly silly because as far as I know, claude code was the fifth or sixth CLI agentic coding tool. So yeah, it is what it is. Oh boy, it looks like the DMCAs have started. People are getting DMCA just for forking their own cloud code GitHub repo, which reminder, the cloud code GitHub repo does not include the source code or cloud code, but people forking that are getting DMCAs now. Yeah, lots of people are getting these. Hilarious. So yeah, Anthropic is speedrunning the world record for most erroneous DMCA requests sent now as well. So if you actually thought this was intentional, there you go.
It is not. And the fact that they also nuked the release on npm, clearly they don't want this out. If you actually think this was intentional, I have a couple bridges for sale. We should definitely chat. And the next conspiracy I saw was that this might be a bug from bun. Bun is a bug when you use the bun serve command to host web code where it sometimes serves source maps in production. Not a great bug. Definitely a thing that needs to be fixed, but this is for web apps that are hosted by Bun, not for apps that are bundled and use the runtime of bun locally.
Since bun is used to bundle this package and run it on your machine, not to host it on a server, this is absolutely not the source of the bug. And Jared, who is the creator of Bun and also an employee at Anthropic now, since they were required, he points out that Cloud Code does not use Bunserve, so this could not be the reason for the leak. It is also worth noting that this is the only comment from an anthropic employee about the leak so far that I have been able to find. Thankfully, there's a lot of people trying to push the limits of copyright right now by rewriting the leaked source in other languages, which as a derivative work might be legal.
I am not a lawyer. I do not know any of this very well, so don't take my word for this. Very interesting to see that this project already has 57k forks and 54k stars. They're going to try and DMC the [ __ ] out of this. There's also a lot of people filing PRs on the cloud code repo trying to add the cloud code source now that we all kind of have it, which I think is hilarious, but also don't spam open source repos like this. It's not a good look. It causes as many problems as it solves.
It is also funny seeing my PR here that has not been responded to yet. And we have one last final conspiracy. Competitors will use this to make their harnesses better. I've seen so many people posting like this. In particular, people saying, "Wow, Open Code's about to get a hell of a lot better, isn't it?" Well, about that. 2 days ago, I recorded a whole video about AI harnesses. It's not out yet. It'll come out soon, I promise. But one of the key points in this video is how bad some of the harnesses are. The example here is benchmarking cursor versus other harnesses, specifically the ones that are provided by the actual company making the model.
So, with Gemini, this would be the Gemini CLI, and it went from a 52% score to 57. GBT 5.4 bumped from an 82 to an 88. But the most interesting by far is Opus, which scored a 77% on Matt's benchmark when using Cloud Code, and it got bumped all the way to 93% when using the harness from cursor. So, to be very, very clear here, the Claude Code source code is not particularly useful unless you really really suck at writing things for agents. And even if that is the case, you're probably better off looking at a real open source project like Open Code or Codec CLI or Gemini CLI or PI or any of the many other open source options.
In fact, if we look at terminal bench, you'll see that claude code is all the way down here at 39th place. Yes, there are 39 harness model pairs that outscore claude code. And if we filter this to just be opus, claude code is still in last place for harnesses for using opus in terminal bench. It is legitimately the worst harness by far. I've been saying this for a while, but I hope you guys understand and will listen to me this time. Not only is claude code not a good thing for others to reference, the opposite is actually true.
One of the funniest things I realized thus far is that when you look for open code in the repo for claude code, you will find multiple instances of them referencing open code source in order to match open code's behaviors for things like scrolling. So if you're curious who's copying from who, it's not the open source people copying the closed source people. I promise you generally speaking, the closed source options tend to be worse. And this is absolutely the case, which is why the closed source options are just blatantly copying things from the open source ones.
You know, DAX could probably have a field day with this now that I think about it. Anyways, now that we have debunked the major conspiracies, the ones that have been frustrating me the most, it's time to do the thing that you all are probably here for, looking at the code itself. While the code was originally in a source map format, it has since been converted by many different people into zips and repos you can find all over the web that have pretty much all of the source. There are some catches here though, specifically sub packages.
It is very common in large monor repos to have things broken up into sub packages that tag workspace star. This means that it should grab that package from the internal vendored version that exists probably in the same repo or workspace in this case. It's also worth noting that generally speaking, package JSONs aren't particularly protected. They're usually included in the package anyways. So, this is probably safe for me to show. And you know what? I'll fight it in [ __ ] court if they try to go at me for just looking at the package JSON and the names of the packages that are included in the bundle.
If they go after me for this, [ __ ] them. We'll fight it. Since these packages don't exist on npm, there's a bit of a danger here. Somebody noticed that these were included and they went and registered them on npm with a disposable email address. They are definitely being squatted for malicious uses. So if you do clone this source, be very very careful because building it without doing the proper protections to make sure you're not installing these packages, which again would fail because they're talking workspace which doesn't exist. But if it blindly just goes and installs the latest, you're [ __ ] Be very careful.
We had to strategically recreate these dependencies in order to make this work, which as I mentioned before, Ben is already done. And work it does. As you can see, my beautiful cloud code instance now in a bright pink as it always should have been. But more importantly, with the change to hide my [ __ ] email address when I open it, I still cannot fathom why they ever thought it was a good idea to show your email every time you open Cloud Code. It's such an absurd level of incompetence that it makes me lose all faith in Cloud Code's like user experience direction.
I have there is no single thing that has caused me to leak my email address more than Claude Code. They added a custom flag to hide it, but I can't pass that environment variable reliably in the dev builds and have accidentally not used it in the past, too. Nah, just nobody needs to see their email every time they [ __ ] open Cloud Code, guys. Come on. Anyways, now that you see this is actually working on a version that is no longer published and I can make real changes to Cloud Code. Pretty cool, right? Like, how fun is that?
That I can build and play with Cloud Code even if I'm not supposed to. And as mentioned before, Ben already got it working with GPT models. And you can tell because of the beautiful UI it's making. Hilarious. So, what else can we learn from this code now that we have it running? Well, I actually asked Claude Code to break down the most interesting things and it gave me some really good insights. I also asked my lawyer about this before doing it. His name is Claude. He's a little too friendly in my opinion and he had a lot to say specifically that were probably saved covering this due to the nature of the leak going as far as it has and this being published by them.
Likely we are safe. So, let's see all the fun unreleased features that were hidden inside of the source code. The first, and sorry to spoil your April Fool's Day joke, Anthropic, is Buddy, a companion that would hatch inside of Claude Code between April 1st and 7th. So you'd have a little guy running around in your Cloud Code instance. Cute. They're probably not going to ship this anymore due to the leak. And then we have dream mode, which is very interesting. Automemory consolidation. The point of dream mode supposedly, according to this analysis of the source, is to spin up background agents to automatically review past sessions and consolidate memories while you're away in order to make it so that cloud code behaves more how you ask it to without having to ask.
There's also coordinator mode. This one is very interesting. We'll talk more about it later. The TLDDR for now is that it will spin up multiple workers and coordinate them in parallel. So workers will get their own full tool access, but with specific instructions on what to do. So you can effectively have one cla code spin up five claude agents to go do other things. And then we get some very interesting ones with ultra plan and ultra review. This is similar to ultra think when you tell the model to think longer than it normally would. But ultra plan is primarily for running in cloud code run remotely so that you can spin up an agent to do a long complex plan and then I'm guessing pull down the plan and run it on local or run it in the cloud.
And then ultra review which is very similar but it's an automated code review using remote agents with billing controls. If you don't remember, they announced Claude Code code review a while back, saying it would average around $25 per PR, which is an insane price. This is probably meant to be part of that. Teleport's already out. It lets you send your session across the network to another device. So, if you want to work on your existing thing that you were doing in Cloud Code, but move it to your phone via the Cloud Code for web back end.
Yes, Cloud Code, the CLI, is just called Cloud Code. Cloud Code, the thing you could use on the web is called Cloud Code for web. And the Cloud Code on your phone is also cloud code for web. Yeah, maybe they need better names. Voice mode's already out, too. And then auto mode is very interesting. The point of auto mode is to run when you are not using the computer or doing prompting yourself. There's a lot to dig into in that. It could honestly be its own dedicated video. This is a fun one for employees. Apparently, anthropic engineers regularly are contributing to other projects externally and filing PRs, but they don't want it to be known that they used Claude code for it.
So they have an undercover flag that has very strict instructions in it, including specifically do not blow your cover. I can't help but wonder how many open source projects have unintentionally merged Claude code generated code because anthropic employee was using this undercover flag. Very interesting. One other fun piece here is that these are feature flagged via Growth Book. Growth Book was actually in my Y Combinator batch. They're old friends and sponsors of the channel. I've worked with them a ton. I love GrowthBook. They're a fully open- source feature flag platform that lets you add feature flags to things.
You might not have heard of them before because they're not super popular. They only have like 7K stars on GitHub. They are phenomenal. I recommend them highly. They've been very nice to work with. This is not sponsored. I haven't talked to them in a bit. I just like growth book a lot. Previously, Anthropic was using a more popular solution known as Stat Sig. But they were not the only ones using it. So was OpenAI. In fact, OpenAI used statig so hard that they acquired them. And what's really funny about growth book being included here is that it very clearly was a shift that was done because Anthropic is so paranoid about any of their competitors having any access to anything of anthropics.
So Anthropic moved off of stats sig in favor of growth book and I'm guessing they did this very aggressively. Sadly the git history was not included in the source maps because why would it be? So we don't actually know when these changes were made. We just know that they are using growth book fully as their feature flag solution. There are a lot of other funny goofy things in the cloud code source. For example, West Boss found all of the different verbs it uses when it's thinking. You've probably seen the little like thing it says in the bottom while it's doing thinking.
Here is all of them. They also have randomly generated IDs and they intentionally try to avoid certain swear words and otherwise undesired terms in those IDs. So, they have an avoid list on things that should not be included, which is hilarious. They also have a reax that detects when you swear at claude code so that it can use that in analytics to try and figure out when you're mad and why you might be mad. Sahill actually found some interesting anti-distillation systems that are built in as well. If you're not familiar, distillation is when people take a lot of histories from a given model and tool and they use that to reinforce another model to behave more like that other model.
Anthropic's been aggressively claiming that different Chinese labs are using their data from runs they do with Opus to try and make their own models better. And apparently they're trying to infect that data by sending fake tool calls into the histories. So when you try to use those histories to train, you end up with a bunch of fake data in them that makes it less likely the model behaves. You can see all the weird things they are trying to do to prevent distillation in their distillation resistant mode. There's a bit more to dig into here. Thank you to Mal for writing up all of this.
Really can use the plausible deniability right now. One of the most interesting pieces here is how the Claude MD is loaded. I've always described the Claude MD file as going before everything else. There's the system prompt, the Claude MD, then the user message. It seems like the Claude MD gets reinserted all of the time. in particular on turns changing. A turn is when it swaps from you to agent because the agent will do multiple different messages while it's running because once the tool call is completed, it will send the result from your computer back up to do another generation.
That's not a turn change. But when the model is done and I send another message, the CloudMD gets included again, not at the top of the history, but right where I'm sending it. It gets reinserted over and over again in order to try and keep the model behaving. And then we get into some of the interesting details of how the subsystems work. In particular, sub aents which share the prompt cache. Talked a lot about context management and prompting and prompt caching in particular in the past. I won't go too in depth here. What's interesting here is when you have a cached history which allows you to get way cheaper inference because you don't have to recalculate where you are in the history in order to generate the next token.
You can just skip all of that. The problem is if you change anything higher up in the history, it will break your whole cache. So if you have a long chat history, but at the top of it is your system prompt that includes the current date and the date changes, that change at the top destroys all of the caching all the way down. So you have to recalculate all of that and pay for all the input tokens again. Avoiding that's essential to make these tools even vaguely reasonably priced. I would go as far as to guess that a portion of why we haven't really seen this coordinator mode yet is because the expense of running this is absurd.
because you're effectively spinning up five plus full cloud code instances per request. And this is one of their attempts to reduce that cost. Instead of having each sub agent spin up with its own context, which means each of them is paying their own input tokens, they share the same prompt cache by sharing all of the context up until the message telling it what to do there. This is interesting. And it's not the same cost as one agent because they all have to do their own work and generate their own output tokens and then they branch off of the shared cache.
Those isn't the best wording for this, but it does significantly reduce the cost of spinning up sub aents. The permission system has five levels of setting cascade. There's policy, then flags, then local, then project, then user. In the cloud/ settings JSON, you can set patterns for what's always allowed, and those will not be overridden by things that are more local. So, here's a fun insight. There are five different compaction strategies. Enthropic's always worried about how much context is being used and wasted and how they can reduce that without losing coherence to the task that's going on.
We've all experienced this if you've used tools like cloud code heavily where it uses up the context window. It compacts and then it just kind of fully loses track of what it was doing. They are trying to fix this clearly. And this is interesting because I don't know what OpenAI changed, but as of like 5.3 Codeex and onwards, I found the compaction in OpenAI models to be really, really good. And I still get really scared whenever I see anthropic models compact. There's then a rant about the hook system that people don't know about. I know a lot about the hook system.
It's interesting. There's also some very annoying things missing in it, like user response submitted hooks. I've been bugging them about that for a while because I wanted it for some of the projects I was working on. There is a lot of potential in hooks, but I honestly don't think it's been realized yet because there's so many hooks still missing. I think that's the majority of what I found helpful here. There's obviously a lot more to dig into, and people keep finding it as time passes. But there's one last piece that I want to talk about when we look at the code.
How actually is the code? Is it good? Is it bad? Is it ugly? How much vibecoded slop is in there? Well, if this is vioded slop, we need a vibecoded slop expert to tell us. So, of course, I asked, "Scan the code and give me a rough 1 to 10 score of how well produced and maintained the code base is." They gave it a 7 out of 10. I wonder if it's a little biased. Type safety is really solid overall. There's only 38 instances of any across over 500 files. Naming consistency is pretty good. Error handling is solid.
Dead code management is good. There's very few code blocks that have been commented out. Async patterns are decent so far. Only 258 then chains. Zero callback hell modern throughout. Blending tools are good overall. Things like no tople side effects. There's only 248 instances of ignoring biome. Cool to see they're using biome. There are no test files, but to be fair, the test files are unlikely to be included in a source map, so I understand why they're not here. This is probably skewing a lot of the scores that people are getting when they rank it this way.
There's also a few too many god files. These gigantic files with over 5,000 lines of code each. That's a lot of really big files. I am not anti- giant file when it makes sense, but this is a bit absurd. They also have their feature flags scattered all over the codebase. There's over a thousand of them referenced across 250 different files. There's also tons of growth book checks that are just in line in existing business logic. Makes it really hard to figure out what's going on. As someone who spent a lot of time debugging weird environment variable stuff, this is going to be a disaster.
And this is why things leak as often as they do, almost certainly. There's a shitload of environment variable sprawl throughout. That is hilarious. Uh, hey, maybe y'all should check out T3 Environment, T3-ENV. It's a package that Julius made to make this type of thing less likely and less problematic. And clearly you guys don't mind borrowing from open source projects. Maybe take a look. We're very generously licensed, so if you want to just steal it, feel free, anthropic, I don't mind. Linux falls back to plain text credential storage. Token prefixes are logged for debugging in JWT utils, and there's no centralized secret sanitization before logging.
I have seen people leak environment variables many a time in cloud code. Not surprised there. And then a bunch of tech debt. Lots of to-dos in the source code. They're all specific and actionable, but many look old. They claim the seven would be bumped if there were tests present. And again, we don't have those because of the nature of source maps. On that note, though, we should look at how much code we have here. I've seen other people saying as many as 500,000 lines of code. I don't know how they got their number. From the exact leak that I'm working with here comes out to around 390,000 lines of code.
To be clear, this does not include things like npm packages, even the package lock or the package JSON or any of the other pieces you would need. This is just the TypeScript source, and the TypeScript source alone is 390,000 lines, which means Gary Tan could recreate this in about 12 or so days. That was a really, really good joke for those who are in the know. Happy some of you guys appreciated my [ __ ] And we have our first official comment from Anthropic. No sensitive customer data or credentials were involved or exposed, Anthropic spokesperson said in a statement.
This was a release packaging issue caused by a human error, not a security breach. We're rolling out measures to prevent this from happening again. Do we think this is really human error or do we think this might have been an agent error? Either way, the emphasis on human error and not blaming their own agents and their own models is very funny to me. And since Codex is open source, I figured I would take a look at this for reference. It is a little bit bigger with over 515,000 lines of code in the RS package bit.
It lines up with the number I see others sharing about cloud code, but it is from my own brief research slightly larger. It is also Rust which is much more verbose but also importantly much more safe. You know now I think about it. Anthropic probably wouldn't have had this problem if they had just used codecs when they were doing these changes because I certainly haven't had an open AI model try to do something like this for a long time. One last hidden feature I want to dig into a bit here. Chyros. Apparently Chyros is that AFK mode to some extent.
It's the always on proactive claude that does things without you asking. It runs in the background and every few seconds. It's probably not going to be a few seconds. there's going to be a heartbeat that's less frequent than that. It gives a prompt that says, "Anything worth doing right now?" It looks around at what's going on, makes a call, do something or nothing, and it then can just make changes, push notifications, edit files, make pull requests, subscribe to pull requests, and update them when things change. For example, if you have a poll request and somebody leaves feedback, it can notice that and then autoedit and push the changes to said PR.
There's a lot of potential in something like this. I'm very excited to see if they actually get this out and released. I'm assuming they're already probably using it internally, which is likely one of the reasons they don't want to open source it because we'll see all of the weird ways they're actually building it. On that note, though, it's time to talk about what I actually think is the most important piece here, how they should respond. Anthropics in a strange spot right now. They have pushed back hard on the idea that Cloud Code should be open source.
And I never really thought that was fair. I'm sure y'all have seen many a time now me shouting at how important it is that they do open source it and that it should be done. They're the only major CLI harness that isn't open source. As far as I know, the cursor CLI is closed source, but like who's taking cursor CLI seriously as a major harness. I think it's fun, but it's not what we're here to talk about. We're here to talk about the major harnesses and the CLIs and the expectation that we build on top of them without having the source itself.
I have always been really skeptical of tools and platforms that are trying to encourage you to build onto them and into them without giving you the source so that you can make sure your stuff stays compatible with them. And it's just obnoxious. Whenever I'm working in cloud code or trying to build something with cloud code compatibility, I'm hopping between docs that don't include half the info I need and testing it locally to see if it even works at all. My experience trying to integrate stuff in cloud code has been [ __ ] hell. And part of that, if not most of that, is because it's closed source.
There's a lot of reasons they would choose to keep it closed source, though. Whether it's their secret sauce, they're scared of leaking, or it's the way they want to manage the repo and how they're scared of people seeing all of the things they're filing PRs for, people overanalyzing changes, trying to leak all of the new features and stuff they're cooking, as we've seen here, or even just the cost of maintaining a open source repo with a lot of eyes on it. It's not trivial to deal with the hundreds upon hundreds of poll requests you can get per day.
Believe me, we're speaking from experience here. So, the reasons against this, while somewhat sensical, are kind of dumb to put it lightly, and most of them, in particular, the biggest ones around the secret sauce and hiding the future features, the doors been blasted open now. Those are no longer valid reasons to not open source. As I'm sure you guys could have guessed, recommendation one, just open source it. And to be clear, I don't think they have to do this immediately. I don't think they need to say, "Fuck it," and put up the source code right now.
That would be nice, but I don't think they will. What I think would be totally reasonable and more likely for them to be able to actually digest is to give us a road map and timeline for when they will be open source. If they want to spend some time to clean up the codebase, purge the commit history, hide things they don't want to show, and rethink how they would put this in the main repo. I totally understand that. Totally fair. I'm okay with them taking their time, even like a month or two, but just come out now and say, "We get it.
We're planning this. We will do this. Here's a rough timeline of how this will look for us to do. The next advice I have, beat the leakers. Right now, all of the people writing up these articles based on their interpretation of the source code and of honestly more often than not, Claude Co's interpretation of the source code, those aren't great sources. People will always prefer better sources with better information to [ __ ] sources with fake information. I am sure half or more of the stuff that's being shared, and even a significant portion of what's been in this video is incorrect or bad interpretations of the things we're reading from the source.
beat them out. Come out and talk about these things. Go through all of these features that leaked and talk about them. Maybe add bunny mode right now and let people start using it. Maybe talk about dream mode and how they're already using it or not and what experiences you've had with it so far. Talk about coordinator mode and why it is or isn't shipping and what your plans are for it. Make it a point to go through all of these and maybe once a day on every weekday for a bit. talk about one of the features in the leak and break down what it was, why you tried it, why it's not out yet, and why it is or isn't coming out in the future.
This is meant to be part of another theme, which is honestly one of the most important things here. Be [ __ ] humans about it. And I know this is not necessarily the most pointed advice, but it's so important now, more so than ever. Yet another corpo post about this is not going to do any good for [ __ ] anybody. Just be human. Just come out here and talk about it with the community. Just be decent. And I've seen this happen before with Anthropic. I've seen Thoric and Lydia come out and be part of the conversation before. Usually that's conversations that the lawyers don't care about.
And this is what I'm sure the lawyers are very, very interested in. God damn. Just be humans. I actually queued up an example of this that I think is very funny. I did a video last week called OpenAI is lying. This video was all about how horrible OpenAI models are at front end and the terrible article they published pretending that their models could be good at front end if you just prompt them right. I thought that was [ __ ] [ __ ] because to be frank, it is [ __ ] [ __ ] And I'm not the only one who thinks that.
It seems like there's a handful of people there that also get it. And instead of being mad at me or cutting me off from early access or cursing me out privately or sending lawyers after me or all the other wonderful things other labs tend to do, they went a very different route. I got dunked on a few days ago by Kevin from the Tanstack team. When he was trying T3 code, he noticed that the implement button could leak out of the input box. This is awful. We absolutely need to fix this. Is it everything that slips when UIs have all these different states people can use them in?
This pained me. We're on it. Why am I talking about this though? Because Jason, who works at OpenAI, made one of the funniest posts I've ever seen. Thanks for trying GBD 5.4. Theo will keep improving our models. This is hilarious. We had a UI overflow bug in our app, and Jason from OpenAI made a good joke about how their models make front-end mistakes. This is exactly what I'm looking for. Frankly speaking, if he had made this tweet before I did my video crashing out at the front end abilities of the model, I probably would have never even put out the [ __ ] video.
Like, it's silly how much these things matter, but just being real and human about it, not sending some corpo or lawyer to tell me how I should be thinking about the thing. Just being realistic and human and kind of funny about it is great. It's funny that everybody thinks OpenAI are the robots and anthropic are the humans when it's very rare I hear from an anthropic employee about something and it's usually a [ __ ] lawyer and I'm far from the only one. Open AI is just chill and like you need this right now. Anthropic, you built this reputation of the cool guy.
You have to be the cool guy right now. If you're going to keep pretending you're the humans of the bunch, you got to be the [ __ ] humans right now. So, what does that mean? It means stop [ __ ] sending DMCAs, especially the ones, to people who aren't using this code and aren't breaking any laws or copyright infringement. Stop it. There's just there's no good look for that. The code is out there. You can't hide it. DMCA a bunch of [ __ ] on GitHub just makes you look worse. There is no benefit to doing that. Literally none. All it does is makes it look like your lawyers are running your company, not your developers, not your product people, not the ones who care about the community.
Let the people who talk to the community help with these decisions. they understand it more. And next, don't do some big press release about this that pretends it's not a big deal or that it didn't happen. That is cringe as [ __ ] Let the team come out and talk about it. Let the team share the things that are interesting. It's silly, but just an example of one of the things that would be a good post from somebody on the team right now. Let's just pretend they have an employee named Barak. And imagine Barak is posting on Twitter and Barak says, "Wow, kind of cool that people can now see all the work I did on some feature.
I'm really proud of it. Here's some fun insights from when I built it. This is the type of [ __ ] they need to do. It's not that [ __ ] hard." Every engineer working on Cloud Code has something cool they built that they want to talk about that they haven't talked about because they can't. This is closed [ __ ] source. It's not closed source anymore. It's source available with a bunch of asterisks. Fix that by making it open source. And instead of trying to press release this [ __ ] instead of trying to suppress it with lawyers, just lean into it.
Let your engineers be excited because excitement will always, always, always, always beat hype. Real excited energy is what you need right now. And you can't get that from your [ __ ] lawyers. And if your lawyers have to compete with the energy on Twitter right now, you will lose. This will be the biggest hitthropics ever taken to their sentiment if y'all start going after everybody just for being excited about this [ __ ] So, if you do things like DMCA me for this video, go after West Boss for sharing all of the strings in the codebase, those types of things, which I know you're considering doing right now.
I'm surprised I haven't gotten an email just from filming this live. Get over your [ __ ] That's not going to work. You're going to become the worst and least respected lab in the [ __ ] industry if you don't let your devs do the right thing here. manufacture more excitement. Let your team be hyped that they can finally talk about these things. Lean into that and you can turn this PR fail into one of the biggest PR wins in history. What a wild week. I hope this was helpful and I hope it was worth going live on my birthday to film all of this.
I'm going to go eat some cake and enjoy my day. Hopefully y'all enjoyed this as well. And until next time, peace.
More from Theo - t3․gg
Get daily recaps from
Theo - t3․gg
AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.









