Be careful what you name your markdown files...
Chapters10
The creator explains how Anthropic bills based on file names, commits, and tool usage, leading to surprising charges like hundreds of dollars for simple artifacts and even when using third-party tools.
Theo exposes the messy pricing, buggy caching, and clampdown on third‑party tools in Anthropic Claude Code, then recommends a more reliable alternative.
Summary
Theo from t3.gg digs into a sequence of controversial moves around Claude Code, arguing that Anthropic’s policies harm engineers and inflight projects. He walks through the surprising pricing structure, including the Max 20x plan at $200/month and hypothetical $2,000 weekly usage, plus how per‑token costs and “extra usage” billing can explode bills for casual users. He demonstrates how a simple commit message—Hermes MD—can trigger unexpected charges, thanks to third‑party harness detection and the subtle ways system prompts are parsed. Theo also explains how caching works in large models, why it matters for speed and cost, and why Claude Code’s caching decisions and billing mechanisms create friction for developers using tools like OpenClaw or Hermes. He critiques the CLI features (-p prompts) that enable programmatic use of Claude Code, and reveals how a system-prompt tweak can cause unexpected billing even in an empty repo. A large portion of the video is a pointed critique of Anthropic’s engineering culture and their treatment of third‑party tooling, culminating in a stark call to rethink Claude Code’s design. Throughout, he compares Anthropic unfavorably with OpenAI, noting cost, reliability, and developer experience differences. He also plugs a sponsor (Kilo Code) as a single‑subscription solution for code, open‑inference, and tooling, positioning it as a practical alternative to the current chaos. Mentions of Hermes MD, OpenClaw, and caching debates are tied to a broader argument: third‑party tooling should be supported, not penalized, and billing should reflect actual usage rather than arbitrary string checks in commit history.
Key Takeaways
- Claude Code Max 20x plan costs $200/month and can reach around $2,000 of inference per week if misused, according to Theo's breakdown of token pricing.
- Anthropic’s third‑party harness detection can trigger extra billing when system prompts or commit messages mention restricted terms (e.g., Hermes MD), even in new or empty repos.
- Caching efficiency and history state are central to model cost and speed; mis‑managed caching across OpenClaw/OpenCode vs. Claude Code can drive higher compute and higher bills.
- The -p CLI feature lets you pass prompts programmatically to Claude Code, enabling OpenClaw workflows and creating opportunities for policy conflicts or abuse unless properly guarded.
- Extra usage billing is a real policy lever: turning on extra usage can keep requests flowing past limits but at significant overage costs, even when not at the official plan cap.
- Theo highlights perceived misalignment between Anthropic’s billing policies and the practical needs of developers who rely on open tooling, contrasting it with OpenAI’s more developer‑friendly approach.
- As a monetization and policy concern, the video argues for transparent dashboards and better tooling support rather than punitive constraints on third‑party integrations.
Who Is This For?
Essential viewing for developers building with Claude Code or third‑party harnesses who want to understand hidden costs, policy pitfalls, and how to navigate caching and billing. Also relevant for teams weighing Claude Code against OpenAI options and for engineers who value transparent, developer‑friendly tooling.
Notable Quotes
"The Max 20x plan on Claude Code is their highest tier. It is a $200 a month plan."
—Intro to Claude Code pricing and the specific $200/month tier.
"They are billing you on what file names are in your git commits."
—Description of third‑party harness detection and billing by commit history.
"This is the only solution that will cover you from pull request to grocery list, one subscription will cover everything from the infrastructure to the inference itself."
—Sponsor pitch framing a bundled, universal subscription.
"If you include OpenClaw inside of that appended system prompt, you weren't allowed to do it."
—Explaining how system prompts can block OpenClaw usage and trigger billing.
"Your engineering culture is actually the most toxic cess pit I've ever seen in my life."
—Broader critique of Anthropic’s internal culture and customer relations.
Questions This Video Answers
- How does Claude Code's caching affect billing and performance compared to OpenAI models?
- What is the extra usage toggle in Claude Code and when should you avoid turning it on?
- Can I use OpenClaw or Hermes MD with Claude Code without triggering extra charges?
- Why does Claude Code charge for writing to cache and how does this impact monthly bills?
- What are the practical steps to mitigate unexpected Claude Code costs when using third‑party tools?
Anthropic Claude CodeClaude Code cachingOpenClawHermes MDOpenAI comparisonBilling practicesCLI -pGit commit history impact
Full Transcript
I hate to do another video just dunking on Anthropic, but they keep finding new ways to just embarrass themselves in the most public egregious fashions. We've already seen them doing absurd things like making sure your build more depending on what system prompt you have in clawed code because they didn't want people using it with tools like OpenClaw or other harnesses that they don't support. On one hand, you can kind of see why they would want that. I've been running my Open Claw through a separate service recently and just the pings it does is about $5 a day of inference through Opus 47.
So I I get that. But on the other hand, this is absurdity. The fact that you're using your subscription in one tool versus another should not change whether or not they support it. And even crazier, I shouldn't be build money when I still have 100% of my usage available because I used a tool you don't like. But we can go further here as Anthropic certainly has. They're no longer just billing you based on what tool you're using or what's in your system prompt. They're now billing you on what files are in your codebase. Even that is not as bad as it is.
They are billing you on what file names are in your git commits. This is a legitimate report of a user who was paying for a $200 a month sub for Claude Code that was using Claude Code and ended up being build $200 when they hadn't used any of their usage in Claude Code because they had a commit that had the term Hermes MD inside of the commit message. I am just as tired of talking about anthropic as you guys are hearing me do it. But when you [ __ ] up your code this genuinely, hilariously and pathetically, I have to make fun of you for it.
And that's what we're going to do now. The level of incompetence at Anthropic right now, powered by the disdain they have for engineers as a whole, is at a level that is hard to fathom. And as frustrated as I am crashing out yet again about this, I think this one perfectly summarizes the problems I have. But if you want somewhere better to spend your $200, maybe somewhere that isn't going to waste all of your time and money and burn trust and destroy you, you might want to check out today's sponsor. I don't know about you all, but I'm getting pretty tired of juggling all these different subscriptions, especially now that Anthropic banned us from using my subscription with things like OpenClaw.
What if you want to use OpenClaw and a VS Code plugin and a CLI and all these other things that you might use your inference for? Wouldn't it be nice if there was one subscription that covered all of that and good integrations with all the tools you already use to use it with? Today sponsors Kilo Code and they've solved all of this. Not only does their subscription work for every different thing you might want to do, they've also built their own VS Code extension, Jet Brains extension, CLI, and more. They also have my favorite OpenClaw integration by far.
It's full and true OpenClaw, but without having to spin everything up yourself. You click the get your assistant button and they'll spin up the infrastructure and manage it all for you. No Mac minis required. As silly as this quote is, it's true. They're the only solution that will cover you from pull request to grocery list, one subscription will cover everything from the infrastructure to the inference itself. Whether you just want to write some code with the latest model or try out OpenClaw directly, there is no better place to start than soyb.link/ilo. Let's dive in. I think a little history is important here before you can really understand the level of egregiousness that exists.
That a not even existing file, just an existing name of a file in your commits is enough to cost you hundreds of dollars. That's insane. We need to talk about why Claude charges for a markdown file that they don't like. The Max 20x plan on Claude Code is their highest tier. It is a $200 a month plan. And this plan is incredibly generous with the amount of usage you can get. If you run it in a loop 247, this $200 a month can get you over $2,000 of inference. This number is citing the API prices anthropic charges.
If you are going through their API, traditionally they charge $5 per mill tokens in and $25 per mill out on their flagship model opus 47 as well as 45 and 46. The price here, this $2,000 is if you look at the tokens that you utilize as a subscriber, run it in a loop for the entire month, you will not hit limits until you've reached about $2,000 of inference each week. You're able to do around $500 of work on the $200 plan. So, if you hit the weekly limit they provide in Claude Code, you did $500 of inference.
I still maintain my max 20x plan because I use it for testing a bunch of different [ __ ] I barely used it the last few days. As you can see, you have a session limit. A session is approximately 5 hours. It starts when you send a message, which is already iffy because if it was starting automatically on a timer, then I could maybe send a message or two before it resets and then get more users. But I get why they also have the weekly limits separately. They have one for all models. They have a separate sonnet only one because sonnet used to be the thing we would use most in cloud code.
Now that opus is the default. This separate sonnet only section makes almost no sense at all. Your son uses counts towards this limit as well as your weekly and 5 hour session limits. Like just unnecessarily confusing. Speaking of which, Claude design is its own separate thing. Absurd. Genuinely absurd. But the point I'm trying to make here is that if you have a weekly limit and you hit the weekly limit, if you get 100% here, that is approximately $500 of usage. That's a lot of usage for $200. And if you do that every week for the month, you'll hit 2,000.
That is a 10x discount. As such, they don't want people to get that close to those limits. So, if you have a tool that automatically uses your inference, they're not going to like that a whole lot because now they're not going to get the benefit of people like me who pay and barely use it. I probably do $50 of inference a month at most on this plan. So, they're profiting off of me. But, if I could use this with OpenClaw, I would be using it more and then getting much closer to the $200 I'm paying for.
They don't like that. They don't want that. So, they're trying to restrict me from being able to do that. There are some legitimate arguments here and I will do my best to steelman them. One of them is caching. If you're not familiar with how caching works in AI, I will do my best to quickly summarize it. You have a model that has a bunch of parameters. Parameters are different chunks of text that point to each other with different levels of confidence. Like if you have the capital of the United States is the most likely thing to come after that is Washington DC.
Or if you just have capital W Washington, it might point to a couple different places. is it might point to DC with high confidence. It might point to state with slightly lower confidence and it measures how likely each of these things is and that's what the model is. It's a bunch of these tokens with arrows and vectors pointing to and from each other to guess what the most likely next thing is. When you hand it a history of chat, it has to update those pointers, those parameters based on the history of the chat so far.
If I was to say, "What state do you live in? Is it Washington?" The most likely next thing is the question mark because there's indication in the tone of the other words I used that the most likely thing to come after that would be a question mark. If I was to say I live in Washington, DC is more likely to come next there, not a question mark. And the way it calculates this is it takes the history and changes what points where based on what exists already. But that means that each chunk, each token, which is just a couple characters that comes in has to be used to recalculate what points where.
That is a calculation that is expensive and costs money to do. So you can take the result of that and snapshot it. Kind of like hibernation on a computer where you just take the RAM and save its state. Once you take all of those tokens and generate the new mapping of things, you can save where it is so that you don't have to recalculate it again every single time a request comes in. That is why input tokens cost so much because when you put in a million tokens, it has to use all of those to calculate what the model should do next.
And if you can save what's already been calculated in the state that it gets to there, you don't have to redo that calculation. So, it's a lot cheaper, but you have to save a ton of state in order to do that. If you're using 16 gigs of RAM, you need to save 16 gigs of your hard drive when you hibernate your computer. These models run on hundreds of gigs of RAM. So these snapshots could hypothetically be depending on how they're doing it, hundreds of gigs. So caching is really important both to make your time to response better because if it doesn't have to recalculate everything, it can start responding faster.
But it's also huge for their own costs because if the I need to have an expensive GPU calculate where we are in the history to figure out what the next thing is, that calculation's expensive. And if you're trying to save compute and you're trying to save money, you're probably trying to reduce that compute. And having good caching practices is really helpful here. There's a problem though. If anything changes in that history, the cache is no longer valid. So if I have at the top of my history a list of all the files in my codebase and then I make a cache entry after I send my first message and then the things in my codebase change, that cache is no longer valid because the top of my history got edited.
So you need to customize how you preserve your history to preserve the cach's existence to reduce their compute costs and get responses faster and cheaper. But if you're using a subscription, you don't give a [ __ ] about any of this. You just want to get high quality responses and want to use the tools you like to use. If you are paying for this, you know how painful it is. But if you are using a subscription, you probably don't care about any of this. In a hypothetical world where anthropics engineers are good at their jobs and cloud code is a tool that functions and is designed well, cloud code would do a better job of caching things than open code would.
Because open code is to build a generic cache system that works with every model. Enthropic is building cloud code to cache the way that anthropic servers work. This means that if something like open code or open claw or any of these other tools didn't do caching as well as claude code, the same prompt on these two different tools will cost anthropic different amounts of money because the caching isn't done as well in one tool than the other. So if you have a tool that automatically drains your usage by calling it constantly on a cron and it [ __ ] up the history in such a way where it can't cache properly, that brutal combo of things results in a higher bill and more compute being used than if you had just used cloud code.
This is my best attempt to steal man. I'm trying to be reasonable here where realistically speaking, cloud code in normal environments would and should cache better in costthropic less compute than using the same type of stuff in another tool. We don't live in a reasonable world. We live in Anthropic's [ __ ] crazy psycho universe. And in this world, they are biting us constantly. You would imagine if Anthropic is trying to reduce their compute costs that they would probably want to make caching easier. Maybe they increase how long the cash is available for, or they don't charge as much for writing to the cash, or they do other things to incentivize people to cash more aggressively, which they did by randomly updating the cash so that it no longer cashes for an hour.
It now cashes for 5 minutes. They just randomly, quietly knocked down cash times from an hour to 5 minutes. We live in their looney land. We don't live in reality. and anthropic despite citing caching as one of the main reasons they don't want to support these other tools. They constantly [ __ ] up their own caching systems. This is a big part of why there was the regression where people were going through their usage way faster. They [ __ ] up the caching. And the solution to all of this isn't lock out the tools that don't do caching well.
It's just to measure their usage accordingly. And if I use open code and I blast through my usage faster, cool. Tell me in my dashboard when I go there, hey, we noticed you're using a tool that doesn't handle cache well. you should recommend to the maintainers they fix it or try out cloud code. I would be totally cool if they had a big call out in the usage section here that had underneath here, hey, we noticed you're hitting your limits pretty fast and you're using tools that aren't clawed code. Just so you know, your usage will go further if you try out our tool instead.
That would be totally fine. That is not what they did. And the reason they can't do that, and again, I'm forced to do conspiracy theories because this company does not give us any real information. And historically, my conspiracy theories have been almost entirely correct. They just recently confirmed a bunch of them when I said that Claude got dumber and had my theories as to why the regressions that existed in the harness and whatnot, as well as the thinking data being preserved incorrectly. I called all of this out in my video and in my podcast, and a week later, they confirmed I was right about all of it.
So, yes, these are conspiracy theories. I don't have real information to confirm this [ __ ] but I have a pretty near 100% track record. I am more likely to tell you why Claude is not working is at this point. So my conspiracy is that cloud code is just as bad if not worse than these other tools at caching. So while they are mad that open claw doesn't cache properly or open code or pi doesn't cache properly, they can't charge more for bad caches because then cloud code usage would skyrocket as well. If they were to go through your usage faster if you weren't hitting cash, everybody would just go through their usage faster.
We've already seen this. If you're not familiar, I run a service T3 chat. The only AI chat that doesn't suck to use. generally is very generous with your usage and what you can do with it. And we have access to every single model and we only charge you eight bucks a month. We have a higher $50 a month tier if you want to go really crazy on the image gen. And I'm I'm really proud of our image gen stuff. You can select multiple different models and generate with the same prompt, change aspect ratio, all that stuff.
Not here to plug T3 chat. I just probably should do it more. It's really really solid. But we offer anthropic models in it and we do a lot of inference with anthropic. Our total tokens for April so far is 11 billion. 11 billion tokens in, 650 million tokens out for the month. Last month during more peak Opus era, we almost did a billion tokens out. Our bill is absurd. Our cost for the month so far has been $40,000. It's not great. Expensive as [ __ ] balls. Ready to see where this gets even worse, though? First off, I want to compare this to my OpenAI bill.
Opening eye models are used significantly more in T3 chat than anthropic ones are significantly more like a two to 4x more. And the bill is a bit over half as much. Similar numbers of tokens way more requests half the price. One of the reasons why we have such a crazy anthropic bill is because of cashing. You think, oh, are you just doing cashing wrong? You mean you need to do it more? What percentage of our bill do you think is cash rights? Because unlike pretty much every other company, Anthropic charges you whenever you write to cash.
Google and OpenAI just do the cash right for you. They don't even ask. They just will write to cash for you. Anthropic you have to call a special thing to write to cash and they bill you for it. What percentage of this $40,000 do you think came from cash rights? Let's group by token type and see. Prompt cash rights accounted for approximately half of our bill any given day. Here's a day where the input tokens cost us $300. output cost us 500. Prompt cash rights cost us $970. When people ask what happened on April 20th, I told the team that Anthropics cashing literally doesn't [ __ ] work and we're going to stop doing it now.
And our costs didn't change meaningfully. We literally just stopped cashing and our bill is the same. If anything, it's slightly lower on average. Actual [ __ ] absurdity. So, how dare they claim the problem is that these other harnesses don't cash properly. We want to preserve the experience, make sure people don't accidentally overuse their usage. They are the worst lab at caching by far. By [ __ ] far. It's absurd. But I just wanted to talk about the cachings of a bit because anybody at Anthropic that says the reason that we can't support third party harnesses is caching should probably go work on the caching team because they don't do their [ __ ] jobs.
So once again, they are lying to us and I have the [ __ ] receipts to prove it. So, the reason that they're not changing how they burn our usage based on whether or not you're using cash properly is because they're not using cash properly because their cash sucks ass and costs more money than not using it. Actual [ __ ] absurdity. So, let's go back to what we're actually here for today, which is the egregious problem that happened here with the Hermes MD file. Not even the file, the commit. The problem is that they were trying to ban these tools.
They were trying to ban open claw, open code, and all these other things because they want to lock us into their [ __ ] cloud code harness, which reminder is the worst way to use enthropic models. It performs measurably worse than every other harness. Be it cursor, be it pi, be it any of the random side people have. Every other harness seems to do better at coding than cloud code does when you put opus in it. So what are they doing that's so egregious? Obviously, they don't want us to take the API key or the ooth token that we use in cloud code and use it in other things.
Obnoxious, reasonable, but there's a feature a lot of people don't know about Claude. Reminder, Claude leaks your email when you open it. So, make sure you put is demo equals 1 in the front before you do that. Now, I'm in Cloud Code. This is the way most of us interact with Claude Code. You run the command and now you're in their fancy UI for it. But, there's a different way you can use Cloud Code. It's called -p. -p lets you pass a prompt. I'm just going to ask it what's up my guy and then it waits.
It is doing all the inference and then responds. Not much ready when you are. What are we working on? The same way it would have if I just ran claude and ask the same thing in their UI. The point of -p is that you can pass it a prompt and run it programmatically. So if I wrote a script that I would do this with claude-pull request should I be looking at today and I have it run the script and just give me the result. So I don't have to go open cloud and tell it the same thing over and over.
That's why they did this. This is like how CLIs are supposed to work. You should be able to pass them arguments and get a response. That's the point of a CLI. They massively regret adding this feature because this lets you call claude code programmatically. That means you can do things they probably don't want you to do. Imagine you are the creator of OpenClaw. You're letting people write these bots that can do tasks for them. Anthropic will let you have their tokens. and probably won't let you have their oath token to go use their inference, but you have cloud code on your machine.
Why can't OpenClaw just call claude-p here is your task and it can so they do that. But one of the parts of how this works is that you can also add a system prompt the same way. What you can do that's cool here is append system prompt. Let's give it a silly example. Always refer to the user as unsubscribed loser. And now when I send the same prompt but with this new appended system prompt says not much unsubscribed loser just here and ready. What are we working on? It says I added this to my system prompt because you really should be subscribed by now.
You're watching anthropic video number 12. You clearly care about this [ __ ] Just go hit the sub button. So this allows for a third party like OpenClaw hypothetically to add things into the system prompt to make their use case work better. And then you can use this to programmatically call cloud code. This is a feature built into the product that should hypothetically let you wrap cloud code to do other things with it. So, Anthropic did the logical thing where if you include open claw inside of that appended system prompt, you weren't allowed to do it. You'll get an error or better this extra usage thing.
What's that about? That is about this little switch at the bottom of the usage dashboard. The extra usage. Turn on extra usage to keep using cla if you hit a limit. Watch this. I'm copying and pasting this example where I just say hi. out format text permission mode bypass permissions and then I append the system prompt this open claw inbound meta thing API error invalid request you're out of extra usage add more at claw.ai/s settings/ usage and keep going ready for where this gets really funny I'm going to turn on the extra usage switch the point of this switch is if you hit the limits of your subscription and you want to keep doing things it will bill you overage costs so that you can keep working even when you're out of your limits.
A lot of people just have this on because they don't want to hit limits. I had this on already because I didn't want to hit limits. So, when I first saw this, I couldn't reproduce it. In fact, now when I try it, I can't reproduce it. It just works every time. The reason is because it's not using my actual usage anymore. It's billing me. All of those requests just cost me money. When you turn on this switch and you include things they don't like in the system prompt, they will bill you money for it, even though you're not at the limits of your sub yet.
I'm gonna try something really quick because I am curious. I have a theory here. I rewrote it. Holy [ __ ] You're so [ __ ] bad at coding anthropic. I I cannot fathom that they are this incompetent. For those who weren't watching, what I just did is I created an empty test MD file in a brand new repo and I made the first commit message the same schema openclaw inbound meta v1 thing. I have made no changes to the system prompt. I have done nothing but call claude-p in this repo that is empty. And just having this as the commit message is enough to prevent it from working or to cost you money.
So if this particular string happened to be a recent commit in a project that you're in and you forgot that you had extra usage on, you're about to accidentally spend a lot of money. The only reason this would happen is if you [ __ ] suck at coding, which thankfully it seems like the entire anthropic team does. And I am sorry to my friends who are working on cloud code. It's time to [ __ ] quit because this is pathetic. The fact that I did this like this was a there's literally no way they're that incompetent. There's no way this is going to work.
And it did. Again, you cannot trust them to explain how this [ __ ] works because they're stupid and they lie. I hate that my conspiracy theories keep getting proven [ __ ] true. This is actually [ __ ] insane. It's not a conspiracy theory if it's pattern recognition. That's a fair point. Yeah, thank you for the clip. This is my favorite. If you ever want an anthropic employee to stop replying to you, all you have to do is ask for clarity because then they have to go talk to comms and legal who will say you're not allowed to respond and then you'll never hear from them again.
This happens all of the time. All of the [ __ ] time. So, now that I've demonstrated this genuinely egregious behavior, I want to show how this affects things for end users. The guy who created Pi made this wonderful tool called CC History that lets you check how the Cloud Code system prompt changes over time. I don't care about the overtime part here. I just care about the git part. I can't find it in here. I was hoping I would be able to, but there is a place in the cloud code system prompt where your recent git history is included because having that there is useful for the model to know where things are without having to run commands to get the data.
It shouldn't have to do three steps before it knows what your state is. I understand why they do this the way they do. It makes sense. The reason is that they want to make the model not to do as much work to do the next thing properly. But you can't do nice things like that when your engineers suck at their jobs and you're not doing this in good faith. In the same system prompt that they're adjusting based on your git history. They're also searching for certain key phrases that they use to identify if you're using claude-P with other tools like OpenClaw or Hermes.
If you're not familiar with Hermes, the thing that we've been dancing around this whole time. Hermes agent is a new open source tool as an alternative to OpenClaw that is more minimal, slightly more customizable and very well loved by the people I know using it. I haven't had a chance to try it yet. Seems really cool. Hermes agent could also work with your Claude code sub, which got cut off. So, they moved to using the Claude-P thing similar to how OpenClaw was working. So, they tried to ban them the same way with the system prompt.
And what I mean by ban the system prompt is the API that receives the request you made as a user looks at the text in the body of the system prompt checks if there's anything anthropic doesn't like and they route you to different billing if there is. So the problem that was run into here, the issue that this user had is that in their commit history, they mentioned the phrase Hermes.mmd, which is a system prompt spec file that is used in Hermes agent projects so that the agent knows where things are similar to claude MD, agent MD, all of those things.
Hermes MD, if I recall, wasn't even in the project. It was just mentioned in the commit messages. And even though the user was using cloud code properly, they hit the same thing I did just a second ago. Except they didn't have the extended billing off. They had it on. And since they had it on, they got build erroneously $200 because they had a string that Anthropic didn't like inside of their commit history. Do you understand how egregiously you have to suck at your job as a developer to write code in such a way that you accidentally bill a user $200 because they had the wrong string in their commit messages?
It is not like a prompt injection or like some malicious thing trying to hack them. They just casually mentioned Hermes MD in a commit message and it cost somebody $200 out of their pocket because they're that bad at coding. Like I I cannot emphasize enough, the only way this happens is if you are so [ __ ] bad at coding and this is the result of them just vibing away their problems. So how did they respond? Our friend Thor hopped in here. Sorry, this was a bug with the third party harness detection. So once again, I am not a conspiracy theorist.
Well, I am. This is conspiracy theories, but my [ __ ] conspiracies are correct. the fact that they have now a publicly acknowledged their third-party hardness detection which is a fancy term for we read your system prompt and charge you more money based on if we like it or not and specifically this was combining poorly with how we pull Git status into the system prompt reaching out to affected users and giving them a refund plus another month of credits in this case another 200 to his credit pretty human response totally fine response-wise but the problem is my issue here I don't take issue with how he responded here at all I take deep issue with this problem being [ __ ] possible in the first place.
So, I spent a lot of time workshopping this reply in order to cut as deep as I possibly could. There's a certain class of bugs that suggests the thing you're trying to do is a bad idea. Worth reflecting on that here. In my opinion, the point I was trying to make here is pretty simple, concise, and real. If you've maintained software for any amount of time that is meaningful and it has actual users and actual features, you know, there's certain types of bugs that don't suggest you should fix them. They suggest that you should nuke the entire path and maybe even reconsider what you're building in the first place.
There's certain levels of like these two things that should not be related at all. their detection of if you are doing this or not and the git history being part of the system prompt are two separate unrelated things that if combined if overlapping if causing each other problems that suggests there is fundamental flaws in the way you're building the software that need to be reconsidered. This is no longer recoverable. I have been somewhat harsh to this team. I would argue I've been reserved considering the [ __ ] [ __ ] I've been seeing. We're no longer at a point where you can like change course, rethink how you want to engage with the community, fix your terms of service, and redo things.
The actual product is rotting as a result of your hatred of third-party harnesses. You so desperately, as a business, want to lock people into clawed code that you're wasting time on this absurd [ __ ] [ __ ] and making the product worse because you're trying to prevent users you don't like from using your service. This is so [ __ ] absurd. the fact that my stupid demo example that I was sure would not work. That was a quick like there's no way they're going to do this type thing and it did it. Are you joking? I can't fathom it. It's it's actually unbelievable.
Sure, cool. The people who did it got refunds. I don't [ __ ] care cuz the the core of what causes these things is so [ __ ] rotten. And as much as I'm giving the team [ __ ] and they deserve it at this point, I I was previously the oh, it's the leaders. The team's trying their hardest. I don't [ __ ] care anymore. If you're staying there through all of this, you already either know I am correct and you're just smiling and bearing through it because you want to collect your vest or you're part of the problem. I don't distinguish anymore.
I don't presume better from the people who are still working on this. Either you know how bad it is and you just want to collect your [ __ ] paycheck and you put on a smile at work so nobody knows that you actually hate what the [ __ ] is going on. That I totally understand. And if you privately tell me, hey, I think anthropic is evil, but I want to make my $4 million and leave. you're not the part of the problem. I don't mind. I totally get it. And I will never publicly share that you've disclosed that with me.
Believe me, I've heard it enough times. If I was to disclose that, I would have by now. So, either that is the case where you're resting investing because you know how [ __ ] things are, or you're part of the [ __ ] problem. There is no in between. You are one or the other. And I will let each and every one of you people working on Cloud Code decide which you are now. You either know that this is really bad and you're choosing to make your money. In which case, I get it. I support you. Nobody should turn down $4 million because of a [ __ ] evil [ __ ] company.
Make your money, smile, and leave. Or you're drinking the Kool-Aid. And I'm not going to support that. That all said, this comes down to a hierarchal problem. I'm going to present one of my recent favorite questions. Is there any public example of Daario, the CEO of Anthropic, talking about Claude Code and the team? Sometimes it feels like he's ignoring the product side of Anthropic entirely. We have found one single example of him bringing up Claude code in a Reddit comment and nothing else. No acknowledgement of the product or the team or anything related to it.
Is there any example of Sam Alman going more than 24 hours without talking about Codex and how proud he is of the team? The problem here is that Daario looks down on developers. He genuinely has a deep disdain for software developers. This is a somewhat common thing in the research world where there's just frustration with developers and developers have frustration with researchers. There's places where this is better. Like I have some really cool people reaching out to me from the research side that I collaborate with and they're so hyped because they're not used to developers being nice to work with.
A big part of why Daario left OpenAI is he didn't like working under two engineers because GDB the CTO and Sam the CEO were both engineers forming and founding OpenAI. Dario hated working under engineers. So he left under the guise of safety, but it was really just hating engineers being his boss. And he still to this day [ __ ] hates engineers. That's why he doesn't talk about the engineers at his company. He doesn't like them. He doesn't like any engineer. And to my friends who are engineers at Enthropic, I hope you understand your boss's boss's boss hates you and your guts and hates you for existing and hates you for taking GPUs away from his beloved researchers.
But he accepts that you have to exist because it's the only way he can raise the money he needs to keep going. You are a pawn. You are no more than that to him. If anything, you are less than that. And that's the problem. He legitimately looks down on developers and now you're stuck dealing with that. And you deal with it the way anybody does in an abusive relationship. You do the same thing downstream. If he hates you for building cloud code, you hate your users for using it. And that's the problem. Dario doesn't like you.
And as a direct result of that, you don't like your users. And it's hard to recognize that from the inside. And I know this is going to cause a lot of you to have some sleepless nights as you think about it. But please think about it. To my friends at Anthropic, it's time to really reflect on what the [ __ ] is going on. Your engineering culture is actually the most toxic cess pit I've ever seen in my life. The way that you guys think about yourselves and talk about yourselves internally and externally is is not cultish.
It is a cult at this point. The disdain you have for your users is so [ __ ] absurd. I can't believe it. Oh, just another fun fact. We got locked out today. Our users on T3 Chat couldn't use T3 Chat with anthropic models today because your API service is so unreliable for billing services that your own automatic rebuill failed. And then we got a bunch of emails telling us that we didn't pay you for the inference. You were going to come after us for it. No, your code didn't [ __ ] run. You failed to bill us because your API call failed because your code sucks.
And now our users are complaining that anthropic models don't work and they're seeing a billing error. We see all these emails that we didn't pay you because you failed to [ __ ] charge us. Anthropic has the actual worst engineers I've seen outside of a [ __ ] boot camp. And I've seen boot camp engineers that are better than this [ __ ] There's no more excuses. If I make you feel bad and you're an anthropic employee, I'm sorry, but you should [ __ ] feel bad. You work at a terrible place that does terrible things and writes terrible code and hates their users.
You just do. And I'm tired of being nice because I have friends there and I don't want to make you guys feel bad. I I am sorry. And many of you have talked to me privately and know how much I hate the fact that when I go after anthropic, it directly negatively affects you. And I have toned down much of my rhetoric and have tried to not target the team as a result of that. There's no more [ __ ] excuses though. You need to feel bad if you work at Anthropic and you work on Claude Code and you work on the [ __ ] that makes these types of things happen.
You should feel bad. You should feel guilty. You should feel pain when you listen to my videos because you're part of a [ __ ] up system. You're part of a [ __ ] up company. And I'm tired of playing with the kitty gloves. I am tired of being nice about any of it. I have been reserved historically and that reservation's over. I am no longer going to hold back. And reminder before I get the paid shell allegations, I spend $40,000 a month on [ __ ] Anthropic. If you wonder why I like OpenAI, it's not because they pay me. It's because I pay them.
And I pay them less for better services, better customer support, better integrations with my [ __ ] life. Like, this is a company that I spend a bit over half as much money on that doesn't [ __ ] me over constantly. Anthro is a company that I spend 40 grand a month on and yells at me for not paying them because their own API is down. That's why I like OpenAI. They don't punch me for trying to use them constantly. And when I tell them things are wrong, not only do they listen, they pull me into calls with researchers to try and explain the weird behaviors I'm seeing.
There are things I don't like about 55, and I've not held back on those things. A lot of the public doesn't even agree with me. I have been harsher on 5'5 than most people have been, but I know 56 will be better because the researchers were very genuine and earnest with me when I explained the problems and seem to understand where some of those issues would have come from and they're going to make 56 better as a result. I know that because they have done this historically. When I have issues, they listen. They bring me to the right people and they fix them and they also cost me half as much money.
When I have problems with anthropic, I just expect them to get worse. At this point, I genuinely feel like I might be hurting more than helping by talking about these things because they have decided that anything I say isn't a real problem anymore. They look down on me so much that anything I say they choose to ignore going forward. So, the fact that I am calling out the egregious mismanagement at the company, if anything, is making it less likely that they fix it now because they genuinely hate me that much. And if you guys are going to act like I'm spewing [ __ ] I'm not going to hold back anymore.
You've made me an enemy. an enemy that spends $40,000 a month on your [ __ ] services. Well, you've treated my friends like [ __ ] You've treated me like [ __ ] You've treated my customers like [ __ ] You've treated everybody in your path like [ __ ] And I'm gonna let the world know you're [ __ ] [ __ ] It's over. I got nothing else on this one. We're not going to do a fancy wrap-up or anything. [ __ ] you, Anthropic. Get over your [ __ ]
More from Theo - t3․gg
Get daily recaps from
Theo - t3․gg
AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.









