Mythos is here, it’s time to start tokenmaxxing

Theo - t3․gg| 00:30:40|Jun 12, 2026
Chapters9
An overview of Fable Mythos availability, the pricing subsidies, and why the creator is rushing to explore its value before June 22nd.

Mythos/Fable in Claude Code opens massive token budgets for a short window— Theo shows practical tricks to max them efficiently and responsibly.

Summary

Theo (Theo - t3.gg) dives into the Mythos (aka Fable) release on Claude Code, explaining why he's pushing it to the limit before June 22 and what the cost implications look like. He shares real-world numbers from his two $200 Claude Code plans and how he’s burned hundreds of dollars worth of inference in just days, all to demonstrate what large-scale agent workflows can do. The video isn’t about token-hoarding; it’s about strategies to maximize meaningful work within generous subsidies while the limits exist. Theo reveals concrete setups—using Hermes, OpenClaw, and cloud code workflows—to automate account switching, manage rate limits, and orchestrate multi-agent reviews and PR workflows. He shows off practical use cases like surfacing merge-ready PRs across Lakebed and T3 Code repos, generating HTML plans for agent reviews, and running complex workflows with sub-agents and judges. Throughout, he cautions about sleep disruption, cost, and the need to stay excited and creative rather than fear-driven about tooling. The sponsor segment highlights Render for enterprise-ready, infrastructure-first hosting with YAML blueprints, private networks, and an emphasis on shipping over per-seat pricing. Theo ends with a rallying call to push the boundaries of what developers can accomplish with AI agents, while acknowledging the real-world constraints of token limits and costs.

Key Takeaways

  • Two Claude Code plans at $200 each can yield up to $8,000 per month in inference subsidies, a great but finite window to experiment.
  • Theo logged $4,358 of inference in 10 days on one laptop and an additional $1,12 on a Mac Mini, illustrating how quickly tokens can rack up in real-world use.
  • He demonstrates account-switching and multi-account orchestration to maximize rate limits, including 5-hour and weekly model quotas that reset on a schedule.
  • Using Hermes, OpenClaw, and Cloud Code, he automates tasks across multiple repos (Lakebed, T3 Chat, T3 Code) to surface merge-worthy PRs and run workflows with many sub-agents.
  • HTML plans and agent-to-agent communication enable transparent, easily digestible reviews (e.g., code reviews generated by Codex and Mythos).
  • The video shows practical remote-work setups (Mac Mini, TailScale, CMUX, screen sharing) to keep long-running agent tasks alive away from the keyboard.
  • The overarching message is to be creative and ambitious with token usage, rather than chasing cost-per-token—build new tools, raise the bar on what you automate, and iterate quickly.

Who Is This For?

Essential viewing for AI/ML developers and software engineers who want to maximize agent-powered workflows in cloud-code environments, especially those experimenting with Mythos/Fable, Claude Code, and large-scale PR automation.

Notable Quotes

"It's such a powerful model and I've been pushing it to its absolute limits to the point where it's affecting my sleep, my day-to-day, and of course the content I'm putting out."
Theo explains the intensity of pushing Mythos/Fable to the limit.
"They subsidies are crazy. We're talking like $8,000 a month of inference for just $200."
Concrete pricing illustrates the scale of subsidies being used.
"This isn't about maximizing how much you get per token or per dollar. It's how you can take advantage of these generous subsidized plans in the 10 days we have left with Babel on them."
The video clarifies the goal is practical value, not mere token-maxing.
"I'm using Ultra Code. I don't recommend using ultra code and the workflow feature in cloud code if you're trying to conserve tokens at all."
Candid tip on choosing modes based on token objectives.
"If you go into these things excited to push the limits of what you could do and build, what you'll get back is incredible."
Motivational closing thought on mindset and outcomes.

Questions This Video Answers

  • How can I maximize Claude Code's rate limits with multi-account setups?
  • What are HTML plans and how do they help with agent reviews in Mythos/Fable workflows?
  • What are the practical steps to surface merge-ready PRs using AI agents?
  • What are the best remote-work setups for long-running AI workflows on macOS?
  • Can I use Render to host AI-powered automation pipelines and what are the trade-offs?
MythosFableClaude CodeOpenClawHermes agentCloud CodeSen/RenderHTML plansPR automationLakebed project
Full Transcript
You might have noticed I haven't been posting as many videos the last few days. And there's a good reason for it. It is indeed the Fable release. It's such a powerful model and I've been pushing it to its absolute limits to the point where it's affecting my sleep, my day-to-day, and of course the content I'm putting out and I don't want to do a normal video where I just show some of the cool things I've been doing with it because this release is different. As you all probably know by now, Fable aka Mythos with some safeguards is available on the pro and max tier subscriptions on Claude Code, but only until June 22nd because as of June 23rd, it will be removed from those plans. The model seems to be just too expensive and too compute heavy for them to run in the subscription tier that is as subsidized as it is. And those subsidies are crazy. We're talking like $8,000 a month. These plans can get you up to $8,000 a month of inference for just $200. And I'm not talking out of my ass here. I'm speaking from experience. I've spent the last few days going hard on my subscriptions for Cloud Code as well as Codex to an extent. And I've managed to do $4,358 of inference in the last 10 days alone. There's a catch though. That's just on this laptop. I also have a Mac Mini I've been doing a lot of work on, too. And this one's up to $1,12 of additional inference on top of all of this. That's a shitload of tokens. And I want to show how I've been using them to get real work done. And I want to be clear, not all of this is super useful. This is not a video about maximizing how much you get per token or per dollar. It's kind of the opposite. It's how you can take advantage of these really generous subsidized subscription plans in the 10 days we have left with Babel on them, which is why I'm rushing this video out, of course, so that you can get the most possible value and see and taste what the future can kind of look like with models this powerful, assuming cost is no issue. Obviously, this is not realistic long term. Nobody should be spending 10 grand a month personally on inference for day-to-day work. But spending $200 to $400 and getting that much inference, that is kind of compelling. I've never burned so many tokens in my life. In fact, I'm pretty sure I did more in the last 10 days than I did in the rest of my life prior to that. And I've learned a ton since. The good, the bad, the ugly, and more. From loops to OpenClaw and Hermes agent to crazy workflows that will automate value being poured out of your code bases and more. There's a lot to dig into here and a lot of cool opportunities to maximize your usage of these tools and models. But before I can dive into all that, I highly recommend maximizing on today's sponsor. AI's gotten great at coding, but deploying not so much. As great as things like AWS are, good luck trying to configure things properly using an agent, especially once you're trying to do multiple things, have preview environments spin your team up as well. It's just not a great story. And that's why Render is so so cool. These guys built an enterprise ready cloud that is also agent ready. While every other cloud is struggling to keep up with what's going on, they're making nice changes like dropping the per seat pricing. Instead, just charging you for your infrastructure. So, it doesn't matter how many people or agents you have on your team. All that matters is how much you're shipping. You can use Render to host anything. Web services, databases, cron jobs, workflows, static sites, CDN content, KVS, everything is supported by these guys. And they have integrations for everything you'd ever want. And as I mentioned before, these guys are actually enterprise ready. They have private networks built in, so your systems don't have to communicate over the internet. This is huge for building real systems. And you can configure this all with blueprints. Simple YAML files that described your entire infrastructure. Imagine if Terraform went way deeper and further. That's what you get with their blueprints. That said, if you want Terraform, they support that, too. Don't worry. And if you're building heavily distributed work with lots of pieces with high failure rates, render workflows are going to save your butt. It's a simple system for queuing and scheduling your work that has a great SDK for TypeScript and for Python, making it easy to make resilient, durable workloads on the same cloud that hosts your database, your CDN, and more. If you're not convinced yet, would $50 convince you? Because they'll give you 50 bucks a credit if you use code render Theo. Try it now at zoidv.link/render. Before we talk about burning all of your tokens, I want to talk about getting as many as possible, as in maximizing your use of the limits that you have. As I mentioned in my last video, I'm actually dual wielding right now. I have two Claude Code $200 plans that I rotate between whenever I max one out. And I found a lot of ways to gify that that we'll talk about in a second. But first, I want to talk a little bit about how these rate limits work. This is my second account which I haven't had to use today which means that my 5hour session hasn't started. This is the first important thing to start hacking around. You want this timer resetting as often as possible. Realistically speaking, I can hit the 5hour limits now in about an hour. So if I don't have this timer ticking and I start this work session that will kill my limit in an hour, then I then have to wait 4 hours for it to reset because the timer doesn't start until you send a message. Thankfully, this is easy to work around. This is all it takes. I am just in the cloud.ai site. I'm going to say hi. I'm going to send. I'm going to stop before it does any real work. And now the timer is ticking. It will now reset in 4 hours and 50 minutes even though I haven't dinged any of my usage. Okay, there's a tiny tiny bit was probably used, but basically nothing. Doing this gets you reset as soon as possible. So now if in 4 hours, let's say 4 and a half hours, I start working, I don't have to worry about hitting my limit because it's going to get reset in 30 minutes. That's only one of the limits you have to worry about though because there are two. You have the current session limit and separately you have the all models weekly limit which resets once a week. Similar to the 5-hour window, this one starts counting down after you send your first message. Again, incentivizing you to make sure you go hop somewhere and trigger something that gets these limits burning. I want to automate this though, which you'll notice is a theme for a lot of the things I want to show off here. I could keep going to the website. I could even have a browser use agent go there for me. But at this point in time, claude-p still counts towards your rate limits. That will change in the near future, which means it probably won't work for that long, but at the very least, the strategy will work now. So, I'm going to set something up. This is not meant to be like, oh, look at how smart I am. It's meant to try and get you into the mindset of taking advantage of the limits that we have today. I'm currently using Hermes agent inside of Discord as my main like default AI agent that does random work on my Mac Mini and I'm going to take advantage of that here. It already has Cloud Code set up. So, watch and learn. I want to make sure my Claude Code account has a recent message at all times. Set up a cron that triggers every 5 hours. It should run in an empty directory. The command should be claude-p. And now my Hermes agent is going to go set that up. So going forward, I will always have something triggered on that account to keep my rate limits moving at all times. Also, just a pro tip I got from Ben, I much prefer using OpenClaw and Hermes in something like Discord because every message gets its own thread that I can keep following up in to maintain context. It's really, really nice. And I'm using this for a ton of different stuff. Obviously, you can't use your cloud code off directly inside of a tool like this because Anthropic really wants you to use claude code with your sub, not tools like this. But you can use your codec sub with this if you want to minmax that too, which is what I do. It works really good. Fantastic. Now that that's done, I'll never again have to worry about my limits not counting down. Again, this isn't trying to get more out of your plan. Like, I'm not trying to push past my weekly limit or anything here. I'm just trying to make sure that you can get to 100% and get your reset as quickly as possible. But as I mentioned before, I'm juggling two accounts. This might seem like really, really annoying to do, especially if you're using the Claude desktop app because when you sign out and into a different account, you lose all of your history, all of your sessions, all of your everything. I put out a warning about this in the Cloud Code desktop app for us account switching folks. And Anthony from Anthropic actually confirmed that they're going to fix this behavior soon. I don't think that will happen within the range we have Fable for in the subscriptions. So, I wouldn't count on this. And honestly, the CLI still is a much better tool than the desktop app at this point in time. Desktop's made meaningful improvements. I did my video roasting it, but I I tried it a bunch the last few days. I wouldn't recommend it. So, what does account switching look like when you're actually doing real work with these tools? I could tell you how to take advantage of the multiple accounts, but I'd rather just show you. And when I do this, I'm also going to show one of the real use cases I've been doing with this stuff. I've talked a bunch about my recent project, Lakebed in various videos. I'm trying to make a better cloud for building apps that agents can easily operate without needing to access dashboards, deal with API keys, and all those types of things. And I have a lot of work in flight on this project right now. In fact, some of it overlaps. PRS 35, 37, and 39 are all attempts to add file storage to Lakebed that have their own benefits and negatives, respectfully. 35 and 37 differed a lot in their implementations. I had Codeex and Claude Code both building their own solutions and then comparing each others and they couldn't have disagreed more. But when I synthesized the best parts of both and made a new PR 39, it came out a lot better. I'm still not 100% confident though and rather than doing the normal thing, which is sitting there and reading all of this yourself, I'm going to be lazy and take advantage of my limits. I have Cloud Code open in a random work tree of this repo and I'm going to tell it to get some work done here. Notice that I'm using Ultra Code. I don't recommend using ultra code and the workflow feature in cloud code if you're trying to conserve tokens at all. But if you're trying to hit those limits, you're trying to take advantage of your remaining inference before a reset comes up, doing things like this is actually really useful and finds more information than I would have expected. I'm going to whisper flow this one cuz I don't feel like typing and talking at the same time. So forgive me. I currently have three poll requests open on this project, 35, 37, and 39. all of which are implementing roughly the same feature which is userfacing object storage that developers can implement in their lake bed apps. I want to decide which one is the best choice to merge. Make a workflow where you break these PRs up and have judges review each of them independently to conclude and figure out which one is best and what pieces of each we can bring into the best solution. Audit all of them independently and help me come to a conclusion as to which of these PRs should be the one that we continue iterating on and merge eventually. I put the word workflow in here specifically to trigger it to start a workflow because workflows are a great way to do this type of giant bulk work. I've actually found it very nice for this type of thing. It's now creating the workflow and orchestrating all of these sub aents to be the judges. And I'll show what that looks like in Cloud Code because it is really cool. What I want to show you guys first is the account maxing that I've been doing. So here I have my personal Claude Code account and here I have my secondary one I was mentioning before. The personal is the one I currently have signed in. So, if I refresh, you're going to see that usage start to tick up pretty fast. Not okay, not that fast because it's still figuring out what the workflow is. So, I only have one thread going. And honestly, running Mythos 24/7 or Fable in this case, 24/7 in just one thread. So, you only have one going at a time, you're probably not going to hit the limits too aggressively. But when you start getting workflows going where you have eight or more running at the same time, you'll burn through those limits fast, like easily under an hour. Okay, it's still setting up the environment, but we do see the percent starting to go up. We're now at one. It'll go to two momentarily. Well, it would, but I want to demo the account swapping. It's a really, really complex process. I'll show you just how complex. You run cloud code. You run the /lo command. You grab the URL that it puts out and you put it in whatever browser profile has the cloud code that you want. I'm using Helium, which has browser profiles just like Chrome does, so I can swap between the two. The purple is for my secondary account. I authorize here. I go back to my terminal. And now I'm logged in on a different account. And ready for the crazy part. We had this workflow going generating tokens. The next turn it takes, as in the next time it does a tool call or kicks off a workflow or starts a next step of any form, that one's going to use the new O token that it just got and it's going to start routing to the other account. It doesn't care when you swap accounts mid session. So if I have five workflows going across different projects or different work trees on the same project and I want to make sure that I'm not going to have them all stop immediately because I hit a limit which is super annoying by the way because it doesn't recover these workflows well. So, if you have a workflow with like 100 plus sub aents, you get to the 94th and then your limit hits, you have to often rerun the entire workflow when you hit those limits. So, it's worth it to keep an eye on how close you're getting and swap your O out before you hit those limits. Or you can do what I do, which is burn a shitload of money by turning on overage credits. And once you see those getting burned, then you switch over. But now you see this account is no longer going up because this is not the account that the traffic is being routed through. Okay, now the workflow is going and you can see just how many tokens get burned with these things. We have the audit stage which will be 13 separate agents and depending on what their results are, they'll pass it on to the judge section which will have even more than verify, harvest, and synthesize. This run looks like it might be 100 plus of these sub agents, which is going to be crazy. Burns absurd tokens. We're already at 368,000 tokens because it's running eight of them in parallel right now. So now if we go here, you'll see these rate limits for my second account are going up super fast. We're already at 5% when it took the last 20 minutes to even hit 1% here because now we're running it 8x harder. But now this account isn't going up anymore cuz it's not the one being used for this sub agent run. Super helpful for hopping between accounts. And the fact that you can just do / login in any cloud code terminal on a given machine and it updates all of the things on that machine is wonderful. Jesus Christ, we're almost at a million tokens down and this has been like under a minute. Yeah, this percentage is going up fast. Like, this is real time. We're already at seven. This is why you should be careful with your workflows. But you should also be careful with your weekly limits. My two accounts reset on Wednesday and Thursday, respectively, right now. And I'm already at pretty high percentages on both. I'm expecting to max out my weekly on at least one of these accounts, and I'm kind of counting on a reset for some reason. Hopeful that they'll have some reason to reset things, but even if they don't, I'll probably just grab another $200 account and push it to its limits as well. specifically during this short testing window of 10 days where we can use Fable. I will not be going anywhere near this hard on anthropic models when Fable is no longer in the subscription tiers. But when it is, I'm pushing my limits. So, you might be wondering now, how big are those weekly limits? Well, from my experience, just from basic testing and this secondary account in particular, it seems like you can get roughly 25% of your weekly when you hit 100% of your 5 hour. Put simply, you can max out a 5-hour window four times in a given week before you're out of usage. So, should you always be trying to hit the 5 hour limits? Maybe, especially during the end, but you should absolutely be aiming to hit the weekly limit to get the most out of your usage. So, I've already burned like $400 of inference just with these basic tests as I've been filming. And you might be thinking, "Wow, that's a lot of inference for things that haven't actually panned out much yet." Like, you're reviewing three slot PRs and trying to decide which one is best. Like how valuable is that? First off, that is actually quite valuable when you're trying to figure out how to deal with a giant pile of PRs. But second off, you can modify this slightly to be way more useful. Here's a much more realistic example that I find is actually super useful in my day-to-day work. I maintain a handful of different repos. from Lakebed, which still isn't public yet, to T3 Chat, which also isn't a public repo, but at the very least has real people contributing every day, to T3 code, which is a big open source project that has a lot of people throwing stuff at it. Even with our best efforts, keeping it under 400 issues and under 300 PRs, feels nearly impossible. So rather than try to do that, I've been putting more effort into highlighting and surfacing the best work that is worth pursuing. reading all those PRs and coming up with all those changes is way too much work for any reasonable human to even consider trying which is why I don't I let my agents do it because they don't get bored. They can just sit there and do the thing indefinitely. So one of the things I have Mythos doing for me, well Fable, same difference is every morning going through all of my PRs on all of my repos and helping me surface the ones that are the easiest to merge, the most justifiable to get done, etc. This example is using my Hermes agent on top of my codec sub, but this is the one that Mythos made and it's really good. I ran this one on just T3 code and it built this ranked Q section where it goes through every PR that's currently open, gives it a status and ranks them based on how easy is this to just go merge and how much of my attention does it deserve. So number one here we have disabling external git diffs, which is a PR that Magnus opened to fix a bug with the external diff viewers some people use with git when they're using T3 code. Simple PR fixes a real bug. People will be very happy if we merge this. It was surfaced by the agent, even though this PR was originally filed over a month ago. And as soon as I sent this link with all of these ranked PRs to Julius, it got merged within 5 minutes. Getting this type of overview of all of your work is so much better than trying to dig through the GitHub PR tab trying to get useful information out of it. If you're wondering how I got my agent to spit out these HTML plans, that's another service that I threw together. This one's just for me and my team, but I recommend building something like it. And maybe I'll even open source it in the future if enough people want it. I built a simple service for hosting plan files, which are just HTML plan descriptions on a real web service so my agent can spit out a URL that I can click on and then go see what it was thinking or what it's planning. I find HTML plans to be way more readable than markdown. I have a whole video about this. And I've been abusing these for all sorts of different things, including reviewing agent work and having one agent review another agent's work and then just pass the HTML over, but also having the ability to read it in a good format. That's what's cool about HTML is I can look at in my browser or on my phone in my browser and get a good idea of what's going on and then just paste the link to an agent and say, "Hey, go deal with this." And it figures it out. By the way, that agent run I started earlier with the workflow is now at 1.6 million tokens and 21% of this fresh usage window used. And that is in under 30 minutes. Kind of crazy. You can burn through usage fast, but if you sear it in the right way, you can get actually useful stuff. I actually did have PR39 get reviewed previously by Mythos and I have its review here which is super convenient because Codex was actually the agent that wrote this PR. So having Codex write it, having Mythos review it and then I can take this URL, hand it to Codeex to say, "Hey, go make the changes that this suggested and give me feedback on the ones you don't like as much, it's such a useful way to just pass context around in a way I can read." So yeah, again, points in favor of HTML plans. Highly recommend it. If you're already pushing the limits of workflows and reviews and you're still not getting close to those limits, there are plenty more things you can do, don't worry. One that I would highly recommend is looking into the skills others have made to audit your projects and your code and find ways to improve it. For example, Shad CN's Shad CN improve skill is really nice. Versel built this nice plugin for adding skills. If you want to actually add the skills, there's cool commands like the Verscell skills package, but you can also just go copy the content of the skill directly because as you hopefully know by now, skills are for the most part just markdown. So here we have all the markdown for this skill. I can go to raw. I can grab this all. I can open up claude. I could put it in workflow mode or ultra code mode, but I am already burning enough inference and I want to save some of this for later. So, I'm just going to paste enter and let it do its thing. Again, very useful if you notice yourself at the end of a weekly limit or an hourly limit and you want to get a little bit more inference out. Just keep a set of these things in your head that you would like to run and write them down. Maybe even cue them. I've also recently been loving Raycast's notes feature. It's super easy to just like write something down and then hide and reopen it with a hotkey. And you can easily iterate through the notes that you have saved here to find random prompts or things you want to do and go grab them in the future if you don't want to send them just yet. Looks like it's going to kick off a workflow for the improving anyways, which means I'm about to burn a lot of money. Great. Thankfully, it's not my money. And since starting this video recording, we've already done another like $400 of inference. Jesus Christ. Take advantage of these generous limits while we have them, folks. We got to escape that permanent underglass. One thing I touched on earlier that I haven't really dug into yet is the effect this has had on my sleep. I'm sure we've all been there. Glued to our laptop just waiting to get the result so you can run the next prompt and then go to bed and then the one more prompt effect keeps you going and not leaving your desk again and again until eventually you're like falling asleep at your keyboard. The AI vampires of the Silicon Valley is a trope that I'm seeing more and more. Even some of the most skilled maintainers that are fathers of loving families are finding themselves staying up till 4 in the morning prompting because it's so addicting. I don't have a solution to that. I'm not gonna sit here and pretend I do. What I do have is a Mac Mini on the same network that has Codeex, Claude Code, and Hermes Agent all set up and ready to go on it, which has been wonderful for being able to get [ __ ] done from my phone and from other computers. And most importantly, being able to close the lid on my laptop and not worry about the work stopping. I have three ways I interface with that Mac Mini. the one that I admittedly use the most is the SSH directly into it. When I'm on the same network, it's very easy to SSH into. When I'm not, I rely on Tail Scale to do it, which has been much better than I was expecting. As an old WireGuard fan, seeing Tail Scale in its current state is awesome. Highly recommend. I highly recommend using a terminal like CMUX, which has both the sidebar and tabs, so you can pin your Mac Mini or whatever other computer you have over SSH as its own section and easily manage and navigate it. It's been super nice. Since I'm on a Mac and the remote computer is a Mac, I can use the built-in screen share utility, which honestly I didn't even know about until recently, which makes it super easy to access the computer, have native hotkeys, and all the other things you would expect when you're using this computer remotely, which has been again a lifesaver for doing this type of thing. This also works over tail scale, which let me configure some things that were blocking when I was away from my computer. But none of those are my favorite way to access that remote machine. And this is where I will admit we're about to get into a bit of a selfplug, but I hope you guys appreciate it because I think this is the coolest thing ever. Julius has put a ton of work into the T3 Code remote system where you can control a T3 Code instance from another computer. I usually do it by just connecting directly to the machine that I have T3 Code set up on, but there's lots of other options, too. If you have T3 code installed on one machine, it's really easy to give access to other ones. You can create a link or expose it over something like tailscale and then access it through that link or by going to the app.t3.code site and creating the connection there. Julius is also deep in the building of T3 connect which is a new method to connect to a remote T3 code instance that you have on one of your machines from another machine or even your phone with the upcoming T3 code app. Very excited about everything we're cooking there. But I'll be real, you can build all this yourself. Codex has some of it built in already. Cloud code pretends to, but it never works when I try. There's lots of ways to control your agents remotely. And I highly recommend finding methods like this because you'll start pushing the length of your jobs a ton when it's no longer locking you to your laptop. Before I had this remote Mac Mini set up, I found myself running shorter jobs so I wouldn't have to worry about closing my laptop cuz I didn't want to be one of those people walking around with a halfopen laptop. I really didn't. Once I had the Mac Mini set up and I could run things from there, I found myself using agents entirely differently, letting them go off on long exploratory journeys, not expecting to be able to use the results, but expecting to have interesting enough findings to talk about to my agents or even to my other co-workers when we build these types of things. I don't care what method you end up using to control these remote machines. You can even build your own, which is a fun way to both burn tokens and really refine your workflow to fit your specific needs and expectations. If you take anything from this video, I really want you to take home the creativity you can apply to how you burn your tokens. Now, you can have agents reviewing each other's work, and you can even automate it. Something like Claude Code is capable of doing babysitting with built-in loops. So, you can tell one agent to make a PR and watch whenever a new comment comes in, and then tell another agent to watch a PR and whenever a new push happens, give it a bunch of feedback and leave a review. Now you have two agents that aren't even aware of each other giving each other what they need to keep pushing work forward before a human's even involved. You can give agents stuff like browser use. This is much stronger on the codec side, but Cloud Code is starting to get there as well. This allows an agent to spin up the changes it made, record the screen showing that the changes work, and then send you that or post it in the PR when they're done, not bothering you until they have video evidence that the work actually worked. As always, Pete is far ahead of the game and he has awesome examples. Codeex has the ability to spin up threads by itself. So it's not just sub ages. It can trigger new threads and make new work trees and make real changes through that. So you set up a really simple loop here. Tell Codeex to maintain your repos. Wake up every 5 minutes and direct work to threads. Makes it easy to parallelize and steer work as needed. Since each thing that it thinks should be going on gets its own thread, it's very easy to hop into a given thread and tell it, "Hey, that's wrong. Go do it the other way or just stop it if it doesn't make sense." If you combine this with something like your sentry bug tracking or your data dog analytics, you can combine it and give all of the resources and data available to you as the engineer to your agents, which let them find information that's useful, self-improve, and just push things forward in a way that's really powerful and surprisingly valuable. Jesus Christ, that workflow is now at 1.8 million tokens. We haven't even started the judges yet cuz we got one more of these audit agents going. it. I know it's like I'm lighting money on fire here, but I've already paid the money and it's actually really fun to watch. I didn't think I'd ever be this type. I've always been the like singlethreaded just get the thing done guy. I'm having a lot of fun right now. I've been going a little further than I normally do, and I really like how Sawyer framed this. You need to be more ambitious than you have been before. Ask the model to rewrite your entire production app from scratch. Ask it to make an entire product or internal tool for you, but don't stop there. Ask it to deploy it and add accounts, multiplayer, etc. All the things you wouldn't normally have in a throwaway personal app. Raise your bar. Only by pushing more do you learn. Orchestrate. When coding and having an agent write a plan, have another agent or multiple. Validate each claim in that plan. Quad workflows are [ __ ] amazing for this. Sub agents and codecs are getting there now, too. They're not quite as good, but they're better than I thought they would be. Fable's really good at orchestration. One tip with that though, if you don't want to burn too much, explicitly tell Fable during orchestration to use Opus or Sonnet for its sub agents because for whatever reason, Fable more than any other model really likes using Fable for its sub agents. It seems to not trust other models. So, be explicit. Tell it to do other things. Also, see there, it just finished the audit stage and then it reviewed what it got back and decided it needed seven judge agents to do more work there, as well as these harvest agents that spun up in the background as well. But again, Fable's incredible at orchestration. Fable is not needed for the individual sub agents. If you're not trying to actually burn as much money as possible, maybe tell Fable to use something else. Back to what Sawyer said here, have it use Codex-P to delegate parts of it to codeex. Have it use Opus sub aents. Have it use workflows. Don't be afraid to give those models big chunks of work. Give your agent a browser and computer use. Let them click around to take screenshots. Use the JS debugger and take performance profiles. LMS aren't just coding machines. They can do the debugging and reproducing work as well. For token maxing, get some way to kick off coding tasks from your phone. Could be the Codex app, ChatgBT, Claw, etc. As I mentioned before, I threw a bunch of this in my Discord with my Hermes agent. Whatever you choose to do, doesn't matter. Make it easy to get the idea out of your head into an agent running. I don't care how you set this up. There's lots of good options, but your goal should be to make it as easy as possible to go from, "Oh, that's a nice idea," to having something running, testing out that theory for you as quickly and smoothly as possible. The more of these opportunities you take, the more of them you'll find, and the more you find, the more you can do, and the more you'll find the limits of the tools that we're relying on now every day. My last thoughts about where this is all going, and more importantly, how we should feel about it. The amount of work somebody can do as a developer has gone up exponentially as a result of all of this. Any one dev can do 10 times to 100 times more work than they could before. They can't necessarily validate that that work is good or worth shipping, but they can absolutely get the code out more than they ever could. Your focus should not just be writing more code. Your focus should be solving more problems. And you shouldn't be doing this out of fear either. If you come into this scared that you're going to lose your job if you don't token max, you're going to burn a bunch of tokens and a bunch of your own sanity and come out way more stressed and not happy and probably still unemployed. If you go into this with excitement, the excitement that we all went into software with, with the ability to customize our computers and the things on them to do what we want and need specifically for ourselves. If you go into your agents with that mindset, if you go into the token maxing with that goal, trying to get more out of the things that you do and build every day, if you approach it with excitement, what you'll get back is incredible. When I went into this stuff skeptical, what I got out was some okay code and a bunch of bugs. When I went into these things excited to push the limits of what I could do and build, I came out with my own custom cloud. I came out with a pile of npm packages I use every single day for a ton of different stuff. I started forking the software I rely on to tailor it to my specific needs. I started building my own control planes for managing all of this. I started having way more fun and building way more, too. So, don't approach this with the attitude of keeping your job and making sure you'll still be employed in a few years. Approach this with the excitement that we can build all the things we ever imagined. Lower your bar for what's worth building and raise your bar for how far you bother going. And what you'll find is a lot of fun, but also a lot of rate limits being hit. We're at 42% now and we still have four hours left. God. Yeah, I'm voted by two subs and I think you should too. Mythos is an incredible model capable of things I never would have expected even a year ago for agents to be able to do. It builds awesome things. It writes awesome plans. It describes the stuff it's doing in ways that are actually digestible. And you can use this to push the limits of what your code is doing to get better information and spend less of your time doing the things you don't like and more of your time shipping things you do. If you find yourself stopping to ask, can an agent do this? Reset your mind. Just ask the agent to do it and see if it fails. And if it fails, try to figure out why. Ask it what it struggled with. Look through the history and the bad tool calls it made and try to steer it in the right direction. Maybe make a skill that points it in the right way. Maybe build your own verification in your codebase to push it in the right direction. If you're not already writing your own custom lint rules for weird [ __ ] that your agents are doing, you're not being creative enough yet. And that's fine. This is a different way of thinking and it's a really exciting one. So go out and build. Take advantage of these subscriptions. Push the limits of the best models and tools available today and don't get too locked into them because new things will be available tomorrow. Just try to be excited when you go in and you'll be amazed what you can pull out. Now that I got these agents cued, I'm going to go catch up on some sleep. I recommend you do the same, but come back fresh and build some cool stuff. Let me know what y'all are building.

Get daily recaps from
Theo - t3․gg

AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.