Code Mode: Giving AI Agents an Entire API in 1,000 Tokens (With Demos)
Chapters9
This chapter introduces the key concepts of agents, MCP (Model Context Protocol), tokens and code mode, explaining how code mode compresses tool access to enable cheaper, faster model interactions and previewing a follow-up episode about AI-based engineering feats.
Code Mode compresses an entire API into two tools and ~1,000 tokens, letting AI agents safely explore and act on Cloudflare’s API with live demos.
Summary
Cloudflare’s Matt Kerry explains how Code Mode lets AI agents call an entire API—such as Cloudflare’s 2,500+ endpoints—by transforming tool use into writing and executing code. Instead of loading a massive OpenAPI spec into the model’s context, Code Mode compresses the API into a TypeScript SDK and uses two tools: search and execute. The result is massive token savings (from ~2 million tokens for the full spec down to just over 1,000 tokens in context) and faster, cheaper interactions. Kerry walks through a live demo where Claude code writes to search the API and execute code against it, showing a Cloudflare MCP server handling the heavy lifting. He also demonstrates deploying a Worker via the MCP server, storing data with KV, and securing access with Cloudflare Access. Throughout, the discussion ties into MCP (Model Context Protocol), sandboxes, and dynamic worker loaders to keep untrusted code isolated and safe. The episode also teases future improvements, like applying code mode to MCP portals and expanding this pattern to other APIs, with Cloudflare positioning agents as a rising force in AI-enabled development. Finally, Kerry hints at an upcoming Cloudflare announcement and reiterates the potential of agents to turn structured API data into actionable, real-time outcomes.
Key Takeaways
- Code Mode compresses an entire API into two tools (search and execute), reducing context to roughly 1,000 tokens versus millions if loaded naively.
- Two tools generate code that interacts with the API, enabling autonomous agent behavior while keeping the model’s context lightweight and cost-efficient.
- Cloudflare’s MCP (Model Context Protocol) and dynamic worker loaders enable safe, sandboxed execution of the agent’s code on the edge, avoiding client-side risk.
- The MCP server approach allows an API to be exposed as a single, scalable endpoint, letting the agent search over and act on API methods without loading full specs into the model.
- Real-time demos show deploying a Cloudflare Worker, using KV for a visitor book, and enforcing access controls via Cloudflare Access, all driven by a coding-agent at the edge.
- The approach supports multi-API environments and can be adapted to other providers by sharing or wrapping their APIs with a similar MCP-based code mode flow.
- Future directions include bringing code mode to MCP portals and deeper integration across Cloudflare’s agent ecosystem to accelerate edge-enabled AI apps.
Who Is This For?
Developers and product teams building AI-powered automation with large APIs, especially those evaluating or already using Cloudflare’s MCP, Workers, and SDKs. This is essential viewing for teams wanting cheaper, safer agent-driven API access and edge-native deployments.
Notable Quotes
"“we’re down to just over a thousand tokens.”"
—Highlighting the dramatic token compression achieved by Code Mode compared to loading the full API spec.
"“we compressed the whole Cloudflare API into two tools, search and execute.”"
—Core claim of the Code Mode approach and its two-tool execution model.
"“the API token doesn't have right permissions.”"
—Demonstrating the practicalities of reauth and scoped permissions during a live demo.
"“this is public for anyone to have a play with.”"
—Showcasing that the MCP server and demo setup are openly accessible for experimentation.
"“the whole program is running on the edge... you can run it on your phone.”"
—Emphasizing the edge-native nature of the setup and its portability.
Questions This Video Answers
- how does code mode reduce API context in AI agents with Cloudflare MCP?
- can I deploy my own MCP server to wrap a different API using code mode?
- what are the security benefits of dynamic worker loaders for running untrusted code?
- how will Code Mode integrate with MCP portals and other Cloudflare products in 2026?
CloudflareCode ModeMCPAgents SDKDynamic Worker LoadersOpenAPIKVDurable ObjectsCloudflare AccessWorkers.dev
Full Transcript
So the raw open API spec, if we wanted to traditionally give the model access to the whole API, we would dump the whole API spec in or we dump a whole tool spec in, which is even worse potentially. And that would be about like 2 million tokens. If you just get down to required parameters, so you actually lose loads of options, but just required parameters for every API, you're at like sort of 240,000 tokens. And with code mode, just these search and execute executing code, we're down to just over a thousand tokens. So this is like I have this uh loaded in every uh session that I run of cold code of open code of any coding agent.
Uh it doesn't add much context and it works pretty damn well. Hello everyone and welcome to this week in net. is the February 27, 2026 edition and this week we're going to talk about agents, code mode and developers and of course those interested in building with AI with cost savings. For that we start with Matt Kerry senior systems engineer. I'm your host Ron based in Lisbon, Portugal and this week we actually have a double episode with two guests about two different blog posts that are about building but those have uh something in common. They became viral and with a lot of interest.
AI is moving fast, building with AI is getting real, and there was just too much to fit in one show. So, we have two episodes about these topics. First up, agents and code mode that Matt Kerry talks ush into about giving agents access to the entire caller API, more than 2500 endpoints using just two tools and around 1,000 tokens of context. So, not as expensive. Quick key terms. For example, agents are AI systems that don't just answer questions, they take action. MCP is the modal context protocol that we've discussed before in the show. The standard for letting models call external tools safely.
Context window are the amount of information a model can see at once. Tokens are the units that measure model input and costs. And of course code mode is instead of loading thousands of API tools the model writes JavaScript to search and execute against the API and its context specifically. So in the result is massive compression lower cost better performance. In our second episode as will be published a few hours we talk about V- next how one engineer rebuilt Nex.js with AI in one week. We also answer a few questions done on social media how this came to be with AI.
It's a experimental project. So stay tuned for that. Without further ado, here's my conversation with Matt Kerry. Hello Matt. How are you? Hey, how you doing? I'm good. For those who don't know, where are you? Oh, I'm actually just outside Lisbon now. A beautiful beach town called Kapar. Uh yeah, I just moved here. It's very exciting. We're ju you're just a few kilometers actually. You're the other side of the river, so I can have a view almost to where you are at. Uh where are you based usually? Uh, so I actually just moved to Portugal, but um I was previously in London for a while.
So it's a recent move. Very recent. H Are you enjoying uh Portugal? Yeah, super nice. Super nice. We got some really good weather at the moment. There's surfers out my window. Yeah, it's really cool. After a few after a few weeks of storms, now the the better weather has coming is coming. It's not the same thing actually in New York and Boston area with the snow storm around. No, they had like what, five storms back to back my first two weeks here or four or five. It was crazy. Crazy here. Yeah. All the flooding and all of that sort of stuff.
We didn't get much of it in Lisbon, but I saw it on I saw it on the news. Was wild. It was. It was. Oh. Well, you wrote a very interesting blog post about code mode. That is something that's been going on for a while now. Give agents an entire API in 10,000 tokens. uh for those who don't know really what is code mode and what this is about. How can we explain what it is and why they should care? Yeah. So I I guess like it's really good to have a little bit of a backstory around the problem that we're trying to solve.
So I work on agents and MCP at Cloudflare uh part of the agents SDK team. So I think you've had a you've had Sunil on here before potentially. Um, yel's on my team and we work predominantly on open source libraries that help people be successful with Cloudflare predominantly building agents. So we support the agents SDK sandboxes um and also all of our like MCP infrastructure. MCP model context protocol it's uh um released by anthropic uh just over a year ago now or November in 2024. a fair about over a year ago now. And the idea is like how do you give agents um hands?
How do you how do you let AI reach outside of potentially the the process it's running on, the computer it's running on, the place it's running on to like access external tools. So if you think about like making agents useful in the workplace, that's a big thing that you're going to have to work out at some point. Like how do you let it access your your your data? How do you let access the the places that you work in? And I've been really lucky to have some of demos from some of the creators of MCP and it's really fun to hear them speak about like yeah we just wanted to like get clawed outside the box you know just like let it let it let it reach outside of the glass box that it was it was acting in.
So that's a little bit of background about MCP like model context protocol and it has things called like tools which are the things that interact which you interact with maybe some external function and it also has some other stuff but we're going to mostly focus on tools. these tools. Traditionally, what people would do is they'd do like fetch all the tools from a particular MCP server and then they'd load them into the context of their agent. And that means that you're kind of limited by how many tools you can have. There are these things called context windows and maybe they'll go up to 200,000 tokens, maybe all the way up to a million.
But there is a limit, a hard limit. And the closer you reach that limit of context, the more like firstly the more expensive the inference becomes because you're loading many more tokens, but also like the less performance the model has. And when you go above that, the the model just like the API just completely won't work. So you want to try and reduce that context as much as possible. So how have people been doing this? They've been like turning on tools when they need them, turning them off when they don't need them. But like that really stops the autonomy of agents.
So we've just been trying to work out like how can we load many many more tools into a context window or how can we compress them compress them in a way that uh is useful to get many more in the context window and so the first version of code mode came out in last summer and that was Sunil and Kenton they worked out that if you could generate a TypeScript SDK from the tool specification then you could just have give the model one tool and that tool would be to write code and it could write code over the TypeScript SDK and then that would call the underlying tools and that was that was really really they had like massive compression over the amount of tools they could call and we kind of expected that would be how people would do stuff.
But like as all of these things like that's not exactly how it worked out and other people came up with other ideas of how to compress their tool their tools like anthropic came out with u programmatic tool calling just afterwards which is very very similar and then they also released like tool search inside uh the core code where you can search over different tools like no one had really solved that problem with MCP providers were still exposing MCP servers like Cloudflare was still exposing 13 or 14 MCP servers and all of these MCP servers had a very small amount of tools each and they all specified everyone would um each product team that built it would be like we're going to cherrypick the best seven things that you can do on our API the most seven important things and we're going to put those in our MCP server we're going to make sure these work really well and when you do that you lose like the granularity you lose like the edge cases which are make the API really useful right you lose all of the endpoints and you distill it down into like six seven 10 unique operations and I was like well why don't we combine code mode with MCP servers so instead of executing code on the MCP client we actually move all of the execution to the server and try and expose the whole Cloudflare API in one MCP server and yeah well we managed it and there's some technical things we can get into about how we managed it based on a primitive called dynamic worker loaders which I think is really really cool we can talk about a bit more but that was it that was the fiveminute rundown of what we did makes sense and it it uses a bunch of different Clawflare products actually to make this work right which is also interesting in a way the ecosystem that is being built 15 years now over 15 years it actually it's great for this type of purpose in terms of making these type of things work one of the things I I want to for those that are not developers to understand is how helpful how relevant this is of course efficiency less tokens less costs always important anyone can understand that but also in terms of the output versus in efficiency what are the main gains there why don't we do why don't we do a tiny little demo and maybe it might be helpful to talk through it let's let's go for it so um I'll just share a screen quickly so so this is claude code um we've probably seen this screen before uh and we can do we're connected to an MCP server and I just actually just I just uh just authenticated it but we're connected to an MCP server, the Cloudflare API MCP server.
Forget about the staging bit just means it's got some goodies. And this is this is public for anyone to have a play with. So anyone can try you can try and replicate some of this stuff. But um say we want to use some part of the Cloudflare API. So uh like what um workers do I have deployed on Matt account? Uh and and so like I could go into the dashboard. I could use Wrangler like RCLI or I could uh ask this MCP server and this MCP server is going to write some code. You see I have a bunch of different um uh accounts.
Um the MCP server is going to uh write some code that searches over the API spec, finds the API that it wants to call, calls the API. Here it had to narrow it down to the particular account and hopefully we've got a good answer. So these are all my like demo workers basically. Uh I have 24 plus workers deployed on the account. These are the key ones. We can be we can even go a little bit crazier like this. So maybe like um uh which of my zones so this is entering a part of the Cloudflare um ecosystem that I don't touch very much.
So which which zones uh have the most traffic? Maybe we could use GraphQL API. And I just know that the the API that we should use here is the GraphQL API. So I'm going to give it a little bit of feedback. It probably doesn't need it, but it's just just for the purposes of the demo. Just make this nice and easy. Um you can ask for a graph and possibly it goes there directly, right? Yeah. So, it's good to give it as much um feedback as possible. And I've like spent a bit of time with this API so I know like which bits are which bits exist.
And if I can uh express that to the model, then it'll just have to do less search and so it'll just speed up this whole process for us. Um but we're going to it looks like it's returned some data. So, we're just waiting for Claude now to write that out. But this these are all like readonly things. Um, oh, something is failing, which happens in all the demos. Yeah. Yeah. Unknown field zone name now. Too many zones requested. Okay, maybe let's just try Matt Z said Carrie. Oh, no. It got around it. So, I'm just asking it to try my my my my uh one specific my one specific one.
And let's let's see if it could do that. Okay, it could do that quite successfully. Maybe I asked it something that was actually impossible potentially or maybe had to write some more stuff. Okay, so it's got some data about my zone. But I'm just going to start writing a prompt so we get that. Um, but we get it with Okay, let's maybe let's try and deploy a worker next because I think it's more fun. Let's deploy uh hello world worker saying uh hi to all the listeners. So I've pressed enter on that while we're waiting.
So, my uh personal website had um 20,000 requests over the last week, and we can split out by day, by 24 hours, different visitors, all of that cool stuff. See my cash here. Like, we're just using the API in a way that like I wouldn't know how to use it otherwise, which is quite cool. Also, in a way that it's not always exposed on the dashboard, which is also kind of interesting. Um, ah, this is fun. Okay, so I said deploy hello world worker saying hi to all listeners. immediately said the API token doesn't have right permissions.
Interesting. So, let's see how we get to that. MCP enter. When I first authenticated, I just gave it read only permissions. So, let's share my whole screen because I don't think you'll be able to see what happens when we do this. So, you're probably going to see yourself in the corner there. I'm so sorry. Uh, but if I reauthenticate, browser window opens and I get this um I get this window here, right? And this window allows me to just get rid of you for a second. Uh allows me to uh pick exactly the scopes that I want access to.
So for the purposes of this demo, I'm going to give myself workers full access so I can read and write workers. I'm also going to give myself access read and write access. And we'll see why I do that in a second because that is a fun part of this demo. Uh but let's press continue. And this is like all part of MCP that I can basically give my agents specific uh oorthth scopes. And we've never done an we've never done an API or an MCP cell where you can do this. This is kind of new.
It's kind of fresh and but it's really necessary because if you're doing um a Cloudflare like a full uh MCP ser for the whole API like you kind of want to be able to narrow it down. Okay, cool. So we've reauthenticated. Okay, now I'm say uh you should be able to do this now. I gave you permissions something like so it's it's giving me some fun insight to like how I would do it just with uh with index.js and making a Wrangler file and deploying Wrangler. But we're just going to sidestep that. We're going to see see what happens if we just carry on and try and um Okay, so this is cool.
So it's written some code. It's called the API endpoint to upload the worker script. And now it's enabled the workers.dev subdomain. So we have a way to access the worker which I think is very very cool. And then it also had to do a get request to get the subdivate. Apparently that's thing. Okay. Hello listeners. Oh well I think this is an emoji it's trying to do but that is pretty cool. We have a deployed endpoint. Hello listeners. But let's make it better. Uh I want this to be more of a visitor book. This is my favorite demo.
I want to see more of a visitor book. I want to be able to write on the wall when I visit. Let's use KV to store everything also. Uh yeah, we we'll we'll do that. Even if you don't say the KV to store everything, possibly it will go there. It will take only a bit more time, right? To Yeah, it's it's just it just becomes a little bit more non-deterministic. I I when I prompt lms, if I know the how it should do something, I normally tell it because you just get better performance. And since we don't have much time, it's it's nice to try and uh rattle through this.
But Exactly. exactly the case. So what's it done and I think it's very smart actually like really very smart. Um what have we done? We've uh looked up the endpoints to do with KV KV namespaces and then we've created a new KV namespace for the visitor book which is cool. Uh it's returned an ID and now it's written a whole load of JavaScript which it's going to upload as the worker Vista book worker. And now my Vista book is live. How cool is that? Ah doesn't look that bad. Let's go. Looks great. So, let's write on the wall.
Hey friends. So, hello listeners.workers.dev. It's public. Okay, we're going to make it less public in a second, but anyone can sign it, right? You see, I had to reload the page there, and it got me a bit confused. Let's make it not have to reload the page. Why don't we um uh can we make this multiplayer? Uh like a chat room. Um I want this all local first and sinky. um brackets and I know brackets durable objects because I love durable objects if anyone hasn't played with them. Uh it allows you to make these like really cool multiplayer things because they have they're like a little piece of compute that lives somewhere and they have websockets that you can connect to and you can yeah just someone can send a message and it could that message can be sent to every client that's connected.
Just the beauty of websockets and durable objects make this really easy. So hopefully they make it easy enough for an LLM to make a full sync engine and uh a full multiplayer chat room with one prompt. Well, we have iterated a little bit, but let's uh let's see if it's possible. And it is also bearing worth bearing in mind here that we haven't saved any of this code on our computers. It's kind of fun this. So we could run this on anything. Like I could run this there's no development environment. So I could run this on my phone.
I could run this on my Raspberry Pi. I could run this on anything. All right. And cool. Right. Let's go. I'm joining the wall. Nice. Right. If you would like to, could you connect to this for me and just like see if this works? Sure. And I'm not going to reload the page. And I'm also going to You're there. Wait, I am. I'm I'm going to go crazy with this. Let's Let's see how fast it is. So, this is not local. This is running over the public internet. How quick is that? Really quick. It's quite amazing to see how real time things can be achieved in so little time.
Yeah. And real time used to be one that's like super hard problems. Um, and with durable objects, they're made pretty easy with the Cloudflare MCP server that's writing code over our API. It makes deployment very easy of these like little mini apps. But say we have like a real application um like a real application. Say my personal website is mattscary.com. very basic website. Uh, but it is running and I kind of like to keep it running. It says it's not secure. That's that's interesting. Um, might fix that after this. Uh, but it's it it's I'd like to keep it running.
But how about just for one moment? I want it to not be accessible to the general public. I just want it to be private just just for me. Maybe while like I'm uploading some stuff, changing some stuff. Um maybe I want to create like a staging version of this website potentially um potentially helpful. So what could we do? What could we do with that? So there we have this thing called access uh which I will admit I'm not very good at using um and it allows like all of this enterprise or stuff. you can just basically block off connections to certain routes, um certain URLs, certain zones, and just like completely block off um access uh depending on like some person's login credentials or email address or like a bunch of other things.
They can sign in with different IDPs. Um but I don't really like know how to use that in the dashboard. I'm not going to lie. I've played with it. Um it's quite hard. This is I guarantee you is the easiest way to set it up. So, how how does it work? So, um I want to make um my website only accessible to me. I'm going to just clarify which zone that is only accessible to me. Um, can you put it behind access uh and make a policy of only me? And my email address is um [email protected].
And we'll just like see what happens here. Um, I might end up doxing all of my emails in a second when I go into my emails, but we'll work out how we get there. Um, but I think like the demo is just like how easy is it to play with the clouds API? We have 2 and a half thousand endpoints and we're able to search over them and execute code over them kind of autonomously. Like yes, sure this is a relatively decent prompt, but it's not that decent. Let's be honest, when we were going back and making the visitor book, um, like my prompt was make it local first and Um, it's like not it's not hectic, right?
Okay. So what would have taken me a while in the dashboard reading some docs is now your website mary is hosted behind cloudflow access. Shall we shall we check? Oh baby, let's go. Right. So damn [clears throat] protected. I am actually going to stop sharing for a bit while I wander into my emails because I've done this way too many times and it's not it's not healthy for anyone's relationship when your emails get published. That's true. Security first. Yeah. Cool. Cool. Right. I did get an email. Let's share again my whole screen. Cool. We're back.
And we're in. So, no one else can access my website. Obviously, I don't really want that. So, maybe let's update this access. Um, we actually don't need access anymore. So, can we just um uh actually I don't want that. uh I just like make it accessible to the public internet again and it will just like remove the policy, remove the access application. Like there's loads of bits that you have to learn how to use and this makes it all quite straightforward. It's it's like having your personal LLM inside the dashboard inside the all the capabilities that Cloudflare has.
You even have an a DOS attacks example on the blog as well. Uh yeah, to protect an origin from DOS attacks, right? Yeah. Yeah, that that one that one's actually really really cool. Um when I played through that scenario, it was like this actually works. It's amazing. Um yeah, really awesome. I guess like there was something else I was going to that I was going to play with. Anyway, so this is this is uh protect this is nonprotected now again. Um, this this the wall though, if we go back to the list visitors thing. Um, if I leave this up, it is it's an unprotected URL.
People could play with it. We could put workers AI in it and we could make it like really we could make our own like social media or whatever, but it might start costing us money at some point um if someone pinged me a lot or if people started using it. So, I don't want to keep it running, but like like how how would I get rid of it? Um, it's alive in the world. I could click around the dashboard or I can just be like can we delete the uh visitor book and the KV and anything else we made in this session.
And then the idea being now that like we can clean up and have a completely clean slate. And we've just done a huge amount of experimentation with Cloudflare. Uh we've played with durable objects, we've played with KV, we've played with workers, we've played with access, we've played with DNS analytics. Um, and like it didn't take us that long. I think it was pretty good. Uh, and it was great. Entirely scoped the permissions that we needed at the beginning, which is very important. Of course, and in in a way it was also related to the fact that you interacted with an LLM, in this case, Cloud Code, uh, and it was really writing simple language without writing code specifically, right?
Yeah, that's that's pretty much it. So the blog post goes over this in in a lot more words, but basically it is we've compressed the whole of the um Cloudflare API into two tools, search and execute. And search can write some code to search over the API and execute writes some code to act on the API. But importantly, this massive open API spec never gets loaded into the LLM's context window. So when we when we look at search um here it has it does some search over methods and paths in this spec but it never reads the spec.
It just reads the types of the spec. And so we've enabled that massive compression um from all of these endpoints to just a few typescript types. So it's much more efficient in terms of cost. Yeah. Uh cost and just destroying a context window. So the raw open API spec, if we wanted to traditionally give the model access to the whole API, we would dump the whole API spec in or we dump a whole tool spec in, which is even worse potentially. And that would be about like 2 million tokens. Um, if you dump if you create uh tools from all of the schemas you get with like 1.1 million tokens, but you lose a little bit of granularity compared to the full spec.
uh if you just get down to required parameters, so you actually lose loads of options, but just required parameters for every API, you're at like sort of 240,000 tokens. And with code mode, just these search and execute executing code, we're down to just over a,000 tokens. So this is I have this uh loaded in every uh session that I run of called code, of open code, of any coding agent. Uh it doesn't add much context and it works pretty damn well. It does. And the demo is really cool. just to see what you can do in a few seconds in a very uh natural language type of way is kind of amazing to see.
Specifically, one of the things I'm curious uh is is also the feedback. We got a lot of feedback online. Uh what what surprised you the most in terms of feedback in terms of how people are actually using it already? Yeah, so lot first of all huge amount of people jumped on it and used it. Um, I think there was sort of a million odd views on the on the Twitter post that I did about this blog. Uh, which is crazy for me. Like I don't normally get that much exposure. Like to get a million views was wild.
And yeah, people people loved it. There was just like o opens the door to using Cloudflare a bit more naturally and like people spend a lot of their day or a lot of programmers spend a lot of their day or software engineers or product managers a lot of their day inside these coding agents and being able to um natively talk to your infrastructure in that way is is kind of illuminating. But what's also very very cool is people are wanting to know how can they do this with their own APIs. So other providers, people of with very large APIs are wanting to work out how they can do this.
And more importantly, u potentially the customers of those large providers are also wanting to know how they can get access to something that works not for Cloudflare but for some other large platform that has 2,000 um odd endpoints that they want to use uh in from their coding agent. So, we actually released this code mode SDK um at the same at the same time, version two of the code mode SDK. Uh and this allows anyone to build their own MCP server that does this does this stuff. Um you can wrap your you can wrap another MCP server, you can wrap like a thousand AI tools, you can wrap an open API spec, you can wrap whatever you want.
um the idea of executing code rather than calling tools directly is the idea of code mode. Uh and that's what we want to try and enable with this SDK. Um, also the from the blog the the MCP server here, the CloudFire MCP server is also open source. And so a fun way of um understanding how it works, but also creating some of the functionality for yourself is if you clone if you get your coding agent to clone the repository, point it towards an API that you like uh or that you use a lot. For instance, like a fun one is the GitHub MCP server.
uh you could point it towards and then you can make your own code mode MCP server based on top of the GitHub MCP server uh that does the same things that we are doing um and you don't have to own the underlying MCP server you can just make your own because APIs are public we build this we build cler with cler which is very very typical thing we usually do but it's actually also a proof of concept in terms of as you say others can use for their own APIs their own MCPS uh specifically that opens the the door for MCP maybe to be to have a let's say second life in a in a way because it was MCP it was getting less traction uh at some point right yeah there was a lot of chat about uh do CLI kill MCP servers um like open claw that's had or or maltbot or malt worker all these things um like openclaw itself has had a huge amount of popularity recently and did you have the malt worker guys on this is yeah I had also talk about mold worker and also MCP not MCP markdown for agents okay yeah so so all of these people are working on different stuff in parallel and like there's a lot of discourse online about how open core specifically uses MC porter which is an MCP to CLI tool and how the CLIs like kill MCP because they enable things like progressive disclosure of tools you don't dump more in the context window they're very native to use for LLMs and all of this stuff.
And I would just argue that structuring your MCP server like this enables progressive disclosure. We can search, we can execute the same as a calling a CLI, you can call help on all of the commands. And like it is also very LLM native because we're just writing a small amount of code. And this code is code that LLM have seen so many times. Like how many times has Claude generated a fetch request? like it must be up there with one of the most popular things it's ever done. And so like I think this is hugely native.
The reason why we get one of the reasons why we get such good performance here is we're writing stuff that's in distribution of the models training set. You always want to keep the models in distribution and code specifically JavaScript code is very much in distribution. The web is built on JavaScript. The models see a huge amount of JavaScript during pre-training and post-training. And we are just letting the model do what it wants to do which is write JavaScript. And a lot of this design was like influenced by just like what does what do the models do better?
I changed something, tested it, changed something else, tested it. Which does the model prefer? Uh I think like if you can lean in towards some of that design decision, like it helps a lot. And this whole argument of like just CLI kill MCP, is MCP dead? Like no, MCP is not dead. MCP is the way that models are going to interact with external tools. like MCP has one and it's getting better and better and better. The fact that some people dump a huge amount of tools across the MCP wire like that that is user error I'm afraid there are better ways of doing it and this is one of those better ways.
Open core itself runs on MCP. Uh it just wraps them in a CLI for progressive disclosure. Like there are many ways of doing that. There are some situations where you can't wrap things in a CLI. So I was talking about like running this on my phone for instance. Like I could run this on a very simple app on my phone. I would not need shell access on my phone. That would work. I maybe don't want to give I maybe want to build an agent that doesn't have shell access. This will work. We run all of this code in a dynamic worker isolate or a dynamic worker loader which is a super sandbox.
It's like super eval. You can just run code as a worker in a very sandboxed environment. We can restrict what fetch requests can be made. There's no access to environment variables. There's no access to like like file system. There's no access to anything on the underlying host. Like this code itself here does not get executed on my machine. It gets executed in a primitive that was designed for this, a sandbox primitive that was designed for this in the cloud on the Cloudflare edge. And I don't have to worry about the LLM being prompt injected to write some very unsecure thing and dumping the context of my M variables into an email somewhere and sending them off to god knows who because none of this happens on my machine.
And I think that is there's a big security thing that people don't really think about when they talk about CLI are better. And even if they do think about it, there are just some situations where you can't use a CLI. Uh you don't want to give shell access and then it's a non-starter. So different things for different situations. Of course, it's part it's part of the the backbone that is being built in this situation and sandboxes making things protected and agents SDK definitely are those types of tools that can leverage these new new age possibilities into production and security and being around in an efficient way as well.
Yeah. Yeah. 100%. And like there's blog posts that me and Son keep on talking about writing. I think he's gonna get there first about the the different types of sandboxes and Kenton has an amazing talk. I don't know when it was from from maybe connect um last year about the different types of sandboxing. I mean he is like the sandbox goat but when you run like this running untrusted user code is a very new thing for everyone. Like people built DSLs, people built markup languages, people built query languages just so they could control. They could give their users the semblance of giving them code without them actually giving them code.
Right? You can think about this with like basically every SAS tool you know has probably has some way of configuring something in like a codelike manner. But it's not code. And it's not code because when the engineers sat down sat around deciding how they were going to build this, they were like, we can't give our users the ability to execute untrusted code on our machines. That feel that's unsafe. Well, now you can. And I think this is going to change like how people do a lot of stuff. And it's going to go beyond just it's going to go beyond me playing around with the Cloudflare API.
Makes sense. Before we go, uh I need to ask you in terms of use cases you've seen agents perform. We spoke about a few demos and possibilities, but in general in the agent space, uh what excites you the most? What are the use cases you think other than the Cloudflare API that could be really make a big shift difference? Yeah, I don't have any particular ones for you. I just like LLMs are really good at taking structured data and making it unstructured and taking unstructured data and making it structured. So it's this like fuzzy layer where previously you'd have had to write some very brittle logic can now just be generated on demand.
And we can kind of see this here like the REST API from Cloudflare is a structured piece of like a structured thing. My funny prompts are not structured and the LLM goes from one to the other. So you see like most businesses have something that is structured and something that isn't structured and they spend a lot of time putting one into the other. Well, labs are going to be pretty useful. All of those things translating like human mind to actual execution and the opposite as well, which is interesting. Exactly. One of the things uh I need to ask you is we've been teasing all week a new announcement.
This will be pub we're recording on Tuesday. This will be p published on Friday. What can we say? What excites you about that specific announcement? Ah, like the one that's coming this afternoon. Yep. Yep. Okay. So, oh, where do I start? I I think what's coming this afternoon and you'll know all about it like when it comes out. It still feels weird to talk about it though. Um is gonna really demonstrate the power of the Cloudflare platform. Cloudflare can take anyone's code and run it at the edge and as long as it's JavaScript in some form of another, it will work.
And the aim the aim is to make that as seamless as possible. And we're getting so close to it. Like the dev experience is getting so so much better. Everything is getting better and we have some incredibly smart people coming up with the most wacky ideas to make everything work even better. And that sounds super woolly, but if you'd have seen the announcement, you will know exactly what I mean. So yeah, I hope you hope there all the listeners are excited as I am. And if you haven't seen the announcement, just please like go on the Cloudflare blog or something, find it.
It will be there. Last but not least, I saw this this week. It was related to getting a custom email address for free using Cloudflare plus Gmail. Uh Gmail is actually promoting like people can have their own website as an email and this is actually it was trending on on Twitter actually someone just putting the steps of using Cloudflare to have email routing in this case to have a custom email for free which is kind of in very for very specific use case not like business and many emails per day but it's a very cool thing that is available for free with email routing Yeah.
Yeah. And we're gonna have email sending soon. I think I'm pretty sure I can talk about that. Yeah, it's gonna happen. Like all of these primitives that you need to build very high performance web applications or just generally things that act on the internet, Cloudflare will have. And I think we have some of the best ones already like worker. I joined Cloudflare because I was super excited about the direction of workers, durable objects. I saw the beta for dynamic worker loaders and I was like, "Oh my god, we can execute other people's code. That's wild." Especially in an era of LLMs.
And that's like why I joined Cloudflare and I guess if anyone's listening who is excited about that sort of stuff as well and the intersection of agents with all of that we are hiring on our team. So come and yeah pop me an email or something. We are uh last but not least anything we can say of what's coming this year for agents if we can say something at all. Yeah in the blog post about code mode we teased that like search and execute and this like code mode stuff is coming for MCP portals. So, I I think it's it's no harm in saying that we will definitely have that in the very near future.
You will be able to put your MCP server behind an MCP portal, even if it's just one, and gain the ability to compress all your tokens into a thousand uh into a,000 tokens. Compress all your tools into a,000 tokens like that will come very very soon. In terms of like more agent stuff, I think for me there's this big focus on how do we make building agents on Cloudflare amazing. We have the raw ingredients, durable objects, sandboxes. They are like the raw ingredients for agents. You have durable objects that sit as this like little execution environment that you can give one to each user.
You can give one to each session and it will persist. The data will stay there forever. It has a SQLite database. It has websockets. You can do all that funky streaming. Like that sort of stuff's made very easy with double objects. Sandboxes. You want to run some heavy process with a file system. You want to give your agents access to a CLI sandbox. How do we wire all these things up with something like workflows as well? Uh where you can do deterministic um uh like workflow execution. Amazing product as well. How do we wire all this up with those ones?
Also some other products like AI search. We have this um like a search index uh tool. Super good product as well. Web um like browser rendering. You can run browsers in the cloud. That's going to be sick. Like imagine your agent going out doing something autonomously, trying to use the API, the API it's struggling with. Oh, we'll just pop open a browser window in the cloud. We'll run it in a browser. Absolutely fine. We need to log in. We can log in. No stress. You know, like these primitives, they're still early, but they're going to be like gamechanging for agents.
And yeah, I'm super excited when we can bring the whole thing into a cohesive piece and we can have all of them working with the agents SDK. So, it makes sense to say 2026, the year of the agents. Yeah. Or at least or or at least because 2025 was already a bit of the year of the but now in a more make it count type of way. Yeah. And get like agents to start doing real work. I think we had there was definitely a shift with like Opus 4.5 and the equivalent codeex model like Codeex 5.2 too where like these frontier models are much better at writing code than me like 100%.
They are much better at inferring facts over a huge amount of uh context than me. They're much better at all of these things individually. And it's like how can we make them now access like how can we meet people where they are and let make these agents useful? How can we give like the tools to developers to enable them to make that like killer agent app that just wasn't possible before because the models weren't good enough? Makes sense. you're not scared for your job. Uh yeah, not like just yet. I think there will be a there will be definitely a point where there is a toss up of round like Yeah, I don't even want to talk about this that much, but there is there is probably going to be a point where we become less useful than the agents and then we have to decide a few things.
But I don't know when human in the loop human in the loop is quite important these days for sure, right? Yeah. Like and I just think h like going into this with your eyes open. Yes. Sure. You can run an agent in dangerously skip permissions mode on your Mac Mini. Yeah, sure. But if you give it access to your Gmail, as someone on Twitter found out, it might it might like delete all your emails and it might delete all your emails at the speed of light because that's how it can do things. One API call or one API call in a loop and then you're just limited by Gmail's throttling.
Like, I mean, all of this stuff is going to happen. People are going to have some very weird experiences and our aim is to make sure that agents can only do what you let them do and but they can do everything that you want them to do. Like it's super important that you design these applications like um I think intentionally is the right word. I'm going to say that makes perfect sense. Let's see where things play out. Yeah. Super exciting, isn't it? It is. It is. Thank you for this Matt. It was great. Yeah. Lovely to chat.
And that's a wrap. It's done.
More from Cloudflare
Related Videos



Top 10 Agentic AI Project Ideas 2026 | AI Agent Project Ideas | Build AI Agents In 2026 |Simplilearn
00:24:35


Cracking Down on Illegal Operations | To Catch a Smuggler S7 MEGA Episode
02:12:18

Search Engine Marketing Full Course 2026 | Search Engine Marketing Tutorial | SEM | Simplilearn
04:43:50
Get daily recaps from
Cloudflare
AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.



