Chatting Agents SDKs with Cloudflare, Anthropic, and OpenAI (Panel)

Cloudflare Developers| 00:50:58|Mar 26, 2026
Chapters8
Overview of how the agent landscape has cleared, use cases identified, and API stability improving.

Cloudflare’s panel showcases how agents SDKs from Cloudflare, Anthropic, and OpenAI interoperate, with code mode, durable objects, and cross-runtime tool use leading to real-world customer and developer wins.

Summary

Cloudflare’s panel brings together Sil (Principal Engineer) and Kate (Engineering Manager) to reveal how 2024 has been the ‘year of the agents’ and how the team has grown to ship a stable, pragmatic product. Sil borrows an Age of Empires metaphor to describe clearing the fog of war around agent use cases like coding agents, research, and customer support. The discussion highlights efforts to make the agents SDK play well with others, allowing frameworks like AI SDK, Longchain, and Mastra to run inside Cloudflare’s execution host based on durable objects. Dom from Anthropic and Dom from OpenAI (via their SDKs) join to compare origin stories, guardrails, and the goal of hosting agents across multiple runtimes and platforms. A key theme is portability and runtime-agnosticism: code mode, sandboxed environments, and the shift from durable objects to serverless function variants to ease adoption. The panel teases an ambitious road map—expanding code mode, making primitives work with other runtimes, and enabling robust agent-to-agent coordination at scale. Throughout, developers are encouraged to try the new tools, contribute PRs, and participate in shaping a more accessible agent ecosystem. The group emphasizes a developer experience that minimizes config churn and maximizes interoperability across runtimes, with ongoing bets on guardrails, memory models, and lifecycle controls for long-running agents. The discussion closes on community energy and the call to action: cloudflare.com/careers and a nod to the broader ecosystem’s potential to redefine how agents operate in production environments.

Key Takeaways

  • Code mode is a top roadmap item: generate code in a sandboxed environment and run it across runtimes like Vercel or Fly.io, with Cloudflare’s dynamic isolates enabling safe, scalable execution. [00:12:40]
  • Code mode will be runtime-agnostic and not Cloudflare-specific, enabling agents to run on Versel, Fly.io, or other JS runtimes via PRs and SDKs. [00:13:55]
  • Primitives for cross-runtime coordination are a priority: Cloudflare plans to expose agent primitives that work with others' toolkits and models, not lock users into a single stack. [00:20:10]
  • Guardrails and human-in-the-loop patterns are essential: robust for long-running tasks, resumable state, and safe admission controls in multi-agent workflows. [00:37:25]
  • MCP (multi-tool) patterns are being evolved with practical guidance: tool definitions, context handling, and code generation approaches to manage many tools at scale. [00:49:05]
  • Memory concepts are evolving toward portable memory: a pluggable memory layer that can attach to any agent, preserving context across agents and sessions. [01:00:15]

Who Is This For?

Developers and platform engineers building AI agents who want practical guidance on cross-runtime deployment, durable-object hosting, and code-mode experimentation. Essential viewing for teams evaluating how to combine Cloudflare, Anthropic, and OpenAI tooling into a scalable agent strategy.

Notable Quotes

"I like all my references to be boomer references. So anyone here played Age of Empires?"
Opening joke that sets the tone for discussing the 'year of the agents' and fog-of-war metaphor.
"Code mode will be runtime agnostic. You'll be able to run it on Versel Fly.io wherever your favorite purveyor of JavaScript runtimes."
Sil outlines a major architectural goal for code mode and cross-runtime compatibility.
"We play well with others. And I’d love to bring up our guests and talk about how you define your agent SDK."
Emphasizes interoperability across different SDKs and runtimes with Anthropic and OpenAI.
"Durable objects were the original idea, but now we call them durable objects because we host your agents—the hardware for AI agents."
Describes the hosting primitive and its evolving branding.
"Guard rails around long-running tasks are crucial—the pattern scales with resumable state and human-in-the-loop approval."
Highlights the importance of governance in agent workflows.

Questions This Video Answers

  • How does Cloudflare's code mode work across runtimes like Vercel and Fly.io?
  • What is MCP and how do you manage hundreds of tools in an agent framework?
  • How can I make an agent run in both text and voice modes without rewriting code?
  • What are durable objects and how do they host AI agents at scale?
  • How do Anthropic and OpenAI's agent SDKs differ in approach and philosophy?
Cloudflare WorkersAgents SDKDurable ObjectsMCP (Multi-Tool Program)Code ModeDynamic IsolatesServerless FunctionsAnthropicOpenAIReal-time / Voice Agents
Full Transcript
[music] That heat. All right, let's hear it for for Sil uh principal engineer on the what that what no something uh Lord of the Lord of the squirrels. Lord of the squirrels. The official title is Lord of the Squirrels. and Kate is the engineering manager uh on on the team. Uh the the team has been grown a lot. The team has been growing a lot. Tell me about what's been happening uh this year, the year of the agents. What has happened this year? So um uh I like all my references to be boomer references. So anyone here played Age of Empires? Yeah. Okay. So you know that when you start a game in Age of Empires, you take that little priesthood, the Wol guy, and he walks the landscape. He removes the fog of war and it feels like that's what the last year has been in the agent landscape. Uh we now know the use cases people are using it for. It looks like it's finding PMF with like coding agents research and like customer support. Um and we know who the customers are. We uh and I'm quite happy the API is starting to stabilize but it feels like the landscape has cleared. Uh the second thing we did is the smartest thing I did is I hired we've hired people smarter than me on the team. It's now Kate's headache. Uh and the rest of the team is in the UK. They're incredible. Uh Nares and Matt and Steve Estee is his actual he's Spanish, but he's Steve James, but his real name is Estee Yames. I get a kick of that every time. Um and uh a number of things have happened. So we know what we want to invest in. So Matt, for example, spends time not just in making MCP inside agents better, but he works directly with the MCP team on with Anthropic 2 because there are a bunch of Python programmers. We trying to show them how to write JavaScript. We're trying to fix the TypeScript SDK one P one PR at a time. Um I mentioned this to you. One of the nice thing about our agents SDK is that we want to play well with others. uh while uh there are many great agent frameworks out there like the AI SDK and longchain and Mastra etc. Uh we want all of those things to run inside the execution host that our agents SDK do. They're durable objects that spin up across the planet and we plan on doubling down on that uh doing way more. Uh by the way, shout out to uh Dom who's here who accepted all my PRs to make the OpenAI agents SDK work on Cloudflare. It was uh it was a lot of fun. It's also how we built the phone call demo that this the server for that is 40 lines of code to be able to have an agent that you can make phone calls to. So fun. Um, so that's kind of what the year has been like. We now have a real team. Uh, Kate makes sure that we actually build and deliver stuff and we build and deliver stuff that people want instead of me saying, "Oh, I think this might be a good idea." Uh, what about you? What do you think about the last year? Well, it's been quite his journey, I think, picking up the library that was well, he was the one who was working on it for like two months. It was a crazy idea. They were like, "Okay, Sunil, go run with it." And he's like, "Okay, I've got And so at some point that this became a team is a huge achievement but also is a great indicator that there is space right now for this and people actually build agents because before you know agent what is agent it's workflow or is it a chrome job or how can I use it now as he rightfully said we see customer employee agents. We see personal assistance. There's bunch of use cases pretty much in any software company right now. There is a team or at least the one person who is building their agent in whatever shape or form it is. Um, and I think that's like a massive achievement not for us as a team but for us as an industry to finally find that product market fit because I think it was a very daunting journey to get there. Uh but yeah, there's lots of cool things we're working on. Um and I invite you guys to try that. Can we let's leak a little bit? Can we leak? We can leave a little bit of the road map. Can we do that? Yeah, we can talk about the road map. All right. What's going on? What's happening? What's coming up? Um I think we are happy with like the base agent class. Uh we're doing [laughter] we're not happy with the base Asian class yet. Yeah. Uh we have uh interesting experiments that we are going to invest more in. So for example, code mode. I don't know if you folks know what code mode is. It's uh it's a thing we're doing where instead of having this back and forth with tool calls with the LLM, we ask the LLM first to generate a whole bunch of code based on the task and we run it in little sandboxed environments. Anthropic actually just put out actually Anthrop we were speaking to anthropic and they kind of gave us the little idea and we ran with it and uh Cloudflare has this tech called dynamic isolates that let you run it in a very safe uh sandboxed way and you can spin up millions of them etc. Um, so, uh, I think they're going to invest more in code mode and make it like actually really good. Uh, and, uh, fun fact, uh, uh, hopefully by next week I've shipped it. I have the PR open. Code mode will be runtime agnostic. You'll be able to run it on Versel Fly.io wherever your favorite purveyor of JavaScript runtimes just so and uh, it's not going to be CloudFare specific tech. Uh speaking of which the other thing we're doing is we are taking pieces of the agents SDK and making them not call specific either. So for example uh our MCP server was is still based on a durable object but we have a new version which is just a serverless function that you can run again on any JavaScript runtime. Uh people really like that because durable objects kind of break people's brain and just making it a function is super nice. Uh the other biggest thing I think we are going to do is uh again we are not in the business of making an agents framework but we do want to start giving primitives for coordinating like millions of these agents around uh especially because they're durable objects that can live across the planet not in a single uh not in a single process so to speak. Uh so I guess we're going to make an agent framework for that but it'll still run other people's code inside it. The other big thing we're doing is right now agents is a library that works in the Cloudflare and workers ecosystem. So everybody knows workers. You make a Wrangler.json file. Uh you spit out some code. Uh I think we're going to try to flip that around. Uh because we have the nice package name. We want to we want to know what npx package create um npx agents create, npx agents dev and what npx agents deploy might look like. And how do you approach it from a agent first experience? Like should should npx agents dev spin up cloud code uh instead of you doing it yourself? Like can we ship the development environment directly in the thing? Uh can we have agents that can we have cloudflare agents that write their own code and like restart themselves? So we have a whole bunch of ideas. Um I haven't even gotten this past the product people yet. I think they might be hearing it for the first time now on stage while while I'm saying these things. Did I cover it? Do we have other things we're working on? Yeah, I mean uh what else? We're we're trying to figure out what is uh what is really agent to agent interactions look like. Um I think we have a very like good understanding at this point what it's supposed to be like and it's not just um let's say you know the agent to agent first was introduced in this concept of deep research where you give it a task it goes away um it spins up sub agents and it does some work and then it come back to you so I think we are toying right now with this concept on a much deeper level and we're trying to go wow how and where this can be updable but also how do we give the this coding infrastructure to the user so they can do a bunch of their own um agents and sub agents without acquiring the PhD in agents framework that's a big thing I think the other thing is developer experience I think there's lots of different things out there and the thing that differentiates is how quickly can you spin it up and how much they need to learn to make it work. And we're obsessed about like shaving configs, making things just easier to understand. That's like a big thing. Um, that's pretty much all I would say, but you'll find out a lot more on Twitter. [laughter] Yeah, I would love to. Uh, we we have, you know, one of the big themes that we keep saying is we play well with others. And I'd love to bring up our guests and and Dom, if you could come up here. Uh because I want to we're going to break the world record as far as I know. We're going to have the most agents SDKs on stage at one time because you you have a have a um you both have an agent SDK. We have an agent SDK and I think let's just talk about what how do you define your your agent SDK? That's what you you first. Yeah. I I wish we all got in a room before we decided to That's okay. Um yeah, I I think that and the word is still overloaded. I think like what is an agent? Does it is it like one to one? Can you have billions of them? Like if if you fork a context, is that a different agent? Or you know like if you have two clock code sessions, are those two different agents or is it like your cloud code running? Um I think that our definition at anthropic is really like an agent like cloud code. So pod code, you talk to it and it does things and there are guard rails, right? We have permission prompts and things like that, but it mostly just decides to do it. Like it's not um being told like you have to link, for example, but it's prompted and it has like the ability to use bash tool to link. So I think that's how we think about agents a lot is like tools. um oftent times things like bash and constraints and running for long periods of time and I I think that is what our agents SDK is right is like that um software that enables that to happen. I think that like we there is obviously a lot of like different um infrastructure needed when you're running a process that's running for like half an hour. Like I I sort of feel like if you can run your agent in like uh serverless environment or something, it's not an agent kind of, you know, like that like if it if it needs to respond in like 30 seconds, it's like it's um more of a workflow. So I think that like that's how I think about the agents SDK. Um I like to use the comparison to like JavaScript frameworks because I think that's you know that's how I came up. Um, and like I remember when you know we were deciding to build web apps and you know Facebook had a web app and you're like oh how do I make something like Facebook you know and so when I think of the Cloudflare agents SDK I think of like Nex.js or something like a great way to like host these like rich react or like JavaScript frameworks I think of us as like React. We're like very opinionated you know we're like you need a bundler now like what is JSX? Like I know you're you're using it. Um, and like in the same way that we, you know, it has to run in a sandbox, things like that, it's a little bit annoying, it's very powerful. Um, and so that's how I think about us. And I think a lot of other agent frameworks I think of a little bit more lightweight like, hey, it's like jQuery or something, you know, it's like, um, you have a lot of control. Uh, there's minimal abstraction, but if you want to do something intense like build Facebook, it's a little bit harder. And uh I think the big you know thing about why I'm really excited about our agents SDK is that we build on top of like our our agents are built on top of it right so you you know that we're dog fooding it the same way that like Facebook dog fooded React right so um yeah that's how I think about it thanks and and Dom how do you talk about your agents SDK I feel like first of all we need to do that like Spider-Man later Um I think the interesting thing is more sort of like where the three different agent SDKs came from because I think they all come from very different spots, right? So you clearly have like your origin in cloud code. Um for OpenAI was actually how many of you heard of Swarm? Yeah. Yeah. All right. So, Swarm was basically our go to market team, our SE SAS that were working with companies and wanted to provide like an educational way of like here's sort of the patterns that work really well when building agents. And so, we published that on GitHub and it really took off even though it wasn't meant to be like an SDK. Um, and so the agents SDK was like a natural evolution of that, but we took those same patterns, but this time actually shipped it not as like a sample repo that you download, but as an SDK that you could build on top of. And so a lot of the same principles still exist where like the agents SDK is supposed to give you the patterns of how how we saw agents really work work well and like it goes back to our agent definition which I don't think is too different but like realistically we think of an agent as um a system that performs a tool but performs an action on behalf of a user and it has typically like a model a set of instructions tools to perform the task and a runtime to then actually ex execute that. And so if you look at our agents SDK, it's intentionally very lightweight. There isn't a lot like we're not trying to build a framework. You use the React analogy, so I guess we're jQuery. Um, but like we're intentionally trying to keep it flexible. It works with any model. Like we put a lot of effort in making sure that it doesn't have to work with OpenAI. You can in Python use any model that's on my LLM. For the TypeScript one, you can use anything as the AI SDK and it's fully open source. So if you want to help out like Sil did, um we we always welcome VRs. And so it's really more meant as like a way for us to share sort of the building patterns and make it really easy for you to get started and and build agents that still scale to production. So we work with like companies like temporal to make sure that you can actually um run like your agents for hours or days um and in a robust way. We worked with folks like the Cloudflare folks to make sure that it's really easy to deploy an agent like a voice agent in like a few lines of code. Um and so that's really sort of how we're approaching it. Why don't you talk about what how do you define yours? Yeah, I I I like how you we're talking about like origins of this because uh it turns out durable objects were in and it's even in the codebase it was originally called an agent and this is like way pre-LM stuff and uh we then decided to pick the worst name and call them durable objects instead. Uh but this idea of having these little autonomous uh compute and storage thing comes from its lineage of durable objects. uh it's a host uh which is to say like if you look at a computer uh is the hardware the computer or is the OS that runs on the computer and I think the durable object tries to be the hardware the place to like host your agents I think that's actually our tagline on GitHub a nice place to host your AI agents um and uh which is why we work very hard at not trying to be opinionated about how you build your agents like I said bring your framework uh but we kind of want to because we hear from people because we are a hosting company I guess uh they're like we have all these programs that could potentially run for minutes, hours, days, weeks. How do we do it efficiently? How do we do it without breaking the bank? And it turns out we magically have this little host container that does it. So that's where our agents SDK comes from. Do you have an operating system kind of? Yeah. like uh because we have like the addressable hooks and uh the ability to hibernate to go to uh to wake up and go to sleep when no IIO is happening. Yeah, it's actually closer to an operating system for agents. We should steal that idea and put it in the on the website. That's my input. Yeah, I think the good definition of what we are trying to do is like we are the where and how and for example entropic and open AI is what and why. Um and that's simply you use their toolkits or SDKs to uh use them agencies again host your agents and you know whatever capacity whatever access you want. Um, yeah. Uh, I I know that we're seeing coding agents a lot and we've we've built some together. Um, I think that uh I would love to know what are people building with your agents SDK that you would love to be this is so cool, right? You want people here to know and be inspired by that if we could talk for for that. Yeah, I I mean I I think um there are like valuable use cases and fun use cases. I'll talk about both of them. I I think like um yeah, obviously customer support is a big one. I think uh the thing with customer support is a long tale of complexity and edge cases and like what you hate is you're like you're connected to an AI and then it's like it asks you some questions and then it's like okay, I'm going to connect you to a human, right? like being able to handle the long tail of edge cases are very very important and you need to handle a lot of ambiguity. I think coding generation is actually a great way to do that. Um so that's like something that that we're seeing and I think data is like the big other case you know like um the like databases data observability like uh um like business insights things like that these are all like very coding adjacent spaces right and um yeah I think like uh there's a lot of opportunity there I think to really transform how these people work I think like you know our finance team only uses cloud code you know, I mean, they're not like uh using what like other um I I don't even know what tools they would use. So, um but yeah, and then like on the fun side, uh like I think there's some interesting stuff with games. I kind of want to make a demo which is like plot code plays Pokemon where I give it access to the memory stack, you know, and just let it like like figure out what's happening, you know, be like, "Oh, this is my party." Like, write a script to extract your party. right to to navigate, right? Um and uh yeah, I think customer service is a thing that everyone is going to bring up. Yeah, I do think one of the things that is cool and sort of where I put a lot of intentionality in on like when I started working on the agents SDK is like being able to build the same agent regardless of whether you're building a voice agent or a textbased agent and being able to jump between those. So, one of the things that I put a lot of energy into was with the TypeScript agents SDK, especially the Py Python one works similarly. There's a bit more wiring up you have to do, but the TypeScript one, you can use the same agent code to run in the browser and on the server in whatever way your server looks like. So, one of the cool things is you can create the same agent definitions, maybe slightly different system prompt if you want to um to like steer the real time speech to speech model, but then you can use the same tools, you can use the same guard rails, you can reuse a lot of that code and have the same uh same agent essentially text base wherever you want to on the phone or even in the browser via WebRTC and you're using the same code, the same TypeScript types, and it really feels familiar. Um, so that has worked really well with like a couple of customers where we've seen them be able to start off with a text one and really quickly switch to also shipping a real timebased one that they could talk to over the phone for example because like we also added I don't think you've worked with that since we now have support within real time directly. So at that point you don't deal with any audio processing anymore. That is all being handled directly between Twilio, whatever voice connectivity you have and OpenAI. And then you just host a websocket server that just does like the tool calling essentially or whatever event you want to listen to. But again, it's all the same SDK. So it's really easy to switch between all of these without having to um change. And so that has been really cool to see. Um yeah, because websockets are so much easier with agents customer support, I guess everyone wants to build a chatbot and we we provide uh a fullfledged backend for the AI SDK uh uh use chat hook uh with persistence etc all out of the box and people like that and they just start adding tools onto it. So customer support is a big one. Couple of fun ones. Uh I implemented the final scene from War Games in uh with an agent where it plays tic-tac-toe against itself again and again and it's like, "Oh, the uh the only winning move is to not play." Uh then I uh then there's a guy who's built a drone that looks for colored walls and goes towards it and he whacks at it with a bat in the name of AI safety. That's been pretty fun. What else have we seen? Um pretty sick personal AI agents. Um, funny enough, Steve on our team, he's implemented an incredibly comprehensive personal agent. It literally like orders food for his dog. It replies to job offers, not in a nice way. Thank God. Good for me. Um, but like it's insane. Um I think the second tricky for us is our agent SDK also has the MCB part which gives it a whole different age and we've seen that part being utilized like in a very cool way. We've seen some travel companies um utilize it to completely redefine the way how you book something. Um, and you know, you can give it in context where like I don't know um I need to book a car um to get from A to B, but like I need my car at like uh 700 p.m. cuz I need to pick up my kid and then go somewhere. And you kind of send it in the messages you would send to your partner, to your mother-in-law, whatever. And then it just gives you well, okay, this is your option. Should I book it for you? And like that is great. Um, I think we saw some good uh on boarding agents. No one likes to fill the boring forms and I recently saw a very cool agent when it really ask me like three questions and then my profile was like fully configured for everything I need. I think this is the really cool part. There's a lot of companies who struggle with on boarding who struggle to make people start using their the product and I think that's where agents um come and help you. Um the same like the search we've seen search been redefined a lot with um MCP in general whenever people uh ask the question in a dashboard or a question just in the search line and the search doesn't just return the article it returns the answer. Um that that's kind of what we've seen the most I probably say. Yeah. MCP is huge by the way like so much MCP usage. It's also something everyone loves to hate, you know, like like they'll use it. It's very useful and they're like, I hate this. This is weird. Yeah. I'm going to do a little round of thank yous. So, thank you for uh we built we built a thing. We have a sandbox and we got to play with a cloud agent in our sandbox like right as it came out. It was super awesome. And I've been enjoying your hot takes on all you need is bash. Can you dive a little bit into what what you mean by that because I felt it when I when I built there? Yeah. Yeah, definitely. Um, yeah. So, for context, I I I think probably the the reason we have built the cloud agent SDK on top of cloud code is because we built cloud code and we saw all these people across enthropic use it for many different non-coding tasks, right? And the secret to it is that it's just using bash. stash is, you know, a way to patrol uh computers, right? It's a way to do anything on the computer. And so I think what we've done really well at Anthropic is that we really believe an AI getting better, you know, like I think you can look at our track record of like, okay, hey, Sonnet 3.5 is going to be great at coding and we think coding is going to be very important, right? Like MCPs, right? We're very early, of course. Yeah. Now people are like, I have a 100 MCPS and it's not very context efficient. I'm like, okay, well, you know, when we first made this spec, uh, it was it was a little bit, you know, it wasn't so popular, right? So, uh, MCP skills, browser use, and computer use, right? We were all sort of very early to those emerging capabilities. And I think that's something we've done a really good job at. Uh, bash is just like obviously the the next thing to us where like um we think agents are going to get more and more general and simpler. And so uh bash is like this way of exposing a lot of uh you know that like interesting primitives right away right so I I've seen someone regarding MCP they've uh turned MCPs into like little CLI objects which is kind of interesting which because when you call a CLI you get like these uh you know it can list the tools for you right and so you get this like for free this progressive uh disclosure of tools right and and So, uh, if you get creative, you can turn all of these like tasks into like bash. And, you know, um, a lot of times the questions I get are like, how do I make sure my like AI agent gets better with the models, right? I hate like rewriting all the code. Well, I'm like, it's just going to keep getting better at Bash, right? And like just like if you build it on Bash, um, yeah, you'll need to add a few guard rails, right? like with cloud code, you still need to like have a human sort of interact with it. Um, but that's like the foundation that I think about is like, hey, build it on bash and like then build your guard rails and your human interaction around it and as the models get better, you'll need less of that over time. So, yeah. Uh, and this is a thank you to you, Dom. You, uh, you built some awesome patterns. I feel like you took uh these are the agentic patterns and actually thank you to you as well for originally publishing those and then and then extending that. Um so if you're looking for a place uh to to do those patterns, there's always really great reference uh examples. Um are there is there a reference example? This is off the top of my head after looking at is there a reference example that you wish was out there that where did they find it? Where do they find the reference examples? Do that. Um I mean most of it is like directly within the GitHub repos for both um the Python and the TypeScript agents SDK. Again they're fully open source so you can you can reverse engineer whatever you want to. Um I think one of the patterns that um maybe is like it's crucial but I feel like we talk less about it because it's not like the fun part. is like the guardrail side and like really thinking through um not necessarily just um doing like in like input guard rails on like scanning what whatever was uh put in and then running it through an LLM as a judge, but also guard rails around um human approval and human in the loop for that might take a while, right? I think like one of the very validating things uh when I worked on the TypeScript agents SDK was I wanted to build human in the loop approval in in a way that scaled because it was resumable. Um and it didn't matter how long it would take to respond, right? Like the server could go down as long as you have somewhere to stay stored. Um, it's not the best pattern to like quick start because now I'm going to talk to you about like well where are you storing the data and like when are you gonna deserialize it and bring it back and um how do you version your worker and other things like that. Um, but I think it's so crucial because like we are still in the state of like people having agents where they want to have immediate gratification of like response came back. But like what if you're running an agent that takes hours and you walk away, right? Like you want the agent to be able to pause when there's a critical thing where it violated a guard rail, for example, because it tried to hand off data to another agent that it's not supposed to. um and like you want to stop in that moment, adjust and fix it, but it might take an hour or two until you get back to it. Um so I think that's like a pattern that we have within it that I think like more people it would be fun for more people to explore that and I know Sunil was very excited when we when we brought that up. Uh it's I don't even think it's about critical points. there are sometimes when I just don't want to take a decision like yeah I'm just going for lunch I'm not going to deal with this uh for a while so yeah having frameworks have inbuilt primitives for pausing resuming uh even like generating a little bit of a UI so if you need a little more complex input from the uh user would be like it's been pretty good yeah and thanks for uh putting it in the SDK uh I think Let's just try. I want to do this instead of uh clapping. We've been building stuff with these folks that are up here. Can we just say it out loud? Maybe like on a one, two, three. Can we just say thank you? Like one, two, three. There we go. That's awesome. Do we have any questions? Any questions out there? Well, actually, I'm gonna Can you Can you project? You look like you can project. You You got this. All right. talking through the agent road map data primitives to kind of build like you have light in there data primitive huh uh okay so that's like three different questions the first one is I don't think we are happy with our implementation of the sync thing right now it's way too primitive and people try shoving like like right now that state object can be a maximum of 2 MB in size and people go over that very very quickly. So I think we want to do something a little smarter. Uh also because they are multiplayer by default you kind of need conflict resolution like inside it. Uh we do want uh file system but I think we're going to attack it in a slightly different way because we now have sandboxes that are full-fledged computers. uh which means there's a potential for building an agent from the agents SDK that actually runs on bun inside there and communicates with the durable object version of the agent. Um so that has been a new capability that's been unlocked for us. Those sandboxes actually don't do persistence right now but yeah as a secret. Yeah, we're going to ship persistence for our containers very very soon. Uh oh, I'm so excited about that. Huh? Who knows? Yeah, I like whatever. Uh if if this is what I get fired for, I'd be a legend on Twitter. It's fine. Uh then um so yeah, like so I I don't think we're there yet. The what we did was just a first take and it seems to have brought us all the way now. Uh there's a bunch more to do ahead. So this stuff is related to for instance your statement. Is there anything you'd want to provide in the runtime? You know obviously when people are working with agents you have a memory. We are yeah I think memory was a good one and we have something that we did a pro concept for um well I guess it's open source I can share um so the way how we think about memory is what if memory wouldn't be just a part of your one specific agent because there's so many agents right what if you um as a USBC can take memory and plug it into like your agent like I don't on topics agent open in any agent and you don't lose any context and they just keep carrying it. So I think we are working right now on this concept of portable memory like what if you can just like make it pluggable to whatever. Uh yeah so I think that's one of the things we're like actively working on. Yeah question. Yeah go ahead. I have a pretty nerdy question. Uh so as you start to scale tools and MCP servers specifically, let's say you're up to 100 MCP servers or 2,000 tools, how have you begun to uh a keep it from setting up all these servers and eating up contexts and tokens and b uh maybe there's like a simple search function where you could name the tools that have similar say like search contacts features you know maybe it's like marapost search context or go high level search context. Have you played with any of those types of things? cuz I'm about to engineer a solution. I'd like to save myself some time. Have you ever figured it out? I actually mean that perfect day on this place since uh Yes. Yeah. I I think um the MCP like programmatic tool use is a great blog post. I I think um I also challenge when people are using MCP or not, right? I think if you have full control of the agent end to end, it's not like like originally FCP was developed for you know like cloud desktop and cloud AI and uh we want to create an ecosystem where anyone can connect to these agents and we don't know what they're going to connect right so there's like a bit of a protocol there but if you're building your agent end to end you know you control the entire control flow and so I would think then like if you're on the order of like hundreds of tool calls I would think about like Hey, make it a TypeScript definition, you know, and [snorts] then generate code against that. Um, and yeah, just like execute the code, lit the code, things like that. Um, you know, like you probably have your own API keys. So, it's a little bit independent. But if you genuinely need the hundreds of MCT servers because your users are using it, then yeah, the the um I forgot what the blog post you guys put out was like the but the uh tooling the programmatic tool use is like a good way to do it. Thank you. It's called code mode. Code mode. That's right. Yeah. Am I reading agent? Yeah. It someone has Yeah. Yeah. There might be some available on the back table. Um, so I I guess I'll stand here, but um when I think about I I've used most Yale's agents SDKs. When I think about like the split between uh the SDKs, I think of like Cloudflare's agent SDK as more of like a runtime environment for the agents versus like Anthropic and OpenAI is more of like abstractions over their like completions API or like chat API for for anthropic I guess like where do you all see the border of that? And like I I feel like there's there's kind of some ambiguity in certain cases where like agents for Cloudflare has some like MCP capabilities and some memory capabilities and like you all have some like state management capabilities. I guess where do you all see those overlap and do you see where do you see the ownership of kind of those capabilities ending up? All right, that's a fun question I get. Um, I mean, it's a good question, but I think the way how we think about it is we want to stay very neutral in terms of what you um can and cannot do. We will provide all the tooling on top of the uh the primitives we have. So, whatever it comes to cost hosting, you'll be able to do that. But we want to make sure that whatever choice you may make make in terms of your um toolkit, in terms of your framework, no matter what that is, it still will work with Cloudflare. I think that's where we draw this like line and boundary. Whenever we if at some point we'll become way too opinionated about how something works, I would say we fail at executing that strategy. But for now, I would say make it work with everything. whatever the choice because I think we'll see more of this. Everyone is coming out with the more agents frameworks like every time I open Twitter there's an agent framework. Um I think the goal is to just uh make sure that there's a common ground whenever people don't have to rewrite half of their codebase whenever there's a new page in the framework. Um anything to add? Um no that's actually perfectly right. We just want to be a nice home for your AI agents. I think for us there's sort of definitely an interesting balance because like especially if you've used the responses API it's at this point a very agentic API like you can give it remote MCP web ser uh web search a bunch of other capabilities and like basically have an agent in a single curl request um that can like run for like hours async or not hours probably but like it can run async you can do all the things um so like for us the agents SDK I think is almost a way to really streamline bringing that into your codebase in a way where it also works really well with others, right? Like as I mentioned earlier like we intentionally make sure that we support all of the other models. Uh so that like we know that you probably are not going to use like exclusively open AI models because even even with open AAI models like if you want to self-host like GPoss um you should be able to do that with whatever host you want and then like combine that with like GPT 5.1 for example. So we want to make sure we have all the abstractions to combine that. But then there's also moments where I think we can improve the experience. So for example with 5.1 um it launched today in the API and two new tools that we introduced were apply patch and shell um to the whole bash thing. Um but the way we're doing it is we're it's special tool calls that we're going to return special events for. And so outside of that you have to bring your own runtime your own sandbox. We want to make sure that it's flexible enough for you to do whatever environment you want. If you want to have your own virtual file system, if you want to have your own sandbox bash, you can do that. But we're going to handle all of the like processing of the events and then calling like your class abstraction that we created for you. So, it's really meant to be extremely lightweight um have zero I think we're close to zero dependencies um and just flexible enough for you to make it really easy to adopt it. So it's one step further for like our regular SDKs are more like typed HTTP libraries and then like the agents SDK takes that further by actually implementing patterns. Yeah, I think for us um I I mean Samil's analogy at the start about kind of like Age of Empires discovering fog of war I think is very active like you know um I think we're still sort of figuring out how does the resource like go and I think you guys all get a vote in that right like who do you give money to you know um I think that like what would I trust anthropic for more than anything right like um I trust Cloudflare for example to host my like database and spin up billions of like uh processes and and you know handle my websocket connections and all and all these things right all that infrastructure. I think what I would trust us for is our um opinions on how to make like LLMs work with your uh like perform their goals basically. So programmatic tool use is a great example, right? Like let's say that you've you spent all this time building up your agent framework and you've like figured out like how the exact amount of tools and now someone's like oh hey like what if all of your tools were just code and you're like aha like I have to rewrite that now right like but you can imagine that like you know we can handle that for you if that's the best practice and we will decide when that's the best practice because we dog food our stuff right so um I think you know maybe differently we don't think of developer experience is the most important thing. Like it's important, right? But the most important thing is capabilities and like making sure the performance is really really good. And then once we figure that out, we'll make the developer experience feel good, but it's not the number one clue, right? Um and so yeah, I think that's how I I think about it. Just like we're very opinionated about like how we think agents should run. So Awesome. Let's Oh, we got one one more question. All right, one more. Um in these agents SDKs um do you have any primitives already or coming to control the context explicitly because the context of an agent is like it's a key to having good responses and I feel like it's it's kind of a black box. I don't know what's inside. and maybe wants more control about it like say hey process this but don't extend the context or this is very important put this into into your context or is that something that's coming or we already have or what do you think yeah I think context management is a really good example of like um if I were you I would use our opinions as strongly as possible right because there's a lot of nuance here right Um, a lot of times I see people wanting to change tools midstream kind of or like edit context earlier or something like that and there is this trade-off of like prompt caching in context, right? And so, um, you really need to have find that fine line of like, hey, when do I inject a new user message or something like that, um, versus like when do I rewrite context? And all of this is sort of like an abstraction that we can help hide for you as well. Like, you know, we still want to have primitives, but I think, you know, context will change over time. Obviously, this year the big way that context has changed is a gendered search, right? So it used to be you have like you know you try and fit as much as you can in one pass with rag and now it's like try and put as little as you can in the initial context and think a lot about the uh the definition of your agentive search API right like how do you make that as intuitive as possible to the model right and make that very simple so I I think a lot of times with context I try and push people to be simpler not simple but like in a easy way simp building simple things I think is really hard and really like difficult to do. Um and so I think it's a little bit of an anti-attern when people have very complex context management strategies that yeah mess with the cache or um things like that. I think trying to find that um like being in sync with the model and I think our HS SDK tries to enable that as much as possible. Yeah, I think we're on the other end of that. Like um we've been keeping it intentional like if you're ignoring voice agents, I think voice agent is a slightly different situation for us because um our real time API like we manage the real context for you basically as you're talking. Um and so like there's APIs that you can like manipulate that if you want to. Um, but real time is slight a slightly different story. For like traditional textbased agents, we give you all of the hooks to connect into whatever part you want um and control that. So like we have this agent handoff uh feature by default. Like if you're handing off to another agent, it takes over the entire context at that point. But then like you have what is called input filters where you can like take things out if you want to or control more like what is the part of the context you want to hand over. Um and then we have like the sessions concept where like you it's again a purely like a pattern that we give you so that like for example you could have your own backend storage where you're then deciding what actually goes in like you're getting a bunch of hooks and you can completely build that and customize it to uh to your own experience uh your own liking. Um, I see a lot of people uh using the analogy of if you're trying to write your soft uh your program as one function that takes a,000 arguments uh by like 20,000 lines of code, you're going to have a problem. And the way you break that is you make a bunch of different functions and you actually spread it down. So I see a lot of people using sub aents because it turns out context pollution ends up being just a place to keep all your inputs and outputs and you can absolutely use a sub agent for that and then get uh uh just a response back that you would like stuff into your context. So u having a a simple multi- aent framework or library or just helpers uh helps immensely when the weight of context starts like uh bothering you. I guess that's the big one that I see happen a lot. Can we actually do one more question? Uh first just let me say thanks for summit that's better than I was. Uh Dota was one of the coolest things I've ever seen. Cloudflare uh containers. I see the vision. Loving it. loving where it's going. Um I just what I my my question is really a request. Um because we're doing video uh I just it's a request for internal advocacy. If we could just get more network throughput. That's that's my question. Can you just can you just be like if we if we could just get like 40 times faster, you know, from where it's at now uh forever, we'll spend a million a year with you guys. I'd love to do it. Okay. Yes. The answer to your question is yes. Okay, cool. We'll do that. It's so beyond my scope of understanding how these things actually get faster. Like how what's coming down the pipe? No, I'm saying if there's a hundred million customer, we'll make it happen. I guess that's what I'm saying. All right, let's give everybody a a hand. They're going to be here for the rest They're going to be here for a little bit. Not the rest of the night. I almost get made you stay all night long uh to to sit and and hang out with and chat with uh because I know I know there are more questions that maybe you didn't want to ask here. I want to say thank you to you all for being here. That was awesome. We are all working together. Let's remember that we're all building together. So, let's let's keep let's keep that going. Uh thanks everybody for coming tonight. Uh we are always hiring. We're always hiring. So, cloudflare.com/careers. Uh and you all are hiring too. So, talk to them about uh uh jobs that might be be available and where to find those. Uh thank you everybody for coming. I appreciate you being here.

Get daily recaps from
Cloudflare Developers

AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.