Edge Containers under the lid and why cold start matters | Immerse Stockholm 2026
Chapters4
The chapter introduces Cloudflare's edge containers, explaining how they extend workers with a full Linux environment to run more resource-intensive apps, while highlighting deployment to the edge and the ongoing product evolution. It contrasts traditional workers with containers and notes current limitations and capabilities.
Cloudflare’s Oscar pitches containers as a full Linux, serverless option that boots in 1–3 seconds and auto-prefetches images to cut cold starts, with transparent trade-offs and roadmap in progress.
Summary
Oscar from Cloudflare introduces a new container service that sits alongside Cloudflare Workers. He emphasizes that containers on Cloudflare provide a full Linux environment with no Windows support, enabling heavier, CPU- and memory-intensive workloads beyond the 128 MB limit of Workers. The talk explains how the system orchestrates containers without Kubernetes, using a durable object as the orchestrator and a worker as the entry point. Files are pre fetched across Cloudflare’s network to reduce cold starts to about one to three seconds, and deployment is designed to be simple: push your image, and Cloudflare handles distribution and startup near the user. Oscar also shares current limitations, notably the lack of autoscaling at the moment, and highlights pricing components like CPU, memory, disk, and network egress. He reinforces the value of fast warm starts to avoid cascading timeouts and notes upcoming improvements such as bigger instance sizes, native disk and memory snapshotting, and enhanced autoscaling. The talk closes with practical use cases, from ETL jobs and video transcoding to Terraform runs, and invites feedback from developers who want this service to evolve. Cloudflare’s aim is to make container deployments effortless at global scale, with attention to latency, cost, and developer experience.
Key Takeaways
- Containers on Cloudflare run a full Linux environment with Dockerfile-based images, expanding beyond Workers' 128 MB limit.
- Cold starts can be as fast as 1–3 seconds when images are pre-fetched and spun up near the user, improving perceived latency.
- Deployment uses a durable object as the orchestrator and a Worker as the entry point, avoiding Kubernetes complexity.
- You pay for container uptime, CPU speed, memory, disk, and network egress; idle containers cost nothing, and pre-warmed images aren’t charged until they start.
- Global prefetching and intelligent routing allow new or returning containers to start in the closest region, often Scandinavia, Paris, or Stockholm, depending on demand.
- No autoscaling is currently available; scaling is managed by the durable object orchestrator, with ongoing roadmap items like bigger instance sizes and memory/disk enhancements.
- Storage is ephemeral by default, but data can be committed to R2 or via a Fuse integration for persistent access, plus ongoing features to improve autoscaling and load balancing.
Who Is This For?
Essential viewing for developers and platform engineers who want to run heavier workloads on Cloudflare with near-instant startup times, while understanding current limitations and roadmap. It’s particularly relevant for teams migrating from EC2/Fargate-style patterns who crave simpler deployment and global performance.
Notable Quotes
""Containers are a way for you now to actually run more resource intensive applications that requires more CPU memory than workers.""
—Oscar explains why Cloudflare introduced container support beyond the limitations of Workers.
""We prefetch all like we will prefetch the images across the network so you don't have to deal with pre-fetching meaning that your cold starts will be very fast.""
—Key mechanism for achieving fast cold starts.
""There is currently no autoscaling available which is actually a big deal and this is getting worked at but you should be aware of it.""
—Important current limitation and roadmap notice.
""You pay for CPU speed, memory disk, network egress and the workers and durable objects.""
—Clarifies the pricing model.
""Storage all storage is ephemeral... but you can commit data to like R2 or database before they go to sleep.""
—Data persistence options with ephemeral containers.
Questions This Video Answers
- How do Cloudflare Containers differ from traditional Kubernetes deployments?
- What are the current limitations of Cloudflare’s container service and when is autoscaling expected to land?
- Can I run Terraform or video transcoding inside Cloudflare Containers and how would I handle persistent data?
- What does prefetching mean for cold starts and how does it impact latency across regions?
Cloudflare ContainersCloudflare WorkersDurable ObjectsServerless ContainersPre-fetchingCold Start LatencyR2 StorageFuse for R2Container Orchestration without KubernetesLatency and Global Routing
Full Transcript
Hi everyone, [music] my name is Oscar. I'm a solution engineer here at Cloudflare. And today I will be talking about what my first slide disappeared, but I will be talking about containers on Cloudflare. And we do things a bit differently with with containers. This is actually a new service for us. It came out like five weeks ago, I think. Now, it's been in private beta for like a year or private and uh public beta for like a year. Uh but yeah, now it released five weeks ago. So, what I say now might also not be true next week because this is a product that we're really developing right now.
But before I start about talking about containers, um I actually think it's also just a refresher on workers, but there's been so much talk about workers already. Um but you know workers these serverless compute you deploy them instantly to 300 cities worldwide and like almost zero cold starts. Uh you build them and ship them no servers no infrastructure pay us a go pricing. Um and then like nice integrations with R2 D1 and some other AI services. Uh, and then my coworker actually said like last week and I I laughed a bit too much at this, but Cloudfare workers are the best invention since white bread and pasta.
And yeah, okay. And then listening to like the to you was it Daniel and then also the lovable people. And I was like, okay, yeah, maybe it's actually true. Like it's might be the best things. But even though um they are great in many ways, like Daniel said as well, they only have 20 128 megabytes of of memory. And this was like a lot 30 years ago, but doesn't hold like it's it's it's it's not enough anymore. You can only run JavaScript and TypeScript. Okay, sure you can run other languages like Rust, but then it's through web assembly.
I'm no Rust developer, but like I feel like that kind of defeats the purpose of running Rust if you have to do it through web assembly. You don't have a file system, no native Linux environment, no CLI tools. Yeah, we we just heard this round. [laughter] Uh so the solution, drum roll, containers. So it's a way for you now to actually run more resource inensive applications that requires more CPU memory than workers. It you can keep the containers running basically as long as you want, but it's also like a serverless function uh serverless containers we have here.
So just keep that in mind. Um yeah, you know, full Linux environment like there is unfortunately no Windows here like no Windows support, but I don't know if anyone Yeah, no, no one cares. [laughter] So yeah, uh full Yeah, full Linux u support. That's what we have and I prefer that as well. Um and then of course because it's Cloudflare, you deploy this once and by default that you deploy to region Earth and I think we all know why containers are popular. you pack it you like you get everything you need to for like your your code your libraries dependencies all in a single package you don't got to hear this ah it works on my machine argument anymore so but I think we're all aware so just here how we are like say what is provided I would say same as other providers like we provide the hardware the operating system and then you do the rest Um I've come from an ex like um AWS background.
So I've been using containers on like virtual machines. They call them EC2s and Fargate. And I think that even though containers are great in its own way, they also come with like a lot of other problems and it's like oh which region should I run my container? It's been very difficult. It's like I just want to I just want to launch a container that's all I want to do. But then you insist at least it's been like oh I need IM permissions. I need load balancers. I need um yeah, you name it. There's probably a lot of things I'm already missing here.
Um so I think what I'm going to talk about now like with how we how CloudFare has done it like we solve a lot of these issues. So yeah, let's continue on like 101 more technical uh comparison here is that uh full Linux environment you get the startup latency is milliseconds when it's pre-warmed. But even though when it's like a cold starts, it's one to three seconds, which I think is incredibly fast. Of course, this will depend on like the image size, but usually it's one to three seconds. And yeah, we can actually talk about the instance sizes.
So workers single size, one size fits all. But these are currently the instance sizes that we have for our containers. We have been getting a lot of feedback on our instance sizes. And like if you we are currently working on getting bigger sizes as well. So that's on the road map. But I thought you like I just want to be transparent. This is what we have now. Maybe it's not the biggest but I think it will be like suffice for a lot of workloads. So how do they actually work? And now we're really getting into some of the details.
So you will have if you have a user let's say this one in in Asia here then who needs a container it would make makes a request to the to the worker and then that worker will be like an entry point or like an entry point routter that will then send the the request forward to the durable object and like the container and the durable object in this case will actually be the orchestrator. So there are like no kubernetes. So if you look on how to deploy this like I made this as little as possible now.
So this is the wrangler JSON. So here you have just the durable object which will be the orchestrator keeping track of the sleep cycles, the start cycles, whatever whatever would be and then the container you specify your your docker file and yeah then the container will do its thing. But then for the worker that will be the entry point or the /rooter you just have to specify the class which port yeah the sleep after or when it's supposed to die and then it's basically a fetch request and that's it. So I think what we've done is that we made it really easy for you to just get going and to launch your containers.
So if we go through it like again hopefully it makes a little bit more sense. you have the the user in Asia makes a request to the one in the worker in Tokyo and then from there um the durable object will handle the life cycle of the container and now we're getting into some more I would say different things and how how we have solved this so if you're a developer and you have pushed your image to Cloudflare we will start distributing this across our our network and what we will do is that we will prefetch all like we will prefetch the images across the network so you don't have to deal with pre-fetching meaning that your cold starts will be very fast like those one to three seconds so let's for an example if we have a person in Leon which is in France he needs to access a container he will go through the the worker From there the worker will be like hey this is a container and then you know the whole durable object thing as well and then the container will start up and then of course it will get it will have like this access ID in this case one two three it will launch in Paris because that's probably the closest one to Leon but me in Stockholm now if I also need the same container or a similar container also named one 123 I will also get rooted there this this is how we can achieve like very fast like latency on our containers.
This will be for me to get a new container here will be like milliseconds. So even though I'm in Stockholm, it will get rooted from our network to Paris. But let's say this instance or like this container instance here dies for it's been a few hours, no one is using it anymore. So like it's put to sleep. Um if I then start a new container instance like few hours later I will get through the worker again but this time because I'm the first one accessing this container it will probably be somewhere in Scandinavia. I I don't keeping track of all of our sites and where they will be get launched but it's yeah it will get launched somewhere in Scandinavia and then if someone come from Copenhagen same thing it goes through the worker and to the clo uh to the one that's already spinning uh yeah running.
So probably in Scandinavia as well. What we also will do is that we will once we are noticing that you're getting a lot of traffic to your containers is that we will start pre-warming more instances. So this uh same like here we have two different container instances so they're doing different things. So one in Paris again and one in Stockholm or in Scandinavia. So if someone from Madrid needs the one two3 container then it's already pre-warmed and that um what do you say instead of having that cold start again like it gets like the latency to boot up the new container will also be very fast in the like in the millisecond area and same thing someone from Amsterdam needs the 456 container and then that one will also get spin up very fast.
I'm starting to feel like a better forecast man. Like you can see the containers here like cloudy but with a chance of containers. Um pricing I think you should be aware of how the pricing works. It's always boring to talk about pricing but it's important. You pay for CPU speed, memory disk, network egress and the workers and durable objects. Sounds like a lot but comparing this like face value with the other prov other vendors actually think it's very price competitive. I think it's also interesting to talk about what you don't pay for sleeping containers that may make sense.
You only pay for one like for the containers running like if they go to sleep you pay for nothing. pre-warmed images like Sitting Idle, the ones like from the ones from that Madrid and Amsterdam joined, you don't pay for them if they're pre-warmed. They you pay for them as soon as they actually like get going. And then, of course, the whole thing of us pre-fetching the images for you, you don't pay for that either. So, a few key takeaways or thumb rules on on how this all work like that I think you should take with you on like how the containers work is like you're not pinned down to a location forever.
The like it will start near the first request. Location follows demand. Like I'm I'm not catchy with these things but uh yeah and you only pay for container uptime. And with this you get fast warm starts and cold starts. So why do the cold start matters? I think actually this is for us developers. I think this all makes sense with my AWS background a lot of things I saw was a lot of timeouts like unnecessary timeouts. It's like these systems depending on the container but like it could take depending on the container size it was like two 20 to 30 seconds and it was not necessarily the container that timed out but it was like load balancers in the background timing out uh because the container never spun up in time and I think like this just have this cascading effects being annoying and then like with today's society when people are like attention span is like zero people start scrolling Tik Tok and then you answering emails and then it's like yeah the thing that you were supposed to do is gone because some container took way too long to spin up.
I've also seen these kind these solutions were they were using a serverless solution on another vendor but they were using lambda function in that case or AWS serverless compute they were pinging them with the lambdas to keep the container images or the container instances warm so you can just get like these fast starts and I'm just like this doesn't make sense why should you as developer have to spend time on having this like quick fix that I just threw together over the end and pay for all of them like all of those requests just for the system to work as it should work.
So from my perspective I think this is what we have done it's like a very what do you say like a fresh uh a breath of fresh air on how we actually like made it easy for you to just launch containers. So at deploy your image are fetched globally you don't have to care about that. Um, and what else? Like the whole like we pre-warm them for you. They like once they go to sleep, they sleep. We wake we will wake more we will wake more um instances if demands requires it. Um but then also I think there are a few things you should be aware of because this is a new service and I actually like to be trans transparent about that scaling.
Uh there is currently no autoscaling available which is actually a big deal and this is getting worked at but you should be aware of it. Uh so scaling will be handled with the durable object orchestrator. Storage all storage is emperial. Do you say it like that? Emphal ephemeral. Okay thank you. Uh so you know that means once a container is gone all the data there is gone as well but you can commit data to like R2 or database before they go to sleep. There is like last week I haven't got my mind into like doing this um testing this out but last week there was a new service coming out where uh or not a service but a new feature coming out where you can attach um your containers with fuse to an R2 bucket.
So the R2 bucket will work as a file system. Uh so like this this is a product that we believe in and there's a lot of work in the background already like there's better autoscaling and load balancing coming more instance sizes native disk and memory snapshotting um reducing like dashboard updates yeah some uh a lot of a lot of stuff and also now if you feel like hey Oscar containers sounds great um we would like this feature like please reach out to me or anyone else at Cloudflare like we would be happy to provide the feedback to the service teams.
Um before closing down talking about some some use cases um I think the world is your oysters like containers being containers but anything a work you can't do maybe try it on the container and that can be anything from like ETL jobs video transcoding terraform like you can run terraform palumi you can even run docker in this thing and like I also saw on our cloud hard for developer YouTube channel that someone was running like a graphical user interface with Kubuntu. So it's like okay. Um I probably wouldn't do that but you can if you need like if if you want to.
Um, and then of course like if you want to run uh any languages you want instead of through web assembly, just run it through the containers. And also just like let your AI agents try to spin up uh containers or if you need to like execute your code, you don't have to look at another vendor anymore. You can do it right on Cloudflare. And yeah, it's 2026. Of course, I'm going to mention AI at least once in my presentation. That's actually all I had for now about containers. I'm super excited to see like where this is going in the future.
So yeah, thank you very much.
More from Cloudflare
Get daily recaps from
Cloudflare
AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.









