Build an AI Agent Fake Photo Booth for your Real Friends

Cloudflare Developers| 00:17:02|Mar 26, 2026
Chapters9
Uses the Replicate Explore page to discover new models and ideas, highlighting the Sedream model from Bite Dance with multi-image prompts and richer world knowledge, then moves to the playground to test ideas.

Cloudflare Developers shows how to build a self-contained AI-powered fake photo booth using Replicate models, agent SDKs, and durable workflows to merge selfies into a shared scene.

Summary

Cloudflare Developers' video follows a hands-on journey from inspiration to a polished demo. The host cites Replicate’s explore page and the Sedream model from ByteDance as pivotal in spotting capabilities for multi-input image prompts and richer world knowledge. They then describe building Photo Booth, an app that lets groups upload selfies and generate a shared scene where the participants appear together—without physically being there. The demonstration centers on an orchestrated flow using the agents SDK: a hub agent to manage booths, SQLite for booth data, and individual booth agents that handle image uploads, face-cropping, and state management. The talk highlights client-side RPC, live state updates via useAgent, and a server-less architecture where the hub and booth agents form tiny, self-contained services. Durability and reliability are addressed with workflows for retry logic, exponential back-off, and a photographer workflow that ensures photos are stored in R2 so that generated images persist beyond the model’s short-lived predictions. The presenter notes practical tips like local development with Cloudflare tunnels, testing against the agent API locally, and the importance of keeping generated images accessible via permanent storage. Throughout, the host shows how an idea can evolve into a functioning, shareable demo—and invites viewers to try it themselves, capture photos, and think about new use cases like remote holiday cards. The video closes with encouragement to explore Replicate models and share builds, while hinting at future enhancements like a random slide generator and more AI-powered tooling.

Key Takeaways

  • Replicate’s Sedream model enables multiple input images and richer world knowledge, enabling more convincing composite scenes.
  • The Photo Booth app uses a hub agent and per-booth agents with SQLite storage to manage booths, including unique slugs and display names.
  • Client-side RPC (callable methods) allows front-end interactions without REST APIs, enabling smooth, serverless UI actions.
  • Photos are uploaded, cropped with face gravity, and stored in R2 to persist predictions beyond the model’s temporary generation window.
  • Durable workflows with automatic retries and exponential back-off handle failures when generating many photos at once.
  • State updates propagate to connected clients in real time via the hub agent, using a websocket-like mechanism without polling.
  • Local development can be tested with Cloudflare tunnels to expose a locally running agent stack to replication services.

Who Is This For?

Developers building AI-powered client apps who want an end-to-end pattern for model-driven image composition, agent-based architecture, and durable workflows. Ideal for Cloudflare users exploring serverless, edge-friendly AI demonstrations.

Notable Quotes

"“I’ve been building this thing. It’s called Photo Booth… Faux, like French for fake.”"
The presenter introduces the project with a playful naming and a clear intent for a fake photo booth experience.
"“These agents are tiny servers with their own database. In fact, you can even upload to them.”"
Explains the agent architecture and the data model foundational to the app.
"“Any method that’s marked callable can be called from the client side.”"
Highlights the serverless, client-driven interaction model via the agents SDK.
"“Photos… stored in R2 so that generated images persist beyond the model’s short-lived predictions.”"
Justifies persisting outputs using durable storage to overcome model latency and transient results.
"“We could do a re-shoot… re-shoots will automatically go past this little blockage.”"
Demonstrates the UX flow for iterative photo captures and user control over results.

Questions This Video Answers

  • How does the Cloudflare Agents SDK enable multi-user AI apps without a REST API?
  • What is face gravity in AI image editing and how is it implemented in an app like Photo Booth?
  • How can I use Cloudflare R2 to persist AI-generated images used in a model-powered workflow?
  • What are best practices for local development and tunneling when building AI-powered demos with Replicate models?
  • How do durable workflows handle retries and back-off when generating many AI images at once?
Cloudflare WorkersCloudflare AgentsReplicateSedream modelR2 storageSQLiteClient-side RPCWebSocket-like state updatesFace gravity/croppingWorkflows (durable execution)
Full Transcript
Where do you go these days to test new ideas? For me, if I'm looking for inspiration, I head straight to the replicate explore page and I see what new models are out, what capabilities are there. It's a great way for me to find inspiration. Now, for instance, I was just doing this a few weeks ago when I spotted this new model, this Sedream model here from Bite Dance. Uh, it's awesome. Uh, and there's been a a lot of cool new app ideas around this type of model where you can do image editing. And what really drew my attention was the fact that I could add multiple input images and uh prompt about them together. And uh what's really cool about this model is that it's it's more it's more uh world aware. There's richer world knowledge. So I can kind of the prompt will do a little bit more for me. So what I did was I took that knowledge and I headed to the playground and I tested some ideas out. Now specifically what I was wondering was if I had a handful of selfies like from people. Could I put them into the scene? Could I use this image input? And it turns out Cream does a pretty amazing job of stitching together a scene. So I started thinking, what if I could build an app that let you make a fake photo booth? like take a group photo anywhere you want, but you aren't there. You and everybody in the photo literally aren't there. So, I built it and it's called photo booth. Get it? Faux, like French for fake. Fo, I want to show you a demo. So, I'm going to gather some of my teammates together in a video chat. Ot, you got a second to help me uh debug something I've been working on? Always. Uh, so I've been building this thing. Uh, it's called it's called Photo Booth. F A UX T O like like false. You get it? Yeah. All right. So, I'm going to share my screen. So, so here it is. Photo booth. So, what you do is we create a booth. Where would you Where should we hang out? Where would you like to hang out? We've only hung out a couple times. I Where should we hang out? How about Piccadilly Circus in London? Oh, I have never even actually been, so you'll have to tell me if this is good. Piccadilly Circus is the where we're going to be. So, pick anything specific that you would like about that? No, it's just a big square in in the middle of the city with tons of tourists. Oh, fun. Awesome. So, what we're going to do is we've never been there. We're going to take a photo there uh just by using uh our selfies. So, I'm going to I'm going to click launch this booth. Now, this is going to kick off and it's going to start and it's going to generate. I'm using uh Cream 45 that came out a couple weeks ago. Um it's really awesome. Uh so, this this should be great. This should look great here. And what's gonna happen is it's gonna put us there. But if you look at the top, there's the share booth. You see that on my screen? Can you take a take a photo of that? And I would love for you to take a selfie. And this is how we're going. You're going to end up in my my photo there. How are we feeling? Is that does that look like Piccadilly Circus? Oh, I know where we're at. Oh, Yellis. Hey, buddy. He's arrived at the Piccadilly Circus. Have you ever been to Piccadilly Circus? Yeah. Better late than ever. Absolutely. Have you been to uh Piccadilly Circus before? in London. Yeah, I have. You have? So, do you want to take a photo with us there real quick? Love to go there with you guys. Okay, sweet. So, so there's a QR code there at the top. Go ahead and take that. Take a picture of that. And I'm going to take a picture of myself here. Very cool. Uh and I'm going to click capture. And you'll see that when I did that, uh my upload went and you'll see that I guess uh I guess that's you a now all three are there. I have said that there should only be two people in this photo. Uh and now there's actually two photos in process in progress here that's that's happening. So uh what's going to happen is it's going to take our photos and it's going to put us into this and the the model itself has a world view. So it should understand I told it to behave how we might behave if we were hanging out there. Um I think we have a lot of fun. This looks like a I I have been here. I do know what you're talking about now. I didn't know it was called Piccadilly Circus. That that happens as you walk through places, right? London's great that way where you walk and you just end up walking through like, "Oh, this is very famous. I feel like I'm in some place." Oh, this tunnel is beautiful. Yeah, that sort of stuff. Why did you choose this, A? It's one of those places that a shelling point if you get lost. Let's all meet a Piccadilly Circus. It's like Time Square. Gotcha. Gotcha. Do you think Shaft Touri Circus up there at the top? Is that a Is that an AI thing or is that It's misspelled. It's Shaftsbury Circus. Oh. Um, and it's not really It's the Shastbury Theater. Ah, okay. Okay. So, we've got a little And then we've got a big Piccadilly sign over here on the side. That happens. It's AI. Well, the structure of EOS is accurate. Okay, that's great. Although, if any Londoners are watching, I know it's not actually technically called EOS. [laughter] I think I think what's going to happen here, so this is this is uh this is running in the background. It's doing a it's it's doing a workflow. It is it has taken our photos and they it's saved it to R2. And so, now and then they're going to come here and and pop back I hope [laughter] this is one of those where it takes a little bit of time. It looks like a great photo except that if you tried this in real life, you can't run over. There's a very busy King section. Yellis, where would you have chosen us to go? Uh, what did you do last time? Was it organ? Uh, a roller skating. Oh, you did a roller skating. That was great. That was great. Yeah. So next time though me and this guy this me with Yulis's hair. So this happens sometimes in these photos that they come. That's great though. Yeah. Yeah. It's great. We are take and and we did get hit by the car and you notice it changed the angle a little bit. So it knew that like oh in order to get them in here I was wondering what it was going to do with that. Uh this guy pops up a lot. Oh there we are. Look at us. It's brought me in this shirt I'm not actually wearing. Yeah. Which is Yeah. And so so if if a if if one of these shots come by, uh you know, sometimes AI does Oh, that is good. Sometimes AI does some weird stuff and I want to make sure that you can always do a re-shoot if you want to. So So if you come in here, you build one of these booths and you do a re-shoot, it should go and it should kick off. But gentlemen, this is it. This is the idea. I'm going to go uh jump and show everybody else how uh I wrote the code. And uh but I would love for you to hang out. I'm going to I'm going to share this, right? So, so I want to make sure that like when you we come in here that we're able to go and share this with people because what good is a picture if we can't be in it together. And you also I feel bad that we're actually not in it. We're we are in it together. We are together. We are one. We are one. We have become [laughter] one. So there is a lot of fun stuff like this. I my my um my my uh niece suggested that we put this in Job of the Huts palace and she came in as her mixed with Baby Yoda, which I know is Grou, but like it was incredible. It was an incredible. So So it's fun. Come play. Come play. I love the way it combined shirt and hair with you. Oh, there we go. We did it again. It took your glasses off, I think. There. And I don't know where what you're wearing. I I don't know what's going on here weatherwise for you here. Are you the Hawaiian? Yeah. That doesn't feel like you, does it? But it feels like an interpretation of you. I mean, I don't think I've ever worn something that colorful in my life, but you look pretty good. You look pretty good in that. [laughter] All right, y'all. Thank you so much. Thanks. Thank you. It's been fantastic. It's awesome. Yeah. Bye. So, I know that I needed orchestration, so I immediately reached for the agents SDK. I wanted to model things this way so that it could scale outwards, right? I wanted each of those booths to be self-contained. you as a user will only ever show up in a booth that you submit your selfie to. So that data is kind of all controlled right there. So first I needed a place to create the booth, right? I needed this page here. I needed something to handle this page. So this is called the hub page. I made an agent called hub. There's currently only one of these. It it controls this page, right? So this is the page here. And let's take a look at what that looks like. So what I want you to remember is this is the the hub agent and it is here under worker agents hub and in here uh what I've done is uh I want you to to remember that we have access to SQLite database like inside of the agent it's really powerful uh I'm creating a table called booths I've got a unique slug uh for each one of these and there's a display name here and then uh I create that booth I create one of these booth agents by name and let show you what that looks like. So, let's let's look down here for create booth. So, here we are. So, here is uh create booth. So, I come in here, I get a unique booth slug, I get the slug, I do this get agent by name, and this is the binding for the booth agent. And I pass in that booth slug. So, now I have a booth with that name. I've created a new one. And then I need to do some setup on it, right? But real quick, let's review what's happening. Now, notice that this booth, this create booth is marked uh with callable here. So in fact I've wired up the form to just call this method directly and pass these values up. Right? There's no REST API. I'm just using client side RPC. Pretty cool, right? Any method that's marked callable can be called from the client side. And any method at all like inside of this instance can be called from the server side. So that's how I'm calling this method setup. Right? So so um again I've created this new booth. I've got this new instance of an object that has the same slug and then I wanted to call this setup on it and I can do this because I have access to this instance, right? This booth is an instance and I'm calling setup on it and I can just do that. So let's go let's go look at what that setup does. So I'm going to open up my my booth agent here and uh let's find setup. So setup takes these values here and uh it does some state management, right? So I want to set up the fact that this is a new display name. And then we call this refresh background cuz that's how we set up that background photo that we have. So that's the setup that's needed after we create a new agent. So let's pop back to the hub. So I also maintain state here in the hub. So that's what this latest booths is, right? So I'm going to pull this latest booths from the the the state. I'm going to unshift it, which means put it first, right? and I'm going to set the latest booths to the top 10 booths. So, what happens when you update the state is that all connected clients receive that update. It's a super powerful way to make that happen without polling. And it really just works outside of the box. And and here's how it works. So, I'm going to go in here under source. This is my the front end here. And I'm going to go to pages. And I'm going to go to the uh the homepage here. So uh this is using use agent here. You can see that I'm using the hub hub agent the same one from the server and I'm connecting to hub agent. It doesn't have a name because there's only one of them by by default it's default and it's calling this onstate update. So whenever the state happens automatically it will pass it will call this function and I'm just doing normal react stuff here right set latest booths uh for the state latest booths and that's going to update over here automatically to anybody who is looking at this page because it's connected over an abstracted websocket that you don't even need to worry about just because I use that use agent. It's pretty powerful, right? These agents are tiny servers with their own database. In fact, you can even upload to them. The agent API allows you to handle incoming requests. Agents booth and then uh in the on request method here of the booth. Uh you'll see that I get the photo upload, right? I get this file from the uh photo upload like I normally would do. I uh run it through. I make sure that only one exists because we don't want to have duplicates. And uh I do a little uh little thing that we do here called a uh face gravity. We do we could do a face crop. We find the the face there. Uh and then I store that and I update uh the SQL uh appropriately and I store that in this is R2 here. This is our zero egress fee object store. So now that I have the upload and I have gone and updated the state so that we know that there are so many uploads, I'm going to go ahead and snap the photo. It has the logic in there that will wait to make sure that the first uh photo is created. Let's take a look really quick at that snap photo. Um, it's going to wait until you get the ideal member size because we wanted to encourage you to get more friends to come, but eventually it will snap if you don't have one. So, this is using a built-inuler, right? So, there's there's this uh this schedule and I said uh in 5 minutes a second do the snap photo and and do a re-shoot like so re-shoots will automatically go past this this little blockage here. So using that built-inuler, right, built using this this built-in scheduleuler is super handy for things like this that you want to run on a timer. Now, one thing I wanted to be careful with was this. What if this thing takes off? Like in testing, it made some really fun photos, and I can imagine people wanting to share those out. So I want to make sure that I handle all of those requests. So this is a great time to lean into workflows. Workflows allow for durable execution, right? So what I did was if we come in here, we go under worker and then we go under workflows. I have this photographer workflow. Um and let's get some space here. You can see that it does some steps. So uh it will automatically do this step. If if I run into any problem here, it will automatically retry it uh for a certain number of times that I can set myself and it will automatically do exponential back off. Right? So, if I'm doing the thing where I'm actually going and I'm generating the photo, that's exactly what I need, right? I want it to in case that runs into problems because I did so many all at once. I wanted to take its time and do exponential back off. It's exactly what I need. So, one pattern that I've really been enjoying lately is that I write most of the logic in the agent, right? And then I just pass the agent name in here and I do the same trick where I get the agent by name and then I call the functions here. I call the methods on that agent directly. And that allows me to test things on the agent as I'm building things. Oh, and I guess that's a good time to remind you that you can build all of these things locally without deploying to a server. Run it locally. And I found out for this one, in the case where you want to pass URLs, you can actually spin up a Cloudflare tunnel and point to your local server. Check the notes in the readme for more information on tunneling. super cool and handy for when you need to go across applications, right? I need to pass the R2 photo, right? I need to, right? Because I'm going to go and I'm going to store this photo and it has a URL and I wanted to be able to pass that URL pass and if I'm on local host, I can't pass that to to my replicate instance, right? Because a local host, I can't do that. So, what happens is this workflow generates the image and then we store it in R2. And the reason why I'm doing that is these replicate image models when they get generated, you only keep your prediction for an hour. So I wanted them to last longer. So I put them in permanent object storage. And of course, of course, I used AI to help me with this. So I like to keep a little folder called truth window. Um I definitely the the whole front end is uh made by OpenAI's codeex and Tailwind. And I tracked what I asked for and what I did in this whole folder. So, you can kind of see if you jump in here, see what I did, and you could see how I use AI to build apps, cuz everybody does that a little bit differently. And I like to share that uh I didn't come up with all this. Check out that truth window folder if you want to see how I use AI to build apps. So, that's my photo booth app. If you've reached this part of the video, can you come jump in a photo with me? Everyone who did it recently should be in this photo. So, you can come. We could take a look at ourselves in this photo. It would be fun. Just scan this QR code, take your photo, and you'll get to see uh how it works from a user's point of view. Fun, right? So, I love it that I went from wouldn't it be cool if I could do this thing to a working app. It's such a bonkers time to be a builder. Let me know what models you find on Replicate and share with me what you build. Oh, my coworker gift had this idea that you could use something like this to build a remote holiday card. Maybe it's not too late to send out that holiday card. Just build a booth, snap some photos of the people in your family, send it out, see what happens. Great idea, gift. I think I'm going to make a random slide generator next. That new Gemini Nano Banana model is awesome at diagrams. Subscribe to the channel so you can follow along. Thanks for hanging out and we'll see you real

Get daily recaps from
Cloudflare Developers

AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.