Cloudflare Browser Run: How to Solve CAPTCHAs & Logins for AI Agents
Chapters11
The chapter demonstrates how the agent customizes a bowl in action, highlighting the steps taken to tailor the meal.
Cloudflare’s Browser Run lets you watch and interact with an agent’s browser session in real time, solving CAPTCHAs and logins together with the user through live view URLs.
Summary
Cloudflare Developers’ demo shows how an agent can operate a browser session end-to-end, with the human able to intervene via a live view URL when needed. The presenter walks through ordering a vegetarian bowl on Wolt, highlighting how the agent handles tasks while the human can manually enter sensitive details like addresses and credit card numbers. The key takeaway is that the Agent SDK now ships with browser tools out of the box, and the keep-alive browser session feature ensures the session remains active as long as you set. The code center emphasizes using Playwright’s launch bindings and Cloudflare’s browser run bindings to create a persistent live session. This setup enables superpowers for AI agents: when a CAPTCHA or login barrier appears, the human can jump in by sharing the live URL, solve it, and let the agent continue. Cloudflare also notes compatibility with other agents like Open Claw and Hermes. If you’re building AI agents that need real browser access with human-assisted fallbacks, this pattern is worth adopting and testing in your own flows.
Key Takeaways
- Agent SDK users can enable browser automation out of the box by updating to the latest package, which includes browser tools ready to use.
- The createBrowserSession method uses a keepAlive parameter to keep the browser session active for a defined duration, ensuring long-running tasks aren’t interrupted.
- Playwright’s launch method is used in conjunction with Cloudflare’s browser run bindings to initialize the browser instance.
- Live view URLs let a human observer interact with the agent’s browser in real time to solve CAPTCHAs or complete logins, without exposing sensitive data to the agent.
- Humans can manually enter sensitive details (like addresses and credit cards) while the agent remains in control of other steps, providing a secure collaboration model.
- The integration has been tested with Open Claw and Hermes agents, illustrating a reusable pattern across multiple AI agents.
- Users can share live browser sessions via a URL, making it easy to intervene and then resume automation without losing context.
Who Is This For?
Developers building AI agents who need reliable, human-assisted browser automation. Ideal for teams integrating Open Claw, Hermes, or Cloudflare’s Agent SDK to handle CAPTCHAs, logins, and form submissions without exposing sensitive data to agents.
Notable Quotes
"The good news, you don't have to manually create the tools for your agent. The Agent SDK now ships with the browser tools that you can use out of the box."
—Medical for developers: out-of-the-box browser tooling in the Agent SDK.
"This feature is super amazing. I have even configured my Open Claw and Hermes agents with this."
—Shows cross-agent applicability and enthusiasm for the live browser run capability.
"Whenever your agent tries to access the browser, it will create a live session that you can view in your application or on the Cloudflare dashboard."
—Explains the core user workflow of live view sessions.
"We simply share the live URL link with me. I open it up, solve the CAPTCHA or maybe log into my account and let the agent take it from there."
—Demonstrates human-assisted interaction via the live view flow.
Questions This Video Answers
- How does Cloudflare Browser Run’s live view help with CAPTCHA challenges in AI agents?
- What changes are in the latest Cloudflare Agent SDK for browser automation?
- Can I use Browser Run with Open Claw or Hermes agents for secure human-in-the-loop automation?
- How do keepAlive browser sessions work in Cloudflare's browser run feature?
- What are best practices for handling sensitive inputs (addresses, credit cards) when using AI agents with live browser view?
Cloudflare Browser RunAgent SDKPlaywrightBrowser automationLive view URLCAPTCHA handlingLogin automationOpen ClawHermes
Full Transcript
[music] Hey there. I almost didn't see you. I was just enjoying the lunch that the agent ordered for me. Yes, the agent actually ordered the meal for Well, we worked together to order my meal. Let me show you the actual recording for this order. So, over here I have my agent and I asked my agent to order a healthy vegetarian bowl or a salad from Wolt. I would prefer something which is rich in protein. So, my agent goes to wolt.com and I can see the live view over here. And now Wolt is asking me to enter my address.
Because I don't want the agent to have my address, I enter the address manually and I let the agent know that, "Hey, I have taken the action. Let's continue." The agent then continues the work of finding the perfect meal for me. And as you can see, whatever the agent does is visible live on a browser. It's not just that only the agent can interact with the browser, but as a human, I can interact with the browser as well. The agent found the perfect meal for me and I told my agent to go ahead and order it for me.
Because I mentioned I need a protein-rich meal, the agent went ahead and added some extra protein supplements for me. And once they were added, the agent added the meal to my cart and the next step was to go and check out. I thought, "Why not ask the agent to handle the checkout as well?" So, I asked the agent to do the same and when the agent did that, we ran into a problem. This time, I had to log in manually to do this. So, I went ahead, entered my email address. Now, the next step was for me to enter my credit card information and again, I did not want my agent to have access to my credit card.
So, I simply entered it myself in the browser and ordered the meal. And this is what I am enjoying right now. Now, if you find it really interesting and want to add this capability to your agent, let me show you how you can do that. Now, over here is the code for my agent. The agent is built using Cloudflare's Agent SDK and I am giving the agent some tools. The good news, you don't have to manually create the tools for your agent. The Agent SDK now ships with the browser tools that you can use out of the box.
So, if you are using Agent SDK, this is the time to update the package. Coming back to our code, the only important thing that you should be looking at is the create browser session. Now, this method uses another method which is the create browser method, but it passes a parameter which is keep alive. The keep alive parameter is going to keep our browser session active till whatever time we set over here. And if I show you the create browser method, this is the method where the actual browser instance gets created or launched. We are using the launch method from Playwright and passing in our browser run bindings with the launch option.
And that's it. Now, whenever your agent tries to access the browser, it will create a live session that you can view in your application or on the Cloudflare dashboard. This feature is super amazing. I have even configured my Open Claw and Hermes agents with this. Every time these agents access the browser and hit any kind of issues like a CAPTCHA or they need me to log in or click something that they cannot really do, they simply share the live URL link with me. I open it up, solve the CAPTCHA or maybe log into my account and let the agent take it from there.
So, how are you going to use live view from browser run? Let me know in the comments below. I'll see you in the next one.
More from Cloudflare Developers
Get daily recaps from
Cloudflare Developers
AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.









