Build Dynamic AI Routes with Cloudflare AI Gateway

Cloudflare Developers| 00:06:59|Mar 26, 2026
Chapters9
Introduces the common SAS pricing problem and outlines the heavy lift required to implement per plan rules, limits, caching, and pricing math manually.

Cloudflare AI Gateway lets you build a dynamic, multi-tier AI routing flow in minutes using a drag-and-drop builder, handling plans, models, rate limits, budgets, and retries out of the box.

Summary

Cloudflare’s own developers walkthrough shows how to replace weeks of boilerplate work with a single dynamic route in the AI Gateway. By dragging blocks in the flow builder, you can route free vs. paid users, apply per-user rate limits, enforce budgets, and switch models by plan automatically. The demo emphasizes a per-user identifier (email) and separate constraints for free users (lower-tier models, tighter quotas) versus paid users (higher limits and premium models). The flow includes conditions like plan equals free, traffic splitting, rate limit pools, and budget blocks, then assigns different models (GPT-4 mini, GPT-4.0, or GPT-5.1) accordingly. Cloudflare emphasizes that this approach replaces what would otherwise take a week or more to implement with traditional routing, caching, retries, and health checks. A live simulator shows paid versus free experiences by sending the same prompt and comparing results, illustrating the practical impact of the dynamic routing decisions in real time. The takeaway: you can deploy complex AI-access controls and provider selections with a single, maintainable visual workflow.

Key Takeaways

  • Dynamic routing in Cloudflare AI Gateway can implement per-user plan logic, model selection, rate limiting, and budgeting within minutes, not days.
  • The flow builder enables conditions like 'if plan equals free' and traffic splitting, using operators such as greater than, less than, and equal to for flexible routing.
  • Rate limits are defined with a pool and an identifier (e.g., email), enabling 10 requests per day for free users.
  • Budgets are enforced per identifier (e.g., $10 per day for free users) to prevent overuse.
  • Paid users automatically receive higher limits and a premium model, such as a GPT-4.0 or GPT-5.1 option, via the dynamic route configuration.
  • A practical simulator demonstrates the real-time difference between free and paid experiences using the same prompt.
  • The approach claims to replace weeks of boilerplate with a single, maintainable visual workflow and centralized metadata.

Who Is This For?

Essential viewing for developers building AI-as-a-service apps who need to manage per-user plans, model access, and usage budgets without custom routing boilerplate.

Notable Quotes

""dynamic route which does all of this in just 5 minutes.""
Highlighting the speed of implementing the routing flow with AI Gateway.
""you literally drag and drop blocks to build this whole flow.""
Emphasizing the visual, no-code style of the setup.
""Let's do if plan equals free.""
Showcases the conditional logic used to differentiate free users.
""I want to do 10 requests per day.""
Example of a rate-limit policy tied to an identifier (email).
""And we'll also use the email as the identifier.""
Demonstrates the per-user targeting mechanism for limits and budgets.

Questions This Video Answers

  • How does Cloudflare AI Gateway dynamic routing manage per-user models and quotas?
  • Can you implement per-user budgets and rate limits with a visual flow builder?
  • What are the steps to create a dynamic route in Cloudflare AI Gateway and connect it to an AI provider?
  • What model options are available for free vs paid plans in Cloudflare AI Gateway?
Cloudflare AI GatewayDynamic routingPer-user rate limitsBudget enforcementModel selection by planFlow builderProvider tokensGPT-4.0GPT-5.1Traffic split
Full Transcript
All right, here's a very super common SAS use case. You got a free plan and a pro flam. Free users get a tiny budget, slower limits, and a cheaper model, whereas prousers get more budget, higher limits, and a much more premium model. Uh, sounds simple, right? But this is honestly a week's worth of effort. Imagine building this in a traditional way. Uh, you need a routing service that knows every plan, every model, every role. Maybe you're already a setup for per user rate limits. And you also need to implement caching for repeated requests. There's token counting pricing math for each provider. And if you're doing this, you also need to run background jobs to keep the prices and metadata up to date. Uh and there's like a lot more things involved uh like retries, fallbacks, health checks too, and basically a lot of things for just one AI response. And we could obviously do is spend a lot of time writing boiler plate or we could just use Cloudflare's AI gateway. We have something called a dynamic route which does all of this in just 5 minutes. Let me show you how this is AI gateways dynamic routing. You literally drag and drop blocks to build this whole flow. Just head over to the Cloudflare dashboard, click on AI gateway, and once you create a new one, you can just head over to the dynamic routing tab, and then just add a new route. Let's give this a name, my trillion dollar SAS. And once you hit create, you'll be presented with this flow builder and add an if condition. Click on this X. Then you can click on this little circle button. Then I want to add an if condition and split traffic. Let's do if else. Once you click that, click on this block. And then you can actually add conditions. Uh let's say I want to do if plan equals free. You can actually choose from a lot of operators too. If you want something greater than, lesser than, lesser than, equal to, etc. I want to stick with the free plan. Then I want to split traffic accordingly. Then let's do if this is true, I want to add some rate limits and head over to limit. Then I can do a rate limit pool. Click on the block again. Then what I do is basically add an identifier which will help me rate limit per user. Let's say I want to do email and I want to do 10 requests per day. Then let's do another limit. Let's add some budgeting. I don't want my free users to use more than $10 a day. Let's do that again on the email identifier. Let's do $10, which is a lot actually. And I also want my free users to get a lower quality model. So, let's just head over here again, click on the little circle button, then click on add model, click on the block, and you can see a lot of providers um which Cloudflare's AI gateway gives you. Uh let's do GPT 4.0 O mini 4 o mini. Uh once you right type it, click on the plus button and then there where there we have a free plans conditions and limits and the model we want them. Uh let's also do if the plan is something other than free. Okay, let's do this. Okay, let's add some higher limits to them. Let's say I want them to have a,000 requests per day. And we'll also use the email as the identifier. Let's also add some budgets. Again, let's click on the circle limit budget. Let's give them a $100 setting. Again, a lot. Then if all of these conditions pass, let's give them a much premium model. Click on add model. Let's do provider token AI. And let's say I want to give them chat GPT 5.1 latest. And yeah, there we have it. We have now created a dynamic route which would have taken at least a week or at least a couple of days to build. So there you have it. Just hit this endpoint and all of these conditions will just work flawlessly. To showcase the dynamic route we just built in realtime use, I actually built this little simulator which is a VIP coding website builder which shows you the difference paid users versus a free free users get uh with just the URL we just built. So let's say we want to run this uh prompt uh create a website for software developer and it's the same prompt uh and I'm actually sending the metadata which I just added in the dynamic route. So there's the email which is identifier and you can see the requests it has uh for free users and the request for the paid ones. So let's just hit this. And while this runs, let me show you the And you can see that it's just one route. None of the primitives or the utilities I was talking about is in here. It's just walking through this metadata that I have passed. And then I just hit the completion endpoint from OpenAI. And let's see. Okay, there we have our free users website. And looks like the paid one is still running. And there we have it. It took nearly like 3 minutes, but yeah, it's finally ready. And you can see the difference the paid user got versus the free user which is because we used a much premium model. So yeah, this is the exactly the use case or the problem we were trying to solve and you can see how cloud AI gateway can help you build this. Thank you.

Get daily recaps from
Cloudflare Developers

AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.