Scaling MCP: A simpler, safer enterprise architecture
Chapters10
Host introduces the MCP topic and references a blog post from Agents Week, inviting viewers to read it for deeper context.
Cloudflare’s Sharon Goldberg walks through a scalable, secure MCP architecture: remote MCP servers, code mode, portals, and enterprise-grade safeguards with AI Gateway and Cloudflare Gateway.
Summary
Sharon Goldberg joins the chat to unpack Cloudflare’s blog on deploying MCP (model context protocol) in the enterprise. She emphasizes moving away from locally run MCP servers toward centrally hosted, auditable deployments via Cloudflare’s developer platform and Workers, with Zero Trust Access for authentication. The discussion covers the AI gateway as a flexible proxy to switch LLMs and enforce cost controls, and introduces MCP server portals to simplify discovery and governance across many servers. A highlight is the Code Mode breakthrough, which limits exposed tooling to just search and execute, dramatically reducing context size and cost. Goldberg also explains how Cloudflare Gateway helps detect shadow MCP traffic and enforce data-loss prevention policies, tying MCP security to broader enterprise safeguards. Finally, the pair stresses that every product and customer-facing service should consider offering an official MCP server to avoid supply-chain risks from uncontrolled, external MCPs. The session roots these ideas in a practical reference architecture that Cloudflare has open-sourced, inviting feedback from teams and customers during Agents Week.
Key Takeaways
- Run MCP servers remotely on a centralized platform (Cloudflare Workers) instead of exposing local machines, enabling IT to patch, audit, and control deployments.
Who Is This For?
Essential viewing for enterprise security teams, platform engineers, and product teams planning to expose MCP capabilities to customers. It clarifies governance, deployment, and security controls needed to scale MCP safely across large organizations.
Notable Quotes
""MCP is dead"... not really—the architecture Cloudflare built keeps it alive with centralized hosting, auditability, and security controls."
—Sharon references the initial skepticism about MCP and contrasts it with Cloudflare’s hosted approach.
""We use Cloudflare's developer platform to host those MCP servers so they’re globally distributed across our network.""
—Key point about deployment model and latency/availability benefits.
""Code mode reduces the context by exposing only two tools: search and execute.""
—Highlights a major optimization to cut costs and complexity.
""If you’re providing a service to customers, it’s better to have an official MCP server for your product than leave it up to random servers on the internet.""
—Addresses supply-chain risk and vendor governance.
""Shadow MCP can be discovered and controlled using Cloudflare Gateway with DLP rules and traffic inspection.""
—Shows practical security measures for unauthorized MCP usage.
Questions This Video Answers
- How does Cloudflare's AI gateway help manage MCP cost and model switching?
- What is Code Mode in MCP and how does it save costs?
- Why should enterprises host MCP servers with Cloudflare Workers instead of local deployments?
- How can Cloudflare Gateway detect shadow MCP traffic and why is it important?
- What are the governance benefits of an MCP Server Portal for large organizations?
MCPModel Context ProtocolCode ModeAI GatewayCloudflare WorkersZero Trust AccessCloudflare GatewayMCP Server PortalShadow MCPSupply chain risk
Full Transcript
Hello everybody and welcome. I am so excited to be here today with one of the authors of uh a blog post that came out during agents week. I hope everybody has been devouring the blog. If you didn't read this one yet, it's time to go read it. I am so lucky to have Sharon Goldberg here to talk a little bit about what is up in this blog. How to deploy MCP in the enterprise. And Sharon, actually before we get started, can can I have you introduce yourself? Yes. Hi. Uh I'm Sharon Goldberg. I'm a product director here at Cloudflare.
I work on a bunch of different really exciting things. One of them is our AI security suite. I also work on postquantum cryptography. I also work on data sovereignty. It's been a really fun couple of years here working on these really advanced technical projects. So yeah, and then this blog was really fun because we pulled together a whole bunch of different things that different groups in the company have been building and also deploying internally and using it actually on our own workforce. So we thought that it would be fun to kind of pull it all together in one place and and as we were working on it, it kind of grew into this giant reference architecture that we're like really happy to to release today.
It is so cool and I think that like we might have a little bit of prerequisite that we need to do. We're we're going to go a little deep here and I want to talk first. I want to talk about what MCP even is. Is that okay? And Sharon, will you tell me you're you're will you tell me how I do with it? Okay, I'll I'll do my best. Craig, you're the star at this. So, I'll sit back and let you go. So, no. At the fighting MCP. So, MCP, if you haven't used it yet, model context protocol, right?
So, the LLMs are needing to do things. They're needing to have more context and they're being able to give be given tools. uh a lot has changed in this space and a lot of people have different feelings uh about this and there has been some security issues that have been raised and some people have said MCP is dead. Uh I am very excited to see what we have built here. Uh and I I think one of the things that first very first started happening which is which is interesting is people were running things locally, right?
Like right like so so you you took a you took this MCP server and I installed it on my local machine. my way of communicating, I guess my MCP client, but my way of communicating was local on my machine and that could go off and call things and the LLM would come in and do things locally on my machine and then go out. Um, and I think that there was some problems with that if I'm if I'm not mistaken. I mean, I just think in general with AI right now, we're still a little bit in the like first few innings.
There's a lot of things that happen with AI that feel to me like we're still in the 90s, right? Right. And like one of those things is that people run things locally with AI. They're doing like these agents will be running locally and like updating your local file system and stuff like that. It feels like very sort of like the way we used to do things. And so I would say with MCP that's a similar situation where this all started was that you would let's say I want to interact with um GitHub or a GitLab repository, right?
You would put an MCP server in front of that to make it easier for an LLM to to speak to that GitHub repository. But where is that MCP server hosted? Are you hosting it on your machine like this machine that I'm speaking in front of right now? That's what you were doing at the beginning and that's what a lot of organizations are still doing. Now the problem with me running an MCP server locally is who wrote that MCP server? Who wrote that MCP server? Is it patched? What version is it using? Have we done security scans against it?
Does my IT team really think I should be deploying this MCP server against our code repo and running commands through it? like maybe there's a bunch of tool injection attacks in that server. Who knows if I run it locally, it's it's harder for my IT team to actually administer and control that. And even though that's still done in a lot of organizations, we actually don't do that here at Cloudflare anymore. Um, and so that's the sort of first part of the reference architecture. Well, yes, speaking of the reference architecture, I saw a diagram. You had a diagram uh a while back that I saw in a slide.
I I I grabbed it. I grabbed it. Can I share it? All right. Right. Awesome. Okay. So, here's the here's the diagram. I y this right out of a presentation that I saw you given. So, uh up here on the left we have these LLMs, right? Yeah. And we run through through this AI gateway. Let's talk a little bit about AI gateway. Yeah. So, so actually maybe let's start from the user, right? So, I've got my user at the bottom and my user wants to do something with MCP. So for example, I may want to make a command or something that says, "Hey, can you go look for all the GitHub repositories through all my GitHub or my GitLab repositories to find all certificate authority implementations that we have at Cloudflare and so it'll just go do that for me, which is really useful.
So I don't have to read all this code and it knows what a certificate authority is. It goes finds it. How does it do that?" Right? So that's the MCP client that's sitting there. The MCP client calls out to an LLM in order to to make these kinds of queries. And so that's that leg that you see there. You can have any LLM that you want. At Cloudflare, we don't connect the LLMs directly to the MCP clients. In our reference architecture, we actually go through an AI gateway. An AI gateway you can think of as a proxy to LLMs.
So, what it can do is it can the most important thing that it does, the most basic thing is it allows you to switch LLM. So, maybe a task that you're doing needs a really cheap LLM or maybe needs a local LLM or maybe you want like a really expensive high-powered LLM and you should be able to easily switch models from different providers and that's something you can do with AI gateway. But the other thing you can do with AI gateway is cost controls. So in AI gateway you can have something that says you know engineers can use this many tokens a day and salespeople can use this many tokens a day and so all of that is controlled through AI gateway and it can give you like those kinds of controls about how much you want how much how many queries can come from each individual person in the organization and track what they're doing.
So that's how we sort of set up the MCP clients and earlier I said you know people used to like in in some places people do run MCP servers locally right and so that MCP client would talk to an MCP server that's sitting on their actual laptop and then goes and calls out to whatever in this picture you can see GitLab in the bottom corner for example or GitHub you can see that in the bottom right so that is a way to run MCP but again as I said before running the MCP server locally is not the best because it really takes control out of your IT team and security team which should be making sure that you have at least trusted up-to-date implementations of these servers and not random stuff you downloaded off the internet.
Yeah, that's a lot of responsibility for somebody to say like oh no this is a thing that I'm going to run here locally on my computer with my company data too like that that's scary responsibility like the more you talk about it that's that responsibility is scary. Yeah. And then so so now that we like sort of take this idea of we're not going to run the MCP server locally because how are we going to run it? So the way that we do it and the way we talk about in the reference architecture we have in our monor repo at cloudflare a whole workflow that can allow developers to build MCP servers and go through security checks and other checks before those um MCP servers then get deployed to the whole company.
Um and how do they get deployed to the whole company? We use Cloudflare's developer platform to host those MCP servers so that they're globally distributed across our network. And so any employee wherever they are in the world will when they're actually accessing the remote MCP server, they'll go to their closest colo which will be running that remote MCP server and they can access it through that. So that's like the basics of how a remote MCP server would work. And really the reason to do it if I'm focused on security is that it gives a security team a way to audit and control and patch and upgrade these servers rather than just kind of like not knowing who's doing what locally on your machines.
So that's that part. Now when we talk about MCP if you look down you see ZTNA access on that diagram just goes there we go. Yeah. So that is around authentication. So you can have an MCP server that may sit in front of something like Cloudflare Radar which is our data repository that scans the internet. There's no authentication to get into Cloudflare radar because it's all public information, but obviously if you're going to be accessing the company's production code repositories, you need to authenticate and make sure you're Cloudflare employee. So, how do you do that?
You can do that with a tool for authentication, zero trust network access tool, that's CTNA, Cloudflare access. And so what we do with our internal MCP servers is that we build them, we host them on Cloudflare workers. Um, and then we put Cloudflare access in front of them, which means that that does single sign on, it does MFA, it can do device context, it can say only allow people in these certain countries, all of these different features that you would expect from really deep fine grained authentication. And you can do through there and put that in front of your MCP server.
And so that protects it from unauthorized access which is obviously really important if you're letting it use your internal you know you're putting in front of your internal click house clusters or something like that. Right. Right. Right. And then the last piece there you can see right in the middle is the MCP server portal. So MCP server portal is it does it serves a bunch of purposes and the very simplest one is the following. Like if I'm an employee of the company and I'm just starting to use MCP. I don't know when MCP servers are out there.
It's really hard and kind of scary to figure out where they are, how I use them, and how do I connect to all of these things and how do I set them up? So, actually, and I mean that is actually really like maybe the biggest barrier to adoption is like how do I even connect to these things? So, with portals, what you do is you just connect your MCP client to the portal and that portal will expose to you here are all the servers that you have access to as an employee of this company. And those servers could be some that we built internally and we hosted internally and those could be actually thirdparty servers because Slack and PayPal have their own MCP servers that they stand up for their customers and you can connect to those through the portal but I don't have to like you know find out how to do that and connect to them and do all that complicated stuff.
I just go through the portal. So as an employee it makes my life easier but also as an IT administrator this is another point of control where I can write policies about who can access what MCP server. I can log what they do. I can um run DLP, data loss prevention policies through the MCP server. So all of that is happening in the MCP server portal along with something else really cool that we launched now. Right. Right. Which is um Because there I mean I would imagine I know that we have I think there's 11 or 12 MCP servers that might actually have additional MCP servers.
So that's a lot of information, right? Right. Yes. Right. A lot of tools to connect to. Is that And I know if I'm if I'm right, I might be jumping the gun here, but are you about ready to talk about code mode? Yeah, I am. Let's talk about code mode. So So So this thing here, I believe, really only has two. It itself is an MCP server, right? These clients when we look at it, that's connecting to this MCP server. And instead of showing all of these tools, we use a thing called code mode where it shows two, right?
Yeah. So, it's a way of reducing the amount of information that has to be processed um when you interact with MCP servers. And it's it's a really cool story because we had a team internally that found a way to reduce the number of tools that are exposed from a single MCP server to just two tools which are search and execute. Search looks at what tools are available and execute. will then write a little piece of code that will call those those existing tools and do something with them, right? And so by doing it that way, you reduce the context so much that you can have huge cost savings.
That's just a very simple explanation. I didn't build this. I'm speaking for my colleagues who built this super cool thing. But then they thought, okay, so we did this for an MCP server that we built, but how do we like make this useful for the world? That was the clever idea. They actually added it into the MCP server portal so that anything that the MCP server portal sits in front of gets this code mode optimization and now you can interact with all these servers which don't have code mode. You get it if you put them behind the portal and now you get code mode.
And so that was something we released now which is super cool and that's like straight up full feature of MCP server portals that went live this week. It's so cool and you you save so much money and so much like you can imagine in that bit where we're making these remote MCP servers if we have a lot of remote MCP servers and we're hearing that people are using these internally right so it's kind of a way to do internal enablement too and there's a bunch of them and how do you find them and that search is really really important so putting this thing in front of it is really nice and you can choose uh there's a uh you know I love running through this because you can kind of choose which ones you want to use in there too which one as a user you want to use these are offered to you which ones do you want to actually enable right now.
It's really nice. Really, really nice. Right. And like if you don't know, you're like, "Look, there's like an FCP server for Jira. I didn't know that. Maybe I should try using it to do Jira things." Which which like so so good for discovery because, you know, a workforce doesn't necessarily know what servers are available and they're being added all the time. And so, it's just a great way to kind of roll things out really quickly across the organ. I think that's one of the reasons why we've had such a broad adoption here, which is interesting because I I don't know how true this is across the industry, but we have we do not have it like MCP usage is not restricted to our R&D team.
It is used by our go to market teams, used by marketing team, by our finance teams. Um because now they have access to all these internal resources that they can interact with using an agent. And I think that's just really powerful and and cool to not have it be so restricted just to developers. It is it and it it is such an unlock and and again like if we are the way that you would do this in the past right is like hey here's this we were talking about it run this locally on your machine like imagine handing that out to everybody and then that ends up in this like that ends up in a dangerous place right I feel like you throwing back to the 90s like you were talking about about like shadow it I we're having a shadow is it sha is it a shadow MCP moment is that what we're having we might be having shadow MCP moment So there are two parts to shadow FCP.
One is a locally run SCPs that people are downloading and running locally, which actually we don't have a solution for in the product yet. That's more like an EDR feature that I could see coming out. But if people are using remote MCPS in an unauthorized way, then you can actually see that if you use Cloudflare 1 as your enterprise networking stack. That's our Sassy platform. So in our sassy platform, we have a secure web gateway that can inspect um HTTPS traffic, TLS traffic. And guess what MCP traffic is? It's HTTPS traffic. And it has certain markers like there are certain headers, there's certain formats.
It's JSON RPC. And so in this blog, we also um recently worked with some customers to figure out how they could use Cloudflare Gateway to discover shadow MCP like like unauthorized remote MCP servers that are being used by their employees. And we can pick that up with a bunch of like um DLP regular expressions in on the HTTP bodies. Anyway, I'm getting way too in the weeds, but the point is there are some tricks you can use and you can pick out this traffic. And so the other thing that we wrote about was how to actually use a secure web gateway like Cloudflare Gateway to pick out MCP traffic from the remote network.
Um and I and that's something that we've been asked about by customers constantly and I'm really happy that we were able to put those rules out there and I really hope um you know if you're operating a secure web gateway please take a look and and use the rules. I think they're really useful and that and we're talking this is like AI security for apps or or is this something else because there's another part of the blog post. There's something else in the blog part. No, there's a whole other part. Let's talk about that. Now, the third, okay, the last thing I want to talk about is the following.
So, okay. So, so the last thing I want to talk about is how we talked a lot about how, you know, Stripe and PayPal and GitHub have put out MCPs that their customers can use. And we're kind of at the point where we feel like every organization should be doing that. You know, we're moving into a world where these things are going to be administered do using like operated administered using MCP. And so if you are providing a service to customers that those customers are going to want to interact with with MCP, it's much better if you wrote the MCP server for your product, then you just let them like find random MCPS off the internet.
That is potentially a supply chain risk, a software supply chain risk. It's much better to have an official MCP server than to wait for someone to write one for you and who knows who that person is and posted it on GitHub as open source. Yep. So for that use case which is that I have a product that I'm building and I want to offer an MCP because I believe it's important from the security and the integrity of the product that I'm building. There's two things that we pulled out and we wanted to highlight. One is um you can of course build an MCP server on Cloudflare um and host it on Cloudflare workers and then it's globally distributed across our entire network always.
So it's not like you have to deploy to every region and manage it in every region. It's all globally deployed. Um so that's really good for latency and performance first. And second, we have a feature called AI security for apps which is in our web application firewall in our W. So you can put our W in front of those MCP servers with this feature turned on. And what will it do? It will look for prompt injection and other AI attacks that are trying to connect with your MCP server. And so you can have the hosting on Cloudflare and then the security on Cloudflare as well.
And we just really wanted to highlight that a lot of organizations need to be thinking about how they're going to provide MCP to their customers and to the people who are administering their products. So that was the last part of this giant reference architecture. That that is that is so awesome and so thoughtful, right? Like I think like I we say it is moving so fast. People are able to build MCP servers so fast. They probably will build yours. So I I love that. I love that we're we're think here's how to do it. here's how to do it uh safely and securely giving you a nice way to go and then also how to how to set this up internally.
Thank you so much for this blog post. I think there's a lot to digest here and I hope that everybody goes and reads it and thinks through the kind of problems that you have and let us know literally let us know what what you more you'd like to see because uh we're all driving this right Sharon this is all coming from from customers about how how they're they're seeing MCPs being used. It's coming from customers. It's coming across the business. Yes. So and we're talking to customers about it all the time. And I guess the last thing I would say was that I I kind of started this whole effort and part of the reason is I did something like I don't know how many customer presentations on this material and I just decided like this has to be written down so that we can you know scale it out across all all of our teams and all of our customers.
So I hope it's helpful um and then maybe I can do a few fewer briefings on this specific topic. So we'll well well thank thank you Sharon and thank you thank you for being here and thank you uh everybody watching this. Please make sure that you go to the blog. There's so much stuff happening on Agents Week. Uh we talked about a lot of products in here. AI Gateway is getting a bunch of new stuff this week. You should check it out. Lots of lots of fun stuff. Lots of uh creative, innovative things that you're going to want to get your hands on.
So, please, we'll hang out in the blogs. There's a big hub page of everything that's launched this week. Sharon, thank you so much for being here and I can't wait to to be on another one of these with you. Thank you. Thanks so much.
More from Cloudflare
Get daily recaps from
Cloudflare
AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.









