Why Spotify is still Excited by CDN - Immerse Stockholm 2026
Chapters5
Spotify discusses scale challenges across three dimensions: the massive user base (750M+ MAUs) and concurrent requests, the enormous and varied catalog with multiple formats, and global markets, all underscoring the need for low-latency, personalized streaming at scale.
Spotify relies on a global CDN backbone to keep latency ultra-low for almost a billion users, proving CDN choice is as critical as product features.
Summary
Matias from Spotify shares how the company scales to around 750 million monthly active users and why CDN is essential beyond just delivering music. He frames scale in three dimensions: the user base with concurrent requests, the vast catalog of tracks and formats, and the reach across 180 markets with multiple client surfaces. He explains how the initial request flow—backend to URL and decryption key, then CDN fetch and decrypt—must feel instant, emphasizing latency as a core metric tied to engagement and retention. The talk contrasts audio with video, noting Spotify’s long-tail catalog and the need for instant playback even when many titles are rarely accessed. Platform engineering at Spotify is presented as an evolution from traditional DevOps, focusing on autonomy while aligning on patterns, templates, and a single developer portal called Backstage. The audience learns about the “golden path” and MCP gateway that steer teams toward safe, supported configurations without micromanagement. Finally, Matias peeks at the future: pushing for even lower latency, exploring edge caching to reduce backend round trips, and expanding features like Lossless playback that dramatically increase CDN usage. Cloudflare’s role is framed as a partner helping achieve a sub-250ms latency budget and more efficient delivery. The session closes with practical notes on handling alarm-clock traffic peaks, DOS protection, and ongoing collaboration with vendors to keep Spotify’s CDN footprint both fast and cost-effective.
Key Takeaways
- Spotify’s user scale involves hundreds of millions of concurrent requests per second and a paged, multi-format catalog totaling hundreds of millions of tracks and podcasts.
- Latency is a top metric for engagement; reducing it makes playback feel like it’s happening from the edge rather than the cloud.
- Backstage acts as Spotify’s internal developer portal, enabling thousands of teams to provision CDN and related infrastructure through templates and defaults.
- The platform group balances autonomy and alignment by providing a ‘golden path’ that makes the right tools the easy choice for engineers.
- Lossless playback launched recently, driving a substantial increase in CDN traffic and demonstrating CDN dependence for advanced audio features.
- Spotify sees edge caching and reduced back-end trips as future levers to shave latency and cut costs.
- AUDIENCE takeaway: effective platform engineering can scale infrastructure for a consumer-grade service across 180+ markets while keeping developer experience high.
Who Is This For?
Essential viewing for cloud engineers and platform teams at large media companies who need to scale CDN and backend architectures for a global audio service without compromising developer autonomy.
Notable Quotes
"750 million. That's our number of monthly active users."
—Matias introduces the scale Spotify must serve and sets the growth context.
"Latency for us is one of the most important key metrics."
—Defines the core performance metric driving CDN reliance.
"the idea is we want the latency to be so low that people forget that we're streaming from the cloud or that you're streaming from the cloud."
—Describes the UX goal of near-seamless streaming.
"single pane of class the internal developer portal"
—Mentions Backstage as the central platform for developers.
"golden path. This is like a signal that we send that if you use this choice you'll be on you're on the protected or the supported track."
—Illustrates the governance approach for platform tooling.
Questions This Video Answers
- How does Spotify manage CDN and backend architecture for 750 million users?
- What is Backstage and how does Spotify use it to standardize CDN deployments?
- What impact did Lossless Playback have on Spotify's CDN traffic?
- How can edge caching reduce latency for streaming music?
- What is the alarm clock challenge in Spotify’s traffic patterns and how is it mitigated?
SpotifyCDNLatencyLossless PlaybackPlatform EngineeringBackstageCloudflareEdge CachingGlobal DeploymentAlarms and DOS Protection
Full Transcript
Hello everyone. When uh CloudFr asked me if I wanted to give a talk, my question was can I talk about not AI? And the answer was yes. So I'm not going to talk about AI for change. I'm excited about AI, don't get me wrong, but it's also good to talk about everything else we do. So um what I want to talk about is how we at Spotify use CDN. CDN is something I really care about. It's something that we all know and and love and we kind of forget about because it just works. To start with, um, let's look at that number in the title.
750 million. That's our number of monthly active users. And that's the last publicly available number is a little bit higher, but it's the thing I want to say here. It's growing. Um, and it keeps on growing. Just last year did a similar talk. we were below 700 million. It's a lot of it's a lot of people. And what I try to say here is where we work at Spotify with infrastructure, it's really often a challenge of scale. Building a streaming app, especially now in an age of aentic AI, is not that hard. Doing it for almost a billion people, that's where it gets interesting.
And when we talk about uh scale, it's really three dimensions. I I I me I mean one of them is the number of users. So 750 million monthly actic users that means a lot of concurrent requests every second to our back end to our CDN a lot of streams double digit million sustained but we also need to store all that data somewhere all the user profiles uh are stored that's just al different type of scale just by the number of users but finally also we want to deliver a personalized experience for each and every one of our users.
So that's that's kind of the one dimension of scale. The second one is our catalog, right? We we have a very large catalog that people can listen to. It's of course music, hundreds of over 100 million tracks, but we also do a couple of thousand or a couple of million podcasts, not thousands. Uh and a couple of hundred thousands audiobooks that people can stream. So it's a large catalog and that means every piece of content needs to be stored somewhere. It needs to be ready to be served. And usually every track or every artifact is stored in different formats or a different encoding.
Maybe some markets have a different version of it. So there's a lot of data to be served. Um we're talking about pabytes or thousands of pedabytes of data. And then the third dimension uh markets. So we are not we don't exist in all countries but we are available in over 180 markets around the globe. So that means we we don't operate just in in the Nordics or just in the US. We have people listening and using the platform around the clock and around the the globe. We're also there's also a lot of service service surfaces where Spotify can be used, right?
You can have your mobile phone, your browser, your your your TV or your speaker, your watch, uh your car have clients. So there's like that ubiquity of of Spotify as well. So what what we serve is really we serve hundreds of millions of users, hundreds millions of tracks uh with hundreds of markets around the globe and around the clock. So um what happens if you open your app and you click play? The first thing that happens is your client or mobile phone for instance sends a request to a back end to request uh a certain track, right?
and the back end responds back, well, here's a URL and here's a decryption key. Next, the client goes to the CDN, fetches the track, and decrypts it with the key provided. Now, I know there's at least one colleague of my team in in the in the audience. They're going to roll their eyes because, of course, it's not that simple. Uh, this is dramatically oversimplified, but there is a lot of nuance in the real world. Um, for one, there are many more requests between the client and the back end, but also the client first fetches a head file to just kind of speed up uh the first couple of seconds that people want to listen to, but also maybe the the the client or the user wants to skip forward and backwards.
Maybe they want to enable uh lyrics or maybe they want to browse some some some metadata about the song while they listen and so on. So there might be a user might want to join a jam from a friend or maybe they want to god forbid connect to their connected speaker at the same time as they're listening. So there's a lot of things happening all the time and that that's supposed to be looked that way. But everything is supposed to feel instant. Latency for us is one of the most important key metrics. and we are streaming platform but more importantly we are primarily an audio streaming platform and that's a little bit different from other streaming platforms and in especially for instance if you think about video streaming there's in two key ways one of them is I already mentioned the catalog it's a very long tale of a catalog we have a lot of tracks we have millions of c of of tracks that probably almost nobody listens to but they need to be there and they need to be available for instant playback and at the same time latency is the one thing we care so much about.
So the idea is we want the latency to be so low that people forget that we're streaming from the cloud or that you're streaming from the cloud. That was something that was already big 20 years ago when Spotify was founded and was something that also made the company um successful from from the start and that kind of puts us apart. We know that latency uh is directly rel uh correlated with engagement and retention. So latency is extremely important. Um so that's where the importance of CDN comes in because of course we serve things from from uh from CDN pops.
So we couldn't be able to do achieve that level of latency without CDNs public CDNs like Cloudfare. Our CDN delivers pabytes of uh a couple of pabytes per day to all the customers at very low latency and around the globe. we wouldn't want to build out that network ourselves. Um, so that's in the end that's really why I'm excited about CDN. That's why we are excited about CNN. That's why the the important of CDN for Spotify cannot be understated, overstated. The other part of the picture is the back end. This was this one box on the slide before, but of course it's not a single entity.
It's much more complicated than that. If you zoom in, um, this is how the back end looks like. Uh this is actually a real rendition of our call graph. So every dot or node is is a an endpoint a microser and every edge is is information flow request flow. Um this is the most accurate architecture picture I could find of Spotify back end. Uh and yeah in aggregate it almost looks like a living organism. Uh it's it's a large back end by also in scale what we serve with it right we we manage 24 million events per second sustained quite a bit above 10 million requests per second sustained that comes in from the internet uh there's thousands of microservices we run over one and a half million CPUs kind of when we don't in baseline and a lot of data pipelines of course The thing is this is not static.
It changes over time. This is an interesting graph. It shows you the autoscaling of one of our microservices. I don't even know which one it is. Doesn't say. But this is a very pattern you see almost in all services in our back end. Every graph and every caller is a different region. So Asia, Europe, US for instance. And you see the peaks and the throughs. And this is really just people waking up listening to music and people uh then going to work or commuting listening to music. and in the evening they turn off music again.
So this we see the same pattern across the globe and you see them shifted because of course Asia wakes up first. Now that is an interesting there's a few peaks in there as well then they're not that many but it's fairly predictable but there's a few patterns that this make that is fairly uh difficult to solve. One of them is we call it the alarm clock challenge. So every morning people wake up with an alarm clock. usually at 6:00 a.m. or 6:30 or 7 or 7:30, but it's very predictable times and then suddenly they want to listen to music.
Usually their alarm clock plays the music, right? If you have, let's say, 1% of our users listening to a specific song or specific playlist in the morning at a specific time, it that's suddenly millions of users turning on Spotify and sending requests all at once. How does that look different than a DOS attack? it's really the same, right? We're talking dozens of millions of requests suddenly searching in one region. So, that's that's an interesting problem to solve engineering wise. Um, and that's where we kind of get into the protection world of things and how do we protect ourselves from DOS because we don't we don't want our infrastructure to fail over when we have an alarm clock event.
The CDN doesn't have an issue with that. they could just absorb that excess traffic but the back end might not be able to scale at the same time we do want to prevent real DOS attacks so that's where we work closely also with our vendors with our like cloudfare to understand what's bad traffic what's healthy traffic and how do we how do we protect ourselves um of course also apart from just serving media on our CDN we also use CDN the more classical way we have web apps web pages like the signup page or campaign pages is that use CDN just to cache the web artifacts as well.
And that is a little bit different because there's a lot of teams building those smaller apps and they need CDN configured for their web apps. And that poses a different type of challenge of scale. It's more of organizational scale like how do we manage the many dozens of or hundreds of teams that want to stand up a CDN centrally. Um and that brings me to the the other piece here is platform engineering. In the end, we pride ourselves that we want to we want to empower teams to work autonomously. In the back in the past, that worked fairly well.
Um, in the and it still works fairly well. The problem is though, if you do that and and optimize for speed, you end up with dozens of implementations of a certain tech stack. Everybody has a certain CDN setup or and maybe even teams have their own vendor engagements. So, that doesn't really work at scale anymore. So fast forward a couple of years, what we do today, we still believe in autonomy, but we also need to figure out how we get alignment in place. Um, so how do we do that? Um, it's really finding the right balance.
We don't want to mandate people what to use. We want to encourage them to use the right thing. So it's really about making the right choice, the easy choice. It's more at the carrot instead of the stick. Um, and this is where the platform group comes in. And and my team is part of the platform group at Spotify. Um this is the the purple box on top there. Basically we sit between vendors and our engineers and we try to kind of codify best of uh best uh patterns and and and and defaults for everyone. Um in ultimately finding the right balance between kind of the self service freedom that we can provide while also keeping the cognitive load for everyone as low as possible is really need requires that our we in platform need to understand what our developers need and what they what they what their use cases are.
It's really a really if you think about it it's it's product leadership and to me the product approach to infrastructure that is what platform engineering is. It's I would say it's an evolution from organizational types like devops or infrastructure teams with a product layer on top. So that's where we operate. Uh we as I mentioned we sit between vendors and we and the spot of engineers. We uh we use CL Google cloud for our in the as our main cloud provider except for CDNs. That's where we use other vendors like like cloudfare. Uh we support roughly almost a thousand different engineering teams and um and we have a roughly two and a half between two and a half and 3,000 engineers at Spotify that then build on top of the platform that we provide.
Um, and we've been doing this for a while. Uh, um, things like how can we migrate technology and stacks across the fleet for everyone centrally? How can we optimize cloud spend for everyone? So that's kind of those are the levers that a platform team can can uh can create for for organization. And of course I do need to point out to backstage that's our kind of single pane of class the internal developer portal that we expose everything through this is um was open sourced a few years ago by Spotify we also have a managed solution called portal so we do use backstage to kind of funnel everybody through uh their use whenever they want to engage with the platform they do that through backstage in the past I had this slide which is let's say a user wants to create a CDN uh for their web page they go to backstage create it through a form nowadays that's not really true probably anymore because they use cloud to create that but it's still true that we have an MCP gateway connected to backstage and the skills that we provide uses the right tools to create a CDN and under the hood then they don't need to care if it's cloudfare or somebody else that's us in platform that decides so our engineers they don't necessarily go in and use cloudflare specific configuration or some other vendor specific configuration.
We set those templates and defaults and they just tell us where they have their bucket of origin, what location they wanted to be deployed in and what their endpoint and DNS records should look like and then we take care of the rest and make sure that that works. You see there's a it mentions golden path. This is like a signal that we send that if you use this choice you'll be on you're on the protected or the supported track. If we ever change anything under the hood, we take care of that migration. People are still free to do something else, but hopefully this is the most easiest way for for people to to use the platform.
Yeah. So, going back to CDN, what does the future hold? Um, of course we intend to grow more and every kind of incremental ad of a couple of million every quarter or 50 million or 100 million that does make it things more interesting right things suddenly start to break in different ways but that's also only really possible with CDN and then but it will have an impact on our CDN footprint of course latency will still remains key and we will keep on focusing on latency we will keep pushing cloudflare to to to be even more efficient to help us bring down the latency even more.
Uh really to kind of stay in that 250 millisecond uh latency budget. Um but there's also other things that we're looking into now is what if we big piece of the of the latency is not about how quickly the CDN can can stream. It's it's really the round trip and back uh to the back end. So what if we if we didn't have to do that? What if we cached some of those things at the edge? We do a lot of caching in the back end, but even better, what if we didn't have any call into the back end?
This is very, it's fairly theoretical at the moment and we're not really sure how that would look like and how we would manage to get there. But that's some ideas that we have because suddenly then we could cut the whole kind of branch of latency uh and make the experience even better at the same time also saving money. So there's some some of those optimis optimization opportunities that we're looking into together with with our vendors and and and teams. Um but that under the hood will not be visible to to customer apart from it being just a lower latency experience.
Um yeah. So really how can we get get rid of that ball of uh yarn uh or make it simpler? What will be is visible to customers of course is new features. We still develop new features. There's just a few examples from the last year. Uh and some of those drive CDN usage more than others. More notably, of course, is lossless. We launched lossless uh playback late last year, I believe it was. And that of course massively increases the dependency on on on CDN, right? We we need to push many more bytes to our users.
And this is I think the fact that we can launch something as technically complex as launch lossless without any hiccups. That is why you remain excited about CDN. Thank you. All right. Thanks a lot, Matias. So, you know, I'm always tasking my team to go out and get to know your customers really well. And one of the things I'm asking for is I want to understand the technical architecture setup of the customer. And I guess if Yuan comes back to me with that picture, I will be a little bit confused. So we have a couple of questions from the audience that that is really good.
Uh so let's me just uh re yeah let me just pick that. So one of the questions was around lossless. So when you started using lossless um how much are the traffic increasing and how many are using lossless in terms of percentage? Um I don't think I can speak to how many that are using it or it's not something that's something that we publicly disclose. But when we did launch lossless, there was uh there was two things that had to be taken into account. Like first of all, we launch it, we don't know how many people would actually use it.
Like how big of a cohort of our premium users actually care. That's a very very hard to determine up up front. So what we did instead we launched it some we kind of shadow launched it and saw tried to understand in which mark how many people would actually start using it because you have to go into the app actually turn it on and um and then yes how how much bigger are the formats or how much more bytes we send uh I don't I think we're uh depends on the codeex but it's it's it's almost an order of magnitude more bytes that cool I have one last question.
I thought it was this was pretty funny actually. So just for fun, imagine AI agents with emotional intelligence would like to listen to Spotify in the future. Are you ready for it? I think I I I want to believe we are. We do have now also an integration a public MCB integration with both chat GPT and and and Claude. So those agents, we're ready for those agents. Um and then we have our own agents in the app as well. So, but I mean don't we'll see the things times are a little bit uncertain. So, it could blow up in any way.
We'll see. Awesome. Just hold on one second. I have a little bit of a gift for you. You can't leave the stage without some nice swag, right? So, huge thank you for joining us today and a great speech. Thanks a lot. Thank you. Yes.
More from Cloudflare
Get daily recaps from
Cloudflare
AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.









