Daniel sent us this one, and it's basically three questions wrapped in a kind of quiet astonishment at the sheer physics of it. YouTube has something like five hundred hours of video uploaded every minute. It all has to be available for streaming instantly, anywhere in the world, which means it's stored hot. People almost never delete anything. So how does Google actually store this stuff and scale it? Has anyone even estimated the total? And the third question is maybe the wildest one: how can Google offer what amounts to unlimited uploads for free, when that's virtually unheard of anywhere else in cloud storage? There is a lot to unpack here.
There really is, and I want to start with the storage infrastructure itself because the way most people imagine it is completely wrong. When you upload a video to YouTube, it does not land on a hard drive in a rack somewhere with your name on it. It gets shredded.
Shredded, distributed, and stored across a global fleet of machines using something Google calls Colossus, which is the successor to the Google File System. Colossus is the foundational storage layer underneath virtually everything Google runs. And the key insight here is that it is not a file system in the way you and I think about files and folders. It is an object storage system that presents a file system interface when needed. Under the hood, your video is broken into chunks, those chunks are replicated across multiple machines, multiple racks, often multiple data centers, and the system tracks where every chunk lives through a metadata layer that is itself distributed and redundant.
There's no one drive anywhere that contains, say, the complete video of that guy who built a backyard roller coaster.
Not even close. And this is where it gets elegant. Colossus is designed to treat hardware failure as the normal state of affairs. Disks die constantly. At Google's scale, they lose thousands of drives a year, probably thousands a month. The system doesn't care. When a disk fails, Colossus notices that a chunk is under-replicated and immediately starts making new copies onto healthy hardware, all in the background, with no human intervention and no service interruption. The video streams just fine because there are always at least two other copies available.
It's not just redundancy. It's automated self-healing at a scale where human intervention would be physically impossible.
And this architecture also solves the scaling problem. You don't provision storage for individual videos. You just keep adding more machines to the Colossus cluster. The software handles distribution, replication, and load balancing. When YouTube needs more capacity, Google adds more nodes to the underlying storage fleet. The growth is linear in hardware but invisible to the application layer.
There's a term I've seen in some of the technical papers on this, "disaggregated storage," where compute and storage are separate resources that scale independently. Is that what's happening here?
Yes, and this is one of the most important architectural decisions Google made. In a traditional setup, each server has its own attached storage. You want more storage, you add more servers, which also adds more compute you might not need. In a disaggregated model, the storage nodes are just storage. The compute nodes that handle video transcoding and serving are separate. You can scale storage without buying CPUs you don't need, and vice versa. For YouTube, where storage grows relentlessly but compute demand fluctuates with viewership patterns, this is essential economics.
Which brings us to the second question. How much data are we actually talking about? Has anyone put a number on it?
This is where things get murky because Google stopped publishing YouTube storage figures a long time ago. The last reasonably grounded estimate I could track down came from a combination of analyst reports and some back of the envelope math by storage researchers. The consensus among people who study this puts YouTube's total stored video somewhere in the range of one to two exabytes as of a few years ago. Given the upload rate has only accelerated, I would not be surprised if we are now looking at three to five exabytes.
One exabyte is a billion gigabytes, for anyone who just felt their brain briefly leave their body.
Yes, and to put five exabytes in perspective, that's roughly the total digital storage capacity of the entire world in the late nineteen nineties. YouTube alone may now hold more data than the entire planet did when I was in medical school.
That figure you're giving, that's just the original uploads? Not the transcoded versions?
Ah, excellent catch. It is not just the original uploads. This is one of the hidden multipliers that makes YouTube's storage problem so much bigger than it appears. When you upload a video, YouTube does not store just your file and serve it. It transcodes that video into a whole family of resolutions and codecs. A single upload might generate versions at one forty four p, two forty p, three sixty p, four eighty p, seven twenty p, ten eighty p, fourteen forty p, twenty one sixty p, sometimes four K and eight K. And it's not just resolution. It's different video codecs, VP nine, AV one, H dot two sixty four, and different bitrates within each resolution, because someone on a shaky mobile connection in rural India needs a different version than someone on gigabit fiber in Seoul.
One upload becomes what, twenty or thirty distinct files?
For a high resolution upload, it could be more. And all of those versions need to be stored and available for streaming. So the storage multiplier effect from transcoding is enormous. Some estimates suggest the total stored data including all transcoded variants could be three to five times the size of the original uploads alone.
Which means if the original uploads are five exabytes, the actual storage footprint might be fifteen to twenty five exabytes.
That's before we talk about replication for durability. Remember, Colossus stores multiple copies of every chunk. Typically three copies by default, though Google can tune this per data classification. So now you are looking at a raw storage footprint that could be forty five to seventy five exabytes just for YouTube. And that's a working estimate, not a confirmed number. Google does not confirm anything about their storage capacity publicly.
Of course they don't. Infrastructure scale is treated as a trade secret.
It's a competitive advantage. If you know exactly how much storage Google has deployed, you can reverse engineer their cost structure. So they keep it vague. What we do know, from occasional comments by Google engineers at conferences and from research papers, is that the overall Google storage fleet across all products, Search, Gmail, Photos, Drive, YouTube, everything, runs into the hundreds of exabytes. Some estimates put it north of a zettabyte now.
That is a trillion gigabytes. I'm just gonna sit with that for a second.
It's the kind of number where the prefixes stop feeling like real words. But here's the thing. The storage is only half the challenge. The real magic is serving it. All of that data has to be available for streaming with low latency to billions of devices around the world. And that's where YouTube's content delivery infrastructure comes in, which is a whole separate layer from the storage layer.
Right, because nobody is streaming directly from a Colossus node in a data center in Iowa if they're watching in Jakarta.
YouTube uses Google's global CDN, the content delivery network, which is arguably the largest private network on the planet. Google has peering agreements with thousands of ISPs worldwide and has deployed caching servers, they call them Google Global Cache nodes, deep inside ISP networks. So when a video goes viral in, say, Brazil, the most popular chunks of that video get cached on servers physically located inside Brazilian ISPs. The first person to watch it might pull data from a Google data center. The next ten thousand people are pulling it from a server that might be a few miles away.
The hot storage concept the prompt mentions, that's layered. The canonical copy lives in Colossus, but the actual streaming traffic is served from a distributed cache that puts the hot data as close to the users as physically possible.
The cache is predictive. YouTube's algorithms know which videos are likely to be popular in which regions before the demand spikes. They pre-warm caches based on viewership patterns, trending signals, even the uploader's subscriber distribution. If a major creator with a huge Brazilian audience uploads a new video at noon Eastern time, YouTube starts pushing that video to Brazilian cache nodes immediately, before a single Brazilian viewer has clicked play.
It's the storage equivalent of Amazon pre-shipping products to warehouses near where they think you'll order them.
And this is where the economic model gets interesting, which ties directly into the third part of the prompt. How can Google offer what is effectively unlimited free uploads?
Because they're not selling storage. They're selling attention.
And not just attention, targeted attention. YouTube's business model is advertising. The more video there is on the platform, the more watch time they can serve, the more ads they can show. Storage is a cost of goods sold, not the product. And the economics per gigabyte are so favorable that the math works even for videos that get almost no views.
Let's put some numbers behind that, because I think this is where the intuition breaks for most people. If I go to AWS and try to store a terabyte of data and serve it to millions of people, I will get a bill that makes me question my life choices. How is Google's cost per gigabyte so much lower?
Several compounding factors. First, Google builds its own hardware. They design their own storage servers, their own networking equipment, their own custom ASICs for video transcoding. They are not paying a margin to Dell or Cisco or anyone else. Second, they buy in volumes that are incomprehensible. When you are one of the largest purchasers of hard drives and flash storage on the planet, your per unit cost is a fraction of what anyone else pays. Third, they operate their own fiber backbone, which means they are not paying transit costs to move data between data centers the way a normal cloud customer would.
The marginal cost of storing one more hour of video, for Google, is approaching zero.
It is not zero, but it is astonishingly low. Some analysts have estimated that Google's all in cost for storage, including hardware, power, cooling, and operational overhead, is something in the range of one to two cents per gigabyte per year. Compare that to consumer cloud storage where you might pay two to five dollars per gigabyte per year. It is two orders of magnitude difference.
For a ten minute video at reasonable quality, we're talking what, a few hundred megabytes? So a few tenths of a cent per year to store it?
And if that video generates even a few dozen views over its lifetime, the ad revenue covers the storage cost many times over. The long tail is profitable because the storage cost is so close to zero. This is also why YouTube can afford to keep videos that have literally never been watched. The cost of deletion might actually be higher than the cost of just keeping them.
That is a genuinely wild sentence. The cost of deciding what to delete, and actually doing it, might exceed the cost of indefinite storage.
It's a real phenomenon in large scale systems. Garbage collection is expensive. You have to identify what to delete, verify that no one still needs it, handle edge cases like videos embedded in external sites, manage legal holds, and then actually reclaim the storage, which in a distributed system with replication and caching is non trivial. Unless storage pressure forces the issue, the economically rational thing is often to just let it sit there.
There's also a strategic angle here. Every video kept is training data.
YouTube's recommendation algorithms, its automatic captioning systems, its content moderation models, all of them are trained on the corpus of uploaded video. The more data they have, the better those models get. And in the AI landscape we're in now, with video generation models and multimodal systems becoming central, having the world's largest repository of labeled, captioned, engagement scored video data is an asset whose value is almost impossible to quantify.
The "free unlimited uploads" is not a charity. It's a data acquisition strategy that pays for itself through ads, and then pays again through model training, and then pays a third time through ecosystem lock in. You upload your life's work to YouTube, you're not going anywhere.
The terms of service reflect this. When you upload to YouTube, you grant them a worldwide, non exclusive, royalty free license to use, reproduce, distribute, and create derivative works from your content. You retain ownership, but they retain essentially unlimited operational rights. This is vastly different from something like Google Drive, where the terms are much more restrictive on what Google can do with your files. YouTube's terms are designed for a platform where the content is the product.
That distinction between YouTube and Google Drive is actually a perfect illustration of the business model difference. Google Drive gives you fifteen gigabytes free and then charges you for more. YouTube gives you effectively unlimited storage. Same company, same underlying infrastructure, completely different economics because the content on Drive is private and can't be monetized, while the content on YouTube is public and can.
Even within Drive, Google is using the same Colossus infrastructure. They're just metering it differently because the business model is different. The infrastructure doesn't know or care whether a chunk of data belongs to a YouTube video or a Google Doc. It's all just bytes in Colossus. The business logic sits in a layer above.
Which brings me back to something you mentioned earlier about transcoding. You said Google builds custom ASICs for video transcoding. What does that hardware actually do, and why is it important for the storage equation?
This is one of those places where the hardware and the storage architecture intersect in a way that most people never see. When a video is uploaded, it has to be transcoded into all those formats and resolutions I mentioned. That is an enormously compute intensive process. If you did it on general purpose CPUs, the cost per upload would be significant. So Google developed custom chips, they call them Argos video coding units, that are purpose built for video transcoding. These chips can transcode video at many times the speed of a general purpose processor while using a fraction of the power.
The upload pipeline is, video comes in, hits a transcoding cluster full of custom silicon, gets turned into twenty or thirty variants, and then those variants get chunked and distributed into Colossus.
The variants themselves are chosen dynamically based on what makes engineering and economic sense. YouTube has moved toward using more efficient codecs like AV one for popular videos because the storage savings from better compression justify the additional compute cost of transcoding. For a video that will be watched millions of times, spending extra compute to shrink the file by thirty percent saves enormous amounts of storage and bandwidth over the video's lifetime. For a video that will be watched three times, they might use a faster, less efficient codec because the compute cost of AV one isn't worth it.
There's a cost optimization function running against every upload that decides what formats to generate based on predicted popularity.
And it's continuously re-evaluating. If a video suddenly goes viral six months after upload, YouTube might go back and generate higher quality transcodes or additional codec variants because the math has changed. The system is not static. It's constantly rebalancing storage against compute against predicted viewership.
This is the part that I think separates Google's infrastructure from what most people imagine when they hear "video hosting." It's not a giant hard drive in the sky. It's a living, breathing, self-optimizing organism that makes millions of micro-decisions per second about where to put bits and how to process them.
It runs on what is effectively a planetary scale operating system. Google's internal infrastructure, the Borg cluster manager which inspired Kubernetes, Colossus for storage, the global CDN, the custom networking stack, all of it works as one integrated system. YouTube is just an application running on top of that system. The same infrastructure serves Search and Gmail and Maps. YouTube happens to be the most storage hungry tenant.
Let's talk about the physical layer for a moment, because I think there's a misconception that this is all in the cloud, meaning it's somehow ethereal. There are actual buildings full of actual machines. Where are they?
Google operates data centers on every inhabited continent except Africa, though they are building there too. The major YouTube serving locations are clustered near population centers. There are massive data centers in the American Midwest and Southeast, in Ireland, the Netherlands, Finland, Singapore, Taiwan, Chile. And those are just the big ones. The Google Global Cache nodes I mentioned, those are deployed in thousands of locations inside ISP networks and internet exchange points. There might be a Google cache server in a building down the street from you right now.
All of those locations have to deal with power, cooling, physical security, connectivity. The logistics alone are staggering.
Google has gotten extraordinarily good at data center design. Their newer facilities use machine learning to optimize cooling in real time. DeepMind, their AI division, famously reduced data center cooling costs by forty percent just by training a model on sensor data and letting it control the cooling systems. When you are operating at this scale, a one percent efficiency improvement is worth tens of millions of dollars a year.
Which loops back to the economic question. Every efficiency gain widens the gap between what it costs Google to store a video and what it would cost anyone else. That gap is a moat.
It's a compounding moat. As Google gets better at infrastructure efficiency, the cost per gigabyte keeps falling, which means the ad revenue needed to cover storage keeps falling, which means more of the long tail becomes profitable, which means more content stays on the platform, which attracts more viewers, which attracts more advertisers. It's a flywheel.
The flywheel is hard for competitors to replicate because you can't just buy your way into it. You have to build it over decades.
This is why YouTube has no serious competitor in user generated video. Vimeo exists but targets a different market. TikTok is a different format and a different consumption model. DailyMotion is still around, barely. Nobody else offers unlimited free uploads with global distribution because nobody else has the integrated infrastructure stack to make the economics work. You either have the Google infrastructure behind you or you don't.
There's something else I want to dig into on the storage architecture side. You mentioned that Colossus is an object store, not a traditional file system. What does that actually mean for how data is organized and accessed?
In a traditional file system, you have a hierarchical directory structure. Slash home slash user slash videos slash cat underscore fails dot mp four. The file system maintains a tree structure that maps that path to the physical blocks on disk where the data lives. This works fine at small scale, but at YouTube scale, the directory tree becomes a bottleneck. You can't have a single namespace with billions of files and expect path lookups to be fast.
What replaces it?
A flat namespace with unique identifiers. Each object, each chunk of a video, gets a unique key, essentially a very long random number. There's no directory structure to traverse. You present the key, the system returns the data. It's like a valet parking system. You hand over a ticket, they fetch your car. You don't need to know which floor it's on or which spot. The mapping from key to physical location is handled by a metadata service that is itself distributed across hundreds or thousands of machines.
This is why you can scale it horizontally. If you need more storage, you add more nodes to the cluster and the metadata service just has a larger pool of locations to map keys to. No reorganization, no rebalancing of directory trees.
Yes, and the metadata layer is the real secret sauce. Google's metadata system for Colossus has to handle trillions of objects and serve lookups in microseconds. It uses a combination of in memory caching, predictive prefetching, and careful sharding to distribute the load. When a viewer requests a video, the system has to locate every chunk of that video, across potentially dozens of physical machines, assemble them in order, and start streaming, all in less time than it takes for the video player to buffer.
It's doing this millions of times per second.
YouTube serves over a billion hours of video per day. That's over a hundred thousand years of video watched every single day. The number of individual chunk retrievals happening per second is almost certainly in the billions.
That number is so large it stops meaning anything.
It's the kind of scale where you stop thinking in terms of individual operations and start thinking in terms of statistical distributions. The system is not designed to never fail. It is designed to fail at a rate that is statistically invisible to any individual user. If one chunk retrieval out of every million fails and gets retried transparently, the user never notices. The engineering is all about managing failure probabilities, not eliminating failures.
That's a profound shift in how you think about reliability. Not "this will never break," but "this will break constantly in ways you will never perceive.
It's the same philosophy that led Google to design their early servers without cases, just bare motherboards sitting on shelves. When a component fails, you don't troubleshoot it. You pull the whole machine and replace it. The software handles the rest.
Let's circle back to the unlimited uploads question, because I think there's a specific angle the prompt is getting at that we haven't fully addressed. The terms of service say you can upload as much as you want. There's no published cap. But there is a practical cap, right? You can't upload a petabyte of video.
There are rate limits and abuse detection systems, but they are not published and they are not framed as storage quotas. YouTube will let you upload an enormous amount of video as long as it passes content moderation and doesn't trigger their automated abuse heuristics. There are channels with tens of thousands of videos. Some automated content farms have uploaded hundreds of thousands of videos before getting caught.
The limit is behavioral, not quantitative. They don't care how much you upload as long as you're not a bot or a bad actor.
As long as the content is video. This is the key distinction the prompt is noticing. You can't upload unlimited arbitrary files to Google Drive for free. You can upload unlimited video files to YouTube. The format constraint is what makes the economics work. Video can be compressed efficiently with modern codecs. Video can be monetized with ads. Video attracts viewers who watch more video. Arbitrary files in Drive have none of those properties.
It's the difference between giving away free samples of your product and giving away free warehouse space.
YouTube is giving away storage for the thing it wants more of, which is video content that drives engagement. Drive is selling storage for the thing you want to keep private. Completely different value propositions even though underneath it's the same Colossus bytes.
I want to ask about one more technical dimension that I think is underappreciated. You mentioned that Colossus replicates data for durability. But YouTube also has to deal with data that becomes less popular over time. Is there a tiered storage strategy where old, unwatched videos get moved to colder, cheaper storage?
This is a natural question and the answer is a bit counterintuitive. YouTube does not appear to use traditional cold storage in the way you might expect, where data gets moved to tape or low power drives after a certain age. The reason is that "cold" on YouTube doesn't mean "never accessed." Even a ten year old video with three views a month still needs to be available for streaming on demand. The latency requirements of video streaming mean you can't archive it to tape and spin it up when someone clicks.
Everything is effectively warm at minimum.
Everything is warm. What Google does instead is optimize the placement and replication strategy. Videos that are accessed frequently get more replicas in more locations. Videos that are accessed rarely might drop down to the minimum replication level, which is still enough to guarantee durability and availability, but doesn't consume as much cache space on edge nodes. The tiering is about cache warmth and replica count, not about moving data to fundamentally different storage media.
Which means the cost differential between popular and unpopular videos is mostly about bandwidth and cache occupancy, not about the underlying storage medium.
The hard drives storing the canonical copies are all roughly equivalent. The cost optimization happens at the caching and distribution layer. This is a different architecture than something like Amazon S three Glacier, where data literally moves to different hardware optimized for cold storage.
That makes sense when you think about YouTube's access patterns. A video can go from zero views a day to millions overnight if it gets picked up by an algorithm or referenced somewhere. The system needs to be able to serve that spike without waiting for data to be retrieved from cold storage.
The unpredictability of viral content means you can't afford the retrieval latency of true cold storage. The entire system is designed around the assumption that any video could become hot at any moment. The cost of keeping everything warm is lower than the cost of missing a viral moment.
To synthesize what we've covered: YouTube's storage architecture is an object based, globally distributed, self healing system built on Colossus. The total stored data is probably in the tens of exabytes when you account for transcoding and replication. The economics work because Google builds its own hardware, operates its own network, and monetizes the content through advertising and training data. And the "unlimited" uploads are made possible by the fact that video, specifically, is a format that generates revenue, not just cost.
That is a clean summary. I would add one more layer, which is that the entire system is continuously optimizing itself. Transcoding choices, replica placement, cache warming, all of it is driven by machine learning models that are making predictions about what viewers will want to watch and where they will be when they want to watch it. The storage architecture is not just about storing bits. It is about storing the right bits in the right places at the right times to maximize the probability that when someone clicks play, the video starts instantly.
The storage system is also a prediction system.
At this scale, they are the same thing. You cannot separate the storage decision from the prediction decision. Every chunk placement is a bet on future demand.
The bets pay off often enough that the whole thing is wildly profitable.
YouTube generated something like forty billion dollars in ad revenue last year. That's more than the entire GDP of many countries. The storage costs, while enormous in absolute terms, are a fraction of that revenue. The business is not storage constrained. It never has been.
Which is why the question "how can they offer unlimited uploads" has a simple answer and a complex answer. The simple answer is ads. The complex answer is everything we just talked about.
The complex answer is, I think, more interesting.
It always is.
Here's a thought I keep coming back to. YouTube is arguably the largest intentional collection of human culture ever assembled. Every upload, from a professionally produced documentary to someone's toddler taking their first steps, is part of a corpus that future historians will study. And the reason it exists at this scale is not because someone decided to build a library. It's because someone figured out how to make the storage economics work.
The Library of Alexandria, funded by targeted advertising.
I mean, unironically, yes. The advertising model is what pays for the preservation. And the preservation is almost a side effect. Google did not set out to archive human culture. They set out to sell ads, and archiving human culture turned out to be a necessary input to that business.
There's something almost accidentally noble about it. The profit motive produced a public good that no government or institution could have funded at this scale.
It continues to grow. Five hundred hours of video every minute. Most of it will never be watched by more than a handful of people. But it's all there, preserved in a distributed, self healing, exabyte scale storage system that will outlast the hard drives it's stored on, because the system doesn't care about individual hardware failures. The data survives the machines.
That's a good place to leave it, I think. Before we go, I believe we have a fun fact.
And now: Hilbert's daily fun fact.
Hilbert: In nineteen twenty three, a Kamchatka beekeeper nearly triggered a brief trade war between Japan and the Soviet Union when his fireweed honey, mistakenly labeled as a shipment of experimental botanical extracts, was intercepted by Japanese customs officials who suspected it was a chemical weapon precursor.
The spice trade but stickier.
This has been My Weird Prompts. Our producer is Hilbert Flumingtop. You can find every episode at myweirdprompts dot com. If you enjoyed this, leave us a review wherever you listen, it helps.
See you next time.