4 min read

Just How Big is a Petabyte? The Myth of Cheap Storage

Just How Big is a Petabyte? The Myth of Cheap Storage

We all store too much data—but at enterprise scale, how much is too much? And what’s the real cost?



 

Just How Big is a Petabyte? 

In 1965, Gordon Moore made a unique observation that would go on to define modern computing. The number of transistors on a chip, he noted, seemed to double roughly every two years. This prediction—now known as Moore’s Law—held true for decades, fuelling an explosion of processing power and, inadvertently, a global addiction to data storage. 

Back then, a megabyte was considered a reasonable chunk of data. Today, we measure storage in petabytes—a term so large it sounds like something you’d order at a dodgy kebab shop at 3 am (we’ve all been there). 

But what does a petabyte actually mean? And how did we get here? 

Moore’s Law and the Storage Paradox 

Moore’s Law has been stretched to its limits. We’re now at 5nm transistor manufacturing, meaning today’s chips are working at scales where individual atoms become a design problem. But when it comes to storage, things haven’t quite followed the same trajectory. 

Unlike processors, hard drives don’t double in capacity every two years. Instead, storage growth has been slower—yet still relentless. A 1TB hard drive in 2005 was a technological marvel. Now, you can pick up a 16TB drive on Amazon for the price of a decent meal out. 

This has led to a dangerous assumption: storage is cheap. And so, data hoarding has become the corporate equivalent of that drawer full of old phone chargers and mystery cables—no one wants to throw anything away, just in case it turns out to be useful. 

Just How Big is a Petabyte? 

A petabyte (PB) is 1,024 terabytes (TB), or about 1 million gigabytes (GB). 

Some rough comparisons: 

  • 500 billion emails (assuming 2KB per email). 
  • 13.3 years of continuous HD video playback. 
  • All the books in the British Library—1,000 times over. 

And yet, many companies now deal in petabytes of unstructured data without really knowing what’s in it. 

Ask IT how big their data estate is, and the answer is usually: 

"Hmmm… we don’t really know." 

We spoke to one financial institution who estimated they had approx. 70PB of data spread across different silos. The key word here is estimated. They weren’t quite sure. It could have been more, less.. It was all very vague.  

70 petabytes. 70!  

That’s roughly 35 trillion pages of documents. If you stacked that as printed A4 paper, it would reach the moon and back, multiple times. And yet, when it comes to GDPR, regulatory requests, or legal investigations, we expect companies to be able to search through all of that in a matter of days. 

Good luck. 

The Cost of a Petabyte (Because ‘Cheap Storage’ is a Lie) 

Despite what cloud vendors might tell you, storage is not cheap at scale. 

Capital Expenditure (CapEx) – Buying 1PB of Storage 

Storing 1PB of data isn’t just about buying some hard drives. You need: 

  • Enterprise-grade storage hardware (£250,000+ for high-performance setups). 
  • Networking & infrastructure (high-speed data pipes don’t come cheap). 
  • Redundancy & backups (because losing 1PB of data is career-ending). 

A reasonable estimate? Expect to spend £500,000 – £1 million to own and maintain 1PB of high-availability storage over five years. 

Operational Expenditure (OpEx) – Keeping 1PB Running 

Once you’ve bought the storage, you need to keep it alive: 

  • Power & Cooling – Data centres are expensive to run, and energy prices aren’t exactly dropping. 
  • Security & Compliance – GDPR fines are a thing. 
  • Personnel Costs – IT teams don’t work for free, and someone has to manage this mess. 

Cloud storage might seem cheaper at first, but over time, costs creep up—especially if you need instant access to all that data. Cold storage is cheap. Instant retrieval? That’s where they get you. 

The point is: at petabyte scale, there is no such thing as cheap storage. 

The Human Cost: How Long Would It Take to Read a Petabyte? 

If you really want to appreciate the absurdity of 1PB of data, consider this: 

  • The average person skim reads at 200 words per minute. 
  • A standard document is about 500 words long. 
  • So, one document takes ~2.5 minutes to read. 

Now, let’s say a petabyte contains around 2 trillion documents (compressing for text storage efficiency). If you started reading today, non-stop, without sleep, it would take you approximately 9.5 billion years to get through everything. 

For context: 

  • Dinosaurs went extinct 65 million years ago. 
  • The Earth itself is only 4.5 billion years old. 
  • You would not finish reading this data before the Sun exploded. 

Even if you hired a thousand people to work on it full-time, they’d still be working on it for over 9 million years. This is what we’re dealing with when regulators say: "We need you to produce all relevant documents from your data estate." 

Sure. Right after we cure time travel, work out whats really in a black hole and ageing. 

You get the picture.  

Conclusion: The Madness of Petabyte-Scale Data Hoarding 

Storage vendors tell us data is cheap, but at petabyte scale, it just isn't true. The cost isn’t just financial—it’s practical, regulatory, and existential. At some point, every company will have to answer the question: do we actually know what’s in our data estate? If the answer is no, then it’s only a matter of time before someone forces you to find out—whether it’s a regulator, a lawsuit, or a Freedom of Information request that suddenly requires you to search through petabytes of unknown, unstructured data. 

If you’d rather get ahead of that problem before it happens, have a look at Lightning IQ—because manually reading a petabyte’s worth of documents isn’t an option. 

Nick Pollard leads EMEA consulting for One Discovery.  He is a seasoned leader with more than 20 years of experience working in real-time investigation, legal and compliance workflows across highly regulated environments like banking, energy and healthcare as well as national security organizations. You can contact at nick.pollard AT onediscovery.com

 

The 2025 Data Reckoning: Why Businesses Are Drowning in Their Own Information

The 2025 Data Reckoning: Why Businesses Are Drowning in Their Own Information

Are Your Clients Prepared for the New Era of Accountability?

Read More
GDPR is Turning 7. Your Data is Getting Old. Now What?

GDPR is Turning 7. Your Data is Getting Old. Now What?

What happens when companies finally reach the 7-year GDPR data retention limit? Spoiler: Most aren’t ready.

Read More
One Discovery Expands Operations to EMEA, Names Nick Pollard Principal Consultant

One Discovery Expands Operations to EMEA, Names Nick Pollard Principal Consultant

Innovative data scanning and processing software brings unheard of speed to global data governance, quality and security projects

Read More