In 1965, Gordon Moore made a unique observation that would go on to define modern computing. The number of transistors on a chip, he noted, seemed to double roughly every two years. This prediction—now known as Moore’s Law—held true for decades, fuelling an explosion of processing power and, inadvertently, a global addiction to data storage.
Back then, a megabyte was considered a reasonable chunk of data. Today, we measure storage in petabytes—a term so large it sounds like something you’d order at a dodgy kebab shop at 3 am (we’ve all been there).
But what does a petabyte actually mean? And how did we get here?
Moore’s Law has been stretched to its limits. We’re now at 5nm transistor manufacturing, meaning today’s chips are working at scales where individual atoms become a design problem. But when it comes to storage, things haven’t quite followed the same trajectory.
Unlike processors, hard drives don’t double in capacity every two years. Instead, storage growth has been slower—yet still relentless. A 1TB hard drive in 2005 was a technological marvel. Now, you can pick up a 16TB drive on Amazon for the price of a decent meal out.
This has led to a dangerous assumption: storage is cheap. And so, data hoarding has become the corporate equivalent of that drawer full of old phone chargers and mystery cables—no one wants to throw anything away, just in case it turns out to be useful.
A petabyte (PB) is 1,024 terabytes (TB), or about 1 million gigabytes (GB).
Some rough comparisons:
And yet, many companies now deal in petabytes of unstructured data without really knowing what’s in it.
Ask IT how big their data estate is, and the answer is usually:
"Hmmm… we don’t really know."
We spoke to one financial institution who estimated they had approx. 70PB of data spread across different silos. The key word here is estimated. They weren’t quite sure. It could have been more, less.. It was all very vague.
70 petabytes. 70!
That’s roughly 35 trillion pages of documents. If you stacked that as printed A4 paper, it would reach the moon and back, multiple times. And yet, when it comes to GDPR, regulatory requests, or legal investigations, we expect companies to be able to search through all of that in a matter of days.
Good luck.
Despite what cloud vendors might tell you, storage is not cheap at scale.
Storing 1PB of data isn’t just about buying some hard drives. You need:
A reasonable estimate? Expect to spend £500,000 – £1 million to own and maintain 1PB of high-availability storage over five years.
Once you’ve bought the storage, you need to keep it alive:
Cloud storage might seem cheaper at first, but over time, costs creep up—especially if you need instant access to all that data. Cold storage is cheap. Instant retrieval? That’s where they get you.
The point is: at petabyte scale, there is no such thing as cheap storage.
If you really want to appreciate the absurdity of 1PB of data, consider this:
Now, let’s say a petabyte contains around 2 trillion documents (compressing for text storage efficiency). If you started reading today, non-stop, without sleep, it would take you approximately 9.5 billion years to get through everything.
For context:
Even if you hired a thousand people to work on it full-time, they’d still be working on it for over 9 million years. This is what we’re dealing with when regulators say: "We need you to produce all relevant documents from your data estate."
Sure. Right after we cure time travel, work out whats really in a black hole and ageing.
You get the picture.
Storage vendors tell us data is cheap, but at petabyte scale, it just isn't true. The cost isn’t just financial—it’s practical, regulatory, and existential. At some point, every company will have to answer the question: do we actually know what’s in our data estate? If the answer is no, then it’s only a matter of time before someone forces you to find out—whether it’s a regulator, a lawsuit, or a Freedom of Information request that suddenly requires you to search through petabytes of unknown, unstructured data.
If you’d rather get ahead of that problem before it happens, have a look at Lightning IQ—because manually reading a petabyte’s worth of documents isn’t an option.
Nick Pollard leads EMEA consulting for One Discovery. He is a seasoned leader with more than 20 years of experience working in real-time investigation, legal and compliance workflows across highly regulated environments like banking, energy and healthcare as well as national security organizations. You can contact at nick.pollard AT onediscovery.com