Preservation or Deletion: Archiving and Accessing the Dataverse New Report by John Monroe of Furthur Market Research
John Monroe, a long-time storage industry expert and Gartner analyst, now an independent consultant with his own company (Furthur Market Research), recently published a new report entitled “Preservation or Deletion: Archiving and Accessing the Dataverse”. This report is a follow-up to John’s initial report entitled “The Escalating Challenge of Preserving Enterprise Data”, and is co-sponsored by Fujifilm, IBM and Twist Bioscience. This new report looks at likely growth rates of new enterprise capacity shipments required to store the ever-expanding “dataverse” and manage the swelling installed base of enterprise-grade SSD, HDD and tape media from 2023 to 2030. The findings and conclusions in John’s report clearly suggest that the status quo in storage strategies is not sustainable. Below are some summaries and excerpts taken from the report and a link is provided to view/download the full report.
Relentless Growth of the Dataverse
John provides a forecast for SSDs, HDDs and tape capacity shipments and the growing installed base from 2023 to 2030. With a CAGR of 30.7%, new shipments of enterprise storage capacity will hit 1.74 ZB in 2023 (that’s up from .95 ZB in 2020) and exceed 11.0 ZB in 2030. Meanwhile, the active installed base of enterprise storage will grow from 6.4 ZB in 2023 to 35.7 ZB in 2030. In a worst-case 25% CAGR scenario, new shipments of enterprise storage capacity will grow to 8.0 ZB while the active installed base expands to 26 ZB in 2030.
However, those forecasts could change dramatically if a not unlikely growth rate of 35% or even 45% should unfold. At 35% CAGR, we would see new capacity shipments of 14.7 ZB with an active installed base of 45 ZB in 2030. (Note: it takes 50 million 20 TB HDDs or 22 million LTO-9 tapes at 45 TB compressed capacity to store just one single zettabyte).
Evolving Data Temperatures
John also provides a breakdown of data temperatures depicted in a classic pyramid with Hot data at the top, followed down the pyramid by Warm, Cool, Cold and finally Frozen data layers. By 2030, the Cold and Frozen data layer will be the largest segment at 61% of stored data. This is largely because of the answer to the implied question posed in the title of the report “Will we preserve or delete our data?” In John’s surveys of end users across different vertical markets, almost all of the IT managers he spoke with specified “indefinite” retention periods for the vast majority of their data, even if frequency of access declined to seldom if ever. We will be storing and maintaining an ever-increasing amount of enterprise data that has aged for more than five years.
Massive Revenue Opportunity Ahead
With a majority of data being stored long term in Cold and Frozen layers requiring lower cost per GB and more energy efficient technologies, John conservatively estimates revenue for enterprise storage devices in these tiers will range from $8.8 B to $15.7 B in 2030, up from $5.1 B in 2023. This bodes well for new generations of tape and emerging technologies like DNA storage that will change the current trend of storing so much of this type of data on expensive and energy intensive SSDs and HDDs.
The report goes on to show that energy consumed by maintaining the installed base of SSDs and HDDs between 2020 and 2025 would consume over 15,000 megawatts of power while the tape installed base for the same period would consume just 18 megawatts, an 838 X difference. In John’s own words:
“It is obvious that HDDs and perhaps a significant number of SSDs are handling far too much of the Cold/Frozen workloads at far too great a cost/GB while consuming an inordinate share of available energy”.
Limited HDD and SSD Production Capabilities
Because the HDD makers have fiscal concerns about investing unprofitably in future CAPEX in the face of uncertain demand and growing SSD incursions, John fears the HDD industry will not adequately invest to be able to deliver ~5 ZB, much less ~8 ZB, of enterprise-grade media per year from 2028 to 2030. And given the recent precipitous price erosions—the price for raw NAND dropped by more than 70% during 2H22—and the inevitability of future supply/demand imbalances and the attendant price fluctuations, John also has growing doubts that the NAND industry will spend the necessary hundreds of billions of dollars to be able to deliver ~1 ZB, much less ~2-3 ZB, of enterprise-grade SSD storage capacity per year from 2028 to 2030. But even new shipments of ~6-10 ZB of expensive, enterprise-grade SSD and HDD media may be insufficient to meet global demand in 2030.
Resurgence in Tape Shipments
The report goes on to say that with limited SSD and HDD production capabilities looming and the increasing need for cost-effective and sustainable storage, the demand trend for new generations of tape, DNA data storage and even optical technologies may be altered drastically. Regarding tape specifically and considering recent hyperscale market adoption, the report suggests:
“There will be a resurgence in tape shipments for a variety of reasons based on expanding demand on multiple fronts, relative data temperature and time-to-data needs based on access frequency, and lower costs of data retention and power consumption, as well as limited HDD and SSD production capabilities. Tape could well grow to at least two zettabytes delivered by 2030”.
The data centers of the future will need everything the SSD, HDD and tape industries can manufacture and deliver, as well as requiring new DNA and perhaps other enterprise storage technologies. Availability and sustainability challenges, combined with the costs of managing the dataverse over increasingly lengthy time periods, will create new use cases for existing storage technologies and demand the creation of new, more cost-effective, and power-efficient storage technologies.
To read the full report: