The origin of the phrase “any port in a storm” is not very clear or well documented, but the definition is. When you are facing some difficulty, you need to accept any solution without waiting for a perfect solution.
Such may have been the case at the outset of the great COVID pandemic of 2019 (and 2020, 2021, 2022, and 2023 too). When it came to dealing with the challenges of storing ever-increasing volumes of valuable data and the new phenomena of work-from-home, the cloud looked like a pretty fast and safe port in the COVID storm, and in some cases, it was and still is.
Prior to COVID-19, the term “cloud repatriation” appeared often in the press as it turned out that cloud was not a panacea for everything. But COVID understandably created short-term tactical storage strategies often resulting in a flight to the cloud.
However, the notion of cloud repatriation seems to be making a comeback, perhaps as a result of the prevailing conditions of economic uncertainty, geo-political tension, inflation, tightening budgets, cyber threats, and other factors. But it’s no longer a question of cloud vs on-premises. Savvy IT managers have the option of a more strategic hybrid cloud approach where the best of public cloud plus on-premises infrastructure provides maximum flexibility and value.
It’s hard to deny the well-known benefits and popularity of the $500 B cloud services market. The cloud is easy and fast to set-up with pay-as-you-go pricing, offering maximum flexibility, scalability, and ease of access. Capex and Opex can be avoided by not having to invest in on-premises infrastructure or IT support teams to manage that infrastructure. Cloud storage services span a broad spectrum including backup and disaster recovery, active archive and deep, cold archival storage.
One significant reason for cloud repatriation in favor of a hybrid cloud strategy is cost. The value of data is increasing as organizations seek to be more “data-driven” in their decisions and strategy. AI analytics can be applied to derive value even from older data sets, and therefore users want to keep more data for longer periods of time. Often that means indefinitely, or forever. But how does that get done on flat or tightening IT budgets? When considering where to store all this data, an unpleasant reality set in on the way to all-cloud deployments. It ended up costing more than anticipated, in many cases due to a lack of oversight and internal cloud management controls. In fact, a Gartner study shows that 70% of respondents without a cost optimization strategy said they overspent their cloud budgets in 2022, while a recent 451 research study suggests that 54% of respondents have moved some applications and data back to their own data centers or co-locations. As bottom line pressure builds, and other crises’ subside, it’s only natural that business leaders are re-evaluating strategies and continue to fine tune and optimize data storage options, figuring out what should be in the cloud and what is best stored on-premises.
When it comes to cloud costs, retrieval of stored data typically generates egress fees that can easily add up over time and as frequency of retrieval increases. For data intensive customers that tap into older data sets with some regularity, such as Fed Gov’t labs, universities and research institutions and other high performance computing (HPC) entities, the volume and reuse of data simply means on-premise infrastructure is more economical and potentially lower in latency than internet connected cloud. Often this on premises infrastructure is in the form of an active archive environment where data is managed according to user defined policies and frequency of access. Data is moved between hot and cold storage tiers to optimize user accessibility and total cost of ownership (TCO).
Working with industry total cost of ownership (TCO) expert Brad Johns, Fujifilm has tracked TCO for many years now comparing on-premises automated tape systems to economy disk and deep archival storage in the cloud. The chart below shows all three options for 2017, 2020 and 2023. It’s interesting to note that while cloud actually got cheaper than on-premises HDD over time, tape has maintained and even improved its cost effectiveness, being 74% less than cloud and 81% less than HDD in 2023. The scenario below is for 20 PB initially, 30% annual growth over five years with a 12% data retrieval from the cloud each year (1% of data per month is retrieved).
Another reason for cloud repatriation in favor of a hybrid cloud strategy is cyber security. No doubt the cloud service providers have invested heavily in cybersecurity measures and have hired the best talent to make their cloud environments as cyber secure as possible.
But ultimately the responsibility for data that has been compromised in the cloud and any ransomware demands falls on the data owner and not the cloud service provider. Companies with mission critical or highly sensitive data may find on premises data protection and cyber security measures more effective. With automated on premises tape systems, data tapes can be easily copied and moved offsite as part of a 3-2-1 defense strategy with a physical air gap to prevent unauthorized access to that data. Anecdotally, certain financial organizations will simply not use cloud storage for fear of being hacked!
Sustainability has recently shot up the list of concerns for C suite executives. The cost of energy is increasing and stakeholders are demanding ESG initiatives including a reduction of carbon footprint in light of the devastating consequences of global warming that we have all witnessed. Once again, as with cyber security, there’s no doubt the major CSPs have done a great job building out the most energy-efficient data centers on the planet.
But when it comes to energy consumption and CO2 profiles for various storage systems commonly used for data protection and long term retention like HDDs or tape, studies show that tape consumes 87% less energy than disk and produces 97% less CO2 equivalents than the same capacity of HDD storage. So if data owners are storing cold and inactive data on the cloud, they should ask if it is being protected and retained by the CSP on eco-friendly storage systems like modern LTO tape.
Some CSPs clearly leverage the low cost, low energy consuming and low CO2 profile of tape systems, while others won’t say or clearly don’t. Regarding CO2 associated with a given storage tier within the cloud, is that the CSP’s concern or the data owner’s? Whose CO2 is it? Most stakeholders will probably agree: it belongs to the data owner. So it’s the data owner’s responsibility to make sure that their data protection and retention strategy, whether done on-premises or in the cloud, takes advantage of the most eco-friendly type of storage possible given the required service level needs.
If data is cold and infrequently accessed as most data is after just 90 days in its lifecycle, why keep it on performance, energy-intensive storage infrastructure? Leveraging automated tape offerings in the cloud or deploying automated tape systems on-premises provides maximum cost reduction, CO2 reduction and can improve data security.
Data sovereignty has to do with the extent to which data is subject to the laws of a country, regardless of where it is stored. Then there is data privacy and making sure that sensitive personal data is carefully protected. The cloud itself raises data sovereignty issues due to the dispersed nature of its data centers. If cloud users aren’t careful, their cloud deployments could extend into different regions with different data sovereignty and data privacy laws. Cloud providers are responsible for providing data storage services and infrastructure, but as with cyber security, users are ultimately responsible for their data’s protection and ensuring that it complies with data sovereignty and data privacy laws. This may be most effectively accomplished in tightly controlled on-premises solutions.
Perhaps the best indicator for the need of a balanced approach between hosting workloads and applications on-premises and in the cloud is what the CSPs themselves are offering. For a number of years now, Microsoft Azure, AWS and Google have expanded their service offerings to integrate on-premises resources with the public cloud services via Azure Stack, AWS Outposts and Google Anthos. These hybrid cloud platforms provide a common solution for both on-premises and cloud-based environments. These solutions also provide centralized monitoring and management tools to coordinate hybrid cloud workloads, regardless of whether they run on-premises or in the public cloud. This approach might be particularly appealing for organizations that want the best of both worlds.
While the value of the cloud is undeniable with its flexibility and scalability, so is the value of on-premises infrastructure with its cost, sustainability and security advantage. As it turns out, the best port in a storm might just be home port!