By Chris Kehoe, Head of Infrastructure Engineering, FUJIFILM Recording Media U.S.A., Inc.
Object storage has many benefits. Near infinite capacity combined with good metadata capabilities and low cost have propelled it beyond its initial use cases of archiving and backup. More recently, it is being deployed as an aid to compute processing at the edge, in analytics, machine learning, disaster recovery, and regulatory compliance. However, one recent paper perhaps got a little over-enthusiastic in claiming that disk-based object storage provided an adequate safeguard against the threat of ransomware.
The basic idea proposed is that ransomware protection is achieved by having multiple copies of object data protecting against that kind of intrusion. If the object store suffers ransomware incursion, the backup is there for recovery purposes. The flaw in this logic, however, is that any technology that is online cannot be considered to be immune to ransomware. Unless it is the work of an insider, any attempt at hacking must enter via online resources. Any digital file or asset that is online – whether it stored in a NAS filer, a SAN array, or on object storage – is open to attack.
Keeping multiple copies of object storage is certainly a wise strategy and does offer a certain level of protection. But if those objects are online on disk, a persistent connection exists that can be compromised. Even in cases where spin-down disk is deployed, there still remains an automated electronic connection. As soon as a data request is made, therefore, the data is online and potentially exposed to the nefarious actions of cybercriminals.
Ransomware statistics can be frightening! Research studies suggest that over two million ransomware incidents occurred in 2019 with 60% of organizations surveyed experiencing a ransomware attack in the past year. To make matters worse, the cybercriminals have moved up the food chain. Two thirds of those attacked said the incident cost them $100,000 to $500,000. Another 20% said the price tag exceeded half a million. Overall, the losses are measured in billions of dollars per year. And it’s getting worse. Enterprise Strategy Group (ESG) reports that about half of all organizations have seen a rise in cyber attacks since the recent upsurge in people working from home.
Understandably, this is a big concern to the FBI. It has issued alerts about the dangers of ransomware. One of its primary recommendations to CEOs is the importance of backup with the following key questions:
“Do you backup all critical information? Are backups stored offline? Have you tested your ability to revert to backups during an incident?”
The key word in that line of questioning is “offline.” Hackers have gotten good at staging their attacks slowly over time. They infiltrate a system, quietly ensuring that backups are infected as well as operational systems. When ready, they encrypt the files and announce to the company that they are locked out of their files until the ransom is paid. Any attempt to recover data from disk or the cloud fails as the backup files are infected, too.
The answer is to make tape part of the 3-2-1 system: Three separate copies of data, stored on at least two different storage media with one copy off-site. This might mean, for example, one copy retained on onsite disk, another in the cloud, and one on tape; or one on onsite disk, one on onsite tape as well as tape copies stored offsite.
Modern tape libraries are part of an overall data management lifecycle strategy that offer many benefits including lower cost, energy savings, increased security and long-term shelf life.
We’re excited to partner with Spectra Logic and Iron Mountain on this new Storage Switzerland eBook: Reintroducing Tape to the Modern Data Center. The first chapter debunks some of the common myths of tape storage around reliability, access and operations. Read more about it here.
Stay tuned over the next few weeks as we reveal the next four chapters covering topics such as disaster recovery and backup, performance, cost, and offsite storage.
Previously, Storage Switzerland blogged about the merits of employing a tape storage hierarchy to cut backup storage costs. Tape media can furthermore add value as a tier in the broader disaster recovery strategy, as well.
As Lead Analyst George Crump overviewed in a recent video, applications are not all created equal when it comes to recovery time objectives (RTOs, the amount of time that it takes to get an application back up and running following an outage)
Check out George’s blog for more details and to view the video:
I often hear from customers that are sitting on scores of legacy tapes with unknown contents beyond a generic “business data” level, and 99+ percent of them are not known at a granular level. As we all know too well, disaster recovery backups morphed into unintentional data archiving these past 10 – 15 years thanks to litigation and government regulatory investigations, along with general business obligations to retain certain records. The duty to preserve has forced businesses to preserve backup tapes if at least one file on the tape might be under some form of preservation obligation. The IT staff almost never has the equipment or human resources to perform targeted restores of data under preservation and stack it together with other similar data, so they take the easy way out: buy more tape and retain existing tapes vs. overwriting their contents. Companies change backup software providers and migrate to newer backup platforms and get stuck paying maintenance and support for software and hardware they no longer use, but might one day.
An additional problem lies in the fact that companies are waking up and realizing that while tape as a storage mechanism is a great value, the real estate and costs associated with parking and retaining them in mass quantities can add up. In response, companies like Seagate and TapeArk offer to move large volumes of data into the cloud, but does this provide value to the customer? Why pay to migrate thousands of tapes to the cloud on the chance that you might one day need to access them?
So I came across a neat solution to this problem from a service provider/software developer named SullivanStricklerout of Atlanta. They recognize the gap between the status quo and the cloud and created TRACS/TDF and TRACS/TSF. TRACS stands for Tape Restoration and Cataloging System, TDF for Tape Duplicate File and TSF for Tape Session File. TDF and TSF files are both file containers which consist of data from legacy backup tapes, regardless of the source tape type and backup software format. TDF and TSF provide customers with a catalog of the contents of the tape and the ability to immediately restore the contents of the once backup tape, now TDF or TSF file, and/or stack and store the TDF/TSF files onto newer, higher capacity media using LTFS or some other backup software.
The economics of tape stacking have been explored for years, but the “value” of the exercise provided little ROI until 6.0 TB LTO-7 tapes arrived. The combination of reducing the storage costs associated with 60 LTO-1 (100 GB) tapes and replacing them with one LTO-7 tape, along with the increased value of discovering the contents of long forgotten backups and never having to pay licensing and support fees for technologies you no longer use, combine to provide the justification for businesses to begin to explore a stacking/migration effort.
Some customers ask, “But if I am going to undertake this effort, why do I need to migrate everything instead of only what I need to keep?” This is a very valid question, and is a good segue into the differences between TRACS/TDF and TRACS/TSF files.
TDF or Tape Duplicate File, is a byte-for-byte copy of the source tape, with the addition of a catalog of the tape contents appended to the file. Files ranging in quantity from one to all can be restored from a TDF file, and as a bonus the conversion process is reversible. This means that customers who convert from tape to TDF format can ultimately rewrite the data back out to tape so that it can once again be used by the backup software which originally created the tape, should there ever be a need.
TSF, or Tape Session File, differs slightly from a TDF file. Whereas a TDF file is a duplicate copy of an entire tape in one logical volume container, a TSF file is an individual logical session container from a tape. A TSF file can be created for one backup session, up to all of the backup sessions on the tape. TSF files are exciting because of the business value they provide. TDF files provide great value due to the stacking and cataloging elements, but TSF files allow users to pick and choose which backup sessions to retain and which can be deleted. If a company’s preservation requirements are such that they need to retain all backups of their email system and their file servers, but not their domain controllers, print servers, departmental databases, etc., then TSF files allow them to do this by breaking up the “if I need one file I need to keep the entire tape” limitation. This process results in an even larger business value than TDF through the reduction in risk associated with retaining data which need not be retained, and since not all sessions will be retained by customers, the reduction in data volume is multiplied.
Additionally, with one eye on the growing number of state, national and international regulations concerning data privacy and information governance, such as the EU’s General Data Protection Regulation (GDPR) or California’s Consumer Privacy Act, TSF allows for the defensible deletion of files stored within backups, without impacting the remaining backed up files. This type of targeted deletion of data originating from tape is quite unique, and all performed without restoring the data from a single tape.
Of course there are other solutions but I like the simplicity and logic of TRACS/TDF and TRACS/TSF. Certainly it’s more practical and affordable than what Seagate and TapeArk propose!
According to the Information Storage Industry Consortium, the total data rate for tape is improving by 22.5% MB/sec per year. One concept that is driving this capacity increase in the tape industry is RAIT (Redundant Arrays of Independent Tape). RAIT is ideal for large files that need massive amounts of throughput such as in a disaster recovery scenario where you need the ability to move your whole data center electronically to another location.
In this video, Fred Moore of Horison Information Strategies explains how RAIT works.
Usage of Cookies