Zhang Yang, Lu Xin-hao
Anqing First People’s Hospital Information Center, Anqing 246004, Anhui Province, China
Abstract“Data backup” is an effective way to prevent risks in information systems. It integrates and copies all or part of the data from hard disk or array of the application host to other storage media. This paper will analyze the advantages and disadvantages of common backup media, including hard disk media-based storage, tape-based tape library, blu-ray disc-based disk library, sporadic disk,mobile storage and disk offline backup devices, using SWOT tools that are subject to strength,weakness, opportunity and threat. In addition, the idea of backup offline and backup deliverables is proposed.
Key words: data backup; media; storage; tape; blu-ray; deliverables
If there is a regret medicine available in the world, “data backup” in the IT industry is undoubtedly the optimal one.Referring to Baidu Encyclopedia, “data backup” is defined as - “the basis of disaster recovery. It refers to the process of integrating and copying all or part of the data from the application host’s hard disk or arrays to other storage media, so as to prevent data loss caused by system errors or system failures. Traditional data backup mainly adopts internal or external tape drives for cold backup. However,this method can only prevent man-made failures such as operational errors, and the recovery time is relatively long”.
From the literal sense, we can directly interpret five implications: First, backup and disaster recovery are two concepts; Second, even if there is disaster recovery, it still requires a backup; Third, traditional backup needs to copy the data to other physical locations; Fourth, the backup also requires certain storage medium; Fifth, the backup requires time cost.
At present, hardware computing and storage architectures are increasingly complex along with the development of application software. Disk array-based FC SAN, IP SAN storage develops from single machine, dual machine to dual active, from snapshot, volume copy to CDP, new technologies emerge at any time to interpret disaster tolerance under high availability requirements, but “data backup” has always been a persistent guardian behind a powerful hardware architecture. Through a comprehensive analysis of the difference between disaster recovery and backup, we demonstrate that disaster recovery focuses on ensuring the service continuity, while data backup focuses on protecting data security; disaster recovery ensures data integrity, while backup only restores data preserved before backup; disaster recovery is an online process, while backup is an offline process; disaster recovery time is short,while backup recovery time is long[1]. Disaster recovery is aimed to ensure that the information system can operate normally in the event of a disaster, while the backup is aimed to solve the data loss caused by the disaster; the ultimate goal is to deal with “soft” disasters such as human misoperation, software errors, virus intrusion, as well as“hard” disasters such as hardware failures and natural disasters.
Since May 2017, an unprecedented scale of extortion virus sweeps through 150 countries and regions around the world, affecting a wide range of industries including public transportation, postal services, communications,automotive manufacturing and medical services. This virus attacks storage terminals via multiple propagation paths,systematically encrypts files on affected storage devices through various algorithms, and the affected person have to pay an expensive fee to obtain the decrypted key. Facing with such great risks, the characteristics of “data backup” -“traditional backup copies data to other physical locations”is particularly prominent, and various types of storage media essential for “data backup” are exactly the topic to be explored in this paper. We will investigate various storage media using SWOT analysis technology (Strength,Weakness, Opportunity, Threat), so as to comprehensively evaluate the application risks of various backup media and devices[2]. As for the characteristics of “data backup”,this paper only considers “static backup” at a specific time point. Although there is a time difference and this method is only responsible for the data before the backup,“static backup” is a potential life-saving straw upon greater disasters and risks.
Regarding the implementation of “static backup”, the authors propose the following dimensions:
1. Local backup: Back up files in a specific area of local hard disk or storage network or network storage, including network-based offsite backup;
2. Offline backup: Back up files in a storage medium separated from the hardware device;
3. Active backup: Back up files in a rewritable storage medium for updating and modification.
4. Dead backup: Back up files in a non-rewritable storage media to prevent erroneous deletion and tampering.
The storage media and devices related to “static backup”are as follows:
1. Hard disk media-based storage;
2. Tape media-based tape library;
3. Blu-ray disc-based disk library;
4. Sporadic disk, mobile storage and disk offline backup devices.
The advantages and disadvantaged of each medium are as follows:
Hard disk performance is developing rapidly due to the great technology. Hard disk storage is characterized by fast reading/writing and large capacity. However, storage devices are at a high cost, frequent data migration and strict room environment are required. It is suitable for hot data storage that requires frequent access.
1. Advantages of hard disk storage: Rapid reading and writing, fast operations such as the storage, query, and retrieval of data; large storage capacity, especially single hard disk, due to continuous development of hard disk.
2. Disadvantages of hard disk storage: Short life span,generally 4-6 years, which requires hard disk replacement and continuous data migration during a long period.Relative high cost in the device operation and maintenance as well as power consumption. High requirement of room environment, such as continuous power supply during the working process of hard disk, cooling system that consumes numerous electric energy. Low safety, raw data recording is at a risk of viruses, hackers, or human modifications because hard disk records can be changed.
Tape storage is a magnetic storage method. Tape is a traditional data storage medium. Generally speaking,tape library is mainly deployed in the terminal of data offline storage, to meet the requirements of large-capacity recording. Tape storage has evolved for decades, its advantages and disadvantages as backup and archive of mass storage are as follows:1. Advantages of tape storage:
● Low initial cost, low cost of tapes, medium recycling;
● Low energy consumption, low power consumption;
● Small size, high portability, removable medium.
2. Disadvantages of tape storage:
● High requirement of storage environment, the temperature, humidity, magnetic fields and dust may cause tape deformation, degradation, adhesion, molding,magnetization and magnetic layer wear. Rewind every 2-3 years. Dust also wears the magnetic layer on which the information is recorded, thereby affecting the reading quality. Intense impact and strong external magnetic fields result in changes of the arrangement of magnetic molecules in the tape.
● Low reading/writing speed. Tape library locates the data position through linear addressing to read data.The addressing speed is slow, that is, the tape needs to be reversed to the target position of the data recorded, which is not conducive to data non-continuous fast access.
● Poor compatibility, incompatible tape format, LTO can only be compatible with two backward generations, which causes the difficulty of reading data on the tape if the old tape drive is eliminated.
Blu-ray disc has a breakthrough technological advancement compared with CD/DVD. It adopts inorganic material phase change method for data recording, which is more stable than CD/DVD organic pigment technology.It has longer storage time and larger capacity. Blu-ray disc storage has the following features:
1. Advantages of Blu-ray disc storage:
● Long media life, theoretical life span of high-quality disk storage media is at least 50 years;
● High capacity and low unit cost. Compared with media such as hard disk and tape, disk has a significant advantage, i.e. low cost per unit of storage capacity. At present, the capacity of single Blu-ray disc can reach 300 GB. Disk storage can be achieved by multiple storage technologies such as multi-layer, multi-stage, multidimension and nano-super resolution. The storage density is to be terabytes in the future. Current Blu-ray storage devices even support RAID technology.
● Low requirement of storage environment, almost no energy consumption upon storage and only energy consumption during reading and writing, and no requirement of heat dissipation.
● Only-reading and non-writeable, high security,avoiding human data deletion or viral attack.
2. Disadvantages of Blu-ray storage:
● Low access speed. Blu-ray disc reading requires the disc to be loaded into CD-ROM drive, which inevitably delay the reading compared with hard disk.
● Non-reusable media. Blue-ray storage provides only one writing and is not erasable, which increases the cost of the medium to some extent.
In 2002, the authors have conducted static backup of small HIS system database for daily recording, using CD-R and CD-RW (weekly cycle). Since 2007, daily backup of medium-sized HIS system database is performed using DVD-R. According to the capacity limitation of CD-R and DVD-R, they are pre-processed through compression and volume compression. After the Blu-ray disc appears,the 25G and 50G read-only Blu-ray discs are used for daily backup and recording, but it is inapplicable for PACS data. Some hospitals also adopt multiple mobile hard disk media for backup copy offline. Strictly speaking, sporadic disk and mobile storage predominantly rely on human operation, which is a challenge for daily management of the information center. In addition, the authors also investigate an all-in-one backup machine. This machine supports the offline of several hard disks. After the data is backed up, hard disks can be hot swapped offline. It is a great idea, but the application scope is minority. The specific application effect needs to be verified.
The above lists the advantages and disadvantages of four backup storage media, and performs a simple SWOT analysis on the first three media and devices. In practical applications, especially when planning information system operation and maintenance and hardware architecture, the media/device shall be selected or as a combination based on actual conditions, and a targeted and effective crossapplication is carried out. The authors briefly propose that:
1. The most significant characteristics of “data backup”is “to copy data to other physical locations (media)”. It’s widely acknowledged that you can’t put all the eggs in one basket. If the backup file is still stored online, you will encounter the ransomware. Any data stored on the network is at risk of being exhausted by the network, so the backup files should be offline as much as possible;
2. As for the difference between “active backup” and “dead backup” of the backup file. A reserving backup may use the rewritable storage medium, and a remnant backup may use non-writable storage medium to prevent erroneous deletion and tampering;
3. The snapshot function of professional storage devices can be utilized to reduce risks, especially for PACS data;
4. Regular check of the recoverability of backup files in the storage medium is necessary;
5. As for the “deliverables”, the backup of static data is aimed to restore the business within the short period,and to achieve the traceability, traces and process control within a long term. At present, the price and audit departments are good at the use of big data analysis.The hospital’s business also needs auditing and tracing.Therefore, the backup of historical static data should be a potential resource. The non-writable backup media in a time-series can be a “deliverable” to reflect the value of backup.
The hardware is valuable and iterative, but the data is priceless. Different people often have different views towards the backup mode, backup strategy, backup device and media selection. This paper is aimed to provide some guideline for daily backup work through our SWOT analysis of backup storage media.