When it comes to the IT industry, it’s not often that a technology developed many decades ago is still deemed important enough to still be widely used by administrators and other users. Even modern servers and storage systems are running RAID technology - mostly in enterprises, but it has become more prevalent in consumer NAS systems as well. RAID has survived for more than 30 years, and it still plays a major role in data storage to this day. Why is that? Glad you asked.
Let’s start at the beginning.
In 1987, David Patterson, Garth A. Gibson, and Randy Katz invented the term RAID while at the University of California in Berkeley. The following year, they published a paper about the “Case for Redundant Arrays of inexpensive Disks” at the SIGMOD conference in June of 1988. At the time, hard disks were still quite expensive and trying to keep data storage “lean” was not only common, but a necessity. Additionally, companies were using huge mainframe computers, as desktop computers had not been widely introduced in the workplace. This began to change, however, as the acceptance and usage of personal computers gained popularity.
Consequently, hard drives for the first non-mainframe-computers were much cheaper than those used in mainframe systems; thus, being the reason Garth, Gibson and Katz developed the concept of RAID. They argued that several connected, and less expensive, hard disks would beat a single, top mainframe hard disk in terms of performance. And even though using many hard disks meant the failure rate would rise, it was possible to configure them for redundancy so that the reliability of such an array could far exceed that of any large single mainframe drive.
RAID Storage Explained
RAID is based on the concept that data spreads, or replicates, across multiple inexpensive or independent drives. Drives within the system are configured so that data can be divided or replicated over two or more drives for load distribution or to help recover data if a drive fails. There are two technical ways to achieve that: either by a hardware solution (a dedicated RAID controller) or a software solution, which is typically included in modern operating systems. Hardware-based systems manage the RAID independently from the host computer using a RAID controller, so the operating system is unaware of the technical workings of the RAID and sees the whole storage system as if it were a single volume connected to the host computer.
Besides these technical implementations, the RAID concept is based on these three fundamental principles:
- Parity - a way of distributing information across a RAID system which allows data to be restored in the case of a drive failure.
- Redundancy - the duplication of critical components in the system architecture to increase reliability and act as a fail-safe. In essence, it allows for multiple component failures to happen before the whole system fails and in the case of RAID systems, the components are the drives.
- Mirroring - when the same data is duplicated from one disk to another. Striping is another method where data is written across multiple disks. Different RAID setups use one or more of these techniques, depending on system requirements.
Based on these principles, the following standard RAID levels have been developed:
- RAID 0 uses striping and is the most basic RAID level. It offers no redundancy, but it does increase performance. Data is striped across at least two disks and with every disk added, read/write performance and storage capacity are increased over a single drive. If one drive fails, there’s no way for the RAID controller to rebuild it.
- RAID 1 uses mirroring, which as the name suggests, mirrors the same data across two disks, therefore it provides the lowest level of RAID redundancy. RAID 1 can double read performance over a single drive, but it gives no increase in write speed. This level allows for one drive to fail.
- RAID 5 is a common configuration and it gives a decent compromise between reliability and performance. It provides a gain in read speeds but no increase in write performance. RAID 5 introduces parity, which takes up the space of one disk in total. This level can handle one disk failure. If you have a hot spare configured as a 5th drive, this can sit as an idle drive in the system with no data saved to it. If one disk fails, the data can be rebuilt to the hot spare by using the data in the parity across the other drives. Once the data has finished rebuilding you can then remove the failed drive and replace it with a new one, which becomes the new hot spare.
- RAID 6 takes the concept of RAID 5 and adds further redundancy with dual-parity. This allows for data to be recreated even if two disks fail within the array. The dual-parity is spread across all the disks and takes the space of two drives.
Over the years, many more RAID levels have been developed mainly by RAID system manufacturers. Today, we have RAID levels ranging from RAID 0 all the way to RAID 61 and beyond, with larger companies creating custom RAID levels to support different applications and infrastructure requirements.
Drive Failures and the Dangers of RAID
If disk failure occurs in a RAID 1 or RAID 5 configuration, the user shouldn’t replace the failed drive until ensuring that all data from the remaining disks are backed up. In many cases, especially when the solution used disks that came out the same production, the possibility that another disk will also fail soon is quite high. And this is where the danger of this concept lies:
Even with all the benefits RAID offers, including better performance and data security, users tend to forget that RAID is not a backup. RAID can be used in combination with backups, thus making the whole storage system much more secure, but a RAID is never to be used instead of a backup. On the contrary, when a RAID system fails due to a malfunctioning hardware RAID controller, for example, it’s much more complicated to get the RAID up and running and recover lost data.
NAS systems have become more affordable to home users. They use the built-in RAID configurations in combination with other advanced storage technologies, like deduplication, to get as much space as possible out of their system. However, this comes at a price; in many cases these systems are set up incorrectly and when a failure arises, the entire system breaks down.
Whether you’re a home user or an enterprise IT administrator, it is important to carefully consider what RAID level suits your needs, or if RAID is even necessary at all. Remember, negligence in the beginning can result in serious problems, high costs, and possible data loss in the end.
New ways to store data continue to be explored, invented, and evolve over time, but given its track record, it’s likely that RAID won’t vanish anytime soon.
If you’re in need of RAID data recovery, don’t hesitate to contact the experts at Ontrack.