Redundant Array of Independent (originally Inexpensive) Disks (RAID) is a term used for computer data storage systems that spread and/or replicate data across multiple drives. RAID technology has revolutionized enterprise data storage and was designed with two key goals: increase data reliability and increase I/O (input/output) performance.
Unfortunately though, RAID storage isn't a perfect technology and as a result data loss can still occur when using RAID systems. In this post we’ll explore how RAID levels work and how data can be stored (and lost!) with this type of storage.
How does RAID work?
A RAID combines physical disks into a single logical unit by using either special hardware or software. Hardware RAID solutions can come in a variety styles, from built onto the motherboard or add in cards, up to large enterprise NAS or SAN servers. With these setups the operating system (OS) is unaware of the technical workings or the RAID. Software solutions are typically implemented within the OS.
RAID is traditionally used on servers, but can be also used on workstations. The latter is especially true in storage-intensive computers such as those used for video and audio editing, where high storage capacities and data transfer speeds are required.
Commonly used RAID vocabulary
Before we go into any further detail, let’s take a look at some of the technical terms that are commonly used to describe aspects of RAID storage:
RAID: RAID is a technology that supports the use of 2 or more hard drives in various configurations for the purposes of achieving greater performance, reliability and larger volume sizes through the use of consolidating disk resources and parity calculations.
Parity: A mathematical calculation which allows drives within a RAID array to fail without the loss of data. The simplest way to show this is the equation: A + B = C. You can remove anyone of the letters from above and work out its value from the 2 remaining. I.e. if B was removed so the equation looked like A + ? = C, then B's value can be worked out by moving the A, so B = C – A. This is obviously a simplistic way of describing it, to fully understand it in a RAID sense, knowledge of binary and the logical XOR expression is required.
Mirroring: The data from 1 or more hard drives is duplicated onto another physical disk(s).
Striping: The method that data and parity can be written across multiple disks. In the example below the data is written across the drives in a sequential order until the last drive, it then jumps back to the first and starts a 2nd stripe.
Block: A block is the logical space on each disk where the data is written, the amount of space is set by the RAID controller and most commonly would be 16KB to 256KB in size. The data will fill up the space until the limit is reached and then move onto the next drive, until the last drive when it will jump to the start of the next stripe.
Left / Right Symmetry: The symmetry in a RAID controls how the data and parity are distributed across the drives. There are four main styles of symmetry, which one is used depends on the RAID vender. Some companies also make proprietary styles depending on their business needs.
Hot Spare: There are a few different methods for dealing with drive failures within a RAID, one is the use of a Hot Spare. It is a spare disk which can be used in place of the failed one.
Degraded mode: This happens when a drive in the RAID becomes unreadable, the drive is then considered bad and is withdrawn from the RAID. The new data and parity are then written to the remaining drives within the RAID, if any data is requested from the failed drive it is worked out with the parity on the others. This degrades the performance of the RAID, hence degraded mode.
Still with me? Now that we’ve defined the key terms, in our next article we will take a look at the 3 key concepts in RAID: mirroring, striping and error correction. We’ll also look at different RAID levels, how modern arrays work and what challenges lie ahead if data is lost. See you next time!