RAID (Redundant Array of Inexpensive Disks)
A common question: What is RAID, and what the heck do all those RAID# numbers mean?
At its core, RAID is a mechanism to provide some level of redundancy to a disk storage system, so that the loss of a single disk doesn’t result in the total loss of all data being stored.
For example, if the hard drive in your workstation dies, you would lose all of your data, including the running operating system, installed applications, and all of your personal files. Using RAID pools instead of single disks is a means of preventing this type of data loss. In a RAID setup, if a single disk were to fail, your data and applications would remain accessible, although at a slight performance penalty. You can then replace the failed drive and the missing data can be reconstructed from the remaining disks.
How Does It Work?
A RAID pool consists of multiple physical disks, grouped together using either software or dedicated hardware controllers, and presented to the computer system as a single logical disk.
RAID works by distributing data across multiple disks, along with additional parity data. In the event that a disk within the RAID set is lost, this parity data can be used to calculate the missing data from the lost disk. It’s not magic, just math.
There are several different types or ‘levels’ of RAID. Each have their own formula and algorithms for performing this process, and each have different characteristics, both pros and cons.
What are RAID levels and how are they different?
There are four types of RAID levels commonly in use: RAID0, RAID1, RAID5, and RAID6.
(RAID2-4 technically exist, but their characteristics make them unsuitable for most use cases.)
Lets discuss each individually.
RAID0 (RAID-Zero) is the simplest form of RAID, and since it does not provide any redundancy, you could argue that it is not really RAID at all. RAID0 simply spans all of the physical disks together into a single large logical drive. The size of the resulting logical drive is equal to the sum of all drives within the storage pool. Note: Drives within a RAID0 pool can be of any size, and do not have to match the size of other disks within the pool.
There is no parity, and the loss of any single physical disk within the pool will result in the destruction of all the data in the pool.
So why use RAID0? Performance! Since data is evenly distributed across the pool, all the physical disks are written to simultaneously. Since the logical bus speed is much faster than the physical storage device speed, adding many disks to a RAID0 pool can level this performance gap and provide much higher total throughput than a single disk. (More about this later.)
- RAID1 (RAID-One) is the lowest level of RAID to provide redundancy.
- Sometimes referred to as ‘mirroring’ RAID1 works by pairing two (or more, but this is uncommon) disks of equal size together and writing all data changes concurrently to both drives simultaneously. Both drives contain exactly the same data, so losing one disk implies that you have only lost one copy, and your data can safely be accessed using the remaining disk.
- With RAID1, the total size of your storage pool is equal to the size of a single disk within the pool, and all disks must be of the same size. (Technically, you can have different sized disks, but the resultant pool size will be the size of the smallest disk in the pool.)
- RAID1 writes the same data to all disks simultaneously, so there is no performance benefit for write operations. However, since the data on all disks is identical, multiple read operations can be distributed across physical storage devices, doubling the performance when reading the disk.
- RAID5 (RAID-Five) is the first complex level of RAID, sometimes referred to as Distributed Data Guarding.
- RAID5 requires at least three or more physical disks. While a RAID5 pool can contain as many disks as you like, it is recommended not to exceed 10 disks within a single pool.
- RAID5 works by "striping" data across all drives sequentially. A stripe of data consists of one block of data for each physical drive within the pool, plus one block of parity data containing a simple checksum of the data blocks. Each block within the stripe is written to each successive physical disk.
Since the parity data makes for one more block than there are physical drives in the pool, the data is distributed in a rolling fashion across all drives. In the event that a disk is missing, the lost block can be calculated by subtracting the checksum from the remaining blocks. In this fashion, RAID5 provides redundancy for the loss of any single disk within the storage pool.
Because data is written to multiple physical disks simultaneously, and the parity calculation is lightweight, RAID5 provides for nearly as much improvement to performance as RAID0 while still providing data redundancy for the entire pool.
Like RAID1, disks within a RAID5 pool all have to be the same size. Read and write operations both require accessing all of the disks within the pool, so their performance is equally affected by adding additional disks to the RAID set.
The total size of the logical disk created with a RAID5 pool is equal to the sum of all drives in the pool minus 1 drive. In other words, there is one drive's worth of parity data within the pool evenly distributed across all drives.
For example, the total size of the logical disk created with three 2TB disks in a RAID5 pool would be 2 x 2TB = 4 TB
- RAID6 (RAID-Six) is the most complex level of RAID commonly used, and it is sometimes referred to as Advanced Data Guarding.
- RAID6 requires at least 4 or more physical disks. Similar to RAID5, you can add as many disks to a RAID6 pool as you like, but it is recommended not to exceed 12 disks.
- RAID6 stripes data across all disks similarly to RAID5, but uses a more complex algorithm to distribute parity information. With RAID6 you can sustain the loss of any two disks within the pool without losing data redundancy.
- RAID6 provides a higher level of redundancy than RAID5 at the expense of storage density and performance. Like RAID0 and RAID5, adding more disks to the pool increases performance, but since the parity calculation is heavier and there is twice as much parity data, the total performance is substantially lower.
- The total size of the logical disk created using RAID6 is equal to the total size of all disks, minus 2 drives of parity data. (e.g. (4) 2TB disks in RAID6 = 2x2TB=4TB)
Wait, what? You said there were only four types of RAID!
That’s correct, but multiple RAID pools can be combined to increase performance.
Remember when we discussed RAID0? Why would anybody want to use RAID0 when you multiply the chances of a disk calamity with each drive you add to the pool?
Well, what if each one of those ‘disks’ you add to the RAID0 pool is self-redundant?
As the name implies, RAID10/50/60 is a combination of RAID1/5/6 with RAID0. For example, if you create two RAID5 pools, each with four 2TB drives, then create a RAID0 spanning the two resulting logical drives, you end up with a storage pool that is both double the size and double the performance of the individual RAID5 logical disks, and still self-redundant!
In this configuration, the resulting logical disk is 12TB (4-disk RAID5 = 3x2TB = 6TB, and then the resulting 2-disk RAID0 = 2 x 6TB = 12TB) and can sustain the loss of up to two disks, one disk from each of the RAID5 sets, without losing data integrity.
In this configuration you also gain the full benefits of the RAID0 performance boost by spanning together two RAID5 logical disks.
ZFS RAID vs typical RAID
What, even more confusion? Yes, sorry, last one I promise!
Replibit is built using the ZFS filesystem to manage backup data and snapshots.
ZFS has its own unique built-in RAID system that is similar to (but somewhat better) than typical RAID. ZFS RAID has its own labels for each of the various levels of redundancy, but since each is functionally similar to its equivalent typical RAID counterpart, we’ll refer to each of those levels using the common terminology. (such as RAID0/1/5/6 and so on.)
There are many highly technical differences between ZFS and typical RAID.
For example, ZFS RAID stripes don’t have to align with the number of disks within the pool, saving storage space when disk writes don’t come out to an even number of blocks.
The primary functional difference, however, is that ZFS performs internal data integrity grooming. The pool is periodically scanned, and if bad or corrupted data blocks are discovered, they are self-corrected by recovering the bad block from parity and then written to a new physical location on disk. Bad physical sectors are also marked so as not to be reused in the future.
For this reason, Replibit recommends using the native software ZFS RAID present within the system when creating storage pools, rather than utilizing hardware RAID controllers.
Saving the cost of an expensive hardware RAID controller actually provides BETTER data protection for your customers data!