One morning, I started wondering how RAID 5 parity works to rebuild a disk array. It seemed “magical” to me, that you can get redundancy and still use most of your disk capacity. So I searched for it… and turned up not very much info, and one other person’s unanswered question. A few articles explained it, but in a little more detail about performance, and less detail about the actual parity function. So I wrote this. The good articles were at:
What’s the magic?
The short answer is “XOR”. XOR is a binary operator that takes two inputs and produces one output. The rules are:
1 xor 1 = 0 1 xor 0 = 1 0 xor 1 = 1 0 xor 0 = 0
In English xor means “if there’s a difference, the result is 1”.
Parity in RAID 5 involves reserving some space for parity information. Parity data is an additional digit of information that helps you recover lost data.
Another way to describe this parity is “even parity”. That means we try to keep the number of “1” bits even. If there are 2 “1”s, the parity is “0”. If there is only 1 “1”, the parity is “1”. In short:
1 1 parity 0 1 0 parity 1 0 1 parity 1 0 0 parity 0
(I had a little difficulty “getting it” because I was used to parity calculations on a network. If you’re familiar with that, don’t apply that knowledge too much. RAID parity is just a little bit different.)
How parity data is arranged on the disk
In practial terms, the data on the disk is stored in cylinders. A cylinder is all the data that passes under the disk head in one revolution. Cylinders are grouped into stripes, which are the same-numbered cylinder across all the drives. So, stripe 1 is the group of all the cylinder 1s across all the disks. (Note that this is the idealized RAID. Actual RAID is less tidy, but the stripe is the basic set of data that is protected.)
RAID calculates parity across cylinders, within a stripe. So if you have a three-disk RAID 5 array, and your data is on stripe 0, two of the cylinders hold data, and the third cylinder holds the parity.
Parity is calculated across the cylinders. (It’s not calculated within a single byte, the way it is on networks.) So if cylinder 1 on disk 1 looks like this:
And cylinder 1 on disk 2 looks like this:
Then the parity looks like this:
Here are the three cylinders again, but closer together so it makes sense:
10010011101110110001101... 10000001000000010000000... 00010010101110100001101...
Notice how there are an even number of 1s in each column of bits. The parity is even.
That is how RAID can recover from a lost hard disk. If you replace a disk, you can rebuild it because you know that there should always be an even number of bits in each column.
If you lost the second disk, your data suddenly looks like this:
10010011101110110001101... _______________________... 00010010101110100001101...
You can rebuild the second disk by setting the 1 and 0 bits based on our parity rule, that there should be an even number of 1s. Here’s an example of regenerating the first four bits:
10010011101110110001101... 1000___________________... 00010010101110100001101...
All we do is repeat this calculation several billion times, and our data is rebuilt.
(A nice feature about RAID 5 is that, as you add more disks, you still retain only one parity cylinder per stripe, so, the fraction of space used for parity decreases. A 4-disk RAID array allows you to use 3/4 of the array for data, and 1/4 for parity.)
The main downside of RAID 5 is performance, and that issue is described in the linked articles, above.