Skip to main content

How does parity work on a RAID-5 array?


I'm looking to build a nice little RAID array for dedicated backups. I'd like to have about 2-4TB of space available, as I have this nasty little habit of digitizing everything. Thus, I need a lot of storage and a lot of redundancy in case of drive failure. I'll also essentially be backing up 2-3 computers' /home folders using one of the "Time Machine" clones for Linux. This array will be accessible over my local network via SSH.


I'm having difficulties understanding how RAID-5 achieves parity and how many drives are actually required. One would assume that it needs 5 drives, but I could be wrong. Most of the diagrams I've seen have only yet confused me. It seems that this is how RAID-5 works, please correct me as I'm sure I'm not grasping it properly:


/---STORAGE---\    /---PARITY----\
| DRIVE_1 | | DRIVE_4 |
| DRIVE_2 |----| ... |
| DRIVE_3 | | |
\-------------/ \-------------/

It seems that drives 1-3 appear and work as a single, massive drive (capacity * number_of_drives) and the parity drive(s) back up those drives. What seems strange to me is that I usually see 3+ storage drives in a diagram to only 1 or 2 parity drives. Say we're running 4 1TB drives in a RAID-5 array, 3 running storage and 1 running parity, we have 3TB of actual storage, but only have 1TB of parity!?


I know I'm missing something here, can someone help me out? Also, for my use case, what would be better, RAID-5 or RAID-6? Fault tolerance is the highest priority for me at this point, since it's going to be running over a network for home use only, speed isn't hugely critical.



Answer



It just XORs each corresponding bit from each drive - If you lose any drive, you can re-build the missing data.


For background:


A B (A XOR B)
0 0 0
1 1 0
0 1 1
1 0 1

Assume that D is the XOR of the other columns, then as long as you only lose one drive, you can figure out what you lost.


A B C D
1 0 0 1
0 1 0 1
1 1 0 0

Some times the stripe bit will be distributed across the drives, but the concept is the same.


So for RAID-5, no matter how many drives, you only need 1 drive for parity equal or bigger than the smallest drive in the array you want to RAID.


RAID-5 for personal use is probably best as computational complexity is much lower than RAID-6.


RAID-6 is more complicated using Galois Fields to compute parity. And that can tax parity computations. However, you can lose more drives, but if you rebuild your array as soon as you get a single failure, you should be fine sticking with RAID-5.


Comments

Popular Posts

keyboard - Is there any utility/method to change Windows key bindings to type rare chars to currently empty bindings?

I'm currently typing this post with my windows XP machine and (Spanish) keyboard, and I'd like to add some extra symbols to my text. I could open the "char map" windows utility, look for the desired symbols, and paste them. But I'd like something quickier. For example, when I'm using my OSX Mac at work, I can easily add a ©, ™, ® or similar symbols, just pressing some weird ALT-GR + G / H / J, key combinations. In my (Spanish) keyboard mapping, these combinations are empty, as they don't produce any char at all, which, on the other hand, is perfectly normal and desirable. So, I thought: Why couldn't I add some extra key mappings on top of my currently empty ALT-GR + G/J/H Keys in my Spanish keyboard, and thus, being able to quickly type these special symbols? So that's my question: Is there any utility/method to achieve that effect under windows? (My version is XP). I've even googled this for some time but no luck. I've been a long term Hot...

virtualization - How to select paravirtualization interface in VirtualBox?

Given a windows 8 host system (Intel Core i5) and a Linux Fedora host, I would like to determine the optimal setting for the paravirtual interface. Options are none Default Legacy minimal Hyper-V KVM This page suggest the selection is only based on the guest system: The biggest change in VirtualBox 5.0 is the introduction of paravirtualization support, bringing higher performance and time-keeping accuracy to supported guest operating systems (Hyper-V on Windows and KVM on Linux). Is that correct? Answer The VirtualBox Manual , in the section titled Paravirtualization providers explains very clearly when each should be used (emphasis added): Minimal: Announces the presence of a virtualized environment. Additionally, reports the TSC and APIC frequency to the guest operating system. This provider is mandatory for running any Mac OS X guests. KVM: Presents a Linux KVM hypervisor interface which is recognized by Linux kernels starting with version 2.6.25. VirtualBox's implementati...

Desktop reboots itself on sleep or hibernate

I have been using an ASUS M2NPV-VM motherboard for main home desktop workstation, operating Windows Vista x64. This computer has right from day one not been able to enter hibernate or standby; after Windows performs its final actions and brings the machine down, it would automatically revive itself for a reboot. Updating to the second latest BIOS (1201)has not helped (the latest BIOS revision would induce video refresh problems rendering it unusable). I have been reading related discussions on incidents similar to mine to no avail of a true workable solution. They appear to be more speculative guesses rather than actual knowledge on the inner workings of motherboard hardware. Does anybody have any electronic engineering experience on PC energy-saving standards to provide a more informed opinion how to go about getting this to work? More stories: this motherboard could not even reboot properly the first thing i used it. It was due to refresh rate of the onboard GPU, which had no influe...

security - How is Linux not prone to viruses, malware and those kinds of things?

How is Linux protected against viruses? This question was a Super User Question of the Week . Read the blog entry for more details or contribute to the blog yourself Answer Well, it factually is not... it's just less subject to hackers developing viruses that target Linux systems. Consumer grade computers usually run on Windows and thus, when targeting a wide audience, Windows is the way to go. Don't misunderstand Linux and viruses, there definitely ARE Linux viruses. Some distros have additional protection layers such as SELinux (See here ) in Ubuntu for example. Then there's the default firewall and the fact that alien files don't automatically have permission to be executed. Specific execution permission has to be granted before execution is possible. (See here ) Then there are several other factors that make Linux a hard place to be for viruses usually non-root users on linux systems have no to little executable files at their disposal that would allow for virus...