The disks used for data storage are mechanical parts which are susceptible to failures. When any one disk fails, all data stored in the disk is lost. This becomes a serious issue when critical data is stored in the disk. RAID stands for redundant array of independent disks. Redundancy is the key behind RAID where data is stored in multiple disks to prevent data loss. RAID is now used to increase the speed and performance of data storage and retrieval process. It is obvious that retrieving data sequentially from a single disk is slower than retrieving the same data from parallel disks. To control the operation, host adapter in RAID provides centralized control.
Multiple RAID levels named as RAID0 to RAID5 define the way the storage and retrieval of data is done. Each of these levels concentrates on minimizing data loss and increasing the performance. However, either one of these goals dominate RAID levels because both the qualities are mutually exclusive.
In RAID, multiple disks are used for storing data. When a write operation is initiated, the host adapter accesses multiple connected drives to write the data. The data to be written is split into blocks. Odd blocks are written into one disk and even blocks to the other if there are two disks. When a read operation is initiated, the host adapter fetches data from multiple disks and stores in the buffer. The RAID levels are used to increase reliability and performance.
RAID 0 strips the data and writes data fragments to multiple disks on the same sector. Even if one of the disks fails, the stored data will be lost completely. RAID 0 is used where speed is the major concern. RAID 1 uses mirroring to minimize data loss. The same data is written on multiple disks to ensure that the data is still available even if one of the disks fails. The probability of failure of all the disks at the same time is very less and hence, you can expect the data to be present all the time. You should purchase more number of disks if you want to increase the storage capacity with RAID 1.
RAID 2 uses error correction code to minimize data loss. Apart from storage disks, ECC disks are used in this RAID level. The ECC disks are dedicated to store error correction codes. The data to be stored is split across multiple disks and the correction codes are directed to the ECC disk. RAID 3 uses byte level stripping with parity instead of error correction codes. RAID 4 is very similar to RAID 3 but block level stripping of data is done here. The correction code is based on parity.
With levels 2, 3 and 4, failure of single ECC disk results in loss of entire data. This problem is eliminated in RAID 5. RAID 5 also uses block parity but the parity data are also distributed across multiple disks. The parity can be used for data recovery meaning that the data can be recovered completely even when a single disk fails.