RAID | The Linux Tutorial

RAID

RAID is an acronym for Redundant Array of Inexpensive Disks. Originally, the idea was that you would get better performance and reliability from several, less expensive drives linked together as you would from a single, more expensive drive. The key change in the entire concept is that hard disk prices have dropped so dramatically that RAID is no longer concerned with inexpensive drives. So much so, that the I in RAID is often interpreted as meaning “Intelligent” or “Independent”, rather than “Inexpensive.”

In the original paper that defined RAID, there were five levels. Since that paper was written, the concept has been expanded and revised. In some cases, characteristics of the original levels are combined to form new levels.

Two concepts are key to understanding RAID. These are redundancy and parity. The concept of parity is no different than that used in serial communication, except for the fact that the parity in a RAID system can be used to not only detect errors, but correct them. This is because more than just a single bit is used per byte of data. The parity information is stored on a drive separate from the data. When an error is detected, the information is used from the good drives, plus the parity information to correct the error. It is also possible to have an entire drive fail completely and still be able to continue working. Usually the drive can be replaced and the information on it rebuilt even while the system is running. Redundancy is the idea that all information is duplicated. If you have a system where one disks is an exact copy of another, one disk is redundant for the other.

A striped array is also referred to as RAID 0 or RAID Level 0. Here, portions of the data are written to and read from multiple disks in parallel. This greatly increases the speed at which data can be accessed. This is because half of the data is being read or written by each hard disk, which cuts the access time almost in half. The amount of data that is written to a single disk is referred to as the stripe width. For example, if single blocks are written to each disk, then the stripe width would be a block

This type of virtual disk provides increased performance since data is being read from multiple disks simultaneously. Since there is no parity to update when data is written, this is faster than system using parity. However, the drawback is that there is no redundancy. If one disk goes out, then data is probably lost. Such a system is more suited for organizations where speed is more important than reliability.

Keep in mind that data is written to all the physical drives each time data is written to the logical disk. Therefore, the pieces must all be the same size. For example, you could not have one piece that was 500 MB and a second piece that was only 400 Mb. (Where would the other 100 be written?) Here again, the total amount of space available is the sum of all the pieces.

Disk mirroring (also referred to as RAID 1) is where data from the first drive is duplicated on the second drive. When data is written to the primary drive, it is automatically written to the secondary drive as well. Although this slows things down a bit when data is written, when data is read it can be read from either disk, thus increasing performance. Mirrored systems are best employed where there is a large database application

Availability of the data (transaction speed and reliability) is more important than storage efficiency. Another consideration is the speed of the system. Since it takes longer than normal to write data, mirrored systems are better suited to database applications where queries are more common than updates.

The term used for RAID 4 is a block interleaved undistributed parity array. Like RAID 0, RAID 4 is also based on striping, but redundancy is built in with parity information written to a separate drive. The term “undistributed” is used since a single drive is used to store the parity information. If one drive fails (or even a portion of the drive), the missing data can be created using the information on the parity disk. It is possible to continue working even with one drive inoperable since the parity drive is used on the fly to recreate the data. Even data written to the disk is still valid since the parity information is updated as well. This is not intended as a means of running your system indefinitely with a drive missing, but rather it gives you the chance to stop your system gracefully.

RAID 5 takes this one step further and distributes the parity information to all drives. For example, the parity drive for block 1 might be drive 5 but the parity drive for block 2 is drive 4. With RAID 4, the single parity drive was accessed on every single data write, which decreased overall performance. Since data and parity and interspersed on a RAID 5 system, no single drive is overburdened. In both cases, the parity information is generated during the write and should the drive go out, the missing data can be recreated. Here again, you can recreated the data while the system is running, if a hot spare is used. Figure – Raid 5

As I mentioned before, some of the characteristics can be combined. For example, it is not uncommon to have stripped arrays mirrored as well. This provides the speed of a striped array with redundancy of a mirrored array, without the expense necessary to implement RAID 5. Such a system would probably be referred to as RAID 10 (RAID 1 plus RAID 0).

Regardless of how long your drives are supposed to last, they will eventually fail. The question is when. On a server, a crashed harddisk means that many if not all of your employees are unable to work until the drive is replaced. However, there are ways of limiting the effects the crash has in a couple ways. First, you can keep the system from going down unexpectedly. Second, you can protect the data already on the drive.

The key issue with RAID is the mechanisms the system uses to portray the multiple drives as single one. The two solutions are quite simply hardware and software. With hardware RAID; the SCSI host adapter does all of the work. Basically, the operating system does not even see that there are multiple drives. Therefore, you can use hardware RAID with operating systems that do not have any support on their own.

On the other hand software RAID is less expensive. Linux comes included with software, so there is no additional cost. However, to me this is no real advantage as the initial hardware costs are a small fraction of the total cost of running the system. Maintenance and support play a much larger roll, so these ought to be considered before the cost of the actual hardware. In it’s Annual Disaster Impact Research, Microsoft reports that on the average a downed server costs at least $10,000 per hour. Think about how many RAID controllers you can buy with that money.

Another advantage of software RAID is the ability to use different types of drives. Although the partitions need to be the same size, the physical hardware can be different, such as IDE and SATA.

In addition, the total cost of ownership also includes user productivity. Should a drive fail, performance degrades faster with a software solution than with a hardware solution.

Let’s take an Adaptec AA-133SA RAID controller as an example. At the time of this writing it is one of the top end models and provides three Ultra SCSI channels, which means you could theoretically connect 45 devices to this single host adapter. Since each of the channels is Ultra SCSI, you have a maximum throughput of 120Mbit/s. At the other end of the spectrum is the Adaptec AAA-131CA, which is designed more for high-end workstations, as it only supports mirroring and striping.

One thing to note is that the Adaptec RAID host adapters do not just provide the interface, which makes multiple drives appear as one. Instead, they all include a coprocessor, which increases the performance of the drives considerably.

However, providing data faster and redundancy in not all of it, Adaptec RAID controllers also have the ability to detect errors and in some cases correct errors on the hard disk. Many SCSI systems can already detect single-bit errors. However, using the parity information from the drives, the Adaptec RAID controllers can correct these single-bit errors. In addition, the Adaptec RAID controllers can also detect 4-bit errors.

You need to also keep in mind the fact that maintenance and administration are more costly than the initial hardware. Even though you have a RAID 5 array, you still need to replace the drive should it fail. This brings up two important aspects.

First, how well can your system detect the fact that a drive has failed? Whatever mechanisms you chose must be in a position to immediately notify the administrators should a drive fail.

The second aspect returns to the fact that maintenance and administration costs are much higher than the cost of the initial hardware. If the hardware makes replacing the drive difficult, you increase your downtime and therefore the maintenance costs increase. Adaptec has addressed this issue by allowing you to “hot swap” your drives. This means you can replace the defective drive on a running system, without have to shutdown the operating system.

Note that this also requires that the case containing the RAID drive be accessible. If your drives are in the same case as the CPU (such as traditional tower cases), you often have difficulty getting to the drives. Removing one while the system is running is not practical. The solution is an external case, which is specifically designed for RAID.

Often you can configure the SCSI ID of the drive with dials on the cases itself and sometimes the position in the case determines the SCSI ID. Typically, the drives are mounted onto rails, which slide into the case. Should one fail, you simple slide it out and replace it with the new drive.

Protecting your data and being able to replace the drive is just a start. The next level up is what is referred to as “hot spares.” Here, you have additional drives already installed that are simply waiting for another to break down. As soon as a failure is detected, the RAID card replaces the failed drive with a spare drive, simply reconfigures the array to reflect the new drive and the failure is reported to the administrator. Keep in mind that this must be completely supported in the hardware.

If you have an I/O-bound application, a failed drive decreases the performance. Instead of just delivering the data, your RAID array must calculate the missing data using the parity information, which means it has a slower response time in delivering the data. The degraded performance continues until you replace the drive. With a hot spare, the RAID array is rebuilding it self as it is delivering data. Although performance is obviously degraded, it is to a lesser extent than having to swap the drives manually.

If you have a CPU-bound application, you obtain substantial increases in performance over software solutions. If a drive fails, the operating system needs to perform the parity calculations in order to reconstruct the data. This keeps the CPU from doing the other tasks and performance is degraded. Because the Adaptec RAID controller does all of the work of reconstructing the data, the CPU doesn’t even notice it. In fact, even while the system is running normally, the RAID controller is doing the appropriate calculations, so there is no performance lost here either.

In addition, the Adaptec RAID controllers can be configured to set the priority of performance versus availability. If performance is given a high priority, it will take longer to restore the data. If availability is given the higher priority, performance suffers. Either is valid, depending on your situation. It is also possible to give each the same priority.

Because the new drive contains no data, it must take the time to re-create the data using the parity information and the data from the other drives. During this time performance will suffer as the system is working to restore the data on the failed drive.

Redundancy like this can (and therefore the safety of your data) be increased further by having redundant RAID 5 arrays. For example, you could mirror the entire RAID set. This is often referred to as RAID 51, as it is a combination of RAID 5 and RAID 1, although RAID 51 was not defined in the originally RAID paper. Basically, this is a RAID array which is mirrored. Should a drive fail, not only can the data be recovered from the parity information, but it can also be copied from its mirror. You might also create a RAID 15 array. This is a RAID 5 array, which is made up of mirror sets.