Hard Disks | The Linux Tutorial

Hard Disks

You’ve got to have a hard disk. You could actually run Linux from a floppy (even a high-density 5.25″ floppy), but life is so much easier when you run from the hard disk. Not only do you have space to save your files, you have access to all the wonderful tools that Linux provides. Disk drives provide a more permanent method for storing data, keeping it on spinning disk platters. To write data, a tiny head magnetizes minute particles on the platter’s surface. The data is read by a head, which can detect whether a particular minute particle is magnetized. (Note that running Linux off of a floppy, such as as a router or fire wall is not a bad idea. You still have the basic functionality you need and there is very little to hack.)

A hard disk is composed of several physical disks, called platters, which are made of aluminum, glass or ceramic-composites and coated with either an “oxide” media (the stuff on the disks) or “thin film” media. Because “thin film” is thinner than oxide, the denser (that is, larger) hard disks are more likely to have thin film. The more platters you have, the more data you can store.

Platters are usually the same size as floppies. Older platters were 5.25″ round, and the newer ones are 3.5″ round. (If someone knows the reason for this, I would love to hear it.) In the center of each platter is a hole though which the spindle sticks. In other words, the platters rotate around the spindle. The functionality is the same as with a phonograph record. (Remember those?) The platters spin at a constant speed that can vary between 3000 and 15,000 RPM depending on the model. Compare this to a floppy disk which only spins at 360 RPM. The disk’s read/write heads are responsible for reading and writing data and there is a pair for each platter, one head for each surface. The read/write heads do not physically touch the surface of the platters, instead they float on a very thin (10 millionths of an inch) cushion of air. The read/write heads are moved across the surface of the platters by an actuator. All of the read/write heads are attached together, they all move across the surfaces of the platters together.

The media that coats the platters is very thin. about 30 millionths of an inch. The media has magnetic properties that can change its alignment when it is exposed to a magnetic field. That magnetic field comes in the form of the hard disks read/write heads. It is the change in alignment of this magnetic media that enables data to be stored on the hard disk.

As I said earlier, a read/write head does just that: it reads and writes. There is usually one head per surface of the platters (top and bottom). That means that there are usually twice as many heads as platters. However, this is not always the case. Sometimes the top- and bottom-most surfaces do not have heads.

The head moves across the platters that are spinning several thousand times a minute (at least 60 times a second!). The gap between the head and platter is smaller that a human hair, smaller than a particle of smoke. For this reason, hard disks are manufactured and repaired in rooms where the number of particles in the air is fewer than 100 particles per cubic meter.

Because of this very small gap and the high speeds in which the platters are rotating, if the head comes into contact with the surface of a platter, the result is (aptly named) a head crash. More than likely this will cause some physical damage to your hard disk. (Imagine burying your face into an asphalt street going even only 20 MPH!)

The heads move in and out across the platters by the older stepping motor or the new, more efficient voice-coil motor. Stepping motors rotate and monitor their movement based on notches or indentations. Voice-coil motors operate on the same principle as a stereo speaker. A magnet inside the speaker causes the speaker cone to move in time to the music (or with the voice). Because there are no notches to determine movement, one surface of the platters is marked with special signals. Because the head above this surface has no write capabilities, this surface cannot be used for any other purpose.

The voice-coil motor enables finer control and is not subject to the problems of heat expanding the disk because the marks are expanded as well. Another fringe benefit is that because the voice-coil operates on electricity, once power is removed, the disk moves back to its starting position because it no longer is resisting a “retaining” spring. This is called “automatic head parking.”

Each surface of the platter is divided into narrow, concentric circles called tracks. Track 0 is the outermost track and the highest numbered track is the track closest to the central spindle. A cylinder is the set of all tracks with the same number. So all of the 5th tracks from each side of every platter in the disk is known as cylinder 5. As the number of cylinders is the same as the number of tracks, you often see disk geometries described in terms of cylinders. Each track is divided into sectors. A sector is the smallest unit of data that can be written to or read from a hard disk and it is also the disk’s block size. A common sector size is 512 bytes and the sector size was set when the disk was formatted, usually when the disk is manufactured.

Since data is stored physically on the disk in concentric rings, the head does not spiral in like a phonograph record but rather moves in and out across the rings (the tracks). Because the heads move in unison across the surface of their respective platters, data are usually stored not in consecutive tracks but rather in the tracks that are positioned directly above or below them. Therefore, hard disks read from successive tracks on the same cylinder and not the same surface.

Think of it this way. As the disk is spinning under the head, it is busy reading data. If it needs to read more data than what fits on a single track, it has to get it from a different track. Assume data were read from consecutive tracks. When the disk finished reading from one track, it would have to move in (or out) to the next track before it could continue. Because tracks are rings and the end is the beginning, the delay in moving out (or in) one track causes the beginning of the next track to spin past the position of the head before the disk can start reading it. Therefore, the disk must wait until the beginning comes around again. Granted, you could stager the start of each track, but this makes seeking a particular spot much more difficult.

Let’s now look at when data are read from consecutive tracks (that is, one complete cylinder is read before it goes on). Once the disk has read the entire contents of a track and has reached the end, the beginning of the track just below it is just now spinning under the head. Therefore, by switching the head it is reading from, the disk can begin to read (or write) as though nothing was different. No movement must take place and the reads occur much faster.

Each track is broken down into smaller chunks called sectors. The number of sectors into which each track is divided is called sectors per track, or sectors/track. Although any value is possible, common values for sectors/track are 17, 24, 32, and 64. (These are shown graphically in Figure 0-12.)

Each sector contains 512 bytes of data. However, each sector can contain up to 571 bytes of information. Each sector contains information that indicates the start and end of the sector, which is only ever changed by a low-level format. In addition, space is reserved for a checksum contained in the data portion of the sector. If the calculated checksum does not match the checksum in this field, the disk will report an error.

Image – Logical Components of a Hard Disk (interactive)

This difference between the total number of bytes per sector and the actual amount of data has been cause for a fair amount of grief. For example, in trying to sell you a hard disk, the salesperson might praise the tremendous amount of space that the hard disk has. You might be amazed at the low cost of a 1G drive.

There are two things to watch out for. Computers count in twos, humans count in tens. Despite what the salesperson wants you to believe (or believes himself), a hard disk with 1 billion bytes is not a 1 gigabyte drive it is only 10⁹bytes. One gigabyte means 2³⁰bytes. A hard disk with 10⁹(1 billion) is only about 950 megabytes. This is five percent smaller!

The next thing is that seller will often state is the unformatted storage capacity of a drive. This is the number that you would get if you multiplied all the sectors on the disk by 571 (see the preceding discussion). Therefore, the unformatted size is irrelevant to almost all users. Typical formatted MFM drives give the user 85 percent of the unformatted size, and RLL drives give the user about 89 percent. (MFM and RLL are formatting standards, the specifics of which are beyond the scope of this book. )

This brings up an interesting question. If the manufacturer is telling us the unformatted size and the formatted size is about 85 percent for MFM and 89 percent for SCSI /IDE (using RLL), how can I figure out how much usable space there really is? Elementary, my dear Watson: Its a matter of multiplication.

Lets start at the beginning. Normally when you get a hard disk, it comes with reference material that indicates how many cylinders, heads, and sectors per track there are (among other things). The set of all tracks at the same distance from the spindle is a cylinder. The number of cylinders is therefore simply the number of tracks because a track is on one surface and a cylinder is all tracks at the same distance. Because you can only use those surfaces that have a head associated with them, you can calculate the number of total tracks by multiplying the number of cylinders by the number of heads. In other words, take the number of tracks on a surface and multiply it by the number of surfaces. This gives you the total number of tracks.

From our discussion of tracks, you know that each track is divided into a specific number of sectors. To find the total number of sectors, simply multiply the number of total tracks that we calculated above by the sectors per track. Once you have the total number of sectors, multiply this by 512 (the number of bytes of data in a sector). This give us the total number of bytes on the hard disk. To figure out how may megabytes this is, simply divide this number by 1,048,576 (1024 x 1024 = 1MB).

Hard disks can be further subdivided into partitions. A partition is a large group of sectors allocated for a particular purpose. Partitioning a disk allows the disk to be used by several operating system or for several purposes. A lot of Linux systems have a single disk with three partitions; one containing a DOS filesystem, another an EXT2 filesystem and a third for the swap partition. The partitions of a hard disk are described by a partition table; each entry describing where the partition starts and ends in terms of heads, sectors and cylinder numbers. For DOS formatted disks, those formatted by fdisk, there are four primary disk partitions. Not all four entries in the partition table have to be used. There are three types of partition supported by fdisk, primary, extended and logical.Extended partitions are not real partitions at all, they contain any number of logical parititions. Extended and logical partitions were invented as a way around the limit of four primary partitions. The following is the output from fdisk for a disk containing two primary partitions:

Disk /dev/sda: 64 heads, 32 sectors, 510 cylinders
Units = cylinders of 2048 * 512 bytes
   Device Boot   Begin    Start      End   Blocks   Id  System
/dev/sda1            1        1      478   489456   83  Linux native
/dev/sda2          479      479      510    32768   82  Linux swap
Expert command (m for help): p
Disk /dev/sda: 64 heads, 32 sectors, 510 cylinders
Nr AF  Hd Sec  Cyl  Hd Sec  Cyl   Start    Size ID
 1 00   1   1    0  63  32  477      32  978912 83
 2 00   0   1  478  63  32  509  978944   65536 82
 3 00   0   0    0   0   0    0       0       0 00
 4 00   0   0    0   0   0    0       0       0 00

This shows that the first partition starts at cylinder or track 0, head 1 and sector 1 and extends to include cylinder 477, sector 32 and head 63. As there are 32 sectors in a track and 64 read/write heads, this partition is a whole number of cylinders in size. fdisk alligns partitions on cylinder boundaries by default. It starts at the outermost cylinder (0) and extends inwards, towards the spindle, for 478 cylinders. The second partition, the swap partition, starts at the next cylinder (478) and extends to the innermost cylinder of the disk.

All PC-based operating systems need to break down the hard disk into partitions. (In fact, all operating systems I can think of that use the same kind of hard disks need to create partitions.) A partition can be any size, from just a couple of megabytes to the entire disk. Each partition is defined in a partition table that appears at the very beginning of the disk. This partition table contains information about the kind of partition it is, where it starts, and where it ends. This table is the same whether you have a DOS-based PC, UNIX, or both.

Because the table is the same for DOS and UNIX, there can be only four partitions total because there are four entries in the table. DOS gets around this by creating logical partitions within one physical partition. This is a characteristic of DOS, not the partition table. Both DOS and UNIX must first partition the drive before installing the operating system and provide the mechanism during the installation process in the form of the fdisk program. Although their appearances are very different, the DOS and Linux fdisk commands perform the same function.

When you run the Linux fdisk utility, the values you see and input are all in tracks. To figure out how big each fdisk partition is, simply multiply that value by 512 and by the number of sectors per track. (Remember that each sector holds 512 bytes of data.)

A disk is usually described by its geometry, the number of cylinders, heads and sectors. For example, at boot time Linux describes one of my IDE disks as:

hdb: Conner Peripherals 540MB - CFS540A, 516MB w/64kB Cache, CHS=1050/16/63

This means that it has 1050 cylinders (tracks), 16 heads (8 platters) and 63 sectors per track. With a sector, or block, size of 512 bytes this gives the disk a storage capacity of 529200 bytes. This does not match the disk’s stated capacity of 516 Mbytes as some of the sectors are used for disk partitioning information. Some disks automatically find bad sectors and re-index the disk to work around them.

To physically connect itself to the rest of the computer, the hard disk has several choices: ST506/412, , ESDI, SCSI, Integrated Drive Electronics(IDE)/Advanced Technology Attachment (ATA), Enhanced IDE (EIDE), Fast IDE, and the newer Serial ATA (SATA). However, the interface the operating system sees for ST506/412 and IDE are identical, and there is no special option for an IDE drive. At the hardware level, though, there are some differences that I need to cover for completeness.

The older IDE/ATA devices could be considered Parallel ATA, because data is transfered in parallel. The Serial ATA (SATA) bus is not simply a modification of PATA, but it is safer to think of it as a completely new interface type. IDE, Fast IDE and EIDE are basically the same interface, the only significant difference is the interface cable which can allow the system to access more physical disk space as well as increases the speed of the data transfer. For more details see the section on SATA.

To be quite honest, only ESDI and ST506/412 are really disk interfaces. SCSI and IDE are referred to as “system-level interfaces” and they incorporate ESDI into the circuitry physically located on the drive.

The ST506/412 was developed by Seagate Technologies (hence the ST) for its ST506 hard disk, which had a whopping 5Mb formatted capacity. (This was in 1980 when 360K was a big floppy.) Seagate later used the same interface in their ST412, which doubled the drive capacity (which is still less hard disk space than most people RAM. Oh well). Other drive manufacturers decided to incorporate this technology, and over the years, it has become a standard. One of its major drawbacks is that is 15-year-old technology, which can no longer compete with the demands of todays hard disk users.

In 1983, the Maxtor Corporation established the Enhanced Small Device Interface (ESDI) standard. The enhancements ESDI provided offered higher reliability because Maxtor had built the encoder/decoder directly into the drive and therefore reduced the noise, high transfer rates, and the ability to get drive parameters directly from this disk. This meant that users no longer had to run the computer setup routines to tell the CMOS what kind of hard disk it had.

One drawback that I have found with ESDI drives is the physical connection between the controller and the drive itself. Two cables were needed: a 34-pin control cable and a 24-pin data cable. Although the cables are different sizes and can’t (easily) be confused, the separation of control and data is something of which I was never a big fan. The connectors on the drive itself were usually split into two unequal halves. In the connector on the cable, a small piece of plastic called a key prevented the connector from being inserted improperly. Even if the key was missing, you could still tell which end was which because the pins on the hard disk were labeled and the first line on the cable had a colored stripe down its side. (This may not always be the case, but I have never seen a cable that wasn’t colored like this.)

Another drawback that I have found is that the physical location on the cable determines which drive is which. The primary drive is located at the end of the cable, and the secondary drive is in the middle. The other issue is the number of cables: ESDI drives require three separate cables. Each drive has its own data cable and the drives share a common control cable.

Although originally introduced as the interface for hard cards (hard disks directly attached to expansion cards), the IDE (integrated drive electronics) interface has grown in popularity to the point where it is perhaps the most commonly used hard-disk interface today (rapidly being replaced by SCSI). As its name implies, the controller electronics are integrated onto the hard disk itself. The connection to the motherboard is made through a relatively small adapter, commonly referred to as a “paddle board.” From here, a single cable attaches two hard disks in a daisy chain, which is similar to the way floppy drives are connected, and often, IDE controllers have connectors and control electronics for floppy drives as well.

IDE drives often play tricks on systems by presenting a different face to the outside world than is actually on the disk. For example, because IDE drives are already pre-formatted when they reach you, they can have more physical sectors in the outer tracks, thereby increasing the overall amount of space on the disk that can be used for storage. When a request is made to read a particular block of data on the drive, the IDE electronics translate this to the actual physical location.

Because IDE drives come pre-formatted, you should never low-level format an IDE drive unless you are specifically permitted to do so by the manufacturer. You could potentially wipe out the entire drive to the point at which it must be returned to the factory for “repair.” Certain drive manufacturers, such as Maxtor, provide low-level format routines that accurately and safely low-level format your drive. Most vendors that I am aware of today simply “zero” out the data blocks when doing a low-level format. However, don’t take my word for it! Check the vendor.

Only two IDE drives can be connected with a single cable, but there is nothing special about the position of the drive on the cable. Therefore, the system needs some other way of determining which drive is which. This is done with the drives themselves. Most commonly, you will find a jumper or set up jumpers used to determine if the drive is the master, slave or master-only/single-drive. On (much) older system, you could only have a single IDE controller, which meant only two drive. Today, it is common to find two IDE controllers build onto the mother board and some system allow you to add extra IDE controllers, thereby increasing the number of IDE drives even further.

The next great advance in hard disk technology was SCSI. SCSI is not a disk interface, but rather a semi-independent bus. More than just hard disks can be attached to a SCSI-Bus. Because of its complex nature and the fact that it can support such a wide range of devices, I talked in more detail about SCSI earlier in this chapter. However, a few specific SCSI issues relate to hard disks in general and the interaction between SCSI and other types of drives.

The thing to note is that the BIOS inside the PC knows nothing about SCSI. Whether this is an oversight or intentional, I don’t know. The SCSI spec is more than 10 years old, so there has been plenty of time to include it. Because the BIOS is fairly standard from machine to machine, including SCSI support might create problems for backward compatibility.

On the other hand, the BIOS is for DOS. DOS makes BIOS calls. To be able to access all the possible SCSI devices through the BIOS, it must be several times larger. Therefore, every PC-based operating system needs to have extra drivers to be able to access SCSI devices.

Because the BIOS does not understand about SCSI, you have to trick the PCs BIOS a little to boot from a SCSI device. By telling the PCs BIOS that no drives are installed as either C: or D:, you force it to quit before it looks for any of the other types. Once it quits, the BIOS on the SCSI host adapter has a chance to run.

The SCSI host adapter obviously knows how to boot from a SCSI hard disk and does so wonderfully. This is assuming that you enabled the BIOS on the host adapter. If not, you’re hosed.

There is also the flip side of the coin. The official doctrine says that if you have a non-SCSI boot driver, you have to disable the SCSI BIOS because this can cause problems. However, I know people who have IDE boot drives and still leave the SCSI BIOS enabled. Linux normally reacts as though the SCSI BIOS were not enabled, so, what do to? I suggest that you see what works. I can only add that if you have more than one host adapter , only one should have the BIOS enabled.

Another thing is that once the kernel boots from a SCSI device, you loose access to other kinds of drives. Just because it doesn’t boot from the IDE (or whatever), does this mean you cannot access it at all? Unfortunately, yes. This is simply the way the kernel is designed. Once the kernel has determined that it has booted off a SCSI hard disk, it can no longer access a non-SCSI hard disk.

The newest member of the hard disk family is Enhanced IDE, or EIDE. The most important aspect of this new hard disk interface is its ability to access more than 504 megabytes. This limitation is because the IDE interface can access only 1024 cylinders, 16 heads, and 63 sectors per track. If you multiply this out using the formula I gave you earlier, you get 504Mb.

EIDE also has other advantages such as higher transfer rates, ability to connect more than just two hard disks, and attach more than just hard disks. One drawback the EIDE had at the beginning was part of its very nature. To overcome the hard disk size limit that DOS had, EIDE drives employ a method called logical block addressing (LBA).

Before we talk about LBA, we need to backtrack a little. Old operating systems that accessed the BIOS directly would use interrupt 13 (INT13h) to access the disk. This limited the size of the hard disk because using INT13h you could only access up to 1024 cylinders, 256 heads and 63 sectors per track, for a total of 528 MB of disk space. (Cylinders * Heads * Sectors * 512 bytes = Disk Capacity) Note the combination of cylinders, heads and sectors per track is referred to as the disk “geometry” and is commonly referred to as CHS (Cylinders-Heads-Sectors).

To overcome this limitation, the BIOS was modified to supported something called “large mode”. or more acurately Extended Cylinders Heads Sectors (ECHS). This introducted a so-called “translation layer” between the INT13h and the BIOS, allowing for disks up to 8GB. (very large for the time) What ECHS did was to calcualte a “best fit” of the disk geometry to fit into the 1024 cylinder limitation. By changing the other values (i.e. heads and sectors per track) the BIOS could come much closer to the actual size of the disk.

The next step was logical block addressing (LBA). Like ECHS, the idea behind LBA is that is that the systems BIOS would “rearrange” the drive geometry so that drives larger than 528Mb could still boot. Because Linux does not use the BIOS to access the hard disk, the fact that the BIOS could handle the EIDE drivemeant nothing. New drivers needed to be added to accountfor this.

More and more you find the (E)IDE controllers built directly onto the motherboard. On the one hand this is a good thing, since you do not need to use a expansion card slot for the controller. However, you need to be careful where it is located. I have had a few motherboards, where the IDE controller was stuck between the PCI and ISA slots. This made it extremely difficult to access the pins without removing either of the cards in either of the PCI or ISA slot (sometimes both).

Although this might not seem like a big deal, but it may become one the next time you do anything with that machine. It is not uncommon when adding a harddisk or something else to the system that you accidentally pull on the harddisk cable. All you need to do is pull it no more than quarter of an inch before some plugs are no longer connected to the pins. Because this connection is almost impossible to see, you don’t notice that the cable has come loose. When you reboot the machine nothing works as the system is getting signals through only some of the lines. If the pins for the IDE controller are out in the open, you may still pull on the cable, but it is easier to see and far easier to fix.

There is much more to choosing the right harddisk that its size. Although size determines how much data you can store, it tells you nothing about the speed at which you can access that data. How quickly you can access your data is the true measure of performance.

Unfortunately, there is no one absolute measure of harddisk performance. The reason is simply that data access occurs in some many different ways, it is often difficult for even the experienced administrator to judge which drive is better. However, there are several different characteristics of harddisks, which, when viewed together give you a good idea of the overall performance of a drive.

One character that is often quoted is the seek time. This refers to the time need to move the read/write head between tracks. If the data is not on the same track, it could mean moving the head a couple thousand tracks in either direction. Movement from one track to the next adjacent one might take only 2 ms, whereas moving the entire diameter of the drive might take 20ms.

So which one do you use? Typically, neither. When access times are specified, you normally see the average seek time. This is measured as the average time between randomly located tracks on the disk. Typical rages at the time of this writing are between 8ms and 14ms. The problem is that disk access is often (usually?) not random. Depending on how you work, you read large a number of blocks at once, such as to load a WordPerfect file. Therefore, average seek time does not reflect access of large pieces of data.

Once the head has moved to the track you need, you are not necessarily read to work. You need to wait until the right block is under the read/write head. The time the drive takes to reach that point is called rotational latency. The faster the drive speeds, the more quickly the block is underneath the head. Therefore, rotational latency is directly related to the rotational speed (rpm) of your drive.

By increasing the rotational speed of a drive you obviously decrease the time the drive has to wait. The fastest drives as I am writing this spin at least 7200 times per minutes, which means that have an average rotational latency is about 4.2 ms.

You can also decrease the rotational latency by staggering the start of the start of each track. This is especially effective when doing sequential reads across tracks. If the start of all tracks were at the same place, the head would move to the new track and the start of the track would have already spin out from underneath. If the tracks are staggered, the head has to wait less time (less rotational latency) until the start of the track is underneath.

Think back to our discussion of harddisks and the concept of a cylinder. This is all of the tracks at the same distance from the spindle. To physically move heads from one track to another takes more time than simple switch which head you are using. However, because switch heads does not occur instantaneously, there is a certain amount of rotational latency. Therefore, the start of each track is staggered as one moves up and down the cylinder, as well as across the cylinders.

By decreasing the rotational latency, we increase the speed at which the head reaches the right position. Once we are there, we can begin reading, this is the average access time. This, too, is measured in milliseconds.

Still, this is not the complete measure of the performance of our drive. Although it is nice that the drive can quickly begin to read, this does not necessarily mean that it will read the data fast. The faster the harddisk can read the data, the faster your WordPerfect file is loaded. This is due to the transfer rate. This is normally measured in megabytes per second.

However, the actual transfer is not necessarily what the harddisk manufacturer says it is. They may have given the transfer rate in terms of the maximum or average sustained transfer rate. This is important to understand. If you have one huge 200Mb file that you are reading on a new drive, the entire drive might be contiguous. Therefore, there is very little movement of the heads as the file is read. This would obviously increase the average transfer rate. However, if you have two hundred 1 Mb files spreads out all over the disk, you will definitely notice a lower transfer rate.

In addition, this is another case of the chain being as strong as its weakest link. The actual transfer rate dependant on other factors, as well. A slow harddisk controller or slow system bus can make a fast harddisk display bad performance.

Another aspect is how much of the date is being re-read. For example, if you read the same one Mb file two hundred times, the head won’t move much. This is not a bad thing, as data is often read repeatedly. Harddisk manufacturers are aware of this and therefore will add caches to the harddisk to improve performance. Data that is read from the harddisk can be stored in the cache so if it is needed again, it can be accessed more quickly than if it must be first read from the drive. Data that is written, may also be needed again, so it too can be re-read from the cache.

This is also called a cache buffer, because it also serves to buffer the data. Sometimes the harddisk cannot keep up with the CPU. It may be the disk is writing someone as new data comes in. Rather than making the CPU wait, the data is written to the cache, which the harddisk can read when it can. Other times, the CPU is doing something else as the data is from the harddisk is ready. The harddisk can write it to the buffer and the CPU can take it when it can.

Finally, there is data throughput. This is a measure of the total amount of data the CPU can access in a given amount of time. Since the data is going through the harddisk controller and through the system bus, this may not be a good measure of performance of the drive itself. However, if the other components can process the data as quickly as the drive can provide it, it is a good measure of the complete system.

This is why you will see references to Seagate drives on the Adaptec web site. Adaptec understands the relationship between the components in your system. Therefore, they suggest drives that can keep up with the other components such as the appropriate ones from Seagate.

Another aspect of the administration costs that a lot of people do not think about is the drive designation. Although calling a harddisk “WhirlWind” or “Falcon” might be pleasing to the marketing people or the IT manager who has no clue about the technical details. However, the administrator is not interested in what name it has but rather its characteristics it has. If it takes a long time to figure out the characteristics, the total cost of owner ship has increased.

How often have you had to wade through pages and pages on a company’s Internet site to figure out how big a particular model was?. Although many (most?) companies have a 1:1 relationship between the model designation and the characteristics, you have to first figure out the scheme, as often it is not posted anywhere on the site.

This is one reason why I keep coming back to Seagate. Without thinking I can come up with the model number or something very close. The general format is:

ST<F><MB><INT>

Where:
<F> = Form factor, such as 3″, 3″ half-high, 5″, etc.
<B> = Approximate size in megabytes.
<INT> = Interface.

So, looking at my drive, which is a ST39140A, I can quickly tell that it is a form factor 3 (3″ drive and 1″ high), it has approximately 9140 MB and an ATA interface. Granted some of the abbreviations used for the interface take a little to get used to. However, the naming scheme is consistent and very easy to figure out.

As with other hardware, your choice of harddisk is also guided by the reputation of the company. This applies not only to what you have heard, but also your own personal experiences. Often it is more than just having heard or reading that a particular manufacturer is bad, but rather an issue of being “sure.” This is why I will never buy an IBM harddisk again. All three I have bought were defective. Although other people have claimed not to have problems with them, I do not want to risk my data on them. Three times is too much of a coincidence and I would not feel safe if I installed an IBM harddisk on any of my machines, nor would I have a clear conscience if I installed it in a customer’s machine.

On the other hand, I have had a proportionally large number of Seagate drives since I first started working with computers. None of which have ever given me problems. So far, all of my Seagate drives have been replaced with larger drives, not because they have failed, but they have grown, too small. There are only so many bays in a computer case and filling them up with small drives is not worth it. Instead, I got larger drives.

In addition to the size and speed of your drive, one important consideration is the interface to the harddisk. Typically, SCSI harddisks are more expensive than ATA drives, even if you ignore the extra costs for the SCSI host adapter. Even if you want to ignore the extra costs to acquire the drive, you need to consider the costs to install and manage the host adapter, the performance increase you get with SCSI is negligible for work stations. Generally, you do not need the extra throughput that SCSI can provide.

In most cases, space will be an issue. Although you need just a few hundred megabytes for the operating system, you are getting larger and larger applications, with dozens of components which quickly fill up space on your harddisk. Buying and installing a new ATA harddisk is generally simpler than adding a SCSI harddisk particularly if your first harddisk is ATA. In addition, on newer system you can have up to four ATA devices, including CD-ROM drives, which is generally sufficient for a workstations, as well as mobile users.

On the other hand, if you are in an environment where you need more than four device or need devices that do not support ATA, then you will have to go with SCSI. In addition, SCSI is basically a must when talking about your server. Size isn’t an issue as what is available is generally the same for ATA and SCSI. The key difference is performance. This is particularly important in a multi-user environment.

Let’s take the Seagate Cheetah as an example. As of this writing it is the fastest available on the market with10,000 RPM. It has a maximum internal transfer rate of 306Mbits/s, which means it is even faster than the 80Mbits/s of the Ultra SCSI interface. This is a result of an average seek time of 6 milliseconds and 2.99 average latency. This means the average access time is under 9 milliseconds. To compensate, the Cheetah series has default buffer size of 1Mb. In addition, the throughput is too high to use anything other than SCSI or Fibre Channel, so it is not available with an ATA interface.

There are also a few other reasons why something like the Cheetah is the perfect solutions for a server. First, it supports up to 15 devices on a single wide SCSI bus. Using the Fibre Channel versions, you can get up to 126 devices, which are also hot swappable.

Another thing to consider is the maintenance and administration. Low-end Medalist drives have an expected mean-time between failures (MTBF) of 400,000 hours. Which is about 45 years. The MTBF for the Cheetah is approximately 1,000,000 hours or over 100 years. No wonder I haven’t ever had a harddisk crash.

The Seagate drives also do something else to reduce maintenance and administration costs. First, there is something Seagate calls SeaShield and is something other harddisk manufacturers should adopt. This is simply a protective cover around the electronics that are exposed on other harddisks. This protects the electronics from static electrical discharge, as well as damage caused by bumping the drive against something. In addition, this cover provides the perfect space for installation instructions, like the jumper settings. There is no need to go hunting around for the data sheet, which often isn’t supplied with the drive. Talk about saving administration costs!

Some of you might be saying that names like Cheetah go against my desire to have understandable model names. My answer is that the opposite is true. As of this writing Seagate has four primary series: Medalist, Medalist Pro, Barracuda and Cheetah. This simply tells the rotation rate, which is 5400, 7200, 7200 and 10,000 RPM respectively. The Medalist is Ultra ATA. The Medalist Pro is either ATA or SCSI. The Barracuda and Cheetah are either SCSI or Fibre Channel. Okay, this requires you to use your brain a little, but it is far easier than many other vendors.