You’ve got to have a hard disk. You could actually run Linux from a floppy (even a high-density 5.25″ floppy), but life is so much easier when you run from the hard disk. Not only do you have space to save your files, you have access to all the wonderful tools that Linux provides. Disk drives provide a more permanent method for storing data, keeping it on spinning disk platters. To write data, a tiny head magnetizes minute particles on the platter’s surface. The data is read by a head, which can detect whether a particular minute particle is magnetized. (Note that running Linux off of a floppy, such as as a router or fire wall is not a bad idea. You still have the basic functionality you need and there is very little to hack.)
A hard disk is composed of several physical disks, called platters, which are made of aluminum, glass or ceramic-composites and coated with either an “oxide” media (the stuff on the disks) or “thin film” media. Because “thin film” is thinner than oxide, the denser (that is, larger) hard disks are more likely to have thin film. The more platters you have, the more data you can store.
Platters are usually the same size as floppies. Older platters were 5.25″ round, and the newer ones are 3.5″ round. (If someone knows the reason for this, I would love to hear it.) In the center of each platter is a hole though which the spindle sticks. In other words, the platters rotate around the spindle. The functionality is the same as with a phonograph record. (Remember those?) The platters spin at a constant speed that can vary between 3000 and 15,000 RPM depending on the model. Compare this to a floppy disk which only spins at 360 RPM. The disk’s read/write heads are responsible for reading and writing data and there is a pair for each platter, one head for each surface. The read/write heads do not physically touch the surface of the platters, instead they float on a very thin (10 millionths of an inch) cushion of air. The read/write heads are moved across the surface of the platters by an actuator. All of the read/write heads are attached together, they all move across the surfaces of the platters together.
The media that coats the platters is very thin. about 30 millionths of an inch. The media has magnetic properties that can change its alignment when it is exposed to a magnetic field. That magnetic field comes in the form of the hard disks read/write heads. It is the change in alignment of this magnetic media that enables data to be stored on the hard disk.
As I said earlier, a read/write head does just that: it reads and writes. There is usually one head per surface of the platters (top and bottom). That means that there are usually twice as many heads as platters. However, this is not always the case. Sometimes the top- and bottom-most surfaces do not have heads.
The head moves across the platters that are spinning several thousand times a minute (at least 60 times a second!). The gap between the head and platter is smaller that a human hair, smaller than a particle of smoke. For this reason, hard disks are manufactured and repaired in rooms where the number of particles in the air is fewer than 100 particles per cubic meter.
Because of this very small gap and the high speeds in which the platters are rotating, if the head comes into contact with the surface of a platter, the result is (aptly named) a head crash. More than likely this will cause some physical damage to your hard disk. (Imagine burying your face into an asphalt street going even only 20 MPH!)
The heads move in and out across the platters by the older stepping motor or the new, more efficient voice-coil motor. Stepping motors rotate and monitor their movement based on notches or indentations. Voice-coil motors operate on the same principle as a stereo speaker. A magnet inside the speaker causes the speaker cone to move in time to the music (or with the voice). Because there are no notches to determine movement, one surface of the platters is marked with special signals. Because the head above this surface has no write capabilities, this surface cannot be used for any other purpose.
The voice-coil motor enables finer control and is not subject to the problems of heat expanding the disk because the marks are expanded as well. Another fringe benefit is that because the voice-coil operates on electricity, once power is removed, the disk moves back to its starting position because it no longer is resisting a “retaining” spring. This is called “automatic head parking.”
Each surface of the platter is divided into narrow, concentric circles called tracks. Track 0 is the outermost track and the highest numbered track is the track closest to the central spindle. A cylinder is the set of all tracks with the same number. So all of the 5th tracks from each side of every platter in the disk is known as cylinder 5. As the number of cylinders is the same as the number of tracks, you often see disk geometries described in terms of cylinders. Each track is divided into sectors. A sector is the smallest unit of data that can be written to or read from a hard disk and it is also the disk’s block size. A common sector size is 512 bytes and the sector size was set when the disk was formatted, usually when the disk is manufactured.
Since data is stored physically on the disk in concentric rings, the head does
not spiral in like a phonograph record but rather moves in and out across the
rings (the tracks). Because the heads move in unison across the
surface of their respective platters, data are usually stored not in consecutive
tracks but rather in the tracks that are positioned directly above or below
them. Therefore, hard disks read from successive tracks on
the same
Think of it this way. As the disk is spinning under the head, it is busy reading data. If it needs to read more data than what fits on a single track, it has to get it from a different track. Assume data were read from consecutive tracks. When the disk finished reading from one track, it would have to move in (or out) to the next track before it could continue. Because tracks are rings and the end is the beginning, the delay in moving out (or in) one track causes the beginning of the next track to spin past the position of the head before the disk can start reading it. Therefore, the disk must wait until the beginning comes around again. Granted, you could stager the start of each track, but this makes seeking a particular spot much more difficult.
Let’s now look at when data are read from
consecutive tracks (that is, one complete
Each track is broken down into smaller chunks called sectors. The number of sectors into which each track is divided is called sectors per track, or sectors/track. Although any value is possible, common values for sectors/track are 17, 24, 32, and 64. (These are shown graphically in Figure 0-12.)
Each
This difference between the total number of bytes per sector and the actual amount of data has been cause for a fair amount of grief. For example, in trying to sell you a hard disk, the salesperson might praise the tremendous amount of space that the hard disk has. You might be amazed at the low cost of a 1G drive.
There are two things to watch out for. Computers
count in twos, humans count in tens. Despite what the salesperson wants you to
believe (or believes himself), a hard disk with 1 billion bytes is not
a 1
The next thing is that seller will often state is the unformatted storage capacity of a drive. This is the number that you would get if you multiplied all the sectors on the disk by 571 (see the preceding discussion). Therefore, the unformatted size is irrelevant to almost all users. Typical formatted MFM drives give the user 85 percent of the unformatted size, and RLL drives give the user about 89 percent. (MFM and RLL are formatting standards, the specifics of which are beyond the scope of this book. )
This brings up an interesting question. If the
manufacturer is telling us the unformatted size and the formatted size is
about 85 percent for MFM and 89 percent for
Lets start at the beginning. Normally when
you get a hard disk, it comes with reference material that indicates how many
cylinders, heads, and sectors per track there are (among other things). The set
of all tracks at the same distance from the spindle is a
From our discussion of tracks, you know that each track is divided into a specific number of sectors. To find the total number of sectors, simply multiply the number of total tracks that we calculated above by the sectors per track. Once you have the total number of sectors, multiply this by 512 (the number of bytes of data in a sector). This give us the total number of bytes on the hard disk. To figure out how may megabytes this is, simply divide this number by 1,048,576 (1024 x 1024 = 1MB).
Hard disks can be further subdivided into partitions. A partition is a large group of sectors allocated for a particular purpose. Partitioning a disk allows the disk to be used by several operating system or for several purposes. A lot of Linux systems have a single disk with three partitions; one containing a DOS filesystem, another an EXT2 filesystem and a third for the swap partition. The partitions of a hard disk are described by a partition table; each entry describing where the partition starts and ends in terms of heads, sectors and cylinder numbers. For DOS formatted disks, those formatted by fdisk, there are four primary disk partitions. Not all four entries in the partition table have to be used. There are three types of partition supported by fdisk, primary, extended and logical.Extended partitions are not real partitions at all, they contain any number of logical parititions. Extended and logical partitions were invented as a way around the limit of four primary partitions. The following is the output from fdisk for a disk containing two primary partitions:
Disk /dev/sda: 64 heads, 32 sectors, 510 cylinders Units = cylinders of 2048 * 512 bytes Device Boot Begin Start End Blocks Id System /dev/sda1 1 1 478 489456 83 Linux native /dev/sda2 479 479 510 32768 82 Linux swap Expert command (m for help): p Disk /dev/sda: 64 heads, 32 sectors, 510 cylinders Nr AF Hd Sec Cyl Hd Sec Cyl Start Size ID 1 00 1 1 0 63 32 477 32 978912 83 2 00 0 1 478 63 32 509 978944 65536 82 3 00 0 0 0 0 0 0 0 0 00 4 00 0 0 0 0 0 0 0 0 00
This shows that the first partition starts at cylinder or track 0, head 1 and sector 1 and extends to include cylinder 477, sector 32 and head 63. As there are 32 sectors in a track and 64 read/write heads, this partition is a whole number of cylinders in size. fdisk alligns partitions on cylinder boundaries by default. It starts at the outermost cylinder (0) and extends inwards, towards the spindle, for 478 cylinders. The second partition, the swap partition, starts at the next cylinder (478) and extends to the innermost cylinder of the disk.
All PC-based operating systems need to break down the hard disk into partitions.
(In fact, all operating systems I can think of that use the same kind of hard disks
need to create partitions.)
A
Because the table is the same for
When you run the Linux fdisk utility, the
values you see and input are all in tracks. To figure out how big each fdisk
partition is, simply multiply that value by 512 and by the number of sectors per
track. (Remember that each
A disk is usually described by its geometry, the number of cylinders, heads and sectors. For example, at boot time Linux describes one of my IDE disks as:
hdb: Conner Peripherals 540MB - CFS540A, 516MB w/64kB Cache, CHS=1050/16/63
This means that it has 1050 cylinders (tracks), 16 heads (8 platters) and 63 sectors per track. With a sector, or block, size of 512 bytes this gives the disk a storage capacity of 529200 bytes. This does not match the disk’s stated capacity of 516 Mbytes as some of the sectors are used for disk partitioning information. Some disks automatically find bad sectors and re-index the disk to work around them.
To physically connect itself to the rest of the computer, the hard disk has several choices:
ST506/412, ,
The older IDE/ATA devices could be considered Parallel ATA, because data is transfered in parallel. The Serial ATA (SATA) bus is not simply a modification of PATA, but it is safer to think of it as a completely new interface type. IDE, Fast IDE and EIDE are basically the same interface, the only significant difference is the interface cable which can allow the system to access more physical disk space as well as increases the speed of the data transfer. For more details see the
To be quite honest, only
The ST506/412 was
developed by Seagate Technologies (hence the ST) for its ST506 hard disk, which
had a whopping 5Mb formatted capacity. (This was in 1980 when 360K was a
big floppy.) Seagate later used the same interface in their ST412, which
doubled the drive capacity (which is still less hard disk space than
most people
In 1983, the Maxtor Corporation established the Enhanced
Small Device Interface (ESDI) standard. The enhancements
One drawback that I have found with
Another drawback that I have found is that the physical location on the
cable determines which drive is which. The primary drive is located at the end
of the cable, and the secondary drive is in the middle. The other issue is the
number of cables:
Although
originally introduced as the interface for hard cards (hard disks directly
attached to expansion cards), the
IDE drives often play tricks on
systems by presenting a different face to the outside world than is actually on
the disk. For example, because
Because IDE
drives come pre-formatted, you should never low-level format an
Only two IDE drives can be connected with a single cable, but there is nothing special about
the position of the drive on the cable. Therefore, the system needs some other way of
determining which drive is which. This is done with the drives themselves. Most commonly, you
will find a
The next great advance in hard disk technology was
The thing to note is that the
On the other hand, the
Because the
The
There is also the flip side of the coin. The official doctrine says that if you have a
non-SCSI
Another thing is that once the
The newest member of the hard disk family is Enhanced
IDE, or
EIDE also has other advantages such as higher transfer rates,
ability to connect more than just two hard disks, and attach more than just hard
disks. One drawback the
Before we talk about LBA, we need to backtrack a little. Old operating systems that accessed the BIOS directly would use
To overcome this limitation, the BIOS was modified to supported something called “large mode”. or more acurately Extended Cylinders Heads Sectors (ECHS). This introducted a so-called “translation layer” between the INT13h and the BIOS, allowing for disks up to 8GB. (very large for the time) What ECHS did was to calcualte a “best fit” of the disk geometry to fit into the 1024 cylinder limitation. By changing the other values (i.e. heads and sectors per track) the BIOS could come much closer to the actual size of the disk.
The next step was logical block addressing (LBA). Like ECHS, the idea behind
More and more you find the (E)IDE controllers built directly onto the motherboard. On the one
hand this is a good thing, since you do not need to use a
Although this might not seem like a big deal, but it may become one the next time you do
anything with that machine. It is not uncommon when adding a harddisk or something else to the
system that you accidentally pull on the harddisk cable. All you need to do is pull it no more than
quarter of an inch before some plugs are no longer connected to the pins. Because this connection is
almost impossible to see, you don’t notice that the cable has come loose. When you reboot the
machine nothing works as the system is getting signals through only some of the lines. If the pins
for the
There is much more to choosing the right harddisk that its size. Although size determines how much data you can store, it tells you nothing about the speed at which you can access that data. How quickly you can access your data is the true measure of performance.
Unfortunately, there is no one absolute measure of harddisk performance. The reason
is simply that data access occurs in some many different ways, it is often difficult for even the
experienced
One character that is often quoted is the seek time. This refers to the time need to move the read/write head between tracks. If the data is not on the same track, it could mean moving the head a couple thousand tracks in either direction. Movement from one track to the next adjacent one might take only 2 ms, whereas moving the entire diameter of the drive might take 20ms.
So which one do you use? Typically, neither. When access times are specified, you normally see the average seek time. This is measured as the average time between randomly located tracks on the disk. Typical rages at the time of this writing are between 8ms and 14ms. The problem is that disk access is often (usually?) not random. Depending on how you work, you read large a number of blocks at once, such as to load a WordPerfect file. Therefore, average seek time does not reflect access of large pieces of data.
Once the head has moved to the track you need, you are not necessarily read to work. You need to wait until the right block is under the read/write head. The time the drive takes to reach that point is called rotational latency. The faster the drive speeds, the more quickly the block is underneath the head. Therefore, rotational latency is directly related to the rotational speed (rpm) of your drive.
By increasing the rotational speed of a drive you obviously decrease the time the drive has to wait. The fastest drives as I am writing this spin at least 7200 times per minutes, which means that have an average rotational latency is about 4.2 ms.
You can also decrease the rotational latency by staggering the start of the start of each track. This is especially effective when doing sequential reads across tracks. If the start of all tracks were at the same place, the head would move to the new track and the start of the track would have already spin out from underneath. If the tracks are staggered, the head has to wait less time (less rotational latency) until the start of the track is underneath.
Think back to our
discussion of harddisks and the concept of a
By decreasing the rotational latency, we increase the speed at which the head reaches the right position. Once we are there, we can begin reading, this is the average access time. This, too, is measured in milliseconds.
Still, this is not the complete measure of the performance of our drive. Although it is nice that the drive can quickly begin to read, this does not necessarily mean that it will read the data fast. The faster the harddisk can read the data, the faster your WordPerfect file is loaded. This is due to the transfer rate. This is normally measured in megabytes per second.
However, the actual transfer is not necessarily what the harddisk manufacturer says
it is. They may have given the transfer rate in terms of the maximum or average sustained transfer
rate. This is important to understand. If you have one huge 200Mb file that you are reading on a new
drive, the entire drive might be contiguous. Therefore, there is very little movement of the heads
as the file is read. This would obviously increase the average transfer rate. However, if you have
two hundred 1
In addition, this is another case of the chain being as strong as
its weakest link. The actual transfer rate dependant on other factors, as well. A slow harddisk
controller or slow system
Another aspect is how much of the date is being re-read. For example, if you read the same one
This is also called a
Finally,
there is data
This is why you will see references to Seagate drives on the Adaptec web site. Adaptec understands the relationship between the components in your system. Therefore, they suggest drives that can keep up with the other components such as the appropriate ones from Seagate.
Another aspect of the administration
costs that a lot of people do not think about is the drive designation. Although calling a harddisk
“WhirlWind” or “Falcon” might be pleasing to the marketing people or the IT manager who has no clue
about the technical details. However, the
How often have you had to wade through pages and pages on a company’s Internet site to figure out how big a particular model was?. Although many (most?) companies have a 1:1 relationship between the model designation and the characteristics, you have to first figure out the scheme, as often it is not posted anywhere on the site.
This is one reason why I keep coming back to Seagate. Without thinking I can come up with the model number or something very close. The general format is:
ST<F><MB><INT>
Where:
<F> = Form factor, such as 3″, 3″ half-high, 5″, etc.
<B> =
Approximate size in megabytes.
<INT> = Interface.
So, looking at my drive, which is a ST39140A, I can quickly tell that it is a form factor 3 (3″ drive and 1″ high), it has approximately 9140 MB and an ATA interface. Granted some of the abbreviations used for the interface take a little to get used to. However, the naming scheme is consistent and very easy to figure out.
As with other hardware, your choice of harddisk is also guided by the reputation of the company. This applies not only to what you have heard, but also your own personal experiences. Often it is more than just having heard or reading that a particular manufacturer is bad, but rather an issue of being “sure.” This is why I will never buy an IBM harddisk again. All three I have bought were defective. Although other people have claimed not to have problems with them, I do not want to risk my data on them. Three times is too much of a coincidence and I would not feel safe if I installed an IBM harddisk on any of my machines, nor would I have a clear conscience if I installed it in a customer’s machine.
On the other hand, I have had a proportionally large number of Seagate drives since I first started working with computers. None of which have ever given me problems. So far, all of my Seagate drives have been replaced with larger drives, not because they have failed, but they have grown, too small. There are only so many bays in a computer case and filling them up with small drives is not worth it. Instead, I got larger drives.
In addition to the size and speed of your
drive, one important consideration is the interface to the harddisk. Typically,
In
most cases, space will be an issue. Although you need just a few hundred megabytes for the
On the other hand, if you are in an
Let’s take the Seagate Cheetah as an example. As of this writing it is the fastest available on
the market with10,000
There are also a few other reasons why something like the Cheetah is the perfect solutions for a
server. First, it supports up to 15 devices on a single wide
Another thing to consider is the maintenance and administration. Low-end
Medalist drives have an expected mean-time between failures (MTBF) of 400,000 hours. Which is about
45 years. The MTBF for the Cheetah is approximately 1,000,000 hours or over 100 years. No wonder I
haven’t ever had a harddisk
The Seagate drives also do something else to reduce maintenance and administration costs. First, there is something Seagate calls SeaShield and is something other harddisk manufacturers should adopt. This is simply a protective cover around the electronics that are exposed on other harddisks. This protects the electronics from static electrical discharge, as well as damage caused by bumping the drive against something. In addition, this cover provides the perfect space for installation instructions, like the jumper settings. There is no need to go hunting around for the data sheet, which often isn’t supplied with the drive. Talk about saving administration costs!
Some of you might be saying that names like Cheetah go against my desire to have understandable
model names. My answer is that the opposite is true. As of this writing Seagate has four primary
series: Medalist, Medalist Pro, Barracuda and Cheetah. This simply tells the rotation rate, which is
5400, 7200, 7200 and 10,000 RPM respectively. The Medalist is Ultra ATA. The
Medalist Pro is either ATA or