|
Hard Disks
You've got to have a hard disk. You could actually run Linux from a floppy (even
a high-density 5.25" floppy), but life is so much easier when you run from the
hard disk. Not only do you have space to save your files, you have access to all
the wonderful tools that Linux provides.
Disk drives provide a more permanent method for storing data, keeping it on
spinning disk platters.
To write data, a tiny head magnetizes minute particles on the platter's surface.
The data is read by a head, which can detect whether a particular minute
particle is magnetized.
(Note that running Linux off of a floppy, such as as a router or fire wall is not a
bad idea. You still have the basic functionality you need and there is very little to
hack.)
A hard disk is composed of several physical disks, called platters,
which are made of aluminum, glass or ceramic-composites and coated with either
an "oxide" media (the stuff on the disks) or "thin film" media. Because "thin
film" is thinner than oxide, the denser (that is, larger) hard disks are
more likely to have thin film. The more platters you have, the more data you can store.
Platters are usually the
same size as floppies. Older platters were 5.25" round, and the newer ones are
3.5" round. (If someone knows the reason for this, I would love to hear it.) In
the center of each platter is a hole though which the spindle sticks. In other
words, the platters rotate around the spindle. The functionality is the same as
with a phonograph record. (Remember those?)
The platters spin at a constant speed that
can vary between 3000 and 15,000 RPM depending on the model.
Compare this to a floppy disk which only spins at 360 RPM.
The disk's read/write heads are responsible for reading and writing data and
there is a pair for each platter, one head for each surface.
The read/write heads do not physically touch the surface of the platters,
instead they float on a very thin (10 millionths of an inch) cushion of air.
The read/write heads are moved across the surface of the platters by an
actuator. All of the read/write heads are attached together, they all move across the
surfaces of the platters together.
The media that coats the
platters is very thin. about 30 millionths of an inch. The media has magnetic
properties that can change its alignment when it is exposed to a magnetic field.
That magnetic field comes in the form of the hard disks read/write heads. It is
the change in alignment of this magnetic media that enables data to be stored on
the hard disk.
As I said earlier, a read/write head does just that: it
reads and writes. There is usually one head per surface of the platters (top and
bottom). That means that there are usually twice as many heads as platters.
However, this is not always the case. Sometimes the top- and bottom-most
surfaces do not have heads.
The head moves across the platters that are
spinning several thousand times a minute (at least 60 times a
second!). The gap between the head and platter is smaller that a human hair,
smaller than a particle of smoke. For this reason, hard disks are manufactured
and repaired in rooms where the number of particles in the air is fewer than 100
particles per cubic meter.
Because of this very small gap
and the high speeds in which the platters are rotating, if the head comes into
contact with the surface of a platter, the result is (aptly named) a head crash.
More than likely this will cause some physical damage to your hard disk.
(Imagine burying your face into an asphalt street going even only 20 MPH!)
The heads move in and out across the platters by the older stepping motor
or the new, more efficient voice-coil motor. Stepping motors rotate and monitor
their movement based on notches or indentations. Voice-coil motors operate on
the same principle as a stereo speaker. A magnet inside the speaker causes the
speaker cone to move in time to the music (or with the voice). Because there are
no notches to determine movement, one surface of the platters is marked with
special signals. Because the head above this surface has no write capabilities,
this surface cannot be used for any other purpose.
The voice-coil motor
enables finer control and is not subject to the problems of heat expanding the
disk because the marks are expanded as well. Another fringe benefit is that
because the voice-coil operates on electricity, once power is removed, the disk
moves back to its starting position because it no longer is resisting a
"retaining" spring. This is called "automatic head parking."
Each surface of the platter is divided into narrow, concentric circles called
tracks. Track 0 is the outermost track and the highest numbered track is the track
closest to the central spindle. A cylinder is the set of all tracks with the
same number. So all of the 5th tracks from each side of every platter in the disk is known
as cylinder 5. As the number of cylinders is the same as the number of tracks, you
often see disk geometries described in terms of cylinders. Each track is divided
into sectors.
A sector is the smallest unit of data that can be written to or read from
a hard disk and it is also the disk's block size.
A common sector size is 512 bytes and the sector size was set when the disk was
formatted, usually when the disk is manufactured.
Since data is stored physically on the disk in concentric rings, the head does
not spiral in like a phonograph record but rather moves in and out across the
rings (the tracks). Because the heads move in unison across the
surface of their respective platters, data are usually stored not in consecutive
tracks but rather in the tracks that are positioned directly above or below
them. Therefore, hard disks read from successive tracks on
the same cylinder and not the same surface.
Think of it this way. As the
disk is spinning under the head, it is busy reading data. If it needs to read
more data than what fits on a single track, it has to get it from a different
track. Assume data were read from consecutive tracks. When the disk finished
reading from one track, it would have to move in (or out) to the next track
before it could continue. Because tracks are rings and the end is the beginning,
the delay in moving out (or in) one track causes the beginning of the next track
to spin past the position of the head before the disk can start reading it.
Therefore, the disk must wait until the beginning comes around again. Granted,
you could stager the start of each track, but this makes seeking a particular
spot much more difficult.
Let's now look at when data are read from
consecutive tracks (that is, one complete cylinder
is read before it goes on).
Once the disk has read the entire contents of a track and has reached the end,
the beginning of the track just below it is just now spinning under the head.
Therefore, by switching the head it is reading from, the disk can begin to read
(or write) as though nothing was different. No movement must take place and the
reads occur much faster.
Each track is broken down into smaller chunks
called sectors. The number of sectors into which each track is divided is called
sectors per track, or sectors/track. Although any value is possible, common
values for sectors/track are 17, 24, 32, and 64. (These are shown graphically in
Figure 0-12.)
Each sector
contains 512 bytes of data. However, each sector
can contain up to 571 bytes of information. Each sector
contains information
that indicates the start and end of the sector,
which is only ever changed by a
low-level format. In addition, space is reserved for a checksum
contained in the
data portion of the sector.
If the calculated checksum
does not match the
checksum in this field, the disk will report an error.
Image - Logical Components of a Hard Disk (interactive)
This difference between the total number of bytes per
sector and the actual amount of data has been cause for a fair amount of grief.
For example, in trying to sell you a hard disk, the salesperson might praise the
tremendous amount of space that the hard disk has. You might be amazed at the
low cost of a 1G drive.
There are two things to watch out for. Computers
count in twos, humans count in tens. Despite what the salesperson wants you to
believe (or believes himself), a hard disk with 1 billion bytes is not
a 1 gigabyte
drive it is only 109 bytes. One gigabyte means 230
bytes. A hard disk with 109 (1 billion) is only about 950
megabytes. This is five percent smaller!
The next thing is that seller
will often state is the unformatted storage capacity of a drive. This is
the number that you would get if you multiplied all the sectors on the disk by
571 (see the preceding discussion). Therefore, the unformatted size is
irrelevant to almost all users. Typical formatted MFM drives give the user 85
percent of the unformatted size, and RLL drives give the user about 89 percent.
(MFM and RLL are formatting standards, the specifics of which are beyond the
scope of this book. )
This brings up an interesting question. If the
manufacturer is telling us the unformatted size and the formatted size is
about 85 percent for MFM and 89 percent for SCSI
/IDE (using RLL), how can
I figure out how much usable space there really is? Elementary, my dear Watson:
Its a matter of multiplication.
Lets start at the beginning. Normally when
you get a hard disk, it comes with reference material that indicates how many
cylinders, heads, and sectors per track there are (among other things). The set
of all tracks at the same distance from the spindle is a cylinder.
The number of
cylinders is therefore simply the number of tracks because a track is on one
surface and a cylinder
is all tracks at the same distance. Because you can only
use those surfaces that have a head associated with them, you can calculate the
number of total tracks by multiplying the number of cylinders by the number of
heads. In other words, take the number of tracks on a surface and multiply it by
the number of surfaces. This gives you the total number of tracks.
From our discussion of tracks, you know that each track is divided into a specific
number of sectors. To find the total number of sectors, simply multiply the
number of total tracks that we calculated above by the sectors per track. Once
you have the total number of sectors, multiply this by 512 (the number of bytes
of data in a sector). This give us the total number of bytes on
the hard disk. To figure out how may megabytes this is, simply divide this
number by 1,048,576 (1024 x 1024 = 1MB).
Hard disks can be further subdivided into partitions. A partition is a large group of
sectors allocated for a particular purpose. Partitioning a disk allows the disk to be used by
several operating system or for several purposes. A lot of Linux systems have a single disk with
three partitions; one containing a DOS filesystem,
another an EXT2 filesystem and a third for the swap partition.
The partitions of a hard disk are described by a partition table; each entry
describing where the partition starts and ends in terms of heads, sectors and cylinder
numbers. For DOS formatted disks, those formatted by fdisk,
there are four primary disk partitions. Not all four entries in the partition table have
to be used. There are three types of partition supported by fdisk, primary, extended and
logical.Extended partitions are not real partitions at all, they contain any number of
logical parititions. Extended and logical partitions were invented as a way around the
limit of four primary partitions. The following is the output from fdisk for a disk
containing two primary partitions:
Disk /dev/sda: 64 heads, 32 sectors, 510 cylinders
Units = cylinders of 2048 * 512 bytes
Device Boot Begin Start End Blocks Id System
/dev/sda1 1 1 478 489456 83 Linux native
/dev/sda2 479 479 510 32768 82 Linux swap
Expert command (m for help): p
Disk /dev/sda: 64 heads, 32 sectors, 510 cylinders
Nr AF Hd Sec Cyl Hd Sec Cyl Start Size ID
1 00 1 1 0 63 32 477 32 978912 83
2 00 0 1 478 63 32 509 978944 65536 82
3 00 0 0 0 0 0 0 0 0 00
4 00 0 0 0 0 0 0 0 0 00
This shows that the first partition starts at cylinder or track 0, head 1 and
sector 1 and extends to include cylinder 477, sector 32 and head 63.
As there are 32 sectors in a track and 64 read/write heads, this partition is a
whole number of cylinders in size. fdisk alligns partitions on cylinder boundaries by
default. It starts at the outermost cylinder (0) and extends inwards, towards the
spindle, for 478 cylinders. The second partition, the swap partition, starts at the
next cylinder (478) and extends to the innermost cylinder of the disk.
All PC-based operating systems need to break down the hard disk into partitions.
(In fact, all operating systems I can thing of that use the same kind of hard disks
need to create partitions.)
A partition can be any size, from just a couple of megabytes to the
entire disk. Each partition
is defined in a partition table that appears
at the very beginning of the disk. This partition
table contains information
about the kind of partition
it is, where it starts, and where it ends. This
table is the same whether you have a DOS-based PC, UNIX, or both.
Because the table is the same for DOS
and UNIX, there can be only four partitions total
because there are four entries in the table. DOS
gets around this by creating logical partitions within one physical
partition. This is a
characteristic of DOS,
not the partition table.
Both DOS and UNIX must
first partition
the drive before installing the operating system
and provide the mechanism during the installation process in the form of the fdisk program.
Although their appearances are very different, the DOS
and Linux fdisk commands perform the same function.
When you run the Linux fdisk utility, the
values you see and input are all in tracks. To figure out how big each fdisk
partition is, simply multiply that value by 512 and by the number of sectors per
track. (Remember that each sector holds 512 bytes of data.)
A disk is usually described by its geometry, the number of cylinders, heads and
sectors. For example, at boot time Linux describes one of my IDE disks as:
hdb: Conner Peripherals 540MB - CFS540A, 516MB w/64kB Cache, CHS=1050/16/63
This means that it has 1050 cylinders (tracks), 16 heads (8 platters) and 63
sectors per track. With a sector, or block, size of 512 bytes this gives the disk a storage
capacity of 529200 bytes. This does not match the disk's stated capacity of 516 Mbytes as
some of the sectors are used for disk partitioning information. Some disks automatically find
bad sectors and re-index the disk to work around them.
To physically
connect itself to the rest of the computer, the hard disk has five choices:
ST506/412, , ESDI,
SCSI,
IDE,
and the newer Enhanced IDE (EIDE). However, the
interface the operating system
sees for ST506/412 and IDE are identical, and
there is no special option for an IDE drive. At the hardware level, though,
there are some differences that I need to cover for completeness.
To be quite honest, only ESDI
and ST506/412 are really disk interfaces. SCSI
and IDE are referred to as "system-level interfaces" and they incorporate ESDI
into the circuitry physically located on the drive.
The ST506/412 was
developed by Seagate Technologies (hence the ST) for its ST506 hard disk, which
had a whopping 5Mb formatted capacity. (This was in 1980 when 360K was a
big floppy.) Seagate later used the same interface in their ST412, which
doubled the drive capacity (which is still less hard disk space than
most people RAM.
Oh well). Other drive manufacturers decided to incorporate this technology, and
over the years, it has become a standard. One of its major drawbacks is that is
15-year-old technology, which can no longer compete with the demands of todays
hard disk users.
In 1983, the Maxtor Corporation established the Enhanced
Small Device Interface (ESDI) standard. The enhancements ESDI
provided offered
higher reliability because Maxtor had built the encoder/decoder directly into
the drive and therefore reduced the noise, high transfer rates, and the ability
to get drive parameters directly from this disk. This meant that users no longer
had to run the computer setup routines to tell the CMOS
what kind of hard disk
it had.
One drawback that I have found with ESDI
drives is the physical
connection between the controller and the drive itself. Two cables were needed:
a 34-pin control cable and a 24-pin data cable. Although the cables are
different sizes and can't (easily) be confused, the separation of control and
data is something of which I was never a big fan. The connectors on the drive
itself were usually split into two unequal halves. In the connector on the
cable, a small piece of plastic called a key prevented the connector from being
inserted improperly. Even if the key was missing, you could still tell which end
was which because the pins on the hard disk were labeled and the first line on
the cable had a colored stripe down its side. (This may not always be the case,
but I have never seen a cable that wasn't colored like this.)
Another drawback that I have found is that the physical location on the
cable determines which drive is which. The primary drive is located at the end
of the cable, and the secondary drive is in the middle. The other issue is the
number of cables: ESDI
drives require three separate cables. Each drive has its
own data cable and the drives share a common control cable.
Although
originally introduced as the interface for hard cards (hard disks directly
attached to expansion cards), the IDE
(integrated drive electronics) interface
has grown in popularity to the point where it is perhaps the most commonly used
hard-disk interface today (rapidly being replaced by SCSI). As its name implies,
the controller electronics are integrated onto the hard disk itself. The
connection to the motherboard is made through a relatively small adapter,
commonly referred to as a "paddle board." From here, a single cable
attaches two hard disks in a daisy chain, which is similar to the way floppy
drives are connected, and often, IDE
controllers have connectors and control
electronics for floppy drives as well.
IDE drives often play tricks on
systems by presenting a different face to the outside world than is actually on
the disk. For example, because IDE
drives are already pre-formatted when they
reach you, they can have more physical sectors in the outer tracks, thereby
increasing the overall amount of space on the disk that can be used for storage.
When a request is made to read a particular block of data on the drive, the IDE
electronics translate this to the actual physical location.
Because IDE
drives come pre-formatted, you should never low-level format an IDE
drive unless
you are specifically permitted to do so by the manufacturer. You could
potentially wipe out the entire drive to the point at which it must be returned
to the factory for "repair." Certain drive manufacturers, such as
Maxtor, provide low-level format routines that accurately and safely
low-level format your drive. Most vendors that I am aware of today simply
"zero" out the data blocks when doing a low-level format. However,
don't take my word for it! Check the vendor.
Only two IDE drives can be connected with a single cable, but there is nothing special about
the position of the drive on the cable. Therefore, the system needs some other way of
determining which drive is which. This is done with the drives themselves. Most commonly, you
will find a jumper or set up jumpers used to determine if the drive is the
master, slave or master-only/single-drive. On (much) older system, you could only have a single
IDE controller, which meant only two drive. Today, it is common to find two IDE controllers
build onto the mother board and some system allow you to add extra IDE controllers, thereby
increasing the number of IDE drives even further.
The next great
advance in hard disk technology was SCSI.
SCSI is not a disk interface, but
rather a semi-independent bus.
More than just hard disks can be attached to a
SCSI-Bus. Because of its complex nature and the fact that it can support such a
wide range of devices, I talked in more detail about SCSI
earlier in this
chapter. However, a few specific SCSI
issues relate to hard disks in general and
the interaction between SCSI
and other types of drives.
The thing to note
is that the BIOS
inside the PC knows nothing about SCSI.
Whether this is an
oversight or intentional, I don't know. The SCSI spec is more than 10 years old,
so there has been plenty of time to include it. Because the BIOS
is fairly standard from machine to machine, including SCSI support might create problems
for backward compatibility.
On the other hand, the BIOS
is for DOS.
DOS makes BIOS calls. To be able to access all the possible SCSI
devices through the
BIOS, it must be several times larger. Therefore, every PC-based operating
system needs to have extra drivers to be able to access SCSI
devices.
Because the BIOS
does not understand about SCSI,
you have to trick the PCs BIOS
a little to boot from a SCSI
device. By telling the PCs BIOS that no drives are installed as either C: or D:,
you force it to quit before it
looks for any of the other types. Once it quits, the BIOS
on the SCSI
host adapter has a chance to run.
The SCSI
host adapter obviously knows how to
boot from a SCSI
hard disk and does so wonderfully. This is assuming that you
enabled the BIOS
on the host adapter. If not, you're hosed.
There is also the flip side of the coin. The official doctrine says that if you have a
non-SCSI boot
driver, you have to disable the SCSI
BIOS
because this can cause problems. However, I know people who have IDE
boot
drives and still leave the SCSI BIOS
enabled. Linux normally reacts as though the SCSI BIOS were not
enabled, so, what do to? I suggest that you see what works. I can only add that
if you have more than one host adapter , only one should have the BIOS
enabled.
Another thing is that once the kernel
boots from a SCSI
device, you loose access to other kinds of drives. Just because it doesn't boot
from the IDE (or whatever), does this mean you cannot access it at all? Unfortunately,
yes. This is simply the way the kernel is designed. Once the kernel has
determined that it has booted off a SCSI hard disk, it can no longer access a
non-SCSI hard disk.
The newest member of the hard disk family is Enhanced
IDE, or EIDE.
The most important aspect of this new hard disk interface is its
ability to access more than 504 megabytes. This limitation is because the IDE
interface can access only 1,024 cylinders, 16 heads, and 63 sectors per track.
If you multiply this out using the formula I gave you earlier, you get
504Mb.
EIDE also has other advantages such as higher transfer rates,
ability to connect more than just two hard disks, and attach more than just hard
disks. One drawback the EIDE
had at the beginning was part of its very nature.
To overcome the hard disk size limit that DOS
had, EIDE
drives employ a method
called logical block addressing (LBA).
The idea behind LBA is that is that the systems
BIOS would "rearrange" the drive geometry so that drives larger
than 528Mb could still boot. Because Linux does not use the
BIOS to access the hard disk, the fact that the BIOS could handle
the EIDE drivemeant nothing. New drivers needed to be added
to accountfor this.
More and more you find the (E)IDE controllers built directly onto the motherboard. On the one
hand this is a good thing, since you do not need to use a expansion card slot
for the controller. However, you need to be careful where it is located. I have had a few
motherboards, where the IDE controller was stuck between the
PCI and ISA slots. This made it extremely difficult to
access the pins without removing either of the cards in either of the PCI or ISA slot (sometimes
both).
Although this might not seem like a big deal, but it may become one the next time you do
anything with that machine. It is not uncommon when adding a harddisk or something else to the
system that you accidentally pull on the harddisk cable. All you need to do is pull it no more than
quarter of an inch before some plugs are no longer connected to the pins. Because this connection is
almost impossible to see, you don't notice that the cable has come loose. When you reboot the
machine nothing works as the system is getting signals through only some of the lines. If the pins
for the IDE controller are out in the open, you may still pull on the cable,
but it is easier to see and far easier to fix.
There is much more to choosing the right harddisk
that its size. Although size determines how much data you can store, it tells you nothing about the
speed at which you can access that data. How quickly you can access your data is the true measure of
performance.
Unfortunately, there is no one absolute measure of harddisk performance. The reason
is simply that data access occurs in some many different ways, it is often difficult for even the
experienced administrator to judge which drive is better. However, there are
several different characteristics of harddisks, which, when viewed together give you a good idea of
the overall performance of a drive.
One character that is often quoted is the seek time. This
refers to the time need to move the read/write head between tracks. If the data is not on the same
track, it could mean moving the head a couple thousand tracks in either direction. Movement from one
track to the next adjacent one might take only 2 ms, whereas moving the entire diameter of the drive
might take 20ms.
So which one do you use? Typically, neither. When access times are specified,
you normally see the average seek time. This is measured as the average time between randomly
located tracks on the disk. Typical rages at the time of this writing are between 8ms and 14ms. The
problem is that disk access is often (usually?) not random. Depending on how you work, you read
large a number of blocks at once, such as to load a WordPerfect file. Therefore, average seek time
does not reflect access of large pieces of data.
Once the head has moved to the track you need,
you are not necessarily read to work. You need to wait until the right block is under the read/write
head. The time the drive takes to reach that point is called rotational latency. The faster the
drive speeds, the more quickly the block is underneath the head. Therefore, rotational latency is
directly related to the rotational speed (rpm) of your drive.
By increasing the rotational speed
of a drive you obviously decrease the time the drive has to wait. The fastest drives as I am writing
this spin at least 7200 times per minutes, which means that have an average rotational latency is
about 4.2 ms.
You can also decrease the rotational latency by staggering the start of the start
of each track. This is especially effective when doing sequential reads across tracks. If the start
of all tracks were at the same place, the head would move to the new track and the start of the
track would have already spin out from underneath. If the tracks are staggered, the head has to wait
less time (less rotational latency) until the start of the track is underneath.
Think back to our
discussion of harddisks and the concept of a cylinder. This is all of the
tracks at the same distance from the spindle. To physically move heads from one track to another
takes more time than simple switch which head you are using. However, because switch heads does not
occur instantaneously, there is a certain amount of rotational latency. Therefore, the start of each
track is staggered as one moves up and down the cylinder, as well as across the cylinders.
By
decreasing the rotational latency, we increase the speed at which the head reaches the right
position. Once we are there, we can begin reading, this is the average access time. This, too, is
measured in milliseconds.
Still, this is not the complete measure of the performance of our
drive. Although it is nice that the drive can quickly begin to read, this does not necessarily mean
that it will read the data fast. The faster the harddisk can read the data, the faster your
WordPerfect file is loaded. This is due to the transfer rate. This is normally measured in megabytes
per second.
However, the actual transfer is not necessarily what the harddisk manufacturer says
it is. They may have given the transfer rate in terms of the maximum or average sustained transfer
rate. This is important to understand. If you have one huge 200Mb file that you are reading on a new
drive, the entire drive might be contiguous. Therefore, there is very little movement of the heads
as the file is read. This would obviously increase the average transfer rate. However, if you have
two hundred 1 Mb files spreads out all over the disk, you will definitely
notice a lower transfer rate.
In addition, this is another case of the chain being as strong as
its weakest link. The actual transfer rate dependant on other factors, as well. A slow harddisk
controller or slow system bus can make a fast harddisk display bad performance.
Another aspect is how much of the date is being re-read. For example, if you read the same one
Mb file two hundred times, the head won't move much. This is not a bad thing,
as data is often read repeatedly. Harddisk manufacturers are aware of this and therefore will add
caches to the harddisk to improve performance. Data that is read from the harddisk can be stored in
the cache so if it is needed again, it can be accessed more quickly than if it
must be first read from the drive. Data that is written, may also be needed again, so it too can be
re-read from the cache.
This is also called a cache
buffer, because it also serves to buffer the data. Sometimes the harddisk
cannot keep up with the CPU. It may be the disk is writing someone as new data
comes in. Rather than making the CPU wait, the data is written to the cache, which the harddisk can
read when it can. Other times, the CPU is doing something else as the data is from the harddisk is
ready. The harddisk can write it to the buffer and the CPU can take it when it can.
Finally,
there is data throughput. This is a measure of the total amount of data the
CPU can access in a given amount of time. Since the data is going through the
harddisk controller and through the system bus, this may not be a good measure
of performance of the drive itself. However, if the other components can process the data as quickly
as the drive can provide it, it is a good measure of the complete system.
This is why you will
see references to Seagate drives on the Adaptec web site. Adaptec understands the relationship
between the components in your system. Therefore, they suggest drives that can keep up with the
other components such as the appropriate ones from Seagate.
Another aspect of the administration
costs that a lot of people do not think about is the drive designation. Although calling a harddisk
"WhirlWind" or "Falcon" might be pleasing to the marketing people or the IT manager who has no clue
about the technical details. However, the administrator is not interested in
what name it has but rather its characteristics it has. If it takes a long time to figure out the
characteristics, the total cost of owner ship has increased.
How often have you had to wade
through pages and pages on a company's Internet site to figure out how big a particular model was?.
Although many (most?) companies have a 1:1 relationship between the model designation and the
characteristics, you have to first figure out the scheme, as often it is not posted anywhere on the
site.
This is one reason why I keep coming back to Seagate. Without thinking I can come up with
the model number or something very close. The general format is:
ST<F><MB><INT>
Where: <F> = Form factor, such as 3", 3" half-high, 5", etc. <B> =
Approximate size in megabytes. <INT> = Interface.
So, looking at my drive, which is a
ST39140A, I can quickly tell that it is a form factor 3 (3" drive and 1" high), it has approximately
9140 MB and an ATA interface. Granted some of the abbreviations used for the interface take a little
to get used to. However, the naming scheme is consistent and very easy to figure out.
As with
other hardware, your choice of harddisk is also guided by the reputation of the company. This
applies not only to what you have heard, but also your own personal experiences. Often it is more
than just having heard or reading that a particular manufacturer is bad, but rather an issue of
being "sure." This is why I will never buy an IBM harddisk again. All three I have bought were
defective. Although other people have claimed not to have problems with them, I do not want to risk
my data on them. Three times is too much of a coincidence and I would not feel safe if I installed
an IBM harddisk on any of my machines, nor would I have a clear conscience if I installed it in a
customer's machine.
On the other hand, I have had a proportionally large number of Seagate drives
since I first started working with computers. None of which have ever given me problems. So far, all
of my Seagate drives have been replaced with larger drives, not because they have failed, but they
have grown, too small. There are only so many bays in a computer case and filling them up with small
drives is not worth it. Instead, I got larger drives.
In addition to the size and speed of your
drive, one important consideration is the interface to the harddisk. Typically,
SCSI harddisks are more expensive than ATA drives, even if you ignore the extra
costs for the SCSI host adapter. Even if you want to ignore the extra costs to
acquire the drive, you need to consider the costs to install and manage the
host adapter, the performance increase you get with SCSI is negligible for work stations.
Generally, you do not need the extra throughput that SCSI can provide.
In
most cases, space will be an issue. Although you need just a few hundred megabytes for the
operating system, you are getting larger and larger applications, with dozens
of components which quickly fill up space on your harddisk. Buying and installing a new ATA harddisk
is generally simpler than adding a SCSI harddisk particularly if your first
harddisk is ATA. In addition, on newer system you can have up to four ATA devices, including
CD-ROM drives, which is generally sufficient for a workstations, as well as
mobile users.
On the other hand, if you are in an environment where you need
more than four device or need devices that do not support ATA, then you will have to go with
SCSI. In addition, SCSI is basically a must when talking about your server.
Size isn't an issue as what is available is generally the same for ATA and SCSI. The key difference
is performance. This is particularly important in a multi-user environment.
Let's take the Seagate Cheetah as an example. As of this writing it is the fastest available on
the market with10,000 RPM. It has a maximum internal transfer rate of
306Mbits/s, which means it is even faster than the 80Mbits/s of the Ultra SCSI
interface. This is a result of an average seek time of 6 milliseconds and 2.99 average latency. This
means the average access time is under 9 milliseconds. To compensate, the Cheetah series has default
buffer size of 1Mb. In addition, the throughput is too
high to use anything other than SCSI or Fibre Channel, so it is not available with an ATA interface.
There are also a few other reasons why something like the Cheetah is the perfect solutions for a
server. First, it supports up to 15 devices on a single wide SCSI
bus. Using the Fibre Channel versions, you can get up to 126 devices, which
are also hot swappable.
Another thing to consider is the maintenance and administration. Low-end
Medalist drives have an expected mean-time between failures (MTBF) of 400,000 hours. Which is about
45 years. The MTBF for the Cheetah is approximately 1,000,000 hours or over 100 years. No wonder I
haven't ever had a harddisk crash.
The Seagate drives also do something else to reduce maintenance and administration costs. First,
there is something Seagate calls SeaShield and is something other harddisk manufacturers should
adopt. This is simply a protective cover around the electronics that are exposed on other harddisks.
This protects the electronics from static electrical discharge, as well as damage caused by bumping
the drive against something. In addition, this cover provides the perfect space for installation
instructions, like the jumper settings. There is no need to go hunting around for the data sheet,
which often isn't supplied with the drive. Talk about saving administration costs!
Some of you might be saying that names like Cheetah go against my desire to have understandable
model names. My answer is that the opposite is true. As of this writing Seagate has four primary
series: Medalist, Medalist Pro, Barracuda and Cheetah. This simply tells the rotation rate, which is
5400, 7200, 7200 and 10,000 RPM respectively. The Medalist is Ultra ATA. The
Medalist Pro is either ATA or SCSI. The Barracuda and Cheetah are either SCSI
or Fibre Channel. Okay, this requires you to use your brain a little, but it is far easier than many
other vendors.
|