Interrupts Exceptions and Traps
Normally, processes are asleep, waiting on some event.
When that event happens, these processes
are called into action. Remember, it is the responsibility of the sched process
to free memory when a process runs short of it. So, it is not until memory is
needed that sched starts up.
How does sched know that memory is needed? When a process makes reference to
a place in its virtual memory space that does not yet exist in
physical memory, a page fault occurs.
Faults belong to a group of system events called
exceptions. An exception is simply something that occurs outside of what
is normally expected. Faults (exceptions) can occur either before or during the
execution of an instruction.
For example, if an instruction that is not yet in memory needs to be read,
the exception (page fault) occurs before the instruction starts being
executed. On the other hand, if the instruction is supposed to read data from a
virtual memory location that isn’t in
physical memory, the exception occurs during the execution
of the instruction. In cases like these, once the
missing memory location is loaded into physical memory,
the CPU can start the instruction.
Traps are exceptions that occur after an instruction has been executed.
For example, attempting to divide by zero generates an exception. However,
in this case it doesn’t make sense to restart the instruction because every time we
to try to run that instruction, it still comes up with a Divide-by-Zero exception.
That is, all memory references are read before we start to execute the command.
It is also possible for processes to generate exceptions intentionally. These programmed
exceptions are called software interrupts.
When any one of these exceptions occurs, the system must react to the exception. To react,
the system will usually switch to another process to deal with the exception, which means
a context switch. In our discussion of process scheduling, I mentioned
that at every clock tick the priority of every process
is recalculated. To make those calculations, something other than those
processes have to run.
In Linux, the system timer (or clock) is programmed to generate a hardware
interrupt 100 times a second (as defined by the HZ system parameter).
The interrupt is accomplished by sending a signal to a special chip
on the motherboard called an interrupt controller. (We go into more
detail about these in the section on hardware.)
The interrupt controller then sends an interrupt to the
CPU.
When the CPU receives this signal,
it knows that the clock tick
has occurred and it jumps to a special part of the kernel
that handles the clock interrupt. Scheduling priorities are also
recalculated within this same section of code.
Because the system might be doing something more important when the clock
generates an interrupt, you can turn interrupts off using
“masking”. In other words, there
is a way to mask out interrupts. Interrupts that can be masked out are
called maskable interrupts. An example of something more important than
the
clock would be accepting input from the keyboard. This is why clock ticks are
lost on systems with a lot of users inputting a lot of
data. As a result, the system clock appears to slow down over
time.
Sometimes events occur on the system that you want to know about no matter
what. Imagine what would
happen if memory was bad. If the system was in the middle of writing to the hard
disk when it
encountered the bad memory, the results could be disastrous. If the system
recognizes the bad
memory, the hardware generates an interrupt
to alert the CPU. If the CPU is told to ignore all hardware
interrupts, it
would ignore this one. Instead, the hardware has the ability to generate an
interrupt
that cannot be ignored, or “masked out”, called a non-maskable
interrupt.
Non-maskable interrupts are
generically referred to as NMIs.
When an interrupt
or an exception occurs, it must be dealt
with to ensure the integrity of the system. How the system reacts depends on
whether it was an
exception or interrupt.
In addition, what happens when the hard disk generates an interrupt is going
to be different than when the clock generates one.
Within the
kernel
is the Interrupt
Descriptor Table (IDT), which is a list of descriptors (pointers) that point
to the functions
that handle the particular interrupt
or exception. These functions are called the interrupt
or exception handlers. When an interrupt
or exception occurs, it has a particular value,
called an identifier or vector. Table 0-2 contains a list of the defined
interrupt
vectors.
Table Interrupt Vectors
|
Identifier |
Description |
|
0 | Divide
error |
| 1 |
Debug exception |
|
2 | Non-maskable
interrupt |
| 3 |
Breakpoint |
|
4 |
Overflow |
|
5 | Bounds
check |
| 6 |
Invalid opcode |
|
7 | Coprocessor not
available |
| 8 |
Double fault |
|
9 |
(reserved) |
|
10 | Invalid
TSS |
| 11 |
Segment not present |
|
12 | Stack
exception |
| 13 |
General protection fault |
|
14 | Page
fault |
| 15 |
(reserved) |
|
16 | Coprocessor
error |
| 17 |
alignment error (80486) |
|
18-31 |
(reserved) |
|
32-255 | External (HW)
interrupts |
These numbers are actually indices into the IDT. When an
interrupt,
exception, or trap occurs, the system knows which number corresponds to that
event. It then uses that number as an index into the IDT,
which in turn points to the appropriate area of memory for handling the
event.
It is possible for devices to share interrupts; that
is, multiple devices on the system can be (and ofter are) configured to use the same
interrupt.
In fact, certain kinds of
computers are designed to allow devices to share interrupts (I’ll talk about
them in the hardware section). If the interrupt
number is an offset into a table of pointers to interrupt routines, how
does the kernel know which one to call?
As it turns out, there are two IDTs: one for shared interrupts and one for non-shared
interrupts. During a kernel rebuild (more on that later),
the kernel determines whether the interrupt
is shared. If it is, it places the pointer to that interrupt
routine into the shared IDT.
When an interrupt
is generated, the interrupt routine for each of these
devices is called. It is up to the interrupt
routine to check whether the associated device really
generated an interrupt.
The order in which they are called is the order in which they are
linked.
When an exception happens in user mode,
the process passes through a trap
gate. At this point, the CPU
no longer uses the process’ user stack,
but rather the system stack within that
process’ task structure. (each task structure has a portion set aside for the
system stack.) At this point, that process is operating in system (kernel) mode; that is, at the
highest privilege level,
0.
The kernel
treats interrupts very similarly to the way it treats exceptions: all the
general purpose registers are pushed onto the system stack
and a common interrupt
handler is called.
The current interrupt
priority is saved and the new priority is loaded. This prevents interrupts at
lower priority levels from interrupting the kernel
while it handles this interrupt. Then the real
interrupt handler is called.
Because an exception is not fatal, the process will return from
whence it came. It is possible that a context switch
occurs immediately on return from kernel
mode.
This might be the result of an exception with a lower priority. Because it could
not interrupt
the process in kernel
mode, it had to wait until it returned to user mode.
Because the exception has a
higher priority than the process when it is in user mode,
a context switch
occurs immediately after
the process returns to user mode.
It is abnormal for another exception to occur while the process is in kernel
mode. Even a page
fault can be considered a software event.
Because the entire kernel
is in memory all the time, a page fault
should not happen. When a page fault
does happen when in kernel
mode, the kernel panics. Special routines have been built into the kernel
to deal with the panic
to help the system shut down as gracefully as possible. Should something else happen
to cause another exception while the system
is trying to panic, a double panic occurs.
This may sound confusing because I just said that
a context switch
could occur as the result of another exception. What this means is that the
exception occurred in user mode,
so there must be a jump to kernel
mode. This does not mean that the
process continues in kernel
mode until it is finished. It may (depending on what it is doing) be
context-switched out. If another process has run before the first one gets its
turn on the CPU again, that process may generate the exception.
Unlike exceptions, another interrupt
could possibly occur while the kernel
is handling the first one (and therefore is in kernel mode). If the
second interrupt
has a higher priority than the first, a context switch
will occur and the new
interrupt will be handled. If the second interrupt
has the same or lower priority, then the kernel
will “put it on hold.” These are not ignored, but rather saved (queued) to be
dealt with later.