11. How do computer languages work?
We've already discussed how programs
are run. Every program ultimately has to execute as a stream of
bytes that are instructions in your computer's machine
language. But human beings don't deal with machine
language very well; doing so has become a rare, black art even among
Almost all Unix code except a small amount of direct
hardware-interface support in the kernel itself is nowadays written in a
high-level language. (The ‘high-level’ in this term
is a historical relic meant to distinguish these from
languages, which are basically thin wrappers around
There are several different kinds of high-level languages. In order
to talk about these, you'll find it useful to bear in mind that the
source code of a program (the human-created, editable
version) has to go through some kind of translation into machine code that
the machine can actually run.
11.1. Compiled languages
The most conventional kind of language is a compiled
language. Compiled languages get translated into
runnable files of binary machine code by a special program called
(logically enough) a
Once the binary has been generated, you can run it directly without looking
at the source code again. (Most software is delivered as compiled binaries
made from code you don't see.)
Compiled languages tend to give excellent performance and have the most
complete access to the OS, but also to be difficult to program in.
C, the language in which Unix itself is written, is by far the most
important of these (with its variant C++). FORTRAN is another compiled
language still used among engineers and scientists but years older and much
more primitive. In the Unix world no other compiled languages are in
mainstream use. Outside it, COBOL is very widely used for financial and
There used to be many other compiler languages, but most of them have
either gone extinct or are strictly research tools. If you are a new
Unix developer using a compiled language, it is overwhelmingly likely
to be C or C++.
11.2. Interpreted languages
language depends on an interpreter program that reads
the source code and translates it on the fly into computations and system
calls. The source has to be re-interpreted (and the interpreter present)
each time the code is executed.
Interpreted languages tend to be slower than compiled languages, and
often have limited access to the underlying operating system and hardware.
On the other hand, they tend to be easier to program and more forgiving of
coding errors than compiled languages.
Many Unix utilities, including the shell and bc(1) and sed(1) and awk(1),
are effectively small interpreted languages. BASICs are usually
interpreted. So is Tcl. Historically, the most important interpretive
language has been LISP (a major improvement over most of its successors).
Today, Unix shells and the Lisp that lives inside the Emacs editor are
probably the most important pure interpreted languages.
11.3. P-code languages
Since 1990 a kind of hybrid language that uses both compilation and
interpretation has become increasingly important. P-code languages are
like compiled languages in that the source is translated to a compact
binary form which is what you actually execute, but that form is not
machine code. Instead it's
which is usually a lot simpler but more powerful than a real machine
language. When you run the program, you interpret the p-code.
P-code can run nearly as fast as a compiled binary (p-code interpreters
can be made quite simple, small and speedy). But p-code languages can keep
the flexibility and power of a good interpreter.
Important p-code languages include Python, Perl, and Java.