The Domain Name System

DNS – Finding Other Machines

If you have TCP/IP installed, by default, your machine is set up to use the /etc/hosts file. This is a list of IP addresses and the matching name of the machines. When you try to connect to another machine, you can do it either with the IP address or the name. If you use the name, the system will look in the /etc/hosts file and make the translation from name to IP address. The only real drawback with this scheme is that every time a machine is added or removed from the network, you have to change the /etc/hosts file on all the affected machines.

Those you of that have had to administer large networks know that updating every /etc/hosts file like this can be a real pain. There is always at least one that you forget or you mis-type the name or address and have to go back and change it on every machine. Fortunately, there is hope.

Provided with Linux is a hostname/IP address database called the Berkeley Internet Name Domain (BIND) service. Instead of updated every machine in the network, there is a Domain Name System (DNS) server that maintains the database and provides the client machines with information about both addresses and names. If machines are added or removed, there is only one machine that needs to get changed. This is the Name Server. (Note: Some documentation translates DNS as Domain Name Server. Other references (most importantly the RCFs) call it the Domain Name System. I have seen some references call it Domain Name Service. Since we know what it is, I’ll just call it DNS.)

So, when do you use DNS over the /etc/hosts file? Well, it’s up to you. The first question I would ask is “Are you connecting to the Internet?” If the answer is “yes”, “maybe” or “someday” then definitely set up DNS.

DNS functions somewhat like directory assistance from the phone company. If your local directory assistance doesn’t have the number, you can contact one in the area you are looking. If your name server doesn’t have the answer, it will query other name servers for that information. (assuming you told it to do so.) Considering how many machine on the Internet, it is unrealistic to configure everything in the /etc/hosts file.

If you are never going to go into the Internet, then the answer is up to you. If you only have two machines in your network, the trouble setting up DNS is not worth it. On the other hand, if you have a dozen or more machines, then setting it up makes life easier in the long run.

There are several key concepts that need to be discussed before we dive into DNS. The first is DNS, like so many other aspects of TCP/IP, is client-server oriented. We have the name server containing the IP addresses and names which serves information to the clients. Next, we need to think about DNS operating in an environment similar to a directory tree. All machines that fall under DNS can be thought of as files in this directory tree structure. These machines are often referred to as nodes. Like directories and file names, there is a hierarchy of names with the tree. This is often referred to as the domain name space.

A branch of the DNS tree is referred to as a domain. A domain is simply a collection of computers that are managed by a single organization. This organization can be a company, university or even a government agency. The organization has a name that it is know by to the outside world. In conjunction with the domains of the individual organizations, there are things called top-level domains. These are broken down by the function of the domains under it. The original top level domains are:

COM – Commercial
EDU – Educational
GOV – Government
NET – Network
MIL – Military
ORG – Non-profit organizations

Each domain will fall within one of these top-level domains. For example, there is the domain google, which falls under the commercial top-level domain. It is thus designated as google.com. The domain assigned to the White House is whitehouse.gov. The domain assigned to the University of California at Santa Cruz is ucsc.edu. (Note that the dot is used to separate the individual components in the machine’s domain and name)

Keep in mind that these domains are used primarily within the US. While a foreign subsidiary might belong to one of these top-level domains, for the most part, the top level domain within most non-US countries is the country code. For example the geographical domain Germany is indicated by the domain abbreviations de (for Deutschland). These are examples, however. I do know some German companies within the com domain. There are also geographic domains within the US, such as ca.us for California as compared to just .ca for for Canada. This is often for very small domains or non-organizations, such as individuals.

In many places, they will use a combination of the upper-level domains that are used in the US and their own country code. For example, the domain name of an Internet provider in Singapore is singnet.com.sg. (Where sg is the country code for Singapore.)

Image – Internet domains (interactive)

Within each domain, there may be sub-domains. However, there doesn’t have to be. You usually find sub-domains in larger domains in an effort to break down the administration into smaller units. For example, if your company had a sub-domain for sales it might be sales.yourdomain.com.

Keep in mind that these are just the domain names, not the machine, or node name. Within a domain there can be (in principle) any number of machines. A machine sitting on the desk in the oval office might be called boss1. It’s full name, including domain would be boss1.pres.whitehouse.gov. A machine in your sales department called darkstar would then be darkstar.sales.yourdomain.com.

Up to now, I have only seen a machine name with five components: the machine name, two sub-domains, the company domain and then the top-level domain. On the other hand, if there was no sales sub-domain, and everything was under the yourdomain.com domain, the machine’s name would be: darkstar.yourdomain.com.

You may often see the fully-qualified domain name (FQDN) of a machine listed like this:

darkstar.yourdomain.com.

Including the trailing dot(.). That dot indicates the root domain. This has no name other that root domain or .(read “dot”). Very similar to the way the root directory has no name other than root or /. In some cases this dot is optional. However, there are cases where is it required and we’ll get to those in the section on configuring DNS.

Like files, it is possible that two machines have the same name. The only criteria for files is that their full path be unique. The same applies to machines. For example, there might be a machine darkstar at the whitehouse. (Maybe George is a closet Dead Head) It’s FQDN would be darkstar.whitehouse.gov. This is obviously not the same machine as darkstar.yourdomain.com any more than 1033 Main Street in Santa Cruz is the same as 1033 Main Street in Annsville. Even something like darkstar.support.yourdomain.com is different from darkstar.sales.yourdomain.com.

A zone is a grouping of machines that may, or may not, be the same as a domain. This is the set of machines over which a particular name server has authority and maintains the data. In our example above, there might be a zone for support, even if there was no sub-domain. On the other hand, there might be a team.support.yourdomain.com domain, but the zone is still yourdomain.com. Therefore, zones can be sub-ordinate or superior to domains. Basically, zones are used to make the job of managing the name server easier. Therefore, what constitutes a zone depends on your specific circumstances.

In DNS, there are a couple different types of servers. A primary server is the master server for one or more DNS zones. Each server maintains the database files, and is considered the authority for this zone. It may also periodically transfer data to a secondary server, if one exists for that zone.

DNS functions are carried out by the Internet domain name server: named. When it starts, named reads it’s configuration file to determine what zones it is responsible for and in which files the data is stored. By default, the configuration file /etc/named.conf. However, named can be started with the -b option to specify an alternate configuration file. Normally, named is started from a script in /etc/rc.d.

For example, the primary server for the yourdomain.com domain needs to know about the machines within the support.yourdomain.com domain. It could server as a secondary server to the support.yourdomain.com domain, whereby it would maintain all the records for the machines within that sub-domain. If, on the other hand, it servers as a stub server, the primary for the yourdomain.com need only know how to get to the primary for the support.yourdomain.com sub-domain. Note here, that it is possible for a server to be primary in one zone and secondary in another.

By moving responsibility to the sub-zone, the administrator of the parent zone, does not need to concern him or herself with changing the configurations files when a machine is added or removed within the sub-zone. As long as the address of sub-zone primary server remains matches the stub server entry all is well.

A secondary server takes over for the primary, should the primary go down or be otherwise inaccessible. A secondary server maintains copies of the database files, and “refreshes” them at predetermined intervals. If it cannot reach the primary to refresh it’s files, it will keep trying at (again) predetermined intervals. If after another predetermined time, the secondary still cannot reach the primary, the secondary considers it’s data invalid and flushes it.

Caching-only servers saves data in a cache file only until that data expires. The expiration time is based on a field within the data that is received from another server. This is called the time-to-live. Time-to-live is a regularly occurring concept within DNS.

A slave server can be a primary, secondary, or caching-only server. If it cannot satisfy the query locally, it will pass, or forward, the request to a fixed list of forwarders (forwarding server), rather than interacting directly with the primary name servers of other zones. These request are recursive, which means that the forwarder must answer either with the requested information or saying it doesn’t know. The requesting machine then asks the next server, then the next and then the next until it finally runs out of servers to check or gets an answer. Slave servers never attempt to contact servers other than the forwarders.

The concept of recursive request is in contrast to iterative requests. Here the queried server either gives an answer or tells the requesting machine where it should look next. For example, darkstar asks, iguana, the primary server for support.yourdomain.com for some information. In a recursive query, iguana asks, boomer, the primary server for yourdomain.com and passes the information back to darkstar. In a iterative query, iguana tells darkstar about boomer, and darkstar then goes asks boomer. This process of asking name servers for information, whether recursive or iterative is called resolution.

Keep in mind that there is client software running on the server. When an application needs information, the client DNS server asks the server for the information, despite the fact that the server is running on the same machine. Applications don’t access the DNS server directly.

There is also the concept of a root server. These are severs located at the top of the domain tree and maintain information about the top-level zone. Root servers are positioned at the top, or root, of the DNS hierarchy, and maintain data about each of the top-level zones.