Managing Scripts

Managing Scripts

A very common use of shell scripts that you write is to automate work. If you need to run the command by hand each time, it often defeats the intent of the automation. Therefore, it is also very common that commands are started from cron.

As Murphy’s Law would have it, sometimes something will prevent the script from ending. However, each time cron starts, a new process is started, so you end up with dozens, if not hundreds of processes. Depending on the script, this could have a dramatic effect on the performance of your system. The solution is to make sure that the process can only start once, or if it is already running, you want to stop any previous instances.

So, the first question is how to figure out what processes are running, which is something we go into details about in another section. In short, you can use the ps command to see what processes are running:

ps aux | grep your_process_name | grep -v grep

Note that when you run this command, it will also appear in the process table. Since your process name is an argument to the grep command, grep ends up finding itself. The grep -v grep says to skip entries that containing the word “grep” which means you do not find the command you just issued. Assuming that the script is only started from cron, the only entries found will be those started by cron. If the return code of the command is 1, you know the process is running (or at least grep found a match.)

In your script, you check for the return code and if it is 1, the script exits, otherwise it does the intended work. Alternatively, you can make the assumption that if it is still running, there is a problem and you want to kill the process. You could use ps, grep, and awk to get the PID of that processes (or even multiple processes). However, it is a lot easier using the pidof command. You end up with something like this:

kill `pidof your_process_name`

The problem with that is the danger of killing a process that you hadn’t intended. Therefore, you need to be sure that you kill the correct process. This is done by storing the PID of the process in a file and then checking for the existence of that file each time your scripts starts. If the file does not exist, it is assumed the process is not running, so the very next thing the script does is create the PID file. This can could be done using the special variable $$ which is the process ID of the currect process, something like this:

echo $$ > PID_file

This is already done by many system processes and typically these files are stored in /var/run and have the ending .pid. Therefore, the file containing the PID of your HTTP server is /var/run/httpd.pid. You can then be sure you get the right process with a command like this:

kill `cat PID_FILE`

Where “PID_FILE” is the path to the file contain the PID.

Note that in your script, you should first check for the existence of the PID file before you try to kill the process. If the process does not exist, but the PID file does, maybe the process died. Depending on how long ago the process died, it is possible that the PID has been re-used and now belongs to a completely different process. So as an added safety measure you could verify that the PID belongs to the correct process.

To get some ideas on how existing scripts manage processes take a look at the init scripts in /etc/rc.d.

Details on if-then constructs in scripts can be found here.
Details on using back-quotes can be found here.
Details on file redirection can be found here.