Semaphores are used to protect critical regions of code or data structures. Remember that each access of a critical piece of data such as a VFS inode describing a directory is made by kernel code running on behalf of a process. It would be very dangerous to allow one process to alter a critical data structure that is being used by another process. One way to achieve this would be to use a buzz lock around the critical piece of data that is being accessed, but this is a simplistic approach that would degrade system performance.

Instead Linux uses semaphores to allow just one process at a time to access critical regions of code and data; all other processes wishing to access this resource will be made to wait until it becomes free. The waiting processes are suspended, other processes in the system can continue to run as normal.

A Linux semaphore data structure contains the following information:

This field keeps track of the count of processes wishing to use this resource. A positive value means that the resource is available. A negative or zero value means that processes are waiting for it. An initial value of 1 means that one and only one process at a time can use this resource. When processes want this resource they decrement the count and when they have finished with this resource they increment the count,
This is the count of processes waiting for this resource which is also the number of process waiting to be awakened when this resource becomes free,
wait queue
When processes are waiting for this resource they are put onto this wait queue,
A buzz lock used when accessing the waking field.

Suppose the initial count for a semaphore is 1, the first process to come along will see that the count is positive and decrement it by 1, making it 0. The process now “owns” the critical piece of code or resource that is being protected by the semaphore. When the process leaves the critical region it increments the semphore’s count. The most optimal case is where there are no other processes contending for ownership of the critical region. Linux has implemented semaphores to work efficiently for this, the most common case.

If another process wishes to enter the critical region whilst it is owned by a process it too will decrement the count. As the count is now negative (-1) the process cannot enter the critical region. Instead it must wait until the owning process exits it. Linux makes the waiting process sleep until the owning process wakes it on exiting the critical region. The waiting process adds itself to the semaphore’s wait queue and sits in a loop checking the value of the waking field and calling the scheduler until waking is non-zero.

The owner of the critical region increments the semaphore’s count and if it is less than or equal to zero then there are processes sleeping, waiting for this resource. In the optimal case the semaphore’s count would have been returned to its initial value of 1 and no further work would be neccessary. The owning process increments the waking counter and wakes up the process sleeping on the semaphore’s wait queue. When the waiting process wakes up, the waking counter is now 1 and it knows that it may now enter the critical region. It decrements the waking counter, returning it to a value of zero, and continues. All access to the waking field of semaphore are protected by a buzz lock using the semaphore’s lock.