|
The Linux Page Cache

Figure: The Linux Page Cache
The role of the Linux page cache is to speed up access to files on disk.
Memory mapped files are read a page at a time and these pages are stored in the page cache.
The figure above shows that
the page cache consists of the page_hash_table, a vector
of pointers to mem_map_t data structures.
Each file in Linux is identified by a VFS inode data structure (described
previously). Each VFS inode is unique and fully describes
one and only one file.
The index into the page table is derived from the file's VFS inode and the offset into
the file.
Whenever a page is read from a memory mapped file, for example when it needs to be brought
back into memory during demand paging, the page is read through the page cache.
If the page is present in the cache, a pointer to the mem_map_t data structure
representing it is returned to the page fault handling code.
Otherwise the page must be brought into memory from the file system that holds the image.
Linux allocates a physical page and reads the page from the file on disk.
If it is possible, Linux will initiate a read of the next page in the file.
This single page read ahead means that if the process is accessing the pages in the file
serially, the next page will be waiting in memory for the process.
Over time the page cache grows as images are read and executed.
Pages will be removed from the cache as they are no longer needed, say as an image
is no longer being used by any process.
As Linux uses memory it can start to run low on physical pages.
In this case Linux will reduce the size of the page cache.
Reducing the Size of the Page and Buffer Caches
The pages held in the page and buffer caches are good candidates for being freed into the free_area
vector.
The Page Cache, which contains pages of memory mapped files, may contain unneccessary pages that are filling
up the system's memory.
Likewise the Buffer Cache, which contains buffers read from or being written to physical devices, may also contain
unneeded buffers.
When the physical pages in the system start to run out, discarding pages from these caches is relatively easy
as it requires no writing to physical devices (unlike swapping pages out of memory).
Discarding these pages does not have too many harmful side effects other than making access to physical devices and
memory mapped files slower.
However, if the discarding of pages from these caches is done fairly, all processes will suffer equally.
Every time the kernel swap daemon tries to shrink these caches
it examines a block of pages in the mem_map page vector to see if any
can be discarded from physical memory.
The size of the block of pages examined is higher if the kernel swap daemon is intensively swapping; that is if
the number of free pages in the system has fallen dangerously low.
The blocks of pages are examined in a cyclical manner; a different block of pages
is examined each time an attempt is made to shrink the memory map.
This is known as the clock algorithm as, rather like the minute hand of a
clock, the whole mem_map page vector is examined
a few pages at a time.
Each page being examined is checked to see if it is cached in either
the page cache or the buffer cache.
You should note that shared pages are not considered for discarding
at this time and that a page cannot be in both caches at the
same time.
If the page is not in either cache then the next page in the
mem_map page vector is examined.
Pages are cached in the buffer cache (or rather the buffers within the pages
are cached) to make buffer allocation and deallocation more efficient.
The memory map shrinking code tries to free the buffers that are
contained within the page being examined.
If all the buffers are freed, then the pages that contain them are
also be freed.
If the examined page is in the Linux page cache, it is removed from
the page cache and freed.
When enough pages have been freed on this attempt then the kernel
swap daemon will wait until the next time it is periodically awakened.
As none of the freed pages were part of any process' virtual memory (they
were cached pages), then no page tables need updating.
If there were not enough cached pages discarded then the swap daemon
will try to swap out some shared pages.
|