Библиотека сайта rus-linux.net
The book is available and called simply "Understanding The Linux Virtual Memory Manager". There is a lot of additional material in the book that is not available here, including details on later 2.4 kernels, introductions to 2.6, a whole new chapter on the shared memory filesystem, coverage of TLB management, a lot more code commentary, countless other additions and clarifications and a CD with lots of cool stuff on it. This material (although now dated and lacking in comparison to the book) will remain available although I obviously encourge you to buy the book from your favourite book store :-) . As the book is under the Bruce Perens Open Book Series, it will be available 90 days after appearing on the book shelves which means it is not available right now. When it is available, it will be downloadable from http://www.phptr.com/perens so check there for more information.
To be fully clear, this webpage is not the actual book.
Next: 12. Swap Management Up: 11. Page Frame Reclamation Previous: 11.5 Swapping Out Process   Contents   Index
11.6 Page Replacement Policy
During discussions the page replacement policy is frequently said to
be a Least Recently Used (LRU)-based algorithm but this is
not strictly speaking true as the lists are not strictly maintained in
LRU order. The objective is for the active_list
to contain
the working set [#!denning70!#] of all processes and the
inactive_list
. As all reclaimable pages are contained in just
two lists and all pages may be selected to reclaimed rather than just the
faulting process, the replacement policy is a global one.
The lists resemble a simplified LRU 2Q [#!johnson94low!#] where two lists
called Am
and A1
are maintained. With LRU 2Q, pages
when first allocated are placed on a FIFO queue called A1
. If
they are referenced while on that queue, they are placed in a normal
LRU managed list called Am
. This is roughly analogous to
using lru_cache_add()
to place pages on a queue called
inactive_list
(A1) and using mark_page_accessed()
to get moved to the active_list
(Am). The algorithm describes
how the size of the two lists have to be tuned but Linux takes a simpler
approach by using refill_inactive()
to move pages from the
bottom of active_list
to inactive_list
to keep
active_list
about two thirds the size of the total page cache.
Figure 11.6 illustrates how the two lists are
structured.
The lists described for 2Q presumes Am is an LRU list but the list in Linux
closer resembles a Clock algorithm [#!carr84!#] where the hand-spread is
the size of the active list. When pages reach the bottom of the list, the
referenced flag is checked, if it is set, it is moved back to the top of
the list and the next page checked. If it is cleared, it is moved to the
inactive_list
.
The Move-To-Front heuristic means that the lists behave in an LRU-like manner but there are too many differences between the Linux replacement policy and LRU to consider it a stack algorithm [#!maekawa87!#]. Even if we ignore the problem of analysing multi-programmed systems [#!coffman80!#] and the fact the memory size for each process is not fixed , the policy does not satisfy the inclusion property as the location of pages in the lists depend heavily upon the size of the lists as opposed to the time of last reference. Neither is the list priority ordered as that would require list updates with every reference. As a final nail in the stack algorithm coffin, the lists are almost ignored when paging out from processes as pageout decisions are related to their location in the virtual address space of the process rather than the location within the page lists.
In summary, the algorithm does exhibit LRU-like behavior and it has been shown by benchmarks to perform well in practice. There are only two cases where the algorithm is likely to behave really badly. The first is if the candidates for reclamation are principally anonymous pages. In this case, Linux will keep examining a large number of pages before linearly scanning process page tables searching for pages to reclaim but this situation is fortunately rare.
The second situation is where there is a single process with many file
backed resident pages in the inactive_list
that are being
written to frequently. Processes and kswapd may go into a loop
of constantly ``laundering'' these pages and placing them at the top of
the inactive_list
without freeing anything. In this case, few
pages are moved from the active_list
to inactive_list
as the ratio between the two lists sizes remains not change significantly.
Next: 12. Swap Management Up: 11. Page Frame Reclamation Previous: 11.5 Swapping Out Process   Contents   Index Mel 2004-02-15