Библиотека сайта rus-linux.net
The book is available and called simply "Understanding The Linux Virtual Memory Manager". There is a lot of additional material in the book that is not available here, including details on later 2.4 kernels, introductions to 2.6, a whole new chapter on the shared memory filesystem, coverage of TLB management, a lot more code commentary, countless other additions and clarifications and a CD with lots of cool stuff on it. This material (although now dated and lacking in comparison to the book) will remain available although I obviously encourge you to buy the book from your favourite book store :-) . As the book is under the Bruce Perens Open Book Series, it will be available 90 days after appearing on the book shelves which means it is not available right now. When it is available, it will be downloadable from http://www.phptr.com/perens so check there for more information.
To be fully clear, this webpage is not the actual book.
Next: 7.3 Free Pages Up: 7. Physical Page Allocation Previous: 7.1 Managing Free Blocks   Contents   Index
Linux provides a quite sizable API for the allocation of page frames.
All of them take a
gfp_mask as a parameter which is a set of
flags that determine how the allocator will behave. The flags are discussed in
The allocation API functions all use the core function
__alloc_pages() but the APIs exist so that the correct node and
zone will be chosen. Different users will require different zones such as
ZONE_ DMA for certain device drivers or
ZONE_ NORMAL for disk
buffers and callers should not have to be aware of what node is being
used. A full list of page allocation APIs are listed in Table 7.1.
Allocations are always for a specified order, 0 in the case where a single page is required. If a free block cannot be found of the requested order, a higher order block is split into two buddies. One is allocated and the other is placed on the free list for the lower order. Figure 7.2 shows where a block is split and how the buddies are added to the free lists until a block for the process is available.
When the block is later freed, the buddy will be checked. If both are free, they are merged to form a higher order block and placed on the higher free list where its buddy is checked and so on. If the buddy is not free, the freed block is added to the free list at the current order. During these list manipulations, interrupts have to be disabled to prevent an interrupt handler manipulating the lists while a process has them in an inconsistent state. This is achieved by using an interrupt safe spinlock.
The second decision to make is which memory node or
to use. Linux uses a node-local allocation policy which aims to
use the memory bank associated with the CPU running the page allocating
process. Here, the function
_alloc_pages() is what is important
as this function is different depending on whether the kernel is built
for a UMA (function in
mm/page_alloc.c) or NUMA (function in
Regardless of which API is used,
mm/page_alloc.c is the heart of the allocator. This function,
which is never called directly, examines the selected zone and checks if
it is suitable to allocate from based on the number of available pages. If
the zone is not suitable, the allocator may fall back to other zones. The
order of zones to fall back on are decided at boot time by the function
build_zonelists() but generally
ZONE_ HIGHMEM will fall back
ZONE_ NORMAL and that in turn will fall back to
ZONE_ DMA. If
number of free pages reaches the
pages_low watermark, it will
wake kswapd to begin freeing up pages from zones and if memory is
extremely tight, the caller will do the work of kswapd itself.
Once the zone has finally been decided on, the function
is called to allocate the block of pages or split higher level blocks if
one of the appropriate size is not available.
Next: 7.3 Free Pages Up: 7. Physical Page Allocation Previous: 7.1 Managing Free Blocks   Contents   Index Mel 2004-02-15