Библиотека сайта rus-linux.net
The book is available and called simply "Understanding The Linux Virtual Memory Manager". There is a lot of additional material in the book that is not available here, including details on later 2.4 kernels, introductions to 2.6, a whole new chapter on the shared memory filesystem, coverage of TLB management, a lot more code commentary, countless other additions and clarifications and a CD with lots of cool stuff on it. This material (although now dated and lacking in comparison to the book) will remain available although I obviously encourge you to buy the book from your favourite book store :-) . As the book is under the Bruce Perens Open Book Series, it will be available 90 days after appearing on the book shelves which means it is not available right now. When it is available, it will be downloadable from http://www.phptr.com/perens so check there for more information.
To be fully clear, this webpage is not the actual book.
Next: 9.1 Caches Up: understand-html Previous: 8.3 Freeing A Non-Contiguous   Contents   Index
In this chapter, the general purpose allocator is described. It is a slab allocator which is very similar in many respects to the general kernel allocator used in Solaris [#!mauro01!#]. It is heavily based on the first slab allocator paper by Bonwick [#!bonwick94!#] with many improvements that bear a close resemblance to those described in his later paper [#!bonwick01!#]. We will begin with a quick overview of the allocator followed by a description of the different structures used before giving an in-depth tour of each task the allocator is responsible for.
The basic idea behind the slab allocator is to have caches of commonly used objects kept in an initialised state available for use by the kernel. Without an object based allocator, the kernel will spend much of its time allocating, initialising and freeing the same object. The slab allocator aims to to cache the freed object so that the basic structure is preserved between uses [#!bonwick94!#].
The slab allocator consists of a variable number of caches that are linked
together on a doubly linked circular list called a cache
chain. A cache, in the context of the slab allocator, is a manager for a
number of objects of a particular type like the
fs_cache cache and is managed by a
kmem_cache_s discussed in detail later. The caches are linked via the
next field in the cache struct.
Each cache maintains blocks of contiguous pages in memory called slabs which are carved up into small chunks for the data structures and objects the cache manages. The relationship between these different structures is illustrated in Figure 9.1.
The slab allocator has three principle aims:
- The allocation of small blocks of memory to help eliminate internal
fragmentation that would be otherwise caused by the buddy system;
- The caching of commonly used objects so that the system does not waste
time allocating, initialising and destroying objects. Benchmarks on Solaris
showed excellent speed improvements for allocations with the slab allocator in
- The better utilisation of hardware cache by aligning objects to the L1 or
To help eliminate internal fragmentation normally caused by a binary buddy
allocator, two sets of caches of small memory buffers ranging from 2 (32)
bytes to 2 (131072) bytes are maintained. One cache set is suitable
for use with DMA devices. These caches are called size-N and size-N(DMA) where
is the size of the allocation, and a function
(see Section 9.4.1) is provided for allocating them. With this,
the single greatest problem with the low level page allocator is addressed.
The sizes caches are discussed in further detail in Section 9.4.
The second task of the slab allocator is to maintain caches of commonly used objects. For many structures used in the kernel, the time needed to initialise an object is comparable to, or exceeds, the cost of allocating space for it. When a new slab is created, a number of objects are packed into it and initialised using a constructor if available. When an object is freed, it is left in its initialised state so that object allocation will be quick.
The final task of the slab allocator is hardware cache utilization. If there is space left over after objects are packed into a slab, the remaining space is used to color the slab. By giving objects in different slabs different offsets, they will be assigned to different CPU cache lines helping ensure that objects from the same cache will be unlikely to flush each other. With this, space that would otherwise be wasted fulfills a new function. Linux does not attempt to color pages [#!kessler91!#], or order where objects are placed such as those described for data [#!gonzalez95!#] or code segments [#!hashemi97!#] but the scheme used does help improve cache line usage. Cache colouring is further discussed in Section 9.1.5. On an SMP system, a further step is taken to help cache utilization where each cache has a small array of objects reserved for each CPU. This is discussed further in Section 9.5.
The slab allocator provides the additional option of slab debugging if the
option is set at compile time with
CONFIG_SLAB_DEBUG. Two debugging
features are providing called red zoning and object
poisoning. With red zoning, a marker is placed at either end of the
object. If this mark is disturbed, the allocator knows the object where a
buffer overflow occured and reports it. Poisoning an object will fill it with
a predefined bit pattern(defined
at slab creation and after a free. At allocation, this pattern is examined
and if it is changed, the allocator knows that the object was used before
it was allocated and flags it.
The small, but powerful, API which the allocator exports is listed in Table 9.5.
- 9.1 Caches
- 9.1.1 Cache Descriptor
- 9.1.2 Cache Static Flags
- 9.1.3 Cache Dynamic Flags
- 9.1.4 Cache Allocation Flags
- 9.1.5 Cache Colouring
- 9.1.6 Cache Creation
- 9.1.7 Cache Reaping
- 9.1.8 Cache Shrinking
- 9.1.9 Cache Destroying
- 9.2 Slabs
- 9.2.1 Storing the Slab Descriptor
- 9.2.2 Slab Creation
- 9.2.3 Tracking Free Objects
- 9.2.4 Initialising the
- 9.2.5 Finding the Next Free Object
- 9.2.6 Updating kmem_bufctl_t
- 9.2.7 Calculating the Number of Objects on a Slab
- 9.2.8 Slab Destroying
- 9.3 Objects
- 9.4 Sizes Cache
- 9.5 Per-CPU Object Cache
- 9.5.1 Describing the Per-CPU Object Cache
- 9.5.2 Adding/Removing Objects from the Per-CPU Cache
- 9.5.3 Enabling Per-CPU Caches
- 9.5.4 Updating Per-CPU Information
- 9.5.5 Draining a Per-CPU Cache
- 9.6 Slab Allocator Initialisation
- 9.7 Interfacing with the Buddy Allocator
Next: 9.1 Caches Up: understand-html Previous: 8.3 Freeing A Non-Contiguous   Contents   Index Mel 2004-02-15