Библиотека сайта rus-linux.net
The book is available and called simply "Understanding The Linux Virtual Memory Manager". There is a lot of additional material in the book that is not available here, including details on later 2.4 kernels, introductions to 2.6, a whole new chapter on the shared memory filesystem, coverage of TLB management, a lot more code commentary, countless other additions and clarifications and a CD with lots of cool stuff on it. This material (although now dated and lacking in comparison to the book) will remain available although I obviously encourge you to buy the book from your favourite book store :-) . As the book is under the Bruce Perens Open Book Series, it will be available 90 days after appearing on the book shelves which means it is not available right now. When it is available, it will be downloadable from http://www.phptr.com/perens so check there for more information.
To be fully clear, this webpage is not the actual book.
Next: 4.3 Using Page Table Up: 4. Page Table Management Previous: 4.1 Describing the Page   Contents   Index
4.2 Describing a Page Table Entry
As mentioned, each entry is described by the structures pte_t
,
pmd_t
and pgd_t
for PTEs, PMDs and PGDs
respectively. Even though these are often just unsigned integers, they
are defined as structures for two reasons. The first is for type protection
so that they will not be used inappropriately. The second is for features
like PAE on the x86 where an additional 4 bits is used for addressing more
than 4GiB of memory. To store the protection bits, pgprot_t
is defined which holds the relevant flags and is usually stored in the lower
bits of a page table entry.
For type casting, 4 macros are provided in asm/page.h
, which
takes the above types and returns the relevant part of the structures. They
are pte_val()
, pmd_val()
, pgd_val()
and pgprot_val()
. To reverse the type casting, 4 more macros are
provided __pte()
, __pmd()
, __pgd()
and __pgprot()
.
Where exactly the protection bits are stored is architecture dependent.
For illustration purposes, we will examine the case of an x86 architecture
without PAE enabled but the same principles apply across architectures. On an
x86 with no PAE, the pte_t
is simply a 32 bit integer within a
struct. Each pte_t
points to an address of a page frame and all
the addresses pointed to are guaranteed to be page aligned. Therefore, there
are PAGE_SHIFT
(12) bits in that 32 bit value that are free for
status bits of the page table entry. A number of the protection and status
bits are listed in Table 4.1 but what bits exist and what they mean varies between architectures.
These bits are self-explanatory except for the _PAGE_PROTNONE
which we will discuss further. On the x86 with Pentium III and higher,
this bit is called the Page Attribute Table (PAT)4.1and is used to indicate the size of the page the PTE is referencing. In a
PGD entry, this same bit is the PSE bit so obviously these bits are meant
to be used in conjunction.
As Linux does not use the PSE bit, the PAT bit is free in the PTE for other
purposes. There is a requirement for having a page resident in memory but
inaccessible to the userspace process such as when a region is protected with
mprotect()
with the PROT_NONE
flag. When the region
is to be protected, the _PAGE_PRESENT
bit is cleared and the
_PAGE_PROTNONE
bit is set. The macro pte_present()
checks if either of these bits are set and so the kernel itself knows the
PTE is present, just inaccessible to userspace which is a subtle,
but important point. As the hardware bit _PAGE_PRESENT
is
clear, a page fault will occur if the page is accessed so Linux can enforce
the protection while still knowing the page is resident if it needs to swap
it out or the process exits.
Footnotes
Next: 4.3 Using Page Table Up: 4. Page Table Management Previous: 4.1 Describing the Page   Contents   Index Mel 2004-02-15