However, to make matters even worse, memory management is typically one of
the areas where there are absolutely no hardware standards, and different CPU’s
use very different means of mapping virtual addresses into physical memory pages.
As such, memory management is one area where traditionally most of the code
has been very architecture-dependent, and only very little high-level code has been
shared across architectures even though we would like to share a lot more.
Virtual address:
L1 Index
L2 Index
Offset in page
Page table
Page directory
Physical page
L1
L2
Figure 8: Two-level page table tree
While the generic virtual to
physical memory mapping can be
seen as any function that maps
a virtual address into a physical
address, extreme performance re-
quirements mean that the func-
tion has to be reasonably simple.
In fact, on a low level all cur-
rent CPU’s use a on-chip virtual
memory mapping cache usually
called a TLB (Translation Looka-
side Buffer), and in the end all
virtual memory mapping schemes
translate into filling this cache
with the appropriate translation
information.
While any mapping strategy is possible, current CPU’s tend to handle the virtual
memory translations in three different ways: with page table trees, with hash tables
or with a pure software-fill TLB. But even when the basic approach is similar in
two architectures, the low-level details are often very different.
Depending on the memory management unit, the page tables may contain extra
information aside from the necessary protection and translation information. Some
architectures support special dirty and accessed bits, to be used by the VM routines
to determine whether a user has written to a page or not, or whether the page has
been recently accessed. Other architectures expect the operating system to keep
track of this information by hand.
For example, the original Linux platform, the Intel 80x86, has a two-level page
table tree (see Figure 8), and implements both dirty and accessed bits in hardware.
In contrast, while the Digital Alpha from a system software viewpoint also has a
normal page table tree, on the Alpha the depth of the tree is three due to the larger
25
virtual address space, and the Alpha lacks hardware support for dirty and accessed
bits.
Virtual address:
Offset in page
Physical page
Page Index
Hash Table
Function
Hash
Figure 9: Hashed in-memory TLB extension
On the other hand the Power-
PC and some of the Sparc CPU’s
do not have a real page table at
all: they have a hash table that is
used to look up the physical page
that corresponds to a virtual ad-
dress (Figure 9). If the page can-
not be found in the hash table they
trap to software. [Int93b]
Because the hashed memory
mappings generally cannot fully
describe the virtual memory setup,
they are more appropriately called
in-memory extensions to the on-
chip TLB so as not to confuse
them with full page tables.
Finally, MIPS CPU’s and the newest UltraSparcs from Sun do not have any
architecture-specified page tables at all, they only have the on-chip TLB and any
miss in the TLB will result in a software trap to refill the TLB
8
(or handle a page
fault if no physical page is available for the offending virtual address). [Int93a]
4.2.1
Memory management through virtual page tables
As seen above, no standard way of mapping virtual memory exists. Indeed, the
Sparc line of CPU’s have used all three different mapping strategies described above
in different versions of the architecture. And yet, despite these fundamental differ-
ences in physical hardware we would like to use as much common code as possible.
The way this is accomplished is by having a common virtual mapping scheme
in the Linux kernel virtual machine (see chapter 1), and mapping that common
memory management scheme onto the physical hardware. This allows us to share all
memory management code over all supported architectures, and any improvements
8
This is also the case with the Digital Alpha architecture, but the Alpha architecture also
specifies a low-level software layer called PAL-code that makes it appear as if the hardware had
three-level page table [Dig92, pp. 3–2].
26
to the memory management are automatically supported on all platforms. The only
thing that the architecture-specific code needs to know about is the mapping from
the virtual machine onto the physical hardware.
Virtual address:
Offset in page
Physical page
L2 Index L3 Index
L1 Index
L2
L3
Mid page table
L1
Page directory
Page table
Figure 10: Three-level page table tree
While the principle of the vir-
tual machine approach is simple
to grasp, the details are not as
obvious. What mapping scheme
should be used in order to make
the translation to the hardware as
efficient as possible, yet be generic
enough that the scheme is useful
as a superset of any realistic real
hardware? If the virtual machine
is too limited, it cannot take ad-
vantage of large address spaces or
special hardware features.
There are also secondary con-
cerns: the virtual machine mem-
ory mappings must be memory-
efficient, so that the mapping information does not take up a lot of physical memory
that could be used to better advantage for file system caching or running user pro-
grams. Remember that not only does the kernel have to keep the page tables of
the virtual machine in memory, the page tables of the physical machine also take
up space.
With all the requirements placed on the page tables of the virtual machine, the
choice in the end is not difficult. It turns out that a multi-level page table tree is
an approach that can easily be expanded to match large virtual memory spaces by
just adding levels. It is flexible, simple, and reasonably efficient.
Not only is a multi-level page table a good generic answer to the page table
problems, it can also often easily be made to map closely to the actual hardware,
so that the mapping of the page tables from the virtual machine to the physical
machine is easy to do. In fact, by choosing the right virtual machine page table
setup, the same page tables can be used by both the kernel virtual machine and
the physical memory management unit. In those cases the mapping overhead is
obviously zero.
27
Dostları ilə paylaş: |