Virtual is a word we hear a lot these days, there is virtual reality, virtual currency and virtual machines. But before all of those there was virtual memory. It is a technology that you find in desktops, laptops, tablets and smartphones. You find it in Windows, OS X, Linux, iOS and Android. But what is it and why is it important? Let me explain.
A computer executes a program by following the machine instructions held in RAM. It will execute the instruction in a location (known as an address) and then move to the next location (by adding one to the address). It can also jump from one address to another, that is how computers perform loops (among other things). In the old days of 8-bit computing or even today on microcontrollers, the whole of physical RAM is used directly and there is no preemptive multitasking which allows multiple programs to run at once. Each address is unique and references one physical place in RAM.
A new approach to memory was developed in the 1950s and 1960s called virtual memory. Virtual memory allows each process to have its own address space and addresses can be reused in every process, they are no longer unique. So address 4095 means one thing to one process but something completely different to another. However these addresses need to be somewhere in physical RAM, so virtual memory maps virtual addresses to physical addresses, but more about that in a moment.
By the 1970’s virtual memory was being used in mainframes by IBM, VAX minicomputers by DEC (running VMS, which stood for Virtual Memory System), a whole bunch of different UNIX implementations, and eventually with the arrival of the Intel 80386 it made its way into Windows and more importantly Linus Torvalds started Linux.
The simplest form of virtual memory uses a one-to-one mapping. Let’s imagine we have two programs and each one uses 5MB of RAM. Let’s also say that program one is held in physical RAM at address 5242880 and goes on for 5MB. Program two starts in physical RAM at address 10485760 and also goes on for 5MB. I will assume that one instruction takes one byte, just because it is easiest for this illustration. In a virtual memory system both processes have addresses starting at 0, the second address is 1, the next 2 and so on. When process one wants to access address 0 the computer maps address 0 over to 5242880, address 1 to 5242881 and so on. When process two want to access address 0 then the computer maps address 0 to 10485760, address 1 to 10485761 and so on.
This mapping is done in hardware (with a lot of help from software) in a special component called the Memory Management Unit (MMU). The kernel, in Android’s case that means Linux, tells the MMU what mappings to use. Then when the CPU tries to access a virtual address the MMU automatically maps it to a real physical address.
The advantages of virtual memory are that:
- * An app doesn’t care where it is in physical RAM.
- * An app only has access to its own address space and can’t interfere with other apps.
- * An app doesn’t need to be stored in contiguous blocks of memory.
This means that the OS can place the app anywhere it wants in memory and the app doesn’t care, in fact the app can be split into several chunks in the memory and the app won’t notice. Back to our 5MB example apps, but this time lets put the apps in memory like this:
- * App one is split into two, it is held in physical RAM at address 5242880 and goes on for 2.5MB
- * App two is held in physical RAM at address 7864320 and goes on for 5MB
- * The second part of app one is in physical RAM at address 13107200 and goes on for 2.5MB
Now app one is split into two different parts of memory. But that is OK, because the OS programs the MMU so that when app one accesses address 0 it maps to 5242880 (as before) but when it accesses address 2621440 (i.e. an address in the second half) it doesn’t map to 7864320 (as it would have done before, i.e. 5242880 + 2621440 ) as that is the space now occupied by app two. Now 2621440 maps to 13107200. In fact the MMU can be programmed to map a virtual address to a physical address in any way that the OS chooses.
To perform the mapping the MMU needs a table, one entry in the table is the virtual address (VA) and next to it is the physical address (PA). To translate from the virtual address to the physical address the MMU looks up the VA and then uses the corresponding PA to actually access the RAM.
The problem with this one-to-one type table is that if a program is 300MB in size (and you are using 32-bit addressing) then you need a table with almost 79 million entries to hold all the lookup data for the MMU!
The way to fix this is to split the memory into pages, blocks of memory of a fixed size which are allocated to each app. This offers less granularity than the one-to-one mapping, however the benefits in term of the table size is significant. The typical page size is 4K, i.e. 4096 bytes. This means that now a 300MB program needs just 76,800 entries. If each entry contains 4 bytes then that is a 300K table, much more manageable.
The resulting mapping system is called a Page Table (PT) and each of the items is a Page Table Entry (PTE). The Linux kernel is responsible for maintaining the PT and the PTEs and it is the job of the MMU to lookup each virtual address and produce a physical address. The PT is stored in RAM and there is one PT per process. If the virtual address needed is in the middle of a physical page then the way the MMU calculates it is like this:
- * The page size is 4K which is 12 bits.
- * The bottom 12 bits remain untouched and are used to form the lower part of the PA, it is known as the offset.
- * The remaining 20 bits (on a 32-bit system) is used as a lookup in the PT to get a physical page address (called a frame).
- * The 20 bits from the PTE are used with the 12 bit offset to get the physical address.
Translation Lookaside Buffer
Even though the MMU is a piece of hardware and it does the table lookups at lightning speed, the fact that it needs to read the PT and get the PTE means that it is spending time accessing main RAM, which in CPU terms is slow! To get around this, CPUs have a cache called the Translation Lookaside Buffer (TLB). It holds the recently accessed page translations in the MMU. For each memory access performed by the processor, the MMU checks whether the translation is cached in the TLB, if it is in the TLB then the address translation is immediately available. If there is a TLB cache miss, then the MMU will use the PT in RAM to find the address.
The TLB is normally held in the CPU and runs at the full speed of the CPU, however it is quite small, anything from 20 to 128 entries. This might sound bad, but remember that Linux is using 4K pages which means that once a PTE is cached it will be quite a while (in CPU terms) before an address on another page is needed.
With all these table lookups going on, what happens when the MMU can’t find an entry in the PT? You might think that not finding a PTE for a virtual address would be bad, but not necessarily. When the MMU fails to find a page in physical memory for a virtual address it raises a page fault, an error which is sent back to Linux to tell the OS that something isn’t right. Page faults can occur for three main reasons:
- A program has a bug and has tried to access an invalid address. In this case Android kills the app, i.e. it crashes.
- The VA is valid but the page isn’t in RAM because it has been swapped out (either to the hard disk on a desktop/laptop or to compressed swap on a smartphone).
- The VA is valid, but the kernel has yet to actually allocate any physical RAM for that address. This is known as lazy allocation. The kernel will allocate a page of RAM and try again. The Resident Set Size of the process will increase, by 4K.
Virtual memory is one of those technologies that we use every day on our desktops, laptops, and smartphones and yet it is really an unsung hero. It allows us to have multi-tasking operating systems, plus it forms the backbone of security systems like sand-boxing. Without it technology would be very different than it is today. So the next time you start an app, just give a thought to all that is going on in the background so that you can make that little sprite jump, run and fly across the screen!