Skip to main content

operating systems - Swaping, Paging, Segmentation, and Virtual memory on x86 PM architecture

Well, this may seem a common or already asked question but after searching various books, online tutorials and even here on SU, I am still puzzled at how these four beasts work together on a x86 protected mode system.


What is the correct terminology that can be used when discussing about these things?


As far as I understand, all these 4 concepts are completely different, but got related when we talk about protecting memory. This is where it got messed up for me!


I' ll begin with swapping first.


Swaping:


A process must be in physical memory for execution. A process can be swapped temporarily out of physical memory to a backing store and then brought back into memory for continued execution.


This applies specifically to multitasking environments where multiple processes are to be executed at the same time, and hence a cpu scheduler is implemented to decide which process to swap to the backing store.


Paging: aka simple paging:


Suppose a process have all addresses that it use/access in the range 0 to 16MB, say. We can call it the logical address space of the process, as the addresses are generated by the process.


Note that by this definiton, logical address space of a process can be different from that of another process, as the process may be larger or smaller.


Now we divide this logical address space of a process into blocks of same size called pages. Also divide physical memory into fixed-size blocks called frames.


By def. logical address = page# : offset in that page


When a process is chosen to be executed by the cpu scheduler, its pages are loaded from backing store into any available memory frames.


Note that all of the pages that belong to this process are loaded in memory, before control is transfered to this process from scheduler. When this process is to be swapped to backing store, all of its pages are stored on the backing store.


The backing store is divided into fixed size blocks that are of same size as physical memory frames.


This eases the swapping process, as we swap a page and not bytes. This decreases the fragmentation on backing store, as we need not find space for some bytes, instead we look whether the space is available for a page or not.


The paging technique also decreases the fragmentation of physical memory, as we keep a page in memory.


The main memory must have space for all of the pages that belong to a process in order to load that process in memory for execution. If there is space only for a few pages of this process then some other process(ie all pages belonging to process) must be swapped to backing store and then only all pages of the process to be executed, must be loaded in memory.


Thus paging technique gives better performance than simple swapping.


Thus swapping allows us to run multiple processes w/o purchasing too much memory, instead we can work with small amount of memory(this amount must be such that all of the pages of the largest program/process that is to be run on a PC can be loaded in memory - ie you must know how much memory your program requires before running it.) plus an additional backing store usually disk, which has very less cost for much larger capacity that main memory.


So swapping + paging allows efficient management of memory so that multiple processes can be run on a system.


Demand-paging:


But the physical memory installed in a system need not be same as the requirement of the process. Also multiple processes need to be run.


The solution is to load only some pages of a process into memory, and when the process accesses an address in a page that is not in memory then a page fault is generated and the OS loads that page on demand so that the process can continue executing. This save the time to loadd all pages of that process before transferring control to it - as was case in paging + swapping.


This technique of keeping only parts of a process in memory, and rest on backing store, such as disk, is called demand paging.


Thus demand-paging = paging + swapping + only keep some pages(not all) of a process in memory.




This is all about paging and swapping that I know. Please feel to correct me I am wrong in some place above.


Now my questions are:




  1. how exactly virtual memory and virtual address space(aka linear address space) terms are related to demand-paging in the context of x86 protected mode.




  2. Is "virtual memory of a process" is a correct term or virtual memory is defined for all processes currently running in a multitasking system ?




  3. Am I right in : virtual memory available to a process == highest address in the Virtual address space(aka Linear address space) of a process + 1 ?




  4. This is about segmentation: In a x86 protected mode, we are told that each process can have a 4GB virtual address space(VAS), this means that since segmentation is present on the x86 architecture, we can divide this VAS into two or more segments. In x86 Flat model, we create segments in the VAS of a process s.t they all overlap exactly, so effectively segmentation is disabled - there are no segments. But then if say at a virtual address in the VAS of some process, some cpu instructions are present, then it is possible that we overwrite these instructions while allocating memory(in this VAS) or when we create variables or arrays. How do we ensure that this does not occur. The protection bits in descriptor does not distinguish b/w the regions as in flat mode all segments overlap. These bit can only prevent reading code or executing data, and that too only becoz the segments are accessed via selectors.




  5. or, is it something like each segment is treated as its own VAS. But in that case the total virtual memory(or total VAS) available to a process in a flat mode, would then be : " no. of segments belonging to a process x virtual memory for a single segment ". For a x86 protected mode, this would translate to 6 x 4GB = 24GB of VAS ! assuming 6 segments pointed by CS, DS, ES, GS, FS, SS registers. Is this correct ?




  6. How does a environment that support simple paging(not demand-paging) but not virtual memory, will ensure protection b/w various segments in flat memory model? We have two cases here - a single tasking system and a multi-tasking system.




UPDATE: on 2012-07-29


So if I understand it correctly:


Virtual memory is a concept and it is implemented on x86 architecture by using demand-paging technique + some protection bits(U bit and W bit specifically).


IOWs, the VAS of a process is divided into pages, which are then used in demand-paging.


Virtual memory mechanism has basically two uses in a multi-tasking environemnt:




  1. Size of the program may exceed the amount of physical memory available for it. The operating system keeps those parts of the program currently in use in main memory, and the rest on the disk. This is implemented by demand-paging with each page having an associated 'present bit' and 'accessed bit' in its page table entry.




  2. To provide memory protection by giving each process its own virtual address space, so one process can't access other process's VAS. This is implemented by having some protection bits associated with each page. Specifically, 'User/Supervisor bit - U bit', read/write bit W bit' in the page table entry are used for page access protection.




Virtual memory is useful in both single-tasking system and multi-tasking system. For single-tasking systems, only Use#1 is relevant.


Page access protection has 2 aspects: privledge level protection and write protection. These are implemented by U bit(for prviledge) and W bit(for write) respectively. These bits are present in page table entry for that page.


Memory protection has 2 aspects: protecting programs from accessing each other and protecting programs from overwriting itself, in case segments overlap in VAS of that process/program.


Now former problem is solved by VAS or virtual memory concept, but what about latter ?


The page access protection scheme doesn't prevent the latter as far as I know. IOWs, virtual memory technique doesn't prevent the progams from overwriting itself, in case segments overlap in VAS of a process.


But it seems to me that even segment-level protection can't prevent the latter (overwriting itself) issue of memory protection.


x86 cpu always evaluates segment-level protection before performing the page-level protection check - no matter whether it is flat or multi-segment model - as there is no way to disable segmentation on x86 cpu.


Consider a flat model scenario:


Consider a virtual address referred to by CS:off. Now the DS:off will also refer to the same virtual address as referred by CS:off, if 'off' value is exactly same in both cases. This is true for SS:off also.


This also means that the page in which this virtual/linear address lies, is viewed by paging unit as simply a page as it doesn't know about segmentation.


Assume all segments of a program, in flat mode belong to same privilege level, say ring0.


Now what will happen if we try to write or execute data at CS:off = DS:off = SS:off.


Assume that this address does not belong to the OS code mapped in VAS of process - please just keep aside OS for simplicity, I'm talking about hardware-level protection!


First, segment-level protection will be passed, then the privilege level checks will be passed while accessing this page(the page containing CS:off or DS:off or SS:off) as all segments belong to same privilege here, but what about W bit for this page. This should be set to 1 to allow writes, otherwise say a data segment will not be able to do writes at his page. So this means that this page is writable too.


This means that we can read/write/execute the data at this virtual(linear) address: CS:off = DS:off = SS:off.?


I don't understand that how x86 hardware can provide protection on this issue in case segments overlap.

Comments

Popular Posts

How do I transmit a single hexadecimal value serial data in PuTTY using an Alt code?

I am trying to sent a specific hexadecimal value across a serial COM port using PuTTY. Specifically, I want to send the hex codes 9C, B6, FC, and 8B. I have looked up the Alt codes for these and they are 156, 182, 252, and 139 respectively. However, whenever I input the Alt codes, a preceding hex value of C2 is sent before 9C, B6, and 8B so the values that are sent are C2 9C, C2 B6, and C2 8B. The value for FC is changed to C3 FC. Why are these values being placed before the hex value and why is FC being changed altogether? To me, it seems like there is a problem internally converting the Alt code to hex. Is there a way to directly input hex values without using Alt codes in PuTTY? Answer What you're seeing is just ordinary text character set conversion. As far as PuTTY is concerned, you are typing (and reading) text , not raw binary data, therefore it has to convert the text to bytes in whatever configured character set before sending it over the wire. In other words, when y...

linux - Extract/save a mail attachment using bash

Using normal bash tools (ie, built-ins or commonly-available command-line tools), is it possible, and how to extract/save attachments on emails? For example, say I have a nightly report which arrives via email but is a zip archive of several log files. I want to save all those zips into a backup directory. How would I accomplish that? Answer If you're aiming for portability, beware that there are several different versions of mail(1) and mailx(1) . There's a POSIX mailx command, but with very few requirements. And none of the implementations I have seem to parse attachments anyway. You might have the mpack package . Its munpack command saves all parts of a MIME message into separate files, then all you have to do is save the interesting parts and clean up the rest. There's also metamail . An equivalent of munpack is metamail -wy .

performance - Single Threaded Qaud Core v.s Hyper-Threading Dual Core

Let's say we have two CPUs, One is Quad Core 3.2 Ghz with 4 Cores, and We have a Dual Core 3.2 Ghz with 2 Cores with 2 threads in each Core (Hyper-Threading). My assumption as a programmer will be, the 4 cores 4 threads should perform faster than 2 cores 4 threads since the second CPU needs to switch between threads in order to emulate 4 cores while the first one doesn't need to perform such switching as each core can perform independently and individually. I want to confirm that my assumption is true, if not please explain why one is better than the other. Answer I do believe thats true - since hyper threading does share some elements - specifically the main execution resources, you'll be able to run 4 full threads at once, rather than waiting for those resources to be freed up. The point of HT is to get better performance with a smaller use of die area - your quad core would generally be a bigger chip - say almost twice as large, than a non HT dual core chip, while a HT...

freeze - How do I stop windows 8.1 from freezing when the screen locks

This happens to me on a regular basis if I leave the computer for upwards of 10 minutes. It didnt do so at first but started after a couple of days. This is possibly related to further windows updates although nothing seems to tie in obviously when looking at my update history. I have to hold the power button in to power off. If the screens have switched off aswell they wont come back on, if they haven't I see the login picture and can move the mouse pointer but nothing happens and no combination of keyboard mashes or mouse clicks lets me see the login prompt. In the event log (type event viewer into the start menu) under system before every Critical problem (me powering down the machine without restarting) I get distributedCOM errors talking about this guid: "The server {BF6C1E47-86EC-4194-9CE5-13C15DCB2001} did not register with DCOM within the required timeout." I also get the same error for this 1B1F472E-3221-4826-97DB-2C2324D389AE. This seems to be a common theme and...