Virtualmemoryfinal 161019175858

1
Virtual Memory (GalvinNotes, 9th Ed.)
Chapter 9: Virtual Memory
Chapter Objectives
 To describe the benefits of a virtual memory system.
 To explain the concepts of demand paging, page-replacement algorithms, and allocation of page frames.
 To discuss the principles of the working-set model.
 To examine the relationship between shared memory and memory-mapped files.
 To explore how kernel memory is managed.
Outline
 Background (About preceding sections, concept of a process not having all of its pages in memory, virtual memory
concept, virtual address space, shared memory using virtual memory)
 Demand Paging:
o Basic concepts
o Performance of Demand Paging
 Copy-on-Write
 Page Replacement:
o Basic Page Replacement
o FIFO Page Replacement
o Optimal Page Replacement
o LRU Page Replacement (Algorithms: Additional-Reference-Bits, Second-Chance, Enhanced Second-Chance,
Counting-based, Page-Buffering, Applications and Page Replacement)
 Allocation of Frames:
o Minimum number of frames
o Allocation Algorithms
o Global vs Local Allocation
o Non-Uniform Memory Access
 Thrashing:
o Cause of Thrashing
o Locality Model
o Working-Set Model
o Page Fault Frequency
 Memory-Mapped Files:
o Basic Mechanism
o Shared Memory in the Win32 API
o Memory-Mapped I/O
 Allocating Kernel Memory:
o Buddy system
o Slab Allocation
 Other Considerations: Prepaging, Page size, TLB Reach, Inverted Page Tables, Program Structure, I/O Interlock and Page
Locking
 OS examples (Optional): Windows, Solaris
Content
BACKGROUND
 Precedingsections talkedabout howto avoid memoryfragmentation bybreaking process memoryrequirements down into smaller b ites
(pages), and storing the pages non-contiguously in memory.
 Most real processesdo not needalltheir pages, or at least not allat once, for several reasons:Error handling code is not neededunless that
specific error occurs, some ofwhichare quite rare. Arrays are oftenover-sizedfor worst-casescenarios, and onlya smallfractionof the arrays
are actuallyusedinpractice. Certainfeatures ofcertainprograms are rarelyused suchas the routine to balance the federal budget. (Me
thinks this holds the key to the larger-than-physical virtual memory concept)
 The abilityto load onlythe portions ofprocesses that were actuallyneeded(andonlywhen theywere needed) has several benefits:Programs
could be writtenfor a much larger address space (virtual memoryspace) than physicallyexists onthe computer. Because eachprocessis only

2
using a fractionof their total address space, there is more memoryleft for other programs, improvingCPU utilization and system throughput.
Less I/O is needed for swapping processes in and out of RAM, speeding things up. (Fig 9.1 show layout of VM)
 Figure 9.2 shows virtual address space, whichis the programmer’s logical view of process memory storage. The actual physical layout is
controlledbythe process's page table. Note that the address space shown inFigure 9.2 is sparse - A great hole inthe middle of the address
space is never used, unless the stack and/or the heap grow to fill the hole .

 Virtual memoryalsoallows the sharing
of files and memorybymultiple processes, withseveral benefits:#Systemlibrariescanbe sharedby mapping them into the vi rtual
address space of more thanone process. #Processes can alsoshare virtual memorybymappingthe same block of memory to more
than one process. #Process pages can be sharedduring a fork() system call, eliminating
the need to copy all of the pages of the original (parent) process.
DEMAND PAGING
 The basic idea behind demand paging is that when a process is swappedin, its pages are not
swappedin all at once. Rather theyare swapped in only whenthe process needs them. (on
demand. ) This is termed a lazy swapper.
 The basic idea behind paging is that when a process is swappedin, the pager only loads into
memory those
pages that it
expects the process
to need (right
away.) Pages that
are not loadedinto
memoryare marked as invalid inthe page table, using the invalid
bit. (The rest of the page table entrymayeither be blankor contain informationabout where to find the swapped-out page on the hard
drive.) If the process onlyever accesses pagesthat are loaded in memory(memoryresident pages), thenthe process runs exactlyas ifall the
pages were loaded in to memory.
 On the other hand, ifa page is neededthat wasnot originallyloadedup, thena page fault trapis generated, which must be handled in a
seriesof steps:The memoryaddress requestedis first checked, to make sure it was a validmemoryrequest. If the reference was invalid, the
process is terminated. Otherwise, the page must be paged in. A free frame is located, possibly from a free -frame list. A disk operation is
scheduled to bring inthe necessarypage fromdisk. (This will usuallyblock the process ona nI/Owait, allowing some other process to usethe
CPU in the meantime.)Whenthe I/O operationis complete, the process's page table is updatedwiththe newframe number, and the invalid

3
bit is changedto indicate that thisis nowa valid page reference. The instructionthat causedthe page fault must now be restarted from the
beginning, (as soon as this process gets another turn on the CPU.)
 In an extreme case, NO pagesare swapped infor a process until theyare requestedby page faults. This is known as pure demand paging.
 In theoryeachinstruction couldgenerate multiple page faults. Inpractice this is veryrare, due to locality of reference, covered in section
9.6.1.
 The hardware necessaryto support virtual memoryis the same as for pagingand swapping: A page table and secondary memory. (Swap
space, whose allocation is discussed in chapter 12.)
 A crucial part of the processis that the instruction must be restartedfromscratchonce the desired page hasbeen made available in memory.
For most simple instructions this is not a major difficulty. However there are some architectures that allowa single instruction to modify a
fairlylarge blockof data, (which mayspana page boundary), andifsome ofthe data gets modified before the page fault occurs, this could
cause problems. One solutionis to access both ends ofthe blockbefore executingthe instruction, guaranteeing that the necessarypages get
paged in before the instruction begins.
 Performance of Demand Paging: There are manysteps that occur whenservicing a page fault (see bookfor full details), and some of the
steps are optional or variable. But just for the sake of discussion, suppose that a normal memoryaccess requires 200 nanoseconds, and that
servicing a page fault takes 8 milliseconds. (8,000,000 nanoseconds, or 40,000 times a normal memoryaccess.) With a page fault rate of p,
(on a scale from 0 to 1), the effective access time is now: (1 - p) * (200) + p * 8000000 = 200 + 7,999,800 * p
which clearlydepends heavilyon p! Even if onlyone access in 1000 causes a page fault, the effective access time drops from 200
nanoseconds to 8.2 microseconds, a slowdownof a factor of 40 times. In order to keep the slowdownless than10%, the page fault rate must
be less than 0.0000025, or one in 399,990 accesses.
 A subtletyis that swapspace is faster to access thanthe regular file system, because it does not have to go through the wh ole directory
structure. For this reasonsome systems will transfer anentire process fromthe file systemto swapspace before startingup the process, so
that future paging all occurs from the (relatively) faster swap space.
 Some systems use demandpaging directlyfrom the file system for binarycode (which never changes andhence does not have to be stored
on a page operation), andto reserve the swapspace for data segments that must be stored. Thisapproachis used by both Sola ris and BSD
Unix.
COPY-ON-WRITE
 The idea behinda copy-on-write forkis that the pages for a parent process do not have to be actually copied for the child until one or the
other of the processeschanges the page. Theycanbe simplysharedbetweenthe two processes inthe meantime, with a bit set that the page
needs to be copied if it ever gets writtento. This is a reasonable approach, since the child process usually issues an exec( ) system call
immediatelyafter the fork (Last line grey). Obviouslyonlypages that can be modified even need to be labeled as copy-on-write. Code
segments can simplybe shared. Pages usedto satisfycopy-on-write duplications are typicallyallocated using zero-fill-on-demand, meaning
that their previous contents are zeroed out before the copy proceeds.

 Some systems provide analternative to the fork()systemcall calleda virtual memory fork, vfork(). In this case the parent is suspended, and
the childuses the parent's memorypages. Thisis veryfast for process creation, but requires that the child not mod ify any of the shared
memorypagesbefore performing the exec()systemcall. (Inessence this addressesthe question ofwhichprocessexecutes first after a call to
fork, the parent or the child. With vfork, the parent is suspended, allowing the childto e xecute first until it callsexec(), sharing pages withthe
parent in the meantime.)
PAGE REPLACEMENT
 In order to make the most use ofvirtualmemory, we load several
processes intomemoryat the same time. Since we only load the
pages that are actually neededbyeachprocess at any given time,
there is room to load manymore processes thanifwe had to load
in the entire process.
 Memoryis alsoneeded for other purposes (suchas I/O buffering),
and ifsome process suddenly decides it needs more pages and
there aren't any free frames available, then there are several
possible solutions to consider:
o Adjust the memory used by I/O buffering, etc., to free up
some frames for user processes. The decision of how to
allocate memoryfor I/Oversus user processes is a complex
one, yieldingdifferent policies on different systems. (Some
allocate a fixedamount for I/O, andothers let the I/O system
contend for memory along with everything else.)

4
o Put the process requestingmore pagesintoa wait queue until some free frames become available.
o Swap some processout of memorycompletely, freeing upits page frames.
o Find some page inmemorythat isn't being used right now, and swap that page only out to disk, freeing up a frame that can be
allocated to the process requesting it. This is knownas page replacement, and is the most commonsolution. There are many di fferent
algorithms for page replacement, which is the subject of the remainder of this section.
Basic Page Replacement:
 The previouslydiscussed page-fault processing assumedthat there would
be free frames available on the free-frame list. Now the page-fault
handling must be modifiedto free up a frame if necessary, as follows:
1. Find the locationof the desired page onthe disk, either in
swapspace or inthe file system.
2. Find a free frame:
a) If there is a free frame, use it.
b)If there is nofree frame, use a page-replacement
algorithmto select anexistingframe to be replaced,
known as the victim frame.
c) Write the victim frame to disk. Change all related
page tables to indicate that this page is nolonger in
memory.
3. Read inthe desiredpage and store it inthe frame. Adjust all
relatedpage andframe tables to indicate the change.
4. Restart the process that was waitingfor this page
 Note that step 2c adds anextra disk write to the page-fault handling, effectivelydoublingthe time requiredto processa page fault. This can
be alleviatedsomewhat byassigninga modify bit, or dirty bit to each page, indicatingwhether or not it has beenchanged since it was last
loadedinfrom disk. If the dirtybit has not beenset, thenthe page is unchanged, anddoes not needto be written out to disk. Otherwise the
page write is required. It shouldcome as nosurprise that manypage replacement strategies specificallylookfor pagesthat do not have their
dirtybit set, andpreferentiallyselect clean pages as victimpages. It shouldalso be obvious that unmodifiable code pages never get their dirty
bits set.
 There are two major requirements to implement a successful demandpaging system. We must develop a frame-allocation algorithm and a
page-replacement algorithm. The former centers around how manyframes are allocatedto each process (and to other needs), andthe latter
dealswithhow to select a page for replacement whenthere are no free frames available. The overall goal in selecting and tuning these
algorithms is to generate the fewest number of overallpage faults. (Because diskaccessis so slow relative to memory access, even slight
improvements to these algorithms can yield large improvements in overall system performance.)
 Algorithms are evaluatedusinga given string of memoryaccesses known as a reference string, which canbe generated inone of ( at least )
three commonways:
o Randomlygenerated, either evenlydistributed or withsome distributioncurve based onobserved system behavior. This is the
fastest and easiest approach, but may not reflect real performance well, as it ignores locality of reference.
o Specificallydesigned sequences. These are useful for illustrating the properties of comparative algorithms in published pape rs
and textbooks, ( and also for homework and exam problems. :-) )
o Recorded memoryreferences from a live system. Thismaybe the best approach, but the amount ofdata collectedcan be
enormous, onthe order ofa millionaddressesper second. The volume ofcollecteddata canbe reduced bymakingtwo
important observations:
 Onlythe page number that was accessed is relevant. The offset within that page does not affect pagingoperations.
 Successive accesses within the same page can be treated as a single page request, because allrequests after the first
are guaranteedto be page hits. ( Since there are no intervening requests for other pagesthat could remove thispage
from the page table. )
**So for example, if pages were ofsize 100 bytes, then the sequence of addressrequests ( 0100, 0432, 0101, 0612,
0634, 0688, 0132, 0038, 0420 ) wouldreduce to page requests ( 1, 4, 1, 6, 1, 0, 4 )
FIFO Page Replacement
 As new pagesare brought in, theyare addedto the tail of a queue, andthe page at the headof the queue is the next victim. Inthe
following example, 20 page requests result in15 page faults:
 Although FIFO is simple andeasy, it is not always optimal, or even efficient. An interesting
effect that canoccur withFIFO is Belady's anomaly, inwhichincreasing the number of frames
available canactuallyincrease the number of page faults that occur! Consider, for example,
the following chart basedonthe page sequence (1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5) and a varying
number of available frames. Obviouslythe maximumnumber of faults is 12 (every request
generates a fault), and the minimum number is 5 (each page loaded only once)...

5
 In FIFO algorithm, whichever page hasbeen inthe frames the longest is the one that is cleared. Until Bélády's anomaly was demonstrated, it
was believedthat an increase inthe number of page frames wouldalways result i nthe same number or fewer page faults. Bélády, Nelson and
Shedler constructedreference strings for which FIFOpage replacement algorithm producednearlytwice more page faults in a l arger memory
than in a smaller one (wiki).
Optimal Page Replacement
 The discoveryof Belady's anomalylead to the searchfor anoptimalpage-replacement algorithm, whichis simplythat whichyields the lowest
of all possible page-faults, and which does not suffer from Belady's anomaly.
 Such an algorithm does exist, and is calledOPT or MIN. This algorithmis simply"Replace the page that will not be used for the longest time in
the future." (www.youtube.com/watch?v=XmdgDHhx0fg clearlyexplains: Look ahead into the sequence to see which number won’t be
required for the longest period, page out that number). FIFO can take 2-3 times more time than OPT/MIN.
 OPT cannot be implemented in practice, because it
requiresforetelling the future, but it makes a nice
benchmark for the comparisonandevaluationof real
proposed new algorithms.
 In practice most page-replacement algorithms try to
approximate OPT bypredicting (estimating) in one
fashionor another what page will not be usedfor the
longest period of time. The basis of FIFO is the
prediction that the page that was brought in the
longest time agois the one that will not be needed
againfor the longest future time, but as we shall see, there are manyother prediction methods, all strivingto match the pe rformance of OPT.
LRU Page Replacement
 The predictionbehind LRU, the Least RecentlyUsed, algorithm is
that the page that has not beenused in the longest time is the
one that will not be used again in the near future. (Note the
distinctionbetweenFIFO andLRU:The former looks at the oldest
loadtime, andthe latter looks at the oldest use time.) Some view
LRU as analogous to OPT, except looking backwards in time
insteadof forwards. (OPT has the interestingpropertythat for any
reference string S andits reverse R, OPT will generate the same
number of page faults for S andfor R. It turns out that LRU has this same property.) Figure 9.15 illustrates LRU for our sample string,
yielding 12 page faults, (as compared to 15 for FIFO and 9 for OPT.)
 LRU is considereda goodreplacement policy, andis oftenused. The problemis howexactlyto implement it. There are two simple
approaches commonlyused:
o Counters: Everymemoryaccess increments a counter, and
the current value of thiscounter is stored inthe page table
entryfor that page. Then finding the LRU page involves
simple searchingthe table for the page with the smallest
counter value. Note that overflowing of the counter must
be considered.
o Stack:Another approach is to usea stack, and whenever a
page is accessed, pull that page from the middle of the stack
and place it on the top. The LRU page will always be at the
bottom of the stack. Because this requires removing objects
from the middle of the stack, a doubly linked list is the
recommended data structure (last line grey).
 Both implementations of LRU require hardware support, either for
incrementing the counter or for managing the stack, as these
operations must be performed for every memory access.
 Neither LRU or OPT exhibit Belady's anomaly. Bothbelongto a class ofpage-replacement algorithms calledstackalgorithms, which can
never exhibit Belady's anomaly. A stack algorithm is one inwhich the pages kept inmemoryfor a frame set of size N will always be a
subset of the pages kept for a frame size of N + 1. In the case of LRU, (and particularlythe stack implementation thereof),the topN pages
of the stack will be the same for all frame set sizes of N or anything larger.
 LRU-Approximation Page Replacement: Full implementation of LRU requires hardware support, and few systems provide the full
hardware support necessary. However manysystems offer some degree ofHW support, enough to approximate LRU fairly well. (In the
absence of ANY hardware support, FIFO might be the best available choice.) Inparticular, manysystems provide a reference bit for every
entryin a page table, which is set anytime that page is accessed. Initiallyall bits are set to zero, andtheycan also all be cleared at any
time. One bit of precisionis enough to distinguishpages that have beenaccessed since the last clear from those that have n ot, but does
not provide any finer grain of detail.

6
 Additional-Reference-Bits Algorithm: Finer grainis possible bystoring the most recent 8 reference bits for each page inan 8-bit byte in
the page table entry, which is interpretedas anunsignedint. At periodic intervals (clock interrupts), the OS takes over, and right-shifts
each of the reference bytes byone bit. The high-order (leftmost) bit is thenfilledinwiththe current value of the reference bit, and the
reference bits are cleared. At anygiventime, the page with the smallest value for the reference byte is the LRU page. Obviously the
specific number ofbits usedandthe frequencywithwhichthe reference byte is updatedare adjustable, and are tunedto give the fastest
performance on a given hardware platform.
 Second-Chance Algorithm: Imagine a pointer that moves continuously
from the topmost frame to the bottom and then back again. If the
pointer is at position Xat a point of time, andthat frame gets filled witha
page fromthe page sequence provided, then the pointer moves/points
to the next frame. The reference bits are set to 0 the first time a new
page is paged in. Anymore reference to that page sets its reference bit
to 1. If the pointer is at a frame whose reference bit is 1, and the next
reference is againto the same page as present in the current frame, then
the bit doesn't become 2! A frame's content is cleanedand replaced only
if the pointer is pointingto it andit's reference bit is 0. If its reference bit
is 1, then the next frame who reference bit is 0 is replaced, but at the
same time, the current frame's reference bit (which is currently 1), is
changed/set to zero before the pointer moves ahead to the next frame
(http://www.mathcs.emory.edu/~cheung/Courses/355/Syllabus/9-
virtual-mem/SC-replace.html). The book's figure is not clear at all for
understanding, but nevertheless providing it below.
 The second chance algorithm (or ClockAlgorithm)is essentially a FIFO,
except the reference bit is used to give pagesa secondchance at staying
in the page table. When a page must be replaced, the page table is
scannedina FIFO (circular queue) manner. If a page is found with its
reference bit not set, then that page is selected as the next victim. If,
however, the next page inthe FIFO does have its reference bit set, then it
is givena secondchance: The reference bit is cleared, andthe FIFO search
continues. Ifsome other page is foundthat didnot have its reference bit
set, thenthat page will be selectedas the victim, and this page (the one
being giventhe secondchance) will be allowedto stayinthe page table. If,
however, there are noother pages that do not have their reference bit set
(to put it simply, all have their bits set), thenthis page will be selected as
the victim when the FIFO search circlesback around to this page on the
second pass. Ifallreference bits in the table are set, then second chance
degrades to FIFO, but alsorequires a complete search of the table for
every page-replacement. As long as there are some pages whose
reference bits are not set, then anypage referenced frequently enough
gets to stay in the page table indefinitely.
 Enhanced Second-Chance Algorithm: The enhanced second chance
algorithmlooks at the reference bit and the modify bit (dirty bit) as an
orderedpage, and classifiespagesintoone of four classes:(0, 0) - Neither
recentlyused nor modified. (0, 1) - Not recentlyused, but modified. (1, 0) -
Recently used, but clean. (1, 1) - Recently used and modified. This
algorithmsearches the page table in a circular fashion(in as many as four
passes), looking for the first page it can find in the lowest numbered
category. I.e. it first makes a pass looking for a (0, 0), andthen if it can't findone, it makesanother pass lookingfor a (0, 1), etc. The main
difference between this algorithm and the previous one is the preference for replacing clean pages if possible.
 Counting-Based Page Replacement: There are several algorithms based oncountingthe number ofreferences that have beenmade to a
given page, such as: (A) Least Frequently Used, LFU – Replace the page withthe lowest reference count. A problem can occur if a page is
usedfrequentlyinitiallyandthennot used anymore, as the reference count remains high. A solutionto this problem is to right-shift the
counters periodically, yielding a time-decaying average reference count. (B) Most Frequently Used, MFU – Replace the page with the
highest reference count. The logic behindthis ideais that pagesthat have alreadybeen referenceda lot have been inthe system a long
time, andwe are probablydone withthem, whereaspagesreferenced onlya few timeshave onlyrecentlybeen loaded, andwe still need
them.
In general counting-based algorithms are not commonlyused, as their implementationis expensive andtheydo not approximate OPT
well.
 Page-Buffering Algorithms: There are a number of page-buffering algorithms that canbe usedinconjunctionwith the afore-mentioned
algorithms, to improve overallperformance andsometimes make up for inherent weaknesses in the hardware and/or the underlyin g
page-replacement algorithms —
o Maintaina certainminimum number of free frames at alltimes. Whena page-fault occurs, goahead and allocate one of the
free framesfrom the free list first, to get the requesting process upandrunning again as quicklyas possible, andthen sel ect a
victim page to write to disk and free up a frame as a second step.
o Keep a list of modifiedpages, andwhenthe I/O system is otherwise idle, have it write these pagesout to disk, and then cle ar
the modify bits, thereby increasing the chance of finding a "cl ean" page for the next potential victim.
o Keep a pool of free frames, but remember what page was init before it was made free. Since the data in the page is not
actuallycleared out whenthe page is freed, it can be made an active page again without having to load in any new data from
disk. This is useful when an algorithm mistakenly replaces a page that in fact is needed again soon.

7
 Some applications like database programs undertake their own memory management overriding the general -purpose OS for data
accessing andcaching needs. Theyare often givena rawdiskpartitionto work with, containing raw data blocks, and no file system
structure.
ALLOCATION OF FRAMES
We saidearlier that there were twoimportant tasks invirtualmemorymanagement:a page-replacement strategyanda frame-allocation strategy.
This section covers the second part of that pair.
 Minimum Number of Frames: The absolute minimum number of frames that a process must be allocated is dependent on system
architecture, and corresponds to the worst-case scenarioof the number of pagesthat could be touched bya single (machine)instruction. If
an instruction (andits operands) spans a page boundary, thenmultiple pages could be needed just for the instruction fetch. Memory
references in an instructiontouch more pages, and if those memorylocations canspanpage boundaries, then multiple pages could be
neededfor operand access also. The worst case involves indirect addressing, particularlywhere multiple levelsof indirect a ddressing are
allowed. Left unchecked, a pointer to a pointer to a pointer to a pointer to a . . . could theoreticallytoucheverypage inthe virtual address
space in a single machine instruction, requiring everyvirtual page be loaded into physical memory simultaneously. F or this reason
architectures place a limit (say16) on the number of levelsof indirection allowed in an instruction, which is enforced with a counter
initializedto the limit and decrementedwitheverylevel ofindirectionin aninstruction - If the counter reaches zero, then an "excessive
indirection" trap occurs. This example would still require a minimum frame allocation of 17 per process.
 Allocation Algorithms:
o Equal Allocation - If there are m framesavailable andn processes to share them, eachprocess gets m/nframes, andthe leftovers
are kept in a free-frame buffer pool.
o Proportional Allocation - Allocate the framesproportionallyto the size of the process, relative to the total size ofall processes.
So if the size of process i is S_i, andS is the sum of all S_i, thenthe allocation for process P_i is a_i = m * S_i / S. Variations on
proportional allocationcould consider priorityof processrather than just their size. Obviouslyall allocations fluctuate o ver time
as the number of availablefree frames, m, fluctuates, andall are alsosubject to the constraints of minimum allocation. (If the
minimum allocations cannot be met, thenprocesses must either be swappedout or not allowedto start until more free frames
become available.)
 Global versus Local Allocation: One bigquestion is whether frame allocation(page replacement) occurs on a local or global level. With
local replacement, the number ofpages allocatedto a processis fixed, andpage replacement occurs onlyamongst the pages al located to
this process. Withglobal replacement, anypage maybe a potential victim, whether it currentlybelongs to the processseeking a free frame
or not. Local page replacement allows processes to better control their own page fault rates, and leads to more consistent performance of
a given process over different system load levels. Global page replacement is overall more efficient, and is the more commonl y used
approach.
 Non-Uniform Memory Access (Consolidates understanding): The above arguments all assume that all memoryis equivalent, or at least
has equivalent access times. This maynot be the case inmultiple-processor systems, especiallywhere eachCPU is physically located on a
separate circuit boardwhichalsoholds some portion ofthe overall systemmemory. In these latter systems, CPUs can access memory that
is physicallylocatedon the same boardmuchfaster thanthe memoryon the other boards. The basic solution is akin to processor affinity -
At the same time that we tryto schedule processes onthe same CPU to minimize cache misses, we alsotryto allocate memory for those
processes onthe same boards, to minimize access times. The presence of threads complicates the picture, especiallywhen the threads get
loadedontodifferent processors. Solaris uses an lgroupas a solution, in a hierarchical fashionbasedonrelative latency. For example, all
processors and RAMona single boardwouldprobablybe inthe same lgroup. Memoryassignments are made within the same lgroup if
possible, or to the next nearest lgroup otherwise. (Where "nearest" is defined as having the lowest access time.)
THRASHING
If a process cannot maintainits minimum required number of frames, then it must be swappedout, freeing upframesfor other processes. This is an
intermediate level of CPU scheduling. But what about a process that cankeep its minimum, but cannot keepall of the frames that it is currentlyusing
on a regular basis?Inthis case, it is forced to page out pages that it will needagain in the verynear future, leading to large numbers of page faults. A
process that is spending more time paging than executing is said to be thrashing.
 Cause of Thrashing: Earlyprocess scheduling schemes wouldcontrol the level of
multiprogramming allowedbased onCPU utilization, adding inmore processes
when CPU utilizationwas low. The problem is that whenmemoryfilled up and
processes started spending lots of time waitingfor their pagesto page in, then
CPU utilization would lower, causing the schedule to add in even more
processes and exacerbating the problem! Eventually the system would
essentiallygrindto a halt. Local page replacement policies can prevent one
thrashing process fromtaking pagesawayfrom other processes, but it still tends
to clog up the I/O queue, therebyslowing down anyother process that needs to
do even a little bit of paging (or any other I/O for that matter.)
To prevent thrashing we must provide processes withas many frames as they
reallyneed"right now", but how dowe know what that is? The locality model
notes that processes typicallyaccess memoryreferencesina givenlocality, makinglots ofreferences to the same generalarea of memory
before movingperiodicallyto a newlocality, as showninFigure 9.19 below. If we couldjust keepas many frames as a re involved in the
current locality, thenpage faultingwouldoccur primarilyon switches from one locality to another. (E.g. when one function exits and
another is called.)

8
 Working-Set Model: The workingset modelis basedon the concept of locality, and defines a working set window, of length delta.
Whatever pages are included in the most recent delta page
references are saidto be in the processesworkingset window, and
comprise its current working set, as illustrated in Figure 9.20:
The selection ofdelta is critical to the success of the working set
model - If it is too small thenit does not encompass all ofthe pages
of the current locality, andif it is too large, then it encompasses
pages that are no longer being frequently accessed. The total
demand, D, is the sum of the sizes of the working sets for all
processes. IfD exceeds the total number of available frames, thenat least one process is thrashing, because there are not e nough frames
available to satisfyits minimum working set. If D is significantlylessthanthe currentlyavailable frames, thenadditional processes can be
launched. The hard part of the working-set model is keeping trackof what pagesare in the current working set, since everyreference adds
one to the set andremoves one older page. An approximationcanbe made using reference bits and a timer that goes off after a set
interval of memoryreferences:For example, suppose that we set the timer to gooff after every5000 references (byanyprocess), and we
can store two additionalhistorical reference bits in addition to the current reference bit. Every time the timer goes off, t he current
reference bit is copied to one of the two historical bits, andthen cleared. If anyof the three bits is set, the n that page was referenced
within the last 15,000 references, and is consideredto be inthat processes reference set. Finer resolution can be achieved with more
historical bits and a more frequent timer, at the expense of greater overhead.
 Page-Fault Frequency: A more direct approach is to recognize that what we reallywant to control is the page-fault rate, and to allocate
frames based onthisdirectlymeasurable value. If the page-fault rate exceeds a certainupper bound thenthat process needs more frames,
and ifit is below a givenlower bound, thenit can affordto give upsome of its frames to other processes. (Illinois profes sor supposes a
page-replacement strategycouldbe devisedthat wouldselect victim framesbased on the process with the lowes t current page-fault
frequency.). Note that there is a direct relationshipbetweenthe page-fault rate andthe working-set, as a processmoves from one locality
to another (unnumbered side-bard-9th Ed).
MEMORY-MAPPED FILES
Rather thanaccessing data files directlyvia the file system witheveryfileaccess, data files canbe paged into memory the same as process files,
resulting in much faster accesses (except of course when page -faults occur.) This is known as memory-mapping a file.
 Basic Mechanism: Basicallya file is mapped to an address range within a
process's virtual address space, and then paged in as needed using the
ordinary demand paging system. Note that file writes are made to the
memorypage frames, andare not immediatelywrittenout to disk. (This is the
purpose of the "flush()" systemcall, whichmayalsobe needed for stdout in
some cases.) Thisis alsowhyit is important to "close()" a file when one is
done writing to it - So that the data can be safelyflushed out to disk and so
that the memoryframes canbe freedup for other purposes. Some systems
provide specialsystemcalls to memorymapfiles and use direct disk access
otherwise. Other systems mapthe file to process address space ifthe special
systemcalls are used andmapthe file to kernel address space otherwise, but
do memorymapping in either case. File sharing is made possible by mapping
the same file to the address space of more than one process, as shown in
Figure 9.23 below. Copy-on-write is supported, and mutual exclusion
techniques (chapter 6) maybe neededto avoid synchronization problems.
 Memory-Mapped I/O: All access to devices is done bywriting into(or reading
from) the device's registers. Normallythis is done via special I/O instructions.
For certaindevices it makessense to simplymapthe device's registers to addressesinthe process's virtualaddress space, making device
I/O as fast and simpleas anyother memoryaccess. Videocontroller cards are a classic example of this. Serial andparallel devices can also
use memorymapped I/O, mappingthe device registers to specific memoryaddresses knownas I/O Ports, e.g. 0xF8. Transferring a series of
bytes must be done one at a time, movingonlyas fast as the I/O device is prepared to process the data, througho ne of two mechanisms:
o Programmed I/O (PIO), also known as polling – The CPU periodicallychecks the control bit on the device, to see if it is ready to
handle another byte of data.

9
o Interrupt Driven – The device generates aninterrupt when it either hasa nother byte of data to deliver or is ready to receive
another byte.
ALLOCATING KERNEL MEMORY
Previous discussions have centered on process memory, which can be conveniently broken up into page -sized chunks, and the only
fragmentationthat occurs is the average half-page lost to internal fragmentationfor each process (segment). There is also additional memory
allocated to the kernel, however, whichcannot be so easilypaged. Some ofit is usedfor I/O buffering and direct access by devices, for example, and
must therefore be contiguous and not affectedbypaging. Other memoryis used for internalkernel data structures ofvarious sizes, andsince kernel
memoryis oftenlocked(restricted from being ever swapped out), management of this resource must b e done carefully to avoid internal
fragmentationor other waste. (i.e. you would like the kernel to consume as little memory as possible, leaving as much as pos sible for user
processes.) Accordingly there are several classic algorithms in place for allocating kernel memory structures.
 Buddy System: The BuddySystemallocates memoryusing
a power of twoallocator. Under this scheme, memory is
always allocated as a power of 2 (4K, 8K, 16K, etc),
rounding up to the next nearest power of two if
necessary. If a block of the correct size is not currently
available, then one is formed bysplitting the next larger
block in two, forming twomatchedbuddies. (And if that
larger size is not available, thenthe next largest available
size is split, and so on.) One nice feature of the buddy
systemis that if the address of a block is exclusively ORed
with the size ofthe block, the resulting address is the
address of the buddyof the same size, which allows for
fast andeasycoalescing of free blocks back into larger
blocks. Free lists are maintained for everysize block. If the
necessaryblock size is not available uponrequest, a free
block fromthe next largest size is split into twobuddies of
the desiredsize. (Recursivelysplitting larger size blocks if
necessary.) When a block is freed, its buddy's address is
calculated, andthe free list for that size block is checked
to see if the buddyis also free. If it is, then the twobuddies are coalesced
into one larger free block, andthe process is repeated with successively
larger free lists. See the (annotated) Figure 9.27 below for an example.
 Slab Allocation: Slab Allocationallocatesmemoryto the kernelin chunks
called slabs, consisting of one or more contiguous pages. The kernel then
creates separate caches for eachtype of data structure it might needfrom
one or more slabs. Initiallythe cachesare markedempty, andare marked
full as theyare used. New requests for space inthe cache is first granted
from empty or partially empty slabs, and if all slabs are full, then
additional slabs are allocated. Thisessentiallyamounts to allocating space
for arrays of structures, in large chunks suitable to the size of the
structure beingstored. For example if a particular structure were 512
bytes long, space for themwould be allocated in groups of 8 using 4K
pages. If the structure were 3K, then space for 4 of them could be
allocated at one time ina slabof 12Kusing three 4K pages. Benefits of
slab allocationinclude lack of internalfragmentation and fast allocationof
space for individual structures Solaris uses slaballocationfor the kernel andalsofor certainuser-mode memoryallocations. Linux usedthe
buddy system prior to 2.2 and switched to slab a llocation since then.
OTHER CONSIDERATIONS
 Prepaging: The basic idea behindprepaging is to predict the pages that will be needed inthe near future, and page them inbefore they are
actuallyrequested. If a process was swapped out andwe know what its workingset wasat the time, thenwhenwe swapit backin we can
go ahead andpage back inthe entire workingset, before the page faults actually occur. With small (data) files we can go ah ead and
prepage all of the pages at one time. Prepagingcanbe of benefit if the prediction is goodandthe pagesare neededeventually, but slows
the system down if the prediction is wrong.
 Page Size: There are quite a few trade-offs ofsmall versus large page sizes:Small pages waste less memorydue to internal fragmentation.
Large pages require smaller page tables. For disk access, the latencyandseek times greatlyoutweigh the actual data transfe r times. This
makes it muchfaster to transfer one large page of data than twoor more smaller pages containing the same amoun t of data. Smaller
pages match localitybetter, because we are not bringingindata that is not reallyneeded. Small pages generate more page fa ults, with
attending overhead. The physical hardware mayalsoplaya part indetermining page size. It is hardto determine an"optimal" page size for
any given system. Current norms range from 4K to 4M, and tend towards larger page sizes as time passes.
 TLB Reach: TLB Reachis defined as the amount of memorythat canbe reachedbythe pages listed in the TLB. Ideal ly the working set
wouldfit withinthe reachof the TLB. Increasingthe size of the TLB is an obvious way of increasing TLB reach, but TLB memo ry is very
expensive and alsodraws lots ofpower. Increasing page sizes increases TLB reach, but also leads to increased fragmentation loss. Some

10
systems provide multiple size pagesto increase TLB reachwhile keeping fragmentationlow. Multiple page sizes requires that the TLB be
managed by software, not hardware.
 Program Structure: Consider a pair of nestedloops to access everyelement ina 1024 x 1024 two-dimensional arrayof 32-bit ints. Arrays in
C are storedinrow-major order, which means that eachrow of the arraywouldoccupya page of memory. If the loops are nested so that
the outer loopincrements the row and the inner loopincrements the column, then anentire row canbe processed before the ne xt page
fault, yielding 1024 page faults total. On the other hand, ifthe loops are nested the other way, so that the program worke d down the
columns instead ofacross the rows, then everyaccesswouldbe to a different page, yieldinga newpage fault for each access , or over a
millionpage faults all together. Be aware that different languages store their arrays differently. FORTRAN for example stores arrays in
column-major format instead ofrow-major. This means that blindtranslation of code from one language to another may turn a fast
program into a very slow one, strictly because of the extra page faults.
 I/O Interlock and Page Locking: There are severaloccasions whenit maybe desirable to lock pages in memory, and not let them get paged
out — Certainkernel operations cannot tolerate havingtheir pagesswappedout. If an I/O controller is doing direct-memory access, it
wouldbe wrong to change pages in the middle of the I/O operation. Ina prioritybasedscheduling system, low priority jobs may need to
wait quite a while before getting their turnon the CPU, andthere is a danger of their pagesbeing pagedout before they get a chance to
use them evenonce after paging themin. Inthis situationpages maybe locked when theyare pagedin, until the process that requested
them gets at least one turn in the CPU.
OPERATING-SYSTEM EXAMPLES (OPTIONAL)
This section is onlyto consolidate your understanding andhelprevise the concepts inyour mindin a real-life case study. Just read throughit, no
need to pushyourself to memorize anything. Just mapmentallywhat you learnt intothese real OS examples.
Windows:
 Windows uses demandpaging with clustering, meaning theypage in multiple pages whenever a page fault occurs.
 The working set minimumandmaximum are normallyset at 50 and345 pages respectively. (Maximums can be exceededinrare
circumstances.)
 Free pages are maintained ona free list, witha minimumthresholdindicatingwhenthere are enough free frames available.
 If a page fault occurs andthe processis below their maximum, thenadditional pages are allocated. Otherwise some pagesfrom this
process must be replaced, using a local page replacement algorithm.
 If the amount offree frames falls below the allowable threshold, thenworking set trimming occurs, taking frames awayfroma nyprocesses
which are above their minimum, until all are at their minimums. Thenadditional framescanbe allocated to processesthat need them.
 The algorithmfor selecting victimframes depends onthe type of processor:
o On single processor 80x86 systems, a variationof the clock( secondchance ) algorithm is used.
o On Alpha andmultiprocessor systems, clearing the reference bits mayrequire invalidating entriesinthe TLB on other processors,
which is anexpensive operation. Inthis case Windows uses a variationof FIFO.
Solaris:
 Solaris maintains a list of free pages, and allocates one to a faulting
thread whenever a fault occurs. It is therefore imperative that a
minimum amount of free memory be kept on hand at all times.
 Solaris hasa parameter, lotsfree, usually set at 1/64 of total physical
memory. Solaris checks 4 times per second to s ee if the free memory
falls belowthis threshhold, and if it does, then the pageout process is
started.
 Pageout uses a variationof the clock(secondchance) algorithm, with
two hands rotating around through the frame table. The first hand
clears the reference bits, andthe secondhandcomesbyafterwards and
checks them. Anyframe whose reference bit hasnot beenreset before
the second hand gets there gets paged out.
 The Pageout methodis adjustable by the distance between the two
hands, (the handspan), andthe speedat whichthe hands move. For example, if the hands each check 100 frames per second, and the
handspan is 1000 frames, thenthere wouldbe a 10 second interval between the time whenthe leading handclears the reference bits and
the time when the trailing hand checks them.
 The speedof the hands is usuallyadjusted according to the amount of free memory, as shownbelow. Slowscan is usuallyset at 100 pages
per second, andfastscan is usually set at the smaller of 1/2 of the total physical pages per second and 8192 pages per second.
 Solaris alsomaintains a cache of pages that have beenreclaimedbut whichhave not yet been overwritten, as opposed to the f ree list
which onlyholds pages whose current contents are invalid. Ifone ofthe pages from the cache is needed before it gets moved to the free
list, then it can be quickly recovered.
 Normallypageout runs 4 timesper secondto check if memoryhas fallenbelow lotsfree. However if it falls belowdesfree, thenpageout will
run at 100 times per second inanattempt to keepat least desfree pages free. If it is unable to dothis for a 30-secondaverage, thenSolaris
begins swapping processes, starting preferably with processes that have been idle for a long time.
 If free memory falls below minfree, then pageout runs with every page fault.
 Recent releases ofSolaris have enhancedthe virtual memorymanagement system, includingrecognizing pages fromsharedlibraries, and
protecting them from being paged out.

11
SUMMARY
 It is desirable to be able to execute a process whose logicaladdress space is larger than the available physical address spa ce. Virtual
memoryis a technique that enables us to mapa large logical address space ontoa smaller physical memory. Virtual memory allows us to
run extremelylarge processesandto raise the degree of multiprogramming, increasing CPU utilization. Further, it frees appl ication
programmers from worrying about memoryavailability. Inaddition, withvirtual memory, several processes can share systemlibraries and
memory. With virtual memory, we canalso use anefficient type ofprocesscreationknownas copy-on-write, wherein parent and child
processes share actual pages of memory.
 Virtual memoryis commonlyimplementedbydemandpaging. Pure demandpaging never brings ina page until that page is referen ced.
The first reference causesa page fault to the operating system. The operating-system kernel consults an internaltable to determine where
the page is located onthe backingstore. It thenfinds a free frame and reads the page infrom the backing store. The page table is updated
to reflect thischange, andthe instruction that causedthe page fault is restarted. This approachallows a process to run even though its
entire memoryimage is not in mainmemoryat once. As long as the page -fault rate is reasonably low, performance is acceptable.
 We can use demandpagingto reduce the number of frames allocated to a process. This arrangement can increase the degree of
multiprogramming (allowing more processesto be available for executionat one time)and—in theory, at least —the CPU utilizationof the
system. It alsoallows processes to be runeven thoughtheir memoryrequirements exceed the total available physical memory. Such
processes run in virtual memory.
 If total memoryrequirements exceedthe capacityof physicalmemory, then it maybe necessaryto replace pages from memory to free
frames for newpages. Various page-replacement algorithms are used. FIFO page replacement is easyto program but suffers fromBelady’s
anomaly. Optimal page replacement requires future knowledge. LRU replacement is an approximationof optimal page replacement, but
even it maybe difficult to implement. Most page-replacement algorithms, such as the second-chance algorithm, are approximations of LRU
replacement.
 In additionto a page-replacement algorithm, a frame-allocation policy is needed. Allocation can be fixed, suggesting local page
replacement, or dynamic, suggesting global replacement. The working-set modelassumes that processes execute inlocalities. The working
set is the set of pages inthe current locality. Accordingly, each process shouldbe allocatedenoughframes for its current working set. If a
process doesnot have enoughmemoryfor its workingset, it will thrash. Providingenoughframes to eachprocessto avoid thrashing may
require process swapping and scheduling.
 Most operatingsystems provide features for memorymapping files, thus allowing file I/O to be treated as routine memory access. The
Win32 API implements shared memory through memory mapping of files.
 Kernel processes typicallyrequire memoryto be allocatedusingpagesthat are physicallycontiguous. The buddysystem allocates memory
to kernel processesinunits sizedaccordingto a power of 2, whichoftenresults in fragmentation. Slab allocators assign ke rnel data
structures to caches associatedwith slabs, whichare made up ofone or more physically contiguous pages. With slab allocation, no
memory is wasted due to fragmentation, and memory requests can be satisfied quickly.
 In additionto requiring us to solve the major problems ofpage replacement andframe allocation, the proper design of a paging system
requiresthat we consider prepaging, page size, TLB reach, inverted page tables, program structure, I/O interlock and page lo cking, and
other issues.
To be cleared
 Inverted Page Tables: Invertedpage tables store one entryfor eachframe instead ofone entry for each virtual page. This reduces the
memoryrequirement for the page table, but loses the informationneededto implement virtual memorypaging. A solution is to keep a
separate page table for eachprocess, for virtual memorymanagement purposes. These are kept ondisk, andonlypaged in when a page
fault occurs. (i.e. theyare not referencedwitheverymemoryaccess the waya traditionalpage table would be.)—(Grey and inadequate as
of now…in the website notes)
Further Reading
 Skipped: SharedMemoryinthe Win32 API (Memory-mapped filessection. There’s a figure there that says “Figure 9.26 Consumer reading
from shared memoryusing the Win32 API”)

Virtualmemoryfinal 161019175858

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (17)

Ähnlich wie Virtualmemoryfinal 161019175858

Ähnlich wie Virtualmemoryfinal 161019175858 (20)

Mehr von marangburu42

Mehr von marangburu42 (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Virtualmemoryfinal 161019175858