SlideShare ist ein Scribd-Unternehmen logo
1 von 29
CSC 203 1.5
Computer System Architecture
Budditha Hettige
Department of Statistics and Computer Science
University of Sri Jayewardenepura
Cache memory
2Budditha Hettige
Cache Memory
• A high‐speed,speed small memory
• Most frequently used memory words are kept in
• When CPU needs a word, it first checks it in
cache. If not found, checks in memory
3Budditha Hettige
Cache and Main Memory
4Budditha Hettige
Cache memory Vs Main Memory
5Budditha Hettige
Cache Hit and Miss
• Cache Hit: a request to
read from memory,
which can satisfy from
the cache without using
the main memory.
• Cache Miss: A request
to read from memory,
which cannot be
satisfied from the cache,
for which the main
memory has to be
consulted.
6Budditha Hettige
Locality Principle
• PRINCIPAL OF LOCALITY is the tendency to
reference data items that are near other
recently referenced data items, or that were
recently referenced themselves.
• TEMPORAL LOCALITY : memory location that
is referenced once is likely to be referenced
multiple times in near future.
• SPATIAL LOCALITY : memory location that is
referenced once, then the program is likely to
be reference a nearby memory location in
near future.
7Budditha Hettige
Locality Principle
Let
c – cache access time
m – main memory access time
h – hit ratio (fraction of all references that can be
satisfied out of cache)
miss ratio = 1‐h
Average memory access time = c + (1‐h) m
H =1 No memory references
H=0 all are memory references
8Budditha Hettige
Example:
Suppose that a word is read k times in a short
interval
First reference: memory, Other k‐1
references: cache
h = k – 1
k
Memory access time = c + m k
9Budditha Hettige
Cache Memory
• Main memories and caches are divided into fixed sized
blocks
• Cache lines – blocks inside the cache
• On a cache miss, entire cache line is loaded into cache
from memory
• Example:
– 64K cache can be divided into 1K lines of 64 bytes, 2K lines of
32 byte etc
• Unified cache
– instruction and data use the same cache
• Split cache
– Instructions in one cache and data in another
10Budditha Hettige
A system with three levels of cache
11Budditha Hettige
Pentium 4 Block Diagram
12Budditha Hettige
Replacement Algorithm
• Optimal Replacement: replace the block
which is no longer needed in the future. If
all blocks currently in Cache Memory will
be used again, replace the one which will
not be used in the future for the longest
time.
• Random selection: replace a randomly
selected block among all blocks currently
in Cache Memory.
13Budditha Hettige
Replacement Algorithm
• FIFO (first-in first-out): replace the block
that has been in Cache Memory for the
longest time.
• LRU (Least recently used): replace the
block in Cache Memory that has not
been used for the longest time.
• LFU (Least frequently used): replace the
block in Cache Memory that has been
used for the least number of times
14Budditha Hettige
Cache Memory Placement Policy
• Three commonly used methods to
translate main memory addresses to
cache memory addresses.
– Associative Mapped Cache
– Direct-Mapped Cache
– Set-Associative Mapped Cache
• The choice of cache mapping scheme
affects cost and performance, and there
is no single best method that is
appropriate for all situations
15Budditha Hettige
Associative Mapping
16Budditha Hettige
Associative Mapping
• A block in the Main Memory can
be mapped to any block in the
Cache Memory available (not
already occupied)
• Advantage: Flexibility. An Main
Memory block can be mapped
anywhere in Cache Memory.
• Disadvantage: Slow or
expensive. A search through all
the Cache Memory blocks is
needed to check whether the
address can be matched to any
of the tags.
17Budditha Hettige
Direct Mapping
18Budditha Hettige
Direct Mapping
 To avoid the search through all
CM blocks needed by
associative mapping, this
method only allows
# blocks in main memory
# blocks in cache memory
 Blocks to be mapped to each
Cache Memory block.
• Each entry (row) in cache can
hold exactly one cache line
from main memory
• 32‐byte cache line size 
cache can hold 64KB
19Budditha Hettige
Direct Mapping
• Advantage: Direct mapping is faster than
the associative mapping as it avoids
searching through all the CM tags for a
match.
• Disadvantage: But it lacks mapping
flexibility. For example, if two MM blocks
mapped to same CM block are needed
repeatedly (e.g., in a loop), they will keep
replacing each other, even though all other
CM blocks may be available.
20Budditha Hettige
Set-Associative Mapping
21Budditha Hettige
Set-Associative Mapping
• This is a trade-off between
associative and direct mappings
where each address is mapped
to a certain set of cache
locations.
• The cache is broken into sets
where each set contains "N"
cache lines, let's say 4. Then,
each memory address is
assigned a set, and can be
cached in any one of those 4
locations within the set that it is
assigned to. In other words,
within each set the cache is
associative, and thus the name.22Budditha Hettige
Set Associative cache
• LRU (Least Recently Used) algorithm is
used
– keep an ordering of each set of locations
that could be accessed from a given
memory location
– whenever any of present lines are accessed,
it updates list, making that entry the most
recently accessed
– when it comes to replace an entry, one at
the end of list is discarded
23Budditha Hettige
Load-Through and Store-Through
• Load-Through : When the CPU
needs to read a word from the
memory, the block containing the
word is brought from MM to CM,
while at the same time the word is
forwarded to the CPU.
• Store-Through : If store-through is
used, a word to be stored from
CPU to memory is written to both
CM (if the word is in there) and
MM. By doing so, a CM block to be
replaced can be overwritten by an
in-coming block without being
saved to MM.
24Budditha Hettige
Cache Write Methods
• Words in a cache have been viewed simply as
copies of words from main memory that are
read from the cache to provide faster access.
However this view point changes.
• There are 3 possible write actions:
– Write the result into the main memory
– Write the result into the cache
– Write the result into both main memory and cache
memory
25Budditha Hettige
Cache Write Methods
• Write Through: A cache architecture in which
data is written to main memory at the same time
as it is cached.
• Write Back / Copy Back: CPU performs write
only to the cache in case of a cache hit. If there
is a cache miss, CPU performs a write to main
memory.
• When the cache is missed :
– Write Allocate: loads the memory block into cache
and updates the cache block
– No-Write allocation: this bypasses the cache and
writes the word directly into the memory.
26Budditha Hettige
Cache Evaluation
Problem Solution
Processor on
which feature first
appears
External memory
slower than the system
bus
Add external cache using
faster memory technology
386
Increased processor
speed results in
external bus becoming
a bottleneck for cache
access.
Move external cache on-chip,
operating at the same speed
as the processor
486
Internal cache is rather
small, due to limited
space on chip
Add external L2 cache using
faster technology than main
memory
486
27Budditha Hettige
Cache Evaluation
28
Problem Solution
Processor on
which feature first
appears
Increased processor speed
results in external bus
becoming a bottleneck for
L2 cache access
Move L2 cache on to the
processor chip.
Pentium II
Create separate back-side bus that
runs at higher speed than the main
(front-side) external bus. The BSB
is dedicated to the L2 cache.
Pentium Pro
Some applications deal
with massive databases
and must have rapid
access to large amounts of
data. The on-chip caches
are too small.
Add external L3 cache. Pentium III
Move L3 cache on-chip Pentium IV
Budditha Hettige
Comparison of Cache Sizes
Processor Type
Year of
Introduction
L1 cache L2 cache L3 cache
IBM 360/85 Mainframe 1968 16 to 32 KB — —
PDP-11/70 Minicomputer 1975 1 KB — —
VAX 11/780 Minicomputer 1978 16 KB — —
IBM 3033 Mainframe 1978 64 KB — —
IBM 3090 Mainframe 1985 128 to 256 KB — —
Intel 80486 PC 1989 8 KB — —
Pentium PC 1993 8 KB/8 KB 256 to 512 KB —
PowerPC 601 PC 1993 32 KB — —
PowerPC 620 PC 1996 32 KB/32 KB — —
PowerPC G4 PC/server 1999 32 KB/32 KB 256 KB to 1 MB 2 MB
IBM S/390 G4 Mainframe 1997 32 KB 256 KB 2 MB
IBM S/390 G6 Mainframe 1999 256 KB 8 MB —
Pentium 4 PC/server 2000 8 KB/8 KB 256 KB —
IBM SP High-end server 2000 64 KB/32 KB 8 MB —
CRAY MTAb Supercomputer 2000 8 KB 2 MB —
Itanium PC/server 2001 16 KB/16 KB 96 KB 4 MB
SGI Origin 2001 High-end server 2001 32 KB/32 KB 4 MB —
Itanium 2 PC/server 2002 32 KB 256 KB 6 MB
IBM POWER5 High-end server 2003 64 KB 1.9 MB 36 MB
CRAY XD-1 Supercomputer 2004 64 KB/64 KB 1MB —Budditha Hettige 29

Weitere ähnliche Inhalte

Was ist angesagt?

Cache memory
Cache memoryCache memory
Cache memory
Anuj Modi
 
Message Signaled Interrupts
Message Signaled InterruptsMessage Signaled Interrupts
Message Signaled Interrupts
Anshuman Biswal
 
Static and dynamic memories
Static and dynamic memoriesStatic and dynamic memories
Static and dynamic memories
Syed Ammar Ali
 
Intel 8257 programmable dma controller
Intel 8257 programmable dma controllerIntel 8257 programmable dma controller
Intel 8257 programmable dma controller
abdulugc
 

Was ist angesagt? (20)

Cache memory
Cache memoryCache memory
Cache memory
 
Associative memory and set associative memory mapping
Associative memory and set associative memory mappingAssociative memory and set associative memory mapping
Associative memory and set associative memory mapping
 
SNAPDRAGON SoC Family and ARM Architecture
SNAPDRAGON SoC Family and ARM Architecture SNAPDRAGON SoC Family and ARM Architecture
SNAPDRAGON SoC Family and ARM Architecture
 
Intel 8086 microprocessor
Intel 8086 microprocessorIntel 8086 microprocessor
Intel 8086 microprocessor
 
Cache memory
Cache memoryCache memory
Cache memory
 
Message Signaled Interrupts
Message Signaled InterruptsMessage Signaled Interrupts
Message Signaled Interrupts
 
Advanced microprocessor
Advanced microprocessorAdvanced microprocessor
Advanced microprocessor
 
Static and dynamic memories
Static and dynamic memoriesStatic and dynamic memories
Static and dynamic memories
 
80286 microprocessors
80286 microprocessors80286 microprocessors
80286 microprocessors
 
ARM architcture
ARM architcture ARM architcture
ARM architcture
 
Memory mapping
Memory mappingMemory mapping
Memory mapping
 
Lcd & keypad
Lcd & keypadLcd & keypad
Lcd & keypad
 
Protection mode
Protection modeProtection mode
Protection mode
 
Micro program example
Micro program exampleMicro program example
Micro program example
 
Optical memory by sonal
Optical memory by sonalOptical memory by sonal
Optical memory by sonal
 
Intel 8257 programmable dma controller
Intel 8257 programmable dma controllerIntel 8257 programmable dma controller
Intel 8257 programmable dma controller
 
Difference among 8085,8086,80186,80286,80386 Microprocessor.pdf
Difference among 8085,8086,80186,80286,80386 Microprocessor.pdfDifference among 8085,8086,80186,80286,80386 Microprocessor.pdf
Difference among 8085,8086,80186,80286,80386 Microprocessor.pdf
 
Data Encryption Standard (DES)
Data Encryption Standard (DES)Data Encryption Standard (DES)
Data Encryption Standard (DES)
 
Microprocessor
MicroprocessorMicroprocessor
Microprocessor
 
MEMORY & I/O SYSTEMS
MEMORY & I/O SYSTEMS                          MEMORY & I/O SYSTEMS
MEMORY & I/O SYSTEMS
 

Ähnlich wie Computer System Architecture Lecture Note 8.2 Cache Memory

Ct213 memory subsystem
Ct213 memory subsystemCt213 memory subsystem
Ct213 memory subsystem
Sandeep Kamath
 
coa-Unit5-ppt1 (1).pptx
coa-Unit5-ppt1 (1).pptxcoa-Unit5-ppt1 (1).pptx
coa-Unit5-ppt1 (1).pptx
Ruhul Amin
 

Ähnlich wie Computer System Architecture Lecture Note 8.2 Cache Memory (20)

Lecture2
Lecture2Lecture2
Lecture2
 
Computer System Architecture Lecture Note 8.1 primary Memory
Computer System Architecture Lecture Note 8.1 primary MemoryComputer System Architecture Lecture Note 8.1 primary Memory
Computer System Architecture Lecture Note 8.1 primary Memory
 
Cache Memory.pptx
Cache Memory.pptxCache Memory.pptx
Cache Memory.pptx
 
cache memory
 cache memory cache memory
cache memory
 
lecture-2-3_Memory.pdf,describing memory
lecture-2-3_Memory.pdf,describing memorylecture-2-3_Memory.pdf,describing memory
lecture-2-3_Memory.pdf,describing memory
 
CS304PC:Computer Organization and Architecture Session 29 Memory organization...
CS304PC:Computer Organization and Architecture Session 29 Memory organization...CS304PC:Computer Organization and Architecture Session 29 Memory organization...
CS304PC:Computer Organization and Architecture Session 29 Memory organization...
 
cache memory management
cache memory managementcache memory management
cache memory management
 
Cache Memory for Computer Architecture.ppt
Cache Memory for Computer Architecture.pptCache Memory for Computer Architecture.ppt
Cache Memory for Computer Architecture.ppt
 
Ct213 memory subsystem
Ct213 memory subsystemCt213 memory subsystem
Ct213 memory subsystem
 
Memory Mapping Cache
Memory Mapping CacheMemory Mapping Cache
Memory Mapping Cache
 
Elements of cache design
Elements of cache designElements of cache design
Elements of cache design
 
coa-Unit5-ppt1 (1).pptx
coa-Unit5-ppt1 (1).pptxcoa-Unit5-ppt1 (1).pptx
coa-Unit5-ppt1 (1).pptx
 
Cache Memory
Cache MemoryCache Memory
Cache Memory
 
Mesi
MesiMesi
Mesi
 
04 cache memory
04 cache memory04 cache memory
04 cache memory
 
Memory (Computer Organization)
Memory (Computer Organization)Memory (Computer Organization)
Memory (Computer Organization)
 
Computer organization memory hierarchy
Computer organization memory hierarchyComputer organization memory hierarchy
Computer organization memory hierarchy
 
Cache memory
Cache memoryCache memory
Cache memory
 
Cache simulator
Cache simulatorCache simulator
Cache simulator
 
Introduction to memory management
Introduction to memory managementIntroduction to memory management
Introduction to memory management
 

Mehr von Budditha Hettige

Mehr von Budditha Hettige (20)

Algorithm analysis
Algorithm analysisAlgorithm analysis
Algorithm analysis
 
Sorting
SortingSorting
Sorting
 
Link List
Link ListLink List
Link List
 
Queue
QueueQueue
Queue
 
02 Stack
02 Stack02 Stack
02 Stack
 
Data Structures 01
Data Structures 01Data Structures 01
Data Structures 01
 
Drawing Fonts
Drawing FontsDrawing Fonts
Drawing Fonts
 
Texture Mapping
Texture Mapping Texture Mapping
Texture Mapping
 
Lighting
LightingLighting
Lighting
 
Viewing
ViewingViewing
Viewing
 
OpenGL 3D Drawing
OpenGL 3D DrawingOpenGL 3D Drawing
OpenGL 3D Drawing
 
2D Drawing
2D Drawing2D Drawing
2D Drawing
 
Graphics Programming OpenGL & GLUT in Code::Blocks
Graphics Programming OpenGL & GLUT in Code::BlocksGraphics Programming OpenGL & GLUT in Code::Blocks
Graphics Programming OpenGL & GLUT in Code::Blocks
 
Introduction to Computer Graphics
Introduction to Computer GraphicsIntroduction to Computer Graphics
Introduction to Computer Graphics
 
Computer System Architecture Lecture Note 9 IO fundamentals
Computer System Architecture Lecture Note 9 IO fundamentalsComputer System Architecture Lecture Note 9 IO fundamentals
Computer System Architecture Lecture Note 9 IO fundamentals
 
Computer System Architecture Lecture Note 7 addressing
Computer System Architecture Lecture Note 7 addressingComputer System Architecture Lecture Note 7 addressing
Computer System Architecture Lecture Note 7 addressing
 
Computer System Architecture Lecture Note 6: hardware performance
Computer System Architecture Lecture Note 6: hardware performanceComputer System Architecture Lecture Note 6: hardware performance
Computer System Architecture Lecture Note 6: hardware performance
 
Computer System Architecture Lecture Note 5: microprocessor technology
Computer System Architecture Lecture Note 5: microprocessor technologyComputer System Architecture Lecture Note 5: microprocessor technology
Computer System Architecture Lecture Note 5: microprocessor technology
 
Computer System Architecture Lecture Note 3: computer architecture
Computer System Architecture Lecture Note 3: computer architectureComputer System Architecture Lecture Note 3: computer architecture
Computer System Architecture Lecture Note 3: computer architecture
 
Computer System Architecture Lecture Note 2: History
Computer System Architecture Lecture Note 2: HistoryComputer System Architecture Lecture Note 2: History
Computer System Architecture Lecture Note 2: History
 

Kürzlich hochgeladen

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 

Kürzlich hochgeladen (20)

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 

Computer System Architecture Lecture Note 8.2 Cache Memory

  • 1. CSC 203 1.5 Computer System Architecture Budditha Hettige Department of Statistics and Computer Science University of Sri Jayewardenepura
  • 3. Cache Memory • A high‐speed,speed small memory • Most frequently used memory words are kept in • When CPU needs a word, it first checks it in cache. If not found, checks in memory 3Budditha Hettige
  • 4. Cache and Main Memory 4Budditha Hettige
  • 5. Cache memory Vs Main Memory 5Budditha Hettige
  • 6. Cache Hit and Miss • Cache Hit: a request to read from memory, which can satisfy from the cache without using the main memory. • Cache Miss: A request to read from memory, which cannot be satisfied from the cache, for which the main memory has to be consulted. 6Budditha Hettige
  • 7. Locality Principle • PRINCIPAL OF LOCALITY is the tendency to reference data items that are near other recently referenced data items, or that were recently referenced themselves. • TEMPORAL LOCALITY : memory location that is referenced once is likely to be referenced multiple times in near future. • SPATIAL LOCALITY : memory location that is referenced once, then the program is likely to be reference a nearby memory location in near future. 7Budditha Hettige
  • 8. Locality Principle Let c – cache access time m – main memory access time h – hit ratio (fraction of all references that can be satisfied out of cache) miss ratio = 1‐h Average memory access time = c + (1‐h) m H =1 No memory references H=0 all are memory references 8Budditha Hettige
  • 9. Example: Suppose that a word is read k times in a short interval First reference: memory, Other k‐1 references: cache h = k – 1 k Memory access time = c + m k 9Budditha Hettige
  • 10. Cache Memory • Main memories and caches are divided into fixed sized blocks • Cache lines – blocks inside the cache • On a cache miss, entire cache line is loaded into cache from memory • Example: – 64K cache can be divided into 1K lines of 64 bytes, 2K lines of 32 byte etc • Unified cache – instruction and data use the same cache • Split cache – Instructions in one cache and data in another 10Budditha Hettige
  • 11. A system with three levels of cache 11Budditha Hettige
  • 12. Pentium 4 Block Diagram 12Budditha Hettige
  • 13. Replacement Algorithm • Optimal Replacement: replace the block which is no longer needed in the future. If all blocks currently in Cache Memory will be used again, replace the one which will not be used in the future for the longest time. • Random selection: replace a randomly selected block among all blocks currently in Cache Memory. 13Budditha Hettige
  • 14. Replacement Algorithm • FIFO (first-in first-out): replace the block that has been in Cache Memory for the longest time. • LRU (Least recently used): replace the block in Cache Memory that has not been used for the longest time. • LFU (Least frequently used): replace the block in Cache Memory that has been used for the least number of times 14Budditha Hettige
  • 15. Cache Memory Placement Policy • Three commonly used methods to translate main memory addresses to cache memory addresses. – Associative Mapped Cache – Direct-Mapped Cache – Set-Associative Mapped Cache • The choice of cache mapping scheme affects cost and performance, and there is no single best method that is appropriate for all situations 15Budditha Hettige
  • 17. Associative Mapping • A block in the Main Memory can be mapped to any block in the Cache Memory available (not already occupied) • Advantage: Flexibility. An Main Memory block can be mapped anywhere in Cache Memory. • Disadvantage: Slow or expensive. A search through all the Cache Memory blocks is needed to check whether the address can be matched to any of the tags. 17Budditha Hettige
  • 19. Direct Mapping  To avoid the search through all CM blocks needed by associative mapping, this method only allows # blocks in main memory # blocks in cache memory  Blocks to be mapped to each Cache Memory block. • Each entry (row) in cache can hold exactly one cache line from main memory • 32‐byte cache line size  cache can hold 64KB 19Budditha Hettige
  • 20. Direct Mapping • Advantage: Direct mapping is faster than the associative mapping as it avoids searching through all the CM tags for a match. • Disadvantage: But it lacks mapping flexibility. For example, if two MM blocks mapped to same CM block are needed repeatedly (e.g., in a loop), they will keep replacing each other, even though all other CM blocks may be available. 20Budditha Hettige
  • 22. Set-Associative Mapping • This is a trade-off between associative and direct mappings where each address is mapped to a certain set of cache locations. • The cache is broken into sets where each set contains "N" cache lines, let's say 4. Then, each memory address is assigned a set, and can be cached in any one of those 4 locations within the set that it is assigned to. In other words, within each set the cache is associative, and thus the name.22Budditha Hettige
  • 23. Set Associative cache • LRU (Least Recently Used) algorithm is used – keep an ordering of each set of locations that could be accessed from a given memory location – whenever any of present lines are accessed, it updates list, making that entry the most recently accessed – when it comes to replace an entry, one at the end of list is discarded 23Budditha Hettige
  • 24. Load-Through and Store-Through • Load-Through : When the CPU needs to read a word from the memory, the block containing the word is brought from MM to CM, while at the same time the word is forwarded to the CPU. • Store-Through : If store-through is used, a word to be stored from CPU to memory is written to both CM (if the word is in there) and MM. By doing so, a CM block to be replaced can be overwritten by an in-coming block without being saved to MM. 24Budditha Hettige
  • 25. Cache Write Methods • Words in a cache have been viewed simply as copies of words from main memory that are read from the cache to provide faster access. However this view point changes. • There are 3 possible write actions: – Write the result into the main memory – Write the result into the cache – Write the result into both main memory and cache memory 25Budditha Hettige
  • 26. Cache Write Methods • Write Through: A cache architecture in which data is written to main memory at the same time as it is cached. • Write Back / Copy Back: CPU performs write only to the cache in case of a cache hit. If there is a cache miss, CPU performs a write to main memory. • When the cache is missed : – Write Allocate: loads the memory block into cache and updates the cache block – No-Write allocation: this bypasses the cache and writes the word directly into the memory. 26Budditha Hettige
  • 27. Cache Evaluation Problem Solution Processor on which feature first appears External memory slower than the system bus Add external cache using faster memory technology 386 Increased processor speed results in external bus becoming a bottleneck for cache access. Move external cache on-chip, operating at the same speed as the processor 486 Internal cache is rather small, due to limited space on chip Add external L2 cache using faster technology than main memory 486 27Budditha Hettige
  • 28. Cache Evaluation 28 Problem Solution Processor on which feature first appears Increased processor speed results in external bus becoming a bottleneck for L2 cache access Move L2 cache on to the processor chip. Pentium II Create separate back-side bus that runs at higher speed than the main (front-side) external bus. The BSB is dedicated to the L2 cache. Pentium Pro Some applications deal with massive databases and must have rapid access to large amounts of data. The on-chip caches are too small. Add external L3 cache. Pentium III Move L3 cache on-chip Pentium IV Budditha Hettige
  • 29. Comparison of Cache Sizes Processor Type Year of Introduction L1 cache L2 cache L3 cache IBM 360/85 Mainframe 1968 16 to 32 KB — — PDP-11/70 Minicomputer 1975 1 KB — — VAX 11/780 Minicomputer 1978 16 KB — — IBM 3033 Mainframe 1978 64 KB — — IBM 3090 Mainframe 1985 128 to 256 KB — — Intel 80486 PC 1989 8 KB — — Pentium PC 1993 8 KB/8 KB 256 to 512 KB — PowerPC 601 PC 1993 32 KB — — PowerPC 620 PC 1996 32 KB/32 KB — — PowerPC G4 PC/server 1999 32 KB/32 KB 256 KB to 1 MB 2 MB IBM S/390 G4 Mainframe 1997 32 KB 256 KB 2 MB IBM S/390 G6 Mainframe 1999 256 KB 8 MB — Pentium 4 PC/server 2000 8 KB/8 KB 256 KB — IBM SP High-end server 2000 64 KB/32 KB 8 MB — CRAY MTAb Supercomputer 2000 8 KB 2 MB — Itanium PC/server 2001 16 KB/16 KB 96 KB 4 MB SGI Origin 2001 High-end server 2001 32 KB/32 KB 4 MB — Itanium 2 PC/server 2002 32 KB 256 KB 6 MB IBM POWER5 High-end server 2003 64 KB 1.9 MB 36 MB CRAY XD-1 Supercomputer 2004 64 KB/64 KB 1MB —Budditha Hettige 29