SlideShare ist ein Scribd-Unternehmen logo
1 von 8
Downloaden Sie, um offline zu lesen
Cache Mapping
COMP375 1
Cache MappingCache Mapping
COMP375 Computer Architecture
d O i tiand Organization
Goals
• Understand how the cache system finds a
d t it i th hdata item in the cache.
• Be able to break an address into the fields
used by the different cache mapping
schemes.
Cache Challenges
• The challenge in cache design is to ensure
th t th d i d d t d i t tithat the desired data and instructions are
in the cache. The cache should achieve a
high hit ratio.
• The cache system must be quickly
searchable as it is checked every memorysearchable as it is checked every memory
reference.
Cache to Ram Ratio
• A processor might have 512 KB of cache
d 512 MB f RAMand 512 MB of RAM.
• There may be 1000 times more RAM than
cache.
• The cache algorithms have to carefully
select the 0 1% of the memory that is likelyselect the 0.1% of the memory that is likely
to be most accessed.
Cache Mapping
COMP375 2
Tag Fields
• A cache line contains two fields
– Data from RAM
– The address of the block currently in the
cache.
• The part of the cache line that stores the
address of the block is called the tag field.
• Many different addresses can be in any• Many different addresses can be in any
given cache line. The tag field specifies
the address currently in the cache line.
• Only the upper address bits are needed.
Cache Lines
• The cache memory is divided into blocks
or lines. Currently lines can range from 16y g
to 64 bytes.
• Data is copied to and from the cache one
line at a time.
• The lower log2(line size) bits of an address
if i l b i lispecify a particular byte in a line.
Line address Offset
Line Example
0110010100
0110010101
0110010110
With a line size of 4,
0110010110
0110010111
0110011000
0110011001
0110011010
0110011011
These boxes
represent RAM
addresses
the offset is the
log2(4) = 2 bits.
The lower 2 bits
0110011011
0110011100
0110011101
0110011110
0110011111
specify which byte
in the line
Computer Science Search
• If you ask COMP285 students how to
h f d t it th h ld b blsearch for a data item, they should be able
to tell you
• Linear Search – O(n)
Binary Search O(log n)• Binary Search – O(log2n)
• Hashing – O(1)
• Parallel Search – O(n/p)
Cache Mapping
COMP375 3
Mapping
• The memory system has to quickly
determine if a given address is in the cache.
• There are three popular methods of
mapping addresses to cache locations.
–Fully Associative – Search the entire
cache for an address.
Direct Each address has a specific–Direct – Each address has a specific
place in the cache.
–Set Associative – Each address can be
in any of a small set of cache locations.
Direct Mapping
• Each location in RAM has one specific
place in cache where the data will be held.
C id th h t b lik• Consider the cache to be like an array.
Part of the address is used as index into
the cache to identify where the data will be
held.
• Since a data block from RAM can only bey
in one specific line in the cache, it must
always replace the one block that was
already there. There is no need for a
replacement algorithm.
Direct Cache Addressing
• The lower log2(line size) bits define which
b t i th bl kbyte in the block
• The next log2(number of lines) bits defines
which line of the cache
• The remaining upper bits are the tag field.
Tag Line Offset
Cache Constants
• cache size / line size = number of lines
• log2(line size) = bits for offset
• log2(number of lines) = bits for cache index
• remaining upper bits = tag address bits
Cache Mapping
COMP375 4
Example direct address
Assume you have
• 32 bit addresses (can address 4 GB)
• 64 byte lines (offset is 6 bits)
• 32 KB of cache
• Number of lines = 32 KB / 64 = 512
• Bits to specify which line = log2(512) = 9
Tag Line Offset
6 bits9 bits17 bits
Example Address
• Using the previous direct mapping scheme
ith 17 bit t 9 bit i d d 6 bit ff twith 17 bit tag, 9 bit index and 6 bit offset
01111101011101110001101100111000
01111101011101110 001101100 111000
Tag Index offset
• Compare the tag field of line 001101100
(10810) for the value 01111101011101110. If it
matches, return byte 111000 (5610) of the line
How many bits are in the tag, line
and offset fields?
25% 25%25%25%Direct Mapping
24 bit addresses
64K bytes of cache
16 byte cache lines
1. tag=4, line=16, offset=4
tag=4,line=16,offset=4
tag=4,line=14,offset=6
tag=8,line=12,offset=4
tag=6,line=12,offset=6
2. tag=4, line=14, offset=6
3. tag=8, line=12, offset=4
4. tag=6, line=12, offset=6
Associative Mapping
• In associative cache mapping, the data
from any location in RAM can be stored in
any location in cacheany location in cache.
• When the processor wants an address, all
tag fields in the cache as checked to
determine if the data is already in the
cache.
• Each tag line requires circuitry to compare
the desired address with the tag field.
• All tag fields are checked in parallel.
Cache Mapping
COMP375 5
Associative Cache Mapping
• The lower log2(line size) bits define which
b t i th bl kbyte in the block
• The remaining upper bits are the tag field.
• For a 4 GB address space with 128 KB
cache and 32 byte blocks:
Tag Offset
5 bits27 bits
Example Address
• Using the previous associate mapping
h ith 27 bit t d 5 bit ff tscheme with 27 bit tag and 5 bit offset
01111101011101110001101100111000
011111010111011100011011001 11000
Tag offset
• Compare all tag fields for the value
011111010111011100011011001. If a match
is found, return byte 11000 (2410
) of the line
How many bits are in the tag and
offset fields?
25% 25%25%25%Associative Mapping
24 bit addresses
128K bytes of cache
64 byte cache lines
1. tag= 20, offset=4
tag=
20,offset=4tag=19,offset=5tag=18,offset=6tag=16,offset=8
2. tag=19, offset=5
3. tag=18, offset=6
4. tag=16, offset=8
Set Associative Mapping
• Set associative mapping is a mixture of
direct and associative mapping.pp g
• The cache lines are grouped into sets.
• The number of lines in a set can vary from
2 to 16.
• A portion of the address is used to specify
which set will hold an address.
• The data can be stored in any of the lines
in the set.
Cache Mapping
COMP375 6
Set Associative Mapping
• When the processor wants an address, it
indexes to the set and then searches theindexes to the set and then searches the
tag fields of all lines in the set for the
desired address.
• n = cache size / line size = number of lines
• b = log2(line size) = bit for offset
• w = number of lines / set
• s = n / w = number of sets
Example Set Associative
Assume you have
• 32 bit addresses
• 32 KB of cache 64 byte lines
• Number of lines = 32 KB / 64 = 512
• 4 way set associative
• Number of sets = 512 / 4 = 128
• Set bits = log2(128) = 7
Tag Set Offset
6 bits7 bits19 bits
Example Address
• Using the previous set-associate mapping
ith 19 bit t 7 bit i d d 6 bit ff twith 19 bit tag, 7 bit index and 6 bit offset
01111101011101110001101100111000
0111110101110111000 1101100 111000
Tag Index offset
• Compare the tag fields of lines 110110000 to
110110011 for the value
0111110101110111000. If a match is found,
return byte 111000 (56) of that line
How many bits are in the tag, set
and offset fields?
25% 25%25%25%2-way Set Associative
24 bit addresses
128K bytes of cache
16 byte cache lines
1. tag=8, set = 12, offset=4
tag=8,set=
12,offset=4
tag=16,set=
4,offset=4
tag=12,set=
8,offset=4
tag=10,set=
10,offset=4
g
2. tag=16, set = 4, offset=4
3. tag=12, set = 8, offset=4
4. tag=10, set = 10, offset=4
Cache Mapping
COMP375 7
Replacement policy
• When a cache miss occurs, data is copied
into some location in cache.into some location in cache.
• With Set Associative or Fully Associative
mapping, the system must decide where
to put the data and what values will be
replaced.
• Cache performance is greatly affected by
properly choosing data that is unlikely to
be referenced again.
Replacement Options
• First In First Out (FIFO)
• Least Recently Used (LRU)
• Pseudo LRU
• Random
LRU replacement
• LRU is easy to implement for 2 way set
i tiassociative.
• You only need one bit per set to indicate
which line in the set was most recently used
• LRU is difficult to implement for larger ways.
For an N way mapping there are N!• For an N way mapping, there are N!
different permutations of use orders.
• It would require log2(N!) bits to keep track.
Pseudo LRU
• Pseudo LRU is frequently used in set
associative mapping.pp g
• In pseudo LRU there is a bit for each half
of a group indicating which have was most
recently used.
• For 4 way set associative, one bit
i di h h lindicates that the upper two or lower two
was most recently used. For each half
another bit specifies which of the two lines
was most recently used. 3 bits total.
Cache Mapping
COMP375 8
Comparison of Mapping
Fully Associative
A i i i k h b b i• Associative mapping works the best, but is
complex to implement. Each tag line
requires circuitry to compare the desired
address with the tag field.
• Some special purpose cache such as theSome special purpose cache, such as the
virtual memory Translation Lookaside
Buffer (TLB) is an associative cache.
Comparison of Mapping
Direct
• Direct has the lowest performance, but is
easiest to implement.
• Direct is often used for instruction cache.
• Sequential addresses fill a cache line and
then go to the next cache line.
• Intel Pentium level 1 instruction cache
uses direct mapping.
Comparison of Mapping
Set Associative
• Set associative is a compromise between
the other two. The bigger the “way” the
better the performance, but the more
complex and expensive.
• Intel Pentium uses 4 way set associative
caching for level 1 data and 8 way set
associative for level 2.

Weitere ähnliche Inhalte

Was ist angesagt?

Cache memory
Cache memoryCache memory
Cache memory
Anuj Modi
 
02 Computer Evolution And Performance
02  Computer  Evolution And  Performance02  Computer  Evolution And  Performance
02 Computer Evolution And Performance
Jeanie Delos Arcos
 

Was ist angesagt? (20)

instruction format and addressing modes
instruction format and addressing modesinstruction format and addressing modes
instruction format and addressing modes
 
Cache Memory
Cache MemoryCache Memory
Cache Memory
 
Memory mapping
Memory mappingMemory mapping
Memory mapping
 
8085 DATA TRANSFER INSTRUCTIONS
8085 DATA TRANSFER INSTRUCTIONS8085 DATA TRANSFER INSTRUCTIONS
8085 DATA TRANSFER INSTRUCTIONS
 
Microprocessor 80286
Microprocessor 80286Microprocessor 80286
Microprocessor 80286
 
Cache memory
Cache memoryCache memory
Cache memory
 
Interleaved memory
Interleaved memoryInterleaved memory
Interleaved memory
 
Memory system
Memory systemMemory system
Memory system
 
Instruction Set Architecture
Instruction Set ArchitectureInstruction Set Architecture
Instruction Set Architecture
 
Floating point arithmetic operations (1)
Floating point arithmetic operations (1)Floating point arithmetic operations (1)
Floating point arithmetic operations (1)
 
Memory interleaving
Memory interleavingMemory interleaving
Memory interleaving
 
Cache memory
Cache memoryCache memory
Cache memory
 
Computer architecture pipelining
Computer architecture pipeliningComputer architecture pipelining
Computer architecture pipelining
 
Cache memory ppt
Cache memory ppt  Cache memory ppt
Cache memory ppt
 
80286 microprocessor
80286 microprocessor80286 microprocessor
80286 microprocessor
 
Computer organization memory hierarchy
Computer organization memory hierarchyComputer organization memory hierarchy
Computer organization memory hierarchy
 
Architecture of 80286 microprocessor
Architecture of 80286 microprocessorArchitecture of 80286 microprocessor
Architecture of 80286 microprocessor
 
02 Computer Evolution And Performance
02  Computer  Evolution And  Performance02  Computer  Evolution And  Performance
02 Computer Evolution And Performance
 
Chapter 2 - Computer Evolution and Performance
Chapter 2 - Computer Evolution and PerformanceChapter 2 - Computer Evolution and Performance
Chapter 2 - Computer Evolution and Performance
 
Register transfer language
Register transfer languageRegister transfer language
Register transfer language
 

Andere mochten auch

Virtual memory
Virtual memoryVirtual memory
Virtual memory
Anuj Modi
 

Andere mochten auch (17)

Memory Mapping Cache
Memory Mapping CacheMemory Mapping Cache
Memory Mapping Cache
 
cache memory
cache memorycache memory
cache memory
 
Cache memory presentation
Cache memory presentationCache memory presentation
Cache memory presentation
 
Address mapping
Address mappingAddress mapping
Address mapping
 
Cache presentation on Mapping and its types
Cache presentation on Mapping and its typesCache presentation on Mapping and its types
Cache presentation on Mapping and its types
 
Memory mapping
Memory mappingMemory mapping
Memory mapping
 
Cache memory
Cache memoryCache memory
Cache memory
 
Memory mapping techniques and low power memory design
Memory mapping techniques and low power memory designMemory mapping techniques and low power memory design
Memory mapping techniques and low power memory design
 
Cache memory
Cache memoryCache memory
Cache memory
 
Paging
PagingPaging
Paging
 
04 Cache Memory
04  Cache  Memory04  Cache  Memory
04 Cache Memory
 
Memory Organization
Memory OrganizationMemory Organization
Memory Organization
 
Chapter 9 - Virtual Memory
Chapter 9 - Virtual MemoryChapter 9 - Virtual Memory
Chapter 9 - Virtual Memory
 
Virtual memory
Virtual memoryVirtual memory
Virtual memory
 
Virtual memory ppt
Virtual memory pptVirtual memory ppt
Virtual memory ppt
 
Os Swapping, Paging, Segmentation and Virtual Memory
Os Swapping, Paging, Segmentation and Virtual MemoryOs Swapping, Paging, Segmentation and Virtual Memory
Os Swapping, Paging, Segmentation and Virtual Memory
 
Paging and Segmentation in Operating System
Paging and Segmentation in Operating SystemPaging and Segmentation in Operating System
Paging and Segmentation in Operating System
 

Ähnlich wie Cache mapping

total cache memory is here.please read this for better knowledge
total cache memory is here.please read this for better knowledgetotal cache memory is here.please read this for better knowledge
total cache memory is here.please read this for better knowledge
JoysreeNandy
 

Ähnlich wie Cache mapping (20)

Cache recap
Cache recapCache recap
Cache recap
 
Cache recap
Cache recapCache recap
Cache recap
 
Cache recap
Cache recapCache recap
Cache recap
 
Cache recap
Cache recapCache recap
Cache recap
 
Cache recap
Cache recapCache recap
Cache recap
 
Cache recap
Cache recapCache recap
Cache recap
 
Cache recap
Cache recapCache recap
Cache recap
 
Memory caching
Memory cachingMemory caching
Memory caching
 
Memory caching
Memory cachingMemory caching
Memory caching
 
Memory caching
Memory cachingMemory caching
Memory caching
 
Memory caching
Memory cachingMemory caching
Memory caching
 
Memory caching
Memory cachingMemory caching
Memory caching
 
Memory caching
Memory cachingMemory caching
Memory caching
 
Advance computer architecture
Advance computer architectureAdvance computer architecture
Advance computer architecture
 
total cache memory is here.please read this for better knowledge
total cache memory is here.please read this for better knowledgetotal cache memory is here.please read this for better knowledge
total cache memory is here.please read this for better knowledge
 
Chache memory ( chapter number 4 ) by William stalling
Chache memory ( chapter number 4 ) by William stallingChache memory ( chapter number 4 ) by William stalling
Chache memory ( chapter number 4 ) by William stalling
 
Cmp.pptx
Cmp.pptxCmp.pptx
Cmp.pptx
 
Cache Memory for Computer Architecture.ppt
Cache Memory for Computer Architecture.pptCache Memory for Computer Architecture.ppt
Cache Memory for Computer Architecture.ppt
 
cache memory introduction, level, function
cache memory introduction, level, functioncache memory introduction, level, function
cache memory introduction, level, function
 
cache memory
cache memorycache memory
cache memory
 

Kürzlich hochgeladen

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
Wonjun Hwang
 

Kürzlich hochgeladen (20)

Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 

Cache mapping

  • 1. Cache Mapping COMP375 1 Cache MappingCache Mapping COMP375 Computer Architecture d O i tiand Organization Goals • Understand how the cache system finds a d t it i th hdata item in the cache. • Be able to break an address into the fields used by the different cache mapping schemes. Cache Challenges • The challenge in cache design is to ensure th t th d i d d t d i t tithat the desired data and instructions are in the cache. The cache should achieve a high hit ratio. • The cache system must be quickly searchable as it is checked every memorysearchable as it is checked every memory reference. Cache to Ram Ratio • A processor might have 512 KB of cache d 512 MB f RAMand 512 MB of RAM. • There may be 1000 times more RAM than cache. • The cache algorithms have to carefully select the 0 1% of the memory that is likelyselect the 0.1% of the memory that is likely to be most accessed.
  • 2. Cache Mapping COMP375 2 Tag Fields • A cache line contains two fields – Data from RAM – The address of the block currently in the cache. • The part of the cache line that stores the address of the block is called the tag field. • Many different addresses can be in any• Many different addresses can be in any given cache line. The tag field specifies the address currently in the cache line. • Only the upper address bits are needed. Cache Lines • The cache memory is divided into blocks or lines. Currently lines can range from 16y g to 64 bytes. • Data is copied to and from the cache one line at a time. • The lower log2(line size) bits of an address if i l b i lispecify a particular byte in a line. Line address Offset Line Example 0110010100 0110010101 0110010110 With a line size of 4, 0110010110 0110010111 0110011000 0110011001 0110011010 0110011011 These boxes represent RAM addresses the offset is the log2(4) = 2 bits. The lower 2 bits 0110011011 0110011100 0110011101 0110011110 0110011111 specify which byte in the line Computer Science Search • If you ask COMP285 students how to h f d t it th h ld b blsearch for a data item, they should be able to tell you • Linear Search – O(n) Binary Search O(log n)• Binary Search – O(log2n) • Hashing – O(1) • Parallel Search – O(n/p)
  • 3. Cache Mapping COMP375 3 Mapping • The memory system has to quickly determine if a given address is in the cache. • There are three popular methods of mapping addresses to cache locations. –Fully Associative – Search the entire cache for an address. Direct Each address has a specific–Direct – Each address has a specific place in the cache. –Set Associative – Each address can be in any of a small set of cache locations. Direct Mapping • Each location in RAM has one specific place in cache where the data will be held. C id th h t b lik• Consider the cache to be like an array. Part of the address is used as index into the cache to identify where the data will be held. • Since a data block from RAM can only bey in one specific line in the cache, it must always replace the one block that was already there. There is no need for a replacement algorithm. Direct Cache Addressing • The lower log2(line size) bits define which b t i th bl kbyte in the block • The next log2(number of lines) bits defines which line of the cache • The remaining upper bits are the tag field. Tag Line Offset Cache Constants • cache size / line size = number of lines • log2(line size) = bits for offset • log2(number of lines) = bits for cache index • remaining upper bits = tag address bits
  • 4. Cache Mapping COMP375 4 Example direct address Assume you have • 32 bit addresses (can address 4 GB) • 64 byte lines (offset is 6 bits) • 32 KB of cache • Number of lines = 32 KB / 64 = 512 • Bits to specify which line = log2(512) = 9 Tag Line Offset 6 bits9 bits17 bits Example Address • Using the previous direct mapping scheme ith 17 bit t 9 bit i d d 6 bit ff twith 17 bit tag, 9 bit index and 6 bit offset 01111101011101110001101100111000 01111101011101110 001101100 111000 Tag Index offset • Compare the tag field of line 001101100 (10810) for the value 01111101011101110. If it matches, return byte 111000 (5610) of the line How many bits are in the tag, line and offset fields? 25% 25%25%25%Direct Mapping 24 bit addresses 64K bytes of cache 16 byte cache lines 1. tag=4, line=16, offset=4 tag=4,line=16,offset=4 tag=4,line=14,offset=6 tag=8,line=12,offset=4 tag=6,line=12,offset=6 2. tag=4, line=14, offset=6 3. tag=8, line=12, offset=4 4. tag=6, line=12, offset=6 Associative Mapping • In associative cache mapping, the data from any location in RAM can be stored in any location in cacheany location in cache. • When the processor wants an address, all tag fields in the cache as checked to determine if the data is already in the cache. • Each tag line requires circuitry to compare the desired address with the tag field. • All tag fields are checked in parallel.
  • 5. Cache Mapping COMP375 5 Associative Cache Mapping • The lower log2(line size) bits define which b t i th bl kbyte in the block • The remaining upper bits are the tag field. • For a 4 GB address space with 128 KB cache and 32 byte blocks: Tag Offset 5 bits27 bits Example Address • Using the previous associate mapping h ith 27 bit t d 5 bit ff tscheme with 27 bit tag and 5 bit offset 01111101011101110001101100111000 011111010111011100011011001 11000 Tag offset • Compare all tag fields for the value 011111010111011100011011001. If a match is found, return byte 11000 (2410 ) of the line How many bits are in the tag and offset fields? 25% 25%25%25%Associative Mapping 24 bit addresses 128K bytes of cache 64 byte cache lines 1. tag= 20, offset=4 tag= 20,offset=4tag=19,offset=5tag=18,offset=6tag=16,offset=8 2. tag=19, offset=5 3. tag=18, offset=6 4. tag=16, offset=8 Set Associative Mapping • Set associative mapping is a mixture of direct and associative mapping.pp g • The cache lines are grouped into sets. • The number of lines in a set can vary from 2 to 16. • A portion of the address is used to specify which set will hold an address. • The data can be stored in any of the lines in the set.
  • 6. Cache Mapping COMP375 6 Set Associative Mapping • When the processor wants an address, it indexes to the set and then searches theindexes to the set and then searches the tag fields of all lines in the set for the desired address. • n = cache size / line size = number of lines • b = log2(line size) = bit for offset • w = number of lines / set • s = n / w = number of sets Example Set Associative Assume you have • 32 bit addresses • 32 KB of cache 64 byte lines • Number of lines = 32 KB / 64 = 512 • 4 way set associative • Number of sets = 512 / 4 = 128 • Set bits = log2(128) = 7 Tag Set Offset 6 bits7 bits19 bits Example Address • Using the previous set-associate mapping ith 19 bit t 7 bit i d d 6 bit ff twith 19 bit tag, 7 bit index and 6 bit offset 01111101011101110001101100111000 0111110101110111000 1101100 111000 Tag Index offset • Compare the tag fields of lines 110110000 to 110110011 for the value 0111110101110111000. If a match is found, return byte 111000 (56) of that line How many bits are in the tag, set and offset fields? 25% 25%25%25%2-way Set Associative 24 bit addresses 128K bytes of cache 16 byte cache lines 1. tag=8, set = 12, offset=4 tag=8,set= 12,offset=4 tag=16,set= 4,offset=4 tag=12,set= 8,offset=4 tag=10,set= 10,offset=4 g 2. tag=16, set = 4, offset=4 3. tag=12, set = 8, offset=4 4. tag=10, set = 10, offset=4
  • 7. Cache Mapping COMP375 7 Replacement policy • When a cache miss occurs, data is copied into some location in cache.into some location in cache. • With Set Associative or Fully Associative mapping, the system must decide where to put the data and what values will be replaced. • Cache performance is greatly affected by properly choosing data that is unlikely to be referenced again. Replacement Options • First In First Out (FIFO) • Least Recently Used (LRU) • Pseudo LRU • Random LRU replacement • LRU is easy to implement for 2 way set i tiassociative. • You only need one bit per set to indicate which line in the set was most recently used • LRU is difficult to implement for larger ways. For an N way mapping there are N!• For an N way mapping, there are N! different permutations of use orders. • It would require log2(N!) bits to keep track. Pseudo LRU • Pseudo LRU is frequently used in set associative mapping.pp g • In pseudo LRU there is a bit for each half of a group indicating which have was most recently used. • For 4 way set associative, one bit i di h h lindicates that the upper two or lower two was most recently used. For each half another bit specifies which of the two lines was most recently used. 3 bits total.
  • 8. Cache Mapping COMP375 8 Comparison of Mapping Fully Associative A i i i k h b b i• Associative mapping works the best, but is complex to implement. Each tag line requires circuitry to compare the desired address with the tag field. • Some special purpose cache such as theSome special purpose cache, such as the virtual memory Translation Lookaside Buffer (TLB) is an associative cache. Comparison of Mapping Direct • Direct has the lowest performance, but is easiest to implement. • Direct is often used for instruction cache. • Sequential addresses fill a cache line and then go to the next cache line. • Intel Pentium level 1 instruction cache uses direct mapping. Comparison of Mapping Set Associative • Set associative is a compromise between the other two. The bigger the “way” the better the performance, but the more complex and expensive. • Intel Pentium uses 4 way set associative caching for level 1 data and 8 way set associative for level 2.