SlideShare ist ein Scribd-Unternehmen logo
1 von 12
Downloaden Sie, um offline zu lesen
Computer Architecture

Babak Kia
Adjunct Professor
Boston University
College of Engineering
Email: bkia -at- bu.edu

ENG SC757 - Advanced Microprocessor Design

Computer Architecture
Computer Architecture is the theory
behind the operational design of a
computer system
This is a term which is applied to a vast
array of computer disciplines ranging
from low level instruction set and logic
design, to higher level aspects of a
computer’s design such as the memory
subsystem and bus structure
In this lecture we will focus on the latter
definition of the term

Topics Discussed
Memory Hierarchy
Memory Performance
• Amdahl’s Law and the Locality of Reference

Cache Organization and Operation
Random Access Memory (DRAM, SRAM)
Non Volatile Memory (Flash)
Bus Interfaces
• How to connect the processor to memory & I/O

1
Memory Hierarchy
A simple axiom of hardware design:
smaller is faster

Memory Hierarchy
Register
Cache

Access Times
1 cy.
1-2 cy.

Main Memory

10-15 cy.

Disk

1000+ cy.

Memory Performance
Performance is measured either in terms
of throughput or response time
The goal of memory design is to increase
memory bandwidth and decrease access
time latencies
We take advantage of three principles of
computing in order to achieve this goal:
• Make the common case faster
• Principle of Locality
• Smaller is Faster

Make the Common Case Faster
Always want to improve the frequent
event as opposed to the infrequent event
Amdahl’s Law quantifies this process:
• The performance improvement to be gained
from using some faster mode of execution is
limited by the fraction of the time the faster
mode can be used

Amdahl’s Law essentially guides you to
spend your resources on an area where
the time is most spent

2
The Principle of Locality
Locality of Reference is one of the most
important properties of a program:
• It is a widely held rule of thumb that 90% of
execution time is spent in only 10% of the code

Temporal Locality
• If a location is referenced, there is a high
likelihood that it will be referenced again in the
near future (time)

Spatial Locality
• If you reference instruction or data at a certain
location, there is a high likelihood that nearby
addresses will also be referenced

Cache
Cache is a small, fast
memory which holds
copies of recently
accessed instruction
and data
Key to the performance
of modern
Microprocessors
There could be up to two
levels of cache, the
picture to the right is
Intel Pentium M, the
large block to the left
showing 2 MB of L2
cache

Cache
Taking advantage of the Principle of
Locality, cache loads itself with contents
of data which was recently accessed
But this only addresses temporal locality!
Therefore to take advantage of spatial
locality as well, cache subsystems are
designed to read anywhere between 16128 bytes of data at a time.

3
Cache Definitions
Cache is simply defined as a temporary storage
area for data which is frequently accessed
A cache hit occurs when the CPU is looking for
data which is already contained within the cache
A cache miss is the alternate scenario, when a
cache line needs to be read from main memory
The percentages of accesses which result in
cache hits are known as the cache hit rate or hit
ratio of the cache
If the cache is full, it needs to evict a cache line
before it can bring in a new cache line. This is
done through a heuristic process and is known as
the cache replacement policy

Cache Organization
Cache is organized not in bytes, but as
blocks of cache lines, with each line
containing some number of bytes (16-64)
Unlike normal memory, cache lines do
not have fixed addresses, which enables
the cache system to populate each cache
line with a unique (non-contiguous)
address
There are three methods for filling a
cache line
• Fully Associative – The most flexible
• Direct Mapped – The most basic
• Set Associative – A combination of the two

Fully Associative
In a fully associative cache subsystem,
the cache controller can place a block of
bytes in any of the available cache lines
Though this makes the system greatly
flexible, the added circuitry to perform
this function increases the cost, and
worst, decreases the performance of the
cache!
Most of today’s cache systems are not
fully associative for this reason

4
Direct Mapped
In contrast to the fully associative cache
is the direct mapped cache system, also
called the one-way set associative
In this system, a block of main memory is
always loaded into the same cache line,
evicting the previous cache entry
This is not an ideal solution either
because in spite of its simplicity, it
doesn’t make an efficient use of the
cache
For this reason, not many systems are
built as direct mapped caches either

Set Associative
Set associative cache is a compromise
between fully associative and direct
mapped caching techniques
The idea is to break apart the cache into
n-sets of cache lines. This way the cache
subsystem uses a direct mapped scheme
to select a set, but then uses a fully
associative scheme to places the line
entry in any of the n cache lines within
the set
For n = 2, the cache subsystem is called a
two-way set associative cache

Cache Line Addressing
But how are cache lines addressed?
Caches include an address tag on each
line which gives it the frame address
• First, the tag of every cache line is checked in
parallel to see if it matches the address
provided by the CPU
• Then there must be a way to identify which
cache line is invalid, which is done through
adding a valid bit to the tag line
• Finally, a random or a Least Recently Used
(LRU) algorithm can be used to evict an invalid
cache line

5
Cache example

Tag

Index

Offset

Cache Operation
Most of the time the cache is busy filling cache
lines (reading from memory)
But the processor doesn’t write a cache line
which can be up to 128 bytes - it only writes
between 1 and 8 bytes
Therefore it must perform a read-modify-write
sequence on the cache line
Also, the cache uses one of two write operations:
• Write-through, where data is updated both on the
cache and in the main memory
• Write-back, where data is written to the cache, and
updated in the main memory only when the cache
line is replaced

Cache coherency – a very important subject! TBD
in a separate lecture…

Memory Technologies
There are basically two different memory
technologies used to store data in RAM:
Static RAM and Dynamic RAM

6
Static RAM
Uses 4-6 transistors to store a single bit of data
Provides a fast access time at the expense of
lower bit densities
For this reason registers and cache subsystems
are fabricated using SRAM technology
Static RAM is considerably more expensive than
Dynamic RAM
However, since it doesn’t need to be refreshed, its
power consumption is much lower than DRAM
Also, the absence of the refresh circuitry makes it
easier to interface to
The simplicity of the memory circuitry
compensates for the more costly technology

Dynamic RAM
The bulk of a processor’s main memory is
comprised of dynamic RAM
Manufacturers have focused on memory sizes
rather than speed
In contrast to SRAM, DRAM uses a single
transistor and capacitor to store a bit
DRAM requires that the address applied to the
device be asserted in a row address (RAS) and a
column address (CAS)
The requirement of RAS and CAS of course kills
the access time, but since it reduces package
pinout, it allows for higher memory densities

Dynamic RAM
RAS and CAS use the same pins, with each being
asserted during either the RAS or the CAS phase
of the address
There are two metrics used to describe DRAM’s
performance:
• Access time is defined as the time between
assertion of RAS to the availability of data
• Cycle time is defined as the minimum time before a
next access can be granted

Manufacturers like to quote access times, but
cycle times are more relevant because they
establish throughput of the system

7
Dynamic RAM
Of course the charge leaks slowly from the
storage capacitor in DRAM and it needs to be
refreshed continually
During the refresh phase, all accesses are heldoff, which happens once every 1 – 100 ms and
slightly impacts the throughput
DRAM bandwidth can be increased by operating it
in paged mode, when several CASs are applied for
each RAS
A notation such as 256x16 means 256 thousand
columns of cells standing 16 rows deep

Wait states
Memory access time is the amount of time from
the point an address is placed on the bus, to the
time when data is read
Normally, data is requested on C1 and read on C2:
C1

C2

However, for slower memories, wait states are
necessary:
C1

ws

ws

ws

Adding wait states to memory is not ideal. Adding
a single wait state doubles access time and
halves the speed of memory access!

Memory Timing
RAS - Row Address Strobe or Row Address Select
CAS - Column Address Strobe or Column Address
Select
tRAS - Active to precharge delay; this is the delay
between the precharge and activation of a row
tRCD - RAS to CAS Delay; the time required
between RAS and CAS access
tCL - (or CL) CAS Latency
tRP - RAS Precharge; the time required to switch
from one row to the next row, for example, switch
internal memory banks
tCLK – The length of a clock cycle

8
Memory Timing
Command Rate - the delay between Chip Select
(CS), or when an IC is selected and the time
commands can be issued to the IC
Latency - The time from when a request is made to
when it is answered; the total time required before
data can be written to or read from the memory.
Memory timing can be displayed as a sequence of
numbers such as 2-3-2-6-T1
• While all parameters are important to system
reliability, some parameters are more significant:
• This refers to the CL-tRCD-tRP-tRAS-Command Rate
sequence and is measured in clock cycles

Memory Timing
CAS Latency
• One of the most important parameters
• Delay between the CAS signal and the availability of
the data on the DQ pin
• Specially important since data is often accessed
sequentially (within the same row), so CAS timing
plays a key role in the performance of a system

tRCD
• The delay between the time a row is activated, to
when a column is activated
• For most situations (sequential), it is not a problem,
however becomes an issue for non-sequential
addressing

Basic DRAM Timing

9
ColdFire SRAM Timing

ColdFire SDRAM Timing

NvRAM
Both SRAM and DRAM are known as volatile memories
– turn the power off and they lose their contents
Computers also rely on a wide array of Non-volatile
Random Access Memories or NvRAMs, which retain
their data even if the power is turned off. They come in
many forms:
•
•
•
•

ROM - Read Only Memory
EPROM - Erasable Programmable Read Only Memory
EEPROM - Electrically Erasable Programmable Read Only
Memory
Flash – A form of EEPROM which
allows multiple locations to be
erased or written in one programming
operation. Normal EEPROM only
allows for one location to be erased
or programmed

10
Flash
Flash memories combine the best features of Non
volatile memories.
• Very high densities (256Mbits or more!)
• Low cost (mass produced, used in everything from
cell phones to USB flash drives)
• Very fast (to read, but not to write)
• Reprogrammable
• And of course non volatile

One disadvantage is that flash
can’t be erased a byte at a time,
it must be erased one sector at
a time
Flash parts have a guaranteed
lifetime of ~100,000 write cycles

The System Bus
The system bus connects the microprocessors to
the memory and I/O subsystems
It is comprised of three major busses: The
address bus, the data bus, and the control bus
Both address and data busses come in variety of
sizes.
Address busses generally range from 20 to 36 bits
(1 MB address space – 64 GBytes)
Data busses are either 8, 16, 32, or 64 bits wide
The control bus on the other hand varies amongst
processors. It is a collection of control signals
with which the processor communicates with the
rest of the system

The Control Bus
The following are generally part of the
Control Bus:
• Read/Write signal, which specify the direction
of data flow
• Byte Enable signals, which allow 16, 32, and 64
bit busses deal with smaller chunks of data
• Some processors have signaling which
identifies between memory and I/O addresses
• Other signals include interrupt lines, parity
lines, bus clock, and status signals

11
The ColdFire System Bus
OE# controls external
bus transceivers
TA# indicates successful
completion of requested
data transfer
TEA# upon assertion
terminates the bus cycle

Portions of this power point presentation may have been taken from relevant users and technical manuals. Original content Copyright © 2005 – Babak Kia

12

Weitere ähnliche Inhalte

Was ist angesagt?

Cache memory and cache
Cache memory and cacheCache memory and cache
Cache memory and cacheVISHAL DONGA
 
Memory technology and optimization in Advance Computer Architechture
Memory technology and optimization in Advance Computer ArchitechtureMemory technology and optimization in Advance Computer Architechture
Memory technology and optimization in Advance Computer ArchitechtureShweta Ghate
 
Cache optimization
Cache optimizationCache optimization
Cache optimizationKavi Kathir
 
Hierarchical Memory System
Hierarchical Memory SystemHierarchical Memory System
Hierarchical Memory SystemJenny Galino
 
Computer Memory Hierarchy Computer Architecture
Computer Memory Hierarchy Computer ArchitectureComputer Memory Hierarchy Computer Architecture
Computer Memory Hierarchy Computer ArchitectureHaris456
 
Csc1401 lecture05 - cache memory
Csc1401   lecture05 - cache memoryCsc1401   lecture05 - cache memory
Csc1401 lecture05 - cache memoryIIUM
 
Massed Refresh: An Energy-Efficient Technique to Reduce Refresh Overhead in H...
Massed Refresh: An Energy-Efficient Technique to Reduce Refresh Overhead in H...Massed Refresh: An Energy-Efficient Technique to Reduce Refresh Overhead in H...
Massed Refresh: An Energy-Efficient Technique to Reduce Refresh Overhead in H...Ishan Thakkar
 
Cache memory ppt
Cache memory ppt  Cache memory ppt
Cache memory ppt Arpita Naik
 
Cache memory in Eng:Ahmed Ali Ahmed
Cache memory in Eng:Ahmed Ali AhmedCache memory in Eng:Ahmed Ali Ahmed
Cache memory in Eng:Ahmed Ali AhmedAhmed Ali Ahmed
 
Project Presentation Final
Project Presentation FinalProject Presentation Final
Project Presentation FinalDhritiman Halder
 
Cache memory and virtual memory
Cache memory and virtual memoryCache memory and virtual memory
Cache memory and virtual memoryPrakharBansal29
 

Was ist angesagt? (20)

Cache memory and cache
Cache memory and cacheCache memory and cache
Cache memory and cache
 
Cache memory
Cache memoryCache memory
Cache memory
 
Memory technology and optimization in Advance Computer Architechture
Memory technology and optimization in Advance Computer ArchitechtureMemory technology and optimization in Advance Computer Architechture
Memory technology and optimization in Advance Computer Architechture
 
Cache optimization
Cache optimizationCache optimization
Cache optimization
 
Cache memory
Cache memoryCache memory
Cache memory
 
Hierarchical Memory System
Hierarchical Memory SystemHierarchical Memory System
Hierarchical Memory System
 
Cache Memory
Cache MemoryCache Memory
Cache Memory
 
Computer Memory Hierarchy Computer Architecture
Computer Memory Hierarchy Computer ArchitectureComputer Memory Hierarchy Computer Architecture
Computer Memory Hierarchy Computer Architecture
 
Csc1401 lecture05 - cache memory
Csc1401   lecture05 - cache memoryCsc1401   lecture05 - cache memory
Csc1401 lecture05 - cache memory
 
Cache memory
Cache memoryCache memory
Cache memory
 
Massed Refresh: An Energy-Efficient Technique to Reduce Refresh Overhead in H...
Massed Refresh: An Energy-Efficient Technique to Reduce Refresh Overhead in H...Massed Refresh: An Energy-Efficient Technique to Reduce Refresh Overhead in H...
Massed Refresh: An Energy-Efficient Technique to Reduce Refresh Overhead in H...
 
Cache memory ppt
Cache memory ppt  Cache memory ppt
Cache memory ppt
 
Cache memory in Eng:Ahmed Ali Ahmed
Cache memory in Eng:Ahmed Ali AhmedCache memory in Eng:Ahmed Ali Ahmed
Cache memory in Eng:Ahmed Ali Ahmed
 
Oversimplified CA
Oversimplified CAOversimplified CA
Oversimplified CA
 
Computer architecture
Computer architectureComputer architecture
Computer architecture
 
Project Presentation Final
Project Presentation FinalProject Presentation Final
Project Presentation Final
 
Lecture3
Lecture3Lecture3
Lecture3
 
Cache memory
Cache memoryCache memory
Cache memory
 
Cache memory and virtual memory
Cache memory and virtual memoryCache memory and virtual memory
Cache memory and virtual memory
 
Caches microP
Caches microPCaches microP
Caches microP
 

Ähnlich wie Computer architecture for HNDIT

Chapter5 the memory-system-jntuworld
Chapter5 the memory-system-jntuworldChapter5 the memory-system-jntuworld
Chapter5 the memory-system-jntuworldPraveen Kumar
 
cache memory and types of cache memory,
cache memory  and types of cache memory,cache memory  and types of cache memory,
cache memory and types of cache memory,ashima967262
 
Memory organization
Memory organizationMemory organization
Memory organizationishapadhy
 
Memory Hierarchy PPT of Computer Organization
Memory Hierarchy PPT of Computer OrganizationMemory Hierarchy PPT of Computer Organization
Memory Hierarchy PPT of Computer Organization2022002857mbit
 
CS304PC:Computer Organization and Architecture Session 29 Memory organization...
CS304PC:Computer Organization and Architecture Session 29 Memory organization...CS304PC:Computer Organization and Architecture Session 29 Memory organization...
CS304PC:Computer Organization and Architecture Session 29 Memory organization...Asst.prof M.Gokilavani
 
Computer Organisation and Architecture
Computer Organisation and ArchitectureComputer Organisation and Architecture
Computer Organisation and ArchitectureSubhasis Dash
 
memeoryorganization PPT for organization of memories
memeoryorganization PPT for organization of memoriesmemeoryorganization PPT for organization of memories
memeoryorganization PPT for organization of memoriesGauravDaware2
 
cachememory-210517060741 (1).pdf
cachememory-210517060741 (1).pdfcachememory-210517060741 (1).pdf
cachememory-210517060741 (1).pdfOmGadekar2
 
Limitations of memory system performance
Limitations of memory system performanceLimitations of memory system performance
Limitations of memory system performanceSyed Zaid Irshad
 
Computer Architecture | Computer Fundamental and Organization
Computer Architecture | Computer Fundamental and OrganizationComputer Architecture | Computer Fundamental and Organization
Computer Architecture | Computer Fundamental and OrganizationSmit Luvani
 
computer architecture and organisation
computer architecture and organisationcomputer architecture and organisation
computer architecture and organisationRamjiChaurasiya
 
unit4 and unit5.pptx
unit4 and unit5.pptxunit4 and unit5.pptx
unit4 and unit5.pptxbobbyk11
 

Ähnlich wie Computer architecture for HNDIT (20)

Cache simulator
Cache simulatorCache simulator
Cache simulator
 
Chapter5 the memory-system-jntuworld
Chapter5 the memory-system-jntuworldChapter5 the memory-system-jntuworld
Chapter5 the memory-system-jntuworld
 
COA (Unit_4.pptx)
COA (Unit_4.pptx)COA (Unit_4.pptx)
COA (Unit_4.pptx)
 
cache memory and types of cache memory,
cache memory  and types of cache memory,cache memory  and types of cache memory,
cache memory and types of cache memory,
 
Memory organization
Memory organizationMemory organization
Memory organization
 
Memory Hierarchy PPT of Computer Organization
Memory Hierarchy PPT of Computer OrganizationMemory Hierarchy PPT of Computer Organization
Memory Hierarchy PPT of Computer Organization
 
Lecture2
Lecture2Lecture2
Lecture2
 
CS304PC:Computer Organization and Architecture Session 29 Memory organization...
CS304PC:Computer Organization and Architecture Session 29 Memory organization...CS304PC:Computer Organization and Architecture Session 29 Memory organization...
CS304PC:Computer Organization and Architecture Session 29 Memory organization...
 
Computer Organisation and Architecture
Computer Organisation and ArchitectureComputer Organisation and Architecture
Computer Organisation and Architecture
 
Cache memory
Cache memoryCache memory
Cache memory
 
memeoryorganization PPT for organization of memories
memeoryorganization PPT for organization of memoriesmemeoryorganization PPT for organization of memories
memeoryorganization PPT for organization of memories
 
Sram pdf
Sram pdfSram pdf
Sram pdf
 
MEMORY & I/O SYSTEMS
MEMORY & I/O SYSTEMS                          MEMORY & I/O SYSTEMS
MEMORY & I/O SYSTEMS
 
cachememory-210517060741 (1).pdf
cachememory-210517060741 (1).pdfcachememory-210517060741 (1).pdf
cachememory-210517060741 (1).pdf
 
Limitations of memory system performance
Limitations of memory system performanceLimitations of memory system performance
Limitations of memory system performance
 
COA notes
COA notesCOA notes
COA notes
 
Computer Architecture | Computer Fundamental and Organization
Computer Architecture | Computer Fundamental and OrganizationComputer Architecture | Computer Fundamental and Organization
Computer Architecture | Computer Fundamental and Organization
 
Cache Memory.pptx
Cache Memory.pptxCache Memory.pptx
Cache Memory.pptx
 
computer architecture and organisation
computer architecture and organisationcomputer architecture and organisation
computer architecture and organisation
 
unit4 and unit5.pptx
unit4 and unit5.pptxunit4 and unit5.pptx
unit4 and unit5.pptx
 

Mehr von tjunicornfx

C++ Question & Answer
C++ Question & AnswerC++ Question & Answer
C++ Question & Answertjunicornfx
 
Mechanical element of a CNC Machine
Mechanical element of a CNC MachineMechanical element of a CNC Machine
Mechanical element of a CNC Machinetjunicornfx
 
AC in Vehicle -Sinhala note (Sri lanka) 1
AC in Vehicle -Sinhala note (Sri lanka) 1AC in Vehicle -Sinhala note (Sri lanka) 1
AC in Vehicle -Sinhala note (Sri lanka) 1tjunicornfx
 
AC in Vehicle -Sinhala note (Sri lanka) -3
AC in Vehicle -Sinhala note (Sri lanka) -3AC in Vehicle -Sinhala note (Sri lanka) -3
AC in Vehicle -Sinhala note (Sri lanka) -3tjunicornfx
 
AC in Vehicle -Sinhala note (Sri lanka) -2
AC in Vehicle -Sinhala note (Sri lanka) -2AC in Vehicle -Sinhala note (Sri lanka) -2
AC in Vehicle -Sinhala note (Sri lanka) -2tjunicornfx
 
Artificail Intelligent lec-1
Artificail Intelligent lec-1Artificail Intelligent lec-1
Artificail Intelligent lec-1tjunicornfx
 
Procedures functions structures in VB.Net
Procedures  functions  structures in VB.NetProcedures  functions  structures in VB.Net
Procedures functions structures in VB.Nettjunicornfx
 
Security architecture
Security architectureSecurity architecture
Security architecturetjunicornfx
 
04 introduction to computer networking
04 introduction to computer networking04 introduction to computer networking
04 introduction to computer networkingtjunicornfx
 

Mehr von tjunicornfx (10)

C++ Question & Answer
C++ Question & AnswerC++ Question & Answer
C++ Question & Answer
 
ASP
ASPASP
ASP
 
Mechanical element of a CNC Machine
Mechanical element of a CNC MachineMechanical element of a CNC Machine
Mechanical element of a CNC Machine
 
AC in Vehicle -Sinhala note (Sri lanka) 1
AC in Vehicle -Sinhala note (Sri lanka) 1AC in Vehicle -Sinhala note (Sri lanka) 1
AC in Vehicle -Sinhala note (Sri lanka) 1
 
AC in Vehicle -Sinhala note (Sri lanka) -3
AC in Vehicle -Sinhala note (Sri lanka) -3AC in Vehicle -Sinhala note (Sri lanka) -3
AC in Vehicle -Sinhala note (Sri lanka) -3
 
AC in Vehicle -Sinhala note (Sri lanka) -2
AC in Vehicle -Sinhala note (Sri lanka) -2AC in Vehicle -Sinhala note (Sri lanka) -2
AC in Vehicle -Sinhala note (Sri lanka) -2
 
Artificail Intelligent lec-1
Artificail Intelligent lec-1Artificail Intelligent lec-1
Artificail Intelligent lec-1
 
Procedures functions structures in VB.Net
Procedures  functions  structures in VB.NetProcedures  functions  structures in VB.Net
Procedures functions structures in VB.Net
 
Security architecture
Security architectureSecurity architecture
Security architecture
 
04 introduction to computer networking
04 introduction to computer networking04 introduction to computer networking
04 introduction to computer networking
 

Kürzlich hochgeladen

Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxleah joy valeriano
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Food processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsFood processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsManeerUddin
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationRosabel UA
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 

Kürzlich hochgeladen (20)

Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Food processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsFood processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture hons
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translation
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 

Computer architecture for HNDIT

  • 1. Computer Architecture Babak Kia Adjunct Professor Boston University College of Engineering Email: bkia -at- bu.edu ENG SC757 - Advanced Microprocessor Design Computer Architecture Computer Architecture is the theory behind the operational design of a computer system This is a term which is applied to a vast array of computer disciplines ranging from low level instruction set and logic design, to higher level aspects of a computer’s design such as the memory subsystem and bus structure In this lecture we will focus on the latter definition of the term Topics Discussed Memory Hierarchy Memory Performance • Amdahl’s Law and the Locality of Reference Cache Organization and Operation Random Access Memory (DRAM, SRAM) Non Volatile Memory (Flash) Bus Interfaces • How to connect the processor to memory & I/O 1
  • 2. Memory Hierarchy A simple axiom of hardware design: smaller is faster Memory Hierarchy Register Cache Access Times 1 cy. 1-2 cy. Main Memory 10-15 cy. Disk 1000+ cy. Memory Performance Performance is measured either in terms of throughput or response time The goal of memory design is to increase memory bandwidth and decrease access time latencies We take advantage of three principles of computing in order to achieve this goal: • Make the common case faster • Principle of Locality • Smaller is Faster Make the Common Case Faster Always want to improve the frequent event as opposed to the infrequent event Amdahl’s Law quantifies this process: • The performance improvement to be gained from using some faster mode of execution is limited by the fraction of the time the faster mode can be used Amdahl’s Law essentially guides you to spend your resources on an area where the time is most spent 2
  • 3. The Principle of Locality Locality of Reference is one of the most important properties of a program: • It is a widely held rule of thumb that 90% of execution time is spent in only 10% of the code Temporal Locality • If a location is referenced, there is a high likelihood that it will be referenced again in the near future (time) Spatial Locality • If you reference instruction or data at a certain location, there is a high likelihood that nearby addresses will also be referenced Cache Cache is a small, fast memory which holds copies of recently accessed instruction and data Key to the performance of modern Microprocessors There could be up to two levels of cache, the picture to the right is Intel Pentium M, the large block to the left showing 2 MB of L2 cache Cache Taking advantage of the Principle of Locality, cache loads itself with contents of data which was recently accessed But this only addresses temporal locality! Therefore to take advantage of spatial locality as well, cache subsystems are designed to read anywhere between 16128 bytes of data at a time. 3
  • 4. Cache Definitions Cache is simply defined as a temporary storage area for data which is frequently accessed A cache hit occurs when the CPU is looking for data which is already contained within the cache A cache miss is the alternate scenario, when a cache line needs to be read from main memory The percentages of accesses which result in cache hits are known as the cache hit rate or hit ratio of the cache If the cache is full, it needs to evict a cache line before it can bring in a new cache line. This is done through a heuristic process and is known as the cache replacement policy Cache Organization Cache is organized not in bytes, but as blocks of cache lines, with each line containing some number of bytes (16-64) Unlike normal memory, cache lines do not have fixed addresses, which enables the cache system to populate each cache line with a unique (non-contiguous) address There are three methods for filling a cache line • Fully Associative – The most flexible • Direct Mapped – The most basic • Set Associative – A combination of the two Fully Associative In a fully associative cache subsystem, the cache controller can place a block of bytes in any of the available cache lines Though this makes the system greatly flexible, the added circuitry to perform this function increases the cost, and worst, decreases the performance of the cache! Most of today’s cache systems are not fully associative for this reason 4
  • 5. Direct Mapped In contrast to the fully associative cache is the direct mapped cache system, also called the one-way set associative In this system, a block of main memory is always loaded into the same cache line, evicting the previous cache entry This is not an ideal solution either because in spite of its simplicity, it doesn’t make an efficient use of the cache For this reason, not many systems are built as direct mapped caches either Set Associative Set associative cache is a compromise between fully associative and direct mapped caching techniques The idea is to break apart the cache into n-sets of cache lines. This way the cache subsystem uses a direct mapped scheme to select a set, but then uses a fully associative scheme to places the line entry in any of the n cache lines within the set For n = 2, the cache subsystem is called a two-way set associative cache Cache Line Addressing But how are cache lines addressed? Caches include an address tag on each line which gives it the frame address • First, the tag of every cache line is checked in parallel to see if it matches the address provided by the CPU • Then there must be a way to identify which cache line is invalid, which is done through adding a valid bit to the tag line • Finally, a random or a Least Recently Used (LRU) algorithm can be used to evict an invalid cache line 5
  • 6. Cache example Tag Index Offset Cache Operation Most of the time the cache is busy filling cache lines (reading from memory) But the processor doesn’t write a cache line which can be up to 128 bytes - it only writes between 1 and 8 bytes Therefore it must perform a read-modify-write sequence on the cache line Also, the cache uses one of two write operations: • Write-through, where data is updated both on the cache and in the main memory • Write-back, where data is written to the cache, and updated in the main memory only when the cache line is replaced Cache coherency – a very important subject! TBD in a separate lecture… Memory Technologies There are basically two different memory technologies used to store data in RAM: Static RAM and Dynamic RAM 6
  • 7. Static RAM Uses 4-6 transistors to store a single bit of data Provides a fast access time at the expense of lower bit densities For this reason registers and cache subsystems are fabricated using SRAM technology Static RAM is considerably more expensive than Dynamic RAM However, since it doesn’t need to be refreshed, its power consumption is much lower than DRAM Also, the absence of the refresh circuitry makes it easier to interface to The simplicity of the memory circuitry compensates for the more costly technology Dynamic RAM The bulk of a processor’s main memory is comprised of dynamic RAM Manufacturers have focused on memory sizes rather than speed In contrast to SRAM, DRAM uses a single transistor and capacitor to store a bit DRAM requires that the address applied to the device be asserted in a row address (RAS) and a column address (CAS) The requirement of RAS and CAS of course kills the access time, but since it reduces package pinout, it allows for higher memory densities Dynamic RAM RAS and CAS use the same pins, with each being asserted during either the RAS or the CAS phase of the address There are two metrics used to describe DRAM’s performance: • Access time is defined as the time between assertion of RAS to the availability of data • Cycle time is defined as the minimum time before a next access can be granted Manufacturers like to quote access times, but cycle times are more relevant because they establish throughput of the system 7
  • 8. Dynamic RAM Of course the charge leaks slowly from the storage capacitor in DRAM and it needs to be refreshed continually During the refresh phase, all accesses are heldoff, which happens once every 1 – 100 ms and slightly impacts the throughput DRAM bandwidth can be increased by operating it in paged mode, when several CASs are applied for each RAS A notation such as 256x16 means 256 thousand columns of cells standing 16 rows deep Wait states Memory access time is the amount of time from the point an address is placed on the bus, to the time when data is read Normally, data is requested on C1 and read on C2: C1 C2 However, for slower memories, wait states are necessary: C1 ws ws ws Adding wait states to memory is not ideal. Adding a single wait state doubles access time and halves the speed of memory access! Memory Timing RAS - Row Address Strobe or Row Address Select CAS - Column Address Strobe or Column Address Select tRAS - Active to precharge delay; this is the delay between the precharge and activation of a row tRCD - RAS to CAS Delay; the time required between RAS and CAS access tCL - (or CL) CAS Latency tRP - RAS Precharge; the time required to switch from one row to the next row, for example, switch internal memory banks tCLK – The length of a clock cycle 8
  • 9. Memory Timing Command Rate - the delay between Chip Select (CS), or when an IC is selected and the time commands can be issued to the IC Latency - The time from when a request is made to when it is answered; the total time required before data can be written to or read from the memory. Memory timing can be displayed as a sequence of numbers such as 2-3-2-6-T1 • While all parameters are important to system reliability, some parameters are more significant: • This refers to the CL-tRCD-tRP-tRAS-Command Rate sequence and is measured in clock cycles Memory Timing CAS Latency • One of the most important parameters • Delay between the CAS signal and the availability of the data on the DQ pin • Specially important since data is often accessed sequentially (within the same row), so CAS timing plays a key role in the performance of a system tRCD • The delay between the time a row is activated, to when a column is activated • For most situations (sequential), it is not a problem, however becomes an issue for non-sequential addressing Basic DRAM Timing 9
  • 10. ColdFire SRAM Timing ColdFire SDRAM Timing NvRAM Both SRAM and DRAM are known as volatile memories – turn the power off and they lose their contents Computers also rely on a wide array of Non-volatile Random Access Memories or NvRAMs, which retain their data even if the power is turned off. They come in many forms: • • • • ROM - Read Only Memory EPROM - Erasable Programmable Read Only Memory EEPROM - Electrically Erasable Programmable Read Only Memory Flash – A form of EEPROM which allows multiple locations to be erased or written in one programming operation. Normal EEPROM only allows for one location to be erased or programmed 10
  • 11. Flash Flash memories combine the best features of Non volatile memories. • Very high densities (256Mbits or more!) • Low cost (mass produced, used in everything from cell phones to USB flash drives) • Very fast (to read, but not to write) • Reprogrammable • And of course non volatile One disadvantage is that flash can’t be erased a byte at a time, it must be erased one sector at a time Flash parts have a guaranteed lifetime of ~100,000 write cycles The System Bus The system bus connects the microprocessors to the memory and I/O subsystems It is comprised of three major busses: The address bus, the data bus, and the control bus Both address and data busses come in variety of sizes. Address busses generally range from 20 to 36 bits (1 MB address space – 64 GBytes) Data busses are either 8, 16, 32, or 64 bits wide The control bus on the other hand varies amongst processors. It is a collection of control signals with which the processor communicates with the rest of the system The Control Bus The following are generally part of the Control Bus: • Read/Write signal, which specify the direction of data flow • Byte Enable signals, which allow 16, 32, and 64 bit busses deal with smaller chunks of data • Some processors have signaling which identifies between memory and I/O addresses • Other signals include interrupt lines, parity lines, bus clock, and status signals 11
  • 12. The ColdFire System Bus OE# controls external bus transceivers TA# indicates successful completion of requested data transfer TEA# upon assertion terminates the bus cycle Portions of this power point presentation may have been taken from relevant users and technical manuals. Original content Copyright © 2005 – Babak Kia 12