SlideShare ist ein Scribd-Unternehmen logo
1 von 20
ADVANCED COMPUTER ARCHITECTURE
AND PARALLEL PROCESSING
Shared memory architecture
1- Shared memory systems form a major category of multiprocessors.
2- In this category all processors share a global memory.
3- Communication between tasks running on different processors is performed
through writing to and reading from the global memory.
4- All interprocessor coordination and synchronization is also accomplished via
the global memory.
5- A shared memory computer system consists of :
- a set of independent processors.
- a set of memory modules, and an interconnection network as shown in
following Figure 4.1
6- Two main problems need to be addressed when designing a shared memory
system:
• performance degradation due to contention, and
• coherence problems
2
Shared memory architecture cont.
3
Shared memory architecture cont
• Performance degradation might happen when multiple processors are trying to
access the shared memory simultaneously. ?
• Caches : having multiple copies of data, spread throughout the caches might lead
to a coherence problem
- The copies in the caches are coherent if they are all equal to the same value.
- If one of the processors writes over the value of one of the copies, then the copy
becomes inconsistent because it no longer equals the value of the other copies.
4
4.1 CLASSIFICATION OF SHARED
MEMORY SYSTEMS
• The simplest shared memory system consists of one memory module (M)
that can be accessed from two processors (P1 and P2).
• An arbitration unit within the memory module passes requests through to a
memory controller. If the memory module is not busy and a single request
arrives, then the arbitration unit passes that request to the memory
controller and the request is satisfied. The module is placed in the busy
state while a request is being serviced. If a new request arrives while the
memory is busy servicing a previous request, the memory module sends a
wait signal, through the memory controller, to the processor making the new
request. 5
4.1 CLASSIFICATION OF SHARED
MEMORY SYSTEMS cont.
• In response, the requesting processor may hold its request on the line until the
memory becomes free or it may repeat its request some time later. If the
arbitration unit receives two requests, it selects one of them and passes it to the
memory controller. Again, the denied request can be either held to be served
next or it may be repeated some time later.
• Categories of shared memory systems Based on the interconnection network:
 Uniform Memory Access (UMA) .
 Non-uniform Memory Access (NUMA)
 Cache Only Memory Architecture (COMA)
6
Uniform Memory Access (UMA)
 In the UMA system a shared memory is accessible by all processors through an
interconnection network in the same way a single processor accesses its
memory.
 All processors have equal access time to any memory location.
 The interconnection network used: a single bus, multiple buses, or a crossbar
switch(diagonal,straight).
 Called it SMP (symmetric multiprocessor) systems?
 access to shared memory is balanced (Each processor has equal opportunity to
read/write to memory, including equal access speed).
7
Nonuniform Memory Access
(NUMA)
• each processor has part of the shared memory attached. The memory has a
single address space. Therefore, any processor could access any memory
location directly using its real address. However, the access time to modules
depends on the distance to the processor.
• the tree and the hierarchical bus networks are architectures used to interconnect
processors.
8
Cache-Only Memory Architecture
(COMA)
• Similar to the NUMA, each processor has part of the shared memory in the
COMA. However, in this case the shared memory consists of cache memory.
• requires that data be migrated to the processor requesting it.
• There is no memory hierarchy and the address space is made of all the caches.
There is a cache directory (D) that helps in remote cache access.
9
4.2 BUS-BASED SYMMETRIC
MULTIPROCESSORS
 Shared memory systems can be designed using :
• bus-based (simplest network for shared memory systems).
• switch-based interconnection networks (message passing).
 The bus/cache architecture alleviates the need for expensive multi-ported
memories and interface circuitry as well as the need to adopt a message-
passing paradigm when developing application software.
 the bus may get saturated if multiple processors are trying to access the shared
memory (via the bus) simultaneously. (contention problem) solve uses Caches
 High speed caches connected to each processor on one side and the bus on the
other side mean that local copies of instructions and data can be supplied at the
highest possible rate.
 If the local processor finds all of its instructions and data in the local cache hit
rate is 100%, otherwise miss rate of a cache, so must be copied from the
global memory, across the bus, into the cache, and then passed on to the local
processor. 10
4.2 BUS-BASED SYMMETRIC
MULTIPROCESSORS cont.
 One of the goals of the cache is to maintain a high hit rate, or low miss rate
under high processor loads. A high hit rate means the processors are not using
the bus as much. Hit rates are determined by a number of factors, ranging from
the application programs being run to the manner in which cache hardware is
implemented.
11
4.2 BUS-BASED SYMMETRIC
MULTIPROCESSORS cont.
• We define the variables for hit rate, number of processors, processor speed, bus
speed, and processor duty cycle rates as follows:
 N = number of processors;
 h = hit rate of each cache, assumed to be the same for all caches;
 (1 - h) = miss rate of all caches;
 B = bandwidth of the bus, measured in cycles/second;
 I = processor duty cycle, assumed to be identical for all processors, in
fetches/cycle; and
 V = peak processor speed, in fetches/second
12
4.2 BUS-BASED SYMMETRIC
MULTIPROCESSORS cont.
• The effective bandwidth of the bus is BI fetches/second.
If each processor is running at a speed of V, then misses are being generated
at a rate of V(1 - h). For an N-processor system, misses are simultaneously
being generated at a rate of N(1 - h)V. This leads to saturation of the bus when
N processors simultaneously try to access the bus. That is, 𝑵 𝟏 − 𝒉 𝑽 ≤ 𝑩𝑰.
The maximum number of processors with cache memories that the bus can
support is given by the relation,
𝑁 ≤
𝐵𝐼
1−ℎ 𝑉
13
4.2 BUS-BASED SYMMETRIC
MULTIPROCESSORS cont.
• Example 1
Suppose a shared memory system is constructed from processors that can
execute V = 107 instructions/s and the processor duty cycle I = 1. The caches
are designed to support a hit rate of 97%, and the bus supports a peak
bandwidth of B = 106 cycles/s. Then, (1 - h) = 0.03, and the maximum number
of processors N Is 𝑁 ≤
106∗1
(0.03∗107)
=3.33.
Thus, the system we have in mind can support only three processors!
We might ask what hit rate is needed to support a 30-processor system. In this
case, h = 1- BI/NV = 1 - (106(1))/((30)(107)) = 1 - 1/300, so for the system we
have in mind, h = 0.9967. Increasing h by 2.8%. results in supporting a factor of
ten more processors.
14
4.3 BASIC CACHE COHERENCY
METHODS
• Multiple copies of data, spread throughout the caches, lead to a coherence
problem among the caches. The copies in the caches are coherent if they all
equal the same value. However, if one of the processors writes over the value of
one of the copies, then the copy becomes inconsistent because it no longer
equals the value of the other copies. If data are allowed to become inconsistent
(incoherent), incorrect results will be propagated through the system, leading to
incorrect final results. Cache coherence algorithms are needed to maintain a
level of consistency throughout the parallel system.
Copies data in the
caches
coherence inconsistent
15
Cache–Memory Coherence
 In a single cache system, coherence between memory and the cache is
maintained using one of two policies:
• write-through:
The memory is updated every time the cache is updated.
• write-back:
The memory is updated only when the block in the cache is being
replaced.
16
Cache–Memory Coherence
17
Cache–Cache Coherence
• In multiprocessing system, when a task running on processor P requests the data in
global memory location X, for example, the contents of X are copied to processor P’s
local cache, where it is passed on to P. Now, suppose processor Q also accesses X.
What happens if Q wants to write a new value over the old value of X?
 There are two fundamental cache coherence policies:
 write-invalidate
Maintains consistency by reading from local caches until a write occurs.
 write-update
Maintains consistency by immediately updating all copies in all caches.
18
Cache–Cache Coherence
19
Shared Memory System
Coherence
• The four combinations to maintain coherence among all caches and
global memory are:
 Write-update and write-through
 Write-update and write-back
 Write-invalidate and write-through
 Write-invalidate and write-back
• If we permit a write-update and write-through directly on global memory
location X, the bus would start to get busy and ultimately all processors
would be idle while waiting for writes to complete. In write-update and write-
back, only copies in all caches are updated. On the contrary, if the write is
limited to the copy of X in cache Q, the caches become inconsistent on X.
Setting the dirty bit prevents the spread of inconsistent values of X, but at
some point, the inconsistent copies must be updated.
20

Weitere ähnliche Inhalte

Was ist angesagt?

Flynns classification
Flynns classificationFlynns classification
Flynns classificationYasir Khan
 
Cache coherence problem and its solutions
Cache coherence problem and its solutionsCache coherence problem and its solutions
Cache coherence problem and its solutionsMajid Saleem
 
Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.MITS Gwalior
 
Multiprocessor Systems
Multiprocessor SystemsMultiprocessor Systems
Multiprocessor Systemsvampugani
 
Micro programmed control
Micro programmed  controlMicro programmed  control
Micro programmed controlShashank Singh
 
Multiprocessor
MultiprocessorMultiprocessor
MultiprocessorNeel Patel
 
Parallel computing and its applications
Parallel computing and its applicationsParallel computing and its applications
Parallel computing and its applicationsBurhan Ahmed
 
Shared Memory Multi Processor
Shared Memory Multi ProcessorShared Memory Multi Processor
Shared Memory Multi Processorbabuece
 
Introduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed ComputingIntroduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed ComputingSayed Chhattan Shah
 
Introduction to parallel processing
Introduction to parallel processingIntroduction to parallel processing
Introduction to parallel processingPage Maker
 
Centralized shared memory architectures
Centralized shared memory architecturesCentralized shared memory architectures
Centralized shared memory architecturesGokuldhev mony
 
Multiprocessor architecture
Multiprocessor architectureMultiprocessor architecture
Multiprocessor architectureArpan Baishya
 
Parallel Programing Model
Parallel Programing ModelParallel Programing Model
Parallel Programing ModelAdlin Jeena
 

Was ist angesagt? (20)

Flynns classification
Flynns classificationFlynns classification
Flynns classification
 
Cache coherence problem and its solutions
Cache coherence problem and its solutionsCache coherence problem and its solutions
Cache coherence problem and its solutions
 
Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.
 
Multiprocessor Systems
Multiprocessor SystemsMultiprocessor Systems
Multiprocessor Systems
 
Micro programmed control
Micro programmed  controlMicro programmed  control
Micro programmed control
 
Pram model
Pram modelPram model
Pram model
 
Multiprocessor
MultiprocessorMultiprocessor
Multiprocessor
 
Parallel Processing Concepts
Parallel Processing Concepts Parallel Processing Concepts
Parallel Processing Concepts
 
Parallel Algorithms
Parallel AlgorithmsParallel Algorithms
Parallel Algorithms
 
Pipeline
PipelinePipeline
Pipeline
 
Parallel computing and its applications
Parallel computing and its applicationsParallel computing and its applications
Parallel computing and its applications
 
Distributed Operating System_1
Distributed Operating System_1Distributed Operating System_1
Distributed Operating System_1
 
Shared Memory Multi Processor
Shared Memory Multi ProcessorShared Memory Multi Processor
Shared Memory Multi Processor
 
Introduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed ComputingIntroduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed Computing
 
Introduction to parallel processing
Introduction to parallel processingIntroduction to parallel processing
Introduction to parallel processing
 
Centralized shared memory architectures
Centralized shared memory architecturesCentralized shared memory architectures
Centralized shared memory architectures
 
Multiprocessor system
Multiprocessor system Multiprocessor system
Multiprocessor system
 
Multiprocessor architecture
Multiprocessor architectureMultiprocessor architecture
Multiprocessor architecture
 
Parallel Programing Model
Parallel Programing ModelParallel Programing Model
Parallel Programing Model
 
Multiprocessing operating systems
Multiprocessing operating systemsMultiprocessing operating systems
Multiprocessing operating systems
 

Ähnlich wie ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING

247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
assignment_presentaion_jhvvnvhjhbhjhvjh.pptx
assignment_presentaion_jhvvnvhjhbhjhvjh.pptxassignment_presentaion_jhvvnvhjhbhjhvjh.pptx
assignment_presentaion_jhvvnvhjhbhjhvjh.pptx23mu36
 
Multiprocessor structures
Multiprocessor structuresMultiprocessor structures
Multiprocessor structuresShareb Ismaeel
 
Lecture 6
Lecture  6Lecture  6
Lecture 6Mr SMAK
 
Lecture 6
Lecture  6Lecture  6
Lecture 6Mr SMAK
 
Lecture 6
Lecture  6Lecture  6
Lecture 6Mr SMAK
 
Distributed system lectures
Distributed system lecturesDistributed system lectures
Distributed system lecturesmarwaeng
 
Memory and Cache Coherence in Multiprocessor System.pdf
Memory and Cache Coherence in Multiprocessor System.pdfMemory and Cache Coherence in Multiprocessor System.pdf
Memory and Cache Coherence in Multiprocessor System.pdfrajaratna4
 
Unit 5 lect-3-multiprocessor
Unit 5 lect-3-multiprocessorUnit 5 lect-3-multiprocessor
Unit 5 lect-3-multiprocessorvishal choudhary
 
Symmetric multiprocessing and Microkernel
Symmetric multiprocessing and MicrokernelSymmetric multiprocessing and Microkernel
Symmetric multiprocessing and MicrokernelManoraj Pannerselum
 
Module2 MultiThreads.ppt
Module2 MultiThreads.pptModule2 MultiThreads.ppt
Module2 MultiThreads.pptshreesha16
 

Ähnlich wie ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING (20)

Week5
Week5Week5
Week5
 
22CS201 COA
22CS201 COA22CS201 COA
22CS201 COA
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
NUMA
NUMANUMA
NUMA
 
assignment_presentaion_jhvvnvhjhbhjhvjh.pptx
assignment_presentaion_jhvvnvhjhbhjhvjh.pptxassignment_presentaion_jhvvnvhjhbhjhvjh.pptx
assignment_presentaion_jhvvnvhjhbhjhvjh.pptx
 
OS_MD_4.pdf
OS_MD_4.pdfOS_MD_4.pdf
OS_MD_4.pdf
 
Multiprocessor structures
Multiprocessor structuresMultiprocessor structures
Multiprocessor structures
 
Chapter 10
Chapter 10Chapter 10
Chapter 10
 
Lecture 6
Lecture  6Lecture  6
Lecture 6
 
Lecture 6
Lecture  6Lecture  6
Lecture 6
 
Lecture 6
Lecture  6Lecture  6
Lecture 6
 
Distributed system lectures
Distributed system lecturesDistributed system lectures
Distributed system lectures
 
Cache Coherence.pptx
Cache Coherence.pptxCache Coherence.pptx
Cache Coherence.pptx
 
Intro_ppt.pptx
Intro_ppt.pptxIntro_ppt.pptx
Intro_ppt.pptx
 
Memory and Cache Coherence in Multiprocessor System.pdf
Memory and Cache Coherence in Multiprocessor System.pdfMemory and Cache Coherence in Multiprocessor System.pdf
Memory and Cache Coherence in Multiprocessor System.pdf
 
Unit 5 lect-3-multiprocessor
Unit 5 lect-3-multiprocessorUnit 5 lect-3-multiprocessor
Unit 5 lect-3-multiprocessor
 
Coa presentation5
Coa presentation5Coa presentation5
Coa presentation5
 
Symmetric multiprocessing and Microkernel
Symmetric multiprocessing and MicrokernelSymmetric multiprocessing and Microkernel
Symmetric multiprocessing and Microkernel
 
Mesi
MesiMesi
Mesi
 
Module2 MultiThreads.ppt
Module2 MultiThreads.pptModule2 MultiThreads.ppt
Module2 MultiThreads.ppt
 

Kürzlich hochgeladen

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 

Kürzlich hochgeladen (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 

ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING

  • 2. Shared memory architecture 1- Shared memory systems form a major category of multiprocessors. 2- In this category all processors share a global memory. 3- Communication between tasks running on different processors is performed through writing to and reading from the global memory. 4- All interprocessor coordination and synchronization is also accomplished via the global memory. 5- A shared memory computer system consists of : - a set of independent processors. - a set of memory modules, and an interconnection network as shown in following Figure 4.1 6- Two main problems need to be addressed when designing a shared memory system: • performance degradation due to contention, and • coherence problems 2
  • 4. Shared memory architecture cont • Performance degradation might happen when multiple processors are trying to access the shared memory simultaneously. ? • Caches : having multiple copies of data, spread throughout the caches might lead to a coherence problem - The copies in the caches are coherent if they are all equal to the same value. - If one of the processors writes over the value of one of the copies, then the copy becomes inconsistent because it no longer equals the value of the other copies. 4
  • 5. 4.1 CLASSIFICATION OF SHARED MEMORY SYSTEMS • The simplest shared memory system consists of one memory module (M) that can be accessed from two processors (P1 and P2). • An arbitration unit within the memory module passes requests through to a memory controller. If the memory module is not busy and a single request arrives, then the arbitration unit passes that request to the memory controller and the request is satisfied. The module is placed in the busy state while a request is being serviced. If a new request arrives while the memory is busy servicing a previous request, the memory module sends a wait signal, through the memory controller, to the processor making the new request. 5
  • 6. 4.1 CLASSIFICATION OF SHARED MEMORY SYSTEMS cont. • In response, the requesting processor may hold its request on the line until the memory becomes free or it may repeat its request some time later. If the arbitration unit receives two requests, it selects one of them and passes it to the memory controller. Again, the denied request can be either held to be served next or it may be repeated some time later. • Categories of shared memory systems Based on the interconnection network:  Uniform Memory Access (UMA) .  Non-uniform Memory Access (NUMA)  Cache Only Memory Architecture (COMA) 6
  • 7. Uniform Memory Access (UMA)  In the UMA system a shared memory is accessible by all processors through an interconnection network in the same way a single processor accesses its memory.  All processors have equal access time to any memory location.  The interconnection network used: a single bus, multiple buses, or a crossbar switch(diagonal,straight).  Called it SMP (symmetric multiprocessor) systems?  access to shared memory is balanced (Each processor has equal opportunity to read/write to memory, including equal access speed). 7
  • 8. Nonuniform Memory Access (NUMA) • each processor has part of the shared memory attached. The memory has a single address space. Therefore, any processor could access any memory location directly using its real address. However, the access time to modules depends on the distance to the processor. • the tree and the hierarchical bus networks are architectures used to interconnect processors. 8
  • 9. Cache-Only Memory Architecture (COMA) • Similar to the NUMA, each processor has part of the shared memory in the COMA. However, in this case the shared memory consists of cache memory. • requires that data be migrated to the processor requesting it. • There is no memory hierarchy and the address space is made of all the caches. There is a cache directory (D) that helps in remote cache access. 9
  • 10. 4.2 BUS-BASED SYMMETRIC MULTIPROCESSORS  Shared memory systems can be designed using : • bus-based (simplest network for shared memory systems). • switch-based interconnection networks (message passing).  The bus/cache architecture alleviates the need for expensive multi-ported memories and interface circuitry as well as the need to adopt a message- passing paradigm when developing application software.  the bus may get saturated if multiple processors are trying to access the shared memory (via the bus) simultaneously. (contention problem) solve uses Caches  High speed caches connected to each processor on one side and the bus on the other side mean that local copies of instructions and data can be supplied at the highest possible rate.  If the local processor finds all of its instructions and data in the local cache hit rate is 100%, otherwise miss rate of a cache, so must be copied from the global memory, across the bus, into the cache, and then passed on to the local processor. 10
  • 11. 4.2 BUS-BASED SYMMETRIC MULTIPROCESSORS cont.  One of the goals of the cache is to maintain a high hit rate, or low miss rate under high processor loads. A high hit rate means the processors are not using the bus as much. Hit rates are determined by a number of factors, ranging from the application programs being run to the manner in which cache hardware is implemented. 11
  • 12. 4.2 BUS-BASED SYMMETRIC MULTIPROCESSORS cont. • We define the variables for hit rate, number of processors, processor speed, bus speed, and processor duty cycle rates as follows:  N = number of processors;  h = hit rate of each cache, assumed to be the same for all caches;  (1 - h) = miss rate of all caches;  B = bandwidth of the bus, measured in cycles/second;  I = processor duty cycle, assumed to be identical for all processors, in fetches/cycle; and  V = peak processor speed, in fetches/second 12
  • 13. 4.2 BUS-BASED SYMMETRIC MULTIPROCESSORS cont. • The effective bandwidth of the bus is BI fetches/second. If each processor is running at a speed of V, then misses are being generated at a rate of V(1 - h). For an N-processor system, misses are simultaneously being generated at a rate of N(1 - h)V. This leads to saturation of the bus when N processors simultaneously try to access the bus. That is, 𝑵 𝟏 − 𝒉 𝑽 ≤ 𝑩𝑰. The maximum number of processors with cache memories that the bus can support is given by the relation, 𝑁 ≤ 𝐵𝐼 1−ℎ 𝑉 13
  • 14. 4.2 BUS-BASED SYMMETRIC MULTIPROCESSORS cont. • Example 1 Suppose a shared memory system is constructed from processors that can execute V = 107 instructions/s and the processor duty cycle I = 1. The caches are designed to support a hit rate of 97%, and the bus supports a peak bandwidth of B = 106 cycles/s. Then, (1 - h) = 0.03, and the maximum number of processors N Is 𝑁 ≤ 106∗1 (0.03∗107) =3.33. Thus, the system we have in mind can support only three processors! We might ask what hit rate is needed to support a 30-processor system. In this case, h = 1- BI/NV = 1 - (106(1))/((30)(107)) = 1 - 1/300, so for the system we have in mind, h = 0.9967. Increasing h by 2.8%. results in supporting a factor of ten more processors. 14
  • 15. 4.3 BASIC CACHE COHERENCY METHODS • Multiple copies of data, spread throughout the caches, lead to a coherence problem among the caches. The copies in the caches are coherent if they all equal the same value. However, if one of the processors writes over the value of one of the copies, then the copy becomes inconsistent because it no longer equals the value of the other copies. If data are allowed to become inconsistent (incoherent), incorrect results will be propagated through the system, leading to incorrect final results. Cache coherence algorithms are needed to maintain a level of consistency throughout the parallel system. Copies data in the caches coherence inconsistent 15
  • 16. Cache–Memory Coherence  In a single cache system, coherence between memory and the cache is maintained using one of two policies: • write-through: The memory is updated every time the cache is updated. • write-back: The memory is updated only when the block in the cache is being replaced. 16
  • 18. Cache–Cache Coherence • In multiprocessing system, when a task running on processor P requests the data in global memory location X, for example, the contents of X are copied to processor P’s local cache, where it is passed on to P. Now, suppose processor Q also accesses X. What happens if Q wants to write a new value over the old value of X?  There are two fundamental cache coherence policies:  write-invalidate Maintains consistency by reading from local caches until a write occurs.  write-update Maintains consistency by immediately updating all copies in all caches. 18
  • 20. Shared Memory System Coherence • The four combinations to maintain coherence among all caches and global memory are:  Write-update and write-through  Write-update and write-back  Write-invalidate and write-through  Write-invalidate and write-back • If we permit a write-update and write-through directly on global memory location X, the bus would start to get busy and ultimately all processors would be idle while waiting for writes to complete. In write-update and write- back, only copies in all caches are updated. On the contrary, if the write is limited to the copy of X in cache Q, the caches become inconsistent on X. Setting the dirty bit prevents the spread of inconsistent values of X, but at some point, the inconsistent copies must be updated. 20