SlideShare ist ein Scribd-Unternehmen logo
1 von 40
Downloaden Sie, um offline zu lesen
Multithreading
Mr. A. B. Shinde
Assistant Professor,
Electronics Engineering,
P.V.P.I.T., Budhgaon
Contents…
 Using ILP support to exploit
thread –level parallelism
 performance and efficiency in
advanced multiple issue
processors
2
Threads
 A thread is a basic unit of CPU utilization.
 A thread is a separate process with its own instructions and data.
 A thread may represent a process that is part of a parallel program
consisting of multiple processes, or it may represent an
independent program.
3
Threads
 It comprises of a thread ID, a program counter, a register set and a
stack.
 It shares its code section, data section, and other operating-system
resources, such as open files and signals with other threads
belonging to the same Process.
 A traditional process has a single thread of control. If a process has
multiple threads of control, it can perform more than one task at a time.
4
Threads
 Many software packages that run
on modern desktop PCs are
multithreaded.
 For example:
A word processor may have:
a thread for displaying graphics,
another thread for responding to
keystrokes from the user, and
a third thread for performing spelling
and grammar checking in the
background.
5
Threads
 Threads also play a vital role in remote procedure call (RPC)
systems.
 RPCs allows interprocess communication by providing a
communication mechanism similar to ordinary function or procedure
calls.
 Many operating system kernels are multithreaded; several threads
operate in the kernel, and each thread performs a specific task, such as
managing devices or interrupt handling.
6
Multithreading
 Benefits:
1. Responsiveness: Multithreading is an interactive application that
may allow a program to continue running even if part of it is
blocked, thereby increasing responsiveness to the user.
For example: A multithreaded web browser could still allow user
interaction in one thread while an image was being loaded in another
thread.
2. Resource sharing: By default, threads share the memory and the
resources of the process to which they belong. The benefit of sharing
code and data is that it allows an application to have several different
threads of activity within the same address space.
7
Multithreading
 Benefits:
3. Economy: Allocating memory and resources for process creation is
costly. Since threads share resources of the process to which they
belong, they will provide cost effective solution.
4. Utilization of multiprocessor architectures: In multiprocessor
architecture, threads may be running in parallel on different processors.
A single threaded process can only run on one CPU, no matter how
many are available.
Multithreading on a multi-CPU machine increases concurrency.
8
Multithreading Models
 Support for threads may be provided either at the user level or at
the kernel level.
 User threads are supported above the kernel and are managed
without kernel support, whereas kernel threads are supported and
managed directly by the operating system.
9
Multithreading Models
 Many-to-One Model:
 The many-to-one model maps many user-
level threads to one kernel thread.
 Thread management is done by the
thread library in user space, so it is
efficient.
 Only one thread can access the kernel at
a time, hence multiple threads are unable to
run in parallel on multiprocessors.
10
Multithreading Models
 One-to-One Model:
 The one-to-one model maps each user
thread to a kernel thread.
 It provides more concurrency than the many-
to-one model. It allows multiple threads to run in
parallel on multiprocessors.
 The only drawback to this model is that
creating a user thread requires creating the
corresponding kernel thread.
 The overhead of creating kernel threads can
burden the performance of an application.
11
Multithreading Models
 Many-to-Many Model :
 The many-to-many model multiplexes many
user-level threads to a smaller or equal
number of kernel threads.
 The number of kernel threads may be specific
to either a particular application or a particular
machine.
 Developers can create as many user threads
as necessary, and the corresponding kernel
threads can run in parallel on a
multiprocessor.
12
Multithreading: ILP Support to Exploit
Thread-Level Parallelism
13
Multithreading: ILP Support to Exploit
Thread-Level Parallelism
 Although ILP increases the performance of system; then also ILP
can be quite limited or hard to exploit in some applications.
Furthermore, there may be parallelism occurring naturally at a higher
level in the application.
For example:
An online transaction-processing system has parallelism among the
multiple queries and updates. These queries and updates can be
processed mostly in parallel, since they are largely independent of one
another.
14
Multithreading: ILP Support to Exploit
Thread-Level Parallelism
 This higher-level parallelism is called thread-level parallelism (TLP)
because it is logically structured as separate threads of execution.
 ILP is parallel operations within a loop or straight-line code.
 TLP is represented by the use of multiple threads of execution that
are in parallel.
15
Multithreading: ILP Support to Exploit
Thread-Level Parallelism
 Thread-level parallelism is an important alternative to instruction-
level parallelism.
 In many applications thread-level parallelism occurs naturally (many
server applications).
 If software is written from scratch, then expressing the parallelism
is much easy.
 But if established applications written without parallelism in mind,
then there can be significant challenges and can be extremely costly
to rewrite them to exploit thread-level parallelism.
16
Multithreading: ILP Support to Exploit
Thread-Level Parallelism
 TLP and ILP exploits two different kinds of parallel structures.
 The crucial question is:
Can we exploit TLP on processor designed for ILP
 Answer is: Yes
Datapath designed to exploit ILP will find that many functional units are
often idle because of either stalls or dependences in the code.
The threads can be used as a independent instructions that might keep
the processor busy to implement TLP.
17
Multithreading: ILP Support to Exploit
Thread-Level Parallelism
 Multithreading allows multiple threads to share the functional units
of a single processor in an overlapping fashion.
 To permit this sharing, the processor must duplicate the
independent state of each thread.
 For example:
A separate copy of the register file, a separate PC and a separate page
table were required for each thread.
 In addition, the hardware must support the ability to change to a
different threads relatively quickly.
18
Multithreading: ILP Support to Exploit
Thread-Level Parallelism
 There are two main approaches to multithreading.
 Fine-grained multithreading &
 Coarse-grained multithreading
19
Multithreading: ILP Support to Exploit
Thread-Level Parallelism
 Fine-grained multithreading:
 It switches between threads on each instruction, causing the
execution of multiple threads to be interleaved.
 This interleaving is often done in a round-robin fashion.
 To make fine-grained multithreading practical, the CPU must be
able to switch threads on every clock cycle.
 Advantage: It can hide the throughput losses that arise from both short
and long stalls.
 Disadvantage: It slows down the execution of the individual threads.
20
Multithreading: ILP Support to Exploit
Thread-Level Parallelism
 Coarse-grained multithreading:
 It was invented as an alternative to fine-grained multithreading.
 Coarse-grained multithreading switches threads only on costly
(larger) stalls.
 Advantage: This change relieves the need to have thread switching.
 Disadvantage: They are likely to slow the processor down, since
instructions from other threads will only be issued when a thread
encounters a costly (larger) stalls.
21
Multithreading: ILP Support to Exploit
Thread-Level Parallelism
 CPU with coarse-grained multithreading issues instructions from a
single thread.
 When a stall occurs, the pipeline must be emptied or frozen.
 New thread that is executing after the stall must fill the pipeline.
 Because of this start-up overhead, coarse grained multithreading is
much more useful for reducing the penalty of high-cost stalls,
where pipeline refill is negligible compared to the stall time.
22
Converting Thread-Level
Parallelism into Instruction-Level Parallelism
 Simultaneous multithreading (SMT) is a variation on multithreading
that uses the resources of a multiple-issue, dynamically scheduled
processor to exploit TLP.
 Multiple-issue processors often have more functional unit
parallelism than a single thread, motivates the use of SMT.
 With register renaming and dynamic scheduling, multiple instructions
from independent threads can be issued without considering the
dependences among them.
23
Converting Thread-Level
Parallelism into Instruction-Level Parallelism
Figure illustrates the differences in a processor’s ability to exploit the
resources of a superscalar for the following configurations:
 A superscalar with no multithreading support
 A superscalar with coarse-grained multithreading
 A superscalar with fine-grained multithreading
 A superscalar with simultaneous multithreading
24
Converting Thread-Level
Parallelism into Instruction-Level Parallelism
 In the superscalar without multithreading support,
the use of issue slots is limited by a lack of ILP.
 In addition, a major stall, such as an instruction
cache miss, can leave the entire processor idle.
25
An empty (white) box indicates that the
corresponding issue slot is unused in that clock
cycle.
Black is used to indicate the occupied issue slots
Converting Thread-Level
Parallelism into Instruction-Level Parallelism
 In the coarse-grained multithreaded superscalar,
the long stalls are partially hidden by switching
to another thread that uses the resources of the
processor.
 This reduces the number of completely idle
clock cycles, within each clock cycle, the ILP
limitations still lead to idle cycles.
 In a coarse grained multithreaded processor,
thread switching only occurs when there is a
stall, then also there will be some fully idle cycles
remaining.
26
The shades of grey and black correspond to
different threads in the multithreading processors.
Converting Thread-Level
Parallelism into Instruction-Level Parallelism
 In the fine-grained multithreading, the
interleaving of threads eliminates fully empty
slots.
 Because only one thread issues instructions in
a given clock cycle, ILP limitations still lead to a
significant number of idle slots within individual
clock cycles.
27
An empty (white) box indicates that the
corresponding issue slot is unused in that clock
cycle.
The shades of grey and black correspond to four
different threads in the multithreading processors.
Converting Thread-Level
Parallelism into Instruction-Level Parallelism
 In SMT, TLP and ILP are exploited
simultaneously.
 Ideally, the issue slot usage is limited by
imbalances in the resource needs and resource
availability over multiple threads.
 In practice, other factors —
- how many active threads are considered,
- finite limitations on buffers,
- the ability to fetch enough instructions from
multiple threads, and
- practical limitations of what instruction
combinations can issue from one thread and
from multiple threads—can also restrict how
many slots are used.
28
Converting Thread-Level
Parallelism into Instruction-Level Parallelism
 Design Challenges in SMT
 Because a dynamically scheduled superscalar processor has a
deep pipeline, coarse-grained MT will gain much in performance.
 Since SMT makes sense only in a fine-grained implementation, we
should think about the impact of fine-grained scheduling on single-
thread performance.
 This effect can be minimized by having a preferred thread, which
still permits multithreading to preserve some of its performance
advantage with a smaller compromise in single-thread performance.
29
Converting Thread-Level
Parallelism into Instruction-Level Parallelism
 Design Challenges in SMT
 Other design challenges for an SMT processor:
 Dealing with a larger register file needed to hold multiple contexts.
 Not affecting the clock cycle, particularly in instruction issue, where
more instructions needs to be considered, and choosing what
instructions to commit may be challenging.
 Ensuring that the cache and TLB conflicts generated by the
simultaneous execution of multiple threads do not cause significant
performance degradation is also challenging.
30
Converting Thread-Level
Parallelism into Instruction-Level Parallelism
 Design Challenges in SMT
 In many cases, the potential performance overhead due to
multithreading is small.
 The efficiency of current superscalars is low enough that there is
scope for significant improvement, even at the cost of some overhead.
31
Performance and Efficiency in Advanced
Multiple-Issue Processors
32
Performance and Efficiency in Advanced
Multiple-Issue Processors
 The question of efficiency in terms of silicon area and power is
equally critical.
 Power is the major constraint on modern processors.
 The Itanium 2 is the most inefficient processor both for floating-point
and integer code.
 The Athlon and Pentium 4 both makes good use of transistor and
area in terms of efficiency.
 The IBM Power5 is the most effective user of energy.
 The fact that none of the processors offer an great advantage in
efficiency.
33
Performance and Efficiency in Advanced
Multiple-Issue Processors
 What Limits Multiple-Issue Processors?
 Power is a function of both static power (proportional to the transistor
count, whether or not the transistors are switching), and dynamic
power (proportional to the product of the number of switching
transistors and the switching rate).
 Static power is certainly a design concern, and dynamic power is
usually the dominant energy consumer.
 A microprocessor trying to achieve both a low CPI and a high CR
must switch more transistors and switch them faster.
34
Performance and Efficiency in Advanced
Multiple-Issue Processors
 What Limits Multiple-Issue Processors?
 Most techniques used for increasing performance, (multiple cores
and multithreading) will increase power consumption.
 The key question is whether a technique is energy efficient?
Does it increase power consumption faster than it increases
performance?
35
Performance and Efficiency in Advanced
Multiple-Issue Processors
 What Limits Multiple-Issue Processors?
 This inefficiency, arises from two primary characteristics:
 First, issuing multiple instructions incurs some overhead in logic
that grows faster than the issue rate grows.
 This logic is responsible for instruction issue analysis, including
dependence checking, register renaming, and similar functions.
 The combined result is that, lower CPIs are likely to lead to lower
ratios of performance per watt, simply due to overhead.
36
Performance and Efficiency in Advanced
Multiple-Issue Processors
 What Limits Multiple-Issue Processors?
 Second, the growing gap between peak issue rates and sustained
performance.
 The number of transistors switching will be proportional to the
peak issue rate, and the performance is proportional to the
sustained rate.
 For example: If we want to sustain four instructions per clock, we must
fetch more, issue more, and initiate execution on more than four
instructions.
 The power will be proportional to the peak rate, but performance
will be at the sustained rate.
37
Performance and Efficiency in Advanced
Multiple-Issue Processors
 What Limits Multiple-Issue Processors?
 Important technique for increasing the exploitation of ILP (speculation)
— is inefficient… because it can never be perfect.
 If speculation were perfect, it could save power, since it would
reduce the execution time and save static power.
 When speculation is not perfect, it rapidly becomes energy
inefficient, since it requires additional dynamic power.
38
Performance and Efficiency in Advanced
Multiple-Issue Processors
 What Limits Multiple-Issue Processors?
 Focusing on improving clock rate:
 Increasing the clock rate will increase transistor switching
frequency and directly increase power consumption.
 To achieve a faster clock rate, we would need to increase pipeline
depth.
 Deeper pipelines, incur additional overhead penalties as well as
causing higher switching rates.
39
40
This presentation is published only for educational purpose
shindesir.pvp@gmail.com

Weitere ähnliche Inhalte

Was ist angesagt? (20)

Threads .ppt
Threads .pptThreads .ppt
Threads .ppt
 
Shell and its types in LINUX
Shell and its types in LINUXShell and its types in LINUX
Shell and its types in LINUX
 
Context switching
Context switchingContext switching
Context switching
 
Multiprocessor system
Multiprocessor system Multiprocessor system
Multiprocessor system
 
Process state in OS
Process state in OSProcess state in OS
Process state in OS
 
INTER PROCESS COMMUNICATION (IPC).pptx
INTER PROCESS COMMUNICATION (IPC).pptxINTER PROCESS COMMUNICATION (IPC).pptx
INTER PROCESS COMMUNICATION (IPC).pptx
 
Mainframe systems
Mainframe systemsMainframe systems
Mainframe systems
 
Distributed file system
Distributed file systemDistributed file system
Distributed file system
 
Semaphores
SemaphoresSemaphores
Semaphores
 
Process scheduling
Process schedulingProcess scheduling
Process scheduling
 
Process of operating system
Process of operating systemProcess of operating system
Process of operating system
 
Pipeline hazards in computer Architecture ppt
Pipeline hazards in computer Architecture pptPipeline hazards in computer Architecture ppt
Pipeline hazards in computer Architecture ppt
 
Introduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed ComputingIntroduction to Parallel and Distributed Computing
Introduction to Parallel and Distributed Computing
 
Multi core-architecture
Multi core-architectureMulti core-architecture
Multi core-architecture
 
Processes and threads
Processes and threadsProcesses and threads
Processes and threads
 
Register allocation and assignment
Register allocation and assignmentRegister allocation and assignment
Register allocation and assignment
 
Architecture of operating system
Architecture of operating systemArchitecture of operating system
Architecture of operating system
 
Crash recovery in database
Crash recovery in databaseCrash recovery in database
Crash recovery in database
 
OS - Process Concepts
OS - Process ConceptsOS - Process Concepts
OS - Process Concepts
 
Instruction pipeline: Computer Architecture
Instruction pipeline: Computer ArchitectureInstruction pipeline: Computer Architecture
Instruction pipeline: Computer Architecture
 

Andere mochten auch

Andere mochten auch (8)

Java And Multithreading
Java And MultithreadingJava And Multithreading
Java And Multithreading
 
Threads in JAVA
Threads in JAVAThreads in JAVA
Threads in JAVA
 
Multithreading In Java
Multithreading In JavaMultithreading In Java
Multithreading In Java
 
Chap2 2 1
Chap2 2 1Chap2 2 1
Chap2 2 1
 
Multithreading Concepts
Multithreading ConceptsMultithreading Concepts
Multithreading Concepts
 
Threads concept in java
Threads concept in javaThreads concept in java
Threads concept in java
 
Multithreading in java
Multithreading in javaMultithreading in java
Multithreading in java
 
Java multi threading
Java multi threadingJava multi threading
Java multi threading
 

Ähnlich wie Multithreading

What is simultaneous multithreading
What is simultaneous multithreadingWhat is simultaneous multithreading
What is simultaneous multithreadingFraboni Ec
 
Multithreading: Exploiting Thread-Level Parallelism to Improve Uniprocessor ...
Multithreading: Exploiting Thread-Level  Parallelism to Improve Uniprocessor ...Multithreading: Exploiting Thread-Level  Parallelism to Improve Uniprocessor ...
Multithreading: Exploiting Thread-Level Parallelism to Improve Uniprocessor ...Ahmed kasim
 
1. What important part of the process switch operation is not shown .pdf
1. What important part of the process switch operation is not shown .pdf1. What important part of the process switch operation is not shown .pdf
1. What important part of the process switch operation is not shown .pdffathimaoptical
 
Multicore Computers
Multicore ComputersMulticore Computers
Multicore ComputersA B Shinde
 
thread_ multiprocessor_ scheduling_a.ppt
thread_ multiprocessor_ scheduling_a.pptthread_ multiprocessor_ scheduling_a.ppt
thread_ multiprocessor_ scheduling_a.pptnaghamallella
 
Parallel and Distributed Computing chapter 3
Parallel and Distributed Computing chapter 3Parallel and Distributed Computing chapter 3
Parallel and Distributed Computing chapter 3AbdullahMunir32
 
Multi threaded programming
Multi threaded programmingMulti threaded programming
Multi threaded programmingAnyapuPranav
 
OS Module-2.pptx
OS Module-2.pptxOS Module-2.pptx
OS Module-2.pptxbleh23
 
Thread (Operating System)
Thread  (Operating System)Thread  (Operating System)
Thread (Operating System)kiran Patel
 
Operating Systems - "Chapter 4: Multithreaded Programming"
Operating Systems - "Chapter 4:  Multithreaded Programming"Operating Systems - "Chapter 4:  Multithreaded Programming"
Operating Systems - "Chapter 4: Multithreaded Programming"Ra'Fat Al-Msie'deen
 
Study of various factors affecting performance of multi core processors
Study of various factors affecting performance of multi core processorsStudy of various factors affecting performance of multi core processors
Study of various factors affecting performance of multi core processorsateeq ateeq
 

Ähnlich wie Multithreading (20)

Threads
ThreadsThreads
Threads
 
What is simultaneous multithreading
What is simultaneous multithreadingWhat is simultaneous multithreading
What is simultaneous multithreading
 
Wiki 2
Wiki 2Wiki 2
Wiki 2
 
Topic 4- processes.pptx
Topic 4- processes.pptxTopic 4- processes.pptx
Topic 4- processes.pptx
 
Multithreading: Exploiting Thread-Level Parallelism to Improve Uniprocessor ...
Multithreading: Exploiting Thread-Level  Parallelism to Improve Uniprocessor ...Multithreading: Exploiting Thread-Level  Parallelism to Improve Uniprocessor ...
Multithreading: Exploiting Thread-Level Parallelism to Improve Uniprocessor ...
 
1. What important part of the process switch operation is not shown .pdf
1. What important part of the process switch operation is not shown .pdf1. What important part of the process switch operation is not shown .pdf
1. What important part of the process switch operation is not shown .pdf
 
Multicore Computers
Multicore ComputersMulticore Computers
Multicore Computers
 
thread_ multiprocessor_ scheduling_a.ppt
thread_ multiprocessor_ scheduling_a.pptthread_ multiprocessor_ scheduling_a.ppt
thread_ multiprocessor_ scheduling_a.ppt
 
Parallel and Distributed Computing chapter 3
Parallel and Distributed Computing chapter 3Parallel and Distributed Computing chapter 3
Parallel and Distributed Computing chapter 3
 
Thread
ThreadThread
Thread
 
Assignment-01.pptx
Assignment-01.pptxAssignment-01.pptx
Assignment-01.pptx
 
Multi threaded programming
Multi threaded programmingMulti threaded programming
Multi threaded programming
 
Threads
ThreadsThreads
Threads
 
OS Module-2.pptx
OS Module-2.pptxOS Module-2.pptx
OS Module-2.pptx
 
Ef35745749
Ef35745749Ef35745749
Ef35745749
 
W-9.pptx
W-9.pptxW-9.pptx
W-9.pptx
 
Thread (Operating System)
Thread  (Operating System)Thread  (Operating System)
Thread (Operating System)
 
Operating Systems - "Chapter 4: Multithreaded Programming"
Operating Systems - "Chapter 4:  Multithreaded Programming"Operating Systems - "Chapter 4:  Multithreaded Programming"
Operating Systems - "Chapter 4: Multithreaded Programming"
 
Threads
ThreadsThreads
Threads
 
Study of various factors affecting performance of multi core processors
Study of various factors affecting performance of multi core processorsStudy of various factors affecting performance of multi core processors
Study of various factors affecting performance of multi core processors
 

Mehr von A B Shinde

Communication System Basics
Communication System BasicsCommunication System Basics
Communication System BasicsA B Shinde
 
MOSFETs: Single Stage IC Amplifier
MOSFETs: Single Stage IC AmplifierMOSFETs: Single Stage IC Amplifier
MOSFETs: Single Stage IC AmplifierA B Shinde
 
Color Image Processing: Basics
Color Image Processing: BasicsColor Image Processing: Basics
Color Image Processing: BasicsA B Shinde
 
Edge Detection and Segmentation
Edge Detection and SegmentationEdge Detection and Segmentation
Edge Detection and SegmentationA B Shinde
 
Image Processing: Spatial filters
Image Processing: Spatial filtersImage Processing: Spatial filters
Image Processing: Spatial filtersA B Shinde
 
Image Enhancement in Spatial Domain
Image Enhancement in Spatial DomainImage Enhancement in Spatial Domain
Image Enhancement in Spatial DomainA B Shinde
 
Digital Image Fundamentals
Digital Image FundamentalsDigital Image Fundamentals
Digital Image FundamentalsA B Shinde
 
Resume Writing
Resume WritingResume Writing
Resume WritingA B Shinde
 
Image Processing Basics
Image Processing BasicsImage Processing Basics
Image Processing BasicsA B Shinde
 
Blooms Taxonomy in Engineering Education
Blooms Taxonomy in Engineering EducationBlooms Taxonomy in Engineering Education
Blooms Taxonomy in Engineering EducationA B Shinde
 
ISE 7.1i Software
ISE 7.1i SoftwareISE 7.1i Software
ISE 7.1i SoftwareA B Shinde
 
VHDL Coding Syntax
VHDL Coding SyntaxVHDL Coding Syntax
VHDL Coding SyntaxA B Shinde
 
VLSI Testing Techniques
VLSI Testing TechniquesVLSI Testing Techniques
VLSI Testing TechniquesA B Shinde
 
Selecting Engineering Project
Selecting Engineering ProjectSelecting Engineering Project
Selecting Engineering ProjectA B Shinde
 
Interview Techniques
Interview TechniquesInterview Techniques
Interview TechniquesA B Shinde
 
Semiconductors
SemiconductorsSemiconductors
SemiconductorsA B Shinde
 
Diode Applications & Transistor Basics
Diode Applications & Transistor BasicsDiode Applications & Transistor Basics
Diode Applications & Transistor BasicsA B Shinde
 

Mehr von A B Shinde (20)

Communication System Basics
Communication System BasicsCommunication System Basics
Communication System Basics
 
MOSFETs: Single Stage IC Amplifier
MOSFETs: Single Stage IC AmplifierMOSFETs: Single Stage IC Amplifier
MOSFETs: Single Stage IC Amplifier
 
MOSFETs
MOSFETsMOSFETs
MOSFETs
 
Color Image Processing: Basics
Color Image Processing: BasicsColor Image Processing: Basics
Color Image Processing: Basics
 
Edge Detection and Segmentation
Edge Detection and SegmentationEdge Detection and Segmentation
Edge Detection and Segmentation
 
Image Processing: Spatial filters
Image Processing: Spatial filtersImage Processing: Spatial filters
Image Processing: Spatial filters
 
Image Enhancement in Spatial Domain
Image Enhancement in Spatial DomainImage Enhancement in Spatial Domain
Image Enhancement in Spatial Domain
 
Resume Format
Resume FormatResume Format
Resume Format
 
Digital Image Fundamentals
Digital Image FundamentalsDigital Image Fundamentals
Digital Image Fundamentals
 
Resume Writing
Resume WritingResume Writing
Resume Writing
 
Image Processing Basics
Image Processing BasicsImage Processing Basics
Image Processing Basics
 
Blooms Taxonomy in Engineering Education
Blooms Taxonomy in Engineering EducationBlooms Taxonomy in Engineering Education
Blooms Taxonomy in Engineering Education
 
ISE 7.1i Software
ISE 7.1i SoftwareISE 7.1i Software
ISE 7.1i Software
 
VHDL Coding Syntax
VHDL Coding SyntaxVHDL Coding Syntax
VHDL Coding Syntax
 
VHDL Programs
VHDL ProgramsVHDL Programs
VHDL Programs
 
VLSI Testing Techniques
VLSI Testing TechniquesVLSI Testing Techniques
VLSI Testing Techniques
 
Selecting Engineering Project
Selecting Engineering ProjectSelecting Engineering Project
Selecting Engineering Project
 
Interview Techniques
Interview TechniquesInterview Techniques
Interview Techniques
 
Semiconductors
SemiconductorsSemiconductors
Semiconductors
 
Diode Applications & Transistor Basics
Diode Applications & Transistor BasicsDiode Applications & Transistor Basics
Diode Applications & Transistor Basics
 

Kürzlich hochgeladen

Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoordharasingh5698
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfJiananWang21
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfRagavanV2
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Arindam Chakraborty, Ph.D., P.E. (CA, TX)
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086anil_gaur
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startQuintin Balsdon
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxJuliansyahHarahap1
 

Kürzlich hochgeladen (20)

Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdf
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 

Multithreading

  • 1. Multithreading Mr. A. B. Shinde Assistant Professor, Electronics Engineering, P.V.P.I.T., Budhgaon
  • 2. Contents…  Using ILP support to exploit thread –level parallelism  performance and efficiency in advanced multiple issue processors 2
  • 3. Threads  A thread is a basic unit of CPU utilization.  A thread is a separate process with its own instructions and data.  A thread may represent a process that is part of a parallel program consisting of multiple processes, or it may represent an independent program. 3
  • 4. Threads  It comprises of a thread ID, a program counter, a register set and a stack.  It shares its code section, data section, and other operating-system resources, such as open files and signals with other threads belonging to the same Process.  A traditional process has a single thread of control. If a process has multiple threads of control, it can perform more than one task at a time. 4
  • 5. Threads  Many software packages that run on modern desktop PCs are multithreaded.  For example: A word processor may have: a thread for displaying graphics, another thread for responding to keystrokes from the user, and a third thread for performing spelling and grammar checking in the background. 5
  • 6. Threads  Threads also play a vital role in remote procedure call (RPC) systems.  RPCs allows interprocess communication by providing a communication mechanism similar to ordinary function or procedure calls.  Many operating system kernels are multithreaded; several threads operate in the kernel, and each thread performs a specific task, such as managing devices or interrupt handling. 6
  • 7. Multithreading  Benefits: 1. Responsiveness: Multithreading is an interactive application that may allow a program to continue running even if part of it is blocked, thereby increasing responsiveness to the user. For example: A multithreaded web browser could still allow user interaction in one thread while an image was being loaded in another thread. 2. Resource sharing: By default, threads share the memory and the resources of the process to which they belong. The benefit of sharing code and data is that it allows an application to have several different threads of activity within the same address space. 7
  • 8. Multithreading  Benefits: 3. Economy: Allocating memory and resources for process creation is costly. Since threads share resources of the process to which they belong, they will provide cost effective solution. 4. Utilization of multiprocessor architectures: In multiprocessor architecture, threads may be running in parallel on different processors. A single threaded process can only run on one CPU, no matter how many are available. Multithreading on a multi-CPU machine increases concurrency. 8
  • 9. Multithreading Models  Support for threads may be provided either at the user level or at the kernel level.  User threads are supported above the kernel and are managed without kernel support, whereas kernel threads are supported and managed directly by the operating system. 9
  • 10. Multithreading Models  Many-to-One Model:  The many-to-one model maps many user- level threads to one kernel thread.  Thread management is done by the thread library in user space, so it is efficient.  Only one thread can access the kernel at a time, hence multiple threads are unable to run in parallel on multiprocessors. 10
  • 11. Multithreading Models  One-to-One Model:  The one-to-one model maps each user thread to a kernel thread.  It provides more concurrency than the many- to-one model. It allows multiple threads to run in parallel on multiprocessors.  The only drawback to this model is that creating a user thread requires creating the corresponding kernel thread.  The overhead of creating kernel threads can burden the performance of an application. 11
  • 12. Multithreading Models  Many-to-Many Model :  The many-to-many model multiplexes many user-level threads to a smaller or equal number of kernel threads.  The number of kernel threads may be specific to either a particular application or a particular machine.  Developers can create as many user threads as necessary, and the corresponding kernel threads can run in parallel on a multiprocessor. 12
  • 13. Multithreading: ILP Support to Exploit Thread-Level Parallelism 13
  • 14. Multithreading: ILP Support to Exploit Thread-Level Parallelism  Although ILP increases the performance of system; then also ILP can be quite limited or hard to exploit in some applications. Furthermore, there may be parallelism occurring naturally at a higher level in the application. For example: An online transaction-processing system has parallelism among the multiple queries and updates. These queries and updates can be processed mostly in parallel, since they are largely independent of one another. 14
  • 15. Multithreading: ILP Support to Exploit Thread-Level Parallelism  This higher-level parallelism is called thread-level parallelism (TLP) because it is logically structured as separate threads of execution.  ILP is parallel operations within a loop or straight-line code.  TLP is represented by the use of multiple threads of execution that are in parallel. 15
  • 16. Multithreading: ILP Support to Exploit Thread-Level Parallelism  Thread-level parallelism is an important alternative to instruction- level parallelism.  In many applications thread-level parallelism occurs naturally (many server applications).  If software is written from scratch, then expressing the parallelism is much easy.  But if established applications written without parallelism in mind, then there can be significant challenges and can be extremely costly to rewrite them to exploit thread-level parallelism. 16
  • 17. Multithreading: ILP Support to Exploit Thread-Level Parallelism  TLP and ILP exploits two different kinds of parallel structures.  The crucial question is: Can we exploit TLP on processor designed for ILP  Answer is: Yes Datapath designed to exploit ILP will find that many functional units are often idle because of either stalls or dependences in the code. The threads can be used as a independent instructions that might keep the processor busy to implement TLP. 17
  • 18. Multithreading: ILP Support to Exploit Thread-Level Parallelism  Multithreading allows multiple threads to share the functional units of a single processor in an overlapping fashion.  To permit this sharing, the processor must duplicate the independent state of each thread.  For example: A separate copy of the register file, a separate PC and a separate page table were required for each thread.  In addition, the hardware must support the ability to change to a different threads relatively quickly. 18
  • 19. Multithreading: ILP Support to Exploit Thread-Level Parallelism  There are two main approaches to multithreading.  Fine-grained multithreading &  Coarse-grained multithreading 19
  • 20. Multithreading: ILP Support to Exploit Thread-Level Parallelism  Fine-grained multithreading:  It switches between threads on each instruction, causing the execution of multiple threads to be interleaved.  This interleaving is often done in a round-robin fashion.  To make fine-grained multithreading practical, the CPU must be able to switch threads on every clock cycle.  Advantage: It can hide the throughput losses that arise from both short and long stalls.  Disadvantage: It slows down the execution of the individual threads. 20
  • 21. Multithreading: ILP Support to Exploit Thread-Level Parallelism  Coarse-grained multithreading:  It was invented as an alternative to fine-grained multithreading.  Coarse-grained multithreading switches threads only on costly (larger) stalls.  Advantage: This change relieves the need to have thread switching.  Disadvantage: They are likely to slow the processor down, since instructions from other threads will only be issued when a thread encounters a costly (larger) stalls. 21
  • 22. Multithreading: ILP Support to Exploit Thread-Level Parallelism  CPU with coarse-grained multithreading issues instructions from a single thread.  When a stall occurs, the pipeline must be emptied or frozen.  New thread that is executing after the stall must fill the pipeline.  Because of this start-up overhead, coarse grained multithreading is much more useful for reducing the penalty of high-cost stalls, where pipeline refill is negligible compared to the stall time. 22
  • 23. Converting Thread-Level Parallelism into Instruction-Level Parallelism  Simultaneous multithreading (SMT) is a variation on multithreading that uses the resources of a multiple-issue, dynamically scheduled processor to exploit TLP.  Multiple-issue processors often have more functional unit parallelism than a single thread, motivates the use of SMT.  With register renaming and dynamic scheduling, multiple instructions from independent threads can be issued without considering the dependences among them. 23
  • 24. Converting Thread-Level Parallelism into Instruction-Level Parallelism Figure illustrates the differences in a processor’s ability to exploit the resources of a superscalar for the following configurations:  A superscalar with no multithreading support  A superscalar with coarse-grained multithreading  A superscalar with fine-grained multithreading  A superscalar with simultaneous multithreading 24
  • 25. Converting Thread-Level Parallelism into Instruction-Level Parallelism  In the superscalar without multithreading support, the use of issue slots is limited by a lack of ILP.  In addition, a major stall, such as an instruction cache miss, can leave the entire processor idle. 25 An empty (white) box indicates that the corresponding issue slot is unused in that clock cycle. Black is used to indicate the occupied issue slots
  • 26. Converting Thread-Level Parallelism into Instruction-Level Parallelism  In the coarse-grained multithreaded superscalar, the long stalls are partially hidden by switching to another thread that uses the resources of the processor.  This reduces the number of completely idle clock cycles, within each clock cycle, the ILP limitations still lead to idle cycles.  In a coarse grained multithreaded processor, thread switching only occurs when there is a stall, then also there will be some fully idle cycles remaining. 26 The shades of grey and black correspond to different threads in the multithreading processors.
  • 27. Converting Thread-Level Parallelism into Instruction-Level Parallelism  In the fine-grained multithreading, the interleaving of threads eliminates fully empty slots.  Because only one thread issues instructions in a given clock cycle, ILP limitations still lead to a significant number of idle slots within individual clock cycles. 27 An empty (white) box indicates that the corresponding issue slot is unused in that clock cycle. The shades of grey and black correspond to four different threads in the multithreading processors.
  • 28. Converting Thread-Level Parallelism into Instruction-Level Parallelism  In SMT, TLP and ILP are exploited simultaneously.  Ideally, the issue slot usage is limited by imbalances in the resource needs and resource availability over multiple threads.  In practice, other factors — - how many active threads are considered, - finite limitations on buffers, - the ability to fetch enough instructions from multiple threads, and - practical limitations of what instruction combinations can issue from one thread and from multiple threads—can also restrict how many slots are used. 28
  • 29. Converting Thread-Level Parallelism into Instruction-Level Parallelism  Design Challenges in SMT  Because a dynamically scheduled superscalar processor has a deep pipeline, coarse-grained MT will gain much in performance.  Since SMT makes sense only in a fine-grained implementation, we should think about the impact of fine-grained scheduling on single- thread performance.  This effect can be minimized by having a preferred thread, which still permits multithreading to preserve some of its performance advantage with a smaller compromise in single-thread performance. 29
  • 30. Converting Thread-Level Parallelism into Instruction-Level Parallelism  Design Challenges in SMT  Other design challenges for an SMT processor:  Dealing with a larger register file needed to hold multiple contexts.  Not affecting the clock cycle, particularly in instruction issue, where more instructions needs to be considered, and choosing what instructions to commit may be challenging.  Ensuring that the cache and TLB conflicts generated by the simultaneous execution of multiple threads do not cause significant performance degradation is also challenging. 30
  • 31. Converting Thread-Level Parallelism into Instruction-Level Parallelism  Design Challenges in SMT  In many cases, the potential performance overhead due to multithreading is small.  The efficiency of current superscalars is low enough that there is scope for significant improvement, even at the cost of some overhead. 31
  • 32. Performance and Efficiency in Advanced Multiple-Issue Processors 32
  • 33. Performance and Efficiency in Advanced Multiple-Issue Processors  The question of efficiency in terms of silicon area and power is equally critical.  Power is the major constraint on modern processors.  The Itanium 2 is the most inefficient processor both for floating-point and integer code.  The Athlon and Pentium 4 both makes good use of transistor and area in terms of efficiency.  The IBM Power5 is the most effective user of energy.  The fact that none of the processors offer an great advantage in efficiency. 33
  • 34. Performance and Efficiency in Advanced Multiple-Issue Processors  What Limits Multiple-Issue Processors?  Power is a function of both static power (proportional to the transistor count, whether or not the transistors are switching), and dynamic power (proportional to the product of the number of switching transistors and the switching rate).  Static power is certainly a design concern, and dynamic power is usually the dominant energy consumer.  A microprocessor trying to achieve both a low CPI and a high CR must switch more transistors and switch them faster. 34
  • 35. Performance and Efficiency in Advanced Multiple-Issue Processors  What Limits Multiple-Issue Processors?  Most techniques used for increasing performance, (multiple cores and multithreading) will increase power consumption.  The key question is whether a technique is energy efficient? Does it increase power consumption faster than it increases performance? 35
  • 36. Performance and Efficiency in Advanced Multiple-Issue Processors  What Limits Multiple-Issue Processors?  This inefficiency, arises from two primary characteristics:  First, issuing multiple instructions incurs some overhead in logic that grows faster than the issue rate grows.  This logic is responsible for instruction issue analysis, including dependence checking, register renaming, and similar functions.  The combined result is that, lower CPIs are likely to lead to lower ratios of performance per watt, simply due to overhead. 36
  • 37. Performance and Efficiency in Advanced Multiple-Issue Processors  What Limits Multiple-Issue Processors?  Second, the growing gap between peak issue rates and sustained performance.  The number of transistors switching will be proportional to the peak issue rate, and the performance is proportional to the sustained rate.  For example: If we want to sustain four instructions per clock, we must fetch more, issue more, and initiate execution on more than four instructions.  The power will be proportional to the peak rate, but performance will be at the sustained rate. 37
  • 38. Performance and Efficiency in Advanced Multiple-Issue Processors  What Limits Multiple-Issue Processors?  Important technique for increasing the exploitation of ILP (speculation) — is inefficient… because it can never be perfect.  If speculation were perfect, it could save power, since it would reduce the execution time and save static power.  When speculation is not perfect, it rapidly becomes energy inefficient, since it requires additional dynamic power. 38
  • 39. Performance and Efficiency in Advanced Multiple-Issue Processors  What Limits Multiple-Issue Processors?  Focusing on improving clock rate:  Increasing the clock rate will increase transistor switching frequency and directly increase power consumption.  To achieve a faster clock rate, we would need to increase pipeline depth.  Deeper pipelines, incur additional overhead penalties as well as causing higher switching rates. 39
  • 40. 40 This presentation is published only for educational purpose shindesir.pvp@gmail.com