SlideShare ist ein Scribd-Unternehmen logo
1 von 49
Downloaden Sie, um offline zu lesen
CSE539: Advanced Computer Architecture

Chapter 6

Pipelining and Superscalar
Techniques
Book: “Advanced Computer Architecture – Parallelism, Scalability, Programmability”, Hwang & Jotwani

Sumit Mittu
Assistant Professor, CSE/IT
Lovely Professional University
sumit.12735@lpu.co.in
In this chapter…
•
•
•
•
•

Linear Pipeline Processors
Non-linear Pipeline Processors
Instruction Pipeline Design
Arithmetic Pipeline Design
Superscalar Pipeline Design

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

2
LINEAR PIPELINE PROCESSORS
• Linear Pipeline Processor
o (Definition)

• Models of Linear Pipeline
o Synchronous Model
o Asynchronous Model
o (Corresponding reservation tables)

• Clocking and Timing Control
o
o
o
o
o

Clock Cycle
Pipeline Frequency
Clock skewing
Flow-through delay
Speedup, Efficiency and Throughput

• Optimal number of Stages and Performance-Cost Ratio (PCR)
Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

3
LINEAR PIPELINE PROCESSORS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

4
LINEAR PIPELINE PROCESSORS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

5
NON-LINEAR PIPELINE PROCESSORS
• Dynamic Pipeline
o Static v/s Dynamic Pipeline
o Streamline connection, feed-forward connection and feedback connection

• Reservation and Latency Analysis
o Reservation tables
o Evaluation time

• Latency Analysis
o
o
o
o

Latency
Collision
Forbidden latencies
Latency Sequence, Latency Cycle and Average Latency

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

6
NON-LINEAR PIPELINE PROCESSORS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

7
NON-LINEAR PIPELINE PROCESSORS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

8
NON-LINEAR PIPELINE PROCESSORS

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

9
INSTRUCTION PIPELINE DESIGN
• Instruction Execution Phases
o E.g. Fetch, Decode, Issue, Execute, Write-back
o In-order Instruction issuing and Reordered Instruction issuing
• E.g. X = Y + Z , A = B x C

• Mechanisms/Design Issues for Instruction Pipelining
o
o
o
o

Pre-fetch Buffers
Multiple Functional Units
Internal Data Forwarding
Hazard Avoidance

• Dynamic Scheduling
• Branch Handling Techniques
Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

10
INSTRUCTION PIPELINE DESIGN

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

11
INSTRUCTION PIPELINE DESIGN

•
•
•
•
•

Fetch: fetches instructions from memory; ideally one per cycle
Decode: reveals instruction operations to be performed and identifies the resources needed
Issue: reserves the resources and reads the operands from registers
Execute: actual processing of operations as indicated by instruction
Write Back: writing results into the registers
Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

12
INSTRUCTION PIPELINE DESIGN

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

13
INSTRUCTION PIPELINE DESIGN

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

14
INSTRUCTION PIPELINE DESIGN
Mechanisms/Design Issues of Instruction Pipeline
• Pre-fetch Buffers
o Sequential Buffers
o Target Buffers
o Loop Buffers

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

15
INSTRUCTION PIPELINE DESIGN
Mechanisms/Design Issues of Instruction Pipeline
• Multiple Functional Units
o Reservation Station and Tags
o Slow-station as Bottleneck stage
• Subdivision of Pipeline Bottleneck stage
• Replication of Pipeline Bottleneck stage
• (Example to be discussed)

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

16
INSTRUCTION PIPELINE DESIGN
Mechanisms/Design Issues of Instruction Pipeline

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

17
INSTRUCTION PIPELINE DESIGN
Mechanisms/Design Issues of Instruction Pipeline
• Internal Forwarding and Register Tagging
o Internal Forwarding:
• A “short-circuit” technique to replace unnecessary memory accesses by register-register
transfers in a sequence of fetch-arithmetic-store operations
o Register Tagging:
• Use of tagged registers , buffers and reservation stations, for exploiting concurrent activities
among multiple arithmetic units
o Store-Fetch Forwarding
• (M  R1, R2  M) replaced by (M  R1, R2  R1)
o Fetch-Fetch Forwarding
• (R1  M, R2  M) replaced by (R1  M, R2  R1)
o Store-Store Overwriting
• (M  R1, M  R2) replaced by (M  R2)

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

18
INSTRUCTION PIPELINE DESIGN
Mechanisms/Design Issues of Instruction Pipeline
• Internal Forwarding and Register Tagging

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

19
INSTRUCTION PIPELINE DESIGN
Mechanisms/Design Issues of Instruction Pipeline
• Internal Forwarding and Register Tagging

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

20
INSTRUCTION PIPELINE DESIGN
Mechanisms/Design Issues of Instruction Pipeline
• Hazard Detection and Avoidance
o
o
o
o

Domain or Input Set of an instruction
Range or Output Set of an instruction
Data Hazards: RAW, WAR and WAW
Resolution using Register Renaming approach

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

21
INSTRUCTION PIPELINE DESIGN
Dynamic Instruction Scheduling
• Idea of Static Scheduling
o Compiler based scheduling strategy to resolve Interlocking among instructions

• Dynamic Scheduling
o Tomasulo’s Algorithm (Register-Tagging Scheme)
• Hardware based dependence-resolution
o Scoreboarding Technique
• Scoreboard: the centralized control unit
• A kind of data-driven mechanism

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

22
INSTRUCTION PIPELINE DESIGN
Dynamic Instruction Scheduling

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

23
INSTRUCTION PIPELINE DESIGN
Dynamic Instruction Scheduling

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

24
INSTRUCTION PIPELINE DESIGN
Branch Handling Techniques
• Branch Taken, Branch Target, Delay Slot
• Effect of Branching
o Parameters:
• k:
No. of stages in the pipeline
• n:
Total no. of instructions or tasks
• p:
Percentage of Brach instructions over n
• q:
Percentage of successful branch instructions (branch taken) over p.
• b:
Delay Slot
• τ:
Pipeline Cycle Time
o Branch Penalty = q of (p of n) * bτ = pqnbτ
o Effective Execution Time:
• Teff = [k + (n-1)] τ + pqnbτ = [k + (n-1) + pqnb]τ

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

25
INSTRUCTION PIPELINE DESIGN
Branch Handling Techniques
• Effect of Branching
o Effective Throughput:
• Heff = n/Teff
• Heff = n / {[k + (n-1) + pqnb]τ} = nf / [k + (n-1) + pqnb]
• As nInfinity and b = k-1
o H*eff = f / [pq(k-1)+1]
• If p=0 and q=0 (no branching occurs)
o H**eff = f = 1/τ
o Performance Degradation Factor
• D = 1 – H*eff / f = pq(k-1) / [pq(k-1)+1]

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

26
INSTRUCTION PIPELINE DESIGN
Branch Handling Techniques

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

27
INSTRUCTION PIPELINE DESIGN
Branch Handling Techniques
• Branch Prediction
o Static Branch Prediction: based on branch code types
o Dynamic Branch prediction: based on recent branch history
• Strategy 1: Predict the branch direction based on information found at decode stage.
• Strategy 2: Use a cache to store target addresses at effective address calculation stage.
• Strategy 3: Use a cache to store target instructions at fetch stage
o Brach Target Buffer Organization

• Delayed Branches
o A delayed branch of d cycles allows at most d-1 useful instructions to be executed following the
branch taken.
o Execution of these instructions should be independent of branch instruction to achieve a zero
branch penalty
Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

28
INSTRUCTION PIPELINE DESIGN
Branch Handling Techniques

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

29
INSTRUCTION PIPELINE DESIGN
Branch Handling Techniques

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

30
ARITHMETIC PIPELINE DESIGN
Computer Arithmetic Operations
• Finite-precision arithmetic
• Overflow and Underflow
• Fixed-Point operations
o Notations:
• Signed-magnitude, one’s complement and two-complement notation
o Operations:
• Addition:
(n bit, n bit)  (n bit) Sum, 1 bit output carry
• Subtraction:
(n bit, n bit)  (n bit) difference
• Multiplication:
(n bit, n bit)  (2n bit) product
• Division:
(2n bit, n bit)  (n bit) quotient, (n bit) remainder

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

31
ARITHMETIC PIPELINE DESIGN
Computer Arithmetic Operations
• Floating-Point Numbers
o X = (m, e) representation
• m: mantissa or fraction
• e: exponent with an implied base or radix r.
• Actual Value X = m * r e
o Operations on numbers X = (mx, ex) and Y = (my, ey)
• Addition:
(mx * rex-ey + my, ey)
• Subtraction:
(mx * rex-ey – my, ey)
• Multiplication:
(mx * my, ex+ey)
• Division:
(mx / my, ex – ey)

• Elementary Functions
o Transcendental functions like: Trigonometric, Exponential, Logarithmic, etc.

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

32
ARITHMETIC PIPELINE DESIGN
Static Arithmetic Pipelines
•
•
•
•

Separate units for fixed point operations and floating point operations
Scalar and Vector Arithmetic Pipelines
Uni-functional or Static Pipelines
Arithmetic Pipeline Stages
o Majorly involve hardware to perform: Add and Shift micro-operations
o Addition using: Carry Propagation Adder (CPA) and Carry Save Adder (CSA)
o Shift using: Shift Registers

• Multiplication Pipeline Design
o E.g. To multiply two 8-bit numbers that yield a 16-bit product using CSA and CPA Wallace Tree.

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

33
ARITHMETIC PIPELINE DESIGN
Static Arithmetic Pipelines

0

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

34
ARITHMETIC PIPELINE DESIGN
Static Arithmetic Pipelines

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

35
ARITHMETIC PIPELINE DESIGN
Multifunctional Arithmetic Pipelines
• Multifunctional Pipeline:
o Static multifunctional pipeline
o Dynamic multifunctional pipeline

• Case Study: T1/ASC static multifunctional pipeline architecture

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

36
ARITHMETIC PIPELINE DESIGN
Multifunctional Arithmetic Pipelines

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

37
ARITHMETIC PIPELINE DESIGN
Multifunctional Arithmetic Pipelines

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

38
SUPERSCALAR PIPELINE DESIGN
• Pipeline Design Parameters
o Pipeline cycle, Base cycle, Instruction issue rate, Instruction issue Latency, Simple Operation Latency
o ILP to fully utilize the pipeline

•
•
•
•

Superscalar Pipeline Structure
Data and Resource Dependencies
Pipeline Stalling
Superscalar Pipeline Scheduling
o In-order Issue and in-order completion
o In-order Issue and out-of-order completion
o Out-of-order Issue and out-of-order completion

• Superscalar Performance
Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

39
SUPERSCALAR PIPELINE DESIGN
Parameter

Base Scalar Processor

Super Scalar Processor
(degree = K)

Pipeline Cycle

1 (base cycle)

K

Instruction Issue Rate

1

K

Instruction Issue Latency

1

1

Simple Operation Latency

1

1

ILP to fully utilize pipeline

1

K

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

40
SUPERSCALAR PIPELINE DESIGN

4,

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

41
SUPERSCALAR PIPELINE DESIGN

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

42
SUPERSCALAR PIPELINE DESIGN

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

43
SUPERSCALAR PIPELINE DESIGN

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

44
SUPERSCALAR PIPELINE DESIGN

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

45
SUPERSCALAR PIPELINE DESIGN

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

46
SUPERSCALAR PIPELINE DESIGN

4,

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

47
SUPERSCALAR PIPELINE DESIGN

Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

48
SUPERSCALAR PIPELINE DESIGN
• Time required by base scalar machine:
o T(1,1) = K + N – 1

• The ideal execution time required by m-issue superscalar machine:
o T(m,1) = K + (N – m)/m
o Where,
• K is the time required to execute first m instructions through m pipelines of k-stages
simultaneously
• Second term corresponds to time required to execute remaining N-m instructions , m per
cycle through m pipelines

• The ideal speedup of superscalar machine
o S(m,1) = T(1,1)/T(m,1) = m(N + k – 1)/[N+ m(k – 1)]

• As n  infinity, S(m,1) m
Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University

49

Weitere ähnliche Inhalte

Was ist angesagt?

Window to viewport transformation
Window to viewport transformationWindow to viewport transformation
Window to viewport transformation
Ankit Garg
 
01 knapsack using backtracking
01 knapsack using backtracking01 knapsack using backtracking
01 knapsack using backtracking
mandlapure
 

Was ist angesagt? (20)

Polygons - Computer Graphics - Notes
Polygons - Computer Graphics - NotesPolygons - Computer Graphics - Notes
Polygons - Computer Graphics - Notes
 
Predicate logic_2(Artificial Intelligence)
Predicate logic_2(Artificial Intelligence)Predicate logic_2(Artificial Intelligence)
Predicate logic_2(Artificial Intelligence)
 
Lecture 5 6_7 - divide and conquer and method of solving recurrences
Lecture 5 6_7 - divide and conquer and method of solving recurrencesLecture 5 6_7 - divide and conquer and method of solving recurrences
Lecture 5 6_7 - divide and conquer and method of solving recurrences
 
Asymptotic Notation
Asymptotic NotationAsymptotic Notation
Asymptotic Notation
 
DeadLock in Operating-Systems
DeadLock in Operating-SystemsDeadLock in Operating-Systems
DeadLock in Operating-Systems
 
Code optimization in compiler design
Code optimization in compiler designCode optimization in compiler design
Code optimization in compiler design
 
Arithmatic pipline
Arithmatic piplineArithmatic pipline
Arithmatic pipline
 
Pipeline
PipelinePipeline
Pipeline
 
Window to viewport transformation
Window to viewport transformationWindow to viewport transformation
Window to viewport transformation
 
Raster scan systems with video controller and display processor
Raster scan systems with video controller and display processorRaster scan systems with video controller and display processor
Raster scan systems with video controller and display processor
 
Amortized Analysis of Algorithms
Amortized Analysis of Algorithms Amortized Analysis of Algorithms
Amortized Analysis of Algorithms
 
L03 ai - knowledge representation using logic
L03 ai - knowledge representation using logicL03 ai - knowledge representation using logic
L03 ai - knowledge representation using logic
 
Dag representation of basic blocks
Dag representation of basic blocksDag representation of basic blocks
Dag representation of basic blocks
 
Raster animation
Raster animationRaster animation
Raster animation
 
Computer graphics basic transformation
Computer graphics basic transformationComputer graphics basic transformation
Computer graphics basic transformation
 
Floating point arithmetic operations (1)
Floating point arithmetic operations (1)Floating point arithmetic operations (1)
Floating point arithmetic operations (1)
 
Instruction format
Instruction formatInstruction format
Instruction format
 
01 knapsack using backtracking
01 knapsack using backtracking01 knapsack using backtracking
01 knapsack using backtracking
 
Code generation in Compiler Design
Code generation in Compiler DesignCode generation in Compiler Design
Code generation in Compiler Design
 
ANIMATION SEQUENCE
ANIMATION SEQUENCEANIMATION SEQUENCE
ANIMATION SEQUENCE
 

Andere mochten auch

Advanced computer architecture
Advanced computer architectureAdvanced computer architecture
Advanced computer architecture
Md. Mahedi Mahfuj
 
Superscalar & superpipeline processor
Superscalar & superpipeline processorSuperscalar & superpipeline processor
Superscalar & superpipeline processor
Muhammad Ishaq
 
Instruction pipelining
Instruction pipeliningInstruction pipelining
Instruction pipelining
Tech_MX
 
A fast graphic api for non-linear machine learning
A fast graphic api for non-linear machine learningA fast graphic api for non-linear machine learning
A fast graphic api for non-linear machine learning
Peng Cheng
 
Lec Jan29 2009
Lec Jan29 2009Lec Jan29 2009
Lec Jan29 2009
Ravi Soni
 
High Throughput, High Content Screening - Automating the Pipeline
High Throughput, High Content Screening - Automating the PipelineHigh Throughput, High Content Screening - Automating the Pipeline
High Throughput, High Content Screening - Automating the Pipeline
Rajarshi Guha
 
Pipelining Coputing
Pipelining CoputingPipelining Coputing
Pipelining Coputing
m.assayony
 
Chapter 04 the processor
Chapter 04   the processorChapter 04   the processor
Chapter 04 the processor
Bảo Hoang
 

Andere mochten auch (20)

Advanced computer architecture
Advanced computer architectureAdvanced computer architecture
Advanced computer architecture
 
Superscalar & superpipeline processor
Superscalar & superpipeline processorSuperscalar & superpipeline processor
Superscalar & superpipeline processor
 
pipelining
pipeliningpipelining
pipelining
 
Instruction pipeline: Computer Architecture
Instruction pipeline: Computer ArchitectureInstruction pipeline: Computer Architecture
Instruction pipeline: Computer Architecture
 
Instruction pipelining
Instruction pipeliningInstruction pipelining
Instruction pipelining
 
A fast graphic api for non-linear machine learning
A fast graphic api for non-linear machine learningA fast graphic api for non-linear machine learning
A fast graphic api for non-linear machine learning
 
Arithmatic sequence Slide
Arithmatic sequence SlideArithmatic sequence Slide
Arithmatic sequence Slide
 
Improving Test Team Throughput via Architecture by Dustin Williams
Improving Test Team Throughput via Architecture by Dustin WilliamsImproving Test Team Throughput via Architecture by Dustin Williams
Improving Test Team Throughput via Architecture by Dustin Williams
 
Lec Jan29 2009
Lec Jan29 2009Lec Jan29 2009
Lec Jan29 2009
 
Reyes and Shader Pipeline
Reyes and Shader PipelineReyes and Shader Pipeline
Reyes and Shader Pipeline
 
L4 speeding-up-execution
L4 speeding-up-executionL4 speeding-up-execution
L4 speeding-up-execution
 
High Throughput, High Content Screening - Automating the Pipeline
High Throughput, High Content Screening - Automating the PipelineHigh Throughput, High Content Screening - Automating the Pipeline
High Throughput, High Content Screening - Automating the Pipeline
 
Aca2 08 new
Aca2 08 newAca2 08 new
Aca2 08 new
 
Pipelining Coputing
Pipelining CoputingPipelining Coputing
Pipelining Coputing
 
Tomasulo Algorithm
Tomasulo AlgorithmTomasulo Algorithm
Tomasulo Algorithm
 
3 Pipelining
3 Pipelining3 Pipelining
3 Pipelining
 
Instruction pipelining (ii)
Instruction pipelining (ii)Instruction pipelining (ii)
Instruction pipelining (ii)
 
Chapter 04 the processor
Chapter 04   the processorChapter 04   the processor
Chapter 04 the processor
 
Instruction pipelining
Instruction pipeliningInstruction pipelining
Instruction pipelining
 
Pipelinig hazardous
Pipelinig hazardousPipelinig hazardous
Pipelinig hazardous
 

Ähnlich wie Aca2 06 new

Project Management Fundamental
Project Management FundamentalProject Management Fundamental
Project Management Fundamental
Andy Pham, PMP
 
solve 6 , quiz 2link of book httpwww.irccyn.ec-nantes.fr~mart.pdf
solve 6 , quiz 2link of book httpwww.irccyn.ec-nantes.fr~mart.pdfsolve 6 , quiz 2link of book httpwww.irccyn.ec-nantes.fr~mart.pdf
solve 6 , quiz 2link of book httpwww.irccyn.ec-nantes.fr~mart.pdf
arihantcomputersddn
 

Ähnlich wie Aca2 06 new (20)

Aca2 01 new
Aca2 01 newAca2 01 new
Aca2 01 new
 
6 data envelopment_analysis
6 data envelopment_analysis6 data envelopment_analysis
6 data envelopment_analysis
 
.pptx
.pptx.pptx
.pptx
 
PPCE 1324.pptx process planning and cost estimation
PPCE 1324.pptx process planning and cost estimationPPCE 1324.pptx process planning and cost estimation
PPCE 1324.pptx process planning and cost estimation
 
Project Management Fundamental
Project Management FundamentalProject Management Fundamental
Project Management Fundamental
 
Cpmprt
CpmprtCpmprt
Cpmprt
 
Lec1 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Intro
Lec1 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- IntroLec1 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Intro
Lec1 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Intro
 
01-Introduction_to_Optimization-v2021.2-Sept23-2021.pptx
01-Introduction_to_Optimization-v2021.2-Sept23-2021.pptx01-Introduction_to_Optimization-v2021.2-Sept23-2021.pptx
01-Introduction_to_Optimization-v2021.2-Sept23-2021.pptx
 
NEW-Workshop Discovery Thammasat 2022.pptx
NEW-Workshop Discovery Thammasat 2022.pptxNEW-Workshop Discovery Thammasat 2022.pptx
NEW-Workshop Discovery Thammasat 2022.pptx
 
Week 8: Programming for Data Analysis
Week 8: Programming for Data AnalysisWeek 8: Programming for Data Analysis
Week 8: Programming for Data Analysis
 
Reducing computational complexity of Mathematical functions using FPGA
Reducing computational complexity of Mathematical functions using FPGAReducing computational complexity of Mathematical functions using FPGA
Reducing computational complexity of Mathematical functions using FPGA
 
Computer aided process design and simulation (Cheg.pptx
Computer aided process design and simulation (Cheg.pptxComputer aided process design and simulation (Cheg.pptx
Computer aided process design and simulation (Cheg.pptx
 
Lec3 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Performance
Lec3 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- PerformanceLec3 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Performance
Lec3 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Performance
 
Relational algebra
Relational algebraRelational algebra
Relational algebra
 
CIM UNIT 1.ppt
CIM UNIT 1.pptCIM UNIT 1.ppt
CIM UNIT 1.ppt
 
lec01.pdf
lec01.pdflec01.pdf
lec01.pdf
 
Chapter 10_Project Scheduling Using Critical Path Method (CPM).pdf
Chapter 10_Project Scheduling Using Critical Path Method (CPM).pdfChapter 10_Project Scheduling Using Critical Path Method (CPM).pdf
Chapter 10_Project Scheduling Using Critical Path Method (CPM).pdf
 
Automated University Timetabling
Automated University TimetablingAutomated University Timetabling
Automated University Timetabling
 
solve 6 , quiz 2link of book httpwww.irccyn.ec-nantes.fr~mart.pdf
solve 6 , quiz 2link of book httpwww.irccyn.ec-nantes.fr~mart.pdfsolve 6 , quiz 2link of book httpwww.irccyn.ec-nantes.fr~mart.pdf
solve 6 , quiz 2link of book httpwww.irccyn.ec-nantes.fr~mart.pdf
 
ECESLU Microprocessors Lecture 3
ECESLU Microprocessors Lecture 3ECESLU Microprocessors Lecture 3
ECESLU Microprocessors Lecture 3
 

Mehr von Sumit Mittu (9)

Int306 03
Int306 03Int306 03
Int306 03
 
Int306 02
Int306 02Int306 02
Int306 02
 
Int306 01
Int306 01Int306 01
Int306 01
 
Int306 00
Int306 00Int306 00
Int306 00
 
Int306 04
Int306 04Int306 04
Int306 04
 
Aca2 10 11
Aca2 10 11Aca2 10 11
Aca2 10 11
 
Aca2 09 new
Aca2 09 newAca2 09 new
Aca2 09 new
 
Aca2 07 new
Aca2 07 newAca2 07 new
Aca2 07 new
 
Aca11 bk2 ch9
Aca11 bk2 ch9Aca11 bk2 ch9
Aca11 bk2 ch9
 

Kürzlich hochgeladen

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 

Kürzlich hochgeladen (20)

This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 

Aca2 06 new

  • 1. CSE539: Advanced Computer Architecture Chapter 6 Pipelining and Superscalar Techniques Book: “Advanced Computer Architecture – Parallelism, Scalability, Programmability”, Hwang & Jotwani Sumit Mittu Assistant Professor, CSE/IT Lovely Professional University sumit.12735@lpu.co.in
  • 2. In this chapter… • • • • • Linear Pipeline Processors Non-linear Pipeline Processors Instruction Pipeline Design Arithmetic Pipeline Design Superscalar Pipeline Design Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 2
  • 3. LINEAR PIPELINE PROCESSORS • Linear Pipeline Processor o (Definition) • Models of Linear Pipeline o Synchronous Model o Asynchronous Model o (Corresponding reservation tables) • Clocking and Timing Control o o o o o Clock Cycle Pipeline Frequency Clock skewing Flow-through delay Speedup, Efficiency and Throughput • Optimal number of Stages and Performance-Cost Ratio (PCR) Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 3
  • 4. LINEAR PIPELINE PROCESSORS Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 4
  • 5. LINEAR PIPELINE PROCESSORS Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 5
  • 6. NON-LINEAR PIPELINE PROCESSORS • Dynamic Pipeline o Static v/s Dynamic Pipeline o Streamline connection, feed-forward connection and feedback connection • Reservation and Latency Analysis o Reservation tables o Evaluation time • Latency Analysis o o o o Latency Collision Forbidden latencies Latency Sequence, Latency Cycle and Average Latency Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 6
  • 7. NON-LINEAR PIPELINE PROCESSORS Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 7
  • 8. NON-LINEAR PIPELINE PROCESSORS Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 8
  • 9. NON-LINEAR PIPELINE PROCESSORS Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 9
  • 10. INSTRUCTION PIPELINE DESIGN • Instruction Execution Phases o E.g. Fetch, Decode, Issue, Execute, Write-back o In-order Instruction issuing and Reordered Instruction issuing • E.g. X = Y + Z , A = B x C • Mechanisms/Design Issues for Instruction Pipelining o o o o Pre-fetch Buffers Multiple Functional Units Internal Data Forwarding Hazard Avoidance • Dynamic Scheduling • Branch Handling Techniques Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 10
  • 11. INSTRUCTION PIPELINE DESIGN Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 11
  • 12. INSTRUCTION PIPELINE DESIGN • • • • • Fetch: fetches instructions from memory; ideally one per cycle Decode: reveals instruction operations to be performed and identifies the resources needed Issue: reserves the resources and reads the operands from registers Execute: actual processing of operations as indicated by instruction Write Back: writing results into the registers Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 12
  • 13. INSTRUCTION PIPELINE DESIGN Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 13
  • 14. INSTRUCTION PIPELINE DESIGN Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 14
  • 15. INSTRUCTION PIPELINE DESIGN Mechanisms/Design Issues of Instruction Pipeline • Pre-fetch Buffers o Sequential Buffers o Target Buffers o Loop Buffers Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 15
  • 16. INSTRUCTION PIPELINE DESIGN Mechanisms/Design Issues of Instruction Pipeline • Multiple Functional Units o Reservation Station and Tags o Slow-station as Bottleneck stage • Subdivision of Pipeline Bottleneck stage • Replication of Pipeline Bottleneck stage • (Example to be discussed) Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 16
  • 17. INSTRUCTION PIPELINE DESIGN Mechanisms/Design Issues of Instruction Pipeline Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 17
  • 18. INSTRUCTION PIPELINE DESIGN Mechanisms/Design Issues of Instruction Pipeline • Internal Forwarding and Register Tagging o Internal Forwarding: • A “short-circuit” technique to replace unnecessary memory accesses by register-register transfers in a sequence of fetch-arithmetic-store operations o Register Tagging: • Use of tagged registers , buffers and reservation stations, for exploiting concurrent activities among multiple arithmetic units o Store-Fetch Forwarding • (M  R1, R2  M) replaced by (M  R1, R2  R1) o Fetch-Fetch Forwarding • (R1  M, R2  M) replaced by (R1  M, R2  R1) o Store-Store Overwriting • (M  R1, M  R2) replaced by (M  R2) Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 18
  • 19. INSTRUCTION PIPELINE DESIGN Mechanisms/Design Issues of Instruction Pipeline • Internal Forwarding and Register Tagging Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 19
  • 20. INSTRUCTION PIPELINE DESIGN Mechanisms/Design Issues of Instruction Pipeline • Internal Forwarding and Register Tagging Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 20
  • 21. INSTRUCTION PIPELINE DESIGN Mechanisms/Design Issues of Instruction Pipeline • Hazard Detection and Avoidance o o o o Domain or Input Set of an instruction Range or Output Set of an instruction Data Hazards: RAW, WAR and WAW Resolution using Register Renaming approach Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 21
  • 22. INSTRUCTION PIPELINE DESIGN Dynamic Instruction Scheduling • Idea of Static Scheduling o Compiler based scheduling strategy to resolve Interlocking among instructions • Dynamic Scheduling o Tomasulo’s Algorithm (Register-Tagging Scheme) • Hardware based dependence-resolution o Scoreboarding Technique • Scoreboard: the centralized control unit • A kind of data-driven mechanism Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 22
  • 23. INSTRUCTION PIPELINE DESIGN Dynamic Instruction Scheduling Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 23
  • 24. INSTRUCTION PIPELINE DESIGN Dynamic Instruction Scheduling Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 24
  • 25. INSTRUCTION PIPELINE DESIGN Branch Handling Techniques • Branch Taken, Branch Target, Delay Slot • Effect of Branching o Parameters: • k: No. of stages in the pipeline • n: Total no. of instructions or tasks • p: Percentage of Brach instructions over n • q: Percentage of successful branch instructions (branch taken) over p. • b: Delay Slot • τ: Pipeline Cycle Time o Branch Penalty = q of (p of n) * bτ = pqnbτ o Effective Execution Time: • Teff = [k + (n-1)] τ + pqnbτ = [k + (n-1) + pqnb]τ Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 25
  • 26. INSTRUCTION PIPELINE DESIGN Branch Handling Techniques • Effect of Branching o Effective Throughput: • Heff = n/Teff • Heff = n / {[k + (n-1) + pqnb]τ} = nf / [k + (n-1) + pqnb] • As nInfinity and b = k-1 o H*eff = f / [pq(k-1)+1] • If p=0 and q=0 (no branching occurs) o H**eff = f = 1/τ o Performance Degradation Factor • D = 1 – H*eff / f = pq(k-1) / [pq(k-1)+1] Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 26
  • 27. INSTRUCTION PIPELINE DESIGN Branch Handling Techniques Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 27
  • 28. INSTRUCTION PIPELINE DESIGN Branch Handling Techniques • Branch Prediction o Static Branch Prediction: based on branch code types o Dynamic Branch prediction: based on recent branch history • Strategy 1: Predict the branch direction based on information found at decode stage. • Strategy 2: Use a cache to store target addresses at effective address calculation stage. • Strategy 3: Use a cache to store target instructions at fetch stage o Brach Target Buffer Organization • Delayed Branches o A delayed branch of d cycles allows at most d-1 useful instructions to be executed following the branch taken. o Execution of these instructions should be independent of branch instruction to achieve a zero branch penalty Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 28
  • 29. INSTRUCTION PIPELINE DESIGN Branch Handling Techniques Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 29
  • 30. INSTRUCTION PIPELINE DESIGN Branch Handling Techniques Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 30
  • 31. ARITHMETIC PIPELINE DESIGN Computer Arithmetic Operations • Finite-precision arithmetic • Overflow and Underflow • Fixed-Point operations o Notations: • Signed-magnitude, one’s complement and two-complement notation o Operations: • Addition: (n bit, n bit)  (n bit) Sum, 1 bit output carry • Subtraction: (n bit, n bit)  (n bit) difference • Multiplication: (n bit, n bit)  (2n bit) product • Division: (2n bit, n bit)  (n bit) quotient, (n bit) remainder Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 31
  • 32. ARITHMETIC PIPELINE DESIGN Computer Arithmetic Operations • Floating-Point Numbers o X = (m, e) representation • m: mantissa or fraction • e: exponent with an implied base or radix r. • Actual Value X = m * r e o Operations on numbers X = (mx, ex) and Y = (my, ey) • Addition: (mx * rex-ey + my, ey) • Subtraction: (mx * rex-ey – my, ey) • Multiplication: (mx * my, ex+ey) • Division: (mx / my, ex – ey) • Elementary Functions o Transcendental functions like: Trigonometric, Exponential, Logarithmic, etc. Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 32
  • 33. ARITHMETIC PIPELINE DESIGN Static Arithmetic Pipelines • • • • Separate units for fixed point operations and floating point operations Scalar and Vector Arithmetic Pipelines Uni-functional or Static Pipelines Arithmetic Pipeline Stages o Majorly involve hardware to perform: Add and Shift micro-operations o Addition using: Carry Propagation Adder (CPA) and Carry Save Adder (CSA) o Shift using: Shift Registers • Multiplication Pipeline Design o E.g. To multiply two 8-bit numbers that yield a 16-bit product using CSA and CPA Wallace Tree. Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 33
  • 34. ARITHMETIC PIPELINE DESIGN Static Arithmetic Pipelines 0 Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 34
  • 35. ARITHMETIC PIPELINE DESIGN Static Arithmetic Pipelines Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 35
  • 36. ARITHMETIC PIPELINE DESIGN Multifunctional Arithmetic Pipelines • Multifunctional Pipeline: o Static multifunctional pipeline o Dynamic multifunctional pipeline • Case Study: T1/ASC static multifunctional pipeline architecture Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 36
  • 37. ARITHMETIC PIPELINE DESIGN Multifunctional Arithmetic Pipelines Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 37
  • 38. ARITHMETIC PIPELINE DESIGN Multifunctional Arithmetic Pipelines Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 38
  • 39. SUPERSCALAR PIPELINE DESIGN • Pipeline Design Parameters o Pipeline cycle, Base cycle, Instruction issue rate, Instruction issue Latency, Simple Operation Latency o ILP to fully utilize the pipeline • • • • Superscalar Pipeline Structure Data and Resource Dependencies Pipeline Stalling Superscalar Pipeline Scheduling o In-order Issue and in-order completion o In-order Issue and out-of-order completion o Out-of-order Issue and out-of-order completion • Superscalar Performance Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 39
  • 40. SUPERSCALAR PIPELINE DESIGN Parameter Base Scalar Processor Super Scalar Processor (degree = K) Pipeline Cycle 1 (base cycle) K Instruction Issue Rate 1 K Instruction Issue Latency 1 1 Simple Operation Latency 1 1 ILP to fully utilize pipeline 1 K Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 40
  • 41. SUPERSCALAR PIPELINE DESIGN 4, Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 41
  • 42. SUPERSCALAR PIPELINE DESIGN Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 42
  • 43. SUPERSCALAR PIPELINE DESIGN Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 43
  • 44. SUPERSCALAR PIPELINE DESIGN Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 44
  • 45. SUPERSCALAR PIPELINE DESIGN Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 45
  • 46. SUPERSCALAR PIPELINE DESIGN Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 46
  • 47. SUPERSCALAR PIPELINE DESIGN 4, Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 47
  • 48. SUPERSCALAR PIPELINE DESIGN Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 48
  • 49. SUPERSCALAR PIPELINE DESIGN • Time required by base scalar machine: o T(1,1) = K + N – 1 • The ideal execution time required by m-issue superscalar machine: o T(m,1) = K + (N – m)/m o Where, • K is the time required to execute first m instructions through m pipelines of k-stages simultaneously • Second term corresponds to time required to execute remaining N-m instructions , m per cycle through m pipelines • The ideal speedup of superscalar machine o S(m,1) = T(1,1)/T(m,1) = m(N + k – 1)/[N+ m(k – 1)] • As n  infinity, S(m,1) m Sumit Mittu, Assistant Professor, CSE/IT, Lovely Professional University 49