SlideShare ist ein Scribd-Unternehmen logo
1 von 4
Downloaden Sie, um offline zu lesen
Datorarkitektur                                            Fö 3 - 1         Datorarkitektur                                                      Fö 3 - 2




                                                                                                                The Instruction Cycle




                        INSTRUCTION PIPELINING (I)
                                                                                                          Fetch
                                                                                                       instruction    FI


                 1. The Instruction Cycle
                                                                                                         Decode       DI
                 2. Instruction Pipelining

                 3. Pipeline Hazards
                                                                                                          Fetch         - Calculate operand address (CO)
                                                                                                         operand        - Fetch operand (FO)
                 4. Structural Hazards

                 5. Data Hazards
                                                                                                         Execute        - Execute instruction (EI)
                                                                                                        instruction     - Write back operand (WO)
                 6. Control Hazards




Petru Eles, IDA, LiTH                                                       Petru Eles, IDA, LiTH




      Datorarkitektur                                            Fö 3 - 3         Datorarkitektur                                                      Fö 3 - 4




                            Instruction Pipelining                                                             Acceleration by Pipelining

          •      Instruction execution is extremely complex and                   Two stage pipeline:                  FI: fetch instruction
                 involves several operations which are executed
                 successively (see slide 2). This implies a large                                                      EI: execute instruction
                 amount of hardware, but only one part of this                                      Clock cycle →          1 2 3 4 5 6 7 8
                 hardware works at a given moment.
                                                                                                    Instr. i               FI EI
                                                                                                    Instr. i+1                FI EI
          •      Pipelining is an implementation technique whereby                                  Instr. i+2                   FI EI
                 multiple instructions are overlapped in execution.
                 This is solved without additional hardware but only                                Instr. i+3                        FI EI
                 by letting different parts of the hardware work for                                Instr. i+4                           FI EI
                 different instructions at the same time.                                           Instr. i+5                              FI EI
                                                                                                    Instr. i+6                                 FI EI
          •      The pipeline organization of a CPU is similar to an
                 assembly line: the work to be done in an instruction
                 is broken into smaller steps (pieces), each of which
                 takes a fraction of the time needed to complete the
                 entire instruction. Each of these steps is a pipe                We consider that each instruction takes execution time Tex.
                 stage (or a pipe segment).
          •      Pipe stages are connected to form a pipe:                        Execution time for the 7 instructions, with pipelining:
                                                                                                                             (Tex/2)*8= 4*Tex
              Stage 1         Stage 2                      Stage n

          •      The time required for moving an instruction from
                 one stage to the next: a machine cycle (often this is
                 one clock cycle). The execution of one instruction
                 takes several machine cycles as it passes through
                 the pipeline.


Petru Eles, IDA, LiTH                                                       Petru Eles, IDA, LiTH
Datorarkitektur                                              Fö 3 - 5         Datorarkitektur                                                Fö 3 - 6




                        Acceleration by Pipelining (cont’d)

                                                                                                      Acceleration by Pipelining (cont’d)
      Six stage pipeline (see also slide 2):
         FI: fetch instruction           FO: fetch operand
         DI: decode instruction          EI: execute instruction
                                                                                        •      Apparently a greater number of stages always
         CO: calculate operand address WO:write operand                                        provides better performance. However:
                                                                                                 - a greater number of stages increases the over-
       Clock cycle →             1 2 3 4 5 6 7 8 9 10 11 12                                        head in moving information between stages
                                                                                                   and synchronization between stages.
       Instr. i                  FI DI COFO EI WO                                                - with the number of stages the complexity of the
       Instr. i+1                   FI DI COFO EI WO                                               CPU grows.
                                                                                                 - it is difficult to keep a large pipeline at maximum
       Instr. i+2                      FI DI COFO EI WO
                                                                                                   rate because of pipeline hazards.
       Instr. i+3                         FI DI COFO EI WO
       Instr. i+4                             FI DI COFO EI WO                      80486 and Pentium:            five-stage pipeline for integer instr.
       Instr. i+5                                FI DI COFO EI WO                                                 eight-stage pipeline for FP instr.
       Instr. i+6                                   FI DI COFO EI WO                PowerPC:                      four-stage pipeline for integer instr.
                                                                                                                  six-stage pipeline for FP instr.

      Execution time for the 7 instructions, with pipelining:
                                                 (Tex/6)*12= 2*Tex

          •      After a certain time (N-1 cycles) all the N stages of
                 the pipeline are working: the pipeline is filled. Now,
                 theoretically, the pipeline works providing maximal
                 parallelism (N instructions are active simultaneously).




Petru Eles, IDA, LiTH                                                         Petru Eles, IDA, LiTH




      Datorarkitektur                                              Fö 3 - 7         Datorarkitektur                                                Fö 3 - 8




                                Pipeline Hazards                                                            Structural Hazards

                                                                                        •      Structural hazards occur when a certain resource
          •      Pipeline hazards are situations that prevent the                              (memory, functional unit) is requested by more than
                 next instruction in the instruction stream from                               one instruction at the same time.
                 executing during its designated clock cycle. The
                 instruction is said to be stalled. When an instruction             Instruction ADD R4,X fetches in the FO stage operand X
                 is stalled, all instructions later in the pipeline than            from memory. The memory doesn’t accept another
                 the stalled instruction are also stalled. Instructions             access during that cycle.
                 earlier than the stalled one can continue. No new
                 instructions are fetched during the stall.                         Clock cycle →              1 2 3 4 5 6 7 8 9 10 11 12

                                                                                    ADD R4,X                   FI DI COFO EI WO
          •      Types of hazards:                                                  Instr. i+1                    FI DI COFO EI WO
                    1. Structural hazards                                           Instr. i+2                       FI DI COFO EI WO
                    2. Data hazards                                                 Instr. i+3                          stall   FI DI COFO EI WO
                    3. Control hazards                                              Instr. i+4                                    FI DI COFO EI WO


                                                                                    Penalty: 1 cycle

                                                                                        •      Certain resources are duplicated in order to avoid
                                                                                               structural hazards. Functional units (ALU, FP unit)
                                                                                               can be pipelined themselves in order to support
                                                                                               several instructions at a time. A classical way to
                                                                                               avoid hazards at memory access is by providing
                                                                                               separate data and instruction caches.




Petru Eles, IDA, LiTH                                                         Petru Eles, IDA, LiTH
Datorarkitektur                                                       Fö 3 - 9          Datorarkitektur                                                          Fö 3 - 10




                                  Data Hazards                                                                            Data Hazards (cont’d)

                                                                                                  •      Some of the penalty produced by data hazards can
                                                                                                         be avoided using a technique called forwarding
          •      We have two instructions, I1 and I2. In a pipeline                                      (bypassing).
                 the execution of I2 can start before I1 has                                                           from reg- from reg-
                                                                                                                       ister or ister or
                 terminated. If in a certain stage of the pipeline, I2                                                 memory memory
                 needs the result produced by I1, but this result has
                 not yet been generated, we have a data hazard.

                                                                                                                              MUX                 MUX
      I1:           MUL       R2,R3                R2 ← R2 * R3
      I2:           ADD       R1,R2                R1 ← R1 + R2                                                 bypass                                          bypass
                                                                                                                path                                            path
                                                                                                                                     ALU
       Clock cycle →              1 2 3 4 5 6 7 8 9 10 11 12

       MUL R2,R3                  FI DI COFO EI WO
       ADD R1,R2                      FI DI CO stall stall FO EI WO                                                         to register or memory
       Instr. i+2                          FI DI             COFO EI WO                           •      The ALU result is always fed back to the ALU input.
                                                                                                         If the hardware detects that the value needed for the
                                                                                                         current operation is the one produced by the previous
      Before executing its FO stage, the ADD instruction is                                              operation (but which has not yet been written back)
      stalled until the MUL instruction has written the result                                           it selects the forwarded result as the ALU input, in-
      into R2.                                                                                           stead of the value read from register or memory.
      Penalty: 2 cycles                                                                        Clock cycle →                 1 2 3 4 5 6 7 8 9 10 11 12

                                                                                               MUL R2,R3                     FI DI COFO EI WO
                                                                                               ADD R1,R2                        FI DI CO stall FO EI WO

                                                                                              After the EI stage of the MUL instruction the result is avail-
                                                                                              able by forwarding. The penalty is reduced to one cycle.


Petru Eles, IDA, LiTH                                                                   Petru Eles, IDA, LiTH




      Datorarkitektur                                                       Fö 3 - 11         Datorarkitektur                                                          Fö 3 - 12




                                Control Hazards                                                                          Control Hazards (cont’d)

                                                                                              Conditional branch
          •      Control hazards are produced by branch
                 instructions.                                                                                             ADD       R1,R2                  R1 ← R1 + R2
                                                                                                                           BEZ       TARGET                 branch if zero
      Unconditional branch
                                                                                                                           instruction i+1
                         --------------
                                                                                                                           -------------
                         BR   TARGET
                                                                                                  TARGET                   -------------
                         --------------
                                                                                              Branch is taken
          TARGET         --------------                                                                                      At this moment, both the
                                                                                                                             condition (set by ADD) and
                                                                                                                             the target address are known.
                               After the FO stage of the
                               branch instruction the                                          Clock cycle →                 1 2 3 4 5 6 7 8 9 10 11 12
                               address of the target is
                               known and it can be fetched                                     ADD R1,R2                     FI DI COFO EI WO
                                                                                               BEZ TARGET                       FI DI COFO EI WO
       Clock cycle →              1 2 3 4 5 6 7 8 9 10 11 12
                                                                                               target                               FI   stall stall   FI DI COFO EI WO
       BR TARGET                  FI DI COFO EI WO                                            Penalty: 3 cycles
       target                         FI   stall stall   FI DI COFO EI WO
       target+1                                            FI DI COFO EI WO                   Branch not taken
                                                                                                                            At this moment the condition is
                        The instruction following the                                                                       known and instr+1 can go on.
                        branch is fetched; before                                              Clock cycle →                 1 2 3 4 5 6 7 8 9 10 11 12
                        the DI is finished it is not
                        known that a branch is exe-                                            ADD R1,R2                     FI DI COFO EI WO
                        cuted. Later the fetched in-
                        struction is discarded                                                 BEZ TARGET                       FI DI COFO EI WO
                                                                                               instr i+1                            FI   stall stall   DI COFO EI WO
      Penalty: 3 cycles
                                                                                              Penalty: 2 cycles


Petru Eles, IDA, LiTH                                                                   Petru Eles, IDA, LiTH
Datorarkitektur                                            Fö 3 - 13         Datorarkitektur                                            Fö 3 - 14




                                                                                                                Summary
                           Control Hazards (cont’d)


          •      With conditional branch we have a penalty even if                     •      Instructions are executed by the CPU as a
                 the branch has not been taken. This is because we                            sequence of steps. Instruction execution can be
                 have to wait until the branch condition is available.                        substantially accelerated by instruction pipelining.

                                                                                       •      A pipeline is organized as a succession of N
          •      Branch instructions represent a major problem in                             stages. At a certain moment N instructions can be
                 assuring an optimal flow through the pipeline.                                active inside the pipeline.
                 Several approaches have been taken for reducing
                 branch penalties (see slides of the following
                                                                                       •      Keeping a pipeline at its maximal rate is prevented
                 lecture).
                                                                                              by pipeline hazards. Structural hazards are due to
                                                                                              resource conflicts. Data hazards are produced by
                                                                                              data dependencies between instructions. Control
                                                                                              hazards are produced as consequence of branch
                                                                                              instructions




Petru Eles, IDA, LiTH                                                        Petru Eles, IDA, LiTH

Weitere ähnliche Inhalte

Was ist angesagt?

Pipelining
PipeliningPipelining
PipeliningAJAL A J
 
Pipeline & Nonpipeline Processor
Pipeline & Nonpipeline ProcessorPipeline & Nonpipeline Processor
Pipeline & Nonpipeline ProcessorSmit Shah
 
Pipelining , structural hazards
Pipelining , structural hazardsPipelining , structural hazards
Pipelining , structural hazardsMunaam Munawar
 
Advanced pipelining
Advanced pipeliningAdvanced pipelining
Advanced pipelininga_spac
 
Instruction pipelining
Instruction pipeliningInstruction pipelining
Instruction pipeliningTech_MX
 
pipelining and hazards occure in assembly language.
pipelining and hazards occure in assembly language.pipelining and hazards occure in assembly language.
pipelining and hazards occure in assembly language.Zohaib Arshid
 
Pipelinig hazardous
Pipelinig hazardousPipelinig hazardous
Pipelinig hazardousjasscheema
 
Concept of Pipelining
Concept of PipeliningConcept of Pipelining
Concept of PipeliningSHAKOOR AB
 
Loop parallelization & pipelining
Loop parallelization & pipeliningLoop parallelization & pipelining
Loop parallelization & pipeliningjagrat123
 
pipeline and vector processing
pipeline and vector processingpipeline and vector processing
pipeline and vector processingAcad
 
Computer architecture pipelining
Computer architecture pipeliningComputer architecture pipelining
Computer architecture pipeliningMazin Alwaaly
 
Pipeline hazard
Pipeline hazardPipeline hazard
Pipeline hazardAJAL A J
 
Comp architecture : branch prediction
Comp architecture : branch predictionComp architecture : branch prediction
Comp architecture : branch predictionrinnocente
 

Was ist angesagt? (20)

Assembly p1
Assembly p1Assembly p1
Assembly p1
 
Pipelining
PipeliningPipelining
Pipelining
 
Pipeline & Nonpipeline Processor
Pipeline & Nonpipeline ProcessorPipeline & Nonpipeline Processor
Pipeline & Nonpipeline Processor
 
Pipelining
PipeliningPipelining
Pipelining
 
Pipelining , structural hazards
Pipelining , structural hazardsPipelining , structural hazards
Pipelining , structural hazards
 
Advanced pipelining
Advanced pipeliningAdvanced pipelining
Advanced pipelining
 
Instruction pipelining
Instruction pipeliningInstruction pipelining
Instruction pipelining
 
pipelining and hazards occure in assembly language.
pipelining and hazards occure in assembly language.pipelining and hazards occure in assembly language.
pipelining and hazards occure in assembly language.
 
Pipelinig hazardous
Pipelinig hazardousPipelinig hazardous
Pipelinig hazardous
 
Concept of Pipelining
Concept of PipeliningConcept of Pipelining
Concept of Pipelining
 
Comp archch06l01pipeline
Comp archch06l01pipelineComp archch06l01pipeline
Comp archch06l01pipeline
 
Loop parallelization & pipelining
Loop parallelization & pipeliningLoop parallelization & pipelining
Loop parallelization & pipelining
 
Pipeline Computing by S. M. Risalat Hasan Chowdhury
Pipeline Computing by S. M. Risalat Hasan ChowdhuryPipeline Computing by S. M. Risalat Hasan Chowdhury
Pipeline Computing by S. M. Risalat Hasan Chowdhury
 
pipelining
pipeliningpipelining
pipelining
 
3 Pipelining
3 Pipelining3 Pipelining
3 Pipelining
 
pipelining
pipeliningpipelining
pipelining
 
pipeline and vector processing
pipeline and vector processingpipeline and vector processing
pipeline and vector processing
 
Computer architecture pipelining
Computer architecture pipeliningComputer architecture pipelining
Computer architecture pipelining
 
Pipeline hazard
Pipeline hazardPipeline hazard
Pipeline hazard
 
Comp architecture : branch prediction
Comp architecture : branch predictionComp architecture : branch prediction
Comp architecture : branch prediction
 

Ähnlich wie Instruction pipelining (i)

Pipelining 16 computers Artitacher pdf
Pipelining   16 computers Artitacher  pdfPipelining   16 computers Artitacher  pdf
Pipelining 16 computers Artitacher pdfMadhuGupta99385
 
A survey of paradigms for building and
A survey of paradigms for building andA survey of paradigms for building and
A survey of paradigms for building andcseij
 
Pipelining Tutorial Solution.pptx
Pipelining Tutorial Solution.pptxPipelining Tutorial Solution.pptx
Pipelining Tutorial Solution.pptxKRITARTHBANSAL1
 
Topic2a ss pipelines
Topic2a ss pipelinesTopic2a ss pipelines
Topic2a ss pipelinesturki_09
 
Ct213 processor design_pipelinehazard
Ct213 processor design_pipelinehazardCt213 processor design_pipelinehazard
Ct213 processor design_pipelinehazardrakeshrakesh2020
 
3. Implementation of Pipelining in Datapath.pptx
3. Implementation of Pipelining in Datapath.pptx3. Implementation of Pipelining in Datapath.pptx
3. Implementation of Pipelining in Datapath.pptxKarthikeyanC53
 
Implementation of pipelining in datapath
Implementation of pipelining in datapathImplementation of pipelining in datapath
Implementation of pipelining in datapathbabuece
 
print.pptx
print.pptxprint.pptx
print.pptxkalai75
 
Design pipeline architecture for various stage pipelines
Design pipeline architecture for various stage pipelinesDesign pipeline architecture for various stage pipelines
Design pipeline architecture for various stage pipelinesMahmudul Hasan
 

Ähnlich wie Instruction pipelining (i) (17)

Superscalar processors
Superscalar processorsSuperscalar processors
Superscalar processors
 
pipelining.pptx
pipelining.pptxpipelining.pptx
pipelining.pptx
 
Pipelining 16 computers Artitacher pdf
Pipelining   16 computers Artitacher  pdfPipelining   16 computers Artitacher  pdf
Pipelining 16 computers Artitacher pdf
 
A survey of paradigms for building and
A survey of paradigms for building andA survey of paradigms for building and
A survey of paradigms for building and
 
Pipelining Tutorial Solution.pptx
Pipelining Tutorial Solution.pptxPipelining Tutorial Solution.pptx
Pipelining Tutorial Solution.pptx
 
Computer architecture presentation
Computer architecture presentationComputer architecture presentation
Computer architecture presentation
 
Core pipelining
Core pipelining Core pipelining
Core pipelining
 
Topic2a ss pipelines
Topic2a ss pipelinesTopic2a ss pipelines
Topic2a ss pipelines
 
Ct213 processor design_pipelinehazard
Ct213 processor design_pipelinehazardCt213 processor design_pipelinehazard
Ct213 processor design_pipelinehazard
 
3. Implementation of Pipelining in Datapath.pptx
3. Implementation of Pipelining in Datapath.pptx3. Implementation of Pipelining in Datapath.pptx
3. Implementation of Pipelining in Datapath.pptx
 
Unit 7
Unit 7Unit 7
Unit 7
 
Pipelining
PipeliningPipelining
Pipelining
 
Risc revolution
Risc revolutionRisc revolution
Risc revolution
 
Implementation of pipelining in datapath
Implementation of pipelining in datapathImplementation of pipelining in datapath
Implementation of pipelining in datapath
 
print.pptx
print.pptxprint.pptx
print.pptx
 
Design pipeline architecture for various stage pipelines
Design pipeline architecture for various stage pipelinesDesign pipeline architecture for various stage pipelines
Design pipeline architecture for various stage pipelines
 
Unit 3
Unit 3Unit 3
Unit 3
 

Instruction pipelining (i)

  • 1. Datorarkitektur Fö 3 - 1 Datorarkitektur Fö 3 - 2 The Instruction Cycle INSTRUCTION PIPELINING (I) Fetch instruction FI 1. The Instruction Cycle Decode DI 2. Instruction Pipelining 3. Pipeline Hazards Fetch - Calculate operand address (CO) operand - Fetch operand (FO) 4. Structural Hazards 5. Data Hazards Execute - Execute instruction (EI) instruction - Write back operand (WO) 6. Control Hazards Petru Eles, IDA, LiTH Petru Eles, IDA, LiTH Datorarkitektur Fö 3 - 3 Datorarkitektur Fö 3 - 4 Instruction Pipelining Acceleration by Pipelining • Instruction execution is extremely complex and Two stage pipeline: FI: fetch instruction involves several operations which are executed successively (see slide 2). This implies a large EI: execute instruction amount of hardware, but only one part of this Clock cycle → 1 2 3 4 5 6 7 8 hardware works at a given moment. Instr. i FI EI Instr. i+1 FI EI • Pipelining is an implementation technique whereby Instr. i+2 FI EI multiple instructions are overlapped in execution. This is solved without additional hardware but only Instr. i+3 FI EI by letting different parts of the hardware work for Instr. i+4 FI EI different instructions at the same time. Instr. i+5 FI EI Instr. i+6 FI EI • The pipeline organization of a CPU is similar to an assembly line: the work to be done in an instruction is broken into smaller steps (pieces), each of which takes a fraction of the time needed to complete the entire instruction. Each of these steps is a pipe We consider that each instruction takes execution time Tex. stage (or a pipe segment). • Pipe stages are connected to form a pipe: Execution time for the 7 instructions, with pipelining: (Tex/2)*8= 4*Tex Stage 1 Stage 2 Stage n • The time required for moving an instruction from one stage to the next: a machine cycle (often this is one clock cycle). The execution of one instruction takes several machine cycles as it passes through the pipeline. Petru Eles, IDA, LiTH Petru Eles, IDA, LiTH
  • 2. Datorarkitektur Fö 3 - 5 Datorarkitektur Fö 3 - 6 Acceleration by Pipelining (cont’d) Acceleration by Pipelining (cont’d) Six stage pipeline (see also slide 2): FI: fetch instruction FO: fetch operand DI: decode instruction EI: execute instruction • Apparently a greater number of stages always CO: calculate operand address WO:write operand provides better performance. However: - a greater number of stages increases the over- Clock cycle → 1 2 3 4 5 6 7 8 9 10 11 12 head in moving information between stages and synchronization between stages. Instr. i FI DI COFO EI WO - with the number of stages the complexity of the Instr. i+1 FI DI COFO EI WO CPU grows. - it is difficult to keep a large pipeline at maximum Instr. i+2 FI DI COFO EI WO rate because of pipeline hazards. Instr. i+3 FI DI COFO EI WO Instr. i+4 FI DI COFO EI WO 80486 and Pentium: five-stage pipeline for integer instr. Instr. i+5 FI DI COFO EI WO eight-stage pipeline for FP instr. Instr. i+6 FI DI COFO EI WO PowerPC: four-stage pipeline for integer instr. six-stage pipeline for FP instr. Execution time for the 7 instructions, with pipelining: (Tex/6)*12= 2*Tex • After a certain time (N-1 cycles) all the N stages of the pipeline are working: the pipeline is filled. Now, theoretically, the pipeline works providing maximal parallelism (N instructions are active simultaneously). Petru Eles, IDA, LiTH Petru Eles, IDA, LiTH Datorarkitektur Fö 3 - 7 Datorarkitektur Fö 3 - 8 Pipeline Hazards Structural Hazards • Structural hazards occur when a certain resource • Pipeline hazards are situations that prevent the (memory, functional unit) is requested by more than next instruction in the instruction stream from one instruction at the same time. executing during its designated clock cycle. The instruction is said to be stalled. When an instruction Instruction ADD R4,X fetches in the FO stage operand X is stalled, all instructions later in the pipeline than from memory. The memory doesn’t accept another the stalled instruction are also stalled. Instructions access during that cycle. earlier than the stalled one can continue. No new instructions are fetched during the stall. Clock cycle → 1 2 3 4 5 6 7 8 9 10 11 12 ADD R4,X FI DI COFO EI WO • Types of hazards: Instr. i+1 FI DI COFO EI WO 1. Structural hazards Instr. i+2 FI DI COFO EI WO 2. Data hazards Instr. i+3 stall FI DI COFO EI WO 3. Control hazards Instr. i+4 FI DI COFO EI WO Penalty: 1 cycle • Certain resources are duplicated in order to avoid structural hazards. Functional units (ALU, FP unit) can be pipelined themselves in order to support several instructions at a time. A classical way to avoid hazards at memory access is by providing separate data and instruction caches. Petru Eles, IDA, LiTH Petru Eles, IDA, LiTH
  • 3. Datorarkitektur Fö 3 - 9 Datorarkitektur Fö 3 - 10 Data Hazards Data Hazards (cont’d) • Some of the penalty produced by data hazards can be avoided using a technique called forwarding • We have two instructions, I1 and I2. In a pipeline (bypassing). the execution of I2 can start before I1 has from reg- from reg- ister or ister or terminated. If in a certain stage of the pipeline, I2 memory memory needs the result produced by I1, but this result has not yet been generated, we have a data hazard. MUX MUX I1: MUL R2,R3 R2 ← R2 * R3 I2: ADD R1,R2 R1 ← R1 + R2 bypass bypass path path ALU Clock cycle → 1 2 3 4 5 6 7 8 9 10 11 12 MUL R2,R3 FI DI COFO EI WO ADD R1,R2 FI DI CO stall stall FO EI WO to register or memory Instr. i+2 FI DI COFO EI WO • The ALU result is always fed back to the ALU input. If the hardware detects that the value needed for the current operation is the one produced by the previous Before executing its FO stage, the ADD instruction is operation (but which has not yet been written back) stalled until the MUL instruction has written the result it selects the forwarded result as the ALU input, in- into R2. stead of the value read from register or memory. Penalty: 2 cycles Clock cycle → 1 2 3 4 5 6 7 8 9 10 11 12 MUL R2,R3 FI DI COFO EI WO ADD R1,R2 FI DI CO stall FO EI WO After the EI stage of the MUL instruction the result is avail- able by forwarding. The penalty is reduced to one cycle. Petru Eles, IDA, LiTH Petru Eles, IDA, LiTH Datorarkitektur Fö 3 - 11 Datorarkitektur Fö 3 - 12 Control Hazards Control Hazards (cont’d) Conditional branch • Control hazards are produced by branch instructions. ADD R1,R2 R1 ← R1 + R2 BEZ TARGET branch if zero Unconditional branch instruction i+1 -------------- ------------- BR TARGET TARGET ------------- -------------- Branch is taken TARGET -------------- At this moment, both the condition (set by ADD) and the target address are known. After the FO stage of the branch instruction the Clock cycle → 1 2 3 4 5 6 7 8 9 10 11 12 address of the target is known and it can be fetched ADD R1,R2 FI DI COFO EI WO BEZ TARGET FI DI COFO EI WO Clock cycle → 1 2 3 4 5 6 7 8 9 10 11 12 target FI stall stall FI DI COFO EI WO BR TARGET FI DI COFO EI WO Penalty: 3 cycles target FI stall stall FI DI COFO EI WO target+1 FI DI COFO EI WO Branch not taken At this moment the condition is The instruction following the known and instr+1 can go on. branch is fetched; before Clock cycle → 1 2 3 4 5 6 7 8 9 10 11 12 the DI is finished it is not known that a branch is exe- ADD R1,R2 FI DI COFO EI WO cuted. Later the fetched in- struction is discarded BEZ TARGET FI DI COFO EI WO instr i+1 FI stall stall DI COFO EI WO Penalty: 3 cycles Penalty: 2 cycles Petru Eles, IDA, LiTH Petru Eles, IDA, LiTH
  • 4. Datorarkitektur Fö 3 - 13 Datorarkitektur Fö 3 - 14 Summary Control Hazards (cont’d) • With conditional branch we have a penalty even if • Instructions are executed by the CPU as a the branch has not been taken. This is because we sequence of steps. Instruction execution can be have to wait until the branch condition is available. substantially accelerated by instruction pipelining. • A pipeline is organized as a succession of N • Branch instructions represent a major problem in stages. At a certain moment N instructions can be assuring an optimal flow through the pipeline. active inside the pipeline. Several approaches have been taken for reducing branch penalties (see slides of the following • Keeping a pipeline at its maximal rate is prevented lecture). by pipeline hazards. Structural hazards are due to resource conflicts. Data hazards are produced by data dependencies between instructions. Control hazards are produced as consequence of branch instructions Petru Eles, IDA, LiTH Petru Eles, IDA, LiTH