Parallel processing involves performing multiple tasks simultaneously to increase computational speed. It can be achieved through pipelining, where instructions are overlapped in execution, or vector/array processors where the same operation is performed on multiple data elements at once. The main types are SIMD (single instruction multiple data) and MIMD (multiple instruction multiple data). Pipelining provides higher throughput by keeping the pipeline full but requires handling dependencies between instructions to avoid hazards slowing things down.
2. What is parallel processing? Difference between
pipeline and non-pipeline processing.
Ans. Parallel processing can be described as a class of techniques
which enables the system to achieve simultaneous
data-processing tasks to increase the computational speed
of a computer system.
A parallel processing system can carry out simultaneous
data-processing to achieve faster execution time. For instance,
while an instruction is being processed in the ALU component of
the CPU, the next instruction can be read from memory.
The primary purpose of parallel processing is to enhance the
computer processing capability and increase its throughput.
Q1)
3. PIPELINING SYSTEM:
1) Pipelining is an technique where multiple instructions are overlapped
in execution.
2) It has a high throughput .
3) Many instructions are executed at the same time and execution
is completed in fewer cycles.
4) The pipeline is filled by CPU scheduler from a pool of work which is waiting
to occur. Each execution unit has a pipeline associated with it, so as to have
work pre-planned.
5) The efficiency of pipelining system depends upon the effectiveness of CPU
scheduler.
NON- PIPELINING SYSTEM:
1) All the actions are grouped into a single step.
2) It has a low throughput.
3) Only one instruction is executed per unit time and execution process requires
more number of cycles.
4) The CPU scheduler in the case of non-pipelining system merely chooses from
the pool of waiting work when an execution unit gives a signal that it is free.
5) Efficiency is not dependent on CPU scheduler.
4. Q.2 What is an Array processor? Designate the types of
Array processor?
Ans. The SIMD from of parallel processing is called Array
processing. A two-dimensional grid of processing
elements transmits an instruction stream from a central
control processor. As each instruction is transmitted,
all elements execute it simultaneously. Each processing
element is connected to its four nearest neighbors for
the purposes of data exchange. Array processors are
highly specialized machines. They are well suited
numerical problems that can be expressed in matrix
or vector format. How they are not very useful in speeding
up general computations.
5. 1. Attached Array Processor :
To improve the performance of the host computer in numerical
computational tasks auxiliary processor is attatched to it.
Attached array processor has two interfaces:
Input output interface to a common processor.
Interface with a local memory.
Here local memory interconnects main memory. Host computer
is general purpose computer. Attached processor is back end
machine driven by the host computer.
The array processor is connected through an I/O controller to
the computer & the computer treats it as an external interface.
6. 2. SIMD array processor :
SIMD is a computer with multiple processing units operating in
parallel.
The processing units are synchronized to perform the same
operation under the control of a common control unit. Thus
providing a single instruction stream, multiple data stream (SIMD)
organization. As shown in figure, SIMD contains a set of identical
processing elements (PES) each having a local memory M.
7. Q.3 Explain any one method for handling branch instruction
in pipeline.
Ans. Method is as follows :
To stall the pipeline until the branch decision is taken
(stalling until resolution) and then fetch the correct
instruction flow.
Eg – in case of 4 stage pipelining ,
Without forwarding : for three clock cycles
With forwarding : for two clock cycles
If the branch is not taken, the three cycles penalty is not
justified ⇒ throughput reduction.
We can assume the branch not taken, and flush the next
3 instructions in the pipeline only if the branch will be
taken.
8. Q.4 Discuss Arithmetic and Instruction pipelines. Also draw
the space time diagram for a four-segment pipeline
showing the time it takes to process eight tasks.
Ans. Arithmetic Pipeline :
Arithmetic pipelines are usually found in most of the
computers. They are used for floating point operations,
multiplication of fixed point numbers etc.
For example:
The input to the Floating Point Adder pipeline is:
X = A*2^aY = B*2^b
Here A and B are mantissas, while a and b are exponents.
The floating point addition and subtraction is done in 4 parts:
1) Compare the exponents.
2) Align the mantissas.
3) Add or subtract mantissas
4) Produce the result.
9. Instruction Pipeline :
In this a stream of instructions can be executed by
overlapping fetch, decode and execute phases of an instruction
cycle. This type of technique is used to increase the throughput of
the computer system.
An instruction pipeline reads instruction from the memory while
previous instructions are being executed in other segments of the
pipeline. Thus we can execute multiple instructions
simultaneously. The pipeline will be more efficient if the
instruction cycle is divided into segments of equal duration.
10. • Space time diagram for a four-segment pipeline showing
the time it takes to process eight tasks -
C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11
S1 T1 T2 T3 T4 T5 T6 T7 T8
S2 T1 T2 T3 T4 T5 T6 T7 T8
S3 T1 T2 T3 T4 T5 T6 T7 T8
S4 T1 T2 T3 T4 T5 T6 T7 T8
Clock Cycles
S
E
G
M
E
N
T
S
Hence, it takes 11 clock cycles.
11. Q.5 Discuss all factor which affect the performance of
Pipelining processor-based systems.
Ans. Factors are as follows :
1) Pipeline latency.
The fact that the execution time of each instruction does not
decrease puts limitations on pipeline depth;
2) Imbalance among pipeline stages.
Imbalance among the pipe stages reduces performance since
the clock can run no faster than the time needed for the
slowest pipeline stage;
3) Pipeline overhead.
Pipeline overhead arises from the combination of pipeline
register delay (setup time plus propagation delay) and clock
skew.
12. Q.6 What is pipeline hazards? Distinguish between
structural and control hazards.
Ans. There are situations, called hazards, that prevent the
instruction in the instruction stream from being executing
during its designated clock cycle. Hazards reduce the performance
from the ideal speedup gained by pipelining.
There are three classes of hazards:
Structural Hazards. They arise from resource conflicts when the
hardware cannot support all possible combinations of instructions in
simultaneous overlapped execution.
Data Hazards. They arise when an instruction depends on the result of
a previous instruction in a way that is exposed by the overlapping of
instructions in the pipeline.
Control Hazards.They arise from the pipelining of branches and other
instructions that change the PC.
13. Q.7 Define the following:
(i) Speedup :
Speedup (S) of the pipelined processor over non-pipelined
processor, when ‘n’ tasks are executed on the same processor
is:
S = Performance of pipelined processor / Performance of Non-
pipelined processor
As the performance of a processor is inversely proportional to
the execution time,
S = ETnon-pipeline / Etpipeline
= [n * k * Tp] / [(k + n – 1) * Tp]
= [n * k] / [k + n – 1]
When the number of tasks ‘n’ are significantly larger than k,
that is, n >> k
S = n * k / n i.e., S = k
where ‘k’ are the number of stages in the pipeline.
14. (ii) Branch Prediction :
Branch prediction is an approach to computer architecture that
attempts to mitigate the costs of branching. Branch predication
speeds up the processing of branch instructions with CPUs using
pipelining. The technique involves only executing certain
instructions if certain predicates are true. Branch prediction is
typically implemented in hardware using a branch predictor.
Branch prediction is also known as branch predication or simply
as predication.
15. (iii) RISC Pipeline :
RISC processor has 5 stage instruction pipeline to execute all the
instructions in the RISC instruction set. Following are the 5 stages
of RISC pipeline with their respective operations:
Stage 1 (Instruction Fetch)
In this stage the CPU reads instructions from the address in the
memory whose value is present in the program counter.
Stage 2 (Instruction Decode)
In this stage, instruction is decoded and the register file is accessed
to get the values from the registers used in the instruction.
Stage 3 (Instruction Execute)
In this stage, ALU operations are performed.
Stage 4 (Memory Access)
In this stage, memory operands are read and written from/to the
memory that is present in the instruction.
Stage 5 (Write Back)
In this stage, computed/fetched value is written back to the register
present in the instructions.
16. (iv) Delayed Branch :
A conditional branch instruction found in
some RISC architectures that include pipelining. The effect
is to execute one or more instructions following the
conditional branch before the branch is taken. This avoids
stalling the pipeline while the branch condition is evaluated,
thus keeping the pipeline full and minimizing the effect of
conditional branches on processor performance.
17. Q.8 Explain the following :
i) Vector Processor :
A vector processor is a central processing unit that can
work on an entire vector in one instruction. The
instruction to the processor is in the form of one
complete vector instead of its element. Vector
processors are used because they reduce the draw and
interpret bandwidth owing to the fact that fewer
instructions must be fetched.
A vector processor is also known as an ‘array
processor’ .
18. ii) Memory interleaving :
Memory interleaving is a technique for increasing memory
speed. It is a process that makes the system more efficient,
fast and reliable. It is a technique for compensating the
relatively slow speed of DRAM(Dynamic RAM). In this
technique, the main memory is divided into memory banks
which can be accessed individually without any dependency
on the other. An interleaved memory with n banks is said to
be n-way interleaved. In an interleaved memory system,
there are still two banks of DRAM but logically the system
seems one bank of memory that is twice as large.
19. iii) Super Computers :
Supercomputer, any of a class of extremely powerful
computers. The term is commonly applied to the fastest
high-performance systems available at any given time.
Such computers have been used primarily for scientific
and engineering work requiring exceedingly high-speed
computations. Common applications for supercomputers
include testing mathematical models for complex physical
phenomena or designs, such as climate and weather,
evolution of the cosmos, nuclear weapons and reactors,
new chemical compounds (especially for pharmaceutical
purposes), and cryptology. As the cost of supercomputing
declined in the 1990s, more businesses began to use
supercomputers for market research and other business-
related models.
20. Q.9 Write down the Flynn’s classification of the computers.
Ans. According to Flynn's, based on the number of instruction
and data streams that can be processed simultaneously,
computing systems are classified into four major categories:
21. 1) SISD (Single Instruction Single Data Stream) :
A SISD computing system is a uniprocessor machine that is
capable of executing a single instruction operating on a single
data stream. Most conventional computers have SISD
architecture where all the instruction and data to be
processed have to be stored in primary memory.
22. 2) SIMD (Single Instruction Multiple Data Stream) :
A SIMD system is a multiprocessor machine, capable of
executing the same instruction on all the CPUs but operating
on the different data stream.
IBM 710 is the real life application of SIMD.
23. 3) MISD (Multiple Instruction Single Data stream) :
An MISD computing is a multiprocessor machine capable of
executing different instructions on processing elements but all
of them operating on the same data set.
24. 4) MIMD (Multiple Instruction Multiple Data Stream) :
A MIMD system is a multiprocessor machine that is capable of
executing multiple instructions over multiple data streams. Each
processing element has a separate instruction stream and data
stream.
25. Q.10 Explain Data Hazards. Describe the types of Data
Hazards.
Ans. Data hazards occur when instructions that exhibit data
dependence, modify data in different stages of a pipeline.
Hazard cause delays in the pipeline.
There are mainly three types of data hazards:
1) RAW (Read after Write) [Flow/True data dependency]
2) WAR (Write after Read) [Anti-Data dependency]
3) WAW (Write after Write) [Output data dependency]
26. RAW hazard occurs when instruction J tries to read data
before instruction I writes it.
Example:
I: R2 <- R1 + R3
J: R4 <- R2 + R3
WAR hazard occurs when instruction J tries to write data
before instruction I reads it.
Example:
I: R2 <- R1 + R3
J: R3 <- R4 + R5
WAW hazard occurs when instruction J tries to write
before instruction I writes it.
Example:
I: R2 <- R1 + R3
J: R2 <- R4 + R5