SlideShare ist ein Scribd-Unternehmen logo
1 von 11
Downloaden Sie, um offline zu lesen
Page 01
Algorithms
1. Introduction
An algorithm is defined as a sequence of computational steps required to
accomplish a specific task. The algorithm works for a given input and will
terminate in a well-defined state. The basic conditions of an algorithm are: input,
output, definiteness, effectiveness and finiteness. The purpose of the developing an
algorithm is to solve a general, well specified problem.
A concern while designing an algorithm also pertains to the kind of computer
on which the algorithm would be executed. The two forms of architectures of
computers are: sequential computer and parallel computer. Therefore, depending
upon the architecture of the computers, we have sequential as well as parallel
algorithms.
The algorithms which are executed on the sequential computers simply
perform according to sequence of steps for solving a given problem. Such
algorithms are known as sequential algorithms.
However, a problem can be solved after dividing it into sub-problems and those
in turn are executed in parallel. Later on, the results of the solutions of these sub
problems can be combined together and the final solution can be achieved. In such
situations, the number of processors required would be more than one and they
would be communicating with each other for producing the final output. This
environment operates on the parallel computer and the special kind of algorithms
called parallel algorithms are designed for these computers. The parallel algorithms
depend on the kind of parallel computer they are designed for. Hence, for a given
problem, there would be a need to design the different kinds of parallel algorithms
depending upon the kind of parallel architecture.
A parallel computer is a set of processors that are able to work cooperatively to
solve a computational problem. This definition is broad enough to include parallel
supercomputers that have hundreds or thousands of processors, networks of
workstations, multi-processor workstations, and embedded systems. The parallel
computers can be represented with the help of various kinds of models such as
random access machine (RAM), parallel random access machine (PRAM),
Interconnection Networks etc. While designing a parallel algorithm, the
Page 02
computational power of various models can be analyzed and compared, parallelism
can be involved for a given problem on a specific model after understanding the
characteristics of a model. The analysis of parallel algorithm on different models
assist in determining the best model for a problem after receiving the results in
terms of the time and space complexity.
2. Analysis of Parallel Algorithms
A generic algorithm is mainly analyzed on the basis of the following
parameters: the time complexity (execution time) and the space complexity
(amount of space required). Usually we give much more importance to time
complexity in comparison with space complexity. The subsequent section
highlights the criteria of analyzing the complexity of parallel algorithms. The
fundamental parameters required for the analysis of parallel algorithms are as
follow:
• Time Complexity
• The Total Number of Processors Required
• The Cost Involved.
2.1 Time Complexity
As it happens, most people who implement algorithms want to know how much
of a particular resource (such as time or storage) is required for a given algorithm.
The parallel architectures have been designed for improving the computation
power of the various algorithms. Thus, the major concern of evaluating an
algorithm is the determination of the amount of time required to execute. Usually,
the time complexity is calculated on the basis of the total number of steps executed
to accomplish the desired output.
The resource consumption in parallel algorithms is both processor cycles on
each processor and also the communication overhead between the processors.
Thus, first in the computation step, the local processor performs an arithmetic
and logic operation. Thereafter, the various processors communicate with each
other for exchanging messages and/or data. Hence, the time complexity can be
calculated on the basis of computational cost and communication cost involved.
The time complexity of an algorithm varies depending upon the instance of the
input for a given problem. For example, the already sorted list (10, 17, 19, 21, 22,
Page 03
and 33) will consume less amount of time than the reverse order of list (33, 22, 21,
19, 17, and 10). The time complexity of an algorithm has been categorized into
three forms:
i) Best Case Complexity;
ii) Average Case Complexity; and
iii) Worst Case Complexity.
The best case complexity is the least amount of time required by the algorithm
for a given input. The average case complexity is the average running time
required by the algorithm for a given input. Similarly, the worst case complexity
can be defined as the maximum amount of time required by the algorithm for a
given input.
The main factors involved for analyzing the time complexity depends upon the:
 Algorithm
 Parallel computer model
 Specific set of inputs.
Mostly the size of the input is a function of time complexity of the algorithm.
2.2 Number of Processors
One of the other factors that assist in analysis of parallel algorithms is the total
number of processors required to deliver a solution to a given problem. Thus, for a
given input of size say n, the number of processors required by the parallel
algorithm is a function of n, usually denoted by TP (n).
2.3 Overall Cost
Finally, the total cost of the algorithm is a product of time complexity of the
parallel algorithm and the total number of processors required for computation.
Cost = Time Complexity * Total Number of Processors
The other form of defining the cost is that it specifies the total number of steps
executed collectively by the n number of processors, i.e., summation of steps.
Another term related with the analysis of the parallel algorithms is efficiency of the
algorithm. It is defined as the ratio of the worst case running time of the best
Page 04
sequential algorithm and the cost of the parallel algorithm. The efficiency would be
mostly less than or equal to 1. In a situation, if efficiency is greater than 1 then it
means that the sequential algorithm is faster than the parallel algorithm.
𝐸𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑐𝑦 =
𝑊𝑜𝑟𝑠𝑡 𝑐𝑎𝑠𝑒 𝑟𝑢𝑛𝑛𝑖𝑛𝑔 𝑡𝑖𝑚𝑒 𝑜𝑓 𝑆𝑒𝑞𝑢𝑒𝑛𝑡𝑖𝑎𝑙 𝐴𝑙𝑔𝑜𝑟𝑖𝑡ℎ𝑚
𝐶𝑜𝑠𝑡 𝑜𝑓 𝑃𝑎𝑟𝑎𝑙𝑙𝑒𝑙 𝐴𝑙𝑔𝑜𝑟𝑖𝑡ℎ𝑚
3. Merge Sort Algorithm
First, divide the given sequence of n numbers into two parts, each consisting of
n/2 numbers. Thereafter, recursively split the sequence into two parts until each
number acts as an independent sequence. Consequently, the independent numbers
are first sorted and recursively merged until a sorted sequence of n numbers is not
achieved.
In order to perform the above-mentioned task, there will be two kinds of circuits
which would be used in the following manner: the first one for sorting and another
one for merging the sorted list of numbers.
Odd-Even Merging Circuit
Let us firstly illustrate the concept of merging two sorted sequences using an
odd-even merging circuit. The working of a merging circuit is as follows:
1) Let there be two sorted sequences A=(a1, a2, a3, a4……… am) and B=(b1, b2, b3,
b4……… bm) which are required to be merged.
2) With the help of a merging circuit (m/2,m/2), merge the odd indexed numbers of
the two sub sequences i.e. (a1, a3, a5…… am-1) and (b1, b3, b5……… bm-1) and thus
resulting in sorted sequence (c1, c2, c3……… cm).
3) Thereafter, with the help of a merging circuit (m/2,m/2), merge the even
indexed numbers of the two sub sequences i.e. (a2, a4, a6……… am) and (b2, b4, b6………
bm) and thus resulting in sorted sequence (d1, d2, d3……… dm).
4) The final output sequence O=(o1, o2, o3……… o2m ) is achieved in the following
manner:
Page 05
o1 = a1 and o2m = bm .In general the formula is as given below: o2i = min(ai+1,bi ) and
o2I+1 = max(ai+1,bi ) for i=1,2,3,4……….m-1.
Now, let us take an example for merging the two sorted sequences of length 4,
i.e., A=(a1, a2, a3, a4) and B=(b1, b2, b3, b4). Suppose the numbers of the sequence are
A=(4,6,9,10) and B=(2,7,8,12). The circuit of merging the two given sequences is
illustrated in Figure 7.
Sorting Circuit along with Odd-Even Merging Circuit
As we already know, the merge sort algorithm requires two circuits, i.e. one for
merging and another for sorting the sequences. Therefore, the sorting circuit has
been derived from the above-discussed merging circuit. The basic steps followed
by the circuit are highlighted below:
i) The given input sequence of length n is divided into two sub-sequences of length
n/2 each.
ii) The two sub sequences are recursively sorted.
iii) The two sorted sub sequences are merged (n/2,n/2) using a merging circuit in order
to finally get the sorted sequence of length n.
Page 06
Now, let us take an example for sorting the n numbers say 4,2,10,12,8,7,6,9. The
circuit of sorting + merging given sequence is illustrated in Figure 8.
Analysis of Merge Sort
i) The width of the sorting + merging circuit is equal to the maximum number of
devices required in a stage is O(n/2). As in the above figure the maximum number
of devices for a given stage is 4 which is 8/2 or n/2.
ii) The circuit contains two sorting circuits for sorting sequences of length n/2 and
thereafter one merging circuit for merging of the two sorted sub sequences (see
Page 07
stage 4th
in the above figure). Let the functions Ts and Tm denote the time
complexity of sorting and merging in terms of its depth.
The Ts can be calculated as follows:
Ts(n) =Ts(n/2) + Tm(n/2)
Ts(n) =Ts(n/2) + log(n/2) ,
Therefore, Ts (n) is equal to O(log2
n).
4. Image Processing
Image is the 2 dimensional distributions of the small image points called
pixels. It can be considered as a function of two real variables, for example, f(x,y)
with f as the amplitude (e.g. brightness) of the image at position (x,y). Image
Processing is the process of enhancing the image and extraction of meaningful
information from an image.
Image Processing with parallel computing is an alternative way to solve image
processing problems that require large times of processing or handling large
amounts of information in "acceptable time" (according to each criterion).
The main idea of parallel image processing is to divide the problem into simple
tasks and solve them concurrently, in such a way the total time can be divided
between the total tasks (in the best case). Parallel image processing cannot be
applied to all problems, in other words we can say that not all the problems can be
coded in a parallel form. A parallel program must have some features for a correct
and efficient operation; otherwise, it is possible that runtime or operation does not
have the expected performance. These features include the following:
 Granularity: It’s defined as the number of basic units and it is classified as:
 Coarse-grained: Few tasks of more intense computing.
 Fine-grained: A large number of small parts and less intense computing.
 Synchronization: This prevents the overlap of two or more processes.
 Latency: This is the time transition of information from request to receipt.
Page 08
 Scalability: It’s defined as the ability of an algorithm to maintain its efficiency by
increasing the number of processors and the size of the problem in the same
proportion.
5. Proposed Algorithms for Parallel Image Processing Tasks
We will go through the following sequential image processing algorithms and
develop its parallel versions.
Load Distribution between Processors:
Suppose we are taking the following image (1) as an example. The main step of
these algorithms is to determine the number of tiles to be generated. The number of
tiles corresponds to the amount of threads. If only a thread exists, the computation
is just sequential computation. Otherwise if there are two or more threads then the
image is divided into distinct areas, as shown in Figure 2, 3, 4. Each thread is
responsible for processing the pixels included in its tile and to execute different
tasks but considering maintaining synchronization between all the processor
otherwise there will be the situation of deadlock between processors.
Parallel Segmentation by Region Growing Technique and
Calculation of different Features of Segmented Regions:
Region growing is one of the very famous techniques of segmentation. Main
drawback of the region growing is that it is time consuming. This algorithm is
designed to reduce timing of this algorithm. These are the steps involved in it.
1. Take Input image on client processor.
2. Select the Seed pixels (It is based on some user criterion like pixels in a certain
gray level range, pixels evenly spaced on a grid, etc.) for different regions on
client processor.
3. Count number of seed points and activate same number of kernels.
4. Send copy of image and respective seed pixels on each kernel processor.
5. Regions are then grown from these seed pixels to adjacent depending on pixel
intensity, gray level texture, or color, etc. on each kernel processor for this
Page 09
image information is also important because region grown is based on the
membership criteria.
6. Calculate different features of different ROI (region of interest) on individual
kernel.
7. Send segmented regions and information of ROIs to the client processor and
deactivation of worker kernels.
8. Reconstruct the image on client processor and Display of all the information.
6. Fault Tolerance
To achieve the needed reliability and availability, we need fault-tolerant
computers. They have the ability to tolerate faults by detecting failures, and
isolate defect modules so that the rest of the system can operate correctly.
Page 10
Reliability techniques have also become of increasing interest to general-
purpose computer systems.
Parallel computing can also be applied to the design of fault-tolerance
computer systems, particularly via lockstep systems performing the same
operation in parallel. This provides redundancy in case one component should
fail, and also allows automatic error detection and error correction if the results
differ. These methods can be used to help prevent single event upsets caused by
transient errors. Although additional measures may be required in embedded or
specialized systems, this method can provide a cost effective approach to
achieve n-modular redundancy in commercial off-the-shelf systems.
7. Process of System Recovery from Fault Tolerance
Faults in different parts of parallel system have different importance. Let’s
think about a fault processor, line or switch. The most important is fault on
processor. In this case the processes allocated on this processor have to be
moved to other processor, recovered and initialized one more time. Usually we
can think about that processor memory content is lost after fault appearing, or
lack of access. It is necessary to remove and to redirect all communications
lines going through this process.
Every process of parallel system from the moment when the fault appears till
the end of the recovery is getting a new attribute (fig.1). When processor
element PE failed, then:
• Every process allocated on the processor element PE is called locked
process, main and copy too,
• Every process except locked process, communicating with locked
process is called fault influenced process,
• Every process except locked process, not communicating with locked
process is called free process.
Page 11
The process of system recovery is known. But there is a question how and
who controls recovery of kernel of processor. Control can be either centralized
or decentralized. In case of decentralized control it is necessary to build on the
fact that all kernels dispose the same data, according to which they determine
final processors. Every kernel determines final processors for those locked
processes which have on its processor allocated copies of processes. If the copy
of process is located on more than one processor then the corresponding
processor transmit message about system recovery to other processors where
the other copies are located. Content of the message is about final processor for
the exact copy of process and time mark of begin of recovery. The kernel of the
system after receiving all messages about system recovery compares this time
mark with its own time of recovery. Lately the kernel doesn’t realize any code
reallocation of the relevant processes. In case of equality of time marks can be
decisive by another criterion, like for example identification number of a
processor. There is a question how many copies of processes are enough for
sufficient resistance against faults. In case of active and passive processes it
depends on requested security. One passive copy of the process is sufficient if
we assume, that fault doesn’t appear on two processors occupied by the same
process at the same time or in time of recovery of the system.

Weitere ähnliche Inhalte

Was ist angesagt?

Parallel algorithms
Parallel algorithmsParallel algorithms
Parallel algorithms
guest084d20
 
Matrix transposition
Matrix transpositionMatrix transposition
Matrix transposition
동호 이
 

Was ist angesagt? (19)

Parallel algorithms
Parallel algorithmsParallel algorithms
Parallel algorithms
 
Parallel Algorithms
Parallel AlgorithmsParallel Algorithms
Parallel Algorithms
 
Matrix multiplication
Matrix multiplicationMatrix multiplication
Matrix multiplication
 
Parallel sorting Algorithms
Parallel  sorting AlgorithmsParallel  sorting Algorithms
Parallel sorting Algorithms
 
Elementary Parallel Algorithms
Elementary Parallel AlgorithmsElementary Parallel Algorithms
Elementary Parallel Algorithms
 
Optimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data PerspectiveOptimal Chain Matrix Multiplication Big Data Perspective
Optimal Chain Matrix Multiplication Big Data Perspective
 
Parallel Algorithms
Parallel AlgorithmsParallel Algorithms
Parallel Algorithms
 
Parallel Algorithms
Parallel AlgorithmsParallel Algorithms
Parallel Algorithms
 
Parallel Algorithms
Parallel AlgorithmsParallel Algorithms
Parallel Algorithms
 
parallel Merging
parallel Mergingparallel Merging
parallel Merging
 
Cost optimal algorithm
Cost optimal algorithmCost optimal algorithm
Cost optimal algorithm
 
Chapter 4: Parallel Programming Languages
Chapter 4: Parallel Programming LanguagesChapter 4: Parallel Programming Languages
Chapter 4: Parallel Programming Languages
 
Parallel algorithms
Parallel algorithmsParallel algorithms
Parallel algorithms
 
Performance analysis(Time & Space Complexity)
Performance analysis(Time & Space Complexity)Performance analysis(Time & Space Complexity)
Performance analysis(Time & Space Complexity)
 
Lecture 3 insertion sort and complexity analysis
Lecture 3   insertion sort and complexity analysisLecture 3   insertion sort and complexity analysis
Lecture 3 insertion sort and complexity analysis
 
Parallel Algorithms
Parallel AlgorithmsParallel Algorithms
Parallel Algorithms
 
Introduction to Algorithms Complexity Analysis
Introduction to Algorithms Complexity Analysis Introduction to Algorithms Complexity Analysis
Introduction to Algorithms Complexity Analysis
 
Algorithm And analysis Lecture 03& 04-time complexity.
 Algorithm And analysis Lecture 03& 04-time complexity. Algorithm And analysis Lecture 03& 04-time complexity.
Algorithm And analysis Lecture 03& 04-time complexity.
 
Matrix transposition
Matrix transpositionMatrix transposition
Matrix transposition
 

Andere mochten auch

All Pair Shortest Path Algorithm – Parallel Implementation and Analysis
All Pair Shortest Path Algorithm – Parallel Implementation and AnalysisAll Pair Shortest Path Algorithm – Parallel Implementation and Analysis
All Pair Shortest Path Algorithm – Parallel Implementation and Analysis
Inderjeet Singh
 
Parallel Algorithms K – means Clustering
Parallel Algorithms K – means ClusteringParallel Algorithms K – means Clustering
Parallel Algorithms K – means Clustering
Andreina Uzcategui
 
Landbrugssektoren – kvinders rolle i udviklingslande
Landbrugssektoren – kvinders rolle i udviklingslandeLandbrugssektoren – kvinders rolle i udviklingslande
Landbrugssektoren – kvinders rolle i udviklingslande
Sanne Chipeta
 
6.3.1
6.3.16.3.1
6.3.1
renee
 
SAC como Estrategia Competititva
SAC como Estrategia CompetititvaSAC como Estrategia Competititva
SAC como Estrategia Competititva
AlejandraL Unzueta
 
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
Fabio Fumarola
 

Andere mochten auch (20)

All Pair Shortest Path Algorithm – Parallel Implementation and Analysis
All Pair Shortest Path Algorithm – Parallel Implementation and AnalysisAll Pair Shortest Path Algorithm – Parallel Implementation and Analysis
All Pair Shortest Path Algorithm – Parallel Implementation and Analysis
 
Parallel Algorithms K – means Clustering
Parallel Algorithms K – means ClusteringParallel Algorithms K – means Clustering
Parallel Algorithms K – means Clustering
 
Presentación1 dhticflore
Presentación1 dhticflorePresentación1 dhticflore
Presentación1 dhticflore
 
Landbrugssektoren – kvinders rolle i udviklingslande
Landbrugssektoren – kvinders rolle i udviklingslandeLandbrugssektoren – kvinders rolle i udviklingslande
Landbrugssektoren – kvinders rolle i udviklingslande
 
6.3.1
6.3.16.3.1
6.3.1
 
Administracion estrategica
Administracion estrategicaAdministracion estrategica
Administracion estrategica
 
Cannabis Juice Benefits
Cannabis Juice BenefitsCannabis Juice Benefits
Cannabis Juice Benefits
 
Ginkgo sales
Ginkgo salesGinkgo sales
Ginkgo sales
 
Farmad niet opgeloste problemen
Farmad niet opgeloste problemenFarmad niet opgeloste problemen
Farmad niet opgeloste problemen
 
Knitting boar - Toronto and Boston HUGs - Nov 2012
Knitting boar - Toronto and Boston HUGs - Nov 2012Knitting boar - Toronto and Boston HUGs - Nov 2012
Knitting boar - Toronto and Boston HUGs - Nov 2012
 
SAC como Estrategia Competititva
SAC como Estrategia CompetititvaSAC como Estrategia Competititva
SAC como Estrategia Competititva
 
IA Een innovatieve toekomst in de bouw. Sessie 1 deel 1: Jan Desmyter Nieuwe ...
IA Een innovatieve toekomst in de bouw. Sessie 1 deel 1: Jan Desmyter Nieuwe ...IA Een innovatieve toekomst in de bouw. Sessie 1 deel 1: Jan Desmyter Nieuwe ...
IA Een innovatieve toekomst in de bouw. Sessie 1 deel 1: Jan Desmyter Nieuwe ...
 
latoya documents
latoya documentslatoya documents
latoya documents
 
Makalah 1
Makalah 1Makalah 1
Makalah 1
 
20121021 bspapproach tiskin
20121021 bspapproach tiskin20121021 bspapproach tiskin
20121021 bspapproach tiskin
 
Parallel Hashing Algorithms
Parallel Hashing AlgorithmsParallel Hashing Algorithms
Parallel Hashing Algorithms
 
Parallel algorithms for solving linear systems with block-fivediagonal matric...
Parallel algorithms for solving linear systems with block-fivediagonal matric...Parallel algorithms for solving linear systems with block-fivediagonal matric...
Parallel algorithms for solving linear systems with block-fivediagonal matric...
 
Balanced Scorecard Templates - Version 3
Balanced Scorecard Templates - Version 3Balanced Scorecard Templates - Version 3
Balanced Scorecard Templates - Version 3
 
Implementation of dijsktra’s algorithm in parallel
Implementation of dijsktra’s algorithm in parallelImplementation of dijsktra’s algorithm in parallel
Implementation of dijsktra’s algorithm in parallel
 
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
 

Ähnlich wie Parallel Algorithms: Sort & Merge, Image Processing, Fault Tolerance

Performance analysis and randamized agoritham
Performance analysis and randamized agorithamPerformance analysis and randamized agoritham
Performance analysis and randamized agoritham
lilyMalar1
 
Module 1 notes of data warehousing and data
Module 1 notes of data warehousing and dataModule 1 notes of data warehousing and data
Module 1 notes of data warehousing and data
vijipersonal2012
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
ijceronline
 
Bounded ant colony algorithm for task Allocation on a network of homogeneous ...
Bounded ant colony algorithm for task Allocation on a network of homogeneous ...Bounded ant colony algorithm for task Allocation on a network of homogeneous ...
Bounded ant colony algorithm for task Allocation on a network of homogeneous ...
ijcsit
 

Ähnlich wie Parallel Algorithms: Sort & Merge, Image Processing, Fault Tolerance (20)

Gk3611601162
Gk3611601162Gk3611601162
Gk3611601162
 
An Algorithm for Optimized Cost in a Distributed Computing System
An Algorithm for Optimized Cost in a Distributed Computing SystemAn Algorithm for Optimized Cost in a Distributed Computing System
An Algorithm for Optimized Cost in a Distributed Computing System
 
Performance analysis and randamized agoritham
Performance analysis and randamized agorithamPerformance analysis and randamized agoritham
Performance analysis and randamized agoritham
 
VCE Unit 01 (2).pptx
VCE Unit 01 (2).pptxVCE Unit 01 (2).pptx
VCE Unit 01 (2).pptx
 
unit 2 hpc.pptx
unit 2 hpc.pptxunit 2 hpc.pptx
unit 2 hpc.pptx
 
Analyzing algorithms
Analyzing algorithmsAnalyzing algorithms
Analyzing algorithms
 
Design & Analysis of Algorithm course .pptx
Design & Analysis of Algorithm course .pptxDesign & Analysis of Algorithm course .pptx
Design & Analysis of Algorithm course .pptx
 
complexity analysis.pdf
complexity analysis.pdfcomplexity analysis.pdf
complexity analysis.pdf
 
Module 1 notes of data warehousing and data
Module 1 notes of data warehousing and dataModule 1 notes of data warehousing and data
Module 1 notes of data warehousing and data
 
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptx
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptxICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptx
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptx
 
chapter 1
chapter 1chapter 1
chapter 1
 
Algorithm Analysis.pdf
Algorithm Analysis.pdfAlgorithm Analysis.pdf
Algorithm Analysis.pdf
 
DA lecture 3.pptx
DA lecture 3.pptxDA lecture 3.pptx
DA lecture 3.pptx
 
Chapter 1 Data structure.pptx
Chapter 1 Data structure.pptxChapter 1 Data structure.pptx
Chapter 1 Data structure.pptx
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Cs2251 daa
Cs2251 daaCs2251 daa
Cs2251 daa
 
IRJET- Comparison for Max-Flow Min-Cut Algorithms for Optimal Assignment Problem
IRJET- Comparison for Max-Flow Min-Cut Algorithms for Optimal Assignment ProblemIRJET- Comparison for Max-Flow Min-Cut Algorithms for Optimal Assignment Problem
IRJET- Comparison for Max-Flow Min-Cut Algorithms for Optimal Assignment Problem
 
Chap5 slides
Chap5 slidesChap5 slides
Chap5 slides
 
Bounded ant colony algorithm for task Allocation on a network of homogeneous ...
Bounded ant colony algorithm for task Allocation on a network of homogeneous ...Bounded ant colony algorithm for task Allocation on a network of homogeneous ...
Bounded ant colony algorithm for task Allocation on a network of homogeneous ...
 
Introduction to algorithms
Introduction to algorithmsIntroduction to algorithms
Introduction to algorithms
 

Mehr von University of Technology - Iraq

Mehr von University of Technology - Iraq (19)

www.pptx
www.pptxwww.pptx
www.pptx
 
Security problems.pptx
Security problems.pptxSecurity problems.pptx
Security problems.pptx
 
Information System (IS) life cycle.pptx
Information System (IS) life cycle.pptxInformation System (IS) life cycle.pptx
Information System (IS) life cycle.pptx
 
Performance Evaluation for Software Defined Networking (SDN) Based on Adaptiv...
Performance Evaluation for Software Defined Networking (SDN) Based on Adaptiv...Performance Evaluation for Software Defined Networking (SDN) Based on Adaptiv...
Performance Evaluation for Software Defined Networking (SDN) Based on Adaptiv...
 
Ch12 microprocessor interrupts
Ch12 microprocessor interruptsCh12 microprocessor interrupts
Ch12 microprocessor interrupts
 
Performance and traffic management for WSNs
Performance and traffic management for WSNsPerformance and traffic management for WSNs
Performance and traffic management for WSNs
 
IP address and Domain name
IP address and Domain nameIP address and Domain name
IP address and Domain name
 
Computer science mcq alle books4u
Computer science mcq   alle books4uComputer science mcq   alle books4u
Computer science mcq alle books4u
 
Switches and LEDs interface to the 8051 microcontroller
Switches and LEDs interface to the 8051 microcontrollerSwitches and LEDs interface to the 8051 microcontroller
Switches and LEDs interface to the 8051 microcontroller
 
VANET
VANETVANET
VANET
 
Wavelet Transform and DSP Applications
Wavelet Transform and DSP ApplicationsWavelet Transform and DSP Applications
Wavelet Transform and DSP Applications
 
FSK, PSK, QAM
FSK, PSK, QAMFSK, PSK, QAM
FSK, PSK, QAM
 
SCHEDULING DAGs WITHOUT CONSIDERING COMMUNICATION
SCHEDULING DAGs WITHOUT CONSIDERING COMMUNICATIONSCHEDULING DAGs WITHOUT CONSIDERING COMMUNICATION
SCHEDULING DAGs WITHOUT CONSIDERING COMMUNICATION
 
DSB-SC Demodulation using matlab
DSB-SC Demodulation using matlabDSB-SC Demodulation using matlab
DSB-SC Demodulation using matlab
 
Main memory-2 (ch8,os)
Main memory-2 (ch8,os)Main memory-2 (ch8,os)
Main memory-2 (ch8,os)
 
CMOS
CMOSCMOS
CMOS
 
Wap
WapWap
Wap
 
8086 microprocessor lab manual
8086 microprocessor lab manual8086 microprocessor lab manual
8086 microprocessor lab manual
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 

Kürzlich hochgeladen

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
MateoGardella
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
MateoGardella
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 

Kürzlich hochgeladen (20)

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 

Parallel Algorithms: Sort & Merge, Image Processing, Fault Tolerance

  • 1. Page 01 Algorithms 1. Introduction An algorithm is defined as a sequence of computational steps required to accomplish a specific task. The algorithm works for a given input and will terminate in a well-defined state. The basic conditions of an algorithm are: input, output, definiteness, effectiveness and finiteness. The purpose of the developing an algorithm is to solve a general, well specified problem. A concern while designing an algorithm also pertains to the kind of computer on which the algorithm would be executed. The two forms of architectures of computers are: sequential computer and parallel computer. Therefore, depending upon the architecture of the computers, we have sequential as well as parallel algorithms. The algorithms which are executed on the sequential computers simply perform according to sequence of steps for solving a given problem. Such algorithms are known as sequential algorithms. However, a problem can be solved after dividing it into sub-problems and those in turn are executed in parallel. Later on, the results of the solutions of these sub problems can be combined together and the final solution can be achieved. In such situations, the number of processors required would be more than one and they would be communicating with each other for producing the final output. This environment operates on the parallel computer and the special kind of algorithms called parallel algorithms are designed for these computers. The parallel algorithms depend on the kind of parallel computer they are designed for. Hence, for a given problem, there would be a need to design the different kinds of parallel algorithms depending upon the kind of parallel architecture. A parallel computer is a set of processors that are able to work cooperatively to solve a computational problem. This definition is broad enough to include parallel supercomputers that have hundreds or thousands of processors, networks of workstations, multi-processor workstations, and embedded systems. The parallel computers can be represented with the help of various kinds of models such as random access machine (RAM), parallel random access machine (PRAM), Interconnection Networks etc. While designing a parallel algorithm, the
  • 2. Page 02 computational power of various models can be analyzed and compared, parallelism can be involved for a given problem on a specific model after understanding the characteristics of a model. The analysis of parallel algorithm on different models assist in determining the best model for a problem after receiving the results in terms of the time and space complexity. 2. Analysis of Parallel Algorithms A generic algorithm is mainly analyzed on the basis of the following parameters: the time complexity (execution time) and the space complexity (amount of space required). Usually we give much more importance to time complexity in comparison with space complexity. The subsequent section highlights the criteria of analyzing the complexity of parallel algorithms. The fundamental parameters required for the analysis of parallel algorithms are as follow: • Time Complexity • The Total Number of Processors Required • The Cost Involved. 2.1 Time Complexity As it happens, most people who implement algorithms want to know how much of a particular resource (such as time or storage) is required for a given algorithm. The parallel architectures have been designed for improving the computation power of the various algorithms. Thus, the major concern of evaluating an algorithm is the determination of the amount of time required to execute. Usually, the time complexity is calculated on the basis of the total number of steps executed to accomplish the desired output. The resource consumption in parallel algorithms is both processor cycles on each processor and also the communication overhead between the processors. Thus, first in the computation step, the local processor performs an arithmetic and logic operation. Thereafter, the various processors communicate with each other for exchanging messages and/or data. Hence, the time complexity can be calculated on the basis of computational cost and communication cost involved. The time complexity of an algorithm varies depending upon the instance of the input for a given problem. For example, the already sorted list (10, 17, 19, 21, 22,
  • 3. Page 03 and 33) will consume less amount of time than the reverse order of list (33, 22, 21, 19, 17, and 10). The time complexity of an algorithm has been categorized into three forms: i) Best Case Complexity; ii) Average Case Complexity; and iii) Worst Case Complexity. The best case complexity is the least amount of time required by the algorithm for a given input. The average case complexity is the average running time required by the algorithm for a given input. Similarly, the worst case complexity can be defined as the maximum amount of time required by the algorithm for a given input. The main factors involved for analyzing the time complexity depends upon the:  Algorithm  Parallel computer model  Specific set of inputs. Mostly the size of the input is a function of time complexity of the algorithm. 2.2 Number of Processors One of the other factors that assist in analysis of parallel algorithms is the total number of processors required to deliver a solution to a given problem. Thus, for a given input of size say n, the number of processors required by the parallel algorithm is a function of n, usually denoted by TP (n). 2.3 Overall Cost Finally, the total cost of the algorithm is a product of time complexity of the parallel algorithm and the total number of processors required for computation. Cost = Time Complexity * Total Number of Processors The other form of defining the cost is that it specifies the total number of steps executed collectively by the n number of processors, i.e., summation of steps. Another term related with the analysis of the parallel algorithms is efficiency of the algorithm. It is defined as the ratio of the worst case running time of the best
  • 4. Page 04 sequential algorithm and the cost of the parallel algorithm. The efficiency would be mostly less than or equal to 1. In a situation, if efficiency is greater than 1 then it means that the sequential algorithm is faster than the parallel algorithm. 𝐸𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑐𝑦 = 𝑊𝑜𝑟𝑠𝑡 𝑐𝑎𝑠𝑒 𝑟𝑢𝑛𝑛𝑖𝑛𝑔 𝑡𝑖𝑚𝑒 𝑜𝑓 𝑆𝑒𝑞𝑢𝑒𝑛𝑡𝑖𝑎𝑙 𝐴𝑙𝑔𝑜𝑟𝑖𝑡ℎ𝑚 𝐶𝑜𝑠𝑡 𝑜𝑓 𝑃𝑎𝑟𝑎𝑙𝑙𝑒𝑙 𝐴𝑙𝑔𝑜𝑟𝑖𝑡ℎ𝑚 3. Merge Sort Algorithm First, divide the given sequence of n numbers into two parts, each consisting of n/2 numbers. Thereafter, recursively split the sequence into two parts until each number acts as an independent sequence. Consequently, the independent numbers are first sorted and recursively merged until a sorted sequence of n numbers is not achieved. In order to perform the above-mentioned task, there will be two kinds of circuits which would be used in the following manner: the first one for sorting and another one for merging the sorted list of numbers. Odd-Even Merging Circuit Let us firstly illustrate the concept of merging two sorted sequences using an odd-even merging circuit. The working of a merging circuit is as follows: 1) Let there be two sorted sequences A=(a1, a2, a3, a4……… am) and B=(b1, b2, b3, b4……… bm) which are required to be merged. 2) With the help of a merging circuit (m/2,m/2), merge the odd indexed numbers of the two sub sequences i.e. (a1, a3, a5…… am-1) and (b1, b3, b5……… bm-1) and thus resulting in sorted sequence (c1, c2, c3……… cm). 3) Thereafter, with the help of a merging circuit (m/2,m/2), merge the even indexed numbers of the two sub sequences i.e. (a2, a4, a6……… am) and (b2, b4, b6……… bm) and thus resulting in sorted sequence (d1, d2, d3……… dm). 4) The final output sequence O=(o1, o2, o3……… o2m ) is achieved in the following manner:
  • 5. Page 05 o1 = a1 and o2m = bm .In general the formula is as given below: o2i = min(ai+1,bi ) and o2I+1 = max(ai+1,bi ) for i=1,2,3,4……….m-1. Now, let us take an example for merging the two sorted sequences of length 4, i.e., A=(a1, a2, a3, a4) and B=(b1, b2, b3, b4). Suppose the numbers of the sequence are A=(4,6,9,10) and B=(2,7,8,12). The circuit of merging the two given sequences is illustrated in Figure 7. Sorting Circuit along with Odd-Even Merging Circuit As we already know, the merge sort algorithm requires two circuits, i.e. one for merging and another for sorting the sequences. Therefore, the sorting circuit has been derived from the above-discussed merging circuit. The basic steps followed by the circuit are highlighted below: i) The given input sequence of length n is divided into two sub-sequences of length n/2 each. ii) The two sub sequences are recursively sorted. iii) The two sorted sub sequences are merged (n/2,n/2) using a merging circuit in order to finally get the sorted sequence of length n.
  • 6. Page 06 Now, let us take an example for sorting the n numbers say 4,2,10,12,8,7,6,9. The circuit of sorting + merging given sequence is illustrated in Figure 8. Analysis of Merge Sort i) The width of the sorting + merging circuit is equal to the maximum number of devices required in a stage is O(n/2). As in the above figure the maximum number of devices for a given stage is 4 which is 8/2 or n/2. ii) The circuit contains two sorting circuits for sorting sequences of length n/2 and thereafter one merging circuit for merging of the two sorted sub sequences (see
  • 7. Page 07 stage 4th in the above figure). Let the functions Ts and Tm denote the time complexity of sorting and merging in terms of its depth. The Ts can be calculated as follows: Ts(n) =Ts(n/2) + Tm(n/2) Ts(n) =Ts(n/2) + log(n/2) , Therefore, Ts (n) is equal to O(log2 n). 4. Image Processing Image is the 2 dimensional distributions of the small image points called pixels. It can be considered as a function of two real variables, for example, f(x,y) with f as the amplitude (e.g. brightness) of the image at position (x,y). Image Processing is the process of enhancing the image and extraction of meaningful information from an image. Image Processing with parallel computing is an alternative way to solve image processing problems that require large times of processing or handling large amounts of information in "acceptable time" (according to each criterion). The main idea of parallel image processing is to divide the problem into simple tasks and solve them concurrently, in such a way the total time can be divided between the total tasks (in the best case). Parallel image processing cannot be applied to all problems, in other words we can say that not all the problems can be coded in a parallel form. A parallel program must have some features for a correct and efficient operation; otherwise, it is possible that runtime or operation does not have the expected performance. These features include the following:  Granularity: It’s defined as the number of basic units and it is classified as:  Coarse-grained: Few tasks of more intense computing.  Fine-grained: A large number of small parts and less intense computing.  Synchronization: This prevents the overlap of two or more processes.  Latency: This is the time transition of information from request to receipt.
  • 8. Page 08  Scalability: It’s defined as the ability of an algorithm to maintain its efficiency by increasing the number of processors and the size of the problem in the same proportion. 5. Proposed Algorithms for Parallel Image Processing Tasks We will go through the following sequential image processing algorithms and develop its parallel versions. Load Distribution between Processors: Suppose we are taking the following image (1) as an example. The main step of these algorithms is to determine the number of tiles to be generated. The number of tiles corresponds to the amount of threads. If only a thread exists, the computation is just sequential computation. Otherwise if there are two or more threads then the image is divided into distinct areas, as shown in Figure 2, 3, 4. Each thread is responsible for processing the pixels included in its tile and to execute different tasks but considering maintaining synchronization between all the processor otherwise there will be the situation of deadlock between processors. Parallel Segmentation by Region Growing Technique and Calculation of different Features of Segmented Regions: Region growing is one of the very famous techniques of segmentation. Main drawback of the region growing is that it is time consuming. This algorithm is designed to reduce timing of this algorithm. These are the steps involved in it. 1. Take Input image on client processor. 2. Select the Seed pixels (It is based on some user criterion like pixels in a certain gray level range, pixels evenly spaced on a grid, etc.) for different regions on client processor. 3. Count number of seed points and activate same number of kernels. 4. Send copy of image and respective seed pixels on each kernel processor. 5. Regions are then grown from these seed pixels to adjacent depending on pixel intensity, gray level texture, or color, etc. on each kernel processor for this
  • 9. Page 09 image information is also important because region grown is based on the membership criteria. 6. Calculate different features of different ROI (region of interest) on individual kernel. 7. Send segmented regions and information of ROIs to the client processor and deactivation of worker kernels. 8. Reconstruct the image on client processor and Display of all the information. 6. Fault Tolerance To achieve the needed reliability and availability, we need fault-tolerant computers. They have the ability to tolerate faults by detecting failures, and isolate defect modules so that the rest of the system can operate correctly.
  • 10. Page 10 Reliability techniques have also become of increasing interest to general- purpose computer systems. Parallel computing can also be applied to the design of fault-tolerance computer systems, particularly via lockstep systems performing the same operation in parallel. This provides redundancy in case one component should fail, and also allows automatic error detection and error correction if the results differ. These methods can be used to help prevent single event upsets caused by transient errors. Although additional measures may be required in embedded or specialized systems, this method can provide a cost effective approach to achieve n-modular redundancy in commercial off-the-shelf systems. 7. Process of System Recovery from Fault Tolerance Faults in different parts of parallel system have different importance. Let’s think about a fault processor, line or switch. The most important is fault on processor. In this case the processes allocated on this processor have to be moved to other processor, recovered and initialized one more time. Usually we can think about that processor memory content is lost after fault appearing, or lack of access. It is necessary to remove and to redirect all communications lines going through this process. Every process of parallel system from the moment when the fault appears till the end of the recovery is getting a new attribute (fig.1). When processor element PE failed, then: • Every process allocated on the processor element PE is called locked process, main and copy too, • Every process except locked process, communicating with locked process is called fault influenced process, • Every process except locked process, not communicating with locked process is called free process.
  • 11. Page 11 The process of system recovery is known. But there is a question how and who controls recovery of kernel of processor. Control can be either centralized or decentralized. In case of decentralized control it is necessary to build on the fact that all kernels dispose the same data, according to which they determine final processors. Every kernel determines final processors for those locked processes which have on its processor allocated copies of processes. If the copy of process is located on more than one processor then the corresponding processor transmit message about system recovery to other processors where the other copies are located. Content of the message is about final processor for the exact copy of process and time mark of begin of recovery. The kernel of the system after receiving all messages about system recovery compares this time mark with its own time of recovery. Lately the kernel doesn’t realize any code reallocation of the relevant processes. In case of equality of time marks can be decisive by another criterion, like for example identification number of a processor. There is a question how many copies of processes are enough for sufficient resistance against faults. In case of active and passive processes it depends on requested security. One passive copy of the process is sufficient if we assume, that fault doesn’t appear on two processors occupied by the same process at the same time or in time of recovery of the system.