Courses 1 & 2 for the Algorithm Design and Complexity course at the Faculty of Engineering in Foreign Languages - Politehnica University of Bucharest, Romania
2. Introduction
Algorithms and problems
We need to be able to provide solutions to problems
Any domain has problems that require an algorithmic solution
Find the best solution from a wide range of choices
Learn methods to develop solutions
Problem => Idea => Solution => Algorithm =>
Pseudocode => Code => Compiled Program
We need to design algorithms
3. Introduction (2)
Any non-trivial problem accepts a wide range of solutions
Need to compare these solution in order to find the best
one => Complexity
Need to show that the devised solution solves the
problem => Correctness
Not all the problems, have an algorithmic solution!
4. Introduction (3)
Some problems are similar (or slightly variations)
Accept similar solutions
Need to learn to discover two problems that are similar
Some methods used for designing algorithms provide
solutions for different problems
Need to understand these methods in order to know to use
them for as many problems as possible
The problems that can be solved using one method have some
common properties!
5. Course Info
Lectures: Traian Rebedea
Ph.D. @ University Politehnica of Bucharest, CS Dept.
traian.rebedea@cs.pub.ro / trebedea@gmail.com
Lecturer @ Computer Science Department
Interests: NLP, IR/IE, ML, TEL/CSCL, AI in general
Published over 25 papers at important conferences
Published 4 book chapters (2 in important international books)
Worked at 3 companies, founded 1 company
http://www.informatik.uni-trier.de/~ley/db/indices/atree/r/Rebedea:Traian.html
http://ro.linkedin.com/in/trebedea
Course Website: http://adcfils.wordpress.com/
7. Grading
Exam: 40p
Lab & Assignments: 60p
Assignments: 45p (3 x 15p)
Lab activity: 15p
The lab assistant decides how to grade the lab activity
Rules:
Minimum 30p for lab & assignments
Minimum 15p for the exam
Minimum 50p for total
You are not allowed to copy solutions from colleagues or
WWW (Measure of Software Similarity - MOSS)
8. Textbooks & More Info
Cormen T.H, Leiserson C.E, Rivest R.L and Stein C, Introduction to
Algorithms, Second Edition. MIT Press, 2001
Baase S and A. Van Gelder. Computer Algorithms. Introduction to
Design & Analysis, Addison-Wesley, 2000
References for each chapter
Introduction to Algorithms @ MIT
Coursera: Algorithms: Design and Analysis (part 1 and part 2)
Click here for link also has video lectures
https://www.coursera.org/course/algo,
https://www.coursera.org/course/algo2
Websites for programming exercises: TopCoder, Infoarena, Talent
Buddy, HackerRank, Project Euler
9. Problems and Algorithms
Problems 1 – n Algorithms
Problem: Sorting
1+ algorithms to solve each problem
An algorithm usually solves only 1 problem
Given an array with n numbers A[n], arrange the elements in
the array such that any two consecutive elements are sorted
(A[i] <= A[i + 1] for i = 1..n-1)
Arrays A[1..n]
Algorithms:
Quick Sort, Merge Sort, Heap Sort, Bubble Sort, Insertion Sort,
Selection Sort, Radix Sort, Bucket Sort, …
10. Problems & Computability
There are a lot of problems
We would like to find solutions for all of them
Not all the problems can be solved!
The problems that can be solved are called computable
or decidable problems
The problems that cannot be solved are:
Very difficult
Not clear enough (need for subjective reasoning)
11. Problems & Computability
We would all like to know:
Which stock bonds shall rise tomorrow?
Which football team would win a game?
Who shall I marry?
Is there a God?
Are there any aliens in the universe?
Who is the most beautiful girl in the world?
Need for subjective thinking: what does “the most beautiful” mean?
12. Example
This example should be understood as a metaphor
Problem: Is there any alien life in the universe?
The physics and astronomy parts of it may be wrong
Assumption: the universe is infinite and there are an infinite
number of celestial bodies
Solution:
Explore all the planets, suns and other celestial bodies
Use any exploring method, it can be as good as you want
Explore the celestial body with a perfect scanner
If you find life on it => ANSWER:YES! Alien life exists
Else: continue to the next celestial body
13. Example (2)
The previous solution has a flaw
It never answers NO!
If the answer to our problem would be NO, then we
must wait an infinite amount of time
We cannot stop the solution at any moment in time and
conclude for certain that the answer is NO, because
maybe we still have a celestial body with alien life on it
that has not been explored yet!
This kind of problem is called undecidable!
14. Undecidable Problems
Problems that cannot be solved with an algorithmic solution
We can devise an algorithm, but that algorithm shall never finish in
some situations… therefore we cannot know the answer to our
problem!
Quick info:
A decision problem = a problem for which the answer is {yes, no}
Is n a prime number ?
An optimum problem = a problem for each we need to find the
optimum solution out of a set of possible solutions
Which is the shortest path between two vertices in a graph ?
Optimum = minimum or maximum
15. Undecidable Problems (2)
Any decision problem can be:
Decidable – can always solve the problem with an algorithm
that always finishes!
Semi-decidable – can devise a pseudo-algorithm for solving the
problem that only finishes if the answer is YES, but it never
finishes otherwise!
Therefore, we can never know whether the answer is YES or NO, but
if the algorithm stops than the answer is for sure YES
Undecidable – can not know if the answer is YES…
The previous example is a semi-decidable problem
However, most of the times all the problems that are not
decidable (semi- or un-decidable) are called undecidable
16. Undecidable Problems (3)
Why the example is a metaphor?
Because, any problem that is not decidable should have an
infinite space of the problem!
All the problems that have a finite number of states that
form the space of exploration for that problem are
decidable
E.g. there are a finite number of arranging the numbers in an
array
E.g. there are a finite number of ways to arrange a set of
queens on a chess board
17. Undecidable Problems (4)
Very difficult problems have an infinite space that should
be explored in order to find the solution!
Quick info:
There are an infinitely uncountable number of problems in this
world
There are only an infinitely countable number of programs in
this world
Alonso-Church thesis states that all the problems that can be
computed (are decidable) are the ones that can have a
program associated to them, that is used for solving them
Only a infinitely countable set of problems are decidable
The rest are not decidable
18. Classic Problems that are NOT Decidable
Halting Problem:
Given a program P and an input x, does P(x) halts/finishes?
YES if P(x) finishes in a finite amount of time
NO if P(x) never finishes (may be because it loops forever)
Barber Problem/Paradox
Post’s Correspondence Problem
19. Barber Paradox
http://en.wikipedia.org/wiki/Barber_paradox
Suppose there is a town with just one male barber; and
that every man in the town keeps himself clean-shaven:
some by shaving themselves, some by attending the
barber. It seems reasonable to imagine that the barber
obeys the following rule: He shaves all and only those men
in town who do not shave themselves.
Does the barber shave himself?
20. Barber Paradox (2)
1.
2.
The situation presented is in fact impossible:
If the barber does not shave himself, he must abide by
the rule and shave himself.
If he does shave himself, according to the rule he will
not shave himself
22. Complexity of Algorithms
Need to find the best algorithm for solving a problem
Is algorithm A better than algorithm B ?
A measure for the performance of an algorithm
Simple practical solution:
Implement the algorithms
Measure their running times on a given machine
But we want to measure the performance of an algorithm:
Independent of the machine and language it is implemented in
Without wasting time for implementing it
23. Complexity of Algorithms (2)
We need a theoretical framework for measuring the
performance of an algorithm
Performance
Time = How quick does the algorithm compute the results ?
Space = How much memory does it need ?
Focus on time performance
Moore’s Law: processing power evolves less quickly than
storage capacity
Space constraints are rarely an issue: related to RAM size
24. Running Time
Measure of the time complexity of an algorithm
It is a theoretical measure that is dependent of the input
data and the processing performed by an algorithm
We define the running time as a function that only
depends on the size of the input data
The size of the input data is measured by positive integers
For arrays: A[n]
For graphs: G(V, E), |V| = n, |E| = m
For multiplying 3 matrices: lines1, columns1, columns2
25. Running Time (2)
We shall only discuss in this chapter running times that
are dependent of a single parameter: T(n)
The discussion can be easily extended to more parameters
T(n) is the running time for an algorithm that has an input
data of size n
T(n) is a function
T(n): N → R+
26. Example – Insertion Sort
http://en.wikipedia.org/wiki/Insertion_sort
Problem: Sorting an array A[n]
Solution: Insertion sort
Every repetition of the main loop of insertion sort
removes an element from the input data, inserting it into
the correct position in the already-sorted list, until no
input elements remain.
The already-sorted list is the sub-array on the left side
Usually, the removed element is the next one
27. Insertion Sort - Pseudocode
InsertionSort( A[1..n] )
1. FOR (j = 2 .. n)
2.
x = A[j]
3.
i=j–1
4.
5.
6.
7.
8.
// element to be inserted
// position on the right side of
//
the sorted sub-array
WHILE (i > 0 AND x < A[i]) // while not in position
A[i + 1] = A[i]
// move to right
i-// continue
A[i + 1] = x
RETURN A
28. Example
From Erik D. Demaine and Charles E. Leiserson –
Introduction to Algorithms@ MIT
29. Analysis of Complexity
What is the complexity of Insertion Sort ?
General solution for the running time:
Each simple instruction takes a constant amount of time
This is clearly a simplification as the execution of an instruction
depends on the operands
Simple instructions: assignments, logical, mathematical between
numbers, print/scan of a number, return, …
Complex instructions: calls to other functions
T(n) = sum over the running time of each instruction
The running time of an instruction = Running time to execute
it once * Number of times it is executed
30. Analysis of Complexity – General
Instruction
nbr
Running time – execute
once
Number of time it is
executed
1
C1 (a constant)
n
2
C2
n–1
3
C3
n–1
4
C4
T1
5
C5
T2
6
C6
T2
7
C7
n–1
8
C8
1
T(n) = C1*n + (C2+C3+C7)*(n-1) + C4*T1 + (C5+C6)*T2
T1 = (j=2..n) tj
T2 = (j=2..n) (tj – 1)
Difficult to say more about the general form of the running time
31. Analysis of Complexity – General (2)
General form of the running time cannot be expanded
because it depends on the structure of the input data
T1, T2 = ??
However, there are interesting special cases that can be
easily computed:
Worst case
Best case
Average case ?
32. Worst Case Complexity
Happens when the array is sorted descending
In this case, all the elements x = A[j] are lower than all
the previous elements
Therefore, they must be moved to the beginning of the
array
Thus:
tj = j (from j – 1 to 0)
T1 = (j=2..n) j = n*(n+1) / 2 – 1
T2 = (j=2..n) (j - 1) = n*(n-1) / 2
Tworst(n) = a*n2 + b*n + c
quadratic time
33. Best Case Complexity
Happens when the array is sorted ascending
In this case, all the elements x = A[j] are higher than all
the previous elements
Therefore, they are not moved
Thus:
tj = 1
T1 = (j=2..n) 1 = n-1
T2 = (j=2..n) (1 - 1) = 0
Tbest(n) = b1*n + c1
linear time
34. Average Case Complexity
It is interesting to compute it precisely
It is very difficult to compute it precisely!
Should take into consideration the distribution of the input
data and sum up over all possible instances of the input data
averaged by their distribution
See example of formula in blackboard
Not feasible
Simpler solution: on average, an element x = A[j] in inserted in
the middle of the already-sorted list
Recompute T1 and T2 for this case => still a quadratic solution
35. Conclusions
General formula for the complexity of an algorithm is
usually incomplete due to the influence of the structure
of the input data
Average case is interesting, but difficult to compute
Solution: only compute worst case!
Makes sense for a lot of applications: ABS braking
algorithm should be good on worst case, the same for
computing reports for your boss, …
On most occasions, average case has the same order of
growth as the worst case complexity
37. Asymptotic Notations
Current simplifications for computing complexity of
algorithms:
Constant amount of time for simple instructions
Interested in worst case most of the time
Only interested in the asymptotic behavior of T(n)
Asymptotic: T(n) | n → INF
These notations are not used only for running times, but
for any function of the form f(n) : N → R+
|sin(n)|, 1/n, 2sin(n)+1 are functions that cannot be running times
41. Remarks
It is important to compute the order of growth for
algorithms
The asymptotic notations define sets of functions
See picture on blackboard
Sometimes, the Big-O notation is used as a substituent of
the notation
notation – equivalence relation for functions of the
form f(n) : N → R+
Big-O, Big- notations – partial order relations
42. Equivalence Relation
Three important properties:
3.
Reflexivity
Symmetry
Transitivity
It partitions the functions into equivalence classes:
1.
2.
Each class has a representative function
Obtained by removing all lower degree terms and removing
any constants from the highest degree term
(1), (log log n), (log n), (n log n), (n2), (n3), … ,
(2n), (n!), (nn)
43. Partial Order Relations
1.
2.
3.
Three important properties:
Reflexivity
Anti-symmetry
Transitivity
f(n) O(g(n)) … f(n) “<=“ g(n)
f(n) (g(n)) … f(n) “>=“ g(n)
f(n) (g(n)) … f(n) “~=“ g(n)
Partial order because some functions cannot be compared:
e.g. n and nsin(n)+1
45. Asymptotic Notations Used in Equations
For any function on the left side that is part of the set
defined by that asymptotic notation
there is a function on the right side that is part of the set
defined by that asymptotic notation
46. Exercises – Set 1
What is the complexity of the following algorithms ?
Matrix_add_1 (A[n][n],B[n][n]) {
for (i = 1,n) {
for (j = 1,n) {
C[i][j] = 0
}
}
for (i = 1,n) {
for (j = 1,n) {
C[i][j] = A[i][j] + B[i][j]
}
}
return C
}
48. Maximum Subsequence Sum Problem
Given (possibly negative) integers a1, a2, ..., an, find the
maximum value of (k=i..j)Σak. The maximum subsequence
sum is defined to be 0 if all the integers are negative
49. Let A[1] … A[N] be an array of integers that contains a
sequence of length N.
Let sum and maxSum be integers initialized to 0.
For integer i = 1 to N do
Let sum = 0
For integer j = i to N do
Let sum = sum + A[ j ]
If( sum > maxSum ) then
Let maxSum = sum
Return maxSum
50.
There also exist solutions in:
(n)
(n log n)
http://www.wou.edu/~broegb/Cs345/MaxSubsequenceSum.pdf
(n3)
52. Keep in Mind
Are there things more important than the performance
of an algorithm ?
May be:
Correctness
Modularity
Maintainability
Robustness
User-friendliness
Programmer time
Extensibility
Reliability