2. What is Dynamic Programming?
Solves problems by combining solutions to
subproblems
Similar to divide-and-conquer
Applicable when subproblems are not independent
Subproblems share sub-subproblems
Common sub-subproblems are computed once, then
answers are reused as needed
Typically used to solve optimization problems
3. What Are Optimization Problems?
Problems for which there may be many
solutions
Each solution has a value
We want the solution that has an “optimal” value
Often the minimum or maximum value
There may be more than one “optimal” solution
4. Development of DP Algorithms
Four steps:
Characterize the structure of an optimal solution
Recursively define the value of an optimal solution
Compute the value of an optimal solution in a bottom-up
fashion
Construct an optimal solution from computed information
5. Matrix-Chain Multiplication
Given a sequence of matrices A1..An, we wish to compute their
product
This can be computed by repeated multiplication of matrix pairs
using the standard algorithm (next slide)
Ambiguities must be resolved in advance by parenthesizing the
matrices
Matrix multiplication is associative, so all parenthesizations yield
the same product
While order of parenthesizing isn’t mathematically important, it
can make a dramatic difference to the cost of evaluation
6. Matrix Multiplication Algorithm
Matrix MatrixMultiply(const Matrix &A, const Matrix &B)
{
if ( A.columns() != B.rows() )
throw XIncompatibleDimensions;
Matrix C(A.rows(), B.columns());
for ( int i = 0 ; i < A.rows ; ++i )
{
for ( int j = 0 ; j < B.columns ; ++j )
{
C(i,j) = 0;
for ( int k = 0 ; k < A.columns ; ++k )
C(i,j) += A(i,k)*B(k,j);
}
}
}
7. Matrix Multiplication Algorithm
To multiply two matrices A (p x q) and B (q x
r), the number of columns in A must equal the
number of rows in B.
The resulting matrix is p x r
Running time of algorithm is O(pqr)
8. Why Parenthesization Order Matters
Consider three matrices: 10x100, 100x5, and
5x50
If we multiply (A1A2)A3, we perform 10x100x5 +
10x5x50 = 7,500 multiplications
If we multiply A1(A2A3), we perform 100x5x50 +
10x100x50 = 75,000 multiplications
9. The Matrix-Chain Multiplication Problem
Given a chain <A1,A2,…,An> of n
matrices, where for i = 1, 2, …, n, the matrix
Ai has dimension pi-1xpi, fully parenthesize
the product A1A2…An in a way that minimizes
the number of scalar multiplications
10. Structure of an Optimal Parenthesization
To compute the product of all matrices (denoted as
A1..n), we must first:
Compute products A1..k and Ak+1..n
Multiply them together
The cost of computing A1..n is then just the cost of
computing A1..k, plus the cost of computing
Ak+1..n, plus the cost of multiplying them together
11. Structure of an Optimal Parenthesization
The parenthesization of the “prefix” subchain A1,…,Ak and “suffix”
subchain Ak+1,…,An within the optimal parenthesization of
A1,…,An must be optimal
If it weren’t, then there would be some other parenthesization
order with lower cost, which would result in non-optimal
parenthesization of A1,…,An
Optimal solution to the whole problem therefore contains optimal
subproblem solutions
This is a hallmark of dynamic programming
12. A Recursive Solution
Our subproblem is the problem of
determining the minimum cost of a
parenthesization of Ai,…,Aj, where 1 <= i <= j
<= n
Let m[i,j] be the minimum number of
multiplications needed to compute Ai..j
Lowest cost to compute A1..n is m[1,n]
13. A Recursive Solution
How do we recursively define m[i,j]?
If i = j, no multiplications are necessary since we never
have to multiply a matrix by itself
Thus, m[i,i] = 0 for all i
If i < j, we compute as follows:
Assume that there is some optimal parenthesization that splits
the range between k and k+1
m[i,j] = m[i,k]+m[k+1,j]+pi-1pkpj, per our previous discussion
To do this, we must find k
There are j-i possibilities, which can be checked directly
14. A Recursive Solution
Our recursive solution is now:
m[i,j] gives the costs of optimal solutions to
subproblems
Define a second table s[i,j] to help keep track of
how to construct an optimal solution
Each entry contains the value k at which to split the
product
ji
ji
pppjkmkim
jim
jki
jki
if
if
}],1[],[{min
0
],[
1
15. Computing the Optimal Costs
An algorithm to compute the minimum cost
for a matrix-chain multiplication should now
be simple
It will take exponential time, which is no better
than our brute force approach
So how can we improve the running time?
16. Computing the Optimal Costs
There are only O(n2) total subproblems
One subproblem for each choice of i and j
satisfying 1 <= i <= j <= n
A recursive algorithm may encounter each
subproblem many times
Recomputations of known values is costly
This is where dynamic programming techniques
are superior
17. Computing the Optimal Costs
Instead of using recursion, we will use a the
dynamic programming bottom-up approach to
compute the cost
The following algorithm assumes that the
matrix Ai has dimension pi-1xpi
The input sequence is an array <p0, p1, …, pn>
with length n+1
The output is the matrices m and s
18. MatrixChainOrder(const double p[], int size, Matrix
&m, Matrix &s)
{
int n = size-1;
for ( int i = 1 ; i <= n ; ++i )
m(i,i) = 0;
for ( int L = 2 ; L <= n ; ++L ) {
for ( int i = 1 ; i <= n-L+1 ; ++i ) {
for ( int j = 0 ; j <= i+L-1 ; ++j ) {
m(i,j) = MAXDOUBLE;
for ( int k = i ; k <= j-1 ; ++k ) {
int q = m(i,k)+m(k+1,j)+p[i-1]*p[k]*p[j];
if ( q < m(i,j) ) m(i,j) = q;
else s(i,j) = k;
} // for ( k )
} // for ( j )
} // for ( i )
} // for ( L )
}
20. Computing the Optimal Costs
The matrix m contains the costs of
multiplications, and s contains which index of
k was used to achieve the optimal cost
What is the running time?
Three nested loops = O(n3)
This is better than the exponential running time
the recurrence would give us
21. Constructing an Optimal Solution
So far, we only know the optimal number of scalar
multiplications, not the order in which to multiply the matrices
This information is encoded in the table s
Each entry s[i,j] records the value k such that the optimal
parenthesization of Ai…Aj occurs between matrix k and k+1
To compute the product A1..n, we parenthesize at s[1,n]
Previous matrix multiplications can be computed recursively
E.g., s[1,s[1,n]] contains the optimal split for the left half of the
multiplication
22. Constructing an Optimal Solution
MatrixChainMultiply(const Matrix A[], const Matrix &s, int
i, int j)
{
if ( j > i )
{
Matrix X = MatrixChainMultiply(A, s, i, s(i,j));
Matrix Y = MatrixChainMultiply(A, s, s(i,j)+1, j);
return MatrixMultiply(X,Y);
}
else
return A[i];
}
23. Elements of Dynamic Programming
There are two key ingredients that an
optimization problem must have for dynamic
programming to be applicable:
Optimal substructure
Overlapping subproblems
24. Optimal Substructure
A problem that exhibits optimal substructure if an
optimal solution to the problem contains within it
optimal solutions to subproblems
E.g., matrix-chain multiplication exhibited this property
An optimal parenthesization of a matrix chain requires that
each sub-chain also be optimally parenthesized
This property is typically shown by assuming that a better
solution exists, then showing how this contradicts the
optimality of the solution to the original problem
25. Overlapping Subproblems
The “space” for subproblems must be relatively
small
i.e., a recursive algorithm for the solution would end up
solving the same subproblems over and over
This is called overlapping subproblems
Dynamic programming algorithms typically compute
overlapping subproblems once and store the
solution in a table for later (constant-time) lookup
26. Overlapping Subproblems
From the matrix-chain algorithm, we see earlier
computations being reused to perform later
computations:
What if we replaced this with a recursive
algorithm?
Figure 16.2 on page 311 shows the added
computations
int q = m(i,k)+m(k+1,j)+p[i-1]*p[k]*p[j];
if ( q < m(i,j) )
m(i,j) = q;
else
s(i,j) = k;
27. Exercise
A common recursive algorithm for computing the
Fibonacci sequence is:
2if
2if
1if
)2()1(
1
1
)(
x
x
x
xFibxFib
xFib
Is dynamic programming applicable? Why?
Write a dynamic programming algorithm for
solving this problem
28. Longest Common Subsequence
A subsequence is simply a part of a
sequence, consisting of some number of
consecutive elements
Formally:
If X is a sequence of size m, and Z is a sequence
of size k, Z is a subsequence of X if there exists
some j such that for 1<= i <= k, we have xi+j=zi
29. Longest Common Subsequence
Given two sequences X and Y, Z is a
common subsequence of X and Y if Z is a
subsequence of both X and Y
The longest common subsequence problem
is thus the problem of finding the longest
common subsequence of two given
sequences
30. Brute-Force Approach
Enumerate all subsequences in X to see if
they are also subsequences of Y, and keep
track of the longest one found
If X is of size m, that’s 2m possibilities!
31. Characterizing an LCS
Does the LCS problem exhibit an optimal
substructure property?
Yes, corresponding to pairs of “prefixes”
A prefix of a sequence is simply the
beginning portion of a sequence for some
specified length
e.g., if X = <A,B,C,B,D,A,B>, the fourth prefix of X
(X4) is <A,B,C,B>
32. Characterizing an LCS
From Theorem 16.1: Let
X=<x1,…,xm>, Y=<y1,…,yn>, and Z=<z1,…,zk> be
any LCS of X and Y
If xm=yn, then zk=xm=yn and Zk-1 is an LCS of Xm-1 and Yn-1
If xm!=yn, then zk!=xm implies that Z is an LCS of Xm-1 and Y
If xm!=yn, then zk!=yn implies that Z is an LCS of X and Yn-1
33. Characterizing an LCS
What does this mean?
An LCS of two sequences contains within it an
LCS of prefixes of the two sequences
This is just the optimal substructure property
34. A Recursive Solution To Subproblems
Theorem 16.1 gives us two conditions to check for:
If xm = yn, we need to find an LCS of Xm-1 & Yn-1, to which we
append xm = yn
If xm != yn, then we must solve two subproblems: finding an LCS
of Xm & Yn-1, and an LCS of Xm-1 & Yn
Whichever is longer is the LCS of X & Y
Overlapping subproblems are evident
To find an LCS of X & Y, we must find LCS of Xm-1 and/or Yn-
1, which have still smaller overlapping subproblems
35. A Recursive Solution To Subproblems
What is the cost of an optimal solution?
Let c[i,j] be the length of an LCS of Xi & Yj
If i or j = 0, then the LCS for that subsequence has length 0
Otherwise, the cost follows directly from Theorem 16.1:
ji
ji
yxji
yxji
ji
jicjic
jicjic
and0,if
and0,if
0or0if
],1[],1,[max(
1]1,1[
0
],[
36. Computing the Length of an LCS
A recursive algorithm for computing the length of an
LCS of two sequences can be written directly from
the recurrence formula for the cost of an optimal
solution
This recursive algorithm will lead to an exponential-time
solution
Dynamic programming techniques can be used to compute
the solution bottom-up and reduce the expected running
time
37. Computing the Length of an LCS
The following algorithm fills in the cost table c
based on the input sequences X and Y
It also maintains a table b that helps simplify an
optimal solution
Entry b[i,j] points to the table entry corresponding to the
optimal subproblem solution chosen when computing
c[i,j]
38. Computing the Length of an LCS
void LCSLength(const sequence &X, const sequence
&Y, matrix &b, matrix &c)
{
int m = X.length, n = Y.length;
// Initialize tables
for ( int i = 0 ; i < m ; ++i )
c(i,0) = 0;
for ( int j = 0 ; j < m ; ++j )
c(0,j) = 0;
39. Computing the Length of an LCS
// Fill in tables
for ( int i = 1 ; i < m ; ++i )
for ( int j = 1 ; j < n ; ++j ) {
if ( x[i] == y[j] ) {
c(i,j) = c(i-1,j-1)+1;
b(i,j) = 1; // Subproblem type 1 “ã”
}
40. Computing the Length of an LCS
else if ( c(i-1,j) >= c(i,j-1) ) {
c(i,j) = c(i-1, j);
b(i,j) = 2; // Subproblem type 2 “á”
}
else {
c(i,j) = c(i, j-1);
b(i,j) = 3; // Subproblem type 3 “ß”
}
}
}
41. Constructing an LCS
Table b can now be used to construct the
LCS of two sequences
Begin in the bottom right corner of b, and follow
the “arrows”
This will build the LCS in reverse order
42. Constructing an LCS
LCSPrint(const matrix &b, const sequence &X, int
i, int j)
{
if ( i == 0 || j == 0 )
return;
switch ( b(i,j) ) {
case 1: LCSPrint(b, X, i-1, j-1); break;
case 2: LCSPrint(b, X, i-1, j); break;
case 3: LCSPrint(b, X, i, j-1); break;
}
}
43. What is the Running Time to Find an
LCS?
Total running time is now the cost to build the
tables + the cost of printing it out
Table building = O(mn)
Printing = O(m+n)
So, total cost is O(mn)