Chapter 16

Chapter 16
Dynamic Programming

What is Dynamic Programming?
 Solves problems by combining solutions to
subproblems
 Similar to divide-and-conquer
 Applicable when subproblems are not independent
 Subproblems share sub-subproblems
 Common sub-subproblems are computed once, then
answers are reused as needed
 Typically used to solve optimization problems

What Are Optimization Problems?
 Problems for which there may be many
solutions
 Each solution has a value
 We want the solution that has an “optimal” value
 Often the minimum or maximum value
 There may be more than one “optimal” solution

Development of DP Algorithms
 Four steps:
 Characterize the structure of an optimal solution
 Recursively define the value of an optimal solution
 Compute the value of an optimal solution in a bottom-up
fashion
 Construct an optimal solution from computed information

Matrix-Chain Multiplication
 Given a sequence of matrices A1..An, we wish to compute their
product
 This can be computed by repeated multiplication of matrix pairs
using the standard algorithm (next slide)
 Ambiguities must be resolved in advance by parenthesizing the
matrices
 Matrix multiplication is associative, so all parenthesizations yield
the same product
 While order of parenthesizing isn’t mathematically important, it
can make a dramatic difference to the cost of evaluation

Matrix Multiplication Algorithm
Matrix MatrixMultiply(const Matrix &A, const Matrix &B)
{
if ( A.columns() != B.rows() )
throw XIncompatibleDimensions;
Matrix C(A.rows(), B.columns());
for ( int i = 0 ; i < A.rows ; ++i )
{
for ( int j = 0 ; j < B.columns ; ++j )
{
C(i,j) = 0;
for ( int k = 0 ; k < A.columns ; ++k )
C(i,j) += A(i,k)*B(k,j);
}
}
}

Matrix Multiplication Algorithm
 To multiply two matrices A (p x q) and B (q x
r), the number of columns in A must equal the
number of rows in B.
 The resulting matrix is p x r
 Running time of algorithm is O(pqr)

Why Parenthesization Order Matters
 Consider three matrices: 10x100, 100x5, and
5x50
 If we multiply (A1A2)A3, we perform 10x100x5 +
10x5x50 = 7,500 multiplications
 If we multiply A1(A2A3), we perform 100x5x50 +
10x100x50 = 75,000 multiplications

The Matrix-Chain Multiplication Problem
 Given a chain <A1,A2,…,An> of n
matrices, where for i = 1, 2, …, n, the matrix
Ai has dimension pi-1xpi, fully parenthesize
the product A1A2…An in a way that minimizes
the number of scalar multiplications

Structure of an Optimal Parenthesization
 To compute the product of all matrices (denoted as
A1..n), we must first:
 Compute products A1..k and Ak+1..n
 Multiply them together
 The cost of computing A1..n is then just the cost of
computing A1..k, plus the cost of computing
Ak+1..n, plus the cost of multiplying them together

Structure of an Optimal Parenthesization
 The parenthesization of the “prefix” subchain A1,…,Ak and “suffix”
subchain Ak+1,…,An within the optimal parenthesization of
A1,…,An must be optimal
 If it weren’t, then there would be some other parenthesization
order with lower cost, which would result in non-optimal
parenthesization of A1,…,An
 Optimal solution to the whole problem therefore contains optimal
subproblem solutions
 This is a hallmark of dynamic programming

A Recursive Solution
 Our subproblem is the problem of
determining the minimum cost of a
parenthesization of Ai,…,Aj, where 1 <= i <= j
<= n
 Let m[i,j] be the minimum number of
multiplications needed to compute Ai..j
 Lowest cost to compute A1..n is m[1,n]

 How do we recursively define m[i,j]?
 If i = j, no multiplications are necessary since we never
have to multiply a matrix by itself
 Thus, m[i,i] = 0 for all i
 If i < j, we compute as follows:
 Assume that there is some optimal parenthesization that splits
the range between k and k+1
 m[i,j] = m[i,k]+m[k+1,j]+pi-1pkpj, per our previous discussion
 To do this, we must find k
 There are j-i possibilities, which can be checked directly

 Our recursive solution is now:
 m[i,j] gives the costs of optimal solutions to
subproblems
 Define a second table s[i,j] to help keep track of
how to construct an optimal solution
 Each entry contains the value k at which to split the
product
ji
ji
pppjkmkim
jim
jki
jki
if
if
}],1[],[{min
0
],[
1

Computing the Optimal Costs
 An algorithm to compute the minimum cost
for a matrix-chain multiplication should now
be simple
 It will take exponential time, which is no better
than our brute force approach
 So how can we improve the running time?

 There are only O(n2) total subproblems
 One subproblem for each choice of i and j
satisfying 1 <= i <= j <= n
 A recursive algorithm may encounter each
subproblem many times
 Recomputations of known values is costly
 This is where dynamic programming techniques
are superior

 Instead of using recursion, we will use a the
dynamic programming bottom-up approach to
compute the cost
 The following algorithm assumes that the
matrix Ai has dimension pi-1xpi
 The input sequence is an array <p0, p1, …, pn>
with length n+1
 The output is the matrices m and s

MatrixChainOrder(const double p[], int size, Matrix
&m, Matrix &s)
{
int n = size-1;
for ( int i = 1 ; i <= n ; ++i )
m(i,i) = 0;
for ( int L = 2 ; L <= n ; ++L ) {
for ( int i = 1 ; i <= n-L+1 ; ++i ) {
for ( int j = 0 ; j <= i+L-1 ; ++j ) {
m(i,j) = MAXDOUBLE;
for ( int k = i ; k <= j-1 ; ++k ) {
int q = m(i,k)+m(k+1,j)+p[i-1]*p[k]*p[j];
if ( q < m(i,j) ) m(i,j) = q;
else s(i,j) = k;
} // for ( k )
} // for ( j )
} // for ( i )
} // for ( L )
}

Example: m and s
0 15,750 7,875 9.375 11,875 15.125
0 2,625 4,375 7,125 10,500
0 750 2,500 5,375
0 1,000 3,500
0 5,000
0
Matrix m
1 1 3 3 3
2 3 3 3
3 3 3
4 5
5
Matrix s
Input dimensions: 30x35, 35x15, 15x5, 5x10, 10x20, 20x25

 The matrix m contains the costs of
multiplications, and s contains which index of
k was used to achieve the optimal cost
 What is the running time?
 Three nested loops = O(n3)
 This is better than the exponential running time
the recurrence would give us

Constructing an Optimal Solution
 So far, we only know the optimal number of scalar
multiplications, not the order in which to multiply the matrices
 This information is encoded in the table s
 Each entry s[i,j] records the value k such that the optimal
parenthesization of Ai…Aj occurs between matrix k and k+1
 To compute the product A1..n, we parenthesize at s[1,n]
 Previous matrix multiplications can be computed recursively
 E.g., s[1,s[1,n]] contains the optimal split for the left half of the
multiplication

Constructing an Optimal Solution
MatrixChainMultiply(const Matrix A[], const Matrix &s, int
i, int j)
{
if ( j > i )
{
Matrix X = MatrixChainMultiply(A, s, i, s(i,j));
Matrix Y = MatrixChainMultiply(A, s, s(i,j)+1, j);
return MatrixMultiply(X,Y);
}
else
return A[i];
}

Elements of Dynamic Programming
 There are two key ingredients that an
optimization problem must have for dynamic
programming to be applicable:
 Optimal substructure
 Overlapping subproblems

Optimal Substructure
 A problem that exhibits optimal substructure if an
optimal solution to the problem contains within it
optimal solutions to subproblems
 E.g., matrix-chain multiplication exhibited this property
 An optimal parenthesization of a matrix chain requires that
each sub-chain also be optimally parenthesized
 This property is typically shown by assuming that a better
solution exists, then showing how this contradicts the
optimality of the solution to the original problem

Overlapping Subproblems
 The “space” for subproblems must be relatively
small
 i.e., a recursive algorithm for the solution would end up
solving the same subproblems over and over
 This is called overlapping subproblems
 Dynamic programming algorithms typically compute
overlapping subproblems once and store the
solution in a table for later (constant-time) lookup

Overlapping Subproblems
 From the matrix-chain algorithm, we see earlier
computations being reused to perform later
computations:
 What if we replaced this with a recursive
algorithm?
 Figure 16.2 on page 311 shows the added
computations
int q = m(i,k)+m(k+1,j)+p[i-1]*p[k]*p[j];
if ( q < m(i,j) )
m(i,j) = q;
else
s(i,j) = k;

Exercise
 A common recursive algorithm for computing the
Fibonacci sequence is:
2if
2if
1if
)2()1(
1
1
)(
x
x
x
xFibxFib
xFib
 Is dynamic programming applicable? Why?
 Write a dynamic programming algorithm for
solving this problem

Longest Common Subsequence
 A subsequence is simply a part of a
sequence, consisting of some number of
consecutive elements
 Formally:
 If X is a sequence of size m, and Z is a sequence
of size k, Z is a subsequence of X if there exists
some j such that for 1<= i <= k, we have xi+j=zi

Longest Common Subsequence
 Given two sequences X and Y, Z is a
common subsequence of X and Y if Z is a
subsequence of both X and Y
 The longest common subsequence problem
is thus the problem of finding the longest
common subsequence of two given
sequences

Brute-Force Approach
 Enumerate all subsequences in X to see if
they are also subsequences of Y, and keep
track of the longest one found
 If X is of size m, that’s 2m possibilities!

Characterizing an LCS
 Does the LCS problem exhibit an optimal
substructure property?
 Yes, corresponding to pairs of “prefixes”
 A prefix of a sequence is simply the
beginning portion of a sequence for some
specified length
 e.g., if X = <A,B,C,B,D,A,B>, the fourth prefix of X
(X4) is <A,B,C,B>

 From Theorem 16.1: Let
X=<x1,…,xm>, Y=<y1,…,yn>, and Z=<z1,…,zk> be
any LCS of X and Y
 If xm=yn, then zk=xm=yn and Zk-1 is an LCS of Xm-1 and Yn-1
 If xm!=yn, then zk!=xm implies that Z is an LCS of Xm-1 and Y
 If xm!=yn, then zk!=yn implies that Z is an LCS of X and Yn-1

 What does this mean?
 An LCS of two sequences contains within it an
LCS of prefixes of the two sequences
 This is just the optimal substructure property

A Recursive Solution To Subproblems
 Theorem 16.1 gives us two conditions to check for:
 If xm = yn, we need to find an LCS of Xm-1 & Yn-1, to which we
append xm = yn
 If xm != yn, then we must solve two subproblems: finding an LCS
of Xm & Yn-1, and an LCS of Xm-1 & Yn
 Whichever is longer is the LCS of X & Y
 Overlapping subproblems are evident
 To find an LCS of X & Y, we must find LCS of Xm-1 and/or Yn-
1, which have still smaller overlapping subproblems

A Recursive Solution To Subproblems
 What is the cost of an optimal solution?
 Let c[i,j] be the length of an LCS of Xi & Yj
 If i or j = 0, then the LCS for that subsequence has length 0
 Otherwise, the cost follows directly from Theorem 16.1:
ji
ji
yxji
yxji
ji
jicjic
jicjic
and0,if
and0,if
0or0if
],1[],1,[max(
1]1,1[
0
],[

Computing the Length of an LCS
 A recursive algorithm for computing the length of an
LCS of two sequences can be written directly from
the recurrence formula for the cost of an optimal
solution
 This recursive algorithm will lead to an exponential-time
solution
 Dynamic programming techniques can be used to compute
the solution bottom-up and reduce the expected running
time

 The following algorithm fills in the cost table c
based on the input sequences X and Y
 It also maintains a table b that helps simplify an
optimal solution
 Entry b[i,j] points to the table entry corresponding to the
optimal subproblem solution chosen when computing
c[i,j]

void LCSLength(const sequence &X, const sequence
&Y, matrix &b, matrix &c)
{
int m = X.length, n = Y.length;
// Initialize tables
for ( int i = 0 ; i < m ; ++i )
c(i,0) = 0;
for ( int j = 0 ; j < m ; ++j )
c(0,j) = 0;

// Fill in tables
for ( int i = 1 ; i < m ; ++i )
for ( int j = 1 ; j < n ; ++j ) {
if ( x[i] == y[j] ) {
c(i,j) = c(i-1,j-1)+1;
b(i,j) = 1; // Subproblem type 1 “ã”
}

else if ( c(i-1,j) >= c(i,j-1) ) {
c(i,j) = c(i-1, j);
b(i,j) = 2; // Subproblem type 2 “á”
}
else {
c(i,j) = c(i, j-1);
b(i,j) = 3; // Subproblem type 3 “ß”
}
}
}

Constructing an LCS
 Table b can now be used to construct the
LCS of two sequences
 Begin in the bottom right corner of b, and follow
the “arrows”
 This will build the LCS in reverse order

Constructing an LCS
LCSPrint(const matrix &b, const sequence &X, int
i, int j)
{
if ( i == 0 || j == 0 )
return;
switch ( b(i,j) ) {
case 1: LCSPrint(b, X, i-1, j-1); break;
case 2: LCSPrint(b, X, i-1, j); break;
case 3: LCSPrint(b, X, i, j-1); break;
}
}

What is the Running Time to Find an
LCS?
 Total running time is now the cost to build the
tables + the cost of printing it out
 Table building = O(mn)
 Printing = O(m+n)
 So, total cost is O(mn)

Chapter 16

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Chapter 16

Ähnlich wie Chapter 16 (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Chapter 16