SlideShare ist ein Scribd-Unternehmen logo
1 von 52
Downloaden Sie, um offline zu lesen
Dynamic Programming:
basics and case studies
Houston Machine Learning Meetup
11/16/2019
Dynamic Programming: name and story
• Richard Bellman coined the term “Dynamic Programming”
Bellman autobiography
“The face of Wilson (the secretory of defense) would turn red, and he would get
violent if people used the term RESEARCH in his presence. You can imagine how he
felt, then, about the term MATHEMATICAL …. I had to do something to shield Wilson
and the Air Force from the fact that I was really doing MATHEMATICS inside the
RAND Corporation…. I decided therefore to use the word “PROGRAMMING". I
wanted to get across the idea that this was DYNAMIC, this was multistage, this was
time-varying…. I thought dynamic programming was a good name. It was something
not even a Congressman could object to..."
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by recursion
public int fib(int N) {
if (n == 0 || n == 1) { return n; }
return fib(N – 1) + fib(N – 2);
}
Time complexity: O(N) = 2^N
Recursion tree of Fibonacci sequence
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by DP
Time complexity: O(N) = N
Index 0 1 2 3 4 5 …..
F(N) 0 1
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by DP
Time complexity: O(N) = N
Index 0 1 2 3 4 5 …..
F(N) 0 1 1
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by DP
Time complexity: O(N) = N
Index 0 1 2 3 4 5 …..
F(N) 0 1 1 2
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by DP
Time complexity: O(N) = N
Index 0 1 2 3 4 5 …..
F(N) 0 1 1 2 3
Fibonacci sequence
• Definition:
• F(0) = 0
• F(1) = 1
• F(n) = F(n – 1) + F(n – 2)
• Solved by DP
Time complexity: O(N) = N
Index 0 1 2 3 4 5 …..
F(N) 0 1 1 2 3 5
Fibonacci sequence
• Recursion:
• F(n) = F(n – 1) + F(n – 2)
• Starts from n
• When computing F(n), F(n-1) and F(n-2) is not known yet
• DP:
• F(n) = F(n – 1) + F(n – 2)
• Starts from 0 and 1
• When computing F(n), F(n-1) and F(n-2) has been stored in array
• Dynamic programming: partial result stored to save time
Longest common subsequence
• To find the longest subsequence common to two or more sequences
• String1: “AGCAT”
• String2: “GAC”
• Common subsequence: “A”, “C”, “G”, “AC”, “GA”,
• LCS: “AC”, or “GA”
• To use a table to find LCS:
• First column: string1(“AGCAT”)
• First row: string2(“GAC”)
• Table[i, j]: LCS of string1.substring(0, i) and string2.substring(0, j)
Longest common subsequence
Longest common subsequence
Longest common subsequence
Longest common subsequence
Wildcard matching
• Linux command-line:
user@bash: ls b*
barry.txt, blan.txt bob.txt
• Complicated example:
string = "adcab“
pattern = “*a*b“
• DP solution:
• Definition: table[i][j]
• Base case:
table[0][0] = true
first row: table[0][i + 1] = table[0][i] (pattern[i]=*)
• Induction rule:
(1) if string[i] equals pattern[j] or pattern[j] equals ?
table[i + ][j + 1] = table[i][j]
(2) if (pattern[j] equals *
table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1]
- * a * b
- T T F F F
a
d
c
a
b
Wildcard matching
- * a * b
- T T F F F
a F T T T F
d F T F T F
c F T F T F
a F T T
b
• Linux command-line:
user@bash: ls b*
barry.txt, blan.txt bob.txt
• Complicated example:
string = "adcab“
pattern = “*a*b“
• DP solution:
• Definition: table[i][j]
• Base case:
table[0][0] = true
first row: table[0][i + 1] = table[0][i] (pattern[i]=*)
• Induction rule:
(1) if string[i] equals pattern[j] or pattern[j] equals ?
table[i + ][j + 1] = table[i][j]
(2) if (pattern[j] equals *
table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1]j + 1]
Wildcard matching
- * a * b
- T T F F F
a F T T T F
d F T F T F
c F T F T F
a F T T T F
b F T F T
• Linux command-line:
user@bash: ls b*
barry.txt, blan.txt bob.txt
• Complicated example:
string = "adcab“
pattern = “*a*b“
• DP solution:
• Definition: table[i][j]
• Base case:
table[0][0] = true
first row: table[0][i + 1] = table[0][i] (pattern[i]=*)
• Induction rule:
(1) if string[i] equals pattern[j] or pattern[j] equals ?
table[i + ][j + 1] = table[i][j]
(2) if (pattern[j] equals *
table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1] j + 1]
Wildcard matching
- * a * b
- T T F F F
a F T T T F
d F T F T F
c F T F T F
a F T T T F
b F T F T T
• Linux command-line:
user@bash: ls b*
barry.txt, blan.txt bob.txt
• Complicated example:
string = "adcab“
pattern = “*a*b“
• DP solution:
• Definition: table[i][j]
• Base case:
table[0][0] = true
first row: table[0][i + 1] = table[0][i] (pattern[i]=*)
• Induction rule:
(1) if string[i] equals pattern[j] or pattern[j] equals ?
table[i + ][j + 1] = table[i][j]
(2) if (pattern[j] equals *
table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1] j + 1]
Longest common subsequence and wildcard
matching
• DP starts from initial condition to the end of string:
• From left to right at each row
• From top to bottom at each cloumn
• State transition from table[i - 1][j - 1], table[i][j - 1], table[i - 1][j] to
table[i][j]
• Each time: move forward by one step
• State at each is the global optimum of that step
• Table (or diagram) is the best tool to simulate the processing
Matrix chain multiplication
• Multiple two matrices: A(10 x 100) and B(100 x 5)
• OUT[p][r] += A[p][q] * B[q][r]
• Computation = 10 x 100 x 5
• Multiple three matrices: A1(10 x 100), A2(100 X 5), and A3(5 x 50)
• ((A1 A2) A3) : 10 x 100 x 5 (A1 A2) + 10 x 5 x 50 = 7500
• (A1 (A2 A3)) : 100 x 5 x 50 (A2 A3) + 10 x 100 x 50 = 75000
• ((A1 A2) A3) is 10 times faster than (A1 (A2 A3)) in regarding to scalar
computation
Matrix chain multiplication
• How to optimize the chain multiplication of matrices ( A1, A2, A3, ….
An)
• DP induction rule:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
• Status:
• M[i, j]: the min number of computations for the matrices (i to j) multiplication
• S[i, j]: the last-layer break-point for M[i, j]
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
Matrix chain multiplication: DP solution
• Six matrices multiplication:
(A1 (A2 A3)) ((A4 A5) A6)
Matrix chain multiplication: DP solution
• State hard to define:
• M[i, j]
• S[i, j]
• State transition complicated:
• By row and column not work
• From previous state to current state by the matrices length (Induction rule)
Framework of dynamic programming
• Three key components of dynamic programming algorithm:
• Definition of state
• Initial condition (base)
• Induction rule (state transition)
• Induction rule: difficult to find
• 1D/2D table for the thinking process
What is part of speech tagging?
• Identify parts of the speech (syntactic categories):
This is a simple sentence
DET VB DET ADJ NOUN
• POS tagging is a first step towards syntactic analysis (sematic analysis)
• Faster than full parsing
• Text classification and word disambiguation
• How to decide the correct label:
• Word to be labeled: chair is probably a noun
• Labels of surrounding word: if preceding word is a modal verb (.e.g., will) then this
word is more likely to be a verb
• Hidden Markov models can be used to work on this problem
Why is POS tagging hard?
• Ambiguity
glass of water/NOUN vs. water/VERB the plants
lie/VERB down vs. tell a lie/NOUN
wind/VERB down vs. a mighty wind/NOUN(homographs)
How about time flies like an arrow?
• Sparse data:
• Words we haven’t seen before
• Word-Tag pairs we haven’t seen before
Example transition probabilities
• Probabilities estimated from tagged WSJ corpus:
• Proper nouns (NNP) often begin sentences:P(NNP|<s>) = 0.28
• Modal verbs (MD) nearly always followed by bare verbs (VB).
• Adjectives (JJ) are often followed by nouns (NN).
Example output probabilities
• Probabilities estimated from tagged WSJ corpus:
• 0.0032% of proper nouns are Janet: P(Janet|NNP) = 0.000032
• About half of determiners (DT) are the.
• the can also be a proper noun.
Hidden Markov Chain
• A set of states (tags)
• An output alphabet (words)
• Initial state (beginning of sentence)
• State transition probabilities ( P(ti|ti-1) )
• Symbol emission probabilities ( P(wi|ti) )
Hidden Markov Chain
• Model the tagging process:
• Sentence: W = (w1, w2, … wn)
• Tags T = (t1, t2, …, tn)
• Joint probability: P(W, T) = ς𝑖=1
𝑛
𝑃 𝑡𝑖 𝑡𝑖−1 𝑃 𝑤𝑖 𝑡𝑖 𝑃(</𝑠 > |𝑡 𝑛)
• Example:
• This/DET is/VB a/DET simple/JJ sentence/NN
• Add begin(<s>) and end-of-sentence (</s>):
P(W, T) = ς𝑖=1
𝑛
𝑃 𝑡𝑖 𝑡𝑖−1 𝑃 𝑤𝑖 𝑡𝑖 𝑃(</𝑠 > |𝑡 𝑛)
= P(DET|<s>) P(VB/DET) P(DET/VB) P(JJ/DET) P(NN/JJ)
P(</s>|NN) x P(This|DET) P(is|VB) P(a|DET) P(simple|JJ)
P(sentence|NN)
Computation estimation of POS
• Suppose we have C possible tags for each of the n words in the
sentence
• There are C^n possible tag sequences: the number grows
exponentially in the length n
• Viterbi algorithm: use dynamic programming to solve it
Viterbi algorithm:
• Target: argmaxT P(T|W)
• Intuition: best path of length (i) at state of t must include best path of
length (i-1) to the previous state
• Use a table to store the partial result:
• TXN table, v(t, i) is the prob of best state sequence for w1 … wi ending at
state i
• Fill in columns from left to right, the max is over each possible previous t’
V(t, i) = max { v (t’, i – 1) P(t|t’) P(wi|ti) }
Viterbi algorithm: case study
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: case study
• W = the doctor is in.
Viterbi algorithm: all tagged
Dynamic programming: take-home message
• Why fast: use memory to store partial result
• DP algorithm component: state definition, initial condition, and
induction rule
• Solve DP problem with a table
Top ten DP problems
• Longest common subsequence
• Shortest common subsequence
• Longest increasing subsequence
• Edit distance
• Matrix chain multiplication
• 0-1 knapsack problem
• Partition problem
• Rod cutting
• Coin change problem
• Word break problem
Reference
• http://people.cs.georgetown.edu/nschneid/cosc572/f16/12_viterbi_s
lides.pdf
• https://en.wikipedia.org/wiki/Dynamic_programming
• https://medium.com/@codingfreak/top-10-dynamic-programming-
problems-5da486eeb360
• https://leetcode.com/problems/wildcard-matching/description/
• https://en.wikipedia.org/wiki/Longest_common_subsequence_probl
em

Weitere ähnliche Inhalte

Was ist angesagt?

20 the chain rule
20 the chain rule20 the chain rule
20 the chain rulemath267
 
19 min max-saddle-points
19 min max-saddle-points19 min max-saddle-points
19 min max-saddle-pointsmath267
 
Your data structures are made of maths!
Your data structures are made of maths!Your data structures are made of maths!
Your data structures are made of maths!kenbot
 
Fosdem 2013 petra selmer flexible querying of graph data
Fosdem 2013 petra selmer   flexible querying of graph dataFosdem 2013 petra selmer   flexible querying of graph data
Fosdem 2013 petra selmer flexible querying of graph dataPetra Selmer
 
1.6 slopes and the difference quotient
1.6 slopes and the difference quotient1.6 slopes and the difference quotient
1.6 slopes and the difference quotientmath265
 
Introduction to Function, Domain and Range - Mohd Noor
Introduction to Function, Domain and Range - Mohd Noor Introduction to Function, Domain and Range - Mohd Noor
Introduction to Function, Domain and Range - Mohd Noor Mohd. Noor Abdul Hamid
 
Relations and Functions
Relations and FunctionsRelations and Functions
Relations and Functionstoni dimella
 
Chapter3 Search
Chapter3 SearchChapter3 Search
Chapter3 SearchKhiem Ho
 
23 general double integrals
23 general double integrals23 general double integrals
23 general double integralsmath267
 
22 double integrals
22 double integrals22 double integrals
22 double integralsmath267
 
t5 graphs of trig functions and inverse trig functions
t5 graphs of trig functions and inverse trig functionst5 graphs of trig functions and inverse trig functions
t5 graphs of trig functions and inverse trig functionsmath260
 
52 rational expressions
52 rational expressions52 rational expressions
52 rational expressionsalg1testreview
 
Relations and functions
Relations and functionsRelations and functions
Relations and functionsHeather Scott
 
Module 1 Lesson 1 Remediation Notes
Module 1 Lesson 1 Remediation NotesModule 1 Lesson 1 Remediation Notes
Module 1 Lesson 1 Remediation Notestoni dimella
 
Limits and continuity[1]
Limits and continuity[1]Limits and continuity[1]
Limits and continuity[1]indu thakur
 
Relations and functions
Relations and functionsRelations and functions
Relations and functionscannout
 
Higher order derivatives for N -body simulations
Higher order derivatives for N -body simulationsHigher order derivatives for N -body simulations
Higher order derivatives for N -body simulationsKeigo Nitadori
 
3.2 properties of division and roots
3.2 properties of division and roots3.2 properties of division and roots
3.2 properties of division and rootsmath260
 
2.4 defintion of derivative
2.4 defintion of derivative2.4 defintion of derivative
2.4 defintion of derivativemath265
 

Was ist angesagt? (20)

20 the chain rule
20 the chain rule20 the chain rule
20 the chain rule
 
19 min max-saddle-points
19 min max-saddle-points19 min max-saddle-points
19 min max-saddle-points
 
Your data structures are made of maths!
Your data structures are made of maths!Your data structures are made of maths!
Your data structures are made of maths!
 
Fosdem 2013 petra selmer flexible querying of graph data
Fosdem 2013 petra selmer   flexible querying of graph dataFosdem 2013 petra selmer   flexible querying of graph data
Fosdem 2013 petra selmer flexible querying of graph data
 
1.6 slopes and the difference quotient
1.6 slopes and the difference quotient1.6 slopes and the difference quotient
1.6 slopes and the difference quotient
 
Introduction to Function, Domain and Range - Mohd Noor
Introduction to Function, Domain and Range - Mohd Noor Introduction to Function, Domain and Range - Mohd Noor
Introduction to Function, Domain and Range - Mohd Noor
 
Relations and Functions
Relations and FunctionsRelations and Functions
Relations and Functions
 
Chapter3 Search
Chapter3 SearchChapter3 Search
Chapter3 Search
 
23 general double integrals
23 general double integrals23 general double integrals
23 general double integrals
 
22 double integrals
22 double integrals22 double integrals
22 double integrals
 
t5 graphs of trig functions and inverse trig functions
t5 graphs of trig functions and inverse trig functionst5 graphs of trig functions and inverse trig functions
t5 graphs of trig functions and inverse trig functions
 
Metric space
Metric spaceMetric space
Metric space
 
52 rational expressions
52 rational expressions52 rational expressions
52 rational expressions
 
Relations and functions
Relations and functionsRelations and functions
Relations and functions
 
Module 1 Lesson 1 Remediation Notes
Module 1 Lesson 1 Remediation NotesModule 1 Lesson 1 Remediation Notes
Module 1 Lesson 1 Remediation Notes
 
Limits and continuity[1]
Limits and continuity[1]Limits and continuity[1]
Limits and continuity[1]
 
Relations and functions
Relations and functionsRelations and functions
Relations and functions
 
Higher order derivatives for N -body simulations
Higher order derivatives for N -body simulationsHigher order derivatives for N -body simulations
Higher order derivatives for N -body simulations
 
3.2 properties of division and roots
3.2 properties of division and roots3.2 properties of division and roots
3.2 properties of division and roots
 
2.4 defintion of derivative
2.4 defintion of derivative2.4 defintion of derivative
2.4 defintion of derivative
 

Ähnlich wie Basics of Dynamic programming

Tree distance algorithm
Tree distance algorithmTree distance algorithm
Tree distance algorithmTrector Rancor
 
Ch01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluitonCh01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluitonshin
 
time_complexity_list_02_04_2024_22_pages.pdf
time_complexity_list_02_04_2024_22_pages.pdftime_complexity_list_02_04_2024_22_pages.pdf
time_complexity_list_02_04_2024_22_pages.pdfSrinivasaReddyPolamR
 
introduction to data structures and types
introduction to data structures and typesintroduction to data structures and types
introduction to data structures and typesankita946617
 
An overview of Python 2.7
An overview of Python 2.7An overview of Python 2.7
An overview of Python 2.7decoupled
 
Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3Charles Martin
 
Profiling and optimization
Profiling and optimizationProfiling and optimization
Profiling and optimizationg3_nittala
 
Basic arithmetic, instruction execution and program
Basic arithmetic, instruction execution and programBasic arithmetic, instruction execution and program
Basic arithmetic, instruction execution and programJyotiprakashMishra18
 
DS Unit-1.pptx very easy to understand..
DS Unit-1.pptx very easy to understand..DS Unit-1.pptx very easy to understand..
DS Unit-1.pptx very easy to understand..KarthikeyaLanka1
 
Number Crunching in Python
Number Crunching in PythonNumber Crunching in Python
Number Crunching in PythonValerio Maggio
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningBig_Data_Ukraine
 

Ähnlich wie Basics of Dynamic programming (20)

Tree distance algorithm
Tree distance algorithmTree distance algorithm
Tree distance algorithm
 
Ch01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluitonCh01 basic concepts_nosoluiton
Ch01 basic concepts_nosoluiton
 
time_complexity_list_02_04_2024_22_pages.pdf
time_complexity_list_02_04_2024_22_pages.pdftime_complexity_list_02_04_2024_22_pages.pdf
time_complexity_list_02_04_2024_22_pages.pdf
 
introduction to data structures and types
introduction to data structures and typesintroduction to data structures and types
introduction to data structures and types
 
Laplace_1.ppt
Laplace_1.pptLaplace_1.ppt
Laplace_1.ppt
 
Unit 3
Unit 3Unit 3
Unit 3
 
Unit 3
Unit 3Unit 3
Unit 3
 
Introduction to matlab
Introduction to matlabIntroduction to matlab
Introduction to matlab
 
Q
QQ
Q
 
An overview of Python 2.7
An overview of Python 2.7An overview of Python 2.7
An overview of Python 2.7
 
A tour of Python
A tour of PythonA tour of Python
A tour of Python
 
Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3Applied machine learning for search engine relevance 3
Applied machine learning for search engine relevance 3
 
Profiling and optimization
Profiling and optimizationProfiling and optimization
Profiling and optimization
 
CDT 22 slides.pdf
CDT 22 slides.pdfCDT 22 slides.pdf
CDT 22 slides.pdf
 
Basic arithmetic, instruction execution and program
Basic arithmetic, instruction execution and programBasic arithmetic, instruction execution and program
Basic arithmetic, instruction execution and program
 
DS Unit-1.pptx very easy to understand..
DS Unit-1.pptx very easy to understand..DS Unit-1.pptx very easy to understand..
DS Unit-1.pptx very easy to understand..
 
Number Crunching in Python
Number Crunching in PythonNumber Crunching in Python
Number Crunching in Python
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
 
Ch8a
Ch8aCh8a
Ch8a
 

Mehr von Yan Xu

Kaggle winning solutions: Retail Sales Forecasting
Kaggle winning solutions: Retail Sales ForecastingKaggle winning solutions: Retail Sales Forecasting
Kaggle winning solutions: Retail Sales ForecastingYan Xu
 
Walking through Tensorflow 2.0
Walking through Tensorflow 2.0Walking through Tensorflow 2.0
Walking through Tensorflow 2.0Yan Xu
 
Practical contextual bandits for business
Practical contextual bandits for businessPractical contextual bandits for business
Practical contextual bandits for businessYan Xu
 
Introduction to Multi-armed Bandits
Introduction to Multi-armed BanditsIntroduction to Multi-armed Bandits
Introduction to Multi-armed BanditsYan Xu
 
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack WangA Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack WangYan Xu
 
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...Yan Xu
 
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...Yan Xu
 
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...Yan Xu
 
Introduction to Autoencoders
Introduction to AutoencodersIntroduction to Autoencoders
Introduction to AutoencodersYan Xu
 
State of enterprise data science
State of enterprise data scienceState of enterprise data science
State of enterprise data scienceYan Xu
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term MemoryYan Xu
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationYan Xu
 
Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)Yan Xu
 
HML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningHML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningYan Xu
 
Secrets behind AlphaGo
Secrets behind AlphaGoSecrets behind AlphaGo
Secrets behind AlphaGoYan Xu
 
Optimization in Deep Learning
Optimization in Deep LearningOptimization in Deep Learning
Optimization in Deep LearningYan Xu
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkYan Xu
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network Yan Xu
 
Introduction to Neural Network
Introduction to Neural NetworkIntroduction to Neural Network
Introduction to Neural NetworkYan Xu
 
Nonlinear dimension reduction
Nonlinear dimension reductionNonlinear dimension reduction
Nonlinear dimension reductionYan Xu
 

Mehr von Yan Xu (20)

Kaggle winning solutions: Retail Sales Forecasting
Kaggle winning solutions: Retail Sales ForecastingKaggle winning solutions: Retail Sales Forecasting
Kaggle winning solutions: Retail Sales Forecasting
 
Walking through Tensorflow 2.0
Walking through Tensorflow 2.0Walking through Tensorflow 2.0
Walking through Tensorflow 2.0
 
Practical contextual bandits for business
Practical contextual bandits for businessPractical contextual bandits for business
Practical contextual bandits for business
 
Introduction to Multi-armed Bandits
Introduction to Multi-armed BanditsIntroduction to Multi-armed Bandits
Introduction to Multi-armed Bandits
 
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack WangA Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
 
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
 
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
 
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
 
Introduction to Autoencoders
Introduction to AutoencodersIntroduction to Autoencoders
Introduction to Autoencoders
 
State of enterprise data science
State of enterprise data scienceState of enterprise data science
State of enterprise data science
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term Memory
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
 
Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)
 
HML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningHML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep Learning
 
Secrets behind AlphaGo
Secrets behind AlphaGoSecrets behind AlphaGo
Secrets behind AlphaGo
 
Optimization in Deep Learning
Optimization in Deep LearningOptimization in Deep Learning
Optimization in Deep Learning
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
 
Introduction to Neural Network
Introduction to Neural NetworkIntroduction to Neural Network
Introduction to Neural Network
 
Nonlinear dimension reduction
Nonlinear dimension reductionNonlinear dimension reduction
Nonlinear dimension reduction
 

Kürzlich hochgeladen

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 

Kürzlich hochgeladen (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 

Basics of Dynamic programming

  • 1. Dynamic Programming: basics and case studies Houston Machine Learning Meetup 11/16/2019
  • 2. Dynamic Programming: name and story • Richard Bellman coined the term “Dynamic Programming” Bellman autobiography “The face of Wilson (the secretory of defense) would turn red, and he would get violent if people used the term RESEARCH in his presence. You can imagine how he felt, then, about the term MATHEMATICAL …. I had to do something to shield Wilson and the Air Force from the fact that I was really doing MATHEMATICS inside the RAND Corporation…. I decided therefore to use the word “PROGRAMMING". I wanted to get across the idea that this was DYNAMIC, this was multistage, this was time-varying…. I thought dynamic programming was a good name. It was something not even a Congressman could object to..."
  • 3. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2)
  • 4. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by recursion public int fib(int N) { if (n == 0 || n == 1) { return n; } return fib(N – 1) + fib(N – 2); } Time complexity: O(N) = 2^N Recursion tree of Fibonacci sequence
  • 5. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by DP Time complexity: O(N) = N Index 0 1 2 3 4 5 ….. F(N) 0 1
  • 6. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by DP Time complexity: O(N) = N Index 0 1 2 3 4 5 ….. F(N) 0 1 1
  • 7. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by DP Time complexity: O(N) = N Index 0 1 2 3 4 5 ….. F(N) 0 1 1 2
  • 8. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by DP Time complexity: O(N) = N Index 0 1 2 3 4 5 ….. F(N) 0 1 1 2 3
  • 9. Fibonacci sequence • Definition: • F(0) = 0 • F(1) = 1 • F(n) = F(n – 1) + F(n – 2) • Solved by DP Time complexity: O(N) = N Index 0 1 2 3 4 5 ….. F(N) 0 1 1 2 3 5
  • 10. Fibonacci sequence • Recursion: • F(n) = F(n – 1) + F(n – 2) • Starts from n • When computing F(n), F(n-1) and F(n-2) is not known yet • DP: • F(n) = F(n – 1) + F(n – 2) • Starts from 0 and 1 • When computing F(n), F(n-1) and F(n-2) has been stored in array • Dynamic programming: partial result stored to save time
  • 11. Longest common subsequence • To find the longest subsequence common to two or more sequences • String1: “AGCAT” • String2: “GAC” • Common subsequence: “A”, “C”, “G”, “AC”, “GA”, • LCS: “AC”, or “GA” • To use a table to find LCS: • First column: string1(“AGCAT”) • First row: string2(“GAC”) • Table[i, j]: LCS of string1.substring(0, i) and string2.substring(0, j)
  • 16. Wildcard matching • Linux command-line: user@bash: ls b* barry.txt, blan.txt bob.txt • Complicated example: string = "adcab“ pattern = “*a*b“ • DP solution: • Definition: table[i][j] • Base case: table[0][0] = true first row: table[0][i + 1] = table[0][i] (pattern[i]=*) • Induction rule: (1) if string[i] equals pattern[j] or pattern[j] equals ? table[i + ][j + 1] = table[i][j] (2) if (pattern[j] equals * table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1] - * a * b - T T F F F a d c a b
  • 17. Wildcard matching - * a * b - T T F F F a F T T T F d F T F T F c F T F T F a F T T b • Linux command-line: user@bash: ls b* barry.txt, blan.txt bob.txt • Complicated example: string = "adcab“ pattern = “*a*b“ • DP solution: • Definition: table[i][j] • Base case: table[0][0] = true first row: table[0][i + 1] = table[0][i] (pattern[i]=*) • Induction rule: (1) if string[i] equals pattern[j] or pattern[j] equals ? table[i + ][j + 1] = table[i][j] (2) if (pattern[j] equals * table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1]j + 1]
  • 18. Wildcard matching - * a * b - T T F F F a F T T T F d F T F T F c F T F T F a F T T T F b F T F T • Linux command-line: user@bash: ls b* barry.txt, blan.txt bob.txt • Complicated example: string = "adcab“ pattern = “*a*b“ • DP solution: • Definition: table[i][j] • Base case: table[0][0] = true first row: table[0][i + 1] = table[0][i] (pattern[i]=*) • Induction rule: (1) if string[i] equals pattern[j] or pattern[j] equals ? table[i + ][j + 1] = table[i][j] (2) if (pattern[j] equals * table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1] j + 1]
  • 19. Wildcard matching - * a * b - T T F F F a F T T T F d F T F T F c F T F T F a F T T T F b F T F T T • Linux command-line: user@bash: ls b* barry.txt, blan.txt bob.txt • Complicated example: string = "adcab“ pattern = “*a*b“ • DP solution: • Definition: table[i][j] • Base case: table[0][0] = true first row: table[0][i + 1] = table[0][i] (pattern[i]=*) • Induction rule: (1) if string[i] equals pattern[j] or pattern[j] equals ? table[i + ][j + 1] = table[i][j] (2) if (pattern[j] equals * table[i + 1][j + 1] = table [i + 1][j] or table [i][j + 1] j + 1]
  • 20. Longest common subsequence and wildcard matching • DP starts from initial condition to the end of string: • From left to right at each row • From top to bottom at each cloumn • State transition from table[i - 1][j - 1], table[i][j - 1], table[i - 1][j] to table[i][j] • Each time: move forward by one step • State at each is the global optimum of that step • Table (or diagram) is the best tool to simulate the processing
  • 21. Matrix chain multiplication • Multiple two matrices: A(10 x 100) and B(100 x 5) • OUT[p][r] += A[p][q] * B[q][r] • Computation = 10 x 100 x 5 • Multiple three matrices: A1(10 x 100), A2(100 X 5), and A3(5 x 50) • ((A1 A2) A3) : 10 x 100 x 5 (A1 A2) + 10 x 5 x 50 = 7500 • (A1 (A2 A3)) : 100 x 5 x 50 (A2 A3) + 10 x 100 x 50 = 75000 • ((A1 A2) A3) is 10 times faster than (A1 (A2 A3)) in regarding to scalar computation
  • 22. Matrix chain multiplication • How to optimize the chain multiplication of matrices ( A1, A2, A3, …. An) • DP induction rule:
  • 23. Matrix chain multiplication: DP solution • Six matrices multiplication: • Status: • M[i, j]: the min number of computations for the matrices (i to j) multiplication • S[i, j]: the last-layer break-point for M[i, j]
  • 24. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 25. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 26. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 27. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 28. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 29. Matrix chain multiplication: DP solution • Six matrices multiplication:
  • 30. Matrix chain multiplication: DP solution • Six matrices multiplication: (A1 (A2 A3)) ((A4 A5) A6)
  • 31. Matrix chain multiplication: DP solution • State hard to define: • M[i, j] • S[i, j] • State transition complicated: • By row and column not work • From previous state to current state by the matrices length (Induction rule)
  • 32. Framework of dynamic programming • Three key components of dynamic programming algorithm: • Definition of state • Initial condition (base) • Induction rule (state transition) • Induction rule: difficult to find • 1D/2D table for the thinking process
  • 33. What is part of speech tagging? • Identify parts of the speech (syntactic categories): This is a simple sentence DET VB DET ADJ NOUN • POS tagging is a first step towards syntactic analysis (sematic analysis) • Faster than full parsing • Text classification and word disambiguation • How to decide the correct label: • Word to be labeled: chair is probably a noun • Labels of surrounding word: if preceding word is a modal verb (.e.g., will) then this word is more likely to be a verb • Hidden Markov models can be used to work on this problem
  • 34. Why is POS tagging hard? • Ambiguity glass of water/NOUN vs. water/VERB the plants lie/VERB down vs. tell a lie/NOUN wind/VERB down vs. a mighty wind/NOUN(homographs) How about time flies like an arrow? • Sparse data: • Words we haven’t seen before • Word-Tag pairs we haven’t seen before
  • 35. Example transition probabilities • Probabilities estimated from tagged WSJ corpus: • Proper nouns (NNP) often begin sentences:P(NNP|<s>) = 0.28 • Modal verbs (MD) nearly always followed by bare verbs (VB). • Adjectives (JJ) are often followed by nouns (NN).
  • 36. Example output probabilities • Probabilities estimated from tagged WSJ corpus: • 0.0032% of proper nouns are Janet: P(Janet|NNP) = 0.000032 • About half of determiners (DT) are the. • the can also be a proper noun.
  • 37. Hidden Markov Chain • A set of states (tags) • An output alphabet (words) • Initial state (beginning of sentence) • State transition probabilities ( P(ti|ti-1) ) • Symbol emission probabilities ( P(wi|ti) )
  • 38. Hidden Markov Chain • Model the tagging process: • Sentence: W = (w1, w2, … wn) • Tags T = (t1, t2, …, tn) • Joint probability: P(W, T) = ς𝑖=1 𝑛 𝑃 𝑡𝑖 𝑡𝑖−1 𝑃 𝑤𝑖 𝑡𝑖 𝑃(</𝑠 > |𝑡 𝑛) • Example: • This/DET is/VB a/DET simple/JJ sentence/NN • Add begin(<s>) and end-of-sentence (</s>): P(W, T) = ς𝑖=1 𝑛 𝑃 𝑡𝑖 𝑡𝑖−1 𝑃 𝑤𝑖 𝑡𝑖 𝑃(</𝑠 > |𝑡 𝑛) = P(DET|<s>) P(VB/DET) P(DET/VB) P(JJ/DET) P(NN/JJ) P(</s>|NN) x P(This|DET) P(is|VB) P(a|DET) P(simple|JJ) P(sentence|NN)
  • 39. Computation estimation of POS • Suppose we have C possible tags for each of the n words in the sentence • There are C^n possible tag sequences: the number grows exponentially in the length n • Viterbi algorithm: use dynamic programming to solve it
  • 40. Viterbi algorithm: • Target: argmaxT P(T|W) • Intuition: best path of length (i) at state of t must include best path of length (i-1) to the previous state • Use a table to store the partial result: • TXN table, v(t, i) is the prob of best state sequence for w1 … wi ending at state i • Fill in columns from left to right, the max is over each possible previous t’ V(t, i) = max { v (t’, i – 1) P(t|t’) P(wi|ti) }
  • 42. Viterbi algorithm: case study • W = the doctor is in.
  • 43. Viterbi algorithm: case study • W = the doctor is in.
  • 44. Viterbi algorithm: case study • W = the doctor is in.
  • 45. Viterbi algorithm: case study • W = the doctor is in.
  • 46. Viterbi algorithm: case study • W = the doctor is in.
  • 47. Viterbi algorithm: case study • W = the doctor is in.
  • 48. Viterbi algorithm: case study • W = the doctor is in.
  • 50. Dynamic programming: take-home message • Why fast: use memory to store partial result • DP algorithm component: state definition, initial condition, and induction rule • Solve DP problem with a table
  • 51. Top ten DP problems • Longest common subsequence • Shortest common subsequence • Longest increasing subsequence • Edit distance • Matrix chain multiplication • 0-1 knapsack problem • Partition problem • Rod cutting • Coin change problem • Word break problem
  • 52. Reference • http://people.cs.georgetown.edu/nschneid/cosc572/f16/12_viterbi_s lides.pdf • https://en.wikipedia.org/wiki/Dynamic_programming • https://medium.com/@codingfreak/top-10-dynamic-programming- problems-5da486eeb360 • https://leetcode.com/problems/wildcard-matching/description/ • https://en.wikipedia.org/wiki/Longest_common_subsequence_probl em