SlideShare ist ein Scribd-Unternehmen logo
1 von 36
Downloaden Sie, um offline zu lesen
Graphical Models
  for dummies
  Max Khesin, Data Strategist,
        Liquidnet Inc.
Graphical Models For
     By Dummies
Grand Theme
• “Probabilistic graphical models are an elegant
  framework which combines uncertainty
  (probabilities) and logical structure
  (independence constraints) to
  compactlyrepresent complex, real-world
  phenomena”. (Koller 2007)
Trying to guess if family is home.
• When wife leaves the house she leaves the outdoor
  light on (but sometimes leaves it on for a guest)
• When wife leaves the house, she usually puts the
  dog out
• When dog has a bowel problem, she goes to the
  backyard
• If the dog is in the backyard, I will probably hear it
  (but it might be the neighbor's dog)
These causal connections are not
              absolute
Three causes of uncertainty (Norvig, Russell
  2009):
- Laziness
- Theoretical Ignorance
- Practical Ignorance
Problem with probability
• Too many parameters
• For binary random variables, 2^n-1
Bayesean Network - definition
• A Bayesian network is a directed graph in which each
   node is annotated with quantitative probability
   information. The full specification is as follows:
1. Each node corresponds to a random variable, which
   may be discrete or continuous.
2. A set of directed links or arrows connects pairs of
   nodes. If there is an arrow from node X to node Y , X is
   said to be a parent of Y. The graph has no directed
   cycles (and hence is a directed acyclic graph, or DAG.
3. Each node Xi has a conditional probability distribution
   P(Xi | Parents(Xi)) that quantifies the effect of the
   parents on the node. (Russell, Norvig 2009)
Compressed distribution (factorization)
• In our Bayesian system, we only have 10
  parameters
• The compression is due to independence
• Independence is how causality manifests itself
  in the distribution
D-connection
Definition: conditional probability
Decomposing a joint distribution




…
Topologic sort:
• “topological ordering of a directed acyclic
  graph (DAG) is a linear ordering of its nodes in
  which each node comes before all nodes to
  which it has outbound edges” - Wikipedia.
Sorting the family-home example
• Family-out(1), Bowel-problem(2), Ligths-
  On(3), Dog-out(4), Hear-bark(4)
• Bowel-problem(1), Family-out(2), Ligths-
  On(3), Dog-out(4), Hear-bark(5)
• Right away, we got all the non-descendants to
  the left of the variable
Parents only matter
Chain rule for Bayesian Networks
Markov Networks
• These models are useful in modeling a variety
  of phenomena where one cannot naturally
  ascribe a directionality to the interaction
  between variables
Markov Networks
Markov Networks
• D is a set of random variables
• Factor to be a function from Val(D) to IR+.
Markov Networks
• H is a Markov network structure
• set of subsets  D1, . . . ,Dm,  where each
  Di is a complete subgraph of H
• factors
Markov Network - factorization


Where the unnormalized measure is



And normalization factor is
Factor Product (pointwise multiplication)
Decision Networks
- Combine Bayesian Networks with Utility
   Theory
Utility-based agent
Decision network
Evaluating decision network
Applications – Bayes Nets
• Expert systems.
• “…A later evaluation showed that the diagnostic accuracy of
  Pathfinder IV was at least as good as that of the expert used
  to design the system. When used with less expert pathologists,
  the system significantly improved the diagnostic accuracy of
  the physicians alone. Moreover, the system showed greater
  ability to identify important findings and to integrate these
  findings into a correct diagnosis. Unfortunately, multiple
  reasons prevent the widespread adoption of Bayesian
  networks as anaid for medical diagnosis, including legal
  liability issues for misdiagnoses and incompatibility with the
  physicians' workflow” (Koller 2009)
Applications – Markov Networks
• Computer vision – segmentation
• Regions are contiguous. Glove is next to the
  arm


  “superpixels”
Application – Markov Nets (combining
         logic and probability)
 1.5     x Smokes( x) Cancer( x)
 1.1     x, y Friends x, y)
                    (       Smokes( x)               Smokes( y)
Two constants: Anna (A) and Bob (B)
                               Friends(A,B)



 Friends(A,A)          Smokes(A)         Smokes(B)        Friends(B,B)



           Cancer(A)                                 Cancer(B)
                               Friends(B,A)
Road to AGI…
Tools
• Netica - http://www.norsys.com/netica.html
• Really, look here
  http://www.cs.ubc.ca/~murphyk/Bayes/bnsof
  t.html
References
•   Russell, Norvig 2009: Artificial Intelligence: A Modern Approach (AIMA)
    http://aima.cs.berkeley.edu/ (amazon)
•   Getoor, Taskar 2007: Introduction to Statistical Relational Learning
    http://www.cs.umd.edu/srl-book/ (amazon)
•   Koller, Friedman 2009: Probabilistic Graphical Models: Principles and Techniques
    http://pgm.stanford.edu/ (amazon)
•   Charniak 1991: Bayesian Networks without Tears.
    www.cs.ubc.ca/~murphyk/Bayes/Charniak_91.pdf
•   CS228: http://www.stanford.edu/class/cs228/ (course available via SCPD)
•   Domingos, practical statistical learning in AI
    http://www.cs.cmu.edu/~tom/10601_sp08/slides/mlns-april-28.ppt, see also
    http://www.youtube.com/watch?v=bW5DzNZgGxY
•   Koller 2007: “Graphical Models in a Nutshell”, a chapter of Getoor, Taskar 2007,
    availavle online http://www.seas.upenn.edu/~taskar/pubs/gms-srl07.pdf

Weitere ähnliche Inhalte

Ähnlich wie Graphical Models 4dummies

WIDS 2021--An Introduction to Network Science
WIDS 2021--An Introduction to Network ScienceWIDS 2021--An Introduction to Network Science
WIDS 2021--An Introduction to Network ScienceColleen Farrelly
 
Sparse inverse covariance estimation using skggm
Sparse inverse covariance estimation using skggmSparse inverse covariance estimation using skggm
Sparse inverse covariance estimation using skggmManjari Narayan
 
Relational machine-learning
Relational machine-learningRelational machine-learning
Relational machine-learningBhushan Kotnis
 
Higher-order spectral graph clustering with motifs
Higher-order spectral graph clustering with motifsHigher-order spectral graph clustering with motifs
Higher-order spectral graph clustering with motifsAustin Benson
 
Socialnetworkanalysis (Tin180 Com)
Socialnetworkanalysis (Tin180 Com)Socialnetworkanalysis (Tin180 Com)
Socialnetworkanalysis (Tin180 Com)Tin180 VietNam
 
Modeling and mining complex networks with feature-rich nodes.
Modeling and mining complex networks with feature-rich nodes.Modeling and mining complex networks with feature-rich nodes.
Modeling and mining complex networks with feature-rich nodes.Corrado Monti
 
712201907
712201907712201907
712201907IJRAT
 
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...Association for Computational Linguistics
 
The application of artificial intelligence
The application of artificial intelligenceThe application of artificial intelligence
The application of artificial intelligencePallavi Vashistha
 
ProbabilisticModeling20080411
ProbabilisticModeling20080411ProbabilisticModeling20080411
ProbabilisticModeling20080411Clay Stanek
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraJason Riedy
 
Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...Hakky St
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresDavid Gleich
 
Networks, Deep Learning (and COVID-19)
Networks, Deep Learning (and COVID-19)Networks, Deep Learning (and COVID-19)
Networks, Deep Learning (and COVID-19)tm1966
 

Ähnlich wie Graphical Models 4dummies (20)

WIDS 2021--An Introduction to Network Science
WIDS 2021--An Introduction to Network ScienceWIDS 2021--An Introduction to Network Science
WIDS 2021--An Introduction to Network Science
 
Sparse inverse covariance estimation using skggm
Sparse inverse covariance estimation using skggmSparse inverse covariance estimation using skggm
Sparse inverse covariance estimation using skggm
 
Relational machine-learning
Relational machine-learningRelational machine-learning
Relational machine-learning
 
social.pptx
social.pptxsocial.pptx
social.pptx
 
Higher-order spectral graph clustering with motifs
Higher-order spectral graph clustering with motifsHigher-order spectral graph clustering with motifs
Higher-order spectral graph clustering with motifs
 
Socialnetworkanalysis (Tin180 Com)
Socialnetworkanalysis (Tin180 Com)Socialnetworkanalysis (Tin180 Com)
Socialnetworkanalysis (Tin180 Com)
 
Modeling and mining complex networks with feature-rich nodes.
Modeling and mining complex networks with feature-rich nodes.Modeling and mining complex networks with feature-rich nodes.
Modeling and mining complex networks with feature-rich nodes.
 
AlexNet
AlexNetAlexNet
AlexNet
 
Declarative data analysis
Declarative data analysisDeclarative data analysis
Declarative data analysis
 
712201907
712201907712201907
712201907
 
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
Chris Dyer - 2017 - Neural MT Workshop Invited Talk: The Neural Noisy Channel...
 
The application of artificial intelligence
The application of artificial intelligenceThe application of artificial intelligence
The application of artificial intelligence
 
ProbabilisticModeling20080411
ProbabilisticModeling20080411ProbabilisticModeling20080411
ProbabilisticModeling20080411
 
Network Science: Theory, Modeling and Applications
Network Science: Theory, Modeling and ApplicationsNetwork Science: Theory, Modeling and Applications
Network Science: Theory, Modeling and Applications
 
Graph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear AlgebraGraph Analysis Beyond Linear Algebra
Graph Analysis Beyond Linear Algebra
 
Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...
 
Spectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structuresSpectral clustering with motifs and higher-order structures
Spectral clustering with motifs and higher-order structures
 
AI: Belief Networks
AI: Belief NetworksAI: Belief Networks
AI: Belief Networks
 
AI: Belief Networks
AI: Belief NetworksAI: Belief Networks
AI: Belief Networks
 
Networks, Deep Learning (and COVID-19)
Networks, Deep Learning (and COVID-19)Networks, Deep Learning (and COVID-19)
Networks, Deep Learning (and COVID-19)
 

Graphical Models 4dummies

  • 1. Graphical Models for dummies Max Khesin, Data Strategist, Liquidnet Inc.
  • 2. Graphical Models For By Dummies
  • 3. Grand Theme • “Probabilistic graphical models are an elegant framework which combines uncertainty (probabilities) and logical structure (independence constraints) to compactlyrepresent complex, real-world phenomena”. (Koller 2007)
  • 4. Trying to guess if family is home. • When wife leaves the house she leaves the outdoor light on (but sometimes leaves it on for a guest) • When wife leaves the house, she usually puts the dog out • When dog has a bowel problem, she goes to the backyard • If the dog is in the backyard, I will probably hear it (but it might be the neighbor's dog)
  • 5.
  • 6. These causal connections are not absolute Three causes of uncertainty (Norvig, Russell 2009): - Laziness - Theoretical Ignorance - Practical Ignorance
  • 7. Problem with probability • Too many parameters • For binary random variables, 2^n-1
  • 8. Bayesean Network - definition • A Bayesian network is a directed graph in which each node is annotated with quantitative probability information. The full specification is as follows: 1. Each node corresponds to a random variable, which may be discrete or continuous. 2. A set of directed links or arrows connects pairs of nodes. If there is an arrow from node X to node Y , X is said to be a parent of Y. The graph has no directed cycles (and hence is a directed acyclic graph, or DAG. 3. Each node Xi has a conditional probability distribution P(Xi | Parents(Xi)) that quantifies the effect of the parents on the node. (Russell, Norvig 2009)
  • 9.
  • 10. Compressed distribution (factorization) • In our Bayesian system, we only have 10 parameters • The compression is due to independence • Independence is how causality manifests itself in the distribution
  • 13. Decomposing a joint distribution …
  • 14. Topologic sort: • “topological ordering of a directed acyclic graph (DAG) is a linear ordering of its nodes in which each node comes before all nodes to which it has outbound edges” - Wikipedia.
  • 15. Sorting the family-home example • Family-out(1), Bowel-problem(2), Ligths- On(3), Dog-out(4), Hear-bark(4) • Bowel-problem(1), Family-out(2), Ligths- On(3), Dog-out(4), Hear-bark(5) • Right away, we got all the non-descendants to the left of the variable
  • 17. Chain rule for Bayesian Networks
  • 18.
  • 19.
  • 20. Markov Networks • These models are useful in modeling a variety of phenomena where one cannot naturally ascribe a directionality to the interaction between variables
  • 22. Markov Networks • D is a set of random variables • Factor to be a function from Val(D) to IR+.
  • 23. Markov Networks • H is a Markov network structure • set of subsets D1, . . . ,Dm, where each Di is a complete subgraph of H • factors
  • 24. Markov Network - factorization Where the unnormalized measure is And normalization factor is
  • 25. Factor Product (pointwise multiplication)
  • 26.
  • 27. Decision Networks - Combine Bayesian Networks with Utility Theory
  • 31. Applications – Bayes Nets • Expert systems. • “…A later evaluation showed that the diagnostic accuracy of Pathfinder IV was at least as good as that of the expert used to design the system. When used with less expert pathologists, the system significantly improved the diagnostic accuracy of the physicians alone. Moreover, the system showed greater ability to identify important findings and to integrate these findings into a correct diagnosis. Unfortunately, multiple reasons prevent the widespread adoption of Bayesian networks as anaid for medical diagnosis, including legal liability issues for misdiagnoses and incompatibility with the physicians' workflow” (Koller 2009)
  • 32. Applications – Markov Networks • Computer vision – segmentation • Regions are contiguous. Glove is next to the arm “superpixels”
  • 33. Application – Markov Nets (combining logic and probability) 1.5 x Smokes( x) Cancer( x) 1.1 x, y Friends x, y) ( Smokes( x) Smokes( y) Two constants: Anna (A) and Bob (B) Friends(A,B) Friends(A,A) Smokes(A) Smokes(B) Friends(B,B) Cancer(A) Cancer(B) Friends(B,A)
  • 35. Tools • Netica - http://www.norsys.com/netica.html • Really, look here http://www.cs.ubc.ca/~murphyk/Bayes/bnsof t.html
  • 36. References • Russell, Norvig 2009: Artificial Intelligence: A Modern Approach (AIMA) http://aima.cs.berkeley.edu/ (amazon) • Getoor, Taskar 2007: Introduction to Statistical Relational Learning http://www.cs.umd.edu/srl-book/ (amazon) • Koller, Friedman 2009: Probabilistic Graphical Models: Principles and Techniques http://pgm.stanford.edu/ (amazon) • Charniak 1991: Bayesian Networks without Tears. www.cs.ubc.ca/~murphyk/Bayes/Charniak_91.pdf • CS228: http://www.stanford.edu/class/cs228/ (course available via SCPD) • Domingos, practical statistical learning in AI http://www.cs.cmu.edu/~tom/10601_sp08/slides/mlns-april-28.ppt, see also http://www.youtube.com/watch?v=bW5DzNZgGxY • Koller 2007: “Graphical Models in a Nutshell”, a chapter of Getoor, Taskar 2007, availavle online http://www.seas.upenn.edu/~taskar/pubs/gms-srl07.pdf