SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Downloaden Sie, um offline zu lesen
Introduction to Bayesian Networks

Practical and Technical Perspectives



Stefan Conrady, stefan.conrady@conradyscience.com

Dr. Lionel Jouffe, jouffe@bayesia.com

February 15, 2011




Conrady Applied Science, LLC - Bayesia’s North American Partner for Sales and Consulting
Introduction to Bayesian Networks




Table of Contents

Introduction

Bayesian Networks from a Practitioner’s Perspective
    Knowledge Uni cation                               2
    Knowledge Representation & Communication           3
    Reasoning                                          4
    Summary                                            4

Technical Introduction
    Introduction                                       5
    Probabilistic Semantics                            7
    Evidential Reasoning                               8
    Learning Bayesian Networks                         9
    Causal Networks                                   10
    Causal Discovery                                  11

  References                                          12

  Contact Information                                 13
        Conrady Applied Science, LLC                  13
        Bayesia SAS                                   13




www.conradyscience.com | www.bayesia.com
              i
Introduction to Bayesian Networks




Introduction

A simplistic analogy may help to jump-start our introduction to Bayesian networks: In the same way one can use a
phone book — without having to memorize all the names and numbers, one can deliberately (and correctly) reason with
the domain knowledge contained in a Bayesian network — without having to become a domain expert.

Over the last 25 years, Bayesian networks have emerged as a practically feasible form of knowledge representation, pri-
marily through the seminal works of UCLA Professor Judea Pearl. With the ever-increasing computing power, Bayesian
networks are now a powerful tool for deep understanding of very complex, high-dimensional problem domains. Their
computational ef ciency and inherently visual structure make Bayesian networks attractive for exploring and explaining
complex problems.

However, Bayesian networks are somewhat of a disruptive technology, as they challenge a number common practices in
the world of business and science. So, beyond the world of academia, promoting Bayesian networks as a new tool for
practical knowledge management and reasoning still requires signi cant persuasion efforts. With this short paper, we
attempt to provide a concise justi cation, both from a practitioner’s and a technical perspective1 , why Bayesian net-
works are so important.




1   Author notes: portions of the technical chapter of this paper are adapted, with permission, from Pearl and Russell
(2000).


www.conradyscience.com | www.bayesia.com                                                                                 1
Introduction to Bayesian Networks - Practitioner's Perspective




Bayesian Networks from a Practitioner’s Perspective

In our quest to “evangelize” about Bayesian networks (and the BayesiaLab software package2 ), we are often limited to
presenting our case in just a few PowerPoint slides and only using a few catchy bullet points. In this context, and this is
obviously not comprehensive, we selected the following headings to highlight the key bene ts of Bayesian networks to
research practitioners and business executives:

1.     Knowledge Uni cation

2.     Knowledge Representation & Communication

3.     Reasoning

Under these headlines, the following paragraphs are meant to provide a glimpse of the powerful properties and wide-
ranging practical advantages of Bayesian networks.

Knowledge Uni cation
Many elds are characterized by the proverbial con ict between “art” and “science.” This manifests itself in debates,
such as the one about evidence-based medicine versus the prevailing practice of physicians with years of experience.
Even more common is the discrepancy between scienti cally derived market research insights and expertise-based mar-
keting decisions of business executives. Traditional frameworks typically don't facilitate leveraging the knowledge avail-
able on both sides.

Bayesian networks have the ability of capturing both qualitative knowledge (through their network structure), and
quantitative knowledge (through their parameters). While expert knowledge from practitioners is mostly qualitative, it
can be used directly for building the structure of a Bayesian network. In addition, data mining algorithms can encode
both qualitative and quantitative knowledge and encode both forms simultaneously in a Bayesian network. As a result,
Bayesian networks can bridge the gap between different types of knowledge and serve to unify all available knowledge
into a single form of representation.




2   Developed by Bayesia SAS, BayesiaLab is a comprehensive software package designed for learning, editing and analyz-
ing Bayesian networks. It is available in North America from Conrady Applied Science, LLC.


www.conradyscience.com | www.bayesia.com                                                                                 2
Introduction to Bayesian Networks - Practitioner's Perspective




                                                    Domain
                       “Art”                                                         “Science”

                       Expert                                                       Mathematical
                     Knowledge                                                      Representation
                     Qualitative                                                     Quantitative




                                         Bayesian Network
                                          Uni ed Knowledge Representation


                                Figure 1: Knowledge uni cation with Bayesian networks

Knowledge Representation & Communication
Relaying knowledge typically includes an array of factual and causal statements. In natural language communication,
such statements will often contain generalizations, approximations, and implicit assumptions regarding their probability.
Such simpli cations are widely accepted in casual conversation or in media headlines.

However, for more precise communication, which is required in science or business, spelling out exceptions, uncertainty
and conditions regarding statements about knowledge is necessary. With natural language expressions, however, this can
become very cumbersome, especially when it concerns a complex domain (hence the substantial girth of many text-
books).

Also, the need for precision in describing complex domains is often at odds with the modern business culture, which, as
already mentioned in the introduction, dictates communication via PowerPoint in few, concise bullet points. Needless to
say, the complex dynamics of a domain can thus often not be relayed correctly to policy makers and other stakeholders.

Bayesian networks are very well suited for capturing probabilistic and incomplete causal knowledge regarding a do-
main. They can easily accommodate exceptions to a rule, e.g. “all swans are white, except for a certain species,” as well
as partial causal information, for instance “alcohol caused the accident,” even though more factors may actually be in-
volved, such as poor road conditions.

Through its structure and its parameters, a Bayesian networks comprehensively describes what is known about a par-
ticular domain and especially the interactions of all the variables contained within that domain. As such, a Bayesian
network is a “Portable Knowledge Format,” that can succinctly and compactly communicate the state of the domain as
well as its dynamics.




www.conradyscience.com | www.bayesia.com                                                                               3
Introduction to Bayesian Networks - Practitioner's Perspective



Reasoning
By representing the interactions, a (correctly formulated) Bayesian network can yield a deep understanding of a domain.
Deep understanding means knowing, not merely how things behaved yesterday, but also how things will behave under
new hypothetical circumstances tomorrow. More speci cally, a Bayesian network allows explicit reasoning, and deliber-
ate reasoning allows us to anticipate the consequences of actions we have not yet taken. Bayesian networks thus become
an instrument for formal reasoning that is entirely transparent to stakeholders, as opposed to a more opaque, internal-
ized process in the decision maker’s mind (or gut).




                        Domain
                         under                        Data                    Bayesian
                         Study                                                Network




                     Hypothetical
                       Domain
                                                                   Manipulation
                                                                  Manipulation

          Figure 2: Using Bayesian networks for formal reasoning about consequences of hypothetical actions

Summary
In summary, Bayesian networks are a highly universal knowledge framework and they provide a common reasoning
language between stakeholders from different backgrounds, such as business executives and market research scientists.
With all available knowledge uni ed, properly communicated and quite literally put into a “reasonable” format, Bayes-
ian network are a powerful tool for making decisions and shaping policies.




www.conradyscience.com | www.bayesia.com                                                                             4
Introduction to Bayesian Networks - Technical Perspective




Technical Introduction

For the technical portion of this introduction, we defer to the words of Judea Pearl, who originally coined the term
“Bayesian network”. We are grateful to him for allowing us to use and adapt large sections from one of his technical
reports for our purposes (Pearl and Russell, 2000).

Introduction
Probabilistic models based on directed acyclic graphs have a long and rich tradition, beginning with the work of geneti-
cist Sewall Wright in the 1920s. Variants have appeared in many elds. Within statistics, such models are known as di-
rected graphical models; within cognitive science and arti cial intelligence, such models are known as Bayesian net-
works. The name honors the Rev. Thomas Bayes (1702-1761), whose rule for updating probabilities in the light of new
evidence is the foundation of the approach.

Rev. Bayes addressed both the case of discrete probability distributions of data and the more complicated case of con-
tinuous probability distributions. In the discrete case, Bayes’ theorem relates the conditional and marginal probabilities
of events A and B, provided that the probability of B does not equal zero:


              P(B A)P(A)
P(A B) =
                 P(B)

In Bayes’ theorem, each probability has a conventional name:

• P(A) is the prior probability (or “unconditional” or “marginal” probability) of A. It is “prior” in the sense that it does
  not take into account any information about B; however, the event B need not occur after event A. In the nineteenth
  century, the unconditional probability P(A) in Bayes’s rule was called the “antecedent” probability; in deductive logic,
  the antecedent set of propositions and the inference rule imply consequences. The unconditional probability P(A) was
  called “a priori” by Ronald A. Fisher.

• P(A|B) is the conditional probability of A, given B. It is also called the posterior probability because it is derived from
  or depends upon the speci ed value of B.

• P(B|A) is the conditional probability of B given A. It is also called the likelihood.

• P(B) is the prior or marginal probability of B, and acts as a normalizing constant.

Bayes theorem in this form gives a mathematical representation of how the conditional probability of event A given B is
related to the converse conditional probability of B given A.

The initial development of Bayesian networks in the late 1970s was motivated by the need to model the top-down (se-
mantic) and bottom-up (perceptual) combination of evidence in reading. The capability for bidirectional inferences,
combined with a rigorous probabilistic foundation, led to the rapid emergence of Bayesian networks as the method of
choice for uncertain reasoning in AI and expert systems replacing earlier, ad hoc rule-based schemes.




www.conradyscience.com | www.bayesia.com                                                                                   5
Introduction to Bayesian Networks - Technical Perspective



The nodes in a Bayesian network represent propositional variables of interest (e.g. the temperature of a device, the gen-
der of a patient, a feature of an object, the occurrence of an event) and the links represent statistical (informational)3 or
causal dependencies among the variables. The dependencies are quanti ed by conditional probabilities for each node
given its parents in the network. The network supports the computation of the posterior probabilities of any subset of
variables given evidence about any other subset.

Figure 1 shows a very simple Bayesian network consisting of only two nodes and one link, representing the joint prob-
ability distribution of the variables Eye Color and Hair Color in a given population. In this case, the conditional prob-
abilities of Hair Color given the values of its parent, Eye Color, are provided in a table. It is important to point out that
this Bayesian network does not contain any causal assumptions, i.e. we have no knowledge of the causal order between
the variables, so the interpretation here should be merely statistical (informational).




               Figure 1: A Bayesian network representing the statistical relationship between to two variables.

Figure 2 illustrates another simple yet typical Bayesian network. In contrast to the statistical relationships in Figure 1,
the diagram in Figure 2 describes the causal relationships among the season of the year (X1), whether it’s raining (X2),
whether the sprinkler is on (X3), whether the pavement is wet (X4), and whether the pavement is slippery (X5). Here the
absence of a direct link between X1 and X5, for example, captures our understanding that there is no direct in uence of
season on slipperiness — the in uence is mediated by the wetness of the pavement (if freezing is a possibility then a di-
rect link could be added).




3   “informational” and “statistical” are treated here as equivalent concepts and can be used interchangeably.


www.conradyscience.com | www.bayesia.com                                                                                   6
Introduction to Bayesian Networks - Technical Perspective




                       Figure 2: A Bayesian network representing causal in uences among ve variables

Perhaps the most important aspect of a Bayesian networks is that they are direct representations of the world, not of
reasoning processes. The arrows in the diagram represent real causal connections and not the ow of information during
reasoning (as in rule-based systems and neural networks). Reasoning processes can operate on Bayesian networks by
propagating information in any direction. For example, if the sprinkler is on, then the pavement is probably wet (predic-
tion, simulation); if someone slips on the pavement, that also provides evidence that it is wet (abduction, reasoning to a
probable cause or diagnosis). On the other hand, if we see that the pavement is wet, that makes it more likely that the
sprinkler is on or that it is raining (abduction); but if we then observe that the sprinkler is on, that reduces the likelihood
that it is raining (explaining away). It is this last form of reasoning, explaining away, that is especially dif cult to model
in rule-based systems and neural networks in any natural way, because it seems to require the propagation of informa-
tion in two directions.

Probabilistic Semantics
Any complete probabilistic model of a domain must, either explicitly or implicitly, represent the joint probability distri-
bution — the probability of every possible event as de ned by the combination of the values of all the variables. There
are exponentially many such events, yet Bayesian networks achieve compactness by factoring the joint distribution into
local, conditional distributions for each variable given its parents. If xi denotes some value of the variable Xi and pai
denotes some set of values for the parents of Xi, then P(xi|pai) denotes this conditional distribution. For example,
P(x4|x2,x3) is the probability of wetness given the values of sprinkler and rain. The global semantics of Bayesian net-
works speci es that the full joint distribution is given by the product


P(xi ,..., xn ) = ∏ P(xi pai )                                                                                              (1)
                   i



In our example network, we have


P(x1 , x2 , x3 , x4 , x5 ) = P(x1 )P(x2 x1 )P(x3 x1 )P(x4 x2 , x3 )P(x5 x4 ) .                                              (2)

It becomes clear that the number of parameters grows linearly with the size of the network, i.e. the number of variables,
however, the conditional probability distribution grows exponentially with the number of parents. Further savings can



www.conradyscience.com | www.bayesia.com                                                                                     7
Introduction to Bayesian Networks - Technical Perspective



be achieved using compact parametric representations — such as noisy-OR models, decision trees, or neural networks
— for the conditional distributions.

There is also an entirely equivalent local semantics, which asserts that each variable is independent of its nondescen-
dants in the network given its parents. For example, the parents of X4 in Figure 2 are X2 and X3 and they render X4
independent of the remaining nondescendant, X1. That is,


P(x4 x 1 , x2 , x3 ) = P(x4 x2 , x3 ) .


                                   Non-Descendants




                                                         Parents




                                       Descendant




         Figure 3: Variable X4 is independent of its nondescendants, in this case X1, given its parents, X3 and X2

The collection of independence assertions formed in this way suf ces to derive the global assertion in Equation 1, and
vice versa. The local semantics is most useful in constructing Bayesian networks, because selecting as parents all the di-
rect causes (or direct relationships) of a given variable invariably satis es the local conditional independence conditions.
The global semantics leads directly to a variety of algorithms for reasoning.

Evidential Reasoning
From the product speci cation in Equation 1 one can express the probability of any desired proposition in terms of the
conditional probabilities speci ed in the network. For example the probability that the sprinkler is on given that the
pavement is slippery is




www.conradyscience.com | www.bayesia.com                                                                                  8
Introduction to Bayesian Networks - Technical Perspective



                                               P(X 3 = on, X5 = true)
P(X 3 = on X5 = true) =
                                                   P(X5 = true)

=
    ∑  x1 , x2 , x4   P(x1 , x2 , X 3 = on, x4 , X5 = true)
     ∑     x1 , x2 , x3 , x4   P(x1 , x2 , x3 , x4 , X5 = true)


==
     ∑    x1 , x2 , x4   P(x1 )P(x2 x1 )P(X 3 = on x1 )P(x4 x2 , X 3 = on)P(X5 = true x4 )
             ∑        x1 , x2 , x3 , x4   P(x1 )P(x2 x1 )P(x3 x1 )P(x4 x2 , x3 )P(X5 = true x4 )



These expressions can often be simpli ed in ways that re ect the structure of the network itself. The rst algorithms
proposed for probabilistic calculations in Bayesian networks used a local distributed message-passing architecture, typi-
cal of many cognitive activities. Initially this approach was limited to tree-structured networks, but was later extended
to general networks in Lauritzen and Spiegelhalter’s (1988) method of junction tree propagation. A number of other
exact methods have been developed and can be found in recent textbooks.

It is easy to show that reasoning in Bayesian networks subsumes the satis ability problem in propositional logic and
hence is NP-hard Monte Carlo simulation methods can be used for approximate inference (Pearl, 1988) giving gradually
improving estimates as sampling proceeds. These methods use local message propagation on the original network struc-
ture unlike junction tree methods. Alternatively, variational methods provide bounds on the true probability.

Learning Bayesian Networks
The conditional probabilities P(xi|pai) of a given structure can be estimated from data by using the maximum likelihood
approach (observed frequencies). They can also be updated continuously from observational data using gradient-based
or EM methods that use just local information derived from inference — in much the same way as weights are adjusted
in neural networks.

It is also possible to machine-learn the structure of a Bayesian network and two families of methods are available for
that purpose. The rst one, the constraint-based algorithms, is based on the probabilistic semantic of Bayesian networks.
Links are added or deleted according to the results of statistical tests, which identify marginal and conditional independ-
encies. The second approach, the score-based algorithms, is based on a metric measuring the quality of candidate net-
works with respect to the observed data. This metric trades off network complexity against degree of t to the data,
typically expressed as the likelihood of the data given the network.

As a substrate for learning, Bayesian networks have the advantage that it is relatively easy to encode prior knowledge in
network form, either by xing portions of the structure or by using prior distributions over the network parameters.
Such prior knowledge can allow a system to learn accurate models from much less data than are required for tabula rasa
approaches.

Uncertainty Over Time

Entities that live in a changing environment must keep track of variables whose values change over time. Dynamic
Bayesian networks capture this process by representing multiple copies of the state variables, one for each time step. A
set of variables Xt denotes the world state at time t and a set of sensor variables Et denotes the observations available at
time t. The sensor model P(Et|Xt) is encoded in the conditional probability distributions for the observable variables,
given the state variables. The transition model P(Xt+1|Xt) relates the state at time t to the state at time t+1. Keeping track
of the world means computing the current probability distribution over world states given all past observations, i.e.,



www.conradyscience.com | www.bayesia.com                                                                                    9
Introduction to Bayesian Networks - Technical Perspective



P(Xt|E1,…,Et). Dynamic Bayesian networks are strictly more expressive than other temporal probability models such as
hidden Markov models and Kalman lters.

Causal Networks
Most probabilistic models including, general Bayesian networks, describe a distribution over possible observed events —
as in Equation 1 — but say nothing about what will happen if a certain intervention occurs. For example, what if I turn
the sprinkler on? What effect does that have on the season, or on the connection between wetness and slipperiness? A
causal network, intuitively speaking, is a Bayesian network with the added property that the parents of each node are its
direct causes — as in Figure 2. In such a network, the result of an intervention is obvious: the sprinkler node is set to
X3 = on and the causal link between the season X1 and the sprinkler X3 is removed (see Figure 4). All other causal links
and conditional probabilities remain intact so the new model is


P(x1 , x2 , x4 , x5 ) = P(x1 )P(x2 x1 )P(x4 x2 , X 3 = on)P(x5 x4 ).

Notice that this differs from observing that X3=on, which would result in a new model that included the term
P(X3=on|x1). This mirrors the difference between seeing and doing: after observing that the sprinkler is on, we wish to
infer that the season is dry, that it probably did not rain, and so on; an arbitrary decision to turn the sprinkler on should
not result in any such beliefs.




                                  Figure 4: A causal network re ecting the intervention, X3=on

Causal networks are more properly de ned, then, as Bayesian networks in which the correct probability model after
intervening to x any node’s value is given simply by deleting links from the node’s parents. For example, Fire → Smoke
is a causal network whereas Smoke → Fire is not, even though both networks are equally capable of representing any
joint distribution on the two variables. Causal networks model the environment as a collection of stable component
mechanisms. These mechanisms may be recon gured locally by interventions, with correspondingly local changes in the
model. This, in turn, allows causal networks to be used very naturally for prediction by an agent that is considering
various courses of action.


www.conradyscience.com | www.bayesia.com                                                                                  10
Introduction to Bayesian Networks - Technical Perspective



Causal Discovery
One of the most exciting prospects in recent years has been the possibility of using Bayesian networks to discover
causal structures in raw statistical data — a task previously considered impossible without controlled experiments. Con-
sider, for example, the following intransitive pattern of dependencies among three events: A and B are dependent. B and
C are dependent, yet A and C are independent. If you ask a person to supply an example of three such events, the exam-
ple would invariably portray A and C as two independent causes and B as their common effect, namely, A → B ← C.
(For instance A and C could be the outcomes of two fair coins and B represents a bell that rings whenever either coin
comes up heads.)




           Figure 4: Causal model for variables A, C and B, representing two fair coins and a bell respectively.

Fitting this dependence pattern with a scenario in which B is the cause and A and C are the effects is mathematically
feasible but very unnatural (see Figure 5), because it must entail ne tuning of the probabilities involved; the desired
dependence pattern will be destroyed as soon as the probabilities undergo a slight change.

Such thought experiments tell us that certain patterns of dependency, which are totally void of temporal information,
are conceptually characteristic of certain causal directionalities and not others. When put together systematically, such
patterns can be used to infer causal structures from raw data and to guarantee that any alternative structure compatible
with the data must be less stable than the one(s) inferred; namely slight uctuations in parameters will render that struc-
ture incompatible with the data.




www.conradyscience.com | www.bayesia.com                                                                               11
Introduction to Bayesian Networks




References
Barber, David. “Bayesian Reasoning and Machine Learning.” http://www.cs.ucl.ac.uk/staff/d.barber/brml.
Barber, David. Bayesian Reasoning and Machine Learning. Cambridge University Press, 2011.  
Barnard, G. A, and T. Bayes. “Studies in the History of Probability and Statistics: IX. Thomas Bayes's Essay Towards
    Solving a Problem in the Doctrine of Chances.” Biometrika 45, no. 3 (1958): 293–315.  
Darwiche, Adnan. “Bayesian networks.” Communications of the ACM 53, no. 12 (12, 2010): 80.  
Hilbert, M., and P. Lopez. “The World's Technological Capacity to Store, Communicate, and Compute Information.”
     Science (2, 2011). http://www.sciencemag.org/cgi/doi/10.1126/science.1200970.  
Koller, Daphne, and Nir Friedman. Probabilistic Graphical Models: Principles and Techniques. The MIT Press, 2009.  
Neapolitan, Richard E., and Xia Jiang. Probabilistic Methods for Financial and Marketing Informatics. 1st ed. Morgan
    Kaufmann, 2007.
Pearl, Judea, and Stuart Russell. Bayesian Networks. UCLA Cognitive Systems Laboratory, November 2000.
     http://bayes.cs.ucla.edu/csl_papers.html.
Pearl, Judea. Causality: Models, Reasoning and Inference. Cambridge University Press, 2000.  
Pearl, Judea. Causality: Models, Reasoning and Inference. 2nd ed. Cambridge University Press, 2009.  
Spirtes, Peter, Clark Glymour, and Richard Scheines. Causation, Prediction, and Search, Second Edition. 2nd ed. The
     MIT Press, 2001.  




www.conradyscience.com | www.bayesia.com                                                                               12
Introduction to Bayesian Networks



Contact Information

Conrady Applied Science, LLC
312 Hamlet’s End Way
Franklin, TN 37067
USA
+1 888-386-8383
info@conradyscience.com
www.conradyscience.com

Bayesia SAS
6, rue Léonard de Vinci
BP 119
53001 Laval Cedex
France
+33(0)2 43 49 75 69
info@bayesia.com
www.bayesia.com




www.conradyscience.com | www.bayesia.com   13

Weitere ähnliche Inhalte

Andere mochten auch

An Introduction to Causal Discovery, a Bayesian Network Approach
An Introduction to Causal Discovery, a Bayesian Network ApproachAn Introduction to Causal Discovery, a Bayesian Network Approach
An Introduction to Causal Discovery, a Bayesian Network ApproachCOST action BM1006
 
Lecture 5: Bayesian Classification
Lecture 5: Bayesian ClassificationLecture 5: Bayesian Classification
Lecture 5: Bayesian ClassificationMarina Santini
 
Neural network & its applications
Neural network & its applications Neural network & its applications
Neural network & its applications Ahmed_hashmi
 
Bayesian Classification
Bayesian ClassificationBayesian Classification
Bayesian ClassificationGang Tao
 
PageRank for anomaly detection - Hadoop Summit
PageRank for anomaly detection - Hadoop SummitPageRank for anomaly detection - Hadoop Summit
PageRank for anomaly detection - Hadoop SummitOfer Mendelevitch
 
Soliton Stability of the 2D Nonlinear Schrödinger Equation
Soliton Stability of the 2D Nonlinear Schrödinger EquationSoliton Stability of the 2D Nonlinear Schrödinger Equation
Soliton Stability of the 2D Nonlinear Schrödinger Equationsheilsn
 
An Algorithm for Bayesian Network Construction from Data
An Algorithm for Bayesian Network Construction from DataAn Algorithm for Bayesian Network Construction from Data
An Algorithm for Bayesian Network Construction from Databutest
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learningbutest
 
Seminário redes bayesianas
Seminário redes bayesianasSeminário redes bayesianas
Seminário redes bayesianasiaudesc
 
Hadoop at ayasdi
Hadoop at ayasdiHadoop at ayasdi
Hadoop at ayasdiMohit Jaggi
 
Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010
Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010
Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010Yahoo Developer Network
 
Cordex India - SAS Forum India: Loss Data Consortium
Cordex India - SAS Forum India: Loss Data ConsortiumCordex India - SAS Forum India: Loss Data Consortium
Cordex India - SAS Forum India: Loss Data ConsortiumSAS Institute India Pvt. Ltd
 

Andere mochten auch (17)

An Introduction to Causal Discovery, a Bayesian Network Approach
An Introduction to Causal Discovery, a Bayesian Network ApproachAn Introduction to Causal Discovery, a Bayesian Network Approach
An Introduction to Causal Discovery, a Bayesian Network Approach
 
Lecture 5: Bayesian Classification
Lecture 5: Bayesian ClassificationLecture 5: Bayesian Classification
Lecture 5: Bayesian Classification
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
Neural network & its applications
Neural network & its applications Neural network & its applications
Neural network & its applications
 
Bayesian Classification
Bayesian ClassificationBayesian Classification
Bayesian Classification
 
PageRank for anomaly detection - Hadoop Summit
PageRank for anomaly detection - Hadoop SummitPageRank for anomaly detection - Hadoop Summit
PageRank for anomaly detection - Hadoop Summit
 
Análise bayesiana de decisões aspectos práticos
Análise bayesiana de decisões   aspectos práticosAnálise bayesiana de decisões   aspectos práticos
Análise bayesiana de decisões aspectos práticos
 
Soliton Stability of the 2D Nonlinear Schrödinger Equation
Soliton Stability of the 2D Nonlinear Schrödinger EquationSoliton Stability of the 2D Nonlinear Schrödinger Equation
Soliton Stability of the 2D Nonlinear Schrödinger Equation
 
Lecture 2
Lecture 2Lecture 2
Lecture 2
 
Redes Bayesianas
Redes BayesianasRedes Bayesianas
Redes Bayesianas
 
An Algorithm for Bayesian Network Construction from Data
An Algorithm for Bayesian Network Construction from DataAn Algorithm for Bayesian Network Construction from Data
An Algorithm for Bayesian Network Construction from Data
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Seminário redes bayesianas
Seminário redes bayesianasSeminário redes bayesianas
Seminário redes bayesianas
 
Hadoop at ayasdi
Hadoop at ayasdiHadoop at ayasdi
Hadoop at ayasdi
 
Bayes Belief Network
Bayes Belief NetworkBayes Belief Network
Bayes Belief Network
 
Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010
Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010
Exact Inference in Bayesian Networks using MapReduce__HadoopSummit2010
 
Cordex India - SAS Forum India: Loss Data Consortium
Cordex India - SAS Forum India: Loss Data ConsortiumCordex India - SAS Forum India: Loss Data Consortium
Cordex India - SAS Forum India: Loss Data Consortium
 

Mehr von Bayesia USA

BayesiaLab_Book_V18 (1)
BayesiaLab_Book_V18 (1)BayesiaLab_Book_V18 (1)
BayesiaLab_Book_V18 (1)Bayesia USA
 
Loyalty_Driver_Analysis_V13b
Loyalty_Driver_Analysis_V13bLoyalty_Driver_Analysis_V13b
Loyalty_Driver_Analysis_V13bBayesia USA
 
vehicle_safety_v20b
vehicle_safety_v20bvehicle_safety_v20b
vehicle_safety_v20bBayesia USA
 
Impact Analysis V12
Impact Analysis V12Impact Analysis V12
Impact Analysis V12Bayesia USA
 
Causality for Policy Assessment and 
Impact Analysis
Causality for Policy Assessment and 
Impact AnalysisCausality for Policy Assessment and 
Impact Analysis
Causality for Policy Assessment and 
Impact AnalysisBayesia USA
 
Vehicle Size, Weight, and Injury Risk: High-Dimensional Modeling and
 Causal ...
Vehicle Size, Weight, and Injury Risk: High-Dimensional Modeling and
 Causal ...Vehicle Size, Weight, and Injury Risk: High-Dimensional Modeling and
 Causal ...
Vehicle Size, Weight, and Injury Risk: High-Dimensional Modeling and
 Causal ...Bayesia USA
 
The Bayesia Portfolio of Research Software
The Bayesia Portfolio of Research SoftwareThe Bayesia Portfolio of Research Software
The Bayesia Portfolio of Research SoftwareBayesia USA
 
Bayesian Networks & BayesiaLab
Bayesian Networks & BayesiaLabBayesian Networks & BayesiaLab
Bayesian Networks & BayesiaLabBayesia USA
 
Causal Inference and Direct Effects
Causal Inference and Direct EffectsCausal Inference and Direct Effects
Causal Inference and Direct EffectsBayesia USA
 
Knowledge Discovery in the Stock Market
Knowledge Discovery in the Stock MarketKnowledge Discovery in the Stock Market
Knowledge Discovery in the Stock MarketBayesia USA
 
Paradoxes and Fallacies - Resolving some well-known puzzles with Bayesian net...
Paradoxes and Fallacies - Resolving some well-known puzzles with Bayesian net...Paradoxes and Fallacies - Resolving some well-known puzzles with Bayesian net...
Paradoxes and Fallacies - Resolving some well-known puzzles with Bayesian net...Bayesia USA
 
Probabilistic Latent Factor Induction and
 Statistical Factor Analysis
Probabilistic Latent Factor Induction and
 Statistical Factor AnalysisProbabilistic Latent Factor Induction and
 Statistical Factor Analysis
Probabilistic Latent Factor Induction and
 Statistical Factor AnalysisBayesia USA
 
Microarray Analysis with BayesiaLab
Microarray Analysis with BayesiaLabMicroarray Analysis with BayesiaLab
Microarray Analysis with BayesiaLabBayesia USA
 
Breast Cancer Diagnostics with Bayesian Networks
Breast Cancer Diagnostics with Bayesian NetworksBreast Cancer Diagnostics with Bayesian Networks
Breast Cancer Diagnostics with Bayesian NetworksBayesia USA
 
Modeling Vehicle Choice and Simulating Market Share with Bayesian Networks
Modeling Vehicle Choice and Simulating Market Share with Bayesian NetworksModeling Vehicle Choice and Simulating Market Share with Bayesian Networks
Modeling Vehicle Choice and Simulating Market Share with Bayesian NetworksBayesia USA
 
Driver Analysis and Product Optimization with Bayesian Networks
Driver Analysis and Product Optimization with Bayesian NetworksDriver Analysis and Product Optimization with Bayesian Networks
Driver Analysis and Product Optimization with Bayesian NetworksBayesia USA
 
BayesiaLab 5.0 Introduction
BayesiaLab 5.0 IntroductionBayesiaLab 5.0 Introduction
BayesiaLab 5.0 IntroductionBayesia USA
 
Car And Driver Hk Interview
Car And Driver Hk InterviewCar And Driver Hk Interview
Car And Driver Hk InterviewBayesia USA
 

Mehr von Bayesia USA (18)

BayesiaLab_Book_V18 (1)
BayesiaLab_Book_V18 (1)BayesiaLab_Book_V18 (1)
BayesiaLab_Book_V18 (1)
 
Loyalty_Driver_Analysis_V13b
Loyalty_Driver_Analysis_V13bLoyalty_Driver_Analysis_V13b
Loyalty_Driver_Analysis_V13b
 
vehicle_safety_v20b
vehicle_safety_v20bvehicle_safety_v20b
vehicle_safety_v20b
 
Impact Analysis V12
Impact Analysis V12Impact Analysis V12
Impact Analysis V12
 
Causality for Policy Assessment and 
Impact Analysis
Causality for Policy Assessment and 
Impact AnalysisCausality for Policy Assessment and 
Impact Analysis
Causality for Policy Assessment and 
Impact Analysis
 
Vehicle Size, Weight, and Injury Risk: High-Dimensional Modeling and
 Causal ...
Vehicle Size, Weight, and Injury Risk: High-Dimensional Modeling and
 Causal ...Vehicle Size, Weight, and Injury Risk: High-Dimensional Modeling and
 Causal ...
Vehicle Size, Weight, and Injury Risk: High-Dimensional Modeling and
 Causal ...
 
The Bayesia Portfolio of Research Software
The Bayesia Portfolio of Research SoftwareThe Bayesia Portfolio of Research Software
The Bayesia Portfolio of Research Software
 
Bayesian Networks & BayesiaLab
Bayesian Networks & BayesiaLabBayesian Networks & BayesiaLab
Bayesian Networks & BayesiaLab
 
Causal Inference and Direct Effects
Causal Inference and Direct EffectsCausal Inference and Direct Effects
Causal Inference and Direct Effects
 
Knowledge Discovery in the Stock Market
Knowledge Discovery in the Stock MarketKnowledge Discovery in the Stock Market
Knowledge Discovery in the Stock Market
 
Paradoxes and Fallacies - Resolving some well-known puzzles with Bayesian net...
Paradoxes and Fallacies - Resolving some well-known puzzles with Bayesian net...Paradoxes and Fallacies - Resolving some well-known puzzles with Bayesian net...
Paradoxes and Fallacies - Resolving some well-known puzzles with Bayesian net...
 
Probabilistic Latent Factor Induction and
 Statistical Factor Analysis
Probabilistic Latent Factor Induction and
 Statistical Factor AnalysisProbabilistic Latent Factor Induction and
 Statistical Factor Analysis
Probabilistic Latent Factor Induction and
 Statistical Factor Analysis
 
Microarray Analysis with BayesiaLab
Microarray Analysis with BayesiaLabMicroarray Analysis with BayesiaLab
Microarray Analysis with BayesiaLab
 
Breast Cancer Diagnostics with Bayesian Networks
Breast Cancer Diagnostics with Bayesian NetworksBreast Cancer Diagnostics with Bayesian Networks
Breast Cancer Diagnostics with Bayesian Networks
 
Modeling Vehicle Choice and Simulating Market Share with Bayesian Networks
Modeling Vehicle Choice and Simulating Market Share with Bayesian NetworksModeling Vehicle Choice and Simulating Market Share with Bayesian Networks
Modeling Vehicle Choice and Simulating Market Share with Bayesian Networks
 
Driver Analysis and Product Optimization with Bayesian Networks
Driver Analysis and Product Optimization with Bayesian NetworksDriver Analysis and Product Optimization with Bayesian Networks
Driver Analysis and Product Optimization with Bayesian Networks
 
BayesiaLab 5.0 Introduction
BayesiaLab 5.0 IntroductionBayesiaLab 5.0 Introduction
BayesiaLab 5.0 Introduction
 
Car And Driver Hk Interview
Car And Driver Hk InterviewCar And Driver Hk Interview
Car And Driver Hk Interview
 

Kürzlich hochgeladen

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 

Kürzlich hochgeladen (20)

DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 

Introduction to Bayesian Networks - Practical and Technical Perspectives

  • 1. Introduction to Bayesian Networks Practical and Technical Perspectives Stefan Conrady, stefan.conrady@conradyscience.com Dr. Lionel Jouffe, jouffe@bayesia.com February 15, 2011 Conrady Applied Science, LLC - Bayesia’s North American Partner for Sales and Consulting
  • 2. Introduction to Bayesian Networks Table of Contents Introduction Bayesian Networks from a Practitioner’s Perspective Knowledge Uni cation 2 Knowledge Representation & Communication 3 Reasoning 4 Summary 4 Technical Introduction Introduction 5 Probabilistic Semantics 7 Evidential Reasoning 8 Learning Bayesian Networks 9 Causal Networks 10 Causal Discovery 11 References 12 Contact Information 13 Conrady Applied Science, LLC 13 Bayesia SAS 13 www.conradyscience.com | www.bayesia.com i
  • 3. Introduction to Bayesian Networks Introduction A simplistic analogy may help to jump-start our introduction to Bayesian networks: In the same way one can use a phone book — without having to memorize all the names and numbers, one can deliberately (and correctly) reason with the domain knowledge contained in a Bayesian network — without having to become a domain expert. Over the last 25 years, Bayesian networks have emerged as a practically feasible form of knowledge representation, pri- marily through the seminal works of UCLA Professor Judea Pearl. With the ever-increasing computing power, Bayesian networks are now a powerful tool for deep understanding of very complex, high-dimensional problem domains. Their computational ef ciency and inherently visual structure make Bayesian networks attractive for exploring and explaining complex problems. However, Bayesian networks are somewhat of a disruptive technology, as they challenge a number common practices in the world of business and science. So, beyond the world of academia, promoting Bayesian networks as a new tool for practical knowledge management and reasoning still requires signi cant persuasion efforts. With this short paper, we attempt to provide a concise justi cation, both from a practitioner’s and a technical perspective1 , why Bayesian net- works are so important. 1 Author notes: portions of the technical chapter of this paper are adapted, with permission, from Pearl and Russell (2000). www.conradyscience.com | www.bayesia.com 1
  • 4. Introduction to Bayesian Networks - Practitioner's Perspective Bayesian Networks from a Practitioner’s Perspective In our quest to “evangelize” about Bayesian networks (and the BayesiaLab software package2 ), we are often limited to presenting our case in just a few PowerPoint slides and only using a few catchy bullet points. In this context, and this is obviously not comprehensive, we selected the following headings to highlight the key bene ts of Bayesian networks to research practitioners and business executives: 1. Knowledge Uni cation 2. Knowledge Representation & Communication 3. Reasoning Under these headlines, the following paragraphs are meant to provide a glimpse of the powerful properties and wide- ranging practical advantages of Bayesian networks. Knowledge Uni cation Many elds are characterized by the proverbial con ict between “art” and “science.” This manifests itself in debates, such as the one about evidence-based medicine versus the prevailing practice of physicians with years of experience. Even more common is the discrepancy between scienti cally derived market research insights and expertise-based mar- keting decisions of business executives. Traditional frameworks typically don't facilitate leveraging the knowledge avail- able on both sides. Bayesian networks have the ability of capturing both qualitative knowledge (through their network structure), and quantitative knowledge (through their parameters). While expert knowledge from practitioners is mostly qualitative, it can be used directly for building the structure of a Bayesian network. In addition, data mining algorithms can encode both qualitative and quantitative knowledge and encode both forms simultaneously in a Bayesian network. As a result, Bayesian networks can bridge the gap between different types of knowledge and serve to unify all available knowledge into a single form of representation. 2 Developed by Bayesia SAS, BayesiaLab is a comprehensive software package designed for learning, editing and analyz- ing Bayesian networks. It is available in North America from Conrady Applied Science, LLC. www.conradyscience.com | www.bayesia.com 2
  • 5. Introduction to Bayesian Networks - Practitioner's Perspective Domain “Art” “Science” Expert Mathematical Knowledge Representation Qualitative Quantitative Bayesian Network Uni ed Knowledge Representation Figure 1: Knowledge uni cation with Bayesian networks Knowledge Representation & Communication Relaying knowledge typically includes an array of factual and causal statements. In natural language communication, such statements will often contain generalizations, approximations, and implicit assumptions regarding their probability. Such simpli cations are widely accepted in casual conversation or in media headlines. However, for more precise communication, which is required in science or business, spelling out exceptions, uncertainty and conditions regarding statements about knowledge is necessary. With natural language expressions, however, this can become very cumbersome, especially when it concerns a complex domain (hence the substantial girth of many text- books). Also, the need for precision in describing complex domains is often at odds with the modern business culture, which, as already mentioned in the introduction, dictates communication via PowerPoint in few, concise bullet points. Needless to say, the complex dynamics of a domain can thus often not be relayed correctly to policy makers and other stakeholders. Bayesian networks are very well suited for capturing probabilistic and incomplete causal knowledge regarding a do- main. They can easily accommodate exceptions to a rule, e.g. “all swans are white, except for a certain species,” as well as partial causal information, for instance “alcohol caused the accident,” even though more factors may actually be in- volved, such as poor road conditions. Through its structure and its parameters, a Bayesian networks comprehensively describes what is known about a par- ticular domain and especially the interactions of all the variables contained within that domain. As such, a Bayesian network is a “Portable Knowledge Format,” that can succinctly and compactly communicate the state of the domain as well as its dynamics. www.conradyscience.com | www.bayesia.com 3
  • 6. Introduction to Bayesian Networks - Practitioner's Perspective Reasoning By representing the interactions, a (correctly formulated) Bayesian network can yield a deep understanding of a domain. Deep understanding means knowing, not merely how things behaved yesterday, but also how things will behave under new hypothetical circumstances tomorrow. More speci cally, a Bayesian network allows explicit reasoning, and deliber- ate reasoning allows us to anticipate the consequences of actions we have not yet taken. Bayesian networks thus become an instrument for formal reasoning that is entirely transparent to stakeholders, as opposed to a more opaque, internal- ized process in the decision maker’s mind (or gut). Domain under Data Bayesian Study Network Hypothetical Domain Manipulation Manipulation Figure 2: Using Bayesian networks for formal reasoning about consequences of hypothetical actions Summary In summary, Bayesian networks are a highly universal knowledge framework and they provide a common reasoning language between stakeholders from different backgrounds, such as business executives and market research scientists. With all available knowledge uni ed, properly communicated and quite literally put into a “reasonable” format, Bayes- ian network are a powerful tool for making decisions and shaping policies. www.conradyscience.com | www.bayesia.com 4
  • 7. Introduction to Bayesian Networks - Technical Perspective Technical Introduction For the technical portion of this introduction, we defer to the words of Judea Pearl, who originally coined the term “Bayesian network”. We are grateful to him for allowing us to use and adapt large sections from one of his technical reports for our purposes (Pearl and Russell, 2000). Introduction Probabilistic models based on directed acyclic graphs have a long and rich tradition, beginning with the work of geneti- cist Sewall Wright in the 1920s. Variants have appeared in many elds. Within statistics, such models are known as di- rected graphical models; within cognitive science and arti cial intelligence, such models are known as Bayesian net- works. The name honors the Rev. Thomas Bayes (1702-1761), whose rule for updating probabilities in the light of new evidence is the foundation of the approach. Rev. Bayes addressed both the case of discrete probability distributions of data and the more complicated case of con- tinuous probability distributions. In the discrete case, Bayes’ theorem relates the conditional and marginal probabilities of events A and B, provided that the probability of B does not equal zero: P(B A)P(A) P(A B) = P(B) In Bayes’ theorem, each probability has a conventional name: • P(A) is the prior probability (or “unconditional” or “marginal” probability) of A. It is “prior” in the sense that it does not take into account any information about B; however, the event B need not occur after event A. In the nineteenth century, the unconditional probability P(A) in Bayes’s rule was called the “antecedent” probability; in deductive logic, the antecedent set of propositions and the inference rule imply consequences. The unconditional probability P(A) was called “a priori” by Ronald A. Fisher. • P(A|B) is the conditional probability of A, given B. It is also called the posterior probability because it is derived from or depends upon the speci ed value of B. • P(B|A) is the conditional probability of B given A. It is also called the likelihood. • P(B) is the prior or marginal probability of B, and acts as a normalizing constant. Bayes theorem in this form gives a mathematical representation of how the conditional probability of event A given B is related to the converse conditional probability of B given A. The initial development of Bayesian networks in the late 1970s was motivated by the need to model the top-down (se- mantic) and bottom-up (perceptual) combination of evidence in reading. The capability for bidirectional inferences, combined with a rigorous probabilistic foundation, led to the rapid emergence of Bayesian networks as the method of choice for uncertain reasoning in AI and expert systems replacing earlier, ad hoc rule-based schemes. www.conradyscience.com | www.bayesia.com 5
  • 8. Introduction to Bayesian Networks - Technical Perspective The nodes in a Bayesian network represent propositional variables of interest (e.g. the temperature of a device, the gen- der of a patient, a feature of an object, the occurrence of an event) and the links represent statistical (informational)3 or causal dependencies among the variables. The dependencies are quanti ed by conditional probabilities for each node given its parents in the network. The network supports the computation of the posterior probabilities of any subset of variables given evidence about any other subset. Figure 1 shows a very simple Bayesian network consisting of only two nodes and one link, representing the joint prob- ability distribution of the variables Eye Color and Hair Color in a given population. In this case, the conditional prob- abilities of Hair Color given the values of its parent, Eye Color, are provided in a table. It is important to point out that this Bayesian network does not contain any causal assumptions, i.e. we have no knowledge of the causal order between the variables, so the interpretation here should be merely statistical (informational). Figure 1: A Bayesian network representing the statistical relationship between to two variables. Figure 2 illustrates another simple yet typical Bayesian network. In contrast to the statistical relationships in Figure 1, the diagram in Figure 2 describes the causal relationships among the season of the year (X1), whether it’s raining (X2), whether the sprinkler is on (X3), whether the pavement is wet (X4), and whether the pavement is slippery (X5). Here the absence of a direct link between X1 and X5, for example, captures our understanding that there is no direct in uence of season on slipperiness — the in uence is mediated by the wetness of the pavement (if freezing is a possibility then a di- rect link could be added). 3 “informational” and “statistical” are treated here as equivalent concepts and can be used interchangeably. www.conradyscience.com | www.bayesia.com 6
  • 9. Introduction to Bayesian Networks - Technical Perspective Figure 2: A Bayesian network representing causal in uences among ve variables Perhaps the most important aspect of a Bayesian networks is that they are direct representations of the world, not of reasoning processes. The arrows in the diagram represent real causal connections and not the ow of information during reasoning (as in rule-based systems and neural networks). Reasoning processes can operate on Bayesian networks by propagating information in any direction. For example, if the sprinkler is on, then the pavement is probably wet (predic- tion, simulation); if someone slips on the pavement, that also provides evidence that it is wet (abduction, reasoning to a probable cause or diagnosis). On the other hand, if we see that the pavement is wet, that makes it more likely that the sprinkler is on or that it is raining (abduction); but if we then observe that the sprinkler is on, that reduces the likelihood that it is raining (explaining away). It is this last form of reasoning, explaining away, that is especially dif cult to model in rule-based systems and neural networks in any natural way, because it seems to require the propagation of informa- tion in two directions. Probabilistic Semantics Any complete probabilistic model of a domain must, either explicitly or implicitly, represent the joint probability distri- bution — the probability of every possible event as de ned by the combination of the values of all the variables. There are exponentially many such events, yet Bayesian networks achieve compactness by factoring the joint distribution into local, conditional distributions for each variable given its parents. If xi denotes some value of the variable Xi and pai denotes some set of values for the parents of Xi, then P(xi|pai) denotes this conditional distribution. For example, P(x4|x2,x3) is the probability of wetness given the values of sprinkler and rain. The global semantics of Bayesian net- works speci es that the full joint distribution is given by the product P(xi ,..., xn ) = ∏ P(xi pai ) (1) i In our example network, we have P(x1 , x2 , x3 , x4 , x5 ) = P(x1 )P(x2 x1 )P(x3 x1 )P(x4 x2 , x3 )P(x5 x4 ) . (2) It becomes clear that the number of parameters grows linearly with the size of the network, i.e. the number of variables, however, the conditional probability distribution grows exponentially with the number of parents. Further savings can www.conradyscience.com | www.bayesia.com 7
  • 10. Introduction to Bayesian Networks - Technical Perspective be achieved using compact parametric representations — such as noisy-OR models, decision trees, or neural networks — for the conditional distributions. There is also an entirely equivalent local semantics, which asserts that each variable is independent of its nondescen- dants in the network given its parents. For example, the parents of X4 in Figure 2 are X2 and X3 and they render X4 independent of the remaining nondescendant, X1. That is, P(x4 x 1 , x2 , x3 ) = P(x4 x2 , x3 ) . Non-Descendants Parents Descendant Figure 3: Variable X4 is independent of its nondescendants, in this case X1, given its parents, X3 and X2 The collection of independence assertions formed in this way suf ces to derive the global assertion in Equation 1, and vice versa. The local semantics is most useful in constructing Bayesian networks, because selecting as parents all the di- rect causes (or direct relationships) of a given variable invariably satis es the local conditional independence conditions. The global semantics leads directly to a variety of algorithms for reasoning. Evidential Reasoning From the product speci cation in Equation 1 one can express the probability of any desired proposition in terms of the conditional probabilities speci ed in the network. For example the probability that the sprinkler is on given that the pavement is slippery is www.conradyscience.com | www.bayesia.com 8
  • 11. Introduction to Bayesian Networks - Technical Perspective P(X 3 = on, X5 = true) P(X 3 = on X5 = true) = P(X5 = true) = ∑ x1 , x2 , x4 P(x1 , x2 , X 3 = on, x4 , X5 = true) ∑ x1 , x2 , x3 , x4 P(x1 , x2 , x3 , x4 , X5 = true) == ∑ x1 , x2 , x4 P(x1 )P(x2 x1 )P(X 3 = on x1 )P(x4 x2 , X 3 = on)P(X5 = true x4 ) ∑ x1 , x2 , x3 , x4 P(x1 )P(x2 x1 )P(x3 x1 )P(x4 x2 , x3 )P(X5 = true x4 ) These expressions can often be simpli ed in ways that re ect the structure of the network itself. The rst algorithms proposed for probabilistic calculations in Bayesian networks used a local distributed message-passing architecture, typi- cal of many cognitive activities. Initially this approach was limited to tree-structured networks, but was later extended to general networks in Lauritzen and Spiegelhalter’s (1988) method of junction tree propagation. A number of other exact methods have been developed and can be found in recent textbooks. It is easy to show that reasoning in Bayesian networks subsumes the satis ability problem in propositional logic and hence is NP-hard Monte Carlo simulation methods can be used for approximate inference (Pearl, 1988) giving gradually improving estimates as sampling proceeds. These methods use local message propagation on the original network struc- ture unlike junction tree methods. Alternatively, variational methods provide bounds on the true probability. Learning Bayesian Networks The conditional probabilities P(xi|pai) of a given structure can be estimated from data by using the maximum likelihood approach (observed frequencies). They can also be updated continuously from observational data using gradient-based or EM methods that use just local information derived from inference — in much the same way as weights are adjusted in neural networks. It is also possible to machine-learn the structure of a Bayesian network and two families of methods are available for that purpose. The rst one, the constraint-based algorithms, is based on the probabilistic semantic of Bayesian networks. Links are added or deleted according to the results of statistical tests, which identify marginal and conditional independ- encies. The second approach, the score-based algorithms, is based on a metric measuring the quality of candidate net- works with respect to the observed data. This metric trades off network complexity against degree of t to the data, typically expressed as the likelihood of the data given the network. As a substrate for learning, Bayesian networks have the advantage that it is relatively easy to encode prior knowledge in network form, either by xing portions of the structure or by using prior distributions over the network parameters. Such prior knowledge can allow a system to learn accurate models from much less data than are required for tabula rasa approaches. Uncertainty Over Time Entities that live in a changing environment must keep track of variables whose values change over time. Dynamic Bayesian networks capture this process by representing multiple copies of the state variables, one for each time step. A set of variables Xt denotes the world state at time t and a set of sensor variables Et denotes the observations available at time t. The sensor model P(Et|Xt) is encoded in the conditional probability distributions for the observable variables, given the state variables. The transition model P(Xt+1|Xt) relates the state at time t to the state at time t+1. Keeping track of the world means computing the current probability distribution over world states given all past observations, i.e., www.conradyscience.com | www.bayesia.com 9
  • 12. Introduction to Bayesian Networks - Technical Perspective P(Xt|E1,…,Et). Dynamic Bayesian networks are strictly more expressive than other temporal probability models such as hidden Markov models and Kalman lters. Causal Networks Most probabilistic models including, general Bayesian networks, describe a distribution over possible observed events — as in Equation 1 — but say nothing about what will happen if a certain intervention occurs. For example, what if I turn the sprinkler on? What effect does that have on the season, or on the connection between wetness and slipperiness? A causal network, intuitively speaking, is a Bayesian network with the added property that the parents of each node are its direct causes — as in Figure 2. In such a network, the result of an intervention is obvious: the sprinkler node is set to X3 = on and the causal link between the season X1 and the sprinkler X3 is removed (see Figure 4). All other causal links and conditional probabilities remain intact so the new model is P(x1 , x2 , x4 , x5 ) = P(x1 )P(x2 x1 )P(x4 x2 , X 3 = on)P(x5 x4 ). Notice that this differs from observing that X3=on, which would result in a new model that included the term P(X3=on|x1). This mirrors the difference between seeing and doing: after observing that the sprinkler is on, we wish to infer that the season is dry, that it probably did not rain, and so on; an arbitrary decision to turn the sprinkler on should not result in any such beliefs. Figure 4: A causal network re ecting the intervention, X3=on Causal networks are more properly de ned, then, as Bayesian networks in which the correct probability model after intervening to x any node’s value is given simply by deleting links from the node’s parents. For example, Fire → Smoke is a causal network whereas Smoke → Fire is not, even though both networks are equally capable of representing any joint distribution on the two variables. Causal networks model the environment as a collection of stable component mechanisms. These mechanisms may be recon gured locally by interventions, with correspondingly local changes in the model. This, in turn, allows causal networks to be used very naturally for prediction by an agent that is considering various courses of action. www.conradyscience.com | www.bayesia.com 10
  • 13. Introduction to Bayesian Networks - Technical Perspective Causal Discovery One of the most exciting prospects in recent years has been the possibility of using Bayesian networks to discover causal structures in raw statistical data — a task previously considered impossible without controlled experiments. Con- sider, for example, the following intransitive pattern of dependencies among three events: A and B are dependent. B and C are dependent, yet A and C are independent. If you ask a person to supply an example of three such events, the exam- ple would invariably portray A and C as two independent causes and B as their common effect, namely, A → B ← C. (For instance A and C could be the outcomes of two fair coins and B represents a bell that rings whenever either coin comes up heads.) Figure 4: Causal model for variables A, C and B, representing two fair coins and a bell respectively. Fitting this dependence pattern with a scenario in which B is the cause and A and C are the effects is mathematically feasible but very unnatural (see Figure 5), because it must entail ne tuning of the probabilities involved; the desired dependence pattern will be destroyed as soon as the probabilities undergo a slight change. Such thought experiments tell us that certain patterns of dependency, which are totally void of temporal information, are conceptually characteristic of certain causal directionalities and not others. When put together systematically, such patterns can be used to infer causal structures from raw data and to guarantee that any alternative structure compatible with the data must be less stable than the one(s) inferred; namely slight uctuations in parameters will render that struc- ture incompatible with the data. www.conradyscience.com | www.bayesia.com 11
  • 14. Introduction to Bayesian Networks References Barber, David. “Bayesian Reasoning and Machine Learning.” http://www.cs.ucl.ac.uk/staff/d.barber/brml. Barber, David. Bayesian Reasoning and Machine Learning. Cambridge University Press, 2011.   Barnard, G. A, and T. Bayes. “Studies in the History of Probability and Statistics: IX. Thomas Bayes's Essay Towards Solving a Problem in the Doctrine of Chances.” Biometrika 45, no. 3 (1958): 293–315.   Darwiche, Adnan. “Bayesian networks.” Communications of the ACM 53, no. 12 (12, 2010): 80.   Hilbert, M., and P. Lopez. “The World's Technological Capacity to Store, Communicate, and Compute Information.” Science (2, 2011). http://www.sciencemag.org/cgi/doi/10.1126/science.1200970.   Koller, Daphne, and Nir Friedman. Probabilistic Graphical Models: Principles and Techniques. The MIT Press, 2009.   Neapolitan, Richard E., and Xia Jiang. Probabilistic Methods for Financial and Marketing Informatics. 1st ed. Morgan Kaufmann, 2007. Pearl, Judea, and Stuart Russell. Bayesian Networks. UCLA Cognitive Systems Laboratory, November 2000. http://bayes.cs.ucla.edu/csl_papers.html. Pearl, Judea. Causality: Models, Reasoning and Inference. Cambridge University Press, 2000.   Pearl, Judea. Causality: Models, Reasoning and Inference. 2nd ed. Cambridge University Press, 2009.   Spirtes, Peter, Clark Glymour, and Richard Scheines. Causation, Prediction, and Search, Second Edition. 2nd ed. The MIT Press, 2001.   www.conradyscience.com | www.bayesia.com 12
  • 15. Introduction to Bayesian Networks Contact Information Conrady Applied Science, LLC 312 Hamlet’s End Way Franklin, TN 37067 USA +1 888-386-8383 info@conradyscience.com www.conradyscience.com Bayesia SAS 6, rue Léonard de Vinci BP 119 53001 Laval Cedex France +33(0)2 43 49 75 69 info@bayesia.com www.bayesia.com www.conradyscience.com | www.bayesia.com 13