SlideShare a Scribd company logo
1 of 56
Download to read offline
Breast Cancer Diagnostics with Bayesian Networks
Interpreting the Wisconsin Breast Cancer Database with BayesiaLab
Stefan Conrady, stefan.conrady@conradyscience.com
Dr. Lionel Jouffe, jouffe@bayesia.com
May 20, 2013
Table of Contents
Case Study & Tutorial
Introduction 4
Background 6
Wisconsin Breast Cancer Database 6
Notation 7
Model Development 8
Data Import 8
Unsupervised Learning 13
Model 1: Markov Blanket 16
Model 1: Performance 21
K-Folds Cross-Validation 23
Model 2: Augmented Markov Blanket 25
Model 2a: Performance 28
Structural Coefficient 32
Model 2b: Augmented Markov Blanket (SC=0.3) 38
Model 2b: Performance 39
Conclusion 40
Model Inference 41
Interactive Inference 42
Adaptive Questionnaire 43
Target Interpretation Tree 46
Summary 52
Appendix
Framework: The Bayesian Network Paradigm 53
Acyclic Graphs & Bayes’s Rule 53
Compact Representation of the Joint Probability Distribution 54
References 55
Contact Information 56
Bayesia USA 56
Breast Cancer Diagnostics with Bayesian Networks
ii
 www.bayesia.us | www.bayesia.sg | www.bayesia.com
Bayesia Singapore Pte. Ltd. 56
Bayesia S.A.S. 56
Copyright 56
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com
 iii
Case Study & Tutorial
Introduction
Data classification is one of the most common tasks in the field of statistical analysis and countless methods
have been developed for this purpose over time. A common approach is to develop a model based on
known historical data, i.e. where the class membership of a record is known, and to use this generalization
to predict the class membership for a new set of observations.
Applications of data classifications permeate virtually all fields of study, including social sciences, engineer-
ing, biology, etc. In the medical field, classification problems often appear in the context of disease identifi-
cation, i.e. making a diagnosis about a patient’s condition. The medical sciences have a long history of de-
veloping large body of knowledge, which links observable symptoms with known types of illnesses. It is the
physician’s task to use the available medical knowledge to make inference based on the patient’s symptoms,
i.e. to classify the medical condition in order to enable appropriate treatment.
Over the last two decades, so-called medical expert systems have emerged, which are meant to support phy-
sicians in their diagnostic work. Given the sheer amount of medical knowledge in existence today, it should
not be surprising that significant benefits are expected from such machine-based support in terms of medical
reasoning and inference.
In this context, several papers by Wolberg, Street, Heisey and Managasarian became much-cited examples.
They proposed an automated method for the classification of Fine Needle Aspirates1 through imaging proc-
essing and machine learning with the objective of achieving a greater accuracy in distinguishing between
malignant and benign cells for the diagnosis of breast cancer. At the time of their study, the practice of vis-
ual inspection of FNA yielded inconsistent diagnostic accuracy. The proposed new approach would increase
this accuracy reliably to over 95%. This research was quickly translated into clinical practice and has since
been applied with continued success.
As part of their studies in the late 1980s and 1990s, the research team generated what became known as the
Wisconsin Breast Cancer Database, which contains measurements of hundreds of FNA samples and the as-
sociated diagnoses. This database has been extensively studied, even outside the medical field. Statisticians
and computer scientists have proposed a wide range of techniques for this classification problem and have
continuously raised the benchmark for predictive performance.
Our objective with this paper is to present Bayesian networks as a highly practical framework for working
with this kind of classification problem. We intend to demonstrate how the BayesiaLab software can ex-
Breast Cancer Diagnostics with Bayesian Networks
4 www.bayesia.us | www.bayesia.sg | www.bayesia.com
1 Fine needle aspiration (FNA) is a percutaneous (“through the skin”) procedure that uses a fine gauge needle (22 or 25
gauge) and a syringe to sample fluid from a breast cyst or remove clusters of cells from a solid mass. With FNA, the
cellular material taken from the breast is usually sent to the pathology laboratory for analysis.
tremely quickly, and relatively simply, create Bayesian network models that achieve the performance of the
best custom-developed models, while only requiring a fraction of the development time.
Furthermore, we wish to illustrate how Bayesian networks can help researchers and practitioners generate a
deeper understanding of the underlying problem domain. Beyond merely producing predictions, we can use
Bayesian networks to precisely quantify the importance of individual variables and employ BayesiaLab to
help identify the most efficient path towards a diagnosis.
BayesiaLab’s speed of model building, its excellent classification performance, plus the ease of interpretation
provide researchers with a powerful new tool. Bayesian networks and BayesiaLab have thus become a driver
in accelerating research.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 5
Background
To provide context for this study, we quote Mangasarian, Street and Wolberg (1994), who conducted the
original research related breast cancer diagnosis with digital image processing and machine learning:
Most breast cancers are detected by the patient as a lump in the breast. The majority of breast
lumps are benign, so it is the physician’s responsibility to diagnose breast cancer, that is, to distin-
guish benign lumps from malignant ones. There are three available methods for diagnosing breast
cancer: mammography, FNA with visual interpretation and surgical biopsy. The reported sensitiv-
ity, i.e. ability to correctly diagnose cancer when the disease is present of mammography varies
from 68% to 79%, of FNA with visual interpretation from 65% to 98%, and of surgical biopsy
close to 100%.
Therefore mammography lacks sensitivity, FNA sensitivity varies widely, and surgical biopsy, al-
though accurate, is invasive, time consuming and costly. The goal of the diagnostic aspect of our
research is to develop a relatively objective system that diagnoses FNAs with an accuracy that ap-
proaches the best achieved visually.
Wisconsin Breast Cancer Database
This breast cancer database was created through the clinical work of Dr. William H. Wolberg at the Univer-
sity of Wisconsin Hospitals in Madison. As of 1992, Dr. Wolberg had collected 699 instances of patient
diagnoses in this database, consisting of two classes: 458 benign cases (65.5%) and 241 malignant cases
(34.5%).
The following eleven attributes2 are included in the database:
1. Sample code number
2. Clump Thickness (1 - 10)
3. Uniformity of Cell Size (1 - 10)
4. Uniformity of Cell Shape (1 - 10)
5. Marginal Adhesion (1 - 10)
6. Single Epithelial Cell Size (1 - 10)
7. Bare Nuclei (1 - 10)
8. Bland Chromatin (1 - 10)
9. Normal Nucleoli (1 - 10)
10. Mitoses (1 - 10)
11. Class (benign/malignant)
Breast Cancer Diagnostics with Bayesian Networks
6 www.bayesia.us | www.bayesia.sg | www.bayesia.com
2 “Attribute” and “variable” are used interchangeably throughout the paper.
Attributes #2 through #10 were computed from digital images of fine needle aspirates (FNA) of breast
masses. These features describe the characteristics of the cell nuclei in the image. The attribute #11, Class,
was established via subsequent biopsies or via long-term monitoring of the tumor.
We will not go into detail here regarding the definition of the attributes and their measurement. Rather, we
refer the reader to papers referenced in the bibliography.
The Wisconsin Breast Cancer Database is available to any interested researcher from the UC Irvine Machine
Learning Repository.3 We use this database in its original format without any further transformation, so
our results can be directly compared to dozens of methods that have been developed since the original
study.
Notation
To clearly distinguish between natural language, software-specific functions and study-specific variable
names, the following notation is used:
• BayesiaLab-specific functions, keywords, commands, etc., are capitalized and printed in bold type. You
can look up such terms in the BayesiaLab Library (library.bayesia.com) for more details.
• The names of variables, attributes, nodes, and node states are capitalized and italicized.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 7
3 UC Irvine Machine Learning Repository website: http://archive.ics.uci.edu/ml/
Model Development
Data Import
Our modeling process begins with importing the database,4 which is formatted as a text file with comma-
separated values. Therefore, we start with Data | Open Data Source | Text File.
The Data Import Wizard then guides us through the required steps. In the first dialogue box of the Data
Import Wizard, we click on Define Typing and specify that we wish to set aside a Test Set from the data-
base.
Breast Cancer Diagnostics with Bayesian Networks
8 www.bayesia.us | www.bayesia.sg | www.bayesia.com
4 If we exclude the variable Sample code number, this database can also be used with the publicly-available evaluation
version of BayesiaLab, which is limited to a maximum of ten nodes. Deleting this variable does not affect the workflow
or the results of the analysis.
Following common practice, we will randomly select 20% of the 699 records as Test Set, and, conse-
quently, the remaining 80% will serve as our Learning Set set.5
In the next step, the Data Import Wizard will suggest the data format for each variable. Attributes 2
through 10 are identified as continuous variables and Class is read as a discrete variable. Only for the first
variable, Sample code number, we have to specify Row Identifier, so it is not mistaken for a continuous pre-
dictor variable.
In the next step, the Information Panel reports that we have a total of 16 missing values in the entire data-
set. We can also see that the column Bare Nuclei is labeled with a small question mark, indicating the pres-
ence of missing values in this particular column.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 9
5 “Learning/Test Set” and “Learning/Test Sample” are used interchangeably in this paper.
We now need to specify the type of Missing Values Imputation. Given the small size of the dataset, and the
small number of missing values, we will choose the Structural EM method.6
A critical element of the data import process is the discretization of all continuous variables. On the next
screen we click Select All Continuous to apply the same discretization algorithm across all continuous vari-
ables. Alternatively, we could choose the type of discretization individually by variable. However, we will
not discuss this option any further in this paper.
As the objective of this exercise is classification, we choose the Decision Tree algorithm from the drop-down
menu in the Multiple Discretization panel. This discretizes each variable for a maximum information gain
with respect to the Target Class.
Breast Cancer Diagnostics with Bayesian Networks
10 www.bayesia.us | www.bayesia.sg | www.bayesia.com
6 For more details on missing values imputation with Bayesian network, see Conrady and Jouffe (2012).
Bayesian networks are entirely non-parametric, probabilistic models, and for their estimation they require a
certain minimum number of observations. To help us with the selection of the number of discretization lev-
els (or Intervals), we use the heuristic of five observations per parameter and probability cell. Given that we
have a relatively small database with only 560 observations,7 three discretization intervals for each variable
appear to be an appropriate choice. If we used a higher number of Intervals, we would need more observa-
tions for a reliable estimation of the parameters.
Upon clicking Finish, we will immediately see a representation of the newly imported database in the form
of a fully unconnected Bayesian network in the Graph Panel. Each variable is now represented as a blue
node in the graph panel of BayesiaLab.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 11
7 560 cases are in the training set (80%) and 139 are in the test set (20%).
The question mark symbol, which is associated with the Bare Nuclei node, indicates that there are missing
values for this variable. Hovering over the question mark with the mouse pointer while pressing the “i” key
will show the number of missing values.
Optionally, BayesiaLab can display an import report summarizing the obtained discretizations for all vari-
ables.
Breast Cancer Diagnostics with Bayesian Networks
12 www.bayesia.us | www.bayesia.sg | www.bayesia.com
Unsupervised Learning
When exploring a new domain, we generally recommended performing Unsupervised Learning on the newly
imported database. This is also the case here, even though our principal objective is predictive modeling, for
which Supervised Learning will later be the main tool.
Learning | Unsupervised Structural Learning | EQ initiates the EQ Algorithm, which is suitable for the initial
review of the database. For larger databases with significantly more variables, the Maximum Weight Span-
ning Tree is a very fast algorithm and can be used instead.
Upon learning, the initial Bayesian network looks like this:
In its “raw” form, the crossing arcs make this network somewhat tricky to read. BayesiaLab has a number
of layout algorithms that can quickly “disentangle” such a network and produce a much more user-friendly
format.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 13
We can select View | Automatic Layout or alternative use the shortcut “P”.
Now we can visually review the learned network structure and compare it to our own domain knowledge.
This allows for a “sanity check” of the database and the variables, and it may highlight any inconsistencies.
Beyond visually inspecting the network structure, BayesiaLab allows us to visualize the quantitative part of
this network. To do this, we first need to switch into the Validation Mode by clicking on the highlighted
button in the lower-lefthand corner of the Graph Panel, or by alternatively using the “F5” key as a shortcut.
We can now display the Pearson Correlation between the nodes that are directly linked in the graph by se-
lecting Analysis | Visual | Pearson’s Correlation from the menu.
Breast Cancer Diagnostics with Bayesian Networks
14 www.bayesia.us | www.bayesia.sg | www.bayesia.com
Each arc’s thickness is now proportional to the Pearson Correlation between the connected nodes. Also, the
blue and red colors indicate positive and negative correlations respectively. Any unexpected sign of correla-
tions would thus become apparent very quickly. In our example, we only have positive correlations and thus
all arcs are blue.
Additionally, callouts indicate that further information can be displayed. We can opt to display this
numerical information via View | Display Arc Comments.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 15
This function is also available via a button in the menu:
Model 1: Markov Blanket
Now that we have performed an initial review of the dataset with the Unsupervised Learning step, we can
return to the Modeling Mode by clicking on the corresponding button in the lower lefthand corner of the
Breast Cancer Diagnostics with Bayesian Networks
16 www.bayesia.us | www.bayesia.sg | www.bayesia.com
screen or using the shortcut “F4”.8
This allows us to proceed to the modeling stage. Given our objective of predicting the state of the variable
Class, i.e. benign versus malignant, we will define Class as the Target Variable by right-clicking on the node
and selecting Set as Target Variable from the contextual menu. Alternatively, we can double-click on Class
while holding the shortcut “T” pressed. We need to specify this explicitly, so the subsequent Supervised
Learning algorithm can use Class as the dependent variable.
This setting is confirmed by the “bullseye”appearance of the new Target Node.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 17
8 We will mostly omit further references to switching between Modeling Mode (F4) and Validation Mode (F5). The
required modes can generally be inferred from the context.
Upon this selection, all Supervised Learning algorithms become available under Learning | Supervised Learn-
ing.
In many cases, the Markov Blanket algorithm is a good starting point for a predictive model. This algorithm
is extremely fast and can even be applied to databases with thousands of variables and millions of records,
even though database size is not a concern in this particular study.
Upon learning the Markov Blanket for Class, and once again applying the Automatic Layout, the resulting
Bayesian network looks as follows:
Markov Blanket Definition
The Markov Blanket for a node A is the set of nodes composed of A’s parents, its children, and its
children’s other parents (=spouses).
The Markov Blanket of the node A contains all the variables, which,
if we know their states, will shield the node A from the rest of the
network. This means that the Markov Blanket of a node is the only
knowledge needed to predict the behavior of that node A. Learning a
Markov Blanket selects relevant predictor variables, which is particu-
larly helpful when there is a large number of variables in the database.
In fact, this can also serve as a highly-efficient variable selection
method in preparation for other types of modeling, e.g. neural net-
works.
Breast Cancer Diagnostics with Bayesian Networks
18 www.bayesia.us | www.bayesia.sg | www.bayesia.com
This network suggests that Class has a direct probabilistic relationship with all variables except Marginal
Adhesion and Single Epithelial Cell Size, which are both disconnected. The lack of their connection with the
Target indicates that these nodes are independent given the nodes in the Markov Blanket.
Beyond distinguishing between predictors (connected nodes) and non-predictors (disconnected nodes), we
can further examine the relationship versus the Target Node Class by highlighting the Mutual Information
of the arcs connecting the nodes. This function is accessible within the Validation Mode via Analysis | Vis-
ual | Arcs’ Mutual Information.
Note
We can see on the graph learned earlier with the EQ algorithm that Uniformity of Cell Shape is the
node that makes these two nodes conditionally independent of Class.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 19
We will also go ahead and immediately select View | Display Arc Comments.
The thickness of the arcs is now proportional to the Mu-
tual Information, i.e. the strength of the relationship be-
tween the nodes. Intuitively, Mutual Information measures
the information that X and Y share: it measures how much
knowing one of these variables reduces our uncertainty
about the other. For example, if X and Y are independent,
then knowing X does not provide any information about Y
and vice versa, so their Mutual Information is zero. At the other extreme, if X and Y are identical then all
information conveyed by X is shared with Y: knowing X determines the value of Y and vice versa.
Formal Definition of Mutual Information
I(X;Y ) = p(x,y)log
p(x,y)
p(x)p(y)
⎛
⎝⎜
⎞
⎠⎟
x∈X
∑
y∈Y
∑
Breast Cancer Diagnostics with Bayesian Networks
20 www.bayesia.us | www.bayesia.sg | www.bayesia.com
In the top part of the comment box attached to each arc, the Mutual Information of the arc is
shown. Expressed as a percentage and highlighted in blue, we see the relative Mutual Informa-
tion in the direction of the arc (parent node ➔ child node). And, at the bottom, we have the
relative Mutual Information in the opposite direction of the arc (child node ➔ parent node).
Model 1: Performance
As we are not equipped with specific domain knowledge about the variables, we will not further interpret
these relationships but rather run an initial test regarding the Network Performance. We want to know how
well this Markov Blanket model can predict the states of the Class variable, i.e. Benign versus Malignant.
This test is available via Analysis | Network Performance | Target.
Using our previously defined Test Set for validating our model, we obtain the following, rather encouraging
results:
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 21
Of the 88 Benign cases of the test set, 3 were incorrectly identified, which corresponds to a false positive
rate of 3.41%. More importantly though, of the 51 Malignant cases, all were identified correctly (true posi-
tives) with no false negatives. The overall performance can be expressed as the Total Precision, which is
computed as total number of correct predictions (true positives + true negatives) divided by the total num-
ber of cases in the Test Set , i.e. (85 +51) ÷ 139 = 97.84%.
As the selection of the Learning Set and the Test Set during the data import process is random, BayesiaLab
may learn slightly different networks based on different Learning Sets after each data import. Hence, your
own network performance evaluation could deviate from what is shown above, unless you chose the same
Fixed Seed for the random number generator when you defined Data Typing during the data import proc-
ess.
Breast Cancer Diagnostics with Bayesian Networks
22 www.bayesia.us | www.bayesia.sg | www.bayesia.com
K-Folds Cross-Validation
To mitigate the sampling artifacts that may occur in a one-off test, we can systematically learn networks on
a sequence of different subsets and then aggregate the test results. Analogous to the original papers on this
topic, we will perform K-Folds Cross Validation, which will iteratively select K different Learning Sets and
Test Sets and then, based on those, learn the networks and test their performance.
The Cross Validation can then be started via Tools | Cross Validation | Targeted Evaluation | K-Folds.
We use the same learning algorithm as before, i.e. the Markov Blanket, and we choose 10 as the number of
sub-samples to be analyzed. Of the total dataset of 699 cases, each of the ten iterations will create a Test Set
of 69 randomly drawn samples, and use the remaining 630 as the Learning Set. This means that BayesiaLab
learns one network per Learning Set and then tests the performance on the respective Test Set.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 23
The summary, including the synthesized results, is shown below.
These results confirm the good performance of this model. The Total Precision is 97%, with a false negative
rate of 2%. This means 2% of the cases were predicted as Benign, while the were actually Malignant.
Breast Cancer Diagnostics with Bayesian Networks
24 www.bayesia.us | www.bayesia.sg | www.bayesia.com
Clicking Comprehensive Report produces a summary, which can also be saved in HTML format. This is
convenient for subsequent editing, as the generated HTML file can be opened and edited as a spreadsheet.
Value Benign Malignant
Gini Index 33.95% 64.59%
Relative Gini Index 98.50% 98.55%
Mean Lift 1.42 2.04
Relative Lift Index 99.74% 99%
Value
Benign
(458)
Malignant
(241)
Benign (446) 441 5
Malignant (253) 17 236
Value
Benign
(458)
Malignant
(241)
Benign (446) 98.88% 1.12%
Malignant (253) 6.72% 93.28%
Value
Benign
(458)
Malignant
(241)
Benign (446) 96.29% 2.07%
Malignant (253) 3.71% 97.93%
R: 0.93817485358
R2: 0.88017205588
Occurrences
Reliability
Precision
Sampling Method: K-Folds
Learning Algorithm: Markov Blanket
Target: Class
Relative Gini Index Mean: 98.53%
Relative Lift Index Mean: 99.37%
Total Precision: 96.85%
As our Markov Blanket modeling is already performing at a level comparable to the models that have been
published in the literature, we might be tempted to conclude our analysis at this point. However, we will
attempt to see whether further performance improvements are possible.
Model 2: Augmented Markov Blanket
BayesiaLab offers an extension to the Markov Blanket algorithm, namely the Augmented Markov Blanket,
which performs an Unsupervised Learning Algorithm on the nodes in the Markov Blanket. This allows
identifying influence paths between the predictor variables and can potentially help improve the prediction
performance.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 25
This algorithm can be started via Learning | Supervised Learning | Augmented Markov Blanket.
As expected, the resulting network is somewhat more complex than the standard Markov Blanket.
If we save the original Markov Blanket and the new Augmented Markov Blanket under different file names,
we can use Tools | Compare | Structure to highlight the differences between both. Given that the addition of
three arcs is immediately visible, this function may appear as overkill for our particular example. However,
Breast Cancer Diagnostics with Bayesian Networks
26 www.bayesia.us | www.bayesia.sg | www.bayesia.com
in more complex situation, Structure Comparison can be rather helpful, and so we will spell out the details.
We choose the original network and the newly learned network as the Reference Network and the Com-
parison Network respectively.
Upon selection, a table provides a list of common arcs and those arcs that have been added in the Compari-
son Network, which was learned with the Augmented Markov Blanket algorithm:
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 27
Clicking Charts provides a visual representation of these differences. The additional arcs, compared to the
original Markov Blanket network, are now highlighted in blue. Conversely, had any arcs been deleted, those
would be shown in red.
Model 2a: Performance
We now proceed to performance evaluation with this new Augmented Markov Blanket network, analogous
to the Markov Blanket model: Analysis | Network Performance | Target
Given that we had originally split the dataset into a Learning Set and a Test Set, the Network Performance
evaluation is once again carried out separately on both subsets.
Breast Cancer Diagnostics with Bayesian Networks
28 www.bayesia.us | www.bayesia.sg | www.bayesia.com
Interestingly, the performance on the Test Set is better than on the Learning Set. This indicates that overfit-
ting is not a problem here.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 29
A summary for either subset can be saved by clicking Comprehensive Report. The out-of-sample Test Set
report is generally the more important one. It is shown below.
Value Benign Malignant
Gini Index 36.52% 63.01%
Relative Gini Index 99.53% 99.53%
Mean Lift 1.45 1.99
Relative Lift Index 99.92% 99.79%
Value
Benign
(88)
Malignant
(51)
Benign (86) 86 0
Malignant (53) 2 51
Value
Benign
(88)
Malignant
(51)
Benign (86) 100% 0%
Malignant (53) 3.77% 96.23%
Value
Benign
(88)
Malignant
(51)
Benign (86) 97.73% 0%
Malignant (53) 2.27% 100%
Occurrences
Reliability
Precision
Target: Class
Relative Gini Index Mean: 99.53%
Relative Lift Index Mean: 99.85%
Total Precision: 98.56%
R: 0.97499525394
R2: 0.95061574521
As with the earlier model, we repeat K-Folds Cross Validation for the Augmented Markov Blanket. The
results are shown below, first as a screenshot and then as a spreadsheet generated via Comprehensive Re-
port.
Breast Cancer Diagnostics with Bayesian Networks
30 www.bayesia.us | www.bayesia.sg | www.bayesia.com
Value Benign Malignant
Gini Index 33.95% 64.58%
Relative Gini Index 98.50% 98.55%
Mean Lift 1.42 2.04
Relative Lift Index 99.75% 98.99%
Value
Benign
(458)
Malignant
(241)
Benign (448) 442 6
Malignant (251) 16 235
Value
Benign
(458)
Malignant
(241)
Benign (448) 98.66% 1.34%
Malignant (251) 6.37% 93.63%
Value
Benign
(458)
Malignant
(241)
Benign (448) 96.51% 2.49%
Malignant (251) 3.49% 97.51%
R: 0.93877413371
R2: 0.88129687412
Occurrences
Reliability
Precision
Sampling Method: K-Folds
Learning Algorithm: Augmented Markov Blanket
Target: Class
Relative Gini Index Mean: 98.52%
Relative Lift Index Mean: 99.37%
Total Precision: 96.85%
Despite the greater complexity of this new network, we do not see an improvement in any of the perform-
ance measures.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 31
Structural Coefficient
Up to this point, the difference in network complexity was a only function of the choice of learning algo-
rithm. We will now address the Structural Coefficient (SC), which is the only parameter adjustable across all
the learning algorithms in BayesiaLab. In essence, this parameter determines a kind of significance thresh-
old, and thus it influences the degree of complexity of the induced networks.
By default, this Structural Coefficient is set to 1, which reliably prevents the learning algorithms from over-
fitting the model to the data. In studies with relatively few observations, the analyst’s judgment is needed for
determining a potential downward adjustment of this parameter. On the other hand, when data sets are
very large, increasing the parameter to values higher than 1 will help manage the network complexity.
Given the fairly simple network structure of the Markov Blanket model, complexity was of no concern.
Augmented Markov Blanket is more complex, but still very manageable. The question is, could a more
complex network provide greater precision without overfitting? To answer this question, we will perform a
Structural Coefficient Analysis, which generates several metrics that help in making the trade-off between
complexity and precision: Tools | Cross Validation | Structural Coefficient Analysis
BayesiaLab prompts us to specify the range of the Structural Coefficient to be examined and the number of
iterations to be performed. It is worth noting that the Minimum Structural Coefficient should not be set to
0, or even close to 0. A value of 0 would imply a fully connected network, which can take a very long time
to learn depending on the number of variables, or even exceed the memory capacity of the computer run-
ning BayesiaLab.
Number of Iterations determines the interval steps to be taken within the specified range of the Structural
Coefficient. Given the relatively light computational load, we choose 25 iterations. With more complex
models, we might be more conservative, as each iteration re-learns and re-evaluates the network. Further-
more, we select to compute all metrics.
Breast Cancer Diagnostics with Bayesian Networks
32 www.bayesia.us | www.bayesia.sg | www.bayesia.com
The resulting report shows how the network changes as a function of the Structural Coefficient. This can be
interpreted as the degree of confidence the analyst should have in any particular arc in the structure.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 33
Clicking Graphs, will show a synthesized network, consisting of all structures generated during the iterative
learning process.
The reference structure is represented by black arcs, which show the original network learned immediately
prior to the start of the Structural Coefficient Analysis. The blue-colored arcs are not contained in the refer-
ence structure, but they appear in networks that have been learned as a function of the different Structural
Coefficients (SC). The thickness of the arcs is proportional to the frequency of individual arcs existing in the
learned networks.
More importantly for us, however, is determining the correct level of network complexity for a reliable and
accurate prediction performance while avoiding overfitting the data. We can plot several different metrics in
this context by clicking Curve.
Breast Cancer Diagnostics with Bayesian Networks
34 www.bayesia.us | www.bayesia.sg | www.bayesia.com
Typically, the “elbow” of the L-shaped curve above identifies a suitable value for the Structural Coefficient
(SC). More formally, we would look for the point on the curve where the second derivative is maximized.
With a visual inspection, an SC value of around 0.3 appears to be a good candidate for that point. The por-
tion of the curve, where SC values approach 0, shows the characteristic pattern of overfitting, which is to be
avoided.
We will also plot the Target’s Precision alone as a function of the SC. On the surface, the curve for the
Learning Set resembles an L-shape too, but the curve moves only within roughly 2 percentage points, i.e.
between 97% and 99%. For practical purposes, this means that the curve is virtually flat.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 35
Breast Cancer Diagnostics with Bayesian Networks
36 www.bayesia.us | www.bayesia.sg | www.bayesia.com
As a result, the Structure/Target’s Precision Ratio
i.e.
Structure
Target's Precision
⎛
⎝⎜
⎞
⎠⎟
is primarily a function of the numera-
tor, i.e. the Structure, as the denominator, Target’s Precision, is nearly constant across a wide range of SC
values, as per the graph above.
If both Learning and Test Sets are available, a Validation Measure ɣ can be computed to help choose the
most appropriate Structural Coefficient.
This measure is based on the Test Set’s mean negative log-likelihood (returned by the network learned from
the Learning Set) and on the variances of the negative log-likelihood of the Test Set and Learning Set (re-
turned by the network learned from Learning Set).
γ = µLL,Test × max(1,
σLL,Test
2
σLL,Learning
2
)
The range between roughly 0.3 and 0.6, i.e. the section around the minimum of the curve, suggests suitable
values for the Structural Coefficient.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 37
Model 2b: Augmented Markov Blanket (SC=0.3)
Given the results from the Structural Coefficient Analysis, we now wish to relearn the network with an SC
value of 0.3. The SC value can be set by right-clicking on the background of the Graph Panel and then se-
lecting Edit Structural Coefficient from the Contextual Menu, or alternatively via the menu, i.e. Edit | Edit
Structural Coefficient.
Once we relearn the network, using the same Augmented Markov Blanket algorithm as before, we obtain a
more complex network. The key question is, will this increase in complexity improve the performance or
perhaps be counterproductive?
Breast Cancer Diagnostics with Bayesian Networks
38 www.bayesia.us | www.bayesia.sg | www.bayesia.com
Model 2b: Performance
We repeat the Network Performance Analysis and generate the Comprehensive Report for the Test Set.
Value Benign Malignant
Gini Index 36.60% 63.15%
Relative Gini Index 99.75% 99.75%
Mean Lift 1.45 1.99
Relative Lift Index 99.96% 99.90%
Value
Benign
(88)
Malignant
(51)
Benign (86) 86 0
Malignant (53) 2 51
Value
Benign
(88)
Malignant
(51)
Benign (86) 100% 0%
Malignant (53) 3.77% 96.23%
Value
Benign
(88)
Malignant
(51)
Benign (86) 97.73% 0%
Malignant (53) 2.27% 100%
Occurrences
Reliability
Precision
Target: Class
Relative Gini Index Mean: 99.75%
Relative Lift Index Mean: 99.93%
Total Precision: 98.56%
R: 0.97908818201
R2: 0.95861366815
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 39
Secondly, we perform K-Folds Cross Validation:
Value Benign Malignant
Gini Index 33.86% 64.42%
Relative Gini Index 98.28% 98.28%
Mean Lift 1.42 2.04
Relative Lift Index 99.69% 99.05%
Value
Benign
(458)
Malignant
(241)
Benign (447) 441 6
Malignant (252) 17 235
Value
Benign
(458)
Malignant
(241)
Benign (447) 98.66% 1.34%
Malignant (252) 6.75% 93.25%
Value
Benign
(458)
Malignant
(241)
Benign (447) 96.29% 2.49%
Malignant (252) 3.71% 97.51%
R: 0.94052337963
R2: 0.88458422762
Occurrences
Reliability
Precision
Sampling Method: K-Folds
Learning Algorithm: Augmented Markov Blanket
Target: Class
Relative Gini Index Mean: 98.28%
Relative Lift Index Mean: 99.37%
Total Precision: 96.71%
Conclusion
All models reviewed, Model 1 (Markov Blanket), Model 2a (Augmented Markov Blanket, SC=1), Model 2b
(Augmented Markov Blanket, SC=0.3), have performed at very similar levels in terms of classification per-
formance. Total Precision and false positives/negatives are shown as the key metrics in the summary table
below.
Total&
Precision
False&
Positives
False&
Negatives
Total&
Precision
False&
Positives
False&
Negatives
Markov&Blanket&(SC=1) 97.84% 3 0 96.85% 17 5
Augmented&Markov&Blanket&(SC=1) 98.56% 2 0 96.85% 16 6
Augmented&Markov&Blanket&(SC=0.3) 98.56% 2 0 96.71% 17 6
Test&Set&(n=139) 10JFold&CrossJValidation&(n=699)
Summary
Reestimating these models with more observations could potentially change the results and might more
clearly differentiate the classification performance. For now, we select the Augment Markov Blanket
(SC=1), and it will serve as the basis for the next section of this paper, Model Inference.
Breast Cancer Diagnostics with Bayesian Networks
40 www.bayesia.us | www.bayesia.sg | www.bayesia.com
Model Inference
Without further discussion of the merits of each model specification, we will now show how the learned
Augment Markov Blanket model can be applied in practice and used for inference. First, we need to go to
Validation Mode (F5). We can now bring up all the Monitors in the Monitor Panel by selecting all the
nodes (Ctrl+A) and double-clicking on any one of them. More conveniently, the Monitors can be displayed
by right-clicking inside the Monitor Panel and selecting Sort | Target Correlation from the Contextual
Menu.
Alternatively, we can do the same via Monitor | Sort | Target Correlation.
Monitors are then automatically created for all the nodes correlated with the Target Node. The Monitor of
Target Node is placed first in the Monitor Panel, followed by the other Monitors in order of their correla-
tion with the Target Node, from highest to lowest.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 41
Interactive Inference
For instance, we can use now BayesiaLab to review the individual predictions made based on the model.
This feature is called Interactive Inference, which can be accessed from the menu via Inference | Interactive
Inference.
Also, we have a choice of using either the Learning Set or the Test Set for inference. For our purposes, we
choose the Test Set.
The Navigation Bar allows scrolling through each record of the test set. Record #0 can be seen below with
all the associated observations highlighted in green. Given the observations shown, the model predicts a
Breast Cancer Diagnostics with Bayesian Networks
42 www.bayesia.us | www.bayesia.sg | www.bayesia.com
99.97% probability that Class is Benign (the Monitor of the Target Node is highlighted in red).
Most cases are rather clear-cut, as above, with probabilities for either diagnosis around 99% or higher.
However, there are a number of exceptions, such as case #11. Here, the probability of malignancy is ap-
proximately 75%.
Adaptive Questionnaire
In situations, when only individual cases are under review, rather than a batch of cases from a database,
BayesiaLab can provide case-by-case diagnosis support with the Adaptive Questionnaire.
For a a Target Node with more than two states, the Adaptive Questionnaire requires that we define a Tar-
get State. Setting the Target State allows BayesiaLab to compute Binary Mutual Information and then focus
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 43
on the defined Target State. Technically, setting the Target State is not necessary in our particular example
as the Target Node is binary.
The Adaptive Questionnaire can be started from the menu via Inference | Adaptive Questionnaire.
We can set Based on a Target State to Malignant, as we want to highlight this particular state.
Furthermore, we can set the cost of collecting observations via the Cost Editor, which can be started via the
Edit Costs button. This is helpful when certain observations are more costly to obtain than others.9
Unfortunately, our example is not ideally suited to illustrate this feature, as the FNA attributes are all col-
lected at the same time, rather than consecutively. However, one can imagine that in other contexts a physi-
cian will start the diagnosis process by collecting easy-to-obtain data, such as blood pressure, before pro-
ceeding to more elaborate (and more expensive) diagnostic techniques, such as performing an angiogram.
Breast Cancer Diagnostics with Bayesian Networks
44 www.bayesia.us | www.bayesia.sg | www.bayesia.com
9 Beyond monetary measures, “cost” could reflect, for instance, the degree of pain associated with a surgical procedure.
Once the Adaptive Questionnaire is started, BayesiaLab presents the Monitor of the Target Node (red) and
its marginal probability, with the Target State highlighted. Again, as shown below, the Monitors are auto-
matically ordered in the sequence of their importance, from high to low, with regard to diagnosing the Tar-
get State of the Target Node.
This means that the ideal first piece of evidence is Uniformity of Cell Size. Let us suppose this metric is equal
to 3 (<=4.5) for the case under investigation. Upon setting this first observation, BayesiaLab will compute
the new probability distribution of the Target Node, given the evidence. We see that the probability of
Class=Malignant has increased to 58.53%. Given the evidence, BayesiaLab also recomputes the ideal new
order of questions and now presents Bare Nuclei as the next most relevant question.
Let us now assume that Bare Nuclei is not available for observation. We instead set the node Clump Thick-
ness to Clump Thickness<=4.5.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 45
Given this latest piece of evidence, the probability distribution of Class is once again updated, as is the array
of questions. The small gray arrows inside the Monitors indicate how the probabilities have changed com-
pared to the prior iteration.
It is important to point out that not only the Target Node is updated as we set evidence. Rather, all nodes
are being updated upon setting evidence, reflecting the omnidirectional nature of inference within a Bayesian
network.
We can continue this process of updating until we have exhausted all available evidence, or until we have
reached an acceptable level of certainty regarding the diagnosis.
Target Interpretation Tree
Although its tree structure is not displayed, the Adaptive Questionnaire is a dynamic tree for seeking evi-
dence. More specifically, it is a tree that applies to one specific case given its observed evidence. The Target
Interpretation Tree is a static tree that is induced from all cases. As such it provides a more general ap-
proach in terms of searching for the optimum sequence of gathering evidence.
Breast Cancer Diagnostics with Bayesian Networks
46 www.bayesia.us | www.bayesia.sg | www.bayesia.com
The Target Interpretation Tree can be started from the menu via Analysis | Target Interpretation Tree.
Upon starting this function, we need to set several options. We define the Search Stop Criteria, and set the
Maximum Size of Evidence to 3 and the Minimum Joint Probability to 1 (percent). Furthermore, we check
the Center on State box and select Malignant from the drop-down menu. This way, Malignant will be high-
lighted in each node of the to-be-generated tree.
By default, the tree is presented in a top-down format.
Often, it may be more convenient to change the layout to a left-to-right format via the Switch Position but-
ton in the upper lefthand corner of the window that contains the tree.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 47
The following tree is presented in the left-to-right layout.
This tree prescribes in which sequence evidence should be sought for gaining the maximum amount of in-
formation towards a diagnosis. Going from left to right, we see how the probability distribution for Class
changes given the evidence set thus far.
The leftmost node in the tree, without any evidence set, shows the marginal probability distribution of
Class. The bottom panel of this node shows Uniformity of Cells Size as the most important evidence to seek.
Breast Cancer Diagnostics with Bayesian Networks
48 www.bayesia.us | www.bayesia.sg | www.bayesia.com
The three branches that emerge from the node represent the possible states of Uniformity of Cells Size, i.e.
the hard evidence we can observe. If we set evidence analogously to what we did in the Adaptive Question-
naire, we will choose the middle branch with the value Uniformity of Cell Size<=4.5 (2/3).
This evidence updates the probabilities of the Target State, now predicting a 58.53% probability of Class=
Malignant. At the same time we can see what is the next best piece of evidence to seek. Here, it is Bare Nu-
clei, which will provide the greatest information gain towards the diagnosis of Class. The information gain
is quantified with the Score displayed at the bottom of the node.
The Score is the Conditional Mutual Information of the node Bare Nuclei with regard to the Target Node,
divided by the cost of observing the evidence if the option Utilize Evidence Cost was checked. In our case, as
we did not check this option, the Score is equal to the Conditional Mutual Information.
We can quickly verify the Score of 7.1% by running the Mapping function. First, we set the evidence on
Uniformity of Cell Size (<=4.5) and then run Analysis | Visual | Mapping.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 49
The Mapping window features drop-down menus for Node Analysis and Arc Analysis. However, we are
only interested in Node Analysis, and we select Mutual Information with the Target Node as the metric to
be displayed.
The size of the nodes, beyond a fixed minimum size,10 is now proportional to the Mutual Information with
the Target Node. To see the specific values, we right-click on the background of the window and select Dis-
play Scores on Nodes from the Contextual Menu.
Breast Cancer Diagnostics with Bayesian Networks
50 www.bayesia.us | www.bayesia.sg | www.bayesia.com
10 The minimum and maximum sizes can be changed via Edit Sizes from the Contextual Menu in the Mapping Window.
This shows us, given Uniformity of Cell Size<=4.5, the Mutual Information of Bare Nuclei with the Target
Node is 0.0711, or 7.1%. Note that the node on which evidence has already been set, i.e. Uniformity of Cell
Size, shows a Conditional Mutual Information of 0.
So, learning Bare Nuclei will bring the highest information gain among the remaining variables. For in-
stance, if we now observed Bare Nuclei>5.5 (3/3), the probability of Class=Malignant would reach 98.33%.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 51
Finally, BayesiaLab also reports the joint probability of each tree node, i.e. the probability that all pieces of
evidence in a branch, up to and including that tree node, would occur.
This says that the joint probability of Uniformity of Cell Size<=4.5 and Bare Nuclei>5.5 is 5.32%.
As opposed to this somewhat artificial illustration of a Target Interpretation Tree in the context of FNA-
based diagnosis, Target Interpretation Trees are often prepared for emergency situations, such as triage
classification, in which rapid diagnosis with constrained resources is essential. We believe that our example
still conveys the idea of “optimum escalation” in obtaining evidence towards a diagnosis.
Summary
By using Bayesian networks as the framework and BayesiaLab as the tool, we have shown a practical new
modeling and analysis approach based on the widely studied Wisconsin Breast Cancer Database.
BayesiaLab can rapidly machine-learn reliable models, even without prior domain knowledge and without
hypothesis. The classification performance of the BayesiaLab-generated Bayesian network models is on par
with all studies on this topic that are published to date. Beyond the predictive performance, BayesiaLab en-
ables a range of analysis and interpretation functions, which can help the researcher gain deeper domain
knowledge and perform inference more efficiently.
Breast Cancer Diagnostics with Bayesian Networks
52 www.bayesia.us | www.bayesia.sg | www.bayesia.com
Appendix
Framework: The Bayesian Network Paradigm11
Acyclic Graphs & Bayes’s Rule
Probabilistic models based on directed acyclic graphs have a long and rich tradition, beginning with the
work of geneticist Sewall Wright in the 1920s. Variants have appeared in many fields. Within statistics, such
models are known as directed graphical models; within cognitive science and artificial intelligence, such
models are known as Bayesian networks. The name honors the Rev. Thomas Bayes (1702-1761), whose
rule for updating probabilities in the light of new evidence is the foundation of the approach.
Rev. Bayes addressed both the case of discrete probability distributions of data and the more complicated
case of continuous probability distributions. In the discrete case, Bayes’ theorem relates the conditional and
marginal probabilities of events A and B, provided that the probability of B does not equal zero:
P(A∣B) =
P(B∣A)P(A)
P(B)
In Bayes’ theorem, each probability has a conventional name:
P(A) is the prior probability (or “unconditional” or “marginal” probability) of A. It is “prior” in the sense
that it does not take into account any information about  B; however, the event  B need not occur after
event A. In the nineteenth century, the unconditional probability P(A) in Bayes’s rule was called the “ante-
cedent” probability; in deductive logic, the antecedent set of propositions and the inference rule imply con-
sequences. The unconditional probability P(A) was called “a priori” by Ronald A. Fisher.
P(A|B) is the conditional probability of A, given B. It is also called the posterior probability because it is de-
rived from or depends upon the specified value of B.
P(B|A) is the conditional probability of B given A. It is also called the likelihood.
P(B) is the prior or marginal probability of B, and acts as a normalizing constant.
Bayes theorem in this form gives a mathematical representation of how the conditional probability of event
A given B is related to the converse conditional probability of B given A.
The initial development of Bayesian networks in the late 1970s was motivated by the need to model the top-
down (semantic) and bottom-up (perceptual) combination of evidence in reading. The capability for bidirec-
tional inferences, combined with a rigorous probabilistic foundation, led to the rapid emergence of Bayesian
networks as the method of choice for uncertain reasoning in AI and expert systems replacing earlier, ad hoc
rule-based schemes.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 53
11 Adapted from Pearl (2000), used with permission.
The nodes in a Bayesian network represent variables
of interest (e.g. the temperature of a device, the gen-
der of a patient, a feature of an object, the occur-
rence of an event) and the links represent statistical
(informational) or causal dependencies among the
variables. The dependencies are quantified by condi-
tional probabilities for each node given its parents in
the network. The network supports the computation
of the posterior probabilities of any subset of vari-
ables given evidence about any other subset.
Compact Representation of the Joint
Probability Distribution
“The central paradigm of probabilistic reasoning is
to identify all relevant variables x1, . . . , xN in the
environment [i.e. the domain under study], and
make a probabilistic model p(x1, . . . , xN) of their interaction [i.e. represent the variables’ joint probability
distribution].”
Bayesian networks are very attractive for this purpose as they can, by means of factorization, compactly
represent the joint probability distribution of all variables.
“Reasoning (inference) is then performed by introducing evidence that sets variables in known states, and
subsequently computing probabilities of interest, conditioned on this evidence. The rules of probability,
combined with Bayes’ rule make for a complete reasoning system, one which includes traditional deductive
logic as a special case.” (Barber, 2012)
Breast Cancer Diagnostics with Bayesian Networks
54 www.bayesia.us | www.bayesia.sg | www.bayesia.com
References
Abdrabou, E. A.M.L, and A. E.B.M Salem. “A Breast Cancer Classifier Based on a Combination of Case-
Based Reasoning and Ontology Approach” (n.d.).
Conrady, Stefan, and Lionel Jouffe. “Missing Values Imputation -  A New Approach to Missing Values
Processing with Bayesian Networks,” January 4, 2012. http://bayesia.us/index.php/missingvalues.
El-Sebakhy, E. A, K. A Faisal, T. Helmy, F. Azzedin, and A. Al-Suhaim. “Evaluation of Breast Cancer Tu-
mor Classification with Unconstrained Functional Networks Classifier.” In The 4th ACS/IEEE Interna-
tional Conf. on Computer Systems and Applications, 281–287, 2006.
Hung, M. S, M. Shanker, and M. Y Hu. “Estimating Breast Cancer Risks Using Neural Networks.” Journal
of the Operational Research Society 53, no. 2 (2002): 222–231.
Karabatak, M., and M. C Ince. “An Expert System for Detection of Breast Cancer Based on Association
Rules and Neural Network.” Expert Systems with Applications 36, no. 2 (2009): 3465–3469.
Mangasarian, Olvi L, W. Nick Street, and William H Wolberg. “Breast Cancer Diagnosis and Prognosis via
Linear Programming.” OPERATIONS RESEARCH 43 (1995): 570–577.
Mu, T., and A. K Nandi. “BREAST CANCER DIAGNOSIS FROM FINE-NEEDLE ASPIRATION USING
SUPERVISED COMPACT HYPERSPHERES AND ESTABLISHMENT OF CONFIDENCE OF MA-
LIGNANCY” (n.d.).
Pearl, Judea. Causality: Models, Reasoning and Inference. 2nd ed. Cambridge University Press, 2009.
Pearl, Judea, and Stuart Russell. Bayesian Networks. UCLA Congnitive Systems Laboratory, November
2000. http://bayes.cs.ucla.edu/csl_papers.html.
Wolberg, W. H, W. N Street, D. M Heisey, and O. L Mangasarian. “Computer-derived Nuclear Features
Distinguish Malignant from Benign Breast Cytology* 1.” Human Pathology 26, no. 7 (1995): 792–
796.
Wolberg, William H, W. Nick Street, and O. L Mangasarian. “MACHINE LEARNING TECHNIQUES TO
DIAGNOSE BREAST CANCER FROM IMAGE-PROCESSED NUCLEAR FEATURES OF FINE
NEEDLE ASPIRATES” (n.d.). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.127.2109.
Wolberg, William H, W. Nick Street, and Olvi L Mangasarian. “Breast Cytology Diagnosis Via Digital Im-
age Analysis” (1993). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.38.9894.
———. “Breast Cytology Diagnosis Via Digital Image Analysis” (1993).
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.38.9894.
Breast Cancer Diagnostics with Bayesian Networks
www.bayesia.us | www.bayesia.sg | www.bayesia.com 55
Contact Information
Bayesia USA
312 Hamlet’s End Way
Franklin, TN 37067
USA
Phone: +1 888-386-8383
info@bayesia.us
www.bayesia.us
Bayesia Singapore Pte. Ltd.
20 Cecil Street
#14-01, Equity Plaza
Singapore 049705
Phone: +65 3158 2690
info@bayesia.sg
www.bayesia.sg
Bayesia S.A.S.
6, rue Léonard de Vinci
BP 119
53001 Laval Cedex
France
Phone: +33(0)2 43 49 75 69
info@bayesia.com
www.bayesia.com
Copyright
© 2013 Bayesia S.A.S., Bayesia USA and Bayesia Singapore. All rights reserved.
Breast Cancer Diagnostics with Bayesian Networks
56 www.bayesia.us | www.bayesia.sg | www.bayesia.com

More Related Content

What's hot

Breast Cancer Detection using Convolution Neural Network
Breast Cancer Detection using Convolution Neural NetworkBreast Cancer Detection using Convolution Neural Network
Breast Cancer Detection using Convolution Neural NetworkIRJET Journal
 
Applying Machine Learning Techniques to Breast Cancer Research - by Benjamin ...
Applying Machine Learning Techniques to Breast Cancer Research - by Benjamin ...Applying Machine Learning Techniques to Breast Cancer Research - by Benjamin ...
Applying Machine Learning Techniques to Breast Cancer Research - by Benjamin ...PAPIs.io
 
Breast cancerdetection IE594 Project Report
Breast cancerdetection IE594 Project ReportBreast cancerdetection IE594 Project Report
Breast cancerdetection IE594 Project ReportASHISH MENKUDALE
 
Mansi_BreastCancerDetection
Mansi_BreastCancerDetectionMansi_BreastCancerDetection
Mansi_BreastCancerDetectionMansiChowkkar
 
Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...eSAT Journals
 
BREAST CANCER DIAGNOSIS USING MACHINE LEARNING ALGORITHMS –A SURVEY
BREAST CANCER DIAGNOSIS USING MACHINE LEARNING ALGORITHMS –A SURVEYBREAST CANCER DIAGNOSIS USING MACHINE LEARNING ALGORITHMS –A SURVEY
BREAST CANCER DIAGNOSIS USING MACHINE LEARNING ALGORITHMS –A SURVEYijdpsjournal
 
Cancer detection using data mining
Cancer detection using data miningCancer detection using data mining
Cancer detection using data miningRishabhKumar283
 
Breast cancer classification
Breast cancer classificationBreast cancer classification
Breast cancer classificationAshwan Abdulmunem
 
Breast cancer diagnosis via data mining performance analysis of seven differe...
Breast cancer diagnosis via data mining performance analysis of seven differe...Breast cancer diagnosis via data mining performance analysis of seven differe...
Breast cancer diagnosis via data mining performance analysis of seven differe...cseij
 
Classification of Breast Masses Using Convolutional Neural Network as Feature...
Classification of Breast Masses Using Convolutional Neural Network as Feature...Classification of Breast Masses Using Convolutional Neural Network as Feature...
Classification of Breast Masses Using Convolutional Neural Network as Feature...Pinaki Ranjan Sarkar
 
Early detection of breast cancer using mammography images and software engine...
Early detection of breast cancer using mammography images and software engine...Early detection of breast cancer using mammography images and software engine...
Early detection of breast cancer using mammography images and software engine...TELKOMNIKA JOURNAL
 
On Predicting and Analyzing Breast Cancer using Data Mining Approach
On Predicting and Analyzing Breast Cancer using Data Mining ApproachOn Predicting and Analyzing Breast Cancer using Data Mining Approach
On Predicting and Analyzing Breast Cancer using Data Mining ApproachMasud Rana Basunia
 
Iganfis Data Mining Approach for Forecasting Cancer Threats
Iganfis Data Mining Approach for Forecasting Cancer ThreatsIganfis Data Mining Approach for Forecasting Cancer Threats
Iganfis Data Mining Approach for Forecasting Cancer Threatsijsrd.com
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONSVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONijscai
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONSVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONijscai
 
Twin support vector machine using kernel function for colorectal cancer detec...
Twin support vector machine using kernel function for colorectal cancer detec...Twin support vector machine using kernel function for colorectal cancer detec...
Twin support vector machine using kernel function for colorectal cancer detec...journalBEEI
 
Breast Cancer Stage Classification on Digital Mammogram Images
Breast Cancer Stage Classification on Digital Mammogram ImagesBreast Cancer Stage Classification on Digital Mammogram Images
Breast Cancer Stage Classification on Digital Mammogram ImagesIJCSIS Research Publications
 
A new model for large dataset dimensionality reduction based on teaching lear...
A new model for large dataset dimensionality reduction based on teaching lear...A new model for large dataset dimensionality reduction based on teaching lear...
A new model for large dataset dimensionality reduction based on teaching lear...TELKOMNIKA JOURNAL
 

What's hot (19)

Breast Cancer Detection using Convolution Neural Network
Breast Cancer Detection using Convolution Neural NetworkBreast Cancer Detection using Convolution Neural Network
Breast Cancer Detection using Convolution Neural Network
 
Applying Machine Learning Techniques to Breast Cancer Research - by Benjamin ...
Applying Machine Learning Techniques to Breast Cancer Research - by Benjamin ...Applying Machine Learning Techniques to Breast Cancer Research - by Benjamin ...
Applying Machine Learning Techniques to Breast Cancer Research - by Benjamin ...
 
Breast cancerdetection IE594 Project Report
Breast cancerdetection IE594 Project ReportBreast cancerdetection IE594 Project Report
Breast cancerdetection IE594 Project Report
 
Mansi_BreastCancerDetection
Mansi_BreastCancerDetectionMansi_BreastCancerDetection
Mansi_BreastCancerDetection
 
Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...
 
BREAST CANCER DIAGNOSIS USING MACHINE LEARNING ALGORITHMS –A SURVEY
BREAST CANCER DIAGNOSIS USING MACHINE LEARNING ALGORITHMS –A SURVEYBREAST CANCER DIAGNOSIS USING MACHINE LEARNING ALGORITHMS –A SURVEY
BREAST CANCER DIAGNOSIS USING MACHINE LEARNING ALGORITHMS –A SURVEY
 
Cancer detection using data mining
Cancer detection using data miningCancer detection using data mining
Cancer detection using data mining
 
Breast cancer classification
Breast cancer classificationBreast cancer classification
Breast cancer classification
 
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...
 
Breast cancer diagnosis via data mining performance analysis of seven differe...
Breast cancer diagnosis via data mining performance analysis of seven differe...Breast cancer diagnosis via data mining performance analysis of seven differe...
Breast cancer diagnosis via data mining performance analysis of seven differe...
 
Classification of Breast Masses Using Convolutional Neural Network as Feature...
Classification of Breast Masses Using Convolutional Neural Network as Feature...Classification of Breast Masses Using Convolutional Neural Network as Feature...
Classification of Breast Masses Using Convolutional Neural Network as Feature...
 
Early detection of breast cancer using mammography images and software engine...
Early detection of breast cancer using mammography images and software engine...Early detection of breast cancer using mammography images and software engine...
Early detection of breast cancer using mammography images and software engine...
 
On Predicting and Analyzing Breast Cancer using Data Mining Approach
On Predicting and Analyzing Breast Cancer using Data Mining ApproachOn Predicting and Analyzing Breast Cancer using Data Mining Approach
On Predicting and Analyzing Breast Cancer using Data Mining Approach
 
Iganfis Data Mining Approach for Forecasting Cancer Threats
Iganfis Data Mining Approach for Forecasting Cancer ThreatsIganfis Data Mining Approach for Forecasting Cancer Threats
Iganfis Data Mining Approach for Forecasting Cancer Threats
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONSVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONSVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
 
Twin support vector machine using kernel function for colorectal cancer detec...
Twin support vector machine using kernel function for colorectal cancer detec...Twin support vector machine using kernel function for colorectal cancer detec...
Twin support vector machine using kernel function for colorectal cancer detec...
 
Breast Cancer Stage Classification on Digital Mammogram Images
Breast Cancer Stage Classification on Digital Mammogram ImagesBreast Cancer Stage Classification on Digital Mammogram Images
Breast Cancer Stage Classification on Digital Mammogram Images
 
A new model for large dataset dimensionality reduction based on teaching lear...
A new model for large dataset dimensionality reduction based on teaching lear...A new model for large dataset dimensionality reduction based on teaching lear...
A new model for large dataset dimensionality reduction based on teaching lear...
 

Similar to Breast Cancer Diagnostics with Bayesian Networks

Performance Evaluation using Supervised Learning Algorithms for Breast Cancer...
Performance Evaluation using Supervised Learning Algorithms for Breast Cancer...Performance Evaluation using Supervised Learning Algorithms for Breast Cancer...
Performance Evaluation using Supervised Learning Algorithms for Breast Cancer...IRJET Journal
 
The Evolution and Impact of Medical Science Journals in Advancing Healthcare
The Evolution and Impact of Medical Science Journals in Advancing HealthcareThe Evolution and Impact of Medical Science Journals in Advancing Healthcare
The Evolution and Impact of Medical Science Journals in Advancing Healthcaresana473753
 
journals on medical
journals on medicaljournals on medical
journals on medicalsana473753
 
A Comprehensive Evaluation of Machine Learning Approaches for Breast Cancer C...
A Comprehensive Evaluation of Machine Learning Approaches for Breast Cancer C...A Comprehensive Evaluation of Machine Learning Approaches for Breast Cancer C...
A Comprehensive Evaluation of Machine Learning Approaches for Breast Cancer C...IRJET Journal
 
Machine Learning Based Approaches for Cancer Classification Using Gene Expres...
Machine Learning Based Approaches for Cancer Classification Using Gene Expres...Machine Learning Based Approaches for Cancer Classification Using Gene Expres...
Machine Learning Based Approaches for Cancer Classification Using Gene Expres...mlaij
 
Comparative analysis on bayesian classification for breast cancer problem
Comparative analysis on bayesian classification for breast cancer problemComparative analysis on bayesian classification for breast cancer problem
Comparative analysis on bayesian classification for breast cancer problemjournalBEEI
 
Logistic Regression Model for Predicting the Malignancy of Breast Cancer
Logistic Regression Model for Predicting the Malignancy of Breast CancerLogistic Regression Model for Predicting the Malignancy of Breast Cancer
Logistic Regression Model for Predicting the Malignancy of Breast CancerIRJET Journal
 
A Review on Data Mining Techniques for Prediction of Breast Cancer Recurrence
A Review on Data Mining Techniques for Prediction of Breast Cancer RecurrenceA Review on Data Mining Techniques for Prediction of Breast Cancer Recurrence
A Review on Data Mining Techniques for Prediction of Breast Cancer RecurrenceDr. Amarjeet Singh
 
Performance and Evaluation of Data Mining Techniques in Cancer Diagnosis
Performance and Evaluation of Data Mining Techniques in Cancer DiagnosisPerformance and Evaluation of Data Mining Techniques in Cancer Diagnosis
Performance and Evaluation of Data Mining Techniques in Cancer DiagnosisIOSR Journals
 
coad_machine_learning
coad_machine_learningcoad_machine_learning
coad_machine_learningFord Sleeman
 
Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...
Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...
Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...IRJET Journal
 
PREDICTION OF BREAST CANCER USING DATA MINING TECHNIQUES
PREDICTION OF BREAST CANCER USING DATA MINING TECHNIQUESPREDICTION OF BREAST CANCER USING DATA MINING TECHNIQUES
PREDICTION OF BREAST CANCER USING DATA MINING TECHNIQUESIAEME Publication
 
[대한병리학회] 의료 인공지능 101: 병리를 중심으로
[대한병리학회] 의료 인공지능 101: 병리를 중심으로[대한병리학회] 의료 인공지능 101: 병리를 중심으로
[대한병리학회] 의료 인공지능 101: 병리를 중심으로Yoon Sup Choi
 
Enabling Translational Medicine with e-Science
Enabling Translational Medicine with e-ScienceEnabling Translational Medicine with e-Science
Enabling Translational Medicine with e-ScienceOla Spjuth
 
Automated Cervicography Using a Machine Learning Classifier
Automated Cervicography Using a Machine Learning ClassifierAutomated Cervicography Using a Machine Learning Classifier
Automated Cervicography Using a Machine Learning ClassifierMobileODT
 
Deep Learning Techniques for Breast Cancer Risk Prediction.pptx
Deep Learning Techniques for Breast Cancer Risk Prediction.pptxDeep Learning Techniques for Breast Cancer Risk Prediction.pptx
Deep Learning Techniques for Breast Cancer Risk Prediction.pptxAnuraag Moharana
 
Breast Cancer Prediction using Machine Learning
Breast Cancer Prediction using Machine LearningBreast Cancer Prediction using Machine Learning
Breast Cancer Prediction using Machine LearningIRJET Journal
 

Similar to Breast Cancer Diagnostics with Bayesian Networks (20)

Performance Evaluation using Supervised Learning Algorithms for Breast Cancer...
Performance Evaluation using Supervised Learning Algorithms for Breast Cancer...Performance Evaluation using Supervised Learning Algorithms for Breast Cancer...
Performance Evaluation using Supervised Learning Algorithms for Breast Cancer...
 
Comparison of breast cancer classification models on Wisconsin dataset
Comparison of breast cancer classification models on Wisconsin  datasetComparison of breast cancer classification models on Wisconsin  dataset
Comparison of breast cancer classification models on Wisconsin dataset
 
The Evolution and Impact of Medical Science Journals in Advancing Healthcare
The Evolution and Impact of Medical Science Journals in Advancing HealthcareThe Evolution and Impact of Medical Science Journals in Advancing Healthcare
The Evolution and Impact of Medical Science Journals in Advancing Healthcare
 
journals on medical
journals on medicaljournals on medical
journals on medical
 
fnano-04-972421.pdf
fnano-04-972421.pdffnano-04-972421.pdf
fnano-04-972421.pdf
 
A Comprehensive Evaluation of Machine Learning Approaches for Breast Cancer C...
A Comprehensive Evaluation of Machine Learning Approaches for Breast Cancer C...A Comprehensive Evaluation of Machine Learning Approaches for Breast Cancer C...
A Comprehensive Evaluation of Machine Learning Approaches for Breast Cancer C...
 
Machine Learning Based Approaches for Cancer Classification Using Gene Expres...
Machine Learning Based Approaches for Cancer Classification Using Gene Expres...Machine Learning Based Approaches for Cancer Classification Using Gene Expres...
Machine Learning Based Approaches for Cancer Classification Using Gene Expres...
 
Comparative analysis on bayesian classification for breast cancer problem
Comparative analysis on bayesian classification for breast cancer problemComparative analysis on bayesian classification for breast cancer problem
Comparative analysis on bayesian classification for breast cancer problem
 
Logistic Regression Model for Predicting the Malignancy of Breast Cancer
Logistic Regression Model for Predicting the Malignancy of Breast CancerLogistic Regression Model for Predicting the Malignancy of Breast Cancer
Logistic Regression Model for Predicting the Malignancy of Breast Cancer
 
A Review on Data Mining Techniques for Prediction of Breast Cancer Recurrence
A Review on Data Mining Techniques for Prediction of Breast Cancer RecurrenceA Review on Data Mining Techniques for Prediction of Breast Cancer Recurrence
A Review on Data Mining Techniques for Prediction of Breast Cancer Recurrence
 
Performance and Evaluation of Data Mining Techniques in Cancer Diagnosis
Performance and Evaluation of Data Mining Techniques in Cancer DiagnosisPerformance and Evaluation of Data Mining Techniques in Cancer Diagnosis
Performance and Evaluation of Data Mining Techniques in Cancer Diagnosis
 
coad_machine_learning
coad_machine_learningcoad_machine_learning
coad_machine_learning
 
Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...
Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...
Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...
 
PREDICTION OF BREAST CANCER USING DATA MINING TECHNIQUES
PREDICTION OF BREAST CANCER USING DATA MINING TECHNIQUESPREDICTION OF BREAST CANCER USING DATA MINING TECHNIQUES
PREDICTION OF BREAST CANCER USING DATA MINING TECHNIQUES
 
[대한병리학회] 의료 인공지능 101: 병리를 중심으로
[대한병리학회] 의료 인공지능 101: 병리를 중심으로[대한병리학회] 의료 인공지능 101: 병리를 중심으로
[대한병리학회] 의료 인공지능 101: 병리를 중심으로
 
Enabling Translational Medicine with e-Science
Enabling Translational Medicine with e-ScienceEnabling Translational Medicine with e-Science
Enabling Translational Medicine with e-Science
 
Automated Cervicography Using a Machine Learning Classifier
Automated Cervicography Using a Machine Learning ClassifierAutomated Cervicography Using a Machine Learning Classifier
Automated Cervicography Using a Machine Learning Classifier
 
Deep Learning Techniques for Breast Cancer Risk Prediction.pptx
Deep Learning Techniques for Breast Cancer Risk Prediction.pptxDeep Learning Techniques for Breast Cancer Risk Prediction.pptx
Deep Learning Techniques for Breast Cancer Risk Prediction.pptx
 
Li2019
Li2019Li2019
Li2019
 
Breast Cancer Prediction using Machine Learning
Breast Cancer Prediction using Machine LearningBreast Cancer Prediction using Machine Learning
Breast Cancer Prediction using Machine Learning
 

More from Bayesia USA

BayesiaLab_Book_V18 (1)
BayesiaLab_Book_V18 (1)BayesiaLab_Book_V18 (1)
BayesiaLab_Book_V18 (1)Bayesia USA
 
Loyalty_Driver_Analysis_V13b
Loyalty_Driver_Analysis_V13bLoyalty_Driver_Analysis_V13b
Loyalty_Driver_Analysis_V13bBayesia USA
 
vehicle_safety_v20b
vehicle_safety_v20bvehicle_safety_v20b
vehicle_safety_v20bBayesia USA
 
Impact Analysis V12
Impact Analysis V12Impact Analysis V12
Impact Analysis V12Bayesia USA
 
Causality for Policy Assessment and 
Impact Analysis
Causality for Policy Assessment and 
Impact AnalysisCausality for Policy Assessment and 
Impact Analysis
Causality for Policy Assessment and 
Impact AnalysisBayesia USA
 
Vehicle Size, Weight, and Injury Risk: High-Dimensional Modeling and
 Causal ...
Vehicle Size, Weight, and Injury Risk: High-Dimensional Modeling and
 Causal ...Vehicle Size, Weight, and Injury Risk: High-Dimensional Modeling and
 Causal ...
Vehicle Size, Weight, and Injury Risk: High-Dimensional Modeling and
 Causal ...Bayesia USA
 
The Bayesia Portfolio of Research Software
The Bayesia Portfolio of Research SoftwareThe Bayesia Portfolio of Research Software
The Bayesia Portfolio of Research SoftwareBayesia USA
 
Bayesian Networks &amp; BayesiaLab
Bayesian Networks &amp; BayesiaLabBayesian Networks &amp; BayesiaLab
Bayesian Networks &amp; BayesiaLabBayesia USA
 
Causal Inference and Direct Effects
Causal Inference and Direct EffectsCausal Inference and Direct Effects
Causal Inference and Direct EffectsBayesia USA
 
Knowledge Discovery in the Stock Market
Knowledge Discovery in the Stock MarketKnowledge Discovery in the Stock Market
Knowledge Discovery in the Stock MarketBayesia USA
 
Paradoxes and Fallacies - Resolving some well-known puzzles with Bayesian net...
Paradoxes and Fallacies - Resolving some well-known puzzles with Bayesian net...Paradoxes and Fallacies - Resolving some well-known puzzles with Bayesian net...
Paradoxes and Fallacies - Resolving some well-known puzzles with Bayesian net...Bayesia USA
 
Probabilistic Latent Factor Induction and
 Statistical Factor Analysis
Probabilistic Latent Factor Induction and
 Statistical Factor AnalysisProbabilistic Latent Factor Induction and
 Statistical Factor Analysis
Probabilistic Latent Factor Induction and
 Statistical Factor AnalysisBayesia USA
 
Microarray Analysis with BayesiaLab
Microarray Analysis with BayesiaLabMicroarray Analysis with BayesiaLab
Microarray Analysis with BayesiaLabBayesia USA
 
Modeling Vehicle Choice and Simulating Market Share with Bayesian Networks
Modeling Vehicle Choice and Simulating Market Share with Bayesian NetworksModeling Vehicle Choice and Simulating Market Share with Bayesian Networks
Modeling Vehicle Choice and Simulating Market Share with Bayesian NetworksBayesia USA
 
Driver Analysis and Product Optimization with Bayesian Networks
Driver Analysis and Product Optimization with Bayesian NetworksDriver Analysis and Product Optimization with Bayesian Networks
Driver Analysis and Product Optimization with Bayesian NetworksBayesia USA
 
BayesiaLab 5.0 Introduction
BayesiaLab 5.0 IntroductionBayesiaLab 5.0 Introduction
BayesiaLab 5.0 IntroductionBayesia USA
 
Car And Driver Hk Interview
Car And Driver Hk InterviewCar And Driver Hk Interview
Car And Driver Hk InterviewBayesia USA
 

More from Bayesia USA (17)

BayesiaLab_Book_V18 (1)
BayesiaLab_Book_V18 (1)BayesiaLab_Book_V18 (1)
BayesiaLab_Book_V18 (1)
 
Loyalty_Driver_Analysis_V13b
Loyalty_Driver_Analysis_V13bLoyalty_Driver_Analysis_V13b
Loyalty_Driver_Analysis_V13b
 
vehicle_safety_v20b
vehicle_safety_v20bvehicle_safety_v20b
vehicle_safety_v20b
 
Impact Analysis V12
Impact Analysis V12Impact Analysis V12
Impact Analysis V12
 
Causality for Policy Assessment and 
Impact Analysis
Causality for Policy Assessment and 
Impact AnalysisCausality for Policy Assessment and 
Impact Analysis
Causality for Policy Assessment and 
Impact Analysis
 
Vehicle Size, Weight, and Injury Risk: High-Dimensional Modeling and
 Causal ...
Vehicle Size, Weight, and Injury Risk: High-Dimensional Modeling and
 Causal ...Vehicle Size, Weight, and Injury Risk: High-Dimensional Modeling and
 Causal ...
Vehicle Size, Weight, and Injury Risk: High-Dimensional Modeling and
 Causal ...
 
The Bayesia Portfolio of Research Software
The Bayesia Portfolio of Research SoftwareThe Bayesia Portfolio of Research Software
The Bayesia Portfolio of Research Software
 
Bayesian Networks &amp; BayesiaLab
Bayesian Networks &amp; BayesiaLabBayesian Networks &amp; BayesiaLab
Bayesian Networks &amp; BayesiaLab
 
Causal Inference and Direct Effects
Causal Inference and Direct EffectsCausal Inference and Direct Effects
Causal Inference and Direct Effects
 
Knowledge Discovery in the Stock Market
Knowledge Discovery in the Stock MarketKnowledge Discovery in the Stock Market
Knowledge Discovery in the Stock Market
 
Paradoxes and Fallacies - Resolving some well-known puzzles with Bayesian net...
Paradoxes and Fallacies - Resolving some well-known puzzles with Bayesian net...Paradoxes and Fallacies - Resolving some well-known puzzles with Bayesian net...
Paradoxes and Fallacies - Resolving some well-known puzzles with Bayesian net...
 
Probabilistic Latent Factor Induction and
 Statistical Factor Analysis
Probabilistic Latent Factor Induction and
 Statistical Factor AnalysisProbabilistic Latent Factor Induction and
 Statistical Factor Analysis
Probabilistic Latent Factor Induction and
 Statistical Factor Analysis
 
Microarray Analysis with BayesiaLab
Microarray Analysis with BayesiaLabMicroarray Analysis with BayesiaLab
Microarray Analysis with BayesiaLab
 
Modeling Vehicle Choice and Simulating Market Share with Bayesian Networks
Modeling Vehicle Choice and Simulating Market Share with Bayesian NetworksModeling Vehicle Choice and Simulating Market Share with Bayesian Networks
Modeling Vehicle Choice and Simulating Market Share with Bayesian Networks
 
Driver Analysis and Product Optimization with Bayesian Networks
Driver Analysis and Product Optimization with Bayesian NetworksDriver Analysis and Product Optimization with Bayesian Networks
Driver Analysis and Product Optimization with Bayesian Networks
 
BayesiaLab 5.0 Introduction
BayesiaLab 5.0 IntroductionBayesiaLab 5.0 Introduction
BayesiaLab 5.0 Introduction
 
Car And Driver Hk Interview
Car And Driver Hk InterviewCar And Driver Hk Interview
Car And Driver Hk Interview
 

Breast Cancer Diagnostics with Bayesian Networks

  • 1. Breast Cancer Diagnostics with Bayesian Networks Interpreting the Wisconsin Breast Cancer Database with BayesiaLab Stefan Conrady, stefan.conrady@conradyscience.com Dr. Lionel Jouffe, jouffe@bayesia.com May 20, 2013
  • 2. Table of Contents Case Study & Tutorial Introduction 4 Background 6 Wisconsin Breast Cancer Database 6 Notation 7 Model Development 8 Data Import 8 Unsupervised Learning 13 Model 1: Markov Blanket 16 Model 1: Performance 21 K-Folds Cross-Validation 23 Model 2: Augmented Markov Blanket 25 Model 2a: Performance 28 Structural Coefficient 32 Model 2b: Augmented Markov Blanket (SC=0.3) 38 Model 2b: Performance 39 Conclusion 40 Model Inference 41 Interactive Inference 42 Adaptive Questionnaire 43 Target Interpretation Tree 46 Summary 52 Appendix Framework: The Bayesian Network Paradigm 53 Acyclic Graphs & Bayes’s Rule 53 Compact Representation of the Joint Probability Distribution 54 References 55 Contact Information 56 Bayesia USA 56 Breast Cancer Diagnostics with Bayesian Networks ii www.bayesia.us | www.bayesia.sg | www.bayesia.com
  • 3. Bayesia Singapore Pte. Ltd. 56 Bayesia S.A.S. 56 Copyright 56 Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com iii
  • 4. Case Study & Tutorial Introduction Data classification is one of the most common tasks in the field of statistical analysis and countless methods have been developed for this purpose over time. A common approach is to develop a model based on known historical data, i.e. where the class membership of a record is known, and to use this generalization to predict the class membership for a new set of observations. Applications of data classifications permeate virtually all fields of study, including social sciences, engineer- ing, biology, etc. In the medical field, classification problems often appear in the context of disease identifi- cation, i.e. making a diagnosis about a patient’s condition. The medical sciences have a long history of de- veloping large body of knowledge, which links observable symptoms with known types of illnesses. It is the physician’s task to use the available medical knowledge to make inference based on the patient’s symptoms, i.e. to classify the medical condition in order to enable appropriate treatment. Over the last two decades, so-called medical expert systems have emerged, which are meant to support phy- sicians in their diagnostic work. Given the sheer amount of medical knowledge in existence today, it should not be surprising that significant benefits are expected from such machine-based support in terms of medical reasoning and inference. In this context, several papers by Wolberg, Street, Heisey and Managasarian became much-cited examples. They proposed an automated method for the classification of Fine Needle Aspirates1 through imaging proc- essing and machine learning with the objective of achieving a greater accuracy in distinguishing between malignant and benign cells for the diagnosis of breast cancer. At the time of their study, the practice of vis- ual inspection of FNA yielded inconsistent diagnostic accuracy. The proposed new approach would increase this accuracy reliably to over 95%. This research was quickly translated into clinical practice and has since been applied with continued success. As part of their studies in the late 1980s and 1990s, the research team generated what became known as the Wisconsin Breast Cancer Database, which contains measurements of hundreds of FNA samples and the as- sociated diagnoses. This database has been extensively studied, even outside the medical field. Statisticians and computer scientists have proposed a wide range of techniques for this classification problem and have continuously raised the benchmark for predictive performance. Our objective with this paper is to present Bayesian networks as a highly practical framework for working with this kind of classification problem. We intend to demonstrate how the BayesiaLab software can ex- Breast Cancer Diagnostics with Bayesian Networks 4 www.bayesia.us | www.bayesia.sg | www.bayesia.com 1 Fine needle aspiration (FNA) is a percutaneous (“through the skin”) procedure that uses a fine gauge needle (22 or 25 gauge) and a syringe to sample fluid from a breast cyst or remove clusters of cells from a solid mass. With FNA, the cellular material taken from the breast is usually sent to the pathology laboratory for analysis.
  • 5. tremely quickly, and relatively simply, create Bayesian network models that achieve the performance of the best custom-developed models, while only requiring a fraction of the development time. Furthermore, we wish to illustrate how Bayesian networks can help researchers and practitioners generate a deeper understanding of the underlying problem domain. Beyond merely producing predictions, we can use Bayesian networks to precisely quantify the importance of individual variables and employ BayesiaLab to help identify the most efficient path towards a diagnosis. BayesiaLab’s speed of model building, its excellent classification performance, plus the ease of interpretation provide researchers with a powerful new tool. Bayesian networks and BayesiaLab have thus become a driver in accelerating research. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 5
  • 6. Background To provide context for this study, we quote Mangasarian, Street and Wolberg (1994), who conducted the original research related breast cancer diagnosis with digital image processing and machine learning: Most breast cancers are detected by the patient as a lump in the breast. The majority of breast lumps are benign, so it is the physician’s responsibility to diagnose breast cancer, that is, to distin- guish benign lumps from malignant ones. There are three available methods for diagnosing breast cancer: mammography, FNA with visual interpretation and surgical biopsy. The reported sensitiv- ity, i.e. ability to correctly diagnose cancer when the disease is present of mammography varies from 68% to 79%, of FNA with visual interpretation from 65% to 98%, and of surgical biopsy close to 100%. Therefore mammography lacks sensitivity, FNA sensitivity varies widely, and surgical biopsy, al- though accurate, is invasive, time consuming and costly. The goal of the diagnostic aspect of our research is to develop a relatively objective system that diagnoses FNAs with an accuracy that ap- proaches the best achieved visually. Wisconsin Breast Cancer Database This breast cancer database was created through the clinical work of Dr. William H. Wolberg at the Univer- sity of Wisconsin Hospitals in Madison. As of 1992, Dr. Wolberg had collected 699 instances of patient diagnoses in this database, consisting of two classes: 458 benign cases (65.5%) and 241 malignant cases (34.5%). The following eleven attributes2 are included in the database: 1. Sample code number 2. Clump Thickness (1 - 10) 3. Uniformity of Cell Size (1 - 10) 4. Uniformity of Cell Shape (1 - 10) 5. Marginal Adhesion (1 - 10) 6. Single Epithelial Cell Size (1 - 10) 7. Bare Nuclei (1 - 10) 8. Bland Chromatin (1 - 10) 9. Normal Nucleoli (1 - 10) 10. Mitoses (1 - 10) 11. Class (benign/malignant) Breast Cancer Diagnostics with Bayesian Networks 6 www.bayesia.us | www.bayesia.sg | www.bayesia.com 2 “Attribute” and “variable” are used interchangeably throughout the paper.
  • 7. Attributes #2 through #10 were computed from digital images of fine needle aspirates (FNA) of breast masses. These features describe the characteristics of the cell nuclei in the image. The attribute #11, Class, was established via subsequent biopsies or via long-term monitoring of the tumor. We will not go into detail here regarding the definition of the attributes and their measurement. Rather, we refer the reader to papers referenced in the bibliography. The Wisconsin Breast Cancer Database is available to any interested researcher from the UC Irvine Machine Learning Repository.3 We use this database in its original format without any further transformation, so our results can be directly compared to dozens of methods that have been developed since the original study. Notation To clearly distinguish between natural language, software-specific functions and study-specific variable names, the following notation is used: • BayesiaLab-specific functions, keywords, commands, etc., are capitalized and printed in bold type. You can look up such terms in the BayesiaLab Library (library.bayesia.com) for more details. • The names of variables, attributes, nodes, and node states are capitalized and italicized. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 7 3 UC Irvine Machine Learning Repository website: http://archive.ics.uci.edu/ml/
  • 8. Model Development Data Import Our modeling process begins with importing the database,4 which is formatted as a text file with comma- separated values. Therefore, we start with Data | Open Data Source | Text File. The Data Import Wizard then guides us through the required steps. In the first dialogue box of the Data Import Wizard, we click on Define Typing and specify that we wish to set aside a Test Set from the data- base. Breast Cancer Diagnostics with Bayesian Networks 8 www.bayesia.us | www.bayesia.sg | www.bayesia.com 4 If we exclude the variable Sample code number, this database can also be used with the publicly-available evaluation version of BayesiaLab, which is limited to a maximum of ten nodes. Deleting this variable does not affect the workflow or the results of the analysis.
  • 9. Following common practice, we will randomly select 20% of the 699 records as Test Set, and, conse- quently, the remaining 80% will serve as our Learning Set set.5 In the next step, the Data Import Wizard will suggest the data format for each variable. Attributes 2 through 10 are identified as continuous variables and Class is read as a discrete variable. Only for the first variable, Sample code number, we have to specify Row Identifier, so it is not mistaken for a continuous pre- dictor variable. In the next step, the Information Panel reports that we have a total of 16 missing values in the entire data- set. We can also see that the column Bare Nuclei is labeled with a small question mark, indicating the pres- ence of missing values in this particular column. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 9 5 “Learning/Test Set” and “Learning/Test Sample” are used interchangeably in this paper.
  • 10. We now need to specify the type of Missing Values Imputation. Given the small size of the dataset, and the small number of missing values, we will choose the Structural EM method.6 A critical element of the data import process is the discretization of all continuous variables. On the next screen we click Select All Continuous to apply the same discretization algorithm across all continuous vari- ables. Alternatively, we could choose the type of discretization individually by variable. However, we will not discuss this option any further in this paper. As the objective of this exercise is classification, we choose the Decision Tree algorithm from the drop-down menu in the Multiple Discretization panel. This discretizes each variable for a maximum information gain with respect to the Target Class. Breast Cancer Diagnostics with Bayesian Networks 10 www.bayesia.us | www.bayesia.sg | www.bayesia.com 6 For more details on missing values imputation with Bayesian network, see Conrady and Jouffe (2012).
  • 11. Bayesian networks are entirely non-parametric, probabilistic models, and for their estimation they require a certain minimum number of observations. To help us with the selection of the number of discretization lev- els (or Intervals), we use the heuristic of five observations per parameter and probability cell. Given that we have a relatively small database with only 560 observations,7 three discretization intervals for each variable appear to be an appropriate choice. If we used a higher number of Intervals, we would need more observa- tions for a reliable estimation of the parameters. Upon clicking Finish, we will immediately see a representation of the newly imported database in the form of a fully unconnected Bayesian network in the Graph Panel. Each variable is now represented as a blue node in the graph panel of BayesiaLab. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 11 7 560 cases are in the training set (80%) and 139 are in the test set (20%).
  • 12. The question mark symbol, which is associated with the Bare Nuclei node, indicates that there are missing values for this variable. Hovering over the question mark with the mouse pointer while pressing the “i” key will show the number of missing values. Optionally, BayesiaLab can display an import report summarizing the obtained discretizations for all vari- ables. Breast Cancer Diagnostics with Bayesian Networks 12 www.bayesia.us | www.bayesia.sg | www.bayesia.com
  • 13. Unsupervised Learning When exploring a new domain, we generally recommended performing Unsupervised Learning on the newly imported database. This is also the case here, even though our principal objective is predictive modeling, for which Supervised Learning will later be the main tool. Learning | Unsupervised Structural Learning | EQ initiates the EQ Algorithm, which is suitable for the initial review of the database. For larger databases with significantly more variables, the Maximum Weight Span- ning Tree is a very fast algorithm and can be used instead. Upon learning, the initial Bayesian network looks like this: In its “raw” form, the crossing arcs make this network somewhat tricky to read. BayesiaLab has a number of layout algorithms that can quickly “disentangle” such a network and produce a much more user-friendly format. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 13
  • 14. We can select View | Automatic Layout or alternative use the shortcut “P”. Now we can visually review the learned network structure and compare it to our own domain knowledge. This allows for a “sanity check” of the database and the variables, and it may highlight any inconsistencies. Beyond visually inspecting the network structure, BayesiaLab allows us to visualize the quantitative part of this network. To do this, we first need to switch into the Validation Mode by clicking on the highlighted button in the lower-lefthand corner of the Graph Panel, or by alternatively using the “F5” key as a shortcut. We can now display the Pearson Correlation between the nodes that are directly linked in the graph by se- lecting Analysis | Visual | Pearson’s Correlation from the menu. Breast Cancer Diagnostics with Bayesian Networks 14 www.bayesia.us | www.bayesia.sg | www.bayesia.com
  • 15. Each arc’s thickness is now proportional to the Pearson Correlation between the connected nodes. Also, the blue and red colors indicate positive and negative correlations respectively. Any unexpected sign of correla- tions would thus become apparent very quickly. In our example, we only have positive correlations and thus all arcs are blue. Additionally, callouts indicate that further information can be displayed. We can opt to display this numerical information via View | Display Arc Comments. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 15
  • 16. This function is also available via a button in the menu: Model 1: Markov Blanket Now that we have performed an initial review of the dataset with the Unsupervised Learning step, we can return to the Modeling Mode by clicking on the corresponding button in the lower lefthand corner of the Breast Cancer Diagnostics with Bayesian Networks 16 www.bayesia.us | www.bayesia.sg | www.bayesia.com
  • 17. screen or using the shortcut “F4”.8 This allows us to proceed to the modeling stage. Given our objective of predicting the state of the variable Class, i.e. benign versus malignant, we will define Class as the Target Variable by right-clicking on the node and selecting Set as Target Variable from the contextual menu. Alternatively, we can double-click on Class while holding the shortcut “T” pressed. We need to specify this explicitly, so the subsequent Supervised Learning algorithm can use Class as the dependent variable. This setting is confirmed by the “bullseye”appearance of the new Target Node. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 17 8 We will mostly omit further references to switching between Modeling Mode (F4) and Validation Mode (F5). The required modes can generally be inferred from the context.
  • 18. Upon this selection, all Supervised Learning algorithms become available under Learning | Supervised Learn- ing. In many cases, the Markov Blanket algorithm is a good starting point for a predictive model. This algorithm is extremely fast and can even be applied to databases with thousands of variables and millions of records, even though database size is not a concern in this particular study. Upon learning the Markov Blanket for Class, and once again applying the Automatic Layout, the resulting Bayesian network looks as follows: Markov Blanket Definition The Markov Blanket for a node A is the set of nodes composed of A’s parents, its children, and its children’s other parents (=spouses). The Markov Blanket of the node A contains all the variables, which, if we know their states, will shield the node A from the rest of the network. This means that the Markov Blanket of a node is the only knowledge needed to predict the behavior of that node A. Learning a Markov Blanket selects relevant predictor variables, which is particu- larly helpful when there is a large number of variables in the database. In fact, this can also serve as a highly-efficient variable selection method in preparation for other types of modeling, e.g. neural net- works. Breast Cancer Diagnostics with Bayesian Networks 18 www.bayesia.us | www.bayesia.sg | www.bayesia.com
  • 19. This network suggests that Class has a direct probabilistic relationship with all variables except Marginal Adhesion and Single Epithelial Cell Size, which are both disconnected. The lack of their connection with the Target indicates that these nodes are independent given the nodes in the Markov Blanket. Beyond distinguishing between predictors (connected nodes) and non-predictors (disconnected nodes), we can further examine the relationship versus the Target Node Class by highlighting the Mutual Information of the arcs connecting the nodes. This function is accessible within the Validation Mode via Analysis | Vis- ual | Arcs’ Mutual Information. Note We can see on the graph learned earlier with the EQ algorithm that Uniformity of Cell Shape is the node that makes these two nodes conditionally independent of Class. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 19
  • 20. We will also go ahead and immediately select View | Display Arc Comments. The thickness of the arcs is now proportional to the Mu- tual Information, i.e. the strength of the relationship be- tween the nodes. Intuitively, Mutual Information measures the information that X and Y share: it measures how much knowing one of these variables reduces our uncertainty about the other. For example, if X and Y are independent, then knowing X does not provide any information about Y and vice versa, so their Mutual Information is zero. At the other extreme, if X and Y are identical then all information conveyed by X is shared with Y: knowing X determines the value of Y and vice versa. Formal Definition of Mutual Information I(X;Y ) = p(x,y)log p(x,y) p(x)p(y) ⎛ ⎝⎜ ⎞ ⎠⎟ x∈X ∑ y∈Y ∑ Breast Cancer Diagnostics with Bayesian Networks 20 www.bayesia.us | www.bayesia.sg | www.bayesia.com
  • 21. In the top part of the comment box attached to each arc, the Mutual Information of the arc is shown. Expressed as a percentage and highlighted in blue, we see the relative Mutual Informa- tion in the direction of the arc (parent node ➔ child node). And, at the bottom, we have the relative Mutual Information in the opposite direction of the arc (child node ➔ parent node). Model 1: Performance As we are not equipped with specific domain knowledge about the variables, we will not further interpret these relationships but rather run an initial test regarding the Network Performance. We want to know how well this Markov Blanket model can predict the states of the Class variable, i.e. Benign versus Malignant. This test is available via Analysis | Network Performance | Target. Using our previously defined Test Set for validating our model, we obtain the following, rather encouraging results: Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 21
  • 22. Of the 88 Benign cases of the test set, 3 were incorrectly identified, which corresponds to a false positive rate of 3.41%. More importantly though, of the 51 Malignant cases, all were identified correctly (true posi- tives) with no false negatives. The overall performance can be expressed as the Total Precision, which is computed as total number of correct predictions (true positives + true negatives) divided by the total num- ber of cases in the Test Set , i.e. (85 +51) ÷ 139 = 97.84%. As the selection of the Learning Set and the Test Set during the data import process is random, BayesiaLab may learn slightly different networks based on different Learning Sets after each data import. Hence, your own network performance evaluation could deviate from what is shown above, unless you chose the same Fixed Seed for the random number generator when you defined Data Typing during the data import proc- ess. Breast Cancer Diagnostics with Bayesian Networks 22 www.bayesia.us | www.bayesia.sg | www.bayesia.com
  • 23. K-Folds Cross-Validation To mitigate the sampling artifacts that may occur in a one-off test, we can systematically learn networks on a sequence of different subsets and then aggregate the test results. Analogous to the original papers on this topic, we will perform K-Folds Cross Validation, which will iteratively select K different Learning Sets and Test Sets and then, based on those, learn the networks and test their performance. The Cross Validation can then be started via Tools | Cross Validation | Targeted Evaluation | K-Folds. We use the same learning algorithm as before, i.e. the Markov Blanket, and we choose 10 as the number of sub-samples to be analyzed. Of the total dataset of 699 cases, each of the ten iterations will create a Test Set of 69 randomly drawn samples, and use the remaining 630 as the Learning Set. This means that BayesiaLab learns one network per Learning Set and then tests the performance on the respective Test Set. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 23
  • 24. The summary, including the synthesized results, is shown below. These results confirm the good performance of this model. The Total Precision is 97%, with a false negative rate of 2%. This means 2% of the cases were predicted as Benign, while the were actually Malignant. Breast Cancer Diagnostics with Bayesian Networks 24 www.bayesia.us | www.bayesia.sg | www.bayesia.com
  • 25. Clicking Comprehensive Report produces a summary, which can also be saved in HTML format. This is convenient for subsequent editing, as the generated HTML file can be opened and edited as a spreadsheet. Value Benign Malignant Gini Index 33.95% 64.59% Relative Gini Index 98.50% 98.55% Mean Lift 1.42 2.04 Relative Lift Index 99.74% 99% Value Benign (458) Malignant (241) Benign (446) 441 5 Malignant (253) 17 236 Value Benign (458) Malignant (241) Benign (446) 98.88% 1.12% Malignant (253) 6.72% 93.28% Value Benign (458) Malignant (241) Benign (446) 96.29% 2.07% Malignant (253) 3.71% 97.93% R: 0.93817485358 R2: 0.88017205588 Occurrences Reliability Precision Sampling Method: K-Folds Learning Algorithm: Markov Blanket Target: Class Relative Gini Index Mean: 98.53% Relative Lift Index Mean: 99.37% Total Precision: 96.85% As our Markov Blanket modeling is already performing at a level comparable to the models that have been published in the literature, we might be tempted to conclude our analysis at this point. However, we will attempt to see whether further performance improvements are possible. Model 2: Augmented Markov Blanket BayesiaLab offers an extension to the Markov Blanket algorithm, namely the Augmented Markov Blanket, which performs an Unsupervised Learning Algorithm on the nodes in the Markov Blanket. This allows identifying influence paths between the predictor variables and can potentially help improve the prediction performance. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 25
  • 26. This algorithm can be started via Learning | Supervised Learning | Augmented Markov Blanket. As expected, the resulting network is somewhat more complex than the standard Markov Blanket. If we save the original Markov Blanket and the new Augmented Markov Blanket under different file names, we can use Tools | Compare | Structure to highlight the differences between both. Given that the addition of three arcs is immediately visible, this function may appear as overkill for our particular example. However, Breast Cancer Diagnostics with Bayesian Networks 26 www.bayesia.us | www.bayesia.sg | www.bayesia.com
  • 27. in more complex situation, Structure Comparison can be rather helpful, and so we will spell out the details. We choose the original network and the newly learned network as the Reference Network and the Com- parison Network respectively. Upon selection, a table provides a list of common arcs and those arcs that have been added in the Compari- son Network, which was learned with the Augmented Markov Blanket algorithm: Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 27
  • 28. Clicking Charts provides a visual representation of these differences. The additional arcs, compared to the original Markov Blanket network, are now highlighted in blue. Conversely, had any arcs been deleted, those would be shown in red. Model 2a: Performance We now proceed to performance evaluation with this new Augmented Markov Blanket network, analogous to the Markov Blanket model: Analysis | Network Performance | Target Given that we had originally split the dataset into a Learning Set and a Test Set, the Network Performance evaluation is once again carried out separately on both subsets. Breast Cancer Diagnostics with Bayesian Networks 28 www.bayesia.us | www.bayesia.sg | www.bayesia.com
  • 29. Interestingly, the performance on the Test Set is better than on the Learning Set. This indicates that overfit- ting is not a problem here. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 29
  • 30. A summary for either subset can be saved by clicking Comprehensive Report. The out-of-sample Test Set report is generally the more important one. It is shown below. Value Benign Malignant Gini Index 36.52% 63.01% Relative Gini Index 99.53% 99.53% Mean Lift 1.45 1.99 Relative Lift Index 99.92% 99.79% Value Benign (88) Malignant (51) Benign (86) 86 0 Malignant (53) 2 51 Value Benign (88) Malignant (51) Benign (86) 100% 0% Malignant (53) 3.77% 96.23% Value Benign (88) Malignant (51) Benign (86) 97.73% 0% Malignant (53) 2.27% 100% Occurrences Reliability Precision Target: Class Relative Gini Index Mean: 99.53% Relative Lift Index Mean: 99.85% Total Precision: 98.56% R: 0.97499525394 R2: 0.95061574521 As with the earlier model, we repeat K-Folds Cross Validation for the Augmented Markov Blanket. The results are shown below, first as a screenshot and then as a spreadsheet generated via Comprehensive Re- port. Breast Cancer Diagnostics with Bayesian Networks 30 www.bayesia.us | www.bayesia.sg | www.bayesia.com
  • 31. Value Benign Malignant Gini Index 33.95% 64.58% Relative Gini Index 98.50% 98.55% Mean Lift 1.42 2.04 Relative Lift Index 99.75% 98.99% Value Benign (458) Malignant (241) Benign (448) 442 6 Malignant (251) 16 235 Value Benign (458) Malignant (241) Benign (448) 98.66% 1.34% Malignant (251) 6.37% 93.63% Value Benign (458) Malignant (241) Benign (448) 96.51% 2.49% Malignant (251) 3.49% 97.51% R: 0.93877413371 R2: 0.88129687412 Occurrences Reliability Precision Sampling Method: K-Folds Learning Algorithm: Augmented Markov Blanket Target: Class Relative Gini Index Mean: 98.52% Relative Lift Index Mean: 99.37% Total Precision: 96.85% Despite the greater complexity of this new network, we do not see an improvement in any of the perform- ance measures. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 31
  • 32. Structural Coefficient Up to this point, the difference in network complexity was a only function of the choice of learning algo- rithm. We will now address the Structural Coefficient (SC), which is the only parameter adjustable across all the learning algorithms in BayesiaLab. In essence, this parameter determines a kind of significance thresh- old, and thus it influences the degree of complexity of the induced networks. By default, this Structural Coefficient is set to 1, which reliably prevents the learning algorithms from over- fitting the model to the data. In studies with relatively few observations, the analyst’s judgment is needed for determining a potential downward adjustment of this parameter. On the other hand, when data sets are very large, increasing the parameter to values higher than 1 will help manage the network complexity. Given the fairly simple network structure of the Markov Blanket model, complexity was of no concern. Augmented Markov Blanket is more complex, but still very manageable. The question is, could a more complex network provide greater precision without overfitting? To answer this question, we will perform a Structural Coefficient Analysis, which generates several metrics that help in making the trade-off between complexity and precision: Tools | Cross Validation | Structural Coefficient Analysis BayesiaLab prompts us to specify the range of the Structural Coefficient to be examined and the number of iterations to be performed. It is worth noting that the Minimum Structural Coefficient should not be set to 0, or even close to 0. A value of 0 would imply a fully connected network, which can take a very long time to learn depending on the number of variables, or even exceed the memory capacity of the computer run- ning BayesiaLab. Number of Iterations determines the interval steps to be taken within the specified range of the Structural Coefficient. Given the relatively light computational load, we choose 25 iterations. With more complex models, we might be more conservative, as each iteration re-learns and re-evaluates the network. Further- more, we select to compute all metrics. Breast Cancer Diagnostics with Bayesian Networks 32 www.bayesia.us | www.bayesia.sg | www.bayesia.com
  • 33. The resulting report shows how the network changes as a function of the Structural Coefficient. This can be interpreted as the degree of confidence the analyst should have in any particular arc in the structure. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 33
  • 34. Clicking Graphs, will show a synthesized network, consisting of all structures generated during the iterative learning process. The reference structure is represented by black arcs, which show the original network learned immediately prior to the start of the Structural Coefficient Analysis. The blue-colored arcs are not contained in the refer- ence structure, but they appear in networks that have been learned as a function of the different Structural Coefficients (SC). The thickness of the arcs is proportional to the frequency of individual arcs existing in the learned networks. More importantly for us, however, is determining the correct level of network complexity for a reliable and accurate prediction performance while avoiding overfitting the data. We can plot several different metrics in this context by clicking Curve. Breast Cancer Diagnostics with Bayesian Networks 34 www.bayesia.us | www.bayesia.sg | www.bayesia.com
  • 35. Typically, the “elbow” of the L-shaped curve above identifies a suitable value for the Structural Coefficient (SC). More formally, we would look for the point on the curve where the second derivative is maximized. With a visual inspection, an SC value of around 0.3 appears to be a good candidate for that point. The por- tion of the curve, where SC values approach 0, shows the characteristic pattern of overfitting, which is to be avoided. We will also plot the Target’s Precision alone as a function of the SC. On the surface, the curve for the Learning Set resembles an L-shape too, but the curve moves only within roughly 2 percentage points, i.e. between 97% and 99%. For practical purposes, this means that the curve is virtually flat. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 35
  • 36. Breast Cancer Diagnostics with Bayesian Networks 36 www.bayesia.us | www.bayesia.sg | www.bayesia.com
  • 37. As a result, the Structure/Target’s Precision Ratio i.e. Structure Target's Precision ⎛ ⎝⎜ ⎞ ⎠⎟ is primarily a function of the numera- tor, i.e. the Structure, as the denominator, Target’s Precision, is nearly constant across a wide range of SC values, as per the graph above. If both Learning and Test Sets are available, a Validation Measure ɣ can be computed to help choose the most appropriate Structural Coefficient. This measure is based on the Test Set’s mean negative log-likelihood (returned by the network learned from the Learning Set) and on the variances of the negative log-likelihood of the Test Set and Learning Set (re- turned by the network learned from Learning Set). γ = µLL,Test × max(1, σLL,Test 2 σLL,Learning 2 ) The range between roughly 0.3 and 0.6, i.e. the section around the minimum of the curve, suggests suitable values for the Structural Coefficient. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 37
  • 38. Model 2b: Augmented Markov Blanket (SC=0.3) Given the results from the Structural Coefficient Analysis, we now wish to relearn the network with an SC value of 0.3. The SC value can be set by right-clicking on the background of the Graph Panel and then se- lecting Edit Structural Coefficient from the Contextual Menu, or alternatively via the menu, i.e. Edit | Edit Structural Coefficient. Once we relearn the network, using the same Augmented Markov Blanket algorithm as before, we obtain a more complex network. The key question is, will this increase in complexity improve the performance or perhaps be counterproductive? Breast Cancer Diagnostics with Bayesian Networks 38 www.bayesia.us | www.bayesia.sg | www.bayesia.com
  • 39. Model 2b: Performance We repeat the Network Performance Analysis and generate the Comprehensive Report for the Test Set. Value Benign Malignant Gini Index 36.60% 63.15% Relative Gini Index 99.75% 99.75% Mean Lift 1.45 1.99 Relative Lift Index 99.96% 99.90% Value Benign (88) Malignant (51) Benign (86) 86 0 Malignant (53) 2 51 Value Benign (88) Malignant (51) Benign (86) 100% 0% Malignant (53) 3.77% 96.23% Value Benign (88) Malignant (51) Benign (86) 97.73% 0% Malignant (53) 2.27% 100% Occurrences Reliability Precision Target: Class Relative Gini Index Mean: 99.75% Relative Lift Index Mean: 99.93% Total Precision: 98.56% R: 0.97908818201 R2: 0.95861366815 Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 39
  • 40. Secondly, we perform K-Folds Cross Validation: Value Benign Malignant Gini Index 33.86% 64.42% Relative Gini Index 98.28% 98.28% Mean Lift 1.42 2.04 Relative Lift Index 99.69% 99.05% Value Benign (458) Malignant (241) Benign (447) 441 6 Malignant (252) 17 235 Value Benign (458) Malignant (241) Benign (447) 98.66% 1.34% Malignant (252) 6.75% 93.25% Value Benign (458) Malignant (241) Benign (447) 96.29% 2.49% Malignant (252) 3.71% 97.51% R: 0.94052337963 R2: 0.88458422762 Occurrences Reliability Precision Sampling Method: K-Folds Learning Algorithm: Augmented Markov Blanket Target: Class Relative Gini Index Mean: 98.28% Relative Lift Index Mean: 99.37% Total Precision: 96.71% Conclusion All models reviewed, Model 1 (Markov Blanket), Model 2a (Augmented Markov Blanket, SC=1), Model 2b (Augmented Markov Blanket, SC=0.3), have performed at very similar levels in terms of classification per- formance. Total Precision and false positives/negatives are shown as the key metrics in the summary table below. Total& Precision False& Positives False& Negatives Total& Precision False& Positives False& Negatives Markov&Blanket&(SC=1) 97.84% 3 0 96.85% 17 5 Augmented&Markov&Blanket&(SC=1) 98.56% 2 0 96.85% 16 6 Augmented&Markov&Blanket&(SC=0.3) 98.56% 2 0 96.71% 17 6 Test&Set&(n=139) 10JFold&CrossJValidation&(n=699) Summary Reestimating these models with more observations could potentially change the results and might more clearly differentiate the classification performance. For now, we select the Augment Markov Blanket (SC=1), and it will serve as the basis for the next section of this paper, Model Inference. Breast Cancer Diagnostics with Bayesian Networks 40 www.bayesia.us | www.bayesia.sg | www.bayesia.com
  • 41. Model Inference Without further discussion of the merits of each model specification, we will now show how the learned Augment Markov Blanket model can be applied in practice and used for inference. First, we need to go to Validation Mode (F5). We can now bring up all the Monitors in the Monitor Panel by selecting all the nodes (Ctrl+A) and double-clicking on any one of them. More conveniently, the Monitors can be displayed by right-clicking inside the Monitor Panel and selecting Sort | Target Correlation from the Contextual Menu. Alternatively, we can do the same via Monitor | Sort | Target Correlation. Monitors are then automatically created for all the nodes correlated with the Target Node. The Monitor of Target Node is placed first in the Monitor Panel, followed by the other Monitors in order of their correla- tion with the Target Node, from highest to lowest. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 41
  • 42. Interactive Inference For instance, we can use now BayesiaLab to review the individual predictions made based on the model. This feature is called Interactive Inference, which can be accessed from the menu via Inference | Interactive Inference. Also, we have a choice of using either the Learning Set or the Test Set for inference. For our purposes, we choose the Test Set. The Navigation Bar allows scrolling through each record of the test set. Record #0 can be seen below with all the associated observations highlighted in green. Given the observations shown, the model predicts a Breast Cancer Diagnostics with Bayesian Networks 42 www.bayesia.us | www.bayesia.sg | www.bayesia.com
  • 43. 99.97% probability that Class is Benign (the Monitor of the Target Node is highlighted in red). Most cases are rather clear-cut, as above, with probabilities for either diagnosis around 99% or higher. However, there are a number of exceptions, such as case #11. Here, the probability of malignancy is ap- proximately 75%. Adaptive Questionnaire In situations, when only individual cases are under review, rather than a batch of cases from a database, BayesiaLab can provide case-by-case diagnosis support with the Adaptive Questionnaire. For a a Target Node with more than two states, the Adaptive Questionnaire requires that we define a Tar- get State. Setting the Target State allows BayesiaLab to compute Binary Mutual Information and then focus Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 43
  • 44. on the defined Target State. Technically, setting the Target State is not necessary in our particular example as the Target Node is binary. The Adaptive Questionnaire can be started from the menu via Inference | Adaptive Questionnaire. We can set Based on a Target State to Malignant, as we want to highlight this particular state. Furthermore, we can set the cost of collecting observations via the Cost Editor, which can be started via the Edit Costs button. This is helpful when certain observations are more costly to obtain than others.9 Unfortunately, our example is not ideally suited to illustrate this feature, as the FNA attributes are all col- lected at the same time, rather than consecutively. However, one can imagine that in other contexts a physi- cian will start the diagnosis process by collecting easy-to-obtain data, such as blood pressure, before pro- ceeding to more elaborate (and more expensive) diagnostic techniques, such as performing an angiogram. Breast Cancer Diagnostics with Bayesian Networks 44 www.bayesia.us | www.bayesia.sg | www.bayesia.com 9 Beyond monetary measures, “cost” could reflect, for instance, the degree of pain associated with a surgical procedure.
  • 45. Once the Adaptive Questionnaire is started, BayesiaLab presents the Monitor of the Target Node (red) and its marginal probability, with the Target State highlighted. Again, as shown below, the Monitors are auto- matically ordered in the sequence of their importance, from high to low, with regard to diagnosing the Tar- get State of the Target Node. This means that the ideal first piece of evidence is Uniformity of Cell Size. Let us suppose this metric is equal to 3 (<=4.5) for the case under investigation. Upon setting this first observation, BayesiaLab will compute the new probability distribution of the Target Node, given the evidence. We see that the probability of Class=Malignant has increased to 58.53%. Given the evidence, BayesiaLab also recomputes the ideal new order of questions and now presents Bare Nuclei as the next most relevant question. Let us now assume that Bare Nuclei is not available for observation. We instead set the node Clump Thick- ness to Clump Thickness<=4.5. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 45
  • 46. Given this latest piece of evidence, the probability distribution of Class is once again updated, as is the array of questions. The small gray arrows inside the Monitors indicate how the probabilities have changed com- pared to the prior iteration. It is important to point out that not only the Target Node is updated as we set evidence. Rather, all nodes are being updated upon setting evidence, reflecting the omnidirectional nature of inference within a Bayesian network. We can continue this process of updating until we have exhausted all available evidence, or until we have reached an acceptable level of certainty regarding the diagnosis. Target Interpretation Tree Although its tree structure is not displayed, the Adaptive Questionnaire is a dynamic tree for seeking evi- dence. More specifically, it is a tree that applies to one specific case given its observed evidence. The Target Interpretation Tree is a static tree that is induced from all cases. As such it provides a more general ap- proach in terms of searching for the optimum sequence of gathering evidence. Breast Cancer Diagnostics with Bayesian Networks 46 www.bayesia.us | www.bayesia.sg | www.bayesia.com
  • 47. The Target Interpretation Tree can be started from the menu via Analysis | Target Interpretation Tree. Upon starting this function, we need to set several options. We define the Search Stop Criteria, and set the Maximum Size of Evidence to 3 and the Minimum Joint Probability to 1 (percent). Furthermore, we check the Center on State box and select Malignant from the drop-down menu. This way, Malignant will be high- lighted in each node of the to-be-generated tree. By default, the tree is presented in a top-down format. Often, it may be more convenient to change the layout to a left-to-right format via the Switch Position but- ton in the upper lefthand corner of the window that contains the tree. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 47
  • 48. The following tree is presented in the left-to-right layout. This tree prescribes in which sequence evidence should be sought for gaining the maximum amount of in- formation towards a diagnosis. Going from left to right, we see how the probability distribution for Class changes given the evidence set thus far. The leftmost node in the tree, without any evidence set, shows the marginal probability distribution of Class. The bottom panel of this node shows Uniformity of Cells Size as the most important evidence to seek. Breast Cancer Diagnostics with Bayesian Networks 48 www.bayesia.us | www.bayesia.sg | www.bayesia.com
  • 49. The three branches that emerge from the node represent the possible states of Uniformity of Cells Size, i.e. the hard evidence we can observe. If we set evidence analogously to what we did in the Adaptive Question- naire, we will choose the middle branch with the value Uniformity of Cell Size<=4.5 (2/3). This evidence updates the probabilities of the Target State, now predicting a 58.53% probability of Class= Malignant. At the same time we can see what is the next best piece of evidence to seek. Here, it is Bare Nu- clei, which will provide the greatest information gain towards the diagnosis of Class. The information gain is quantified with the Score displayed at the bottom of the node. The Score is the Conditional Mutual Information of the node Bare Nuclei with regard to the Target Node, divided by the cost of observing the evidence if the option Utilize Evidence Cost was checked. In our case, as we did not check this option, the Score is equal to the Conditional Mutual Information. We can quickly verify the Score of 7.1% by running the Mapping function. First, we set the evidence on Uniformity of Cell Size (<=4.5) and then run Analysis | Visual | Mapping. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 49
  • 50. The Mapping window features drop-down menus for Node Analysis and Arc Analysis. However, we are only interested in Node Analysis, and we select Mutual Information with the Target Node as the metric to be displayed. The size of the nodes, beyond a fixed minimum size,10 is now proportional to the Mutual Information with the Target Node. To see the specific values, we right-click on the background of the window and select Dis- play Scores on Nodes from the Contextual Menu. Breast Cancer Diagnostics with Bayesian Networks 50 www.bayesia.us | www.bayesia.sg | www.bayesia.com 10 The minimum and maximum sizes can be changed via Edit Sizes from the Contextual Menu in the Mapping Window.
  • 51. This shows us, given Uniformity of Cell Size<=4.5, the Mutual Information of Bare Nuclei with the Target Node is 0.0711, or 7.1%. Note that the node on which evidence has already been set, i.e. Uniformity of Cell Size, shows a Conditional Mutual Information of 0. So, learning Bare Nuclei will bring the highest information gain among the remaining variables. For in- stance, if we now observed Bare Nuclei>5.5 (3/3), the probability of Class=Malignant would reach 98.33%. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 51
  • 52. Finally, BayesiaLab also reports the joint probability of each tree node, i.e. the probability that all pieces of evidence in a branch, up to and including that tree node, would occur. This says that the joint probability of Uniformity of Cell Size<=4.5 and Bare Nuclei>5.5 is 5.32%. As opposed to this somewhat artificial illustration of a Target Interpretation Tree in the context of FNA- based diagnosis, Target Interpretation Trees are often prepared for emergency situations, such as triage classification, in which rapid diagnosis with constrained resources is essential. We believe that our example still conveys the idea of “optimum escalation” in obtaining evidence towards a diagnosis. Summary By using Bayesian networks as the framework and BayesiaLab as the tool, we have shown a practical new modeling and analysis approach based on the widely studied Wisconsin Breast Cancer Database. BayesiaLab can rapidly machine-learn reliable models, even without prior domain knowledge and without hypothesis. The classification performance of the BayesiaLab-generated Bayesian network models is on par with all studies on this topic that are published to date. Beyond the predictive performance, BayesiaLab en- ables a range of analysis and interpretation functions, which can help the researcher gain deeper domain knowledge and perform inference more efficiently. Breast Cancer Diagnostics with Bayesian Networks 52 www.bayesia.us | www.bayesia.sg | www.bayesia.com
  • 53. Appendix Framework: The Bayesian Network Paradigm11 Acyclic Graphs & Bayes’s Rule Probabilistic models based on directed acyclic graphs have a long and rich tradition, beginning with the work of geneticist Sewall Wright in the 1920s. Variants have appeared in many fields. Within statistics, such models are known as directed graphical models; within cognitive science and artificial intelligence, such models are known as Bayesian networks. The name honors the Rev. Thomas Bayes (1702-1761), whose rule for updating probabilities in the light of new evidence is the foundation of the approach. Rev. Bayes addressed both the case of discrete probability distributions of data and the more complicated case of continuous probability distributions. In the discrete case, Bayes’ theorem relates the conditional and marginal probabilities of events A and B, provided that the probability of B does not equal zero: P(A∣B) = P(B∣A)P(A) P(B) In Bayes’ theorem, each probability has a conventional name: P(A) is the prior probability (or “unconditional” or “marginal” probability) of A. It is “prior” in the sense that it does not take into account any information about  B; however, the event  B need not occur after event A. In the nineteenth century, the unconditional probability P(A) in Bayes’s rule was called the “ante- cedent” probability; in deductive logic, the antecedent set of propositions and the inference rule imply con- sequences. The unconditional probability P(A) was called “a priori” by Ronald A. Fisher. P(A|B) is the conditional probability of A, given B. It is also called the posterior probability because it is de- rived from or depends upon the specified value of B. P(B|A) is the conditional probability of B given A. It is also called the likelihood. P(B) is the prior or marginal probability of B, and acts as a normalizing constant. Bayes theorem in this form gives a mathematical representation of how the conditional probability of event A given B is related to the converse conditional probability of B given A. The initial development of Bayesian networks in the late 1970s was motivated by the need to model the top- down (semantic) and bottom-up (perceptual) combination of evidence in reading. The capability for bidirec- tional inferences, combined with a rigorous probabilistic foundation, led to the rapid emergence of Bayesian networks as the method of choice for uncertain reasoning in AI and expert systems replacing earlier, ad hoc rule-based schemes. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 53 11 Adapted from Pearl (2000), used with permission.
  • 54. The nodes in a Bayesian network represent variables of interest (e.g. the temperature of a device, the gen- der of a patient, a feature of an object, the occur- rence of an event) and the links represent statistical (informational) or causal dependencies among the variables. The dependencies are quantified by condi- tional probabilities for each node given its parents in the network. The network supports the computation of the posterior probabilities of any subset of vari- ables given evidence about any other subset. Compact Representation of the Joint Probability Distribution “The central paradigm of probabilistic reasoning is to identify all relevant variables x1, . . . , xN in the environment [i.e. the domain under study], and make a probabilistic model p(x1, . . . , xN) of their interaction [i.e. represent the variables’ joint probability distribution].” Bayesian networks are very attractive for this purpose as they can, by means of factorization, compactly represent the joint probability distribution of all variables. “Reasoning (inference) is then performed by introducing evidence that sets variables in known states, and subsequently computing probabilities of interest, conditioned on this evidence. The rules of probability, combined with Bayes’ rule make for a complete reasoning system, one which includes traditional deductive logic as a special case.” (Barber, 2012) Breast Cancer Diagnostics with Bayesian Networks 54 www.bayesia.us | www.bayesia.sg | www.bayesia.com
  • 55. References Abdrabou, E. A.M.L, and A. E.B.M Salem. “A Breast Cancer Classifier Based on a Combination of Case- Based Reasoning and Ontology Approach” (n.d.). Conrady, Stefan, and Lionel Jouffe. “Missing Values Imputation -  A New Approach to Missing Values Processing with Bayesian Networks,” January 4, 2012. http://bayesia.us/index.php/missingvalues. El-Sebakhy, E. A, K. A Faisal, T. Helmy, F. Azzedin, and A. Al-Suhaim. “Evaluation of Breast Cancer Tu- mor Classification with Unconstrained Functional Networks Classifier.” In The 4th ACS/IEEE Interna- tional Conf. on Computer Systems and Applications, 281–287, 2006. Hung, M. S, M. Shanker, and M. Y Hu. “Estimating Breast Cancer Risks Using Neural Networks.” Journal of the Operational Research Society 53, no. 2 (2002): 222–231. Karabatak, M., and M. C Ince. “An Expert System for Detection of Breast Cancer Based on Association Rules and Neural Network.” Expert Systems with Applications 36, no. 2 (2009): 3465–3469. Mangasarian, Olvi L, W. Nick Street, and William H Wolberg. “Breast Cancer Diagnosis and Prognosis via Linear Programming.” OPERATIONS RESEARCH 43 (1995): 570–577. Mu, T., and A. K Nandi. “BREAST CANCER DIAGNOSIS FROM FINE-NEEDLE ASPIRATION USING SUPERVISED COMPACT HYPERSPHERES AND ESTABLISHMENT OF CONFIDENCE OF MA- LIGNANCY” (n.d.). Pearl, Judea. Causality: Models, Reasoning and Inference. 2nd ed. Cambridge University Press, 2009. Pearl, Judea, and Stuart Russell. Bayesian Networks. UCLA Congnitive Systems Laboratory, November 2000. http://bayes.cs.ucla.edu/csl_papers.html. Wolberg, W. H, W. N Street, D. M Heisey, and O. L Mangasarian. “Computer-derived Nuclear Features Distinguish Malignant from Benign Breast Cytology* 1.” Human Pathology 26, no. 7 (1995): 792– 796. Wolberg, William H, W. Nick Street, and O. L Mangasarian. “MACHINE LEARNING TECHNIQUES TO DIAGNOSE BREAST CANCER FROM IMAGE-PROCESSED NUCLEAR FEATURES OF FINE NEEDLE ASPIRATES” (n.d.). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.127.2109. Wolberg, William H, W. Nick Street, and Olvi L Mangasarian. “Breast Cytology Diagnosis Via Digital Im- age Analysis” (1993). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.38.9894. ———. “Breast Cytology Diagnosis Via Digital Image Analysis” (1993). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.38.9894. Breast Cancer Diagnostics with Bayesian Networks www.bayesia.us | www.bayesia.sg | www.bayesia.com 55
  • 56. Contact Information Bayesia USA 312 Hamlet’s End Way Franklin, TN 37067 USA Phone: +1 888-386-8383 info@bayesia.us www.bayesia.us Bayesia Singapore Pte. Ltd. 20 Cecil Street #14-01, Equity Plaza Singapore 049705 Phone: +65 3158 2690 info@bayesia.sg www.bayesia.sg Bayesia S.A.S. 6, rue Léonard de Vinci BP 119 53001 Laval Cedex France Phone: +33(0)2 43 49 75 69 info@bayesia.com www.bayesia.com Copyright © 2013 Bayesia S.A.S., Bayesia USA and Bayesia Singapore. All rights reserved. Breast Cancer Diagnostics with Bayesian Networks 56 www.bayesia.us | www.bayesia.sg | www.bayesia.com