2. Singular goal of workshop
Accuracydistribution
model 1 model 2 model 3 model 4 model 5 model 6
• understand
3. Singular goal of workshop
Accuracydistribution
model 1 model 2 model 3 model 4 model 5 model 6
• understand
• machine learning
4. Singular goal of workshop
Accuracydistribution
model 1 model 2 model 3 model 4 model 5 model 6
• understand
• machine learning
• support vector machine
5. Singular goal of workshop
Accuracydistribution
model 1 model 2 model 3 model 4 model 5 model 6
• understand
• machine learning
• support vector machine
• classification accuracy
6. Singular goal of workshop
Accuracydistribution
model 1 model 2 model 3 model 4 model 5 model 6
• understand
• machine learning
• support vector machine
• classification accuracy
• cross-validation
8. What is Machine Learning?
• “giving computers the ability to
learn without being explicitly
programmed.”
9. What is Machine Learning?
• “giving computers the ability to
learn without being explicitly
programmed.”
• i.e. building algorithms to
learn patterns in data
10. What is Machine Learning?
• “giving computers the ability to
learn without being explicitly
programmed.”
• i.e. building algorithms to
learn patterns in data
• automatically
18. Clinical Application of ML
• ML has many clinical applications, including:
• computer-aided diagnosis
• clinical decision support
19. Clinical Application of ML
• ML has many clinical applications, including:
• computer-aided diagnosis
• clinical decision support
• personalized medicine
20. Clinical Application of ML
• ML has many clinical applications, including:
• computer-aided diagnosis
• clinical decision support
• personalized medicine
• treatment and monitoring
21. Clinical Application of ML
• ML has many clinical applications, including:
• computer-aided diagnosis
• clinical decision support
• personalized medicine
• treatment and monitoring
• better care and service delivery systems
(reduce length of hospitalization, optimize
resource redistribution etc)
22. Clinical Application of ML
• ML has many clinical applications, including:
• computer-aided diagnosis
• clinical decision support
• personalized medicine
• treatment and monitoring
• better care and service delivery systems
(reduce length of hospitalization, optimize
resource redistribution etc)
• I will focus on biomarkers today!
23. Clinical Application of ML
• ML has many clinical applications, including:
• computer-aided diagnosis
• clinical decision support
• personalized medicine
• treatment and monitoring
• better care and service delivery systems
(reduce length of hospitalization, optimize
resource redistribution etc)
• I will focus on biomarkers today!
• more on how to assess their utility
24. Clinical Application of ML
• ML has many clinical applications, including:
• computer-aided diagnosis
• clinical decision support
• personalized medicine
• treatment and monitoring
• better care and service delivery systems
(reduce length of hospitalization, optimize
resource redistribution etc)
• I will focus on biomarkers today!
• more on how to assess their utility
• less on how to identify, build and tune them.
25. Types of Machine learning
Data
Labelled
SupervisedUnsupervised
Data Not
labelled
52. Support Vector Machine (SVM)
• A popular classification technique
• At its core, it is
• binary (separate two classes)
53. Support Vector Machine (SVM)
• A popular classification technique
• At its core, it is
• binary (separate two classes)
• linear (boundary: line in 2d or
hyperplane in n-d)
54. Support Vector Machine (SVM)
• A popular classification technique
• At its core, it is
• binary (separate two classes)
• linear (boundary: line in 2d or
hyperplane in n-d)
• Its power lies in finding the boundary
between classes difficult to separate
70. Harder problem
(classes are not linearly separable)
L1
L2
x1
x2
L1→less errors,
smaller margin
71. Harder problem
(classes are not linearly separable)
L1
L2
x1
x2
L1→less errors,
smaller margin
L2→more errors,
larger margin
72. Harder problem
(classes are not linearly separable)
L1
L2
x1
x2
L1→less errors,
smaller margin
L2→more errors,
larger margin
Tradeoff between
error and margin!
73. Harder problem
(classes are not linearly separable)
L1
L2
x1
x2
L1→less errors,
smaller margin
L2→more errors,
larger margin
Tradeoff between
error and margin!
parameter C: penalty
for misclassification
89. Recap: SVM
• Linear classifier at its core
• Boundary with max. margin
• Input data can be transformed
to higher dimensions to
achieve better separation
90. Classifier Performance
• How do you evaluate how well the classifier works?
• input unseen data with known labels (ground truth)
• make predictions with previously trained classifier
• using ground truth,
• compute % of when prediction matches ground
truth —> classification accuracy
94. P. Raamana
What is generalizability?
available
data (sample*)
23*has a statistical definition
95. P. Raamana
What is generalizability?
available
data (sample*)
23*has a statistical definition
96. P. Raamana
What is generalizability?
available
data (sample*) desired: accuracy on
unseen data (population*)
23*has a statistical definition
97. P. Raamana
What is generalizability?
available
data (sample*) desired: accuracy on
unseen data (population*)
23*has a statistical definition
98. P. Raamana
What is generalizability?
available
data (sample*) desired: accuracy on
unseen data (population*)
out-of-sample
predictions
23*has a statistical definition
99. P. Raamana
What is generalizability?
available
data (sample*) desired: accuracy on
unseen data (population*)
out-of-sample
predictions
23
avoid
overfitting
*has a statistical definition
111. P. Raamana
Cross-validation
• What is cross-validation?
• How to perform it?
• What are the effects of
different CV choices?
Training set Test set
≈ℵ≈
27
112. P. Raamana
Cross-validation
• What is cross-validation?
• How to perform it?
• What are the effects of
different CV choices?
Training set Test set
≈ℵ≈
negative bias unbiased positive bias
27
120. P. Raamana
Why cross-validate?
Training set Test set
bigger training set
better learning better testing
bigger test set
Key: Train & test sets must be disjoint.
29
121. P. Raamana
Why cross-validate?
Training set Test set
bigger training set
better learning better testing
bigger test set
Key: Train & test sets must be disjoint.
And the dataset or sample size is fixed.
29
122. P. Raamana
Why cross-validate?
Training set Test set
bigger training set
better learning better testing
bigger test set
Key: Train & test sets must be disjoint.
And the dataset or sample size is fixed.
They grow at the expense of each other!
29
123. P. Raamana
Why cross-validate?
Training set Test set
bigger training set
better learning better testing
bigger test set
Key: Train & test sets must be disjoint.
And the dataset or sample size is fixed.
They grow at the expense of each other!
29
124. P. Raamana
Why cross-validate?
Training set Test set
bigger training set
better learning better testing
bigger test set
Key: Train & test sets must be disjoint.
And the dataset or sample size is fixed.
They grow at the expense of each other!
cross-validate
to maximize both
29
126. P. Raamana
Use cases
• “When setting aside data for parameter
estimation and validation of results can
not be afforded, cross-validation (CV) is
typically used”
30
127. P. Raamana
Use cases
• “When setting aside data for parameter
estimation and validation of results can
not be afforded, cross-validation (CV) is
typically used”
• Use cases:
30
128. P. Raamana
accuracydistribution
fromrepetitionofCV(%)
Use cases
• “When setting aside data for parameter
estimation and validation of results can
not be afforded, cross-validation (CV) is
typically used”
• Use cases:
• to estimate generalizability
(test accuracy)
30
129. P. Raamana
accuracydistribution
fromrepetitionofCV(%)
Use cases
• “When setting aside data for parameter
estimation and validation of results can
not be afforded, cross-validation (CV) is
typically used”
• Use cases:
• to estimate generalizability
(test accuracy)
• to pick optimal parameters
(model selection)
30
130. P. Raamana
accuracydistribution
fromrepetitionofCV(%)
Use cases
• “When setting aside data for parameter
estimation and validation of results can
not be afforded, cross-validation (CV) is
typically used”
• Use cases:
• to estimate generalizability
(test accuracy)
• to pick optimal parameters
(model selection)
• to compare performance
(model comparison).
30
133. P. Raamana
Key Aspects of CV
1. How you split the dataset into train/test
•maximal independence between
training and test sets is desired.
31
134. P. Raamana
Key Aspects of CV
1. How you split the dataset into train/test
•maximal independence between
training and test sets is desired.
•This split could be
• over samples (e.g. indiv. diagnosis)
samples
(rows)
31
135. P. Raamana
Key Aspects of CV
1. How you split the dataset into train/test
•maximal independence between
training and test sets is desired.
•This split could be
• over samples (e.g. indiv. diagnosis)
samples
(rows)
31
healt
hy
dise
ase
136. P. Raamana
Key Aspects of CV
1. How you split the dataset into train/test
•maximal independence between
training and test sets is desired.
•This split could be
• over samples (e.g. indiv. diagnosis)
• over time (for task prediction in fMRI)
time (columns)
samples
(rows)
31
healt
hy
dise
ase
137. P. Raamana
Key Aspects of CV
1. How you split the dataset into train/test
•maximal independence between
training and test sets is desired.
•This split could be
• over samples (e.g. indiv. diagnosis)
• over time (for task prediction in fMRI)
time (columns)
samples
(rows)
31
healt
hy
dise
ase
138. P. Raamana
Key Aspects of CV
1. How you split the dataset into train/test
•maximal independence between
training and test sets is desired.
•This split could be
• over samples (e.g. indiv. diagnosis)
• over time (for task prediction in fMRI)
2. How often you repeat randomized splits?
•to expose classifier to full variability
•As many as times as you can e.g. 100
≈ℵ≈
time (columns)
samples
(rows)
31
healt
hy
dise
ase
142. P. Raamana
Validation set
goodness of fit
of the model
biased* towards
the training set
Training set Test set
32*biased towards X —> overfit to X
143. P. Raamana
Validation set
goodness of fit
of the model
biased* towards
the training set
Training set Test set
≈ℵ≈
32*biased towards X —> overfit to X
147. P. Raamana
Validation set
optimize
parameters
goodness of fit
of the model
biased towards
the test set
biased* towards
the training set
evaluate
generalization
independent of
training or test sets
Training set Test set Validation set
≈ℵ≈
32*biased towards X —> overfit to X
148. P. Raamana
Validation set
optimize
parameters
goodness of fit
of the model
biased towards
the test set
biased* towards
the training set
evaluate
generalization
independent of
training or test sets
Whole dataset
Training set Test set Validation set
≈ℵ≈
32*biased towards X —> overfit to X
149. P. Raamana
Validation set
optimize
parameters
goodness of fit
of the model
biased towards
the test set
biased* towards
the training set
evaluate
generalization
independent of
training or test sets
Whole dataset
Training set Test set Validation set
≈ℵ≈
inner-loop
32*biased towards X —> overfit to X
150. P. Raamana
Validation set
optimize
parameters
goodness of fit
of the model
biased towards
the test set
biased* towards
the training set
evaluate
generalization
independent of
training or test sets
Whole dataset
Training set Test set Validation set
≈ℵ≈
inner-loop
outer-loop
32*biased towards X —> overfit to X
154. P. Raamana
Terminology
33
Data split
Training
Testing
Validation
Purpose (Do’s)
Train model to learn its
core parameters
Optimize
hyperparameters
Evaluate fully-optimized
classifier to report
performance
Don’ts (Invalid use)
Don’t report training error as
the test error!
Don’t do feature selection or
anything supervised on test
set to learn or optimize!
Don’t use it in any way to train
classifier or optimize
parameters
155. P. Raamana
Terminology
33
Data split
Training
Testing
Validation
Purpose (Do’s)
Train model to learn its
core parameters
Optimize
hyperparameters
Evaluate fully-optimized
classifier to report
performance
Don’ts (Invalid use)
Don’t report training error as
the test error!
Don’t do feature selection or
anything supervised on test
set to learn or optimize!
Don’t use it in any way to train
classifier or optimize
parameters
Alternative
names
Training
(no confusion)
Validation
(or tweaking, tuning,
optimization set)
Test set (more
accurately reporting
set)
162. P. Raamana
K-fold CV
Test sets in different trials are indeed mutually disjoint
Train Test, 4th fold
trial
1
2
…
k
34
163. P. Raamana
K-fold CV
Test sets in different trials are indeed mutually disjoint
Train Test, 4th fold
trial
1
2
…
k
Note: different folds won’t be contiguous. 34
164. P. Raamana
K-fold CV
Test sets in different trials are indeed mutually disjoint
Train Test, 4th fold
trial
1
2
…
k
Note: different folds won’t be contiguous. 34
166. P. Raamana
Repeated Holdout CV
Train Test
trial
1
2
…
n
Set aside an independent subsample (e.g. 30%) for testing
whole dataset
35
167. P. Raamana
Repeated Holdout CV
Train Test
trial
1
2
…
n
Set aside an independent subsample (e.g. 30%) for testing
whole dataset
35
168. P. Raamana
Repeated Holdout CV
Train Test
trial
1
2
…
n
Set aside an independent subsample (e.g. 30%) for testing
whole dataset
35
169. P. Raamana
Repeated Holdout CV
Train Test
trial
1
2
…
n
Set aside an independent subsample (e.g. 30%) for testing
whole dataset
35
170. P. Raamana
Repeated Holdout CV
Train Test
trial
1
2
…
n
Note: there could be overlap among the test sets
from different trials! Hence large n is recommended.
Set aside an independent subsample (e.g. 30%) for testing
whole dataset
35
171. Typical workflow
Whole dataset
(randomized split)
Training set
(with labels)
feature extraction
selection
parameter optimization
(on training data only)
Trained classifier
Test set: rest
(no labels)
Same feature
extraction
Select same
features
Evaluate on
test set
Pool predictions
over repetitions
NextCVrepetitioniofn
Accuracydistribution
172. Software
• There is a free machine learning toolbox in every
major language!
• Check below for the latest techniques/toolboxes:
• http://www.jmlr.org/mloss/ or
• http://mloss.org/software/
173. Confusion Matrices
Feature Importance
Accuracy distributions Intuitive comparison of misclassification rates
neuropredict : easy and comprehensive predictive analysis
input features
• each feature set could be any
set of numbers estimated from
sample by itself (intrinsic, not
group-wise)
• designed to seamlessly
compare many features (n>1),
if they are all from same set of
samples belonging to same
classes
• supports many input formats.
• plugs directly into outputs from
popular software like
Freesurfer.
neuropredict
• performs cross-validation,
in such a way to increase
power of statistical
comparisons later on
• tracks misclassification rates:
class- and subject-wise,
for each feature
• measures feature importance
• statistical comparison of
predictive performance
• intuitive visualizations
• stream-lined comparison for
a large number of features!
docs: http://neuropredict.readthedocs.io
code: github.com/raamana/neuropredict twitter: @raamana_
178. neuropredict features
• Auto-reading of neuroimaging features
• Auto-evaluation of predictive accuracy
• Auto-comparison of performance …
• Notice the word I am repeating?
179. neuropredict features
• Auto-reading of neuroimaging features
• Auto-evaluation of predictive accuracy
• Auto-comparison of performance …
• Notice the word I am repeating?
• Auto
• Auto
• Auto
180. neuropredict features
• Auto-reading of neuroimaging features
• Auto-evaluation of predictive accuracy
• Auto-comparison of performance …
• Notice the word I am repeating?
• Being automatic is important, without
which, it becomes hard, and error-prone!!
• Auto
• Auto
• Auto
187. Model selection
Friedman, J., Hastie, T., & Tibshirani, R. (2008). The elements of statistical learning. Springer, Berlin: Springer series in statistics.