SlideShare ist ein Scribd-Unternehmen logo
1 von 59
Downloaden Sie, um offline zu lesen
Clinical prediction models in the age
of artificial intelligence and big data
Ewout Steyerberg
Professor of Clinical Biostatistics and Medical
Decision Making
<E.Steyerberg@ErasmusMC.nl /
E.W.Steyerberg@LUMC.nl >
Basel, Nov 1 2019
Thanks to co-workers; no COI
• LUMC: Maarten van Smeden
• Leuven: Ben van Calster
Both provided many of the slides shown
Main question
Where does Big Data / machine learning (ML) /
artificial intelligence (AI) assist us in prediction
research?
• Strengths and weaknesses of Big Data
initiatives
• Consider links between classical statistical
approaches, ML, AI for prediction
Prediction models; what for?
• Understanding nature:
relative risks of different predictors
• Predicting outcomes:
absolute risk by combinations of predictors
Traditional regression modeling
5
Can well be used for explanation and prediction
Steyerberg. Clinical prediction models (2nd ed). New York: Springer, 2019.
Riley et al. Prognosis Research in healthcare. Oxford: OUP, 2019.
Prediction models
• Diagnosis
– Imaging findings, e.g. abnormal CT scan in trauma
– Clinical condition, e.g. serious infection
– …
• Prognosis
– Mortality, e.g. < 30 days, over time, …
– …
Prognostic / predictive models
Prognostic modeling
y ~ X Prognostic factors
y ~ Tx Treatment effect
y ~ X + Tx Covariate adjusted tx effect
Predictive modeling
y ~ X * Tx Predictive factors for differential tx effect
Opportunities in medical prediction
• More data
– larger N
– more variables
• More detail
– biomarkers / omics / imaging / eHealth
• Novel methods
– ML / AI / ..
– Statistical methods
• Dynamic prediction
• Testing procedures for high dimensional data
• …
Hype
Examples
• Biomarkers
• Imaging
• Omics
Positive example 1
• Biomarkers in diagnosing head trauma
– Mild: AUC 0.89 [0.87-0.90] vs clinical 0.84 [0.83-0.86]
Positive example 2
• MRI Imaging in diagnosing prostate cancer
• MRI-PCa-RCs AUC 0.83 to 0.85 vs
PCa-RCs AUC 0.69 to 0.74
Positive example 3
Positive example 3
• Omics in diagnosing … / predicting … ??
• Because omics à
clinical characteristics à
outcome?
Examples
• Biomarkers
• Imaging
• Omics
• ML / AI
Success of ML / AI
Non-exhaustive list
17
Gaming
Natural Language Processing (Siri etc)
Fraud detection
Shoplifting
Object recognition (e.g. for driverless cars)
Facial recognition
Traffic predictions (e.g. Waze app)
Electrical load forecasting
(Social) media and advertising (people you may know, movie suggestions, )
Spam filtering
Search engines (e.g. Google PageRank)
Handwriting recognition
Popularity skyrocketing
18
Search on https://www.ncbi.nlm.nih.gov/pubmed/ on (performed Oct 18, 2019)
IBM Watson winning Jeopardy! (2011)
IBM Watson for oncology
https://bit.ly/2LxiWGj
Evidence
• Cochrane: ”We searched for RCTs and found
20 among ... papers”
• Dr Watson: “We searched 4 Million webpages
in 1 second”
Five myths
1. Big Data will resolve the problems of small data
2. ML/AI is very different from classical modeling
3. Deep learning is relevant for all medical
prediction problems
4. ML / AI is better than classical modeling for
medical prediction problems
5. ML / AI leads to better generalizability
Myth 1: Big Data will resolve the
problems of small data
Abstract
The use of artificial intelligence, and deep-learning in
particular, has been enabled by the use of big data,
along with markedly enhanced computing power and
cloud storage, across all sectors.
In medicine, this is beginning to have an impact ...
Do you have a clear research question?
Do you have data that help you answer the question?
What is the quality of the data?
Big Data, Big Errors
• Harrell tweet
Myth 2: ML/AI is very different
from classical modeling
“Everything is ML”
https://bit.ly/2lEVn33
Two cultures
Breiman, Stat Sci, 2001, DOI:
10.1214/ss/1009213726
Traditional Statistics vs Machine Learning
30
Breiman. Stat Sci 2001;16:199-231.
Traditional Statistics vs Machine Learning
31Galit Shmueli. Keynote talk at 2019 ISBIS conference, Kuala Lumpur; taken from slideshare.net
Bzdok. Nature Methods 2018;15:233-4.
??
Example of exaggerating contrasts
Predicting mortality – the results
Elastic net, 586 (‘600’) variables: c=0.801
Traditional Cox, 27 (‘30’) expert-selected variables: c=0.793
PlosOne, 2018, DOI: 10.1371/journal.pone.0202344
Predicting mortality – the media
PlosOne, 2018, DOI: 10.1371/journal.pone.0202344;
https://bit.ly/2Q6H41R; https://bit.ly/2m3RLrn
ML refers to a culture,
not to methods
• Substantial overlap methods used by both cultures
• Substantial overlap analysis goals
• Attempts to separate the two frequently result in
disagreement
Pragmatic approach:
“ML” refers to models roughly outside of the traditional
regression types of analysis:
trees, SVMs, neural networks, boosting etc.
Machine learning:
simple overview
37
Intellspot.com
Myth 3: Deep learning is relevant
for all medical prediction
Example: retinal disease
Gulshan et al, JAMA, 2016, 10.1001/jama.2016.17216;
Picture retinopathy: https://bit.ly/2kB3X2w AS
Diabetic retinopathy
Deep learning (= Neural network)
• 128,000 images
• Transfer learning (preinitialization)
• Sensitivity and specificity > .90
• Estimated from training data
Example: lymph node metastases
Bejnordi et al, JAMA, 2018, doi: 10.1001/jama.2017.14585.
See letter to the editor for a critical discussion: https://bit.ly/2kcYS0e
Deep learning competition
But:
• 390 teams signed up, 23 submitted
• “Only” 270 images for training
• Test AUC range: 0.56 to 0.99
3. Deep learning is relevant for all medical
prediction problems
NO: Deep learning excels in visual tasks
Myth 4: ML / AI is better than classical
modeling for medical prediction
Reviewer #2,
van Smeden submission 2019
Poor methods and unclear reporting
46
What was done about missing data? 45% fully unclear, 100% poor or unclear
How were continuous predictors modeled? 20% unclear, 25% categorized
How were hyperparameters tuned? 66% unclear, 19% tuned with information
How was performance validated? 68% unclear or biased approach
Was accuracy of risk estimates checked? 79% not at all
Further observations:
- Prognosis: time horizon often ignored
- Patients matched on variables used a predictors
- 99% of patients excluded from modeling to obtain a balanced dataset
- First and last percentile of continuous predictors replaced with mean
Differences in discrimination
Christodoulou et al. Journal of Clinical Epidemiology, 2019,
doi: 10.1016/j.jclinepi.2019.02.004
Where is ML useful?
N cases
N predictors
Large
Large
Small
Small
BA
C D
Rajkomar et al. NEJM 2019;380:1347-58.
Myth 5: ML / AI leads to better generalizability
“ … developed 7 parallel models for hospital-acquired acute kidney injury
using common regression and machine learning methods, validating each
over 9 subsequent years.”:
“Discrimination was maintained for all models. Calibration declined as all
models increasingly overpredicted risk. However, the random forest and
neural network models maintained calibration … ”
Efron talk Leiden
http://statweb.stanford.edu/~ckirby/brad/talks/
Empirical findings in TBI
– 16 cohorts: 5 observational, 11 RCTs
– Develop in 15, validate in 1
– 7 methods:
LR; SVM; RF; nnet; gbm; LASSO; ridge
5 observational 11 RCTs
Variability between cohorts >> variability between methods
Prediction challenges
• There is no such thing as a validated prediction
algorithm
• Algorithms are high maintenance
– Developed models need validation and updating to
remain useful over time and place
• Regulation and quality control of algorithms
– What about proprietary algorithms?
Five myths
1. Big Data will resolve the problems of small data
NO: Big Data, Big Errors
2. ML/AI is very different from classical modeling
NO: a continuum, cultural differences
3. Deep learning is relevant for all medical prediction
NO: Deep learning excels in visual tasks
4. ML / AI is better than classical modeling for prediction
NO: some methods do harm (e.g. tree modeling)
5. ML / AI leads to better generalizability
NO: any prediction model may suffer from poor
generalizability
Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019

Weitere ähnliche Inhalte

Was ist angesagt?

ML and AI: a blessing and curse for statisticians and medical doctors
ML and AI: a blessing and curse forstatisticians and medical doctorsML and AI: a blessing and curse forstatisticians and medical doctors
ML and AI: a blessing and curse for statisticians and medical doctorsMaarten van Smeden
 
Big Data applications in Health Care
Big Data applications in Health CareBig Data applications in Health Care
Big Data applications in Health CareLeo Barella
 
Introduction to prediction modelling - Berlin 2018 - Part II
Introduction to prediction modelling - Berlin 2018 - Part IIIntroduction to prediction modelling - Berlin 2018 - Part II
Introduction to prediction modelling - Berlin 2018 - Part IIMaarten van Smeden
 
Introduction to prediction modelling - Berlin 2018 - Part I
Introduction to prediction modelling - Berlin 2018 - Part IIntroduction to prediction modelling - Berlin 2018 - Part I
Introduction to prediction modelling - Berlin 2018 - Part IMaarten van Smeden
 
Improving epidemiological research: avoiding the statistical paradoxes and fa...
Improving epidemiological research: avoiding the statistical paradoxes and fa...Improving epidemiological research: avoiding the statistical paradoxes and fa...
Improving epidemiological research: avoiding the statistical paradoxes and fa...Maarten van Smeden
 
Big Data in healthcare - opportunities and issues
Big Data in healthcare - opportunities and issuesBig Data in healthcare - opportunities and issues
Big Data in healthcare - opportunities and issuesJaco van Duivenboden
 
Digital twins in cancer state of-the-art and open research
Digital twins in cancer state of-the-art and open researchDigital twins in cancer state of-the-art and open research
Digital twins in cancer state of-the-art and open researchKamran Gholizadeh HamlAbadi
 
Regression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsRegression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsMaarten van Smeden
 
What is modeling.pptx
What is modeling.pptxWhat is modeling.pptx
What is modeling.pptxBerhe Tekle
 
Survey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease predictionSurvey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease predictionSivagowry Shathesh
 
How to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - StatsworkHow to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - StatsworkStats Statswork
 
Measurement error in medical research
Measurement error in medical researchMeasurement error in medical research
Measurement error in medical researchMaarten van Smeden
 
Predictive modeling healthcare
Predictive modeling healthcarePredictive modeling healthcare
Predictive modeling healthcareTaposh Roy
 
An introduction to machine learning in biomedical research: Key concepts, pr...
An introduction to machine learning in biomedical research:  Key concepts, pr...An introduction to machine learning in biomedical research:  Key concepts, pr...
An introduction to machine learning in biomedical research: Key concepts, pr...FranciscoJAzuajeG
 
Correcting for missing data, measurement error and confounding
Correcting for missing data, measurement error and confoundingCorrecting for missing data, measurement error and confounding
Correcting for missing data, measurement error and confoundingMaarten van Smeden
 
演講-Meta analysis in medical research-張偉豪
演講-Meta analysis in medical research-張偉豪演講-Meta analysis in medical research-張偉豪
演講-Meta analysis in medical research-張偉豪Beckett Hsieh
 

Was ist angesagt? (20)

ML and AI: a blessing and curse for statisticians and medical doctors
ML and AI: a blessing and curse forstatisticians and medical doctorsML and AI: a blessing and curse forstatisticians and medical doctors
ML and AI: a blessing and curse for statisticians and medical doctors
 
Big Data applications in Health Care
Big Data applications in Health CareBig Data applications in Health Care
Big Data applications in Health Care
 
Predictimands
PredictimandsPredictimands
Predictimands
 
Introduction to prediction modelling - Berlin 2018 - Part II
Introduction to prediction modelling - Berlin 2018 - Part IIIntroduction to prediction modelling - Berlin 2018 - Part II
Introduction to prediction modelling - Berlin 2018 - Part II
 
Introduction to prediction modelling - Berlin 2018 - Part I
Introduction to prediction modelling - Berlin 2018 - Part IIntroduction to prediction modelling - Berlin 2018 - Part I
Introduction to prediction modelling - Berlin 2018 - Part I
 
Improving epidemiological research: avoiding the statistical paradoxes and fa...
Improving epidemiological research: avoiding the statistical paradoxes and fa...Improving epidemiological research: avoiding the statistical paradoxes and fa...
Improving epidemiological research: avoiding the statistical paradoxes and fa...
 
Big Data in healthcare - opportunities and issues
Big Data in healthcare - opportunities and issuesBig Data in healthcare - opportunities and issues
Big Data in healthcare - opportunities and issues
 
Digital twins in cancer state of-the-art and open research
Digital twins in cancer state of-the-art and open researchDigital twins in cancer state of-the-art and open research
Digital twins in cancer state of-the-art and open research
 
Regression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsRegression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questions
 
What is modeling.pptx
What is modeling.pptxWhat is modeling.pptx
What is modeling.pptx
 
Survey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease predictionSurvey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease prediction
 
How to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - StatsworkHow to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - Statswork
 
Big data in healthcare
Big data in healthcareBig data in healthcare
Big data in healthcare
 
Measurement error in medical research
Measurement error in medical researchMeasurement error in medical research
Measurement error in medical research
 
Predictive modeling healthcare
Predictive modeling healthcarePredictive modeling healthcare
Predictive modeling healthcare
 
An introduction to machine learning in biomedical research: Key concepts, pr...
An introduction to machine learning in biomedical research:  Key concepts, pr...An introduction to machine learning in biomedical research:  Key concepts, pr...
An introduction to machine learning in biomedical research: Key concepts, pr...
 
Analysis Of Medical Data
Analysis Of Medical DataAnalysis Of Medical Data
Analysis Of Medical Data
 
Correcting for missing data, measurement error and confounding
Correcting for missing data, measurement error and confoundingCorrecting for missing data, measurement error and confounding
Correcting for missing data, measurement error and confounding
 
20 sajan gimire journal club presentation
20 sajan gimire journal club presentation20 sajan gimire journal club presentation
20 sajan gimire journal club presentation
 
演講-Meta analysis in medical research-張偉豪
演講-Meta analysis in medical research-張偉豪演講-Meta analysis in medical research-張偉豪
演講-Meta analysis in medical research-張偉豪
 

Ähnlich wie Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019

A Survey on Various Disease Prediction Techniques
A Survey on Various Disease Prediction TechniquesA Survey on Various Disease Prediction Techniques
A Survey on Various Disease Prediction Techniquesijtsrd
 
ai-in-healthcare-202011-201117103639.pptx
ai-in-healthcare-202011-201117103639.pptxai-in-healthcare-202011-201117103639.pptx
ai-in-healthcare-202011-201117103639.pptxssuser6b571f
 
ICBO 2014, October 8, 2014
ICBO 2014, October 8, 2014ICBO 2014, October 8, 2014
ICBO 2014, October 8, 2014Warren Kibbe
 
Bias in covid 19 models
Bias in covid 19 modelsBias in covid 19 models
Bias in covid 19 modelsLaure Wynants
 
Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...
Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...
Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...Health IT Conference – iHT2
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management inscit2006
 
Deep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining IDeep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining IDeakin University
 
Genetic prediction using Machine Learning Techniques .pptx
Genetic prediction using Machine Learning Techniques .pptxGenetic prediction using Machine Learning Techniques .pptx
Genetic prediction using Machine Learning Techniques .pptxHabtamuAyenew4
 
Data Mining Techniques In Computer Aided Cancer Diagnosis
Data Mining Techniques In Computer Aided Cancer DiagnosisData Mining Techniques In Computer Aided Cancer Diagnosis
Data Mining Techniques In Computer Aided Cancer DiagnosisDataminingTools Inc
 
Data Mining Techniques In Computer Aided Cancer Diagnosis
Data Mining Techniques In Computer Aided Cancer DiagnosisData Mining Techniques In Computer Aided Cancer Diagnosis
Data Mining Techniques In Computer Aided Cancer DiagnosisDatamining Tools
 
Prof. Mark Coles (Oxford University) - Data-driven systems medicine
Prof. Mark Coles (Oxford University) - Data-driven systems medicineProf. Mark Coles (Oxford University) - Data-driven systems medicine
Prof. Mark Coles (Oxford University) - Data-driven systems medicinemntbs1
 
How does machine learning help in cancer detection
How does machine learning help in cancer detection How does machine learning help in cancer detection
How does machine learning help in cancer detection GlobalTechCouncil
 
ML, biomedical data & trust
ML, biomedical data & trustML, biomedical data & trust
ML, biomedical data & trustPaul Agapow
 
Systems Genetics of Cancer - big data and all that
Systems Genetics of Cancer - big data and all thatSystems Genetics of Cancer - big data and all that
Systems Genetics of Cancer - big data and all thatFlorian Markowetz
 
Digital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision MedicineDigital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision MedicineJoel Saltz
 
Str-AI-ght to heaven? Pitfalls for clinical decision support based on AI
Str-AI-ght to heaven? Pitfalls for clinical decision support based on AIStr-AI-ght to heaven? Pitfalls for clinical decision support based on AI
Str-AI-ght to heaven? Pitfalls for clinical decision support based on AIBenVanCalster
 
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASE
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASEUSING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASE
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASEIJCSEIT Journal
 

Ähnlich wie Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019 (20)

A Survey on Various Disease Prediction Techniques
A Survey on Various Disease Prediction TechniquesA Survey on Various Disease Prediction Techniques
A Survey on Various Disease Prediction Techniques
 
ai-in-healthcare-202011-201117103639.pptx
ai-in-healthcare-202011-201117103639.pptxai-in-healthcare-202011-201117103639.pptx
ai-in-healthcare-202011-201117103639.pptx
 
ICBO 2014, October 8, 2014
ICBO 2014, October 8, 2014ICBO 2014, October 8, 2014
ICBO 2014, October 8, 2014
 
Bias in covid 19 models
Bias in covid 19 modelsBias in covid 19 models
Bias in covid 19 models
 
AI in the Covid-19 pandemic
AI in the Covid-19 pandemicAI in the Covid-19 pandemic
AI in the Covid-19 pandemic
 
Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...
Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...
Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management
 
Deep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining IDeep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining I
 
Genetic prediction using Machine Learning Techniques .pptx
Genetic prediction using Machine Learning Techniques .pptxGenetic prediction using Machine Learning Techniques .pptx
Genetic prediction using Machine Learning Techniques .pptx
 
Data Mining Techniques In Computer Aided Cancer Diagnosis
Data Mining Techniques In Computer Aided Cancer DiagnosisData Mining Techniques In Computer Aided Cancer Diagnosis
Data Mining Techniques In Computer Aided Cancer Diagnosis
 
Data Mining Techniques In Computer Aided Cancer Diagnosis
Data Mining Techniques In Computer Aided Cancer DiagnosisData Mining Techniques In Computer Aided Cancer Diagnosis
Data Mining Techniques In Computer Aided Cancer Diagnosis
 
Prof. Mark Coles (Oxford University) - Data-driven systems medicine
Prof. Mark Coles (Oxford University) - Data-driven systems medicineProf. Mark Coles (Oxford University) - Data-driven systems medicine
Prof. Mark Coles (Oxford University) - Data-driven systems medicine
 
Classification ANN
Classification ANNClassification ANN
Classification ANN
 
How does machine learning help in cancer detection
How does machine learning help in cancer detection How does machine learning help in cancer detection
How does machine learning help in cancer detection
 
ML, biomedical data & trust
ML, biomedical data & trustML, biomedical data & trust
ML, biomedical data & trust
 
Systems Genetics of Cancer - big data and all that
Systems Genetics of Cancer - big data and all thatSystems Genetics of Cancer - big data and all that
Systems Genetics of Cancer - big data and all that
 
Digital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision MedicineDigital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision Medicine
 
Str-AI-ght to heaven? Pitfalls for clinical decision support based on AI
Str-AI-ght to heaven? Pitfalls for clinical decision support based on AIStr-AI-ght to heaven? Pitfalls for clinical decision support based on AI
Str-AI-ght to heaven? Pitfalls for clinical decision support based on AI
 
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASE
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASEUSING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASE
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASE
 
Big data
Big dataBig data
Big data
 

Mehr von Ewout Steyerberg

Statistics and ML 21Oct22 sel.pptx
Statistics and ML 21Oct22 sel.pptxStatistics and ML 21Oct22 sel.pptx
Statistics and ML 21Oct22 sel.pptxEwout Steyerberg
 
Statistics and ML Paris 20sept22
Statistics and ML Paris 20sept22Statistics and ML Paris 20sept22
Statistics and ML Paris 20sept22Ewout Steyerberg
 
Reproducibility Leiden 12jul22.pptx
Reproducibility Leiden 12jul22.pptxReproducibility Leiden 12jul22.pptx
Reproducibility Leiden 12jul22.pptxEwout Steyerberg
 
Prediction research Twente 22June22 sel.pptx
Prediction research Twente 22June22 sel.pptxPrediction research Twente 22June22 sel.pptx
Prediction research Twente 22June22 sel.pptxEwout Steyerberg
 
Open Science Better Science? Steyerberg 2June2022.pptx
Open Science Better Science? Steyerberg 2June2022.pptxOpen Science Better Science? Steyerberg 2June2022.pptx
Open Science Better Science? Steyerberg 2June2022.pptxEwout Steyerberg
 
Prediction research: perspectives on performance Stanford 19May22.pptx
Prediction research: perspectives on performance Stanford 19May22.pptxPrediction research: perspectives on performance Stanford 19May22.pptx
Prediction research: perspectives on performance Stanford 19May22.pptxEwout Steyerberg
 
Evaluation of the clinical value of biomarkers for risk prediction
Evaluation of the clinical value of biomarkers for risk predictionEvaluation of the clinical value of biomarkers for risk prediction
Evaluation of the clinical value of biomarkers for risk predictionEwout Steyerberg
 
Open science LMU session contribution E Steyerberg 2jul20
Open science LMU session contribution E Steyerberg 2jul20Open science LMU session contribution E Steyerberg 2jul20
Open science LMU session contribution E Steyerberg 2jul20Ewout Steyerberg
 

Mehr von Ewout Steyerberg (8)

Statistics and ML 21Oct22 sel.pptx
Statistics and ML 21Oct22 sel.pptxStatistics and ML 21Oct22 sel.pptx
Statistics and ML 21Oct22 sel.pptx
 
Statistics and ML Paris 20sept22
Statistics and ML Paris 20sept22Statistics and ML Paris 20sept22
Statistics and ML Paris 20sept22
 
Reproducibility Leiden 12jul22.pptx
Reproducibility Leiden 12jul22.pptxReproducibility Leiden 12jul22.pptx
Reproducibility Leiden 12jul22.pptx
 
Prediction research Twente 22June22 sel.pptx
Prediction research Twente 22June22 sel.pptxPrediction research Twente 22June22 sel.pptx
Prediction research Twente 22June22 sel.pptx
 
Open Science Better Science? Steyerberg 2June2022.pptx
Open Science Better Science? Steyerberg 2June2022.pptxOpen Science Better Science? Steyerberg 2June2022.pptx
Open Science Better Science? Steyerberg 2June2022.pptx
 
Prediction research: perspectives on performance Stanford 19May22.pptx
Prediction research: perspectives on performance Stanford 19May22.pptxPrediction research: perspectives on performance Stanford 19May22.pptx
Prediction research: perspectives on performance Stanford 19May22.pptx
 
Evaluation of the clinical value of biomarkers for risk prediction
Evaluation of the clinical value of biomarkers for risk predictionEvaluation of the clinical value of biomarkers for risk prediction
Evaluation of the clinical value of biomarkers for risk prediction
 
Open science LMU session contribution E Steyerberg 2jul20
Open science LMU session contribution E Steyerberg 2jul20Open science LMU session contribution E Steyerberg 2jul20
Open science LMU session contribution E Steyerberg 2jul20
 

Kürzlich hochgeladen

6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfNicoChristianSunaryo
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etclalithasri22
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformationAnnie Melnic
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are successPratikSingh115843
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfPratikPatil591646
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfnikeshsingh56
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 

Kürzlich hochgeladen (17)

6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdf
 
DATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etcDATA ANALYSIS using various data sets like shoping data set etc
DATA ANALYSIS using various data sets like shoping data set etc
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformation
 
2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use2023 Survey Shows Dip in High School E-Cigarette Use
2023 Survey Shows Dip in High School E-Cigarette Use
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Presentation of project of business person who are success
Presentation of project of business person who are successPresentation of project of business person who are success
Presentation of project of business person who are success
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdf
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Statistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdfStatistics For Management by Richard I. Levin 8ed.pdf
Statistics For Management by Richard I. Levin 8ed.pdf
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 

Prediction, Big Data, and AI: Steyerberg, Basel Nov 1, 2019

  • 1. Clinical prediction models in the age of artificial intelligence and big data Ewout Steyerberg Professor of Clinical Biostatistics and Medical Decision Making <E.Steyerberg@ErasmusMC.nl / E.W.Steyerberg@LUMC.nl > Basel, Nov 1 2019
  • 2. Thanks to co-workers; no COI • LUMC: Maarten van Smeden • Leuven: Ben van Calster Both provided many of the slides shown
  • 3. Main question Where does Big Data / machine learning (ML) / artificial intelligence (AI) assist us in prediction research? • Strengths and weaknesses of Big Data initiatives • Consider links between classical statistical approaches, ML, AI for prediction
  • 4. Prediction models; what for? • Understanding nature: relative risks of different predictors • Predicting outcomes: absolute risk by combinations of predictors
  • 5. Traditional regression modeling 5 Can well be used for explanation and prediction Steyerberg. Clinical prediction models (2nd ed). New York: Springer, 2019. Riley et al. Prognosis Research in healthcare. Oxford: OUP, 2019.
  • 6. Prediction models • Diagnosis – Imaging findings, e.g. abnormal CT scan in trauma – Clinical condition, e.g. serious infection – … • Prognosis – Mortality, e.g. < 30 days, over time, … – …
  • 7. Prognostic / predictive models Prognostic modeling y ~ X Prognostic factors y ~ Tx Treatment effect y ~ X + Tx Covariate adjusted tx effect Predictive modeling y ~ X * Tx Predictive factors for differential tx effect
  • 8. Opportunities in medical prediction • More data – larger N – more variables • More detail – biomarkers / omics / imaging / eHealth • Novel methods – ML / AI / .. – Statistical methods • Dynamic prediction • Testing procedures for high dimensional data • …
  • 11. Positive example 1 • Biomarkers in diagnosing head trauma – Mild: AUC 0.89 [0.87-0.90] vs clinical 0.84 [0.83-0.86]
  • 12. Positive example 2 • MRI Imaging in diagnosing prostate cancer • MRI-PCa-RCs AUC 0.83 to 0.85 vs PCa-RCs AUC 0.69 to 0.74
  • 14. Positive example 3 • Omics in diagnosing … / predicting … ?? • Because omics à clinical characteristics à outcome?
  • 17. Non-exhaustive list 17 Gaming Natural Language Processing (Siri etc) Fraud detection Shoplifting Object recognition (e.g. for driverless cars) Facial recognition Traffic predictions (e.g. Waze app) Electrical load forecasting (Social) media and advertising (people you may know, movie suggestions, ) Spam filtering Search engines (e.g. Google PageRank) Handwriting recognition
  • 18. Popularity skyrocketing 18 Search on https://www.ncbi.nlm.nih.gov/pubmed/ on (performed Oct 18, 2019)
  • 19. IBM Watson winning Jeopardy! (2011)
  • 20. IBM Watson for oncology https://bit.ly/2LxiWGj
  • 21. Evidence • Cochrane: ”We searched for RCTs and found 20 among ... papers” • Dr Watson: “We searched 4 Million webpages in 1 second”
  • 22. Five myths 1. Big Data will resolve the problems of small data 2. ML/AI is very different from classical modeling 3. Deep learning is relevant for all medical prediction problems 4. ML / AI is better than classical modeling for medical prediction problems 5. ML / AI leads to better generalizability
  • 23. Myth 1: Big Data will resolve the problems of small data
  • 24. Abstract The use of artificial intelligence, and deep-learning in particular, has been enabled by the use of big data, along with markedly enhanced computing power and cloud storage, across all sectors. In medicine, this is beginning to have an impact ...
  • 25. Do you have a clear research question? Do you have data that help you answer the question? What is the quality of the data?
  • 26. Big Data, Big Errors • Harrell tweet
  • 27. Myth 2: ML/AI is very different from classical modeling
  • 29. Two cultures Breiman, Stat Sci, 2001, DOI: 10.1214/ss/1009213726
  • 30. Traditional Statistics vs Machine Learning 30 Breiman. Stat Sci 2001;16:199-231.
  • 31. Traditional Statistics vs Machine Learning 31Galit Shmueli. Keynote talk at 2019 ISBIS conference, Kuala Lumpur; taken from slideshare.net Bzdok. Nature Methods 2018;15:233-4. ??
  • 33.
  • 34. Predicting mortality – the results Elastic net, 586 (‘600’) variables: c=0.801 Traditional Cox, 27 (‘30’) expert-selected variables: c=0.793 PlosOne, 2018, DOI: 10.1371/journal.pone.0202344
  • 35. Predicting mortality – the media PlosOne, 2018, DOI: 10.1371/journal.pone.0202344; https://bit.ly/2Q6H41R; https://bit.ly/2m3RLrn
  • 36. ML refers to a culture, not to methods • Substantial overlap methods used by both cultures • Substantial overlap analysis goals • Attempts to separate the two frequently result in disagreement Pragmatic approach: “ML” refers to models roughly outside of the traditional regression types of analysis: trees, SVMs, neural networks, boosting etc.
  • 38. Myth 3: Deep learning is relevant for all medical prediction
  • 39.
  • 40. Example: retinal disease Gulshan et al, JAMA, 2016, 10.1001/jama.2016.17216; Picture retinopathy: https://bit.ly/2kB3X2w AS Diabetic retinopathy Deep learning (= Neural network) • 128,000 images • Transfer learning (preinitialization) • Sensitivity and specificity > .90 • Estimated from training data
  • 41. Example: lymph node metastases Bejnordi et al, JAMA, 2018, doi: 10.1001/jama.2017.14585. See letter to the editor for a critical discussion: https://bit.ly/2kcYS0e Deep learning competition But: • 390 teams signed up, 23 submitted • “Only” 270 images for training • Test AUC range: 0.56 to 0.99
  • 42. 3. Deep learning is relevant for all medical prediction problems NO: Deep learning excels in visual tasks
  • 43. Myth 4: ML / AI is better than classical modeling for medical prediction
  • 44. Reviewer #2, van Smeden submission 2019
  • 45.
  • 46. Poor methods and unclear reporting 46 What was done about missing data? 45% fully unclear, 100% poor or unclear How were continuous predictors modeled? 20% unclear, 25% categorized How were hyperparameters tuned? 66% unclear, 19% tuned with information How was performance validated? 68% unclear or biased approach Was accuracy of risk estimates checked? 79% not at all Further observations: - Prognosis: time horizon often ignored - Patients matched on variables used a predictors - 99% of patients excluded from modeling to obtain a balanced dataset - First and last percentile of continuous predictors replaced with mean
  • 47. Differences in discrimination Christodoulou et al. Journal of Clinical Epidemiology, 2019, doi: 10.1016/j.jclinepi.2019.02.004
  • 48.
  • 49. Where is ML useful? N cases N predictors Large Large Small Small BA C D
  • 50.
  • 51. Rajkomar et al. NEJM 2019;380:1347-58.
  • 52. Myth 5: ML / AI leads to better generalizability “ … developed 7 parallel models for hospital-acquired acute kidney injury using common regression and machine learning methods, validating each over 9 subsequent years.”: “Discrimination was maintained for all models. Calibration declined as all models increasingly overpredicted risk. However, the random forest and neural network models maintained calibration … ”
  • 54.
  • 55. Empirical findings in TBI – 16 cohorts: 5 observational, 11 RCTs – Develop in 15, validate in 1 – 7 methods: LR; SVM; RF; nnet; gbm; LASSO; ridge
  • 56. 5 observational 11 RCTs Variability between cohorts >> variability between methods
  • 57. Prediction challenges • There is no such thing as a validated prediction algorithm • Algorithms are high maintenance – Developed models need validation and updating to remain useful over time and place • Regulation and quality control of algorithms – What about proprietary algorithms?
  • 58. Five myths 1. Big Data will resolve the problems of small data NO: Big Data, Big Errors 2. ML/AI is very different from classical modeling NO: a continuum, cultural differences 3. Deep learning is relevant for all medical prediction NO: Deep learning excels in visual tasks 4. ML / AI is better than classical modeling for prediction NO: some methods do harm (e.g. tree modeling) 5. ML / AI leads to better generalizability NO: any prediction model may suffer from poor generalizability