The document discusses the Modiag project which uses artificial intelligence and machine learning techniques to help with the early and accurate diagnosis of neurodegenerative diseases like Alzheimer's and Parkinson's. A multidisciplinary team is developing a web platform and integrated database to collect and analyze various data types from multiple sources using ML approaches. Preliminary results show ML can successfully classify patients using genomic data and identify important genes and pathways. Further work is needed to better organize and analyze other data types to improve diagnostic precision. The goal is to develop a more efficient diagnostic workflow to support precision medicine approaches for these complex diseases.
The experience of a multidisciplinary team in the early diagnosis of Alzheimer's disease
1. www.decisionscienceforum.com
CASE STUDY MODIAG – ARTIFICIAL INTELLIGENCE AND
NEURODEGENERATIVE DISEASES
The experience of a multidisciplinary
team in the early diagnosis of
Alzheimer's disease
Paola Bertolazzi
Mara D’Onofrio
2. OUTLINE
Mara D’Onofrio:
• Priority of the Neurodegenerative Diseases:
Alzheimer and Parkinson
• The challenge of earlier and most accurate diagnosis
• Biomarkers: Why?
Paola Bertolazzi:
• MoDiag Project solution
• A/I, Machine Learning
3. NEURODEGENERATIVE DISEASES:
THE CHALLENGE OF EARLY AND MOST ACCURATE DIAGNOSIS
Mara D’Onofrio, MD, PhD
Head of Genomics Laboratory
European Brain Research Institute (EBRI)
“Rita Levi-Montalcini”, Roma, Italy
4. DEMENTIA AFFECTS MEMORY,
COGNITIVE ABILITIES AND BEHAVIOUR
• Alzheimer’s disease and
Parkinson’s disease constitute the
main conditions of dementia
worldwide and the major health
threats to elderly people (also
Frontotemporal Dementia, Vascular
and Lewy Body Dementias).
• The boundaries between diseases
are indistinct and mixed forms
often coexist (WHO 2017-2025).
• Dementia is prevalent with age.
5. DEMENTIA IS ONE OF THE FASTEST GROWING
PUBLIC HEALTH PROBLEMS
World Alzheimer’s report 2018
6. BIG PHARMA ENDS HUNT FOR DRUGS TO
TREAT ALZHEIMER’S AND PARKINSON’S
DISEASES
Why?
• Wrong hypotheses!
• Not enough early diagnosis and
accuracy for therapeutic success!
• Too late treatments!
• Multifactorial complex diseases!
7. MANY SYSTEMS CONTRIBUTE TO NEURODEGENERATION
• Vulnerability of
specific neuronal
population
• Protein aggregation
and misfolding
• Synaptopathy more
than neurons death
• Familiar and
sporadic forms
• Long latency before
the first symptoms
8. THE NEED OF ACCURACY FOR PATIENT STRATIFICATION:
DISEASE OVERLAP
Parkinson’s Disease
Lewy Bodies
•Different inclusion bodies for different
neurodegenerative disorders;
• Overlapping syndromes
Pathological aging
Pure Levy Body
Dementia
Alzheimer’s
Disease
Dementia with NFT only
Atypical
Parkinsonism/Levy
Body Dementia
NFT
Amyloid plaques
Amyloid Beta
oligomers
Lewy
Bodies
10. BIOMARKERS: WHY?
Biomarkers are objective measures of a biological or pathogenic process:
To evaluate disease risk or prognosis
To guide clinical diagnosis
To monitor therapeutics intervention
As endpoints for clinical trials
There is a need for biomarkers that
reflect the core of the disease
Klennow, 2012
12. A DIFFERENT PERSPECTIVE
Are researchers and
academics really sharing,
using and disseminating
data and using registries
in the best possible way? Doshi, 2013
13. NEURODEGENERATIVE DISEASES: MACHINE LEARNING
AND THE PROBLEM OF DATA
Paola Bertolazzi, MD
MoDiag Scientific coordinator
ACT OR
Advance Control Technology
& OPERATION RESEARCH
CNR
Istituto di Analisi dei Sistemi e Informatica
Antonio Ruberti
14. AI/ML AND DATA FOR SUPPORTING BETTER AND EARLIER DIAGNOSIS:
AI
1956
Dartmouth College
Allen Newell (CMU),
Herbert Simon (CMU),
John McCarthy (MIT),
Marvin Minsky (MIT)
and Arthur Samuel
(IBM)
NATURAL
LANGUAGE
50s
EXPERT
SYSTEMS
80s
THEOREM
PROVING
50s
ROBOTICS
30s IMAGE
RECOGNITION
50s
STRATEGIC
GAME SYSTEMS
90s
DATA MINING
90s
MACHINE LEARNING
NEURAL NETWORK
50s
DEEP LEARNING
ARTIFICIAL
INTELLIGENCE
15. ML FOR SUPPORTING BETTER AND EARLIER DIAGNOSIS
1959
IBM
Arthur Samuel
SUPERVISIONED
UNSUPERVISIONED
MACHINE
LEARNING
& DATA
PREDICTION
RECOGNITION
CLASSIFICATION
LEARNING FROM
DATA
LOGIC MODELS
BLACK BOX
16. ML FOR SUPPORTING BETTER AND
EARLIER DIAGNOSIS: DATA
• MANY DATA SOURCES
• SOURCES ETHEROGENEITY
• DATA ETHEROGENEITY
Genomics
sequences, mutations methylations, transcriptome,
Biospecimen
proteins, metabolites
Predisposition factors and comorbidity
family and medical history, drugs
DATA TYPE
Physical exams
Neurological exams
Neuropsycological test
Neuro imaging
Protocols
17. MACHINE LEARNING FROM
BIOMEDICAL DATA:
OBSTACLES/CHALLENGES
• Deep Learning:
– How to collect billions of instances
– Necessity of models with semantic
• Machine learning
– How to collect thousand of instances
– How to perform ML
• Future : Data Standardization
• Now: Data Integration
18. IDEAS AND OBJECTIVES OF MODIAG PROJECT
• Web service platform
– easy data management
– collect data
– better/early diagnosis
• Based on data integration and
ML techniques
WEB SERVICE
INTEGRATED DATA
BASE
MACHINE LEARNING
USER
INTERFACE
Diagnosis
Rules
Medical
records
19. PARTNERS AND CONSULTANTS:
A MULTIDISCIPLINARY TEAM
PROJECT PARTNERS
AI/ML methods and tech: ACT OR and
IASI
DATA & IT tech: ACT OR and IASI
BIOMOLECULAR DATA PRODUCTION
AND DOMAIN COMPETENCE: EBRI Rita
Levi-Montalcini
CONSULTANTS
Center for Cognitive deficits and Dementia,
Department of Human Neurosciences, “Sapienza”
University, Policlinico Umberto I, Roma, Italy (Prof.
Giuseppe Bruno)
Department of Medicine, Geriatrics, Perugia,
University, Perugia Hospital, Perugia, Italy (Prof.
Patrizia Mecocci)
Center for diagnosis and therapy of Parkinson’s
disease, IRCCS San Raffaele Pisana, Roma, Italy
(Prof. Fabrizio Stocchi)
Center for diagnosis and therapy of Alzheimer’s
disease, University of Thessaloniki, Thessaloniki,
Greece (Prof. Magda Tsolaki)
20. MODIAG PROJECT MAIN ACTIVITIES
DATA SET
IDENTIFICATION AND
MODELLING
DATA BASE
DESIGN AND
IMPLEMENTATION
SERVICE
WORKFLOW
DEFINITION
WEB PLATFORM
DESIGN AND
IMPLEMENTATION
PLATFORM
AND SERVICE
VALIDATION
ML FRAMEWORK
DESIGN AND
IMPLEMENTATION
LABORATORY
EXPERIMENTS
21. DATA SET IDENTIFICATION/MODELLING: SCALABILITY AND
CUSTOMIZABILITY
ADNI
DATA SET
PERUGIA
DATA SET
ROMA
DATA SET
AddNeuromed
DATA SET
DATA SOURCES
Public Data Base
ADNI
AddNeuromed
Real world data
Perugia
Sapienza
The European Union AddNeuroMed program and the US-based Alzheimer Disease
Neuroimaging Initiative (ADNI) are two large multi-center initiatives to collect and validate
biomarker data for Alzheimer's disease. Both initiatives use the same MRI data acquisition
scheme
22. DATA BASE DESIGN: OUR
METHODOLOGY
• Alzheimer’s Disease domain Ontology
• DB Conceptual model
• DB population
• DB Management procedures design
– Querying
– adding new kind of information
– Integrating new data bases
23. DATA BASE DESIGN:
ONTOLOGY DESIGN
• Main difficulties
– Domain experts collaboration
– No standardization
• Kind of tests
• Way of performing tests
• Measure units for evaluation
• Challenges
• A shared ontology could allow in short
time to have the world largest collection
of data for a single disease
25. MODIAG ML FRAMEWORK DESIGN
• ML techniques
• supervisioned versus unsupervisioned
• Unsupervisioned : clustering for better
stratification
• Supervisioned: for classification
• black box versus semantic models
• Black box: for detecting good data sets
• Semantic models: for earlier diagnosis
26. MODIAG ML APPROACHES: DiFFERENT TYPE OF DATA
RISK FACTOR
DATA
PHENOTYPE
MEASURES
DATA
Physical exams
Neurological exams
Neuropsycological test
Neuro imaging
Sequences, mutations
methylations,
transcriptome
Biospecimen
proteins,
metabolites
Family and
medical history,
drugs
27. MACHINE LEARNING RESULTS: RISK FACTOR DATA
• Most studies on:
– integrated data set contaning data strongly
correlated to diagnosis, as neuropsycological tests
or MRI results
• Very few studies on:
– Biospecimen
– Medical History
– Family History
– Vitals
• Big efforts in collecting data and new ML
strategies are needed to study these data
• Biospecimen
• Medical History
• Family History
• Vitals
28. ML RESULTS: TRANSCRIPTOME DATA ANALYSIS
European AddNeuroMed project (https://www.synapse.org, EU), a large multi-center
initiative designed to collect and validate clinical, transcriptomic (mRNA levels of all ~ 30000
genes) and proteomic data for Alzheimer's disease (AD)
MoDiag Aim:
to apply ML approaches to integrate
Blood Trascriptomic data of AD patients for
stratification and a more refined diagnosis
(Total = 744 including AD, MCI and Controls)
29. ML PRELIMINARY RESULTS: AddNeuromed transcriptome
Classifier: AD vs CTL
Max model performance with as few as 100
genes
Alzheimer’s
Disease
(AD)
Controls
(CTL)
30. ML PRELIMINARY RESULTS: AddNeuromed transcriptome
Alzheimer’s
Disease (AD)
Mild Cognitive
Impairment
(MCI)
Classifier: AD vs MCI
Max model performance with
as few as 140 genes
31. ML PRELIMINARY RESULTS: AddNeuromed transcriptome
Classifier: MCI vs CTL
Max model performance with as few as
110 genes
Mild
Cognitive
Impairment
(MCI)
Controls
(CTL)
33. Conclusions and future work
• ML on genomics data: excellent results were obtained for
Alzheimer’s disease vs Controls and Mild Cognitive
Impairment vs Controls; good results for patients
classification Alzheimer’s vs Mild Cognitive Impairment.
• Clinical data: more effort is needed for new ML strategies
and data organization.
• The new knowledge could allow the design of a more
efficient, precise and cost-saving diagnostic workflow.
34. PRECISION MEDICINE: THE ABILITY TO TAILOR DIAGNOSIS,
PROGNOSIS AND THERAPY-IDEALLY TO INDIVIDUAL PATIENT
Robust, accurate and sensitive biomarkers are critical to this endeavour
35. REFERENCES
Data collection
ADNI
AddNeuromed
Machine learning literature
Mueller et al. (2005). The Alzheimer's disease neuroimaging initiative.
Neuroimaging Clinics, 15(4), 869-877.
Mahyoub et al. 2018, Comparison Analysis of M. L. Alg. to Rank Alzheimer’s Disease Risk Factors by
Importance, Intern. Conference on Developments in eSystems Engineering
Gamberger et al. 2016, "Clusters of male and female Alzheimer’s disease patients in the Alzheimer’s
Disease Neuroimaging Initiative (ADNI) database", Brain Informatics.
Galili et al. 2014, Categorize, Cluster, and Classify: A 3-C Strategy for Scientific Discovery in the Medical
Informatics for Scientific Discovery, “ the Medical Informatics Platform of the Human Brain Project”, Džeroski et
al.(Eds.): DS 2014, LNAI 8777, pp. 73–86, 2014.m Springer International Publishing Switzerland 2014