SlideShare ist ein Scribd-Unternehmen logo
1 von 49
Downloaden Sie, um offline zu lesen
Integrative Artificial Intelligence
Approach to Predict T1D
Kenneth Young, PhD
May 13, 2022
Health Informatics Institute
University of South Florida
May 13, 2022
Topics
• Artificial Intelligence (AI)
• dkNET Project: AI to Predict T1D
Integrative AI approach to predict T1D 2
Artificial Intelligence
• Artificial Intelligence (AI) refers to machines or
computers that emulate the cognitive functions
associated with the human mind.
• Learning
• Problem solving
• Popular branch of computer science with aims to build
intelligent machines capable of performing intelligent
tasks.
• Activities that would necessitate human intelligence.
May 13, 2022 Integrative AI approach to predict T1D 3
AI Progression
• Deep learning algorithms started to become mainstream in the early
2010’s.
• Deep learning algorithms represent AI techniques that have neural
networks capable of unsupervised learning from data that are
unstructured.
May 13, 2022 Integrative AI approach to predict T1D 4
AI Learning Approaches
• Three related concepts:
• Artificial Intelligence (AI)
• Machine learning (ML)
• Deep learning (DL)
• Three primary types of learning:
• Supervised
• Unsupervised
• Reinforcement
May 13, 2022 Integrative AI approach to predict T1D 5
Deep Learning Model Types
May 13, 2022 Integrative AI approach to predict T1D 6
Deep
Learning
Model
Types
CNNs
DNNs
LSTMs
RNNs
GANs
GANs
RBFNs
MLPs
SOMs
DBNs
RBMs
AEs
• Deep Learning Model Types:
• Convolutional Neural Networks (CNNs)
• Long Short Term Memory Networks (LSTMs)
• Recurrent Neural Networks (RNNs)
• Generative Adversarial Networks (GANs)
• Radial Basis Function Networks (RBFNs)
• Multilayer Perceptrons (MLPs)
• Self Organizing Maps (SOMs)
• Deep Belief Networks (DBNs)
• Restricted Boltzmann Machines( RBMs)
• Autoencoders (AEs)
Deep Learning
• The artificial neural networks are built like
the human brain, with neuron nodes
connected like a web.
• Deep learning maps inputs to outputs and
finds correlations.
• Deep learning is composed of several layers.
• The layers consist of nodes “neurons”.
• Hidden layers are those layers with nodes
other than the input and output nodes and
allow for non-linearity. Resolve the XOR, or
“exclusive or”, problem in artificial neural
network (ANN) research.
• The output of some deep learning models,
such as LSTMs, feeds the input of the next
layer and memorizes previous inputs
through internal memory.
May 13, 2022 Integrative AI approach to predict T1D 7
Using AI (Deep Learning)
• Feature learning: Hierarchy of
increasing complexity and
abstraction
• Makes deep learning networks
capable of handling very large, high-
dimensional data sets with billions of
parameters that pass through
nonlinear functions
• Capable of discovering latent
structures within unlabeled,
unstructured data: [Text, pictures,
video, audio]
May 13, 2022 Integrative AI approach to predict T1D 8
Example of feature hierarchy learned by a deep learning model on faces from Lee et al. (2009).
AI Prediction
• Google’s DeepMInd predicts likelihood of a patient developing acute
kidney injury (AKI), a life threatening condition
• University of Nottingham developed AI to predict the risk of early death
due to chronic diseases in a large middle-aged population
• AI used to predict progression of diabetic kidney disease
• IBM’s Watson can diagnosis heart disease
• Deep learning used for the prediction of variant effects on expression and
disease risk
• Deep learning for inferring gene relationships from single-cell expression
data
• Prediction of Alzheimer’s Disease Based on Bidirectional LSTM
May 13, 2022 Integrative AI approach to predict T1D 9
AI Prediction
• Researchers at MIT and Wyss Institute at
Harvard University developed an AI tool to
aid in the detection of skin cancer
• Successfully distinguished SPLs from non-
suspicious lesions in photos of patients’ skin
with ~90% accuracy
• A pre-trained deep convolutional neural
network (DCNN) determines the
suspiciousness of individual pigmented
lesions
• Yellow = consider further inspection
• Red = requires further inspection or referral to
dermatologist
May 13, 2022 Integrative AI approach to predict T1D 10
Images source: Harvard University researchers
dkNET Project: Integrative Artificial
Intelligence Approach to Predict T1D
• Goals:
• Use artificial intelligence (AI) and machine learning (ML) tools to develop
novel computational approaches for synthesis and analysis of multi-omics,
clinical, and environmental data to evaluate the prediction of type 1 diabetes
(T1D).
• Develop an AI framework and pipelines for fusion and analysis of multi-level
and multi-scale data.
• AI capable of learning temporal and static data.
May 13, 2022 Integrative AI approach to predict T1D 11
Underlying Assumptions
• Results may not be applied to general T1D population.
• Using data from TEDDY nested case-control study (NCCS) population, we will
examine the TEDDY T1D and persistent and confirmed multiple-autoantibody
(IA) diagnosed patient populations of the nested case-control cohort.
• Quality control (QC) metrics, background correction, and data normalization
are performed on applicable data prior to analysis.
Integrative AI approach to predict T1D 12
May 13, 2022
TEDDY Study
• This project utilizes data from an NIDDK funded study called The
Environmental Determinants of Diabetes in the Young (TEDDY) study.
•The TEDDY study investigates:
• Genetic and genetic-environmental interactions, including gestational
infection or other gestational events.
• Childhood infections or other environmental factors after birth in relation to
the development of prediabetes autoimmunity and T1D.
May 13, 2022 Integrative AI approach to predict T1D 13
TEDDY Study Continued
•The long-term goal of the TEDDY study is the identification of factors
which trigger T1D in genetically susceptible individuals or which
protect against the disease.
• Identification of such factors will lead to a better understanding of
disease pathogenesis and result in new strategies to prevent, delay or
reverse T1D.
•The TEDDY participants are followed for 15 years for the appearance
of various beta-cell autoantibodies and diabetes.
•The participants are no longer followed once they reach a study
endpoint.
May 13, 2022 Integrative AI approach to predict T1D 14
TEDDY Study Centers
Integrative AI approach to predict T1D 15
Florida
Georgia
Washington
Colorado
Germany
Finland
Sweden
May 13, 2022
TEDDY Study Research Findings
• This work builds upon published TEDDY results.
• TEDDY researchers have found that autoimmune destruction of
insulin producing cells typically begins in the first two years of life.
Integrative AI approach to predict T1D 16
May 13, 2022
Type 1 Diabetes
• Type 1 diabetes (T1D) is a complex autoimmune disease resulting in
the destruction of β-cells leading to deficient insulin production
overtime.
• Diabetes is a worldwide epidemic, prevalence figures estimate 382
million people living with diabetes in 2013. By 2035 this number is
projected to rise to 592 million.
• The prevalence of T1D in individuals younger than 20 years of age has
increased by 23%, from 2001 to 2009.
• The CDC’s 2020 National Diabetes Statistics report shows the
prevalence of T1D in the U.S. increased nearly 30% from 2017 to
2020.
Integrative AI approach to predict T1D 17
May 13, 2022
T1D Triggers and Risk Factors
• TEDDY researchers believe that T1D may be triggered by environmental factors
and genetic traits.
• The main genes associated with T1D are human leukocyte antigen (HLA) DR3,
DR3-DQ2, DR4, and DR4-DQ8.
• T1D risks may be the same for the general population and genetically at risk.
• Genetic and immunopathogenic studies have directly implicated cytokines in the
pathogenesis of T1D.
• At least five autoantibody-reactivities are predictive of T1D:
• ICA, IAA, GADA, IA-2A, and ZnT8A.
• Autoantibodies against insulin (IAA) are usually the first to occur in children at risk for T1D.
• The association between antibody prevalence and T1D confirms the importance
of antibody detection in at risk individuals, prior to clinical onset.
Integrative AI approach to predict T1D 18
May 13, 2022
AI Methods
• A data-driven AI approach to explore the TEDDY study data to predict
T1D. This may provide insight into the important features that may
cause or protect against T1D.
• Deep Learning Neural Networks:
• Recurrent Neural Networks (RNN, LSTM)
• Multilayer Perceptrons (MLP)
• Combination of Neural Networks
• Static and Sequential Feature Modeling
• TEDDY Data:
• Static: Family history, maternal history, SNPs
• Dynamic: Proteomics, gene expression, test results, clinical visits
Integrative AI approach to predict T1D 19
May 13, 2022
AI Tools
• The primary development is programmed in Python version 3.7.
• For AI modeling, TensorFlow version 2.5.0 and Keras version 2.3.1
are used.
• Statistical and machine learning programming are performed
using the Visual Studio Community 2019 integrated development
environment version 16.10.2.
• The Snakemake workflow management tool is used to create
reproducible and scalable data analysis to run on high
performance computing (HPC) environments.
• GitHub is used for version control and source code management.
• Note: These versions may change during the course of this project
Integrative AI approach to predict T1D 20
May 13, 2022
AI Data
• Utilized from the TEDDY study.
• Limited to the first two years of data from the TEDDY nested case-
control study (NCCS) for first iteration of the AI model.
• From case-control cohort of genetically at risk (HLA-susceptibility
genotypes) study participants from the TEDDY study.
• Comprises temporal (time-series) and static features.
• Diverse types including multi-omics data and environmental factor
measurements every three-six months for fifteen years.
Integrative AI approach to predict T1D 21
May 13, 2022
AI Workflow
Integrative AI approach to predict T1D 22
May 13, 2022
Acquire Data
1. Static
2. Temporal
Preprocess Data
1. Mask
2. Drop
3. Aggregate
4. Normalize
Data Imputation
1. Static: Mean, Median,
Most Frequent
2. Temporal:
Interpolate, LOCF,
NOCB, Case Complete
Split Data
1. Train
2. Validation
3. Test
Feature Selection
1. SVC
2. SVM
3. Random
Forest
4. None
Scale Data
1. Static
2. Temporal
AI Model
1. Compile
2. Fit
3. Predict
4. Evaluate
AI Model Outputs
1. Accuracy
2. Loss
3. Precision
4. Recall
5. AUC/ROC
6. Predictions
AI Model
Interpretation
1. Feature
Importance
• Static
• Temporal
AI Model
Interpretation
Visualization
1. SHAP Plots
Preprocessing: Data Acquisition
• Load various data files acquired from the TEDDY Study.
• Data are limited to TEDDY NCCS cohorts.
• Consists of time-series (temporal) and static data.
• Various –omics, environmental, and exposure data
Integrative AI approach to predict T1D 23
May 13, 2022
Preprocessing: Data Masking
• The masking process de-identifies or obfuscates the data.
• Enhance the privacy and security of the data.
• Prevents potential research bias.
Integrative AI approach to predict T1D 24
May 13, 2022
Preprocessing: Data Aggregation
• Aggregate various data from TEDDY study which can include
numerous data such as –omics, environmental, diet, exposures
• These data are in various structures that can require proper
knowledge and time to prepare and aggregate.
• Data may be: grouped, joined, pivoted, transposed, and merged.
• Data are from various sources and must be joined.
• The data must be transformed to 3D time-series structure to feed the
LSTM network.
Integrative AI approach to predict T1D 25
May 13, 2022
Preprocessing: Data Dropping
• Certain data columns “features” are dropped as they obfuscate the
data.
• Data that do not meet the level of detection “LOD” are either
dropped or used as binary.
• Censor data, drop data with time points > 24 months.
• Data dropped add no value to the model and only hinder the model's
predictive power.
• Example data dropped: participant identifiers, vial barcode numbers,
collection dates of specimens, samples below LOD.
Integrative AI approach to predict T1D 26
May 13, 2022
Data Imputation
• When missing values exist in a dataset, it is important to reason why the data are missing
and how their missingness may impact the data analysis or false conclusions reached.
• If we ignore these missing data, statistical power may reduce. Even more important is the
potential to bias answers which may misleadingly point to incorrect conclusions.
• With the AI Framework, various imputation methods can be employed:
• Temporal Data:
1. Interpolate – used by AI model
2. Last observation carry forward (LOCF)
3. Next observation carry backward (NOCB)
4. Case Complete (CC)
• Static Data:
1. Mean
2. Median – used by AI model
3. Most Frequent
Integrative AI approach to predict T1D 27
May 13, 2022
Feature Selection
• Feature selection is the process of reducing the number of input variables when
developing a predictive AI model.
• It may be beneficial to reduce the number of input variables:
• Reduce computational cost
• Improve performance of AI model
• Various methods exist to select the important features of the data:
• Random Forest
• Support Vector Machine
• Support Vector Classification
• Select K-Best
• Chi-square
Integrative AI approach to predict T1D 28
May 13, 2022
Data Splitting
• The data are split into training, validation,
and test datasets.
• The AI framework uses skleans
train_test_split function.
• Options: Stratified Shuffle Split, K-Folds cross
validation iterator.
• Balance data: Dataset splits are stratified by
y-value.
• The train_test_split splits the original dataset in
such a way that the proportion of both classes
(binary classification) is preserved in the training,
validation, and test datasets.
• Splits ((training, validation), test): (80:20):10
Integrative AI approach to predict T1D 29
May 13, 2022
Data Scaling
• AI algorithms perform better when numerical input variables are scaled to
a standard range.
• Common to scale data prior to fitting neural network model.
• Data often consists of many different input variables or features (columns)
and each may have a different range of values or units of measure.
• Robust standardization (robust data scaling) is used as it can accommodate
data with outliers. This Scaler removes the median and scales the data
according to the quantile range (defaults to IQR: Interquartile Range)
• Data are scaled and transformed. Sklearn (Sci-kit) Scalar fit_transform is
used to standardize features. Training data are fit and transformed, while
validation and test data are only transformed.
Integrative AI approach to predict T1D 30
May 13, 2022
AI Deep Learning Model
• We developed a deep learning model
capable of learning from time-series
features and static features.
• Uses keras and tensorflow to construct
multi-layer perceptron neural networks
(MLP) and recurrent neural networks
(RNN)
• Concatenate neural networks
Integrative AI approach to predict T1D 31
May 13, 2022
Simple MLP
Simple RNN
AI Model
• A neural network framework was developed that can combine
temporal and static data.
• The TEDDY data are temporal (time-series) and comprise
various –omics data in addition to static data (environment,
diet, exposure, SNP).
• The temporal samples are collected at three months intervals
(time-series).
• The data outcome is binary (0 or 1). 1 represents persistent
confirmed islet autoimmunity (IA), which in the TEDDY study,
is defined using MIAA, GADA, and IA2A autoantibodies.
• Future work will predict T1D as the binary outcome.
Integrative AI approach to predict T1D 32
May 13, 2022
AI Model Hyperparameter tuning
• Building AI models is an iterative process to optimize the
model’s performance and compute resources.
• The settings adjusted during each iteration are called
hyperparameters, which govern the training process.
• A hyperparameter is a training parameter set by an AI
engineer before training the model. These parameters are not
learned by the machine learning model during the training
process.
• These decisions impact model metrics, such as accuracy.
Integrative AI approach to predict T1D 33
May 13, 2022
AI Model Hyperparameters
• Example hyperparameters to tune:
• Batch size
• Number of epochs
• Number of hidden layers
• Dropout
• Learning rate
• Optimizer
• Regularization
• Gradient clipping
• Activators
• Loss/cost functions
• Tools exist to aid in hyperparameter tuning:
• Ray Tune
• KerasTuner
Integrative AI approach to predict T1D 34
May 13, 2022
Deep Learning Models
• Concatenated multi-layer
perceptron neural
networks (MLP) and
recurrent neural networks
(RNN)
Integrative AI approach to predict T1D 35
May 13, 2022
AI Model: Concatenated
• Concatenated multi-layer
perceptron neural networks
(MLP), bi-directional long-
short term memory (LSTM)
recurrent neural networks
(RNN), with hidden layers
• Single output layer for binary
prediction (1, 0)
Integrative AI approach to predict T1D 36
May 13, 2022
AI Testing and Validating
• AI uses training and
validation data to classify the
object as a Chihuahua or
muffin
• Similar approaches are used
to classify an image as either
benign or malignant tumor
• Entire AI process must be
tested and validated, from
data preprocessing to the
model outputs
May 13, 2022 Integrative AI approach to predict T1D 37
AI Model Output: Accuracy
• Training and validation data
fed into the AI model will
produce accuracy results of
the training and validation
data
• Accuracy for training and
validation of the AI model
Integrative AI approach to predict T1D 38
May 13, 2022
AI Model Output: Loss
• Binary cross-entropy
• Loss for training and validation
of the AI model
• The graph shows the model
loss for the training and
validation data
• If the validation loss
continually increases after a
specific epoch, this could be
due to overfitting
Integrative AI approach to predict T1D 39
May 13, 2022
AI Model Output: Precision
• What proportion of
positive identifications is
actually correct?
• Precision = TP / (TP + FP)
• 0.78125 = 25 / (25+7)
Integrative AI approach to predict T1D 40
May 13, 2022
AI Model Output: Recall
• Model Recall
“Sensitivity”
• What proportion of
actual positives are
identified correctly?
• Recall = TP / (TP + FN)
• 0.6097= 25 / (25+16)
Integrative AI approach to predict T1D 41
May 13, 2022
AI Model Output: AUC - ROC
• An AUC closer to one
provides a good measure
of separability
• The ROC curve shows the
trade-off between
sensitivity (or TPR) and
specificity (1 – FPR)
Integrative AI approach to predict T1D 42
May 13, 2022
AI Model Output: Iterative Testing Metrics
• Building AI models is
an iterative process.
• To validate the
model, data are
repeated at random
and evaluated by
the model.
• The results of this
model still require
further QA and
validation testing.
Integrative AI approach to predict T1D 43
May 13, 2022
Test Run 8 Output:
Test Run Accuracy Loss Precision Recall AUC F-Score
1 0.7229 0.6151 0.7812 0.6098 0.7113 0.7192
2 0.7229 0.6069 0.7813 0.6098 0.734 0.7192
3 0.6987 0.6052 0.7222 0.6341 0.7288 0.6975
4 0.7469 0.6152 0.7632 0.7073 0.7465 0.7465
5 0.7469 0.5963 0.7941 0.6586 0.7387 0.7445
6 0.7108 0.6563 0.7179 0.6829 0.7305 0.7106
7 0.7349 0.632 0.7209 0.7561 0.7445 0.7345
8 0.7229 0.6169 0.7813 0.6098 0.752 0.7192
0.72766 ± 0.02 0.60774 ± 0.01 0.7684 ± 0.03 0.64392 ± 0.04 0.73186 ± 0.01 0.72538 ± 0.02
AI Model Output: Important Temporal Features
• Top temporal features, ranked
by mean absolute SHAP value.
• The color bar corresponds to
raw values. If raw value is high,
it appears red. Low raw value,
appears blue.
• Each variable appears as its own
point.
• The distribution shows how a
variable may influence the
model.
• Features extending to right, may
influence a (1) prediction.
• Features extending to left, may
influence a (0) prediction.
Integrative AI approach to predict T1D 44
May 13, 2022
AI Model Output: Important Static Features
• Top Static Features
• The distribution shows how
a variable may influence the
model.
• Features extending to right,
may influence a (1)
prediction.
• Features extending to left,
may influence a (0)
prediction.
• Blue = low raw value
• Red = high raw value
Integrative AI approach to predict T1D 45
May 13, 2022
Project Limitations
• This project analyzes high-dimensional clinical research data, not all data are available
during the time of this research.
• T1D cases available from the TEDDY study have nearly doubled, TEDDY is in the process
of developing a second case-control, with the additional data, this could improve the
statistical power and accuracy of this project
• T1D is an endpoint in the TEDDY study, the study participants who are diagnosed with
T1D are no longer followed after their study endpoint visit. This limits the amount of data
available.
• AI model output and important features must be validated and reviewed by TEDDY
researchers.
• Possibly difficult to generalize results with T1D general population.
• Some data required censoring, imputation, and exclusion.
• Right censoring as ordinary linear regression often ineffective at handling the censoring
of observations.
Integrative AI approach to predict T1D 46
May 13, 2022
Current Project Outcomes
• We successfully developed an AI framework that incorporates -omics and
environmental data to predict outcomes.
• Developed packages to prepare data for the AI model and transform 2D
data into 3D data that can be fed into the AI model.
• The AI model is capable of combining temporal (time-series) and static
data.
• Combines (concatenates) AI neural networks, LSTM and MLP.
• Currently predicting IA± on 83 test records with a mean accuracy of: 0.72
• While the AI model still needs to be validated internally by the project
team and TEDDY researchers, a combination of –omics data and static data
from the TEDDY study can be run through the AI model. The project is still
in development and being tested.
Integrative AI approach to predict T1D 47
May 13, 2022
Next Steps
• Enhance current AI models.
• Hyperparameters, Quality Assurance (QA), validate and test models.
• The current model is being developed to predict IA±, but additional data
are needed to feed the model and predict T1D as an outcome.
• Acquire and prepare data for additional Neural Network (NN) models.
• Program, validate, and test additional NN models for additional data.
• Quality control of AI framework and model, internal/construct validity.
• Generalize AI framework to incorporate additional methods and models for
predictions of various diseases.
• Collaborate with TEDDY researchers to validate AI model outcomes and
important features.
Integrative AI approach to predict T1D 48
May 13, 2022
Acknowledgements
• Health Informatics Institute at the University of South
Florida (USF)
• Dr. Jeffrey Krischer
• Dr. Kristian Lynch
• Leo Moreno
• Chris Shaffer
• TEDDY Publications Committee
• TEDDY Study Group:
• NIH, TEDDY DCC, clinical center investigators, clinical center
staff, and TEDDY study participants.
• dkNET / NIDDK
Integrative AI approach to predict T1D 49
May 13, 2022

Weitere ähnliche Inhalte

Ähnlich wie dkNET Webinar "Integrative Artificial Intelligence Approach to Predict T1D" 05/13/2022

Determination of Various Diseases in Two Most Consumed Fruits using Artificia...
Determination of Various Diseases in Two Most Consumed Fruits using Artificia...Determination of Various Diseases in Two Most Consumed Fruits using Artificia...
Determination of Various Diseases in Two Most Consumed Fruits using Artificia...
ijtsrd
 
Diabetes prediciton model ppt.ppt
Diabetes prediciton model ppt.pptDiabetes prediciton model ppt.ppt
Diabetes prediciton model ppt.ppt
satvikpatil5
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
Thinkful
 

Ähnlich wie dkNET Webinar "Integrative Artificial Intelligence Approach to Predict T1D" 05/13/2022 (20)

Tech sem 2_dilip
Tech sem 2_dilipTech sem 2_dilip
Tech sem 2_dilip
 
Géant and DECIDE - improving quality of life for sufferers from Alzheimer's D...
Géant and DECIDE - improving quality of life for sufferers from Alzheimer's D...Géant and DECIDE - improving quality of life for sufferers from Alzheimer's D...
Géant and DECIDE - improving quality of life for sufferers from Alzheimer's D...
 
Big Data in Distributed Analytics,Cybersecurity And Digital Forensics
Big Data in Distributed Analytics,Cybersecurity And Digital ForensicsBig Data in Distributed Analytics,Cybersecurity And Digital Forensics
Big Data in Distributed Analytics,Cybersecurity And Digital Forensics
 
Artificial Intelligence and radiology.pptx
Artificial  Intelligence and radiology.pptxArtificial  Intelligence and radiology.pptx
Artificial Intelligence and radiology.pptx
 
Unveiling Tomorrow_ The Future of Data Science.pdf
Unveiling Tomorrow_ The Future of Data Science.pdfUnveiling Tomorrow_ The Future of Data Science.pdf
Unveiling Tomorrow_ The Future of Data Science.pdf
 
Data science lecture1_doaa_mohey
Data science lecture1_doaa_moheyData science lecture1_doaa_mohey
Data science lecture1_doaa_mohey
 
Big Data - A view
Big Data - A viewBig Data - A view
Big Data - A view
 
Determination of Various Diseases in Two Most Consumed Fruits using Artificia...
Determination of Various Diseases in Two Most Consumed Fruits using Artificia...Determination of Various Diseases in Two Most Consumed Fruits using Artificia...
Determination of Various Diseases in Two Most Consumed Fruits using Artificia...
 
G12_PPT_final.pptx
G12_PPT_final.pptxG12_PPT_final.pptx
G12_PPT_final.pptx
 
Diabetes prediciton model ppt.ppt
Diabetes prediciton model ppt.pptDiabetes prediciton model ppt.ppt
Diabetes prediciton model ppt.ppt
 
Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data Science
 
Imaging Data Commons (IDC) - Introduction and intital approach
Imaging Data Commons (IDC) - Introduction and intital approachImaging Data Commons (IDC) - Introduction and intital approach
Imaging Data Commons (IDC) - Introduction and intital approach
 
Graph AI Industrial Applications: From Explainability To Discovery
Graph AI Industrial Applications: From Explainability To DiscoveryGraph AI Industrial Applications: From Explainability To Discovery
Graph AI Industrial Applications: From Explainability To Discovery
 
Informatics Engineering, an International Journal (IEIJ)
Informatics Engineering, an International Journal (IEIJ)Informatics Engineering, an International Journal (IEIJ)
Informatics Engineering, an International Journal (IEIJ)
 
ppt1 - Copy (1).pptx
ppt1 - Copy (1).pptxppt1 - Copy (1).pptx
ppt1 - Copy (1).pptx
 
Computational intelligence for big data analytics bda 2013
Computational intelligence for big data analytics   bda 2013Computational intelligence for big data analytics   bda 2013
Computational intelligence for big data analytics bda 2013
 
The Internet of Things: What's next?
The Internet of Things: What's next? The Internet of Things: What's next?
The Internet of Things: What's next?
 
RETINAL IMAGE CLASSIFICATION USING NEURAL NETWORK BASED ON A CNN METHODS
RETINAL IMAGE CLASSIFICATION USING NEURAL NETWORK BASED ON A CNN METHODSRETINAL IMAGE CLASSIFICATION USING NEURAL NETWORK BASED ON A CNN METHODS
RETINAL IMAGE CLASSIFICATION USING NEURAL NETWORK BASED ON A CNN METHODS
 

Mehr von dkNET

dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET
 
dkNET Webinar: Tabula Sapiens 03/22/2024
dkNET Webinar: Tabula Sapiens 03/22/2024dkNET Webinar: Tabula Sapiens 03/22/2024
dkNET Webinar: Tabula Sapiens 03/22/2024
dkNET
 
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
dkNET
 
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET
 
dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...
dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...
dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...
dkNET
 
dkNET Webinar: A Single Cell Atlas of Human and Mouse White Adipose Tissue 11...
dkNET Webinar: A Single Cell Atlas of Human and Mouse White Adipose Tissue 11...dkNET Webinar: A Single Cell Atlas of Human and Mouse White Adipose Tissue 11...
dkNET Webinar: A Single Cell Atlas of Human and Mouse White Adipose Tissue 11...
dkNET
 
dkNET Webinar "The National Sleep Research Resource (NSRR) - Opportunities fo...
dkNET Webinar "The National Sleep Research Resource (NSRR) - Opportunities fo...dkNET Webinar "The National Sleep Research Resource (NSRR) - Opportunities fo...
dkNET Webinar "The National Sleep Research Resource (NSRR) - Opportunities fo...
dkNET
 
dkNET Webinar: Discover the Latest from dkNET - Biomed Resource Watch 06/02/2023
dkNET Webinar: Discover the Latest from dkNET - Biomed Resource Watch 06/02/2023dkNET Webinar: Discover the Latest from dkNET - Biomed Resource Watch 06/02/2023
dkNET Webinar: Discover the Latest from dkNET - Biomed Resource Watch 06/02/2023
dkNET
 
dkNET Webinar: Leveraging Computational Strategies to Identify Type 1 Diabete...
dkNET Webinar: Leveraging Computational Strategies to Identify Type 1 Diabete...dkNET Webinar: Leveraging Computational Strategies to Identify Type 1 Diabete...
dkNET Webinar: Leveraging Computational Strategies to Identify Type 1 Diabete...
dkNET
 
dkNET Webinar: Estimating Relative Beta-Cell Function During Continuous Gluco...
dkNET Webinar: Estimating Relative Beta-Cell Function During Continuous Gluco...dkNET Webinar: Estimating Relative Beta-Cell Function During Continuous Gluco...
dkNET Webinar: Estimating Relative Beta-Cell Function During Continuous Gluco...
dkNET
 
dkNET Webinar: Postpartum Glucose Screening Among Homeless Women with Gestati...
dkNET Webinar: Postpartum Glucose Screening Among Homeless Women with Gestati...dkNET Webinar: Postpartum Glucose Screening Among Homeless Women with Gestati...
dkNET Webinar: Postpartum Glucose Screening Among Homeless Women with Gestati...
dkNET
 
dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...
dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...
dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...
dkNET
 
dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...
dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...
dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...
dkNET
 
dkNET Webinar "The Mission and Progress of the(sugar)science: Helping Scienti...
dkNET Webinar "The Mission and Progress of the(sugar)science: Helping Scienti...dkNET Webinar "The Mission and Progress of the(sugar)science: Helping Scienti...
dkNET Webinar "The Mission and Progress of the(sugar)science: Helping Scienti...
dkNET
 
dkNET Webinar: Discovering and Evaluating Antibodies, Cell Lines, Software To...
dkNET Webinar: Discovering and Evaluating Antibodies, Cell Lines, Software To...dkNET Webinar: Discovering and Evaluating Antibodies, Cell Lines, Software To...
dkNET Webinar: Discovering and Evaluating Antibodies, Cell Lines, Software To...
dkNET
 

Mehr von dkNET (20)

dkNET Office Hours: NIH Data Management and Sharing Mandate 05/03/2024
dkNET Office Hours: NIH Data Management and Sharing Mandate  05/03/2024dkNET Office Hours: NIH Data Management and Sharing Mandate  05/03/2024
dkNET Office Hours: NIH Data Management and Sharing Mandate 05/03/2024
 
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
 
dkNET Webinar: Unlocking the Power of FAIR Data Sharing with ImmPort 04/12/2024
dkNET Webinar: Unlocking the Power of FAIR Data Sharing with ImmPort 04/12/2024dkNET Webinar: Unlocking the Power of FAIR Data Sharing with ImmPort 04/12/2024
dkNET Webinar: Unlocking the Power of FAIR Data Sharing with ImmPort 04/12/2024
 
dkNET Webinar: Tabula Sapiens 03/22/2024
dkNET Webinar: Tabula Sapiens 03/22/2024dkNET Webinar: Tabula Sapiens 03/22/2024
dkNET Webinar: Tabula Sapiens 03/22/2024
 
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
dkNET Webinar "The Multi-Omic Response to Exercise Training Across Rat Tissue...
 
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
dkNET Webinar: The Collaborative Microbial Metabolite Center – Democratizing ...
 
dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...
dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...
dkNET Webinar: An Encyclopedia of the Adipose Tissue Secretome to Identify Me...
 
dkNET Webinar: A Single Cell Atlas of Human and Mouse White Adipose Tissue 11...
dkNET Webinar: A Single Cell Atlas of Human and Mouse White Adipose Tissue 11...dkNET Webinar: A Single Cell Atlas of Human and Mouse White Adipose Tissue 11...
dkNET Webinar: A Single Cell Atlas of Human and Mouse White Adipose Tissue 11...
 
dkNET Webinar "The National Sleep Research Resource (NSRR) - Opportunities fo...
dkNET Webinar "The National Sleep Research Resource (NSRR) - Opportunities fo...dkNET Webinar "The National Sleep Research Resource (NSRR) - Opportunities fo...
dkNET Webinar "The National Sleep Research Resource (NSRR) - Opportunities fo...
 
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
 
dkNET Webinar: Discover the Latest from dkNET - Biomed Resource Watch 06/02/2023
dkNET Webinar: Discover the Latest from dkNET - Biomed Resource Watch 06/02/2023dkNET Webinar: Discover the Latest from dkNET - Biomed Resource Watch 06/02/2023
dkNET Webinar: Discover the Latest from dkNET - Biomed Resource Watch 06/02/2023
 
dkNET Webinar: Leveraging Computational Strategies to Identify Type 1 Diabete...
dkNET Webinar: Leveraging Computational Strategies to Identify Type 1 Diabete...dkNET Webinar: Leveraging Computational Strategies to Identify Type 1 Diabete...
dkNET Webinar: Leveraging Computational Strategies to Identify Type 1 Diabete...
 
dkNET Webinar: Estimating Relative Beta-Cell Function During Continuous Gluco...
dkNET Webinar: Estimating Relative Beta-Cell Function During Continuous Gluco...dkNET Webinar: Estimating Relative Beta-Cell Function During Continuous Gluco...
dkNET Webinar: Estimating Relative Beta-Cell Function During Continuous Gluco...
 
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
 
dkNET Webinar: Postpartum Glucose Screening Among Homeless Women with Gestati...
dkNET Webinar: Postpartum Glucose Screening Among Homeless Women with Gestati...dkNET Webinar: Postpartum Glucose Screening Among Homeless Women with Gestati...
dkNET Webinar: Postpartum Glucose Screening Among Homeless Women with Gestati...
 
dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...
dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...
dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...
 
dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...
dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...
dkNET Webinar: : FAIR Data Curation of Antibody/B-cell and T-cell Receptor Se...
 
dkNET Office Hours - "Are You Ready for 2023? New NIH Data Management and Sha...
dkNET Office Hours - "Are You Ready for 2023? New NIH Data Management and Sha...dkNET Office Hours - "Are You Ready for 2023? New NIH Data Management and Sha...
dkNET Office Hours - "Are You Ready for 2023? New NIH Data Management and Sha...
 
dkNET Webinar "The Mission and Progress of the(sugar)science: Helping Scienti...
dkNET Webinar "The Mission and Progress of the(sugar)science: Helping Scienti...dkNET Webinar "The Mission and Progress of the(sugar)science: Helping Scienti...
dkNET Webinar "The Mission and Progress of the(sugar)science: Helping Scienti...
 
dkNET Webinar: Discovering and Evaluating Antibodies, Cell Lines, Software To...
dkNET Webinar: Discovering and Evaluating Antibodies, Cell Lines, Software To...dkNET Webinar: Discovering and Evaluating Antibodies, Cell Lines, Software To...
dkNET Webinar: Discovering and Evaluating Antibodies, Cell Lines, Software To...
 

Kürzlich hochgeladen

Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
Sérgio Sacani
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Sérgio Sacani
 

Kürzlich hochgeladen (20)

Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 

dkNET Webinar "Integrative Artificial Intelligence Approach to Predict T1D" 05/13/2022

  • 1. Integrative Artificial Intelligence Approach to Predict T1D Kenneth Young, PhD May 13, 2022 Health Informatics Institute University of South Florida May 13, 2022
  • 2. Topics • Artificial Intelligence (AI) • dkNET Project: AI to Predict T1D Integrative AI approach to predict T1D 2
  • 3. Artificial Intelligence • Artificial Intelligence (AI) refers to machines or computers that emulate the cognitive functions associated with the human mind. • Learning • Problem solving • Popular branch of computer science with aims to build intelligent machines capable of performing intelligent tasks. • Activities that would necessitate human intelligence. May 13, 2022 Integrative AI approach to predict T1D 3
  • 4. AI Progression • Deep learning algorithms started to become mainstream in the early 2010’s. • Deep learning algorithms represent AI techniques that have neural networks capable of unsupervised learning from data that are unstructured. May 13, 2022 Integrative AI approach to predict T1D 4
  • 5. AI Learning Approaches • Three related concepts: • Artificial Intelligence (AI) • Machine learning (ML) • Deep learning (DL) • Three primary types of learning: • Supervised • Unsupervised • Reinforcement May 13, 2022 Integrative AI approach to predict T1D 5
  • 6. Deep Learning Model Types May 13, 2022 Integrative AI approach to predict T1D 6 Deep Learning Model Types CNNs DNNs LSTMs RNNs GANs GANs RBFNs MLPs SOMs DBNs RBMs AEs • Deep Learning Model Types: • Convolutional Neural Networks (CNNs) • Long Short Term Memory Networks (LSTMs) • Recurrent Neural Networks (RNNs) • Generative Adversarial Networks (GANs) • Radial Basis Function Networks (RBFNs) • Multilayer Perceptrons (MLPs) • Self Organizing Maps (SOMs) • Deep Belief Networks (DBNs) • Restricted Boltzmann Machines( RBMs) • Autoencoders (AEs)
  • 7. Deep Learning • The artificial neural networks are built like the human brain, with neuron nodes connected like a web. • Deep learning maps inputs to outputs and finds correlations. • Deep learning is composed of several layers. • The layers consist of nodes “neurons”. • Hidden layers are those layers with nodes other than the input and output nodes and allow for non-linearity. Resolve the XOR, or “exclusive or”, problem in artificial neural network (ANN) research. • The output of some deep learning models, such as LSTMs, feeds the input of the next layer and memorizes previous inputs through internal memory. May 13, 2022 Integrative AI approach to predict T1D 7
  • 8. Using AI (Deep Learning) • Feature learning: Hierarchy of increasing complexity and abstraction • Makes deep learning networks capable of handling very large, high- dimensional data sets with billions of parameters that pass through nonlinear functions • Capable of discovering latent structures within unlabeled, unstructured data: [Text, pictures, video, audio] May 13, 2022 Integrative AI approach to predict T1D 8 Example of feature hierarchy learned by a deep learning model on faces from Lee et al. (2009).
  • 9. AI Prediction • Google’s DeepMInd predicts likelihood of a patient developing acute kidney injury (AKI), a life threatening condition • University of Nottingham developed AI to predict the risk of early death due to chronic diseases in a large middle-aged population • AI used to predict progression of diabetic kidney disease • IBM’s Watson can diagnosis heart disease • Deep learning used for the prediction of variant effects on expression and disease risk • Deep learning for inferring gene relationships from single-cell expression data • Prediction of Alzheimer’s Disease Based on Bidirectional LSTM May 13, 2022 Integrative AI approach to predict T1D 9
  • 10. AI Prediction • Researchers at MIT and Wyss Institute at Harvard University developed an AI tool to aid in the detection of skin cancer • Successfully distinguished SPLs from non- suspicious lesions in photos of patients’ skin with ~90% accuracy • A pre-trained deep convolutional neural network (DCNN) determines the suspiciousness of individual pigmented lesions • Yellow = consider further inspection • Red = requires further inspection or referral to dermatologist May 13, 2022 Integrative AI approach to predict T1D 10 Images source: Harvard University researchers
  • 11. dkNET Project: Integrative Artificial Intelligence Approach to Predict T1D • Goals: • Use artificial intelligence (AI) and machine learning (ML) tools to develop novel computational approaches for synthesis and analysis of multi-omics, clinical, and environmental data to evaluate the prediction of type 1 diabetes (T1D). • Develop an AI framework and pipelines for fusion and analysis of multi-level and multi-scale data. • AI capable of learning temporal and static data. May 13, 2022 Integrative AI approach to predict T1D 11
  • 12. Underlying Assumptions • Results may not be applied to general T1D population. • Using data from TEDDY nested case-control study (NCCS) population, we will examine the TEDDY T1D and persistent and confirmed multiple-autoantibody (IA) diagnosed patient populations of the nested case-control cohort. • Quality control (QC) metrics, background correction, and data normalization are performed on applicable data prior to analysis. Integrative AI approach to predict T1D 12 May 13, 2022
  • 13. TEDDY Study • This project utilizes data from an NIDDK funded study called The Environmental Determinants of Diabetes in the Young (TEDDY) study. •The TEDDY study investigates: • Genetic and genetic-environmental interactions, including gestational infection or other gestational events. • Childhood infections or other environmental factors after birth in relation to the development of prediabetes autoimmunity and T1D. May 13, 2022 Integrative AI approach to predict T1D 13
  • 14. TEDDY Study Continued •The long-term goal of the TEDDY study is the identification of factors which trigger T1D in genetically susceptible individuals or which protect against the disease. • Identification of such factors will lead to a better understanding of disease pathogenesis and result in new strategies to prevent, delay or reverse T1D. •The TEDDY participants are followed for 15 years for the appearance of various beta-cell autoantibodies and diabetes. •The participants are no longer followed once they reach a study endpoint. May 13, 2022 Integrative AI approach to predict T1D 14
  • 15. TEDDY Study Centers Integrative AI approach to predict T1D 15 Florida Georgia Washington Colorado Germany Finland Sweden May 13, 2022
  • 16. TEDDY Study Research Findings • This work builds upon published TEDDY results. • TEDDY researchers have found that autoimmune destruction of insulin producing cells typically begins in the first two years of life. Integrative AI approach to predict T1D 16 May 13, 2022
  • 17. Type 1 Diabetes • Type 1 diabetes (T1D) is a complex autoimmune disease resulting in the destruction of β-cells leading to deficient insulin production overtime. • Diabetes is a worldwide epidemic, prevalence figures estimate 382 million people living with diabetes in 2013. By 2035 this number is projected to rise to 592 million. • The prevalence of T1D in individuals younger than 20 years of age has increased by 23%, from 2001 to 2009. • The CDC’s 2020 National Diabetes Statistics report shows the prevalence of T1D in the U.S. increased nearly 30% from 2017 to 2020. Integrative AI approach to predict T1D 17 May 13, 2022
  • 18. T1D Triggers and Risk Factors • TEDDY researchers believe that T1D may be triggered by environmental factors and genetic traits. • The main genes associated with T1D are human leukocyte antigen (HLA) DR3, DR3-DQ2, DR4, and DR4-DQ8. • T1D risks may be the same for the general population and genetically at risk. • Genetic and immunopathogenic studies have directly implicated cytokines in the pathogenesis of T1D. • At least five autoantibody-reactivities are predictive of T1D: • ICA, IAA, GADA, IA-2A, and ZnT8A. • Autoantibodies against insulin (IAA) are usually the first to occur in children at risk for T1D. • The association between antibody prevalence and T1D confirms the importance of antibody detection in at risk individuals, prior to clinical onset. Integrative AI approach to predict T1D 18 May 13, 2022
  • 19. AI Methods • A data-driven AI approach to explore the TEDDY study data to predict T1D. This may provide insight into the important features that may cause or protect against T1D. • Deep Learning Neural Networks: • Recurrent Neural Networks (RNN, LSTM) • Multilayer Perceptrons (MLP) • Combination of Neural Networks • Static and Sequential Feature Modeling • TEDDY Data: • Static: Family history, maternal history, SNPs • Dynamic: Proteomics, gene expression, test results, clinical visits Integrative AI approach to predict T1D 19 May 13, 2022
  • 20. AI Tools • The primary development is programmed in Python version 3.7. • For AI modeling, TensorFlow version 2.5.0 and Keras version 2.3.1 are used. • Statistical and machine learning programming are performed using the Visual Studio Community 2019 integrated development environment version 16.10.2. • The Snakemake workflow management tool is used to create reproducible and scalable data analysis to run on high performance computing (HPC) environments. • GitHub is used for version control and source code management. • Note: These versions may change during the course of this project Integrative AI approach to predict T1D 20 May 13, 2022
  • 21. AI Data • Utilized from the TEDDY study. • Limited to the first two years of data from the TEDDY nested case- control study (NCCS) for first iteration of the AI model. • From case-control cohort of genetically at risk (HLA-susceptibility genotypes) study participants from the TEDDY study. • Comprises temporal (time-series) and static features. • Diverse types including multi-omics data and environmental factor measurements every three-six months for fifteen years. Integrative AI approach to predict T1D 21 May 13, 2022
  • 22. AI Workflow Integrative AI approach to predict T1D 22 May 13, 2022 Acquire Data 1. Static 2. Temporal Preprocess Data 1. Mask 2. Drop 3. Aggregate 4. Normalize Data Imputation 1. Static: Mean, Median, Most Frequent 2. Temporal: Interpolate, LOCF, NOCB, Case Complete Split Data 1. Train 2. Validation 3. Test Feature Selection 1. SVC 2. SVM 3. Random Forest 4. None Scale Data 1. Static 2. Temporal AI Model 1. Compile 2. Fit 3. Predict 4. Evaluate AI Model Outputs 1. Accuracy 2. Loss 3. Precision 4. Recall 5. AUC/ROC 6. Predictions AI Model Interpretation 1. Feature Importance • Static • Temporal AI Model Interpretation Visualization 1. SHAP Plots
  • 23. Preprocessing: Data Acquisition • Load various data files acquired from the TEDDY Study. • Data are limited to TEDDY NCCS cohorts. • Consists of time-series (temporal) and static data. • Various –omics, environmental, and exposure data Integrative AI approach to predict T1D 23 May 13, 2022
  • 24. Preprocessing: Data Masking • The masking process de-identifies or obfuscates the data. • Enhance the privacy and security of the data. • Prevents potential research bias. Integrative AI approach to predict T1D 24 May 13, 2022
  • 25. Preprocessing: Data Aggregation • Aggregate various data from TEDDY study which can include numerous data such as –omics, environmental, diet, exposures • These data are in various structures that can require proper knowledge and time to prepare and aggregate. • Data may be: grouped, joined, pivoted, transposed, and merged. • Data are from various sources and must be joined. • The data must be transformed to 3D time-series structure to feed the LSTM network. Integrative AI approach to predict T1D 25 May 13, 2022
  • 26. Preprocessing: Data Dropping • Certain data columns “features” are dropped as they obfuscate the data. • Data that do not meet the level of detection “LOD” are either dropped or used as binary. • Censor data, drop data with time points > 24 months. • Data dropped add no value to the model and only hinder the model's predictive power. • Example data dropped: participant identifiers, vial barcode numbers, collection dates of specimens, samples below LOD. Integrative AI approach to predict T1D 26 May 13, 2022
  • 27. Data Imputation • When missing values exist in a dataset, it is important to reason why the data are missing and how their missingness may impact the data analysis or false conclusions reached. • If we ignore these missing data, statistical power may reduce. Even more important is the potential to bias answers which may misleadingly point to incorrect conclusions. • With the AI Framework, various imputation methods can be employed: • Temporal Data: 1. Interpolate – used by AI model 2. Last observation carry forward (LOCF) 3. Next observation carry backward (NOCB) 4. Case Complete (CC) • Static Data: 1. Mean 2. Median – used by AI model 3. Most Frequent Integrative AI approach to predict T1D 27 May 13, 2022
  • 28. Feature Selection • Feature selection is the process of reducing the number of input variables when developing a predictive AI model. • It may be beneficial to reduce the number of input variables: • Reduce computational cost • Improve performance of AI model • Various methods exist to select the important features of the data: • Random Forest • Support Vector Machine • Support Vector Classification • Select K-Best • Chi-square Integrative AI approach to predict T1D 28 May 13, 2022
  • 29. Data Splitting • The data are split into training, validation, and test datasets. • The AI framework uses skleans train_test_split function. • Options: Stratified Shuffle Split, K-Folds cross validation iterator. • Balance data: Dataset splits are stratified by y-value. • The train_test_split splits the original dataset in such a way that the proportion of both classes (binary classification) is preserved in the training, validation, and test datasets. • Splits ((training, validation), test): (80:20):10 Integrative AI approach to predict T1D 29 May 13, 2022
  • 30. Data Scaling • AI algorithms perform better when numerical input variables are scaled to a standard range. • Common to scale data prior to fitting neural network model. • Data often consists of many different input variables or features (columns) and each may have a different range of values or units of measure. • Robust standardization (robust data scaling) is used as it can accommodate data with outliers. This Scaler removes the median and scales the data according to the quantile range (defaults to IQR: Interquartile Range) • Data are scaled and transformed. Sklearn (Sci-kit) Scalar fit_transform is used to standardize features. Training data are fit and transformed, while validation and test data are only transformed. Integrative AI approach to predict T1D 30 May 13, 2022
  • 31. AI Deep Learning Model • We developed a deep learning model capable of learning from time-series features and static features. • Uses keras and tensorflow to construct multi-layer perceptron neural networks (MLP) and recurrent neural networks (RNN) • Concatenate neural networks Integrative AI approach to predict T1D 31 May 13, 2022 Simple MLP Simple RNN
  • 32. AI Model • A neural network framework was developed that can combine temporal and static data. • The TEDDY data are temporal (time-series) and comprise various –omics data in addition to static data (environment, diet, exposure, SNP). • The temporal samples are collected at three months intervals (time-series). • The data outcome is binary (0 or 1). 1 represents persistent confirmed islet autoimmunity (IA), which in the TEDDY study, is defined using MIAA, GADA, and IA2A autoantibodies. • Future work will predict T1D as the binary outcome. Integrative AI approach to predict T1D 32 May 13, 2022
  • 33. AI Model Hyperparameter tuning • Building AI models is an iterative process to optimize the model’s performance and compute resources. • The settings adjusted during each iteration are called hyperparameters, which govern the training process. • A hyperparameter is a training parameter set by an AI engineer before training the model. These parameters are not learned by the machine learning model during the training process. • These decisions impact model metrics, such as accuracy. Integrative AI approach to predict T1D 33 May 13, 2022
  • 34. AI Model Hyperparameters • Example hyperparameters to tune: • Batch size • Number of epochs • Number of hidden layers • Dropout • Learning rate • Optimizer • Regularization • Gradient clipping • Activators • Loss/cost functions • Tools exist to aid in hyperparameter tuning: • Ray Tune • KerasTuner Integrative AI approach to predict T1D 34 May 13, 2022
  • 35. Deep Learning Models • Concatenated multi-layer perceptron neural networks (MLP) and recurrent neural networks (RNN) Integrative AI approach to predict T1D 35 May 13, 2022
  • 36. AI Model: Concatenated • Concatenated multi-layer perceptron neural networks (MLP), bi-directional long- short term memory (LSTM) recurrent neural networks (RNN), with hidden layers • Single output layer for binary prediction (1, 0) Integrative AI approach to predict T1D 36 May 13, 2022
  • 37. AI Testing and Validating • AI uses training and validation data to classify the object as a Chihuahua or muffin • Similar approaches are used to classify an image as either benign or malignant tumor • Entire AI process must be tested and validated, from data preprocessing to the model outputs May 13, 2022 Integrative AI approach to predict T1D 37
  • 38. AI Model Output: Accuracy • Training and validation data fed into the AI model will produce accuracy results of the training and validation data • Accuracy for training and validation of the AI model Integrative AI approach to predict T1D 38 May 13, 2022
  • 39. AI Model Output: Loss • Binary cross-entropy • Loss for training and validation of the AI model • The graph shows the model loss for the training and validation data • If the validation loss continually increases after a specific epoch, this could be due to overfitting Integrative AI approach to predict T1D 39 May 13, 2022
  • 40. AI Model Output: Precision • What proportion of positive identifications is actually correct? • Precision = TP / (TP + FP) • 0.78125 = 25 / (25+7) Integrative AI approach to predict T1D 40 May 13, 2022
  • 41. AI Model Output: Recall • Model Recall “Sensitivity” • What proportion of actual positives are identified correctly? • Recall = TP / (TP + FN) • 0.6097= 25 / (25+16) Integrative AI approach to predict T1D 41 May 13, 2022
  • 42. AI Model Output: AUC - ROC • An AUC closer to one provides a good measure of separability • The ROC curve shows the trade-off between sensitivity (or TPR) and specificity (1 – FPR) Integrative AI approach to predict T1D 42 May 13, 2022
  • 43. AI Model Output: Iterative Testing Metrics • Building AI models is an iterative process. • To validate the model, data are repeated at random and evaluated by the model. • The results of this model still require further QA and validation testing. Integrative AI approach to predict T1D 43 May 13, 2022 Test Run 8 Output: Test Run Accuracy Loss Precision Recall AUC F-Score 1 0.7229 0.6151 0.7812 0.6098 0.7113 0.7192 2 0.7229 0.6069 0.7813 0.6098 0.734 0.7192 3 0.6987 0.6052 0.7222 0.6341 0.7288 0.6975 4 0.7469 0.6152 0.7632 0.7073 0.7465 0.7465 5 0.7469 0.5963 0.7941 0.6586 0.7387 0.7445 6 0.7108 0.6563 0.7179 0.6829 0.7305 0.7106 7 0.7349 0.632 0.7209 0.7561 0.7445 0.7345 8 0.7229 0.6169 0.7813 0.6098 0.752 0.7192 0.72766 ± 0.02 0.60774 ± 0.01 0.7684 ± 0.03 0.64392 ± 0.04 0.73186 ± 0.01 0.72538 ± 0.02
  • 44. AI Model Output: Important Temporal Features • Top temporal features, ranked by mean absolute SHAP value. • The color bar corresponds to raw values. If raw value is high, it appears red. Low raw value, appears blue. • Each variable appears as its own point. • The distribution shows how a variable may influence the model. • Features extending to right, may influence a (1) prediction. • Features extending to left, may influence a (0) prediction. Integrative AI approach to predict T1D 44 May 13, 2022
  • 45. AI Model Output: Important Static Features • Top Static Features • The distribution shows how a variable may influence the model. • Features extending to right, may influence a (1) prediction. • Features extending to left, may influence a (0) prediction. • Blue = low raw value • Red = high raw value Integrative AI approach to predict T1D 45 May 13, 2022
  • 46. Project Limitations • This project analyzes high-dimensional clinical research data, not all data are available during the time of this research. • T1D cases available from the TEDDY study have nearly doubled, TEDDY is in the process of developing a second case-control, with the additional data, this could improve the statistical power and accuracy of this project • T1D is an endpoint in the TEDDY study, the study participants who are diagnosed with T1D are no longer followed after their study endpoint visit. This limits the amount of data available. • AI model output and important features must be validated and reviewed by TEDDY researchers. • Possibly difficult to generalize results with T1D general population. • Some data required censoring, imputation, and exclusion. • Right censoring as ordinary linear regression often ineffective at handling the censoring of observations. Integrative AI approach to predict T1D 46 May 13, 2022
  • 47. Current Project Outcomes • We successfully developed an AI framework that incorporates -omics and environmental data to predict outcomes. • Developed packages to prepare data for the AI model and transform 2D data into 3D data that can be fed into the AI model. • The AI model is capable of combining temporal (time-series) and static data. • Combines (concatenates) AI neural networks, LSTM and MLP. • Currently predicting IA± on 83 test records with a mean accuracy of: 0.72 • While the AI model still needs to be validated internally by the project team and TEDDY researchers, a combination of –omics data and static data from the TEDDY study can be run through the AI model. The project is still in development and being tested. Integrative AI approach to predict T1D 47 May 13, 2022
  • 48. Next Steps • Enhance current AI models. • Hyperparameters, Quality Assurance (QA), validate and test models. • The current model is being developed to predict IA±, but additional data are needed to feed the model and predict T1D as an outcome. • Acquire and prepare data for additional Neural Network (NN) models. • Program, validate, and test additional NN models for additional data. • Quality control of AI framework and model, internal/construct validity. • Generalize AI framework to incorporate additional methods and models for predictions of various diseases. • Collaborate with TEDDY researchers to validate AI model outcomes and important features. Integrative AI approach to predict T1D 48 May 13, 2022
  • 49. Acknowledgements • Health Informatics Institute at the University of South Florida (USF) • Dr. Jeffrey Krischer • Dr. Kristian Lynch • Leo Moreno • Chris Shaffer • TEDDY Publications Committee • TEDDY Study Group: • NIH, TEDDY DCC, clinical center investigators, clinical center staff, and TEDDY study participants. • dkNET / NIDDK Integrative AI approach to predict T1D 49 May 13, 2022