Protein family specific models using deep neural networks and transfer learning improve virtual screening and highlight the need for more data

•

0 gefällt mir•233 views

1) The document presents research on using deep neural networks and transfer learning to improve virtual screening for drug discovery. 2) The researchers trained protein family-specific models using the DenseNet architecture on different sized training sets and evaluated using transfer learning and fine-tuning. 3) The results showed that the protein family-specific models outperformed baseline models on standard evaluation metrics, highlighting both the importance of more target-specific models and the need for more data to train such models.

Gesundheit & Medizin

Protein Family-Specific Models Using Deep Neural
Networks and Transfer Learning Improve Virtual Screening
and Highlight the Need for More Data
Presenter: Aydin Ayanzadeh Authors: Fergus Imrie, Anthony R. Bradley,
Mihaela van der Schaar, and Charlotte M.
Deane*

Agenda
● Introduction
● Training Set Size
● DenseNet
● Transfer Learning
● Fine Tuning
● Evaluation Metrics
● Results
○ Quantitative Results
○ Qualitative Results
2
● Visualization

Introduction
● Machine learning
● computer-aided drug
discovery
● CNNs
● DenseNet
● Transfer Learning
3

Introduction
● DUD-E, ChEMBL data set.
● Virtual screening is a computational technique used
in drug discovery to search libraries of small
molecules in order to identify those structures
which are most likely to bind to a drug target.
● A major challenge in virtual screening is the
heterogeneity of binding between different targets
arising from the structural diversity of proteins.
4

DenseNet
Figure 2. Schematic of the DenseNet architecture used in our model.
6
Model Description
● Dense connections
● strengthen feature propagation,
● encourage feature reuse
● reduce the number of parameters.
● Maintain low complexity features

Transfer Learning
● Transfer Learning is the reuse of a pre-trained
model on a new problem.
● very popular in the field of Deep Learning
● ImageNet
● Ensemble learning
7

Fine Tuning
● Fine-tuning classifier
● Fine-tuning all layers
Figure 3. (a−d) Illustration of the different training regimes adopted to construct
family-specific models. White corresponds to layers of the model that have been
trained on all training data; blue, to layers that have been trained first on all
training data and then fine-tuned on data from a specific protein family; and
orange, to layers that have been trained only on data from a specific protein
family.
8

Evaluation
Metrics
● False Positive. Predict an event when there was no event.
● False Negative. Predict no event when in fact there was an
event
● ROC Curves summarize the trade-off between the true
positive rate and false positive rate for a predictive model
using different probability thresholds.
● Precision means the percentage of your results which are
relevant.
● recall refers to the percentage of total relevant results correctly
classified by your algorithm
● Precision-Recall curves summarize the trade-off between the
true positive rate and the positive predictive value for a
predictive model using different probability thresholds.
9

Quantitative Results
10
Table 2. Mean AUC ROC, AUC PRC, and ROC Enrichment Across Targets
in the DUD-E Data Set for Our Method,DenseFS, Compared to Baseline
CNN and the AutoDockVina Scoring Function

Quantitative Results
11
Table 5. Mean AUC ROC, AUC PRC, and ROC EnrichmentAcross Targets in the MUV Test Set for
Our Method,DenseFS, Compared to Baseline CNN and the AutoDockVina Scoring Function

Quantitative Results
12
Table 5. Mean AUC ROC, AUC PRC, and ROC EnrichmentAcross Targets in the MUV Test Set for
Our Method,DenseFS, Compared to Baseline CNN and the AutoDockVina Scoring Function

Visualization
13
Fig. Visualization of the known active CHEMBL293409 ligand (a) docked against the DUD-E target
ANDR. (b and c) Results of the visualization procedure for Baseline CNN and DenseFS, respectively.
Areas of green indicate a score for that region above 0.5, whereas red represents a score below 0.5, with
the intensity depending on the magnitude of the difference. The Baseline CNN assigned the complex an
overall score of 0.34, while DenseFS scored the complex at 0.91.
● Protein families:
○ kinase (26 targets)
○ protease (15)
○ nuclear (11)
○ GPCR (5)
○ other (45)

Qualitative Results
15
● kinases, proteases, and nuclear

Empfohlen

AI: Learning in AI 2DataminingTools Inc

IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline

Boost model accuracy of imbalanced covid 19 mortality predictionBindhuBhargaviTalasi

2224d_finalSeth Hildick-Smith

Prediction of pIC50 Values for the Acetylcholinesterase (AChE) using QSAR ModelIRJET Journal

Performance Comparison Analysis for Medical Images Using Deep Learning Approa...IRJET Journal

AI approaches in healthcare - targeting precise and personalized medicine DayOne

1207.2600Risjunardi Damanik

Empfohlen

AI: Learning in AI 2DataminingTools Inc

IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline

Boost model accuracy of imbalanced covid 19 mortality predictionBindhuBhargaviTalasi

2224d_finalSeth Hildick-Smith

Prediction of pIC50 Values for the Acetylcholinesterase (AChE) using QSAR ModelIRJET Journal

Performance Comparison Analysis for Medical Images Using Deep Learning Approa...IRJET Journal

AI approaches in healthcare - targeting precise and personalized medicine DayOne

1207.2600Risjunardi Damanik

High performance intrusion detection using modified k mean & naïve bayeseSAT Journals

Comparative Study of Pre-Trained Neural Network Models in Detection of GlaucomaIRJET Journal

DeepDRImageGuidedDiabeticRetinopathyDetectionUsingAttentionBasedDeepLearningS...RamithaDevi

Using Artificial Neural Networks to Detect Multiple Cancers from a Blood TestStevenQu1

How predictive models help Medicinal Chemists design better drugs_webinarAnn-Marie Roche

Deep learning methods applied to physicochemical and toxicological endpointsValery Tkachenko

A Review on Food Classification using Convolutional Neural NetworksIRJET Journal

Prediction of Neurological Disorder using Classification ApproachBRNSSPublicationHubI

AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...IAEME Publication

Pay-as-you-go Reconciliation in Schema Matching NetworksPlanetData Network of Excellence

i2164-2591-8-6-4 (1).pdf presentation cmahsamjutt1234

2. visualization in data miningAzad public school

MIS637_Final_Project_Rahul_BhatiaRahul Bhatia

Blood Cell Image Classification for Detecting Malaria using CNNIRJET Journal

A ROBUST MISSING VALUE IMPUTATION METHOD MIFOIMPUTE FOR INCOMPLETE MOLECULAR ...ijcsa

Machine Learning Approach.pptxCYPatrickKwee

Care expert assistant for Medicare system using Machine learningIRJET Journal

Uber Data Analysis - SAS ProjectKushal417

Make Sense Out of Data with Feature EngineeringDataRobot

Cell Segmentation of 2D Phase-Contrast Microscopy Images with Deep Learning M...Fellowship at Vodafone FutureLab

Mreps efficient and flexible detection of tandem repeats in dnaFellowship at Vodafone FutureLab

Weitere ähnliche Inhalte

Ähnlich wie Protein family specific models using deep neural networks and transfer learning improve virtual screening and highlight the need for more data