SlideShare ist ein Scribd-Unternehmen logo
1 von 30
The Molecular
Autoencoder
Dan Elton
1/24/2018 Dan Elton, P.W. Chung Group Meeting 2
What is a Machine Learning?
"Machine Learning is a field of study that gives computers the ability to learn without
being explicitly programmed" - Arthur Samuel, 1959
"A computer program is said to learn from experience E with respect to some class of
tasks T and performance measure P, if its performance at tasks in T, as measured by
P, improves with the experience E." - Tom M. Mitchell.
Reinforcement
learning
Unsupervised learningSupervised learning
• Regression
• Classification
Model Y = f(x) to match data (x,y)
• Parametric models
• Linear models
• Polynomial model
• Logistic model
• Neural network model
• Convolutional Neural network
• Non parametric models
• Kernel Ridge regression
• Decision tree
• Gaussian Process regression
• Kernel SVM
• Clustering
• Dimensionality reduction
• Autoencoders
• Robotics , etc
1/24/2018 Dan Elton, P.W. Chung Group Meeting 3
Supervised learning workflow
Source : sci-kit-learn.org
1/24/2018 Dan Elton, P.W. Chung Group Meeting 4
What is a neural network?
Dendrites
(input wires)
Terminal axons
(output wires)
1/24/2018 Dan Elton, P.W. Chung Group Meeting 5
What is a neural network?
Input layer hidden layer output layer
are the weights
Activations of layer i Input or activations from layer i-1
is the activation function
1/24/2018 Dan Elton, P.W. Chung Group Meeting 6
Activation functions
Binary step
closest to biological
neurons, but
no gradient info =(
Logistic/Sigmoid
arctan()
Rectified Linear Unit
(ReLU)
Maintains a nice large gradient
Exponential Linear
Unit
(ELU)
1/24/2018 Dan Elton, P.W. Chung Group Meeting 7
What is convolution?
Input
Output
1 dimensional convolution with the filter aka “kernel”
Convolution with stride = 2
1/24/2018 Dan Elton, P.W. Chung Group Meeting 8
What is convolution?
Source: http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution
“Feature map”
2 dimensional convolution with the 2x2 filter
Note that the edges were lost.
There are ways to prevent this,
such as padding the edges with
zeros.
1 0 1
0 1 0
1 0 1
1/24/2018 Dan Elton, P.W. Chung Group Meeting 9
What are convolutional neural nets?
By most accounts the CNN was invented by Yan Lecun . He
developed the “LeNet” in 1998 for at ATT’s Bell Laboratories
for reading digits.
Architecture of LeNet:
1/24/2018 Dan Elton, P.W. Chung Group Meeting 10
What are convolutional neural nets?
“2D” images are actually 3D, because they have 3 color channels.
A 3D diagram conveys best what a CNN actually does. The depth of
the non-input layers is the # of filters. Typically the # of filters in each
successive layer increases while the size of the filters decreases:
1/24/2018 Dan Elton, P.W. Chung Group Meeting 11
What are convolutional neural nets?
By many accounts the current deep learning boom began when Krizhevsky, Sutskever and
Hinton used a CNN to win the 2010 ImageNet image classification competition. The
resulting publication has 13,000+ citations.
A Krizhevsky, I Sutskever and GE Hinton Imagenet classification with deep convolutional neural networks Advances
in neural information processing systems, 1097-1105 (2012)
Architecture they used , it has 60 million parameters and 650,000 neurons
Why do CNNs work so well?
They learn a hierarchical set of features the same way the mammalian visual cortex does!
Dan Elton, P.W. Chung Group Meeting1/24/2018 12
Hubel & Wiesel, 1959
Receptive fields of single
neurons in the cat’s
striate cortex
Slide from
Yan LeCun
1/24/2018 Dan Elton, P.W. Chung Group Meeting 13
What is an autoencoder?
• The “latent space” is also called the “low dimensional manifold”, “compressed
representation”, or “thought vector”
• See “Decoding the Thought Vector” for amazing examples of how faces are
compressed: http://gabgoh.github.io/ThoughtVectors/
Source: keras blog
1/24/2018 Dan Elton, P.W. Chung Group Meeting 14
What is a variational autoencoder?
• During training, the output is sampled from the enforced
distribution as mean + random_noise * variance, during testing
the output is the mean.
• Minimize Kullback–Leibler divergence
D.P. Kingma, M. Welling
Auto-Encoding Variational Bayes
The International Conference on Learning Representations (ICLR), Banff, 2014
[arXiv preprint].
1/24/2018 Dan Elton, P.W. Chung Group Meeting 15
What are recursive neural networks?
Recursive Neural Networks (RNNs) have loops.
The simplest RNN is shown on the left, it contains one
feedback loop
The mathematics and calculation of gradients (ie backpropagation) can be made
isomorphic to that of a feed-forward neural network via time unrolling
Output we are
interested in
inputs
All of these beautiful figures are taken from http://colah.github.io/posts/2015-08-
Understanding-LSTMs/ Copyright by Christopher Olah.
1/24/2018 Dan Elton, P.W. Chung Group Meeting 16
What are recursive neural networks?
Ex.: video
classification:
Inputs all frames
video, output a
classification for
each frame
Ex.: translation:
input Spanish,
output English
Ex.: sentiment
analysis:
Input text,
output positive
or negative
sentiment
Ex.: image
captioning:
Input image,
output
sequence of
words.
RNNs can be run many different ways…..
“seq2seq”
1/24/2018 Dan Elton, P.W. Chung Group Meeting 17
What is a gated recurrent unit?
RNNs have trouble capturing long range decencies
Suppose we need the output at time t+1 to depend on x0, x1, which happened in the
distant past of the input stream.
Technically this is called the vanishing gradient problem – the dependence (gradient)
becomes exponentially small with the number of layers it has to pass through. There
is also an exploding gradient problem, where the gradient increases exponetially .
1/24/2018 Dan Elton, P.W. Chung Group Meeting 18
What is an LSTM?
Sepp Hochreiter & Jurgen Schmidhuber (right) invented the
Long Short Term Memory (LSTM) unit in 1997 to solve the
vanishing gradient problem. LSTMs were recently used by
Google for human-level accuracy machine translation. Apple
uses LSTMs in Siri, etc etc.
The LSTM looks complicated but it is actually based on an
extremely simple idea – add a memory cell:
Output state
1/24/2018 Dan Elton, P.W. Chung Group Meeting 19
How does an LSTM work?
“forget” gate “input” gate read out gate
tanh()sigmoid/logistic
1/24/2018 Dan Elton, P.W. Chung Group Meeting 20
LSTM vs. Gated Recurrent Unit (GRU)
The GRU unit1 makes major changes to the LSTM:
• Output and memory cells are merged
• “forget” and “input” gates are merged into a single “update” gate
• Performance is similar to LSTM3 or slightly better2,4 but with less free parameters:
(6 vs 12 for a 1D input/output)
1. Cho, Kyunghyun, van Merrienboer, Bart, Gulcehre, Caglar, Bougares, Fethi, Schwenk, Holger, and Bengio, Yoshua. Learning Phrase
Representations using RNN Encoder-Decoder for Statistical Machine Translation. (2014) arXiv:1406.1078
2. Jozefowicz et al. An Empirical Exploration of Recurrent Network Architectures, Proceedings of the 32nd International Conference on
Machine Learning, 2015
3. Klaus Greff, Rupesh Kumar Srivastava, Jan Koutník, Bas R. Steunebrink, Jürgen Schmidhuber, LSTM: A Search Space Odyssey (2015)
arXiv:1503.04069
4. Chung, Junyoung, Gulcehre, Caglar, Cho, KyungHyun, and Bengio, Yoshua. Empirical Evaluation of Gated Recurrent Neural Networks on
Sequence Modeling. (2014) arXiv:1412.3555
If the dimensionality of the input is n
and the dimensionality of the output
is d, then
#of parameters
LSTM 4*d*(n+d+1)
GRU 3*d*(n+d)
1/24/2018 Dan Elton, P.W. Chung Group Meeting 21
What are SMILES strings?
SMILES (simplified molecular-input line-entry system) encode 2D molecular graphs into 1D.
Example
CC(=O)NCCC1=CNc2c1cc(OC)cc2 CN1CCC[C@H]1C2=CN=CC=C2
The only ambiguity in SMILES strings:
• They do not capture 3D structure. However for small molecules and most application
areas this doesn’t matter much as molecules generally only have one conformation, so it
is implicitly contained. It only would matter in something like proteins, which might fold
into more than one conformation, or if the molecules are interacting with something like
an interface.
FC(F)FCCC(=O)O
1/24/2018 Dan Elton, P.W. Chung Group Meeting 22
One-hot encoding
There are 35 characters (C, N, O, @, -, =, etc)
The maximal molecule length is 120, molecules shorter than this are padded with 0s
3 1 Dimensional Convolution Layers.
Gated recurrent unit (GRU) layers with 501 element memory cells
”time distributed dense layer” (a separate dense layer applied to each timestep
“flattening” – reshapes a 2D array to a 1D array
Two dense (fully connected) neural network layers, with
435 and 292 neurons, respectively
latent layer: mean and standard deviation units
Custom layer to sample the Gaussian distributions during training
Overall auto encoder architecture
1/24/2018 Dan Elton, P.W. Chung Group Meeting 23
Architecture
Dense (fully connected) neural network layer, 292 neurons
one-hot inputs
9 convolution filters of length 9
9 convolution filters of length 9
11 convolution filters of length 10
1/24/2018 Dan Elton, P.W. Chung Group Meeting 24
How does one determine architecture?
The JSON file for the molecular autoencoder reveals about 200+
hyperparameters.
The most important are:
• Number of layers
• Types of layers
• Size and # of filters in CNN layers
• # of hidden cells in GRU layers (also called # of units)
• Number of latent variables
There are various ways of regularizing that can be turned on in several or all
layers:
• L1/ L2 weight regularization
• Weight sharing
• Dropout (currently most popular)
1/24/2018 Dan Elton, P.W. Chung Group Meeting 25
How does one determine architecture?
1. This Week in Machine Learning (TWiML) Podcast, interview with Matthew Zeiler and others.
2. J Snoek, H Larochelle, RP Adams, Practical bayesian optimization of machine learning algorithms Advances in neural
information processing systems, 2951-2959 (2012)
3. Google Research Blog: Using Machine Learning to Explore Neural Network Architecture
4 Sean C. Smithson, Guang Yang, Warren J. Gross, Brett H. Meyer
Neural Networks Designing Neural Networks: Multi-Objective Hyper-Parameter Optimization arXiv:1611.02120
• Historically, design for deep networks has been a black art. This is part of the
reason deep learning jobs have such high salaries.1 There are many heuristics but
no overarching theory guiding design yet.
• Bayesian Optimization is one approach 2
• People at Google use reinforcement learning and genetic algorithms to design
complex deep networks, like the GoogleNet shown above, which can create designs
that perform as well as from human designers. 3
• People have even used neural networks to design neural nets. 4
1/24/2018 Dan Elton, P.W. Chung Group Meeting 26
Latent space projection into 2D via t-SNE
250,000
commercially
available drug-like
molecules from the
ZINC database
150,000 Organic LED
molecules,
combinatorically generated1
1.) Rafael Gómez-Bombarelli et al. “Design of efficient molecular organic light-emitting diodes by a high-
throughput virtual screening and experimental approach”. In: Nat. Mater. 15 pp. 1120–1127 (2016)
Data sets that are available
1/24/2018 Dan Elton, P.W. Chung Group Meeting 27
Name Description # of molecules Size
GDB-17-Set (50 million)
http://gdb.unibe.ch/downloads/
50,000,000
GBD-13 C-N molecules
GBD-13 C-N-O
molecules
ZINC database
zinc.docking.org
Commercially available
molecules
22,724,825
1/24/2018 Dan Elton, P.W. Chung Group Meeting 28
Adversarially trained autoencoder
1. Goodfellow, Ian J.; Pouget-Abadie, Jean; Mirza, Mehdi; Xu, Bing; Warde-Farley, David; Ozair, Sherjil; Courville, Aaron; Bengio, Yoshua (2014)
"Generative Adversarial Networks". arXiv:1406.2661
2. A. Makhzani, J. Shlens, N. Jaitly, and I. Goodfellow, in International Conference on Learning Representations, (2016), arxiv.org:1511.05644
3. Kadurin, Artur et al. “The Cornucopia of Meaningful Leads: Applying Deep Adversarial Autoencoders for New Molecule Development in
Oncology.” Oncotarget 8.7 (2017): 10883–10890. PMC. Web. 2 Aug. 2017.
Generative adversarial networks1 (GANs) have exploded in popularity
since 2014. Adversarial autoencoders2 (AAE) apply the GAN framework to
variational autoencoder training.
The adversarial
autoencoder is an
autoencoder that is
regularized by
matching the
aggregated posterior ,
q(z) derived from the
data distribution, to an
arbitrary prior, p(z).
Here p(z) is a ”the
Normal distribution
N(5,1)”
Application to oncology molecular lead discovery (2017)3
1/24/2018 Dan Elton, P.W. Chung Group Meeting 29
“Molecular Tinder” for screening OLED molecules
From Aspuru-Guzik group: http://chimad.northwestern.edu/docs/DDD_WS_II/12_Aspuru_Guzik.p
1/24/2018 Dan Elton, P.W. Chung Group Meeting 30
Teacher forcing

Weitere ähnliche Inhalte

Was ist angesagt?

Deep learning: Cutting through the Myths and Hype
Deep learning: Cutting through the Myths and HypeDeep learning: Cutting through the Myths and Hype
Deep learning: Cutting through the Myths and HypeSiby Jose Plathottam
 
Convolutional neural networks deepa
Convolutional neural networks deepaConvolutional neural networks deepa
Convolutional neural networks deepadeepa4466
 
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중datasciencekorea
 
A Brief Introduction on Recurrent Neural Network and Its Application
A Brief Introduction on Recurrent Neural Network and Its ApplicationA Brief Introduction on Recurrent Neural Network and Its Application
A Brief Introduction on Recurrent Neural Network and Its ApplicationXiaohu ZHU
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networksSi Haem
 
AI&BigData Lab. Артем Чернодуб "Распознавание изображений методом Lazy Deep ...
AI&BigData Lab. Артем Чернодуб  "Распознавание изображений методом Lazy Deep ...AI&BigData Lab. Артем Чернодуб  "Распознавание изображений методом Lazy Deep ...
AI&BigData Lab. Артем Чернодуб "Распознавание изображений методом Lazy Deep ...GeeksLab Odessa
 
Lecture3 xing fei-fei
Lecture3 xing fei-feiLecture3 xing fei-fei
Lecture3 xing fei-feiTianlu Wang
 
Advanced applications of artificial intelligence and neural networks
Advanced applications of artificial intelligence and neural networksAdvanced applications of artificial intelligence and neural networks
Advanced applications of artificial intelligence and neural networksPraveen Kumar
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature surveyAkshay Hegde
 
An introduction to Deep Learning
An introduction to Deep LearningAn introduction to Deep Learning
An introduction to Deep LearningDavid Rostcheck
 
Towards Machine Comprehension of Spoken Content
Towards Machine Comprehension of Spoken ContentTowards Machine Comprehension of Spoken Content
Towards Machine Comprehension of Spoken ContentNVIDIA Taiwan
 
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Seonho Park
 
Intro to Deep Learning for Computer Vision
Intro to Deep Learning for Computer VisionIntro to Deep Learning for Computer Vision
Intro to Deep Learning for Computer VisionChristoph Körner
 
Deep learning intro
Deep learning introDeep learning intro
Deep learning introbeamandrew
 
Geek Night 17.0 - Artificial Intelligence and Machine Learning
Geek Night 17.0 - Artificial Intelligence and Machine LearningGeek Night 17.0 - Artificial Intelligence and Machine Learning
Geek Night 17.0 - Artificial Intelligence and Machine LearningGeekNightHyderabad
 
DSRLab seminar Introduction to deep learning
DSRLab seminar   Introduction to deep learningDSRLab seminar   Introduction to deep learning
DSRLab seminar Introduction to deep learningPoo Kuan Hoong
 
Recurrent neural networks for sequence learning and learning human identity f...
Recurrent neural networks for sequence learning and learning human identity f...Recurrent neural networks for sequence learning and learning human identity f...
Recurrent neural networks for sequence learning and learning human identity f...SungminYou
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual IntroductionLukas Masuch
 
Hybrid neural networks for time series learning by Tian Guo, EPFL, Switzerland
Hybrid neural networks for time series learning by Tian Guo,  EPFL, SwitzerlandHybrid neural networks for time series learning by Tian Guo,  EPFL, Switzerland
Hybrid neural networks for time series learning by Tian Guo, EPFL, SwitzerlandEuroIoTa
 
Unsupervised Feature Learning
Unsupervised Feature LearningUnsupervised Feature Learning
Unsupervised Feature LearningAmgad Muhammad
 

Was ist angesagt? (20)

Deep learning: Cutting through the Myths and Hype
Deep learning: Cutting through the Myths and HypeDeep learning: Cutting through the Myths and Hype
Deep learning: Cutting through the Myths and Hype
 
Convolutional neural networks deepa
Convolutional neural networks deepaConvolutional neural networks deepa
Convolutional neural networks deepa
 
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
 
A Brief Introduction on Recurrent Neural Network and Its Application
A Brief Introduction on Recurrent Neural Network and Its ApplicationA Brief Introduction on Recurrent Neural Network and Its Application
A Brief Introduction on Recurrent Neural Network and Its Application
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networks
 
AI&BigData Lab. Артем Чернодуб "Распознавание изображений методом Lazy Deep ...
AI&BigData Lab. Артем Чернодуб  "Распознавание изображений методом Lazy Deep ...AI&BigData Lab. Артем Чернодуб  "Распознавание изображений методом Lazy Deep ...
AI&BigData Lab. Артем Чернодуб "Распознавание изображений методом Lazy Deep ...
 
Lecture3 xing fei-fei
Lecture3 xing fei-feiLecture3 xing fei-fei
Lecture3 xing fei-fei
 
Advanced applications of artificial intelligence and neural networks
Advanced applications of artificial intelligence and neural networksAdvanced applications of artificial intelligence and neural networks
Advanced applications of artificial intelligence and neural networks
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature survey
 
An introduction to Deep Learning
An introduction to Deep LearningAn introduction to Deep Learning
An introduction to Deep Learning
 
Towards Machine Comprehension of Spoken Content
Towards Machine Comprehension of Spoken ContentTowards Machine Comprehension of Spoken Content
Towards Machine Comprehension of Spoken Content
 
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
 
Intro to Deep Learning for Computer Vision
Intro to Deep Learning for Computer VisionIntro to Deep Learning for Computer Vision
Intro to Deep Learning for Computer Vision
 
Deep learning intro
Deep learning introDeep learning intro
Deep learning intro
 
Geek Night 17.0 - Artificial Intelligence and Machine Learning
Geek Night 17.0 - Artificial Intelligence and Machine LearningGeek Night 17.0 - Artificial Intelligence and Machine Learning
Geek Night 17.0 - Artificial Intelligence and Machine Learning
 
DSRLab seminar Introduction to deep learning
DSRLab seminar   Introduction to deep learningDSRLab seminar   Introduction to deep learning
DSRLab seminar Introduction to deep learning
 
Recurrent neural networks for sequence learning and learning human identity f...
Recurrent neural networks for sequence learning and learning human identity f...Recurrent neural networks for sequence learning and learning human identity f...
Recurrent neural networks for sequence learning and learning human identity f...
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
 
Hybrid neural networks for time series learning by Tian Guo, EPFL, Switzerland
Hybrid neural networks for time series learning by Tian Guo,  EPFL, SwitzerlandHybrid neural networks for time series learning by Tian Guo,  EPFL, Switzerland
Hybrid neural networks for time series learning by Tian Guo, EPFL, Switzerland
 
Unsupervised Feature Learning
Unsupervised Feature LearningUnsupervised Feature Learning
Unsupervised Feature Learning
 

Ähnlich wie Molecular autoencoder

Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learningAmr Rashed
 
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Wanjin Yu
 
#4 Convolutional Neural Networks for Natural Language Processing
#4 Convolutional Neural Networks for Natural Language Processing#4 Convolutional Neural Networks for Natural Language Processing
#4 Convolutional Neural Networks for Natural Language ProcessingBerlin Language Technology
 
NEURAL NETWORKS
NEURAL NETWORKSNEURAL NETWORKS
NEURAL NETWORKSESCOM
 
X trepan an extended trepan for
X trepan an extended trepan forX trepan an extended trepan for
X trepan an extended trepan forijaia
 
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...cscpconf
 
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...csandit
 
20141003.journal club
20141003.journal club20141003.journal club
20141003.journal clubHayaru SHOUNO
 
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Claudio Greco
 
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Alessandro Suglia
 
파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)
파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)
파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)XAIC
 
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer ArchitectureFoundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architectureijaia
 
Foundations of ANNs: Tolstoy’s Genius Explored using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored using Transformer ArchitectureFoundations of ANNs: Tolstoy’s Genius Explored using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored using Transformer Architecturegerogepatton
 
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer ArchitectureFoundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecturegerogepatton
 
Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Julien SIMON
 
An Evolutionary-based Neural Network for Distinguishing between Genuine and P...
An Evolutionary-based Neural Network for Distinguishing between Genuine and P...An Evolutionary-based Neural Network for Distinguishing between Genuine and P...
An Evolutionary-based Neural Network for Distinguishing between Genuine and P...Md Rakibul Hasan
 

Ähnlich wie Molecular autoencoder (20)

Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
 
#4 Convolutional Neural Networks for Natural Language Processing
#4 Convolutional Neural Networks for Natural Language Processing#4 Convolutional Neural Networks for Natural Language Processing
#4 Convolutional Neural Networks for Natural Language Processing
 
NEURAL NETWORKS
NEURAL NETWORKSNEURAL NETWORKS
NEURAL NETWORKS
 
X trepan an extended trepan for
X trepan an extended trepan forX trepan an extended trepan for
X trepan an extended trepan for
 
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...
 
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
20141003.journal club
20141003.journal club20141003.journal club
20141003.journal club
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
 
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...
 
Deep learning and computer vision
Deep learning and computer visionDeep learning and computer vision
Deep learning and computer vision
 
파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)
파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)
파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)
 
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer ArchitectureFoundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
 
Foundations of ANNs: Tolstoy’s Genius Explored using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored using Transformer ArchitectureFoundations of ANNs: Tolstoy’s Genius Explored using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored using Transformer Architecture
 
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer ArchitectureFoundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
 
Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)
 
Icon18revrec sudeshna
Icon18revrec sudeshnaIcon18revrec sudeshna
Icon18revrec sudeshna
 
An Evolutionary-based Neural Network for Distinguishing between Genuine and P...
An Evolutionary-based Neural Network for Distinguishing between Genuine and P...An Evolutionary-based Neural Network for Distinguishing between Genuine and P...
An Evolutionary-based Neural Network for Distinguishing between Genuine and P...
 

Mehr von Dan Elton

Societal, policy, and regulatory implications of AI for healthcare and medicine
Societal, policy, and regulatory implications of AI for healthcare and medicineSocietal, policy, and regulatory implications of AI for healthcare and medicine
Societal, policy, and regulatory implications of AI for healthcare and medicineDan Elton
 
How deep learning works and why it alone won't get us much closer to AGI
How deep learning works and why it alone won't get us much closer to AGIHow deep learning works and why it alone won't get us much closer to AGI
How deep learning works and why it alone won't get us much closer to AGIDan Elton
 
Introduction to Reinforcement Learning for Molecular Design
Introduction to Reinforcement Learning for Molecular Design Introduction to Reinforcement Learning for Molecular Design
Introduction to Reinforcement Learning for Molecular Design Dan Elton
 
Avoiding Machine Learning Pitfalls 2-10-18
Avoiding Machine Learning Pitfalls 2-10-18Avoiding Machine Learning Pitfalls 2-10-18
Avoiding Machine Learning Pitfalls 2-10-18Dan Elton
 
Avoiding Machine Learning Pitfalls 2-10-18
Avoiding Machine Learning Pitfalls 2-10-18Avoiding Machine Learning Pitfalls 2-10-18
Avoiding Machine Learning Pitfalls 2-10-18Dan Elton
 
Machine Learning Pitfalls
Machine Learning Pitfalls Machine Learning Pitfalls
Machine Learning Pitfalls Dan Elton
 

Mehr von Dan Elton (6)

Societal, policy, and regulatory implications of AI for healthcare and medicine
Societal, policy, and regulatory implications of AI for healthcare and medicineSocietal, policy, and regulatory implications of AI for healthcare and medicine
Societal, policy, and regulatory implications of AI for healthcare and medicine
 
How deep learning works and why it alone won't get us much closer to AGI
How deep learning works and why it alone won't get us much closer to AGIHow deep learning works and why it alone won't get us much closer to AGI
How deep learning works and why it alone won't get us much closer to AGI
 
Introduction to Reinforcement Learning for Molecular Design
Introduction to Reinforcement Learning for Molecular Design Introduction to Reinforcement Learning for Molecular Design
Introduction to Reinforcement Learning for Molecular Design
 
Avoiding Machine Learning Pitfalls 2-10-18
Avoiding Machine Learning Pitfalls 2-10-18Avoiding Machine Learning Pitfalls 2-10-18
Avoiding Machine Learning Pitfalls 2-10-18
 
Avoiding Machine Learning Pitfalls 2-10-18
Avoiding Machine Learning Pitfalls 2-10-18Avoiding Machine Learning Pitfalls 2-10-18
Avoiding Machine Learning Pitfalls 2-10-18
 
Machine Learning Pitfalls
Machine Learning Pitfalls Machine Learning Pitfalls
Machine Learning Pitfalls
 

Kürzlich hochgeladen

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 

Kürzlich hochgeladen (20)

Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 

Molecular autoencoder

  • 2. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 2 What is a Machine Learning? "Machine Learning is a field of study that gives computers the ability to learn without being explicitly programmed" - Arthur Samuel, 1959 "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with the experience E." - Tom M. Mitchell. Reinforcement learning Unsupervised learningSupervised learning • Regression • Classification Model Y = f(x) to match data (x,y) • Parametric models • Linear models • Polynomial model • Logistic model • Neural network model • Convolutional Neural network • Non parametric models • Kernel Ridge regression • Decision tree • Gaussian Process regression • Kernel SVM • Clustering • Dimensionality reduction • Autoencoders • Robotics , etc
  • 3. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 3 Supervised learning workflow Source : sci-kit-learn.org
  • 4. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 4 What is a neural network? Dendrites (input wires) Terminal axons (output wires)
  • 5. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 5 What is a neural network? Input layer hidden layer output layer are the weights Activations of layer i Input or activations from layer i-1 is the activation function
  • 6. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 6 Activation functions Binary step closest to biological neurons, but no gradient info =( Logistic/Sigmoid arctan() Rectified Linear Unit (ReLU) Maintains a nice large gradient Exponential Linear Unit (ELU)
  • 7. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 7 What is convolution? Input Output 1 dimensional convolution with the filter aka “kernel” Convolution with stride = 2
  • 8. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 8 What is convolution? Source: http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution “Feature map” 2 dimensional convolution with the 2x2 filter Note that the edges were lost. There are ways to prevent this, such as padding the edges with zeros. 1 0 1 0 1 0 1 0 1
  • 9. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 9 What are convolutional neural nets? By most accounts the CNN was invented by Yan Lecun . He developed the “LeNet” in 1998 for at ATT’s Bell Laboratories for reading digits. Architecture of LeNet:
  • 10. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 10 What are convolutional neural nets? “2D” images are actually 3D, because they have 3 color channels. A 3D diagram conveys best what a CNN actually does. The depth of the non-input layers is the # of filters. Typically the # of filters in each successive layer increases while the size of the filters decreases:
  • 11. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 11 What are convolutional neural nets? By many accounts the current deep learning boom began when Krizhevsky, Sutskever and Hinton used a CNN to win the 2010 ImageNet image classification competition. The resulting publication has 13,000+ citations. A Krizhevsky, I Sutskever and GE Hinton Imagenet classification with deep convolutional neural networks Advances in neural information processing systems, 1097-1105 (2012) Architecture they used , it has 60 million parameters and 650,000 neurons
  • 12. Why do CNNs work so well? They learn a hierarchical set of features the same way the mammalian visual cortex does! Dan Elton, P.W. Chung Group Meeting1/24/2018 12 Hubel & Wiesel, 1959 Receptive fields of single neurons in the cat’s striate cortex Slide from Yan LeCun
  • 13. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 13 What is an autoencoder? • The “latent space” is also called the “low dimensional manifold”, “compressed representation”, or “thought vector” • See “Decoding the Thought Vector” for amazing examples of how faces are compressed: http://gabgoh.github.io/ThoughtVectors/ Source: keras blog
  • 14. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 14 What is a variational autoencoder? • During training, the output is sampled from the enforced distribution as mean + random_noise * variance, during testing the output is the mean. • Minimize Kullback–Leibler divergence D.P. Kingma, M. Welling Auto-Encoding Variational Bayes The International Conference on Learning Representations (ICLR), Banff, 2014 [arXiv preprint].
  • 15. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 15 What are recursive neural networks? Recursive Neural Networks (RNNs) have loops. The simplest RNN is shown on the left, it contains one feedback loop The mathematics and calculation of gradients (ie backpropagation) can be made isomorphic to that of a feed-forward neural network via time unrolling Output we are interested in inputs All of these beautiful figures are taken from http://colah.github.io/posts/2015-08- Understanding-LSTMs/ Copyright by Christopher Olah.
  • 16. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 16 What are recursive neural networks? Ex.: video classification: Inputs all frames video, output a classification for each frame Ex.: translation: input Spanish, output English Ex.: sentiment analysis: Input text, output positive or negative sentiment Ex.: image captioning: Input image, output sequence of words. RNNs can be run many different ways….. “seq2seq”
  • 17. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 17 What is a gated recurrent unit? RNNs have trouble capturing long range decencies Suppose we need the output at time t+1 to depend on x0, x1, which happened in the distant past of the input stream. Technically this is called the vanishing gradient problem – the dependence (gradient) becomes exponentially small with the number of layers it has to pass through. There is also an exploding gradient problem, where the gradient increases exponetially .
  • 18. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 18 What is an LSTM? Sepp Hochreiter & Jurgen Schmidhuber (right) invented the Long Short Term Memory (LSTM) unit in 1997 to solve the vanishing gradient problem. LSTMs were recently used by Google for human-level accuracy machine translation. Apple uses LSTMs in Siri, etc etc. The LSTM looks complicated but it is actually based on an extremely simple idea – add a memory cell: Output state
  • 19. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 19 How does an LSTM work? “forget” gate “input” gate read out gate tanh()sigmoid/logistic
  • 20. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 20 LSTM vs. Gated Recurrent Unit (GRU) The GRU unit1 makes major changes to the LSTM: • Output and memory cells are merged • “forget” and “input” gates are merged into a single “update” gate • Performance is similar to LSTM3 or slightly better2,4 but with less free parameters: (6 vs 12 for a 1D input/output) 1. Cho, Kyunghyun, van Merrienboer, Bart, Gulcehre, Caglar, Bougares, Fethi, Schwenk, Holger, and Bengio, Yoshua. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. (2014) arXiv:1406.1078 2. Jozefowicz et al. An Empirical Exploration of Recurrent Network Architectures, Proceedings of the 32nd International Conference on Machine Learning, 2015 3. Klaus Greff, Rupesh Kumar Srivastava, Jan Koutník, Bas R. Steunebrink, Jürgen Schmidhuber, LSTM: A Search Space Odyssey (2015) arXiv:1503.04069 4. Chung, Junyoung, Gulcehre, Caglar, Cho, KyungHyun, and Bengio, Yoshua. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. (2014) arXiv:1412.3555 If the dimensionality of the input is n and the dimensionality of the output is d, then #of parameters LSTM 4*d*(n+d+1) GRU 3*d*(n+d)
  • 21. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 21 What are SMILES strings? SMILES (simplified molecular-input line-entry system) encode 2D molecular graphs into 1D. Example CC(=O)NCCC1=CNc2c1cc(OC)cc2 CN1CCC[C@H]1C2=CN=CC=C2 The only ambiguity in SMILES strings: • They do not capture 3D structure. However for small molecules and most application areas this doesn’t matter much as molecules generally only have one conformation, so it is implicitly contained. It only would matter in something like proteins, which might fold into more than one conformation, or if the molecules are interacting with something like an interface. FC(F)FCCC(=O)O
  • 22. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 22 One-hot encoding There are 35 characters (C, N, O, @, -, =, etc) The maximal molecule length is 120, molecules shorter than this are padded with 0s
  • 23. 3 1 Dimensional Convolution Layers. Gated recurrent unit (GRU) layers with 501 element memory cells ”time distributed dense layer” (a separate dense layer applied to each timestep “flattening” – reshapes a 2D array to a 1D array Two dense (fully connected) neural network layers, with 435 and 292 neurons, respectively latent layer: mean and standard deviation units Custom layer to sample the Gaussian distributions during training Overall auto encoder architecture 1/24/2018 Dan Elton, P.W. Chung Group Meeting 23 Architecture Dense (fully connected) neural network layer, 292 neurons one-hot inputs 9 convolution filters of length 9 9 convolution filters of length 9 11 convolution filters of length 10
  • 24. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 24 How does one determine architecture? The JSON file for the molecular autoencoder reveals about 200+ hyperparameters. The most important are: • Number of layers • Types of layers • Size and # of filters in CNN layers • # of hidden cells in GRU layers (also called # of units) • Number of latent variables There are various ways of regularizing that can be turned on in several or all layers: • L1/ L2 weight regularization • Weight sharing • Dropout (currently most popular)
  • 25. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 25 How does one determine architecture? 1. This Week in Machine Learning (TWiML) Podcast, interview with Matthew Zeiler and others. 2. J Snoek, H Larochelle, RP Adams, Practical bayesian optimization of machine learning algorithms Advances in neural information processing systems, 2951-2959 (2012) 3. Google Research Blog: Using Machine Learning to Explore Neural Network Architecture 4 Sean C. Smithson, Guang Yang, Warren J. Gross, Brett H. Meyer Neural Networks Designing Neural Networks: Multi-Objective Hyper-Parameter Optimization arXiv:1611.02120 • Historically, design for deep networks has been a black art. This is part of the reason deep learning jobs have such high salaries.1 There are many heuristics but no overarching theory guiding design yet. • Bayesian Optimization is one approach 2 • People at Google use reinforcement learning and genetic algorithms to design complex deep networks, like the GoogleNet shown above, which can create designs that perform as well as from human designers. 3 • People have even used neural networks to design neural nets. 4
  • 26. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 26 Latent space projection into 2D via t-SNE 250,000 commercially available drug-like molecules from the ZINC database 150,000 Organic LED molecules, combinatorically generated1 1.) Rafael Gómez-Bombarelli et al. “Design of efficient molecular organic light-emitting diodes by a high- throughput virtual screening and experimental approach”. In: Nat. Mater. 15 pp. 1120–1127 (2016)
  • 27. Data sets that are available 1/24/2018 Dan Elton, P.W. Chung Group Meeting 27 Name Description # of molecules Size GDB-17-Set (50 million) http://gdb.unibe.ch/downloads/ 50,000,000 GBD-13 C-N molecules GBD-13 C-N-O molecules ZINC database zinc.docking.org Commercially available molecules 22,724,825
  • 28. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 28 Adversarially trained autoencoder 1. Goodfellow, Ian J.; Pouget-Abadie, Jean; Mirza, Mehdi; Xu, Bing; Warde-Farley, David; Ozair, Sherjil; Courville, Aaron; Bengio, Yoshua (2014) "Generative Adversarial Networks". arXiv:1406.2661 2. A. Makhzani, J. Shlens, N. Jaitly, and I. Goodfellow, in International Conference on Learning Representations, (2016), arxiv.org:1511.05644 3. Kadurin, Artur et al. “The Cornucopia of Meaningful Leads: Applying Deep Adversarial Autoencoders for New Molecule Development in Oncology.” Oncotarget 8.7 (2017): 10883–10890. PMC. Web. 2 Aug. 2017. Generative adversarial networks1 (GANs) have exploded in popularity since 2014. Adversarial autoencoders2 (AAE) apply the GAN framework to variational autoencoder training. The adversarial autoencoder is an autoencoder that is regularized by matching the aggregated posterior , q(z) derived from the data distribution, to an arbitrary prior, p(z). Here p(z) is a ”the Normal distribution N(5,1)” Application to oncology molecular lead discovery (2017)3
  • 29. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 29 “Molecular Tinder” for screening OLED molecules From Aspuru-Guzik group: http://chimad.northwestern.edu/docs/DDD_WS_II/12_Aspuru_Guzik.p
  • 30. 1/24/2018 Dan Elton, P.W. Chung Group Meeting 30 Teacher forcing