Deep Neural Networks  that talk (Back)… with style

WITH STYLE
DEEP NEURAL NETWORKS  
THAT TALK (BACK)…
@graphiﬁc
Roelof Pieters
Vienna 2016

COMMON GENERATIVE ARCHITECTURES
▸ AE/VAE
▸ DBN
▸ Latent Vectors/Manifold Walking
▸ RNN / LSTM /GRU
▸ Modded RNNs (ie Biaxial RNN)
▸ CNN + LSTM/GRU
▸ X + Mixture density network (MDN)
▸ Compositional Pattern-Producing
Networks (CPPN)
▸ NEAT
▸ CPPN w/ GAN+VAE.
▸ DRAW
▸ GRAN
▸ DCGAN
▸ DeepDream & other CNN
visualisations
▸ Splitting/Remixing NNs:
▸ Image Analogy
▸ Style Transfer
▸ Semantic Style Transfer
▸ Texture Synthesis

DEEP NEURAL NETWORKS
▸ Here we mean the recurrent variants…
(karpathy.github.io/2015/05/21/rnn-effectiveness/)

▸ One to one
▸ just like a normal neural network
▸ ie Image Categorization
RECURRENT NEURAL NETWORKS
EEVEE

▸ One to many
▸ example: image captioning
(Vinyals et al., 2015)

▸ Many to one
▸ example: sentiment analysis
+++

▸ Many to many
▸ output at end (machine translation?) 
or at each timestep (char-rnn !!)

NEURAL NETWORKS CAN SAMPLE (XT > XT+1)
RECURRENT NEURAL NETWORK (RNN/LSTM) SAMPLING
▸ (Naive) Sampling
▸ Scheduled Sampling (ML) Bengio et al 2015
▸ Sequence Level (RL) Ranzati et al 2016
▸ Reward Augmented Maximum Likelihood (ML+RL) Nourouzi
et al forthcoming

▸ Approach used for most recent “creative” generations
▸ (Char-RNN, Torch-RNN, etc)
LSTM SAMPLING (GRAVES 2013)

SCHEDULED SAMPLING (BENGIO ET AL 2015)
▸ At training start with ground truth and slowly move towards
using model predictions as next steps

SEQUENCE LEVEL TRAINING (RANZATO ET AL 2016)
▸ Use model predictions as next steps, but continuous reward/
loss through Reinforcement Learning

REWARD AUGMENTED MAXIMUM LIKELIHOOD (NOUROUZI ET AL FORTHCOMING)
▸ Generate targets sampled  
around the correct solution
▸ “giving it mostly wrong  
examples to learn the right ones”

TRANSLATE (X == Y)
Are you going to Nuclai?
will there!
Damn right you are homey!
Ofcourse beI

NEURAL NETWORKS CAN (RE)MIX / TRANSFER (X > Y) / (Z = X+Y)

NEURAL NETWORKS CAN HALLUCINATE {*X*}

NEURAL NETWORKS CAN
HALLUCINATE {*X*}
https://www.youtube.com/watch?v=oyxSerkkP4o
https://vimeo.com/134457823 https://www.youtube.com/watch?v=tbTJH8aPl60
https://www.youtube.com/watch?v=NYVg-V8-7q0

▸ Possible to learn and generate:
▸ Audio
▸ Images
▸ Text
▸ … anything basically…
▸Revolution ?

IF I CAN’T DANCE
IT’S NOT MY
REVOLUTION.
Emma Goldman

http://www.creativeai.net/posts/qPpnatfMqwMKXPEw7/chor-rnn-generative-choreography-using-deep-learning

Robin Sloan, Writing with the machine

EARLY LSTM MUSIC COMPOSITION (2002)
Douglas Eck and Jurgen Schmidhuber (2002) Learning The Long-Term Structure of the Blues?

AUDIO GENERATION: MIDI
▸ https://soundcloud.com/graphiﬁc/pyotr-lstm-
tchaikovsky
A Recurrent Latent Variable Model for
Sequential Data, 2016,  
J. Chung, K. Kastner, L. Dinh, K. Goel,
A. Courville, Y. Bengio
+ “modded VRNN:

AUDIO GENERATION: MIDI
▸ https://soundcloud.com/graphiﬁc/neural-remix-net
A Recurrent Latent Variable Model for
Sequential Data, 2016,  
J. Chung, K. Kastner, L. Dinh, K. Goel,
A. Courville, Y. Bengio
+ “modded VRNN:

https://soundcloud.com/graphiﬁc/neural-remix-net
Gated Recurrent Unit (GRU)stanford cs224d project
Aran Nayebi, Matt Vitelli (2015) GRUV: Algorithmic Music Generation using Recurrent Neural Networks
AUDIO GENERATION: RAW (MP3)

PLAYING WITH NEURAL NETS #1PLAYING WITH NEURAL NETS #1

python has a wide range of deep
learning-related libraries available
Deep Learning with Python
Low level
High level
deeplearning.net/software/theano
caffe.berkeleyvision.org
tensorﬂow.org/
lasagne.readthedocs.org/en/latest
and of course:
keras.io

Code & Papers?
http://gitxiv.com/ #GitXiv

Creative AI projects?
http://www.creativeai.net/ #Crea+veAI

Did you say Podcasts?
http://ethicalmachines.com/

https://medium.com/@ArtiﬁcialExperience/creativeai-9d4b2346faf3

Questions?
love letters? existential dilemma’s? academic questions? gifts?  
ﬁnd me at: 
www.csc.kth.se/~roelof/
roelof@kth.se
@graphiﬁc
Consulting / Projects / Contracts / $$$ / more love letters?
http://www.graph-technologies.com/
roelof@graph-technologies.com

WHAT ABOUT CONVNETS?
▸ Awesome for interpreting features
▸ Recurrence can be “kind” of achieved with
▸ long splicing ﬁlters
▸ pooling layers
▸ smart architectures

Yoon Kim (2014) Convolutional Neural Networks for Sentence Classiﬁcation
Xiang Zhang, Junbo Zhao, Yann LeCun (2015) Character-level Convolutional Networks for Text Classiﬁcation
NLP

AUDIO
Keunwoo Choi, Jeonghee Kim,
George Fazekas, and Mark Sandler
(2016) Auralisation of Deep
Convolutional Neural Networks:
Listening to Learned Features

AUDIO
Keunwoo Choi, George Fazekas, Mark Sandler (2016) Explaining Deep
Convolutional Neural Networks on Music Classiﬁcation
audio at: 
https://keunwoochoi.wordpress.com/2016/03/23/what-cnns-see-when-cnns-see-spectrograms/

Deep Neural Networks  that talk (Back)… with style

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Deep Neural Networks  that talk (Back)… with style

Ähnlich wie Deep Neural Networks  that talk (Back)… with style (20)

Mehr von Roelof Pieters

Mehr von Roelof Pieters (8)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)