Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Deep Learning 

for NLP
An Introduction to Neural Word Embeddings*
Roelof Pieters

PhD candidate KTH/CSC
CIO/CTO Feeda AB
...
2. NLP: WORD EMBEDDINGS
1. DEEP LEARNING
1. DEEP LEARNING
2. NLP: WORD EMBEDDINGS
A couple of headlines… [all November ’14]
Improving some task T
based on experience E with
respect to performance
measure P.
Deep Learning = Machine Learning
Learni...
Representation learning
Attempts to automatically learn
good features or
representations
Deep learning
Attempt to learn mu...
ML: Traditional Approach
1. Gather as much LABELED data as you can get
2. Throw some algorithms at it (mainly put in an SV...
History
• Perceptron (’57-69…)
• Multi-Layered Perceptrons (’86)
• SVMs (popularized 00s)
• RBM (‘92+)
• “2006”
Rosenblatt...
History
• Perceptron (’57-69…)
• Multi-Layered Perceptrons (’86)
• SVMs (popularized 00s)
• RBM (‘92+)
• “2006”
(Rumelhart...
Backprop Renaissance
• Multi-Layered Perceptrons (’86)
• Uses Backpropagation (Bryson & Ho 1969):
back-propagates the erro...
Backprop Renaissance
Forward Propagation
• Sum inputs, produce activation, feed-forward
Backprop Renaissance
Back Propagation (of error)
• Calculate total error at the top
• Calculate contributions to error at ...
• Compute gradient of example-wise loss wrt
parameters
• Simply applying the derivative chain rule wisely 





• If compu...
Simple Chain Rule
History
• Perceptron (’57-69…)
• Multi-Layered Perceptrons (’86)
• SVMs (popularized 00s)
• RBM (‘92+)
• “2006”
(Cortes & ...
History
• Perceptron (’57-69…)
• Multi-Layered Perceptrons (’86)
• SVMs (popularized 00s)
• RBM (‘92+)
• “2006”
• Form of ...
• Energy Function
RBM: Structure
• Training Function:



• often by contrastive divergence (CD) (Hinton 1999;
Hinton 2000)
• Gibbs sampling
• Gradient Desc...
History
• Perceptron (’57-69…)
• Multi-Layered Perceptrons (’86)
• SVMs (popularized 00s)
• RBM (‘92+)
• “2006”
1. More la...
Stacking Single Layer Learners
One of the big ideas from 2006: layer-wise
unsupervised feature learning
- Stacking Restric...
• Stacked RBM
• Introduced by Hinton et al. (2006)
• 1st RBM hidden layer == 2th RBM input layer
• Feature Hierarchy
Deep ...
Biological Justification
Deep Learning = Brain “inspired”

Audio/Visual Cortex has multiple stages == Hierarchical
Biological Justification
Deep Learning = Brain “inspired”

Audio/Visual Cortex has multiple stages == Hierarchical
“Brainia...
Biological Justification
Deep Learning = Brain “inspired”

Audio/Visual Cortex has multiple stages == Hierarchical
• Comput...
Biological Justification
Deep Learning = Brain “inspired”

Audio/Visual Cortex has multiple stages == Hierarchical
• Comput...
Different Levels of Abstraction
Hierarchical Learning
• Natural progression
from low level to high
level structure as seen
in natural complexity
• Easier ...
• Shared Low Level
Representations
• Multi-Task Learning
• Unsupervised Training
• Partial Feature Sharing
• Mixed Mode Le...
Classic Deep Architecture
Input layer
Hidden layers
Output layer
Modern Deep Architecture
Input layer
Hidden layers
Output layer
Modern Deep Architecture
Input layer
Hidden layers
Output layer
movie time:
http://www.cs.toronto.edu/~hinton/adi/index.htm
Deep Learning
Hierarchies
Efficient
Generalization
Distributed
Sharing
Unsupervised*
Black Box
Training Time
Major PWNAGE!
Much Data
Why ...
No More Handcrafted Features !
— Andrew Ng
“I’ve worked all my life in
Machine Learning, and I’ve
never seen one algorithm knock
over benchmarks like Dee...
Deep Learning: Why?
Beat state of the art in many areas:
• Language Modeling (2012, Mikolov et al)
• Image Recognition (Kr...
One Model rules them all ?



DL approaches have been successfully applied to:
Deep Learning: Why for NLP ?
Automatic summ...
2. NLP: WORD EMBEDDINGS
1. DEEP LEARNING
• NLP treats words mainly (rule-based/statistical
approaches at least) as atomic symbols:

• or in vector space:

• also k...
• NLP treats words mainly (rule-based/statistical
approaches at least) as atomic symbols:

• or in vector space:

• also k...
Distributional representations
“You shall know a word by the company it keeps”

(J. R. Firth 1957)
One of the most success...
• Word Embeddings (Bengio et al, 2001;
Bengio et al, 2003) based on idea of
distributed representations for symbols
(Hinto...
Neural distributional representations
• Neural word embeddings
• Combine vector space semantics with the prediction
of pro...
Word Embeddings: SocherVector Space Model
Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, ...
Word Embeddings: SocherVector Space Model
Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, ...
Word Embeddings: SocherVector Space Model
Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, ...
Word Embeddings: SocherVector Space Model
Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, ...
Word Embeddings: SocherVector Space Model
Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, ...
• Recursive Tensor (Neural) Network (RTNT) 

(Socher et al. 2011; Socher 2014)
• Top-down hierarchical net (vs feed forwar...
Recursive Neural Tensor Network
Recursive Neural Tensor Network
Compositionality
Principle of compositionality:
the “meaning (vector) of a
complex expression
(sentence) is determined by:...
NP
PP/IN
NP
DT NN PRP$ NN
Parse Tree
Compositionality
NP
PP/IN
NP
DT NN PRP$ NN
Parse Tree
INDT NN PRP NN
Compositionality
NP
IN
NP
PRP NN
Parse Tree
DT NN
Compositionality
NP
IN
NP
DT NN PRP NN
PP
NP (S / ROOT)
Compositionality
NP
IN
NP
DT NN PRP NN
PP
NP (S / ROOT)
“rules”
Compositionality
NP
IN
NP
DT NN PRP NN
PP
NP (S / ROOT)
“rules” “meanings”
Compositionality
Vector Space + Word Embeddings: Socher
Vector Space + Word Embeddings: Socher
code & info: http://metaoptimize.com/projects/wordreprs/
Word Embeddings: Turian
t-SNE visualizations of word embeddings. Left: Number Region; Right:
Jobs Region. From Turian et al. 2011
Word Embeddings:...
• Recurrent Neural Network (Mikolov et al. 2010;
Mikolov et al. 2013a)
W(‘‘woman")−W(‘‘man") ≃ W(‘‘aunt")−W(‘‘uncle")
W(‘‘...
• Mikolov et al. 2013b
Figures from Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013b).
Efficient Estimation of Word Re...
• cuda-convnet2 (Alex Krizhevsky, Toronto) (c++/
CUDA, optimized for GTX 580) 

https://code.google.com/p/cuda-convnet2/
•...
• Theano - CPU/GPU symbolic expression compiler in
python (from LISA lab at University of Montreal). http://
deeplearning....
as PhD candidate KTH/CSC:
“Always interested in discussion

Machine Learning, Deep
Architectures, 

Graphs, and NLP”
Wanna...
Were Hiring!
roelof@feeda.com
www.feeda.com
Feeda
• Software Developers
• Data Scientists
Appendum
Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Chris Manning, Andrew Ng and Chris Potts. 2013. Recursive
...
Appendum
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng 

Improving Word Representations via Global C...
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Nächste SlideShare
Wird geladen in …5
×

Deep Learning for NLP: An Introduction to Neural Word Embeddings

19.246 Aufrufe

Veröffentlicht am

Guest Lecture for Language Technology Course. 4 Dec 2014, KTH, Stockholm

Veröffentlicht in: Technologie
  • Dating for everyone is here: ❤❤❤ http://bit.ly/2F7hN3u ❤❤❤
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier
  • Follow the link, new dating source: ❶❶❶ http://bit.ly/2F7hN3u ❶❶❶
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier
  • Do This Simple 2-Minute Ritual To Loss 1 Pound Of Belly Fat Every 72 Hours ▲▲▲ https://tinyurl.com/bkfitness4u
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier
  • DOWNLOAD THIS BOOKS INTO AVAILABLE FORMAT (Unlimited) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... Download Full EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ACCESS WEBSITE for All Ebooks ......................................................................................................................... Download Full PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... Download EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... Download doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier

Deep Learning for NLP: An Introduction to Neural Word Embeddings

  1. 1. Deep Learning 
 for NLP An Introduction to Neural Word Embeddings* Roelof Pieters
 PhD candidate KTH/CSC CIO/CTO Feeda AB Feeda KTH, December 4, 2014 roelof@kth.se www.csc.kth.se/~roelof/ @graphific *and some more fun stuff…
  2. 2. 2. NLP: WORD EMBEDDINGS 1. DEEP LEARNING
  3. 3. 1. DEEP LEARNING 2. NLP: WORD EMBEDDINGS
  4. 4. A couple of headlines… [all November ’14]
  5. 5. Improving some task T based on experience E with respect to performance measure P. Deep Learning = Machine Learning Learning denotes changes in the system that are adaptive in the sense that they enable the system to do the same task (or tasks drawn from a population of similar tasks) more effectively the next time. — H. Simon 1983 
 "Why Should Machines Learn?” in Mitchell 1997 — T. Mitchell 1997
  6. 6. Representation learning Attempts to automatically learn good features or representations Deep learning Attempt to learn multiple levels of representation of increasing complexity/abstraction Deep Learning: What?
  7. 7. ML: Traditional Approach 1. Gather as much LABELED data as you can get 2. Throw some algorithms at it (mainly put in an SVM and keep it at that) 3. If you actually have tried more algos: Pick the best 4. Spend hours hand engineering some features / feature selection / dimensionality reduction (PCA, SVD, etc) 5. Repeat… For each new problem/question::
  8. 8. History • Perceptron (’57-69…) • Multi-Layered Perceptrons (’86) • SVMs (popularized 00s) • RBM (‘92+) • “2006” Rosenblatt 1957 vs Minsky & Papert
  9. 9. History • Perceptron (’57-69…) • Multi-Layered Perceptrons (’86) • SVMs (popularized 00s) • RBM (‘92+) • “2006” (Rumelhart, Hinton & Williams, 1986)
  10. 10. Backprop Renaissance • Multi-Layered Perceptrons (’86) • Uses Backpropagation (Bryson & Ho 1969): back-propagates the error signal computed at the output layer to get derivatives for learning, in order to update the weight vectors until convergence is reached
  11. 11. Backprop Renaissance Forward Propagation • Sum inputs, produce activation, feed-forward
  12. 12. Backprop Renaissance Back Propagation (of error) • Calculate total error at the top • Calculate contributions to error at each step going backwards
  13. 13. • Compute gradient of example-wise loss wrt parameters • Simply applying the derivative chain rule wisely 
 
 
 • If computing the loss (example, parameters) is O(n) computation, then so is computing the gradient Backpropagation
  14. 14. Simple Chain Rule
  15. 15. History • Perceptron (’57-69…) • Multi-Layered Perceptrons (’86) • SVMs (popularized 00s) • RBM (‘92+) • “2006” (Cortes & Vapnik 1995) Kernel SVM
  16. 16. History • Perceptron (’57-69…) • Multi-Layered Perceptrons (’86) • SVMs (popularized 00s) • RBM (‘92+) • “2006” • Form of log-linear Markov Random Field 
 (MRF) • Bipartite graph, with no intra-layer connections
  17. 17. • Energy Function RBM: Structure
  18. 18. • Training Function:
 
 • often by contrastive divergence (CD) (Hinton 1999; Hinton 2000) • Gibbs sampling • Gradient Descent • Goal: compute weight updates RBM: Training more info: Geoffrey Hinton (2010). A Practical Guide to Training Restricted Boltzmann Machines
  19. 19. History • Perceptron (’57-69…) • Multi-Layered Perceptrons (’86) • SVMs (popularized 00s) • RBM (‘92+) • “2006” 1. More labeled data (“Big Data”) 2. GPU’s 3. “layer-wise unsupervised feature learning”
  20. 20. Stacking Single Layer Learners One of the big ideas from 2006: layer-wise unsupervised feature learning - Stacking Restricted Boltzmann Machines (RBM) -> Deep Belief Network (DBN) - Stacking regularized auto-encoders -> deep neural nets
  21. 21. • Stacked RBM • Introduced by Hinton et al. (2006) • 1st RBM hidden layer == 2th RBM input layer • Feature Hierarchy Deep Belief Network (DBN)
  22. 22. Biological Justification Deep Learning = Brain “inspired”
 Audio/Visual Cortex has multiple stages == Hierarchical
  23. 23. Biological Justification Deep Learning = Brain “inspired”
 Audio/Visual Cortex has multiple stages == Hierarchical “Brainiacs” “Pragmatists”vs
  24. 24. Biological Justification Deep Learning = Brain “inspired”
 Audio/Visual Cortex has multiple stages == Hierarchical • Computational Biology • CVAP “Brainiacs” “Pragmatists”vs
  25. 25. Biological Justification Deep Learning = Brain “inspired”
 Audio/Visual Cortex has multiple stages == Hierarchical • Computational Biology • CVAP • Jorge Dávila-Chacón • “that guy” “Brainiacs” “Pragmatists”vs
  26. 26. Different Levels of Abstraction
  27. 27. Hierarchical Learning • Natural progression from low level to high level structure as seen in natural complexity • Easier to monitor what is being learnt and to guide the machine to better subspaces • A good lower level representation can be used for many distinct tasks Different Levels of Abstraction Feature Representation
  28. 28. • Shared Low Level Representations • Multi-Task Learning • Unsupervised Training • Partial Feature Sharing • Mixed Mode Learning • Composition of Functions Generalizable Learning
  29. 29. Classic Deep Architecture Input layer Hidden layers Output layer
  30. 30. Modern Deep Architecture Input layer Hidden layers Output layer
  31. 31. Modern Deep Architecture Input layer Hidden layers Output layer movie time: http://www.cs.toronto.edu/~hinton/adi/index.htm
  32. 32. Deep Learning
  33. 33. Hierarchies Efficient Generalization Distributed Sharing Unsupervised* Black Box Training Time Major PWNAGE! Much Data Why go Deep ?
  34. 34. No More Handcrafted Features !
  35. 35. — Andrew Ng “I’ve worked all my life in Machine Learning, and I’ve never seen one algorithm knock over benchmarks like Deep Learning” Deep Learning: Why?
  36. 36. Deep Learning: Why? Beat state of the art in many areas: • Language Modeling (2012, Mikolov et al) • Image Recognition (Krizhevsky won 2012 ImageNet competition) • Sentiment Classification (2011, Socher et al) • Speech Recognition (2010, Dahl et al) • MNIST hand-written digit recognition (Ciresan et al, 2010)
  37. 37. One Model rules them all ?
 
 DL approaches have been successfully applied to: Deep Learning: Why for NLP ? Automatic summarization Coreference resolution Discourse analysis Machine translation Morphological segmentation Named entity recognition (NER) Natural language generation Natural language understanding Optical character recognition (OCR) Part-of-speech tagging Parsing Question answering Relationship extraction sentence boundary disambiguation Sentiment analysis Speech recognition Speech segmentation Topic segmentation and recognition Word segmentation Word sense disambiguation Information retrieval (IR) Information extraction (IE) Speech processing
  38. 38. 2. NLP: WORD EMBEDDINGS 1. DEEP LEARNING
  39. 39. • NLP treats words mainly (rule-based/statistical approaches at least) as atomic symbols:
 • or in vector space:
 • also known as “one hot” representation. • Its problem ? Word Representation Love Candy Store [0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 …]
  40. 40. • NLP treats words mainly (rule-based/statistical approaches at least) as atomic symbols:
 • or in vector space:
 • also known as “one hot” representation. • Its problem ? Word Representation Love Candy Store [0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 …] Candy [0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 …] AND Store [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 …] = 0 !
  41. 41. Distributional representations “You shall know a word by the company it keeps”
 (J. R. Firth 1957) One of the most successful ideas of modern statistical NLP! these words represent banking • Hard (class based) clustering models: • Soft clustering models
  42. 42. • Word Embeddings (Bengio et al, 2001; Bengio et al, 2003) based on idea of distributed representations for symbols (Hinton 1986) • Neural Word embeddings (Mnih and Hinton 2007, Collobert & Weston 2008, Turian et al 2010; Collobert et al. 2011, Mikolov et al. 2011) Language Modeling
  43. 43. Neural distributional representations • Neural word embeddings • Combine vector space semantics with the prediction of probabilistic models • Words are represented as a dense vector: Candy =
  44. 44. Word Embeddings: SocherVector Space Model Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA In a perfect world:
  45. 45. Word Embeddings: SocherVector Space Model Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA In a perfect world: input: 
 - the country of my birth - the place where I was born
  46. 46. Word Embeddings: SocherVector Space Model Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA In a perfect world: the country of my birth input: 
 - the country of my birth - the place where I was born
  47. 47. Word Embeddings: SocherVector Space Model Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA In a perfect world: the country of my birth the place where I was born input: 
 - the country of my birth - the place where I was born
  48. 48. Word Embeddings: SocherVector Space Model Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA In a perfect world: the country of my birth the place where I was born ? …
  49. 49. • Recursive Tensor (Neural) Network (RTNT) 
 (Socher et al. 2011; Socher 2014) • Top-down hierarchical net (vs feed forward) • NLP! • Sequence based classification, windows of several events, entire scenes (rather than images), entire sentences (rather than words) • Features = Vectors • A tensor = multi-dimensional matrix, or multiple matrices of the same size Recursive Neural (Tensor) Network
  50. 50. Recursive Neural Tensor Network
  51. 51. Recursive Neural Tensor Network
  52. 52. Compositionality Principle of compositionality: the “meaning (vector) of a complex expression (sentence) is determined by: — Gottlob Frege 
 (1848 - 1925) - the meanings of its constituent expressions (words) and - the rules (grammar) used to combine them”
  53. 53. NP PP/IN NP DT NN PRP$ NN Parse Tree Compositionality
  54. 54. NP PP/IN NP DT NN PRP$ NN Parse Tree INDT NN PRP NN Compositionality
  55. 55. NP IN NP PRP NN Parse Tree DT NN Compositionality
  56. 56. NP IN NP DT NN PRP NN PP NP (S / ROOT) Compositionality
  57. 57. NP IN NP DT NN PRP NN PP NP (S / ROOT) “rules” Compositionality
  58. 58. NP IN NP DT NN PRP NN PP NP (S / ROOT) “rules” “meanings” Compositionality
  59. 59. Vector Space + Word Embeddings: Socher
  60. 60. Vector Space + Word Embeddings: Socher
  61. 61. code & info: http://metaoptimize.com/projects/wordreprs/ Word Embeddings: Turian
  62. 62. t-SNE visualizations of word embeddings. Left: Number Region; Right: Jobs Region. From Turian et al. 2011 Word Embeddings: Turian
  63. 63. • Recurrent Neural Network (Mikolov et al. 2010; Mikolov et al. 2013a) W(‘‘woman")−W(‘‘man") ≃ W(‘‘aunt")−W(‘‘uncle") W(‘‘woman")−W(‘‘man") ≃ W(‘‘queen")−W(‘‘king") Figures from Mikolov, T., Yih, W., & Zweig, G. (2013). Linguistic Regularities in Continuous Space Word Representations Word Embeddings: Mikolov
  64. 64. • Mikolov et al. 2013b Figures from Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013b). Efficient Estimation of Word Representations in Vector Space Word Embeddings: Mikolov
  65. 65. • cuda-convnet2 (Alex Krizhevsky, Toronto) (c++/ CUDA, optimized for GTX 580) 
 https://code.google.com/p/cuda-convnet2/ • Caffe (Berkeley) (Cuda/OpenCL, Theano, Python)
 http://caffe.berkeleyvision.org/ • OverFeat (NYU) 
 http://cilvr.nyu.edu/doku.php?id=code:start Wanna Play ?
  66. 66. • Theano - CPU/GPU symbolic expression compiler in python (from LISA lab at University of Montreal). http:// deeplearning.net/software/theano/ • Pylearn2 - library designed to make machine learning research easy. http://deeplearning.net/software/pylearn2/ • Torch - Matlab-like environment for state-of-the-art machine learning algorithms in lua (from Ronan Collobert, Clement Farabet and Koray Kavukcuoglu) http://torch.ch/ • more info: http://deeplearning.net/software links/ Wanna Play ? Wanna Play ?
  67. 67. as PhD candidate KTH/CSC: “Always interested in discussion
 Machine Learning, Deep Architectures, 
 Graphs, and NLP” Wanna Play with Me ? roelof@kth.se www.csc.kth.se/~roelof/ Internship / EntrepeneurshipAcademic/Research as CIO/CTO Feeda: “Always looking for additions to our 
 brand new R&D team”
 
 [Internships upcoming on 
 KTH exjobb website…] roelof@feeda.com www.feeda.com Feeda
  68. 68. Were Hiring! roelof@feeda.com www.feeda.com Feeda • Software Developers • Data Scientists
  69. 69. Appendum Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Chris Manning, Andrew Ng and Chris Potts. 2013. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. EMNLP 2013 code & demo: http://nlp.stanford.edu/sentiment/index.html
  70. 70. Appendum Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng 
 Improving Word Representations via Global Context and Multiple Word Prototypes

×