Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
@graphific
Roelof Pieters
Guest	
  Lecture:	
  Deep	
  Learning	
  
for	
  Informa8on	
  Retrieval
28	
  April	
  2015
www....
2
About Me
• (-10y) CS dropout (Amsterdam Technical Univ.)
• (2y) Msc Social Anthropology, Stockholm
University
• Current:...
3
Information Retrieval (IR)
- Hedvig Kjellström, lecture 1
4
Data landscape is changing
1. Amount of digital data
is growing at
increasing rate (IOT,
digitalization,
wearables, phon...
[Jussi Karlgren, NLP Sthlm Meetup 2014]
6
Data landscape is changing
Triple V’s of Big Data:
1. Volume
2. Velocity
3. Variety
7
Making sense of Data
Typical ML Regression
8
Making sense of Data
Neural NetTypical ML Regression
Degrees of Complexity
9
perceptron demo
Neural Net
10
(figure from Lior Rokach, Ben-Gurion University)
Neural Net
11
(figure from Lior Rokach, Ben-Gurion University)
Neural Net
12
(figure from Lior Rokach, Ben-Gurion University)
Neural Net
13
(figure from Lior Rokach, Ben-Gurion University)
Neural Net
14
(figure from Lior Rokach, Ben-Gurion University)
Neural Net
15
multilayer nn demo
Deep Learning ??
16
Deep Learning ??
17
• Learning multiple layers
• “Back propagation”
• Can “theoretically” learn any function!
Prior to 200...
18
2006+: the 3 Deep Learning Conspirators
19
20
— Andrew Ng
“I’ve worked all my life in
Machine Learning, and I’ve
never seen one algorithm knock
over benchmarks like Dee...
Different Levels of Abstraction
22
Hierarchical Learning
• Natural progression
from low level to high
level structure as seen
in natural complexity
Different...
Hierarchical Learning
• Natural progression
from low level to high
level structure as seen
in natural complexity
• Easier ...
Hierarchical Learning
• Natural progression
from low level to high
level structure as seen
in natural complexity
• Easier ...
Hierarchical Learning
• Natural progression
from low level to high
level structure as seen
in natural complexity
Different...
Classic Deep Architecture
Input layer
Hidden layers
Output layer
27
Modern Deep Architecture
Input layer
Hidden layers
Output layer
movie time:
http://www.cs.toronto.edu/~hinton/adi/index.ht...
[Kudos to Richard Socher, for this eloquent summary :) ]
• Manually designed features are often over-specified, incomplete
...
Word Embeddings
30
31
What about NLP ?
1. Language is ambiguous:

Every sentence has many possible interpretations.
2. Language is productive...
• NLP treats words mainly (rule-based/statistical
approaches at least) as atomic symbols:

• or in vector space:

• also k...
Language Representation
33
- Johan Boye, lecture 2
Term-document matrix = Sparse!
Distributional representations
“You shall know a word by the company it keeps”

(J. R. Firth 1957)
One of the most success...
Distributional hypothesis
He filled the wampimuk, passed it
around and we all drunk some
We found a little, hairy wampimuk
...
Distributional semantics
Landauer and Dumais (1997), Turney and Pantel (2010), …
36
Distributional semantics
Distributional meaning as co-occurrence vector:
37
Distributional representations
• Taking it further:
• Continuous word embeddings
• Combine vector space semantics with the...
Word Embeddings: SocherVector Space Model
adapted rom Bengio, “Representation Learning and Deep Learning”, July, 2012, UCL...
Word Embeddings: SocherVector Space Model
adapted rom Bengio, “Representation Learning and Deep Learning”, July, 2012, UCL...
• Can theoretically (given enough units) approximate
“any” function
• and fit to “any” kind of data
• Efficient for NLP: hidd...
Word Embeddings: SocherVector Space Model
Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, ...
Compositionality
Principle of compositionality:
the “meaning (vector) of a
complex expression (sentence)
is determined by:...
• How do we handle the compositionality of language in
our models?
44
Compositionality
• How do we handle the compositionality of language in
our models?
• Recursion :

the same operator (same parameters) is
a...
• How do we handle the compositionality of language in
our models?
• Option 1: Recurrent Neural Networks (RNN)
46
RNN 1: R...
• How do we handle the compositionality of language in
our models?
• Option 2: Recursive Neural Networks (also
sometimes c...
Recursive Neural Tensor Network
48
Recursive Neural Tensor Network
49
code & info: http://www.socher.org/index.php/Main/
ParsingNaturalScenesAndNaturalLangua...
NP
PP/IN
NP
DT NN PRP$ NN
Parse Tree
Recurrent NN for Vector Space
50
NP
PP/IN
NP
DT NN PRP$ NN
Parse Tree
INDT NN PRP NN
Compositionality
51
Recurrent NN: CompositionalityRecurrent NN for Vec...
NP
IN
NP
PRP NN
Parse Tree
DT NN
Compositionality
52
Recurrent NN: CompositionalityRecurrent NN for Vector Space
NP
IN
NP
DT NN PRP NN
PP
NP (S / ROOT)
“rules” “meanings”
Compositionality
53
Recurrent NN: CompositionalityRecurrent NN f...
Vector Space + Word Embeddings: Socher
54
Recurrent NN: CompositionalityRecurrent NN for Vector Space
Vector Space + Word Embeddings: Socher
55
Recurrent NN for Vector Space
Word Embeddings: Turian (2010)
Turian, J., Ratinov, L., Bengio, Y. (2010). Word representations: A simple and general meth...
Word Embeddings: Turian (2010)
Turian, J., Ratinov, L., Bengio, Y. (2010). Word representations: A simple and general meth...
Word Embeddings: Demo
Word Embeddings: Collobert & Weston (2011)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P. (...
Polysemous-embeddings: Stanford (2012)
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng (2012)

Improvi...
Linguistic Regularities: Mikolov (2013)
code & info: https://code.google.com/p/word2vec/
Mikolov, T., Yih, W., & Zweig, G....
Word Embeddings for MT: Mikolov (2013)
Mikolov, T., Le, V. L., Sutskever, I. (2013) . 

Exploiting Similarities among Lang...
Word Embeddings for MT: Kiros (2014)
Kiros, R., Zemel, R. S., Salakhutdinov, R. (2014) . 

A Multiplicative Model for Lear...
Recursive Deep Models & Sentiment: Socher (2013)
Socher, R., Perelygin, A., Wu, J., Chuang, J.,Manning, C., Ng, A., Potts,...
Paragraph Vectors: Le & Mikolov (2014)
Le, Q., Mikolov,. T. (2014) Distributed Representations of Sentences and Documents
...
Paragraph Vectors: Dai et al. (2014)
Dai, A., Olah,. C., Le, Q., Corrado, G. (2014) Document Embedding with Paragraph Vect...
Paragraph Vectors: Dai et al. (2014)
Dai, A., Olah,. C., Le, Q., Corrado, G. (2014) Document Embedding with Paragraph Vect...
Paragraph Vectors: Dai et al. (2014)
Dai, A., Olah,. C., Le, Q., Corrado, G. (2014) Document Embedding with Paragraph Vect...
Joint Image-Word Embeddings
69
1. Multimodal representation learning
2. Generating descriptions of images
3. Ranking images and captions (“image-sentence...
Bags of Visual Words
71
Source credit : K. Grauman, B. Leibe
Bags of Visual Words (Sivic & Zisserman 2003)
standard BoW issues however
What we get:
But we want:
• visual word order/re...
Zero-shot Learning
• skip-gram text model on wikipedia corpus of 5.7 million
documents (5.4 billion words) - approach from...
Encoder: A deep convolutional network (CNN) and long short-
term memory recurrent network (LSTM) for learning a joint
imag...
• captures Multimodal linguistic regularities
Encoder-Decoder pipeline
75
• captures Multimodal linguistic regularities
Encoder-Decoder pipeline
76
(PCA projection of (300-dimensional) word and im...
77
Vinyals, O., Toshev, A., Bengio, S., Erhan. D. (2015) 

Show and Tell: A Neural Image Caption Generator
Joint Visual-Se...
78
Karpathy, A., Fei Fei, L. (2015) 

Deep Visual-Semantic Alignments for Generating Image Descriptions
Joint Visual-Seman...
79
Joint Visual-Semantic embedding
Karpathy, A., Fei Fei, L. (2015) 

Deep Visual-Semantic Alignments for Generating Image...
80
Joint Visual-Semantic embedding
Karpathy, A., Fei Fei, L. (2015) 

Deep Visual-Semantic Alignments for Generating Image...
Joint Visual-Semantic embedding
81
Karpathy, A., Fei Fei, L. (2015) 

Deep Visual-Semantic Alignments for Generating Image...
Any Questions?
Download example code samples from
https://github.com/graphific/DL-Meetup-intro
83
git clone --recursive https://github.com...
• Theano - CPU/GPU symbolic expression compiler in
python (from LISA lab at University of Montreal).
http://deeplearning.n...
• RNNLM (Mikolov)

http://rnnlm.org
• NB-SVM

https://github.com/mesnilgr/nbsvm
• Word2Vec (skipgrams/cbow)

https://code....
• cuda-convnet2 (Alex Krizhevsky, Toronto) (c++/
CUDA, optimized for GTX 580) 

https://code.google.com/p/cuda-convnet2/
•...
87
Impact on Computer Vision
88
Impact on Computer Vision
(from Clarifai)89
Impact on Audio Processing
Speech Recognition
90
Impact on Audio Processing
TIMIT Speech Recognition
(from: Clarifai)91
C&W 2011
Impact on Natural Language Processing
Pos: Toutanova et al.

2003)
Ner: Ando & Zhang 

2005
C&W 2011
92
Impact on Natural Language Processing
Named Entity Recognition:
93
Nächste SlideShare
Wird geladen in …5
×

Deep Learning for Information Retrieval

9.405 Aufrufe

Veröffentlicht am

Guest Lecture at DD2476 Search Engines and Information Retrieval Systems. 28 April 2015

Veröffentlicht in: Bildung
  • Als Erste(r) kommentieren

Deep Learning for Information Retrieval

  1. 1. @graphific Roelof Pieters Guest  Lecture:  Deep  Learning   for  Informa8on  Retrieval 28  April  2015 www.csc.kth.se/~roelof/ roelof@kth.se roelof@graph-technologies.com Gve Systems Graph Technologies R&D DD2476 Search Engines and Information Retrieval Systems https://www.kth.se/social/course/DD2476/ slides  online  at  
 h4p://www.slideshare.net/roelofp/deep-­‐learning-­‐for-­‐informa=on-­‐retrieval  
  2. 2. 2 About Me • (-10y) CS dropout (Amsterdam Technical Univ.) • (2y) Msc Social Anthropology, Stockholm University • Current: PhD candidate at KTH/CSC with focus on: • Deep Learning for Natural Language Processing (Distributed Semantics) • Graph-based approaches for Knowledge Representation • Multi-modal models • Current: Data Science Consultant at Graph Technologies RD & Gve-Systems • Recommender Systems • Deep Learning • Realtime Graph-based Search Engines
  3. 3. 3 Information Retrieval (IR) - Hedvig Kjellström, lecture 1
  4. 4. 4 Data landscape is changing 1. Amount of digital data is growing at increasing rate (IOT, digitalization, wearables, phones/ tablets) 2. Data types are shifting as well: 1. from text to audio-visual 2. from professional to personal/social (social media) 3. from semi-structured to unstructured
  5. 5. [Jussi Karlgren, NLP Sthlm Meetup 2014]
  6. 6. 6 Data landscape is changing Triple V’s of Big Data: 1. Volume 2. Velocity 3. Variety
  7. 7. 7 Making sense of Data Typical ML Regression
  8. 8. 8 Making sense of Data Neural NetTypical ML Regression
  9. 9. Degrees of Complexity 9 perceptron demo
  10. 10. Neural Net 10 (figure from Lior Rokach, Ben-Gurion University)
  11. 11. Neural Net 11 (figure from Lior Rokach, Ben-Gurion University)
  12. 12. Neural Net 12 (figure from Lior Rokach, Ben-Gurion University)
  13. 13. Neural Net 13 (figure from Lior Rokach, Ben-Gurion University)
  14. 14. Neural Net 14 (figure from Lior Rokach, Ben-Gurion University)
  15. 15. Neural Net 15 multilayer nn demo
  16. 16. Deep Learning ?? 16
  17. 17. Deep Learning ?? 17 • Learning multiple layers • “Back propagation” • Can “theoretically” learn any function! Prior to 2006: • Very slow and inefficient • SVMs, random forests, etc. SOTA
  18. 18. 18 2006+: the 3 Deep Learning Conspirators
  19. 19. 19
  20. 20. 20
  21. 21. — Andrew Ng “I’ve worked all my life in Machine Learning, and I’ve never seen one algorithm knock over benchmarks like Deep Learning” Deep Learning: Why? 21
  22. 22. Different Levels of Abstraction 22
  23. 23. Hierarchical Learning • Natural progression from low level to high level structure as seen in natural complexity Different Levels of Abstraction Feature Representation 23
  24. 24. Hierarchical Learning • Natural progression from low level to high level structure as seen in natural complexity • Easier to monitor what is being learnt and to guide the machine to better subspaces Different Levels of Abstraction Feature Representation 24
  25. 25. Hierarchical Learning • Natural progression from low level to high level structure as seen in natural complexity • Easier to monitor what is being learnt and to guide the machine to better subspaces • A good lower level representation can be used for many distinct tasks Different Levels of Abstraction Feature Representation 25
  26. 26. Hierarchical Learning • Natural progression from low level to high level structure as seen in natural complexity Different Levels of Abstraction Feature Representation 2626 • Easier to monitor what is being learnt and to guide the machine to better subspaces • A good lower level representation can be used for many distinct tasks
  27. 27. Classic Deep Architecture Input layer Hidden layers Output layer 27
  28. 28. Modern Deep Architecture Input layer Hidden layers Output layer movie time: http://www.cs.toronto.edu/~hinton/adi/index.htm 28
  29. 29. [Kudos to Richard Socher, for this eloquent summary :) ] • Manually designed features are often over-specified, incomplete and take a long time to design and validate • Learned Features are easy to adapt, fast to learn • Deep learning provides a very flexible, (almost?) universal, learnable framework for representing world, visual and linguistic information. • Deep learning can learn unsupervised (from raw text/audio/ images/whatever content) and supervised (with specific labels like positive/negative) Why Deep Learning ? 29
  30. 30. Word Embeddings 30
  31. 31. 31 What about NLP ? 1. Language is ambiguous:
 Every sentence has many possible interpretations. 2. Language is productive:
 We will always encounter new words or new constructions 3. Language is culturally specific Some of the challenges in Language Understanding:
  32. 32. • NLP treats words mainly (rule-based/statistical approaches at least) as atomic symbols:
 • or in vector space:
 • also known as “one hot” representation. • Its problem ? Language Representation Love Candy Store [0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 …] Candy [0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 …] AND Store [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 …] = 0 ! 32
  33. 33. Language Representation 33 - Johan Boye, lecture 2 Term-document matrix = Sparse!
  34. 34. Distributional representations “You shall know a word by the company it keeps”
 (J. R. Firth 1957) One of the most successful ideas of modern statistical NLP! these words represent banking • Hard (class based) clustering models • Soft clustering models 34
  35. 35. Distributional hypothesis He filled the wampimuk, passed it around and we all drunk some We found a little, hairy wampimuk sleeping behind the tree (McDonald & Ramscar 2001) 35
  36. 36. Distributional semantics Landauer and Dumais (1997), Turney and Pantel (2010), … 36
  37. 37. Distributional semantics Distributional meaning as co-occurrence vector: 37
  38. 38. Distributional representations • Taking it further: • Continuous word embeddings • Combine vector space semantics with the prediction of probabilistic models • Words are represented as a dense vector: Candy = 38
  39. 39. Word Embeddings: SocherVector Space Model adapted rom Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA In a perfect world: 39
  40. 40. Word Embeddings: SocherVector Space Model adapted rom Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA In a perfect world: the country of my birth the place where I was born 40
  41. 41. • Can theoretically (given enough units) approximate “any” function • and fit to “any” kind of data • Efficient for NLP: hidden layers can be used as word lookup tables • Dense distributed word vectors + efficient NN training algorithms: • Can scale to billions of words ! Why Neural Networks for NLP? 41
  42. 42. Word Embeddings: SocherVector Space Model Figure (edited) from Bengio, “Representation Learning and Deep Learning”, July, 2012, UCLA In a perfect world: the country of my birth the place where I was born ? … 42
  43. 43. Compositionality Principle of compositionality: the “meaning (vector) of a complex expression (sentence) is determined by: — Gottlob Frege 
 (1848 - 1925) - the meanings of its constituent expressions (words) and - the rules (grammar) used to combine them” 43
  44. 44. • How do we handle the compositionality of language in our models? 44 Compositionality
  45. 45. • How do we handle the compositionality of language in our models? • Recursion :
 the same operator (same parameters) is applied repeatedly on different components 45 Compositionality
  46. 46. • How do we handle the compositionality of language in our models? • Option 1: Recurrent Neural Networks (RNN) 46 RNN 1: Recurrent Neural Networks (we ignore recurrent NN’s for this talk)
  47. 47. • How do we handle the compositionality of language in our models? • Option 2: Recursive Neural Networks (also sometimes called RNN) 47 RNN 2: Recursive Neural Networks
  48. 48. Recursive Neural Tensor Network 48
  49. 49. Recursive Neural Tensor Network 49 code & info: http://www.socher.org/index.php/Main/ ParsingNaturalScenesAndNaturalLanguageWithRecursiveNeuralNetworks Socher, R., Liu, C.C., NG, A.Y., Manning, C.D. (2011) 
 Parsing Natural Scenes and Natural Language with Recursive Neural Networks
  50. 50. NP PP/IN NP DT NN PRP$ NN Parse Tree Recurrent NN for Vector Space 50
  51. 51. NP PP/IN NP DT NN PRP$ NN Parse Tree INDT NN PRP NN Compositionality 51 Recurrent NN: CompositionalityRecurrent NN for Vector Space
  52. 52. NP IN NP PRP NN Parse Tree DT NN Compositionality 52 Recurrent NN: CompositionalityRecurrent NN for Vector Space
  53. 53. NP IN NP DT NN PRP NN PP NP (S / ROOT) “rules” “meanings” Compositionality 53 Recurrent NN: CompositionalityRecurrent NN for Vector Space
  54. 54. Vector Space + Word Embeddings: Socher 54 Recurrent NN: CompositionalityRecurrent NN for Vector Space
  55. 55. Vector Space + Word Embeddings: Socher 55 Recurrent NN for Vector Space
  56. 56. Word Embeddings: Turian (2010) Turian, J., Ratinov, L., Bengio, Y. (2010). Word representations: A simple and general method for semi-supervised learning code & info: http://metaoptimize.com/projects/wordreprs/56
  57. 57. Word Embeddings: Turian (2010) Turian, J., Ratinov, L., Bengio, Y. (2010). Word representations: A simple and general method for semi-supervised learning code & info: http://metaoptimize.com/projects/wordreprs/ 57
  58. 58. Word Embeddings: Demo
  59. 59. Word Embeddings: Collobert & Weston (2011) Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P. (2011) . Natural Language Processing (almost) from Scratch 59
  60. 60. Polysemous-embeddings: Stanford (2012) Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng (2012)
 Improving Word Representations via Global Context and Multiple Word Prototypes 60
  61. 61. Linguistic Regularities: Mikolov (2013) code & info: https://code.google.com/p/word2vec/ Mikolov, T., Yih, W., & Zweig, G. (2013). Linguistic Regularities in Continuous Space Word Representations 61
  62. 62. Word Embeddings for MT: Mikolov (2013) Mikolov, T., Le, V. L., Sutskever, I. (2013) . 
 Exploiting Similarities among Languages for Machine Translation 62
  63. 63. Word Embeddings for MT: Kiros (2014) Kiros, R., Zemel, R. S., Salakhutdinov, R. (2014) . 
 A Multiplicative Model for Learning Distributed Text-Based Attribute Representations 63
  64. 64. Recursive Deep Models & Sentiment: Socher (2013) Socher, R., Perelygin, A., Wu, J., Chuang, J.,Manning, C., Ng, A., Potts, C. (2013) 
 Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. code & demo: http://nlp.stanford.edu/sentiment/index.html 64
  65. 65. Paragraph Vectors: Le & Mikolov (2014) Le, Q., Mikolov,. T. (2014) Distributed Representations of Sentences and Documents 65 • add context (sentence, paragraph, document) to word vectors during training ! Results on Stanford Sentiment 
 Treebank dataset:
  66. 66. Paragraph Vectors: Dai et al. (2014) Dai, A., Olah,. C., Le, Q., Corrado, G. (2014) Document Embedding with Paragraph Vectors 66
  67. 67. Paragraph Vectors: Dai et al. (2014) Dai, A., Olah,. C., Le, Q., Corrado, G. (2014) Document Embedding with Paragraph Vectors 67
  68. 68. Paragraph Vectors: Dai et al. (2014) Dai, A., Olah,. C., Le, Q., Corrado, G. (2014) Document Embedding with Paragraph Vectors 68 Nearest neighbours to the machine learning paper “Distributed Representations of Sentences and Documents” in arXiv.
  69. 69. Joint Image-Word Embeddings 69
  70. 70. 1. Multimodal representation learning 2. Generating descriptions of images 3. Ranking images and captions (“image-sentence ranking”) Some Current Approaches 70
  71. 71. Bags of Visual Words 71 Source credit : K. Grauman, B. Leibe
  72. 72. Bags of Visual Words (Sivic & Zisserman 2003) standard BoW issues however What we get: But we want: • visual word order/relations • location • scale/viewpoint invariance • … 72
  73. 73. Zero-shot Learning • skip-gram text model on wikipedia corpus of 5.7 million documents (5.4 billion words) - approach from (Mikolov et al. ICLR 2013) 73 Frome, A., Corrado, G.S., Shlens, J., Bengio, S., Dean, J., Mikolov, T., Ranzato, M.A. (2013) 
 Devise: A deep visual-semantic embedding model DeViSE model
  74. 74. Encoder: A deep convolutional network (CNN) and long short- term memory recurrent network (LSTM) for learning a joint image-sentence embedding. Decoder: A new neural language model that combines structure and content vectors for generating words one at a time in sequence. Encoder-Decoder pipeline 74 Kiros, R., Salakhutdinov, R., Zemerl, R. S. (2014) 
 Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models (Kiros et al 2014)
  75. 75. • captures Multimodal linguistic regularities Encoder-Decoder pipeline 75
  76. 76. • captures Multimodal linguistic regularities Encoder-Decoder pipeline 76 (PCA projection of (300-dimensional) word and image representations)
  77. 77. 77 Vinyals, O., Toshev, A., Bengio, S., Erhan. D. (2015) 
 Show and Tell: A Neural Image Caption Generator Joint Visual-Semantic embedding Karpathy, A., Fei Fei, L. (2015) 
 Deep Visual-Semantic Alignments for Generating Image Descriptions CNN+LSTM CNN+RNN
  78. 78. 78 Karpathy, A., Fei Fei, L. (2015) 
 Deep Visual-Semantic Alignments for Generating Image Descriptions Joint Visual-Semantic embedding
  79. 79. 79 Joint Visual-Semantic embedding Karpathy, A., Fei Fei, L. (2015) 
 Deep Visual-Semantic Alignments for Generating Image Descriptions
  80. 80. 80 Joint Visual-Semantic embedding Karpathy, A., Fei Fei, L. (2015) 
 Deep Visual-Semantic Alignments for Generating Image Descriptions
  81. 81. Joint Visual-Semantic embedding 81 Karpathy, A., Fei Fei, L. (2015) 
 Deep Visual-Semantic Alignments for Generating Image Descriptions demo
  82. 82. Any Questions?
  83. 83. Download example code samples from https://github.com/graphific/DL-Meetup-intro 83 git clone --recursive https://github.com/graphific/ DL-Meetup-intro.git Wanna Play ? Code! (more at http://deeplearning.net/ )
  84. 84. • Theano - CPU/GPU symbolic expression compiler in python (from LISA lab at University of Montreal). http://deeplearning.net/software/theano/ • Pylearn2 - library designed to make machine learning research easy. http://deeplearning.net/software/ pylearn2/ • Torch - Matlab-like environment for state-of-the-art machine learning algorithms in lua (from Ronan Collobert, Clement Farabet and Koray Kavukcuoglu) http://torch.ch/ • more info: http://deeplearning.net/software links/ Wanna Play ? Wanna Play ? General Deep Learning 84
  85. 85. • RNNLM (Mikolov)
 http://rnnlm.org • NB-SVM
 https://github.com/mesnilgr/nbsvm • Word2Vec (skipgrams/cbow)
 https://code.google.com/p/word2vec/ (original)
 http://radimrehurek.com/gensim/models/word2vec.html (python) • GloVe
 http://nlp.stanford.edu/projects/glove/ (original)
 https://github.com/maciejkula/glove-python (python) • Socher et al / Stanford RNN Sentiment code:
 http://nlp.stanford.edu/sentiment/code.html • Deep Learning without Magic Tutorial:
 http://nlp.stanford.edu/courses/NAACL2013/ Wanna Play ? NLP 85
  86. 86. • cuda-convnet2 (Alex Krizhevsky, Toronto) (c++/ CUDA, optimized for GTX 580) 
 https://code.google.com/p/cuda-convnet2/ • Caffe (Berkeley) (Cuda/OpenCL, Theano, Python)
 http://caffe.berkeleyvision.org/ • OverFeat (NYU) 
 http://cilvr.nyu.edu/doku.php?id=code:start Wanna Play ? Computer Vision 86
  87. 87. 87
  88. 88. Impact on Computer Vision 88
  89. 89. Impact on Computer Vision (from Clarifai)89
  90. 90. Impact on Audio Processing Speech Recognition 90
  91. 91. Impact on Audio Processing TIMIT Speech Recognition (from: Clarifai)91
  92. 92. C&W 2011 Impact on Natural Language Processing Pos: Toutanova et al.
 2003) Ner: Ando & Zhang 
 2005 C&W 2011 92
  93. 93. Impact on Natural Language Processing Named Entity Recognition: 93

×