SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Downloaden Sie, um offline zu lesen
Deep Learning forDeep Learning for
NLP ApplicationsNLP Applications
What is DeepWhat is Deep
Learning?Learning?
Just a NeuralJust a Neural
Network!Network!
"Deep learning" refers to
Deep Neural Networks
A Deep Neural Network is
simply a Neural Network
with multiple hidden layers
Neural Networks have
been around since the
1970s
So why now?So why now?
LargeLarge
Networks areNetworks are
hard to trainhard to train
Vanishing gradients make
backpropagation harder
Overfitting becomes a
serious issue
So we settled (for the time
being) with simpler, more
useful variations of Neural
Networks
Then,Then,
suddenly ...suddenly ...
We realized we can stack
these simpler Neural
Networks, making them
easier to train
We derived more efficient
parameter estimation and
model regularization
methods
Also, Moore's law kicked in
and GPU computation
became viable
So what's the bigSo what's the big
deal?deal?
MASSIVE improvements inMASSIVE improvements in
Computer VisionComputer Vision
Speech RecognitionSpeech Recognition
Baidu (with Andrew Ng as their chief) has built a state-
of-the-art speech recognition system with Deep
Learning
Their dataset: 7000 hours of conversation couple with
background noise synthesis for a total of 100,000
hours
They processed this through a massive GPU cluster
Cross DomainCross Domain
RepresentationsRepresentations
What if you wanted to take
an image and generate a
description of it?
The beauty of
representation learning is
it's ability to be distributed
across tasks
This is the real power of
Neural Networks
But Samiur, whatBut Samiur, what
about NLP?about NLP?
Deep Learning NLPDeep Learning NLP
Distributed word representations
Dependency Parsing
Sentiment Analysis
And many others ...
StandardStandard
Bag of Words
A one-hot encoding
20k to 50k dimensions
Can be improved by
factoring in document
frequency
Word embeddingWord embedding
Neural Word embeddings
Uses a vector space
that attempts to
predict a word given a
context window
200-400 dimensions
motel [0.06, -0.01, 0.13, 0.07, -0.06, -0.04, 0, -0.04]
hotel [0.07, -0.03, 0.07, 0.06, -0.06, -0.03, 0.01, -0.05]
Word RepresentationsWord Representations
Word embeddings make semantic similarity and
synonyms possible
Word embeddings have coolWord embeddings have cool
properties:properties:
Dependency ParsingDependency Parsing
Converting sentences to a
dependency based grammar
Simplifying this to the verbs
and it's agents is called
Semantic Role Labeling
SentimentSentiment
AnalysisAnalysis
Recursive Neural Networks
Can model tree
structures very well
This makes it great for
other NLP tasks too
(such as parsing)
Get to theGet to the
applications partapplications part
already!already!
ToolsTools
Python
Theano/PyLearn2
Gensim (for word2vec)
nolearn (uses scikit-learn
Java/Clojure/Scala
DeepLearning4j
neuralnetworks by Ivan Vasilev
APIs
Alchemy API
Meta Mind
Problem: Funding SentenceProblem: Funding Sentence
ClassifierClassifier
Build a binary classifier that is able to take any sentence
from a news article and tell if it's about funding or not.
eg. "Mattermark is today announcing that it has raised a
round of $6.5 million"
Word VectorsWord Vectors
Used Gensim's Word2Vec implementation to train
unsupervised word vectors on the UMBC Webbase
Corpus (~100M documents, ~48GB of text)
Then, iterated 20 times on text in news articles in the
tech news domain (~1M documents, ~300MB of text)
Sentence VectorsSentence Vectors
How can you compose word vectors to make
sentence vectors?
Use paragraph vector model proposed by Quoc Le
Feed into an RNN constructed by a dependency
tree of the sentence
Use some heuristic function to combine the string
of word vectors
What did we try?What did we try?
TF-IDF + Naive Bayes
Word2Vec + Composition Methods
Word2Vec + TF-IDF + Composition Methods
Word2Vec + TF-IDF + Semantic Role Labeling (SRL) +
Composition Methods
Composition MethodsComposition Methods
Where wi represents the i'th word vector,
wv the word vector for the verb, and a0 and a1 are
agents
What worked?What worked?
Word2Vec + TFIDF + SRL + Circular Convolution
/Additive
The first method with simple TFIDF/Naive Bayes
performed extremely poorly because of it's large
dimensionality
Combining TFIDF with Word2Vec provided a small,
but noticeable improvement
Adding SRL and a more sophisticated composition
method increased performance by almost 5%
What else could we try?What else could we try?
Can we apply this method to generate general
purpose document vectors?
We are currently using LDA (a topic analysis
method) or simple TFIDF to create document
vectors
How will this method compare to the already
proposed paragraph vector method by Quoc Le?
Can we associate these document vectors with much
smaller query strings?
eg. Search for artificial intelligence against our
companies and get better results than keyword
search
Who's doing ML atWho's doing ML at
Mattermark?Mattermark?
mattermark
We need more people! Refer anyone that
you know that does Data Science/ML

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLP
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on Language
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed models
 
NLP Bootcamp
NLP BootcampNLP Bootcamp
NLP Bootcamp
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 
Practical Deep Learning for NLP
Practical Deep Learning for NLP Practical Deep Learning for NLP
Practical Deep Learning for NLP
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddings
 
Zero shot learning through cross-modal transfer
Zero shot learning through cross-modal transferZero shot learning through cross-modal transfer
Zero shot learning through cross-modal transfer
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
 
Recurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas MikolovRecurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas Mikolov
 
ODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLP
 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer Connect
 
NLP from scratch
NLP from scratch NLP from scratch
NLP from scratch
 
Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)
 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature survey
 
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
 

Andere mochten auch

Query Linguistic Intent Detection
Query Linguistic Intent DetectionQuery Linguistic Intent Detection
Query Linguistic Intent Detection
butest
 
Cognitive Modeling & Intelligent Tutors
Cognitive Modeling & Intelligent TutorsCognitive Modeling & Intelligent Tutors
Cognitive Modeling & Intelligent Tutors
Cody Ray
 
Neural Network Classification and its Applications in Insurance Industry
Neural Network Classification and its Applications in Insurance IndustryNeural Network Classification and its Applications in Insurance Industry
Neural Network Classification and its Applications in Insurance Industry
Inderjeet Singh
 
Application of machine learning in industrial applications
Application of machine learning in industrial applicationsApplication of machine learning in industrial applications
Application of machine learning in industrial applications
Anish Das
 
Soft computing (ANN and Fuzzy Logic) : Dr. Purnima Pandit
Soft computing (ANN and Fuzzy Logic)  : Dr. Purnima PanditSoft computing (ANN and Fuzzy Logic)  : Dr. Purnima Pandit
Soft computing (ANN and Fuzzy Logic) : Dr. Purnima Pandit
Purnima Pandit
 

Andere mochten auch (20)

Query Linguistic Intent Detection
Query Linguistic Intent DetectionQuery Linguistic Intent Detection
Query Linguistic Intent Detection
 
Sentiment Polarity Analysis for Generating Search Result Snippets based on Pa...
Sentiment Polarity Analysis for Generating Search Result Snippets based on Pa...Sentiment Polarity Analysis for Generating Search Result Snippets based on Pa...
Sentiment Polarity Analysis for Generating Search Result Snippets based on Pa...
 
IRE Semantic Annotation of Documents
IRE Semantic Annotation of Documents IRE Semantic Annotation of Documents
IRE Semantic Annotation of Documents
 
Neural Network Applications In Machining: A Review
Neural Network Applications In Machining: A ReviewNeural Network Applications In Machining: A Review
Neural Network Applications In Machining: A Review
 
Logica | Intelligent Self learning - a helping hand in financial crime
Logica | Intelligent Self learning - a helping hand in financial crimeLogica | Intelligent Self learning - a helping hand in financial crime
Logica | Intelligent Self learning - a helping hand in financial crime
 
An Optimal Iterative Algorithm for Extracting MUCs in a Black-box Constraint ...
An Optimal Iterative Algorithm for Extracting MUCs in a Black-box Constraint ...An Optimal Iterative Algorithm for Extracting MUCs in a Black-box Constraint ...
An Optimal Iterative Algorithm for Extracting MUCs in a Black-box Constraint ...
 
Black Box Methods for Inferring Parallel Applications' Properties in Virtual ...
Black Box Methods for Inferring Parallel Applications' Properties in Virtual ...Black Box Methods for Inferring Parallel Applications' Properties in Virtual ...
Black Box Methods for Inferring Parallel Applications' Properties in Virtual ...
 
Application of machine learning and cognitive computing in intrusion detectio...
Application of machine learning and cognitive computing in intrusion detectio...Application of machine learning and cognitive computing in intrusion detectio...
Application of machine learning and cognitive computing in intrusion detectio...
 
Cognitive Modeling & Intelligent Tutors
Cognitive Modeling & Intelligent TutorsCognitive Modeling & Intelligent Tutors
Cognitive Modeling & Intelligent Tutors
 
Project based learning methodologies for Embedded Systems and Intelligent Sys...
Project based learning methodologies for Embedded Systems and Intelligent Sys...Project based learning methodologies for Embedded Systems and Intelligent Sys...
Project based learning methodologies for Embedded Systems and Intelligent Sys...
 
Ai and neural networks
Ai and neural networksAi and neural networks
Ai and neural networks
 
word2vec - From theory to practice
word2vec - From theory to practiceword2vec - From theory to practice
word2vec - From theory to practice
 
Neural
NeuralNeural
Neural
 
Representation Learning of Vectors of Words and Phrases
Representation Learning of Vectors of Words and PhrasesRepresentation Learning of Vectors of Words and Phrases
Representation Learning of Vectors of Words and Phrases
 
[SmartNews] Globally Scalable Web Document Classification Using Word2Vec
[SmartNews] Globally Scalable Web Document Classification Using Word2Vec[SmartNews] Globally Scalable Web Document Classification Using Word2Vec
[SmartNews] Globally Scalable Web Document Classification Using Word2Vec
 
Home Automation: Design and Construction of an intelligent design for Cooling...
Home Automation: Design and Construction of an intelligent design for Cooling...Home Automation: Design and Construction of an intelligent design for Cooling...
Home Automation: Design and Construction of an intelligent design for Cooling...
 
Neural Network Classification and its Applications in Insurance Industry
Neural Network Classification and its Applications in Insurance IndustryNeural Network Classification and its Applications in Insurance Industry
Neural Network Classification and its Applications in Insurance Industry
 
Application of machine learning in industrial applications
Application of machine learning in industrial applicationsApplication of machine learning in industrial applications
Application of machine learning in industrial applications
 
Word2vec algorithm
Word2vec algorithmWord2vec algorithm
Word2vec algorithm
 
Soft computing (ANN and Fuzzy Logic) : Dr. Purnima Pandit
Soft computing (ANN and Fuzzy Logic)  : Dr. Purnima PanditSoft computing (ANN and Fuzzy Logic)  : Dr. Purnima Pandit
Soft computing (ANN and Fuzzy Logic) : Dr. Purnima Pandit
 

Ähnlich wie Deep Learning for NLP Applications

DataChat_FinalPaper
DataChat_FinalPaperDataChat_FinalPaper
DataChat_FinalPaper
Urjit Patel
 

Ähnlich wie Deep Learning for NLP Applications (20)

Challenges in transfer learning in nlp
Challenges in transfer learning in nlpChallenges in transfer learning in nlp
Challenges in transfer learning in nlp
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
From Linked Data to Semantic Applications
From Linked Data to Semantic ApplicationsFrom Linked Data to Semantic Applications
From Linked Data to Semantic Applications
 
Video+Language: From Classification to Description
Video+Language: From Classification to DescriptionVideo+Language: From Classification to Description
Video+Language: From Classification to Description
 
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
 
DataChat_FinalPaper
DataChat_FinalPaperDataChat_FinalPaper
DataChat_FinalPaper
 
Envisioning the Future of Language Workbenches
Envisioning the Future of Language WorkbenchesEnvisioning the Future of Language Workbenches
Envisioning the Future of Language Workbenches
 
228-SE3001_2
228-SE3001_2228-SE3001_2
228-SE3001_2
 
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
 
Tutorial on word2vec
Tutorial on word2vecTutorial on word2vec
Tutorial on word2vec
 
Chatbot_Presentation
Chatbot_PresentationChatbot_Presentation
Chatbot_Presentation
 
Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
 
White paper - Job skills extraction with LSTM and Word embeddings - Nikita Sh...
White paper - Job skills extraction with LSTM and Word embeddings - Nikita Sh...White paper - Job skills extraction with LSTM and Word embeddings - Nikita Sh...
White paper - Job skills extraction with LSTM and Word embeddings - Nikita Sh...
 
Bridging the gap between AI and UI - DSI Vienna - full version
Bridging the gap between AI and UI - DSI Vienna - full versionBridging the gap between AI and UI - DSI Vienna - full version
Bridging the gap between AI and UI - DSI Vienna - full version
 
Importance Of Being Driven
Importance Of Being DrivenImportance Of Being Driven
Importance Of Being Driven
 
Video + Language 2019
Video + Language 2019Video + Language 2019
Video + Language 2019
 
Video + Language
Video + LanguageVideo + Language
Video + Language
 
Word Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology ClassesWord Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology Classes
 
Class Diagram Extraction from Textual Requirements Using NLP Techniques
Class Diagram Extraction from Textual Requirements Using NLP TechniquesClass Diagram Extraction from Textual Requirements Using NLP Techniques
Class Diagram Extraction from Textual Requirements Using NLP Techniques
 
D017232729
D017232729D017232729
D017232729
 

Kürzlich hochgeladen

scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
HenryBriggs2
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
mphochane1998
 

Kürzlich hochgeladen (20)

Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Rums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfRums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdf
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 

Deep Learning for NLP Applications

  • 1. Deep Learning forDeep Learning for NLP ApplicationsNLP Applications
  • 2. What is DeepWhat is Deep Learning?Learning?
  • 3. Just a NeuralJust a Neural Network!Network! "Deep learning" refers to Deep Neural Networks A Deep Neural Network is simply a Neural Network with multiple hidden layers Neural Networks have been around since the 1970s
  • 4. So why now?So why now?
  • 5. LargeLarge Networks areNetworks are hard to trainhard to train Vanishing gradients make backpropagation harder Overfitting becomes a serious issue So we settled (for the time being) with simpler, more useful variations of Neural Networks
  • 6. Then,Then, suddenly ...suddenly ... We realized we can stack these simpler Neural Networks, making them easier to train We derived more efficient parameter estimation and model regularization methods Also, Moore's law kicked in and GPU computation became viable
  • 7. So what's the bigSo what's the big deal?deal?
  • 8. MASSIVE improvements inMASSIVE improvements in Computer VisionComputer Vision
  • 9. Speech RecognitionSpeech Recognition Baidu (with Andrew Ng as their chief) has built a state- of-the-art speech recognition system with Deep Learning Their dataset: 7000 hours of conversation couple with background noise synthesis for a total of 100,000 hours They processed this through a massive GPU cluster
  • 10. Cross DomainCross Domain RepresentationsRepresentations What if you wanted to take an image and generate a description of it? The beauty of representation learning is it's ability to be distributed across tasks This is the real power of Neural Networks
  • 11. But Samiur, whatBut Samiur, what about NLP?about NLP?
  • 12. Deep Learning NLPDeep Learning NLP Distributed word representations Dependency Parsing Sentiment Analysis And many others ...
  • 13. StandardStandard Bag of Words A one-hot encoding 20k to 50k dimensions Can be improved by factoring in document frequency Word embeddingWord embedding Neural Word embeddings Uses a vector space that attempts to predict a word given a context window 200-400 dimensions motel [0.06, -0.01, 0.13, 0.07, -0.06, -0.04, 0, -0.04] hotel [0.07, -0.03, 0.07, 0.06, -0.06, -0.03, 0.01, -0.05] Word RepresentationsWord Representations Word embeddings make semantic similarity and synonyms possible
  • 14. Word embeddings have coolWord embeddings have cool properties:properties:
  • 15. Dependency ParsingDependency Parsing Converting sentences to a dependency based grammar Simplifying this to the verbs and it's agents is called Semantic Role Labeling
  • 16. SentimentSentiment AnalysisAnalysis Recursive Neural Networks Can model tree structures very well This makes it great for other NLP tasks too (such as parsing)
  • 17. Get to theGet to the applications partapplications part already!already!
  • 18. ToolsTools Python Theano/PyLearn2 Gensim (for word2vec) nolearn (uses scikit-learn Java/Clojure/Scala DeepLearning4j neuralnetworks by Ivan Vasilev APIs Alchemy API Meta Mind
  • 19. Problem: Funding SentenceProblem: Funding Sentence ClassifierClassifier Build a binary classifier that is able to take any sentence from a news article and tell if it's about funding or not. eg. "Mattermark is today announcing that it has raised a round of $6.5 million"
  • 20. Word VectorsWord Vectors Used Gensim's Word2Vec implementation to train unsupervised word vectors on the UMBC Webbase Corpus (~100M documents, ~48GB of text) Then, iterated 20 times on text in news articles in the tech news domain (~1M documents, ~300MB of text)
  • 21. Sentence VectorsSentence Vectors How can you compose word vectors to make sentence vectors? Use paragraph vector model proposed by Quoc Le Feed into an RNN constructed by a dependency tree of the sentence Use some heuristic function to combine the string of word vectors
  • 22. What did we try?What did we try? TF-IDF + Naive Bayes Word2Vec + Composition Methods Word2Vec + TF-IDF + Composition Methods Word2Vec + TF-IDF + Semantic Role Labeling (SRL) + Composition Methods
  • 23. Composition MethodsComposition Methods Where wi represents the i'th word vector, wv the word vector for the verb, and a0 and a1 are agents
  • 24. What worked?What worked? Word2Vec + TFIDF + SRL + Circular Convolution /Additive The first method with simple TFIDF/Naive Bayes performed extremely poorly because of it's large dimensionality Combining TFIDF with Word2Vec provided a small, but noticeable improvement Adding SRL and a more sophisticated composition method increased performance by almost 5%
  • 25. What else could we try?What else could we try? Can we apply this method to generate general purpose document vectors? We are currently using LDA (a topic analysis method) or simple TFIDF to create document vectors How will this method compare to the already proposed paragraph vector method by Quoc Le? Can we associate these document vectors with much smaller query strings? eg. Search for artificial intelligence against our companies and get better results than keyword search
  • 26. Who's doing ML atWho's doing ML at Mattermark?Mattermark? mattermark We need more people! Refer anyone that you know that does Data Science/ML