SlideShare ist ein Scribd-Unternehmen logo
1 von 43
Downloaden Sie, um offline zu lesen
1
ODSC, London. Sep, 2018
Inside the Black Box:
How Does a Neural Network Understand
Names?
Kfir Bar, Chief Scientist, Basis Technology
2
Automatically find names of people,
organizations, locations, and more in text
across many languages.
Named entity recognition (NER)
According to Elon Musk,
Mars rocket will fly ‘short
flights’ next year.
3
?
5
Context is important
Edward Adelson
Neuroscientist, MIT
Checker shadow illusion
The squares represented by A and B
are of the same color
6
Context is important
Edward Adelson
Neuroscientist, MIT
Checker shadow illusion
The squares represented by A and B
are of the same color
Can't play Spain? Improve your
playing via easy step-by-step video
lessons!
7
But sometimes it gets ambiguous...
8
But sometimes it gets ambiguous...
Can't play Spain? Improve your
playing via easy step-by-step video
lessons!
Mom is a great TV show
9
But sometimes it gets ambiguous...
Mom is a great TV show
10
But sometimes it gets ambiguous...
Mother
➔ Processing one word after another
➔ Assigning label to each word, based on local as well as global features
➔ Labels are B-PER, I-PER, B-LOC, I-LOC, OTHER, etc. (a.k.a IOB)
I/O am/O working/O for/O Basis/B-ORG Technology/I-ORG
11
NER as a sequence-labeling problem
12
Use multiple engines
Dictionaries
Rule-based engine
AI-based engine
Decisions
Traditional ML vs. Deep Learning
I love this movie
words, part of speech tags,
lemmas, brown clusters
[00010010110000101001…..001]
☺ Positive
Feature extraction
Vectorization
Modeling
I love this movie
Embeddings lookup
[0.323, -0.3434, 0.901, …, -0.267]
[-0.4923, 0.554, 0.001, …, -0.365]
[1.58845, 0.478, 0.0901, …, -0.171]
…
[-0.0592, 0.588, -0.01, …, -0.111]
Modeling
☺ Positive
13
Word embeddings
- + BerlinJapan Germany
German
European
Europe
Africa
Tokyo =
15
Feed forward network for NER
listen
to
while
I
Natural Language Processing (Almost) from Scratch (Collobert et al., 2011)
B-PER
B-LOC
...
...
Layer 1 Layer 2 Output
Spain I-PER
...
16
Recurrent neural network (RNN)
listen
to
while
I
B-PER
B-LOC
...
...
Layer 1 Output
Spain I-PER
...
17
Recurrent neural network (RNN)
listen
to
while
I
B-PER
B-LOC
...
...
Layer 1 Output
Spain I-PER
...
18
Recurrent neural network (RNN)
listen
to
while
I
B-PER
B-LOC
...
...
Layer 1 Output
Spain I-PER
...
19
Recurrent neural network (RNN)
➔ At each time step we
process one word
concatenated with
the output from
previous time steps
➔ It remembers information
for many time steps
20
Recurrent neural network (RNN)
t-1 t t+1
B-PER I-PER OTHER
➔ At each time step we
process one word
concatenated with
the output from
previous time steps
➔ It remembers information
for many time steps
21
Long Short Term Memory (LSTM)
LSTMIt can forget information when
necessary
LSTM LSTM
t-1 t t+1
B-PER I-PER OTHER
22
LSTM for Sequence Labeling
LSTM
Washington
B-PER
LSTM
said
OTHER
LSTM
in
OTHER
LSTM
Chicago
B-LOC
LSTM
last
OTHER
...
+
23
Bidirectional LSTM for Sequence Labeling
Bidirectional LSTM-CRF Models for Sequence Tagging (Huang et al., 2015)
LSTM
Washington
B-PER
LSTM
+
LSTM
said
OTHER
LSTM
+
LSTM
in
OTHER
LSTM
+
LSTM
Chicago
B-LOC
LSTM
+
LSTM
last
OTHER
LSTM
...
24
Multilayer LSTM for Sequence Labeling
+
LSTM
Washington
B-PER
LSTM
+
LSTM
said
OTHER
LSTM
+
LSTM
in
OTHER
LSTM
+
LSTM
Chicago
B-LOC
LSTM
+
LSTM
last
OTHER
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
+ + + + +
25
Multilayer LSTM for Sequence Labeling
+
LSTM
Washington
B-PER
LSTM
+
LSTM
said
OTHER
LSTM
+
LSTM
in
OTHER
LSTM
+
LSTM
Chicago
B-LOC
LSTM
+
LSTM
last
OTHER
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
+ + + + +
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
+ + + + +
+
26
Alternative decoding using Conditional Random Fields (CRF)
LSTM
Washington
LSTM
+
LSTM
said
LSTM
+
LSTM
in
LSTM
+
LSTM
Chicago
LSTM
+
LSTM
last
LSTM
...
B-PER OTHER OTHER B-LOC OTHER
+
27
Alternative decoding using Conditional Random Fields (CRF)
LSTM
Washington
LSTM
+
LSTM
said
LSTM
+
LSTM
in
LSTM
+
LSTM
Chicago
LSTM
+
LSTM
last
LSTM
...
OTHER
I-LOC
B-LOC
I-PER
B-PER
OTHER
I-LOC
B-LOC
I-PER
B-PER
OTHER
I-LOC
B-LOC
I-PER
B-PER
OTHER
I-LOC
B-LOC
I-PER
B-PER
OTHER
I-LOC
B-LOC
I-PER
B-PER
28
Decoding with CRF
The global score of
a specific sequence
of labels
OTHER
I-LOC
B-LOC
I-PER
B-PER
OTHER
I-LOC
B-LOC
I-PER
B-PER
OTHER
I-LOC
B-LOC
I-PER
B-PER
OTHER
I-LOC
B-LOC
I-PER
B-PER
OTHER
I-LOC
B-LOC
I-PER
B-PER
29
Decoding with CRF
The global score of
a specific sequence
of labels
T [O, I-PER] < T [B-PER, I-PER]
30
Decoding with CRF
OTHER
I-LOC
B-LOC
I-PER
B-PER
OTHER
I-LOC
B-LOC
I-PER
B-PER
OTHER
I-LOC
B-LOC
I-PER
B-PER
OTHER
I-LOC
B-LOC
I-PER
B-PER
OTHER
I-LOC
B-LOC
I-PER
B-PER
argmax
+
31
Character encoding
LSTM
Washington
LSTM
+
LSTM
said
LSTM
+
LSTM
in
LSTM
+
LSTM
Chicago
LSTM
+
LSTM
last
LSTM
...
B-PER OTHER OTHER B-LOC OTHER
+
s a i d
32
Character encoding results
*Results are F score measured over Basis’ evaluation set
English Arabic Korean
BiLSTM 83.5 80.3 82.3
BiLSTM+Char 85.1 82.5 86.0
33
Char encode, word encode, decode
Char encoding
Word encoding
Decoding
Washington said in Chicago last...
Labels
34
Reported combinations
Char encoder Word encoder Decoder
Collobert et al. (2011) None CNN CRF
Mesnil et al. (2013) None RNN RNN
Nguyen et al. (2016) None RNN GRU
Huang et al. (2015) None LSTM CRF
Lample et al. (2016) LSTM LSTM CRF
Chiu & Nichols (2016) CNN LSTM CRF
Zhai et al. (2017) CNN LSTM LSTM
Yang et al. (2016) GRU GRU CRF
Strubell et al. (2017) None Dilated CNN CRF
Shen et al. (2018) CNN CNN LSTM
Borrowed from Shen et al. (2018)
35
What does LSTM actually learn?
36
By Siddhartha Mukherjee
The dying algorithm - predicts death
for oncological patients
“Here is the strange rub of such a deep
learning system: It learns, but it cannot
tell us why it has learned…
...the algorithm looks vacantly at us
when we ask, Why? It is, like death,
another black box.”
Jan 2018
+
37
Bidirectional LSTM for NER
LSTM
Washington
B-PER
LSTM
+
LSTM
said
OTHER
LSTM
+
LSTM
in
OTHER
LSTM
+
LSTM
Chicago
B-LOC
LSTM
+
LSTM
last
OTHER
LSTM
...
+ + + ++
38
What does LSTM actually learn?
LSTM
Washington
B-PER
LSTM
LSTM
said
OTHER
LSTM
LSTM
in
OTHER
LSTM
LSTM
Chicago
B-LOC
LSTM
LSTM
last
OTHER
LSTM
...
+ + + ++
39
What does LSTM actually learn?
LSTM
Washington
B-PER
LSTM
LSTM
said
OTHER
LSTM
LSTM
in
OTHER
LSTM
LSTM
Chicago
B-LOC
LSTM
LSTM
last
OTHER
LSTM
...
Let’s look at this cell vector over time
...
40
What does LSTM actually learn?
41
Neuron 280 - gets positive around some punctuation marks
42
Neuron 189 - gets negative around potential locations
Thank you!
43
Questions?
kfir@basistech.com
@kfirbar

Weitere ähnliche Inhalte

Ähnlich wie ODSC London 2018

ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and ApplicationsICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and ApplicationsForward Gradient
 
Understanding Names with Neural Networks - May 2020
Understanding Names with Neural Networks - May 2020Understanding Names with Neural Networks - May 2020
Understanding Names with Neural Networks - May 2020Basis Technology
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingIla Group
 
Machine learning for document analysis and understanding
Machine learning for document analysis and understandingMachine learning for document analysis and understanding
Machine learning for document analysis and understandingSeiichi Uchida
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingSeonghyun Kim
 
Introduction to Artificial Intelligence
Introduction to Artificial IntelligenceIntroduction to Artificial Intelligence
Introduction to Artificial IntelligenceAI Summary
 
CSCE181 Big ideas in NLP
CSCE181 Big ideas in NLPCSCE181 Big ideas in NLP
CSCE181 Big ideas in NLPInsoo Chung
 
Natural Language Processing for Games Research
Natural Language Processing for Games ResearchNatural Language Processing for Games Research
Natural Language Processing for Games ResearchJose Zagal
 
From Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science TalesFrom Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science TalesBertram Ludäscher
 
IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)Marina Santini
 
KiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorialKiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorialAlyona Medelyan
 
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2Karthik Murugesan
 
文法および流暢性を考慮した頑健なテキスト誤り訂正 (第15回ステアラボ人工知能セミナー)
文法および流暢性を考慮した頑健なテキスト誤り訂正 (第15回ステアラボ人工知能セミナー)文法および流暢性を考慮した頑健なテキスト誤り訂正 (第15回ステアラボ人工知能セミナー)
文法および流暢性を考慮した頑健なテキスト誤り訂正 (第15回ステアラボ人工知能セミナー)STAIR Lab, Chiba Institute of Technology
 
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...Universitat Politècnica de Catalunya
 
A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013Philip Zheng
 
Translated learning
Translated learningTranslated learning
Translated learningSOYEON KIM
 

Ähnlich wie ODSC London 2018 (20)

ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and ApplicationsICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
 
Understanding Names with Neural Networks - May 2020
Understanding Names with Neural Networks - May 2020Understanding Names with Neural Networks - May 2020
Understanding Names with Neural Networks - May 2020
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Year 1 AI.ppt
Year 1 AI.pptYear 1 AI.ppt
Year 1 AI.ppt
 
Machine learning for document analysis and understanding
Machine learning for document analysis and understandingMachine learning for document analysis and understanding
Machine learning for document analysis and understanding
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
 
DNA Daily
DNA DailyDNA Daily
DNA Daily
 
Introduction to Artificial Intelligence
Introduction to Artificial IntelligenceIntroduction to Artificial Intelligence
Introduction to Artificial Intelligence
 
CSCE181 Big ideas in NLP
CSCE181 Big ideas in NLPCSCE181 Big ideas in NLP
CSCE181 Big ideas in NLP
 
Natural Language Processing for Games Research
Natural Language Processing for Games ResearchNatural Language Processing for Games Research
Natural Language Processing for Games Research
 
From Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science TalesFrom Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science Tales
 
IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)IE: Named Entity Recognition (NER)
IE: Named Entity Recognition (NER)
 
KiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorialKiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorial
 
Genetic Algorithm
Genetic AlgorithmGenetic Algorithm
Genetic Algorithm
 
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
 
Watson System
Watson SystemWatson System
Watson System
 
文法および流暢性を考慮した頑健なテキスト誤り訂正 (第15回ステアラボ人工知能セミナー)
文法および流暢性を考慮した頑健なテキスト誤り訂正 (第15回ステアラボ人工知能セミナー)文法および流暢性を考慮した頑健なテキスト誤り訂正 (第15回ステアラボ人工知能セミナー)
文法および流暢性を考慮した頑健なテキスト誤り訂正 (第15回ステアラボ人工知能セミナー)
 
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
 
A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013A tutorial on deep learning at icml 2013
A tutorial on deep learning at icml 2013
 
Translated learning
Translated learningTranslated learning
Translated learning
 

Kürzlich hochgeladen

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Kürzlich hochgeladen (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

ODSC London 2018

  • 1. 1 ODSC, London. Sep, 2018 Inside the Black Box: How Does a Neural Network Understand Names? Kfir Bar, Chief Scientist, Basis Technology
  • 2. 2 Automatically find names of people, organizations, locations, and more in text across many languages. Named entity recognition (NER)
  • 3. According to Elon Musk, Mars rocket will fly ‘short flights’ next year. 3
  • 4. ?
  • 5. 5 Context is important Edward Adelson Neuroscientist, MIT Checker shadow illusion The squares represented by A and B are of the same color
  • 6. 6 Context is important Edward Adelson Neuroscientist, MIT Checker shadow illusion The squares represented by A and B are of the same color
  • 7. Can't play Spain? Improve your playing via easy step-by-step video lessons! 7 But sometimes it gets ambiguous...
  • 8. 8 But sometimes it gets ambiguous... Can't play Spain? Improve your playing via easy step-by-step video lessons!
  • 9. Mom is a great TV show 9 But sometimes it gets ambiguous...
  • 10. Mom is a great TV show 10 But sometimes it gets ambiguous... Mother
  • 11. ➔ Processing one word after another ➔ Assigning label to each word, based on local as well as global features ➔ Labels are B-PER, I-PER, B-LOC, I-LOC, OTHER, etc. (a.k.a IOB) I/O am/O working/O for/O Basis/B-ORG Technology/I-ORG 11 NER as a sequence-labeling problem
  • 12. 12 Use multiple engines Dictionaries Rule-based engine AI-based engine Decisions
  • 13. Traditional ML vs. Deep Learning I love this movie words, part of speech tags, lemmas, brown clusters [00010010110000101001…..001] ☺ Positive Feature extraction Vectorization Modeling I love this movie Embeddings lookup [0.323, -0.3434, 0.901, …, -0.267] [-0.4923, 0.554, 0.001, …, -0.365] [1.58845, 0.478, 0.0901, …, -0.171] … [-0.0592, 0.588, -0.01, …, -0.111] Modeling ☺ Positive 13
  • 14. Word embeddings - + BerlinJapan Germany German European Europe Africa Tokyo =
  • 15. 15 Feed forward network for NER listen to while I Natural Language Processing (Almost) from Scratch (Collobert et al., 2011) B-PER B-LOC ... ... Layer 1 Layer 2 Output Spain I-PER ...
  • 16. 16 Recurrent neural network (RNN) listen to while I B-PER B-LOC ... ... Layer 1 Output Spain I-PER ...
  • 17. 17 Recurrent neural network (RNN) listen to while I B-PER B-LOC ... ... Layer 1 Output Spain I-PER ...
  • 18. 18 Recurrent neural network (RNN) listen to while I B-PER B-LOC ... ... Layer 1 Output Spain I-PER ...
  • 19. 19 Recurrent neural network (RNN) ➔ At each time step we process one word concatenated with the output from previous time steps ➔ It remembers information for many time steps
  • 20. 20 Recurrent neural network (RNN) t-1 t t+1 B-PER I-PER OTHER ➔ At each time step we process one word concatenated with the output from previous time steps ➔ It remembers information for many time steps
  • 21. 21 Long Short Term Memory (LSTM) LSTMIt can forget information when necessary LSTM LSTM t-1 t t+1 B-PER I-PER OTHER
  • 22. 22 LSTM for Sequence Labeling LSTM Washington B-PER LSTM said OTHER LSTM in OTHER LSTM Chicago B-LOC LSTM last OTHER ...
  • 23. + 23 Bidirectional LSTM for Sequence Labeling Bidirectional LSTM-CRF Models for Sequence Tagging (Huang et al., 2015) LSTM Washington B-PER LSTM + LSTM said OTHER LSTM + LSTM in OTHER LSTM + LSTM Chicago B-LOC LSTM + LSTM last OTHER LSTM ...
  • 24. 24 Multilayer LSTM for Sequence Labeling + LSTM Washington B-PER LSTM + LSTM said OTHER LSTM + LSTM in OTHER LSTM + LSTM Chicago B-LOC LSTM + LSTM last OTHER LSTM LSTM LSTM LSTM LSTM LSTM LSTM LSTM LSTM LSTM LSTM + + + + +
  • 25. 25 Multilayer LSTM for Sequence Labeling + LSTM Washington B-PER LSTM + LSTM said OTHER LSTM + LSTM in OTHER LSTM + LSTM Chicago B-LOC LSTM + LSTM last OTHER LSTM LSTM LSTM LSTM LSTM LSTM LSTM LSTM LSTM LSTM LSTM + + + + + LSTM LSTM LSTM LSTM LSTM LSTM LSTM LSTM LSTM LSTM + + + + +
  • 26. + 26 Alternative decoding using Conditional Random Fields (CRF) LSTM Washington LSTM + LSTM said LSTM + LSTM in LSTM + LSTM Chicago LSTM + LSTM last LSTM ... B-PER OTHER OTHER B-LOC OTHER
  • 27. + 27 Alternative decoding using Conditional Random Fields (CRF) LSTM Washington LSTM + LSTM said LSTM + LSTM in LSTM + LSTM Chicago LSTM + LSTM last LSTM ... OTHER I-LOC B-LOC I-PER B-PER OTHER I-LOC B-LOC I-PER B-PER OTHER I-LOC B-LOC I-PER B-PER OTHER I-LOC B-LOC I-PER B-PER OTHER I-LOC B-LOC I-PER B-PER
  • 28. 28 Decoding with CRF The global score of a specific sequence of labels OTHER I-LOC B-LOC I-PER B-PER OTHER I-LOC B-LOC I-PER B-PER OTHER I-LOC B-LOC I-PER B-PER OTHER I-LOC B-LOC I-PER B-PER OTHER I-LOC B-LOC I-PER B-PER
  • 29. 29 Decoding with CRF The global score of a specific sequence of labels T [O, I-PER] < T [B-PER, I-PER]
  • 32. 32 Character encoding results *Results are F score measured over Basis’ evaluation set English Arabic Korean BiLSTM 83.5 80.3 82.3 BiLSTM+Char 85.1 82.5 86.0
  • 33. 33 Char encode, word encode, decode Char encoding Word encoding Decoding Washington said in Chicago last... Labels
  • 34. 34 Reported combinations Char encoder Word encoder Decoder Collobert et al. (2011) None CNN CRF Mesnil et al. (2013) None RNN RNN Nguyen et al. (2016) None RNN GRU Huang et al. (2015) None LSTM CRF Lample et al. (2016) LSTM LSTM CRF Chiu & Nichols (2016) CNN LSTM CRF Zhai et al. (2017) CNN LSTM LSTM Yang et al. (2016) GRU GRU CRF Strubell et al. (2017) None Dilated CNN CRF Shen et al. (2018) CNN CNN LSTM Borrowed from Shen et al. (2018)
  • 35. 35 What does LSTM actually learn?
  • 36. 36 By Siddhartha Mukherjee The dying algorithm - predicts death for oncological patients “Here is the strange rub of such a deep learning system: It learns, but it cannot tell us why it has learned… ...the algorithm looks vacantly at us when we ask, Why? It is, like death, another black box.” Jan 2018
  • 37. + 37 Bidirectional LSTM for NER LSTM Washington B-PER LSTM + LSTM said OTHER LSTM + LSTM in OTHER LSTM + LSTM Chicago B-LOC LSTM + LSTM last OTHER LSTM ...
  • 38. + + + ++ 38 What does LSTM actually learn? LSTM Washington B-PER LSTM LSTM said OTHER LSTM LSTM in OTHER LSTM LSTM Chicago B-LOC LSTM LSTM last OTHER LSTM ...
  • 39. + + + ++ 39 What does LSTM actually learn? LSTM Washington B-PER LSTM LSTM said OTHER LSTM LSTM in OTHER LSTM LSTM Chicago B-LOC LSTM LSTM last OTHER LSTM ... Let’s look at this cell vector over time ...
  • 40. 40 What does LSTM actually learn?
  • 41. 41 Neuron 280 - gets positive around some punctuation marks
  • 42. 42 Neuron 189 - gets negative around potential locations