SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Successes and Challenges
Emily Pitler, Google AI
Representations
from Natural
Language Data
State-of-the-art in Natural Language Understanding in 2017
→ → Custom Recurrent Architectures
P 2
Oct. 2018: One Model with Task-specific Tuning in Minutes
P 3
BERT: Bidirectional Encoder Representations from Transformers
P 4
https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html
Transformers: Attention is All You Need
P 5Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin, NIPS 2017
From One-Hot Vectors to Word Embeddings &
Self-Attention
P 6
animal...street...it
0000…10001…01000…0
one-hot
1.4…3.74.9…6.42.5…8.0
embedding
The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
From One-Hot Vectors to Word Embeddings &
Self-Attention
P 7
animal...street...it
0000…10001…01000…0
one-hot
1.4…3.74.9…6.42.5…8.0
embedding
The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
query, key, value
From One-Hot Vectors to Word Embeddings &
Self-Attention
P 8
animal...street...it
0000…10001…01000…0
one-hot
1.4…3.74.9…6.42.5…8.0
embedding
The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
query, key, value
From One-Hot Vectors to Word Embeddings &
Self-Attention
P 9
animal...street...it
0000…10001…01000…0
one-hot
0.1
0.2
0.7
(self-)attention
1.4…3.74.9…6.42.5…8.0
embedding
The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
query, key, value
From One-Hot Vectors to Word Embeddings &
Self-Attention
P 10
animal...street...it
0000…10001…01000…0
one-hot
0.1
0.2
0.7
(self-)attention
1.4…3.74.9…6.42.5…8.0
embedding
The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
OpenAI: Generative Pretraining
The animal tired Acceptable
<s> the … too <s> the … tired
P 11
Transformer Transformer Transformer
Transformer TransformerTransformer
Transformer Transformer Transformer
Transformer TransformerTransformer
Understanding Can Need “Future” Information
How far is Jacksonville from Miami?
Jacksonville is in the First Coast region of northeast Florida and is centered on the
banks of the St. Johns River, about 25 miles (40 km) south of the Georgia state line
and about 340 miles (550 km) north of Miami.
VERB NOUN
Mark which area you want to distress. Mark, which area do you want to distress?
P 12
Naive Bidirectionality: Words Can “See Themselves”
The animal tired The animal tired
<s> the … too <s> the … too
P 13
Transformer Transformer Transformer
Transformer TransformerTransformer
Transformer Transformer Transformer
Transformer TransformerTransformer
Training BERT
Masked Language Model (Fill-in-the-blank)
Deep learning (also [MASK] [MASK] deep structured learning or [MASK]
learning) is part of a broader family of machine learning methods
[MASK] on [MASK] data representations, as opposed to task-specific
algorithms.
[MASK] is allergic to peaches. Is
P 14https://en.wikipedia.org/wiki/Deep_learning
https://en.wikipedia.org/wiki/Daniel_Tiger%27s_Neighborhood
BooksCorpus: Zhu, Kiros, Zemel, Salakhutdinov, Urtasun, Torralba, Fidler, CVPR 2015
Basic BERT Recipe
P 15
Basic BERT Recipe
P 16
Basic BERT Recipe
P 17
SWAG: Zellers, Bisk, Schwartz, Choi, EMNLP 2018 SQuAD: Rajpurkar, Zhang, Lopyrev, Liang, EMNLP 2016
Results: Commonsense Reasoning and Question Answering
P 18
MRPC: Dolan and Brockett, IWP 2005
Pretraining Tasks Matter...and Bigger = Better *
P 19
Do I Need Full BERT Models for All My Tasks?
P 20Houlsby, Giurgiu, Jastrzebski, Morrone, de Laroussilhe, Gesmundo, Attariyan, Gelly, arxiv Feb 2019
Try It Out, Get Faster Training with TPUs
P 21
Mismatches between Training and
Realistic Inputs
Two Case Studies: Mixed Language Text and Identifying Commands
P 22
P 23
Multiple Languages: Frequent “In-the-Wild”, Rare in Training
P 24
Multiple Languages: Frequent “In-the-Wild”, Rare in Training
“A Fast, Compact, Accurate Model for Language Identification
of Codemixed Text”
Zhang, Riesa , Gillick , Bakalov, Baldridge, Weiss, EMNLP 2018 P 25
Accuracy and Speed of Token-level Language Id
P 26
Accuracy and Speed of Token-level Language Id
P 27
Useful Preprocessing Step Across Tasks
P 28
Mismatches between Training and
Realistic Inputs
Two Case Studies: Mixed Language Text and Identifying Commands
P 29
Noun-Verb Ambiguity
“lives” / Noun → /laIvz/
“lives” / Verb → /lIvz/
flies
NOUN
Mark VERB
P 30Elkahky, Webster, Andor, Pitler, EMNLP 2018
Certain insects can damage plumerias, such as mites, flies, or aphids. NOUN
Mark which area you want to distress. VERB
P 31
“A Challenge Set and Methods for Noun-Verb Ambiguity”,
EMNLP 2018
Accuracy on Noun-Verb Disambiguation
P 32
Pronunciation of Homographs Accuracy
P 33
Mark VERB
Webster, Recasens, Axelrod, Baldridge, TACL 2019 Kwiatkowski, Palomaki, Redfield, Collins, Parikh, Alberti, Epstein, Polosukhin, Kelcey, Devlin, Lee, Toutanova, Jones, Chang, Dai, Uszkoreit, Le, Petrov, TACL 2019
Released Datasets with “In-the-Wild” Natural Challenges
P 34
Summary
P 35

Weitere ähnliche Inhalte

Ähnlich wie Emily Pitler - Representations from Natural Language Data: Successes and Challenges

ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015RIILP
 
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and ApplicationsICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and ApplicationsForward Gradient
 
Word2vec: From intuition to practice using gensim
Word2vec: From intuition to practice using gensimWord2vec: From intuition to practice using gensim
Word2vec: From intuition to practice using gensimEdgar Marca
 
MEBI 591C/598 – Data and Text Mining in Biomedical Informatics
MEBI 591C/598 – Data and Text Mining in Biomedical InformaticsMEBI 591C/598 – Data and Text Mining in Biomedical Informatics
MEBI 591C/598 – Data and Text Mining in Biomedical Informaticsbutest
 
State of NLP and Amazon Comprehend
State of NLP and Amazon ComprehendState of NLP and Amazon Comprehend
State of NLP and Amazon ComprehendEgor Pushkin
 
Biemann ibm cog_comp_jan2015_noanim
Biemann ibm cog_comp_jan2015_noanimBiemann ibm cog_comp_jan2015_noanim
Biemann ibm cog_comp_jan2015_noanimdiannepatricia
 
ELKL 5 Language documentation for linguistics and technology
ELKL 5 Language documentation for linguistics and technologyELKL 5 Language documentation for linguistics and technology
ELKL 5 Language documentation for linguistics and technologyDafydd Gibbon
 
Portuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and HowPortuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and HowValeria de Paiva
 
Babak Rasolzadeh: The importance of entities
Babak Rasolzadeh: The importance of entitiesBabak Rasolzadeh: The importance of entities
Babak Rasolzadeh: The importance of entitiesZoltan Varju
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Bhaskar Mitra
 
Recent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingRecent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingSeth Grimes
 
More on Indexing Text Operations (1).pptx
More on Indexing  Text Operations (1).pptxMore on Indexing  Text Operations (1).pptx
More on Indexing Text Operations (1).pptxMahsadelavari
 
Big Data Spain 2017 - Deriving Actionable Insights from High Volume Media St...
Big Data Spain 2017  - Deriving Actionable Insights from High Volume Media St...Big Data Spain 2017  - Deriving Actionable Insights from High Volume Media St...
Big Data Spain 2017 - Deriving Actionable Insights from High Volume Media St...Apache OpenNLP
 
Named Entity Recognition for Twitter Microposts (only) using Distributed Word...
Named Entity Recognition for Twitter Microposts (only) using Distributed Word...Named Entity Recognition for Twitter Microposts (only) using Distributed Word...
Named Entity Recognition for Twitter Microposts (only) using Distributed Word...fgodin
 
Contemporary Models of Natural Language Processing
Contemporary Models of Natural Language ProcessingContemporary Models of Natural Language Processing
Contemporary Models of Natural Language ProcessingKaterina Vylomova
 
Assessment of english language learners final
Assessment of english language learners finalAssessment of english language learners final
Assessment of english language learners finalcswstyle
 
NLP in Practice - Part I
NLP in Practice - Part INLP in Practice - Part I
NLP in Practice - Part IDelip Rao
 

Ähnlich wie Emily Pitler - Representations from Natural Language Data: Successes and Challenges (20)

ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
 
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and ApplicationsICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
 
Word2vec: From intuition to practice using gensim
Word2vec: From intuition to practice using gensimWord2vec: From intuition to practice using gensim
Word2vec: From intuition to practice using gensim
 
MEBI 591C/598 – Data and Text Mining in Biomedical Informatics
MEBI 591C/598 – Data and Text Mining in Biomedical InformaticsMEBI 591C/598 – Data and Text Mining in Biomedical Informatics
MEBI 591C/598 – Data and Text Mining in Biomedical Informatics
 
State of NLP and Amazon Comprehend
State of NLP and Amazon ComprehendState of NLP and Amazon Comprehend
State of NLP and Amazon Comprehend
 
Biemann ibm cog_comp_jan2015_noanim
Biemann ibm cog_comp_jan2015_noanimBiemann ibm cog_comp_jan2015_noanim
Biemann ibm cog_comp_jan2015_noanim
 
ELKL 5 Language documentation for linguistics and technology
ELKL 5 Language documentation for linguistics and technologyELKL 5 Language documentation for linguistics and technology
ELKL 5 Language documentation for linguistics and technology
 
Portuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and HowPortuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and How
 
Babak Rasolzadeh: The importance of entities
Babak Rasolzadeh: The importance of entitiesBabak Rasolzadeh: The importance of entities
Babak Rasolzadeh: The importance of entities
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
 
Recent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingRecent Advances in Natural Language Processing
Recent Advances in Natural Language Processing
 
More on Indexing Text Operations (1).pptx
More on Indexing  Text Operations (1).pptxMore on Indexing  Text Operations (1).pptx
More on Indexing Text Operations (1).pptx
 
Big Data Spain 2017 - Deriving Actionable Insights from High Volume Media St...
Big Data Spain 2017  - Deriving Actionable Insights from High Volume Media St...Big Data Spain 2017  - Deriving Actionable Insights from High Volume Media St...
Big Data Spain 2017 - Deriving Actionable Insights from High Volume Media St...
 
Parameter setting
Parameter settingParameter setting
Parameter setting
 
Lausanne 2019 #3
Lausanne 2019 #3Lausanne 2019 #3
Lausanne 2019 #3
 
Named Entity Recognition for Twitter Microposts (only) using Distributed Word...
Named Entity Recognition for Twitter Microposts (only) using Distributed Word...Named Entity Recognition for Twitter Microposts (only) using Distributed Word...
Named Entity Recognition for Twitter Microposts (only) using Distributed Word...
 
Contemporary Models of Natural Language Processing
Contemporary Models of Natural Language ProcessingContemporary Models of Natural Language Processing
Contemporary Models of Natural Language Processing
 
Assessment of english language learners final
Assessment of english language learners finalAssessment of english language learners final
Assessment of english language learners final
 
NLP in Practice - Part I
NLP in Practice - Part INLP in Practice - Part I
NLP in Practice - Part I
 
Lidia Pivovarova
Lidia PivovarovaLidia Pivovarova
Lidia Pivovarova
 

Mehr von MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceMLconf
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLMLconf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...MLconf
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...MLconf
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeMLconf
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareMLconf
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesMLconf
 

Mehr von MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Kürzlich hochgeladen

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 

Kürzlich hochgeladen (20)

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 

Emily Pitler - Representations from Natural Language Data: Successes and Challenges

  • 1. Successes and Challenges Emily Pitler, Google AI Representations from Natural Language Data
  • 2. State-of-the-art in Natural Language Understanding in 2017 → → Custom Recurrent Architectures P 2
  • 3. Oct. 2018: One Model with Task-specific Tuning in Minutes P 3
  • 4. BERT: Bidirectional Encoder Representations from Transformers P 4
  • 5. https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html Transformers: Attention is All You Need P 5Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin, NIPS 2017
  • 6. From One-Hot Vectors to Word Embeddings & Self-Attention P 6 animal...street...it 0000…10001…01000…0 one-hot 1.4…3.74.9…6.42.5…8.0 embedding The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
  • 7. From One-Hot Vectors to Word Embeddings & Self-Attention P 7 animal...street...it 0000…10001…01000…0 one-hot 1.4…3.74.9…6.42.5…8.0 embedding The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
  • 8. query, key, value From One-Hot Vectors to Word Embeddings & Self-Attention P 8 animal...street...it 0000…10001…01000…0 one-hot 1.4…3.74.9…6.42.5…8.0 embedding The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
  • 9. query, key, value From One-Hot Vectors to Word Embeddings & Self-Attention P 9 animal...street...it 0000…10001…01000…0 one-hot 0.1 0.2 0.7 (self-)attention 1.4…3.74.9…6.42.5…8.0 embedding The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
  • 10. query, key, value From One-Hot Vectors to Word Embeddings & Self-Attention P 10 animal...street...it 0000…10001…01000…0 one-hot 0.1 0.2 0.7 (self-)attention 1.4…3.74.9…6.42.5…8.0 embedding The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
  • 11. OpenAI: Generative Pretraining The animal tired Acceptable <s> the … too <s> the … tired P 11 Transformer Transformer Transformer Transformer TransformerTransformer Transformer Transformer Transformer Transformer TransformerTransformer
  • 12. Understanding Can Need “Future” Information How far is Jacksonville from Miami? Jacksonville is in the First Coast region of northeast Florida and is centered on the banks of the St. Johns River, about 25 miles (40 km) south of the Georgia state line and about 340 miles (550 km) north of Miami. VERB NOUN Mark which area you want to distress. Mark, which area do you want to distress? P 12
  • 13. Naive Bidirectionality: Words Can “See Themselves” The animal tired The animal tired <s> the … too <s> the … too P 13 Transformer Transformer Transformer Transformer TransformerTransformer Transformer Transformer Transformer Transformer TransformerTransformer
  • 14. Training BERT Masked Language Model (Fill-in-the-blank) Deep learning (also [MASK] [MASK] deep structured learning or [MASK] learning) is part of a broader family of machine learning methods [MASK] on [MASK] data representations, as opposed to task-specific algorithms. [MASK] is allergic to peaches. Is P 14https://en.wikipedia.org/wiki/Deep_learning https://en.wikipedia.org/wiki/Daniel_Tiger%27s_Neighborhood BooksCorpus: Zhu, Kiros, Zemel, Salakhutdinov, Urtasun, Torralba, Fidler, CVPR 2015
  • 18. SWAG: Zellers, Bisk, Schwartz, Choi, EMNLP 2018 SQuAD: Rajpurkar, Zhang, Lopyrev, Liang, EMNLP 2016 Results: Commonsense Reasoning and Question Answering P 18
  • 19. MRPC: Dolan and Brockett, IWP 2005 Pretraining Tasks Matter...and Bigger = Better * P 19
  • 20. Do I Need Full BERT Models for All My Tasks? P 20Houlsby, Giurgiu, Jastrzebski, Morrone, de Laroussilhe, Gesmundo, Attariyan, Gelly, arxiv Feb 2019
  • 21. Try It Out, Get Faster Training with TPUs P 21
  • 22. Mismatches between Training and Realistic Inputs Two Case Studies: Mixed Language Text and Identifying Commands P 22
  • 23. P 23 Multiple Languages: Frequent “In-the-Wild”, Rare in Training
  • 24. P 24 Multiple Languages: Frequent “In-the-Wild”, Rare in Training
  • 25. “A Fast, Compact, Accurate Model for Language Identification of Codemixed Text” Zhang, Riesa , Gillick , Bakalov, Baldridge, Weiss, EMNLP 2018 P 25
  • 26. Accuracy and Speed of Token-level Language Id P 26
  • 27. Accuracy and Speed of Token-level Language Id P 27
  • 28. Useful Preprocessing Step Across Tasks P 28
  • 29. Mismatches between Training and Realistic Inputs Two Case Studies: Mixed Language Text and Identifying Commands P 29
  • 30. Noun-Verb Ambiguity “lives” / Noun → /laIvz/ “lives” / Verb → /lIvz/ flies NOUN Mark VERB P 30Elkahky, Webster, Andor, Pitler, EMNLP 2018
  • 31. Certain insects can damage plumerias, such as mites, flies, or aphids. NOUN Mark which area you want to distress. VERB P 31 “A Challenge Set and Methods for Noun-Verb Ambiguity”, EMNLP 2018
  • 32. Accuracy on Noun-Verb Disambiguation P 32
  • 34. Mark VERB Webster, Recasens, Axelrod, Baldridge, TACL 2019 Kwiatkowski, Palomaki, Redfield, Collins, Parikh, Alberti, Epstein, Polosukhin, Kelcey, Devlin, Lee, Toutanova, Jones, Chang, Dai, Uszkoreit, Le, Petrov, TACL 2019 Released Datasets with “In-the-Wild” Natural Challenges P 34