SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Downloaden Sie, um offline zu lesen
Benchmarking approaches to
transfer learning in NLP
Yury Kashnitskiy, Gianluigi Bardelloni
Benchmarking approaches to
transfer learning in NLP
General problem:
1. Scarce well-labeled data in NLP tasks
2. Loads of raw texts available
3. How to utilize raw data to improve
performance in supervised NLP tasks?
Simpler: how to use all these bulks of unlabeled texts?
2
Business problem:
1. NLP tasks typically require a lot of labeled data
2. Labeling data is expensive
3. A disciplined approach to minimizing the needed training
size is highly desired
Benchmarking approaches to
transfer learning in NLP
3
Background: transfer learning in Computer Vision
Benchmarking approaches to
transfer learning in NLP
Pretraining
Fine-tuning
4
General idea:
1. A neural net is trained on raw
data to predict a word given its
context
2. Meanwhile it learns a vector
(“embedding”) for each word
3. Embeddings can be
transferred and used in
supervised learning tasks
(ex: part-of-speech-tagging)
This approach leads to SotA
results in many NLP tasks
Benchmarking approaches to
transfer learning in NLP
5
Benchmarking approaches to
transfer learning in NLP
ULMFiT:
1. Take a pretrained
language Model (with
ex. Wikipedia)
2. Fine-tune language
model on your domain
(ex. chats with
customers)
3. Fine-tune classifier
6
Benchmarking approaches to
transfer learning in NLP
Task 1. Amazon product reviews classification (English)
Validation accuracy:
Logistic Regression + Tf-Idf: 72.5%
ULMFiT: 79.5%
7
Benchmarking approaches to
transfer learning in NLP
Interpreting Logistic Regression with eli5
8
Examples of generated text (pure LSTM, PyTorch examples):
"This is a product that gets great secrets to the store's difficulty,
though it's compact and well made. Ordered one in the
mornings, for price. It's good though. Lasts 6 months, really need
to ask Amazon.com) as soon as you have the off!"
"I have been using Panasonic for a year and my mom was on
them to replace my Quadra Action and been pleased with the
gel.”
9
Benchmarking approaches to
transfer learning in NLP
Benchmarking approaches to
transfer learning in NLP
Task 2. Classifying chats with customers (Dutch)
Validation accuracy:
Logistic Regression + Tf-Idf: 73.5%
ULMFiT: 70.2%
Logit + ELMo: 66%
10
Benchmarking approaches to
transfer learning in NLP
Things to try:
1. Training ULMFiT
models with Dutch
texts
2. Fine-tuning BERT
classifier
3. Trying other
models: GPT-2,
OpenAI transformers
etc.
The goal is to develop best practices for
transfer learning in Dutch classification tasks
11
Benchmarking approaches to
transfer learning in NLP
What we expect from the collaboration:
1. Trying different transfer learning approaches
2. Both public and private data (English and Dutch)
3. Sharing code & ideas
In 3 months:
share preliminary results – code, models, guides
12
Proposed tasks
1. Benchmarking different approaches on several datasets to
see what works best. The main focus is on BERT and ULMFiT,
however, no limitations
Kaggle Dataset:
- Amazon healthcare reviews (English)
- Amazon pet products reviews (English), Kaggle comp.
- Clickbait news detection (English), Kaggle comp.
- Book reviews sentiment prediction (Dutch)
13
Benchmarking approaches to
transfer learning in NLP
2. Training own models for ULMFiT (Dutch)
3. Exploring Byte Pair Encoding as preprocessing for ULMFiT
4. Exploring preprocessing steps to improve BERT classifier
5. Dealing with typos and noise in text in case of BERT
6. Fine-tuning BERT Language models, exploring it's effect on
classification
14
Proposed tasks
Benchmarking approaches to
transfer learning in NLP
Trying other approaches:
1. Huggingface transfer learning tutorial + code
2. Fine-tuning classification head over LSTMs (pure Python)
3. GPT-2 transformers
4. Other OpenAI transformers
Other tasks:
1. Investigating text augmentations and their effect on
classification accuracy
2. Active learning in NLP
3. Hierarchical text classification
4. Few-shot learning (ex. Unsupervised Data Augmentation)
15
Benchmarking approaches to
transfer learning in NLP
Yury Kashnitskiy, Gianluigi Bardelloni
yury.kashnitskiy@kpn.com
gianluigi.bardelloni@kpn.com
16

Weitere ähnliche Inhalte

Was ist angesagt?

A neural probabilistic language model
A neural probabilistic language modelA neural probabilistic language model
A neural probabilistic language model
c sharada
 
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
Lifeng (Aaron) Han
 
Seq2seq Model to Tokenize the Chinese Language
Seq2seq Model to Tokenize the Chinese LanguageSeq2seq Model to Tokenize the Chinese Language
Seq2seq Model to Tokenize the Chinese Language
Jinho Choi
 

Was ist angesagt? (20)

Nlp presentation
Nlp presentationNlp presentation
Nlp presentation
 
Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical
 
Plug play language_models
Plug play language_modelsPlug play language_models
Plug play language_models
 
GENETIC APPROACH FOR ARABIC PART OF SPEECH TAGGING
GENETIC APPROACH FOR ARABIC PART OF SPEECH TAGGINGGENETIC APPROACH FOR ARABIC PART OF SPEECH TAGGING
GENETIC APPROACH FOR ARABIC PART OF SPEECH TAGGING
 
NLP and its application in Insurance -Short story presentation
NLP and its application in Insurance -Short story presentationNLP and its application in Insurance -Short story presentation
NLP and its application in Insurance -Short story presentation
 
Bt0081, software engineering
Bt0081, software engineeringBt0081, software engineering
Bt0081, software engineering
 
Bt0081, software engineering
Bt0081, software engineeringBt0081, software engineering
Bt0081, software engineering
 
A neural probabilistic language model
A neural probabilistic language modelA neural probabilistic language model
A neural probabilistic language model
 
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...
 
Next word predication using markov models
Next word predication using markov modelsNext word predication using markov models
Next word predication using markov models
 
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
 
MT and Post Editing in master's level translation education
MT and Post Editing in master's level translation education MT and Post Editing in master's level translation education
MT and Post Editing in master's level translation education
 
Blenderbot
BlenderbotBlenderbot
Blenderbot
 
ESR8 Liangyou Li - EXPERT Summer School - Malaga 2015
ESR8 Liangyou Li - EXPERT Summer School - Malaga 2015ESR8 Liangyou Li - EXPERT Summer School - Malaga 2015
ESR8 Liangyou Li - EXPERT Summer School - Malaga 2015
 
Seq2seq Model to Tokenize the Chinese Language
Seq2seq Model to Tokenize the Chinese LanguageSeq2seq Model to Tokenize the Chinese Language
Seq2seq Model to Tokenize the Chinese Language
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURESGENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
GENERATING SUMMARIES USING SENTENCE COMPRESSION AND STATISTICAL MEASURES
 
Tensorflow
TensorflowTensorflow
Tensorflow
 
COLING 2014: Joint Opinion Relation Detection Using One-Class Deep Neural Net...
COLING 2014: Joint Opinion Relation Detection Using One-Class Deep Neural Net...COLING 2014: Joint Opinion Relation Detection Using One-Class Deep Neural Net...
COLING 2014: Joint Opinion Relation Detection Using One-Class Deep Neural Net...
 
SEMI-AUTOMATIC SIMULTANEOUS INTERPRETING QUALITY EVALUATION
SEMI-AUTOMATIC SIMULTANEOUS INTERPRETING QUALITY EVALUATIONSEMI-AUTOMATIC SIMULTANEOUS INTERPRETING QUALITY EVALUATION
SEMI-AUTOMATIC SIMULTANEOUS INTERPRETING QUALITY EVALUATION
 

Ähnlich wie Benchmarking transfer learning approaches for NLP

Training language models to follow instructions with human feedback (Instruct...
Training language models to follow instructions with human feedback (Instruct...Training language models to follow instructions with human feedback (Instruct...
Training language models to follow instructions with human feedback (Instruct...
Rama Irsheidat
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
BigDataCloud
 
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdfTransfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
oranisalcani
 

Ähnlich wie Benchmarking transfer learning approaches for NLP (20)

Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
 
Thomas Wolf "Transfer learning in NLP"
Thomas Wolf "Transfer learning in NLP"Thomas Wolf "Transfer learning in NLP"
Thomas Wolf "Transfer learning in NLP"
 
Training language models to follow instructions with human feedback (Instruct...
Training language models to follow instructions with human feedback (Instruct...Training language models to follow instructions with human feedback (Instruct...
Training language models to follow instructions with human feedback (Instruct...
 
A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3
A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3
A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3
 
Deep Neural Networks in Text Classification using Active Learning
Deep Neural Networks in Text Classification using Active LearningDeep Neural Networks in Text Classification using Active Learning
Deep Neural Networks in Text Classification using Active Learning
 
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
Promise 2011: "An Iterative Semi-supervised Approach to Software Fault Predic...
 
srinu.pptx
srinu.pptxsrinu.pptx
srinu.pptx
 
NLP Meetup 2023
NLP Meetup 2023NLP Meetup 2023
NLP Meetup 2023
 
2201.00598.pdf
2201.00598.pdf2201.00598.pdf
2201.00598.pdf
 
Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 ReviewNatural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
 
Cd project
Cd projectCd project
Cd project
 
Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot T...
Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot T...Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot T...
Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot T...
 
Natural Language Processing: From Human-Robot Interaction to Alzheimer’s Dete...
Natural Language Processing: From Human-Robot Interaction to Alzheimer’s Dete...Natural Language Processing: From Human-Robot Interaction to Alzheimer’s Dete...
Natural Language Processing: From Human-Robot Interaction to Alzheimer’s Dete...
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
 
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
 
How to fine-tune and develop your own large language model.pptx
How to fine-tune and develop your own large language model.pptxHow to fine-tune and develop your own large language model.pptx
How to fine-tune and develop your own large language model.pptx
 
Implications of GPT-3
Implications of GPT-3Implications of GPT-3
Implications of GPT-3
 
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language ModelsScaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language Models
 
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdfTransfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
Transfer_Learning_for_Natural_Language_P_v3_MEAP.pdf
 
Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...
 

Mehr von Yury Kashnitsky

Mehr von Yury Kashnitsky (8)

How to jump into Data Science
How to jump into Data ScienceHow to jump into Data Science
How to jump into Data Science
 
mlcourse.ai fall2019 Live Session 0
mlcourse.ai fall2019 Live Session 0mlcourse.ai fall2019 Live Session 0
mlcourse.ai fall2019 Live Session 0
 
Gender-unbiased BERT-based Pronoun Resolution
Gender-unbiased BERT-based  Pronoun ResolutionGender-unbiased BERT-based  Pronoun Resolution
Gender-unbiased BERT-based Pronoun Resolution
 
mlcourse.ai. Outro
mlcourse.ai. Outromlcourse.ai. Outro
mlcourse.ai. Outro
 
Time series forecasting with ARIMA
Time series forecasting with ARIMATime series forecasting with ARIMA
Time series forecasting with ARIMA
 
mlcourse.ai. Clustering
mlcourse.ai. Clusteringmlcourse.ai. Clustering
mlcourse.ai. Clustering
 
mlcourse.ai, introduction, course overview
mlcourse.ai, introduction, course overviewmlcourse.ai, introduction, course overview
mlcourse.ai, introduction, course overview
 
Необычные модели Playboy, или про поиск аномалий в данных
Необычные модели Playboy, или про поиск аномалий в данныхНеобычные модели Playboy, или про поиск аномалий в данных
Необычные модели Playboy, или про поиск аномалий в данных
 

Kürzlich hochgeladen

Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 

Kürzlich hochgeladen (20)

Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 

Benchmarking transfer learning approaches for NLP

  • 1. Benchmarking approaches to transfer learning in NLP Yury Kashnitskiy, Gianluigi Bardelloni
  • 2. Benchmarking approaches to transfer learning in NLP General problem: 1. Scarce well-labeled data in NLP tasks 2. Loads of raw texts available 3. How to utilize raw data to improve performance in supervised NLP tasks? Simpler: how to use all these bulks of unlabeled texts? 2
  • 3. Business problem: 1. NLP tasks typically require a lot of labeled data 2. Labeling data is expensive 3. A disciplined approach to minimizing the needed training size is highly desired Benchmarking approaches to transfer learning in NLP 3
  • 4. Background: transfer learning in Computer Vision Benchmarking approaches to transfer learning in NLP Pretraining Fine-tuning 4
  • 5. General idea: 1. A neural net is trained on raw data to predict a word given its context 2. Meanwhile it learns a vector (“embedding”) for each word 3. Embeddings can be transferred and used in supervised learning tasks (ex: part-of-speech-tagging) This approach leads to SotA results in many NLP tasks Benchmarking approaches to transfer learning in NLP 5
  • 6. Benchmarking approaches to transfer learning in NLP ULMFiT: 1. Take a pretrained language Model (with ex. Wikipedia) 2. Fine-tune language model on your domain (ex. chats with customers) 3. Fine-tune classifier 6
  • 7. Benchmarking approaches to transfer learning in NLP Task 1. Amazon product reviews classification (English) Validation accuracy: Logistic Regression + Tf-Idf: 72.5% ULMFiT: 79.5% 7
  • 8. Benchmarking approaches to transfer learning in NLP Interpreting Logistic Regression with eli5 8
  • 9. Examples of generated text (pure LSTM, PyTorch examples): "This is a product that gets great secrets to the store's difficulty, though it's compact and well made. Ordered one in the mornings, for price. It's good though. Lasts 6 months, really need to ask Amazon.com) as soon as you have the off!" "I have been using Panasonic for a year and my mom was on them to replace my Quadra Action and been pleased with the gel.” 9 Benchmarking approaches to transfer learning in NLP
  • 10. Benchmarking approaches to transfer learning in NLP Task 2. Classifying chats with customers (Dutch) Validation accuracy: Logistic Regression + Tf-Idf: 73.5% ULMFiT: 70.2% Logit + ELMo: 66% 10
  • 11. Benchmarking approaches to transfer learning in NLP Things to try: 1. Training ULMFiT models with Dutch texts 2. Fine-tuning BERT classifier 3. Trying other models: GPT-2, OpenAI transformers etc. The goal is to develop best practices for transfer learning in Dutch classification tasks 11
  • 12. Benchmarking approaches to transfer learning in NLP What we expect from the collaboration: 1. Trying different transfer learning approaches 2. Both public and private data (English and Dutch) 3. Sharing code & ideas In 3 months: share preliminary results – code, models, guides 12
  • 13. Proposed tasks 1. Benchmarking different approaches on several datasets to see what works best. The main focus is on BERT and ULMFiT, however, no limitations Kaggle Dataset: - Amazon healthcare reviews (English) - Amazon pet products reviews (English), Kaggle comp. - Clickbait news detection (English), Kaggle comp. - Book reviews sentiment prediction (Dutch) 13 Benchmarking approaches to transfer learning in NLP
  • 14. 2. Training own models for ULMFiT (Dutch) 3. Exploring Byte Pair Encoding as preprocessing for ULMFiT 4. Exploring preprocessing steps to improve BERT classifier 5. Dealing with typos and noise in text in case of BERT 6. Fine-tuning BERT Language models, exploring it's effect on classification 14 Proposed tasks Benchmarking approaches to transfer learning in NLP
  • 15. Trying other approaches: 1. Huggingface transfer learning tutorial + code 2. Fine-tuning classification head over LSTMs (pure Python) 3. GPT-2 transformers 4. Other OpenAI transformers Other tasks: 1. Investigating text augmentations and their effect on classification accuracy 2. Active learning in NLP 3. Hierarchical text classification 4. Few-shot learning (ex. Unsupervised Data Augmentation) 15 Benchmarking approaches to transfer learning in NLP
  • 16. Yury Kashnitskiy, Gianluigi Bardelloni yury.kashnitskiy@kpn.com gianluigi.bardelloni@kpn.com 16