SlideShare a Scribd company logo
1 of 14
Generative Pseudo Labeling
윤용선
1
0. Paper
• GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval
• Authors: Kexin Wang, Nandan Thakur, Nils Reimers, Iryna Gurevych
• Published: 2021.12 (Arxiv)
• https://arxiv.org/abs/2112.07577
2
0. Preliminaries
• Information Retrieval
• Query와 관련이 있는 문서를 찾는 작업 (관련이 있는 = 대답할 수 있는)
• Open-domain QA: IR + MRC
• Method: 쿼리와 가장 높은 Score(Similarity) 를 갖는 문서 선택
• Sparse embedding vs Dense embedding
• Keyword/고유명사는 sparse, Synonym/Paraphrase는 dense
3
0. Preliminaries
- 빠른 검색 (Maximum
Inner Project Search)
- 아쉬운 성능
- 좋은 성능
- 엄청 느림
Retriever -> Reranker -> Reader
4
0. Preliminaries
5
1. Introduction
• Recently, information retrieval methods based on dense vector spaces have become popular to
address the limitation of sparse vector.
• Dense retrieval methods require large amounts of training data to work well.
• Dense retrieval methods are extremely sensitive to domain shifts.
• Models trained on MS MARCO perform rather poorly for questions for COVID-19 scientific
literatures.
• Models did not learn how to represent this topic well in a vector space.
• We present Generative Pseudo Labeling (GPL), an unsupervised domain adaptation for dense
retrieval models.
6
2. Method
• For a given target corpus, we generate for each passage three queries using T5-encoder-decoder
model.
• For each of the generated queries, we use an existing retrieval system to retrieve 50 negative
passages.
• For each (query, positive, negative) – tuple we compute the margin score using cross-encoder.
• Train the bi-encoder with margin score.
7
2. Method
• Multiple Negative Ranking loss considers only the coarse relationship between queries and
passages., i.e. the matching passage is considered as relevant while all other passages are
considered irrelevant.
• However, the query generator might generate queries that are not answerable by the passage.
Further, other passages might actually be relevant as well for a given query.
• MarginMSE loss uses a powerful cross-encoder to soft-label (query, passage) pairs. It then teaches
the dense retriever to mimic the score margin between the positive and negative query-passage
pairs.
In GPL,
- Bad query -> low pos score -> distant
- False negative -> high neg score -> similar
MarginMSE Loss
8
3. Experiments
• Query generator: docT5query
• Negative miner(Retriever): msmarco-distilbert-base-v3, msmarco-MiniLM-L-6-v3
• 50 negatives using each retriever and uniformly sample
• Cross encoder: msmarco-MiniLM-L-6-v2
• Student: MS MARCO DistilBERT + Mean pooling + Dot product
• 140k training steps, 32 batch size (No need of large batch size!)
Experimental Setup
9
3. Experiments
• Six domain-specific text retrieval tasks from the BeIR benchmark
• Evaluation is done using nDCG@10
• 더 관련있는 문서를 더 높은 순위로 예측하자!
Evaluation
• Zero-Shot
• MS MARCO: distil-bert dense retrieval trained with MarginMSE
• BM25: lexical matching from Elastic search
• Pre-Training based Domain Adaptation
• SimCSE: encode same sent with different dropout masks + MNRL loss
• ICT: sample one sent from passage as the pseudo query
• TSDAE: denoising autoencoder
• Generation-based Domain Adaptation
• Qgen: generated query + Multiple Negative Ranking loss
Baselines
10
4. Results
11
5. Analysis
• GPL begins to be saturated after around 100K steps.
• With TSDAE pre-training, the performance can be improved consistently.
Influence of Training Steps
Influence of Corpus Size
• We find with more than 10K passages, GPL can already outperform the zero-shot baseline
12
5. Analysis
• Generating 3 queries per passages appears to be optimal, generating more queries per passages
does not yield further improvements.
Robustness against Query Generation
Sensitivity to Starting Checkpoints
• We also evaluate to directly fine-tune a distilbert-model using QGen
13
6. Conclusion
• In this work we propose GPL, a novel unsupervised domain adaptation method
for dense retrieval models.
• Pseudo-labeling overcomes two important shortcomings of previous methods.
• Not all generated queries are of high quality
• Training with mined hard negatives can be noised
• We observe GPL performs well on all the datasets and significantly outperforms
other approaches.
• As a limitation, GPL requires a relatively complex training setup and future work
can focus on simplify this training pipeline.
14

More Related Content

Similar to tmptmptmp123.pptx

Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Lucidworks
 
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
Junho Cho
 
Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...
Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...
Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...
Soheila Dehghanzadeh
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
Sanghamitra Deb
 
Solr Black Belt Pre-conference
Solr Black Belt Pre-conferenceSolr Black Belt Pre-conference
Solr Black Belt Pre-conference
Erik Hatcher
 
NLP_Presentation
NLP_PresentationNLP_Presentation
NLP_Presentation
Aravind700
 

Similar to tmptmptmp123.pptx (20)

Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
 
Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_experts
 
How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?
 
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
 
Presentation
PresentationPresentation
Presentation
 
Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...
Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...
Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
 
Predicting SPARQL query execution time and suggesting SPARQL queries based on...
Predicting SPARQL query execution time and suggesting SPARQL queries based on...Predicting SPARQL query execution time and suggesting SPARQL queries based on...
Predicting SPARQL query execution time and suggesting SPARQL queries based on...
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
 
Search to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the EyesSearch to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the Eyes
 
Benchmarking NGINX for Accuracy and Results
Benchmarking NGINX for Accuracy and ResultsBenchmarking NGINX for Accuracy and Results
Benchmarking NGINX for Accuracy and Results
 
Studies of HPCC Systems from Machine Learning Perspectives
Studies of HPCC Systems from Machine Learning PerspectivesStudies of HPCC Systems from Machine Learning Perspectives
Studies of HPCC Systems from Machine Learning Perspectives
 
Elasticsearch Sharding Strategy at Tubular Labs
Elasticsearch Sharding Strategy at Tubular LabsElasticsearch Sharding Strategy at Tubular Labs
Elasticsearch Sharding Strategy at Tubular Labs
 
Solr Black Belt Pre-conference
Solr Black Belt Pre-conferenceSolr Black Belt Pre-conference
Solr Black Belt Pre-conference
 
NLP_Presentation
NLP_PresentationNLP_Presentation
NLP_Presentation
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
 
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopEventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
 
Making powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysisMaking powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysis
 
Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...
Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...
Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...
 

Recently uploaded

ابو ظبي اعلان | - سايتوتك في الامارات حبوب الاجهاض للبيع ف حبوب الإجهاض ... ا...
ابو ظبي اعلان | - سايتوتك في الامارات حبوب الاجهاض للبيع ف حبوب الإجهاض ... ا...ابو ظبي اعلان | - سايتوتك في الامارات حبوب الاجهاض للبيع ف حبوب الإجهاض ... ا...
ابو ظبي اعلان | - سايتوتك في الامارات حبوب الاجهاض للبيع ف حبوب الإجهاض ... ا...
brennadilys816
 
Future of Trade 2024 - Decoupled and Reconfigured - Snapshot Report
Future of Trade 2024 - Decoupled and Reconfigured - Snapshot ReportFuture of Trade 2024 - Decoupled and Reconfigured - Snapshot Report
Future of Trade 2024 - Decoupled and Reconfigured - Snapshot Report
Dubai Multi Commodity Centre
 
Apotik Jual Obat Aborsi asli Mataram, Wa : 085180626899 - Penjual obat Cytote...
Apotik Jual Obat Aborsi asli Mataram, Wa : 085180626899 - Penjual obat Cytote...Apotik Jual Obat Aborsi asli Mataram, Wa : 085180626899 - Penjual obat Cytote...
Apotik Jual Obat Aborsi asli Mataram, Wa : 085180626899 - Penjual obat Cytote...
jual Obat Aborsi Bandung, Wa : 085180626899 Apotik jual Obat Cytotec Di Bandung
 
RATINGS OF EACH VIDEO FOR UNI PROJECT IWDSFODF
RATINGS OF EACH VIDEO FOR UNI PROJECT IWDSFODFRATINGS OF EACH VIDEO FOR UNI PROJECT IWDSFODF
RATINGS OF EACH VIDEO FOR UNI PROJECT IWDSFODF
CaitlinCummins3
 
What is paper chromatography, principal, procedure,types, diagram, advantages...
What is paper chromatography, principal, procedure,types, diagram, advantages...What is paper chromatography, principal, procedure,types, diagram, advantages...
What is paper chromatography, principal, procedure,types, diagram, advantages...
srcw2322l101
 
Presentation4 (2) survey responses clearly labelled
Presentation4 (2) survey responses clearly labelledPresentation4 (2) survey responses clearly labelled
Presentation4 (2) survey responses clearly labelled
CaitlinCummins3
 

Recently uploaded (20)

Stages of Startup Funding - An Explainer
Stages of Startup Funding - An ExplainerStages of Startup Funding - An Explainer
Stages of Startup Funding - An Explainer
 
How Do Venture Capitalists Make Decisions?
How Do Venture Capitalists Make Decisions?How Do Venture Capitalists Make Decisions?
How Do Venture Capitalists Make Decisions?
 
ابو ظبي اعلان | - سايتوتك في الامارات حبوب الاجهاض للبيع ف حبوب الإجهاض ... ا...
ابو ظبي اعلان | - سايتوتك في الامارات حبوب الاجهاض للبيع ف حبوب الإجهاض ... ا...ابو ظبي اعلان | - سايتوتك في الامارات حبوب الاجهاض للبيع ف حبوب الإجهاض ... ا...
ابو ظبي اعلان | - سايتوتك في الامارات حبوب الاجهاض للبيع ف حبوب الإجهاض ... ا...
 
hyundai capital 2023 consolidated financial statements
hyundai capital 2023 consolidated financial statementshyundai capital 2023 consolidated financial statements
hyundai capital 2023 consolidated financial statements
 
Exploring-Pipe-Flanges-Applications-Types-and-Benefits.pptx
Exploring-Pipe-Flanges-Applications-Types-and-Benefits.pptxExploring-Pipe-Flanges-Applications-Types-and-Benefits.pptx
Exploring-Pipe-Flanges-Applications-Types-and-Benefits.pptx
 
MichaelStarkes_UncutGemsProjectSummary.pdf
MichaelStarkes_UncutGemsProjectSummary.pdfMichaelStarkes_UncutGemsProjectSummary.pdf
MichaelStarkes_UncutGemsProjectSummary.pdf
 
بروفايل شركة ميار الخليج للاستشارات الهندسية.pdf
بروفايل شركة ميار الخليج للاستشارات الهندسية.pdfبروفايل شركة ميار الخليج للاستشارات الهندسية.pdf
بروفايل شركة ميار الخليج للاستشارات الهندسية.pdf
 
Innomantra Viewpoint - Building Moonshots : May-Jun 2024.pdf
Innomantra Viewpoint - Building Moonshots : May-Jun 2024.pdfInnomantra Viewpoint - Building Moonshots : May-Jun 2024.pdf
Innomantra Viewpoint - Building Moonshots : May-Jun 2024.pdf
 
Inside the Black Box of Venture Capital (VC)
Inside the Black Box of Venture Capital (VC)Inside the Black Box of Venture Capital (VC)
Inside the Black Box of Venture Capital (VC)
 
Toyota Kata Coaching for Agile Teams & Transformations
Toyota Kata Coaching for Agile Teams & TransformationsToyota Kata Coaching for Agile Teams & Transformations
Toyota Kata Coaching for Agile Teams & Transformations
 
Future of Trade 2024 - Decoupled and Reconfigured - Snapshot Report
Future of Trade 2024 - Decoupled and Reconfigured - Snapshot ReportFuture of Trade 2024 - Decoupled and Reconfigured - Snapshot Report
Future of Trade 2024 - Decoupled and Reconfigured - Snapshot Report
 
Apotik Jual Obat Aborsi asli Mataram, Wa : 085180626899 - Penjual obat Cytote...
Apotik Jual Obat Aborsi asli Mataram, Wa : 085180626899 - Penjual obat Cytote...Apotik Jual Obat Aborsi asli Mataram, Wa : 085180626899 - Penjual obat Cytote...
Apotik Jual Obat Aborsi asli Mataram, Wa : 085180626899 - Penjual obat Cytote...
 
Raising Seed Capital by Steve Schlafman at RRE Ventures
Raising Seed Capital by Steve Schlafman at RRE VenturesRaising Seed Capital by Steve Schlafman at RRE Ventures
Raising Seed Capital by Steve Schlafman at RRE Ventures
 
Progress Report - UKG Analyst Summit 2024 - A lot to do - Good Progress1-1.pdf
Progress Report - UKG Analyst Summit 2024 - A lot to do - Good Progress1-1.pdfProgress Report - UKG Analyst Summit 2024 - A lot to do - Good Progress1-1.pdf
Progress Report - UKG Analyst Summit 2024 - A lot to do - Good Progress1-1.pdf
 
RATINGS OF EACH VIDEO FOR UNI PROJECT IWDSFODF
RATINGS OF EACH VIDEO FOR UNI PROJECT IWDSFODFRATINGS OF EACH VIDEO FOR UNI PROJECT IWDSFODF
RATINGS OF EACH VIDEO FOR UNI PROJECT IWDSFODF
 
The Risks of Ignoring Bookkeeping in Your Business
The Risks of Ignoring Bookkeeping in Your BusinessThe Risks of Ignoring Bookkeeping in Your Business
The Risks of Ignoring Bookkeeping in Your Business
 
Pay after result spell caster (,$+27834335081)@ bring back lost lover same da...
Pay after result spell caster (,$+27834335081)@ bring back lost lover same da...Pay after result spell caster (,$+27834335081)@ bring back lost lover same da...
Pay after result spell caster (,$+27834335081)@ bring back lost lover same da...
 
What is paper chromatography, principal, procedure,types, diagram, advantages...
What is paper chromatography, principal, procedure,types, diagram, advantages...What is paper chromatography, principal, procedure,types, diagram, advantages...
What is paper chromatography, principal, procedure,types, diagram, advantages...
 
Hyundai capital 2024 1q Earnings release
Hyundai capital 2024 1q Earnings releaseHyundai capital 2024 1q Earnings release
Hyundai capital 2024 1q Earnings release
 
Presentation4 (2) survey responses clearly labelled
Presentation4 (2) survey responses clearly labelledPresentation4 (2) survey responses clearly labelled
Presentation4 (2) survey responses clearly labelled
 

tmptmptmp123.pptx

  • 2. 0. Paper • GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval • Authors: Kexin Wang, Nandan Thakur, Nils Reimers, Iryna Gurevych • Published: 2021.12 (Arxiv) • https://arxiv.org/abs/2112.07577 2
  • 3. 0. Preliminaries • Information Retrieval • Query와 관련이 있는 문서를 찾는 작업 (관련이 있는 = 대답할 수 있는) • Open-domain QA: IR + MRC • Method: 쿼리와 가장 높은 Score(Similarity) 를 갖는 문서 선택 • Sparse embedding vs Dense embedding • Keyword/고유명사는 sparse, Synonym/Paraphrase는 dense 3
  • 4. 0. Preliminaries - 빠른 검색 (Maximum Inner Project Search) - 아쉬운 성능 - 좋은 성능 - 엄청 느림 Retriever -> Reranker -> Reader 4
  • 6. 1. Introduction • Recently, information retrieval methods based on dense vector spaces have become popular to address the limitation of sparse vector. • Dense retrieval methods require large amounts of training data to work well. • Dense retrieval methods are extremely sensitive to domain shifts. • Models trained on MS MARCO perform rather poorly for questions for COVID-19 scientific literatures. • Models did not learn how to represent this topic well in a vector space. • We present Generative Pseudo Labeling (GPL), an unsupervised domain adaptation for dense retrieval models. 6
  • 7. 2. Method • For a given target corpus, we generate for each passage three queries using T5-encoder-decoder model. • For each of the generated queries, we use an existing retrieval system to retrieve 50 negative passages. • For each (query, positive, negative) – tuple we compute the margin score using cross-encoder. • Train the bi-encoder with margin score. 7
  • 8. 2. Method • Multiple Negative Ranking loss considers only the coarse relationship between queries and passages., i.e. the matching passage is considered as relevant while all other passages are considered irrelevant. • However, the query generator might generate queries that are not answerable by the passage. Further, other passages might actually be relevant as well for a given query. • MarginMSE loss uses a powerful cross-encoder to soft-label (query, passage) pairs. It then teaches the dense retriever to mimic the score margin between the positive and negative query-passage pairs. In GPL, - Bad query -> low pos score -> distant - False negative -> high neg score -> similar MarginMSE Loss 8
  • 9. 3. Experiments • Query generator: docT5query • Negative miner(Retriever): msmarco-distilbert-base-v3, msmarco-MiniLM-L-6-v3 • 50 negatives using each retriever and uniformly sample • Cross encoder: msmarco-MiniLM-L-6-v2 • Student: MS MARCO DistilBERT + Mean pooling + Dot product • 140k training steps, 32 batch size (No need of large batch size!) Experimental Setup 9
  • 10. 3. Experiments • Six domain-specific text retrieval tasks from the BeIR benchmark • Evaluation is done using nDCG@10 • 더 관련있는 문서를 더 높은 순위로 예측하자! Evaluation • Zero-Shot • MS MARCO: distil-bert dense retrieval trained with MarginMSE • BM25: lexical matching from Elastic search • Pre-Training based Domain Adaptation • SimCSE: encode same sent with different dropout masks + MNRL loss • ICT: sample one sent from passage as the pseudo query • TSDAE: denoising autoencoder • Generation-based Domain Adaptation • Qgen: generated query + Multiple Negative Ranking loss Baselines 10
  • 12. 5. Analysis • GPL begins to be saturated after around 100K steps. • With TSDAE pre-training, the performance can be improved consistently. Influence of Training Steps Influence of Corpus Size • We find with more than 10K passages, GPL can already outperform the zero-shot baseline 12
  • 13. 5. Analysis • Generating 3 queries per passages appears to be optimal, generating more queries per passages does not yield further improvements. Robustness against Query Generation Sensitivity to Starting Checkpoints • We also evaluate to directly fine-tune a distilbert-model using QGen 13
  • 14. 6. Conclusion • In this work we propose GPL, a novel unsupervised domain adaptation method for dense retrieval models. • Pseudo-labeling overcomes two important shortcomings of previous methods. • Not all generated queries are of high quality • Training with mined hard negatives can be noised • We observe GPL performs well on all the datasets and significantly outperforms other approaches. • As a limitation, GPL requires a relatively complex training setup and future work can focus on simplify this training pipeline. 14

Editor's Notes

  1. MS MARCO로 학습한 bi-encoder가 BM25 보다 성능이 안좋음 Cross encoder에서도 BM25 retriever가 MS MARCO retriever보다 좋음 Pretraining + domain adaptation에서는 TSDAE가 가장 좋음 그 외에서는 GPL이 제일 좋음 Distilbert에 TSDAE 학습 후 GPL 학습하면 더 좋음 Reranking 더 좋음