SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Paying Attention to Multi-Word Expressions
in Neural Machine Translation
Matīss Rikters1 and Ondřej Bojar2
1University of Latvia, Faculty of Computing
2Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics
The 16th Machine Translation Summit
Nagoya, Japan
September 20, 2017
Contents
• Introduction
• Related Work
• Workflow
• Data
• NMT Systems
• Experiments
• Results
• Manual Inspection
• Attention Inspection
• Conclusions
Introduction
• Raining cats and dogs En → Lv
• Lietu kaķi un suņi
• Suņu un kaķu
• Raining kaķi un suņi
Introduction
Introduction
• Raining cats and dogs En → Lv
• Lietu kaķi un suņi
• Suņu un kaķu
• Raining kaķi un suņi
• Līst kā pa Jāņiem
Introduction
Related Work
• Extracting MWE candidates and integrating them in SMT (Skadiņa, 2016)
• Tagging candidate phrases in source sentence and forcing the decoder
to generate multiple words at once for the target phrase (Tang et al., 2016)
• Inclusion of structural biases from word-based alignment models,
such as positional bias, Markov conditioning, fertility and agreement over
translation directions, in attentional NMT (Cohn et al., 2016)
• Automatically extracting smaller parts of training segment pairs and adding
them to NMT training data (Chen et al., 2016)
• No related work specifically targeting MWEs in NMT
More Related Work
• Translating Phrases in Neural Machine Translation (Wang et al., 2017)
• Results of the WMT17 Neural MT Training Task (Bojar et al., 2017)
Workflow
Tag corpora with
morphological
taggers
UDPipe
LV Tagger
Identify MWE
candidates
MWE Toolkit
Align identified
MWE candidates
MPAligner
Shuffle MWEs into
training corpora;
Train NMT systems
Neural Monkey
Identify changes
Data
• WMT17 News Translation Task
• Training
• En → Lv
• 4.5M parallel sentences for the baseline
• En → Cs
• 49M parallel sentences for the baseline
• Evaluation
• En → Lv
• 2003 sentences in total
• En → Cs
• 6000 sentences in total
Identifying Multi-word Expressions
En → Lv
• 210 patterns (Skadiņa, 2016)
• 60 000 multi-word expressions
En → Cs
• 23 patterns (Majchrakova et al., 2012; Pecina 2008)
• 400 000 multi-word expressions
Data
En → Lv
En → Cs
1M 1xMWE 1M 2xMWE 2M 2xMWE 0.5M
2.5M 1xMWE 2.5M 2xMWE 5M 2xMWE 5M
Data
• WMT17 News Translation Task
• Training
• En → Lv
• 4.5M parallel sentences for the baseline
• 4.8M after adding MEWs/MWE sentences
• En → Cs
• 49M parallel sentences for the baseline
• 17M after adding MEWs/MWE sentences
• Evaluation
• En → Lv
• 2003 sentences in total
• 611 sentences with at least one MWE
• En → Cs
• 6000 sentences in total
• 112 sentences with at least one MWE
NMT Systems
• Neural Monkey
• Embedding size 350
• Encoder state size 350
• Decoder state size 350
• Max sentence length 50
• BPE merges 30000
https://github.com/ufal/neuralmonkey
Experiments
Two forms of the presenting MWEs to the NMT system
• Adding only the parallel MWEs themselves (MWE phrases)
each pair forming a new “sentence pair” in the parallel corpus
• Adding full sentences that contain the identified MWEs (MWE sentences)
Results
Languages En → Cs En → Lv
Dataset Dev MWE Dev MWE
Baseline 13.71 10.25 11.29 9.32
+MWE phrases - - 11.94 10.31
+MWE sentences 13.99 10.44 - -
Manual Inspection
Alignment
Inspection
Alignment
Inspection
Conclusions
• First experiments with handling multi-word expressions in neural
machine translation – tow methods for MWE integration in NMT
training data
• Open-source scripts for a complete workflow of identifying, extracting
and integrating MWEs into the NMT training and translation workflow
• Started work on an open-source tool for visualizing NMT attention
alignments (Rikters et al., 2017)
Advertisements
References
• Chen, W., Matusov, E., Khadivi, S., and Peter, J.-T. (2016). Guided alignment training for topic-
aware neural machine translation. AMTA 2016, Vol., page 121.
• Cohn, T., Hoang, C. D. V., Vymolova, E., Yao, K., Dyer, C., and Haffari, G. (2016). Incorporating
structural alignment biases into an attentional neural translation model. In Proceedings of the
2016 Conference of the North American Chapter of the Association for Computational Linguistics:
Human Language Technologies, pages 876–885, San Diego, California. Association for
Computational Linguistics.
• Majchrakova, D., Dusek, O., Hajic, J., Karcova, A., and Garabik, R. (2012). Semi-automatic
detection of multiword expressions in the Slovak dependency treebank.
• Pecina, P. (2008). Reference data for Czech collocation extraction. In Proc. of the LREC Workshop
Towards a Shared Task for MWEs (MWE 2008), pages 11–14.
• Rikters, M., Fishel, M., Bojar, O. (2017). Visualizing Neural Machine Translation Attention and
Confidence. Prague Bulletin for Mathematical Linguistics, volume 109.
• Skadiņa, I. (2016). Multi-word expressions in English - Latvian. In Human Language Technologies –
The Baltic Perspective: Proceedings of the Seventh International Conference Baltic HLT 2016,
volume 289, page 97. IOS Press.
• Tang, Y., Meng, F., Lu, Z., Li, H., and Yu, P. L. H. (2016). Neural machine translation with external
phrase memory. CoRR, abs/1606.01792.
Code & Presentation

Weitere ähnliche Inhalte

Ähnlich wie Paying attention to MWEs in NMT

Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)YerevaNN research lab
 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categoriesWarNik Chow
 
Big Data Palooza Talk: Aspects of Semantic Processing
Big Data Palooza Talk: Aspects of Semantic ProcessingBig Data Palooza Talk: Aspects of Semantic Processing
Big Data Palooza Talk: Aspects of Semantic ProcessingNa'im Tyson
 
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...Quinsulon Israel
 
Approach to leverage Websites to APIs through Semantics
Approach to leverage Websites to APIs through SemanticsApproach to leverage Websites to APIs through Semantics
Approach to leverage Websites to APIs through SemanticsIoannis Stavrakantonakis
 
Colloquium talk on modal sense classification using a convolutional neural ne...
Colloquium talk on modal sense classification using a convolutional neural ne...Colloquium talk on modal sense classification using a convolutional neural ne...
Colloquium talk on modal sense classification using a convolutional neural ne...Ana Marasović
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingTed Xiao
 
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel CorporaMultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel CorporaLifeng (Aaron) Han
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language ProcessingPranav Gupta
 
Linked Open Vocabulary Ranking and Terms Discovery
Linked Open Vocabulary Ranking and Terms DiscoveryLinked Open Vocabulary Ranking and Terms Discovery
Linked Open Vocabulary Ranking and Terms DiscoveryIoannis Stavrakantonakis
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Jinho Choi
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...RajkiranVeluri
 
Thamme Gowda's PhD dissertation defense slides
Thamme Gowda's PhD dissertation defense slidesThamme Gowda's PhD dissertation defense slides
Thamme Gowda's PhD dissertation defense slidesThamme Gowda
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPMachine Learning Prague
 
Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...Kai Li
 
Presentation ASLIB 2014_Ghoula
Presentation ASLIB 2014_GhoulaPresentation ASLIB 2014_Ghoula
Presentation ASLIB 2014_GhoulaNizar Ghoula
 
The Essay Scoring Tool (TEST) for Hindi
The Essay Scoring Tool (TEST) for HindiThe Essay Scoring Tool (TEST) for Hindi
The Essay Scoring Tool (TEST) for Hindisinghg77
 
Wei Xu - Innovative Applications of AI Panel
Wei Xu - Innovative Applications of AI PanelWei Xu - Innovative Applications of AI Panel
Wei Xu - Innovative Applications of AI PanelRehgan Avon
 
Investigation of Text-to-Speech based Synthetic Parallel Data for Sequence-to...
Investigation of Text-to-Speech based Synthetic Parallel Data for Sequence-to...Investigation of Text-to-Speech based Synthetic Parallel Data for Sequence-to...
Investigation of Text-to-Speech based Synthetic Parallel Data for Sequence-to...NU_I_TODALAB
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?NAVER Engineering
 

Ähnlich wie Paying attention to MWEs in NMT (20)

Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)
 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories
 
Big Data Palooza Talk: Aspects of Semantic Processing
Big Data Palooza Talk: Aspects of Semantic ProcessingBig Data Palooza Talk: Aspects of Semantic Processing
Big Data Palooza Talk: Aspects of Semantic Processing
 
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
Dissertation defense slides on "Semantic Analysis for Improved Multi-document...
 
Approach to leverage Websites to APIs through Semantics
Approach to leverage Websites to APIs through SemanticsApproach to leverage Websites to APIs through Semantics
Approach to leverage Websites to APIs through Semantics
 
Colloquium talk on modal sense classification using a convolutional neural ne...
Colloquium talk on modal sense classification using a convolutional neural ne...Colloquium talk on modal sense classification using a convolutional neural ne...
Colloquium talk on modal sense classification using a convolutional neural ne...
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language Processing
 
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel CorporaMultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Linked Open Vocabulary Ranking and Terms Discovery
Linked Open Vocabulary Ranking and Terms DiscoveryLinked Open Vocabulary Ranking and Terms Discovery
Linked Open Vocabulary Ranking and Terms Discovery
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
 
Thamme Gowda's PhD dissertation defense slides
Thamme Gowda's PhD dissertation defense slidesThamme Gowda's PhD dissertation defense slides
Thamme Gowda's PhD dissertation defense slides
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLP
 
Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...
 
Presentation ASLIB 2014_Ghoula
Presentation ASLIB 2014_GhoulaPresentation ASLIB 2014_Ghoula
Presentation ASLIB 2014_Ghoula
 
The Essay Scoring Tool (TEST) for Hindi
The Essay Scoring Tool (TEST) for HindiThe Essay Scoring Tool (TEST) for Hindi
The Essay Scoring Tool (TEST) for Hindi
 
Wei Xu - Innovative Applications of AI Panel
Wei Xu - Innovative Applications of AI PanelWei Xu - Innovative Applications of AI Panel
Wei Xu - Innovative Applications of AI Panel
 
Investigation of Text-to-Speech based Synthetic Parallel Data for Sequence-to...
Investigation of Text-to-Speech based Synthetic Parallel Data for Sequence-to...Investigation of Text-to-Speech based Synthetic Parallel Data for Sequence-to...
Investigation of Text-to-Speech based Synthetic Parallel Data for Sequence-to...
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?
 

Mehr von Matīss ‎‎‎‎‎‎‎  

Hybrid Machine Translation by Combining Multiple Machine Translation Systems
Hybrid Machine Translation by Combining Multiple Machine Translation SystemsHybrid Machine Translation by Combining Multiple Machine Translation Systems
Hybrid Machine Translation by Combining Multiple Machine Translation SystemsMatīss ‎‎‎‎‎‎‎  
 
Effective online learning implementation for statistical machine translation
Effective online learning implementation for statistical machine translationEffective online learning implementation for statistical machine translation
Effective online learning implementation for statistical machine translationMatīss ‎‎‎‎‎‎‎  
 
Hybrid machine translation by combining multiple machine translation systems
Hybrid machine translation by combining multiple machine translation systemsHybrid machine translation by combining multiple machine translation systems
Hybrid machine translation by combining multiple machine translation systemsMatīss ‎‎‎‎‎‎‎  
 

Mehr von Matīss ‎‎‎‎‎‎‎   (20)

日本のお風呂
日本のお風呂日本のお風呂
日本のお風呂
 
Thrifty Food Tweets on a Rainy Day
Thrifty Food Tweets on a Rainy DayThrifty Food Tweets on a Rainy Day
Thrifty Food Tweets on a Rainy Day
 
私の趣味
私の趣味私の趣味
私の趣味
 
How Masterly Are People at Playing with Their Vocabulary?
How Masterly Are People at Playing with Their Vocabulary?How Masterly Are People at Playing with Their Vocabulary?
How Masterly Are People at Playing with Their Vocabulary?
 
私の町リガ
私の町リガ私の町リガ
私の町リガ
 
大学への交通手段
大学への交通手段大学への交通手段
大学への交通手段
 
小学生に 携帯電話
小学生に 携帯電話小学生に 携帯電話
小学生に 携帯電話
 
Tracing multisensory food experience on twitter
Tracing multisensory food experience on twitterTracing multisensory food experience on twitter
Tracing multisensory food experience on twitter
 
ラトビア大学
ラトビア大学ラトビア大学
ラトビア大学
 
私の趣味
私の趣味私の趣味
私の趣味
 
富士山りょこう
富士山りょこう富士山りょこう
富士山りょこう
 
Tips and Tools for NMT
Tips and Tools for NMTTips and Tools for NMT
Tips and Tools for NMT
 
Hybrid Machine Translation by Combining Multiple Machine Translation Systems
Hybrid Machine Translation by Combining Multiple Machine Translation SystemsHybrid Machine Translation by Combining Multiple Machine Translation Systems
Hybrid Machine Translation by Combining Multiple Machine Translation Systems
 
The Impact of Corpora Qulality on Neural Machine Translation
The Impact of Corpora Qulality on Neural Machine TranslationThe Impact of Corpora Qulality on Neural Machine Translation
The Impact of Corpora Qulality on Neural Machine Translation
 
Advancing Estonian Machine Translation
Advancing Estonian Machine TranslationAdvancing Estonian Machine Translation
Advancing Estonian Machine Translation
 
Debugging neural machine translations
Debugging neural machine translationsDebugging neural machine translations
Debugging neural machine translations
 
Effective online learning implementation for statistical machine translation
Effective online learning implementation for statistical machine translationEffective online learning implementation for statistical machine translation
Effective online learning implementation for statistical machine translation
 
Neirontulkojumu atkļūdošana
Neirontulkojumu atkļūdošanaNeirontulkojumu atkļūdošana
Neirontulkojumu atkļūdošana
 
Hybrid machine translation by combining multiple machine translation systems
Hybrid machine translation by combining multiple machine translation systemsHybrid machine translation by combining multiple machine translation systems
Hybrid machine translation by combining multiple machine translation systems
 
CoLing 2016
CoLing 2016CoLing 2016
CoLing 2016
 

Kürzlich hochgeladen

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 

Kürzlich hochgeladen (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

Paying attention to MWEs in NMT

  • 1. Paying Attention to Multi-Word Expressions in Neural Machine Translation Matīss Rikters1 and Ondřej Bojar2 1University of Latvia, Faculty of Computing 2Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics The 16th Machine Translation Summit Nagoya, Japan September 20, 2017
  • 2. Contents • Introduction • Related Work • Workflow • Data • NMT Systems • Experiments • Results • Manual Inspection • Attention Inspection • Conclusions
  • 3. Introduction • Raining cats and dogs En → Lv • Lietu kaķi un suņi • Suņu un kaķu • Raining kaķi un suņi
  • 5. Introduction • Raining cats and dogs En → Lv • Lietu kaķi un suņi • Suņu un kaķu • Raining kaķi un suņi • Līst kā pa Jāņiem
  • 7. Related Work • Extracting MWE candidates and integrating them in SMT (Skadiņa, 2016) • Tagging candidate phrases in source sentence and forcing the decoder to generate multiple words at once for the target phrase (Tang et al., 2016) • Inclusion of structural biases from word-based alignment models, such as positional bias, Markov conditioning, fertility and agreement over translation directions, in attentional NMT (Cohn et al., 2016) • Automatically extracting smaller parts of training segment pairs and adding them to NMT training data (Chen et al., 2016) • No related work specifically targeting MWEs in NMT
  • 8. More Related Work • Translating Phrases in Neural Machine Translation (Wang et al., 2017) • Results of the WMT17 Neural MT Training Task (Bojar et al., 2017)
  • 9. Workflow Tag corpora with morphological taggers UDPipe LV Tagger Identify MWE candidates MWE Toolkit Align identified MWE candidates MPAligner Shuffle MWEs into training corpora; Train NMT systems Neural Monkey Identify changes
  • 10. Data • WMT17 News Translation Task • Training • En → Lv • 4.5M parallel sentences for the baseline • En → Cs • 49M parallel sentences for the baseline • Evaluation • En → Lv • 2003 sentences in total • En → Cs • 6000 sentences in total
  • 11. Identifying Multi-word Expressions En → Lv • 210 patterns (Skadiņa, 2016) • 60 000 multi-word expressions En → Cs • 23 patterns (Majchrakova et al., 2012; Pecina 2008) • 400 000 multi-word expressions
  • 12. Data En → Lv En → Cs 1M 1xMWE 1M 2xMWE 2M 2xMWE 0.5M 2.5M 1xMWE 2.5M 2xMWE 5M 2xMWE 5M
  • 13. Data • WMT17 News Translation Task • Training • En → Lv • 4.5M parallel sentences for the baseline • 4.8M after adding MEWs/MWE sentences • En → Cs • 49M parallel sentences for the baseline • 17M after adding MEWs/MWE sentences • Evaluation • En → Lv • 2003 sentences in total • 611 sentences with at least one MWE • En → Cs • 6000 sentences in total • 112 sentences with at least one MWE
  • 14. NMT Systems • Neural Monkey • Embedding size 350 • Encoder state size 350 • Decoder state size 350 • Max sentence length 50 • BPE merges 30000 https://github.com/ufal/neuralmonkey
  • 15. Experiments Two forms of the presenting MWEs to the NMT system • Adding only the parallel MWEs themselves (MWE phrases) each pair forming a new “sentence pair” in the parallel corpus • Adding full sentences that contain the identified MWEs (MWE sentences)
  • 16. Results Languages En → Cs En → Lv Dataset Dev MWE Dev MWE Baseline 13.71 10.25 11.29 9.32 +MWE phrases - - 11.94 10.31 +MWE sentences 13.99 10.44 - -
  • 20. Conclusions • First experiments with handling multi-word expressions in neural machine translation – tow methods for MWE integration in NMT training data • Open-source scripts for a complete workflow of identifying, extracting and integrating MWEs into the NMT training and translation workflow • Started work on an open-source tool for visualizing NMT attention alignments (Rikters et al., 2017)
  • 22. References • Chen, W., Matusov, E., Khadivi, S., and Peter, J.-T. (2016). Guided alignment training for topic- aware neural machine translation. AMTA 2016, Vol., page 121. • Cohn, T., Hoang, C. D. V., Vymolova, E., Yao, K., Dyer, C., and Haffari, G. (2016). Incorporating structural alignment biases into an attentional neural translation model. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 876–885, San Diego, California. Association for Computational Linguistics. • Majchrakova, D., Dusek, O., Hajic, J., Karcova, A., and Garabik, R. (2012). Semi-automatic detection of multiword expressions in the Slovak dependency treebank. • Pecina, P. (2008). Reference data for Czech collocation extraction. In Proc. of the LREC Workshop Towards a Shared Task for MWEs (MWE 2008), pages 11–14. • Rikters, M., Fishel, M., Bojar, O. (2017). Visualizing Neural Machine Translation Attention and Confidence. Prague Bulletin for Mathematical Linguistics, volume 109. • Skadiņa, I. (2016). Multi-word expressions in English - Latvian. In Human Language Technologies – The Baltic Perspective: Proceedings of the Seventh International Conference Baltic HLT 2016, volume 289, page 97. IOS Press. • Tang, Y., Meng, F., Lu, Z., Li, H., and Yu, P. L. H. (2016). Neural machine translation with external phrase memory. CoRR, abs/1606.01792.

Hinweis der Redaktion

  1. Wang et al. propose a method to translate phrases in NMT by integrating a phrase memory storing target phrases from a phrase-based statistical machine translation (SMT) system into the encoder-decoder architecture of NMT. Curriculum learning, namely learning first on short target (Czech) sentences only and gradually adding also longer sentences to the batches as the training progresses.