SlideShare ist ein Scribd-Unternehmen logo
1 von 22
KantanNeural™ from A to Z
1/3: To NMT or not to NMT?
Dimitar Shterionov
The Rise of MT
1954 1966 1970 1982 1993 2003 2005 2016 2020
Quality of MT over time
Relativequality
Time
31/07/2017 KantanFest, Dublin, Ireland 2
Breakthrough in NeuralMT
31/07/2017 KantanFest, Dublin, Ireland 3
Yet another MT paradigm?
31/07/2017 KantanFest, Dublin, Ireland 4
Yet another MT paradigm?
Which technique is faster?
Which technique is better?
How can I integrate NMT in my pipeline?
How can I compare PBSMT and NMT?
How can I improve my NMT engine?
When to use PBSMT and when NMT?
31/07/2017 KantanFest, Dublin, Ireland 5
Yet another MT paradigm?
Which technique is faster?
Which technique is better?
How can I integrate NMT in my pipeline?
How can I compare PBSMT and NMT?
How can I improve my NMT engine?
When to use PBSMT and when NMT?
31/07/2017 KantanFest, Dublin, Ireland 6
Is NMT better than PBSMT???
Yet another MT paradigm?
Which technique is faster?
Which technique is better?
How can I integrate NMT in my pipeline?
How can I compare PBSMT and NMT?
How can I improve my NMT engine?
When to use PBSMT and when NMT?
31/07/2017 KantanFest, Dublin, Ireland 7
Can NMT better than PBSMT???
 Various empirical evaluations
(since 2015)
31/07/2017 KantanFest, Dublin, Ireland 8
…
Scientific Rigour – NMT vs PBSMT
31/07/2017 KantanFest, Dublin, Ireland 9
 Experiment Setup
 Identical Training, Test and Tune Data
 NMT training limited to 4 days
 Evaluation:
 Automated Scores: F-Measure, TER, BLEU
 Ranking with KantanLQR™, A/B Testing
 Publications and Presentations
 EAMT 2017
 MT Summit 2017
 LocWorld34 NMT GALA Track
Scientific Rigour – NMT vs PBSMT
31/07/2017 KantanFest, Dublin, Ireland 10
 A small parenthesis…
There are so many factors
 Learning algorithm and rate
 Number of epochs
 ANN properties
 Data – preprocessing, segmentation
you need the right data!
Scientific Rigour – NMT vs PBSMT
31/07/2017 KantanFest, Dublin, Ireland 11
Training: Identical Corpora
Language Arc
Parallel
Sentences
TWC UWC Domain(s)
English->German 8,820,562 110,150,238 859,167 Legal/Medical
English->Chinese(Simplified) 6,522,064 84,426,931 956,864 Legal/Technical
English->Japanese 8,545,366 87,252,129 676,244 Legal/Technical
English->Italian 2,756,185 35,295,535 765,930 Medical
English->Spanish 3,681,332 44,917,538 952,089 Legal
31/07/2017 KantanFest, Dublin, Ireland 12
Language Arc F-Measure BLEU TER Time F-Measure BLEU TER Perplexity Time
English->German 62.00% 54.08% 54.31% 18h 62.53% 47.53% 53.41% 3.02 92h
English->Chinese(Simplified) 77.16% 45.36% 46.85% 6h 71.85% 39.39% 47.01% 2.00 10h
English->Japanese 80.04% 63.27% 43.77% 9h 69.51% 40.55% 49.46% 1.89 68h
English->Italian 69.74% 56.98% 42.54% 8h 64.88% 42.00% 48.73% 2.70 83h
English->Spanish 71.53% 54.78% 41.87% 9h 69.41% 49.24% 44.89% 2.59 71h
SMT NMT
Training: Automated Scores
“In information theory, perplexity is a measurement of how well a
probability distribution or probability model predicts a sample. It may be
used to compare probability models. A low perplexity indicates the
probability distribution is good at predicting the sample.”
31/07/2017 KantanFest, Dublin, Ireland 13
Training: Automated Scores
0
10
20
30
40
50
60
70
80
90
English->German English->Chinese(S) English->Japanese English->Italian English->Spanish
SMT-FM SMT-BLEU SMT-TER NMT-FM NMT-BLEU NMT-TER
Language Arc F-Measure BLEU TER Time F-Measure BLEU TER Perplexity Time
English->German 62.00% 54.08% 54.31% 18h 62.53% 47.53% 53.41% 3.02 92h
English->Chinese(Simplified) 77.16% 45.36% 46.85% 6h 71.85% 39.39% 47.01% 2.00 10h
English->Japanese 80.04% 63.27% 43.77% 9h 69.51% 40.55% 49.46% 1.89 68h
English->Italian 69.74% 56.98% 42.54% 8h 64.88% 42.00% 48.73% 2.70 83h
English->Spanish 71.53% 54.78% 41.87% 9h 69.41% 49.24% 44.89% 2.59 71h
SMT NMT
31/07/2017 KantanFest, Dublin, Ireland 14
Training: Automated Scores
0
10
20
30
40
50
60
70
80
90
English->German English->Chinese(S) English->Japanese English->Italian English->Spanish
SMT-FM SMT-BLEU SMT-TER NMT-FM NMT-BLEU NMT-TER
Language Arc F-Measure BLEU TER Time F-Measure BLEU TER Perplexity Time
English->German 62.00% 54.08% 54.31% 18h 62.53% 47.53% 53.41% 3.02 92h
English->Chinese(Simplified) 77.16% 45.36% 46.85% 6h 71.85% 39.39% 47.01% 2.00 10h
English->Japanese 80.04% 63.27% 43.77% 9h 69.51% 40.55% 49.46% 1.89 68h
English->Italian 69.74% 56.98% 42.54% 8h 64.88% 42.00% 48.73% 2.70 83h
English->Spanish 71.53% 54.78% 41.87% 9h 69.41% 49.24% 44.89% 2.59 71h
SMT NMT
Alternative translations
Source
All dossiers must be individually analysed by the ministry responsible for the
economy and scientific policy.
Reference
Jeder Antrag wird von den Dienststellen des zuständigen Ministers für
Wirtschaft und Wissenschaftspolitik individuell geprüft.
PBSMT
Alle Unterlagen müssen einzeln analysiert werden von den Dienststellen des
zuständigen Ministers für Wirtschaft und Wissenschaftspolitik.
NMT
Alle Unterlagen müssen von dem für die Volkswirtschaft und die
wissenschaftliche Politik zuständigen Ministerium einzeln analysiert werden.
58%
0%
Source En este punto muestro mi desacuerdo con el informe.
Reference On this point, I am not in agreement with the report before us.
PBSMT At this point, I am not in agreement with the report.
NMT In this point I disagree with the report.
72%
7%
Source Debemos apoyarles a todos para que alcancen este objetivo.
Reference We must give them all our support to reach that goal.
PBSMT We must give them all our support to reach that goal.
NMT We have to support everyone to achieve this goal.
100%
0%
BLEU
EN→DEES→ENES→EN
31/07/2017 KantanFest, Dublin, Ireland 15
31/07/2017 KantanFest, Dublin, Ireland 16
Ranking
37
21
13
24
10
21
EN→ZH-CN EN→JA EN→DE EN→IT EN→ES AVERAGE
Average Scores from A/B Testing (in percent)
Same SMT NMT
31/07/2017 KantanFest, Dublin, Ireland 17
Ranking
37
21
13
24
10
21
24
21
34
19
28
25.2
EN→ZH-CN EN→JA EN→DE EN→IT EN→ES AVERAGE
Average Scores from A/B Testing (in percent)
Same SMT NMT
31/07/2017 KantanFest, Dublin, Ireland 18
Ranking
37
21
13
24
10
21
24
21
34
19
28
25.2
39
58
53
56
62
53.6
EN→ZH-CN EN→JA EN→DE EN→IT EN→ES AVERAGE
Average Scores from A/B Testing (in percent)
Same SMT NMT
BLEU underestimation of NMT
 Take the translations from the NMT engine
considered better than their PBSMT counterparts.
 How many of those are scored by BLEU lower than
their PBSMT counterparts?
 Do the same for the PBSMT translations.
31/07/2017 KantanFest, Dublin, Ireland 19
EN→ZH-CN EN→JP EN→DE EN→IT EN→ES Average
NMT 40% 59% 55% 34% 53% 48%
PBSMT 12% 0% 9% 9% 0% 6%
Take-away messages…
 NMT is a new efficient paradigm for MT
 NMT does not solve the problem of language
 NMT can be much better than PBSMT
 Evaluating NMT:
 BLEU, TER, F-Measure may underestimate NMT
when compared to PBSMT
 Using KantanLQR™ (A/B Testing) facilitates MT ranking
31/07/2017 KantanFest, Dublin, Ireland 20
Take-away messages…
 NMT is a new efficient paradigm for MT
 NMT does not solve the problem of language … but it is getting there
 NMT can be much better than PBSMT
 Evaluating NMT:
 BLEU, TER, F-Measure may underestimate NMT
when compared to PBSMT
 Using KantanLQR™ (A/B Testing) facilitates MT ranking
31/07/2017 KantanFest, Dublin, Ireland 21
To NMT or not to NMT?
Quality Evaluation
Thank you…
31/07/2017 KantanFest, Dublin, Ireland 22

Weitere ähnliche Inhalte

Ähnlich wie Kantanfest: Dimitar Shterionov - Part 1

Self-charging, Highly Accurate Insole-Based Health Trackers for Medical Grade...
Self-charging, Highly Accurate Insole-Based Health Trackers for Medical Grade...Self-charging, Highly Accurate Insole-Based Health Trackers for Medical Grade...
Self-charging, Highly Accurate Insole-Based Health Trackers for Medical Grade...INVIZA® HEALTH
 
Bagging-Clustering Methods to Forecast Time Series
Bagging-Clustering Methods to Forecast Time SeriesBagging-Clustering Methods to Forecast Time Series
Bagging-Clustering Methods to Forecast Time SeriesTiago Mendes Dantas
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
Decision Making Using The Analytic Hierarchy Process
Decision Making Using The Analytic Hierarchy ProcessDecision Making Using The Analytic Hierarchy Process
Decision Making Using The Analytic Hierarchy ProcessVaibhav Gaikwad
 
Supply Chain Performance at ETC Final Presentation
Supply Chain Performance at ETC Final PresentationSupply Chain Performance at ETC Final Presentation
Supply Chain Performance at ETC Final PresentationMark Cigich
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
Steven Lugard
Steven LugardSteven Lugard
Steven LugardInvestnet
 
Fall 2014 Co-op Rotation Summary
Fall 2014 Co-op Rotation SummaryFall 2014 Co-op Rotation Summary
Fall 2014 Co-op Rotation SummaryAsh Abel
 
T he SPL - IT Query by Example Search on Speech system for MediaEval 2014
T he SPL - IT Query by Example Search on Speech system for MediaEval 2014T he SPL - IT Query by Example Search on Speech system for MediaEval 2014
T he SPL - IT Query by Example Search on Speech system for MediaEval 2014multimediaeval
 
Machine Learning using biased data
Machine Learning using biased dataMachine Learning using biased data
Machine Learning using biased dataArnaud de Myttenaere
 
EPFL workshop on sparsity
EPFL workshop on sparsityEPFL workshop on sparsity
EPFL workshop on sparsityJuri Ranieri
 
P 1-2+3-Marcel_Meijer
P 1-2+3-Marcel_MeijerP 1-2+3-Marcel_Meijer
P 1-2+3-Marcel_MeijerMarcel Meijer
 
Convolutional Neural Network to Model Articulation Impairments in Patients wi...
Convolutional Neural Network to Model Articulation Impairments in Patients wi...Convolutional Neural Network to Model Articulation Impairments in Patients wi...
Convolutional Neural Network to Model Articulation Impairments in Patients wi...Juan Camilo Vasquez
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Jinho Choi
 
Instrument Condition Based Monitoring.ppt
Instrument Condition Based Monitoring.pptInstrument Condition Based Monitoring.ppt
Instrument Condition Based Monitoring.pptmuhamadzulhelmibinmo
 
Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic ...
Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic ...Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic ...
Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic ...Yusuke Oda
 
Infineon DPS310 Capacitive Pressure Sensor
Infineon DPS310 Capacitive Pressure SensorInfineon DPS310 Capacitive Pressure Sensor
Infineon DPS310 Capacitive Pressure SensorYole Developpement
 

Ähnlich wie Kantanfest: Dimitar Shterionov - Part 1 (20)

CLiC-it 2018 Presentation
CLiC-it 2018 PresentationCLiC-it 2018 Presentation
CLiC-it 2018 Presentation
 
Self-charging, Highly Accurate Insole-Based Health Trackers for Medical Grade...
Self-charging, Highly Accurate Insole-Based Health Trackers for Medical Grade...Self-charging, Highly Accurate Insole-Based Health Trackers for Medical Grade...
Self-charging, Highly Accurate Insole-Based Health Trackers for Medical Grade...
 
Bagging-Clustering Methods to Forecast Time Series
Bagging-Clustering Methods to Forecast Time SeriesBagging-Clustering Methods to Forecast Time Series
Bagging-Clustering Methods to Forecast Time Series
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Decision Making Using The Analytic Hierarchy Process
Decision Making Using The Analytic Hierarchy ProcessDecision Making Using The Analytic Hierarchy Process
Decision Making Using The Analytic Hierarchy Process
 
Supply Chain Performance at ETC Final Presentation
Supply Chain Performance at ETC Final PresentationSupply Chain Performance at ETC Final Presentation
Supply Chain Performance at ETC Final Presentation
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Steven Lugard
Steven LugardSteven Lugard
Steven Lugard
 
Fall 2014 Co-op Rotation Summary
Fall 2014 Co-op Rotation SummaryFall 2014 Co-op Rotation Summary
Fall 2014 Co-op Rotation Summary
 
T he SPL - IT Query by Example Search on Speech system for MediaEval 2014
T he SPL - IT Query by Example Search on Speech system for MediaEval 2014T he SPL - IT Query by Example Search on Speech system for MediaEval 2014
T he SPL - IT Query by Example Search on Speech system for MediaEval 2014
 
Machine Learning using biased data
Machine Learning using biased dataMachine Learning using biased data
Machine Learning using biased data
 
Log11 uitwerking opdrachten
Log11 uitwerking opdrachtenLog11 uitwerking opdrachten
Log11 uitwerking opdrachten
 
EPFL workshop on sparsity
EPFL workshop on sparsityEPFL workshop on sparsity
EPFL workshop on sparsity
 
P 1-2+3-Marcel_Meijer
P 1-2+3-Marcel_MeijerP 1-2+3-Marcel_Meijer
P 1-2+3-Marcel_Meijer
 
Convolutional Neural Network to Model Articulation Impairments in Patients wi...
Convolutional Neural Network to Model Articulation Impairments in Patients wi...Convolutional Neural Network to Model Articulation Impairments in Patients wi...
Convolutional Neural Network to Model Articulation Impairments in Patients wi...
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
 
Instrument Condition Based Monitoring.ppt
Instrument Condition Based Monitoring.pptInstrument Condition Based Monitoring.ppt
Instrument Condition Based Monitoring.ppt
 
Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic ...
Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic ...Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic ...
Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic ...
 
Infineon DPS310 Capacitive Pressure Sensor
Infineon DPS310 Capacitive Pressure SensorInfineon DPS310 Capacitive Pressure Sensor
Infineon DPS310 Capacitive Pressure Sensor
 
Casa cookbook for KAT 7
Casa cookbook for KAT 7Casa cookbook for KAT 7
Casa cookbook for KAT 7
 

Mehr von kantanmt

KantanFest: Mindaugas Kazlauskas
KantanFest: Mindaugas KazlauskasKantanFest: Mindaugas Kazlauskas
KantanFest: Mindaugas Kazlauskaskantanmt
 
Kantanfest: Laura Casanellas
Kantanfest: Laura CasanellasKantanfest: Laura Casanellas
Kantanfest: Laura Casanellaskantanmt
 
KantanFest: Andy Way
KantanFest: Andy WayKantanFest: Andy Way
KantanFest: Andy Waykantanmt
 
KantanFest: Tony O'Dowd
KantanFest: Tony O'DowdKantanFest: Tony O'Dowd
KantanFest: Tony O'Dowdkantanmt
 
Get Started with KantanNeural
Get Started with KantanNeuralGet Started with KantanNeural
Get Started with KantanNeuralkantanmt
 
You Asked, We Will Answer
You Asked, We Will AnswerYou Asked, We Will Answer
You Asked, We Will Answerkantanmt
 
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT SystemsATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systemskantanmt
 
Cross Border Selling: Breaking the Language Barrier with Automated Translation
Cross Border Selling: Breaking the Language Barrier with Automated TranslationCross Border Selling: Breaking the Language Barrier with Automated Translation
Cross Border Selling: Breaking the Language Barrier with Automated Translationkantanmt
 
Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...kantanmt
 
Webinar automotive and engineering content 16.06.16
Webinar   automotive and engineering content 16.06.16Webinar   automotive and engineering content 16.06.16
Webinar automotive and engineering content 16.06.16kantanmt
 
IC4 Cloud Security Workshop 2016
IC4 Cloud Security Workshop 2016IC4 Cloud Security Workshop 2016
IC4 Cloud Security Workshop 2016kantanmt
 
New Ways to Engage Clients with Custom Machine Translation
New Ways to Engage Clients with Custom Machine TranslationNew Ways to Engage Clients with Custom Machine Translation
New Ways to Engage Clients with Custom Machine Translationkantanmt
 
Improving your Bottom Line with Custom Machine Translation
Improving your Bottom Line with Custom Machine TranslationImproving your Bottom Line with Custom Machine Translation
Improving your Bottom Line with Custom Machine Translationkantanmt
 
How to Achieve Agile Localization for High-Volume Content with Machine Transl...
How to Achieve Agile Localization for High-Volume Content with Machine Transl...How to Achieve Agile Localization for High-Volume Content with Machine Transl...
How to Achieve Agile Localization for High-Volume Content with Machine Transl...kantanmt
 
How to Improve Translation Productivity
How to Improve Translation ProductivityHow to Improve Translation Productivity
How to Improve Translation Productivitykantanmt
 
How to save 16 million euro for your start up business
How to save 16 million euro for your start up businessHow to save 16 million euro for your start up business
How to save 16 million euro for your start up businesskantanmt
 
What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?kantanmt
 
Tips for Preparing Training Data for High Quality Machine Translation
Tips for Preparing Training Data for High Quality Machine TranslationTips for Preparing Training Data for High Quality Machine Translation
Tips for Preparing Training Data for High Quality Machine Translationkantanmt
 
EAMT Workshop 2015 - KantanMT
EAMT Workshop 2015 - KantanMTEAMT Workshop 2015 - KantanMT
EAMT Workshop 2015 - KantanMTkantanmt
 
Breaking Language Barriers: Machine Translation for eCommerce
Breaking Language Barriers: Machine Translation for eCommerceBreaking Language Barriers: Machine Translation for eCommerce
Breaking Language Barriers: Machine Translation for eCommercekantanmt
 

Mehr von kantanmt (20)

KantanFest: Mindaugas Kazlauskas
KantanFest: Mindaugas KazlauskasKantanFest: Mindaugas Kazlauskas
KantanFest: Mindaugas Kazlauskas
 
Kantanfest: Laura Casanellas
Kantanfest: Laura CasanellasKantanfest: Laura Casanellas
Kantanfest: Laura Casanellas
 
KantanFest: Andy Way
KantanFest: Andy WayKantanFest: Andy Way
KantanFest: Andy Way
 
KantanFest: Tony O'Dowd
KantanFest: Tony O'DowdKantanFest: Tony O'Dowd
KantanFest: Tony O'Dowd
 
Get Started with KantanNeural
Get Started with KantanNeuralGet Started with KantanNeural
Get Started with KantanNeural
 
You Asked, We Will Answer
You Asked, We Will AnswerYou Asked, We Will Answer
You Asked, We Will Answer
 
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT SystemsATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
 
Cross Border Selling: Breaking the Language Barrier with Automated Translation
Cross Border Selling: Breaking the Language Barrier with Automated TranslationCross Border Selling: Breaking the Language Barrier with Automated Translation
Cross Border Selling: Breaking the Language Barrier with Automated Translation
 
Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...Go global with this Winning Combination – Content strategy and Machine Transl...
Go global with this Winning Combination – Content strategy and Machine Transl...
 
Webinar automotive and engineering content 16.06.16
Webinar   automotive and engineering content 16.06.16Webinar   automotive and engineering content 16.06.16
Webinar automotive and engineering content 16.06.16
 
IC4 Cloud Security Workshop 2016
IC4 Cloud Security Workshop 2016IC4 Cloud Security Workshop 2016
IC4 Cloud Security Workshop 2016
 
New Ways to Engage Clients with Custom Machine Translation
New Ways to Engage Clients with Custom Machine TranslationNew Ways to Engage Clients with Custom Machine Translation
New Ways to Engage Clients with Custom Machine Translation
 
Improving your Bottom Line with Custom Machine Translation
Improving your Bottom Line with Custom Machine TranslationImproving your Bottom Line with Custom Machine Translation
Improving your Bottom Line with Custom Machine Translation
 
How to Achieve Agile Localization for High-Volume Content with Machine Transl...
How to Achieve Agile Localization for High-Volume Content with Machine Transl...How to Achieve Agile Localization for High-Volume Content with Machine Transl...
How to Achieve Agile Localization for High-Volume Content with Machine Transl...
 
How to Improve Translation Productivity
How to Improve Translation ProductivityHow to Improve Translation Productivity
How to Improve Translation Productivity
 
How to save 16 million euro for your start up business
How to save 16 million euro for your start up businessHow to save 16 million euro for your start up business
How to save 16 million euro for your start up business
 
What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?
 
Tips for Preparing Training Data for High Quality Machine Translation
Tips for Preparing Training Data for High Quality Machine TranslationTips for Preparing Training Data for High Quality Machine Translation
Tips for Preparing Training Data for High Quality Machine Translation
 
EAMT Workshop 2015 - KantanMT
EAMT Workshop 2015 - KantanMTEAMT Workshop 2015 - KantanMT
EAMT Workshop 2015 - KantanMT
 
Breaking Language Barriers: Machine Translation for eCommerce
Breaking Language Barriers: Machine Translation for eCommerceBreaking Language Barriers: Machine Translation for eCommerce
Breaking Language Barriers: Machine Translation for eCommerce
 

Kürzlich hochgeladen

Spain Vs Italy 20 players confirmed for Spain's Euro 2024 squad, and three po...
Spain Vs Italy 20 players confirmed for Spain's Euro 2024 squad, and three po...Spain Vs Italy 20 players confirmed for Spain's Euro 2024 squad, and three po...
Spain Vs Italy 20 players confirmed for Spain's Euro 2024 squad, and three po...World Wide Tickets And Hospitality
 
08448380779 Call Girls In International Airport Women Seeking Men
08448380779 Call Girls In International Airport Women Seeking Men08448380779 Call Girls In International Airport Women Seeking Men
08448380779 Call Girls In International Airport Women Seeking MenDelhi Call girls
 
Croatia vs Italy Euro Cup 2024 Three pitfalls for Spalletti’s Italy in Group ...
Croatia vs Italy Euro Cup 2024 Three pitfalls for Spalletti’s Italy in Group ...Croatia vs Italy Euro Cup 2024 Three pitfalls for Spalletti’s Italy in Group ...
Croatia vs Italy Euro Cup 2024 Three pitfalls for Spalletti’s Italy in Group ...Eticketing.co
 
Slovenia Vs Serbia UEFA Euro 2024 Fixture Guide Every Fixture Detailed.docx
Slovenia Vs Serbia UEFA Euro 2024 Fixture Guide Every Fixture Detailed.docxSlovenia Vs Serbia UEFA Euro 2024 Fixture Guide Every Fixture Detailed.docx
Slovenia Vs Serbia UEFA Euro 2024 Fixture Guide Every Fixture Detailed.docxWorld Wide Tickets And Hospitality
 
( Sports training) All topic (MCQs).pptx
( Sports training) All topic (MCQs).pptx( Sports training) All topic (MCQs).pptx
( Sports training) All topic (MCQs).pptxParshotamGupta1
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
TAM Sports_IPL 17 Till Match 37_Celebrity Endorsement _Report.pdf
TAM Sports_IPL 17 Till Match 37_Celebrity Endorsement _Report.pdfTAM Sports_IPL 17 Till Match 37_Celebrity Endorsement _Report.pdf
TAM Sports_IPL 17 Till Match 37_Celebrity Endorsement _Report.pdfSocial Samosa
 
Spain Vs Albania- Spain at risk of being thrown out of Euro 2024 with Tournam...
Spain Vs Albania- Spain at risk of being thrown out of Euro 2024 with Tournam...Spain Vs Albania- Spain at risk of being thrown out of Euro 2024 with Tournam...
Spain Vs Albania- Spain at risk of being thrown out of Euro 2024 with Tournam...World Wide Tickets And Hospitality
 
08448380779 Call Girls In Karol Bagh Women Seeking Men
08448380779 Call Girls In Karol Bagh Women Seeking Men08448380779 Call Girls In Karol Bagh Women Seeking Men
08448380779 Call Girls In Karol Bagh Women Seeking MenDelhi Call girls
 
Albania Vs Spain Albania is Loaded with Defensive Talent on their Roster.docx
Albania Vs Spain Albania is Loaded with Defensive Talent on their Roster.docxAlbania Vs Spain Albania is Loaded with Defensive Talent on their Roster.docx
Albania Vs Spain Albania is Loaded with Defensive Talent on their Roster.docxWorld Wide Tickets And Hospitality
 
🔝|97111༒99012🔝 Call Girls In {Delhi} Cr Park ₹5.5k Cash Payment With Room De...
🔝|97111༒99012🔝 Call Girls In  {Delhi} Cr Park ₹5.5k Cash Payment With Room De...🔝|97111༒99012🔝 Call Girls In  {Delhi} Cr Park ₹5.5k Cash Payment With Room De...
🔝|97111༒99012🔝 Call Girls In {Delhi} Cr Park ₹5.5k Cash Payment With Room De...Diya Sharma
 
Technical Data | Sig Sauer Easy6 BDX 1-6x24 | Optics Trade
Technical Data | Sig Sauer Easy6 BDX 1-6x24 | Optics TradeTechnical Data | Sig Sauer Easy6 BDX 1-6x24 | Optics Trade
Technical Data | Sig Sauer Easy6 BDX 1-6x24 | Optics TradeOptics-Trade
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual serviceCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual serviceanilsa9823
 
CALL ON ➥8923113531 🔝Call Girls Telibagh Lucknow best Night Fun service 🧣
CALL ON ➥8923113531 🔝Call Girls Telibagh Lucknow best Night Fun service  🧣CALL ON ➥8923113531 🔝Call Girls Telibagh Lucknow best Night Fun service  🧣
CALL ON ➥8923113531 🔝Call Girls Telibagh Lucknow best Night Fun service 🧣anilsa9823
 
JORNADA 5 LIGA MURO 2024INSUGURACION.pdf
JORNADA 5 LIGA MURO 2024INSUGURACION.pdfJORNADA 5 LIGA MURO 2024INSUGURACION.pdf
JORNADA 5 LIGA MURO 2024INSUGURACION.pdfArturo Pacheco Alvarez
 
Asli Kala jadu, Black magic specialist in Pakistan Or Kala jadu expert in Egy...
Asli Kala jadu, Black magic specialist in Pakistan Or Kala jadu expert in Egy...Asli Kala jadu, Black magic specialist in Pakistan Or Kala jadu expert in Egy...
Asli Kala jadu, Black magic specialist in Pakistan Or Kala jadu expert in Egy...baharayali
 
Who Is Emmanuel Katto Uganda? His Career, personal life etc.
Who Is Emmanuel Katto Uganda? His Career, personal life etc.Who Is Emmanuel Katto Uganda? His Career, personal life etc.
Who Is Emmanuel Katto Uganda? His Career, personal life etc.Marina Costa
 
ALL NFL NETWORK CONTACTS- April 29, 2024
ALL NFL NETWORK CONTACTS- April 29, 2024ALL NFL NETWORK CONTACTS- April 29, 2024
ALL NFL NETWORK CONTACTS- April 29, 2024Brian Slack
 
9990611130 Find & Book Russian Call Girls In Ghazipur
9990611130 Find & Book Russian Call Girls In Ghazipur9990611130 Find & Book Russian Call Girls In Ghazipur
9990611130 Find & Book Russian Call Girls In GhazipurGenuineGirls
 

Kürzlich hochgeladen (20)

Spain Vs Italy 20 players confirmed for Spain's Euro 2024 squad, and three po...
Spain Vs Italy 20 players confirmed for Spain's Euro 2024 squad, and three po...Spain Vs Italy 20 players confirmed for Spain's Euro 2024 squad, and three po...
Spain Vs Italy 20 players confirmed for Spain's Euro 2024 squad, and three po...
 
08448380779 Call Girls In International Airport Women Seeking Men
08448380779 Call Girls In International Airport Women Seeking Men08448380779 Call Girls In International Airport Women Seeking Men
08448380779 Call Girls In International Airport Women Seeking Men
 
Croatia vs Italy Euro Cup 2024 Three pitfalls for Spalletti’s Italy in Group ...
Croatia vs Italy Euro Cup 2024 Three pitfalls for Spalletti’s Italy in Group ...Croatia vs Italy Euro Cup 2024 Three pitfalls for Spalletti’s Italy in Group ...
Croatia vs Italy Euro Cup 2024 Three pitfalls for Spalletti’s Italy in Group ...
 
Slovenia Vs Serbia UEFA Euro 2024 Fixture Guide Every Fixture Detailed.docx
Slovenia Vs Serbia UEFA Euro 2024 Fixture Guide Every Fixture Detailed.docxSlovenia Vs Serbia UEFA Euro 2024 Fixture Guide Every Fixture Detailed.docx
Slovenia Vs Serbia UEFA Euro 2024 Fixture Guide Every Fixture Detailed.docx
 
( Sports training) All topic (MCQs).pptx
( Sports training) All topic (MCQs).pptx( Sports training) All topic (MCQs).pptx
( Sports training) All topic (MCQs).pptx
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
TAM Sports_IPL 17 Till Match 37_Celebrity Endorsement _Report.pdf
TAM Sports_IPL 17 Till Match 37_Celebrity Endorsement _Report.pdfTAM Sports_IPL 17 Till Match 37_Celebrity Endorsement _Report.pdf
TAM Sports_IPL 17 Till Match 37_Celebrity Endorsement _Report.pdf
 
Call Girls Service Noida Extension @9999965857 Delhi 🫦 No Advance VVIP 🍎 SER...
Call Girls Service Noida Extension @9999965857 Delhi 🫦 No Advance  VVIP 🍎 SER...Call Girls Service Noida Extension @9999965857 Delhi 🫦 No Advance  VVIP 🍎 SER...
Call Girls Service Noida Extension @9999965857 Delhi 🫦 No Advance VVIP 🍎 SER...
 
Spain Vs Albania- Spain at risk of being thrown out of Euro 2024 with Tournam...
Spain Vs Albania- Spain at risk of being thrown out of Euro 2024 with Tournam...Spain Vs Albania- Spain at risk of being thrown out of Euro 2024 with Tournam...
Spain Vs Albania- Spain at risk of being thrown out of Euro 2024 with Tournam...
 
08448380779 Call Girls In Karol Bagh Women Seeking Men
08448380779 Call Girls In Karol Bagh Women Seeking Men08448380779 Call Girls In Karol Bagh Women Seeking Men
08448380779 Call Girls In Karol Bagh Women Seeking Men
 
Albania Vs Spain Albania is Loaded with Defensive Talent on their Roster.docx
Albania Vs Spain Albania is Loaded with Defensive Talent on their Roster.docxAlbania Vs Spain Albania is Loaded with Defensive Talent on their Roster.docx
Albania Vs Spain Albania is Loaded with Defensive Talent on their Roster.docx
 
🔝|97111༒99012🔝 Call Girls In {Delhi} Cr Park ₹5.5k Cash Payment With Room De...
🔝|97111༒99012🔝 Call Girls In  {Delhi} Cr Park ₹5.5k Cash Payment With Room De...🔝|97111༒99012🔝 Call Girls In  {Delhi} Cr Park ₹5.5k Cash Payment With Room De...
🔝|97111༒99012🔝 Call Girls In {Delhi} Cr Park ₹5.5k Cash Payment With Room De...
 
Technical Data | Sig Sauer Easy6 BDX 1-6x24 | Optics Trade
Technical Data | Sig Sauer Easy6 BDX 1-6x24 | Optics TradeTechnical Data | Sig Sauer Easy6 BDX 1-6x24 | Optics Trade
Technical Data | Sig Sauer Easy6 BDX 1-6x24 | Optics Trade
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual serviceCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service
 
CALL ON ➥8923113531 🔝Call Girls Telibagh Lucknow best Night Fun service 🧣
CALL ON ➥8923113531 🔝Call Girls Telibagh Lucknow best Night Fun service  🧣CALL ON ➥8923113531 🔝Call Girls Telibagh Lucknow best Night Fun service  🧣
CALL ON ➥8923113531 🔝Call Girls Telibagh Lucknow best Night Fun service 🧣
 
JORNADA 5 LIGA MURO 2024INSUGURACION.pdf
JORNADA 5 LIGA MURO 2024INSUGURACION.pdfJORNADA 5 LIGA MURO 2024INSUGURACION.pdf
JORNADA 5 LIGA MURO 2024INSUGURACION.pdf
 
Asli Kala jadu, Black magic specialist in Pakistan Or Kala jadu expert in Egy...
Asli Kala jadu, Black magic specialist in Pakistan Or Kala jadu expert in Egy...Asli Kala jadu, Black magic specialist in Pakistan Or Kala jadu expert in Egy...
Asli Kala jadu, Black magic specialist in Pakistan Or Kala jadu expert in Egy...
 
Who Is Emmanuel Katto Uganda? His Career, personal life etc.
Who Is Emmanuel Katto Uganda? His Career, personal life etc.Who Is Emmanuel Katto Uganda? His Career, personal life etc.
Who Is Emmanuel Katto Uganda? His Career, personal life etc.
 
ALL NFL NETWORK CONTACTS- April 29, 2024
ALL NFL NETWORK CONTACTS- April 29, 2024ALL NFL NETWORK CONTACTS- April 29, 2024
ALL NFL NETWORK CONTACTS- April 29, 2024
 
9990611130 Find & Book Russian Call Girls In Ghazipur
9990611130 Find & Book Russian Call Girls In Ghazipur9990611130 Find & Book Russian Call Girls In Ghazipur
9990611130 Find & Book Russian Call Girls In Ghazipur
 

Kantanfest: Dimitar Shterionov - Part 1

  • 1. KantanNeural™ from A to Z 1/3: To NMT or not to NMT? Dimitar Shterionov
  • 2. The Rise of MT 1954 1966 1970 1982 1993 2003 2005 2016 2020 Quality of MT over time Relativequality Time 31/07/2017 KantanFest, Dublin, Ireland 2
  • 3. Breakthrough in NeuralMT 31/07/2017 KantanFest, Dublin, Ireland 3
  • 4. Yet another MT paradigm? 31/07/2017 KantanFest, Dublin, Ireland 4
  • 5. Yet another MT paradigm? Which technique is faster? Which technique is better? How can I integrate NMT in my pipeline? How can I compare PBSMT and NMT? How can I improve my NMT engine? When to use PBSMT and when NMT? 31/07/2017 KantanFest, Dublin, Ireland 5
  • 6. Yet another MT paradigm? Which technique is faster? Which technique is better? How can I integrate NMT in my pipeline? How can I compare PBSMT and NMT? How can I improve my NMT engine? When to use PBSMT and when NMT? 31/07/2017 KantanFest, Dublin, Ireland 6 Is NMT better than PBSMT???
  • 7. Yet another MT paradigm? Which technique is faster? Which technique is better? How can I integrate NMT in my pipeline? How can I compare PBSMT and NMT? How can I improve my NMT engine? When to use PBSMT and when NMT? 31/07/2017 KantanFest, Dublin, Ireland 7 Can NMT better than PBSMT???
  • 8.  Various empirical evaluations (since 2015) 31/07/2017 KantanFest, Dublin, Ireland 8 … Scientific Rigour – NMT vs PBSMT
  • 9. 31/07/2017 KantanFest, Dublin, Ireland 9  Experiment Setup  Identical Training, Test and Tune Data  NMT training limited to 4 days  Evaluation:  Automated Scores: F-Measure, TER, BLEU  Ranking with KantanLQR™, A/B Testing  Publications and Presentations  EAMT 2017  MT Summit 2017  LocWorld34 NMT GALA Track Scientific Rigour – NMT vs PBSMT
  • 10. 31/07/2017 KantanFest, Dublin, Ireland 10  A small parenthesis… There are so many factors  Learning algorithm and rate  Number of epochs  ANN properties  Data – preprocessing, segmentation you need the right data! Scientific Rigour – NMT vs PBSMT
  • 11. 31/07/2017 KantanFest, Dublin, Ireland 11 Training: Identical Corpora Language Arc Parallel Sentences TWC UWC Domain(s) English->German 8,820,562 110,150,238 859,167 Legal/Medical English->Chinese(Simplified) 6,522,064 84,426,931 956,864 Legal/Technical English->Japanese 8,545,366 87,252,129 676,244 Legal/Technical English->Italian 2,756,185 35,295,535 765,930 Medical English->Spanish 3,681,332 44,917,538 952,089 Legal
  • 12. 31/07/2017 KantanFest, Dublin, Ireland 12 Language Arc F-Measure BLEU TER Time F-Measure BLEU TER Perplexity Time English->German 62.00% 54.08% 54.31% 18h 62.53% 47.53% 53.41% 3.02 92h English->Chinese(Simplified) 77.16% 45.36% 46.85% 6h 71.85% 39.39% 47.01% 2.00 10h English->Japanese 80.04% 63.27% 43.77% 9h 69.51% 40.55% 49.46% 1.89 68h English->Italian 69.74% 56.98% 42.54% 8h 64.88% 42.00% 48.73% 2.70 83h English->Spanish 71.53% 54.78% 41.87% 9h 69.41% 49.24% 44.89% 2.59 71h SMT NMT Training: Automated Scores “In information theory, perplexity is a measurement of how well a probability distribution or probability model predicts a sample. It may be used to compare probability models. A low perplexity indicates the probability distribution is good at predicting the sample.”
  • 13. 31/07/2017 KantanFest, Dublin, Ireland 13 Training: Automated Scores 0 10 20 30 40 50 60 70 80 90 English->German English->Chinese(S) English->Japanese English->Italian English->Spanish SMT-FM SMT-BLEU SMT-TER NMT-FM NMT-BLEU NMT-TER Language Arc F-Measure BLEU TER Time F-Measure BLEU TER Perplexity Time English->German 62.00% 54.08% 54.31% 18h 62.53% 47.53% 53.41% 3.02 92h English->Chinese(Simplified) 77.16% 45.36% 46.85% 6h 71.85% 39.39% 47.01% 2.00 10h English->Japanese 80.04% 63.27% 43.77% 9h 69.51% 40.55% 49.46% 1.89 68h English->Italian 69.74% 56.98% 42.54% 8h 64.88% 42.00% 48.73% 2.70 83h English->Spanish 71.53% 54.78% 41.87% 9h 69.41% 49.24% 44.89% 2.59 71h SMT NMT
  • 14. 31/07/2017 KantanFest, Dublin, Ireland 14 Training: Automated Scores 0 10 20 30 40 50 60 70 80 90 English->German English->Chinese(S) English->Japanese English->Italian English->Spanish SMT-FM SMT-BLEU SMT-TER NMT-FM NMT-BLEU NMT-TER Language Arc F-Measure BLEU TER Time F-Measure BLEU TER Perplexity Time English->German 62.00% 54.08% 54.31% 18h 62.53% 47.53% 53.41% 3.02 92h English->Chinese(Simplified) 77.16% 45.36% 46.85% 6h 71.85% 39.39% 47.01% 2.00 10h English->Japanese 80.04% 63.27% 43.77% 9h 69.51% 40.55% 49.46% 1.89 68h English->Italian 69.74% 56.98% 42.54% 8h 64.88% 42.00% 48.73% 2.70 83h English->Spanish 71.53% 54.78% 41.87% 9h 69.41% 49.24% 44.89% 2.59 71h SMT NMT
  • 15. Alternative translations Source All dossiers must be individually analysed by the ministry responsible for the economy and scientific policy. Reference Jeder Antrag wird von den Dienststellen des zuständigen Ministers für Wirtschaft und Wissenschaftspolitik individuell geprüft. PBSMT Alle Unterlagen müssen einzeln analysiert werden von den Dienststellen des zuständigen Ministers für Wirtschaft und Wissenschaftspolitik. NMT Alle Unterlagen müssen von dem für die Volkswirtschaft und die wissenschaftliche Politik zuständigen Ministerium einzeln analysiert werden. 58% 0% Source En este punto muestro mi desacuerdo con el informe. Reference On this point, I am not in agreement with the report before us. PBSMT At this point, I am not in agreement with the report. NMT In this point I disagree with the report. 72% 7% Source Debemos apoyarles a todos para que alcancen este objetivo. Reference We must give them all our support to reach that goal. PBSMT We must give them all our support to reach that goal. NMT We have to support everyone to achieve this goal. 100% 0% BLEU EN→DEES→ENES→EN 31/07/2017 KantanFest, Dublin, Ireland 15
  • 16. 31/07/2017 KantanFest, Dublin, Ireland 16 Ranking 37 21 13 24 10 21 EN→ZH-CN EN→JA EN→DE EN→IT EN→ES AVERAGE Average Scores from A/B Testing (in percent) Same SMT NMT
  • 17. 31/07/2017 KantanFest, Dublin, Ireland 17 Ranking 37 21 13 24 10 21 24 21 34 19 28 25.2 EN→ZH-CN EN→JA EN→DE EN→IT EN→ES AVERAGE Average Scores from A/B Testing (in percent) Same SMT NMT
  • 18. 31/07/2017 KantanFest, Dublin, Ireland 18 Ranking 37 21 13 24 10 21 24 21 34 19 28 25.2 39 58 53 56 62 53.6 EN→ZH-CN EN→JA EN→DE EN→IT EN→ES AVERAGE Average Scores from A/B Testing (in percent) Same SMT NMT
  • 19. BLEU underestimation of NMT  Take the translations from the NMT engine considered better than their PBSMT counterparts.  How many of those are scored by BLEU lower than their PBSMT counterparts?  Do the same for the PBSMT translations. 31/07/2017 KantanFest, Dublin, Ireland 19 EN→ZH-CN EN→JP EN→DE EN→IT EN→ES Average NMT 40% 59% 55% 34% 53% 48% PBSMT 12% 0% 9% 9% 0% 6%
  • 20. Take-away messages…  NMT is a new efficient paradigm for MT  NMT does not solve the problem of language  NMT can be much better than PBSMT  Evaluating NMT:  BLEU, TER, F-Measure may underestimate NMT when compared to PBSMT  Using KantanLQR™ (A/B Testing) facilitates MT ranking 31/07/2017 KantanFest, Dublin, Ireland 20
  • 21. Take-away messages…  NMT is a new efficient paradigm for MT  NMT does not solve the problem of language … but it is getting there  NMT can be much better than PBSMT  Evaluating NMT:  BLEU, TER, F-Measure may underestimate NMT when compared to PBSMT  Using KantanLQR™ (A/B Testing) facilitates MT ranking 31/07/2017 KantanFest, Dublin, Ireland 21 To NMT or not to NMT?
  • 22. Quality Evaluation Thank you… 31/07/2017 KantanFest, Dublin, Ireland 22

Hinweis der Redaktion

  1. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  2. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  3. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  4. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  5. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  6. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  7. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  8. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  9. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  10. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  11. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  12. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  13. According to the PBSMT paradigm, a sentence is translated phrase by phrase. The translation of each phrase is derived from a phrase table (i.e., a representation of a translation model). Then these phrase-level translations are combined in a sentence in a way that maximises the likelihood for a correct sentence in the target language (i.e., using a language model). Sometimes a third model is used to fix the casing.
  14. (give 30 seconds for people to check and ask which translation they prefer).
  15. (give 30 seconds for people to check and ask which translation they prefer).
  16. (give 30 seconds for people to check and ask which translation they prefer).
  17. (give 30 seconds for people to check and ask which translation they prefer).
  18. Next we aimed to investigate our hypothesis of BLEU underestimating NMT quality. In order to do so, we needed to find irregularities between human evaluation and BLEU scores. To do so, first, we took the set of translations, for each language pair and from the set that the reviewers evaluated, where NMT was marked by all three reviewers better. Next, from this set we counted the number of translations with BLEU score lower than their PBSMT counterparts. Third, we find the ration of the two counts. We did the same also for the PBSMT – get the set of better translations, count the ones with BLEU score lower than the NMT counterparts and calculate the ration between the two numbers. It is clear from our results that indeed, the BLEU is not that reliable for NMT. Furthermore, these results indicate that BLEU underestimates the quality, thus confirming our hypothesis. Now, can we actually trust BLEU??? There are several remarks that need to be noted. First, the numbers shown in our table for each language pair are similar – this means that the affect of the BLEU underestimation is the same among the NMT engines, that is – we can compare NMT engines based on BLEU and still get a sense of their quality differences; Second, we notice the same tendency in the F-Measure score, which is also a metric based on n-grams. That indicates that indeed the issues arise from the underlying principles of PBSMT and NMT (recall the 2D picture with the points linked to the John/Mary sentences). This can push the future research in quality estimation in a particular direction. And third, something not shown in a table or a graph. Remember that our engines are trained under a time restriction. Assume we let the training continue until the neural network reaches its full potential. That is, it will model optimally the training data. Given that the test data is very similar to the training data this would mean that the engine would model each test sentence also very well, even on a phrase level. And as such, the scores (BLEU, F-Measure and TER) would improve and get closer or even surpass the PBSMT scores. This statement is supported by other research where (e.g., google’s paper from November last year) shows very good scores but also each of their models is trained for almost two weeks.
  19. A translation production line nowadays typically combines an MT component with human post-editing. While the MT component is simply a means to get a raw translation of the original text, which in the next step is modified to meet certain translation quality standards, the choice of correct MT toolset impacts the efficiency of this pipeline.
  20. A translation production line nowadays typically combines an MT component with human post-editing. While the MT component is simply a means to get a raw translation of the original text, which in the next step is modified to meet certain translation quality standards, the choice of correct MT toolset impacts the efficiency of this pipeline.