Beyond BLEU: Professional Error Annotation in MT Quality Estimation and Improvement

•Als PPTX, PDF herunterladen•

2 gefällt mir•1,738 views

The document discusses the limitations of automated metrics like BLEU for evaluating machine translation quality and proposing alternatives like professional human annotation of errors in machine translation using frameworks like MQM/DQF. It presents error profiles of machine translation systems based on such annotations, which can provide more useful information than automated scores to understand issues and improve translation quality. The document advocates moving beyond fully automated metrics to involve human analysis of errors and source language phenomena to build test suites that help determine relationships between source barriers and target errors to enhance machine translation systems.

Technologie

Kim Harris with Aljoscha Burchardt (DFKI), Hans Uszkoreit (DFKI), Arle Lommel (CSA)
BEYOND AUTOMATED
QUALITY SCORES
From BLEU to professional error annotation in MT quality estimation
and improvement

Kim Harris • TAUS Roundtable Vienna 2016
• The closer a machine translation is to a professional
human translation, the better it is
• Relatively high correlation with human judgements
• One of the most popular automated and inexpensive
metrics.
• Automated quality scores based on comparisons with sets
of HT references
• Can be useful for certain estimation tasks but not for
improvement
• No ability to assess why scores improve or worsen
• Focus on the score and not the actual quality
BLEU: Status Quo
2

Kim Harris • TAUS Roundtable Vienna 2016
• MQM/DQF error annotation for HT and MT
• Analysis of quality based on real issues
• Ranking/estimation properties
• Use results to improve output
Error Annotation for MT Improvement
3

Kim Harris • TAUS Roundtable Vienna 2016
Annotation: Humans in the HQMT loop
4

Kim Harris • TAUS Roundtable Vienna 2016
Error profiles based on MQM annotation
By languages By system types
5

Kim Harris • TAUS Roundtable Vienna 2016 6
Error profiles

Kim Harris • TAUS Roundtable Vienna 2016
Error and source barrier analysis
• Moving away from completely automatic
• Analyse MQM errors, linguistic phenomena in target MT
• Compare to source phenomena
• Test suite analysis
• Basis for improved quality translation in MT thanks to
categorization and markup of translation barriers in
source language
• Mapping (almost) all linguistic phenomena for one
language
• determine possible relationships between phenomena in
the source and errors in the target
• can be used to test different MT systems and domains
New paradigm in HQMT
7

Kim Harris • TAUS Roundtable Vienna 2016
Enter: The Test Suite
8

Kim Harris • TAUS Roundtable Vienna 2016
Structure of Barrier Categories
9

Kim Harris • TAUS Roundtable Vienna 2016
Beyond BLEU
10

Kim Harris • TAUS Roundtable Vienna 2016
The Bigger Vision
11

Quality Translation 21 (QT21) has received funding from the EU’s Horizon 2020 research and innovation programme under grant no. 645452. META-QT has received funding from the EU’s
Horizon 2020 research and innovation programme through the contract CRACKER (grant agreement no.: 645357). Formerly co-funded by FP7 and ICT PSP through the contracts T4ME (grant
agreement no.: 249119), CESAR (grant agreement no.: 271022), METANET4U (grant agreement no.: 270893) and META-NORD (grant agreement no.: 270899).
Thank you!

Weitere ähnliche Inhalte

Ähnlich wie Beyond BLEU: Professional Error Annotation in MT Quality Estimation and Improvement

Topic 2: How to Pump up Your MT Quality (4)TAUS - The Language Data Network

Quality and Localization EffectivenessTAUS - The Language Data Network

KantanFest: Andy Waykantanmt

Spotlight Webinar: ROBISThe National Collaborating Centre for Methods and Tools

Lava con carothers globalization, four key best practices for improving your ...Scott Carothers

Lava con2013 carothers globalization, four key best practices for improving y...Scott Carothers

Automating Clinical Workflows with the VarSeq SuiteGolden Helix

Rick Hathaway V SCTCday cloud 24 feb16 BarcelonaAgustin Argelich Casals

Requirements' Quality Improvement: A Successful Case StudyThe REUSE Company

Vertica Analytics Database general overviewStratebi

FME Extensive Usage Inside the Mapping Production System of Natural Resources...Safe Software

Workshop Presentation James Stroyan 20120301Nordic Innovation

Bonazzi commons bd2 k ahm 2016 v2Vivien Bonazzi

Documented Requirements are not Useless After All!Lionel Briand

Metadata quality Assurance Framework at QQML2016 - shortPéter Király

5 challenges of scaling l10n workflows KantanMT/bmmt webinarkantanmt

Overview of Multidimensional Quality Metrics (QTLaunchPad)Arle Lommel

ELSE IF 2019: Language Technology Market: State-of-the-Art, Trends and Value ...PretaLLOD

UKRDDS 1st Workshop 20150423 - plan walkthroughChristopher Brown

Seven Degrees Presentation for 2015 ICEAAJames Lawlor

Ähnlich wie Beyond BLEU: Professional Error Annotation in MT Quality Estimation and Improvement (20)

Topic 2: How to Pump up Your MT Quality (4)

Quality and Localization Effectiveness

KantanFest: Andy Way

Spotlight Webinar: ROBIS

Lava con carothers globalization, four key best practices for improving your ...

Lava con2013 carothers globalization, four key best practices for improving y...

Automating Clinical Workflows with the VarSeq Suite

Rick Hathaway V SCTCday cloud 24 feb16 Barcelona

Requirements' Quality Improvement: A Successful Case Study

Vertica Analytics Database general overview

FME Extensive Usage Inside the Mapping Production System of Natural Resources...

Workshop Presentation James Stroyan 20120301

Bonazzi commons bd2 k ahm 2016 v2

Documented Requirements are not Useless After All!

Metadata quality Assurance Framework at QQML2016 - short

5 challenges of scaling l10n workflows KantanMT/bmmt webinar

Overview of Multidimensional Quality Metrics (QTLaunchPad)

ELSE IF 2019: Language Technology Market: State-of-the-Art, Trends and Value ...

UKRDDS 1st Workshop 20150423 - plan walkthrough

Seven Degrees Presentation for 2015 ICEAA

Mehr von TAUS - The Language Data Network

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS - The Language Data Network

TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS - The Language Data Network

TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS - The Language Data Network

TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS - The Language Data Network

TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS - The Language Data Network

Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...TAUS - The Language Data Network

Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)TAUS - The Language Data Network

Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...TAUS - The Language Data Network

A translation memory P2P trading platform - to make global translation memory...TAUS - The Language Data Network

Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...TAUS - The Language Data Network

Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...TAUS - The Language Data Network

Farmer Lv (TrueTran)TAUS - The Language Data Network

Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...TAUS - The Language Data Network

The Theory and Practice of Computer Aided Translation Training System, Liu Q...TAUS - The Language Data Network

Translation Technology Showcase in ShenzhenTAUS - The Language Data Network

How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)TAUS - The Language Data Network

SDL Trados Studio 2017, Jocelyn He (SDL)TAUS - The Language Data Network

How we train post-editors - Yongpeng Wei (Lingosail)TAUS - The Language Data Network

A use-case for getting MT into your company, Kerstin Berns (berns language c...TAUS - The Language Data Network

QE integrated in XTM, by Bob Willans (XTM)TAUS - The Language Data Network

Mehr von TAUS - The Language Data Network (20)

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...

TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...

TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...

TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...

TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...

Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...

Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)

Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...

A translation memory P2P trading platform - to make global translation memory...

Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...

Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...

Farmer Lv (TrueTran)

Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...

The Theory and Practice of Computer Aided Translation Training System, Liu Q...

Translation Technology Showcase in Shenzhen

How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)

SDL Trados Studio 2017, Jocelyn He (SDL)

How we train post-editors - Yongpeng Wei (Lingosail)

A use-case for getting MT into your company, Kerstin Berns (berns language c...

QE integrated in XTM, by Bob Willans (XTM)

Kürzlich hochgeladen

Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst

My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

Powerpoint exploring the locations used in television show Time Clashcharlottematthew16

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos

Story boards and shot lists for my a level piececharlottematthew16

CloudStudio User manual (basic edition):comworks

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106

Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang

Commit 2024 - Secret Management made easyAlfredo García Lavilla

Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited

Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

AI as an Interface for Commercial BuildingsMemoori

Training state-of-the-art general text embeddingZilliz

What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett

DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy

Kürzlich hochgeladen (20)

Human Factors of XR: Using Human Factors to Design XR Systems

My INSURER PTE LTD - Insurtech Innovation Award 2024

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

Powerpoint exploring the locations used in television show Time Clash

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)

Story boards and shot lists for my a level piece

CloudStudio User manual (basic edition):

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics

Are Multi-Cloud and Serverless Good or Bad?

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)

Commit 2024 - Secret Management made easy

Ensuring Technical Readiness For Copilot in Microsoft 365

Vector Databases 101 - An introduction to the world of Vector Databases

My Hashitalk Indonesia April 2024 Presentation

AI as an Interface for Commercial Buildings

Training state-of-the-art general text embedding

What's New in Teams Calling, Meetings and Devices March 2024

DevoxxFR 2024 Reproducible Builds with Apache Maven

Beyond BLEU: Professional Error Annotation in MT Quality Estimation and Improvement

1. Kim Harris with Aljoscha Burchardt (DFKI), Hans Uszkoreit (DFKI), Arle Lommel (CSA) BEYOND AUTOMATED QUALITY SCORES From BLEU to professional error annotation in MT quality estimation and improvement

2. Kim Harris • TAUS Roundtable Vienna 2016 • The closer a machine translation is to a professional human translation, the better it is • Relatively high correlation with human judgements • One of the most popular automated and inexpensive metrics. • Automated quality scores based on comparisons with sets of HT references • Can be useful for certain estimation tasks but not for improvement • No ability to assess why scores improve or worsen • Focus on the score and not the actual quality BLEU: Status Quo 2

3. Kim Harris • TAUS Roundtable Vienna 2016 • MQM/DQF error annotation for HT and MT • Analysis of quality based on real issues • Ranking/estimation properties • Use results to improve output Error Annotation for MT Improvement 3

4. Kim Harris • TAUS Roundtable Vienna 2016 Annotation: Humans in the HQMT loop 4

5. Kim Harris • TAUS Roundtable Vienna 2016 Error profiles based on MQM annotation By languages By system types 5

6. Kim Harris • TAUS Roundtable Vienna 2016 6 Error profiles

7. Kim Harris • TAUS Roundtable Vienna 2016 Error and source barrier analysis • Moving away from completely automatic • Analyse MQM errors, linguistic phenomena in target MT • Compare to source phenomena • Test suite analysis • Basis for improved quality translation in MT thanks to categorization and markup of translation barriers in source language • Mapping (almost) all linguistic phenomena for one language • determine possible relationships between phenomena in the source and errors in the target • can be used to test different MT systems and domains New paradigm in HQMT 7

8. Kim Harris • TAUS Roundtable Vienna 2016 Enter: The Test Suite 8

9. Kim Harris • TAUS Roundtable Vienna 2016 Structure of Barrier Categories 9

10. Kim Harris • TAUS Roundtable Vienna 2016 Beyond BLEU 10

11. Kim Harris • TAUS Roundtable Vienna 2016 The Bigger Vision 11

12. Quality Translation 21 (QT21) has received funding from the EU’s Horizon 2020 research and innovation programme under grant no. 645452. META-QT has received funding from the EU’s Horizon 2020 research and innovation programme through the contract CRACKER (grant agreement no.: 645357). Formerly co-funded by FP7 and ICT PSP through the contracts T4ME (grant agreement no.: 249119), CESAR (grant agreement no.: 271022), METANET4U (grant agreement no.: 270893) and META-NORD (grant agreement no.: 270899). Thank you!

Beyond BLEU: Professional Error Annotation in MT Quality Estimation and Improvement

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Beyond BLEU: Professional Error Annotation in MT Quality Estimation and Improvement

Ähnlich wie Beyond BLEU: Professional Error Annotation in MT Quality Estimation and Improvement (20)

Mehr von TAUS - The Language Data Network

Mehr von TAUS - The Language Data Network (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Beyond BLEU: Professional Error Annotation in MT Quality Estimation and Improvement