Machine Translation Quality - Are We There Yet? - Olga Beregovaya (Welocalize)

MT Quality – LSP Perspective
Olga Beregovaya
VP, Technology Solutions

• What do translators appreciate?
• What do translators struggle most with?
• Engineering – impact on quality?
• Final output quality?
In Translators’ Own Words

THE POST-EDITOR PRODUCES:
Publishable quality
The post-editor is responsible for ensuring that client quality requirements
and style guide are met
The post-editor is expected to adhere to client StyleGuide preferences
with regard to:
Infinitive / Imperative
Passive / Impassive
Formal / Informal
Different Styles for Headers, Lists, Tables
Special Handling of UI Options (Bilingual, English, Target?)
Converting All the Measurements Based On the Local Conventions
+ Disambiguate Terminology
+ Correct all the grammatical errors
But does the machine produce sufficient output?

THE POST-EDITOR RECEIVES:
GERMAN FRENCH JAPANESE RUSSIAN CHINESE SPANISH ITALIAN BRAZILIAN
WRONG TERMINOLOGY 6.46 4.93 13.63 5.00 6.20 9.63 3.78 1.13
WRONG SPELLING 2.00 0.86 0.88 0.13 0.30 1.13 0.56 1.27
SOURCE NOT TRANSLATED 6.38 5.36 3.88 5.13 3.60 2.50 1.22 1.73
COMPLIANCE WITH CLIENT
SPECS
2.46 0.86 3.00 2.13 0.70 0.63 0.44 2.60
LITERAL TRANSLATION 7.85 8.64 5.00 4.00 9.40 5.38 7.67 7.93
TEXT/INFO ADDED 2.69 1.36 2.13 1.25 0.80 1.88 0.44 0.80
CAPITALIZATION 2.69 3.43 0.00 2.63 0.50 1.75 3.33 2.60
WRONG WORD FORM 6.77 7.79 0.13 9.88 0.60 6.75 3.67 6.75
WRONG PART OF SPEECH 2.62 3.21 2.00 1.88 0.60 2.13 3.67 1.33
PUNCTUATION 4.46 3.00 0.75 3.38 4.10 2.13 1.22 3.53
SENTENCE STRUCTURE 12.54 10.00 14.25 8.00 13.00 5.38 6.11 3.67
TAGS + MARK-UP 1.23 0.14 0.13 0.50 0.20 0.38 0.44 0.20
LOCALE ADAPTATION 0.46 0.29 0.75 0.63 0.20 0.75 0.44 0.13
SPACING 0.92 0.36 2.25 1.25 4.00 0.50 0.33 0.40
OTHER 1.92 1.50 1.88 0.13 0.50 0.13 1.44 0.27
TOTAL ERRORS 61.46 51.71 50.63 45.88 44.70 41.00 34.78 32.53

Most time-consuming issues that translators need to
fix are:
• Sentence structure (word order)
• MT output too literal
• Wrong terminology
• Word form disagreements
• Source term left untranslated
OR, IN A NUTSHELL…

The Translator Gains…
Productivity gains ranging from 56% to negative
- Content type
- Engine output quality
- How fast is HT (and how much MT helps)
- Correlation?

TOP 6 ON THE TRANSLATORS’ LOVE IT-LIST
1. Source of inspiration: reduces thinking and translation choice time
2. Provides reference - very useful to translators new to a specific domain
3. Reduces typing & lookup time by handling well repetitive terminology and
structures
4. …thereby takes away the more monotonous efforts of translation
5. Post-editors over time notice improvements; appreciate it more if they
‘co-own’ the engine
6. MT output can be funny
LOL!LOL!

TOP 3 ON THE TRANSLATORS’ S*#!T-LIST
1. Wrong sentence structure
• Major impact on the post-editing effort (Spanish and Portuguese produce fewest errors)
• Japanese has the highest error rate and the lowest productivity gains (supported by
the cognitive effort error ranking research)
2. Wrong and inconsistent terminology
• Very time-consuming to check and fix terminology; + enough issues from Fuzzy
Matches already
• A major problem for new products where the terminology is not settled yet
• Inconsistent output for UI references
3. Correct MT to an agreed standard (=quality expectations)
• A challenging concept in the beginning for post-editors – they think they should edit
less if the quality is bad
S*#!TS*#!T

FEEDBACK LOOP – Essential!
SOURCE TEXT MT OUTPUT POST-EDITED OUTPUT
SPECIFIC
ERRORS/CHANGES MADE
Single-phase options range from 1.4kW
to 7.7kW while three-phase PDUs,
packed with output receptacles, range
from 8.6kW to 21.6kW.
Single-fase 7.7kW Opties variëren van
1.4kW om en driefasige PDU's, boordevol
Output-aansluitingen, variëren van 8,6
kW tot 21.6kW.
1,4 kW ... 7,7 kW ... 21,6 kW
Numbers and measurement units are not
converted properly and no spaces
inserted by MT engine (3 out of 4
occurrences, 1 is correct however,
strange...
Single-phase options range from 1.4kW
to 7.7kW while three-phase PDUs,
packed with output receptacles, range
from 8.6kW to 21.6kW.
• Biedt maximaal 24 TB <fmt id="1"
tooltip="SUPERSCRIPT"
endtooltip="SUPERSCRIPT"> 2 </fmt>
maximale capaciteit per-
uitbreidingsbehuizing toe te voegen.
• Biedt een maximale capaciteit van 24
TB<fmt id="1" tooltip="SUPERSCRIPT"
endtooltip="SUPERSCRIPT">2</fmt> per
uitbreidingsbehuizing.
No space should be inserted in front of
and behind a number in superscript (in
this case a "2"). ...>2<... and not: > 2 <
<fmt id="1" tooltip="b"
endtooltip="b">Interface Speed:</fmt> 6
Gb/s SAS
<fmt id="1" tooltip="b" endtooltip="b">
Interfacesnelheid: 6 </fmt> Gb/s SAS
• Biedt een maximale capaciteit van 24
TB<fmt id="1" tooltip="SUPERSCRIPT"
endtooltip="SUPERSCRIPT">2</fmt> per
uitbreidingsbehuizing.
The number is inserted before the tag
and should be after the tag
<fmt id="1" tooltip="b"
endtooltip="b">Intermixed Drive
Capacities:</fmt> Yes
<fmt id="1" tooltip="b" endtooltip="b">
Intermixed Capaciteit van de schijven:
Ja </fmt>
...</fmt> Ja
The string is inserted before the tag and
should be after the tag (and again
spacing before and after tags inserted)
A new feature — DR Rapid Data Access
— adds tighter integration with backup
software applications, starting with
Symantec OpenStorage-enabled
backup applications.
Een nieuwe functie - DR-Rapid Data
Access - voegt strakkere integratie met
back-uptoepassingen, beginnend met
Symantec OpenStorage geschikte
back-uptoepassingen.
... — DR Rapid Data Access — ...
Please ensure any special characters
like — (ChrW(151)) are preserved when
inserting a TM proposal, and not
replaced by a normal hyphen
(ChrW(45)).
Can these errors can be learned and corrected automatically? Can
we simplify or omit the “feedback loop”?

POST-EDITING QUALITY RESULTS
No fails on one of our 28-language PE program thanks to correct
terminology choices and few and consistent error.

LOCALIZATION TAG PLACEMENT
This is what a plain-text engine will do:
To become verified and lift your sending limit, please confirm your email
address, then add a credit or prepaid card to your account and {30} {31}
{32} {33} {34} {35}confirm{36} {37} {38} it.{39}.
{30}Para hacerse verificado y levantar su límite de envío, por favor
confirme su dirección de correo electrónico, luego añada un crédito o
tarjeta de prepago a su cuenta de y
confírmelo.{31}{32}{33}{34}{35}{36}{37}{38}{39}

This is a<ph id="1" x="<b>">{1}</ph>test<ph id="1"
x="</b>">{2}</ph>
Dies ist ein <ph id="1" x="<b>">{1}</ph>Test<ph id="1"
x="</b>">{2}</ph>.
AND THIS IS WHAT’S NEEDED

• More transparency in workings of engine and training
• Faster systems, shorter turnaround on large systems
• More “wizards” for training and deployment
• Easier testing methodologies without full deployments
• More standardized scoring and comparison metrics
• More “wizards” for training and deployment
• Predictive analysis of quality – confidence and utility scores
• Normalization integrated into workflow and standardized
• Industry-wide proper name and title library
• Better transliteration standards
• Morphologically aware terminology choices
• More research on post-editing environments
1. How to display source/target
2. How to display multiple suggestions
3. Autocomplete
4. Better ways to calculate the productivity improvements with post-editing
• More interoperability, so translators can stay in CAT tool they prefer
• Simplified workflows connecting MT engines and other tools
Translator Wishlist

Machine Translation Quality - Are We There Yet? - Olga Beregovaya (Welocalize)

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Machine Translation Quality - Are We There Yet? - Olga Beregovaya (Welocalize)

Ähnlich wie Machine Translation Quality - Are We There Yet? - Olga Beregovaya (Welocalize) (20)

Mehr von TAUS - The Language Data Network

Mehr von TAUS - The Language Data Network (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Machine Translation Quality - Are We There Yet? - Olga Beregovaya (Welocalize)