SlideShare ist ein Scribd-Unternehmen logo
1 von 14
TAUS ROUNDTABLE 2014
22 May/ Moscow (Russia)
THURSDAY, 22 May /11:20 – 11:50
Readability: Cornerstone of Google’s L10N Quality
Evaluation System
Maxim Lobanov, Google
TAUS ROUNDTABLE 2014
22 May/ Moscow (Russia)
Maxim Lobanov
Senior Language Specialist at Google
For Czech, Hungarian, Romanian, Russian, Slovak, and Ukrainian languages
mlobanov@google.com
Readability – cornerstone of
Google l10n evaluation system
Localization Quality Evaluation (LQE)
What: Measure quality of translations provided by vendors
How: Error-typology-based review of translations
Who: Professional qualified linguists
LQE tool
– An internal tool to evaluate translation quality
– Used with Google Translation Toolkit
– All LQE results are accumulated in the LQE server
Types of errors
Projects are rated according to 6 categories of linguistic errors:
–Compliance (deviation from style guide or instructions)
–Grammar (violation of language rules)
–Meaning (translation has different meaning than source)
–Punctuation / Spelling (typos, punctuation, etc)
–Terminology (deviation from glossary/term database)
– Readability (poor flow of speech, unnatural wording, etc)
CATEGORIES BUT NOT ERRORS
–Client edits (corrections made not because of translator's fault)
–Kudos (to mark excellent translations)
Each error type has 3 severity levels with different weights
Key LQE metrics - LQE score
Errors per 1000 words
–Number of errors normalized to a document with 1000 words
–Pass rate: projects with less than 3 errors per thousand are good enough to be published
"LQE score v.s. Real life quality" mystery
Good LQE score ≠ good quality...
– With no regular errors (grammar, punctuation, spelling, terminology, etc.) translation
was still bad
Readability!
– Error category to mark unauthentic or awkward translations
– Literal translations
– Poor tone and style
– Poor knowledge of target audience
– Dull wordy texts
– No local flavor
– Lack of efforts to get context and understand products
Readability is VIP
– Really adds value to localization quality
– Helps end-users
– Helps Googlers
– Improves the writing skills of translators
– Increases attention to details
– Makes translation job meaningful
– Helps industry
And even more!
– This is one of the drivers behind Google's mission:
to organize the world's information and make it universally accessible
and useful.
Challenges
– Mentality
– Consistency
– Lack of standards
– Context
– Churn
– Underuse
What's next
– Further standardization
– More trainings and analysis
– LQE score reflecting the real life quality
– Readability and LQE applied to MT
Questions?

Weitere ähnliche Inhalte

Ähnlich wie Readability: Cornerstone of Google's L10N Quality Evaluation System. Maxim Lobanov, Google

LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
Lifeng (Aaron) Han
 
Lepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metricLepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metric
Lifeng (Aaron) Han
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
Abdullah al Mamun
 

Ähnlich wie Readability: Cornerstone of Google's L10N Quality Evaluation System. Maxim Lobanov, Google (20)

Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
 
Processing short-message communications in low-resource languages
Processing short-message communications in low-resource languages�Processing short-message communications in low-resource languages�
Processing short-message communications in low-resource languages
 
Research data as an aid in teaching technical competence in subtitling
Research data as an aid in teaching technical competence in subtitlingResearch data as an aid in teaching technical competence in subtitling
Research data as an aid in teaching technical competence in subtitling
 
RusLTC at TSD-2014 (Brno)
RusLTC at TSD-2014 (Brno)RusLTC at TSD-2014 (Brno)
RusLTC at TSD-2014 (Brno)
 
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...
 
Tech capabilities with_sa
Tech capabilities with_saTech capabilities with_sa
Tech capabilities with_sa
 
Translation assessment
Translation assessmentTranslation assessment
Translation assessment
 
Loex2009
Loex2009Loex2009
Loex2009
 
A Low Dimensionality Representation for Language Variety Identification (CICL...
A Low Dimensionality Representation for Language Variety Identification (CICL...A Low Dimensionality Representation for Language Variety Identification (CICL...
A Low Dimensionality Representation for Language Variety Identification (CICL...
 
Machine translator Introduction
Machine translator IntroductionMachine translator Introduction
Machine translator Introduction
 
Lentz_assessment options for world language proficiency certificates, bilingu...
Lentz_assessment options for world language proficiency certificates, bilingu...Lentz_assessment options for world language proficiency certificates, bilingu...
Lentz_assessment options for world language proficiency certificates, bilingu...
 
Cwpa 2016 comparative revision writing
Cwpa 2016 comparative revision writingCwpa 2016 comparative revision writing
Cwpa 2016 comparative revision writing
 
NLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful inNLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful in
 
lect36-tasks.ppt
lect36-tasks.pptlect36-tasks.ppt
lect36-tasks.ppt
 
Veronika Snizhko: Оцінка якості NLP проєкту: чому автоматичних метрик може бу...
Veronika Snizhko: Оцінка якості NLP проєкту: чому автоматичних метрик може бу...Veronika Snizhko: Оцінка якості NLP проєкту: чому автоматичних метрик може бу...
Veronika Snizhko: Оцінка якості NLP проєкту: чому автоматичних метрик може бу...
 
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT LEPOR: an augmented machine translation evaluation metric - Thesis PPT
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
 
Lepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metricLepor: augmented automatic MT evaluation metric
Lepor: augmented automatic MT evaluation metric
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Cross-domain sentiment analysis of the natural Romanian language
Cross-domain sentiment analysis of the natural Romanian languageCross-domain sentiment analysis of the natural Romanian language
Cross-domain sentiment analysis of the natural Romanian language
 
Human Evaluation: Why do we need it? - Dr. Sheila Castilho
Human Evaluation: Why do we need it? - Dr. Sheila CastilhoHuman Evaluation: Why do we need it? - Dr. Sheila Castilho
Human Evaluation: Why do we need it? - Dr. Sheila Castilho
 

Mehr von ABBYY Language Serivces

User Empowered Machine Translation. Dion Wiggins, Asia Online
User Empowered Machine Translation. Dion Wiggins, Asia OnlineUser Empowered Machine Translation. Dion Wiggins, Asia Online
User Empowered Machine Translation. Dion Wiggins, Asia Online
ABBYY Language Serivces
 
CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...
CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...
CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...
ABBYY Language Serivces
 
Translation Automation Going Cloud: The New Landscape for Professional Transl...
Translation Automation Going Cloud: The New Landscape for Professional Transl...Translation Automation Going Cloud: The New Landscape for Professional Transl...
Translation Automation Going Cloud: The New Landscape for Professional Transl...
ABBYY Language Serivces
 
Презентация Программы лингвистического обеспечения города Сочи
Презентация Программы лингвистического обеспечения города СочиПрезентация Программы лингвистического обеспечения города Сочи
Презентация Программы лингвистического обеспечения города Сочи
ABBYY Language Serivces
 
FlashGamm Moscow 2011 ABBYY Language Services
FlashGamm Moscow 2011 ABBYY Language ServicesFlashGamm Moscow 2011 ABBYY Language Services
FlashGamm Moscow 2011 ABBYY Language Services
ABBYY Language Serivces
 
ABBYY Language Services FlashGamm Moscow 2011
ABBYY Language Services FlashGamm Moscow 2011 ABBYY Language Services FlashGamm Moscow 2011
ABBYY Language Services FlashGamm Moscow 2011
ABBYY Language Serivces
 

Mehr von ABBYY Language Serivces (20)

User Empowered Machine Translation. Dion Wiggins, Asia Online
User Empowered Machine Translation. Dion Wiggins, Asia OnlineUser Empowered Machine Translation. Dion Wiggins, Asia Online
User Empowered Machine Translation. Dion Wiggins, Asia Online
 
CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...
CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...
CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...
 
Is MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
Is MT ready for e-Government? The Latvian Story. Indra Samite, TildeIs MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
Is MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
 
MT at Yandex: Overview and Ways We Use it in Localization. Farkhat Aminov, Ya...
MT at Yandex: Overview and Ways We Use it in Localization. Farkhat Aminov, Ya...MT at Yandex: Overview and Ways We Use it in Localization. Farkhat Aminov, Ya...
MT at Yandex: Overview and Ways We Use it in Localization. Farkhat Aminov, Ya...
 
Translation Automation Going Cloud: The New Landscape for Professional Transl...
Translation Automation Going Cloud: The New Landscape for Professional Transl...Translation Automation Going Cloud: The New Landscape for Professional Transl...
Translation Automation Going Cloud: The New Landscape for Professional Transl...
 
Planning for an Uncertain Future. Jaap van der Meer, TAUS
Planning for an Uncertain Future. Jaap van der Meer, TAUSPlanning for an Uncertain Future. Jaap van der Meer, TAUS
Planning for an Uncertain Future. Jaap van der Meer, TAUS
 
Together We Know More. Jaap van der Meer, TAUS
Together We Know More. Jaap van der Meer, TAUSTogether We Know More. Jaap van der Meer, TAUS
Together We Know More. Jaap van der Meer, TAUS
 
Координация тысяч переводчиков на одном проекте
Координация тысяч переводчиков на одном проектеКоординация тысяч переводчиков на одном проекте
Координация тысяч переводчиков на одном проекте
 
«Облачная» автоматизация переводов: что нового для переводчика-профессионала?
«Облачная» автоматизация переводов: что нового для переводчика-профессионала?«Облачная» автоматизация переводов: что нового для переводчика-профессионала?
«Облачная» автоматизация переводов: что нового для переводчика-профессионала?
 
Language by phone
Language by phoneLanguage by phone
Language by phone
 
Презентация Программы лингвистического обеспечения города Сочи
Презентация Программы лингвистического обеспечения города СочиПрезентация Программы лингвистического обеспечения города Сочи
Презентация Программы лингвистического обеспечения города Сочи
 
General
GeneralGeneral
General
 
General
GeneralGeneral
General
 
Abbyy ls docflow
Abbyy ls docflowAbbyy ls docflow
Abbyy ls docflow
 
Abbyy ls docflow
Abbyy ls docflowAbbyy ls docflow
Abbyy ls docflow
 
FlashGamm 2011 ABBYY Language Services
FlashGamm 2011 ABBYY Language ServicesFlashGamm 2011 ABBYY Language Services
FlashGamm 2011 ABBYY Language Services
 
SocialDev 2011 ABBYY Language Services
SocialDev 2011 ABBYY Language ServicesSocialDev 2011 ABBYY Language Services
SocialDev 2011 ABBYY Language Services
 
FlashGamm Moscow 2011 ABBYY Language Services
FlashGamm Moscow 2011 ABBYY Language ServicesFlashGamm Moscow 2011 ABBYY Language Services
FlashGamm Moscow 2011 ABBYY Language Services
 
ABBYY Language Services FlashGamm Moscow 2011
ABBYY Language Services FlashGamm Moscow 2011 ABBYY Language Services FlashGamm Moscow 2011
ABBYY Language Services FlashGamm Moscow 2011
 
Perevedem.Ru Presentation
Perevedem.Ru PresentationPerevedem.Ru Presentation
Perevedem.Ru Presentation
 

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

Readability: Cornerstone of Google's L10N Quality Evaluation System. Maxim Lobanov, Google

  • 1. TAUS ROUNDTABLE 2014 22 May/ Moscow (Russia)
  • 2. THURSDAY, 22 May /11:20 – 11:50 Readability: Cornerstone of Google’s L10N Quality Evaluation System Maxim Lobanov, Google TAUS ROUNDTABLE 2014 22 May/ Moscow (Russia)
  • 3. Maxim Lobanov Senior Language Specialist at Google For Czech, Hungarian, Romanian, Russian, Slovak, and Ukrainian languages mlobanov@google.com Readability – cornerstone of Google l10n evaluation system
  • 4. Localization Quality Evaluation (LQE) What: Measure quality of translations provided by vendors How: Error-typology-based review of translations Who: Professional qualified linguists
  • 5. LQE tool – An internal tool to evaluate translation quality – Used with Google Translation Toolkit – All LQE results are accumulated in the LQE server
  • 6. Types of errors Projects are rated according to 6 categories of linguistic errors: –Compliance (deviation from style guide or instructions) –Grammar (violation of language rules) –Meaning (translation has different meaning than source) –Punctuation / Spelling (typos, punctuation, etc) –Terminology (deviation from glossary/term database) – Readability (poor flow of speech, unnatural wording, etc) CATEGORIES BUT NOT ERRORS –Client edits (corrections made not because of translator's fault) –Kudos (to mark excellent translations) Each error type has 3 severity levels with different weights
  • 7. Key LQE metrics - LQE score Errors per 1000 words –Number of errors normalized to a document with 1000 words –Pass rate: projects with less than 3 errors per thousand are good enough to be published
  • 8. "LQE score v.s. Real life quality" mystery Good LQE score ≠ good quality... – With no regular errors (grammar, punctuation, spelling, terminology, etc.) translation was still bad
  • 9. Readability! – Error category to mark unauthentic or awkward translations – Literal translations – Poor tone and style – Poor knowledge of target audience – Dull wordy texts – No local flavor – Lack of efforts to get context and understand products
  • 10. Readability is VIP – Really adds value to localization quality – Helps end-users – Helps Googlers – Improves the writing skills of translators – Increases attention to details – Makes translation job meaningful – Helps industry
  • 11. And even more! – This is one of the drivers behind Google's mission: to organize the world's information and make it universally accessible and useful.
  • 12. Challenges – Mentality – Consistency – Lack of standards – Context – Churn – Underuse
  • 13. What's next – Further standardization – More trainings and analysis – LQE score reflecting the real life quality – Readability and LQE applied to MT