- MT@EC is a machine translation system developed by the European Commission to provide automated translations for all 24 official EU languages.
- It was launched in 2013 to address the growing translation needs of the EU, which far exceed the translation capacity of the Commission.
- MT@EC is used both for disseminating information to understand texts in other languages, and as a tool to aid human translators in drafting translations more efficiently.
- The system continues to be improved through customization pilots with public institutions and by incorporating translator feedback to enhance quality over time.
TAUS MT Showcase, MT@EC for European public administrations and online services, Spyridon Pilos, European Commission
1. Wednesday,
4
June
MT@EC
for
Europen
Public
Administra>ons
and
Online
Services
Spyridon
Pilos,
European
Commission
TAUS
Machine
Transla>on
Showcase
2014
Dublin
(Ireland)
The
research
within
the
project
MosesCore
leading
to
these
results
has
received
funding
from
the
European
Union
7th
Framework
Programme,
grant
agreement
no
288487
2. MT@EC
European Commission machine translation
for public administrations and digital services
in the European Union
Spyridon Pilos
Head of Language applications, IT unit
Directorate-General for Translation (DGT)
Dublin, 4 June 20142
3. European Commission machine translation
• European Commission and languages
• MT@EC: machine translation for EU users
• What next?
3
6. 6
Why do we need machine translation?
• The Commission…
• DGT has 1700 translators
• Over 2 M pages translated in 2013
• But…
…just to make europa.eu fully multilingual
almost 6.8 M documents to be translated
or 8 500 translators/year!
The result:
Thousands of non-translated documents
(and this does not include user generated content)
7. MT and EC: a long history
Started in the 1970s
• Eurotra (78-92): research, high expectations
• Rule-based ECMT (75-97), costly to develop – not scalable
(18 language pairs in 20 years - coverage of post-2004
languages never attempted- system shut down in 2010
Data-driven systems (Statistical MT) :
• cheap and quick to develop… if you have good data
• EC needs solution for all EU languages… and has good data
EC action plan (2009), Inter-service task force (2010)
• The goal: MT@EC offering machine translation for all
languages to and from English, operational in July 2013
8. MT for understanding (inbound)
MT
L2
L3
…
Ln
L1
Robustness, Coverage
Practically unlimited
demand; free web-based
services cover much of it
Requirements for MT@EC
• Provide MT as a (simple and robust) service
• Optimise quality for understandability (gisting)
• Deal with many domains, document types, formats, …
• Scale to huge volumes
Two Usage Scenarios for MT@EC
9. MT for dissemination (outbound)
Textual quality
MT
L2
L3
…
Ln
L1
Publishable quality can only be
authored by humans; Translation
Memories & CAT-Tools used by
professional translators
• Requirements for MT@EC
• Provide MT as a tool within a CAT workflow
• Develop new ways to incorporate feedback
• explicit feedback on MT quality, implicit feedback via TM
• improvements requiring language-specific knowledge
• towards hybrid approaches
• Optimise quality for post-editing
Two Usage Scenarios for MT@EC
10. MT@EC: a European Commission product
•
• Released : 26 June 2013 (version 1.0)
• Languages: All 24 EU official languages
552 language pairs (61 direct)
• Technology: Statistical machine translation
using open source software Moses co-funded by EU
Framework Programmes for research and innovation
• Development by DGT: between 2010-2013
co-funded by the ISA* programme (action 2.8)
• * Interoperability solutions for public administrations
10
11. • Delivery: - web user interface (human to machine)
- web services (machine to machine)
• Security: Host (EC data centre) + access (ECAS)
+ transfer (sTesta)
• Special features:
• Source document format/formatting maintained
• Specific output formats for translation: tmx and xliff
• Can translate multiple documents to multiple languages
• Translation can also be returned by email
• Indication of quality for language pairs
• Feedback mechanism
11
MT@EC description
12. Quality evaluation
and improvement…
• “Maturity Check” (April-May 2011)
• Can baseline MT engines already be used as such?
• Identify main sources of problems for various languages,
cluster them across languages
• Real-life trial (July 2011-June 2013)
• Make first MT results available to translators
• Auto-MT for 10..19 “best” language pairs (now: all)
• On-demand MT for others (now all languages get MT)
• Automatic scores
• BLEU scores for internal tuning and regression testing
• Can help to identify domains/document types where MT
is most useful, but also point to systematic difficulties
… with the help of
DGT translators
13. Maturity check 2011 (EN->X)
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
ES
FR
IT
PT
RO
DE
DA
NL
SV
BG
CS
PL
SK
SL
EL
MT
LT
LV
ET
FI
HU
useful useless
Romance
languages
inflected
Germanic
languages
Slavic
languages
Baltic
lang.
analytic
Semitic
highly inflected languages
Hellenic
Finno-
Ugric
composita
strong
aggluti-
nation
DGT's SMT maturity check outcome as a ( ) sentences ratio + morphology
Language
differences
14. + Aid for typing
+ time savings
+ “original” proposed solution
+ guides the terminological research
From the translator's
point of view
— gender/numbers and order of words
— can be "fluent", but with mistranslations
— omissions and additions
— risk of error when
incorrect terminology suggested
— quality dependent
on the quality of the originals
14
15. 15
§ … the staff of European institutions and bodies:
§ European Commission,
§ European Parliament,
§ Council of the European Union,
§ European Court of Justice,
§ Court of Auditors,
§ Economic and social committee
§ Committee of the regions
§ European Central Bank,
§ European Investment Bank
§ Translation Centre
§ … and more
MT@EC is already available to…
è DGT took into account the needs of
translators and other staff when designing the servcie
20. MT@EC is also integrated
into EC digital services
à operational
20
Service
Description/URL
IMI
Internal Market Information System
http://ec.europa.eu/internal_market/imi-net/index_en.html
SOLVIT
SOLVIT is an on-line problem solving network concerning
missapplication of Internal Market law by public authorities.
http://ec.europa.eu/solvit/
è DGT supports and advises
for better integration on the customer side
21. Integration into EC digital services
à under development (indicative list)
21
Service
Description/URL
nLex A common gateway to National Law
http://eur-lex.europa.eu/n-lex/
TED TED (Tenders Electronic Daily) is the online version of the
'Supplement to the Official Journal of the European Union', dedicated
to European public procurement.
http://ted.europa.eu/
e-Justice The future electronic one-stop-shop in the area of justice.
http://e-justice.europa.eu/
Joinup Joinup is an open collaborative platform supporting interoperability in
Europe.
https://joinup.ec.europa.eu/
22. Integration into EC digital services
à initiated (indicative list)
22
Service
Description/URL
ODR Platform to facilitate the resolution of consumer disputes out-of-court
(Alternative Dispute Resolution)
http://ec.europa.eu/consumers/redress_cons/adr_en.htm
EURES The European Job Mobility portal newtorking the European
employment services.
https://ec.europa.eu/eures/
EQF The portal supporting the implementation fo the European
Qualifications Framework for lifelong learning.
http://ec.europa.eu/eqf/home_en.htm
ESCO The multilingual classification of European Skills, Competences,
Qualifications and Occupations; identifies and categorises skills and
competences, qualifications and occupations in 22 European
languages. Supports EURES and other similar portals.
https://ec.europa.eu/esco/
23. MT@EC for public administrations
23
Free real-life trial in 2014:
§ - Staff can have direct free access to the standard MT@EC
service (upon request)
• - Organisations can participate in a "customisation" pilot
project, where DGT builds specific engines with their data
(based on bilateral cooperation agreements)
è DGT to understand better their needs and constraints
and develop appropriate service delivery models
24. Customisation pilots
• Pilot A: Connect an information system to the standard
MT@EC service.
• Pilot B: DGT builds custom engines (their data) available
to all through MT@EC
• Pilot C: DGT builds custom engines (their data) available
only to them through MT@EC
• Pilot D: DGT builds custom engines (their data) for you to
run in their premises
• Pilot E: DGT assists you to build their own custom
engines for you to run in their premises
24
25. MT@EC: right for the EU
Quality:
• built on data derived from EU translations
(Euramis translation memory system: 800 M segments in
24 languages and annual growth rate > 20% )
• designed for EU relevant collaboration
• team of computational linguists working with
translators and linguists in DGT
• work to improve MT for all EU languages
Security
Customer support
25
26. MT@EC: what next
26
• CEF (Connecting Europe Facility)
• A funding programme for building and deploying
infrastructures.
• Includes deploying mature technologies to build, enable and
operate pan- European Digital Services.
• Includes an Automated Translation (AT) platform as one of
its core building blocks for digital services.
• A key component of the AT platform is MT@EC.
27. The automated translation platform
27
• To facilitate cross-border information exchange and
enable cross-border access to online content and
services provided by the digital service infrastructures
of the CEF.
• To offer MT services to EU institutions and public
administrations in the Member States.
• To build on the existing Commission Machine
Translation service (MT@EC)
• Emphasis is placed on secure, quality, customisable
machine translation.
è Follow this space:
http://ec.europa.eu/digital-agenda/en/connecting-europe-facility