The document summarizes a presentation on human language technologies in a multilingual Europe. Some key points:
- There are 24 official EU languages and many regional/minority languages that have equal status but most are under-supported by language technologies and face digital extinction.
- The META-NET alliance coordinates language technology research across Europe but the field remains fragmented. There is a need for high-quality, deployable language technologies to support applications like translation, conversational interfaces, and a multilingual digital single market.
- A proposed "Multilingual Value Programme" would help enable the multilingual digital single market through technologies for translating, analyzing, processing and curating natural language content.
- A long-term
08448380779 Call Girls In Civil Lines Women Seeking Men
Human Language Technologies in a Multilingual Europe
1. Georg Rehm
georg.rehm@dfki.de
DFKI GmbH, Language Technology Lab – Berlin, Germany
META-NET, General Secretary
Human Language Technologies
in a Multilingual Europe
2. Outline
• Multilingual Europe
• Analysis I: Technology Support for Europe’s Languages
• Analysis II: Status and Current Developments
• Example: LT for the Digital Single Market
• Missions and Opportunities
• Towards the Human Language Project
2EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
3. • Multilingualism is at the very heart of the European idea.
• 24 EU languages – all languages have the same status.
• Dozens of regional and minority languages as well as
languages of immigrants and trade partners.
• Economic challenges:
– If the DSM is not multilingual, there will be 20+ isolated markets!
– Language barriers are market barriers!
• Social and public challenges:
– Empower all citizens to use their mother tongues.
– Provide multilingual digital public services.
– Enable cross-border, cross-lingual, cross-cultural communication.
Towards a European public sphere and e-participation.
– Restore trust in media (fake news debate, filter bubble issue etc.)
4. Analysis I: Technology Support
for Europe’s Languages
4EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
5. q
60 research centres in 34 countries (founded in 2010)
Chair of Executive Board: Jan Hajic (CUNI)
Dep.: J. van Genabith (DFKI), A. Vasiljevs (Tilde)
General Secretary: Georg Rehm (DFKI)
q
Multilingual Europe
Technology Alliance.
826 members in
67 countries
(published in 2013) (31 volumes; published in 2012)
T4ME (META-NET) CESAR METANET4UMETA-NORDMultilingual Europe Technology AllianceNET
10. Analysis II: Status and
Current Developments
10EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
11. • Multilingual Europe: our languages enjoy equal status yet digital extinction
of the majority of European languages is a very severe danger.
• Language Technology Research and Innovation in Europe: World class
research, excellent results (examples: Moses, recent NMT results of QT21),
strong SME base, thousands of LSPs; fragmentation; need for coordination.
• Big need for high-quality, high-coverage, precise, robust, deployable
Language Technologies: translation, conversational interfaces, text and
media analytics, personal assistants, multilingual DSM etc.
• Artificial Intelligence: Important breakthroughs and massive investments in
R&D and applications (mostly in US and Asia) – huge opportunity for Europe!
• The European Language Challenge cannot be abandoned or outsourced.
Ø Europe must not make its digital infrastructure dependent
on non-European solutions. This is why the EU is building
GALILEO as an alternative to GPS, GLONASS, Bei Dou.
• Big need for Language Technologies made in Europe for Europe!
Status and Current Developments
11EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
!
12. Example:
Language Technology for the
Digital Single Market
12EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
13. q Top priority in the European Union.
q Expected to add 400b€ to European GDP
and hundreds of thousands of new jobs.
q Unfortunately, the language topic is not
included in the EC’s Digital Single Market
strategy (published in May 2015).
14.
15. MDSM: Needed Applications
q Crosslingual SME presales communication and aftersales services
q Multilingual websites, product catalogues, product descriptions
q Crosslingual business intelligence (e.g., based on UGC)
q Crosslingual communication for SMEs, public institutions, citizens
q Multilingual (big) data, language and knowledge value chains
q Multilingual knowledge bases and knowledge graphs (and services)
q Multilingual conversational interfaces for connected devices (IoT)
q Crosslingual social media analytics for EU-wide societal issues
q Multilingual text and report generation (knowledge/data to text)
q All services must be domain-adaptable (avoid one size fits all)
q etc.
15EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
16. Multilingual Value Programme
q Multilingual Value Programe
§ Suggested three-year programme
§ Requires modest investment
q “Enabling the Multilingual Digital Single
Market through technologies for
translating, analysing, processing and
curating natural language content”
q Three components address the main
needs of the Multilingual DSM (MDSM)
and how to put them into practice:
1. Multilingual Application Areas
2. Multilingual Services
3. Research
Strategic Research and Innovation Agenda
Language as a Data Type and
Key Challenge for Big Data
Enabling the Multilingual Digital Single Market
through technologies for translating, analysing, processing
and curating natural language content
SRIA Editorial Team
Version 0.9 – July 2016
16EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
Version 1.0 to be published in 2017
18. Missions and Opportunities
• Languages & European Society: Enable all European citizens
to communicate and operate in their mother tongues (online & offline).
• Languages & Media: Address – technologically – the massively
increasing social, political and commercial relevance of content and
communication (fake news debate, filter bubble challenge).
• Languages & Market: Realise the Multilingual DSM, including
multilingual content, crosslingual text analytics, multilingual generation.
• Languages & Digital Tech: Future-proof our languages.
• Languages & Devices: Robust, precise, high-quality spoken
language interfaces for billions of connected things – and all languages.
• Excellent opportunity for Europe, European research, European
education, European industry, European innovation, European culture!
• Goal: Move Europe into the pole position in this field!
18EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
19. Towards the
Human Language Project
19EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
20. Multilingual
Europe
through
Technology
Multilingual
Strategy of the
EU: more tech
support for
multilingualism
Language
Technologies
for Europe's
digital public
services
Technologies
for the
Multilingual
Digital Single
Market
Language
Technologies
for Big Data
text analytics
The Human
Language
Project – long-
term R&D&I,
post-H2020
Language
Technologies
R&D&I
(H2020, WP
2018-20)
Multilingual Europe
in January 2017
Strategic Research and Innovation Agenda
Language as a Data Type and
Key Challenge for Big Data
Enabling the Multilingual Digital Single Market
through technologies for translating, analysing, processing
and curating natural language content
SRIA Editorial Team
Version 0.9 – July 2016
Open calls and
upcoming service
contracts
Dec. 2016: EC brainstorming
meeting on future LT priorities
in Horizon 2020 and FP9.
Need for a new strategy paper?
Jan. 2017: STOA workshop
and study on LT for Europe
Dec. 2017: LT Session
at BDVA Summit in
Valencia
2017: MDSM SRIA V1.0
Policy change and initiative
towards a European digital
public sphere enabled by MT/LT
DG CONNECT
DGT and
DG CONNECT
DG CONNECT
WP 2018-20 (incl. IoT,
I4.0, assistants,
robots etc.)
Shared programme
between EU and MS Suggested MLV Programme
Strategic Research and Innovation Agenda
Language as a Data Type and
Key Challenge for Big Data
Enabling the Multilingual Digital Single Market
through technologies for translating, analysing, processing
and curating natural language content
SRIA Editorial Team
Version 0.9 – July 2016
CEF AT
ELRC
21. Multilingual
Europe
through
Technology
Multilingual
Strategy of the
EU: more tech
support for
multilingualism
Language
Technologies
for Europe's
digital public
services
Technologies
for the
Multilingual
Digital Single
Market
Language
Technologies
for Big Data
text analytics
The Human
Language
Project – long-
term R&D&I,
post-H2020
Language
Technologies
R&D&I
(H2020, WP
2018-20)
Multilingual Europe
in January 2017
Strategic Research and Innovation Agenda
Language as a Data Type and
Key Challenge for Big Data
Enabling the Multilingual Digital Single Market
through technologies for translating, analysing, processing
and curating natural language content
SRIA Editorial Team
Version 0.9 – July 2016
Open calls and
upcoming service
contracts
Dec. 2016: EC brainstorming
meeting on future LT priorities
in Horizon 2020 and FP9.
Need for a new strategy paper?
Jan. 2017: STOA workshop
and study on LT for Europe
Dec. 2017: LT Session
at BDVA Summit in
Valencia
2017: MDSM SRIA V1.0
Policy change and initiative
towards a European digital
public sphere enabled by MT/LT
DG CONNECT
DGT and
DG CONNECT
DG CONNECT
WP 2018-20 (incl. IoT,
I4.0, assistants,
robots etc.)
Shared programme
between EU and MS Suggested MLV Programme
Strategic Research and Innovation Agenda
Language as a Data Type and
Key Challenge for Big Data
Enabling the Multilingual Digital Single Market
through technologies for translating, analysing, processing
and curating natural language content
SRIA Editorial Team
Version 0.9 – July 2016
CEF AT
ELRC
Observations:
• Current initiatives are too small and unbalanced; they
concentrate on innovation and technology deployment.
• Danger to loose touch with research and novel,
potentially paradigm-shifting developments.
• Difficult to kick-start new, paradigm-shifting research.
• We need a coordinated, concerted and consolidated
push in basic research, applied R&D and innovation!
22. Human Language Project – Interdisciplinary R&D&I Programme
Basic
Research
•Results in new
methods,
approaches
Applied R&D
•Results in novel
technologies
Innovation
•Results in novel
or improved
products or
services
Research Themes – Needs and Gaps (market-driven)
• Computational Linguistics
• Artificial Intelligence
• Language Technology
• Linguistics
• Computer Science
• Cognitive Science
• other related fields
• New, groundbreaking
methods, paradigms,
approaches
• Foster technologies,
products, innovation,
economy
• Foster education
HLP: Umbrella programme
to turbo-charge and to
coordinate all European
R&D&I activities in a
systematic way including
EP, EC, Member States.
22EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
23. Human Language Project
• Goal: Deep Natural Language Understanding.
• Breakthroughs in Artificial Intelligence plus a fresh
look at Linguistics for the Next Generation of LT!
• All official European and many additional languages
• Broad coverage, high quality, high precision
• Across modalities: text, text types, speech, image, video etc.
• Across platforms: messaging, telephony, social, mobile, IoT etc.
• Across cultures: knowledge, customs, formalities, humour,
emotion, subjectivity, biases, opinions, filter bubble etc.
23EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
24. Human Language Project
• Collaboration and coordination between EC, EP,
Member States and all other stakeholders.
• Mix of funding sources:
– EU projects: Horizon 2020 (WP 2018-2020) + FP9 (2021+)
– National/regional funding sources
• Setup: basic research, applied research, innovation,
commercialisation – tightly intertwined
• Timeframe: 10 years
• Policy change towards “LT-enabled multilingualism”
• Public procurement: EU/EC, MS administrations
should demand certain language technologies
24EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
25. HLP Topics: Key Ingredients for
Future European LT Research
Artificial Intelligence
including cognition, perception, vision,
cross-modal, cross-platform, cross-culture, IoT etc.
Machine Learning
Language Technology
• Extend knowledge bases
• Semantic Web, ontologies,
linked data, interoperability
• More complex models
• Multilingual resources that
are grounded, extensible
• Subjectivity, objectivity,
further novel dimensions
• Web-scale reasoning
• Combine DNNs and
symbolic processing
• ML for knowledge
acquisition and extension
• DNNs embedded into
modular systems including
symbolic knowledge bases
• Make it possible to inspect
and also to optimise DNNs
(beyond end-to-end)
• (Computational) Linguistics
research towards deep
language understanding
• From corpora to DNNs to
annotated data to highly
improved symbolic methods
• Language portability
• Full and Deep Language
Understanding by 2030 –
Human Language Project
Knowledge Technology
25EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)