SlideShare ist ein Scribd-Unternehmen logo
1 von 45
Downloaden Sie, um offline zu lesen
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
The work leading to these results has received funding from the European Union under grant agreement n° 287658
EU-BRIDGE
Bridges Across the Language Divide
Sebastian Stüker, KIT
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Not this project, + different architecture
from "The hitch-hiker's Guide to the Galaxy" (TV
series)
1981 © BBC
orig. animation artwork by Rod Lord
not	
  
BABEL FISH
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
It‘s Happening Right Now
Speech and language
technologies
(posing hard problems)
A strong need,
in particular
in Europe
Growth of
computing
power/memory
Availability of
resources
Scientific
progress
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
The Vision
• Bridges Between Languages in Europe:
•  Technology to bridge the European language
•  Building/maintaining a sustainable European language infrastructure
•  Language divide is expressed in language and speech
•  Easily accessible services
• Bridges Between Science and Society
•  Language technology requires continued scientific attention
•  Exploitation and insertion requires suitable adaptation
•  Metrics:
•  WER, BLEU, TER, F-Scores
•  User Friendliness, Productivity, Sales, Distribution Channels, Customer
Support
•  Identify use cases and applications
•  Effective transition and insertion
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
EU-BRIDGE in a Nutshell
•  Here: Not science fiction, but results coming soon (2014/2015)
•  FP7 IP EU-BRIDGE: Bridges Across the Language Divide
•  Development of a speech translation service infrastructure
•  Targeted to make progress in particular regarding market
insertion
•  Project footprint:
•  Feb 2012 - Jan 2015
•  Budget € 10.5m, EC funding € 7.8m
•  10 partners
•  1 service infrastructure, 4 use cases
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Goals and Project Plan
Four use cases
1.  Captioning and translation of subtitles for TV programs
2.  Simultaneous translation of academic lectures
3.  Speech translation services for the European Parliament
4.  Translation of webinars
Four major objectives
1.  Performance
2.  Language portability
3.  Reduction of dependency on data
4.  Rapid technology transition and market insertion
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
EU-BRIDGE Partners
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
EU-BRIDGE Partners
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
EU-BRIDGE Partners
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
EU-BRIDGE Partners
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
EU-BRIDGE Partners
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
ASR
MT
Use Case 2
Engines	
   Services	
   Use	
  Cases	
  
Language Service
Customization,
Adaptation
Develop	
  and	
  Insert	
  
Improved	
  Technology	
  
Language Services
for User and Developer
Communities
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Language Service
ASR
MT
Use Case 2
Lecture
Translation
Engines	
   Services	
   Use	
  Cases	
  
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Technical Challenges
Languages
Robustness
1
3	
  	
  
500	
  
(Speaking Style
Noise)
Domain
EuroMatrixPlus
GALETC-STAR
C-STAR
NespoleVerbmobil
Babel
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Research	
   WP	
  1 	
   	
   	
  WP2	
  
Engines	
   WP	
  3	
  
Services	
   WP	
  4	
  
Products	
  
WP	
  5	
  
Cap<on	
  Transl.	
  	
  	
  	
  Lecture	
  Transl.	
  	
  	
  	
  Webinar	
  
RBM 	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  EUP/RBM/KIT	
  	
  	
  	
  	
  	
  ADX	
  
Standarized,	
  
inhouse	
  tests:	
  
WER,	
  BLEU	
  
	
  
	
  
External	
  
campaigns	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
Field	
  tesDng:	
  
dev	
  costs	
  
cost	
  reduc<on,	
  
usability	
  
	
  
run	
  and	
  
schedule	
  
build	
  &	
  
improve	
  
u<lize	
  
Market	
  
Laboratory	
  
WP6:	
  	
  EvaluaDon	
  
Deployable	
  Engines	
  
Transcrip<on	
  and	
  Transla<on	
  Services	
  
Market	
  Inser<on	
  
Transcrip<on	
  
Transla<on	
  
Data	
  Dependency	
  
Data	
  Costs	
  
Project Organization
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
WP	
  4	
  
Services	
  
	
  
	
  
	
  
	
  
	
  
Task	
  4.1	
  Core	
  Services:	
  ASR,	
  MT,	
  Text	
  Processing	
  
Design	
  and	
  implementa<on	
  of	
  core	
  service	
  APIs	
  and	
  
integra<on	
  of	
  core	
  engines	
  
WP	
  3	
  
Engines	
  
Task	
  3.1	
  ASR	
  and	
  MT	
  
Task	
  3.4	
  LID	
  
Task	
  3.2	
  Mul<threaded+Stream	
  
Decoding	
  
Task	
  3.5	
  Text	
  Processing	
  
LID	
  
TP	
  
ASR	
   MT	
  
TP	
  TP	
  TP	
  
Task	
  3.3	
  Language	
  Dependent	
  
Linguis<c	
  Processing	
  
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
WP	
  4	
  
Services	
  
	
  
	
  
	
  
	
  
	
  
WP	
  5	
  
Products	
  
Task	
  4.2	
  Running	
  Services	
  in	
  the	
  cloud	
  
Design	
  and	
  implementa<on	
  of	
  client-­‐server	
  
architecture	
  +	
  APIs	
  
Task	
  5.1	
  
Cap<on	
  Transla<on	
  
Task	
  5.2	
  
Lecture	
  Transla<on	
  
Task	
  5.3	
  
Mul<lingual	
  Tele-­‐	
  and	
  Video	
  
Conferencing	
  
Task	
  4.1	
  Core	
  Services:	
  ASR,	
  MT,	
  Text	
  Processing	
  
Design	
  and	
  implementa<on	
  of	
  core	
  service	
  APIs	
  and	
  
integra<on	
  of	
  core	
  engines	
  
LID	
  
TP	
  ASR	
   MT	
  
TP	
  
TP	
   TP	
  
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Evaluations
“You Improve what you Evaluate”
•  Evaluations organized around Use Cases
•  Align Metrics with Product/Service Goals
Co-opetition
•  Partners engage in friendly competition
•  Goal is progress/complementarity, not site-to-site “horse-
race”
•  Partners participate with standard or optimized/tuned
systems
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Evaluations
•  Standard open benchmarks
•  Calibrate technology internationally in open competition
•  IWSL, WMT: cover lecture task and multilingual MT
•  EU-Bridge partners are (co-) organizers of these evals and
can thus influence the process to suit EU-BRIDGE’s needs
•  Don’t need to create/market a new campaign
•  Internal EU-Bridge Tasks
•  Need to evaluate & optimize technology for EU-BRIDGE use cases
•  Different goals/sub-goals pursued (e.g. captioning, NE, shortening, ..)
•  Evals/Assessment need to change frequently during the project
•  Comparisons with outside teams not needed and not helpful
•  Field Tests and Measures:
•  Evaluate on Living online ‘Organism’
•  Optimize for extrinsic measures, not intrinsic ones
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
IWSLT 2014
•  Evaluation of Speech Translation Technology on Talks and
Lectures
•  Working on TED data, because of efficiency of creating training
and evaluation data
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Use Cases
Captioning and Translation of Multimedia Content
•  BBC Weather Data
•  Euronews
•  Skynews
Multilingual Lectures, Meetings
•  TED Lectures
•  University Lectures
•  European Parliament Voting Sessions
European Parliament Interpreter Support
•  Terminology Support: Tool Field Tests
•  Named Entities: Eval, Tool Integration & Tests
Webinar Translation
•  Integration into the Andrexen platform
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Use Cases
Captioning and Translation of Multimedia Content
•  BBC Weather Data
•  Euronews
•  Skynews
Multilingual Lectures, Meetings
•  TED Lectures
•  University Lectures
•  European Parliament Voting Sessions
European Parliament Interpreter Support
•  Terminology Support: Tool Field Tests
•  Named Entities: Eval, Tool Integration & Tests
Webinar Translation
•  Integration into the Andrexen platform
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Lecture Translation
• Continuous Monologue
• Speaking-Style
•  Planned, not read, not memorized
•  Fast, spontaneous, fragmentary, and no punctuation!
•  Noises, coughing, non-verbal events (e.g., singing)
• Vocabulary
•  Very large
•  Specialized terms
•  Foreign Words
• Speed, Rea-ltime
• Service-Infrastructure
•  Many parallel lectures;
•  Automatic, robust assignment of compute power
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Lectures
êß*0vúbØi∫BA¬pysUêÍ}
hÿ5≈ƒÄ<„y‡ëŒkû
¢OFˇØ∏kô#寫Zeû
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Speech-Translation Systems
•  Human Interpreters are too expensive
•  Automatic Speech Translation an affordable solution:
•  Still lots of errors, room for improvement
•  But, better than nothing
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Service Infrastructure
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Launch, June 11 2012
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Deployment
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Data Resources
KIT lecture corpus:
•  Collected to fit the needs for training the systems:
•  ASR AM: Large amounts of in-domain audio data with careful
transcriptions
•  MT Translation Model: Parallel sentences of in-domain data for the
translation
•  ASR+MT LM: Large amounts of monolingual sentences in the
required domain
•  Any kind of meta data that might help (e.g., lecturers’ slides)
•  Data collection took place at KIT’s lecture halls:
•  Started with computer science lectures
•  Gradually spread to lectures from all departments at the university
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Data Resources
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Automatic Speech Recognition
•  Janus based ASR system: HMM/GMM based quinphone system with
4,000 models, MVDR front-end, 4gram LM
•  Acoustic Model:
•  Trained on all lecture data in order to get a speaker independent model
•  Created 5 speaker adapted AMs: speaker independent model + Viterbi
training and bMMIE training on the speaker dependent data
•  Language Model:
•  Interpolation from 28 text corpora
•  Interpol. weights tuned on random selection from AM training data
•  300k vocab selected by ML count estimation method
•  German has a lot of compounds:
•  Sub-word vocabulary for compounds
•  Vocabulary adaptation by deriving queries from slides (OOV
2.25%è0.75% at 300k vocab size)
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
ASR Performance
•  Tested the ASR system on the dev set of six lecturers
Lecturer	
   1	
   2	
   3	
   4	
   5	
   6	
  
Speaker	
  Independent	
  AM	
   34.8%	
   21.1%	
   28.4%	
   22.9%	
   22.7%	
   19.0%	
  
Speaker	
  Dependent	
  AM	
   –	
   18.9%	
   27.6%	
   22.6%	
   21.5%	
   17.8%	
  
Adapted	
  LM+Vocabulary	
   23.9%	
   17.3%	
   –	
   18.1%	
   –	
   15.4%	
  
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
OOV Reduction
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
ASR Post-Processing
•  Numbers are converted to digit sequences
•  Common symbols are substituted, e.g., “Prozent”è’%’
•  Punctuation, i.e. ‘.’ and ‘,’ is inserted:
•  Prediction via a 4gram model
•  Pause information used to adjust the priors
•  Simple, rule based conversion of equations:
•  “F of x”è ‘f(x)’
•  Punctuation is used to structure text into sentences and chunks
for translation
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Machine Translation
Statistical Phrase Based MT System
•  Trained on different out-of-domain data and the lecture
corpus
•  Applied the same compound word splitting as for the ASR
system for consistency; names etc. were excluded from
splitting by applying a named entity tagger
•  Used a discriminatively trained word alignment approach
•  Specific models for short and long range reorderings (also
on the training data)
•  Online system:
•  Phrase table filtered with ASR vocabulary
•  Simplified POS tagging
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Machine Translation
Adaptation to the lecture domain
•  Domain independent translation model trained on all data
•  In-Domain TM only on the lecture data (re-use alignments)
•  Combined via log-linear combination
•  For LM log-linear combination of large LM, in-domain LM
and TED LM
•  Translations or special terms learned from Wikipedia and
Wiktionary
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
MT Results
Lecturer	
   BLEU	
  
Lecturer	
  1	
   13.80	
  
Lecturer	
  2	
   22.58	
  
Lecturer	
  3	
   14.24	
  
Lecturer	
  4	
   20.83	
  
Lecturer	
  5	
   24.50	
  
Lecturer	
  6	
   24.13	
  
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Thoughts on the Output Modality
Text instead of synthesized speech
•  Text can be easily distributed over the WWW
•  Laptops, smartphones, tablets
•  Nowadays ubiquitous
•  No proprietary software, just a browser
•  Listening to synthesized speech can be very tiresome
•  Artificial voices not perfect
•  Original speech present in addition
•  Translation system commits errors
•  Translated text contains errors; synthesis quality suffers from that
•  Temporal Navigation
•  Once translation has been heard, it is gone
•  Text enables users to move back and forth in the translation; supports
understanding the translation
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Interface
Client for the lecturer:
•  Must be as simple as possible!
•  Client needs to know:
•  Who is speaking?
•  What lecture is it?
•  Lecturer needs to know:
•  Is it turned on?
•  Is it doing something?
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Interface
Displaying the results:
•  Translation result + ASR result (?!)
•  Scrolling back and forth in time
•  Make text readable !!!
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Where do we go?
•  Interfaces need to get simpler!
•  Systems need to become an omnivore:
•  Get all meta information as early as possible (before the lecture!)
•  Slides, papers, web sites, text books, etc.
•  Find a way for obtaining comparable corpora
•  Make system self maintaining and autonomous
•  Automatic unsupervised acoustic model adaptation after every lecture
•  Automatically detect speaker, lecture and language: Access the university
information system
•  Offer additional services
•  Archive of the lectures for the students (for search)
•  Translation of the slides
•  Summary of the lecture
•  Get the students in the loop
•  Automatic corrections by the students: during the lecture and afterwards
•  Make it a game
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Andrexen - Webinar
• Role definition (Speaker / Listener)
•  Functionality was defined and implemented, and relevant MCloud
services integrated
• Data flow (Architecture / Slideshow / Voice / Text)
•  Features implemented to support the parallel transmission of
slides, speaker audio, translation text and intelligent user location
to provide integrated UC experience
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
Andrexen - Webinar
Adapting to industry requirements:
• Real time streaming
• Support legacy systems:
- Training narrow bandwidth (8KHz) acoustic model
• Specific vocabulary:
- Vocabulary and language model adaptation
www.eu-bridge.eu24 May 2012
Text für Fußzeile
Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
•  Translator integrated as virtual participant
•  Is “chatting” the translation results
Andrexen - Webinar

Weitere ähnliche Inhalte

Ähnlich wie EU-BRIDGE: Bridges Across the Language Divide

Freme general-overview-version-june-2015
Freme general-overview-version-june-2015Freme general-overview-version-june-2015
Freme general-overview-version-june-2015FREMEProjectH2020
 
The Perfect Storm MPEG DASH with H.265 (HEVC) with HTML5
The Perfect Storm  MPEG DASH with H.265 (HEVC) with HTML5The Perfect Storm  MPEG DASH with H.265 (HEVC) with HTML5
The Perfect Storm MPEG DASH with H.265 (HEVC) with HTML5IMTC
 
Innovation in the network – Adding value to voice OpenCloud Bouygues
Innovation in the network – Adding value to voice OpenCloud BouyguesInnovation in the network – Adding value to voice OpenCloud Bouygues
Innovation in the network – Adding value to voice OpenCloud BouyguesAlan Quayle
 
Ariyo - EUDAT CDI B2 services documentation
Ariyo - EUDAT CDI B2 services documentationAriyo - EUDAT CDI B2 services documentation
Ariyo - EUDAT CDI B2 services documentationEUDAT
 
Data Processing and Analysis
Data Processing and AnalysisData Processing and Analysis
Data Processing and AnalysisEUDAT
 
English profile christoph scheibner
English profile christoph scheibnerEnglish profile christoph scheibner
English profile christoph scheibnercs381
 
Presentation of coBuilder solutions - and our work
Presentation of coBuilder solutions - and our workPresentation of coBuilder solutions - and our work
Presentation of coBuilder solutions - and our workØystein Iversen
 
EDF2013: Language Technology Panel, Hans-Ulrich von Freyberg: Language Infras...
EDF2013: Language Technology Panel, Hans-Ulrich von Freyberg: Language Infras...EDF2013: Language Technology Panel, Hans-Ulrich von Freyberg: Language Infras...
EDF2013: Language Technology Panel, Hans-Ulrich von Freyberg: Language Infras...European Data Forum
 
Cloud based Projects at Belfast eScience Centre
Cloud based Projects at Belfast eScience CentreCloud based Projects at Belfast eScience Centre
Cloud based Projects at Belfast eScience CentreEduserv
 
VISHAL KESWANI -May 2015
VISHAL KESWANI -May 2015VISHAL KESWANI -May 2015
VISHAL KESWANI -May 2015Vishal Keswani
 
Linkedin 2016 CV
Linkedin 2016 CVLinkedin 2016 CV
Linkedin 2016 CVDavid Wragg
 
John Manning Job History April 2016
John Manning Job History April 2016John Manning Job History April 2016
John Manning Job History April 2016John Manning
 
Dan M Royce CV 2nd Jan 2016
Dan M Royce CV 2nd Jan 2016Dan M Royce CV 2nd Jan 2016
Dan M Royce CV 2nd Jan 2016Dan Royce
 

Ähnlich wie EU-BRIDGE: Bridges Across the Language Divide (20)

Freme general-overview-version-june-2015
Freme general-overview-version-june-2015Freme general-overview-version-june-2015
Freme general-overview-version-june-2015
 
The Perfect Storm MPEG DASH with H.265 (HEVC) with HTML5
The Perfect Storm  MPEG DASH with H.265 (HEVC) with HTML5The Perfect Storm  MPEG DASH with H.265 (HEVC) with HTML5
The Perfect Storm MPEG DASH with H.265 (HEVC) with HTML5
 
Innovation in the network – Adding value to voice OpenCloud Bouygues
Innovation in the network – Adding value to voice OpenCloud BouyguesInnovation in the network – Adding value to voice OpenCloud Bouygues
Innovation in the network – Adding value to voice OpenCloud Bouygues
 
Ariyo - EUDAT CDI B2 services documentation
Ariyo - EUDAT CDI B2 services documentationAriyo - EUDAT CDI B2 services documentation
Ariyo - EUDAT CDI B2 services documentation
 
Data Processing and Analysis
Data Processing and AnalysisData Processing and Analysis
Data Processing and Analysis
 
Mahuya_Chowdhury
Mahuya_ChowdhuryMahuya_Chowdhury
Mahuya_Chowdhury
 
Vostok presentation
Vostok presentationVostok presentation
Vostok presentation
 
ProjectsAMDev
ProjectsAMDevProjectsAMDev
ProjectsAMDev
 
English profile christoph scheibner
English profile christoph scheibnerEnglish profile christoph scheibner
English profile christoph scheibner
 
Presentation of coBuilder solutions - and our work
Presentation of coBuilder solutions - and our workPresentation of coBuilder solutions - and our work
Presentation of coBuilder solutions - and our work
 
EDF2013: Language Technology Panel, Hans-Ulrich von Freyberg: Language Infras...
EDF2013: Language Technology Panel, Hans-Ulrich von Freyberg: Language Infras...EDF2013: Language Technology Panel, Hans-Ulrich von Freyberg: Language Infras...
EDF2013: Language Technology Panel, Hans-Ulrich von Freyberg: Language Infras...
 
Cloud based Projects at Belfast eScience Centre
Cloud based Projects at Belfast eScience CentreCloud based Projects at Belfast eScience Centre
Cloud based Projects at Belfast eScience Centre
 
VISHAL KESWANI -May 2015
VISHAL KESWANI -May 2015VISHAL KESWANI -May 2015
VISHAL KESWANI -May 2015
 
Linkedin 2016 CV
Linkedin 2016 CVLinkedin 2016 CV
Linkedin 2016 CV
 
John Manning Job History April 2016
John Manning Job History April 2016John Manning Job History April 2016
John Manning Job History April 2016
 
Dan M Royce CV 2nd Jan 2016
Dan M Royce CV 2nd Jan 2016Dan M Royce CV 2nd Jan 2016
Dan M Royce CV 2nd Jan 2016
 
了解 Qt
了解 Qt了解 Qt
了解 Qt
 
ELIA Lyon 2015 Static
ELIA Lyon 2015 StaticELIA Lyon 2015 Static
ELIA Lyon 2015 Static
 
20090921 Expp Presentation
20090921 Expp Presentation20090921 Expp Presentation
20090921 Expp Presentation
 
Marco Brunori CV EN
Marco Brunori CV ENMarco Brunori CV EN
Marco Brunori CV EN
 

Mehr von The Open Education Consortium

Training Entrepreneurship Through OEP - the Start up Model (OE Global 2015)
Training Entrepreneurship Through OEP - the Start up Model (OE Global 2015)Training Entrepreneurship Through OEP - the Start up Model (OE Global 2015)
Training Entrepreneurship Through OEP - the Start up Model (OE Global 2015)The Open Education Consortium
 
Qualitative Investigation of Faculty Usage of OERs in Wachington Community an...
Qualitative Investigation of Faculty Usage of OERs in Wachington Community an...Qualitative Investigation of Faculty Usage of OERs in Wachington Community an...
Qualitative Investigation of Faculty Usage of OERs in Wachington Community an...The Open Education Consortium
 
Open license for MOOCs - Martijn Ouwehand (OE global 2015)
Open license for MOOCs - Martijn Ouwehand (OE global 2015) Open license for MOOCs - Martijn Ouwehand (OE global 2015)
Open license for MOOCs - Martijn Ouwehand (OE global 2015) The Open Education Consortium
 
Open Education 101 (OE Global 2015 Pre-conference workshop)
Open Education 101 (OE Global 2015 Pre-conference workshop)Open Education 101 (OE Global 2015 Pre-conference workshop)
Open Education 101 (OE Global 2015 Pre-conference workshop)The Open Education Consortium
 
Collaborating across borders: OER use and open educational practices within t...
Collaborating across borders: OER use and open educational practices within t...Collaborating across borders: OER use and open educational practices within t...
Collaborating across borders: OER use and open educational practices within t...The Open Education Consortium
 
OER strategies and best practices as success factors in Open Access initiativ...
OER strategies and best practices as success factors in Open Access initiativ...OER strategies and best practices as success factors in Open Access initiativ...
OER strategies and best practices as success factors in Open Access initiativ...The Open Education Consortium
 
The Maturing Open Educational Resources (OER) Ecosystem: Partners, Expansion,...
The Maturing Open Educational Resources (OER) Ecosystem: Partners, Expansion,...The Maturing Open Educational Resources (OER) Ecosystem: Partners, Expansion,...
The Maturing Open Educational Resources (OER) Ecosystem: Partners, Expansion,...The Open Education Consortium
 
OER strategies and best practices as success factors in Open Access initiativ...
OER strategies and best practices as success factors in Open Access initiativ...OER strategies and best practices as success factors in Open Access initiativ...
OER strategies and best practices as success factors in Open Access initiativ...The Open Education Consortium
 
Web based learning - research and innovation in translation learning resource...
Web based learning - research and innovation in translation learning resource...Web based learning - research and innovation in translation learning resource...
Web based learning - research and innovation in translation learning resource...The Open Education Consortium
 
Free of Open? Investigating Intellectual Property Right and Openness for OER ...
Free of Open? Investigating Intellectual Property Right and Openness for OER ...Free of Open? Investigating Intellectual Property Right and Openness for OER ...
Free of Open? Investigating Intellectual Property Right and Openness for OER ...The Open Education Consortium
 

Mehr von The Open Education Consortium (20)

Moving Beyond OER: USNH
Moving Beyond OER: USNHMoving Beyond OER: USNH
Moving Beyond OER: USNH
 
Universities for the Future (OE Global 2015)
Universities for the Future (OE Global 2015) Universities for the Future (OE Global 2015)
Universities for the Future (OE Global 2015)
 
Training Entrepreneurship Through OEP - the Start up Model (OE Global 2015)
Training Entrepreneurship Through OEP - the Start up Model (OE Global 2015)Training Entrepreneurship Through OEP - the Start up Model (OE Global 2015)
Training Entrepreneurship Through OEP - the Start up Model (OE Global 2015)
 
Qualitative Investigation of Faculty Usage of OERs in Wachington Community an...
Qualitative Investigation of Faculty Usage of OERs in Wachington Community an...Qualitative Investigation of Faculty Usage of OERs in Wachington Community an...
Qualitative Investigation of Faculty Usage of OERs in Wachington Community an...
 
Open license for MOOCs - Martijn Ouwehand (OE global 2015)
Open license for MOOCs - Martijn Ouwehand (OE global 2015) Open license for MOOCs - Martijn Ouwehand (OE global 2015)
Open license for MOOCs - Martijn Ouwehand (OE global 2015)
 
OE Global 2015 - Paradis
OE Global 2015 -  ParadisOE Global 2015 -  Paradis
OE Global 2015 - Paradis
 
Inclusive design (oe global action lab)
Inclusive design (oe global action lab)Inclusive design (oe global action lab)
Inclusive design (oe global action lab)
 
OE global2015 Pre-conference workshop
OE global2015 Pre-conference workshopOE global2015 Pre-conference workshop
OE global2015 Pre-conference workshop
 
Open Education 101 (OE Global 2015 Pre-conference workshop)
Open Education 101 (OE Global 2015 Pre-conference workshop)Open Education 101 (OE Global 2015 Pre-conference workshop)
Open Education 101 (OE Global 2015 Pre-conference workshop)
 
OER Roadmap (OE Global Pre-conference workshop)
OER Roadmap (OE Global Pre-conference workshop)OER Roadmap (OE Global Pre-conference workshop)
OER Roadmap (OE Global Pre-conference workshop)
 
Collaborating across borders: OER use and open educational practices within t...
Collaborating across borders: OER use and open educational practices within t...Collaborating across borders: OER use and open educational practices within t...
Collaborating across borders: OER use and open educational practices within t...
 
OER strategies and best practices as success factors in Open Access initiativ...
OER strategies and best practices as success factors in Open Access initiativ...OER strategies and best practices as success factors in Open Access initiativ...
OER strategies and best practices as success factors in Open Access initiativ...
 
The Maturing Open Educational Resources (OER) Ecosystem: Partners, Expansion,...
The Maturing Open Educational Resources (OER) Ecosystem: Partners, Expansion,...The Maturing Open Educational Resources (OER) Ecosystem: Partners, Expansion,...
The Maturing Open Educational Resources (OER) Ecosystem: Partners, Expansion,...
 
Teachers time is valuable (OE global2015)
Teachers time is valuable (OE global2015) Teachers time is valuable (OE global2015)
Teachers time is valuable (OE global2015)
 
Open Assembly (Global 2015 Action Lab)
Open Assembly (Global 2015 Action Lab)Open Assembly (Global 2015 Action Lab)
Open Assembly (Global 2015 Action Lab)
 
Atieno Adala
Atieno Adala Atieno Adala
Atieno Adala
 
OER strategies and best practices as success factors in Open Access initiativ...
OER strategies and best practices as success factors in Open Access initiativ...OER strategies and best practices as success factors in Open Access initiativ...
OER strategies and best practices as success factors in Open Access initiativ...
 
Web based learning - research and innovation in translation learning resource...
Web based learning - research and innovation in translation learning resource...Web based learning - research and innovation in translation learning resource...
Web based learning - research and innovation in translation learning resource...
 
Free of Open? Investigating Intellectual Property Right and Openness for OER ...
Free of Open? Investigating Intellectual Property Right and Openness for OER ...Free of Open? Investigating Intellectual Property Right and Openness for OER ...
Free of Open? Investigating Intellectual Property Right and Openness for OER ...
 
Openlib03
Openlib03Openlib03
Openlib03
 

Kürzlich hochgeladen

BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 

Kürzlich hochgeladen (20)

BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 

EU-BRIDGE: Bridges Across the Language Divide

  • 1. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications The work leading to these results has received funding from the European Union under grant agreement n° 287658 EU-BRIDGE Bridges Across the Language Divide Sebastian Stüker, KIT
  • 2. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications
  • 3. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Not this project, + different architecture from "The hitch-hiker's Guide to the Galaxy" (TV series) 1981 © BBC orig. animation artwork by Rod Lord not   BABEL FISH
  • 4. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications It‘s Happening Right Now Speech and language technologies (posing hard problems) A strong need, in particular in Europe Growth of computing power/memory Availability of resources Scientific progress
  • 5. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications The Vision • Bridges Between Languages in Europe: •  Technology to bridge the European language •  Building/maintaining a sustainable European language infrastructure •  Language divide is expressed in language and speech •  Easily accessible services • Bridges Between Science and Society •  Language technology requires continued scientific attention •  Exploitation and insertion requires suitable adaptation •  Metrics: •  WER, BLEU, TER, F-Scores •  User Friendliness, Productivity, Sales, Distribution Channels, Customer Support •  Identify use cases and applications •  Effective transition and insertion
  • 6. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications EU-BRIDGE in a Nutshell •  Here: Not science fiction, but results coming soon (2014/2015) •  FP7 IP EU-BRIDGE: Bridges Across the Language Divide •  Development of a speech translation service infrastructure •  Targeted to make progress in particular regarding market insertion •  Project footprint: •  Feb 2012 - Jan 2015 •  Budget € 10.5m, EC funding € 7.8m •  10 partners •  1 service infrastructure, 4 use cases
  • 7. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Goals and Project Plan Four use cases 1.  Captioning and translation of subtitles for TV programs 2.  Simultaneous translation of academic lectures 3.  Speech translation services for the European Parliament 4.  Translation of webinars Four major objectives 1.  Performance 2.  Language portability 3.  Reduction of dependency on data 4.  Rapid technology transition and market insertion
  • 8. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications EU-BRIDGE Partners
  • 9. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications EU-BRIDGE Partners
  • 10. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications EU-BRIDGE Partners
  • 11. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications EU-BRIDGE Partners
  • 12. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications EU-BRIDGE Partners
  • 13. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications ASR MT Use Case 2 Engines   Services   Use  Cases   Language Service Customization, Adaptation Develop  and  Insert   Improved  Technology   Language Services for User and Developer Communities
  • 14. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Language Service ASR MT Use Case 2 Lecture Translation Engines   Services   Use  Cases  
  • 15. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Technical Challenges Languages Robustness 1 3     500   (Speaking Style Noise) Domain EuroMatrixPlus GALETC-STAR C-STAR NespoleVerbmobil Babel
  • 16. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Research   WP  1      WP2   Engines   WP  3   Services   WP  4   Products   WP  5   Cap<on  Transl.        Lecture  Transl.        Webinar   RBM                            EUP/RBM/KIT            ADX   Standarized,   inhouse  tests:   WER,  BLEU       External   campaigns                 Field  tesDng:   dev  costs   cost  reduc<on,   usability     run  and   schedule   build  &   improve   u<lize   Market   Laboratory   WP6:    EvaluaDon   Deployable  Engines   Transcrip<on  and  Transla<on  Services   Market  Inser<on   Transcrip<on   Transla<on   Data  Dependency   Data  Costs   Project Organization
  • 17. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications WP  4   Services             Task  4.1  Core  Services:  ASR,  MT,  Text  Processing   Design  and  implementa<on  of  core  service  APIs  and   integra<on  of  core  engines   WP  3   Engines   Task  3.1  ASR  and  MT   Task  3.4  LID   Task  3.2  Mul<threaded+Stream   Decoding   Task  3.5  Text  Processing   LID   TP   ASR   MT   TP  TP  TP   Task  3.3  Language  Dependent   Linguis<c  Processing  
  • 18. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications WP  4   Services             WP  5   Products   Task  4.2  Running  Services  in  the  cloud   Design  and  implementa<on  of  client-­‐server   architecture  +  APIs   Task  5.1   Cap<on  Transla<on   Task  5.2   Lecture  Transla<on   Task  5.3   Mul<lingual  Tele-­‐  and  Video   Conferencing   Task  4.1  Core  Services:  ASR,  MT,  Text  Processing   Design  and  implementa<on  of  core  service  APIs  and   integra<on  of  core  engines   LID   TP  ASR   MT   TP   TP   TP  
  • 19. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Evaluations “You Improve what you Evaluate” •  Evaluations organized around Use Cases •  Align Metrics with Product/Service Goals Co-opetition •  Partners engage in friendly competition •  Goal is progress/complementarity, not site-to-site “horse- race” •  Partners participate with standard or optimized/tuned systems
  • 20. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Evaluations •  Standard open benchmarks •  Calibrate technology internationally in open competition •  IWSL, WMT: cover lecture task and multilingual MT •  EU-Bridge partners are (co-) organizers of these evals and can thus influence the process to suit EU-BRIDGE’s needs •  Don’t need to create/market a new campaign •  Internal EU-Bridge Tasks •  Need to evaluate & optimize technology for EU-BRIDGE use cases •  Different goals/sub-goals pursued (e.g. captioning, NE, shortening, ..) •  Evals/Assessment need to change frequently during the project •  Comparisons with outside teams not needed and not helpful •  Field Tests and Measures: •  Evaluate on Living online ‘Organism’ •  Optimize for extrinsic measures, not intrinsic ones
  • 21. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications IWSLT 2014 •  Evaluation of Speech Translation Technology on Talks and Lectures •  Working on TED data, because of efficiency of creating training and evaluation data
  • 22. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Use Cases Captioning and Translation of Multimedia Content •  BBC Weather Data •  Euronews •  Skynews Multilingual Lectures, Meetings •  TED Lectures •  University Lectures •  European Parliament Voting Sessions European Parliament Interpreter Support •  Terminology Support: Tool Field Tests •  Named Entities: Eval, Tool Integration & Tests Webinar Translation •  Integration into the Andrexen platform
  • 23. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Use Cases Captioning and Translation of Multimedia Content •  BBC Weather Data •  Euronews •  Skynews Multilingual Lectures, Meetings •  TED Lectures •  University Lectures •  European Parliament Voting Sessions European Parliament Interpreter Support •  Terminology Support: Tool Field Tests •  Named Entities: Eval, Tool Integration & Tests Webinar Translation •  Integration into the Andrexen platform
  • 24. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Lecture Translation • Continuous Monologue • Speaking-Style •  Planned, not read, not memorized •  Fast, spontaneous, fragmentary, and no punctuation! •  Noises, coughing, non-verbal events (e.g., singing) • Vocabulary •  Very large •  Specialized terms •  Foreign Words • Speed, Rea-ltime • Service-Infrastructure •  Many parallel lectures; •  Automatic, robust assignment of compute power
  • 25. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Lectures êß*0vúbØi∫BA¬pysUêÍ} hÿ5≈ƒÄ<„y‡ëŒkû ¢OFˇØ∏kô#寫Zeû
  • 26. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Speech-Translation Systems •  Human Interpreters are too expensive •  Automatic Speech Translation an affordable solution: •  Still lots of errors, room for improvement •  But, better than nothing
  • 27. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Service Infrastructure
  • 28. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Launch, June 11 2012
  • 29. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Deployment
  • 30. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Data Resources KIT lecture corpus: •  Collected to fit the needs for training the systems: •  ASR AM: Large amounts of in-domain audio data with careful transcriptions •  MT Translation Model: Parallel sentences of in-domain data for the translation •  ASR+MT LM: Large amounts of monolingual sentences in the required domain •  Any kind of meta data that might help (e.g., lecturers’ slides) •  Data collection took place at KIT’s lecture halls: •  Started with computer science lectures •  Gradually spread to lectures from all departments at the university
  • 31. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Data Resources
  • 32. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Automatic Speech Recognition •  Janus based ASR system: HMM/GMM based quinphone system with 4,000 models, MVDR front-end, 4gram LM •  Acoustic Model: •  Trained on all lecture data in order to get a speaker independent model •  Created 5 speaker adapted AMs: speaker independent model + Viterbi training and bMMIE training on the speaker dependent data •  Language Model: •  Interpolation from 28 text corpora •  Interpol. weights tuned on random selection from AM training data •  300k vocab selected by ML count estimation method •  German has a lot of compounds: •  Sub-word vocabulary for compounds •  Vocabulary adaptation by deriving queries from slides (OOV 2.25%è0.75% at 300k vocab size)
  • 33. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications ASR Performance •  Tested the ASR system on the dev set of six lecturers Lecturer   1   2   3   4   5   6   Speaker  Independent  AM   34.8%   21.1%   28.4%   22.9%   22.7%   19.0%   Speaker  Dependent  AM   –   18.9%   27.6%   22.6%   21.5%   17.8%   Adapted  LM+Vocabulary   23.9%   17.3%   –   18.1%   –   15.4%  
  • 34. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications OOV Reduction
  • 35. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications ASR Post-Processing •  Numbers are converted to digit sequences •  Common symbols are substituted, e.g., “Prozent”è’%’ •  Punctuation, i.e. ‘.’ and ‘,’ is inserted: •  Prediction via a 4gram model •  Pause information used to adjust the priors •  Simple, rule based conversion of equations: •  “F of x”è ‘f(x)’ •  Punctuation is used to structure text into sentences and chunks for translation
  • 36. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Machine Translation Statistical Phrase Based MT System •  Trained on different out-of-domain data and the lecture corpus •  Applied the same compound word splitting as for the ASR system for consistency; names etc. were excluded from splitting by applying a named entity tagger •  Used a discriminatively trained word alignment approach •  Specific models for short and long range reorderings (also on the training data) •  Online system: •  Phrase table filtered with ASR vocabulary •  Simplified POS tagging
  • 37. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Machine Translation Adaptation to the lecture domain •  Domain independent translation model trained on all data •  In-Domain TM only on the lecture data (re-use alignments) •  Combined via log-linear combination •  For LM log-linear combination of large LM, in-domain LM and TED LM •  Translations or special terms learned from Wikipedia and Wiktionary
  • 38. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications MT Results Lecturer   BLEU   Lecturer  1   13.80   Lecturer  2   22.58   Lecturer  3   14.24   Lecturer  4   20.83   Lecturer  5   24.50   Lecturer  6   24.13  
  • 39. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Thoughts on the Output Modality Text instead of synthesized speech •  Text can be easily distributed over the WWW •  Laptops, smartphones, tablets •  Nowadays ubiquitous •  No proprietary software, just a browser •  Listening to synthesized speech can be very tiresome •  Artificial voices not perfect •  Original speech present in addition •  Translation system commits errors •  Translated text contains errors; synthesis quality suffers from that •  Temporal Navigation •  Once translation has been heard, it is gone •  Text enables users to move back and forth in the translation; supports understanding the translation
  • 40. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Interface Client for the lecturer: •  Must be as simple as possible! •  Client needs to know: •  Who is speaking? •  What lecture is it? •  Lecturer needs to know: •  Is it turned on? •  Is it doing something?
  • 41. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Interface Displaying the results: •  Translation result + ASR result (?!) •  Scrolling back and forth in time •  Make text readable !!!
  • 42. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Where do we go? •  Interfaces need to get simpler! •  Systems need to become an omnivore: •  Get all meta information as early as possible (before the lecture!) •  Slides, papers, web sites, text books, etc. •  Find a way for obtaining comparable corpora •  Make system self maintaining and autonomous •  Automatic unsupervised acoustic model adaptation after every lecture •  Automatically detect speaker, lecture and language: Access the university information system •  Offer additional services •  Archive of the lectures for the students (for search) •  Translation of the slides •  Summary of the lecture •  Get the students in the loop •  Automatic corrections by the students: during the lecture and afterwards •  Make it a game
  • 43. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Andrexen - Webinar • Role definition (Speaker / Listener) •  Functionality was defined and implemented, and relevant MCloud services integrated • Data flow (Architecture / Slideshow / Voice / Text) •  Features implemented to support the parallel transmission of slides, speaker audio, translation text and intelligent user location to provide integrated UC experience
  • 44. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications Andrexen - Webinar Adapting to industry requirements: • Real time streaming • Support legacy systems: - Training narrow bandwidth (8KHz) acoustic model • Specific vocabulary: - Vocabulary and language model adaptation
  • 45. www.eu-bridge.eu24 May 2012 Text für Fußzeile Sebastian Stüker: EU-BRIDGE *** Advanced text and speech translation services for eudcational applications •  Translator integrated as virtual participant •  Is “chatting” the translation results Andrexen - Webinar