SlideShare ist ein Scribd-Unternehmen logo
1 von 69
Downloaden Sie, um offline zu lesen
Oct.28.2017
Ewa Szymanska, PhD
Head of Rakuten Institute of Technology Singapore
2
Source: https://unsplash.com/ by Element5 Digital
3
I am watching shows in Chinese to get used
to ‘actual’ spoken Mandarin, and not just
what I see in my textbooks
“
” VIKI user
4
* Images from Rakuten VIKI, Rakuten TV
5
1.8 billion people are learning foreign languages
Source: The Washington Post: https://www.washingtonpost.com/news/worldviews/wp/2015/04/23/the-worlds-languages-in-7-maps-and-charts
Languages with most
native speakers
Most commonly studied
foreign languages
6
Online individual language learning market is growing at 12% CAGR
Source: Rosetta Stone Investor Day 2017
7
I. Entertaining Content II. Global Users III. Technology
*Photo by Jakob Owens on Unsplash
8
Interactive
subtitles
Video
dictionary Quizzes1 2
3
* Images from Rakuten VIKI
9
Interactive subtitles1
Fast adoption
30,000 DAU
– daily active users
High engagement
Korean Learn Mode
users view 10% more
than Viki average
High satisfaction
83 NPS
– net promoter score
*cnet.com @ CBS Interactive Inc. Apr 13, 2017; Keia.org, Korean Economic Institute, Apr 2017; Forbes Oct 24, 2017; The Verge, Sep 28, 2017
Shows availability
“Daughter
Back”
“Return of
Happiness”
“Ice and Fire
of Youth”
“My Love
from the Star”
“Boys Over
Flowers”
“Descendants
of the Sun”
Learn Chinese (Japan) Learn Korean (USA)
* Images from Rakuten VIKI
[ Learn Mode collection on viki.com ]
11
• 60,000+ quizzes taken
• 35,000+ users completed the quiz
• Very positive social media engagement:
2 Drama Vocab Quiz [ languagequiz.viki.com ]
12
3 Video-based Dictionary
Integrate with the classroom curriculum:
13
“ If you talk to a man
in a language he understands,
that goes to his head.
If you talk to him in his language,
that goes to his heart. ”
- Nelson Mandela
14
Oct 28, 2017
Stanley Kok
Principal Research Scientist
Rakuten Institute of Technology (Singapore)
you
16
你 是 辣妹 , 也是 名门贵 族
你是辣妹,也是名门贵族
你 是 辣妹 , 也是 名门贵族
are (a) hot chick and also (of) the gentry
Splitting a sentence into pieces, each preserving
its original semantics
you are (a) hot chick and also tribe
17
努力的人才会成功
努力 的 人 才 会 成功
only hardworking people will succeed
努力 的 人才 会 成功
hardworking talent will succeed
18
Tokenization
19
Dictionary
Lookup
20
Many open-source tokenizers available
Good, but not perfect
Different mistakes
Why not use more (or all) of them to improve
tokenization?
 Strengths of one tokenizer overcomes
shortcomings of another
21
How to quantify “goodness” of tokenization?
Take human learner’s perspective
#Dictionary look-ups needed to understand all tokens
Non-existent tokens assumed to need large #lookups (10)
你 是 辣妹 你 是 辣 妹 你 是辣 妹
hot
chick
areyou
younger
sister
spicy
areyou younger
sister
?you
1 + 1 + 1 = 3
1 + 1 + 1 + 1 = 4
1 + 10 + 1 = 12
22
Can do better than picking lowest cost
tokenization from tokenizers
Treat common tokens as “anchor points”
Pick best tokens from remaining ones
23
你 是 辣妹 也是 名门贵 族
你 是辣 妹 也是 名门贵族
你 是 辣妹 也是 名门贵族
you are hot chick
and also tribe
you
younger
sister
and also (of) the gentry
(15)
(14)
(5)
24
Dictionaries are important for language learning
Manual approach provides high-quality dictionary,
but not scalable
About 7000 languages in the world
About 49 million bilingual dictionaries
Thus need automatic approach
25
Lots of online dictionaries available
Could we automatically learn new dictionaries
from them?
Focus on Chinese-English (C-E) & Korean-
English (K-E) bilingual dictionaries
26
Lots of dictionaries online
Some are C-E and K-E, but many are not
Many dictionaries are C-X and X-E
Use language X as bridge/pivot
C-X + X-E => C-E, e.g.,
辣妹->fille sexy + fille sexy ->hot chick
=> 辣妹-> hot chick
27
Take 2 hops for now
Chinese-English dictionary has 750K entries
90% correct
Korean-English dictionary has 100K entries
99% correct
28
Learn bilingual dictionary using
Using seed lexicon
Monolingual data (plentiful)
Maps bi-lingual phrases to vector space
dolphin
海豚
东京Tokyo
Sushi
寿司
29
30
31
Artifact of standard machine translation pipeline
Parallel sentences aligned word for word
Compute probability of mapping tokens of a
source language to those of a target language
A correct source token will be more
consistently aligned to its corresponding
target token(s)
Add high-probability mappings to dictionary
32
Chinese English P(C|E) P(E|C) AveProb
辣妹 hot chick 0.8 0.9 0.85
是辣 is curry 0.1 0.1 0.1
33
Chinese-English Dictionary
3 million Chinese tokens (Jan’17)
89% in dictionary
Korean-English Dictionary
4 million Korean tokens (Jan’17)
86% in dictionary
34
0
50000
100000
150000
200000
250000
300000
350000
400000
450000
500000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
#KoreanTokens vs. #Defintions
0
50000
100000
150000
200000
250000
300000
350000
400000
450000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
#ChineseTokens vs. #Definitions
35
Match parallel sentences to
Phrase table
Dictionary
36
他 放弃 梦想
He gave up his dreams
Chinese English AveProb
放弃 gave up his 0.74
放弃 quit, 0.83
放弃 abdicate 0.68
Phrase Table
37
他 放弃 梦想
He gave up his dreams
Chinese English AveProb
放弃 gave up his 0.74
放弃 quit 0.83
放弃 abdicate 0.68
Phrase Table
Best Match
他 放弃 梦想
He gave up his dreams
best match
38
Chinese English AveProb
放弃 gave up his 0.74
放弃 quit 0.83
放弃 abdicate 0.68
Phrase Table
best match
Chinese English
放弃 abandon
放弃 give up
放弃 abdicate
Dictionary
Drama Vocabulary Quiz
Liling Tan
Rakuten Institute of Technology (Singapore)
28 Oct 2017 @ Rakuten Tech. Conference
40
Overview
•Introduction
•Demo
•How did We Create the Quiz?
41
Introduction
•Quizzes are fun and could be viral
•But manually creating quizzes is tedious
•We created #DramaVocabQuiz that generates new
vocabulary quizzes automatically
42
43
44
45
46
47
48
How do we Generate
Quizzes
Automatically?
49
Korean Drama Word List
• The word 미남 [minam] “handsome guy” can be followed by multiple suffixes at once -이시라
구요 [-issilaguyo] to form a single word meaning “someone said that he is handsome”.
• We only extract the root word 미남 [minam], and count it as a unique word type
50
Korean Drama Word List
51
Korean Drama Word List
52
Korean Drama Word List
53
Splitting Word List into
3 Difficulty Levels
↑
54
Generate the Distractors
• Distractor 1: Select the top 5th to 20th closest words (cosine)
• Distractor 2: Use Distractor 1 as negative and question word as
positive, select 1st to 20th closest word (cosmul)
References:
• Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In ICLR.
• Omer Levy and Yoav Goldberg. 2014. Linguistic Regularities in Sparse and Explicit Word Representations. In CoNLL.
55
Language Leaners Like Quizzes!!
• 60,000+ quizzes taken
• 35,000+ unique users completed quiz
• 16% of the users repeated quiz
56
Word Frequency is a Good Indicator of Difficulty
10
8
6
4
2
0
Easy Medium Hard
Easy = Frequent words
Medium = Less Frequent
words
Hard = Least Frequent
words
57
Conclusion
Watch Drama,
Learn Language
Quiz: https://languagequiz.viki.com
Techblog: https://techblog.rakuten.co.jp/2017/05/26/lang-quiz/
Oct.28.2017
Pang Zineng
Senior Technologist
Rakuten Institute of Technology Singapore
59
* Images from Rakuten VIKI
60
clips
pages
Web Search In-Video Search
* Images from Rakuten VIKI
61
Web Search In-Video Search
•The meta data of the site
•The meta data of the page
•The word tokens in the page
•The topic of the page
•The originality of the page
•Hyperlinks (page rank)
• The meta data of the video
•The meta data of this clip
(timestamp, length, URI, etc.)
• The caption text of the clip
• The frames & audio signal
•Complexity of the sentence
•Diversity of the clips
site
identifier
page
identifier
content
ranking
search
relevancy
video
identifier
clip
identifier
search
relevancy
content
ranking
* Images from Rakuten VIKI
62
Job:
• Make some data ready for consumption.
Questions:
• How does the data come?
• What needs to be done for it to be ready?
• How will the data be consumed?
database
Pre-
processing
function
Trigger /
monitor
function
Raw
Data
Data access
function
FTP API
Data provider
Data consumer
63
Job:
• Let outsider use a function.
Questions:
• How frequently will the function be used?
• What data does the function need?
Application
logic
API
Endpoint
Web Application
API Cache
Request
Queue
Application
Cache
Internal/External Data
64
Rakuten TV
video contents
Other
video contents
Rakuten VIKI
video contents
Search
function
3rd Party Platform
Motion Dictionary
* Images from Rakuten VIKI
65
Japanese
Dictionary
Data
dictionary
function
voice
function
3rd party
solution
Korean
Dictionary
Data
Chinese
Dictionary
Data
3rd party
solution
open source
framework
Interactive Subtitles
(version 2)
Interactive Subtitles
(version 3)
* Images from Rakuten VIKI
tokenization
function
Korean
Tokenization
Data
Chinese
Tokenization
Data
Japanese
Tokenization
Data
open source
framework
open source
framework
open source
framework
Korean
Tokenization
Data
Chinese
Tokenization
Data
In-house
solution
In-house
solution
66
Japanese
Dictionary
Data
dictionary
function
voice
function
3rd party
solution
Korean
Dictionary
Data
Chinese
Dictionary
Data
3rd party
solution
open source
framework
Interactive Subtitles
(version 2)
Interactive Subtitles
(version 3)
* Images from Rakuten VIKI
tokenization
function
Japanese
Tokenization
Data
open source
framework
Global
Tokenization
Data
In-house
solution
Global
Dictionary
Data
In-house
solution
Korean
Tokenization
Data
Chinese
Tokenization
Data
In-house
solution
In-house
solution
67
Take
Quiz
function
Vocab Quiz
(version 1)
* Images from Rakuten VIKI
Chinese
Quiz Data
Korean
Quiz Data
68
Chinese
Quiz Data
Take
Quiz
function
voice
function
Vocab Quiz
(version 2)
* Images from Rakuten VIKI
Korean
Quiz Data
69
Fast iteration in R&D won’t be possible
if we had many things bundled or coupled.
-- Pang
Vocab Quiz
• https://languagequiz.viki.com/
Learn Mode (PC/Mac only)
• https://www.viki.com/collections/316981l-learn-the-basics-chinese
• https://www.viki.com/collections/316939l-learn-the-basics-korean
Motion Dictionary
• TBD

Weitere ähnliche Inhalte

Andere mochten auch

Challenge for statup's cto from big company nagaaki hoshi
Challenge for statup's cto from big company nagaaki hoshiChallenge for statup's cto from big company nagaaki hoshi
Challenge for statup's cto from big company nagaaki hoshiRakuten Group, Inc.
 
Rakutenとsreと私 yanagimoto koichi
Rakutenとsreと私 yanagimoto koichiRakutenとsreと私 yanagimoto koichi
Rakutenとsreと私 yanagimoto koichiRakuten Group, Inc.
 
AI AND FUNDAMENTAL GAME TECHNOLOGIESIN FINAL FANTASY XV
AI AND FUNDAMENTAL GAME TECHNOLOGIESIN FINAL FANTASY XVAI AND FUNDAMENTAL GAME TECHNOLOGIESIN FINAL FANTASY XV
AI AND FUNDAMENTAL GAME TECHNOLOGIESIN FINAL FANTASY XVRakuten Group, Inc.
 
はてなのインフラの歴史、そしてMackerelへ至る道とこれから
はてなのインフラの歴史、そしてMackerelへ至る道とこれから はてなのインフラの歴史、そしてMackerelへ至る道とこれから
はてなのインフラの歴史、そしてMackerelへ至る道とこれから Rakuten Group, Inc.
 
Value Delivery through RakutenBig Data Intelligence Ecosystem and Technology
Value Delivery through RakutenBig Data Intelligence Ecosystem  and  TechnologyValue Delivery through RakutenBig Data Intelligence Ecosystem  and  Technology
Value Delivery through RakutenBig Data Intelligence Ecosystem and TechnologyRakuten Group, Inc.
 
WannaEat: A computer vision-based, multi-platform restaurant lookup app
WannaEat: A computer vision-based, multi-platform restaurant lookup appWannaEat: A computer vision-based, multi-platform restaurant lookup app
WannaEat: A computer vision-based, multi-platform restaurant lookup appRakuten Group, Inc.
 
時間がないといって、オペレーション改善を怠るな~オペレーション改善奮闘記~ Emi muroya
時間がないといって、オペレーション改善を怠るな~オペレーション改善奮闘記~ Emi muroya時間がないといって、オペレーション改善を怠るな~オペレーション改善奮闘記~ Emi muroya
時間がないといって、オペレーション改善を怠るな~オペレーション改善奮闘記~ Emi muroyaRakuten Group, Inc.
 
Rakuten app productivity initiative for developers marcus saw
Rakuten app productivity initiative for developers marcus sawRakuten app productivity initiative for developers marcus saw
Rakuten app productivity initiative for developers marcus sawRakuten Group, Inc.
 
Rakuten Technology Conference 2017 A Distributed SQL Database For Data Analy...
Rakuten Technology Conference 2017 A Distributed SQL Database  For Data Analy...Rakuten Technology Conference 2017 A Distributed SQL Database  For Data Analy...
Rakuten Technology Conference 2017 A Distributed SQL Database For Data Analy...Rakuten Group, Inc.
 
Java ee7 with apache spark for the world's largest credit card core systems, ...
Java ee7 with apache spark for the world's largest credit card core systems, ...Java ee7 with apache spark for the world's largest credit card core systems, ...
Java ee7 with apache spark for the world's largest credit card core systems, ...Rakuten Group, Inc.
 
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platformcloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
cloudera Apache Kudu Updatable Analytical Storage for Modern Data PlatformRakuten Group, Inc.
 
Change the engineer life by batch system renewal
Change the engineer life by batch system renewalChange the engineer life by batch system renewal
Change the engineer life by batch system renewalRakuten Group, Inc.
 
Building your own static site Using Hugo
Building your own static site Using HugoBuilding your own static site Using Hugo
Building your own static site Using HugoRakuten Group, Inc.
 
RTC 2017 - The Power of Parallelism
RTC 2017 - The Power of ParallelismRTC 2017 - The Power of Parallelism
RTC 2017 - The Power of ParallelismRakuten Group, Inc.
 
Artificial Intelligence for Happiness of People
Artificial Intelligence for Happiness of PeopleArtificial Intelligence for Happiness of People
Artificial Intelligence for Happiness of PeopleRakuten Group, Inc.
 

Andere mochten auch (20)

Challenge for statup's cto from big company nagaaki hoshi
Challenge for statup's cto from big company nagaaki hoshiChallenge for statup's cto from big company nagaaki hoshi
Challenge for statup's cto from big company nagaaki hoshi
 
Don't manage too hard!
Don't manage too hard! Don't manage too hard!
Don't manage too hard!
 
Human-Centric Machine Learning
Human-Centric Machine LearningHuman-Centric Machine Learning
Human-Centric Machine Learning
 
Rakutenとsreと私 yanagimoto koichi
Rakutenとsreと私 yanagimoto koichiRakutenとsreと私 yanagimoto koichi
Rakutenとsreと私 yanagimoto koichi
 
AI AND FUNDAMENTAL GAME TECHNOLOGIESIN FINAL FANTASY XV
AI AND FUNDAMENTAL GAME TECHNOLOGIESIN FINAL FANTASY XVAI AND FUNDAMENTAL GAME TECHNOLOGIESIN FINAL FANTASY XV
AI AND FUNDAMENTAL GAME TECHNOLOGIESIN FINAL FANTASY XV
 
はてなのインフラの歴史、そしてMackerelへ至る道とこれから
はてなのインフラの歴史、そしてMackerelへ至る道とこれから はてなのインフラの歴史、そしてMackerelへ至る道とこれから
はてなのインフラの歴史、そしてMackerelへ至る道とこれから
 
Value Delivery through RakutenBig Data Intelligence Ecosystem and Technology
Value Delivery through RakutenBig Data Intelligence Ecosystem  and  TechnologyValue Delivery through RakutenBig Data Intelligence Ecosystem  and  Technology
Value Delivery through RakutenBig Data Intelligence Ecosystem and Technology
 
WannaEat: A computer vision-based, multi-platform restaurant lookup app
WannaEat: A computer vision-based, multi-platform restaurant lookup appWannaEat: A computer vision-based, multi-platform restaurant lookup app
WannaEat: A computer vision-based, multi-platform restaurant lookup app
 
COBOL to Apache Spark
COBOL to Apache SparkCOBOL to Apache Spark
COBOL to Apache Spark
 
時間がないといって、オペレーション改善を怠るな~オペレーション改善奮闘記~ Emi muroya
時間がないといって、オペレーション改善を怠るな~オペレーション改善奮闘記~ Emi muroya時間がないといって、オペレーション改善を怠るな~オペレーション改善奮闘記~ Emi muroya
時間がないといって、オペレーション改善を怠るな~オペレーション改善奮闘記~ Emi muroya
 
Rakuten app productivity initiative for developers marcus saw
Rakuten app productivity initiative for developers marcus sawRakuten app productivity initiative for developers marcus saw
Rakuten app productivity initiative for developers marcus saw
 
Rakuten Technology Conference 2017 A Distributed SQL Database For Data Analy...
Rakuten Technology Conference 2017 A Distributed SQL Database  For Data Analy...Rakuten Technology Conference 2017 A Distributed SQL Database  For Data Analy...
Rakuten Technology Conference 2017 A Distributed SQL Database For Data Analy...
 
Java ee7 with apache spark for the world's largest credit card core systems, ...
Java ee7 with apache spark for the world's largest credit card core systems, ...Java ee7 with apache spark for the world's largest credit card core systems, ...
Java ee7 with apache spark for the world's largest credit card core systems, ...
 
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platformcloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
 
Change the engineer life by batch system renewal
Change the engineer life by batch system renewalChange the engineer life by batch system renewal
Change the engineer life by batch system renewal
 
Building your own static site Using Hugo
Building your own static site Using HugoBuilding your own static site Using Hugo
Building your own static site Using Hugo
 
Realizing AI Conversational Bot
Realizing AI Conversational BotRealizing AI Conversational Bot
Realizing AI Conversational Bot
 
RTC 2017 - The Power of Parallelism
RTC 2017 - The Power of ParallelismRTC 2017 - The Power of Parallelism
RTC 2017 - The Power of Parallelism
 
Riemannian Geometry in Egison
Riemannian Geometry in EgisonRiemannian Geometry in Egison
Riemannian Geometry in Egison
 
Artificial Intelligence for Happiness of People
Artificial Intelligence for Happiness of PeopleArtificial Intelligence for Happiness of People
Artificial Intelligence for Happiness of People
 

Ähnlich wie Enable Fast Iteration in R&D- Use modular, loosely coupled architectures so changes don't have widespread impact- Automate testing and deployments to streamline the development cycle - Implement continuous integration/delivery to get feedback quickly- Empower cross-functional teams with autonomy over their work- Adopt agile methodologies like Scrum, Kanban to support experimentation- Colocate teams physically to facilitate collaboration and rapid problem-solving- Leverage cloud infrastructure for flexible, on-demand compute resources- Invest in tools that enhance developer productivity like IDEs, version control etc.- Foster a culture

Shut Up! No one is listening! Web 2.0 and Mobile Media Are Speaking.
Shut Up! No one is listening! Web 2.0 and Mobile Media Are Speaking.Shut Up! No one is listening! Web 2.0 and Mobile Media Are Speaking.
Shut Up! No one is listening! Web 2.0 and Mobile Media Are Speaking.Courtney Teague
 
Intelligent Chatbot on WeChat
Intelligent Chatbot on WeChatIntelligent Chatbot on WeChat
Intelligent Chatbot on WeChatAI Frontiers
 
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and ApplicationsICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and ApplicationsForward Gradient
 
Evaluation of online learning
Evaluation of online learningEvaluation of online learning
Evaluation of online learningshatha al abeer
 
Tutorial of Sentiment Analysis
Tutorial of Sentiment AnalysisTutorial of Sentiment Analysis
Tutorial of Sentiment AnalysisFabio Benedetti
 
[Rakuten TechConf2014] [D-2] The Pattern-Matching-Oriented Programming Langua...
[Rakuten TechConf2014] [D-2] The Pattern-Matching-Oriented Programming Langua...[Rakuten TechConf2014] [D-2] The Pattern-Matching-Oriented Programming Langua...
[Rakuten TechConf2014] [D-2] The Pattern-Matching-Oriented Programming Langua...Rakuten Group, Inc.
 
Nlp and Neural Networks workshop
Nlp and Neural Networks workshopNlp and Neural Networks workshop
Nlp and Neural Networks workshopQuantUniversity
 
KiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorialKiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorialAlyona Medelyan
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingIla Group
 
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...Data Science Milan
 
16-nlp (2).ppt
16-nlp (2).ppt16-nlp (2).ppt
16-nlp (2).ppttestbest6
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPMENGSAYLOEM1
 
UX STRAT Europe 2019: Zhaochang He, VMware
UX STRAT Europe 2019: Zhaochang He, VMwareUX STRAT Europe 2019: Zhaochang He, VMware
UX STRAT Europe 2019: Zhaochang He, VMwareUX STRAT
 
Can Deep Learning solve the Sentiment Analysis Problem
Can Deep Learning solve the Sentiment Analysis ProblemCan Deep Learning solve the Sentiment Analysis Problem
Can Deep Learning solve the Sentiment Analysis ProblemMark Cieliebak
 
Innovations in AI-Powered Assessments and Feedback
Innovations in AI-Powered Assessments and FeedbackInnovations in AI-Powered Assessments and Feedback
Innovations in AI-Powered Assessments and Feedbackorrenprunckun
 
Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineer...
Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineer...Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineer...
Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineer...Preetha Chatterjee
 

Ähnlich wie Enable Fast Iteration in R&D- Use modular, loosely coupled architectures so changes don't have widespread impact- Automate testing and deployments to streamline the development cycle - Implement continuous integration/delivery to get feedback quickly- Empower cross-functional teams with autonomy over their work- Adopt agile methodologies like Scrum, Kanban to support experimentation- Colocate teams physically to facilitate collaboration and rapid problem-solving- Leverage cloud infrastructure for flexible, on-demand compute resources- Invest in tools that enhance developer productivity like IDEs, version control etc.- Foster a culture (20)

Shut Up! No one is listening! Web 2.0 and Mobile Media Are Speaking.
Shut Up! No one is listening! Web 2.0 and Mobile Media Are Speaking.Shut Up! No one is listening! Web 2.0 and Mobile Media Are Speaking.
Shut Up! No one is listening! Web 2.0 and Mobile Media Are Speaking.
 
Intelligent Chatbot on WeChat
Intelligent Chatbot on WeChatIntelligent Chatbot on WeChat
Intelligent Chatbot on WeChat
 
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and ApplicationsICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
 
Evaluation of online learning
Evaluation of online learningEvaluation of online learning
Evaluation of online learning
 
Tutorial of Sentiment Analysis
Tutorial of Sentiment AnalysisTutorial of Sentiment Analysis
Tutorial of Sentiment Analysis
 
[Rakuten TechConf2014] [D-2] The Pattern-Matching-Oriented Programming Langua...
[Rakuten TechConf2014] [D-2] The Pattern-Matching-Oriented Programming Langua...[Rakuten TechConf2014] [D-2] The Pattern-Matching-Oriented Programming Langua...
[Rakuten TechConf2014] [D-2] The Pattern-Matching-Oriented Programming Langua...
 
Nlp and Neural Networks workshop
Nlp and Neural Networks workshopNlp and Neural Networks workshop
Nlp and Neural Networks workshop
 
Let's pretend
Let's pretendLet's pretend
Let's pretend
 
KiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorialKiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorial
 
The NLP Muppets revolution!
The NLP Muppets revolution!The NLP Muppets revolution!
The NLP Muppets revolution!
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...
50 Shades of Text - Leveraging Natural Language Processing (NLP), Alessandro ...
 
16-nlp (2).ppt
16-nlp (2).ppt16-nlp (2).ppt
16-nlp (2).ppt
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLP
 
UX STRAT Europe 2019: Zhaochang He, VMware
UX STRAT Europe 2019: Zhaochang He, VMwareUX STRAT Europe 2019: Zhaochang He, VMware
UX STRAT Europe 2019: Zhaochang He, VMware
 
Can Deep Learning solve the Sentiment Analysis Problem
Can Deep Learning solve the Sentiment Analysis ProblemCan Deep Learning solve the Sentiment Analysis Problem
Can Deep Learning solve the Sentiment Analysis Problem
 
Python dictionaries
Python dictionariesPython dictionaries
Python dictionaries
 
Innovations in AI-Powered Assessments and Feedback
Innovations in AI-Powered Assessments and FeedbackInnovations in AI-Powered Assessments and Feedback
Innovations in AI-Powered Assessments and Feedback
 
1004-nlp.ppt
1004-nlp.ppt1004-nlp.ppt
1004-nlp.ppt
 
Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineer...
Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineer...Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineer...
Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineer...
 

Mehr von Rakuten Group, Inc.

コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話Rakuten Group, Inc.
 
楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のり楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のりRakuten Group, Inc.
 
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...Rakuten Group, Inc.
 
DataSkillCultureを浸透させる楽天の取り組み
DataSkillCultureを浸透させる楽天の取り組みDataSkillCultureを浸透させる楽天の取り組み
DataSkillCultureを浸透させる楽天の取り組みRakuten Group, Inc.
 
大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開Rakuten Group, Inc.
 
楽天における大規模データベースの運用
楽天における大規模データベースの運用楽天における大規模データベースの運用
楽天における大規模データベースの運用Rakuten Group, Inc.
 
楽天サービスを支えるネットワークインフラストラクチャー
楽天サービスを支えるネットワークインフラストラクチャー楽天サービスを支えるネットワークインフラストラクチャー
楽天サービスを支えるネットワークインフラストラクチャーRakuten Group, Inc.
 
楽天の規模とクラウドプラットフォーム統括部の役割
楽天の規模とクラウドプラットフォーム統括部の役割楽天の規模とクラウドプラットフォーム統括部の役割
楽天の規模とクラウドプラットフォーム統括部の役割Rakuten Group, Inc.
 
Rakuten Services and Infrastructure Team.pdf
Rakuten Services and Infrastructure Team.pdfRakuten Services and Infrastructure Team.pdf
Rakuten Services and Infrastructure Team.pdfRakuten Group, Inc.
 
The Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdfThe Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdfRakuten Group, Inc.
 
Supporting Internal Customers as Technical Account Managers.pdf
Supporting Internal Customers as Technical Account Managers.pdfSupporting Internal Customers as Technical Account Managers.pdf
Supporting Internal Customers as Technical Account Managers.pdfRakuten Group, Inc.
 
Making Cloud Native CI_CD Services.pdf
Making Cloud Native CI_CD Services.pdfMaking Cloud Native CI_CD Services.pdf
Making Cloud Native CI_CD Services.pdfRakuten Group, Inc.
 
How We Defined Our Own Cloud.pdf
How We Defined Our Own Cloud.pdfHow We Defined Our Own Cloud.pdf
How We Defined Our Own Cloud.pdfRakuten Group, Inc.
 
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoRakuten Group, Inc.
 
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoRakuten Group, Inc.
 
Introduction of GORA API Group technology
Introduction of GORA API Group technologyIntroduction of GORA API Group technology
Introduction of GORA API Group technologyRakuten Group, Inc.
 
100PBを越えるデータプラットフォームの実情
100PBを越えるデータプラットフォームの実情100PBを越えるデータプラットフォームの実情
100PBを越えるデータプラットフォームの実情Rakuten Group, Inc.
 
社内エンジニアを支えるテクニカルアカウントマネージャー
社内エンジニアを支えるテクニカルアカウントマネージャー社内エンジニアを支えるテクニカルアカウントマネージャー
社内エンジニアを支えるテクニカルアカウントマネージャーRakuten Group, Inc.
 

Mehr von Rakuten Group, Inc. (20)

コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
 
楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のり楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のり
 
What Makes Software Green?
What Makes Software Green?What Makes Software Green?
What Makes Software Green?
 
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
 
DataSkillCultureを浸透させる楽天の取り組み
DataSkillCultureを浸透させる楽天の取り組みDataSkillCultureを浸透させる楽天の取り組み
DataSkillCultureを浸透させる楽天の取り組み
 
大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開
 
楽天における大規模データベースの運用
楽天における大規模データベースの運用楽天における大規模データベースの運用
楽天における大規模データベースの運用
 
楽天サービスを支えるネットワークインフラストラクチャー
楽天サービスを支えるネットワークインフラストラクチャー楽天サービスを支えるネットワークインフラストラクチャー
楽天サービスを支えるネットワークインフラストラクチャー
 
楽天の規模とクラウドプラットフォーム統括部の役割
楽天の規模とクラウドプラットフォーム統括部の役割楽天の規模とクラウドプラットフォーム統括部の役割
楽天の規模とクラウドプラットフォーム統括部の役割
 
Rakuten Services and Infrastructure Team.pdf
Rakuten Services and Infrastructure Team.pdfRakuten Services and Infrastructure Team.pdf
Rakuten Services and Infrastructure Team.pdf
 
The Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdfThe Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdf
 
Supporting Internal Customers as Technical Account Managers.pdf
Supporting Internal Customers as Technical Account Managers.pdfSupporting Internal Customers as Technical Account Managers.pdf
Supporting Internal Customers as Technical Account Managers.pdf
 
Making Cloud Native CI_CD Services.pdf
Making Cloud Native CI_CD Services.pdfMaking Cloud Native CI_CD Services.pdf
Making Cloud Native CI_CD Services.pdf
 
How We Defined Our Own Cloud.pdf
How We Defined Our Own Cloud.pdfHow We Defined Our Own Cloud.pdf
How We Defined Our Own Cloud.pdf
 
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech info
 
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech info
 
OWASPTop10_Introduction
OWASPTop10_IntroductionOWASPTop10_Introduction
OWASPTop10_Introduction
 
Introduction of GORA API Group technology
Introduction of GORA API Group technologyIntroduction of GORA API Group technology
Introduction of GORA API Group technology
 
100PBを越えるデータプラットフォームの実情
100PBを越えるデータプラットフォームの実情100PBを越えるデータプラットフォームの実情
100PBを越えるデータプラットフォームの実情
 
社内エンジニアを支えるテクニカルアカウントマネージャー
社内エンジニアを支えるテクニカルアカウントマネージャー社内エンジニアを支えるテクニカルアカウントマネージャー
社内エンジニアを支えるテクニカルアカウントマネージャー
 

Kürzlich hochgeladen

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 

Kürzlich hochgeladen (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 

Enable Fast Iteration in R&D- Use modular, loosely coupled architectures so changes don't have widespread impact- Automate testing and deployments to streamline the development cycle - Implement continuous integration/delivery to get feedback quickly- Empower cross-functional teams with autonomy over their work- Adopt agile methodologies like Scrum, Kanban to support experimentation- Colocate teams physically to facilitate collaboration and rapid problem-solving- Leverage cloud infrastructure for flexible, on-demand compute resources- Invest in tools that enhance developer productivity like IDEs, version control etc.- Foster a culture

  • 1. Oct.28.2017 Ewa Szymanska, PhD Head of Rakuten Institute of Technology Singapore
  • 3. 3 I am watching shows in Chinese to get used to ‘actual’ spoken Mandarin, and not just what I see in my textbooks “ ” VIKI user
  • 4. 4 * Images from Rakuten VIKI, Rakuten TV
  • 5. 5 1.8 billion people are learning foreign languages Source: The Washington Post: https://www.washingtonpost.com/news/worldviews/wp/2015/04/23/the-worlds-languages-in-7-maps-and-charts Languages with most native speakers Most commonly studied foreign languages
  • 6. 6 Online individual language learning market is growing at 12% CAGR Source: Rosetta Stone Investor Day 2017
  • 7. 7 I. Entertaining Content II. Global Users III. Technology *Photo by Jakob Owens on Unsplash
  • 9. 9 Interactive subtitles1 Fast adoption 30,000 DAU – daily active users High engagement Korean Learn Mode users view 10% more than Viki average High satisfaction 83 NPS – net promoter score *cnet.com @ CBS Interactive Inc. Apr 13, 2017; Keia.org, Korean Economic Institute, Apr 2017; Forbes Oct 24, 2017; The Verge, Sep 28, 2017
  • 10. Shows availability “Daughter Back” “Return of Happiness” “Ice and Fire of Youth” “My Love from the Star” “Boys Over Flowers” “Descendants of the Sun” Learn Chinese (Japan) Learn Korean (USA) * Images from Rakuten VIKI [ Learn Mode collection on viki.com ]
  • 11. 11 • 60,000+ quizzes taken • 35,000+ users completed the quiz • Very positive social media engagement: 2 Drama Vocab Quiz [ languagequiz.viki.com ]
  • 12. 12 3 Video-based Dictionary Integrate with the classroom curriculum:
  • 13. 13 “ If you talk to a man in a language he understands, that goes to his head. If you talk to him in his language, that goes to his heart. ” - Nelson Mandela
  • 14. 14
  • 15. Oct 28, 2017 Stanley Kok Principal Research Scientist Rakuten Institute of Technology (Singapore)
  • 16. you 16 你 是 辣妹 , 也是 名门贵 族 你是辣妹,也是名门贵族 你 是 辣妹 , 也是 名门贵族 are (a) hot chick and also (of) the gentry Splitting a sentence into pieces, each preserving its original semantics you are (a) hot chick and also tribe
  • 17. 17 努力的人才会成功 努力 的 人 才 会 成功 only hardworking people will succeed 努力 的 人才 会 成功 hardworking talent will succeed
  • 18. 18
  • 20. 20 Many open-source tokenizers available Good, but not perfect Different mistakes Why not use more (or all) of them to improve tokenization?  Strengths of one tokenizer overcomes shortcomings of another
  • 21. 21 How to quantify “goodness” of tokenization? Take human learner’s perspective #Dictionary look-ups needed to understand all tokens Non-existent tokens assumed to need large #lookups (10) 你 是 辣妹 你 是 辣 妹 你 是辣 妹 hot chick areyou younger sister spicy areyou younger sister ?you 1 + 1 + 1 = 3 1 + 1 + 1 + 1 = 4 1 + 10 + 1 = 12
  • 22. 22 Can do better than picking lowest cost tokenization from tokenizers Treat common tokens as “anchor points” Pick best tokens from remaining ones
  • 23. 23 你 是 辣妹 也是 名门贵 族 你 是辣 妹 也是 名门贵族 你 是 辣妹 也是 名门贵族 you are hot chick and also tribe you younger sister and also (of) the gentry (15) (14) (5)
  • 24. 24 Dictionaries are important for language learning Manual approach provides high-quality dictionary, but not scalable About 7000 languages in the world About 49 million bilingual dictionaries Thus need automatic approach
  • 25. 25 Lots of online dictionaries available Could we automatically learn new dictionaries from them? Focus on Chinese-English (C-E) & Korean- English (K-E) bilingual dictionaries
  • 26. 26 Lots of dictionaries online Some are C-E and K-E, but many are not Many dictionaries are C-X and X-E Use language X as bridge/pivot C-X + X-E => C-E, e.g., 辣妹->fille sexy + fille sexy ->hot chick => 辣妹-> hot chick
  • 27. 27 Take 2 hops for now Chinese-English dictionary has 750K entries 90% correct Korean-English dictionary has 100K entries 99% correct
  • 28. 28 Learn bilingual dictionary using Using seed lexicon Monolingual data (plentiful) Maps bi-lingual phrases to vector space dolphin 海豚 东京Tokyo Sushi 寿司
  • 29. 29
  • 30. 30
  • 31. 31 Artifact of standard machine translation pipeline Parallel sentences aligned word for word Compute probability of mapping tokens of a source language to those of a target language A correct source token will be more consistently aligned to its corresponding target token(s) Add high-probability mappings to dictionary
  • 32. 32 Chinese English P(C|E) P(E|C) AveProb 辣妹 hot chick 0.8 0.9 0.85 是辣 is curry 0.1 0.1 0.1
  • 33. 33 Chinese-English Dictionary 3 million Chinese tokens (Jan’17) 89% in dictionary Korean-English Dictionary 4 million Korean tokens (Jan’17) 86% in dictionary
  • 34. 34 0 50000 100000 150000 200000 250000 300000 350000 400000 450000 500000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 #KoreanTokens vs. #Defintions 0 50000 100000 150000 200000 250000 300000 350000 400000 450000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 #ChineseTokens vs. #Definitions
  • 35. 35 Match parallel sentences to Phrase table Dictionary
  • 36. 36 他 放弃 梦想 He gave up his dreams Chinese English AveProb 放弃 gave up his 0.74 放弃 quit, 0.83 放弃 abdicate 0.68 Phrase Table
  • 37. 37 他 放弃 梦想 He gave up his dreams Chinese English AveProb 放弃 gave up his 0.74 放弃 quit 0.83 放弃 abdicate 0.68 Phrase Table Best Match
  • 38. 他 放弃 梦想 He gave up his dreams best match 38 Chinese English AveProb 放弃 gave up his 0.74 放弃 quit 0.83 放弃 abdicate 0.68 Phrase Table best match Chinese English 放弃 abandon 放弃 give up 放弃 abdicate Dictionary
  • 39. Drama Vocabulary Quiz Liling Tan Rakuten Institute of Technology (Singapore) 28 Oct 2017 @ Rakuten Tech. Conference
  • 41. 41 Introduction •Quizzes are fun and could be viral •But manually creating quizzes is tedious •We created #DramaVocabQuiz that generates new vocabulary quizzes automatically
  • 42. 42
  • 43. 43
  • 44. 44
  • 45. 45
  • 46. 46
  • 47. 47
  • 48. 48 How do we Generate Quizzes Automatically?
  • 49. 49 Korean Drama Word List • The word 미남 [minam] “handsome guy” can be followed by multiple suffixes at once -이시라 구요 [-issilaguyo] to form a single word meaning “someone said that he is handsome”. • We only extract the root word 미남 [minam], and count it as a unique word type
  • 53. 53 Splitting Word List into 3 Difficulty Levels ↑
  • 54. 54 Generate the Distractors • Distractor 1: Select the top 5th to 20th closest words (cosine) • Distractor 2: Use Distractor 1 as negative and question word as positive, select 1st to 20th closest word (cosmul) References: • Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In ICLR. • Omer Levy and Yoav Goldberg. 2014. Linguistic Regularities in Sparse and Explicit Word Representations. In CoNLL.
  • 55. 55 Language Leaners Like Quizzes!! • 60,000+ quizzes taken • 35,000+ unique users completed quiz • 16% of the users repeated quiz
  • 56. 56 Word Frequency is a Good Indicator of Difficulty 10 8 6 4 2 0 Easy Medium Hard Easy = Frequent words Medium = Less Frequent words Hard = Least Frequent words
  • 57. 57 Conclusion Watch Drama, Learn Language Quiz: https://languagequiz.viki.com Techblog: https://techblog.rakuten.co.jp/2017/05/26/lang-quiz/
  • 58. Oct.28.2017 Pang Zineng Senior Technologist Rakuten Institute of Technology Singapore
  • 59. 59 * Images from Rakuten VIKI
  • 60. 60 clips pages Web Search In-Video Search * Images from Rakuten VIKI
  • 61. 61 Web Search In-Video Search •The meta data of the site •The meta data of the page •The word tokens in the page •The topic of the page •The originality of the page •Hyperlinks (page rank) • The meta data of the video •The meta data of this clip (timestamp, length, URI, etc.) • The caption text of the clip • The frames & audio signal •Complexity of the sentence •Diversity of the clips site identifier page identifier content ranking search relevancy video identifier clip identifier search relevancy content ranking * Images from Rakuten VIKI
  • 62. 62 Job: • Make some data ready for consumption. Questions: • How does the data come? • What needs to be done for it to be ready? • How will the data be consumed? database Pre- processing function Trigger / monitor function Raw Data Data access function FTP API Data provider Data consumer
  • 63. 63 Job: • Let outsider use a function. Questions: • How frequently will the function be used? • What data does the function need? Application logic API Endpoint Web Application API Cache Request Queue Application Cache Internal/External Data
  • 64. 64 Rakuten TV video contents Other video contents Rakuten VIKI video contents Search function 3rd Party Platform Motion Dictionary * Images from Rakuten VIKI
  • 65. 65 Japanese Dictionary Data dictionary function voice function 3rd party solution Korean Dictionary Data Chinese Dictionary Data 3rd party solution open source framework Interactive Subtitles (version 2) Interactive Subtitles (version 3) * Images from Rakuten VIKI tokenization function Korean Tokenization Data Chinese Tokenization Data Japanese Tokenization Data open source framework open source framework open source framework Korean Tokenization Data Chinese Tokenization Data In-house solution In-house solution
  • 66. 66 Japanese Dictionary Data dictionary function voice function 3rd party solution Korean Dictionary Data Chinese Dictionary Data 3rd party solution open source framework Interactive Subtitles (version 2) Interactive Subtitles (version 3) * Images from Rakuten VIKI tokenization function Japanese Tokenization Data open source framework Global Tokenization Data In-house solution Global Dictionary Data In-house solution Korean Tokenization Data Chinese Tokenization Data In-house solution In-house solution
  • 67. 67 Take Quiz function Vocab Quiz (version 1) * Images from Rakuten VIKI Chinese Quiz Data Korean Quiz Data
  • 68. 68 Chinese Quiz Data Take Quiz function voice function Vocab Quiz (version 2) * Images from Rakuten VIKI Korean Quiz Data
  • 69. 69 Fast iteration in R&D won’t be possible if we had many things bundled or coupled. -- Pang Vocab Quiz • https://languagequiz.viki.com/ Learn Mode (PC/Mac only) • https://www.viki.com/collections/316981l-learn-the-basics-chinese • https://www.viki.com/collections/316939l-learn-the-basics-korean Motion Dictionary • TBD