SlideShare ist ein Scribd-Unternehmen logo
1 von 13
GOTO Berlin Conference 2013

A use case of online machine learning
using Jubatus
2013/10/18
NTT DATA Corporation System Platforms Sector
OSS Professional Services
Toru Shimogaki

Copyright © 2013 NTT DATA Corporation
Who is Toru Shimogaki
 The Elephant Wizard as a team lead of “NTT DATA OSS
Professional Services”
 Deep and wide experience deploying Open Source
Software technologies for enterprise customers

10+ years
Contributor
Ex. pg_bulkload
Copyright © 2013 NTT DATA Corporation

6+ years
Leads Japanese
Hadoop Community

Co-Author
2nd Edition
in Japan
2
About Jubatus (1/2)
 OSS Machine Learning Platform developed by NTT
Research Laboratories and Preferred Infrastructure, Inc
 Motivation: Take the Right Action at the Right Time, at
the Right Place

Copyright © 2013 NTT DATA Corporation

(Source : Hadoop Summit 2013)

3
About Jubatus (2/2)
 Distributed Processing Framework and Streaming
Machine Learning Libraries
 Classification, Regression, Recommendation, Graph Mining,
Anomaly Detection
 Especially,
Jubatus Classifier has
small footprint and
responds with very low
latency. So it is easy
to scale for multiple
and simultaneous
requests.
(Source : Hadoop Summit 2013)
Copyright © 2013 NTT DATA Corporation

4
Background
 “SUUMO” : Online Service for Real-Estate Business
Improve usability for smartphone access
Navigate those who don’t know how to search
residences
Efficiently approach the first time customer and
non regular short-time access users

Copyright © 2013 NTT DATA Corporation

5
“SUUMO” Real-Estate Service : Customer Reach and Brand Awareness
Compared to our competitors , ”SUUMO” has the largest customer reach and has scored highest in
brand awareness.

Customer reach (Unique Users)
(Million)

Brand awareness
(%)

9.9
64.3
5.2

4.7
2.6
4.7

SUUMO

A
Real estate

B

C

SUUMO

4.4

9.2

A

B

C

Aided SUUMO awareness
SUUMO UU

77.2%

9.9million /month

Awareness of Character 【SUUMO】

2013 RECRUIT SUMAI CO., LTD All Rights Reserved

87.6%

6
Background
 “SUUMO” : Online Service for Real-Estate Business
Improve usability for smartphone access
Navigate those who don’t know how to search
residences
Efficiently approach the first time customer and
non regular short-time access users

Copyright © 2013 NTT DATA Corporation

7
Web service for SmartPhone (beta ver.)
 Repeatedly present two choices for candidates, then
system learns user’s flavor
Existing search service
simply list candidates with
filtering..

Resulted
recommendation

Present two typical
and salient choices.
Then a user simply
choose more
preferable one.

Present recommendation
based on acquired
user’s preference

Ask 10 times
with countdown
Copyright © 2013 NTT DATA Corporation

8
Inside this SUUMO Web service
 This SUUMO web services is implemented with the
combination of some algorithms
 Multidimentional Scaling (MDS)
 Jubatus classifier (Passive Aggressive algorithm)
 etc.

Copyright © 2013 NTT DATA Corporation

9
Building search space by MDS
 Building the search spaces for each station by Multidimensional
Scaling (MDS)
 daily batch processing using R

 Goal : to achieve O(log n) search (like binary search)
 Using MDS, convert multi dimensional vectors to lower dimensional one
with keeping distance relations among houses
Rent
Fraction on foot
Size
Deposit
Age of a building
...
Copyright © 2013 NTT DATA Corporation

10
Learning user’s flavor using Jubatus
 Classify user’s flavor for real-estate using Jubatus
 Goal : reflect user action to result on real-time (cannot by R)
 Using Passive Aggressive algorithm
 If score of the area becomes lower than threshold, remove it from search area
 Processing this classification with low latency, and easy to scale

Initial state

1 clicked

2 clicked

6 clicked

10 clicked

 Using this approach :
 Keep to include discrete candidate, which is excluded in MDS space search but a
user still have some interest with unexpected attributions.
 Keep sufficient diversion in order to present “salient” candidate. It is necessary to
have distant choices for estimating user’s preference without losing features
Copyright © 2013 NTT DATA Corporation

11
Wrap up
 Introduced a use case of online machine learning using Jubatus
 Now beta service for smartphone is released on SUUMO, which
is one of the largest residence service in Japan.
 Today I explained
 Building a Multi-dimensional Scaling search space
 Using Jubatus classifier to understand users preference adaptively

 Furute works
 Learn from similar users activity using logs
 Semantic analysis of sentences input by client

Copyright © 2013 NTT DATA Corporation

12
NTT DATA Corporation System Platforms Sector
OSS Professional Services
URL:
http://oss.nttdata.co.jp/hadoop/
mail: hadoop@kits.nttdata.co.jp
Copyright © 2013 NTT DATA Corporation

Weitere ähnliche Inhalte

Andere mochten auch

小町のレス数が予測できるか試してみた
小町のレス数が予測できるか試してみた小町のレス数が予測できるか試してみた
小町のレス数が予測できるか試してみたJubatusOfficial
 
jubarecommenderの紹介
jubarecommenderの紹介jubarecommenderの紹介
jubarecommenderの紹介JubatusOfficial
 
新聞から今年の漢字を予測する
新聞から今年の漢字を予測する新聞から今年の漢字を予測する
新聞から今年の漢字を予測するJubatusOfficial
 
単語コレクター(文章自動校正器)
単語コレクター(文章自動校正器)単語コレクター(文章自動校正器)
単語コレクター(文章自動校正器)JubatusOfficial
 
かまってちゃん小町
かまってちゃん小町かまってちゃん小町
かまってちゃん小町JubatusOfficial
 
Jubatus 新機能ハイライト
Jubatus 新機能ハイライトJubatus 新機能ハイライト
Jubatus 新機能ハイライトJubatusOfficial
 
コンテンツマーケティングでレコメンドエンジンが必要になる背景とその活用
コンテンツマーケティングでレコメンドエンジンが必要になる背景とその活用コンテンツマーケティングでレコメンドエンジンが必要になる背景とその活用
コンテンツマーケティングでレコメンドエンジンが必要になる背景とその活用JubatusOfficial
 
データ圧縮アルゴリズムを用いたマルウェア感染通信ログの判定
データ圧縮アルゴリズムを用いたマルウェア感染通信ログの判定データ圧縮アルゴリズムを用いたマルウェア感染通信ログの判定
データ圧縮アルゴリズムを用いたマルウェア感染通信ログの判定JubatusOfficial
 
まだCPUで消耗してるの?Jubatusによる近傍探索のGPUを利用した高速化
まだCPUで消耗してるの?Jubatusによる近傍探索のGPUを利用した高速化まだCPUで消耗してるの?Jubatusによる近傍探索のGPUを利用した高速化
まだCPUで消耗してるの?Jubatusによる近傍探索のGPUを利用した高速化JubatusOfficial
 
Jubakit の紹介
Jubakit の紹介Jubakit の紹介
Jubakit の紹介kmaehashi
 
発言小町からのプロファイリング
発言小町からのプロファイリング発言小町からのプロファイリング
発言小町からのプロファイリングJubatusOfficial
 
地域の魅力を伝えるツアーガイドAI
地域の魅力を伝えるツアーガイドAI地域の魅力を伝えるツアーガイドAI
地域の魅力を伝えるツアーガイドAIJubatusOfficial
 
機械学習チュートリアル@Jubatus Casual Talks
機械学習チュートリアル@Jubatus Casual Talks機械学習チュートリアル@Jubatus Casual Talks
機械学習チュートリアル@Jubatus Casual TalksYuya Unno
 

Andere mochten auch (20)

小町のレス数が予測できるか試してみた
小町のレス数が予測できるか試してみた小町のレス数が予測できるか試してみた
小町のレス数が予測できるか試してみた
 
jubarecommenderの紹介
jubarecommenderの紹介jubarecommenderの紹介
jubarecommenderの紹介
 
新聞から今年の漢字を予測する
新聞から今年の漢字を予測する新聞から今年の漢字を予測する
新聞から今年の漢字を予測する
 
単語コレクター(文章自動校正器)
単語コレクター(文章自動校正器)単語コレクター(文章自動校正器)
単語コレクター(文章自動校正器)
 
jubabanditの紹介
jubabanditの紹介jubabanditの紹介
jubabanditの紹介
 
かまってちゃん小町
かまってちゃん小町かまってちゃん小町
かまってちゃん小町
 
JubaQLご紹介
JubaQLご紹介JubaQLご紹介
JubaQLご紹介
 
Jubatus 新機能ハイライト
Jubatus 新機能ハイライトJubatus 新機能ハイライト
Jubatus 新機能ハイライト
 
コンテンツマーケティングでレコメンドエンジンが必要になる背景とその活用
コンテンツマーケティングでレコメンドエンジンが必要になる背景とその活用コンテンツマーケティングでレコメンドエンジンが必要になる背景とその活用
コンテンツマーケティングでレコメンドエンジンが必要になる背景とその活用
 
Jubaanomalyについて
JubaanomalyについてJubaanomalyについて
Jubaanomalyについて
 
データ圧縮アルゴリズムを用いたマルウェア感染通信ログの判定
データ圧縮アルゴリズムを用いたマルウェア感染通信ログの判定データ圧縮アルゴリズムを用いたマルウェア感染通信ログの判定
データ圧縮アルゴリズムを用いたマルウェア感染通信ログの判定
 
まだCPUで消耗してるの?Jubatusによる近傍探索のGPUを利用した高速化
まだCPUで消耗してるの?Jubatusによる近傍探索のGPUを利用した高速化まだCPUで消耗してるの?Jubatusによる近傍探索のGPUを利用した高速化
まだCPUで消耗してるの?Jubatusによる近傍探索のGPUを利用した高速化
 
Jubakit の紹介
Jubakit の紹介Jubakit の紹介
Jubakit の紹介
 
発言小町からのプロファイリング
発言小町からのプロファイリング発言小町からのプロファイリング
発言小町からのプロファイリング
 
銀座のママ
銀座のママ銀座のママ
銀座のママ
 
小町の溜息
小町の溜息小町の溜息
小町の溜息
 
JUBARHYME
JUBARHYMEJUBARHYME
JUBARHYME
 
Jubatus 1.0 の紹介
Jubatus 1.0 の紹介Jubatus 1.0 の紹介
Jubatus 1.0 の紹介
 
地域の魅力を伝えるツアーガイドAI
地域の魅力を伝えるツアーガイドAI地域の魅力を伝えるツアーガイドAI
地域の魅力を伝えるツアーガイドAI
 
機械学習チュートリアル@Jubatus Casual Talks
機械学習チュートリアル@Jubatus Casual Talks機械学習チュートリアル@Jubatus Casual Talks
機械学習チュートリアル@Jubatus Casual Talks
 

Ähnlich wie A use case of online machine learning using Jubatus

Guidelines for Android application design.pptx
Guidelines for Android application design.pptxGuidelines for Android application design.pptx
Guidelines for Android application design.pptxdebasish duarah
 
Why an innovative mobile strategy needs a robust API
Why an innovative mobile strategy needs a robust APIWhy an innovative mobile strategy needs a robust API
Why an innovative mobile strategy needs a robust APIManmohan Gupta
 
Why an Innovative Mobile Strategy Requires a Robust API
Why an Innovative Mobile Strategy Requires a Robust API Why an Innovative Mobile Strategy Requires a Robust API
Why an Innovative Mobile Strategy Requires a Robust API Software AG
 
IRJET- Location based Voice Reminder
IRJET-  	  Location based Voice ReminderIRJET-  	  Location based Voice Reminder
IRJET- Location based Voice ReminderIRJET Journal
 
IRJET- Chore Pay – Magnified Task Colligation
IRJET-  	  Chore Pay – Magnified Task ColligationIRJET-  	  Chore Pay – Magnified Task Colligation
IRJET- Chore Pay – Magnified Task ColligationIRJET Journal
 
IRJET- Voice Controlled Personal Assistant Bot with Smart Storage
IRJET- Voice Controlled Personal Assistant Bot with Smart StorageIRJET- Voice Controlled Personal Assistant Bot with Smart Storage
IRJET- Voice Controlled Personal Assistant Bot with Smart StorageIRJET Journal
 
How Changing Mobile Technology Is Changing The Way We Do Business
How Changing Mobile Technology Is Changing The Way We Do Business How Changing Mobile Technology Is Changing The Way We Do Business
How Changing Mobile Technology Is Changing The Way We Do Business Osaka University
 
Azure WP7 fire starter
Azure WP7 fire starterAzure WP7 fire starter
Azure WP7 fire starterSam Basu
 
Fujitsu IT Future 2013 : Touching the Cloud par Joseph Reger CTO Fujitsu Tech...
Fujitsu IT Future 2013 : Touching the Cloud par Joseph Reger CTO Fujitsu Tech...Fujitsu IT Future 2013 : Touching the Cloud par Joseph Reger CTO Fujitsu Tech...
Fujitsu IT Future 2013 : Touching the Cloud par Joseph Reger CTO Fujitsu Tech...Fujitsu France
 
Running Head HUMAN-COMPUTER INTERFACE .docx
Running Head HUMAN-COMPUTER INTERFACE                            .docxRunning Head HUMAN-COMPUTER INTERFACE                            .docx
Running Head HUMAN-COMPUTER INTERFACE .docxwlynn1
 
Running Head HUMAN-COMPUTER INTERFACE .docx
Running Head HUMAN-COMPUTER INTERFACE                            .docxRunning Head HUMAN-COMPUTER INTERFACE                            .docx
Running Head HUMAN-COMPUTER INTERFACE .docxjeanettehully
 
13Running Head HUMAN-COMPUTER INTERFACEHuman-.docx
13Running Head HUMAN-COMPUTER INTERFACEHuman-.docx13Running Head HUMAN-COMPUTER INTERFACEHuman-.docx
13Running Head HUMAN-COMPUTER INTERFACEHuman-.docxjesusamckone
 
13Running Head HUMAN-COMPUTER INTERFACEHuman-.docx
13Running Head HUMAN-COMPUTER INTERFACEHuman-.docx13Running Head HUMAN-COMPUTER INTERFACEHuman-.docx
13Running Head HUMAN-COMPUTER INTERFACEHuman-.docxaulasnilda
 
SAP Screen Personas and SAP Fiori session from TechEd 2013
SAP Screen Personas and SAP Fiori session from TechEd 2013SAP Screen Personas and SAP Fiori session from TechEd 2013
SAP Screen Personas and SAP Fiori session from TechEd 2013Peter Spielvogel
 
1Running Head HUMAN-COMPUTER INTERFACE .docx
1Running Head HUMAN-COMPUTER INTERFACE                     .docx1Running Head HUMAN-COMPUTER INTERFACE                     .docx
1Running Head HUMAN-COMPUTER INTERFACE .docxherminaprocter
 
Role of Operators in the Mobile App Delivery Ecosystem
Role of Operators in the Mobile App Delivery EcosystemRole of Operators in the Mobile App Delivery Ecosystem
Role of Operators in the Mobile App Delivery EcosystemRelayware
 
Human Computer Interaction .docx
Human Computer  Interaction .docxHuman Computer  Interaction .docx
Human Computer Interaction .docxsaeed afridi
 
Mobile UX breakfast briefing - Dubai september 2013
Mobile UX breakfast briefing - Dubai september 2013Mobile UX breakfast briefing - Dubai september 2013
Mobile UX breakfast briefing - Dubai september 2013User Vision
 
Survey, comparison & evaluation of cross platform mobile application developm...
Survey, comparison & evaluation of cross platform mobile application developm...Survey, comparison & evaluation of cross platform mobile application developm...
Survey, comparison & evaluation of cross platform mobile application developm...Soumya Kanti Datta
 

Ähnlich wie A use case of online machine learning using Jubatus (20)

Guidelines for Android application design.pptx
Guidelines for Android application design.pptxGuidelines for Android application design.pptx
Guidelines for Android application design.pptx
 
Why an innovative mobile strategy needs a robust API
Why an innovative mobile strategy needs a robust APIWhy an innovative mobile strategy needs a robust API
Why an innovative mobile strategy needs a robust API
 
Why an Innovative Mobile Strategy Requires a Robust API
Why an Innovative Mobile Strategy Requires a Robust API Why an Innovative Mobile Strategy Requires a Robust API
Why an Innovative Mobile Strategy Requires a Robust API
 
Rajput Bandhu
Rajput BandhuRajput Bandhu
Rajput Bandhu
 
IRJET- Location based Voice Reminder
IRJET-  	  Location based Voice ReminderIRJET-  	  Location based Voice Reminder
IRJET- Location based Voice Reminder
 
IRJET- Chore Pay – Magnified Task Colligation
IRJET-  	  Chore Pay – Magnified Task ColligationIRJET-  	  Chore Pay – Magnified Task Colligation
IRJET- Chore Pay – Magnified Task Colligation
 
IRJET- Voice Controlled Personal Assistant Bot with Smart Storage
IRJET- Voice Controlled Personal Assistant Bot with Smart StorageIRJET- Voice Controlled Personal Assistant Bot with Smart Storage
IRJET- Voice Controlled Personal Assistant Bot with Smart Storage
 
How Changing Mobile Technology Is Changing The Way We Do Business
How Changing Mobile Technology Is Changing The Way We Do Business How Changing Mobile Technology Is Changing The Way We Do Business
How Changing Mobile Technology Is Changing The Way We Do Business
 
Azure WP7 fire starter
Azure WP7 fire starterAzure WP7 fire starter
Azure WP7 fire starter
 
Fujitsu IT Future 2013 : Touching the Cloud par Joseph Reger CTO Fujitsu Tech...
Fujitsu IT Future 2013 : Touching the Cloud par Joseph Reger CTO Fujitsu Tech...Fujitsu IT Future 2013 : Touching the Cloud par Joseph Reger CTO Fujitsu Tech...
Fujitsu IT Future 2013 : Touching the Cloud par Joseph Reger CTO Fujitsu Tech...
 
Running Head HUMAN-COMPUTER INTERFACE .docx
Running Head HUMAN-COMPUTER INTERFACE                            .docxRunning Head HUMAN-COMPUTER INTERFACE                            .docx
Running Head HUMAN-COMPUTER INTERFACE .docx
 
Running Head HUMAN-COMPUTER INTERFACE .docx
Running Head HUMAN-COMPUTER INTERFACE                            .docxRunning Head HUMAN-COMPUTER INTERFACE                            .docx
Running Head HUMAN-COMPUTER INTERFACE .docx
 
13Running Head HUMAN-COMPUTER INTERFACEHuman-.docx
13Running Head HUMAN-COMPUTER INTERFACEHuman-.docx13Running Head HUMAN-COMPUTER INTERFACEHuman-.docx
13Running Head HUMAN-COMPUTER INTERFACEHuman-.docx
 
13Running Head HUMAN-COMPUTER INTERFACEHuman-.docx
13Running Head HUMAN-COMPUTER INTERFACEHuman-.docx13Running Head HUMAN-COMPUTER INTERFACEHuman-.docx
13Running Head HUMAN-COMPUTER INTERFACEHuman-.docx
 
SAP Screen Personas and SAP Fiori session from TechEd 2013
SAP Screen Personas and SAP Fiori session from TechEd 2013SAP Screen Personas and SAP Fiori session from TechEd 2013
SAP Screen Personas and SAP Fiori session from TechEd 2013
 
1Running Head HUMAN-COMPUTER INTERFACE .docx
1Running Head HUMAN-COMPUTER INTERFACE                     .docx1Running Head HUMAN-COMPUTER INTERFACE                     .docx
1Running Head HUMAN-COMPUTER INTERFACE .docx
 
Role of Operators in the Mobile App Delivery Ecosystem
Role of Operators in the Mobile App Delivery EcosystemRole of Operators in the Mobile App Delivery Ecosystem
Role of Operators in the Mobile App Delivery Ecosystem
 
Human Computer Interaction .docx
Human Computer  Interaction .docxHuman Computer  Interaction .docx
Human Computer Interaction .docx
 
Mobile UX breakfast briefing - Dubai september 2013
Mobile UX breakfast briefing - Dubai september 2013Mobile UX breakfast briefing - Dubai september 2013
Mobile UX breakfast briefing - Dubai september 2013
 
Survey, comparison & evaluation of cross platform mobile application developm...
Survey, comparison & evaluation of cross platform mobile application developm...Survey, comparison & evaluation of cross platform mobile application developm...
Survey, comparison & evaluation of cross platform mobile application developm...
 

Mehr von NTT DATA OSS Professional Services

Global Top 5 を目指す NTT DATA の確かで意外な技術力
Global Top 5 を目指す NTT DATA の確かで意外な技術力Global Top 5 を目指す NTT DATA の確かで意外な技術力
Global Top 5 を目指す NTT DATA の確かで意外な技術力NTT DATA OSS Professional Services
 
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~NTT DATA OSS Professional Services
 
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイントPostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイントNTT DATA OSS Professional Services
 
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~NTT DATA OSS Professional Services
 
データ活用をもっともっと円滑に! ~データ処理・分析基盤編を少しだけ~
データ活用をもっともっと円滑に!~データ処理・分析基盤編を少しだけ~データ活用をもっともっと円滑に!~データ処理・分析基盤編を少しだけ~
データ活用をもっともっと円滑に! ~データ処理・分析基盤編を少しだけ~NTT DATA OSS Professional Services
 
商用ミドルウェアのPuppet化で気を付けたい5つのこと
商用ミドルウェアのPuppet化で気を付けたい5つのこと商用ミドルウェアのPuppet化で気を付けたい5つのこと
商用ミドルウェアのPuppet化で気を付けたい5つのことNTT DATA OSS Professional Services
 

Mehr von NTT DATA OSS Professional Services (20)

Global Top 5 を目指す NTT DATA の確かで意外な技術力
Global Top 5 を目指す NTT DATA の確かで意外な技術力Global Top 5 を目指す NTT DATA の確かで意外な技術力
Global Top 5 を目指す NTT DATA の確かで意外な技術力
 
Spark SQL - The internal -
Spark SQL - The internal -Spark SQL - The internal -
Spark SQL - The internal -
 
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
 
Hadoopエコシステムのデータストア振り返り
Hadoopエコシステムのデータストア振り返りHadoopエコシステムのデータストア振り返り
Hadoopエコシステムのデータストア振り返り
 
HDFS Router-based federation
HDFS Router-based federationHDFS Router-based federation
HDFS Router-based federation
 
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイントPostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
 
Apache Hadoopの新機能Ozoneの現状
Apache Hadoopの新機能Ozoneの現状Apache Hadoopの新機能Ozoneの現状
Apache Hadoopの新機能Ozoneの現状
 
Distributed data stores in Hadoop ecosystem
Distributed data stores in Hadoop ecosystemDistributed data stores in Hadoop ecosystem
Distributed data stores in Hadoop ecosystem
 
Structured Streaming - The Internal -
Structured Streaming - The Internal -Structured Streaming - The Internal -
Structured Streaming - The Internal -
 
Apache Hadoopの未来 3系になって何が変わるのか?
Apache Hadoopの未来 3系になって何が変わるのか?Apache Hadoopの未来 3系になって何が変わるのか?
Apache Hadoopの未来 3系になって何が変わるのか?
 
Apache Hadoop and YARN, current development status
Apache Hadoop and YARN, current development statusApache Hadoop and YARN, current development status
Apache Hadoop and YARN, current development status
 
HDFS basics from API perspective
HDFS basics from API perspectiveHDFS basics from API perspective
HDFS basics from API perspective
 
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
 
20170303 java9 hadoop
20170303 java9 hadoop20170303 java9 hadoop
20170303 java9 hadoop
 
ブロックチェーンの仕組みと動向(入門編)
ブロックチェーンの仕組みと動向(入門編)ブロックチェーンの仕組みと動向(入門編)
ブロックチェーンの仕組みと動向(入門編)
 
Application of postgre sql to large social infrastructure jp
Application of postgre sql to large social infrastructure jpApplication of postgre sql to large social infrastructure jp
Application of postgre sql to large social infrastructure jp
 
Application of postgre sql to large social infrastructure
Application of postgre sql to large social infrastructureApplication of postgre sql to large social infrastructure
Application of postgre sql to large social infrastructure
 
Apache Hadoop 2.8.0 の新機能 (抜粋)
Apache Hadoop 2.8.0 の新機能 (抜粋)Apache Hadoop 2.8.0 の新機能 (抜粋)
Apache Hadoop 2.8.0 の新機能 (抜粋)
 
データ活用をもっともっと円滑に! ~データ処理・分析基盤編を少しだけ~
データ活用をもっともっと円滑に!~データ処理・分析基盤編を少しだけ~データ活用をもっともっと円滑に!~データ処理・分析基盤編を少しだけ~
データ活用をもっともっと円滑に! ~データ処理・分析基盤編を少しだけ~
 
商用ミドルウェアのPuppet化で気を付けたい5つのこと
商用ミドルウェアのPuppet化で気を付けたい5つのこと商用ミドルウェアのPuppet化で気を付けたい5つのこと
商用ミドルウェアのPuppet化で気を付けたい5つのこと
 

Kürzlich hochgeladen

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 

Kürzlich hochgeladen (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

A use case of online machine learning using Jubatus

  • 1. GOTO Berlin Conference 2013 A use case of online machine learning using Jubatus 2013/10/18 NTT DATA Corporation System Platforms Sector OSS Professional Services Toru Shimogaki Copyright © 2013 NTT DATA Corporation
  • 2. Who is Toru Shimogaki  The Elephant Wizard as a team lead of “NTT DATA OSS Professional Services”  Deep and wide experience deploying Open Source Software technologies for enterprise customers 10+ years Contributor Ex. pg_bulkload Copyright © 2013 NTT DATA Corporation 6+ years Leads Japanese Hadoop Community Co-Author 2nd Edition in Japan 2
  • 3. About Jubatus (1/2)  OSS Machine Learning Platform developed by NTT Research Laboratories and Preferred Infrastructure, Inc  Motivation: Take the Right Action at the Right Time, at the Right Place Copyright © 2013 NTT DATA Corporation (Source : Hadoop Summit 2013) 3
  • 4. About Jubatus (2/2)  Distributed Processing Framework and Streaming Machine Learning Libraries  Classification, Regression, Recommendation, Graph Mining, Anomaly Detection  Especially, Jubatus Classifier has small footprint and responds with very low latency. So it is easy to scale for multiple and simultaneous requests. (Source : Hadoop Summit 2013) Copyright © 2013 NTT DATA Corporation 4
  • 5. Background  “SUUMO” : Online Service for Real-Estate Business Improve usability for smartphone access Navigate those who don’t know how to search residences Efficiently approach the first time customer and non regular short-time access users Copyright © 2013 NTT DATA Corporation 5
  • 6. “SUUMO” Real-Estate Service : Customer Reach and Brand Awareness Compared to our competitors , ”SUUMO” has the largest customer reach and has scored highest in brand awareness. Customer reach (Unique Users) (Million) Brand awareness (%) 9.9 64.3 5.2 4.7 2.6 4.7 SUUMO A Real estate B C SUUMO 4.4 9.2 A B C Aided SUUMO awareness SUUMO UU 77.2% 9.9million /month Awareness of Character 【SUUMO】 2013 RECRUIT SUMAI CO., LTD All Rights Reserved 87.6% 6
  • 7. Background  “SUUMO” : Online Service for Real-Estate Business Improve usability for smartphone access Navigate those who don’t know how to search residences Efficiently approach the first time customer and non regular short-time access users Copyright © 2013 NTT DATA Corporation 7
  • 8. Web service for SmartPhone (beta ver.)  Repeatedly present two choices for candidates, then system learns user’s flavor Existing search service simply list candidates with filtering.. Resulted recommendation Present two typical and salient choices. Then a user simply choose more preferable one. Present recommendation based on acquired user’s preference Ask 10 times with countdown Copyright © 2013 NTT DATA Corporation 8
  • 9. Inside this SUUMO Web service  This SUUMO web services is implemented with the combination of some algorithms  Multidimentional Scaling (MDS)  Jubatus classifier (Passive Aggressive algorithm)  etc. Copyright © 2013 NTT DATA Corporation 9
  • 10. Building search space by MDS  Building the search spaces for each station by Multidimensional Scaling (MDS)  daily batch processing using R  Goal : to achieve O(log n) search (like binary search)  Using MDS, convert multi dimensional vectors to lower dimensional one with keeping distance relations among houses Rent Fraction on foot Size Deposit Age of a building ... Copyright © 2013 NTT DATA Corporation 10
  • 11. Learning user’s flavor using Jubatus  Classify user’s flavor for real-estate using Jubatus  Goal : reflect user action to result on real-time (cannot by R)  Using Passive Aggressive algorithm  If score of the area becomes lower than threshold, remove it from search area  Processing this classification with low latency, and easy to scale Initial state 1 clicked 2 clicked 6 clicked 10 clicked  Using this approach :  Keep to include discrete candidate, which is excluded in MDS space search but a user still have some interest with unexpected attributions.  Keep sufficient diversion in order to present “salient” candidate. It is necessary to have distant choices for estimating user’s preference without losing features Copyright © 2013 NTT DATA Corporation 11
  • 12. Wrap up  Introduced a use case of online machine learning using Jubatus  Now beta service for smartphone is released on SUUMO, which is one of the largest residence service in Japan.  Today I explained  Building a Multi-dimensional Scaling search space  Using Jubatus classifier to understand users preference adaptively  Furute works  Learn from similar users activity using logs  Semantic analysis of sentences input by client Copyright © 2013 NTT DATA Corporation 12
  • 13. NTT DATA Corporation System Platforms Sector OSS Professional Services URL: http://oss.nttdata.co.jp/hadoop/ mail: hadoop@kits.nttdata.co.jp Copyright © 2013 NTT DATA Corporation