SlideShare ist ein Scribd-Unternehmen logo
1 von 80
Lessons learnt at building
recommendation services
at industry scale
Domonkos Tikk
Gravity R&D
Industry keynote @ ECIR 2016
@domonkostikk
Credits to colleagues
3/24/2016
Bottyán Németh
Product Owner and co-founder
István Pilászy
Head of Development and co-founder
Balázs Hidasi
Head of Data Mining & Research
Gábor Vincze
Head of Global Service
György Dózsa
Head of Web Integrations
and many others…
IR ⊃ Recsys
3/24/2016
Information Retrieval without query
IR ?
Who we are and what we do
4
Gravity R&D is a recommender system vendor company
We provide recommendation as a service since 2009 for
our customers all around the globe
The journey Gravity made from 2009-2016
3/24/2016
How we imagine growth?
6
?
How we imagine growth?
7
How it actually happens?
8
?
How it actually happens?
9
The impact of Netflix Prize
Short summary of Netflix Prize
3/24/2016
• 2006–2009
• Predict movie ratings (explicit feedback)
• Content based filtering (CBF) did not work
• Classical CF methods (item-kNN, user-kNN) did not
work
• Matrix factorization was extremely effective
• We were fully in love with matrix factorization
Schematic of matrix factorization
3/24/2016
• Model
 How we approximate user preferences
 𝑟𝑢,𝑖 = 𝑝 𝑢
𝑇 𝑞𝑖
• Objective function (error function)
 What we want to minimize or optimize?
 E.g. optimize for RMSE with regularization L = (𝑢,𝑖)∈𝑇𝑟𝑎𝑖𝑛 𝑟𝑢,𝑖 −
Learning
≈ 𝑆𝐼
𝑆𝐼
𝑆 𝑈 𝑆 𝑈
𝐾
𝐾
0.5 -0.30.4 -0.20.5 -0.1
1.1 0.81.2 0.9
1 4 3
4
4 4
4
2
1.4
-0.2
0.8
0.5
-1.3
-0.4 1.6
-0.1 0.5
0.3
1.2 -0.51.1 -0.4
1.2 0.9
0.4 -0.4
1.2 -0.3
1.3
-0.1
0.9
0.4
1.1 -0.2
1.5
0.0-1.2
-0.3 1.6
0.11.5
0.0
-1.1
-0.2
0.6
0.2
P
Q
R
3/24/2016
3/24/2016
1 4 3
4
4 4
4
2
1.5
-1.0
2.1
0.8
1.0
1.6 1.8
0.7 1.6
0.0
1.4 1.1
0.9 1.9
2.5 -0.3
P
Q
R
3.3 2.4
-0.5 3.5 1.5
1.14.9
3/24/2016
Make investors interested
3/24/2016
• Reference
• Team
• Technology
• Business model
Netflix Prize demo / 1
3/24/2016
• In 2009 we created a public demo mainly for investors
• Users can rate movies and get recommendations
• What do you expect from a demo?
 Be relevant even after 1 rating
 Users will provide their favorite movies first
 Be relevant after 2 ratings: both movies should affect the
results
Netflix Prize demo / 2
3/24/2016
• Using a good MF model with K=200 factors and biases
• Use linear regression to compute user feature vector
• Recs after rating a romantic movie Notting Hill, 1999
OK Score Title
 4.6916 The_Shawshank_Redemption/1994
 4.6858 House,_M.D.:_Season_1/2004
 4.6825 Lost:_Season_1/2004
 4.5903 Anne_of_Green_Gables:_The_Sequel/1987
 4.5497 Lord_of_the_Rings:_The_Return_of_the_King/2003
Netflix Prize demo / 3
3/24/2016
• Idea: turn off item bias during recommendation.
• Result are fully relevant
• Even with 10 factors, it is very good
OK Score Title
 4.3323 Love_Actually/2003
 4.3015 Runaway_Bride/1999
 4.2811 My_Best_Friend's_Wedding/1997
 4.2790 You've_Got_Mail/1998
 4.1564 About_a_Boy/2002
Netflix Prize demo / 4
3/24/2016
• Now give 5-star rating to Saving Private Ryan / 1998
• Almost no change in the list
OK Score Title
 4.5911 You've_Got_Mail/1998
 4.5085 Love_Actually/2003
 4.3944 Sleepless_in_Seattle/1993
 4.3625 Runaway_Bride/1999
 4.3274 My_Best_Friend's_Wedding/1997
Netflix Prize demo / 5
3/24/2016
• Idea: set item biases to zero before computing user feature vector
• 5th rec is romantic + war
• Conclusion: MF is good, but rating and ranking are very different
OK Score Title
 4.5094 You've_Got_Mail/1998
 4.3445 Black_Hawk_Down/2001
 4.3298 Sleepless_in_Seattle/1993
 4.3114 Love_Actually/2003
! 4.2805 Apollo_13/1995
The rough start
The business model question
Trabant Rolls Royce
Business model: Trabant vs. Rolls Royce
• Cheap for client
• Simple functionality
• Low performance
• No customization
• Limited warranty
• Works if sold in large
quantities
• Expensive for client
• Complex functionality
• High performance
• Fully customization
• Full warranty (SLA)
• Few sales can bring
enough return
Our decision in 2009 was: Rolls Royce
• Expensive for client
• Complex functionality
• High performance
• Fully customization
• Full warranty (SLA)
• Few sales can bring
enough return
# of requests
26
Vatera.hu largest online marketplace in Hungary
served by one “server”
Alexa TOP100 video chat webpage
(~40M recommendation requests / day):
 Served by 5 application servers and 1 DB
 Too many events to store in MySQL  using
Cassandra (v0.6)
 Training time for IALS too long  speedup by IALS1
 Max. 5 sec latency in “product” availability
Using new/beta technologies
27
Cassandra (v0.6)
Nginx (v0.5) (22% of top 1M sites)
Kafka (v0.8)
MySQL auto. failover
Reaching the limits
28
Even if the technology is widely used if you reach its
limits the optimization is very costly / time consuming.
Java GC – service collapsed because increased minor GC
times due to a JVM bug (26th of January 2013)
Maintaining MySQL with lots of data (optimize table,
slave replication lag, faster storage device)
Complexity increases
29
There is always a business request or an algorithmic
development which requires more resources.
Optimizations
30
# of items
31
How to store item model / metadata in memory to serve
requests fast?
VS.
Auto increment IDs for the items?
231 (~2 billions) is not enough
Preconceptions
32
More data yield better results
CTR is the right proxy: quick decision on A/B tests
Daily retrain is enough
Training frequency
33
CTR decreased in the morning
Tasks are different in real-world
applications
Industry vs. academia
3/24/2016
• In Academic papers
 50% explicit feedback
 50% implicit feedback
o 49.9% personal
o 0.1% item2item
• At gravityrd.com:
 1% explicit feedback
 99% implicit feedback
o 15% personal
o 84% item2item
• Sites where rating is crucial tend to create their own rec engine
• Even if there is explicit rating, there are more implicit feedback
Implicit vs. explicit ratings
• Standard SGD based
learning does not work
(complexity issues)
• Implicit ALS
• Approximate versions of
IALS
 with coordinate descent*
 with conjugate gradient**
* I Pilászy, D Zibriczky, D Tikk, Fast ALS-based matrix factorization for explicit and implicit feedback datasets,
RecSys 2010,
** G Takács, I Pilászy, D Tikk, Applications of the conjugate gradient method for implicit feedback, collaborative
filtering, RecSys 2011,
What is the problem with the explicit
objective function
3/24/2016
• L = (𝑢,𝑖)∈𝑇 𝑟𝑢,𝑖 − 𝑟𝑢,𝑖
2
+ 𝜆 𝑈 𝑢=1
𝑆 𝑈
𝑃𝑢
2
+𝜆𝐼 𝑖=1
𝑆 𝐼
𝑄𝑖
2
• The matrix to be factorized contains 0s and 1s
 If we consider only the positive events (1s)
o Predicting 1s everywhere, minimizes 𝐿 trivially
o Some minor differences may occur due to regularization
• Modified objective function (including zeros)
 L = 𝑢=1,𝑖=1
𝑆 𝑈,𝑆 𝐼
𝑟𝑢,𝑖 − 𝑟𝑢,𝑖
2
+ 𝜆 𝑈 𝑢=1
𝑆 𝑈
𝑃𝑢
2
+𝜆𝐼 𝑖=1
𝑆 𝐼
𝑄𝑖
2
 Number of terms increased
 #zeros ≫ #ones
o All zero prediction gives pretty good 𝐿
Why „explicit” optimization suffers
3/24/2016
• Complexity of the best explicit method
 𝑂 𝑇 𝐾
 Linear in the number of observed ratings
• Implicit feedback
 One should consider negative implicit feedback („missing rating”)
 There is no real missing rating in the matrix
o An element is either 0 or 1, no empty cells
 Complexity: 𝑂 𝑆 𝑈 𝑆𝐼 𝐾
 Sparse data (< 1%, in general)
 𝑆 𝑈 𝑆𝐼 ≫ 𝑇
iALS – objective function
3/24/2016
• 𝐿 = 𝑢=1,𝑖=1
𝑆 𝑈,𝑆 𝐼
𝑤 𝑢,𝑖 𝑟𝑢,𝑖 − 𝑟𝑢,𝑖
2
+ 𝜆 𝑈 𝑢=1
𝑆 𝑈
𝑃𝑢
2
+ 𝜆𝐼 𝑖=1
𝑆 𝐼
𝑄𝑖
2
• Weighted MSE
• 𝑤 𝑢,𝑖 =
𝑤 𝑢,𝑖 if (𝑢, 𝑖) ∈ 𝑇
𝑤0 otherwise
𝑤0 ≪ 𝑤 𝑢,𝑖
• Typical weights: 𝑤0 = 1, 𝑤 𝑢,𝑖 = 100 ∗ 𝑠𝑢𝑝𝑝 𝑢, 𝑖
• Create two matrices from the events
 (1) Preference matrix
o Binary
o 1 represents the presence of an event
 (2) Confidence matrix
o Interprets our certainty on the corresponding values in the first matrix
o Negative feedback is much less certain
Complexity of iALS
3/24/2016
• Total cost: 𝑂 𝐾3 𝑆 𝑈 + 𝑆𝐼 + 𝐾2 𝑁+
 Linear in the number of events
 Cubic in the number of features
• In practice: 𝑆 𝑈 + 𝑆𝐼 ≪ 𝑁+
so for small 𝐾 the second term
dominates
 Quadratic in the number of features
• Approximate versions are even faster
 CG scales linearly in number of features for small 𝐾
Training time using speed-ups
3/24/2016
• ~1000 users
• ~170k items
• ~19M events
0.00
100.00
200.00
300.00
400.00
500.00
600.00
700.00
800.00
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Runningtime(s)
Number of features (K)
ALS
CG
CD
Item-2-item scenario
Task 2: item-2-item recommendations
3/24/2016
• What is item-to-item recommendation?
 People who viewed this also viewed: …
 Viewed, watched, purchased, liked, favored, etc.
• Ignoring the current user
• The recommendation should be relevant to the current
item
• Very common scenario
3/24/2016
Data volume and time
3/24/2016
• Data characteristics (after data
retention):
 Number of active users: 100k – 100M
 Number of active items : 1k – 100M
 Number of relations between them:
10M – 10B
• Response time: must be within
200ms
• We cannot give 199ms for MF
prediction + 1ms business logic
Time complexity of MF for implicit feedback
3/24/2016
• During training
 𝑁+
= #events, S 𝑈 = #users, 𝑆𝐼 = #items
 implicit ALS: 𝑂 𝐾3
𝑆 𝑈 + 𝑆𝐼 + 𝐾2
𝑁+
o with Coordinate Descent: 𝑂 𝐾2
𝑆 𝑈 + 𝑆𝐼 + 𝐾𝑁+
o with CG: the same, but more stable.
 BPR: 𝑂 𝐾𝑁+
 CliMF:𝑂 𝐾𝑁+
⋅ avg(user support)
• During recommendation: 𝐼 ⋅ 𝐾
• Not practical if 𝐼 > 100k, 𝐾 > 100
• You have to increase 𝐾 as 𝐼 grows
i2i recommendations with SVD / 2
3/24/2016
• Recommendations should seem relevant
• You can expect that movies of the same trilogy are similar to each
other
• We defined the following metric:
 For movies A and B of a trilogy, check if B is amongst the top-5 most
similar items of A.
Score: 0 or 1
 A trilogy can provide 6 such pairs (12 for tetralogies)
 Sum up this for all trilogies
• We used a custom movie dataset
• Good metric for CF item-to-item, bad metric for CBF item-to-item
i2i recommendations with SVD / 3
3/24/2016
• Evaluating for SVD with different number of factors
• Using cosine similarity between SVD feature vectors
• more factors provide better results
• Why not use the original space?
• Who wants to run SVD with 500 factors?
• Score of neighbor method (using cosine similarity between
original vectors): 169
𝐾 10 20 50 100 200 500 1000 1500
score 72 82 95 96 106 126 152 158
I2i recommendations with SVD / 4
3/24/2016
• What does a 200-factor SVD recommend to Kill Bill: Vol. 1
• Really bad recommendation
OK Cos
Sim
Title
 0.299 Kill Bill: Vol. 2
 0.273 Matthias, Matthias
 0.223 The New Rijksmuseum
 0.199 Naked
 0.190 Grave Danger
i2i recommendations with SVD / 5
3/24/2016
• What does a 1500-factor SVD recommend to Kill Bill: Vol. 1
• Good, but uses lots of CPU
• But that is an easy domain, with 20k movies!
OK Cos
Sim
Title
 0.292 Kill Bill: Vol. 2
! 0.140 Inglourious Basterds
! 0.133 Pulp Fiction
 0.131 American Beauty
! 0.125 Reservoir Dogs
Implementing an item-to-item method / 1
3/24/2016
We implemented the following article:
Noam Koenigstein and Yehuda Koren. "Towards scalable and
accurate item-oriented recommendations." Proceedings of the 7th
ACM conference on Recommender systems. ACM, 2013.
• They define a new metric for i2i evaluation:
MPR (Mean Percentile Rank):
If user visits A, and then B, then recommend for A, and see the
position of B in that list.
• They propose a new method (EIR, Euclidean Item Recommender) ,
that assigns feature vector for each item, so that if A is close to B,
then users frequently visit B after A.
• They don’t compare it with pure popularity method
Implementing an item-to-item method / 2
3/24/2016
Results on a custom movie dataset:
• SVD and other methods can’t beat the new method
• Popularity method is better or on-pair with the new method
• Recommendations for Pulp Fiction:
SVD New method
Reservoir Dogs A Space Odyssey
Inglourious Basterds A Clockwork Orange
Four Rooms The Godfather
The Shawshank Redemption Eternal Sunshine of the Spotless Mind
Fight Club Mulholland Drive
Implementing an item-to-item method / 3
3/24/2016
Comparison
method
metadata similarity
(larger is better)
MPR
(smaller is better)
cosine 7.54 0.68
Jaccard 7.59 0.68
Association rules 6.44 0.68
pop 1.65 0.25
random 1.44 0.50
EIR 5.00 0.25
Summary of EIR
3/24/2016
• This method is better in MPR than many other methods
• It is on pair with Popularity method
• It is worse in metadata-based similarity
• Sometimes recommendations look like they were
random
• Sensitive to the parameters
• Very few articles are dealing with CF item-to-item recs
Case studies on CTR
Case studies on CTR / 1
3/24/2016
CTR almost doubled when we switched from IALS1 to
item-kNN on a site where users and items are the same
3/24/2016
Case studies on CTR / 2
3/24/2016
Comparison of BPR vs. item-kNN on a classified site, for
item-to-item recommendations
Item-kNN is the winner
3/24/2016
Item-kNN
BPR
Case studies on CTR / 3
3/24/2016
Using BPR vs. item-kNN on a video site for personal
recommendations
Measuring number of clicks on recommendations
Result: 4% more clicks for BPR
3/24/2016
BPR
Item-kNN
Critiques of MF
3/24/2016
• Lots of parameters to tune
• Needs many iteration over the data
• If there is no inter-connection between two item sets,
they can get similar feature vectors.
• Sensitive to noise in data and cold-start
• Not the best for item-to-item recs, especially when
many neighbors already exist
When to use MF
3/24/2016
• One dense domain (e.g. movies), with not too many
items (e.g. less than 100k)
• Feedback is taste-based
• For personalized recommendations (e.g. newsletter)
• Do always A/B testing
• Smart blending (e.g. using it for high supported items)
• Usually better for offline evaluation metrics
Where we are now
Copyright©2016byGravityR&DZrt.Allrightsreserved.
Gravity’s Products and Features
Omnichannel Recommendations
• Mobile / Desktop / iPhone & Android Apps
Dynamic & personalized retargeting
• Through ad networks and third party sites
Smart Search
• Autocomplete, Autocorrect, Search result re-ranking
Personalized Emails & Push Notifications
Technology overview
66
• Performance: Gravity’s performance
oriented architecture enables real-time
response to the always changing
environment and user behavior
• Algorithms: more than 100 different
recommendation algorithm enables true
personalization and to reach the highest
KPIs in different domains
• Infrastructure: fast response times all
around the globe and data security thanks
to the private cloud infrastructure located
in 4 different data centers
• Flexibility: the advanced business rule
engine with intuitive user interface allows
to satisfy various business requirements
Performance
140M requests
served daily
Algorithms
30 man-years
invested
Infrastructure
4 data centers
globally
Flexibility
100s of logics
configurable
Infrastructure
67
Currently 200+ hosts and 3500+ services monitored
0
50
100
150
200
250
2008 2009 2010 2011 2012 2013 2014 2015 2016
Number of servers
4 data centers around the globe
3/24/2016
SJC
20+ servers
AMS
60+ servers BUD
80+ servers
SIN
30+ servers
Using lots of technologies
3/24/2016
Using lots of algorithms (100+)
70
0
10
20
30
40
50
60
0 20 40 60 80 100 120
Number of times an algorithm is used
New directions
Deep learning: Session based
recommendations
• User profile  separate sessions
 User identification problem
 Sessions of different purposeses
o Buy for herself / present
o Purchase products that specify a need (e.g. TV now, fridge 2 weeks later)
o Intent / goal of a browsing sessions of the same user can be different
• Usual solution: Item-to-item recommendations
 Previous history is not considered
 No personalized experience
 Extra round for finding the best fit
• Next event prediction:
 Given the events in the session (so far) what is the next most likely event?
Session based recommendations with RNN
• Item-to-session recommendations
• Using RNNs (GRU, LSTM)
• Network with many features
• Distinctive features
 Session-parallel mini-batches
 Sampling on the output layer
 Ranking loss
o BPR
o TOP1
GRU layer
Feedforward layers
GRU layer
Input: actual item, 1-of-N coding
Embedding layer
GRU layer
…
Output: scores on items
Session-parallel mini-batches
3/24/2016
*Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, Domonkos Tikk: Session-based Recommendations with
Recurrent Neural Networks, to appear at ICLR 2016, available on Arxiv.
Results
3/24/2016
• Significant improvement over the baselines
• +20-30% in recall@20 and MRR@20 over item-kNN
Direct usage of content for recommendations
• User’s decision (click or not click)
 Title
 Image
 Description
• Pipeline
 Automatic feature extraction from content (text, images, music, video)
 Feed features to the RNN recommender
• Other usages
 „Truly similar” item recommendation
 „X is to Y like A is to B” recommendations
 Etc.
• High potential
Recoplatform: RaaS for SMBs
3/24/2016
• www.recoplatform.com
• Self service solution
• Automated quick and
easy integration
• Priced to scale with
business size
3/24/2016
Technology
Product
Business
model
Algorithms
Cross the river when you come to it
79
Thank you!
Email: domi@gravityrd.com
Twitter: @domonkostikk
Web: www.gravityrd.com
F: facebook.com/gravityrd
Blog: blog.gravityrd.com
Yes, we are hiring: hr@gravityrd.com

Weitere ähnliche Inhalte

Was ist angesagt?

Generative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variantsGenerative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variantsananth
 
Session-aware Linear Item-Item Models for Session-based Recommendation (WWW 2...
Session-aware Linear Item-Item Models for Session-based Recommendation (WWW 2...Session-aware Linear Item-Item Models for Session-based Recommendation (WWW 2...
Session-aware Linear Item-Item Models for Session-based Recommendation (WWW 2...민진 최
 
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processingNAVER Engineering
 
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...Alexandros Karatzoglou
 
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Anoop Deoras
 
Local collaborative autoencoders (WSDM2021)
Local collaborative autoencoders (WSDM2021)Local collaborative autoencoders (WSDM2021)
Local collaborative autoencoders (WSDM2021)민진 최
 
Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)Alexander Korbonits
 
Machine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data DemystifiedMachine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data DemystifiedOmid Vahdaty
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Alexandros Karatzoglou
 
An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms Hakky St
 
Generative Adversarial Networks and Their Applications in Medical Imaging
Generative Adversarial Networks  and Their Applications in Medical ImagingGenerative Adversarial Networks  and Their Applications in Medical Imaging
Generative Adversarial Networks and Their Applications in Medical ImagingSanghoon Hong
 
Deep learning: the future of recommendations
Deep learning: the future of recommendationsDeep learning: the future of recommendations
Deep learning: the future of recommendationsBalázs Hidasi
 
[GAN by Hung-yi Lee]Part 3: The recent research of my group
[GAN by Hung-yi Lee]Part 3: The recent research of my group[GAN by Hung-yi Lee]Part 3: The recent research of my group
[GAN by Hung-yi Lee]Part 3: The recent research of my groupNAVER Engineering
 
Recurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas MikolovRecurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas MikolovBhaskar Mitra
 
Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)NamHyuk Ahn
 
Mathematical Background for Artificial Intelligence
Mathematical Background for Artificial IntelligenceMathematical Background for Artificial Intelligence
Mathematical Background for Artificial Intelligenceananth
 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedOmid Vahdaty
 
李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning台灣資料科學年會
 

Was ist angesagt? (20)

Generative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variantsGenerative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variants
 
Session-aware Linear Item-Item Models for Session-based Recommendation (WWW 2...
Session-aware Linear Item-Item Models for Session-based Recommendation (WWW 2...Session-aware Linear Item-Item Models for Session-based Recommendation (WWW 2...
Session-aware Linear Item-Item Models for Session-based Recommendation (WWW 2...
 
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing
 
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
Ranking and Diversity in Recommendations - RecSys Stammtisch at SoundCloud, B...
 
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
 
Deeplearning in finance
Deeplearning in financeDeeplearning in finance
Deeplearning in finance
 
Local collaborative autoencoders (WSDM2021)
Local collaborative autoencoders (WSDM2021)Local collaborative autoencoders (WSDM2021)
Local collaborative autoencoders (WSDM2021)
 
Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)Deep Learning with Python (PyData Seattle 2015)
Deep Learning with Python (PyData Seattle 2015)
 
Machine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data DemystifiedMachine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data Demystified
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial
 
An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms
 
Generative Adversarial Networks and Their Applications in Medical Imaging
Generative Adversarial Networks  and Their Applications in Medical ImagingGenerative Adversarial Networks  and Their Applications in Medical Imaging
Generative Adversarial Networks and Their Applications in Medical Imaging
 
Deep learning: the future of recommendations
Deep learning: the future of recommendationsDeep learning: the future of recommendations
Deep learning: the future of recommendations
 
[GAN by Hung-yi Lee]Part 3: The recent research of my group
[GAN by Hung-yi Lee]Part 3: The recent research of my group[GAN by Hung-yi Lee]Part 3: The recent research of my group
[GAN by Hung-yi Lee]Part 3: The recent research of my group
 
Recurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas MikolovRecurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas Mikolov
 
Entity2rec recsys
Entity2rec recsysEntity2rec recsys
Entity2rec recsys
 
Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)
 
Mathematical Background for Artificial Intelligence
Mathematical Background for Artificial IntelligenceMathematical Background for Artificial Intelligence
Mathematical Background for Artificial Intelligence
 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data Demystified
 
李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning李俊良/Feature Engineering in Machine Learning
李俊良/Feature Engineering in Machine Learning
 

Andere mochten auch

Xây dựng mạng lưới tài năng trẻ trong sáng tạo – khởi nghiệp
Xây dựng mạng lưới tài năng trẻ trong sáng tạo – khởi nghiệpXây dựng mạng lưới tài năng trẻ trong sáng tạo – khởi nghiệp
Xây dựng mạng lưới tài năng trẻ trong sáng tạo – khởi nghiệpTri Dung, Tran
 
From a toolkit of recommendation algorithms into a real business: the Gravity...
From a toolkit of recommendation algorithms into a real business: the Gravity...From a toolkit of recommendation algorithms into a real business: the Gravity...
From a toolkit of recommendation algorithms into a real business: the Gravity...Domonkos Tikk
 
Gravity rd corporate introduction - nlp matiné 2014
Gravity rd corporate introduction  - nlp matiné 2014Gravity rd corporate introduction  - nlp matiné 2014
Gravity rd corporate introduction - nlp matiné 2014Zoltan Varju
 
Gravity personalizaton intro
Gravity personalizaton introGravity personalizaton intro
Gravity personalizaton introEszter Nagy
 
Entrepreneurship & Innovation: Dual-core Engine
Entrepreneurship & Innovation: Dual-core EngineEntrepreneurship & Innovation: Dual-core Engine
Entrepreneurship & Innovation: Dual-core EngineTri Dung, Tran
 
The rise of Recommendation Engines
The rise of Recommendation EnginesThe rise of Recommendation Engines
The rise of Recommendation Engineslamnk
 
Challenges Encountered by Scaling Up Recommendation Services at Gravity R&D
Challenges Encountered by Scaling Up Recommendation Services at Gravity R&DChallenges Encountered by Scaling Up Recommendation Services at Gravity R&D
Challenges Encountered by Scaling Up Recommendation Services at Gravity R&DDomonkos Tikk
 
Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...
Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...
Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...DataStax
 
Feature Hashing for Scalable Machine Learning: Spark Summit East talk by Nick...
Feature Hashing for Scalable Machine Learning: Spark Summit East talk by Nick...Feature Hashing for Scalable Machine Learning: Spark Summit East talk by Nick...
Feature Hashing for Scalable Machine Learning: Spark Summit East talk by Nick...Spark Summit
 
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...Spark Summit
 

Andere mochten auch (10)

Xây dựng mạng lưới tài năng trẻ trong sáng tạo – khởi nghiệp
Xây dựng mạng lưới tài năng trẻ trong sáng tạo – khởi nghiệpXây dựng mạng lưới tài năng trẻ trong sáng tạo – khởi nghiệp
Xây dựng mạng lưới tài năng trẻ trong sáng tạo – khởi nghiệp
 
From a toolkit of recommendation algorithms into a real business: the Gravity...
From a toolkit of recommendation algorithms into a real business: the Gravity...From a toolkit of recommendation algorithms into a real business: the Gravity...
From a toolkit of recommendation algorithms into a real business: the Gravity...
 
Gravity rd corporate introduction - nlp matiné 2014
Gravity rd corporate introduction  - nlp matiné 2014Gravity rd corporate introduction  - nlp matiné 2014
Gravity rd corporate introduction - nlp matiné 2014
 
Gravity personalizaton intro
Gravity personalizaton introGravity personalizaton intro
Gravity personalizaton intro
 
Entrepreneurship & Innovation: Dual-core Engine
Entrepreneurship & Innovation: Dual-core EngineEntrepreneurship & Innovation: Dual-core Engine
Entrepreneurship & Innovation: Dual-core Engine
 
The rise of Recommendation Engines
The rise of Recommendation EnginesThe rise of Recommendation Engines
The rise of Recommendation Engines
 
Challenges Encountered by Scaling Up Recommendation Services at Gravity R&D
Challenges Encountered by Scaling Up Recommendation Services at Gravity R&DChallenges Encountered by Scaling Up Recommendation Services at Gravity R&D
Challenges Encountered by Scaling Up Recommendation Services at Gravity R&D
 
Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...
Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...
Netflix Recommendations Using Spark + Cassandra (Prasanna Padmanabhan & Roopa...
 
Feature Hashing for Scalable Machine Learning: Spark Summit East talk by Nick...
Feature Hashing for Scalable Machine Learning: Spark Summit East talk by Nick...Feature Hashing for Scalable Machine Learning: Spark Summit East talk by Nick...
Feature Hashing for Scalable Machine Learning: Spark Summit East talk by Nick...
 
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
Netflix's Recommendation ML Pipeline Using Apache Spark: Spark Summit East ta...
 

Ähnlich wie Lessons learnt at building recommendation services at industry scale

Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyAlon Bochman, CFA
 
Kaggle Days Paris - Alberto Danese - ML Interpretability
Kaggle Days Paris - Alberto Danese - ML InterpretabilityKaggle Days Paris - Alberto Danese - ML Interpretability
Kaggle Days Paris - Alberto Danese - ML InterpretabilityAlberto Danese
 
[CS570] Machine Learning Team Project (I know what items really are)
[CS570] Machine Learning Team Project (I know what items really are)[CS570] Machine Learning Team Project (I know what items really are)
[CS570] Machine Learning Team Project (I know what items really are)Kunwoo Park
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning SystemsAnuj Gupta
 
Horizon: Deep Reinforcement Learning at Scale
Horizon: Deep Reinforcement Learning at ScaleHorizon: Deep Reinforcement Learning at Scale
Horizon: Deep Reinforcement Learning at ScaleDatabricks
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning SystemsXavier Amatriain
 
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Greg Makowski
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersAlbert Y. C. Chen
 
DutchMLSchool. ML: A Technical Perspective
DutchMLSchool. ML: A Technical PerspectiveDutchMLSchool. ML: A Technical Perspective
DutchMLSchool. ML: A Technical PerspectiveBigML, Inc
 
The Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureThe Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureIvo Andreev
 
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...Aalto University
 
ODSC East 2020 : Continuous_learning_systems
ODSC East 2020 : Continuous_learning_systemsODSC East 2020 : Continuous_learning_systems
ODSC East 2020 : Continuous_learning_systemsAnuj Gupta
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkIvo Andreev
 
Efficiency gains in inversion based interpretation through computer
Efficiency gains in inversion based interpretation through computerEfficiency gains in inversion based interpretation through computer
Efficiency gains in inversion based interpretation through computerDustin Dewett
 
SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...
SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...
SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...South Tyrol Free Software Conference
 
Continuous Learning Systems: Building ML systems that learn from their mistakes
Continuous Learning Systems: Building ML systems that learn from their mistakesContinuous Learning Systems: Building ML systems that learn from their mistakes
Continuous Learning Systems: Building ML systems that learn from their mistakesAnuj Gupta
 

Ähnlich wie Lessons learnt at building recommendation services at industry scale (20)

Kaggle Gold Medal Case Study
Kaggle Gold Medal Case StudyKaggle Gold Medal Case Study
Kaggle Gold Medal Case Study
 
Kaggle Days Paris - Alberto Danese - ML Interpretability
Kaggle Days Paris - Alberto Danese - ML InterpretabilityKaggle Days Paris - Alberto Danese - ML Interpretability
Kaggle Days Paris - Alberto Danese - ML Interpretability
 
[CS570] Machine Learning Team Project (I know what items really are)
[CS570] Machine Learning Team Project (I know what items really are)[CS570] Machine Learning Team Project (I know what items really are)
[CS570] Machine Learning Team Project (I know what items really are)
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning Systems
 
Horizon: Deep Reinforcement Learning at Scale
Horizon: Deep Reinforcement Learning at ScaleHorizon: Deep Reinforcement Learning at Scale
Horizon: Deep Reinforcement Learning at Scale
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
 
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
 
DutchMLSchool. ML: A Technical Perspective
DutchMLSchool. ML: A Technical PerspectiveDutchMLSchool. ML: A Technical Perspective
DutchMLSchool. ML: A Technical Perspective
 
The Machine Learning Workflow with Azure
The Machine Learning Workflow with AzureThe Machine Learning Workflow with Azure
The Machine Learning Workflow with Azure
 
OR Ndejje Univ.pptx
OR Ndejje Univ.pptxOR Ndejje Univ.pptx
OR Ndejje Univ.pptx
 
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
Model-Based User Interface Optimization: Part IV: ADVANCED TOPICS - At SICSA ...
 
ODSC East 2020 : Continuous_learning_systems
ODSC East 2020 : Continuous_learning_systemsODSC East 2020 : Continuous_learning_systems
ODSC East 2020 : Continuous_learning_systems
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 
OR Ndejje Univ (1).pptx
OR Ndejje Univ (1).pptxOR Ndejje Univ (1).pptx
OR Ndejje Univ (1).pptx
 
Efficiency gains in inversion based interpretation through computer
Efficiency gains in inversion based interpretation through computerEfficiency gains in inversion based interpretation through computer
Efficiency gains in inversion based interpretation through computer
 
Big data
Big dataBig data
Big data
 
SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...
SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...
SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...
 
Continuous Learning Systems: Building ML systems that learn from their mistakes
Continuous Learning Systems: Building ML systems that learn from their mistakesContinuous Learning Systems: Building ML systems that learn from their mistakes
Continuous Learning Systems: Building ML systems that learn from their mistakes
 
Skillwise Big data
Skillwise Big dataSkillwise Big data
Skillwise Big data
 

Mehr von Domonkos Tikk

Recommenders on video sharing portals - business and algorithmic aspects
Recommenders on video sharing portals - business and algorithmic aspectsRecommenders on video sharing portals - business and algorithmic aspects
Recommenders on video sharing portals - business and algorithmic aspectsDomonkos Tikk
 
General factorization framework for context-aware recommendations
General factorization framework for context-aware recommendationsGeneral factorization framework for context-aware recommendations
General factorization framework for context-aware recommendationsDomonkos Tikk
 
Tartalomgazdagítás (content enrichment)
Tartalomgazdagítás (content enrichment) Tartalomgazdagítás (content enrichment)
Tartalomgazdagítás (content enrichment) Domonkos Tikk
 
Idomaar crowd rec_reference_fw
Idomaar crowd rec_reference_fwIdomaar crowd rec_reference_fw
Idomaar crowd rec_reference_fwDomonkos Tikk
 
Big Data in Online Classifieds
Big Data in Online ClassifiedsBig Data in Online Classifieds
Big Data in Online ClassifiedsDomonkos Tikk
 
Context-aware similarities within the factorization framework - presented at ...
Context-aware similarities within the factorization framework - presented at ...Context-aware similarities within the factorization framework - presented at ...
Context-aware similarities within the factorization framework - presented at ...Domonkos Tikk
 
Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...
Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...
Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...Domonkos Tikk
 
Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...
Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...
Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...Domonkos Tikk
 
Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 worksh...
Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 worksh...Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 worksh...
Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 worksh...Domonkos Tikk
 

Mehr von Domonkos Tikk (9)

Recommenders on video sharing portals - business and algorithmic aspects
Recommenders on video sharing portals - business and algorithmic aspectsRecommenders on video sharing portals - business and algorithmic aspects
Recommenders on video sharing portals - business and algorithmic aspects
 
General factorization framework for context-aware recommendations
General factorization framework for context-aware recommendationsGeneral factorization framework for context-aware recommendations
General factorization framework for context-aware recommendations
 
Tartalomgazdagítás (content enrichment)
Tartalomgazdagítás (content enrichment) Tartalomgazdagítás (content enrichment)
Tartalomgazdagítás (content enrichment)
 
Idomaar crowd rec_reference_fw
Idomaar crowd rec_reference_fwIdomaar crowd rec_reference_fw
Idomaar crowd rec_reference_fw
 
Big Data in Online Classifieds
Big Data in Online ClassifiedsBig Data in Online Classifieds
Big Data in Online Classifieds
 
Context-aware similarities within the factorization framework - presented at ...
Context-aware similarities within the factorization framework - presented at ...Context-aware similarities within the factorization framework - presented at ...
Context-aware similarities within the factorization framework - presented at ...
 
Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...
Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...
Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...
 
Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...
Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...
Fast ALS-Based Tensor Factorization for Context-Aware Recommendation from Imp...
 
Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 worksh...
Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 worksh...Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 worksh...
Recommender Systems Evaluation: A 3D Benchmark - presented at RUE 2012 worksh...
 

Kürzlich hochgeladen

Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa494f574xmv
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书rnrncn29
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Sonam Pathan
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作ys8omjxb
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Paul Calvano
 
Unidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxUnidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxmibuzondetrabajo
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书rnrncn29
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhimiss dipika
 
Q4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxQ4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxeditsforyah
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxDyna Gilbert
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书zdzoqco
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationLinaWolf1
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一z xss
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predieusebiomeyer
 
Internet of Things Presentation (IoT).pptx
Internet of Things Presentation (IoT).pptxInternet of Things Presentation (IoT).pptx
Internet of Things Presentation (IoT).pptxErYashwantJagtap
 
NSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentationNSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentationMarko4394
 

Kürzlich hochgeladen (17)

Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
 
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24
 
Unidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxUnidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptx
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhi
 
Q4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxQ4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptx
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptx
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 Documentation
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predi
 
Internet of Things Presentation (IoT).pptx
Internet of Things Presentation (IoT).pptxInternet of Things Presentation (IoT).pptx
Internet of Things Presentation (IoT).pptx
 
NSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentationNSX-T and Service Interfaces presentation
NSX-T and Service Interfaces presentation
 

Lessons learnt at building recommendation services at industry scale

  • 1. Lessons learnt at building recommendation services at industry scale Domonkos Tikk Gravity R&D Industry keynote @ ECIR 2016 @domonkostikk
  • 2. Credits to colleagues 3/24/2016 Bottyán Németh Product Owner and co-founder István Pilászy Head of Development and co-founder Balázs Hidasi Head of Data Mining & Research Gábor Vincze Head of Global Service György Dózsa Head of Web Integrations and many others…
  • 3. IR ⊃ Recsys 3/24/2016 Information Retrieval without query IR ?
  • 4. Who we are and what we do 4 Gravity R&D is a recommender system vendor company We provide recommendation as a service since 2009 for our customers all around the globe
  • 5. The journey Gravity made from 2009-2016 3/24/2016
  • 6. How we imagine growth? 6 ?
  • 7. How we imagine growth? 7
  • 8. How it actually happens? 8 ?
  • 9. How it actually happens? 9
  • 10. The impact of Netflix Prize
  • 11. Short summary of Netflix Prize 3/24/2016 • 2006–2009 • Predict movie ratings (explicit feedback) • Content based filtering (CBF) did not work • Classical CF methods (item-kNN, user-kNN) did not work • Matrix factorization was extremely effective • We were fully in love with matrix factorization
  • 12. Schematic of matrix factorization 3/24/2016 • Model  How we approximate user preferences  𝑟𝑢,𝑖 = 𝑝 𝑢 𝑇 𝑞𝑖 • Objective function (error function)  What we want to minimize or optimize?  E.g. optimize for RMSE with regularization L = (𝑢,𝑖)∈𝑇𝑟𝑎𝑖𝑛 𝑟𝑢,𝑖 − Learning ≈ 𝑆𝐼 𝑆𝐼 𝑆 𝑈 𝑆 𝑈 𝐾 𝐾
  • 13. 0.5 -0.30.4 -0.20.5 -0.1 1.1 0.81.2 0.9 1 4 3 4 4 4 4 2 1.4 -0.2 0.8 0.5 -1.3 -0.4 1.6 -0.1 0.5 0.3 1.2 -0.51.1 -0.4 1.2 0.9 0.4 -0.4 1.2 -0.3 1.3 -0.1 0.9 0.4 1.1 -0.2 1.5 0.0-1.2 -0.3 1.6 0.11.5 0.0 -1.1 -0.2 0.6 0.2 P Q R 3/24/2016
  • 15. 1 4 3 4 4 4 4 2 1.5 -1.0 2.1 0.8 1.0 1.6 1.8 0.7 1.6 0.0 1.4 1.1 0.9 1.9 2.5 -0.3 P Q R 3.3 2.4 -0.5 3.5 1.5 1.14.9 3/24/2016
  • 16. Make investors interested 3/24/2016 • Reference • Team • Technology • Business model
  • 17. Netflix Prize demo / 1 3/24/2016 • In 2009 we created a public demo mainly for investors • Users can rate movies and get recommendations • What do you expect from a demo?  Be relevant even after 1 rating  Users will provide their favorite movies first  Be relevant after 2 ratings: both movies should affect the results
  • 18. Netflix Prize demo / 2 3/24/2016 • Using a good MF model with K=200 factors and biases • Use linear regression to compute user feature vector • Recs after rating a romantic movie Notting Hill, 1999 OK Score Title  4.6916 The_Shawshank_Redemption/1994  4.6858 House,_M.D.:_Season_1/2004  4.6825 Lost:_Season_1/2004  4.5903 Anne_of_Green_Gables:_The_Sequel/1987  4.5497 Lord_of_the_Rings:_The_Return_of_the_King/2003
  • 19. Netflix Prize demo / 3 3/24/2016 • Idea: turn off item bias during recommendation. • Result are fully relevant • Even with 10 factors, it is very good OK Score Title  4.3323 Love_Actually/2003  4.3015 Runaway_Bride/1999  4.2811 My_Best_Friend's_Wedding/1997  4.2790 You've_Got_Mail/1998  4.1564 About_a_Boy/2002
  • 20. Netflix Prize demo / 4 3/24/2016 • Now give 5-star rating to Saving Private Ryan / 1998 • Almost no change in the list OK Score Title  4.5911 You've_Got_Mail/1998  4.5085 Love_Actually/2003  4.3944 Sleepless_in_Seattle/1993  4.3625 Runaway_Bride/1999  4.3274 My_Best_Friend's_Wedding/1997
  • 21. Netflix Prize demo / 5 3/24/2016 • Idea: set item biases to zero before computing user feature vector • 5th rec is romantic + war • Conclusion: MF is good, but rating and ranking are very different OK Score Title  4.5094 You've_Got_Mail/1998  4.3445 Black_Hawk_Down/2001  4.3298 Sleepless_in_Seattle/1993  4.3114 Love_Actually/2003 ! 4.2805 Apollo_13/1995
  • 23. The business model question Trabant Rolls Royce
  • 24. Business model: Trabant vs. Rolls Royce • Cheap for client • Simple functionality • Low performance • No customization • Limited warranty • Works if sold in large quantities • Expensive for client • Complex functionality • High performance • Fully customization • Full warranty (SLA) • Few sales can bring enough return
  • 25. Our decision in 2009 was: Rolls Royce • Expensive for client • Complex functionality • High performance • Fully customization • Full warranty (SLA) • Few sales can bring enough return
  • 26. # of requests 26 Vatera.hu largest online marketplace in Hungary served by one “server” Alexa TOP100 video chat webpage (~40M recommendation requests / day):  Served by 5 application servers and 1 DB  Too many events to store in MySQL  using Cassandra (v0.6)  Training time for IALS too long  speedup by IALS1  Max. 5 sec latency in “product” availability
  • 27. Using new/beta technologies 27 Cassandra (v0.6) Nginx (v0.5) (22% of top 1M sites) Kafka (v0.8) MySQL auto. failover
  • 28. Reaching the limits 28 Even if the technology is widely used if you reach its limits the optimization is very costly / time consuming. Java GC – service collapsed because increased minor GC times due to a JVM bug (26th of January 2013) Maintaining MySQL with lots of data (optimize table, slave replication lag, faster storage device)
  • 29. Complexity increases 29 There is always a business request or an algorithmic development which requires more resources.
  • 31. # of items 31 How to store item model / metadata in memory to serve requests fast? VS. Auto increment IDs for the items? 231 (~2 billions) is not enough
  • 32. Preconceptions 32 More data yield better results CTR is the right proxy: quick decision on A/B tests Daily retrain is enough
  • 34. Tasks are different in real-world applications
  • 35. Industry vs. academia 3/24/2016 • In Academic papers  50% explicit feedback  50% implicit feedback o 49.9% personal o 0.1% item2item • At gravityrd.com:  1% explicit feedback  99% implicit feedback o 15% personal o 84% item2item • Sites where rating is crucial tend to create their own rec engine • Even if there is explicit rating, there are more implicit feedback
  • 36. Implicit vs. explicit ratings • Standard SGD based learning does not work (complexity issues) • Implicit ALS • Approximate versions of IALS  with coordinate descent*  with conjugate gradient** * I Pilászy, D Zibriczky, D Tikk, Fast ALS-based matrix factorization for explicit and implicit feedback datasets, RecSys 2010, ** G Takács, I Pilászy, D Tikk, Applications of the conjugate gradient method for implicit feedback, collaborative filtering, RecSys 2011,
  • 37. What is the problem with the explicit objective function 3/24/2016 • L = (𝑢,𝑖)∈𝑇 𝑟𝑢,𝑖 − 𝑟𝑢,𝑖 2 + 𝜆 𝑈 𝑢=1 𝑆 𝑈 𝑃𝑢 2 +𝜆𝐼 𝑖=1 𝑆 𝐼 𝑄𝑖 2 • The matrix to be factorized contains 0s and 1s  If we consider only the positive events (1s) o Predicting 1s everywhere, minimizes 𝐿 trivially o Some minor differences may occur due to regularization • Modified objective function (including zeros)  L = 𝑢=1,𝑖=1 𝑆 𝑈,𝑆 𝐼 𝑟𝑢,𝑖 − 𝑟𝑢,𝑖 2 + 𝜆 𝑈 𝑢=1 𝑆 𝑈 𝑃𝑢 2 +𝜆𝐼 𝑖=1 𝑆 𝐼 𝑄𝑖 2  Number of terms increased  #zeros ≫ #ones o All zero prediction gives pretty good 𝐿
  • 38. Why „explicit” optimization suffers 3/24/2016 • Complexity of the best explicit method  𝑂 𝑇 𝐾  Linear in the number of observed ratings • Implicit feedback  One should consider negative implicit feedback („missing rating”)  There is no real missing rating in the matrix o An element is either 0 or 1, no empty cells  Complexity: 𝑂 𝑆 𝑈 𝑆𝐼 𝐾  Sparse data (< 1%, in general)  𝑆 𝑈 𝑆𝐼 ≫ 𝑇
  • 39. iALS – objective function 3/24/2016 • 𝐿 = 𝑢=1,𝑖=1 𝑆 𝑈,𝑆 𝐼 𝑤 𝑢,𝑖 𝑟𝑢,𝑖 − 𝑟𝑢,𝑖 2 + 𝜆 𝑈 𝑢=1 𝑆 𝑈 𝑃𝑢 2 + 𝜆𝐼 𝑖=1 𝑆 𝐼 𝑄𝑖 2 • Weighted MSE • 𝑤 𝑢,𝑖 = 𝑤 𝑢,𝑖 if (𝑢, 𝑖) ∈ 𝑇 𝑤0 otherwise 𝑤0 ≪ 𝑤 𝑢,𝑖 • Typical weights: 𝑤0 = 1, 𝑤 𝑢,𝑖 = 100 ∗ 𝑠𝑢𝑝𝑝 𝑢, 𝑖 • Create two matrices from the events  (1) Preference matrix o Binary o 1 represents the presence of an event  (2) Confidence matrix o Interprets our certainty on the corresponding values in the first matrix o Negative feedback is much less certain
  • 40. Complexity of iALS 3/24/2016 • Total cost: 𝑂 𝐾3 𝑆 𝑈 + 𝑆𝐼 + 𝐾2 𝑁+  Linear in the number of events  Cubic in the number of features • In practice: 𝑆 𝑈 + 𝑆𝐼 ≪ 𝑁+ so for small 𝐾 the second term dominates  Quadratic in the number of features • Approximate versions are even faster  CG scales linearly in number of features for small 𝐾
  • 41. Training time using speed-ups 3/24/2016 • ~1000 users • ~170k items • ~19M events 0.00 100.00 200.00 300.00 400.00 500.00 600.00 700.00 800.00 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 Runningtime(s) Number of features (K) ALS CG CD
  • 43. Task 2: item-2-item recommendations 3/24/2016 • What is item-to-item recommendation?  People who viewed this also viewed: …  Viewed, watched, purchased, liked, favored, etc. • Ignoring the current user • The recommendation should be relevant to the current item • Very common scenario
  • 45. Data volume and time 3/24/2016 • Data characteristics (after data retention):  Number of active users: 100k – 100M  Number of active items : 1k – 100M  Number of relations between them: 10M – 10B • Response time: must be within 200ms • We cannot give 199ms for MF prediction + 1ms business logic
  • 46. Time complexity of MF for implicit feedback 3/24/2016 • During training  𝑁+ = #events, S 𝑈 = #users, 𝑆𝐼 = #items  implicit ALS: 𝑂 𝐾3 𝑆 𝑈 + 𝑆𝐼 + 𝐾2 𝑁+ o with Coordinate Descent: 𝑂 𝐾2 𝑆 𝑈 + 𝑆𝐼 + 𝐾𝑁+ o with CG: the same, but more stable.  BPR: 𝑂 𝐾𝑁+  CliMF:𝑂 𝐾𝑁+ ⋅ avg(user support) • During recommendation: 𝐼 ⋅ 𝐾 • Not practical if 𝐼 > 100k, 𝐾 > 100 • You have to increase 𝐾 as 𝐼 grows
  • 47. i2i recommendations with SVD / 2 3/24/2016 • Recommendations should seem relevant • You can expect that movies of the same trilogy are similar to each other • We defined the following metric:  For movies A and B of a trilogy, check if B is amongst the top-5 most similar items of A. Score: 0 or 1  A trilogy can provide 6 such pairs (12 for tetralogies)  Sum up this for all trilogies • We used a custom movie dataset • Good metric for CF item-to-item, bad metric for CBF item-to-item
  • 48. i2i recommendations with SVD / 3 3/24/2016 • Evaluating for SVD with different number of factors • Using cosine similarity between SVD feature vectors • more factors provide better results • Why not use the original space? • Who wants to run SVD with 500 factors? • Score of neighbor method (using cosine similarity between original vectors): 169 𝐾 10 20 50 100 200 500 1000 1500 score 72 82 95 96 106 126 152 158
  • 49. I2i recommendations with SVD / 4 3/24/2016 • What does a 200-factor SVD recommend to Kill Bill: Vol. 1 • Really bad recommendation OK Cos Sim Title  0.299 Kill Bill: Vol. 2  0.273 Matthias, Matthias  0.223 The New Rijksmuseum  0.199 Naked  0.190 Grave Danger
  • 50. i2i recommendations with SVD / 5 3/24/2016 • What does a 1500-factor SVD recommend to Kill Bill: Vol. 1 • Good, but uses lots of CPU • But that is an easy domain, with 20k movies! OK Cos Sim Title  0.292 Kill Bill: Vol. 2 ! 0.140 Inglourious Basterds ! 0.133 Pulp Fiction  0.131 American Beauty ! 0.125 Reservoir Dogs
  • 51. Implementing an item-to-item method / 1 3/24/2016 We implemented the following article: Noam Koenigstein and Yehuda Koren. "Towards scalable and accurate item-oriented recommendations." Proceedings of the 7th ACM conference on Recommender systems. ACM, 2013. • They define a new metric for i2i evaluation: MPR (Mean Percentile Rank): If user visits A, and then B, then recommend for A, and see the position of B in that list. • They propose a new method (EIR, Euclidean Item Recommender) , that assigns feature vector for each item, so that if A is close to B, then users frequently visit B after A. • They don’t compare it with pure popularity method
  • 52. Implementing an item-to-item method / 2 3/24/2016 Results on a custom movie dataset: • SVD and other methods can’t beat the new method • Popularity method is better or on-pair with the new method • Recommendations for Pulp Fiction: SVD New method Reservoir Dogs A Space Odyssey Inglourious Basterds A Clockwork Orange Four Rooms The Godfather The Shawshank Redemption Eternal Sunshine of the Spotless Mind Fight Club Mulholland Drive
  • 53. Implementing an item-to-item method / 3 3/24/2016 Comparison method metadata similarity (larger is better) MPR (smaller is better) cosine 7.54 0.68 Jaccard 7.59 0.68 Association rules 6.44 0.68 pop 1.65 0.25 random 1.44 0.50 EIR 5.00 0.25
  • 54. Summary of EIR 3/24/2016 • This method is better in MPR than many other methods • It is on pair with Popularity method • It is worse in metadata-based similarity • Sometimes recommendations look like they were random • Sensitive to the parameters • Very few articles are dealing with CF item-to-item recs
  • 56. Case studies on CTR / 1 3/24/2016 CTR almost doubled when we switched from IALS1 to item-kNN on a site where users and items are the same
  • 58. Case studies on CTR / 2 3/24/2016 Comparison of BPR vs. item-kNN on a classified site, for item-to-item recommendations Item-kNN is the winner
  • 60. Case studies on CTR / 3 3/24/2016 Using BPR vs. item-kNN on a video site for personal recommendations Measuring number of clicks on recommendations Result: 4% more clicks for BPR
  • 62. Critiques of MF 3/24/2016 • Lots of parameters to tune • Needs many iteration over the data • If there is no inter-connection between two item sets, they can get similar feature vectors. • Sensitive to noise in data and cold-start • Not the best for item-to-item recs, especially when many neighbors already exist
  • 63. When to use MF 3/24/2016 • One dense domain (e.g. movies), with not too many items (e.g. less than 100k) • Feedback is taste-based • For personalized recommendations (e.g. newsletter) • Do always A/B testing • Smart blending (e.g. using it for high supported items) • Usually better for offline evaluation metrics
  • 65. Copyright©2016byGravityR&DZrt.Allrightsreserved. Gravity’s Products and Features Omnichannel Recommendations • Mobile / Desktop / iPhone & Android Apps Dynamic & personalized retargeting • Through ad networks and third party sites Smart Search • Autocomplete, Autocorrect, Search result re-ranking Personalized Emails & Push Notifications
  • 66. Technology overview 66 • Performance: Gravity’s performance oriented architecture enables real-time response to the always changing environment and user behavior • Algorithms: more than 100 different recommendation algorithm enables true personalization and to reach the highest KPIs in different domains • Infrastructure: fast response times all around the globe and data security thanks to the private cloud infrastructure located in 4 different data centers • Flexibility: the advanced business rule engine with intuitive user interface allows to satisfy various business requirements Performance 140M requests served daily Algorithms 30 man-years invested Infrastructure 4 data centers globally Flexibility 100s of logics configurable
  • 67. Infrastructure 67 Currently 200+ hosts and 3500+ services monitored 0 50 100 150 200 250 2008 2009 2010 2011 2012 2013 2014 2015 2016 Number of servers
  • 68. 4 data centers around the globe 3/24/2016 SJC 20+ servers AMS 60+ servers BUD 80+ servers SIN 30+ servers
  • 69. Using lots of technologies 3/24/2016
  • 70. Using lots of algorithms (100+) 70 0 10 20 30 40 50 60 0 20 40 60 80 100 120 Number of times an algorithm is used
  • 72. Deep learning: Session based recommendations • User profile  separate sessions  User identification problem  Sessions of different purposeses o Buy for herself / present o Purchase products that specify a need (e.g. TV now, fridge 2 weeks later) o Intent / goal of a browsing sessions of the same user can be different • Usual solution: Item-to-item recommendations  Previous history is not considered  No personalized experience  Extra round for finding the best fit • Next event prediction:  Given the events in the session (so far) what is the next most likely event?
  • 73. Session based recommendations with RNN • Item-to-session recommendations • Using RNNs (GRU, LSTM) • Network with many features • Distinctive features  Session-parallel mini-batches  Sampling on the output layer  Ranking loss o BPR o TOP1 GRU layer Feedforward layers GRU layer Input: actual item, 1-of-N coding Embedding layer GRU layer … Output: scores on items
  • 74. Session-parallel mini-batches 3/24/2016 *Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, Domonkos Tikk: Session-based Recommendations with Recurrent Neural Networks, to appear at ICLR 2016, available on Arxiv.
  • 75. Results 3/24/2016 • Significant improvement over the baselines • +20-30% in recall@20 and MRR@20 over item-kNN
  • 76. Direct usage of content for recommendations • User’s decision (click or not click)  Title  Image  Description • Pipeline  Automatic feature extraction from content (text, images, music, video)  Feed features to the RNN recommender • Other usages  „Truly similar” item recommendation  „X is to Y like A is to B” recommendations  Etc. • High potential
  • 77. Recoplatform: RaaS for SMBs 3/24/2016 • www.recoplatform.com • Self service solution • Automated quick and easy integration • Priced to scale with business size
  • 79. Cross the river when you come to it 79
  • 80. Thank you! Email: domi@gravityrd.com Twitter: @domonkostikk Web: www.gravityrd.com F: facebook.com/gravityrd Blog: blog.gravityrd.com Yes, we are hiring: hr@gravityrd.com

Hinweis der Redaktion

  1. Little people last sayings