Data Con LA 2020
Description
Watchworthy is a personalized TV recommendation app that leverages over 1B votes cast by TV fans on Ranker lists. Visitors to Ranker.com can vote the newest and most talked about TV Shows up and down on various lists, from "Best New Horror Shows" to "Funniest Sitcoms Ever Made." These votes constitute a trove of anonymous crowdsourced data that gives us valuable insight into taste correlations. But despite this massive volume of users we have to train on, like most developers using a user-to-item dataset, we face the classic "cold start" issue: how do we recommend brand-new shows that relatively few people have voted on? This problem is further complicated by a request to use voting behavior (rather than metadata) as much as possible when building out this recommendation engine. We present an unique approach that parses out existing user voting profiles to create additional users, called “split users". These split users will be grouped into separate training sets to create multiple sub-models. By creating an ensemble based on the submodels and applying a "most pleasure" strategy, we achieve the goal of recommending new shows.
Speakers
Vincent S, Ranker, VP of Data Science
Keryu Ong, Ranker, Senior Data Scientist
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
How Ranker Turned Pop Culture Lists Into Personalized TV Recommendations
1. How Ranker Turned Pop Culture
Lists Into Personalized TV
Recommendations
DataCon LA 2020
2. OVERVIEWIntroduction
Dr. Vincent Seah
VP, Data Science
Ranker since July 2019
Ph.D. Mechanical Engineering
UCLA
Fullscreen Media (acq. by AT&T)
KPMG US
Inkiru (acq. by Walmart Labs)
Hoodiny Entertainment Group (acq. by PRISA)
linkedin.com/in/drnotsoevil
Ker-Yu Ong
Senior Data Scientist
Ranker since April 2020
M.Sc. Data Science
University of San Francisco
Deloitte San Francisco
Deloitte Singapore
linkedin.com/in/keryu-ong
3. OVERVIEW
● CEO, Clark Benson
● Media publisher turning engagement into IP
● Over 100 employees
● Headquartered in Los Angeles, with an office in NYC
● 40M monthly unique visitors worldwide
● More than 1B votes cast over last 10 years
● Fan-powered votable content with 10,000 lists
covering everything from TV, Movies to
Sports, Food and Lifestyle
● Products built in-house:
○ Ranker Insights
○ Watchworthy App
○ Data Science Apps
4. WATCHWORTHY
Cross-Platform, Personalized Show Recommendations Based on 1B Data Points
ONBOARDING IN-APP
● Mobile app with an unparalleled
ability to give users targeted,
personalized TV
recommendations
● Using pure, first party voting
data from Ranker website
● Available on Android and iOS
8. CHALLENGE
Voters Sentiment versus Metadata
Actor-Based Recs
Voter sentiment casts a
wider net of recs across
genres and decades
Genre-Based Recs
"Rom-Com"
9. CHALLENGE
“If I like this older TV show, what new TV show should I watch?”
Breaking Bad (2008) Chernobyl (2019)
10. RELATED WORK
In Good Company
● A Fairness-aware Hybrid Recommender System
○ G. Farnadi, P. Kouki, S.K. Thompson, S. Srinivasan, L. Getoor [2018]
○ “...A fair recommender system should provide rankings to the protected group that are the
same as the unprotected group…”
● Group Recommender Systems: A Virtual User Approach Based on Precedence Mining
○ V.R. Kagita, A.K. Pujari, V. Padmanabhan [2015]
○ “... introducing a virtual user that can more effectively represent a group..”
● Personalized Real-Time Movie Recommendation System: Practical Prototype and Evaluation
○ J. Zhang, Y. Wang, Z. Yuan, Q. Jin [2019]
○ “...virtual opinion leader is conceived to represent the whole cluster…”
● Innovations in Graph Representation Learning
○ A. Epasto, and B. Perozzi [2019]
○ “...we developed Splitter, an unsupervised embedding method that allows the nodes in a graph
to have multiple embeddings to better encode their participation in multiple communities…”
11. APPROACH
Recap
Recap: How do we recommend new shows when:
● A user’s input taste profile is dominated by older shows
● The rec algo training data is dominated by older shows
Reframed this as a class imbalance problem
Solvable via classification techniques
● Minority-class Upsampling (SMOTE)
● Majority-class Downsampling
● Data Augmentation
14. EXPERIMENTS
● Upsampling votes from bridge voters
● Downsampling votes from non-bridge voters
● Applying different thresholds for
○ Vote count
○ Vote type
○ Vote spread
Things We Tried
Challenge:
Because we were preserving each user’s voting
pattern, upsampling did not change the distribution
of bridge voters’ votes
15. EXPERIMENTS
Challenge, Illustrated
User Show Year
Keryu 2019
Keryu 2010
Keryu 2005
Original
Upsampled
User Show Year
Keryu 2019
Keryu 2010
Keryu 2005
Keryu_2 2019
Keryu_2 2010
Keryu_2 2005
16. EXPERIMENTS
Foray into “Splitting”: Upsampling Bridge Votes
User Show Year
Keryu 2019
Keryu 2010
Keryu 2005
Keryu_21 2019
Keryu_21 2010
Keryu_22 2019
Keryu_22 2005
Upsampled
User Show Year
Keryu 2019
Keryu 2010
Keryu 2005
Original
17. ● What about individual models
for each bridge vote?
SPLIT SAMPLING
Foray into “Splitting” - Multiple Models
User Show Year
Keryu 2019
Keryu 2010
Keryu 2005
User Show Year
Keryu 2019
Keryu 2010
User Show Year
Keryu 2019
Keryu 2005
User Show Year
Keryu 2019
Keryu 2010
Keryu 2005
Original
m_0
m_1
m_2
18. SPLIT SAMPLING
Methodology
1. Bin shows into release year decades
2. Split bridge voters’ votes by bridge decade:
a. 1990s to new
b. 2000s to new
c. 2010s to new etc.
3. Build an overall model and individual decade-specific models
4. Ensemble to get maximum number of
new shows per recommendation stream
20. EXAMPLE
Grey's Anatomy
Law & Order: Special Victims Unit
Stranger Things
The Big Bang Theory
The Closer
The Crown
This Is Us
black-ish
Bob's Burgers
Breaking Bad
Family Guy
Fresh Off the Boat
Rick and Morty
Riverdale
The Vampire Diaries
21. EXAMPLE
Grey's Anatomy
Law & Order: Special Victims Unit
Stranger Things
The Big Bang Theory
The Closer
The Crown
This Is Us
black-ish
Bob's Burgers
Breaking Bad
Family Guy
Fresh Off the Boat
Rick and Morty
Riverdale
The Vampire Diaries
22. EXAMPLE
Grey's Anatomy
Law & Order: Special Victims Unit
Stranger Things
The Big Bang Theory
The Closer
The Crown
This Is Us
black-ish
Bob's Burgers
Breaking Bad
Family Guy
Fresh Off the Boat
Rick and Morty
Riverdale
The Vampire Diaries
with split sampling
original