New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Recommender SaaS in Practice
1. Recommender SaaS in Practice
Tianjian Chen
Jianbo Zhao
Xin Sun
Baidu Inc. 2013
http://www.baidu.com
2. About Us
• Baidu.com Inc.
•
Leading internet company in China
•
Reach over 500 million Internet users
•
Over 8 billion PV/day of web search, online advertising and social
network services
http://www.baidu.com
3. The Recommender SaaS Project
• Provide On-Site Recommender System for Every Website
•
http://tuijian.baidu.com (Chinese Version Only For Now)
Website
Original Web Page
Recommender
System
SaaS
Update
On-Site Content Recommendations
Content
Users
Combination
http://www.baidu.com
5. Single On-Site RS Diagram
Recommender Trigger
New-Item
Item
Indexing
Probabilistic
Prediction
Item
Recalling
Real-time
User Log
User
Modeling
Result List
Control Strategy
http://www.baidu.com
7. Scale Out to Thousands of Sites
Recommender Web API
Tracking API
Recommender Engine Cluster
Engine Instance
Invert-Indexer Cluster
Engine Instance
K-V Storage Cluster
Site 1
Site 5
Site 6
User
C-F
Site 4
Site 7
Site 9
Model
Result
Stream Computing Cluster
Web Crawler
User Tracking System
http://www.baidu.com
8. Global User Modeling
User Tracking
Log
JOIN in
Memory
in Real-Time
Hot Web Page
Cache
Web Crawler
Based on Stream Computing
10 Gbps Bandwidth
User Browsing
Session
50 Million Web Pages Cached
Billions of Cookies
User
Preference
Modeling
http://www.baidu.com
9. Project Status
• Beta release launched in April, 2013
• More than 1000 websites joined the beta test
• > 100 million page views every day
• Avg. CTR 3%
•
from 2% to 20% depending on different types of websites.
http://www.baidu.com
10. Inside a Recommender Engine Instance
• Combination of Multiple Sub Recommender Engines
Item Type
Item Based
CF
Content
Based
Movie/Video
X
News Web
X
X
Pic Gallery
X
X
Novel Library
X
Item
Popularity
X
Yellow Page
X
X
[X] means particular engine has certain performance gain in recommendation of
some item type
http://www.baidu.com
11. Mono RS Engine CTR Comparation
Item Type
Item Based
CF
Content
Based
Item
Popularity
Movie/Video
> 6%
~ 0.5%
> 2%
News Web
~ 7%
> 25%
~ 0.5%
Pic Gallery
> 6%
~ 4%
~ 1%
Novel Library
> 10%
~ 8%
~ 1%
Yellow Page
~ 1%
~ 1%
> 15%
• IBCF is handy, but not the silver bullet
• To our surprise, IP doesn’t work for News Recommendation
• No one like old yellow page posts, even they are semantically or
statistically relevant.
http://www.baidu.com
12. Things Need to Be Figured Out
• Aggregation method of different recommendations engines
• Performance loss caused by the site owners’ preset rules
• Item longevity detection / prediction
• URL normalization
• And…
http://www.baidu.com
13. Influence of User Browsing Context
CTR
CTR
3x
5x
1x
1x
Long Term Model Short Term Model
(Minutes)
(Months)
Landing on
Leaf Page
Landing on
Portal Page
http://www.baidu.com
14. Conclusions
• Break big problem down to small ones
• Wrap simple stuffs up for building complex services
• No silver bullet for an open RecSys cloud
• Beside of accuracy and relevance, time efficiency is also important
http://www.baidu.com
15. Q & A Time
Thanks
chentianjian@baidu.com
http://www.baidu.com