SlideShare a Scribd company logo
1 of 15
Recommender SaaS in Practice
Tianjian Chen
Jianbo Zhao
Xin Sun

Baidu Inc. 2013

http://www.baidu.com
About Us
• Baidu.com Inc.
•

Leading internet company in China

•

Reach over 500 million Internet users

•

Over 8 billion PV/day of web search, online advertising and social
network services

http://www.baidu.com
The Recommender SaaS Project
• Provide On-Site Recommender System for Every Website
•

http://tuijian.baidu.com (Chinese Version Only For Now)

Website

Original Web Page

Recommender
System
SaaS

Update

On-Site Content Recommendations

Content
Users
Combination

http://www.baidu.com
Recommendation Widgets

Original
Content

Popup / Panel Slider

Original
Content

Embedded Box

http://www.baidu.com
Single On-Site RS Diagram
Recommender Trigger

New-Item

Item
Indexing

Probabilistic
Prediction
Item
Recalling

Real-time
User Log

User
Modeling

Result List
Control Strategy

http://www.baidu.com
A Direct Solution for Scalability

http://www.baidu.com
Scale Out to Thousands of Sites
Recommender Web API

Tracking API

Recommender Engine Cluster

Engine Instance

Invert-Indexer Cluster

Engine Instance

K-V Storage Cluster

Site 1

Site 5

Site 6

User

C-F

Site 4

Site 7

Site 9

Model

Result

Stream Computing Cluster

Web Crawler

User Tracking System
http://www.baidu.com
Global User Modeling
User Tracking
Log

JOIN in
Memory

in Real-Time
Hot Web Page
Cache

Web Crawler

Based on Stream Computing
10 Gbps Bandwidth

User Browsing
Session

50 Million Web Pages Cached
Billions of Cookies

User
Preference
Modeling
http://www.baidu.com
Project Status
• Beta release launched in April, 2013
• More than 1000 websites joined the beta test

• > 100 million page views every day
• Avg. CTR 3%
•

from 2% to 20% depending on different types of websites.

http://www.baidu.com
Inside a Recommender Engine Instance
• Combination of Multiple Sub Recommender Engines
Item Type

Item Based
CF

Content
Based

Movie/Video

X

News Web

X

X

Pic Gallery

X

X

Novel Library

X

Item
Popularity

X

Yellow Page

X

X

[X] means particular engine has certain performance gain in recommendation of
some item type

http://www.baidu.com
Mono RS Engine CTR Comparation
Item Type

Item Based
CF

Content
Based

Item
Popularity

Movie/Video

> 6%

~ 0.5%

> 2%

News Web

~ 7%

> 25%

~ 0.5%

Pic Gallery

> 6%

~ 4%

~ 1%

Novel Library

> 10%

~ 8%

~ 1%

Yellow Page

~ 1%

~ 1%

> 15%

• IBCF is handy, but not the silver bullet
• To our surprise, IP doesn’t work for News Recommendation
• No one like old yellow page posts, even they are semantically or
statistically relevant.

http://www.baidu.com
Things Need to Be Figured Out
• Aggregation method of different recommendations engines
• Performance loss caused by the site owners’ preset rules

• Item longevity detection / prediction
• URL normalization
• And…

http://www.baidu.com
Influence of User Browsing Context
CTR

CTR
3x

5x

1x
1x

Long Term Model Short Term Model
(Minutes)
(Months)

Landing on
Leaf Page

Landing on
Portal Page

http://www.baidu.com
Conclusions
• Break big problem down to small ones
• Wrap simple stuffs up for building complex services

• No silver bullet for an open RecSys cloud
• Beside of accuracy and relevance, time efficiency is also important

http://www.baidu.com
Q & A Time
Thanks
chentianjian@baidu.com

http://www.baidu.com

More Related Content

Similar to Recommender SaaS in Practice

Seo Onpage Optimization Training
Seo Onpage Optimization TrainingSeo Onpage Optimization Training
Seo Onpage Optimization TrainingSEOBANK
 
Accelerating eCommerce Experiences: 2016 Holiday Case Study
Accelerating eCommerce Experiences: 2016 Holiday Case StudyAccelerating eCommerce Experiences: 2016 Holiday Case Study
Accelerating eCommerce Experiences: 2016 Holiday Case Studyrswint
 
web analysis document | web analysis document | web analysis document | web a...
web analysis document | web analysis document | web analysis document | web a...web analysis document | web analysis document | web analysis document | web a...
web analysis document | web analysis document | web analysis document | web a...nazen2
 
SEO Surgery APAC #2 by Nik Ranger
SEO Surgery APAC #2 by Nik RangerSEO Surgery APAC #2 by Nik Ranger
SEO Surgery APAC #2 by Nik RangerAnton Shulke
 
SharePoint Development Workshop
SharePoint Development WorkshopSharePoint Development Workshop
SharePoint Development WorkshopMJ Ferdous
 
Technical SEO for WordPress Developers, Designers and Webmasters
Technical SEO for WordPress Developers, Designers and WebmastersTechnical SEO for WordPress Developers, Designers and Webmasters
Technical SEO for WordPress Developers, Designers and WebmastersHenry Visotski
 
Enhancing WordPress With AI Plugins Boost Efficiency & Speed .pptx
Enhancing WordPress With AI Plugins  Boost Efficiency & Speed .pptxEnhancing WordPress With AI Plugins  Boost Efficiency & Speed .pptx
Enhancing WordPress With AI Plugins Boost Efficiency & Speed .pptxMegataskWeb
 
10 Commonly Missed SEO Opportunities For Wordpress Awesomeness
10 Commonly Missed SEO Opportunities For Wordpress Awesomeness10 Commonly Missed SEO Opportunities For Wordpress Awesomeness
10 Commonly Missed SEO Opportunities For Wordpress AwesomenessJason White
 
Optimizing WordPress Performance
Optimizing WordPress PerformanceOptimizing WordPress Performance
Optimizing WordPress PerformanceDouglas Yuen
 
Accelerating eCommerce Experiences
Accelerating eCommerce ExperiencesAccelerating eCommerce Experiences
Accelerating eCommerce Experiencesrswint
 
Technical SEO Checklist for Beginners
Technical SEO Checklist for BeginnersTechnical SEO Checklist for Beginners
Technical SEO Checklist for BeginnersBristolSEO
 
12 GoMeasure (sg and kl) - page speed light speed path to conversions - joh...
12   GoMeasure (sg and kl) - page speed light speed path to conversions - joh...12   GoMeasure (sg and kl) - page speed light speed path to conversions - joh...
12 GoMeasure (sg and kl) - page speed light speed path to conversions - joh...Vinoaj Vijeyakumaar
 
SEO Bootcamp - Technical SEO Audit - Template Level
SEO Bootcamp - Technical SEO Audit - Template LevelSEO Bootcamp - Technical SEO Audit - Template Level
SEO Bootcamp - Technical SEO Audit - Template LevelJonah A Berger
 
SUGMEA - Sitecore JSS and Performance Optimization - Alex Shyba - Altudo
SUGMEA - Sitecore JSS and Performance Optimization - Alex Shyba - AltudoSUGMEA - Sitecore JSS and Performance Optimization - Alex Shyba - Altudo
SUGMEA - Sitecore JSS and Performance Optimization - Alex Shyba - Altudodharmeshharji
 
IBM Feature Pack 8 Webinar
IBM Feature Pack 8 WebinarIBM Feature Pack 8 Webinar
IBM Feature Pack 8 WebinarCrossView
 
Search Beyond Google: The other search engines
Search Beyond Google: The other search enginesSearch Beyond Google: The other search engines
Search Beyond Google: The other search enginesJitka Lopez (Jizerova)
 
Website Parameters.pptx
Website Parameters.pptxWebsite Parameters.pptx
Website Parameters.pptxASHAVI2
 
Technical SEO: Crawl Space Management - SEOZone Istanbul 2014
Technical SEO: Crawl Space Management - SEOZone Istanbul 2014Technical SEO: Crawl Space Management - SEOZone Istanbul 2014
Technical SEO: Crawl Space Management - SEOZone Istanbul 2014Bastian Grimm
 
Bn 1021 demo digital marketing
Bn 1021 demo  digital marketingBn 1021 demo  digital marketing
Bn 1021 demo digital marketingconline training
 

Similar to Recommender SaaS in Practice (20)

Seo Onpage Optimization Guide
Seo Onpage Optimization Guide Seo Onpage Optimization Guide
Seo Onpage Optimization Guide
 
Seo Onpage Optimization Training
Seo Onpage Optimization TrainingSeo Onpage Optimization Training
Seo Onpage Optimization Training
 
Accelerating eCommerce Experiences: 2016 Holiday Case Study
Accelerating eCommerce Experiences: 2016 Holiday Case StudyAccelerating eCommerce Experiences: 2016 Holiday Case Study
Accelerating eCommerce Experiences: 2016 Holiday Case Study
 
web analysis document | web analysis document | web analysis document | web a...
web analysis document | web analysis document | web analysis document | web a...web analysis document | web analysis document | web analysis document | web a...
web analysis document | web analysis document | web analysis document | web a...
 
SEO Surgery APAC #2 by Nik Ranger
SEO Surgery APAC #2 by Nik RangerSEO Surgery APAC #2 by Nik Ranger
SEO Surgery APAC #2 by Nik Ranger
 
SharePoint Development Workshop
SharePoint Development WorkshopSharePoint Development Workshop
SharePoint Development Workshop
 
Technical SEO for WordPress Developers, Designers and Webmasters
Technical SEO for WordPress Developers, Designers and WebmastersTechnical SEO for WordPress Developers, Designers and Webmasters
Technical SEO for WordPress Developers, Designers and Webmasters
 
Enhancing WordPress With AI Plugins Boost Efficiency & Speed .pptx
Enhancing WordPress With AI Plugins  Boost Efficiency & Speed .pptxEnhancing WordPress With AI Plugins  Boost Efficiency & Speed .pptx
Enhancing WordPress With AI Plugins Boost Efficiency & Speed .pptx
 
10 Commonly Missed SEO Opportunities For Wordpress Awesomeness
10 Commonly Missed SEO Opportunities For Wordpress Awesomeness10 Commonly Missed SEO Opportunities For Wordpress Awesomeness
10 Commonly Missed SEO Opportunities For Wordpress Awesomeness
 
Optimizing WordPress Performance
Optimizing WordPress PerformanceOptimizing WordPress Performance
Optimizing WordPress Performance
 
Accelerating eCommerce Experiences
Accelerating eCommerce ExperiencesAccelerating eCommerce Experiences
Accelerating eCommerce Experiences
 
Technical SEO Checklist for Beginners
Technical SEO Checklist for BeginnersTechnical SEO Checklist for Beginners
Technical SEO Checklist for Beginners
 
12 GoMeasure (sg and kl) - page speed light speed path to conversions - joh...
12   GoMeasure (sg and kl) - page speed light speed path to conversions - joh...12   GoMeasure (sg and kl) - page speed light speed path to conversions - joh...
12 GoMeasure (sg and kl) - page speed light speed path to conversions - joh...
 
SEO Bootcamp - Technical SEO Audit - Template Level
SEO Bootcamp - Technical SEO Audit - Template LevelSEO Bootcamp - Technical SEO Audit - Template Level
SEO Bootcamp - Technical SEO Audit - Template Level
 
SUGMEA - Sitecore JSS and Performance Optimization - Alex Shyba - Altudo
SUGMEA - Sitecore JSS and Performance Optimization - Alex Shyba - AltudoSUGMEA - Sitecore JSS and Performance Optimization - Alex Shyba - Altudo
SUGMEA - Sitecore JSS and Performance Optimization - Alex Shyba - Altudo
 
IBM Feature Pack 8 Webinar
IBM Feature Pack 8 WebinarIBM Feature Pack 8 Webinar
IBM Feature Pack 8 Webinar
 
Search Beyond Google: The other search engines
Search Beyond Google: The other search enginesSearch Beyond Google: The other search engines
Search Beyond Google: The other search engines
 
Website Parameters.pptx
Website Parameters.pptxWebsite Parameters.pptx
Website Parameters.pptx
 
Technical SEO: Crawl Space Management - SEOZone Istanbul 2014
Technical SEO: Crawl Space Management - SEOZone Istanbul 2014Technical SEO: Crawl Space Management - SEOZone Istanbul 2014
Technical SEO: Crawl Space Management - SEOZone Istanbul 2014
 
Bn 1021 demo digital marketing
Bn 1021 demo  digital marketingBn 1021 demo  digital marketing
Bn 1021 demo digital marketing
 

Recently uploaded

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 

Recommender SaaS in Practice

  • 1. Recommender SaaS in Practice Tianjian Chen Jianbo Zhao Xin Sun Baidu Inc. 2013 http://www.baidu.com
  • 2. About Us • Baidu.com Inc. • Leading internet company in China • Reach over 500 million Internet users • Over 8 billion PV/day of web search, online advertising and social network services http://www.baidu.com
  • 3. The Recommender SaaS Project • Provide On-Site Recommender System for Every Website • http://tuijian.baidu.com (Chinese Version Only For Now) Website Original Web Page Recommender System SaaS Update On-Site Content Recommendations Content Users Combination http://www.baidu.com
  • 4. Recommendation Widgets Original Content Popup / Panel Slider Original Content Embedded Box http://www.baidu.com
  • 5. Single On-Site RS Diagram Recommender Trigger New-Item Item Indexing Probabilistic Prediction Item Recalling Real-time User Log User Modeling Result List Control Strategy http://www.baidu.com
  • 6. A Direct Solution for Scalability http://www.baidu.com
  • 7. Scale Out to Thousands of Sites Recommender Web API Tracking API Recommender Engine Cluster Engine Instance Invert-Indexer Cluster Engine Instance K-V Storage Cluster Site 1 Site 5 Site 6 User C-F Site 4 Site 7 Site 9 Model Result Stream Computing Cluster Web Crawler User Tracking System http://www.baidu.com
  • 8. Global User Modeling User Tracking Log JOIN in Memory in Real-Time Hot Web Page Cache Web Crawler Based on Stream Computing 10 Gbps Bandwidth User Browsing Session 50 Million Web Pages Cached Billions of Cookies User Preference Modeling http://www.baidu.com
  • 9. Project Status • Beta release launched in April, 2013 • More than 1000 websites joined the beta test • > 100 million page views every day • Avg. CTR 3% • from 2% to 20% depending on different types of websites. http://www.baidu.com
  • 10. Inside a Recommender Engine Instance • Combination of Multiple Sub Recommender Engines Item Type Item Based CF Content Based Movie/Video X News Web X X Pic Gallery X X Novel Library X Item Popularity X Yellow Page X X [X] means particular engine has certain performance gain in recommendation of some item type http://www.baidu.com
  • 11. Mono RS Engine CTR Comparation Item Type Item Based CF Content Based Item Popularity Movie/Video > 6% ~ 0.5% > 2% News Web ~ 7% > 25% ~ 0.5% Pic Gallery > 6% ~ 4% ~ 1% Novel Library > 10% ~ 8% ~ 1% Yellow Page ~ 1% ~ 1% > 15% • IBCF is handy, but not the silver bullet • To our surprise, IP doesn’t work for News Recommendation • No one like old yellow page posts, even they are semantically or statistically relevant. http://www.baidu.com
  • 12. Things Need to Be Figured Out • Aggregation method of different recommendations engines • Performance loss caused by the site owners’ preset rules • Item longevity detection / prediction • URL normalization • And… http://www.baidu.com
  • 13. Influence of User Browsing Context CTR CTR 3x 5x 1x 1x Long Term Model Short Term Model (Minutes) (Months) Landing on Leaf Page Landing on Portal Page http://www.baidu.com
  • 14. Conclusions • Break big problem down to small ones • Wrap simple stuffs up for building complex services • No silver bullet for an open RecSys cloud • Beside of accuracy and relevance, time efficiency is also important http://www.baidu.com
  • 15. Q & A Time Thanks chentianjian@baidu.com http://www.baidu.com