SlideShare ist ein Scribd-Unternehmen logo
1 von 52
Downloaden Sie, um offline zu lesen
Background Motivation Model & Metric Experimental Setup Results Summary
Incorporating Clicks, Attention and Satisfaction
into a SERP Evaluation Model
Aleksandr Chuklin¶,§ Maarten de Rijke§
chuklin@google.com derijke@uva.nl
¶Google Research Europe
§University of Amsterdam
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 1
Background
Background Motivation Model & Metric Experimental Setup Results Summary
Search Engine Result Page (SERP) Evaluation
Main problem
Combining relevance of individual SERP items (Rk) into a
whole-page metric.
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 3
Background Motivation Model & Metric Experimental Setup Results Summary
Search Engine Result Page (SERP) Evaluation
Examples
document 3
document 4
document 1
document 2
document 5
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 4
Background Motivation Model & Metric Experimental Setup Results Summary
Search Engine Result Page (SERP) Evaluation
Examples
Precision at N:
P@N =
1
N
N
k=1
Rk
document 3
document 4
document 1
document 2
document 5
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 4
Background Motivation Model & Metric Experimental Setup Results Summary
Search Engine Result Page (SERP) Evaluation
Examples
Precision at N:
P@N =
1
N
N
k=1
Rk
Discounted Cumulative Gain (DCG):
DCG@N =
N
k=1
1
log2 (1 + k)
· Rk
document 3
document 4
document 1
document 2
document 5
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 4
Background Motivation Model & Metric Experimental Setup Results Summary
Search Engine Result Page (SERP) Evaluation
Examples
Precision at N:
P@N =
1
N
N
k=1
Rk
Discounted Cumulative Gain (DCG):
DCG@N =
N
k=1
1
log2 (1 + k)
· Rk
Model-Based Metrics (Chuklin et al. 2013):
Utility@N =
N
k=1
P(Ck = 1) · Rk
document 3
document 4
document 1
document 2
document 5
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 4
Background Motivation Model & Metric Experimental Setup Results Summary
Main Goal of This Paper
Better measure for SERP utility
Namely, improve this (Chuklin et al. 2013):
N
k=1
P(Ck = 1) · Rk
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 5
Motivation
Background Motivation Model & Metric Experimental Setup Results Summary
Complex Heterogeneous SERPs
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 7
Background Motivation Model & Metric Experimental Setup Results Summary
Motivation 1: Non-Trivial Attention Patterns
4
ement
9
1
3
5
6
7
8
4
2
(c) Mouse Data
data. The session sequence for this data would be
Image credits: F. Diaz, R.W. White, G. Buscher, and D. Liebling. Robust models of mouse movement on dynamic
web search results pages. In CIKM, 2013. ACM Press
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 8
Background Motivation Model & Metric Experimental Setup Results Summary
Motivation 2: Satisfaction Without Clicks
High direct page utility (measured by DCG or ERR) leads to higher
abandonment rate (SERPs with no clicks)
direct page utility
Image credits: from A. Chuklin and P. Serdyukov. Good abandonments in factoid queries. In WWW, 2012.
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 9
Background Motivation Model & Metric Experimental Setup Results Summary
Problems of Existing Models and Evaluation Metrics
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 10
Background Motivation Model & Metric Experimental Setup Results Summary
Problems of Existing Models and Evaluation Metrics
existing models mostly do not model non-trivial user
attention patterns
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 10
Background Motivation Model & Metric Experimental Setup Results Summary
Problems of Existing Models and Evaluation Metrics
existing models mostly do not model non-trivial user
attention patterns
existing models do not use explicit user satisfaction data
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 10
Model & Metric
Background Motivation Model & Metric Experimental Setup Results Summary
Clicks + Attention + Satisfaction (CAS) Model
SERP
𝜑&
𝐸&
𝐶&
𝜑)
𝐸)
𝐶)
𝜑*
𝐸*
𝐶*
𝑆
…
Utility
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 12
Background Motivation Model & Metric Experimental Setup Results Summary
Clicks + Attention + Satisfaction (CAS) Model
SERP
𝜑&
𝐸&
𝐶&
𝜑)
𝐸)
𝐶)
𝜑*
𝐸*
𝐶*
𝑆
…
Utility
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 13
Background Motivation Model & Metric Experimental Setup Results Summary
Click Model
Examination assumption: click happens only when an item was
examined and attractive:
P(Ck = 1) = P(Ek = 1) · P(Ck = 1 | Ek = 1)
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 14
Background Motivation Model & Metric Experimental Setup Results Summary
Click Model
Examination assumption: click happens only when an item was
examined and attractive:
P(Ck = 1) = P(Ek = 1) · P(Ck = 1 | Ek = 1)
N.B. Here we assume that P(Ck = 1 | Ek = 1) = α(Rk) where Rk
comes from the raters and α is a logistic function.
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 14
Background Motivation Model & Metric Experimental Setup Results Summary
Clicks + Attention + Satisfaction (CAS) Model
SERP
𝜑&
𝐸&
𝐶&
𝜑)
𝐸)
𝐶)
𝜑*
𝐸*
𝐶*
𝑆
…
Utility
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 15
Background Motivation Model & Metric Experimental Setup Results Summary
Attention (Examination) Model
Logistic regression model:
P(Ek = 1) = ε(ϕk),
where ϕk is a vector of features for SERP item k.
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 16
Background Motivation Model & Metric Experimental Setup Results Summary
Attention (Examination) Model
Logistic regression model:
P(Ek = 1) = ε(ϕk),
where ϕk is a vector of features for SERP item k.
Feature group Features # of features
rank user-perceived rank of the SERP item
(can be different from k)
1
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 16
Background Motivation Model & Metric Experimental Setup Results Summary
Attention (Examination) Model
Logistic regression model:
P(Ek = 1) = ε(ϕk),
where ϕk is a vector of features for SERP item k.
Feature group Features # of features
rank user-perceived rank of the SERP item
(can be different from k)
1
CSS classes SERP item type (Web, News,
Weather, Currency, Knowledge
Panel, etc.)
10
geometry offset from the top, first or second col-
umn (binary), width (w), height (h),
w × h
5
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 16
Background Motivation Model & Metric Experimental Setup Results Summary
Clicks + Attention + Satisfaction (CAS) Model
SERP
𝜑&
𝐸&
𝐶&
𝜑)
𝐸)
𝐶)
𝜑*
𝐸*
𝐶*
𝑆
…
Utility
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 17
Background Motivation Model & Metric Experimental Setup Results Summary
Satisfaction Model
in previous models, satisfaction comes only from clicked
results;
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 18
Background Motivation Model & Metric Experimental Setup Results Summary
Satisfaction Model
in previous models, satisfaction comes only from clicked
results;
in our model it also comes from the SERP items that simply
attracted attention;
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 18
Background Motivation Model & Metric Experimental Setup Results Summary
Satisfaction Model
in previous models, satisfaction comes only from clicked
results;
in our model it also comes from the SERP items that simply
attracted attention;
P(S = 1) = σ(τ0 + U) =
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 18
Background Motivation Model & Metric Experimental Setup Results Summary
Satisfaction Model
in previous models, satisfaction comes only from clicked
results;
in our model it also comes from the SERP items that simply
attracted attention;
P(S = 1) = σ(τ0 + U) =
σ τ0 +
k
P(Ek = 1)ud (Dk) +
k
P(Ck = 1)ur (Rk)
where Dk and Rk are ratings assigned by the raters for direct
snippet relevance and result relevance respectively. ud and ur are
linear functions of rating histograms.
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 18
Background Motivation Model & Metric Experimental Setup Results Summary
The CAS Metric
Utility that determines the satisfaction probability:
U =
k
P(Ek = 1)ud (Dk) +
k
P(Ck = 1)ur (Rk)
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 19
Background Motivation Model & Metric Experimental Setup Results Summary
The CAS Metric
Utility that determines the satisfaction probability:
U =
k
P(Ek = 1)ud (Dk)
NEW
+
k
P(Ck = 1)ur (Rk)
Chuklin et al. 2013
has an additional term
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 19
Background Motivation Model & Metric Experimental Setup Results Summary
The CAS Metric
Utility that determines the satisfaction probability:
U =
k
P(Ek = 1)ud (Dk)
NEW
+
k
P(Ck = 1)ur (Rk)
Chuklin et al. 2013
has an additional term
trained on mousing and satisfaction (in addition to clicks)
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 19
Experimental Setup
Background Motivation Model & Metric Experimental Setup Results Summary
Dataset
199 queries with explicit unambiguous
feedback (satisfied / not satisfied);
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 21
Background Motivation Model & Metric Experimental Setup Results Summary
Dataset
199 queries with explicit unambiguous
feedback (satisfied / not satisfied);
1,739 rated results
direct snippet relevance (D)
result relevance (R)
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 21
Background Motivation Model & Metric Experimental Setup Results Summary
Baselines and CAS Model Variants
UBM model that agrees
well with online team-draft
experimental outcomes;
PBM position-based model,
a robust model with fewer
parameters than UBM;
random model that predicts
click and satisfaction with
fixed probabilities (learned
from the data).
uUBM from
Chuklin et al. 2013. Similar
to UBM, but parameters are
trained on a different and
much bigger dataset.
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 22
Background Motivation Model & Metric Experimental Setup Results Summary
Baselines and CAS Model Variants
UBM model that agrees
well with online team-draft
experimental outcomes;
PBM position-based model,
a robust model with fewer
parameters than UBM;
random model that predicts
click and satisfaction with
fixed probabilities (learned
from the data).
uUBM from
Chuklin et al. 2013. Similar
to UBM, but parameters are
trained on a different and
much bigger dataset.
CASnod is a stripped-down
version that does not use
(D) labels;
CASnosat is a version of
the CAS model that does
not include the satisfaction
term while optimizing the
model;
CASnoreg is a version of
the CAS model that does
not use regularization while
training. All other models
were trained with
L2-regularization.
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 22
Results
Background Motivation Model & Metric Experimental Setup Results Summary
Is the New Metric Really New?
Correlation Between Metrics
Table: Correlation between metrics measured by average Pearson’s
correlation coefficient.
CASnosat CASnoreg CAS UBM PBM DCG uUBM
CASnod 0.593 0.564 0.633 0.470 0.487 0.546 0.441
CASnosat 0.664 0.715 0.707 0.668 0.735 0.684
CASnoreg 0.974 0.363 0.379 0.417 0.341
CAS 0.377 0.394 0.440 0.360
UBM 0.814 0.972 0.882
PBM 0.906 0.965
DCG 0.943
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 24
Background Motivation Model & Metric Experimental Setup Results Summary
Is the New Metric Measuring the Right Thing?
Metric Correlation with True Satisfaction
CASnod
CASnosat
CASnoreg
CAS
UBM PBM
random DCG
uUBM
0.2
0.1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
Pearson correlation coefficient between different model-based
metrics and the user-reported satisfaction.
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 25
Background Motivation Model & Metric Experimental Setup Results Summary
Bonus Point
Log-Likelihood of Click Prediction
CASnod
CASnosat
CASnoreg
CAS
UBM PBM
random
uUBM
4.5
4.0
3.5
3.0
2.5
2.0
1.5
Log-likelihood of the click data. Note that uUBM was trained on a
totally different dataset.
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 26
Summary
Background Motivation Model & Metric Experimental Setup Results Summary
Summary
A model-based metric needs to model satisfaction explicitly
and use it for training.
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 28
Background Motivation Model & Metric Experimental Setup Results Summary
Summary
A model-based metric needs to model satisfaction explicitly
and use it for training.
Direct snippet relevance (D) is essential for predicting
satisfaction.
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 28
Background Motivation Model & Metric Experimental Setup Results Summary
Summary
A model-based metric needs to model satisfaction explicitly
and use it for training.
Direct snippet relevance (D) is essential for predicting
satisfaction.
The CAS metric is quite different from the previously used
metrics, making it an interesting addition to TREC.
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 28
Background Motivation Model & Metric Experimental Setup Results Summary
Summary
A model-based metric needs to model satisfaction explicitly
and use it for training.
Direct snippet relevance (D) is essential for predicting
satisfaction.
The CAS metric is quite different from the previously used
metrics, making it an interesting addition to TREC.
When used as a model, CAS consistently predicts user
satisfaction with a relatively small penalty in click prediction.
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 28
Background Motivation Model & Metric Experimental Setup Results Summary
Acknowledgments
All content represents the opinion of the authors which is not necessarily shared or endorsed by their respective
employers and/or sponsors.
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 29
Background Motivation Model & Metric Experimental Setup Results Summary
Evaluating the User Model
Log-Likelihood of Satisfaction Prediction
CASnod
CASnosat
CASnoreg
CAS
UBM PBM
random
uUBM
0.8
0.7
0.6
0.5
0.4
0.3
0.2
Log-likelihood of the satisfaction prediction. Some models have
log-likelihood below −0.8, hence there are no boxes for them.
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 31
Background Motivation Model & Metric Experimental Setup Results Summary
Analyzing the Attention Features
CASrank is the
model that only uses
the rank to predict
attention;
CASnogeom only
uses the rank and
SERP item type
information and does
not use geometry;
CASnoclass does not
use the CSS class
features (SERP item
type).
Pearson correlation with satisfaction
CASrank
CASnogeom
CASnoclass
CASnod
CAS
0.2
0.1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
Log-likelihood of clicks / satisfaction
CASrank
CASnogeom
CASnoclass
CASnod
CAS
2.5
2.4
2.3
2.2
2.1
2.0
1.9
1.8
1.7
CASrank
CASnogeom
CASnoclass
CASnod
CAS
0.65
0.60
0.55
0.50
0.45
0.40
0.35
0.30
0.25
0.20
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 32
Background Motivation Model & Metric Experimental Setup Results Summary
Heterogeneous SERPs
12% of the SERPs in our data are heterogeneous and our metric
does well for them.
Table: Pearson correlation between utility of heterogeneous SERP and
user-reported satisfaction.
CAS UBM PBM random DCG uUBM
0.60 0.38 -0.05 -0.39 0.24 -0.08
CASrank CASnogeom CASclass CASnod CASnosat CASnoreg
0.15 -0.04 0.27 -0.04 0.48 0.67
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 33
Background Motivation Model & Metric Experimental Setup Results Summary
Spammers
Some raters were filtered out as spammers, but there was still
some natural disagreement:
Table: Filtered out workers and agreement scores for remaining workers.
% of workers % of ratings Cohen’s Krippendorf’s
label removed removed kappa alpha
(D) 32% 27% 0.339 0.144
(R) 41% 29% 0.348 0.117
AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 34

Weitere ähnliche Inhalte

Andere mochten auch

Strategies to Drive Web Traffic in the Real Estate World
Strategies to Drive Web Traffic in the Real Estate WorldStrategies to Drive Web Traffic in the Real Estate World
Strategies to Drive Web Traffic in the Real Estate WorldRand Fishkin
 
The Paradox of Great Content
The Paradox of Great ContentThe Paradox of Great Content
The Paradox of Great ContentRand Fishkin
 
The Search Landscape in 2017
The Search Landscape in 2017The Search Landscape in 2017
The Search Landscape in 2017Rand Fishkin
 
SEO & UX: So Happy Together
SEO & UX: So Happy TogetherSEO & UX: So Happy Together
SEO & UX: So Happy TogetherRand Fishkin
 
Link Building's Tipping Point
Link Building's Tipping PointLink Building's Tipping Point
Link Building's Tipping PointRand Fishkin
 
Keeping Up With SEO in 2017 & Beyond
Keeping Up With SEO in 2017 & BeyondKeeping Up With SEO in 2017 & Beyond
Keeping Up With SEO in 2017 & BeyondRand Fishkin
 
Intro to Mozcon 2016
Intro to Mozcon 2016Intro to Mozcon 2016
Intro to Mozcon 2016Rand Fishkin
 
Fight Back Against Back: How Search Engines & Social Networks' AI Impacts Mar...
Fight Back Against Back: How Search Engines & Social Networks' AI Impacts Mar...Fight Back Against Back: How Search Engines & Social Networks' AI Impacts Mar...
Fight Back Against Back: How Search Engines & Social Networks' AI Impacts Mar...Rand Fishkin
 
The Measure of a Marketer's Worth
The Measure of a Marketer's WorthThe Measure of a Marketer's Worth
The Measure of a Marketer's WorthRand Fishkin
 
The Worst Lessons Marketing Ever Taught Content
The Worst Lessons Marketing Ever Taught ContentThe Worst Lessons Marketing Ever Taught Content
The Worst Lessons Marketing Ever Taught ContentRand Fishkin
 
The Remarkable SEO Power of Republishing
The Remarkable SEO Power of RepublishingThe Remarkable SEO Power of Republishing
The Remarkable SEO Power of RepublishingRand Fishkin
 
SEO: Crawl Budget Optimierung & Onsite SEO
SEO: Crawl Budget Optimierung & Onsite SEOSEO: Crawl Budget Optimierung & Onsite SEO
SEO: Crawl Budget Optimierung & Onsite SEOPhilipp Klöckner
 
SEO: SERPs im Wandel - SMX Munich 2017
SEO: SERPs im Wandel - SMX Munich 2017SEO: SERPs im Wandel - SMX Munich 2017
SEO: SERPs im Wandel - SMX Munich 2017Philipp Klöckner
 
Crawl Budget Optimization - SMX München 2016
Crawl Budget Optimization - SMX München 2016Crawl Budget Optimization - SMX München 2016
Crawl Budget Optimization - SMX München 2016Bastian Grimm
 
Burman GSC Gurgaon
Burman GSC GurgaonBurman GSC Gurgaon
Burman GSC GurgaonManish Kumar
 
Isomorphic JavaScript: #DevBeat Master Class
Isomorphic JavaScript: #DevBeat Master ClassIsomorphic JavaScript: #DevBeat Master Class
Isomorphic JavaScript: #DevBeat Master ClassSpike Brehm
 
SEO in a Two Algorithm World
SEO in a Two Algorithm WorldSEO in a Two Algorithm World
SEO in a Two Algorithm WorldRand Fishkin
 
Cut The Cruft - Everett Sizemore - MozTalk Denver - 2016
Cut The Cruft - Everett Sizemore - MozTalk Denver - 2016Cut The Cruft - Everett Sizemore - MozTalk Denver - 2016
Cut The Cruft - Everett Sizemore - MozTalk Denver - 2016Everett Sizemore
 

Andere mochten auch (20)

Strategies to Drive Web Traffic in the Real Estate World
Strategies to Drive Web Traffic in the Real Estate WorldStrategies to Drive Web Traffic in the Real Estate World
Strategies to Drive Web Traffic in the Real Estate World
 
The Paradox of Great Content
The Paradox of Great ContentThe Paradox of Great Content
The Paradox of Great Content
 
The Search Landscape in 2017
The Search Landscape in 2017The Search Landscape in 2017
The Search Landscape in 2017
 
SEO & UX: So Happy Together
SEO & UX: So Happy TogetherSEO & UX: So Happy Together
SEO & UX: So Happy Together
 
Link Building's Tipping Point
Link Building's Tipping PointLink Building's Tipping Point
Link Building's Tipping Point
 
Keeping Up With SEO in 2017 & Beyond
Keeping Up With SEO in 2017 & BeyondKeeping Up With SEO in 2017 & Beyond
Keeping Up With SEO in 2017 & Beyond
 
Intro to Mozcon 2016
Intro to Mozcon 2016Intro to Mozcon 2016
Intro to Mozcon 2016
 
Fight Back Against Back: How Search Engines & Social Networks' AI Impacts Mar...
Fight Back Against Back: How Search Engines & Social Networks' AI Impacts Mar...Fight Back Against Back: How Search Engines & Social Networks' AI Impacts Mar...
Fight Back Against Back: How Search Engines & Social Networks' AI Impacts Mar...
 
The Measure of a Marketer's Worth
The Measure of a Marketer's WorthThe Measure of a Marketer's Worth
The Measure of a Marketer's Worth
 
The Worst Lessons Marketing Ever Taught Content
The Worst Lessons Marketing Ever Taught ContentThe Worst Lessons Marketing Ever Taught Content
The Worst Lessons Marketing Ever Taught Content
 
The Remarkable SEO Power of Republishing
The Remarkable SEO Power of RepublishingThe Remarkable SEO Power of Republishing
The Remarkable SEO Power of Republishing
 
SEO: Crawl Budget Optimierung & Onsite SEO
SEO: Crawl Budget Optimierung & Onsite SEOSEO: Crawl Budget Optimierung & Onsite SEO
SEO: Crawl Budget Optimierung & Onsite SEO
 
SEO: SERPs im Wandel - SMX Munich 2017
SEO: SERPs im Wandel - SMX Munich 2017SEO: SERPs im Wandel - SMX Munich 2017
SEO: SERPs im Wandel - SMX Munich 2017
 
Crawl Budget Optimization - SMX München 2016
Crawl Budget Optimization - SMX München 2016Crawl Budget Optimization - SMX München 2016
Crawl Budget Optimization - SMX München 2016
 
Burman GSC Gurgaon
Burman GSC GurgaonBurman GSC Gurgaon
Burman GSC Gurgaon
 
Frontend talk for backenders
Frontend talk for backendersFrontend talk for backenders
Frontend talk for backenders
 
Alloy Cybersecurity
Alloy CybersecurityAlloy Cybersecurity
Alloy Cybersecurity
 
Isomorphic JavaScript: #DevBeat Master Class
Isomorphic JavaScript: #DevBeat Master ClassIsomorphic JavaScript: #DevBeat Master Class
Isomorphic JavaScript: #DevBeat Master Class
 
SEO in a Two Algorithm World
SEO in a Two Algorithm WorldSEO in a Two Algorithm World
SEO in a Two Algorithm World
 
Cut The Cruft - Everett Sizemore - MozTalk Denver - 2016
Cut The Cruft - Everett Sizemore - MozTalk Denver - 2016Cut The Cruft - Everett Sizemore - MozTalk Denver - 2016
Cut The Cruft - Everett Sizemore - MozTalk Denver - 2016
 

Ähnlich wie Incorporating Clicks, Attention and Satisfaction into a SERP Evaluation Model

Evaluate deep q learning for sequential targeted marketing with 10-fold cross...
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...Evaluate deep q learning for sequential targeted marketing with 10-fold cross...
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...Jian Wu
 
Workshop: Your first machine learning project
Workshop: Your first machine learning projectWorkshop: Your first machine learning project
Workshop: Your first machine learning projectAlex Austin
 
Simulation-Based Modeling and Evaluation of Incentive Schemes in Crowdsourcin...
Simulation-Based Modeling and Evaluation of Incentive Schemes in Crowdsourcin...Simulation-Based Modeling and Evaluation of Incentive Schemes in Crowdsourcin...
Simulation-Based Modeling and Evaluation of Incentive Schemes in Crowdsourcin...Ognjen Scekic
 
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Edureka!
 
Tech meetup Data Driven - Codemotion
Tech meetup Data Driven - Codemotion Tech meetup Data Driven - Codemotion
Tech meetup Data Driven - Codemotion antimo musone
 
Predire il futuro con Machine Learning & Big Data
Predire il futuro con Machine Learning & Big DataPredire il futuro con Machine Learning & Big Data
Predire il futuro con Machine Learning & Big DataData Driven Innovation
 
Phase 2 of Predicting Payment default on Vehicle Loan EMI
Phase 2 of Predicting Payment default on Vehicle Loan EMIPhase 2 of Predicting Payment default on Vehicle Loan EMI
Phase 2 of Predicting Payment default on Vehicle Loan EMIVikas Virani
 
Recommending job ads to people
Recommending job ads to peopleRecommending job ads to people
Recommending job ads to peopleFabian Abel
 
Visual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningBenjamin Bengfort
 
AlgorithmsModelsNov13.pptx
AlgorithmsModelsNov13.pptxAlgorithmsModelsNov13.pptx
AlgorithmsModelsNov13.pptxPerumalPitchandi
 
Fast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareFast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareTigerGraph
 
Developing Web-scale Machine Learning at LinkedIn - From Soup to Nuts
Developing Web-scale Machine Learning at LinkedIn - From Soup to NutsDeveloping Web-scale Machine Learning at LinkedIn - From Soup to Nuts
Developing Web-scale Machine Learning at LinkedIn - From Soup to NutsKun Liu
 
Empirical Model of Supervised Learning Approach for Opinion Mining
Empirical Model of Supervised Learning Approach for Opinion MiningEmpirical Model of Supervised Learning Approach for Opinion Mining
Empirical Model of Supervised Learning Approach for Opinion MiningIRJET Journal
 
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...Simplilearn
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017Manish Pandey
 
Envelopment Analysis In Economics
Envelopment Analysis In EconomicsEnvelopment Analysis In Economics
Envelopment Analysis In EconomicsAmber Rodriguez
 
Web Page Ranking using Machine Learning
Web Page Ranking using Machine LearningWeb Page Ranking using Machine Learning
Web Page Ranking using Machine LearningPradip Rahul
 

Ähnlich wie Incorporating Clicks, Attention and Satisfaction into a SERP Evaluation Model (20)

Evaluate deep q learning for sequential targeted marketing with 10-fold cross...
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...Evaluate deep q learning for sequential targeted marketing with 10-fold cross...
Evaluate deep q learning for sequential targeted marketing with 10-fold cross...
 
Workshop: Your first machine learning project
Workshop: Your first machine learning projectWorkshop: Your first machine learning project
Workshop: Your first machine learning project
 
Simulation-Based Modeling and Evaluation of Incentive Schemes in Crowdsourcin...
Simulation-Based Modeling and Evaluation of Incentive Schemes in Crowdsourcin...Simulation-Based Modeling and Evaluation of Incentive Schemes in Crowdsourcin...
Simulation-Based Modeling and Evaluation of Incentive Schemes in Crowdsourcin...
 
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
 
Tech meetup Data Driven - Codemotion
Tech meetup Data Driven - Codemotion Tech meetup Data Driven - Codemotion
Tech meetup Data Driven - Codemotion
 
Predire il futuro con Machine Learning & Big Data
Predire il futuro con Machine Learning & Big DataPredire il futuro con Machine Learning & Big Data
Predire il futuro con Machine Learning & Big Data
 
Data Science Machine
Data Science Machine Data Science Machine
Data Science Machine
 
Kaggle KDD Cup Report
Kaggle KDD Cup ReportKaggle KDD Cup Report
Kaggle KDD Cup Report
 
Phase 2 of Predicting Payment default on Vehicle Loan EMI
Phase 2 of Predicting Payment default on Vehicle Loan EMIPhase 2 of Predicting Payment default on Vehicle Loan EMI
Phase 2 of Predicting Payment default on Vehicle Loan EMI
 
Recommending job ads to people
Recommending job ads to peopleRecommending job ads to people
Recommending job ads to people
 
Visual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learning
 
AlgorithmsModelsNov13.pptx
AlgorithmsModelsNov13.pptxAlgorithmsModelsNov13.pptx
AlgorithmsModelsNov13.pptx
 
Fast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA HardwareFast Parallel Similarity Calculations with FPGA Hardware
Fast Parallel Similarity Calculations with FPGA Hardware
 
Developing Web-scale Machine Learning at LinkedIn - From Soup to Nuts
Developing Web-scale Machine Learning at LinkedIn - From Soup to NutsDeveloping Web-scale Machine Learning at LinkedIn - From Soup to Nuts
Developing Web-scale Machine Learning at LinkedIn - From Soup to Nuts
 
Empirical Model of Supervised Learning Approach for Opinion Mining
Empirical Model of Supervised Learning Approach for Opinion MiningEmpirical Model of Supervised Learning Approach for Opinion Mining
Empirical Model of Supervised Learning Approach for Opinion Mining
 
Telecom Churn Analysis
Telecom Churn AnalysisTelecom Churn Analysis
Telecom Churn Analysis
 
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Algori...
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
 
Envelopment Analysis In Economics
Envelopment Analysis In EconomicsEnvelopment Analysis In Economics
Envelopment Analysis In Economics
 
Web Page Ranking using Machine Learning
Web Page Ranking using Machine LearningWeb Page Ranking using Machine Learning
Web Page Ranking using Machine Learning
 

Mehr von Rand Fishkin

SparkToro Beta Sneak Peek
SparkToro Beta Sneak PeekSparkToro Beta Sneak Peek
SparkToro Beta Sneak PeekRand Fishkin
 
The Healthcare Search Landscape in 2019: SEO, Content Marketing, & More
The Healthcare Search Landscape in 2019: SEO, Content Marketing, & MoreThe Healthcare Search Landscape in 2019: SEO, Content Marketing, & More
The Healthcare Search Landscape in 2019: SEO, Content Marketing, & MoreRand Fishkin
 
Building Influence in 2019
Building Influence in 2019Building Influence in 2019
Building Influence in 2019Rand Fishkin
 
Influence Not Influencers
Influence Not InfluencersInfluence Not Influencers
Influence Not InfluencersRand Fishkin
 
The Next Era of Web Marketing: 2019 & Beyond
The Next Era of Web Marketing: 2019 & BeyondThe Next Era of Web Marketing: 2019 & Beyond
The Next Era of Web Marketing: 2019 & BeyondRand Fishkin
 
How to Kick Butt with Your Email Outreach
How to Kick Butt with Your Email OutreachHow to Kick Butt with Your Email Outreach
How to Kick Butt with Your Email OutreachRand Fishkin
 
The Big 7 Startup Marketing Mistakes
The Big 7 Startup Marketing MistakesThe Big 7 Startup Marketing Mistakes
The Big 7 Startup Marketing MistakesRand Fishkin
 
Why Your Saas Marketing Sucks (and how to fix it)
Why Your Saas Marketing Sucks (and how to fix it)Why Your Saas Marketing Sucks (and how to fix it)
Why Your Saas Marketing Sucks (and how to fix it)Rand Fishkin
 
SEO on the SERPs - Brighton SEO Closing Talk
SEO on the SERPs - Brighton SEO Closing TalkSEO on the SERPs - Brighton SEO Closing Talk
SEO on the SERPs - Brighton SEO Closing TalkRand Fishkin
 
7 Lessons That Would Have Made Me a Better Entrepreneur
7 Lessons That Would Have Made Me a Better Entrepreneur7 Lessons That Would Have Made Me a Better Entrepreneur
7 Lessons That Would Have Made Me a Better EntrepreneurRand Fishkin
 
The Search & SEO World in 2018
The Search & SEO World in 2018The Search & SEO World in 2018
The Search & SEO World in 2018Rand Fishkin
 
Why Startups Suck at Marketing
Why Startups Suck at MarketingWhy Startups Suck at Marketing
Why Startups Suck at MarketingRand Fishkin
 
The Invisible Giant that Mucks Up Our Marketing
The Invisible Giant that Mucks Up Our MarketingThe Invisible Giant that Mucks Up Our Marketing
The Invisible Giant that Mucks Up Our MarketingRand Fishkin
 
How to Survive Google's Trojan Horsing of the Web
How to Survive Google's Trojan Horsing of the WebHow to Survive Google's Trojan Horsing of the Web
How to Survive Google's Trojan Horsing of the WebRand Fishkin
 
What Startup Execs Need to Know About SEO in 2017
What Startup Execs Need to Know About SEO in 2017What Startup Execs Need to Know About SEO in 2017
What Startup Execs Need to Know About SEO in 2017Rand Fishkin
 
Inside Google's Numbers in 2017
Inside Google's Numbers in 2017Inside Google's Numbers in 2017
Inside Google's Numbers in 2017Rand Fishkin
 
Why We Can't Do SEO WIthout CRO
Why We Can't Do SEO WIthout CROWhy We Can't Do SEO WIthout CRO
Why We Can't Do SEO WIthout CRORand Fishkin
 
The Digital Marketer's Framework
The Digital Marketer's FrameworkThe Digital Marketer's Framework
The Digital Marketer's FrameworkRand Fishkin
 

Mehr von Rand Fishkin (20)

SparkToro Beta Sneak Peek
SparkToro Beta Sneak PeekSparkToro Beta Sneak Peek
SparkToro Beta Sneak Peek
 
The Healthcare Search Landscape in 2019: SEO, Content Marketing, & More
The Healthcare Search Landscape in 2019: SEO, Content Marketing, & MoreThe Healthcare Search Landscape in 2019: SEO, Content Marketing, & More
The Healthcare Search Landscape in 2019: SEO, Content Marketing, & More
 
Building Influence in 2019
Building Influence in 2019Building Influence in 2019
Building Influence in 2019
 
Influence Not Influencers
Influence Not InfluencersInfluence Not Influencers
Influence Not Influencers
 
The Next Era of Web Marketing: 2019 & Beyond
The Next Era of Web Marketing: 2019 & BeyondThe Next Era of Web Marketing: 2019 & Beyond
The Next Era of Web Marketing: 2019 & Beyond
 
How to Kick Butt with Your Email Outreach
How to Kick Butt with Your Email OutreachHow to Kick Butt with Your Email Outreach
How to Kick Butt with Your Email Outreach
 
The Big 7 Startup Marketing Mistakes
The Big 7 Startup Marketing MistakesThe Big 7 Startup Marketing Mistakes
The Big 7 Startup Marketing Mistakes
 
Why Your Saas Marketing Sucks (and how to fix it)
Why Your Saas Marketing Sucks (and how to fix it)Why Your Saas Marketing Sucks (and how to fix it)
Why Your Saas Marketing Sucks (and how to fix it)
 
SEO on the SERPs - Brighton SEO Closing Talk
SEO on the SERPs - Brighton SEO Closing TalkSEO on the SERPs - Brighton SEO Closing Talk
SEO on the SERPs - Brighton SEO Closing Talk
 
7 Lessons That Would Have Made Me a Better Entrepreneur
7 Lessons That Would Have Made Me a Better Entrepreneur7 Lessons That Would Have Made Me a Better Entrepreneur
7 Lessons That Would Have Made Me a Better Entrepreneur
 
The Search & SEO World in 2018
The Search & SEO World in 2018The Search & SEO World in 2018
The Search & SEO World in 2018
 
SEO in 2017/18
SEO in 2017/18SEO in 2017/18
SEO in 2017/18
 
Why Startups Suck at Marketing
Why Startups Suck at MarketingWhy Startups Suck at Marketing
Why Startups Suck at Marketing
 
The Invisible Giant that Mucks Up Our Marketing
The Invisible Giant that Mucks Up Our MarketingThe Invisible Giant that Mucks Up Our Marketing
The Invisible Giant that Mucks Up Our Marketing
 
B2B SEO in 2017
B2B SEO in 2017B2B SEO in 2017
B2B SEO in 2017
 
How to Survive Google's Trojan Horsing of the Web
How to Survive Google's Trojan Horsing of the WebHow to Survive Google's Trojan Horsing of the Web
How to Survive Google's Trojan Horsing of the Web
 
What Startup Execs Need to Know About SEO in 2017
What Startup Execs Need to Know About SEO in 2017What Startup Execs Need to Know About SEO in 2017
What Startup Execs Need to Know About SEO in 2017
 
Inside Google's Numbers in 2017
Inside Google's Numbers in 2017Inside Google's Numbers in 2017
Inside Google's Numbers in 2017
 
Why We Can't Do SEO WIthout CRO
Why We Can't Do SEO WIthout CROWhy We Can't Do SEO WIthout CRO
Why We Can't Do SEO WIthout CRO
 
The Digital Marketer's Framework
The Digital Marketer's FrameworkThe Digital Marketer's Framework
The Digital Marketer's Framework
 

Kürzlich hochgeladen

Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa494f574xmv
 
Unidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxUnidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxmibuzondetrabajo
 
TRENDS Enabling and inhibiting dimensions.pptx
TRENDS Enabling and inhibiting dimensions.pptxTRENDS Enabling and inhibiting dimensions.pptx
TRENDS Enabling and inhibiting dimensions.pptxAndrieCagasanAkio
 
ETHICAL HACKING dddddddddddddddfnandni.pptx
ETHICAL HACKING dddddddddddddddfnandni.pptxETHICAL HACKING dddddddddddddddfnandni.pptx
ETHICAL HACKING dddddddddddddddfnandni.pptxNIMMANAGANTI RAMAKRISHNA
 
IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119APNIC
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书zdzoqco
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书rnrncn29
 
Company Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptxCompany Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptxMario
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predieusebiomeyer
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书rnrncn29
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxDyna Gilbert
 

Kürzlich hochgeladen (11)

Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa
 
Unidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptxUnidad 4 – Redes de ordenadores (en inglés).pptx
Unidad 4 – Redes de ordenadores (en inglés).pptx
 
TRENDS Enabling and inhibiting dimensions.pptx
TRENDS Enabling and inhibiting dimensions.pptxTRENDS Enabling and inhibiting dimensions.pptx
TRENDS Enabling and inhibiting dimensions.pptx
 
ETHICAL HACKING dddddddddddddddfnandni.pptx
ETHICAL HACKING dddddddddddddddfnandni.pptxETHICAL HACKING dddddddddddddddfnandni.pptx
ETHICAL HACKING dddddddddddddddfnandni.pptx
 
IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
 
Company Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptxCompany Snapshot Theme for Business by Slidesgo.pptx
Company Snapshot Theme for Business by Slidesgo.pptx
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predi
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
 
Top 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptxTop 10 Interactive Website Design Trends in 2024.pptx
Top 10 Interactive Website Design Trends in 2024.pptx
 

Incorporating Clicks, Attention and Satisfaction into a SERP Evaluation Model

  • 1. Background Motivation Model & Metric Experimental Setup Results Summary Incorporating Clicks, Attention and Satisfaction into a SERP Evaluation Model Aleksandr Chuklin¶,§ Maarten de Rijke§ chuklin@google.com derijke@uva.nl ¶Google Research Europe §University of Amsterdam AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 1
  • 3. Background Motivation Model & Metric Experimental Setup Results Summary Search Engine Result Page (SERP) Evaluation Main problem Combining relevance of individual SERP items (Rk) into a whole-page metric. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 3
  • 4. Background Motivation Model & Metric Experimental Setup Results Summary Search Engine Result Page (SERP) Evaluation Examples document 3 document 4 document 1 document 2 document 5 AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 4
  • 5. Background Motivation Model & Metric Experimental Setup Results Summary Search Engine Result Page (SERP) Evaluation Examples Precision at N: P@N = 1 N N k=1 Rk document 3 document 4 document 1 document 2 document 5 AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 4
  • 6. Background Motivation Model & Metric Experimental Setup Results Summary Search Engine Result Page (SERP) Evaluation Examples Precision at N: P@N = 1 N N k=1 Rk Discounted Cumulative Gain (DCG): DCG@N = N k=1 1 log2 (1 + k) · Rk document 3 document 4 document 1 document 2 document 5 AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 4
  • 7. Background Motivation Model & Metric Experimental Setup Results Summary Search Engine Result Page (SERP) Evaluation Examples Precision at N: P@N = 1 N N k=1 Rk Discounted Cumulative Gain (DCG): DCG@N = N k=1 1 log2 (1 + k) · Rk Model-Based Metrics (Chuklin et al. 2013): Utility@N = N k=1 P(Ck = 1) · Rk document 3 document 4 document 1 document 2 document 5 AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 4
  • 8. Background Motivation Model & Metric Experimental Setup Results Summary Main Goal of This Paper Better measure for SERP utility Namely, improve this (Chuklin et al. 2013): N k=1 P(Ck = 1) · Rk AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 5
  • 10. Background Motivation Model & Metric Experimental Setup Results Summary Complex Heterogeneous SERPs AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 7
  • 11. Background Motivation Model & Metric Experimental Setup Results Summary Motivation 1: Non-Trivial Attention Patterns 4 ement 9 1 3 5 6 7 8 4 2 (c) Mouse Data data. The session sequence for this data would be Image credits: F. Diaz, R.W. White, G. Buscher, and D. Liebling. Robust models of mouse movement on dynamic web search results pages. In CIKM, 2013. ACM Press AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 8
  • 12. Background Motivation Model & Metric Experimental Setup Results Summary Motivation 2: Satisfaction Without Clicks High direct page utility (measured by DCG or ERR) leads to higher abandonment rate (SERPs with no clicks) direct page utility Image credits: from A. Chuklin and P. Serdyukov. Good abandonments in factoid queries. In WWW, 2012. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 9
  • 13. Background Motivation Model & Metric Experimental Setup Results Summary Problems of Existing Models and Evaluation Metrics AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 10
  • 14. Background Motivation Model & Metric Experimental Setup Results Summary Problems of Existing Models and Evaluation Metrics existing models mostly do not model non-trivial user attention patterns AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 10
  • 15. Background Motivation Model & Metric Experimental Setup Results Summary Problems of Existing Models and Evaluation Metrics existing models mostly do not model non-trivial user attention patterns existing models do not use explicit user satisfaction data AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 10
  • 17. Background Motivation Model & Metric Experimental Setup Results Summary Clicks + Attention + Satisfaction (CAS) Model SERP 𝜑& 𝐸& 𝐶& 𝜑) 𝐸) 𝐶) 𝜑* 𝐸* 𝐶* 𝑆 … Utility AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 12
  • 18. Background Motivation Model & Metric Experimental Setup Results Summary Clicks + Attention + Satisfaction (CAS) Model SERP 𝜑& 𝐸& 𝐶& 𝜑) 𝐸) 𝐶) 𝜑* 𝐸* 𝐶* 𝑆 … Utility AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 13
  • 19. Background Motivation Model & Metric Experimental Setup Results Summary Click Model Examination assumption: click happens only when an item was examined and attractive: P(Ck = 1) = P(Ek = 1) · P(Ck = 1 | Ek = 1) AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 14
  • 20. Background Motivation Model & Metric Experimental Setup Results Summary Click Model Examination assumption: click happens only when an item was examined and attractive: P(Ck = 1) = P(Ek = 1) · P(Ck = 1 | Ek = 1) N.B. Here we assume that P(Ck = 1 | Ek = 1) = α(Rk) where Rk comes from the raters and α is a logistic function. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 14
  • 21. Background Motivation Model & Metric Experimental Setup Results Summary Clicks + Attention + Satisfaction (CAS) Model SERP 𝜑& 𝐸& 𝐶& 𝜑) 𝐸) 𝐶) 𝜑* 𝐸* 𝐶* 𝑆 … Utility AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 15
  • 22. Background Motivation Model & Metric Experimental Setup Results Summary Attention (Examination) Model Logistic regression model: P(Ek = 1) = ε(ϕk), where ϕk is a vector of features for SERP item k. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 16
  • 23. Background Motivation Model & Metric Experimental Setup Results Summary Attention (Examination) Model Logistic regression model: P(Ek = 1) = ε(ϕk), where ϕk is a vector of features for SERP item k. Feature group Features # of features rank user-perceived rank of the SERP item (can be different from k) 1 AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 16
  • 24. Background Motivation Model & Metric Experimental Setup Results Summary Attention (Examination) Model Logistic regression model: P(Ek = 1) = ε(ϕk), where ϕk is a vector of features for SERP item k. Feature group Features # of features rank user-perceived rank of the SERP item (can be different from k) 1 CSS classes SERP item type (Web, News, Weather, Currency, Knowledge Panel, etc.) 10 geometry offset from the top, first or second col- umn (binary), width (w), height (h), w × h 5 AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 16
  • 25. Background Motivation Model & Metric Experimental Setup Results Summary Clicks + Attention + Satisfaction (CAS) Model SERP 𝜑& 𝐸& 𝐶& 𝜑) 𝐸) 𝐶) 𝜑* 𝐸* 𝐶* 𝑆 … Utility AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 17
  • 26. Background Motivation Model & Metric Experimental Setup Results Summary Satisfaction Model in previous models, satisfaction comes only from clicked results; AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 18
  • 27. Background Motivation Model & Metric Experimental Setup Results Summary Satisfaction Model in previous models, satisfaction comes only from clicked results; in our model it also comes from the SERP items that simply attracted attention; AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 18
  • 28. Background Motivation Model & Metric Experimental Setup Results Summary Satisfaction Model in previous models, satisfaction comes only from clicked results; in our model it also comes from the SERP items that simply attracted attention; P(S = 1) = σ(τ0 + U) = AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 18
  • 29. Background Motivation Model & Metric Experimental Setup Results Summary Satisfaction Model in previous models, satisfaction comes only from clicked results; in our model it also comes from the SERP items that simply attracted attention; P(S = 1) = σ(τ0 + U) = σ τ0 + k P(Ek = 1)ud (Dk) + k P(Ck = 1)ur (Rk) where Dk and Rk are ratings assigned by the raters for direct snippet relevance and result relevance respectively. ud and ur are linear functions of rating histograms. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 18
  • 30. Background Motivation Model & Metric Experimental Setup Results Summary The CAS Metric Utility that determines the satisfaction probability: U = k P(Ek = 1)ud (Dk) + k P(Ck = 1)ur (Rk) AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 19
  • 31. Background Motivation Model & Metric Experimental Setup Results Summary The CAS Metric Utility that determines the satisfaction probability: U = k P(Ek = 1)ud (Dk) NEW + k P(Ck = 1)ur (Rk) Chuklin et al. 2013 has an additional term AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 19
  • 32. Background Motivation Model & Metric Experimental Setup Results Summary The CAS Metric Utility that determines the satisfaction probability: U = k P(Ek = 1)ud (Dk) NEW + k P(Ck = 1)ur (Rk) Chuklin et al. 2013 has an additional term trained on mousing and satisfaction (in addition to clicks) AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 19
  • 34. Background Motivation Model & Metric Experimental Setup Results Summary Dataset 199 queries with explicit unambiguous feedback (satisfied / not satisfied); AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 21
  • 35. Background Motivation Model & Metric Experimental Setup Results Summary Dataset 199 queries with explicit unambiguous feedback (satisfied / not satisfied); 1,739 rated results direct snippet relevance (D) result relevance (R) AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 21
  • 36. Background Motivation Model & Metric Experimental Setup Results Summary Baselines and CAS Model Variants UBM model that agrees well with online team-draft experimental outcomes; PBM position-based model, a robust model with fewer parameters than UBM; random model that predicts click and satisfaction with fixed probabilities (learned from the data). uUBM from Chuklin et al. 2013. Similar to UBM, but parameters are trained on a different and much bigger dataset. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 22
  • 37. Background Motivation Model & Metric Experimental Setup Results Summary Baselines and CAS Model Variants UBM model that agrees well with online team-draft experimental outcomes; PBM position-based model, a robust model with fewer parameters than UBM; random model that predicts click and satisfaction with fixed probabilities (learned from the data). uUBM from Chuklin et al. 2013. Similar to UBM, but parameters are trained on a different and much bigger dataset. CASnod is a stripped-down version that does not use (D) labels; CASnosat is a version of the CAS model that does not include the satisfaction term while optimizing the model; CASnoreg is a version of the CAS model that does not use regularization while training. All other models were trained with L2-regularization. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 22
  • 39. Background Motivation Model & Metric Experimental Setup Results Summary Is the New Metric Really New? Correlation Between Metrics Table: Correlation between metrics measured by average Pearson’s correlation coefficient. CASnosat CASnoreg CAS UBM PBM DCG uUBM CASnod 0.593 0.564 0.633 0.470 0.487 0.546 0.441 CASnosat 0.664 0.715 0.707 0.668 0.735 0.684 CASnoreg 0.974 0.363 0.379 0.417 0.341 CAS 0.377 0.394 0.440 0.360 UBM 0.814 0.972 0.882 PBM 0.906 0.965 DCG 0.943 AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 24
  • 40. Background Motivation Model & Metric Experimental Setup Results Summary Is the New Metric Measuring the Right Thing? Metric Correlation with True Satisfaction CASnod CASnosat CASnoreg CAS UBM PBM random DCG uUBM 0.2 0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 Pearson correlation coefficient between different model-based metrics and the user-reported satisfaction. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 25
  • 41. Background Motivation Model & Metric Experimental Setup Results Summary Bonus Point Log-Likelihood of Click Prediction CASnod CASnosat CASnoreg CAS UBM PBM random uUBM 4.5 4.0 3.5 3.0 2.5 2.0 1.5 Log-likelihood of the click data. Note that uUBM was trained on a totally different dataset. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 26
  • 43. Background Motivation Model & Metric Experimental Setup Results Summary Summary A model-based metric needs to model satisfaction explicitly and use it for training. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 28
  • 44. Background Motivation Model & Metric Experimental Setup Results Summary Summary A model-based metric needs to model satisfaction explicitly and use it for training. Direct snippet relevance (D) is essential for predicting satisfaction. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 28
  • 45. Background Motivation Model & Metric Experimental Setup Results Summary Summary A model-based metric needs to model satisfaction explicitly and use it for training. Direct snippet relevance (D) is essential for predicting satisfaction. The CAS metric is quite different from the previously used metrics, making it an interesting addition to TREC. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 28
  • 46. Background Motivation Model & Metric Experimental Setup Results Summary Summary A model-based metric needs to model satisfaction explicitly and use it for training. Direct snippet relevance (D) is essential for predicting satisfaction. The CAS metric is quite different from the previously used metrics, making it an interesting addition to TREC. When used as a model, CAS consistently predicts user satisfaction with a relatively small penalty in click prediction. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 28
  • 47. Background Motivation Model & Metric Experimental Setup Results Summary Acknowledgments All content represents the opinion of the authors which is not necessarily shared or endorsed by their respective employers and/or sponsors. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 29
  • 48.
  • 49. Background Motivation Model & Metric Experimental Setup Results Summary Evaluating the User Model Log-Likelihood of Satisfaction Prediction CASnod CASnosat CASnoreg CAS UBM PBM random uUBM 0.8 0.7 0.6 0.5 0.4 0.3 0.2 Log-likelihood of the satisfaction prediction. Some models have log-likelihood below −0.8, hence there are no boxes for them. AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 31
  • 50. Background Motivation Model & Metric Experimental Setup Results Summary Analyzing the Attention Features CASrank is the model that only uses the rank to predict attention; CASnogeom only uses the rank and SERP item type information and does not use geometry; CASnoclass does not use the CSS class features (SERP item type). Pearson correlation with satisfaction CASrank CASnogeom CASnoclass CASnod CAS 0.2 0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6 Log-likelihood of clicks / satisfaction CASrank CASnogeom CASnoclass CASnod CAS 2.5 2.4 2.3 2.2 2.1 2.0 1.9 1.8 1.7 CASrank CASnogeom CASnoclass CASnod CAS 0.65 0.60 0.55 0.50 0.45 0.40 0.35 0.30 0.25 0.20 AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 32
  • 51. Background Motivation Model & Metric Experimental Setup Results Summary Heterogeneous SERPs 12% of the SERPs in our data are heterogeneous and our metric does well for them. Table: Pearson correlation between utility of heterogeneous SERP and user-reported satisfaction. CAS UBM PBM random DCG uUBM 0.60 0.38 -0.05 -0.39 0.24 -0.08 CASrank CASnogeom CASclass CASnod CASnosat CASnoreg 0.15 -0.04 0.27 -0.04 0.48 0.67 AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 33
  • 52. Background Motivation Model & Metric Experimental Setup Results Summary Spammers Some raters were filtered out as spammers, but there was still some natural disagreement: Table: Filtered out workers and agreement scores for remaining workers. % of workers % of ratings Cohen’s Krippendorf’s label removed removed kappa alpha (D) 32% 27% 0.339 0.144 (R) 41% 29% 0.348 0.117 AC–MdR Incorporating Clicks, Attention and Satisfaction. . . 34