SlideShare ist ein Scribd-Unternehmen logo
1 von 50
Conversion Models
ABSOLUTELY AMAZING learning to rank training data ?
Activate 2019
Discount Code ctwact19 for 40% off!
Doug Turnbull, http://o19s.com
WE'RE HIRING!
Relevance Cornucopia🦃 Training Event:
http://o19s.com/blog/2019/09/11/announcing-relevance-cornucopia/
(Early Bird (gobble gobble) till end of Sept)
● Week of Nov 10
● "Think Like a Relevance Engineer" for Solr or Elasticsearch
● "Learning to Rank" & "Natural Language Search" training
● Delivered by our crack team of expert relevance consultants!
What I'm currently up to...
THEY'RE
HIRING!
(see Dennis Chaney's talk)
https://www.lexisnexis.com/en-us/about-us/careers.page
Outline
1. What holds orgs back from AI-Powered Search?
2. Click Models help?
3. Click Models for The Rest of Us
What holds orgs
back from this?
http://aipoweredsearch.com
Discount Code: ctwact19
How most 'Machine Learning Search' Projects
Fail
Our
Jerk-face
AI Search
Garbage
Training Data In
Garbage Results Out
This difficulty is a major theme in our
community
From User Actions to Better Rankings Agnes Van Belle, Haystack EU 2018; Learning Learning To Rank Torsten Köster & Fabian Klenk & René Kriegler,
Haystack EU 2018; Learning to rank (LTR) in an Activity Marketplace Ashraf Aaref & Felipe Besson - MICES 2018
Through 4 iterations of LtR
"Consistent theme of being hindered
by judgment quality"
V1 LTR model failed. We need to "Redefine our criteria for
measuring relevance" and "Judge the judgements very often"
(entire talk about this problem)
Why is this so hard!?
First: what is the training data?
grade,keywords,docId
4,Rambo,7555 # Rambo
3,Rambo,1370 # Rambo III
0,Rambo,102947 # First Daughter
4,Rocky,1366 # Rocky
...
Doc 7555 is
perfectly relevant
for query "Rambo"
Doc 102947 very
irrelevant for
"Rambo"
Judgment List:
Measuring how good is search...
grade,keywords,docId
4,Rambo,7555 # Rambo
3,Rambo,1370 # Rambo III
0,Rambo,102947 # First Daughter
4,Rocky,1366 # Rocky
...
Our
Search
Solution
Keywords NDCG@5 ERR@5
Rambo 0.95 0.56
Rocky 0.58 0.21
Offline testing: How is our tuning going?
Rambo: going pretty good!
Rocky: not so great… let's focus here
… and for training Learning to Rank
grade,keywords,docId
4,Rambo,7555 # Rambo
3,Rambo,1370 # Rambo III
0,Rambo,102947 # First Daughter
4,Rocky,1366 # Rocky
...
Our
LtR
Model
Keywords NDCG@5
Rambo 0.95
Rocky 0.58
Train
modelJudgments are training data...
Analyze
Results
Elite
Search
Team
Of course there's manual judgments
http://github.com/o19s/quepid
For a good talk on a robust human judgment program, see Tito Sierra and Tara
Diedrichson's Haystack Talk "Making the Case for Human Judgment Relevance
Testing" https://haystackconf.com/2019/human-judgement/
(Usually not enough data for LtR training
data)
For LtR: use implicit data from user behavior
Less
'Opinion'?
How to do this - maybe something like this!?
if purchased=True:
grade = 4
if clicked + dwell for 5 secs:
grade = 3
if click:
grade = 2
if shown, but not clicked:
grade = 1
Clickstream
grade,keywords,docId
4,Rambo,7555 # Rambo
3,Rambo,1370 # Rambo III
0,Rambo,102947 # First Daughter
4,Rocky,1366 # Rocky
...
Is this a good approach?
Thoughts?
Self reinforcing bad search
Search
Engine
'Santa Claus Conquers
Martians' most relevant!
Users only interact with what
the search engine shows them
ML reinforces search's current
(bad?) behavior
Position bias: 'Santa Claus…' clicked more as its in posn 1
Presentation bias: where is "The Martian"?
q=stuck on mars
Domain-specific considerations
Lack of a clear 'Conversion' - what if this is just IMDB getting info on the
movie?; what if users just want to research an expensive purchase first?
What are YOUR user's goals? Shopping vs research vs known-item search vs
passive browsing vs … all have different fingerprints
UI layout? How does a grid vs a list influence user's click behaviors? What
about a chat-bot system or Alexa-style question answering!??
'Good Abandonments' - what if your snippets answer the user's question
without them clicking on a thing!
How you get judgments is a model too!
Your
Intuition
<your assumptions go
here>
Clickstream
grade,keywords,docId
4,Rambo,7555 # Rambo
3,Rambo,1370 # Rambo III
0,Rambo,102947 # First Daughter
4,Rocky,1366 # Rocky
...
This means when you hear...
"I think that clicking
and spending > 5
seconds on the page
indicates relevant
document!"
"I think that we should
oversample clicks
farther down the page
to compensate for
position bias""Carefully inspecting
the product is an
indication of relevance"
NDCG - but based on what judgment methodology?
"We improved
NDCG 20% through
X ML search technique!"
Overconfident
search consultant
We need to study these models too
Hard-Coded
Ranking 2
Hard-Coded
Ranking 1 Clickstream
Judgment
Aggregation
Solution 1
Show users hard-coded
corresponding to judgment list
Judgment
Aggregation
Solution 2
A
B
- A/B Test the Judgment system
- Consensus with other judgment
systems (ie manual)
- Continue to evolve & improve
This is why this is so hard
- Search behaviors / UIs constantly
evolving
- Your domain & products
considerations dominate
- SERP UIs have biases
Ok enough ranting
Click Models
What is a click model
CLICKS
q=waffle maker
So hot right now
Really really really
ridiculously good
looking
What is this? A search
result for q=ANTS?
Click Models for Web Search by Chuklin, Markov, de Rijke
https://www.morganclaypool.com/doi/abs/10.2200/S00654ED1V01Y201507ICR043
Attractiveness vs Satisfaction
Attractiveness
~Perceived Relevance
Denoted 'A'
The snippet *looked*
useful/interesting for
what I need - tied to
clicks
All click models
provide A
≠
Satisfaction
~Actual Relevance
Denoted 'S'
The document satisfied
my information need
Some click models
attempt S
A=0.45
A=0.25
A=0.15
CTR: The World's Dumbest Click Model
(we know this is
dominated by
position bias)
So Hot
Right
Now
A=0.45 / 0.50
= 0.9
A = 0.25 / 0.20
= 1.25
A = 0.15 / 0.16
= 0.9375
CTR/Avg Posn CTR:
The World's Second Simplest Click Model
So Hot
Right
Now
(aka COEC - clicks over expected clicks)
Personalized Click Prediction in Sponsored Search, Chang, Cantu-Paz
http://www.wsdm-conference.org/2010/proceedings/docs/p351.pdf
Avg CTR for posn 1
over all queries
This Query's
CTR for posn 1
Probabilistic Models ~ e.g. Position Based Model
C
d
Ed
Ad
Ad User found doc d attractive
Ed User Examined document d
αdq
γr
αdq Attractiveness for doc d, query q
γr
Examine probability for rank r
across all queries
C
d
Document d
clicked
Observed:
Rank examine
prob
Doc attractiveness
for Query
P(Cd) = P(Ed) * P(Ad)
~ γr * αdq
PBM ~ Two Unknowns, One Equation
P(Cd) ~ γr * αdq
Find best
examine for
observed clicks
Find best
attractiveness for
doc/query pair It's definitely examined P(Ed)=1 if it's clicked!
It's definitely attractive P(Ad)=1 if it's clicked!
Unlikely something was examined if users never click on
that position (or is the document unattractive)?
Unlikely something is attractive, if users seem to examine
that position (see posn clicked a lot) but don't click this
particular document
Assumptions:
Assumptions -> TERRIFYING MAAAAAATH!!!
Iteratively improve attractiveness & examine probabilities over the search session until they
converge to most likely
Clicked 'assumptions'
Not Clicked, then probably not
attractive if this posn is
examined a lot (trust me 😊 )
For each session with
query/doc pair
(t - iteration)
Solving for satisfaction
Shoutout: Solving for Satisfaction, Liz Haubert
https://haystackconf.com/2019/satisfaction/
Dynamic Bayesian Network
A Dynamic Bayesian Network Click Modelfor Web Search Ranking by Chapelle, Zhang
http://olivier.chapelle.cc/pub/DBN_www2009.pdf
Wikimedia Foundation's use of DBN:
https://blog.wikimedia.org/2017/10/17/elasticsearch-learning-to-rank-plugin/
Er
Cd
Ar
αdq
Sr
sdq
Er-1
Cd
Ar-1
αdq
Sr-1
sdq
We can compute 'attractiveness' and 'satisfaction' of doc for query
......
γ
You examine the next
result if you clicked but
were not satisfied, or at
probability γ if you were
satisfied
Simplified DBN: last
clicked result satisfied me
We are not building Web Search
● Low visibility just the SERP clicks, we
don't see what happens beyond...
● High volume simpler assumptions
help map just clicks to satisfaction
Web Search:
Most of us - 'Average Joes'
● More visibility clicks, conversions, and
more from the session after search!
● Lower volume may not be able to rely
on simpler assumptions for satisfaction
Most other search apps:
Click models for the rest of us
Click models for the rest of us
● Click Model CAN be used to overcome
SERP UI biases to derive
attractiveness for Average Joes
● What about satisfaction? Aka 'actual
relevance'
● Can we use our advantage to measure
that directly?
q=waffle maker
0.7
0.9
0.4
Avg Joes have enough data to derive attractiveness
Attractiveness:
Most of us have some kind of 'post click' tracking
Conversions: Direct/explicit goal completed by user - like
"purchase"
Pseudo-conversions: "goals" not directly recognized by
user or clear in analytics - like "read article" or "add to cart"
Indications of interest: not quite "goals" but indications
user is happy - like "click plus dwell"
q=heart attack
0.7
'Shallow' events dense; 'deeper' events sparse
Attractiveness: click!
These clicks are fleeting to
users
Top of
funnel/path
Click+
Dwell
Click+
Dwell+
Scroll
Read
Reviews
Add to
Cart
Checkout
End of
funnel/path
Most people
should get here...
...a few will get all
the way through...
q=waffle maker
0.7
If user can't bother to do shallow event, attractiveness
discounted
Attractiveness:
User immediately hits back
button!
Time on page = 0.001s
Not actually relevant
q=waffle maker
0.7
If user moves deep into page, attractiveness confirmed
Attractiveness:
Add to Cart
Bought
Definitely relevant
q=heart attack
0.7
Discount attractiveness based on event not achieved
Attractiveness: click!
Click+
Dwell
Click+
Dwell+
Scroll
Read
Reviews
Add to
Cart
Checkout
Quit here?
Discount A: 0.01
Quit here?
Discount A: 0.95
Update over multiple sessions...
q=waffle maker
0.7
Attractiveness:
Bought
Session 1
Immediately
returned to
SERP
Session 2
Stayed on
page, read
reviews
Session 3
Further 'post query' evidence:
D=0.65 D=0.01 D=0.20
J = Discount * Attractiveness
Σ
num_sessions
J =0.7 x 0.65+0.01+0.2 = 0.29
3
User Value-Cost Model
What is the value of a page for the user
We can't really measure the value but we can indirectly the cost to the user in
time & money
...I can't be
bothered...
Click+
Dwell
Click+
Dwell+
Scroll
Read
Reviews… this was at least
worth some of my time
towards my goal...
Back immediately
Discount heavily
Discount
moderately
Bayes justification to judgments
P (J | V) = P (V | J) * P(J)
P(V)
Prior, earlier belief in relevance given by
attractiveness as derived from click model
Probability of user getting value in the
context of it being deemed relevant
to this query
Probability of user getting value
regardless of query
Judgment in the
context of value
Bayes approach to judgments
J = avgPageValueForThisQuery * A
avgPageValue
When avg_page_value = 0.3
q=waffle maker
0.7
Attractiveness:
Bought
Session 1
Immediately
returned to
SERP
Session 2
Stayed on
page, read
reviews
Session 3
Further 'post query' evidence:
D=0.65
user_value=0.01 user_value=0.20
Discount * Attractiveness
Σ
num_sessions
J =0.7 x 0.65+0.01+0.2 = 0.95
3 / 0.3
avg_page_value
J =
Zhong, et. al. Incorporating Post-Click Behaviors into a Click Model
https://zhangyuc.github.io/files/zhang11kdd.pdf
Your Take home reading
Questions

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

SEOWars: Rstudio aplicado a SEO #sob22
SEOWars: Rstudio aplicado a SEO #sob22SEOWars: Rstudio aplicado a SEO #sob22
SEOWars: Rstudio aplicado a SEO #sob22
 
Martin McGarry - SEO strategy c/o England manager Gareth Southgate
Martin McGarry - SEO strategy c/o England manager Gareth SouthgateMartin McGarry - SEO strategy c/o England manager Gareth Southgate
Martin McGarry - SEO strategy c/o England manager Gareth Southgate
 
EAT: Have We Been Looking At It Backwards
EAT: Have We Been Looking At It BackwardsEAT: Have We Been Looking At It Backwards
EAT: Have We Been Looking At It Backwards
 
How to Use Search Intent to Dominate Google Discover
How to Use Search Intent to Dominate Google DiscoverHow to Use Search Intent to Dominate Google Discover
How to Use Search Intent to Dominate Google Discover
 
Competitor Site Audits with Free Tools and Data - Sophie Gibson - BrightonSEO...
Competitor Site Audits with Free Tools and Data - Sophie Gibson - BrightonSEO...Competitor Site Audits with Free Tools and Data - Sophie Gibson - BrightonSEO...
Competitor Site Audits with Free Tools and Data - Sophie Gibson - BrightonSEO...
 
Agrupa y vencerás - SEO técnico
Agrupa y vencerás - SEO técnicoAgrupa y vencerás - SEO técnico
Agrupa y vencerás - SEO técnico
 
BrightonSEO - Apr 2022 - No excuses for doing UX
BrightonSEO - Apr 2022 - No excuses for doing UXBrightonSEO - Apr 2022 - No excuses for doing UX
BrightonSEO - Apr 2022 - No excuses for doing UX
 
Using Search Intent in our Link Building Efforts
Using Search Intent in our Link Building EffortsUsing Search Intent in our Link Building Efforts
Using Search Intent in our Link Building Efforts
 
Beth Barnham Schema Auditing BrightonSEO Slides.pptx
Beth Barnham Schema Auditing BrightonSEO Slides.pptxBeth Barnham Schema Auditing BrightonSEO Slides.pptx
Beth Barnham Schema Auditing BrightonSEO Slides.pptx
 
Small Tasks Make Big Changes - Shmulik Dorinbaum.pptx
Small Tasks Make Big Changes - Shmulik Dorinbaum.pptxSmall Tasks Make Big Changes - Shmulik Dorinbaum.pptx
Small Tasks Make Big Changes - Shmulik Dorinbaum.pptx
 
Product, service and category page links (and how to get them) - Rebecca Moss...
Product, service and category page links (and how to get them) - Rebecca Moss...Product, service and category page links (and how to get them) - Rebecca Moss...
Product, service and category page links (and how to get them) - Rebecca Moss...
 
Improving Crawling and Indexing using Real-Time Log File Insights
Improving Crawling and Indexing using Real-Time Log File InsightsImproving Crawling and Indexing using Real-Time Log File Insights
Improving Crawling and Indexing using Real-Time Log File Insights
 
Developing Technical SEO Skills - Brighton SEO Sept 2021
Developing Technical SEO Skills - Brighton SEO Sept 2021Developing Technical SEO Skills - Brighton SEO Sept 2021
Developing Technical SEO Skills - Brighton SEO Sept 2021
 
Brighton SEO - Luis Bueno Tabernero - How to do an ASO Audit like in the 90's...
Brighton SEO - Luis Bueno Tabernero - How to do an ASO Audit like in the 90's...Brighton SEO - Luis Bueno Tabernero - How to do an ASO Audit like in the 90's...
Brighton SEO - Luis Bueno Tabernero - How to do an ASO Audit like in the 90's...
 
Claves SEO para Ecommerce #RMC22
Claves SEO para Ecommerce  #RMC22Claves SEO para Ecommerce  #RMC22
Claves SEO para Ecommerce #RMC22
 
[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유
[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유
[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유
 
Python For SEO specialists and Content Marketing - Hand in Hand
Python For SEO specialists and Content Marketing - Hand in HandPython For SEO specialists and Content Marketing - Hand in Hand
Python For SEO specialists and Content Marketing - Hand in Hand
 
Cómo Gestionar Proyectos SEO Complejos de Forma Exitosa #SEonthebeach
Cómo Gestionar Proyectos SEO Complejos de Forma Exitosa #SEonthebeachCómo Gestionar Proyectos SEO Complejos de Forma Exitosa #SEonthebeach
Cómo Gestionar Proyectos SEO Complejos de Forma Exitosa #SEonthebeach
 
Stoicism in Digital - brightonSEO April 2022.pdf
Stoicism in Digital  - brightonSEO April 2022.pdfStoicism in Digital  - brightonSEO April 2022.pdf
Stoicism in Digital - brightonSEO April 2022.pdf
 
Diginius - DuckDuckGo, Privacy and the Future of Search
Diginius - DuckDuckGo, Privacy and the Future of Search Diginius - DuckDuckGo, Privacy and the Future of Search
Diginius - DuckDuckGo, Privacy and the Future of Search
 

Ähnlich wie Conversion Models: A Systematic Method of Building Learning to Rank Training Data - Doug Turnbull, OpenSource Connections

Blueprint project[1]
Blueprint project[1]Blueprint project[1]
Blueprint project[1]
toddvabpre
 
Software for Search: Compendium, SEOmoz, & Distilled
Software for Search: Compendium, SEOmoz, & DistilledSoftware for Search: Compendium, SEOmoz, & Distilled
Software for Search: Compendium, SEOmoz, & Distilled
Compendium
 
Persona Driven Keyword Research
Persona Driven Keyword ResearchPersona Driven Keyword Research
Persona Driven Keyword Research
Michael King
 

Ähnlich wie Conversion Models: A Systematic Method of Building Learning to Rank Training Data - Doug Turnbull, OpenSource Connections (20)

Search Analytics
Search AnalyticsSearch Analytics
Search Analytics
 
NYC Data Driven Business Meetup - 2.7.17
NYC Data Driven Business Meetup - 2.7.17NYC Data Driven Business Meetup - 2.7.17
NYC Data Driven Business Meetup - 2.7.17
 
Rand Fishkin en The Inbounder
Rand Fishkin en The InbounderRand Fishkin en The Inbounder
Rand Fishkin en The Inbounder
 
Using SEO to Build Your Business
Using SEO to Build Your BusinessUsing SEO to Build Your Business
Using SEO to Build Your Business
 
Using SEO to Build Your Business
Using SEO to Build Your BusinessUsing SEO to Build Your Business
Using SEO to Build Your Business
 
Seo Made Easy
Seo Made EasySeo Made Easy
Seo Made Easy
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Fight Back Against Back: How Search Engines & Social Networks' AI Impacts Mar...
Fight Back Against Back: How Search Engines & Social Networks' AI Impacts Mar...Fight Back Against Back: How Search Engines & Social Networks' AI Impacts Mar...
Fight Back Against Back: How Search Engines & Social Networks' AI Impacts Mar...
 
Advanced Keyword Research
Advanced Keyword ResearchAdvanced Keyword Research
Advanced Keyword Research
 
Creative Career Hacking 2015: The not-so-well-known ways to find and apply fo...
Creative Career Hacking 2015: The not-so-well-known ways to find and apply fo...Creative Career Hacking 2015: The not-so-well-known ways to find and apply fo...
Creative Career Hacking 2015: The not-so-well-known ways to find and apply fo...
 
Blueprint project[1]
Blueprint project[1]Blueprint project[1]
Blueprint project[1]
 
Lean Analytics & Analytics Dashboards
Lean Analytics & Analytics DashboardsLean Analytics & Analytics Dashboards
Lean Analytics & Analytics Dashboards
 
Software for Search: Compendium, SEOmoz, & Distilled
Software for Search: Compendium, SEOmoz, & DistilledSoftware for Search: Compendium, SEOmoz, & Distilled
Software for Search: Compendium, SEOmoz, & Distilled
 
Debunking SEO Myths
Debunking SEO MythsDebunking SEO Myths
Debunking SEO Myths
 
What the * is SEO
What the * is SEOWhat the * is SEO
What the * is SEO
 
Analytics for SEO
Analytics for SEOAnalytics for SEO
Analytics for SEO
 
Competitive Keyword Intelligence for Search Marketing
Competitive Keyword Intelligence for Search MarketingCompetitive Keyword Intelligence for Search Marketing
Competitive Keyword Intelligence for Search Marketing
 
Kamloops2012 Online Marketing for Heritage Operators
Kamloops2012 Online Marketing for Heritage OperatorsKamloops2012 Online Marketing for Heritage Operators
Kamloops2012 Online Marketing for Heritage Operators
 
What is SEO?
What is SEO?What is SEO?
What is SEO?
 
Persona Driven Keyword Research
Persona Driven Keyword ResearchPersona Driven Keyword Research
Persona Driven Keyword Research
 

Mehr von Lucidworks

Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Lucidworks
 

Mehr von Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 
Webinar: Lucidworks Managed Search
Webinar: Lucidworks Managed SearchWebinar: Lucidworks Managed Search
Webinar: Lucidworks Managed Search
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Kürzlich hochgeladen (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 

Conversion Models: A Systematic Method of Building Learning to Rank Training Data - Doug Turnbull, OpenSource Connections

  • 1. Conversion Models ABSOLUTELY AMAZING learning to rank training data ? Activate 2019 Discount Code ctwact19 for 40% off! Doug Turnbull, http://o19s.com WE'RE HIRING!
  • 2. Relevance Cornucopia🦃 Training Event: http://o19s.com/blog/2019/09/11/announcing-relevance-cornucopia/ (Early Bird (gobble gobble) till end of Sept) ● Week of Nov 10 ● "Think Like a Relevance Engineer" for Solr or Elasticsearch ● "Learning to Rank" & "Natural Language Search" training ● Delivered by our crack team of expert relevance consultants!
  • 3. What I'm currently up to... THEY'RE HIRING! (see Dennis Chaney's talk) https://www.lexisnexis.com/en-us/about-us/careers.page
  • 4. Outline 1. What holds orgs back from AI-Powered Search? 2. Click Models help? 3. Click Models for The Rest of Us
  • 5. What holds orgs back from this? http://aipoweredsearch.com Discount Code: ctwact19
  • 6. How most 'Machine Learning Search' Projects Fail Our Jerk-face AI Search Garbage Training Data In Garbage Results Out
  • 7. This difficulty is a major theme in our community From User Actions to Better Rankings Agnes Van Belle, Haystack EU 2018; Learning Learning To Rank Torsten Köster & Fabian Klenk & René Kriegler, Haystack EU 2018; Learning to rank (LTR) in an Activity Marketplace Ashraf Aaref & Felipe Besson - MICES 2018 Through 4 iterations of LtR "Consistent theme of being hindered by judgment quality" V1 LTR model failed. We need to "Redefine our criteria for measuring relevance" and "Judge the judgements very often" (entire talk about this problem)
  • 8. Why is this so hard!?
  • 9. First: what is the training data? grade,keywords,docId 4,Rambo,7555 # Rambo 3,Rambo,1370 # Rambo III 0,Rambo,102947 # First Daughter 4,Rocky,1366 # Rocky ... Doc 7555 is perfectly relevant for query "Rambo" Doc 102947 very irrelevant for "Rambo" Judgment List:
  • 10. Measuring how good is search... grade,keywords,docId 4,Rambo,7555 # Rambo 3,Rambo,1370 # Rambo III 0,Rambo,102947 # First Daughter 4,Rocky,1366 # Rocky ... Our Search Solution Keywords NDCG@5 ERR@5 Rambo 0.95 0.56 Rocky 0.58 0.21 Offline testing: How is our tuning going? Rambo: going pretty good! Rocky: not so great… let's focus here
  • 11. … and for training Learning to Rank grade,keywords,docId 4,Rambo,7555 # Rambo 3,Rambo,1370 # Rambo III 0,Rambo,102947 # First Daughter 4,Rocky,1366 # Rocky ... Our LtR Model Keywords NDCG@5 Rambo 0.95 Rocky 0.58 Train modelJudgments are training data... Analyze Results Elite Search Team
  • 12. Of course there's manual judgments http://github.com/o19s/quepid For a good talk on a robust human judgment program, see Tito Sierra and Tara Diedrichson's Haystack Talk "Making the Case for Human Judgment Relevance Testing" https://haystackconf.com/2019/human-judgement/ (Usually not enough data for LtR training data)
  • 13. For LtR: use implicit data from user behavior Less 'Opinion'?
  • 14. How to do this - maybe something like this!? if purchased=True: grade = 4 if clicked + dwell for 5 secs: grade = 3 if click: grade = 2 if shown, but not clicked: grade = 1 Clickstream grade,keywords,docId 4,Rambo,7555 # Rambo 3,Rambo,1370 # Rambo III 0,Rambo,102947 # First Daughter 4,Rocky,1366 # Rocky ... Is this a good approach? Thoughts?
  • 15. Self reinforcing bad search Search Engine 'Santa Claus Conquers Martians' most relevant! Users only interact with what the search engine shows them ML reinforces search's current (bad?) behavior Position bias: 'Santa Claus…' clicked more as its in posn 1 Presentation bias: where is "The Martian"? q=stuck on mars
  • 16. Domain-specific considerations Lack of a clear 'Conversion' - what if this is just IMDB getting info on the movie?; what if users just want to research an expensive purchase first? What are YOUR user's goals? Shopping vs research vs known-item search vs passive browsing vs … all have different fingerprints UI layout? How does a grid vs a list influence user's click behaviors? What about a chat-bot system or Alexa-style question answering!?? 'Good Abandonments' - what if your snippets answer the user's question without them clicking on a thing!
  • 17. How you get judgments is a model too! Your Intuition <your assumptions go here> Clickstream grade,keywords,docId 4,Rambo,7555 # Rambo 3,Rambo,1370 # Rambo III 0,Rambo,102947 # First Daughter 4,Rocky,1366 # Rocky ...
  • 18. This means when you hear... "I think that clicking and spending > 5 seconds on the page indicates relevant document!" "I think that we should oversample clicks farther down the page to compensate for position bias""Carefully inspecting the product is an indication of relevance"
  • 19. NDCG - but based on what judgment methodology? "We improved NDCG 20% through X ML search technique!" Overconfident search consultant
  • 20. We need to study these models too Hard-Coded Ranking 2 Hard-Coded Ranking 1 Clickstream Judgment Aggregation Solution 1 Show users hard-coded corresponding to judgment list Judgment Aggregation Solution 2 A B - A/B Test the Judgment system - Consensus with other judgment systems (ie manual) - Continue to evolve & improve
  • 21. This is why this is so hard - Search behaviors / UIs constantly evolving - Your domain & products considerations dominate - SERP UIs have biases
  • 24. What is a click model CLICKS q=waffle maker So hot right now Really really really ridiculously good looking What is this? A search result for q=ANTS? Click Models for Web Search by Chuklin, Markov, de Rijke https://www.morganclaypool.com/doi/abs/10.2200/S00654ED1V01Y201507ICR043
  • 25. Attractiveness vs Satisfaction Attractiveness ~Perceived Relevance Denoted 'A' The snippet *looked* useful/interesting for what I need - tied to clicks All click models provide A ≠ Satisfaction ~Actual Relevance Denoted 'S' The document satisfied my information need Some click models attempt S
  • 26. A=0.45 A=0.25 A=0.15 CTR: The World's Dumbest Click Model (we know this is dominated by position bias) So Hot Right Now
  • 27. A=0.45 / 0.50 = 0.9 A = 0.25 / 0.20 = 1.25 A = 0.15 / 0.16 = 0.9375 CTR/Avg Posn CTR: The World's Second Simplest Click Model So Hot Right Now (aka COEC - clicks over expected clicks) Personalized Click Prediction in Sponsored Search, Chang, Cantu-Paz http://www.wsdm-conference.org/2010/proceedings/docs/p351.pdf Avg CTR for posn 1 over all queries This Query's CTR for posn 1
  • 28. Probabilistic Models ~ e.g. Position Based Model C d Ed Ad Ad User found doc d attractive Ed User Examined document d αdq γr αdq Attractiveness for doc d, query q γr Examine probability for rank r across all queries C d Document d clicked Observed: Rank examine prob Doc attractiveness for Query P(Cd) = P(Ed) * P(Ad) ~ γr * αdq
  • 29. PBM ~ Two Unknowns, One Equation P(Cd) ~ γr * αdq Find best examine for observed clicks Find best attractiveness for doc/query pair It's definitely examined P(Ed)=1 if it's clicked! It's definitely attractive P(Ad)=1 if it's clicked! Unlikely something was examined if users never click on that position (or is the document unattractive)? Unlikely something is attractive, if users seem to examine that position (see posn clicked a lot) but don't click this particular document Assumptions:
  • 30. Assumptions -> TERRIFYING MAAAAAATH!!! Iteratively improve attractiveness & examine probabilities over the search session until they converge to most likely Clicked 'assumptions' Not Clicked, then probably not attractive if this posn is examined a lot (trust me 😊 ) For each session with query/doc pair (t - iteration)
  • 31. Solving for satisfaction Shoutout: Solving for Satisfaction, Liz Haubert https://haystackconf.com/2019/satisfaction/
  • 32. Dynamic Bayesian Network A Dynamic Bayesian Network Click Modelfor Web Search Ranking by Chapelle, Zhang http://olivier.chapelle.cc/pub/DBN_www2009.pdf Wikimedia Foundation's use of DBN: https://blog.wikimedia.org/2017/10/17/elasticsearch-learning-to-rank-plugin/ Er Cd Ar αdq Sr sdq Er-1 Cd Ar-1 αdq Sr-1 sdq We can compute 'attractiveness' and 'satisfaction' of doc for query ...... γ You examine the next result if you clicked but were not satisfied, or at probability γ if you were satisfied Simplified DBN: last clicked result satisfied me
  • 33. We are not building Web Search ● Low visibility just the SERP clicks, we don't see what happens beyond... ● High volume simpler assumptions help map just clicks to satisfaction Web Search:
  • 34. Most of us - 'Average Joes' ● More visibility clicks, conversions, and more from the session after search! ● Lower volume may not be able to rely on simpler assumptions for satisfaction Most other search apps:
  • 35. Click models for the rest of us
  • 36. Click models for the rest of us ● Click Model CAN be used to overcome SERP UI biases to derive attractiveness for Average Joes ● What about satisfaction? Aka 'actual relevance' ● Can we use our advantage to measure that directly?
  • 37. q=waffle maker 0.7 0.9 0.4 Avg Joes have enough data to derive attractiveness Attractiveness:
  • 38. Most of us have some kind of 'post click' tracking Conversions: Direct/explicit goal completed by user - like "purchase" Pseudo-conversions: "goals" not directly recognized by user or clear in analytics - like "read article" or "add to cart" Indications of interest: not quite "goals" but indications user is happy - like "click plus dwell"
  • 39. q=heart attack 0.7 'Shallow' events dense; 'deeper' events sparse Attractiveness: click! These clicks are fleeting to users Top of funnel/path Click+ Dwell Click+ Dwell+ Scroll Read Reviews Add to Cart Checkout End of funnel/path Most people should get here... ...a few will get all the way through...
  • 40. q=waffle maker 0.7 If user can't bother to do shallow event, attractiveness discounted Attractiveness: User immediately hits back button! Time on page = 0.001s Not actually relevant
  • 41. q=waffle maker 0.7 If user moves deep into page, attractiveness confirmed Attractiveness: Add to Cart Bought Definitely relevant
  • 42. q=heart attack 0.7 Discount attractiveness based on event not achieved Attractiveness: click! Click+ Dwell Click+ Dwell+ Scroll Read Reviews Add to Cart Checkout Quit here? Discount A: 0.01 Quit here? Discount A: 0.95
  • 43. Update over multiple sessions... q=waffle maker 0.7 Attractiveness: Bought Session 1 Immediately returned to SERP Session 2 Stayed on page, read reviews Session 3 Further 'post query' evidence: D=0.65 D=0.01 D=0.20 J = Discount * Attractiveness Σ num_sessions J =0.7 x 0.65+0.01+0.2 = 0.29 3
  • 44. User Value-Cost Model What is the value of a page for the user We can't really measure the value but we can indirectly the cost to the user in time & money ...I can't be bothered... Click+ Dwell Click+ Dwell+ Scroll Read Reviews… this was at least worth some of my time towards my goal... Back immediately Discount heavily Discount moderately
  • 45.
  • 46. Bayes justification to judgments P (J | V) = P (V | J) * P(J) P(V) Prior, earlier belief in relevance given by attractiveness as derived from click model Probability of user getting value in the context of it being deemed relevant to this query Probability of user getting value regardless of query Judgment in the context of value
  • 47. Bayes approach to judgments J = avgPageValueForThisQuery * A avgPageValue
  • 48. When avg_page_value = 0.3 q=waffle maker 0.7 Attractiveness: Bought Session 1 Immediately returned to SERP Session 2 Stayed on page, read reviews Session 3 Further 'post query' evidence: D=0.65 user_value=0.01 user_value=0.20 Discount * Attractiveness Σ num_sessions J =0.7 x 0.65+0.01+0.2 = 0.95 3 / 0.3 avg_page_value J =
  • 49. Zhong, et. al. Incorporating Post-Click Behaviors into a Click Model https://zhangyuc.github.io/files/zhang11kdd.pdf Your Take home reading

Hinweis der Redaktion

  1. Good abandonments
  2. Good abandonments
  3. What does the input data look like? Map it out?
  4. Pros/Cons Cons: As overall relevance improves, the denominator in COEC also improves
  5. Might need to see this a bit more in terms of what the input is -
  6. Add long tail data
  7. Stronger intro of idea of priors and posteriors
  8. Stronger intro of idea of priors and posteriors
  9. Stronger intro of idea of priors and posteriors
  10. Stronger intro of idea of priors and posteriors
  11. Stronger intro of idea of priors and posteriors
  12. Stronger intro of idea of priors and posteriors
  13. Stronger intro of idea of priors and posteriors