SlideShare ist ein Scribd-Unternehmen logo
1 von 44
Downloaden Sie, um offline zu lesen
Direct Answers for Search
Queries in the Long Tail

Michael Bernstein, Jaime Teevan, Susan Dumais,
Dan Liebling, and Eric Horvitz
MIT CSAIL and Microsoft Research



                                      MIT HUMAN-COMPUTER INTERACTION
Answers: Direct Search Results
Manually constructed for popular queries

weather boston
the girl with the dragon tattoo



                                  good answers
                                  reduce clicks on
                                  result pages
memorial day 2012
                                  users trigger
                                  answers repeatedly
                                  once discovered

AAPL                              [Chilton + Teevan 2009]
the girl with the dragon tattoo




                                  only popular
                                  query types
memorial day 2012
                                  answers are:
                                  - high cost
AAPL
                                  - high maintenance
Prevalence of Uncommon Searches
No answers for many information needs

molasses substitutes

increase volume windows xp

dissolvable stitches speed

dog body temperature

CHI 2013 deadline
               …
Tail Answers
Direct results for queries in the long tail

molasses substitutes
Tail Answers
Direct results for queries in the long tail

molasses substitutes




Tail Answers improve the search
experience for less common queries,
and fully compensate for poor results.
The Long Tail of Answers
                weather
                movies
# occurrences




                          Tail Answers

                                                            chi 2017 location
                                      Information needs

                   Hard to find structured information
                   Not enough query volume for dedicated teams
Crowds in Tail Answers
Crowd Data                      Paid Crowds
search logs                     on-demand

75 million Bing search trails   Extract


                                Proofread


                                Title
Crowds can support the
long tail of user goals in
interactive systems.

Crowd Data      Paid Crowds
search logs     on-demand
Tail Answers Pipeline
Find URLs that satisfy fact-finding
information needs, then extract answers
from those pages
Tail Answers Pipeline
1   Find candidate information needs



2   Filter candidates



3   Write answers
Tail Answers Pipeline
1   Identify answer candidates



2   Filter candidates



3   Extract answer content
Identify Answer Candidates
Crowd data: 75 million search sessions
All information needs are answer candidates:
queries leading to a clickthrough on a single URL


    force quit mac
  force quit on macs
                                     URL
           …




                                                         Identify




                                               Outline
                                                         Filter
how to force quit mac                                    Extract
                                                         Evaluation
                                                         Discussion
Abstraction: Search Trails
[White, Bilenko, and Cucerzan 2007]

URL path from query to 30min session timeout
query              URL1               URL2   URL3

                   URL1

                   URL1

                   URL1
Example Answer Candidates
   force quit mac
 force quit on macs      URL1
how to force quit mac


    410 area code
                         URL2
area code 410 location
Tail Answers Pipeline
1   Identify answer candidates



2   Filter candidates



3   Extract answer content
Filtering Answer Candidates
Focus on fact-finding information needs [Kellar 2007]
Exclude popular but unanswerable candidates:
         radio
        pandora                        pandora.com
 pandora radio log in
                                                         Identify




                                               Outline
                                                         Filter
                                                         Extract
                                                         Evaluation
                                                         Discussion
Filtering Answer Candidates
Three filters remove answer candidates that
do not address fact-finding information needs:
     Navigation behavior
     Pages addressing search needs

     Query behavior
     Unambiguous needs

     Answer type
     Succinct answers
Filter by Navigation Behavior
Destination Probability for URL :
P(session length = 2 | URL in trail)
Probability of ending session at URL
after clicking through from the search results

query   URL1


query   URL1
                                 URL1 destination
query   URL1     URL2    URL3
                                 probability = 0.5
query   URL1     URL4
Filter by Navigation Behavior
Destination Probability Filter:
URLs with low probability that searchers will
end their session (Lots of back navigations, later clicks)
Focus on queries where searchers addressed
an information need
Filter by Query Behavior
What answers are these searchers looking for?

dissolvable stitches
(how long they last?)
(what they’re made of?)


732 area code
(city and state?)
(count of active phone numbers?)
Filter by Query Behavior
A minority of searchers use question words:

how long dissolvable stitches last


where is 732 area code

Filter candidates with fewer than 1% of
clickthroughs from question queries
Filter by Answer Type
Can a concise answer address this need?
Ask paid crowdsourcing workers to select:
–  Short: phrase or sentence                        Today
   “The optimal fish frying temperature is 350°F.”
–  List: small set of directions or alternatives
   “To change your password over Remote Desktop: 1) Click on
   Start > Windows Security. 2) Click the Change Password
   button. [...]”
–  Summary: synthesize large amount of content
   Impact of Budget Cuts on Teachers
Creating Tail Answers
1   Identify answer candidates



2   Filter candidates



3   Extract answer content
Extracting the Tail Answer
We now have answer candidates with:
    Factual responses
    Succinct responses
However, the answer is buried:
                            dissolvable stitches
                            dissolvable stitches how long
                            dissolvable stitches absorption

                                                           Identify




                                                 Outline
                                                           Filter
                                                           Extract
                                                           Evaluation
                                                           Discussion
Crowdsourcing Workflow
Reliably extract relevant answer from the URL
via paid crowdsourcing (CrowdFlower)
       Extract         Vote



       Proofread       Vote



       Title           Vote


   [Bernstein et al. 2010, Little et al. 2010]
Quality Challenge: Overgenerating

Typical extraction length:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Curabitur nisl
ligula, venenatis eget vulputate at, venenatis non sem. Pellentesque
viverra metus vel orci suscipit vitae ullamcorper nisl vestibulum. Ut
bibendum venenatis erat nec porttitor. Integer aliquam elit tempor
tortor iaculis ut volutpat est lacinia. Aenean fringilla interdum tristique.
Duis id felis sit amet libero porttitor suscipit eget vitae elit. Fusce neque
augue, facilisis quis bibendum a, dapibus et felis. Aliquam at sagittis
magna. Sed commodo semper tortor in facilisis. Nullam consequat
quam et felis faucibus sed imperdiet purus luctus. Proin adipiscing felis
ac nulla euismod ac dictum massa blandit. In volutpat auctor pharetra.
Phasellus at nisl massa. Vivamus malesuada turpis a ligula lacinia ut
interdum dui congue. Curabitur a molestie leo. Nulla mattis posuere
sapien sit amet orci aliquam.
Inclusion/Exclusion Gold Standard
Inclusion/exclusion lists test worker agreement
with a few annotated examples:
      Text that must be present
      Text that must not be present
Implementable via negative look-ahead regex




Gold standard questions: [Le et al. 2010]
Quality Challenge: Overgenerating

Extraction length with gold standards:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Curabitur nisl
ligula, venenatis eget vulputate at, venenatis non sem. Pellentesque
viverra metus vel orci suscipit vitae ullamcorper nisl vestibulum. Ut
bibendum venenatis erat nec porttitor. Integer aliquam elit tempor
tortor iaculis ut volutpat est lacinia. Aenean fringilla interdum tristique.
Duis id felis sit amet libero porttitor suscipit eget vitae elit. Fusce neque
augue, facilisis quis bibendum a, dapibus et felis. Aliquam at sagittis
magna. Sed commodo semper tortor in facilisis. Nullam consequat
quam et felis faucibus sed imperdiet purus luctus. Proin adipiscing felis
ac nulla euismod ac dictum massa blandit. In volutpat auctor pharetra.
Phasellus at nisl massa. Vivamus malesuada turpis a ligula lacinia ut
interdum dui congue. Curabitur a molestie leo. Nulla mattis posuere
sapien sit amet orci aliquam.
Tail Answer Pipeline
1   Identify answer candidates



2   Filter candidates



3   Extract answer content
75 million search trails   Median answer triggered once a day
19,167 answer candidates   44 cents to create each answer
Evaluation: Answer Quality
Hand-coded for correctness and writing errors
(two–three redundant coders)

83% of Tail Answers had no writing errors
87% of Tail Answers were completely correct or
had a minor error (e.g., title != content)
False positives in crowd data:
dynamic web pages                                     Identify




                                            Outline
                                                      Filter
                                                      Extract
                                                      Evaluation
                                                      Discussion
Field Experiment
How do Tail Answers impact searchers’
subjective impressions of the result page?
Method:
Recruit 361 users to issue queries that trigger
Tail Answers to a modified version of Bing.
Field Experiment Design
Within-subjects 2x2 design:
     Tail Answers vs. no Tail Answers
     Good Ranking vs. Bad Ranking
Measurement: 7-point Likert responses
    1. Result page is useful
    2. No need to click through to a result
Analysis: linear mixed effects model
Generalization of ANOVA
Tail Answers’ Usefulness Are
Comparable to Good Result Ranking
Results

Tail Answers main effect:                       0.34pts (7pt Likert)
Ranking main effect:                            0.68pts
Interaction effect:                             1.03pts
          7
 Useful




          5
                                                   Tail Answer
          3                                        No Tail Answer
          1
              Good Ranking       Bad Ranking
All results significant p<0.001
Answers Make Result Clickthroughs
Less Necessary
Results

Tail Answers main effect:                       1.01pts (7pt Likert)
Result ranking main effect:                     0.50pts
Interaction effect:                             0.91pts
             7
 No clicks




             5
                                                   Tail Answer
             3                                     No Tail Answer
             1
                 Good Ranking    Bad Ranking
All results significant p<0.001
Tail Answers impact
subjective ratings half as
much as good ranking,
and fully compensate for
poor results.
…but, we need to improve the trigger queries.
Ongoing Challenges
Spreading incorrect or unverified information




Cannibalizing pageviews from the
original content pages

                                                    Identify




                                          Outline
                                                    Filter
                                                    Extract
                                                    Evaluation
                                                    Discussion
Extension: A.I.-driven Answers
Use open information extraction systems to
propose answers, and crowd to verify
Crowd-authored




Authored by AI, verified by crowds
Extension: Better Result Snippets
Improve result pages for popular queries
Automatically extracted




Crowd-authored
Extension: Domain-Specific Answers
Design for specific information needs
Crowds structuring new data types
Direct Answers for Search
Queries in the Long Tail


Crowd data can support many uncommon
user goals in interactive systems.

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (15)

International Women's Day - Advice to My Younger Self
International Women's Day - Advice to My Younger SelfInternational Women's Day - Advice to My Younger Self
International Women's Day - Advice to My Younger Self
 
Social2B Capabilities
Social2B CapabilitiesSocial2B Capabilities
Social2B Capabilities
 
Social media makes you Citius Altius Fortius!
Social media makes you Citius Altius Fortius! Social media makes you Citius Altius Fortius!
Social media makes you Citius Altius Fortius!
 
Quantifying the Invisible Audience in Social Networks
Quantifying the Invisible Audience in Social NetworksQuantifying the Invisible Audience in Social Networks
Quantifying the Invisible Audience in Social Networks
 
Econsultancy Post Site Redesign
Econsultancy Post Site RedesignEconsultancy Post Site Redesign
Econsultancy Post Site Redesign
 
Generate more leads, Close more deals, Expand reveneue in existing clients
Generate more leads, Close more deals, Expand reveneue in existing clientsGenerate more leads, Close more deals, Expand reveneue in existing clients
Generate more leads, Close more deals, Expand reveneue in existing clients
 
Cloud Adoption Trends
Cloud Adoption TrendsCloud Adoption Trends
Cloud Adoption Trends
 
Event (managers) in the digital world
Event (managers) in the digital worldEvent (managers) in the digital world
Event (managers) in the digital world
 
The New Reality - highlights from the study
The New Reality - highlights from the studyThe New Reality - highlights from the study
The New Reality - highlights from the study
 
Key findings from Econsultancy's research into Customer Experience Maturity i...
Key findings from Econsultancy's research into Customer Experience Maturity i...Key findings from Econsultancy's research into Customer Experience Maturity i...
Key findings from Econsultancy's research into Customer Experience Maturity i...
 
яресько н 2015 09 03 final nrc_tax reform_final_format 16_9
яресько н 2015 09 03 final nrc_tax reform_final_format 16_9яресько н 2015 09 03 final nrc_tax reform_final_format 16_9
яресько н 2015 09 03 final nrc_tax reform_final_format 16_9
 
How to Achieve Face-Melting Content Marketing ROI - Webinar Slides - London 6...
How to Achieve Face-Melting Content Marketing ROI - Webinar Slides - London 6...How to Achieve Face-Melting Content Marketing ROI - Webinar Slides - London 6...
How to Achieve Face-Melting Content Marketing ROI - Webinar Slides - London 6...
 
Talking heads 2.0 presentatie
Talking heads 2.0 presentatie Talking heads 2.0 presentatie
Talking heads 2.0 presentatie
 
Infrastructure in 6 slides - The New Reality
Infrastructure in 6 slides - The New RealityInfrastructure in 6 slides - The New Reality
Infrastructure in 6 slides - The New Reality
 
Digital transformation in Sales Management
Digital transformation in Sales ManagementDigital transformation in Sales Management
Digital transformation in Sales Management
 

Ähnlich wie Direct Answers for Search Queries in the Long Tail

Louis Rosenfeld: Nettstedssøk i et nøtteskall (Webdagene 2013)
Louis Rosenfeld: Nettstedssøk i et nøtteskall (Webdagene 2013)Louis Rosenfeld: Nettstedssøk i et nøtteskall (Webdagene 2013)
Louis Rosenfeld: Nettstedssøk i et nøtteskall (Webdagene 2013)
webdagene
 
Balancing the Dimensions of User Intent
Balancing the Dimensions of User IntentBalancing the Dimensions of User Intent
Balancing the Dimensions of User Intent
Trey Grainger
 

Ähnlich wie Direct Answers for Search Queries in the Long Tail (20)

Techniques For Deep Query Understanding
Techniques For Deep Query UnderstandingTechniques For Deep Query Understanding
Techniques For Deep Query Understanding
 
Laboratorio Master BI&BDA (Modulo Web Data Analytics) : Reddit fashion insights
Laboratorio Master BI&BDA (Modulo Web Data Analytics) : Reddit fashion insightsLaboratorio Master BI&BDA (Modulo Web Data Analytics) : Reddit fashion insights
Laboratorio Master BI&BDA (Modulo Web Data Analytics) : Reddit fashion insights
 
Building a Meta-search Engine
Building a Meta-search EngineBuilding a Meta-search Engine
Building a Meta-search Engine
 
A Corpus for Entity Profiling in Microblog Posts
A Corpus for Entity Profiling in Microblog PostsA Corpus for Entity Profiling in Microblog Posts
A Corpus for Entity Profiling in Microblog Posts
 
AI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementAI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge Management
 
Discovery Hub: on-the-fly linked data exploratory search
Discovery Hub: on-the-fly linked data exploratory searchDiscovery Hub: on-the-fly linked data exploratory search
Discovery Hub: on-the-fly linked data exploratory search
 
Structure, Personalization, Scale: A Deep Dive into LinkedIn Search
Structure, Personalization, Scale: A Deep Dive into LinkedIn SearchStructure, Personalization, Scale: A Deep Dive into LinkedIn Search
Structure, Personalization, Scale: A Deep Dive into LinkedIn Search
 
Louis Rosenfeld: Nettstedssøk i et nøtteskall (Webdagene 2013)
Louis Rosenfeld: Nettstedssøk i et nøtteskall (Webdagene 2013)Louis Rosenfeld: Nettstedssøk i et nøtteskall (Webdagene 2013)
Louis Rosenfeld: Nettstedssøk i et nøtteskall (Webdagene 2013)
 
Site Search Analytics in a Nutshell
Site Search Analytics in a NutshellSite Search Analytics in a Nutshell
Site Search Analytics in a Nutshell
 
Who will RT this?: Automatically Identifying and Engaging Strangers on Twitte...
Who will RT this?: Automatically Identifying and Engaging Strangers on Twitte...Who will RT this?: Automatically Identifying and Engaging Strangers on Twitte...
Who will RT this?: Automatically Identifying and Engaging Strangers on Twitte...
 
Realtime search at Yammer
Realtime search at YammerRealtime search at Yammer
Realtime search at Yammer
 
Real-time Search at Yammer - By Aleksandrovsky Boris
Real-time Search at Yammer - By Aleksandrovsky BorisReal-time Search at Yammer - By Aleksandrovsky Boris
Real-time Search at Yammer - By Aleksandrovsky Boris
 
Real Time Search at Yammer
Real Time Search at YammerReal Time Search at Yammer
Real Time Search at Yammer
 
Utility of topic extraction on customer experience data
Utility of topic extraction on customer experience dataUtility of topic extraction on customer experience data
Utility of topic extraction on customer experience data
 
Balancing the Dimensions of User Intent
Balancing the Dimensions of User IntentBalancing the Dimensions of User Intent
Balancing the Dimensions of User Intent
 
AI and Python: Developing a Conversational Interface using Python
AI and Python: Developing a Conversational Interface using PythonAI and Python: Developing a Conversational Interface using Python
AI and Python: Developing a Conversational Interface using Python
 
CM UTaipei Kaggle Share
CM UTaipei Kaggle ShareCM UTaipei Kaggle Share
CM UTaipei Kaggle Share
 
ExStreamlycheap Final Slides
ExStreamlycheap Final SlidesExStreamlycheap Final Slides
ExStreamlycheap Final Slides
 
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
 
Implementing Crowdsourced Testing
Implementing Crowdsourced TestingImplementing Crowdsourced Testing
Implementing Crowdsourced Testing
 

Mehr von Michael Bernstein

RepliCHI: Graduate Student Perspectives
RepliCHI: Graduate Student PerspectivesRepliCHI: Graduate Student Perspectives
RepliCHI: Graduate Student Perspectives
Michael Bernstein
 
RepliCHI: Graduate Student Perspectives
RepliCHI: Graduate Student PerspectivesRepliCHI: Graduate Student Perspectives
RepliCHI: Graduate Student Perspectives
Michael Bernstein
 
The Trouble with Social Computing Systems Research
The Trouble with Social Computing Systems ResearchThe Trouble with Social Computing Systems Research
The Trouble with Social Computing Systems Research
Michael Bernstein
 
HarambeeNet: Data by the people, for the people
HarambeeNet: Data by the people, for the peopleHarambeeNet: Data by the people, for the people
HarambeeNet: Data by the people, for the people
Michael Bernstein
 

Mehr von Michael Bernstein (8)

The Future of Crowd Work
The Future of Crowd WorkThe Future of Crowd Work
The Future of Crowd Work
 
RepliCHI: Graduate Student Perspectives
RepliCHI: Graduate Student PerspectivesRepliCHI: Graduate Student Perspectives
RepliCHI: Graduate Student Perspectives
 
RepliCHI: Graduate Student Perspectives
RepliCHI: Graduate Student PerspectivesRepliCHI: Graduate Student Perspectives
RepliCHI: Graduate Student Perspectives
 
The Trouble with Social Computing Systems Research
The Trouble with Social Computing Systems ResearchThe Trouble with Social Computing Systems Research
The Trouble with Social Computing Systems Research
 
Soylent: A Word Processor with a Crowd Inside
Soylent: A Word Processor with a Crowd InsideSoylent: A Word Processor with a Crowd Inside
Soylent: A Word Processor with a Crowd Inside
 
Eddi: Interactive Topic-Based Browsing of Social Status Streams
Eddi: Interactive Topic-Based Browsing of Social Status StreamsEddi: Interactive Topic-Based Browsing of Social Status Streams
Eddi: Interactive Topic-Based Browsing of Social Status Streams
 
HarambeeNet: Data by the people, for the people
HarambeeNet: Data by the people, for the peopleHarambeeNet: Data by the people, for the people
HarambeeNet: Data by the people, for the people
 
FeedMe: Enhancing Directed Content Sharing on the Web
FeedMe: Enhancing Directed Content Sharing on the WebFeedMe: Enhancing Directed Content Sharing on the Web
FeedMe: Enhancing Directed Content Sharing on the Web
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

Direct Answers for Search Queries in the Long Tail

  • 1. Direct Answers for Search Queries in the Long Tail Michael Bernstein, Jaime Teevan, Susan Dumais, Dan Liebling, and Eric Horvitz MIT CSAIL and Microsoft Research MIT HUMAN-COMPUTER INTERACTION
  • 2. Answers: Direct Search Results Manually constructed for popular queries weather boston
  • 3. the girl with the dragon tattoo good answers reduce clicks on result pages memorial day 2012 users trigger answers repeatedly once discovered AAPL [Chilton + Teevan 2009]
  • 4. the girl with the dragon tattoo only popular query types memorial day 2012 answers are: - high cost AAPL - high maintenance
  • 5. Prevalence of Uncommon Searches No answers for many information needs molasses substitutes increase volume windows xp dissolvable stitches speed dog body temperature CHI 2013 deadline …
  • 6. Tail Answers Direct results for queries in the long tail molasses substitutes
  • 7. Tail Answers Direct results for queries in the long tail molasses substitutes Tail Answers improve the search experience for less common queries, and fully compensate for poor results.
  • 8. The Long Tail of Answers weather movies # occurrences Tail Answers chi 2017 location Information needs Hard to find structured information Not enough query volume for dedicated teams
  • 9.
  • 10. Crowds in Tail Answers Crowd Data Paid Crowds search logs on-demand 75 million Bing search trails Extract Proofread Title
  • 11. Crowds can support the long tail of user goals in interactive systems. Crowd Data Paid Crowds search logs on-demand
  • 12. Tail Answers Pipeline Find URLs that satisfy fact-finding information needs, then extract answers from those pages
  • 13. Tail Answers Pipeline 1 Find candidate information needs 2 Filter candidates 3 Write answers
  • 14. Tail Answers Pipeline 1 Identify answer candidates 2 Filter candidates 3 Extract answer content
  • 15. Identify Answer Candidates Crowd data: 75 million search sessions All information needs are answer candidates: queries leading to a clickthrough on a single URL force quit mac force quit on macs URL … Identify Outline Filter how to force quit mac Extract Evaluation Discussion
  • 16. Abstraction: Search Trails [White, Bilenko, and Cucerzan 2007] URL path from query to 30min session timeout query URL1 URL2 URL3 URL1 URL1 URL1
  • 17. Example Answer Candidates force quit mac force quit on macs URL1 how to force quit mac 410 area code URL2 area code 410 location
  • 18. Tail Answers Pipeline 1 Identify answer candidates 2 Filter candidates 3 Extract answer content
  • 19. Filtering Answer Candidates Focus on fact-finding information needs [Kellar 2007] Exclude popular but unanswerable candidates: radio pandora pandora.com pandora radio log in Identify Outline Filter Extract Evaluation Discussion
  • 20. Filtering Answer Candidates Three filters remove answer candidates that do not address fact-finding information needs: Navigation behavior Pages addressing search needs Query behavior Unambiguous needs Answer type Succinct answers
  • 21. Filter by Navigation Behavior Destination Probability for URL : P(session length = 2 | URL in trail) Probability of ending session at URL after clicking through from the search results query URL1 query URL1 URL1 destination query URL1 URL2 URL3 probability = 0.5 query URL1 URL4
  • 22. Filter by Navigation Behavior Destination Probability Filter: URLs with low probability that searchers will end their session (Lots of back navigations, later clicks) Focus on queries where searchers addressed an information need
  • 23. Filter by Query Behavior What answers are these searchers looking for? dissolvable stitches (how long they last?) (what they’re made of?) 732 area code (city and state?) (count of active phone numbers?)
  • 24. Filter by Query Behavior A minority of searchers use question words: how long dissolvable stitches last where is 732 area code Filter candidates with fewer than 1% of clickthroughs from question queries
  • 25. Filter by Answer Type Can a concise answer address this need? Ask paid crowdsourcing workers to select: –  Short: phrase or sentence Today “The optimal fish frying temperature is 350°F.” –  List: small set of directions or alternatives “To change your password over Remote Desktop: 1) Click on Start > Windows Security. 2) Click the Change Password button. [...]” –  Summary: synthesize large amount of content Impact of Budget Cuts on Teachers
  • 26. Creating Tail Answers 1 Identify answer candidates 2 Filter candidates 3 Extract answer content
  • 27. Extracting the Tail Answer We now have answer candidates with: Factual responses Succinct responses However, the answer is buried: dissolvable stitches dissolvable stitches how long dissolvable stitches absorption Identify Outline Filter Extract Evaluation Discussion
  • 28. Crowdsourcing Workflow Reliably extract relevant answer from the URL via paid crowdsourcing (CrowdFlower) Extract Vote Proofread Vote Title Vote [Bernstein et al. 2010, Little et al. 2010]
  • 29. Quality Challenge: Overgenerating Typical extraction length: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Curabitur nisl ligula, venenatis eget vulputate at, venenatis non sem. Pellentesque viverra metus vel orci suscipit vitae ullamcorper nisl vestibulum. Ut bibendum venenatis erat nec porttitor. Integer aliquam elit tempor tortor iaculis ut volutpat est lacinia. Aenean fringilla interdum tristique. Duis id felis sit amet libero porttitor suscipit eget vitae elit. Fusce neque augue, facilisis quis bibendum a, dapibus et felis. Aliquam at sagittis magna. Sed commodo semper tortor in facilisis. Nullam consequat quam et felis faucibus sed imperdiet purus luctus. Proin adipiscing felis ac nulla euismod ac dictum massa blandit. In volutpat auctor pharetra. Phasellus at nisl massa. Vivamus malesuada turpis a ligula lacinia ut interdum dui congue. Curabitur a molestie leo. Nulla mattis posuere sapien sit amet orci aliquam.
  • 30. Inclusion/Exclusion Gold Standard Inclusion/exclusion lists test worker agreement with a few annotated examples: Text that must be present Text that must not be present Implementable via negative look-ahead regex Gold standard questions: [Le et al. 2010]
  • 31. Quality Challenge: Overgenerating Extraction length with gold standards: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Curabitur nisl ligula, venenatis eget vulputate at, venenatis non sem. Pellentesque viverra metus vel orci suscipit vitae ullamcorper nisl vestibulum. Ut bibendum venenatis erat nec porttitor. Integer aliquam elit tempor tortor iaculis ut volutpat est lacinia. Aenean fringilla interdum tristique. Duis id felis sit amet libero porttitor suscipit eget vitae elit. Fusce neque augue, facilisis quis bibendum a, dapibus et felis. Aliquam at sagittis magna. Sed commodo semper tortor in facilisis. Nullam consequat quam et felis faucibus sed imperdiet purus luctus. Proin adipiscing felis ac nulla euismod ac dictum massa blandit. In volutpat auctor pharetra. Phasellus at nisl massa. Vivamus malesuada turpis a ligula lacinia ut interdum dui congue. Curabitur a molestie leo. Nulla mattis posuere sapien sit amet orci aliquam.
  • 32. Tail Answer Pipeline 1 Identify answer candidates 2 Filter candidates 3 Extract answer content
  • 33. 75 million search trails Median answer triggered once a day 19,167 answer candidates 44 cents to create each answer
  • 34. Evaluation: Answer Quality Hand-coded for correctness and writing errors (two–three redundant coders) 83% of Tail Answers had no writing errors 87% of Tail Answers were completely correct or had a minor error (e.g., title != content) False positives in crowd data: dynamic web pages Identify Outline Filter Extract Evaluation Discussion
  • 35. Field Experiment How do Tail Answers impact searchers’ subjective impressions of the result page? Method: Recruit 361 users to issue queries that trigger Tail Answers to a modified version of Bing.
  • 36. Field Experiment Design Within-subjects 2x2 design: Tail Answers vs. no Tail Answers Good Ranking vs. Bad Ranking Measurement: 7-point Likert responses 1. Result page is useful 2. No need to click through to a result Analysis: linear mixed effects model Generalization of ANOVA
  • 37. Tail Answers’ Usefulness Are Comparable to Good Result Ranking Results Tail Answers main effect: 0.34pts (7pt Likert) Ranking main effect: 0.68pts Interaction effect: 1.03pts 7 Useful 5 Tail Answer 3 No Tail Answer 1 Good Ranking Bad Ranking All results significant p<0.001
  • 38. Answers Make Result Clickthroughs Less Necessary Results Tail Answers main effect: 1.01pts (7pt Likert) Result ranking main effect: 0.50pts Interaction effect: 0.91pts 7 No clicks 5 Tail Answer 3 No Tail Answer 1 Good Ranking Bad Ranking All results significant p<0.001
  • 39. Tail Answers impact subjective ratings half as much as good ranking, and fully compensate for poor results. …but, we need to improve the trigger queries.
  • 40. Ongoing Challenges Spreading incorrect or unverified information Cannibalizing pageviews from the original content pages Identify Outline Filter Extract Evaluation Discussion
  • 41. Extension: A.I.-driven Answers Use open information extraction systems to propose answers, and crowd to verify Crowd-authored Authored by AI, verified by crowds
  • 42. Extension: Better Result Snippets Improve result pages for popular queries Automatically extracted Crowd-authored
  • 43. Extension: Domain-Specific Answers Design for specific information needs Crowds structuring new data types
  • 44. Direct Answers for Search Queries in the Long Tail Crowd data can support many uncommon user goals in interactive systems.