SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Learning To Rank
with Apache Solr and Fusion
Trey Grainger
Chief Algorithms Officer
Andy Liu
Senior Data Scientist
The Problem: Classic Similarity
Keyword search isn’t always enough for relevance
The Problem
User searches for “outdoor rock speaker”
Should see this:Sees this:
The Problem
• Improving search relevance is hard,
• TF-IDF and BM25 are good for text-keyword but what about other
models of relevance?
• Text matching is sometimes not the best solution
• Users don’t always say what they mean
The Solution:
Fusion 4 Signals + Solr 7 LTR
The Solution : Learning to Rank Overview
• Learning to rank lets you pick “features” of a document that
“matter” and teach the machine how to rank a set of items.
• One possible source of ordering is user behavior (i.e. the only clicks were
on the speaker shaped like a rock)
• Solr provides a Learning to Rank implementation.
• Fusion provides a way of capturing user behavior through signals.
The Solution: Learning to Rank Overview
The Solution: Learning to Rank Overview
The Solution: Fusion Signals Overview
The Solution
• Define features (relevancy factors)
• Derive Ground Truth using Fusion’s signals
• Use Solr’s Learning to Rank implementation
Some notes
• Fusion’s normal click boosting is an alternative and pretty good
• It is possible to use them together or one where the other
doesn’t work
• Do other more simple things first, learning to rank without an
adequate schema won’t accomplish much.
Some notes
• Using click signals for ground truth
• Pros:
• Voluminous
• Cheap
• Reflects a captive user’s intent (especially when supplemented with purchase, add to
cart events)
• Tacitly, implicitly labeled data the key to an OOTB “self-learning” system
• Cons
• Noisy
• Potential for reinforcing existing ranking
Putting it together
Building an LTR Pipeline
…but is it better?
• Models compared:
• Solr Out-of-the-box BM25
ranking using textual
features only
• Logistic Regression using all
features except the signals
feature
• Logistic Regression using all
features
Why is it better?
• Summary of Benefits:
• LTR offers automated relevancy tuning
• Using Fusion to implement LTR greatly reduces the time and complexity
required to train and deploy LTR models in production
• Leveraging Fusion’s signals as features in an LTR model offers an easy way of
boosting search relevance performance beyond what is possible using textual
features alone
A/B and experiments
• Do this carefully.
• A/B testing is the safest way to make sure you don’t ruin different user
experiences.
• Stay tuned for a future webinar on Experiments and A/B testing
Where to learn more?
• Grab the technical paper (with step by step instructions):
https://lucidworks.com/ebook/learning-to-rank/
• Grab the code: https://github.com/lucidworks/fusion-ltr-
webinar#fusionsolr-setup
Thank you
Register by Sep 6
to save $200
SEPTEMBER 9-12,
2019 WASHINGTON DC
Check out the site here: https:/ / activate-conf.com/
JOIN ANDY AND TREY AT ACTIVATE
• Productionizing Python ML Models Using Fusion 5, Sanket Shahane,
Andy Liu
• Natural Language Search with Knowledge Graphs, Trey Grainger
• Closing Keynote: The Next Generation of AI-powered Search, Trey
Grainger
AI, ML & DATA SCIENCE TRACK
• Supporting Query Tagging/Suggestion in Fusion 4.2, Uber
• Building a Health QA Chatbot with Solr, Healthwise Incorporated
• Tackling a “Small Data” Search Challenge at Airbnb Experiences,
Airbnb
• Using Deep Learning and Customized Solr Components to Improve
Search Relevancy at Target, Target
THE SEARCH AND AI
CONFERENCE
SEPTEMBER 9-
12,2019 WASHINGTON DC
Check out the site here: https://activate-conf.com/
Register by Sep 6
to save $200
LIVE Q&A:
Enter your questions in the chat box now
for Trey to answer live

Weitere ähnliche Inhalte

Ähnlich wie Learning to Rank with Apache Solr and Fusion

50 Shades of Fail KScope16
50 Shades of Fail KScope1650 Shades of Fail KScope16
50 Shades of Fail KScope16Christian Berg
 
Introduction to Test Driven Development
Introduction to Test Driven DevelopmentIntroduction to Test Driven Development
Introduction to Test Driven DevelopmentSiva Arunachalam
 
Sharpest tool in the box: Choosing the right authoring tool for your learning...
Sharpest tool in the box: Choosing the right authoring tool for your learning...Sharpest tool in the box: Choosing the right authoring tool for your learning...
Sharpest tool in the box: Choosing the right authoring tool for your learning...Brightwave Group
 
Retrieval Performance Bound Analysis for Single Term Queries
Retrieval Performance Bound Analysis for Single Term QueriesRetrieval Performance Bound Analysis for Single Term Queries
Retrieval Performance Bound Analysis for Single Term QueriesTwitter Inc.
 
TDD - Seriously, try it! (updated '22)
TDD - Seriously, try it! (updated '22)TDD - Seriously, try it! (updated '22)
TDD - Seriously, try it! (updated '22)Nacho Cougil
 
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Into...
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Into...Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Into...
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Into...Ortus Solutions, Corp
 
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Adob...
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Adob...Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Adob...
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Adob...Uma Ghotikar
 
Avoiding test hell
Avoiding test hellAvoiding test hell
Avoiding test hellYun Ki Lee
 
Tuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and ArchitectureTuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and ArchitectureDatabricks
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comSimon Hughes
 
Sample_CPT_Presentation-by_Dongwei_Mei.pdf
Sample_CPT_Presentation-by_Dongwei_Mei.pdfSample_CPT_Presentation-by_Dongwei_Mei.pdf
Sample_CPT_Presentation-by_Dongwei_Mei.pdfSURYAPRAKASH281978
 
Mulesoft torronto meetup_16
Mulesoft torronto meetup_16Mulesoft torronto meetup_16
Mulesoft torronto meetup_16Anurag Dwivedi
 
Critical Capabilities to Shifting Left the Right Way
Critical Capabilities to Shifting Left the Right WayCritical Capabilities to Shifting Left the Right Way
Critical Capabilities to Shifting Left the Right WaySmartBear
 
Client Technical Analysis of Legacy Software and Future Replacement
Client Technical Analysis of Legacy Software and Future ReplacementClient Technical Analysis of Legacy Software and Future Replacement
Client Technical Analysis of Legacy Software and Future ReplacementVictorSzoltysek
 
Performance and Abstractions
Performance and AbstractionsPerformance and Abstractions
Performance and AbstractionsMetosin Oy
 
TDD - Seriously, try it! - Trójmiasto Java User Group (17th May '23)
TDD - Seriously, try it! - Trójmiasto Java User Group (17th May '23)TDD - Seriously, try it! - Trójmiasto Java User Group (17th May '23)
TDD - Seriously, try it! - Trójmiasto Java User Group (17th May '23)ssusercaf6c1
 
TDD - Seriously, try it! - Trjjmiasto JUG (17th May '23)
TDD - Seriously, try it! - Trjjmiasto JUG (17th May '23)TDD - Seriously, try it! - Trjjmiasto JUG (17th May '23)
TDD - Seriously, try it! - Trjjmiasto JUG (17th May '23)Nacho Cougil
 
Building trust within the organization, first steps towards DevOps
Building trust within the organization, first steps towards DevOpsBuilding trust within the organization, first steps towards DevOps
Building trust within the organization, first steps towards DevOpsGuido Serra
 

Ähnlich wie Learning to Rank with Apache Solr and Fusion (20)

50 Shades of Fail KScope16
50 Shades of Fail KScope1650 Shades of Fail KScope16
50 Shades of Fail KScope16
 
Introduction to Test Driven Development
Introduction to Test Driven DevelopmentIntroduction to Test Driven Development
Introduction to Test Driven Development
 
odsc_2023.pdf
odsc_2023.pdfodsc_2023.pdf
odsc_2023.pdf
 
Sharpest tool in the box: Choosing the right authoring tool for your learning...
Sharpest tool in the box: Choosing the right authoring tool for your learning...Sharpest tool in the box: Choosing the right authoring tool for your learning...
Sharpest tool in the box: Choosing the right authoring tool for your learning...
 
Retrieval Performance Bound Analysis for Single Term Queries
Retrieval Performance Bound Analysis for Single Term QueriesRetrieval Performance Bound Analysis for Single Term Queries
Retrieval Performance Bound Analysis for Single Term Queries
 
TDD - Seriously, try it! (updated '22)
TDD - Seriously, try it! (updated '22)TDD - Seriously, try it! (updated '22)
TDD - Seriously, try it! (updated '22)
 
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Into...
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Into...Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Into...
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Into...
 
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Adob...
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Adob...Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Adob...
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Adob...
 
Eurosport's Kodakademi #2
Eurosport's Kodakademi #2Eurosport's Kodakademi #2
Eurosport's Kodakademi #2
 
Avoiding test hell
Avoiding test hellAvoiding test hell
Avoiding test hell
 
Tuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and ArchitectureTuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and Architecture
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
 
Sample_CPT_Presentation-by_Dongwei_Mei.pdf
Sample_CPT_Presentation-by_Dongwei_Mei.pdfSample_CPT_Presentation-by_Dongwei_Mei.pdf
Sample_CPT_Presentation-by_Dongwei_Mei.pdf
 
Mulesoft torronto meetup_16
Mulesoft torronto meetup_16Mulesoft torronto meetup_16
Mulesoft torronto meetup_16
 
Critical Capabilities to Shifting Left the Right Way
Critical Capabilities to Shifting Left the Right WayCritical Capabilities to Shifting Left the Right Way
Critical Capabilities to Shifting Left the Right Way
 
Client Technical Analysis of Legacy Software and Future Replacement
Client Technical Analysis of Legacy Software and Future ReplacementClient Technical Analysis of Legacy Software and Future Replacement
Client Technical Analysis of Legacy Software and Future Replacement
 
Performance and Abstractions
Performance and AbstractionsPerformance and Abstractions
Performance and Abstractions
 
TDD - Seriously, try it! - Trójmiasto Java User Group (17th May '23)
TDD - Seriously, try it! - Trójmiasto Java User Group (17th May '23)TDD - Seriously, try it! - Trójmiasto Java User Group (17th May '23)
TDD - Seriously, try it! - Trójmiasto Java User Group (17th May '23)
 
TDD - Seriously, try it! - Trjjmiasto JUG (17th May '23)
TDD - Seriously, try it! - Trjjmiasto JUG (17th May '23)TDD - Seriously, try it! - Trjjmiasto JUG (17th May '23)
TDD - Seriously, try it! - Trjjmiasto JUG (17th May '23)
 
Building trust within the organization, first steps towards DevOps
Building trust within the organization, first steps towards DevOpsBuilding trust within the organization, first steps towards DevOps
Building trust within the organization, first steps towards DevOps
 

Mehr von Lucidworks

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategyLucidworks
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceLucidworks
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsLucidworks
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesLucidworks
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Lucidworks
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...Lucidworks
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Lucidworks
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Lucidworks
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteLucidworks
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentLucidworks
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeLucidworks
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Lucidworks
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchLucidworks
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Lucidworks
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyLucidworks
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Lucidworks
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceLucidworks
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchLucidworks
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondLucidworks
 

Mehr von Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Kürzlich hochgeladen

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Kürzlich hochgeladen (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Learning to Rank with Apache Solr and Fusion

  • 1. Learning To Rank with Apache Solr and Fusion Trey Grainger Chief Algorithms Officer Andy Liu Senior Data Scientist
  • 2. The Problem: Classic Similarity Keyword search isn’t always enough for relevance
  • 3. The Problem User searches for “outdoor rock speaker” Should see this:Sees this:
  • 4. The Problem • Improving search relevance is hard, • TF-IDF and BM25 are good for text-keyword but what about other models of relevance? • Text matching is sometimes not the best solution • Users don’t always say what they mean
  • 5. The Solution: Fusion 4 Signals + Solr 7 LTR
  • 6. The Solution : Learning to Rank Overview • Learning to rank lets you pick “features” of a document that “matter” and teach the machine how to rank a set of items. • One possible source of ordering is user behavior (i.e. the only clicks were on the speaker shaped like a rock) • Solr provides a Learning to Rank implementation. • Fusion provides a way of capturing user behavior through signals.
  • 7. The Solution: Learning to Rank Overview
  • 8. The Solution: Learning to Rank Overview
  • 9. The Solution: Fusion Signals Overview
  • 10. The Solution • Define features (relevancy factors) • Derive Ground Truth using Fusion’s signals • Use Solr’s Learning to Rank implementation
  • 11. Some notes • Fusion’s normal click boosting is an alternative and pretty good • It is possible to use them together or one where the other doesn’t work • Do other more simple things first, learning to rank without an adequate schema won’t accomplish much.
  • 12. Some notes • Using click signals for ground truth • Pros: • Voluminous • Cheap • Reflects a captive user’s intent (especially when supplemented with purchase, add to cart events) • Tacitly, implicitly labeled data the key to an OOTB “self-learning” system • Cons • Noisy • Potential for reinforcing existing ranking
  • 14. Building an LTR Pipeline
  • 15. …but is it better? • Models compared: • Solr Out-of-the-box BM25 ranking using textual features only • Logistic Regression using all features except the signals feature • Logistic Regression using all features
  • 16. Why is it better? • Summary of Benefits: • LTR offers automated relevancy tuning • Using Fusion to implement LTR greatly reduces the time and complexity required to train and deploy LTR models in production • Leveraging Fusion’s signals as features in an LTR model offers an easy way of boosting search relevance performance beyond what is possible using textual features alone
  • 17. A/B and experiments • Do this carefully. • A/B testing is the safest way to make sure you don’t ruin different user experiences. • Stay tuned for a future webinar on Experiments and A/B testing
  • 18. Where to learn more? • Grab the technical paper (with step by step instructions): https://lucidworks.com/ebook/learning-to-rank/ • Grab the code: https://github.com/lucidworks/fusion-ltr- webinar#fusionsolr-setup
  • 20. Register by Sep 6 to save $200 SEPTEMBER 9-12, 2019 WASHINGTON DC Check out the site here: https:/ / activate-conf.com/ JOIN ANDY AND TREY AT ACTIVATE • Productionizing Python ML Models Using Fusion 5, Sanket Shahane, Andy Liu • Natural Language Search with Knowledge Graphs, Trey Grainger • Closing Keynote: The Next Generation of AI-powered Search, Trey Grainger AI, ML & DATA SCIENCE TRACK • Supporting Query Tagging/Suggestion in Fusion 4.2, Uber • Building a Health QA Chatbot with Solr, Healthwise Incorporated • Tackling a “Small Data” Search Challenge at Airbnb Experiences, Airbnb • Using Deep Learning and Customized Solr Components to Improve Search Relevancy at Target, Target THE SEARCH AND AI CONFERENCE SEPTEMBER 9- 12,2019 WASHINGTON DC Check out the site here: https://activate-conf.com/ Register by Sep 6 to save $200
  • 21. LIVE Q&A: Enter your questions in the chat box now for Trey to answer live