SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Using LWE/Solr/Lucene for eCom Grant Ingersoll, Lucid Imagination @gsingers Apache Solr and Lucene and their logos are trademarks of the Apache Software Foundation
Difference Makers Case Study 1: Relevance Matters Large Electronics Manufacturer Top selling product on page 10 for a search by product name Case Study 2: Don’t Overthink it Large Online Retailer Simply adding auto-suggest added millions to bottom line at very little cost Case Study 3: Test, Test, Test Amazon Recommendation System http://glinden.blogspot.com/2006/04/early-amazon-shopping-cart.html 3
Topics The Stack Knowing Users ,[object Object]
Minimum Features for eCom
Extended FeaturesNot Just Search What’s Missing? What’s Next? 4
eCom Stack Choices 5 Apache Solr and Lucene and their logos are trademarks of the Apache Software Foundation
Users: Get to Know Them! Audience Poll: How many of you are developers? How many of the developers know what the top 10 queries are on your site? How many of the non-developers know? Your users represent 100% of your opportunity to sell your products ;-) Shouldn’t you know what they are searching for? 6
Search Analytics “If you can’t measure it, you can’t manage it” Attributed to Peter Drucker, however, see * Ultimately, it’s all about conversion May not be the best measure for judging search Is there One Right Answer on your Site or Multiple? Known Item search vs Keyword/Category 7 *http://edkless.com/2009/06/peter-drucker-and-time-sheets/
Useful Metrics Mean Reciprocal Rank or Precision @ 10 Known Item vs. Keyword/Category “Show me the money” -- Top Product Analysis Identity Search - If your top product is named X and someone searches for X, is X on the first page?  Is it number 1? Is a top product underperforming as it relates to search? Top X Queries and Query Terms Zero Results and % of Zero Results Avg. # of facets/filters/spellchecks clicked per session Avg # of searches per user session Auto-suggest usage 8
Minimum Search Features High Quality Relevance for keyword and known item search P@10 or MRR close to 1 Sub-second response time under load All achievable in LWE/Solr/Lucene 9
Faceting LWE/Solr support faceting by: Field Date/Number Ranges Pivot (“what if” faceting) Hierarchical (via domain modeling) Dynamic (via Carrot^2) Single and multi-select faceting supported Facet by Function In Development https://issues.apache.org/jira/browse/SOLR-1581 http://wiki.apache.org/solr/SimpleFacetParameters 10
More Features Extensible Language Analysis Multilingual Support Synonyms Overrides on a per-word basis Pluggable Framework Frequent/Incremental Updates How often do you update your index? Near Real Time (IndexReader.open() ) Column Stride Fields (4.0) 11
Relevance Controls Function Queries Ratings/Reviews Margin/Inventory/Price/Location Can Sort by Functions 
/solr/browse?q=ipod&bf=price Editorial Controls (QueryElevationComponent) Fine grained controls 
/solr/elevate?q=YYYY&enableElevation=true Landing Pages (if done in search
) Implement: Docs with field that is filtered on or a separate index/core Editorial Controls Click Scoring (LWE only) Popularity based ranking 12
Beyond the Search Box Many eCom sites actually power all navigation by the search engine Many other tools in the Stack to help users discover content Auto Suggest Spell Checking More Like This Spatial 13
Complementary Tools Apache Mahout Recommendation Systems Crude Solr/Mahout Rec Integration at https://github.com/gsingers/ApacheCon2010 Classifiers/Clustering User Analysis, Content Analysis, etc. Social  BazaarVoice, etc. Business Rules Engine Drools or others 14
What’s Missing? UI Controls for non-devs: Synonyms (LWE has UI/REST support) Facets (Field support in LWE) Relevance Control (LWE REST API Support) Business Rules Integration Deeper Taxonomy Support More performance reports (LWE has some) Facet Management tools  Labels Sort order other than Count or Alphabetical Editorial facet control 15

Weitere Àhnliche Inhalte

Mehr von lucenerevolution

Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
lucenerevolution
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - final
lucenerevolution
 

Mehr von lucenerevolution (20)

Search at Twitter
Search at TwitterSearch at Twitter
Search at Twitter
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solr
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloud
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST API
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucene
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenal
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside down
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - final
 
The First Class Integration of Solr with Hadoop
The First Class Integration of Solr with HadoopThe First Class Integration of Solr with Hadoop
The First Class Integration of Solr with Hadoop
 
A Novel methodology for handling Document Level Security in Search Based Appl...
A Novel methodology for handling Document Level Security in Search Based Appl...A Novel methodology for handling Document Level Security in Search Based Appl...
A Novel methodology for handling Document Level Security in Search Based Appl...
 

KĂŒrzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

KĂŒrzlich hochgeladen (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Using lwe solr lucene for e com - By Grant Ingersoll

  • 1. Using LWE/Solr/Lucene for eCom Grant Ingersoll, Lucid Imagination @gsingers Apache Solr and Lucene and their logos are trademarks of the Apache Software Foundation
  • 2. Difference Makers Case Study 1: Relevance Matters Large Electronics Manufacturer Top selling product on page 10 for a search by product name Case Study 2: Don’t Overthink it Large Online Retailer Simply adding auto-suggest added millions to bottom line at very little cost Case Study 3: Test, Test, Test Amazon Recommendation System http://glinden.blogspot.com/2006/04/early-amazon-shopping-cart.html 3
  • 3.
  • 5. Extended FeaturesNot Just Search What’s Missing? What’s Next? 4
  • 6. eCom Stack Choices 5 Apache Solr and Lucene and their logos are trademarks of the Apache Software Foundation
  • 7. Users: Get to Know Them! Audience Poll: How many of you are developers? How many of the developers know what the top 10 queries are on your site? How many of the non-developers know? Your users represent 100% of your opportunity to sell your products ;-) Shouldn’t you know what they are searching for? 6
  • 8. Search Analytics “If you can’t measure it, you can’t manage it” Attributed to Peter Drucker, however, see * Ultimately, it’s all about conversion May not be the best measure for judging search Is there One Right Answer on your Site or Multiple? Known Item search vs Keyword/Category 7 *http://edkless.com/2009/06/peter-drucker-and-time-sheets/
  • 9. Useful Metrics Mean Reciprocal Rank or Precision @ 10 Known Item vs. Keyword/Category “Show me the money” -- Top Product Analysis Identity Search - If your top product is named X and someone searches for X, is X on the first page? Is it number 1? Is a top product underperforming as it relates to search? Top X Queries and Query Terms Zero Results and % of Zero Results Avg. # of facets/filters/spellchecks clicked per session Avg # of searches per user session Auto-suggest usage 8
  • 10. Minimum Search Features High Quality Relevance for keyword and known item search P@10 or MRR close to 1 Sub-second response time under load All achievable in LWE/Solr/Lucene 9
  • 11. Faceting LWE/Solr support faceting by: Field Date/Number Ranges Pivot (“what if” faceting) Hierarchical (via domain modeling) Dynamic (via Carrot^2) Single and multi-select faceting supported Facet by Function In Development https://issues.apache.org/jira/browse/SOLR-1581 http://wiki.apache.org/solr/SimpleFacetParameters 10
  • 12. More Features Extensible Language Analysis Multilingual Support Synonyms Overrides on a per-word basis Pluggable Framework Frequent/Incremental Updates How often do you update your index? Near Real Time (IndexReader.open() ) Column Stride Fields (4.0) 11
  • 13. Relevance Controls Function Queries Ratings/Reviews Margin/Inventory/Price/Location Can Sort by Functions 
/solr/browse?q=ipod&bf=price Editorial Controls (QueryElevationComponent) Fine grained controls 
/solr/elevate?q=YYYY&enableElevation=true Landing Pages (if done in search
) Implement: Docs with field that is filtered on or a separate index/core Editorial Controls Click Scoring (LWE only) Popularity based ranking 12
  • 14. Beyond the Search Box Many eCom sites actually power all navigation by the search engine Many other tools in the Stack to help users discover content Auto Suggest Spell Checking More Like This Spatial 13
  • 15. Complementary Tools Apache Mahout Recommendation Systems Crude Solr/Mahout Rec Integration at https://github.com/gsingers/ApacheCon2010 Classifiers/Clustering User Analysis, Content Analysis, etc. Social BazaarVoice, etc. Business Rules Engine Drools or others 14
  • 16. What’s Missing? UI Controls for non-devs: Synonyms (LWE has UI/REST support) Facets (Field support in LWE) Relevance Control (LWE REST API Support) Business Rules Integration Deeper Taxonomy Support More performance reports (LWE has some) Facet Management tools Labels Sort order other than Count or Alphabetical Editorial facet control 15
  • 17. What’s Next? Some sample code and more discussion at http://www.lucidimagination.com/blog/2011/01/25/implementing-the-ecommerce-checklist-with-apache-solr-and-lucidworks/ 16
  • 18. Resources Principles for Effective Search in E-Commerce Design http://lucene.li/2T http://www.lucidimagination.com/search/?q=ecommerce grant@lucidimagination.com @gsingers 17 http://www.lucidimagination.com

Hinweis der Redaktion

  1. Case 1: Don’t think relevance matters? This single result was costing lots of money every single dayCase 2: Think about how long it takes to add auto-suggest
 How long to add NLP to search?Case 3: take a long term view, test hypotheses
  2. Many things can go wrong between search and conversion that aren’t related to searchEstimate MRR or P@10 based on click stream analysis
  3. Is a top product underperforming as it relates to search? In other words, is a user less likely to buy when searching for a top product versus other navigation options?Also, the usual performance metricsOthers?
  4. http://localhost:8983/solr/browse?q=ipod&bf=price
  5. All of these things are fairly easily built