SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Optimizing
Solr performance and User
Experience for E-Commerce
15 April 2010
Slides posted at the end of this
Agenda                                                                          presentation; full replay
                                                                                    available within
                                                                               ~48 hours of live webcast
    Introductions
    Apache Lucene/Solr and eCommerce - Grant
         Lucene/Solr Powered Sales
         Common Scenarios
         Search Needs for eCommerce
    Use Case: Sheet Music Plus - Brian
         Leveraging Solr to sell Music online
         Gauging search success through timely metrics



                Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                                                               2
Introductions
    Grant Ingersoll
      Co-Founder of Lucid Imagination
      Lucene/Solr/Mahout committer




    Brian Doll
      Search Architect at Sheetmusicplus.com
      12+ years experience in eCommerce, finance, retail
      and media



                 Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                             3
A few Lucene/Solr-Powered Commerce Sites




         Buy.com



                                                                               4

               Lucid Imagination, Inc. – http://www.lucidimagination.com   4
eCommerce Scenarios
  Users can’t buy it if they can’t find it!
    Search/Discovery is often mission critical in eCommerce
    Users don’t know how to spell
    Users often don’t even know how to describe it
  Online stores typically look like:
    SKUs (products): 1K to 500K typically, but we have customers with
    10-50M+ SKUs in Solr
    High Volume Search: Millions of queries/day (peaks @ Xmas in US)
    Lots of Metadata with 1 or 2 text fields
    Many product names are “tricky”:
    GPSGolfPro, iPod, Tchaikovsky, Garmin nuvi 255w
                Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                            5
eCommerce Checklist for Search
     Keyword search
     High Quality relevance (precision @ < 10)
     Faceting/Discovery
     Flexible language analysis tools
       Stemming, protected words, case changes,
       alpha-numeric
     Multilingual support
     Frequent Incremental Updates
       Ratings, Reviews, New Products
     Best Bets
              Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                          6
eCommerce Checklist for Search, continued
   Auto-suggest
   Did you mean?
   Related Searches/Items
   Editorial Relevance Controls
     Sales, Margins, Inventory, Fixed results, Exclusions, Ratings
   Admin:
     Scalability, Fault Tolerance, Easy Setup and Config
   Recommendations (See Mahout)
   Analytics and other Business Tools

                Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                            7
eCommerce and Solr

     Most of the checklist comes out of the box with Apache Solr
     Primary missing piece: Analytics + high level business tools
       But one could argue that is a feature, not a bug!
       Many people already have their own reporting/analytics
       Often quite easy to hook in Solr/Lucene to those solutions




              Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                          8
Solr E-Commerce Case Study: Sheetmusicplus.com

   Overview
     About Sheetmusicplus and our Solr infrastructure
     Onsite search: understanding the value
     Understanding your traffic, your search and your customers
     Setting search strategies
     Understanding Customer Value
     Q&A




              Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                          9
About sheetmusicplus.com

       Largest selection of sheet music
       12 year old online business
       600k+ SKUs
       1.5M+ Songs
       Faceted navigation and on-site search with Solr:
        Up to 35 application requests per second this year/~3 million per day
        Up to 11 solr queries per second this year/~1 million per day
       Ruby on Rails app servers running on Passenger / Apache
       MySQL / Memcached / VMWare
       Lots of needles in a huge haystack
               Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                                 10
What’s the value of on-site search?

     Value of a new prospect
     Value of an existing customer
     eCommerce conversion rate of the search page
     Dollar value of the search page
     Dollar value of top search terms




              Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                          11
Why Solr?

     Strategies:
     A set of patterns to match search operations with various
     subsets of use cases to optimize results for different users
     We love that the structure of your data is part of your Solr
     environment (schema.xml)
     And the way you query against that data is part of your app
     Solr provides an easy and flexible API that is easy to
     experiment with




              Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                          12
Step 1




         Data                                                    Strategy



         Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                            13
Get to know your visitors




   on-site search terms are not as varied as we
                    had thought



            Lucid Imagination, Inc. – http://www.lucidimagination.com
Surprisingly, lots of people search like this:




     piano




              Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                          15
Surprisingly, lots of people search like this:




     piano



   While others search like this:
    Piano

     The Clash London Calling Guitar Tablature Medium Difficulty



                 Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                             16
Understanding your data

     We have very diverse data
         Most product attributes are inconsistent
     We receive product information in 136 different formats
       Product title
       Contributor roles: Artist, Composer, Performer, Arranger, etc.
       Musical Genres, instrumentation, difficulty level, format
       Algorithmically applied facets based on all data elements
       Some items have scores of facets, some have only a few



              Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                          17
Analyzing your data

     How might we categorize all of our potential fields?
     What data elements might contain a person’s name?
     What elements might refer to a musical instrument?
     What do we do about unstructured publisher descriptions?


     How can you provide an effective search service
     with the data you have?




              Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                          18
A “one size fits all” search strategy may not be ideal.

     piano




     The Clash London Calling Guitar Tablature Medium Difficulty



                 Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                             19
What if 


     
 we applied a strategy pattern to our search queries?




             Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                         20
How we set up our search strategies

     Each strategy must answer the following:
       A unique name and description to help with analytics
       A hash of solr fields to search, containing:

        The copy field name
        The type of search (phrase or term searching)
        The boost value
        The slop value


       The default sort order of that particular strategy




              Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                          21
Grouping your data

     There are significant performance benefits in
     minimizing the number of fields to search against
     Group the fields that share common traits and have a
     similar level of influence on your search results to boost
     performance
          Example:
          “Contributors” are all equally important to our visitors. Many
          searches include an artist name, but just as often those names
          are composers, arrangers or popular performers. We can
          create a single copy field that aggregates all of those fields
          into one searchable Solr field


              Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                           22
Step 2




         Data                                                    Strategy



         Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                            23
Building a search strategy

      What (copy) fields do you search against?
      What boost is applied to each of those fields?
      What sort order do you default to?
      What other factors might you influence your results with?
        (We occasionally influence results with sales ranking data)




               Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                           24
Having more than one strategy is much easier
than trying to get it exactly right
It’ll be unique to your site
We have a classifier algorithm to help determine
which strategy we should apply to an incoming
search request




        Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                    25
Implementing these strategies provided an
86% increase in per-search value.


We can now fine-tune individual categories of
searches, as well as specific phrases.




     Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                 26
Expanding your portfolio of search strategies

      How are customers searching your data?
      What categories of searches might you need to address?
      Sheetmusicplus.com examples:
        Artist search {Pink Floyd, Haydn, Louis Armstrong}
        Instrument search {Guitar, ukulele, oboe}
        Genre search {Jazz, classical, standards}
        Title search {“I left my heart in San Fernando’s Hideaway”}
      Evaluate: How does this relate to metadata?


               Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                           27
Monitor and refine your strategies

      Use Google Analytics (or a similar tool) to understand
      what your users are searching for
      Deliver search strategies to help them find what they want
      Don't bother chasing obscure queries that
      don't generate revenue
      Focus on delivering customer value
      Win!




             Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                         28
Track customer value per strategy

     Use Google Analytics to track the value of your strategies
       Add a dummy URL parameter to each of your search results
       indicating which strategy was used during that search
                           /some/resulting/doc?s=4
       In Google Analytics, search your page views for ‘s=4’
        Graph of page views over time, eCommerce value per page, etc




              Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                          29
Summary

    Data --> Strategy
    Tuning your search algorithm is an endless game, and if
    you focus on pleasing customer X, it'll cost you.
    Don't try to please everybody. Please the people who
    make you money!




           Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                       30
Q&A
 Go to http://bit.ly/solr-ecom to download slides;
Full replay available within 48 hours of live webcast




        Lucid Imagination, Inc. – http://www.lucidimagination.com
                                                                    31

Weitere Àhnliche Inhalte

Andere mochten auch

Boolean- Search Basics
Boolean- Search BasicsBoolean- Search Basics
Boolean- Search BasicsRithesh Nair
 
Boolean Logic: how to talk to search engines in their own language
Boolean Logic: how to talk to search engines in their own languageBoolean Logic: how to talk to search engines in their own language
Boolean Logic: how to talk to search engines in their own languageBoxford Library
 
Boolean Logic Searching: A Primer
Boolean Logic Searching: A PrimerBoolean Logic Searching: A Primer
Boolean Logic Searching: A Primercswetzel
 
Jazeed about Solr - People as A Search Problem
Jazeed about Solr - People as A Search ProblemJazeed about Solr - People as A Search Problem
Jazeed about Solr - People as A Search ProblemLucidworks (Archived)
 
Searching The United States Code with Solr/Lucene
Searching The United States Code with Solr/LuceneSearching The United States Code with Solr/Lucene
Searching The United States Code with Solr/LuceneLucidworks (Archived)
 
Highly Relevant Search Result Ranking for Law Enforcement
Highly Relevant Search Result Ranking for Law EnforcementHighly Relevant Search Result Ranking for Law Enforcement
Highly Relevant Search Result Ranking for Law EnforcementLucidworks (Archived)
 
Presentation: IT Wizard Summer Camp
Presentation: IT Wizard Summer CampPresentation: IT Wizard Summer Camp
Presentation: IT Wizard Summer CampCMD Training Institute
 
Network Forensics Puzzle Contest ă«æŒ‘æˆŠ #1
Network Forensics Puzzle Contest ă«æŒ‘æˆŠ #1Network Forensics Puzzle Contest ă«æŒ‘æˆŠ #1
Network Forensics Puzzle Contest ă«æŒ‘æˆŠ #1ćœ° æ‘ćœ°
 
Tate Tyler - Designing the Search Experience
Tate Tyler - Designing the Search ExperienceTate Tyler - Designing the Search Experience
Tate Tyler - Designing the Search ExperienceLucidworks (Archived)
 
Mujer, pajaro y estrella
Mujer, pajaro y estrellaMujer, pajaro y estrella
Mujer, pajaro y estrellaguest986e5ae
 
Windows 8 で魅抛的ăȘWeb ă‚”ă‚€ăƒˆă‚’äœœă‚‹
Windows 8 で魅抛的ăȘWeb ă‚”ă‚€ăƒˆă‚’äœœă‚‹Windows 8 で魅抛的ăȘWeb ă‚”ă‚€ăƒˆă‚’äœœă‚‹
Windows 8 で魅抛的ăȘWeb ă‚”ă‚€ăƒˆă‚’äœœă‚‹ćœ° æ‘ćœ°
 
What’s New in Apache Lucene 2.9
What’s New in Apache Lucene 2.9What’s New in Apache Lucene 2.9
What’s New in Apache Lucene 2.9Lucidworks (Archived)
 
C:\Fakepath\6620millardmodule3b
C:\Fakepath\6620millardmodule3bC:\Fakepath\6620millardmodule3b
C:\Fakepath\6620millardmodule3bDonna Millard
 
ĐżŃ€Đ”Đ·Đ”ĐœŃ‚Đ°Ń†ĐžŃ ĐżĐŸ ĐșĐœĐžĐłĐ” Ўуг ĐŽĐ” ĐșĐ°Ń€Đ»ĐŸ "эĐșŃŃ‚Ń€ĐžĐŒĐ°Đ»ŃŒĐœĐŸĐ” упраĐČĐ»Đ”ĐœĐžĐ” ĐżŃ€ĐŸĐ”ĐșŃ‚Đ°ĐŒĐž"
ĐżŃ€Đ”Đ·Đ”ĐœŃ‚Đ°Ń†ĐžŃ ĐżĐŸ ĐșĐœĐžĐłĐ” Ўуг ĐŽĐ” ĐșĐ°Ń€Đ»ĐŸ "эĐșŃŃ‚Ń€ĐžĐŒĐ°Đ»ŃŒĐœĐŸĐ” упраĐČĐ»Đ”ĐœĐžĐ” ĐżŃ€ĐŸĐ”ĐșŃ‚Đ°ĐŒĐž"ĐżŃ€Đ”Đ·Đ”ĐœŃ‚Đ°Ń†ĐžŃ ĐżĐŸ ĐșĐœĐžĐłĐ” Ўуг ĐŽĐ” ĐșĐ°Ń€Đ»ĐŸ "эĐșŃŃ‚Ń€ĐžĐŒĐ°Đ»ŃŒĐœĐŸĐ” упраĐČĐ»Đ”ĐœĐžĐ” ĐżŃ€ĐŸĐ”ĐșŃ‚Đ°ĐŒĐž"
ĐżŃ€Đ”Đ·Đ”ĐœŃ‚Đ°Ń†ĐžŃ ĐżĐŸ ĐșĐœĐžĐłĐ” Ўуг ĐŽĐ” ĐșĐ°Ń€Đ»ĐŸ "эĐșŃŃ‚Ń€ĐžĐŒĐ°Đ»ŃŒĐœĐŸĐ” упраĐČĐ»Đ”ĐœĐžĐ” ĐżŃ€ĐŸĐ”ĐșŃ‚Đ°ĐŒĐž"tarodnova
 
Indexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrIndexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrLucidworks (Archived)
 
Web Design Course FETAC Level 5
Web Design Course FETAC Level 5 Web Design Course FETAC Level 5
Web Design Course FETAC Level 5 CMD Training Institute
 
Understanding Lucene Search Performance
Understanding Lucene Search PerformanceUnderstanding Lucene Search Performance
Understanding Lucene Search PerformanceLucidworks (Archived)
 

Andere mochten auch (20)

Boolean- Search Basics
Boolean- Search BasicsBoolean- Search Basics
Boolean- Search Basics
 
Boolean Logic: how to talk to search engines in their own language
Boolean Logic: how to talk to search engines in their own languageBoolean Logic: how to talk to search engines in their own language
Boolean Logic: how to talk to search engines in their own language
 
Boolean+logic
Boolean+logicBoolean+logic
Boolean+logic
 
Boolean Logic Searching: A Primer
Boolean Logic Searching: A PrimerBoolean Logic Searching: A Primer
Boolean Logic Searching: A Primer
 
Jazeed about Solr - People as A Search Problem
Jazeed about Solr - People as A Search ProblemJazeed about Solr - People as A Search Problem
Jazeed about Solr - People as A Search Problem
 
Searching The United States Code with Solr/Lucene
Searching The United States Code with Solr/LuceneSearching The United States Code with Solr/Lucene
Searching The United States Code with Solr/Lucene
 
Highly Relevant Search Result Ranking for Law Enforcement
Highly Relevant Search Result Ranking for Law EnforcementHighly Relevant Search Result Ranking for Law Enforcement
Highly Relevant Search Result Ranking for Law Enforcement
 
Presentation: IT Wizard Summer Camp
Presentation: IT Wizard Summer CampPresentation: IT Wizard Summer Camp
Presentation: IT Wizard Summer Camp
 
Network Forensics Puzzle Contest ă«æŒ‘æˆŠ #1
Network Forensics Puzzle Contest ă«æŒ‘æˆŠ #1Network Forensics Puzzle Contest ă«æŒ‘æˆŠ #1
Network Forensics Puzzle Contest ă«æŒ‘æˆŠ #1
 
Tate Tyler - Designing the Search Experience
Tate Tyler - Designing the Search ExperienceTate Tyler - Designing the Search Experience
Tate Tyler - Designing the Search Experience
 
Mujer, pajaro y estrella
Mujer, pajaro y estrellaMujer, pajaro y estrella
Mujer, pajaro y estrella
 
Windows 8 で魅抛的ăȘWeb ă‚”ă‚€ăƒˆă‚’äœœă‚‹
Windows 8 で魅抛的ăȘWeb ă‚”ă‚€ăƒˆă‚’äœœă‚‹Windows 8 で魅抛的ăȘWeb ă‚”ă‚€ăƒˆă‚’äœœă‚‹
Windows 8 で魅抛的ăȘWeb ă‚”ă‚€ăƒˆă‚’äœœă‚‹
 
What’s New in Apache Lucene 2.9
What’s New in Apache Lucene 2.9What’s New in Apache Lucene 2.9
What’s New in Apache Lucene 2.9
 
C:\Fakepath\6620millardmodule3b
C:\Fakepath\6620millardmodule3bC:\Fakepath\6620millardmodule3b
C:\Fakepath\6620millardmodule3b
 
ĐżŃ€Đ”Đ·Đ”ĐœŃ‚Đ°Ń†ĐžŃ ĐżĐŸ ĐșĐœĐžĐłĐ” Ўуг ĐŽĐ” ĐșĐ°Ń€Đ»ĐŸ "эĐșŃŃ‚Ń€ĐžĐŒĐ°Đ»ŃŒĐœĐŸĐ” упраĐČĐ»Đ”ĐœĐžĐ” ĐżŃ€ĐŸĐ”ĐșŃ‚Đ°ĐŒĐž"
ĐżŃ€Đ”Đ·Đ”ĐœŃ‚Đ°Ń†ĐžŃ ĐżĐŸ ĐșĐœĐžĐłĐ” Ўуг ĐŽĐ” ĐșĐ°Ń€Đ»ĐŸ "эĐșŃŃ‚Ń€ĐžĐŒĐ°Đ»ŃŒĐœĐŸĐ” упраĐČĐ»Đ”ĐœĐžĐ” ĐżŃ€ĐŸĐ”ĐșŃ‚Đ°ĐŒĐž"ĐżŃ€Đ”Đ·Đ”ĐœŃ‚Đ°Ń†ĐžŃ ĐżĐŸ ĐșĐœĐžĐłĐ” Ўуг ĐŽĐ” ĐșĐ°Ń€Đ»ĐŸ "эĐșŃŃ‚Ń€ĐžĐŒĐ°Đ»ŃŒĐœĐŸĐ” упраĐČĐ»Đ”ĐœĐžĐ” ĐżŃ€ĐŸĐ”ĐșŃ‚Đ°ĐŒĐž"
ĐżŃ€Đ”Đ·Đ”ĐœŃ‚Đ°Ń†ĐžŃ ĐżĐŸ ĐșĐœĐžĐłĐ” Ўуг ĐŽĐ” ĐșĐ°Ń€Đ»ĐŸ "эĐșŃŃ‚Ń€ĐžĐŒĐ°Đ»ŃŒĐœĐŸĐ” упраĐČĐ»Đ”ĐœĐžĐ” ĐżŃ€ĐŸĐ”ĐșŃ‚Đ°ĐŒĐž"
 
Indexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrIndexing Text and HTML Files with Solr
Indexing Text and HTML Files with Solr
 
Web Design Course FETAC Level 5
Web Design Course FETAC Level 5 Web Design Course FETAC Level 5
Web Design Course FETAC Level 5
 
All Data Big and Small
All Data Big and SmallAll Data Big and Small
All Data Big and Small
 
Learn How to Master Solr1 4
Learn How to Master Solr1 4Learn How to Master Solr1 4
Learn How to Master Solr1 4
 
Understanding Lucene Search Performance
Understanding Lucene Search PerformanceUnderstanding Lucene Search Performance
Understanding Lucene Search Performance
 

Mehr von Lucidworks (Archived)

Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Lucidworks (Archived)
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and SolrLucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessLucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceLucidworks (Archived)
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineLucidworks (Archived)
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchLucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrLucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchLucidworks (Archived)
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Lucidworks (Archived)
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...Lucidworks (Archived)
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Lucidworks (Archived)
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCLucidworks (Archived)
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCLucidworks (Archived)
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCLucidworks (Archived)
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCLucidworks (Archived)
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCLucidworks (Archived)
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKLucidworks (Archived)
 

Mehr von Lucidworks (Archived) (20)

Integrating Hadoop & Solr
Integrating Hadoop & SolrIntegrating Hadoop & Solr
Integrating Hadoop & Solr
 
The Data-Driven Paradigm
The Data-Driven ParadigmThe Data-Driven Paradigm
The Data-Driven Paradigm
 
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
 
What's new in solr june 2014
What's new in solr june 2014What's new in solr june 2014
What's new in solr june 2014
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DC
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLK
 

KĂŒrzlich hochgeladen

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel AraĂșjo
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 

KĂŒrzlich hochgeladen (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

Developing and implementing search strategies specifically for e commerce sites

  • 1. Optimizing Solr performance and User Experience for E-Commerce 15 April 2010
  • 2. Slides posted at the end of this Agenda presentation; full replay available within ~48 hours of live webcast Introductions Apache Lucene/Solr and eCommerce - Grant Lucene/Solr Powered Sales Common Scenarios Search Needs for eCommerce Use Case: Sheet Music Plus - Brian Leveraging Solr to sell Music online Gauging search success through timely metrics Lucid Imagination, Inc. – http://www.lucidimagination.com 2
  • 3. Introductions Grant Ingersoll Co-Founder of Lucid Imagination Lucene/Solr/Mahout committer Brian Doll Search Architect at Sheetmusicplus.com 12+ years experience in eCommerce, finance, retail and media Lucid Imagination, Inc. – http://www.lucidimagination.com 3
  • 4. A few Lucene/Solr-Powered Commerce Sites Buy.com 4 Lucid Imagination, Inc. – http://www.lucidimagination.com 4
  • 5. eCommerce Scenarios Users can’t buy it if they can’t find it! Search/Discovery is often mission critical in eCommerce Users don’t know how to spell Users often don’t even know how to describe it Online stores typically look like: SKUs (products): 1K to 500K typically, but we have customers with 10-50M+ SKUs in Solr High Volume Search: Millions of queries/day (peaks @ Xmas in US) Lots of Metadata with 1 or 2 text fields Many product names are “tricky”: GPSGolfPro, iPod, Tchaikovsky, Garmin nuvi 255w Lucid Imagination, Inc. – http://www.lucidimagination.com 5
  • 6. eCommerce Checklist for Search Keyword search High Quality relevance (precision @ < 10) Faceting/Discovery Flexible language analysis tools Stemming, protected words, case changes, alpha-numeric Multilingual support Frequent Incremental Updates Ratings, Reviews, New Products Best Bets Lucid Imagination, Inc. – http://www.lucidimagination.com 6
  • 7. eCommerce Checklist for Search, continued Auto-suggest Did you mean? Related Searches/Items Editorial Relevance Controls Sales, Margins, Inventory, Fixed results, Exclusions, Ratings Admin: Scalability, Fault Tolerance, Easy Setup and Config Recommendations (See Mahout) Analytics and other Business Tools Lucid Imagination, Inc. – http://www.lucidimagination.com 7
  • 8. eCommerce and Solr Most of the checklist comes out of the box with Apache Solr Primary missing piece: Analytics + high level business tools But one could argue that is a feature, not a bug! Many people already have their own reporting/analytics Often quite easy to hook in Solr/Lucene to those solutions Lucid Imagination, Inc. – http://www.lucidimagination.com 8
  • 9. Solr E-Commerce Case Study: Sheetmusicplus.com Overview About Sheetmusicplus and our Solr infrastructure Onsite search: understanding the value Understanding your traffic, your search and your customers Setting search strategies Understanding Customer Value Q&A Lucid Imagination, Inc. – http://www.lucidimagination.com 9
  • 10. About sheetmusicplus.com Largest selection of sheet music 12 year old online business 600k+ SKUs 1.5M+ Songs Faceted navigation and on-site search with Solr:  Up to 35 application requests per second this year/~3 million per day  Up to 11 solr queries per second this year/~1 million per day Ruby on Rails app servers running on Passenger / Apache MySQL / Memcached / VMWare Lots of needles in a huge haystack Lucid Imagination, Inc. – http://www.lucidimagination.com 10
  • 11. What’s the value of on-site search? Value of a new prospect Value of an existing customer eCommerce conversion rate of the search page Dollar value of the search page Dollar value of top search terms Lucid Imagination, Inc. – http://www.lucidimagination.com 11
  • 12. Why Solr? Strategies: A set of patterns to match search operations with various subsets of use cases to optimize results for different users We love that the structure of your data is part of your Solr environment (schema.xml) And the way you query against that data is part of your app Solr provides an easy and flexible API that is easy to experiment with Lucid Imagination, Inc. – http://www.lucidimagination.com 12
  • 13. Step 1 Data Strategy Lucid Imagination, Inc. – http://www.lucidimagination.com 13
  • 14. Get to know your visitors on-site search terms are not as varied as we had thought Lucid Imagination, Inc. – http://www.lucidimagination.com
  • 15. Surprisingly, lots of people search like this: piano Lucid Imagination, Inc. – http://www.lucidimagination.com 15
  • 16. Surprisingly, lots of people search like this: piano While others search like this: Piano The Clash London Calling Guitar Tablature Medium Difficulty Lucid Imagination, Inc. – http://www.lucidimagination.com 16
  • 17. Understanding your data We have very diverse data Most product attributes are inconsistent We receive product information in 136 different formats Product title Contributor roles: Artist, Composer, Performer, Arranger, etc. Musical Genres, instrumentation, difficulty level, format Algorithmically applied facets based on all data elements Some items have scores of facets, some have only a few Lucid Imagination, Inc. – http://www.lucidimagination.com 17
  • 18. Analyzing your data How might we categorize all of our potential fields? What data elements might contain a person’s name? What elements might refer to a musical instrument? What do we do about unstructured publisher descriptions? How can you provide an effective search service with the data you have? Lucid Imagination, Inc. – http://www.lucidimagination.com 18
  • 19. A “one size fits all” search strategy may not be ideal. piano The Clash London Calling Guitar Tablature Medium Difficulty Lucid Imagination, Inc. – http://www.lucidimagination.com 19
  • 20. What if 
 
 we applied a strategy pattern to our search queries? Lucid Imagination, Inc. – http://www.lucidimagination.com 20
  • 21. How we set up our search strategies Each strategy must answer the following: A unique name and description to help with analytics A hash of solr fields to search, containing:  The copy field name  The type of search (phrase or term searching)  The boost value  The slop value The default sort order of that particular strategy Lucid Imagination, Inc. – http://www.lucidimagination.com 21
  • 22. Grouping your data There are significant performance benefits in minimizing the number of fields to search against Group the fields that share common traits and have a similar level of influence on your search results to boost performance Example: “Contributors” are all equally important to our visitors. Many searches include an artist name, but just as often those names are composers, arrangers or popular performers. We can create a single copy field that aggregates all of those fields into one searchable Solr field Lucid Imagination, Inc. – http://www.lucidimagination.com 22
  • 23. Step 2 Data Strategy Lucid Imagination, Inc. – http://www.lucidimagination.com 23
  • 24. Building a search strategy What (copy) fields do you search against? What boost is applied to each of those fields? What sort order do you default to? What other factors might you influence your results with? (We occasionally influence results with sales ranking data) Lucid Imagination, Inc. – http://www.lucidimagination.com 24
  • 25. Having more than one strategy is much easier than trying to get it exactly right It’ll be unique to your site We have a classifier algorithm to help determine which strategy we should apply to an incoming search request Lucid Imagination, Inc. – http://www.lucidimagination.com 25
  • 26. Implementing these strategies provided an 86% increase in per-search value. We can now fine-tune individual categories of searches, as well as specific phrases. Lucid Imagination, Inc. – http://www.lucidimagination.com 26
  • 27. Expanding your portfolio of search strategies How are customers searching your data? What categories of searches might you need to address? Sheetmusicplus.com examples: Artist search {Pink Floyd, Haydn, Louis Armstrong} Instrument search {Guitar, ukulele, oboe} Genre search {Jazz, classical, standards} Title search {“I left my heart in San Fernando’s Hideaway”} Evaluate: How does this relate to metadata? Lucid Imagination, Inc. – http://www.lucidimagination.com 27
  • 28. Monitor and refine your strategies Use Google Analytics (or a similar tool) to understand what your users are searching for Deliver search strategies to help them find what they want Don't bother chasing obscure queries that don't generate revenue Focus on delivering customer value Win! Lucid Imagination, Inc. – http://www.lucidimagination.com 28
  • 29. Track customer value per strategy Use Google Analytics to track the value of your strategies Add a dummy URL parameter to each of your search results indicating which strategy was used during that search /some/resulting/doc?s=4 In Google Analytics, search your page views for ‘s=4’  Graph of page views over time, eCommerce value per page, etc Lucid Imagination, Inc. – http://www.lucidimagination.com 29
  • 30. Summary Data --> Strategy Tuning your search algorithm is an endless game, and if you focus on pleasing customer X, it'll cost you. Don't try to please everybody. Please the people who make you money! Lucid Imagination, Inc. – http://www.lucidimagination.com 30
  • 31. Q&A Go to http://bit.ly/solr-ecom to download slides; Full replay available within 48 hours of live webcast Lucid Imagination, Inc. – http://www.lucidimagination.com 31