SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
SOLR Facts



 65% of IT organizations were able to reduce the costs of developing
 and deploying their search application by 50% or more as a result of
 using SOLR




 Source: Survey of 26 Solr/Lucene users conducted by TechValidate




     10/23/2012                © 2011 Crown Partners. All Rights Reserved.   1
SOLR Facts



 43% of IT organizations index or update 1,000,001 to 5,000,000 or
 more documents each week with SOLR.




          43%



 Source: Survey of 26 Solr/Lucene users conducted by TechValidate




     10/23/2012                © 2011 Crown Partners. All Rights Reserved.   2
SOLR Facts



 “We were able to decrease risk by allowing our catalog of 6 million-
 plus items and 50 million user profiles to be searched well beyond the
 possibilities with MySQL.”




 Source: Executive, Small Business Computer Software Company


     10/23/2012               © 2011 Crown Partners. All Rights Reserved.   3
SOLR Facts



  “With SOLR’s Dis-Max query parser, we were able to
 drastically increase the relevance of returned search
 results.”




 Source: IT Architect, Small Business Media & Entertainment Company


     10/23/2012                © 2011 Crown Partners. All Rights Reserved.   4
Click to edit Master title style


                                                              SOLR
                                                                Optimizing SOLR to Improve Search


                                                               Presented by: Rob Miller, Jason Grim & Ryan
                                                               Street
                                                               08/15/2012




© 2011 Crown Partners. All Rights2011 Crown Partners. All Rights Reserved.
        10/23/2012
        10/23/201               ©
                                  Reserved.                                                    5             5
About Crown

 Certified Magento Development Team

 SOLR experts in Advanced Search

 Integration between SOLR and Magento, ERP




     10/23/2012              © 2011 Crown Partners. All Rights Reserved.   6
Agenda

 Overview of SOLR

 Basic Solr Troubleshooting

  – Common SOLR Troubleshooting and Solutions

 Advanced optimization of search results

  – Making changes in Solr configuration to better your results.

 Improving search speed

  – Optimizing to improve search speed.




     10/23/2012               © 2011 Crown Partners. All Rights Reserved.   7
10/23/2012   © 2011 Crown Partners. All Rights Reserved.   8
Crown’s First SOLR Webinar

 Crown’s SOLR 1.0 Webinar

  – Support for Spelling/Synonyms/Stop Words

  – Improved Layered Navigation

  – September 21, 2011

  – http://bit.ly/solrmagentowebinar




     10/23/2012             © 2011 Crown Partners. All Rights Reserved.   9
Basic
                                     Troubleshooting




10/23/2012   © 2011 Crown Partners. All Rights Reserved.   10
Useful SOLR Tools
 Web Interface
 Luke
 Command Line




     10/23/2012     © 2011 Crown Partners. All Rights Reserved.   11
Magento Cannot Connect to SOLR
 Do you have the right URL and Port?
 Does Your server communicate SOLR and Magento?




     10/23/2012            © 2011 Crown Partners. All Rights Reserved.   12
Magento and Solr Show Bad or No Results
 Bad Data
 Change from Final to Partial Commit
 Look into the command line for critical errors during index




     10/23/2012               © 2011 Crown Partners. All Rights Reserved.   13
Where to find more answers…
 Magento Forums
  – http://www.magentocommerce.com/boards/
 Magento Answers

  – http://www.magentocommerce.com/answers/welcome

 Dr. Gento

  – http://www.drgento.com

 And of course… Crown!
  – http://www.crownpartners.com/




     10/23/2012              © 2011 Crown Partners. All Rights Reserved.   14
Advanced
                                     Optimization of
                                     Search Results




10/23/2012   © 2011 Crown Partners. All Rights Reserved.   15
Configuration: Minimum Must Match

  Must Match Formats

   – 2

   – 75%

   – 2<-25%

   – 2<-1 5<80%

  Setting is language specific

  Will NOT require Reindex (Query time parameter)




    10/23/2012                   © 2011 Crown Partners. All Rights Reserved.   16
Query Boosting Results

   Boost individual product attributes

   Query time configuration

   Language specific




     10/23/2012                  © 2011 Crown Partners. All Rights Reserved.   17
Where to Find More Answers

  Apache’s Wiki

   – http://wiki.apache.org/solr/

  Dr. Gento

   – http://www.drgento.com

  And of course… Crown!

   – http://www.crownpartners.com/




    10/23/2012                 © 2011 Crown Partners. All Rights Reserved.   18
Improving
                                     Search
                                     Speed



10/23/2012   © 2011 Crown Partners. All Rights Reserved.   19
SOLR and Magento Relationship
  User submits a search query

  Magento connects to SOLR and sends over query

  SOLR processes query and returns Magento Product IDs

  Magento loads the product IDs and displays them to the user




     10/23/2012                 © 2011 Crown Partners. All Rights Reserved.   20
Is SOLR the Problem?

  Check qtime of a query

   – /select params={…} hits=79 status=0 QTime=48

  Solr Performance Enhancements

   – http://wiki.apache.org/solr/SolrPerformanceFactors




    10/23/2012               © 2011 Crown Partners. All Rights Reserved.   21
MySQL Optimization

  Update your version of MySQL to the latest version

  Make sure your MySQL settings are tuned per Magento’s recommendations

   – http://www.magentocommerce.com/whitepaper/

  Using the Memory (HEAP) Storage Engine for Temp Tables

   – http://dev.mysql.com/doc//refman/5.0/en/memory-storage-engine.html

  Leverage MySQL query caching




    10/23/2012                © 2011 Crown Partners. All Rights Reserved.   22
Where to Find More Answers

  Magento Forums

   – http://www.magentocommerce.com/boards/

  Magento U Performance and Optimization for System Administrators

   – http://www.magentocommerce.com/services/training

  Dr. Gento

   – http://www.drgento.com

  And of course… Crown!

   – http://www.crownpartners.com/




    10/23/2012                © 2011 Crown Partners. All Rights Reserved.   23
Questions?




10/23/2012   © 2011 Crown Partners. All Rights Reserved.            24
Thank You!

                                                 Rob Miller
                                 rmiller@crownpartners.com

                                                 Jason Grim
                                     jgrim@crownpartners.com

                                               Ryan Street
                                 rstreet@crownpartners.com




10/23/2012   © 2011 Crown Partners. All Rights Reserved.            25

Weitere ähnliche Inhalte

Ähnlich wie Start Your Search Engines: Optimizing Solr to Improve Results

Partner Webcast – Oracle Public Cloud for ISVs: Migrating Java EE and ADF app...
Partner Webcast – Oracle Public Cloud for ISVs: Migrating Java EE and ADF app...Partner Webcast – Oracle Public Cloud for ISVs: Migrating Java EE and ADF app...
Partner Webcast – Oracle Public Cloud for ISVs: Migrating Java EE and ADF app...Thanos TP
 
So we've done APM. Now what?
 So we've done APM. Now what? So we've done APM. Now what?
So we've done APM. Now what?SL Corporation
 
From Fagmented to Insights -
From Fagmented to Insights - From Fagmented to Insights -
From Fagmented to Insights - Crown
 
Drupal commerce + search api (solr)
Drupal commerce + search api (solr)Drupal commerce + search api (solr)
Drupal commerce + search api (solr)Ryan Street
 
Real-Time Coherence Monitoring in Integrated Environments
Real-Time Coherence Monitoring in Integrated EnvironmentsReal-Time Coherence Monitoring in Integrated Environments
Real-Time Coherence Monitoring in Integrated EnvironmentsSL Corporation
 
STPCon fall 2012: The Testing Renaissance Has Arrived
STPCon fall 2012: The Testing Renaissance Has ArrivedSTPCon fall 2012: The Testing Renaissance Has Arrived
STPCon fall 2012: The Testing Renaissance Has ArrivedSOASTA
 
Con8811 converged identity governance for speeding up business and reducing c...
Con8811 converged identity governance for speeding up business and reducing c...Con8811 converged identity governance for speeding up business and reducing c...
Con8811 converged identity governance for speeding up business and reducing c...OracleIDM
 
Why Federate When You Can Differentiate
Why Federate When You Can DifferentiateWhy Federate When You Can Differentiate
Why Federate When You Can DifferentiateFlexiant
 
Metadata 101: Building a Solid Foundation
Metadata 101: Building a Solid FoundationMetadata 101: Building a Solid Foundation
Metadata 101: Building a Solid FoundationWebdam
 
Con8833 access at scale for hundreds of millions of users final
Con8833 access at scale for hundreds of millions of users   finalCon8833 access at scale for hundreds of millions of users   final
Con8833 access at scale for hundreds of millions of users finalOracleIDM
 
Primavera Cost Unifier cost controls and project delivery modules tips and tr...
Primavera Cost Unifier cost controls and project delivery modules tips and tr...Primavera Cost Unifier cost controls and project delivery modules tips and tr...
Primavera Cost Unifier cost controls and project delivery modules tips and tr...p6academy
 
Con8828 justifying and planning a successful identity management upgrade final
Con8828 justifying and planning a successful identity management upgrade finalCon8828 justifying and planning a successful identity management upgrade final
Con8828 justifying and planning a successful identity management upgrade finalOracleIDM
 
MOSA webinar: Small Cell Networks: Lessons Learned
MOSA webinar: Small Cell Networks: Lessons LearnedMOSA webinar: Small Cell Networks: Lessons Learned
MOSA webinar: Small Cell Networks: Lessons LearnedWi-Fi 360
 
Big data meetup_10_9_2013
Big data meetup_10_9_2013Big data meetup_10_9_2013
Big data meetup_10_9_2013Tanya Cashorali
 
HowToManageYourProject
HowToManageYourProjectHowToManageYourProject
HowToManageYourProjectTakahisa Ogawa
 
HPLN Web Performance Optimization - Liran tal
HPLN Web Performance Optimization - Liran talHPLN Web Performance Optimization - Liran tal
HPLN Web Performance Optimization - Liran talLiran Tal
 

Ähnlich wie Start Your Search Engines: Optimizing Solr to Improve Results (20)

Oracle 360
Oracle 360Oracle 360
Oracle 360
 
Partner Webcast – Oracle Public Cloud for ISVs: Migrating Java EE and ADF app...
Partner Webcast – Oracle Public Cloud for ISVs: Migrating Java EE and ADF app...Partner Webcast – Oracle Public Cloud for ISVs: Migrating Java EE and ADF app...
Partner Webcast – Oracle Public Cloud for ISVs: Migrating Java EE and ADF app...
 
So we've done APM. Now what?
 So we've done APM. Now what? So we've done APM. Now what?
So we've done APM. Now what?
 
From Fagmented to Insights -
From Fagmented to Insights - From Fagmented to Insights -
From Fagmented to Insights -
 
Drupal commerce + search api (solr)
Drupal commerce + search api (solr)Drupal commerce + search api (solr)
Drupal commerce + search api (solr)
 
Real-Time Coherence Monitoring in Integrated Environments
Real-Time Coherence Monitoring in Integrated EnvironmentsReal-Time Coherence Monitoring in Integrated Environments
Real-Time Coherence Monitoring in Integrated Environments
 
STPCon fall 2012: The Testing Renaissance Has Arrived
STPCon fall 2012: The Testing Renaissance Has ArrivedSTPCon fall 2012: The Testing Renaissance Has Arrived
STPCon fall 2012: The Testing Renaissance Has Arrived
 
2013 july gac webinar for tom
2013 july gac webinar for tom2013 july gac webinar for tom
2013 july gac webinar for tom
 
Con8811 converged identity governance for speeding up business and reducing c...
Con8811 converged identity governance for speeding up business and reducing c...Con8811 converged identity governance for speeding up business and reducing c...
Con8811 converged identity governance for speeding up business and reducing c...
 
Why Federate When You Can Differentiate
Why Federate When You Can DifferentiateWhy Federate When You Can Differentiate
Why Federate When You Can Differentiate
 
Metadata 101: Building a Solid Foundation
Metadata 101: Building a Solid FoundationMetadata 101: Building a Solid Foundation
Metadata 101: Building a Solid Foundation
 
Con8833 access at scale for hundreds of millions of users final
Con8833 access at scale for hundreds of millions of users   finalCon8833 access at scale for hundreds of millions of users   final
Con8833 access at scale for hundreds of millions of users final
 
Primavera Cost Unifier cost controls and project delivery modules tips and tr...
Primavera Cost Unifier cost controls and project delivery modules tips and tr...Primavera Cost Unifier cost controls and project delivery modules tips and tr...
Primavera Cost Unifier cost controls and project delivery modules tips and tr...
 
Con8828 justifying and planning a successful identity management upgrade final
Con8828 justifying and planning a successful identity management upgrade finalCon8828 justifying and planning a successful identity management upgrade final
Con8828 justifying and planning a successful identity management upgrade final
 
MOSA webinar: Small Cell Networks: Lessons Learned
MOSA webinar: Small Cell Networks: Lessons LearnedMOSA webinar: Small Cell Networks: Lessons Learned
MOSA webinar: Small Cell Networks: Lessons Learned
 
Big data meetup_10_9_2013
Big data meetup_10_9_2013Big data meetup_10_9_2013
Big data meetup_10_9_2013
 
Continuous Delivery in the Enterprise
Continuous Delivery in the EnterpriseContinuous Delivery in the Enterprise
Continuous Delivery in the Enterprise
 
Bof4162 kovalsky
Bof4162 kovalskyBof4162 kovalsky
Bof4162 kovalsky
 
HowToManageYourProject
HowToManageYourProjectHowToManageYourProject
HowToManageYourProject
 
HPLN Web Performance Optimization - Liran tal
HPLN Web Performance Optimization - Liran talHPLN Web Performance Optimization - Liran tal
HPLN Web Performance Optimization - Liran tal
 

Start Your Search Engines: Optimizing Solr to Improve Results

  • 1. SOLR Facts 65% of IT organizations were able to reduce the costs of developing and deploying their search application by 50% or more as a result of using SOLR Source: Survey of 26 Solr/Lucene users conducted by TechValidate 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 1
  • 2. SOLR Facts 43% of IT organizations index or update 1,000,001 to 5,000,000 or more documents each week with SOLR. 43% Source: Survey of 26 Solr/Lucene users conducted by TechValidate 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 2
  • 3. SOLR Facts “We were able to decrease risk by allowing our catalog of 6 million- plus items and 50 million user profiles to be searched well beyond the possibilities with MySQL.” Source: Executive, Small Business Computer Software Company 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 3
  • 4. SOLR Facts “With SOLR’s Dis-Max query parser, we were able to drastically increase the relevance of returned search results.” Source: IT Architect, Small Business Media & Entertainment Company 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 4
  • 5. Click to edit Master title style SOLR Optimizing SOLR to Improve Search Presented by: Rob Miller, Jason Grim & Ryan Street 08/15/2012 © 2011 Crown Partners. All Rights2011 Crown Partners. All Rights Reserved. 10/23/2012 10/23/201 © Reserved. 5 5
  • 6. About Crown Certified Magento Development Team SOLR experts in Advanced Search Integration between SOLR and Magento, ERP 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 6
  • 7. Agenda Overview of SOLR Basic Solr Troubleshooting – Common SOLR Troubleshooting and Solutions Advanced optimization of search results – Making changes in Solr configuration to better your results. Improving search speed – Optimizing to improve search speed. 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 7
  • 8. 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 8
  • 9. Crown’s First SOLR Webinar Crown’s SOLR 1.0 Webinar – Support for Spelling/Synonyms/Stop Words – Improved Layered Navigation – September 21, 2011 – http://bit.ly/solrmagentowebinar 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 9
  • 10. Basic Troubleshooting 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 10
  • 11. Useful SOLR Tools Web Interface Luke Command Line 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 11
  • 12. Magento Cannot Connect to SOLR Do you have the right URL and Port? Does Your server communicate SOLR and Magento? 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 12
  • 13. Magento and Solr Show Bad or No Results Bad Data Change from Final to Partial Commit Look into the command line for critical errors during index 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 13
  • 14. Where to find more answers… Magento Forums – http://www.magentocommerce.com/boards/ Magento Answers – http://www.magentocommerce.com/answers/welcome Dr. Gento – http://www.drgento.com And of course… Crown! – http://www.crownpartners.com/ 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 14
  • 15. Advanced Optimization of Search Results 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 15
  • 16. Configuration: Minimum Must Match Must Match Formats – 2 – 75% – 2<-25% – 2<-1 5<80% Setting is language specific Will NOT require Reindex (Query time parameter) 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 16
  • 17. Query Boosting Results Boost individual product attributes Query time configuration Language specific 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 17
  • 18. Where to Find More Answers Apache’s Wiki – http://wiki.apache.org/solr/ Dr. Gento – http://www.drgento.com And of course… Crown! – http://www.crownpartners.com/ 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 18
  • 19. Improving Search Speed 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 19
  • 20. SOLR and Magento Relationship User submits a search query Magento connects to SOLR and sends over query SOLR processes query and returns Magento Product IDs Magento loads the product IDs and displays them to the user 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 20
  • 21. Is SOLR the Problem? Check qtime of a query – /select params={…} hits=79 status=0 QTime=48 Solr Performance Enhancements – http://wiki.apache.org/solr/SolrPerformanceFactors 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 21
  • 22. MySQL Optimization Update your version of MySQL to the latest version Make sure your MySQL settings are tuned per Magento’s recommendations – http://www.magentocommerce.com/whitepaper/ Using the Memory (HEAP) Storage Engine for Temp Tables – http://dev.mysql.com/doc//refman/5.0/en/memory-storage-engine.html Leverage MySQL query caching 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 22
  • 23. Where to Find More Answers Magento Forums – http://www.magentocommerce.com/boards/ Magento U Performance and Optimization for System Administrators – http://www.magentocommerce.com/services/training Dr. Gento – http://www.drgento.com And of course… Crown! – http://www.crownpartners.com/ 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 23
  • 24. Questions? 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 24
  • 25. Thank You! Rob Miller rmiller@crownpartners.com Jason Grim jgrim@crownpartners.com Ryan Street rstreet@crownpartners.com 10/23/2012 © 2011 Crown Partners. All Rights Reserved. 25

Hinweis der Redaktion

  1. • Complex migration and integration projects are a backbone of our company• ExactTarget Gold Partner with full integration between ExactTarget and Magento• We’ve built e-commerce sites ground up, handled complicated product catalog migrations for large B2B companies, and integrated email, ecommerce, digital experience, and business analytic solutions for B2C retail companies. 
  2. For the next hour we will be speaking about the integrations of Solr and Magento and making the setup work best for your ecommerce site.Today we are going to go over more advanced topics such as:Basic Troubleshooting-Useful Solr tools and Common problems and solutions.Advanced optimization of search results.-Making changes in Solr configuration to better your results. In the previous presentation we covered modifications direclty in Magento. Today we will be covering changes done to Solr.Improving search speed-Optimizing Magento to improve search.
  3. Solr is an open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document handling.Magento Enterprise integrates with Solr right out of the box.
  4. We did a more in depth introduction of Solr September of 2011. You can watch the full video by going to the URL displayed or by going to Magento&apos;s webinar section on their website. It covers setup, indexing, and fine tuning search results through Magento.
  5. So now let us go over useful Solr tools and Common problems and solutions.
  6. Web Interface (5 minutes)Schema fileIf you make file changes you can confirm Solr has loaded them by looking for them in this file.Show config fileIf you make file changes you can confirm Solr has loaded them by looking for them in this file.Schema BrowserNumber of docs in the indexActual indexed fields and some statistics about them.Ping URLThe URL used to test if Solr is running properly.Solr StatsRequest handlers used and other high level stats and configurations.readDir pathLuke (5 minutes)Lucene Index BrowserTokenized terms for searchCommand Line (5 minutes)Show logs during indexShow logs during query
  7. Do you have the right URL and port?For example the default port for Tomcat for 8080 and Jetty is 8983.Show test button.What the button actually does.Ping URL to Solr and the response.PHP Setting to fix it and why. (90% of the time it&apos;s fixed by this.)
  8. What the problem is…Bad data, Solr not committing changes.Final commit vs Partial commit.How to diagnose this issue. (Tailing the log look for rollback)It tells which product ID has critical error.
  9. Direct configuration changes in Solr to better suite you business needs.There are two different types of settings in Solr: Query time and Index time.Query time settings are settings that take effect when a Query is ran. These do not require a reindex of data.Index time settings are used during index, if a change is made to index time setting then you must reindex to see the changes take place.
  10. When dealing with queries there are 3 types of &quot;clauses&quot; that Lucene knows about: mandatory, prohibited, and &apos;optional&apos; (aka: &quot;SHOULD&quot;) By default all words or phrases specified in the &quot;q&quot; param are treated as &quot;optional&quot; clauses unless they are preceeded by a &quot;+&quot; or a &quot;-&quot;. When dealing with these &quot;optional&quot; clauses, the &quot;mm&quot; option makes it possible to say that a certain minimum number of those clauses must match (mm). Specifying this minimum number can be done in complex ways, equating to ideas like...   At least 2 of the optional clauses must match, regardless of how many clauses there are: &quot;2&quot;At least 75% of the optional clauses must match, rounded down: &quot;75%&quot;  If there are less than 3 optional clauses, they all must match; if there are 3 or more, then 75% must match, rounded up: &quot;2&lt;-25%&quot;  If there are less than 3 optional clauses, they all must match; for 3 to 5 clauses, one less than the number of clauses must match, for 6 or more clauses, 80% must match, rounded down: &quot;2&lt;-1 5&lt;80%&quot;This is modified in the query time configuration file solrconfig.xmlThis setting is language specific.No reindex will be needed
  11. Perhaps there will be a situation when products will need to be promoted in your search or boosted. With Solr&apos;s &quot;Boost Query&quot; parameter this can easily be accomplished.This is modified in the query time configuration file solrconfig.xmlThis setting is language specific.No reindex will be needed
  12. How Magento and Solr Work together-First of all it&apos;s not Solr it&apos;s Magento-Solr returns product IDs not data. Magento does the data grabber
  13. Checking query time logging for the &quot;Q&quot; time in milliseconds.Solr optimization that we do not have time to cover here. Go to: http://wiki.apache.org/solr/SolrPerformanceFactors
  14. Make sure you have the most recent version of MySQLMake sure your MySQL settings are tuned per Magento&apos;s recommendations.Use the Memory (HEAP) storage engine for temp tables.Leverage MySQL query caching as recommended by Magento.