3. We deliver business-driven web services that enable our customers to conduct better business on the Internet We base our work to our customers’ strategy and needs
4.
5. About me, Kalle Virta Software architect and developer High performance and complex integrations Almost 10 years in the business Seen Drupal from version 3 A lot of big Drupal sites / systems under by belt
10. Your enhanced stack mem-cached MySQL server Linux + Apache Varnish Apache SOLR Did you notice? It’s still blue.
11. The new guys Varnish is a http cache and does it well – but it doesn’t help at all on your customized-for-every-person social media site Memcached is a good idea, and you can even use it with cache router to cache Drupal stuff, including your own modules, but… it still just caches stuff SOLR however, is a different story…
12. SOLR Apache SOLR is a search server around Lucene (which is a search library) written in Java It needs a Java container, e.g. Jetty or Tomcat In a simple way, you can save your stuff in XML form in it and then search from them SOLR will tokenize and do all kinds of (configurable) magic to the data when indexing it, but it can also store the original data (not always possible with search indexers)
13. SOLR for searching Obviously all the features of SOLR make it optimal for sitewide searching functionality You can actually find stuff with SOLR, all the fields in the search can be biased, that is, you can tune the fields in which the hits make the score go higher SOLR also does one really neat thing for searching…
15. The old advanced search Search mouse Product category Product sub-cat Manufacturer Price range - Search Too many search results (794), narrow your search and try again
16. The faceted search Order by price Logitech LS1 Laser Mouse Current search 29 € A cheap laser mouse that’ll get you through even the most problematic of PowerPoint presentations. mouse Sub-category Logitech G3 Gaming Mouse wireless mice (296) wired mice (96) laser mice (163) 59 € A great laser mouse with more buttons than you’ll ever have time to configure. A steal. Show all Microsoft Super Mouse Manufacturer 49 € Logitech(194) Microsoft (36) HP (3) A great mouse from the company that brought you the best product of all times, Windows Me. Show all Apple Mighty Mouse 129 € Price range The mouse the image happens to be of. Never tried it. Looks pretty nice, though. 0-50 € (384) 50-100 € (129) 100-300 € (50) page 1 2 3 4 5 6 7 8 9 10
17. SOLR for faceted searching Apache SOLR let’s you facet search results – that is, to show possible search filters and give counts for them Faceting with SOLR can also be achieved in Drupal – and now a Drupal contrib module comes to play With ApacheSOLR –module (http://drupal.org/project/apachesolr) you can do all this with a couple of clicks in your Drupal installation
18. SOLRfy your Drupal search 1/3 Download SOLR package from http://www.apache.org/dyn/closer.cgi/lucene/solr/ Unpackage it and check your server’s firewall settings to allow traffic to port 8983 Check that you have Java (RE) installed
19. SOLRfy your Drupal search 2/3 Then get Drupal’s “apachesolr” module, there’s two xml files in the package, solrconfig.xml and schema.xml Go back to your SOLR directory, rename example directory to “drupal” so you’ll find it easier Drop the two xml files to that drupal/solr/conf –directory Go to that drupal directory and fire up Apache SOLR with “java –jar start.jar”
20. SOLRfy your Drupal search 3/3 Now you can turn on “apachesolr” module in Drupal Tune the SOLR server settings in Drupal, reindex all content and then start clicking on those filtering/faceting settings on apachesolr You’ll have to turn the facets on as blocks But your search experience will be something else entirely …and once you see how searching with SOLR works, you’re not going back
21. Apachesolr -module Automatically creates facets for taxonomy terms, for every vocabulary – you can just turn them on Automatically creates facets for CCK fields using dropdown/radio widgets (i.e. with a set of options) Exposes hooks for CCK fields (to make facets out of them) Exposes hook for altering the query (to some extent) Easy to use
22. Faceting without SOLR You can do faceting without SOLR too “Faceted search” module will do it for you But at only 10K nodes, SOLR is three times as fast With 100K+ nodes, faceted search without SOLR is practically unusable …but for small sites, SOLR is not necessary for faceting
23. SEARCH So you can with SOLR …but my site does A LOT more
24. SOLRify the rest of your Drupal universe You probably know your performance problems on your site If it’s somehow personalized, you usually can’t do anything about it with caching How about using SOLR for it? Apache Solr Views –module (at a very mature “dev” state ;) and Views 3 (dev too) will talk together and integrate to apachesolr –module and it’s SOLR index When this is stable and fully functional…
26. SELECT title, description, mediatype FROM media LEFT JOIN media_types ON media_type_id = type_id LEFT JOIN media_tag ON media_tag.mid = media.id WHERE name LIKE ‘%s’ OR description LIKE ‘%s’ OR media.id IN (SELECT mid FROM promoted_media) But my problems are in my custom modules
27. Custom modules Custom modules can be designed with ApacheSOLR in mind When you realize all the potential there is in a indexer that can index XML files, sky is the limit Whenever you have a data structure that’s too complex for MySQL to search from – and that’s not too rarely – you might benefit from indexing that data to SOLR and using your SOLR as the read-only “db”
28. Custom modules – making SOLR do the reading media_workflow media_tag A single “row” for SOLR to index media media_revision tag media_version files
29. Custom modules – making SOLR do the reading You know you need a better structure when you can’t circumvent running LEFT JOIN or subqueries – and running them gets too slow When you’ve optimized your code several times and restructuring your database would mean creating a read-optimized cache of everything Then SOLR might be just the thing to get you through
30. Custom modules – making SOLR do the reading MySQL server Write Index Apache SOLR Read
31. Libraries to use with custom modules Apachesolr –module uses a SOLR library written in PHP and licensed in New BSD (http://code.google.com/p/solr-php-client/) There’s also a PECL extension, but I’m not aware of any speed comparisons There are also contrib Drupal modules that give you an API for accessing SOLR
33. Not a magic bullet 1/2 Apache SOLR is a hassle with all the java containers and such, you’ll probably have to run it on a separate server You should always run stuff through Drupal or a script that will authenticate and authorize calls to SOLR (SOLR shouldn’t be exposed, unless all the data is public) Sometimes the extra server might be better to use on an extra MySQL node Sometimes you can just fix your stuff and make it as fast as it would be on Apache SOLR
34. Not a magic bullet 2/2 And then there’s the fact SOLR is build mainly for the English language So make sure SOLR will do what you want for you in the language you want it to do it in
35. Recap SOLR will right now give your Drupal site a fast, faceted search with really easy setup (thanks to apachesolr module) SOLR will soon give a boost to the performance and search abilities of your views SOLR will right now give you a lot of more power for searching from your custom databases and complicated content types, if used by a module developer It’s still not a magic bullet – it has it’s downsides
36. Sounds easy? Been there, done that? is recruiting Send your CV to jobs@exove.com
37. Thank you for your time Questions? If you’d rather ask me in private, drop a mail to kalle@exove.com