Introduction to Solr, presented at Bangkok meetup in April 2014:
http://www.meetup.com/bkk-web/events/172090992/
Covers high-level use-cases for Solr. Demos include support for Thai language (with GitHub link for source).
Has slides showcasing Solr-ecosystem as well as couple of ideas for possible Solr-specific learning projects.
11. Understanding full-text search
SELECT *
FROM database
WHERE field LIKE ‘%word%’"
This DOES NOT Scale"
Instead: "
break text into tokens"
domain-specific processing (e.g. lower-casing)"
build fast-access structures"
algorithms for term, phrases, proximity search
11
12. Basic search engine features
Search (Duh!): keyword, phrase, field-specific"
Positive and negative terms"
Sort: relevancy, recency"
Pagination"
Compact summary in results"
SPEED
12
13. Advanced search engine features
Facets/Taxonomy - based navigation with live counts"
Language-specific processing"
Domain-specific text processing (WiFi = Wi-Fi = WIFI)"
Geographic search"
More-like-this, did-you-mean, autocomplete"
Scaling/Clustering"
NOT web crawling - different, but related
13
20. DEMO - Basic
Unzip"
Go to example directory"
Run Solr"
Import some documents from example docs"
grep -l store *.xml | xargs ./post.sh"
Show off Solr 4 admin panel
20
21. DEMO - Browse handler
Restart Solr with -Dsolr.clustering.enabled=true"
Visit http://localhost:8983/solr/browse/ "
Show off"
Search"
Facets - Categories and Ranges"
Spatial/Geo-distance"
Clusters
21
22. DEMO - Thai specific
Index Thai and English text"
Search in English, Thai,Auto-transliterated Thai"
ShowAnalysis screen"
Code at: https://github.com/arafalov/solr-thai-test
22
24. Start for free
Download, unzip, cd example; java -jar start.jar"
Go through basic tutorial in docs/tutorial.html"
Copy example directory, modify schema.xml until happy"
If coming from ElasticSearch, look at example-schemaless"
Do NOT follow this path to production"
example schema is a kitchen sink !!!
24
25. Accelerate your learning
Buy my book - seriously. That’s what it’s for"
All code/data is at: https://github.com/arafalov/solr-indexing-book "
Buy Solr InAction - just published and is a great reference"
Use my www.solr-start.com resource and join the mailing list"
Join solr-user mailing list - full of advanced hackers"
Watch Lucid Revolution videos for background"
Start helping out on Stack Overflow #solr"
Blog what you learned, twit with #Solr
25
26. Pick a project - make it happen
Solr + Dart => Better search experience for Dart packages"
Solr consultants discovery website"
Visualise Solr search request - step by step"
Solr + your language => is client library up to date?"
ToDoMVC for Solr clients"
Package LARGE dataset for others (e.g. Project Gutenberg)"
Rebuild lernu.net Esperanto dictionary with Solr backend
26
27. With Solr, how far can I go?
Cloudera (BigData) has > 1,000,000,000 $USD
investments - opportunities?"
8M+ searches/day, 40 languages, 100ms NRT, 1024 cores,
256 shards, 32 servers on #solr at Bloomberg http://bit.ly/
1jmG72G (via @FlaxSearch)
27
28. Other Search-related books
Designing the Search Experience: The Information
Architecture of Discovery - by a TwigKit creator +1"
SearchAnalytics for Your Site: Conversations with Your
Customers by Louis Rosenfeld - see also Quepid"
Enterprise Search by Martin White
28