SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Downloaden Sie, um offline zu lesen
Introduction to
Apache Solr
Software is eating the world"
The search is eating the software
April 2014
2
Alexandre Rafalovitch
www.outerthoughts.com
Web search engines !
are quite sophisticated
3
4
But the real search needs !
are!
much DEEPER and BROADER
5
Searching code
6
Searching people and companies
7
Searching products
8
Searching library material
9
Searching languages
10
Understanding full-text search
SELECT * 

FROM database

WHERE field LIKE ‘%word%’"
This DOES NOT Scale"
Instead: "
break text into tokens"
domain-specific processing (e.g. lower-casing)"
build fast-access structures"
algorithms for term, phrases, proximity search
11
Basic search engine features
Search (Duh!): keyword, phrase, field-specific"
Positive and negative terms"
Sort: relevancy, recency"
Pagination"
Compact summary in results"
SPEED
12
Advanced search engine features
Facets/Taxonomy - based navigation with live counts"
Language-specific processing"
Domain-specific text processing (WiFi = Wi-Fi = WIFI)"
Geographic search"
More-like-this, did-you-mean, autocomplete"
Scaling/Clustering"
NOT web crawling - different, but related
13
Search engine solutions?
Solr"
Elastic Search"
Xapian"
Sphinx"
Zoie"
Groonga"
Searchdaimon"
{F}lexSearch"
Algolia (SaaS)"
Searchify (SaaS)"
ForageJS"
Lunr.js"
FACT-Finder"
DtSearch"
MarkLogic"
Verity"
Fast"
Most databases"
!
!
…AND MORE
14
Used with permission from SemaText
Open Source Search Evolution
15
Secret Ingredient - Lucene
Solr"
Elastic Search"
Zoie"
SwiftType"
PyLucene (Python wrapper)"
Lucene.net (C# port)
Scalable, high-performance
indexing"
Incremental indexing"
Full-text search"
Information-Retrieval algorithms"
Implemented in Java"
Written in 1999, still going strong
16
Secret Ingredient - Solr
Certified distributions"
LucidWorks"
HelioSearch"
Big Data platforms"
Cloudera"
Hortonworks HDP"
Hosted and SaaS"
Amazon CloudSearch"
WebSolr, SolrHQ, SearchBox
Lucene full-text-search"
XML and REST config"
Schema/Schemaless"
SolrCloud (clustering)"
Caching"
Near real-time"
Rich-document indexing (Tika inside)"
Plugins, components, processors
17
Solr Ecosystem sample
Drupal"
Project Blacklight"
LuxDB"
SolrMeter"
CrafterCMS"
Typo3"
Magenta"
HippoCMS"
ColdFusion"
SolrNet"
DataStax"
Dovecot"
NGData Lily"
Basho Riak"
YaCy"
Apache ManifoldCF"
Apache Camel"
FranzAllegrograph"
BitNami Solr Stack"
Carrot2!
Broadleaf Commerce"
Cloudera CDK!
CodeLibs Fess (フェス)!
Splunk"
Alfresco"
Rosette by BasisTech!
Luwak by Flax!
Quepid by OSC!
TwigKit!
SPM by SemaText!
SILK by LucidWorks!
Banana (O/S Solr
Kibana)
18
DEMO Time
19
DEMO - Basic
Unzip"
Go to example directory"
Run Solr"
Import some documents from example docs"
grep -l store *.xml | xargs ./post.sh"
Show off Solr 4 admin panel
20
DEMO - Browse handler
Restart Solr with -Dsolr.clustering.enabled=true"
Visit http://localhost:8983/solr/browse/ "
Show off"
Search"
Facets - Categories and Ranges"
Spatial/Geo-distance"
Clusters
21
DEMO - Thai specific
Index Thai and English text"
Search in English, Thai,Auto-transliterated Thai"
ShowAnalysis screen"
Code at: https://github.com/arafalov/solr-thai-test
22
Getting into Solr
23
Start for free
Download, unzip, cd example; java -jar start.jar"
Go through basic tutorial in docs/tutorial.html"
Copy example directory, modify schema.xml until happy"
If coming from ElasticSearch, look at example-schemaless"
Do NOT follow this path to production"
example schema is a kitchen sink !!!
24
Accelerate your learning
Buy my book - seriously. That’s what it’s for"
All code/data is at: https://github.com/arafalov/solr-indexing-book "
Buy Solr InAction - just published and is a great reference"
Use my www.solr-start.com resource and join the mailing list"
Join solr-user mailing list - full of advanced hackers"
Watch Lucid Revolution videos for background"
Start helping out on Stack Overflow #solr"
Blog what you learned, twit with #Solr
25
Pick a project - make it happen
Solr + Dart => Better search experience for Dart packages"
Solr consultants discovery website"
Visualise Solr search request - step by step"
Solr + your language => is client library up to date?"
ToDoMVC for Solr clients"
Package LARGE dataset for others (e.g. Project Gutenberg)"
Rebuild lernu.net Esperanto dictionary with Solr backend
26
With Solr, how far can I go?
Cloudera (BigData) has > 1,000,000,000 $USD
investments - opportunities?"
8M+ searches/day, 40 languages, 100ms NRT, 1024 cores,
256 shards, 32 servers on #solr at Bloomberg http://bit.ly/
1jmG72G (via @FlaxSearch)
27
Other Search-related books
Designing the Search Experience: The Information
Architecture of Discovery - by a TwigKit creator +1"
SearchAnalytics for Your Site: Conversations with Your
Customers by Louis Rosenfeld - see also Quepid"
Enterprise Search by Martin White
28
29
Alexandre Rafalovitch
www.outerthoughts.com

Weitere ähnliche Inhalte

Was ist angesagt?

Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash courseTommaso Teofili
 
Enterprise Search Using Apache Solr
Enterprise Search Using Apache SolrEnterprise Search Using Apache Solr
Enterprise Search Using Apache Solrsagar chaturvedi
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesRahul Jain
 
Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5israelekpo
 
Introduction to Apache Solr.
Introduction to Apache Solr.Introduction to Apache Solr.
Introduction to Apache Solr.ashish0x90
 
Apache Solr/Lucene Internals by Anatoliy Sokolenko
Apache Solr/Lucene Internals  by Anatoliy SokolenkoApache Solr/Lucene Internals  by Anatoliy Sokolenko
Apache Solr/Lucene Internals by Anatoliy SokolenkoProvectus
 
20130310 solr tuorial
20130310 solr tuorial20130310 solr tuorial
20130310 solr tuorialChris Huang
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache SolrEdureka!
 
Battle of the giants: Apache Solr vs ElasticSearch
Battle of the giants: Apache Solr vs ElasticSearchBattle of the giants: Apache Solr vs ElasticSearch
Battle of the giants: Apache Solr vs ElasticSearchRafał Kuć
 
Intro to Apache Lucene and Solr
Intro to Apache Lucene and SolrIntro to Apache Lucene and Solr
Intro to Apache Lucene and SolrGrant Ingersoll
 
Data Science with Solr and Spark
Data Science with Solr and SparkData Science with Solr and Spark
Data Science with Solr and SparkLucidworks
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrTrey Grainger
 
Introduction to Apache solr
Introduction to Apache solrIntroduction to Apache solr
Introduction to Apache solrKnoldus Inc.
 
Introduction Apache Solr & PHP
Introduction Apache Solr & PHPIntroduction Apache Solr & PHP
Introduction Apache Solr & PHPHiraq Citra M
 
Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!Murshed Ahmmad Khan
 
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)Alexandre Rafalovitch
 
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...Lucidworks
 

Was ist angesagt? (20)

Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash course
 
Enterprise Search Using Apache Solr
Enterprise Search Using Apache SolrEnterprise Search Using Apache Solr
Enterprise Search Using Apache Solr
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and Usecases
 
Solr Architecture
Solr ArchitectureSolr Architecture
Solr Architecture
 
Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5Building Intelligent Search Applications with Apache Solr and PHP5
Building Intelligent Search Applications with Apache Solr and PHP5
 
Introduction to Apache Solr.
Introduction to Apache Solr.Introduction to Apache Solr.
Introduction to Apache Solr.
 
Apache Solr/Lucene Internals by Anatoliy Sokolenko
Apache Solr/Lucene Internals  by Anatoliy SokolenkoApache Solr/Lucene Internals  by Anatoliy Sokolenko
Apache Solr/Lucene Internals by Anatoliy Sokolenko
 
20130310 solr tuorial
20130310 solr tuorial20130310 solr tuorial
20130310 solr tuorial
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache Solr
 
Battle of the giants: Apache Solr vs ElasticSearch
Battle of the giants: Apache Solr vs ElasticSearchBattle of the giants: Apache Solr vs ElasticSearch
Battle of the giants: Apache Solr vs ElasticSearch
 
Intro to Apache Lucene and Solr
Intro to Apache Lucene and SolrIntro to Apache Lucene and Solr
Intro to Apache Lucene and Solr
 
Data Science with Solr and Spark
Data Science with Solr and SparkData Science with Solr and Spark
Data Science with Solr and Spark
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solr
 
Introduction to Apache solr
Introduction to Apache solrIntroduction to Apache solr
Introduction to Apache solr
 
Apache Solr
Apache SolrApache Solr
Apache Solr
 
Introduction Apache Solr & PHP
Introduction Apache Solr & PHPIntroduction Apache Solr & PHP
Introduction Apache Solr & PHP
 
Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!
 
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
 
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
 
Building a Search Engine Using Lucene
Building a Search Engine Using LuceneBuilding a Search Engine Using Lucene
Building a Search Engine Using Lucene
 

Andere mochten auch

Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache SolrChristos Manios
 
Using Apache Solr
Using Apache SolrUsing Apache Solr
Using Apache Solrpittaya
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineTrey Grainger
 
Tutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginTutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginsearchbox-com
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered LuceneErik Hatcher
 
Solr: Search at the Speed of Light
Solr: Search at the Speed of LightSolr: Search at the Speed of Light
Solr: Search at the Speed of LightErik Hatcher
 
Building a real time big data analytics platform with solr
Building a real time big data analytics platform with solrBuilding a real time big data analytics platform with solr
Building a real time big data analytics platform with solrTrey Grainger
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solrlucenerevolution
 
Configuring Apache Solr for Thai Text Search
Configuring Apache Solr for Thai Text SearchConfiguring Apache Solr for Thai Text Search
Configuring Apache Solr for Thai Text Searchsagarote
 
OSDF 2013 - Autopsy 3: Extensible Desktop Forensics by Brian Carrier
OSDF 2013 - Autopsy 3: Extensible Desktop Forensics by Brian CarrierOSDF 2013 - Autopsy 3: Extensible Desktop Forensics by Brian Carrier
OSDF 2013 - Autopsy 3: Extensible Desktop Forensics by Brian CarrierBasis Technology
 
Optimizing multilingual search in SOLR
Optimizing multilingual search in SOLROptimizing multilingual search in SOLR
Optimizing multilingual search in SOLRBasis Technology
 
Rosette Search Essentials for Elasticsearch
Rosette Search Essentials for ElasticsearchRosette Search Essentials for Elasticsearch
Rosette Search Essentials for ElasticsearchBasis Technology
 
Moving Beyond Entity Extraction to Entity Resolution - Human Language Technol...
Moving Beyond Entity Extraction to Entity Resolution - Human Language Technol...Moving Beyond Entity Extraction to Entity Resolution - Human Language Technol...
Moving Beyond Entity Extraction to Entity Resolution - Human Language Technol...Basis Technology
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
Solr introduction
Solr introductionSolr introduction
Solr introductionLap Tran
 
Simple fuzzy Name Matching in Elasticsearch - Graham Morehead
Simple fuzzy Name Matching in Elasticsearch - Graham MoreheadSimple fuzzy Name Matching in Elasticsearch - Graham Morehead
Simple fuzzy Name Matching in Elasticsearch - Graham MoreheadBasis Technology
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationslucenerevolution
 

Andere mochten auch (20)

Solr Presentation
Solr PresentationSolr Presentation
Solr Presentation
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Using Apache Solr
Using Apache SolrUsing Apache Solr
Using Apache Solr
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engine
 
Tutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginTutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component plugin
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered Lucene
 
Solr: Search at the Speed of Light
Solr: Search at the Speed of LightSolr: Search at the Speed of Light
Solr: Search at the Speed of Light
 
Building a real time big data analytics platform with solr
Building a real time big data analytics platform with solrBuilding a real time big data analytics platform with solr
Building a real time big data analytics platform with solr
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solr
 
Solr Flair
Solr FlairSolr Flair
Solr Flair
 
Configuring Apache Solr for Thai Text Search
Configuring Apache Solr for Thai Text SearchConfiguring Apache Solr for Thai Text Search
Configuring Apache Solr for Thai Text Search
 
OSDF 2013 - Autopsy 3: Extensible Desktop Forensics by Brian Carrier
OSDF 2013 - Autopsy 3: Extensible Desktop Forensics by Brian CarrierOSDF 2013 - Autopsy 3: Extensible Desktop Forensics by Brian Carrier
OSDF 2013 - Autopsy 3: Extensible Desktop Forensics by Brian Carrier
 
Optimizing multilingual search in SOLR
Optimizing multilingual search in SOLROptimizing multilingual search in SOLR
Optimizing multilingual search in SOLR
 
Rosette Search Essentials for Elasticsearch
Rosette Search Essentials for ElasticsearchRosette Search Essentials for Elasticsearch
Rosette Search Essentials for Elasticsearch
 
Moving Beyond Entity Extraction to Entity Resolution - Human Language Technol...
Moving Beyond Entity Extraction to Entity Resolution - Human Language Technol...Moving Beyond Entity Extraction to Entity Resolution - Human Language Technol...
Moving Beyond Entity Extraction to Entity Resolution - Human Language Technol...
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Solr introduction
Solr introductionSolr introduction
Solr introduction
 
Simple fuzzy Name Matching in Elasticsearch - Graham Morehead
Simple fuzzy Name Matching in Elasticsearch - Graham MoreheadSimple fuzzy Name Matching in Elasticsearch - Graham Morehead
Simple fuzzy Name Matching in Elasticsearch - Graham Morehead
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 

Ähnlich wie Introduction to Apache Solr

Rapid prototyping with solr - By Erik Hatcher
Rapid prototyping with solr -  By Erik Hatcher Rapid prototyping with solr -  By Erik Hatcher
Rapid prototyping with solr - By Erik Hatcher lucenerevolution
 
Search Intelligence @elo7.com
Search Intelligence @elo7.comSearch Intelligence @elo7.com
Search Intelligence @elo7.comFernando Meyer
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
 
PyCon India 2012: Rapid development of website search in python
PyCon India 2012: Rapid development of website search in pythonPyCon India 2012: Rapid development of website search in python
PyCon India 2012: Rapid development of website search in pythonChetan Giridhar
 
How to build a custom search engine
How to build a custom search engineHow to build a custom search engine
How to build a custom search enginesearchbox-com
 
Intro to Apache Solr for Drupal
Intro to Apache Solr for DrupalIntro to Apache Solr for Drupal
Intro to Apache Solr for DrupalChris Caple
 
State-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache SolrState-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache SolrRobert Douglass
 
Best Great Ideas on Java Research Papers
Best Great Ideas on Java Research PapersBest Great Ideas on Java Research Papers
Best Great Ideas on Java Research Paperssuzanneriverabme
 
Dev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialDev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialSourcesense
 
SplunkLive London 2014 Developer Presentation
SplunkLive London 2014  Developer PresentationSplunkLive London 2014  Developer Presentation
SplunkLive London 2014 Developer PresentationDamien Dallimore
 
Open source library management software
Open source library management softwareOpen source library management software
Open source library management softwareAnn Marie Pipkin
 
Content Outside of CONTENTdm: Part 1: Exhibit Creation Tool using <b>...&l...
Content Outside of CONTENTdm: Part 1: Exhibit Creation Tool using <b>...&l...Content Outside of CONTENTdm: Part 1: Exhibit Creation Tool using <b>...&l...
Content Outside of CONTENTdm: Part 1: Exhibit Creation Tool using <b>...&l...tutorialsruby
 
<img src="../i/r_14.png" />
<img src="../i/r_14.png" /><img src="../i/r_14.png" />
<img src="../i/r_14.png" />tutorialsruby
 

Ähnlich wie Introduction to Apache Solr (20)

Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Rapid prototyping with solr - By Erik Hatcher
Rapid prototyping with solr -  By Erik Hatcher Rapid prototyping with solr -  By Erik Hatcher
Rapid prototyping with solr - By Erik Hatcher
 
Solr Masterclass Bangkok, June 2014
Solr Masterclass Bangkok, June 2014Solr Masterclass Bangkok, June 2014
Solr Masterclass Bangkok, June 2014
 
Search Intelligence @elo7.com
Search Intelligence @elo7.comSearch Intelligence @elo7.com
Search Intelligence @elo7.com
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
PyCon India 2012: Rapid development of website search in python
PyCon India 2012: Rapid development of website search in pythonPyCon India 2012: Rapid development of website search in python
PyCon India 2012: Rapid development of website search in python
 
How to build a custom search engine
How to build a custom search engineHow to build a custom search engine
How to build a custom search engine
 
Intro to Apache Solr for Drupal
Intro to Apache Solr for DrupalIntro to Apache Solr for Drupal
Intro to Apache Solr for Drupal
 
State-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache SolrState-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache Solr
 
Bollean Search - NageshRao
Bollean Search - NageshRaoBollean Search - NageshRao
Bollean Search - NageshRao
 
Best Great Ideas on Java Research Papers
Best Great Ideas on Java Research PapersBest Great Ideas on Java Research Papers
Best Great Ideas on Java Research Papers
 
Dev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialDev8d Apache Solr Tutorial
Dev8d Apache Solr Tutorial
 
SplunkLive London 2014 Developer Presentation
SplunkLive London 2014  Developer PresentationSplunkLive London 2014  Developer Presentation
SplunkLive London 2014 Developer Presentation
 
Open source library management software
Open source library management softwareOpen source library management software
Open source library management software
 
Content Outside of CONTENTdm: Part 1: Exhibit Creation Tool using <b>...&l...
Content Outside of CONTENTdm: Part 1: Exhibit Creation Tool using <b>...&l...Content Outside of CONTENTdm: Part 1: Exhibit Creation Tool using <b>...&l...
Content Outside of CONTENTdm: Part 1: Exhibit Creation Tool using <b>...&l...
 
<img src="../i/r_14.png" />
<img src="../i/r_14.png" /><img src="../i/r_14.png" />
<img src="../i/r_14.png" />
 
psager
psagerpsager
psager
 
psager
psagerpsager
psager
 
Solr 8 interview
Solr 8 interview Solr 8 interview
Solr 8 interview
 

Mehr von Alexandre Rafalovitch

From content to search: speed-dating Apache Solr (ApacheCON 2018)
From content to search: speed-dating Apache Solr (ApacheCON 2018)From content to search: speed-dating Apache Solr (ApacheCON 2018)
From content to search: speed-dating Apache Solr (ApacheCON 2018)Alexandre Rafalovitch
 
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasks
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasksSearching for AI - Leveraging Solr for classic Artificial Intelligence tasks
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasksAlexandre Rafalovitch
 
Rapid Solr Schema Development (Phone directory)
Rapid Solr Schema Development (Phone directory)Rapid Solr Schema Development (Phone directory)
Rapid Solr Schema Development (Phone directory)Alexandre Rafalovitch
 
Solr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approachSolr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approachAlexandre Rafalovitch
 
Solr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseSolr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseAlexandre Rafalovitch
 

Mehr von Alexandre Rafalovitch (6)

JSON in Solr: from top to bottom
JSON in Solr: from top to bottomJSON in Solr: from top to bottom
JSON in Solr: from top to bottom
 
From content to search: speed-dating Apache Solr (ApacheCON 2018)
From content to search: speed-dating Apache Solr (ApacheCON 2018)From content to search: speed-dating Apache Solr (ApacheCON 2018)
From content to search: speed-dating Apache Solr (ApacheCON 2018)
 
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasks
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasksSearching for AI - Leveraging Solr for classic Artificial Intelligence tasks
Searching for AI - Leveraging Solr for classic Artificial Intelligence tasks
 
Rapid Solr Schema Development (Phone directory)
Rapid Solr Schema Development (Phone directory)Rapid Solr Schema Development (Phone directory)
Rapid Solr Schema Development (Phone directory)
 
Solr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approachSolr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approach
 
Solr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseSolr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by Case
 

Kürzlich hochgeladen

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 

Kürzlich hochgeladen (20)

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 

Introduction to Apache Solr

  • 1. Introduction to Apache Solr Software is eating the world" The search is eating the software April 2014
  • 3. Web search engines ! are quite sophisticated 3
  • 4. 4
  • 5. But the real search needs ! are! much DEEPER and BROADER 5
  • 7. Searching people and companies 7
  • 11. Understanding full-text search SELECT * 
 FROM database
 WHERE field LIKE ‘%word%’" This DOES NOT Scale" Instead: " break text into tokens" domain-specific processing (e.g. lower-casing)" build fast-access structures" algorithms for term, phrases, proximity search 11
  • 12. Basic search engine features Search (Duh!): keyword, phrase, field-specific" Positive and negative terms" Sort: relevancy, recency" Pagination" Compact summary in results" SPEED 12
  • 13. Advanced search engine features Facets/Taxonomy - based navigation with live counts" Language-specific processing" Domain-specific text processing (WiFi = Wi-Fi = WIFI)" Geographic search" More-like-this, did-you-mean, autocomplete" Scaling/Clustering" NOT web crawling - different, but related 13
  • 14. Search engine solutions? Solr" Elastic Search" Xapian" Sphinx" Zoie" Groonga" Searchdaimon" {F}lexSearch" Algolia (SaaS)" Searchify (SaaS)" ForageJS" Lunr.js" FACT-Finder" DtSearch" MarkLogic" Verity" Fast" Most databases" ! ! …AND MORE 14
  • 15. Used with permission from SemaText Open Source Search Evolution 15
  • 16. Secret Ingredient - Lucene Solr" Elastic Search" Zoie" SwiftType" PyLucene (Python wrapper)" Lucene.net (C# port) Scalable, high-performance indexing" Incremental indexing" Full-text search" Information-Retrieval algorithms" Implemented in Java" Written in 1999, still going strong 16
  • 17. Secret Ingredient - Solr Certified distributions" LucidWorks" HelioSearch" Big Data platforms" Cloudera" Hortonworks HDP" Hosted and SaaS" Amazon CloudSearch" WebSolr, SolrHQ, SearchBox Lucene full-text-search" XML and REST config" Schema/Schemaless" SolrCloud (clustering)" Caching" Near real-time" Rich-document indexing (Tika inside)" Plugins, components, processors 17
  • 18. Solr Ecosystem sample Drupal" Project Blacklight" LuxDB" SolrMeter" CrafterCMS" Typo3" Magenta" HippoCMS" ColdFusion" SolrNet" DataStax" Dovecot" NGData Lily" Basho Riak" YaCy" Apache ManifoldCF" Apache Camel" FranzAllegrograph" BitNami Solr Stack" Carrot2! Broadleaf Commerce" Cloudera CDK! CodeLibs Fess (フェス)! Splunk" Alfresco" Rosette by BasisTech! Luwak by Flax! Quepid by OSC! TwigKit! SPM by SemaText! SILK by LucidWorks! Banana (O/S Solr Kibana) 18
  • 20. DEMO - Basic Unzip" Go to example directory" Run Solr" Import some documents from example docs" grep -l store *.xml | xargs ./post.sh" Show off Solr 4 admin panel 20
  • 21. DEMO - Browse handler Restart Solr with -Dsolr.clustering.enabled=true" Visit http://localhost:8983/solr/browse/ " Show off" Search" Facets - Categories and Ranges" Spatial/Geo-distance" Clusters 21
  • 22. DEMO - Thai specific Index Thai and English text" Search in English, Thai,Auto-transliterated Thai" ShowAnalysis screen" Code at: https://github.com/arafalov/solr-thai-test 22
  • 24. Start for free Download, unzip, cd example; java -jar start.jar" Go through basic tutorial in docs/tutorial.html" Copy example directory, modify schema.xml until happy" If coming from ElasticSearch, look at example-schemaless" Do NOT follow this path to production" example schema is a kitchen sink !!! 24
  • 25. Accelerate your learning Buy my book - seriously. That’s what it’s for" All code/data is at: https://github.com/arafalov/solr-indexing-book " Buy Solr InAction - just published and is a great reference" Use my www.solr-start.com resource and join the mailing list" Join solr-user mailing list - full of advanced hackers" Watch Lucid Revolution videos for background" Start helping out on Stack Overflow #solr" Blog what you learned, twit with #Solr 25
  • 26. Pick a project - make it happen Solr + Dart => Better search experience for Dart packages" Solr consultants discovery website" Visualise Solr search request - step by step" Solr + your language => is client library up to date?" ToDoMVC for Solr clients" Package LARGE dataset for others (e.g. Project Gutenberg)" Rebuild lernu.net Esperanto dictionary with Solr backend 26
  • 27. With Solr, how far can I go? Cloudera (BigData) has > 1,000,000,000 $USD investments - opportunities?" 8M+ searches/day, 40 languages, 100ms NRT, 1024 cores, 256 shards, 32 servers on #solr at Bloomberg http://bit.ly/ 1jmG72G (via @FlaxSearch) 27
  • 28. Other Search-related books Designing the Search Experience: The Information Architecture of Discovery - by a TwigKit creator +1" SearchAnalytics for Your Site: Conversations with Your Customers by Louis Rosenfeld - see also Quepid" Enterprise Search by Martin White 28