SlideShare a Scribd company logo
1 of 39
Elasticsearch and SolrCloud 
a performance comparison 
Tom Mortimer - Technical Director 
27th November 2014 
charlie@flax.co.uk 
www.flax.co.uk/blog 
+44 (0) 8700 118334 
Twitter: @FlaxSearch
Who are Flax? 
We design, build and support open source powered 
search applications
Who are Flax? 
We design, build and support open source powered 
search applications 
Based in Cambridge U.K., technology agnostic & 
independent – but open source exponents & committers
Who are Flax? 
We design, build and support open source powered 
search applications 
Based in Cambridge U.K., technology agnostic & 
independent – but open source exponents & committers 
UK Authorized Partner of
Who are Flax? 
We design, build and support open source powered 
search applications 
Based in Cambridge U.K., technology agnostic & 
independent – but open source exponents & committers 
UK Authorized Partner of 
Customers include Reed Specialist Recruitment, Mydeco, 
NLA, Gorkana, Financial Times, News UK, EMBL-EBI, 
Accenture, University of Cambridge, UK Government...
Who are Flax? 
We design, build and support open source powered 
search applications 
Based in Cambridge U.K., technology agnostic & 
independent – but open source exponents & committers 
UK Authorized Partner of 
Customers in recruitment, government, e-commerce, 
news & media, bioinformatics, consulting, law...
Who are Flax? 
We design, build and support open source powered 
search applications 
Based in Cambridge U.K., technology agnostic & 
independent – but open source exponents & committers 
UK Authorized Partner of 
Customers in recruitment, government, e-commerce, 
news & media, bioinformatics, consulting, law...
 Open source search server based on Lucene 
 Created in 2004 by Yonik Seeley 
 Became an Apache project in 2006 
 Merged with Lucene in 2011 
 Web API 
 XML config, XML/JSON data formats 
 SolrCloud features added in 2012 
 Uses Apache ZooKeeper for cluster management
 Open source search server based on Lucene 
 Created in 2010 by Shay Banon 
 RESTful Web API 
 Everything is JSON 
 Distributed and NRT by design 
 Own Zen Discovery module for cluster management
vs. 
 Both have large, dynamic communities 
 Well-funded commercial backing 
 Widely used in many diverse projects 
 Elasticsearch easier to setup and configure 
 Elasticsearch query DSL 
 But: is Elasticsearch as tolerant of network faults? 
(Jepsen tests by Kyle Kingsbury) 
 How does performance compare?
vs. 
 Both have large, dynamic communities 
 Well-funded commercial backing 
 Widely used in many diverse projects 
 Elasticsearch easier to setup and configure 
 Elasticsearch query DSL 
 But: is Elasticsearch as tolerant of network faults? 
(Jepsen tests by Kyle Kingsbury) 
 How does performance compare? 
 Note that we don't have a preference...we use both!
Why does performance matter? 
 Won't it be the same, as they both use Lucene? 
 Can't you just throw hardware at it? 
 Hardware is cheaper than developers
Why does performance matter? 
 Won't it be the same, as they both use Lucene? 
 Can't you just throw hardware at it? 
 Hardware is cheaper than developers 
 Well, no.
Why does performance matter? 
 There's a lot more to them than just a web API on top of 
Lucene. 
 Several of our customers have fixed hardware budgets 
 May have to use limited internal resources 
 With large indexes or complex queries, need to squeeze 
every last bit of performance out of the hardware
Why does performance matter? 
 There's a lot more to them than just a web API on top of 
Lucene. 
 Several of our customers have fixed hardware budgets 
 May have to use limited internal resources 
 With large indexes or complex queries, need to squeeze 
every last bit of performance out of the hardware
What performance studies are 
out there? 
 Not many found by a Google search. 
http://blog.socialcast.com/realtime-search-solr-vs-elasticsearch/ 
 Solr much faster than Elasticsearch, except for NRT 
searches with concurrent indexing (where situation was 
reversed). 
 But: This was over 3 years ago, before SolrCloud
Our experience 
 Client with complex filtering requirements for content 
licensing, 10Ms of documents, limited hardware budget, 
no NRT requirement. 
 Performed tests 18 months ago on EC2. 
 Solr was approximately 20 times faster! 
 More recently, Solr was 4 times faster for a project 
requiring geospatial filtering 
 What about now?
This study 
 Recent versions of Elasticsearch (1.4.0) and Solr (4.10.2) 
 Concentrated on indexing performance, query times with 
and without concurrent indexing, QPS, filters and facets. 
 Hardware kindly provided by BigStep.com 
 Full Metal Cloud (real instances, not VMs) 
 Optimised for high performance 
 Can be faster than your own dedicated hardware!
The results?
The results? 
 Not really very interesting
The results? 
 Not really very interesting 
 SolrCloud and Elasticsearch were both very fast 
 Similar performance with concurrent indexing or not 
 Solr could handle higher QPS
Cluster configuration 
 Two machines, each with 96GB RAM 
 Two instances of SolrCloud or Elasticsearch on 
each 
 Each instance has 24GB JVM heap 
 Four shards 
 No replicas
Cluster configuration in BigStep
Data 
 40M documents created by using a Markov chain on a 
seed document (on Stoicism) from gutenberg.org 
“Below planets. this Below lay this the lay infinite the void infinite without 
void beginning, without middle, beginning, or middle, end, or this end 
occupied...” 
 Small (5-20 word) and larger (200-1000 word) docs 
 Randomly assigned ints for “source” and “level”, to 
simulate licensing filters and for facets.
Indexing 
 Python script and requests library 
 Single process for small index, four processes for 
larger index 
 Single process for indexing concurrent with search
Searching 
 Python and requests 
 Each query time logged for analysis 
 Single process for query time testing 
 Multiple processes to test QPS 
 All tests performed warm 
 Queries consisted of three randomly chosen terms 
combined with OR 
 Filters randomly generated 
 Facets / Elasticsearch aggregations
40M Small documents 
 Elasticsearch indexed them in 30 minutes 
 Total index size was 8.8 GB (easily cacheable) 
 Solr indexed them in 43 minutes 
 Total index size was 7.6 GB
40M Small documents (concurrent indexing) 
Elasticsearch: 0.01s mean, 99% < 0.06s 
Solr: 0.01s mean, 99% < 0.10s
40M Large documents 
 Elasticsearch indexed them in 179 minutes 
 Total index size was 363 GB (not completely 
cacheable) 
 Solr indexed them in 119 minutes 
 Total index size was 226 GB
40M Large documents (search with facets) 
Elasticsearch: 0.21s mean, 99% < 0.75s 
Solr: 0.25s mean, 99% < 0.84s
40M Large documents (with 10 filters) 
Elasticsearch: 0.21s mean, 99% < 0.72s 
Solr: 0.09s mean, 99% < 0.50s
40M Large documents (concurrent indexing) 
Elasticsearch: 0.16s mean, 99% < 0.86s 
Solr: 0.09s mean, 99% < 0.46s
40M Large documents (QPS)
Conclusions 
 SolrCloud seems to be slightly faster. However, 
performance was acceptable in all cases. 
 SolrCloud can apparently support a significantly 
higher number of queries per second (tested without 
concurrent indexing, however).
Limitations and problems 
 Validity of generated documents? 
 Validity of random queries? 
 Searches did not fetch any document data 
 Did not test highlighting, range facets, geolocation, 
etc. etc... 
 Only tested one type of cluster configuration 
(Elasticsearch is very flexible about node role). 
 Did not tune JVM parameters 
 Did not perform profiling to identify reasons for 
differences
What's next 
 Would have also liked to have compared BigStep with 
Amazon EC2. 
 If there is any interest, I hope to address some of 
these problems in the near future. 
 We'll open source the code (next week?) on 
www.github.com/flaxsearch
What to take away from this? 
 Elasticsearch and Solr are both awesome 
 They currently seem very close in terms of 
performance (according to this limited study)
What to take away from this? 
 Elasticsearch and Solr are both awesome 
 They currently seem very close in terms of 
performance (according to this limited study) 
 However, all search applications are different 
 Solr and Elasticsearch may have quite different 
performance characteristics in certain cases. 
 Hard to predict. 
 If performance is important to you, it will pay to try 
both.
Thanks! 
 To you, for listening 
 To for the use of Full Metal Cloud 
 Any questions? - tom@flax.co.uk

More Related Content

What's hot

Battle of the Giants round 2
Battle of the Giants round 2Battle of the Giants round 2
Battle of the Giants round 2Rafał Kuć
 
Solr vs. Elasticsearch, Case by Case: Presented by Alexandre Rafalovitch, UN
Solr vs. Elasticsearch,  Case by Case: Presented by Alexandre Rafalovitch, UNSolr vs. Elasticsearch,  Case by Case: Presented by Alexandre Rafalovitch, UN
Solr vs. Elasticsearch, Case by Case: Presented by Alexandre Rafalovitch, UNLucidworks
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedBeyondTrees
 
Your Data, Your Search, ElasticSearch (EURUKO 2011)
Your Data, Your Search, ElasticSearch (EURUKO 2011)Your Data, Your Search, ElasticSearch (EURUKO 2011)
Your Data, Your Search, ElasticSearch (EURUKO 2011)Karel Minarik
 
Cool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchCool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchclintongormley
 
Building a CRM on top of ElasticSearch
Building a CRM on top of ElasticSearchBuilding a CRM on top of ElasticSearch
Building a CRM on top of ElasticSearchMark Greene
 
Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2Sematext Group, Inc.
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchRuslan Zavacky
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.Jurriaan Persyn
 
ElasticSearch AJUG 2013
ElasticSearch AJUG 2013ElasticSearch AJUG 2013
ElasticSearch AJUG 2013Roy Russo
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneRahul Jain
 
Intro to Elasticsearch
Intro to ElasticsearchIntro to Elasticsearch
Intro to ElasticsearchClifford James
 
Managing Your Content with Elasticsearch
Managing Your Content with ElasticsearchManaging Your Content with Elasticsearch
Managing Your Content with ElasticsearchSamantha Quiñones
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesRahul Jain
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Philips Kokoh Prasetyo
 
Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)Federico Panini
 
Elasticsearch and Spark
Elasticsearch and SparkElasticsearch and Spark
Elasticsearch and SparkAudible, Inc.
 
ElasticSearch - DevNexus Atlanta - 2014
ElasticSearch - DevNexus Atlanta - 2014ElasticSearch - DevNexus Atlanta - 2014
ElasticSearch - DevNexus Atlanta - 2014Roy Russo
 

What's hot (19)

Battle of the Giants round 2
Battle of the Giants round 2Battle of the Giants round 2
Battle of the Giants round 2
 
Solr vs. Elasticsearch, Case by Case: Presented by Alexandre Rafalovitch, UN
Solr vs. Elasticsearch,  Case by Case: Presented by Alexandre Rafalovitch, UNSolr vs. Elasticsearch,  Case by Case: Presented by Alexandre Rafalovitch, UN
Solr vs. Elasticsearch, Case by Case: Presented by Alexandre Rafalovitch, UN
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learned
 
Your Data, Your Search, ElasticSearch (EURUKO 2011)
Your Data, Your Search, ElasticSearch (EURUKO 2011)Your Data, Your Search, ElasticSearch (EURUKO 2011)
Your Data, Your Search, ElasticSearch (EURUKO 2011)
 
Cool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchCool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearch
 
Building a CRM on top of ElasticSearch
Building a CRM on top of ElasticSearchBuilding a CRM on top of ElasticSearch
Building a CRM on top of ElasticSearch
 
Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.
 
ElasticSearch AJUG 2013
ElasticSearch AJUG 2013ElasticSearch AJUG 2013
ElasticSearch AJUG 2013
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of Lucene
 
Intro to Elasticsearch
Intro to ElasticsearchIntro to Elasticsearch
Intro to Elasticsearch
 
Managing Your Content with Elasticsearch
Managing Your Content with ElasticsearchManaging Your Content with Elasticsearch
Managing Your Content with Elasticsearch
 
Elasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetupElasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetup
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and Usecases
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!
 
Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)
 
Elasticsearch and Spark
Elasticsearch and SparkElasticsearch and Spark
Elasticsearch and Spark
 
ElasticSearch - DevNexus Atlanta - 2014
ElasticSearch - DevNexus Atlanta - 2014ElasticSearch - DevNexus Atlanta - 2014
ElasticSearch - DevNexus Atlanta - 2014
 

Viewers also liked

Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scalethelabdude
 
High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...
High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...
High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...Lucidworks
 
Getting the Most Out of Your NoSQL DB
Getting the Most Out of Your NoSQL DBGetting the Most Out of Your NoSQL DB
Getting the Most Out of Your NoSQL DBBigstep
 
Side by Side with Elasticsearch and Solr
Side by Side with Elasticsearch and SolrSide by Side with Elasticsearch and Solr
Side by Side with Elasticsearch and SolrSematext Group, Inc.
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersSematext Group, Inc.
 
Working with deeply nested documents in Apache Solr
Working with deeply nested documents in Apache SolrWorking with deeply nested documents in Apache Solr
Working with deeply nested documents in Apache SolrAnshum Gupta
 
Benchmarking Solr Performance
Benchmarking Solr PerformanceBenchmarking Solr Performance
Benchmarking Solr PerformanceLucidworks
 
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...Lucidworks
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloudVarun Thacker
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash courseTommaso Teofili
 
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerRunning High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerSematext Group, Inc.
 
Solr/Elasticsearch for CF Developers (and others)
Solr/Elasticsearch for CF Developers (and others)Solr/Elasticsearch for CF Developers (and others)
Solr/Elasticsearch for CF Developers (and others)Mary Jo Sminkey
 
Solr Performance Monitoring with SPM
Solr Performance Monitoring with SPMSolr Performance Monitoring with SPM
Solr Performance Monitoring with SPMSematext Group, Inc.
 
Scaling to 30,000 Requests Per Second and Beyond with MongoDB
Scaling to 30,000 Requests Per Second and Beyond with MongoDBScaling to 30,000 Requests Per Second and Beyond with MongoDB
Scaling to 30,000 Requests Per Second and Beyond with MongoDBmchesnut
 
Elastic Search Performance Optimization - Deview 2014
Elastic Search Performance Optimization - Deview 2014Elastic Search Performance Optimization - Deview 2014
Elastic Search Performance Optimization - Deview 2014Gruter
 
Solar Investment Case
Solar Investment CaseSolar Investment Case
Solar Investment Casemacsolar77
 
Building a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solrBuilding a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solrlucenerevolution
 
Modernizing WordPress Search with Elasticsearch
Modernizing WordPress Search with ElasticsearchModernizing WordPress Search with Elasticsearch
Modernizing WordPress Search with ElasticsearchTaylor Lovett
 

Viewers also liked (20)

Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
High Performance Solr
High Performance SolrHigh Performance Solr
High Performance Solr
 
High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...
High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...
High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...
 
Getting the Most Out of Your NoSQL DB
Getting the Most Out of Your NoSQL DBGetting the Most Out of Your NoSQL DB
Getting the Most Out of Your NoSQL DB
 
Side by Side with Elasticsearch and Solr
Side by Side with Elasticsearch and SolrSide by Side with Elasticsearch and Solr
Side by Side with Elasticsearch and Solr
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
 
Working with deeply nested documents in Apache Solr
Working with deeply nested documents in Apache SolrWorking with deeply nested documents in Apache Solr
Working with deeply nested documents in Apache Solr
 
Benchmarking Solr Performance
Benchmarking Solr PerformanceBenchmarking Solr Performance
Benchmarking Solr Performance
 
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloud
 
Scaling Solr with Solr Cloud
Scaling Solr with Solr CloudScaling Solr with Solr Cloud
Scaling Solr with Solr Cloud
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash course
 
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerRunning High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
 
Solr/Elasticsearch for CF Developers (and others)
Solr/Elasticsearch for CF Developers (and others)Solr/Elasticsearch for CF Developers (and others)
Solr/Elasticsearch for CF Developers (and others)
 
Solr Performance Monitoring with SPM
Solr Performance Monitoring with SPMSolr Performance Monitoring with SPM
Solr Performance Monitoring with SPM
 
Scaling to 30,000 Requests Per Second and Beyond with MongoDB
Scaling to 30,000 Requests Per Second and Beyond with MongoDBScaling to 30,000 Requests Per Second and Beyond with MongoDB
Scaling to 30,000 Requests Per Second and Beyond with MongoDB
 
Elastic Search Performance Optimization - Deview 2014
Elastic Search Performance Optimization - Deview 2014Elastic Search Performance Optimization - Deview 2014
Elastic Search Performance Optimization - Deview 2014
 
Solar Investment Case
Solar Investment CaseSolar Investment Case
Solar Investment Case
 
Building a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solrBuilding a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solr
 
Modernizing WordPress Search with Elasticsearch
Modernizing WordPress Search with ElasticsearchModernizing WordPress Search with Elasticsearch
Modernizing WordPress Search with Elasticsearch
 

Similar to Solr and Elasticsearch, a performance study

Elastic search overview
Elastic search overviewElastic search overview
Elastic search overviewABC Talks
 
Solr and ElasticSearch demo and speaker feb 2014
Solr  and ElasticSearch demo and speaker feb 2014Solr  and ElasticSearch demo and speaker feb 2014
Solr and ElasticSearch demo and speaker feb 2014nkabra
 
MongoDB meetup at Hike
MongoDB meetup at HikeMongoDB meetup at Hike
MongoDB meetup at HikeBharvi Dixit
 
Qui Quaerit, Reperit. AWS Elasticsearch in Action
Qui Quaerit, Reperit. AWS Elasticsearch in ActionQui Quaerit, Reperit. AWS Elasticsearch in Action
Qui Quaerit, Reperit. AWS Elasticsearch in ActionGlobalLogic Ukraine
 
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...Fwdays
 
In search of: A meetup about Liferay and Search 2016-04-20
In search of: A meetup about Liferay and Search   2016-04-20In search of: A meetup about Liferay and Search   2016-04-20
In search of: A meetup about Liferay and Search 2016-04-20Tibor Lipusz
 
Finding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic WebFinding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic Webebiquity
 
Elasticsearch for Westcoast
Elasticsearch for WestcoastElasticsearch for Westcoast
Elasticsearch for WestcoastCharlie Hull
 
Exploring MongoDB & Elasticsearch: Better Together
Exploring MongoDB & Elasticsearch: Better TogetherExploring MongoDB & Elasticsearch: Better Together
Exploring MongoDB & Elasticsearch: Better TogetherObjectRocket
 
Faceted search using Solr and Ontopia
Faceted search using Solr and OntopiaFaceted search using Solr and Ontopia
Faceted search using Solr and OntopiaGeir Ove Grønmo
 
AWS October Webinar Series - Introducing Amazon Elasticsearch Service
AWS October Webinar Series - Introducing Amazon Elasticsearch ServiceAWS October Webinar Series - Introducing Amazon Elasticsearch Service
AWS October Webinar Series - Introducing Amazon Elasticsearch ServiceAmazon Web Services
 
BigData Search Simplified with ElasticSearch
BigData Search Simplified with ElasticSearchBigData Search Simplified with ElasticSearch
BigData Search Simplified with ElasticSearchTO THE NEW | Technology
 
Elasticsearch for beginners
Elasticsearch for beginnersElasticsearch for beginners
Elasticsearch for beginnersNeil Baker
 
Getting started with Laravel & Elasticsearch
Getting started with Laravel & ElasticsearchGetting started with Laravel & Elasticsearch
Getting started with Laravel & ElasticsearchPeter Steenbergen
 
Elastic search apache_solr
Elastic search apache_solrElastic search apache_solr
Elastic search apache_solrmacrochen
 
How to run Elasticsearch on Azure in just a few minutes
How to run Elasticsearch on Azure in just a few minutesHow to run Elasticsearch on Azure in just a few minutes
How to run Elasticsearch on Azure in just a few minutesChristoph Wurm
 
Search and analyze your data with elasticsearch
Search and analyze your data with elasticsearchSearch and analyze your data with elasticsearch
Search and analyze your data with elasticsearchAnton Udovychenko
 
Synchronicity: Just-In-Time Discovery of Lost Web Pages
Synchronicity: Just-In-Time Discovery of Lost Web PagesSynchronicity: Just-In-Time Discovery of Lost Web Pages
Synchronicity: Just-In-Time Discovery of Lost Web PagesMichael Nelson
 

Similar to Solr and Elasticsearch, a performance study (20)

Elastic search overview
Elastic search overviewElastic search overview
Elastic search overview
 
Solr and ElasticSearch demo and speaker feb 2014
Solr  and ElasticSearch demo and speaker feb 2014Solr  and ElasticSearch demo and speaker feb 2014
Solr and ElasticSearch demo and speaker feb 2014
 
MongoDB meetup at Hike
MongoDB meetup at HikeMongoDB meetup at Hike
MongoDB meetup at Hike
 
Qui Quaerit, Reperit. AWS Elasticsearch in Action
Qui Quaerit, Reperit. AWS Elasticsearch in ActionQui Quaerit, Reperit. AWS Elasticsearch in Action
Qui Quaerit, Reperit. AWS Elasticsearch in Action
 
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
 
In search of: A meetup about Liferay and Search 2016-04-20
In search of: A meetup about Liferay and Search   2016-04-20In search of: A meetup about Liferay and Search   2016-04-20
In search of: A meetup about Liferay and Search 2016-04-20
 
Finding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic WebFinding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic Web
 
Elasticsearch for Westcoast
Elasticsearch for WestcoastElasticsearch for Westcoast
Elasticsearch for Westcoast
 
Exploring MongoDB & Elasticsearch: Better Together
Exploring MongoDB & Elasticsearch: Better TogetherExploring MongoDB & Elasticsearch: Better Together
Exploring MongoDB & Elasticsearch: Better Together
 
Faceted search using Solr and Ontopia
Faceted search using Solr and OntopiaFaceted search using Solr and Ontopia
Faceted search using Solr and Ontopia
 
AWS October Webinar Series - Introducing Amazon Elasticsearch Service
AWS October Webinar Series - Introducing Amazon Elasticsearch ServiceAWS October Webinar Series - Introducing Amazon Elasticsearch Service
AWS October Webinar Series - Introducing Amazon Elasticsearch Service
 
BigData Search Simplified with ElasticSearch
BigData Search Simplified with ElasticSearchBigData Search Simplified with ElasticSearch
BigData Search Simplified with ElasticSearch
 
Elasticsearch for beginners
Elasticsearch for beginnersElasticsearch for beginners
Elasticsearch for beginners
 
Elastic pivorak
Elastic pivorakElastic pivorak
Elastic pivorak
 
Getting started with Laravel & Elasticsearch
Getting started with Laravel & ElasticsearchGetting started with Laravel & Elasticsearch
Getting started with Laravel & Elasticsearch
 
Solr 101
Solr 101Solr 101
Solr 101
 
Elastic search apache_solr
Elastic search apache_solrElastic search apache_solr
Elastic search apache_solr
 
How to run Elasticsearch on Azure in just a few minutes
How to run Elasticsearch on Azure in just a few minutesHow to run Elasticsearch on Azure in just a few minutes
How to run Elasticsearch on Azure in just a few minutes
 
Search and analyze your data with elasticsearch
Search and analyze your data with elasticsearchSearch and analyze your data with elasticsearch
Search and analyze your data with elasticsearch
 
Synchronicity: Just-In-Time Discovery of Lost Web Pages
Synchronicity: Just-In-Time Discovery of Lost Web PagesSynchronicity: Just-In-Time Discovery of Lost Web Pages
Synchronicity: Just-In-Time Discovery of Lost Web Pages
 

More from Charlie Hull

Lucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challengesLucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challengesCharlie Hull
 
Finding the Bad Actor: Custom scoring & forensic name matching with Elastics...
Finding the Bad Actor: Custom scoring & forensic name matching  with Elastics...Finding the Bad Actor: Custom scoring & forensic name matching  with Elastics...
Finding the Bad Actor: Custom scoring & forensic name matching with Elastics...Charlie Hull
 
Making sense of big data
Making sense of big dataMaking sense of big data
Making sense of big dataCharlie Hull
 
Search Solutions 2015: Towards a new model of search relevance testing
Search Solutions 2015:  Towards a new model of search relevance testingSearch Solutions 2015:  Towards a new model of search relevance testing
Search Solutions 2015: Towards a new model of search relevance testingCharlie Hull
 
FIBEP WMIC 2015 - How Infomedia upgraded their closed-source search engine to...
FIBEP WMIC 2015 - How Infomedia upgraded their closed-source search engine to...FIBEP WMIC 2015 - How Infomedia upgraded their closed-source search engine to...
FIBEP WMIC 2015 - How Infomedia upgraded their closed-source search engine to...Charlie Hull
 
Enterprise Search Europe 2015: Fishing the big data streams - the future of ...
Enterprise Search Europe 2015:  Fishing the big data streams - the future of ...Enterprise Search Europe 2015:  Fishing the big data streams - the future of ...
Enterprise Search Europe 2015: Fishing the big data streams - the future of ...Charlie Hull
 
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015Charlie Hull
 
Bio solr building a better search for bioinformatics
Bio solr   building a better search for bioinformaticsBio solr   building a better search for bioinformatics
Bio solr building a better search for bioinformaticsCharlie Hull
 
Turning search upside down with powerful open source search software
Turning search upside down with powerful open source search softwareTurning search upside down with powerful open source search software
Turning search upside down with powerful open source search softwareCharlie Hull
 
Intranet show and_tell_2010
Intranet show and_tell_2010Intranet show and_tell_2010
Intranet show and_tell_2010Charlie Hull
 
Flax ovum search-across_the_enterprise
Flax ovum search-across_the_enterpriseFlax ovum search-across_the_enterprise
Flax ovum search-across_the_enterpriseCharlie Hull
 
What's the story with Open Source?
What's the story with Open Source? What's the story with Open Source?
What's the story with Open Source? Charlie Hull
 

More from Charlie Hull (12)

Lucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challengesLucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challenges
 
Finding the Bad Actor: Custom scoring & forensic name matching with Elastics...
Finding the Bad Actor: Custom scoring & forensic name matching  with Elastics...Finding the Bad Actor: Custom scoring & forensic name matching  with Elastics...
Finding the Bad Actor: Custom scoring & forensic name matching with Elastics...
 
Making sense of big data
Making sense of big dataMaking sense of big data
Making sense of big data
 
Search Solutions 2015: Towards a new model of search relevance testing
Search Solutions 2015:  Towards a new model of search relevance testingSearch Solutions 2015:  Towards a new model of search relevance testing
Search Solutions 2015: Towards a new model of search relevance testing
 
FIBEP WMIC 2015 - How Infomedia upgraded their closed-source search engine to...
FIBEP WMIC 2015 - How Infomedia upgraded their closed-source search engine to...FIBEP WMIC 2015 - How Infomedia upgraded their closed-source search engine to...
FIBEP WMIC 2015 - How Infomedia upgraded their closed-source search engine to...
 
Enterprise Search Europe 2015: Fishing the big data streams - the future of ...
Enterprise Search Europe 2015:  Fishing the big data streams - the future of ...Enterprise Search Europe 2015:  Fishing the big data streams - the future of ...
Enterprise Search Europe 2015: Fishing the big data streams - the future of ...
 
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
 
Bio solr building a better search for bioinformatics
Bio solr   building a better search for bioinformaticsBio solr   building a better search for bioinformatics
Bio solr building a better search for bioinformatics
 
Turning search upside down with powerful open source search software
Turning search upside down with powerful open source search softwareTurning search upside down with powerful open source search software
Turning search upside down with powerful open source search software
 
Intranet show and_tell_2010
Intranet show and_tell_2010Intranet show and_tell_2010
Intranet show and_tell_2010
 
Flax ovum search-across_the_enterprise
Flax ovum search-across_the_enterpriseFlax ovum search-across_the_enterprise
Flax ovum search-across_the_enterprise
 
What's the story with Open Source?
What's the story with Open Source? What's the story with Open Source?
What's the story with Open Source?
 

Recently uploaded

Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
cpct NetworkING BASICS AND NETWORK TOOL.ppt
cpct NetworkING BASICS AND NETWORK TOOL.pptcpct NetworkING BASICS AND NETWORK TOOL.ppt
cpct NetworkING BASICS AND NETWORK TOOL.pptrcbcrtm
 
Software Coding for software engineering
Software Coding for software engineeringSoftware Coding for software engineering
Software Coding for software engineeringssuserb3a23b
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 

Recently uploaded (20)

Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
cpct NetworkING BASICS AND NETWORK TOOL.ppt
cpct NetworkING BASICS AND NETWORK TOOL.pptcpct NetworkING BASICS AND NETWORK TOOL.ppt
cpct NetworkING BASICS AND NETWORK TOOL.ppt
 
Software Coding for software engineering
Software Coding for software engineeringSoftware Coding for software engineering
Software Coding for software engineering
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 

Solr and Elasticsearch, a performance study

  • 1. Elasticsearch and SolrCloud a performance comparison Tom Mortimer - Technical Director 27th November 2014 charlie@flax.co.uk www.flax.co.uk/blog +44 (0) 8700 118334 Twitter: @FlaxSearch
  • 2. Who are Flax? We design, build and support open source powered search applications
  • 3. Who are Flax? We design, build and support open source powered search applications Based in Cambridge U.K., technology agnostic & independent – but open source exponents & committers
  • 4. Who are Flax? We design, build and support open source powered search applications Based in Cambridge U.K., technology agnostic & independent – but open source exponents & committers UK Authorized Partner of
  • 5. Who are Flax? We design, build and support open source powered search applications Based in Cambridge U.K., technology agnostic & independent – but open source exponents & committers UK Authorized Partner of Customers include Reed Specialist Recruitment, Mydeco, NLA, Gorkana, Financial Times, News UK, EMBL-EBI, Accenture, University of Cambridge, UK Government...
  • 6. Who are Flax? We design, build and support open source powered search applications Based in Cambridge U.K., technology agnostic & independent – but open source exponents & committers UK Authorized Partner of Customers in recruitment, government, e-commerce, news & media, bioinformatics, consulting, law...
  • 7. Who are Flax? We design, build and support open source powered search applications Based in Cambridge U.K., technology agnostic & independent – but open source exponents & committers UK Authorized Partner of Customers in recruitment, government, e-commerce, news & media, bioinformatics, consulting, law...
  • 8.  Open source search server based on Lucene  Created in 2004 by Yonik Seeley  Became an Apache project in 2006  Merged with Lucene in 2011  Web API  XML config, XML/JSON data formats  SolrCloud features added in 2012  Uses Apache ZooKeeper for cluster management
  • 9.  Open source search server based on Lucene  Created in 2010 by Shay Banon  RESTful Web API  Everything is JSON  Distributed and NRT by design  Own Zen Discovery module for cluster management
  • 10. vs.  Both have large, dynamic communities  Well-funded commercial backing  Widely used in many diverse projects  Elasticsearch easier to setup and configure  Elasticsearch query DSL  But: is Elasticsearch as tolerant of network faults? (Jepsen tests by Kyle Kingsbury)  How does performance compare?
  • 11. vs.  Both have large, dynamic communities  Well-funded commercial backing  Widely used in many diverse projects  Elasticsearch easier to setup and configure  Elasticsearch query DSL  But: is Elasticsearch as tolerant of network faults? (Jepsen tests by Kyle Kingsbury)  How does performance compare?  Note that we don't have a preference...we use both!
  • 12. Why does performance matter?  Won't it be the same, as they both use Lucene?  Can't you just throw hardware at it?  Hardware is cheaper than developers
  • 13. Why does performance matter?  Won't it be the same, as they both use Lucene?  Can't you just throw hardware at it?  Hardware is cheaper than developers  Well, no.
  • 14. Why does performance matter?  There's a lot more to them than just a web API on top of Lucene.  Several of our customers have fixed hardware budgets  May have to use limited internal resources  With large indexes or complex queries, need to squeeze every last bit of performance out of the hardware
  • 15. Why does performance matter?  There's a lot more to them than just a web API on top of Lucene.  Several of our customers have fixed hardware budgets  May have to use limited internal resources  With large indexes or complex queries, need to squeeze every last bit of performance out of the hardware
  • 16. What performance studies are out there?  Not many found by a Google search. http://blog.socialcast.com/realtime-search-solr-vs-elasticsearch/  Solr much faster than Elasticsearch, except for NRT searches with concurrent indexing (where situation was reversed).  But: This was over 3 years ago, before SolrCloud
  • 17. Our experience  Client with complex filtering requirements for content licensing, 10Ms of documents, limited hardware budget, no NRT requirement.  Performed tests 18 months ago on EC2.  Solr was approximately 20 times faster!  More recently, Solr was 4 times faster for a project requiring geospatial filtering  What about now?
  • 18. This study  Recent versions of Elasticsearch (1.4.0) and Solr (4.10.2)  Concentrated on indexing performance, query times with and without concurrent indexing, QPS, filters and facets.  Hardware kindly provided by BigStep.com  Full Metal Cloud (real instances, not VMs)  Optimised for high performance  Can be faster than your own dedicated hardware!
  • 20. The results?  Not really very interesting
  • 21. The results?  Not really very interesting  SolrCloud and Elasticsearch were both very fast  Similar performance with concurrent indexing or not  Solr could handle higher QPS
  • 22. Cluster configuration  Two machines, each with 96GB RAM  Two instances of SolrCloud or Elasticsearch on each  Each instance has 24GB JVM heap  Four shards  No replicas
  • 24. Data  40M documents created by using a Markov chain on a seed document (on Stoicism) from gutenberg.org “Below planets. this Below lay this the lay infinite the void infinite without void beginning, without middle, beginning, or middle, end, or this end occupied...”  Small (5-20 word) and larger (200-1000 word) docs  Randomly assigned ints for “source” and “level”, to simulate licensing filters and for facets.
  • 25. Indexing  Python script and requests library  Single process for small index, four processes for larger index  Single process for indexing concurrent with search
  • 26. Searching  Python and requests  Each query time logged for analysis  Single process for query time testing  Multiple processes to test QPS  All tests performed warm  Queries consisted of three randomly chosen terms combined with OR  Filters randomly generated  Facets / Elasticsearch aggregations
  • 27. 40M Small documents  Elasticsearch indexed them in 30 minutes  Total index size was 8.8 GB (easily cacheable)  Solr indexed them in 43 minutes  Total index size was 7.6 GB
  • 28. 40M Small documents (concurrent indexing) Elasticsearch: 0.01s mean, 99% < 0.06s Solr: 0.01s mean, 99% < 0.10s
  • 29. 40M Large documents  Elasticsearch indexed them in 179 minutes  Total index size was 363 GB (not completely cacheable)  Solr indexed them in 119 minutes  Total index size was 226 GB
  • 30. 40M Large documents (search with facets) Elasticsearch: 0.21s mean, 99% < 0.75s Solr: 0.25s mean, 99% < 0.84s
  • 31. 40M Large documents (with 10 filters) Elasticsearch: 0.21s mean, 99% < 0.72s Solr: 0.09s mean, 99% < 0.50s
  • 32. 40M Large documents (concurrent indexing) Elasticsearch: 0.16s mean, 99% < 0.86s Solr: 0.09s mean, 99% < 0.46s
  • 34. Conclusions  SolrCloud seems to be slightly faster. However, performance was acceptable in all cases.  SolrCloud can apparently support a significantly higher number of queries per second (tested without concurrent indexing, however).
  • 35. Limitations and problems  Validity of generated documents?  Validity of random queries?  Searches did not fetch any document data  Did not test highlighting, range facets, geolocation, etc. etc...  Only tested one type of cluster configuration (Elasticsearch is very flexible about node role).  Did not tune JVM parameters  Did not perform profiling to identify reasons for differences
  • 36. What's next  Would have also liked to have compared BigStep with Amazon EC2.  If there is any interest, I hope to address some of these problems in the near future.  We'll open source the code (next week?) on www.github.com/flaxsearch
  • 37. What to take away from this?  Elasticsearch and Solr are both awesome  They currently seem very close in terms of performance (according to this limited study)
  • 38. What to take away from this?  Elasticsearch and Solr are both awesome  They currently seem very close in terms of performance (according to this limited study)  However, all search applications are different  Solr and Elasticsearch may have quite different performance characteristics in certain cases.  Hard to predict.  If performance is important to you, it will pay to try both.
  • 39. Thanks!  To you, for listening  To for the use of Full Metal Cloud  Any questions? - tom@flax.co.uk