SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Downloaden Sie, um offline zu lesen
Distributed Database
Architecture
Search and Indexing
Nick Kabra
Distributed Database Architecture 1
Presentation Agenda
Team Introduction
Basics and History
Use Cases & Current Usage
Highlights
Appendix
DISCLAIMER: This is a knowledge-sharing
session and not a recommendation for any
specific technology / product
From the web
Migration
Distributed Database Architecture 2
Team Introduction
Name:
Designation:
Experience with Search and Indexing:
How long have you been working with Solr or ElasticSearch:
Distributed Database Architecture 3
Basics
1
2
3
4
• Used for Indexing and Searching
• Built on top of Lucene API
• Solr and ES take Lucene API and build features on
top. API accessed through web server
• Smaller version of Google which has indexed and
ranked the web pages
Search platform for Web sites. Search platform for organization.
• Lucene – search engine packaged together in
set of jar files
Distributed Database Architecture 4
History
• Differences in design and architecture.
Distributed Database Architecture 5
ES was released in 2010.
Additional features.
Solr released in 2008.
Key Players: Solr and ElasticSearch
1
2
3
Latest Version= Solr 4.6.1
released on Jan 28, 2014
Collection – Main logical
structure for Solr
Index – Main logical structure for
ES
Architecture
• Distributed
• Fault tolerant and auto
replicas
• Coord: Only ElasticSearch
nodes + zen discovery. Split
brain.
• Single leader
• Automatic leader election
Solr ElasticSearch (ES)
Latest Version= ElasticSearch
1.0.0 released on Feb 12, 2014
Architecture
• Distributed
• Fault tolerant and auto
replicas
• Coord: Apache Solr +
ZooKeeper ensemble. So
quorum
• Leader per shard
• Automatic leader election
Distributed Database Architecture 6
Resume recommendations
UseCase1
Challenge
• Company ABC helps other firms hire skilled developers, project
managers. Empower customers to find the right job candidate
from a database of 8 million profiles.
• Need fast and predictable performance.
• Include geo-spatial.
Success
• Customer hires using the company ABC.
• ABC stores searches made by customers.
• Identify candidates, skills, compensation structure to
enhance the customer search experience with better
matches.
• Make recommendations to customers on salaries, future
market needs etc.
• Eliminate duplicate profiles with realtime indexing and
percolation.
• Provides enhanced customers experience, faster
responses
Opportunity
• Use ES as the search engine with realtime indexing
and nested querying.
Point
Distributed Database Architecture 7
Integration - Use Case 2
THE
FULL
CIRCLE
Kibana
Visualization engine for
dynamic dashboards created
in real-time or on-the-fly
ElasticSearch
Search, analyze in realtime
Logstash
Take logs, scrub, parse and
enrich the data
Distributed Database Architecture 8
Chatagent for 460 million documents – Use Case 3
9
Challenge
6,000 customers from around the world use LiveChat daily to communicate with their customers from one person owned businesses to
international organizations like LG, Apple, Adobe etc.
LiveChat customers conduct 3.6 million queries and 220 million “get” operations per day on 460 million documents. LiveChat keeps these
documents updated with 70 million indexing operations every day.
Solution
Advantage
• Reduce query time from 2 seconds to 100 ms
• Streamline updating from hours to seconds
• Guarantee maximum uptime
• Scale to meet the needs of 6,000 customers
• Store and search on 460 million documents
• Process 3.6 million queries per day
• Scalability, indexing, Full text search allows users to search through chat archives
• Faceting makes it possible to pull various statistics for LiveChat clients.
• ES acts as single datastore, data updates available immediately - Now each of the documents is updated in LiveChat on an average of 20 to
30 times every 20 to 60 seconds.
Distributed Database Architecture
Current Uses
1
2
3
4
• Use Case 1
• Use Case 2
• Use Case 4
• Use Case 3
x • Use Case X
10Distributed Database Architecture
Highlights
Schema and config –
Solrconfig.xml, es.yml – change
no. of shards and replicas live
Scaling - nodes autobalanced,
/ Solr -3755 or shard splitting /add a
document
Nesting (address, users & rights,
boolean, parent children)
Index=different types of
documents and analyzer
Point
Node discovery and fault
discovery. Zookeeper
Point
Multiple documents per schema
and parent-child
Point
Percolator
Point
Aggregation+facets in ES
/Facets in Solr
Distributed Database Architecture 11
Highlights (contd. 2)
Auto-load balancer and auto-sharding
Marvel metrics on 03/13/2014
Brain Split problem in ES
Structured queryDSL and query control
Real-time indexing /near real-time indexing
Query routing and Solr 5816 to be introduced
1
2
3
4
5
6
Distributed Database Architecture 12
ElasticSearch / Solr funnel
UIMA
Text analysis debugger,
spell check
Decision tree faceting /
Drilldown
Cloudera, Mapr, DataStax
support Solr
Filters for queries across
nested documents
Query handling analyzer and
language, term suggester,
autocomplete
Realtime GET with query routing
Hortonworks, Couchbase
support ElasticSearch
Distributed Database Architecture 13
FROM THE WEB
Web CPA
This is only an FYI: Found some customers moving from Solr to ElasticSearch but
could not find any article which mentioned that clients moved from ES to Solr.
Caveat: No prejudice but it would be good to hear what customers say.
Let us also check this site: http://www.ymc.ch/en/why-we-chose-solr-4-0-instead-of-elasticsearch
http://www.mgt-commerce.com/magento-elasticsearch.html
Foursquare= http://engineering.foursquare.com/2012/08/09/foursquare-now-uses-
elastic-search-and-on-a-related-note-slashem-also-works-with-elastic-search/
Jetwick= http://karussell.wordpress.com/2011/02/07/why-jetwick-moved-from-solr-
to-elasticsearch/
Netricos= http://www.netricos.com/blog/posts/how-we-are-using-elastic-search
Stumbleupon = http://www.elasticsearch.org/case-study/stumbleupon/
UK govt. site= https://gds.blog.gov.uk/2012/08/03/from-solr-to-elasticsearch/
Wikimedia= http://thenextweb.com/insider/2014/01/06/wikimedia-will-replace-
search-elasticsearch-beta-users-february-users-march-april/#!xDKnd
Distributed Database Architecture 14
2 Parts of a whole – The Math
Solr performs very well on small
indexes that don’t change very often
1
Scalability, auto-sharding, GUI
admin, schemaless, real-time,
nested queries, routing and the
way indexing and queries are
handled which provide faster
execution of queries and better
indexing provide a distinct
advantage to using ES
2
Solr
ElasticSearch
Distributed Database Architecture 15
Migration
Step 1
Use river plugin to migrate
from existing Solr to ES.
Step 2
Pulls the content from
existing Solr cluster and
index it in ES
Step 3
When you decide to switch to
Elasticsearch permanently, you would
obviously switch your indexing to
directly index content from your
sources to Elasticsearch. Keeping Solr
in the middle is not a recommended
setup.
Distributed Database Architecture 16
If we have a small site and need
search features without the
distributed bells-and-whistles,
both Solr and ElasticSearch are
efficient
If we are planning a large
installation that requires
running distributed search
with nesting, scalability,
sharding, real-time
ElasticSearch can do a better
job.
Conclusion
Distributed Database Architecture 17
Both products
trying to catch-up
based on other
product’s capabilities
Where do we go from here ?
---------------------------------------
The best way to define this is:
Some possible next steps….
Question to ask
Distributed Database Architecture 18
Thank you!
201-925-0488
nikkabs@gmail.com
Architecture – Global Head
Distributed Database Architecture 19
Questions session
.
Distributed Database Architecture 20
Appendix
.HYPERLINK
Distributed Database Architecture 21

Weitere ähnliche Inhalte

Was ist angesagt?

Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...Edureka!
 
Implementing Site Search in CQ5 / AEM
Implementing Site Search in CQ5 / AEMImplementing Site Search in CQ5 / AEM
Implementing Site Search in CQ5 / AEMrtpaem
 
Do you need an external search platform for Adobe Experience Manager?
Do you need an external search platform for Adobe Experience Manager?Do you need an external search platform for Adobe Experience Manager?
Do you need an external search platform for Adobe Experience Manager?therealgaston
 
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML InfrastructureMLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML InfrastructureData Science Milan
 
Webinar: Event Processing & Data Analytics with Lucidworks Fusion
Webinar: Event Processing & Data Analytics with Lucidworks FusionWebinar: Event Processing & Data Analytics with Lucidworks Fusion
Webinar: Event Processing & Data Analytics with Lucidworks FusionLucidworks
 
StreamSQL Feature Store (Apache Pulsar Summit)
StreamSQL Feature Store (Apache Pulsar Summit)StreamSQL Feature Store (Apache Pulsar Summit)
StreamSQL Feature Store (Apache Pulsar Summit)Simba Khadder
 
Consuming External Content and Enriching Content with Apache Camel
Consuming External Content and Enriching Content with Apache CamelConsuming External Content and Enriching Content with Apache Camel
Consuming External Content and Enriching Content with Apache Cameltherealgaston
 
Fifth Elephant Apache Atlas Talk
Fifth Elephant Apache Atlas TalkFifth Elephant Apache Atlas Talk
Fifth Elephant Apache Atlas TalkVimal Sharma
 
Politics Ain’t Beanbag: Using APEX, ML, and GeoCoding In a Modern Election Ca...
Politics Ain’t Beanbag: Using APEX, ML, and GeoCoding In a Modern Election Ca...Politics Ain’t Beanbag: Using APEX, ML, and GeoCoding In a Modern Election Ca...
Politics Ain’t Beanbag: Using APEX, ML, and GeoCoding In a Modern Election Ca...Jim Czuprynski
 
What's Your Super-Power? Mine is Machine Learning with Oracle Autonomous DB.
What's Your Super-Power? Mine is Machine Learning with Oracle Autonomous DB.What's Your Super-Power? Mine is Machine Learning with Oracle Autonomous DB.
What's Your Super-Power? Mine is Machine Learning with Oracle Autonomous DB.Jim Czuprynski
 
Hopsworks data engineering melbourne april 2020
Hopsworks   data engineering melbourne april 2020Hopsworks   data engineering melbourne april 2020
Hopsworks data engineering melbourne april 2020Jim Dowling
 
Analytics Metrics Delivery & ML Feature Visualization
Analytics Metrics Delivery & ML Feature VisualizationAnalytics Metrics Delivery & ML Feature Visualization
Analytics Metrics Delivery & ML Feature VisualizationBill Liu
 
Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and MaggyAsynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and MaggyJim Dowling
 
Berlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowlingBerlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowlingJim Dowling
 
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...Edureka!
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedBeyondTrees
 
ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)Michael Rys
 
An Autonomous Singularity Approaches: Force Multipliers For Overwhelmed DBAs
An Autonomous Singularity Approaches: Force Multipliers For Overwhelmed DBAsAn Autonomous Singularity Approaches: Force Multipliers For Overwhelmed DBAs
An Autonomous Singularity Approaches: Force Multipliers For Overwhelmed DBAsJim Czuprynski
 

Was ist angesagt? (20)

Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
 
Machine Learning with Apache Spark
Machine Learning with Apache SparkMachine Learning with Apache Spark
Machine Learning with Apache Spark
 
Implementing Site Search in CQ5 / AEM
Implementing Site Search in CQ5 / AEMImplementing Site Search in CQ5 / AEM
Implementing Site Search in CQ5 / AEM
 
Do you need an external search platform for Adobe Experience Manager?
Do you need an external search platform for Adobe Experience Manager?Do you need an external search platform for Adobe Experience Manager?
Do you need an external search platform for Adobe Experience Manager?
 
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML InfrastructureMLOps with a Feature Store: Filling the Gap in ML Infrastructure
MLOps with a Feature Store: Filling the Gap in ML Infrastructure
 
Webinar: Event Processing & Data Analytics with Lucidworks Fusion
Webinar: Event Processing & Data Analytics with Lucidworks FusionWebinar: Event Processing & Data Analytics with Lucidworks Fusion
Webinar: Event Processing & Data Analytics with Lucidworks Fusion
 
StreamSQL Feature Store (Apache Pulsar Summit)
StreamSQL Feature Store (Apache Pulsar Summit)StreamSQL Feature Store (Apache Pulsar Summit)
StreamSQL Feature Store (Apache Pulsar Summit)
 
Consuming External Content and Enriching Content with Apache Camel
Consuming External Content and Enriching Content with Apache CamelConsuming External Content and Enriching Content with Apache Camel
Consuming External Content and Enriching Content with Apache Camel
 
Fifth Elephant Apache Atlas Talk
Fifth Elephant Apache Atlas TalkFifth Elephant Apache Atlas Talk
Fifth Elephant Apache Atlas Talk
 
Politics Ain’t Beanbag: Using APEX, ML, and GeoCoding In a Modern Election Ca...
Politics Ain’t Beanbag: Using APEX, ML, and GeoCoding In a Modern Election Ca...Politics Ain’t Beanbag: Using APEX, ML, and GeoCoding In a Modern Election Ca...
Politics Ain’t Beanbag: Using APEX, ML, and GeoCoding In a Modern Election Ca...
 
What's Your Super-Power? Mine is Machine Learning with Oracle Autonomous DB.
What's Your Super-Power? Mine is Machine Learning with Oracle Autonomous DB.What's Your Super-Power? Mine is Machine Learning with Oracle Autonomous DB.
What's Your Super-Power? Mine is Machine Learning with Oracle Autonomous DB.
 
Hopsworks data engineering melbourne april 2020
Hopsworks   data engineering melbourne april 2020Hopsworks   data engineering melbourne april 2020
Hopsworks data engineering melbourne april 2020
 
Analytics Metrics Delivery & ML Feature Visualization
Analytics Metrics Delivery & ML Feature VisualizationAnalytics Metrics Delivery & ML Feature Visualization
Analytics Metrics Delivery & ML Feature Visualization
 
Introduction to Hivemall
Introduction to HivemallIntroduction to Hivemall
Introduction to Hivemall
 
Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and MaggyAsynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
Asynchronous Hyperparameter Search with Spark on Hopsworks and Maggy
 
Berlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowlingBerlin buzzwords 2020-feature-store-dowling
Berlin buzzwords 2020-feature-store-dowling
 
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
What Is ELK Stack | ELK Tutorial For Beginners | Elasticsearch Kibana | ELK S...
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learned
 
ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)ADL/U-SQL Introduction (SQLBits 2016)
ADL/U-SQL Introduction (SQLBits 2016)
 
An Autonomous Singularity Approaches: Force Multipliers For Overwhelmed DBAs
An Autonomous Singularity Approaches: Force Multipliers For Overwhelmed DBAsAn Autonomous Singularity Approaches: Force Multipliers For Overwhelmed DBAs
An Autonomous Singularity Approaches: Force Multipliers For Overwhelmed DBAs
 

Ähnlich wie Solr and ElasticSearch demo and speaker feb 2014

Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014ALTER WAY
 
Solr at zvents 6 years later & still going strong
Solr at zvents   6 years later & still going strongSolr at zvents   6 years later & still going strong
Solr at zvents 6 years later & still going stronglucenerevolution
 
Elastic - ELK, Logstash & Kibana
Elastic - ELK, Logstash & KibanaElastic - ELK, Logstash & Kibana
Elastic - ELK, Logstash & KibanaSpringPeople
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Spark Summit
 
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical SolutionEnterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical SolutionDmitry Anoshin
 
Enabling Self Service Business Intelligence using Excel
Enabling Self Service Business Intelligenceusing ExcelEnabling Self Service Business Intelligenceusing Excel
Enabling Self Service Business Intelligence using ExcelAlan Koo
 
Comment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitablesComment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitablesElasticsearch
 
Getting Started with Elasticsearch
Getting Started with ElasticsearchGetting Started with Elasticsearch
Getting Started with ElasticsearchAlibaba Cloud
 
Ladies Be Architects - Integration - Multi-Org, Security, JSON, Backup & Restore
Ladies Be Architects - Integration - Multi-Org, Security, JSON, Backup & RestoreLadies Be Architects - Integration - Multi-Org, Security, JSON, Backup & Restore
Ladies Be Architects - Integration - Multi-Org, Security, JSON, Backup & Restoregemziebeth
 
Cómo transformar los datos en análisis con los que tomar decisiones
Cómo transformar los datos en análisis con los que tomar decisionesCómo transformar los datos en análisis con los que tomar decisiones
Cómo transformar los datos en análisis con los que tomar decisionesElasticsearch
 
Comment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitablesComment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitablesElasticsearch
 
170215 msa intro
170215 msa intro170215 msa intro
170215 msa introSonic leigh
 
AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)Igor Talevski
 
Handling of Large Data by Salesforce
Handling of Large Data by SalesforceHandling of Large Data by Salesforce
Handling of Large Data by SalesforceThinqloud
 
Just the Job: Employing Solr for Recruitment Search -Charlie Hull
Just the Job: Employing Solr for Recruitment Search -Charlie Hull Just the Job: Employing Solr for Recruitment Search -Charlie Hull
Just the Job: Employing Solr for Recruitment Search -Charlie Hull lucenerevolution
 
Solr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance studySolr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance studyCharlie Hull
 
SQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, LucidworksngineersSQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, LucidworksngineersLucidworks
 
Transforming data into actionable insights
Transforming data into actionable insightsTransforming data into actionable insights
Transforming data into actionable insightsElasticsearch
 

Ähnlich wie Solr and ElasticSearch demo and speaker feb 2014 (20)

Serverless SQL
Serverless SQLServerless SQL
Serverless SQL
 
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
 
Solr at zvents 6 years later & still going strong
Solr at zvents   6 years later & still going strongSolr at zvents   6 years later & still going strong
Solr at zvents 6 years later & still going strong
 
Elastic - ELK, Logstash & Kibana
Elastic - ELK, Logstash & KibanaElastic - ELK, Logstash & Kibana
Elastic - ELK, Logstash & Kibana
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
 
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical SolutionEnterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
 
Enabling Self Service Business Intelligence using Excel
Enabling Self Service Business Intelligenceusing ExcelEnabling Self Service Business Intelligenceusing Excel
Enabling Self Service Business Intelligence using Excel
 
Comment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitablesComment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitables
 
Getting Started with Elasticsearch
Getting Started with ElasticsearchGetting Started with Elasticsearch
Getting Started with Elasticsearch
 
Ladies Be Architects - Integration - Multi-Org, Security, JSON, Backup & Restore
Ladies Be Architects - Integration - Multi-Org, Security, JSON, Backup & RestoreLadies Be Architects - Integration - Multi-Org, Security, JSON, Backup & Restore
Ladies Be Architects - Integration - Multi-Org, Security, JSON, Backup & Restore
 
Cómo transformar los datos en análisis con los que tomar decisiones
Cómo transformar los datos en análisis con los que tomar decisionesCómo transformar los datos en análisis con los que tomar decisiones
Cómo transformar los datos en análisis con los que tomar decisiones
 
Comment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitablesComment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitables
 
170215 msa intro
170215 msa intro170215 msa intro
170215 msa intro
 
AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)
 
Handling of Large Data by Salesforce
Handling of Large Data by SalesforceHandling of Large Data by Salesforce
Handling of Large Data by Salesforce
 
Just the Job: Employing Solr for Recruitment Search -Charlie Hull
Just the Job: Employing Solr for Recruitment Search -Charlie Hull Just the Job: Employing Solr for Recruitment Search -Charlie Hull
Just the Job: Employing Solr for Recruitment Search -Charlie Hull
 
Solr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance studySolr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance study
 
SQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, LucidworksngineersSQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
 
Oracle bi ee architecture
Oracle bi ee architectureOracle bi ee architecture
Oracle bi ee architecture
 
Transforming data into actionable insights
Transforming data into actionable insightsTransforming data into actionable insights
Transforming data into actionable insights
 

Mehr von nkabra

How i helped rue la la become a one stop ecommerce boutique
How i helped rue la la become a one stop ecommerce boutiqueHow i helped rue la la become a one stop ecommerce boutique
How i helped rue la la become a one stop ecommerce boutiquenkabra
 
How geo phy built a proprietary automated valuation platform for the commerci...
How geo phy built a proprietary automated valuation platform for the commerci...How geo phy built a proprietary automated valuation platform for the commerci...
How geo phy built a proprietary automated valuation platform for the commerci...nkabra
 
How fleet advantage analytics uses predic engine and iot with machine learning
How fleet advantage analytics uses predic engine and iot with machine learningHow fleet advantage analytics uses predic engine and iot with machine learning
How fleet advantage analytics uses predic engine and iot with machine learningnkabra
 
Building a data science team at michelin tyres
Building a data science team at michelin tyresBuilding a data science team at michelin tyres
Building a data science team at michelin tyresnkabra
 
Inmemory db nick kabra june 2013 discussion at columbia university
Inmemory db nick kabra june 2013 discussion at columbia universityInmemory db nick kabra june 2013 discussion at columbia university
Inmemory db nick kabra june 2013 discussion at columbia universitynkabra
 
Comparisons of no sql databases march 2014
Comparisons of no sql databases march 2014Comparisons of no sql databases march 2014
Comparisons of no sql databases march 2014nkabra
 
Hadoop comparative scorecard nick kabra sr mgmt 04042014 and stack integrati...
Hadoop comparative scorecard  nick kabra sr mgmt 04042014 and stack integrati...Hadoop comparative scorecard  nick kabra sr mgmt 04042014 and stack integrati...
Hadoop comparative scorecard nick kabra sr mgmt 04042014 and stack integrati...nkabra
 
Harvard case studies presentation 09102013
Harvard case studies presentation 09102013Harvard case studies presentation 09102013
Harvard case studies presentation 09102013nkabra
 
Hadoop compression analysis strata conference
Hadoop compression analysis strata conferenceHadoop compression analysis strata conference
Hadoop compression analysis strata conferencenkabra
 
Hadoop compression strata conference
Hadoop compression strata conferenceHadoop compression strata conference
Hadoop compression strata conferencenkabra
 
Future of big data nick kabra speaker compendium march 2013
Future of big data nick kabra speaker compendium march 2013Future of big data nick kabra speaker compendium march 2013
Future of big data nick kabra speaker compendium march 2013nkabra
 
Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013nkabra
 

Mehr von nkabra (12)

How i helped rue la la become a one stop ecommerce boutique
How i helped rue la la become a one stop ecommerce boutiqueHow i helped rue la la become a one stop ecommerce boutique
How i helped rue la la become a one stop ecommerce boutique
 
How geo phy built a proprietary automated valuation platform for the commerci...
How geo phy built a proprietary automated valuation platform for the commerci...How geo phy built a proprietary automated valuation platform for the commerci...
How geo phy built a proprietary automated valuation platform for the commerci...
 
How fleet advantage analytics uses predic engine and iot with machine learning
How fleet advantage analytics uses predic engine and iot with machine learningHow fleet advantage analytics uses predic engine and iot with machine learning
How fleet advantage analytics uses predic engine and iot with machine learning
 
Building a data science team at michelin tyres
Building a data science team at michelin tyresBuilding a data science team at michelin tyres
Building a data science team at michelin tyres
 
Inmemory db nick kabra june 2013 discussion at columbia university
Inmemory db nick kabra june 2013 discussion at columbia universityInmemory db nick kabra june 2013 discussion at columbia university
Inmemory db nick kabra june 2013 discussion at columbia university
 
Comparisons of no sql databases march 2014
Comparisons of no sql databases march 2014Comparisons of no sql databases march 2014
Comparisons of no sql databases march 2014
 
Hadoop comparative scorecard nick kabra sr mgmt 04042014 and stack integrati...
Hadoop comparative scorecard  nick kabra sr mgmt 04042014 and stack integrati...Hadoop comparative scorecard  nick kabra sr mgmt 04042014 and stack integrati...
Hadoop comparative scorecard nick kabra sr mgmt 04042014 and stack integrati...
 
Harvard case studies presentation 09102013
Harvard case studies presentation 09102013Harvard case studies presentation 09102013
Harvard case studies presentation 09102013
 
Hadoop compression analysis strata conference
Hadoop compression analysis strata conferenceHadoop compression analysis strata conference
Hadoop compression analysis strata conference
 
Hadoop compression strata conference
Hadoop compression strata conferenceHadoop compression strata conference
Hadoop compression strata conference
 
Future of big data nick kabra speaker compendium march 2013
Future of big data nick kabra speaker compendium march 2013Future of big data nick kabra speaker compendium march 2013
Future of big data nick kabra speaker compendium march 2013
 
Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013
 

Kürzlich hochgeladen

7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdftheeltifs
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...nirzagarg
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjurptikerjasaptiker
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制vexqp
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx  Federal Constitution  of the Swiss ConfederationSR-101-01012024-EN.docx  Federal Constitution  of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss ConfederationEfruzAsilolu
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangeThinkInnovation
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制vexqp
 
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATIONCapstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATIONLakpaYanziSherpa
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowgargpaaro
 

Kürzlich hochgeladen (20)

7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdf
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit RiyadhCytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
SR-101-01012024-EN.docx  Federal Constitution  of the Swiss ConfederationSR-101-01012024-EN.docx  Federal Constitution  of the Swiss Confederation
SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATIONCapstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 

Solr and ElasticSearch demo and speaker feb 2014

  • 1. Distributed Database Architecture Search and Indexing Nick Kabra Distributed Database Architecture 1
  • 2. Presentation Agenda Team Introduction Basics and History Use Cases & Current Usage Highlights Appendix DISCLAIMER: This is a knowledge-sharing session and not a recommendation for any specific technology / product From the web Migration Distributed Database Architecture 2
  • 3. Team Introduction Name: Designation: Experience with Search and Indexing: How long have you been working with Solr or ElasticSearch: Distributed Database Architecture 3
  • 4. Basics 1 2 3 4 • Used for Indexing and Searching • Built on top of Lucene API • Solr and ES take Lucene API and build features on top. API accessed through web server • Smaller version of Google which has indexed and ranked the web pages Search platform for Web sites. Search platform for organization. • Lucene – search engine packaged together in set of jar files Distributed Database Architecture 4
  • 5. History • Differences in design and architecture. Distributed Database Architecture 5 ES was released in 2010. Additional features. Solr released in 2008.
  • 6. Key Players: Solr and ElasticSearch 1 2 3 Latest Version= Solr 4.6.1 released on Jan 28, 2014 Collection – Main logical structure for Solr Index – Main logical structure for ES Architecture • Distributed • Fault tolerant and auto replicas • Coord: Only ElasticSearch nodes + zen discovery. Split brain. • Single leader • Automatic leader election Solr ElasticSearch (ES) Latest Version= ElasticSearch 1.0.0 released on Feb 12, 2014 Architecture • Distributed • Fault tolerant and auto replicas • Coord: Apache Solr + ZooKeeper ensemble. So quorum • Leader per shard • Automatic leader election Distributed Database Architecture 6
  • 7. Resume recommendations UseCase1 Challenge • Company ABC helps other firms hire skilled developers, project managers. Empower customers to find the right job candidate from a database of 8 million profiles. • Need fast and predictable performance. • Include geo-spatial. Success • Customer hires using the company ABC. • ABC stores searches made by customers. • Identify candidates, skills, compensation structure to enhance the customer search experience with better matches. • Make recommendations to customers on salaries, future market needs etc. • Eliminate duplicate profiles with realtime indexing and percolation. • Provides enhanced customers experience, faster responses Opportunity • Use ES as the search engine with realtime indexing and nested querying. Point Distributed Database Architecture 7
  • 8. Integration - Use Case 2 THE FULL CIRCLE Kibana Visualization engine for dynamic dashboards created in real-time or on-the-fly ElasticSearch Search, analyze in realtime Logstash Take logs, scrub, parse and enrich the data Distributed Database Architecture 8
  • 9. Chatagent for 460 million documents – Use Case 3 9 Challenge 6,000 customers from around the world use LiveChat daily to communicate with their customers from one person owned businesses to international organizations like LG, Apple, Adobe etc. LiveChat customers conduct 3.6 million queries and 220 million “get” operations per day on 460 million documents. LiveChat keeps these documents updated with 70 million indexing operations every day. Solution Advantage • Reduce query time from 2 seconds to 100 ms • Streamline updating from hours to seconds • Guarantee maximum uptime • Scale to meet the needs of 6,000 customers • Store and search on 460 million documents • Process 3.6 million queries per day • Scalability, indexing, Full text search allows users to search through chat archives • Faceting makes it possible to pull various statistics for LiveChat clients. • ES acts as single datastore, data updates available immediately - Now each of the documents is updated in LiveChat on an average of 20 to 30 times every 20 to 60 seconds. Distributed Database Architecture
  • 10. Current Uses 1 2 3 4 • Use Case 1 • Use Case 2 • Use Case 4 • Use Case 3 x • Use Case X 10Distributed Database Architecture
  • 11. Highlights Schema and config – Solrconfig.xml, es.yml – change no. of shards and replicas live Scaling - nodes autobalanced, / Solr -3755 or shard splitting /add a document Nesting (address, users & rights, boolean, parent children) Index=different types of documents and analyzer Point Node discovery and fault discovery. Zookeeper Point Multiple documents per schema and parent-child Point Percolator Point Aggregation+facets in ES /Facets in Solr Distributed Database Architecture 11
  • 12. Highlights (contd. 2) Auto-load balancer and auto-sharding Marvel metrics on 03/13/2014 Brain Split problem in ES Structured queryDSL and query control Real-time indexing /near real-time indexing Query routing and Solr 5816 to be introduced 1 2 3 4 5 6 Distributed Database Architecture 12
  • 13. ElasticSearch / Solr funnel UIMA Text analysis debugger, spell check Decision tree faceting / Drilldown Cloudera, Mapr, DataStax support Solr Filters for queries across nested documents Query handling analyzer and language, term suggester, autocomplete Realtime GET with query routing Hortonworks, Couchbase support ElasticSearch Distributed Database Architecture 13
  • 14. FROM THE WEB Web CPA This is only an FYI: Found some customers moving from Solr to ElasticSearch but could not find any article which mentioned that clients moved from ES to Solr. Caveat: No prejudice but it would be good to hear what customers say. Let us also check this site: http://www.ymc.ch/en/why-we-chose-solr-4-0-instead-of-elasticsearch http://www.mgt-commerce.com/magento-elasticsearch.html Foursquare= http://engineering.foursquare.com/2012/08/09/foursquare-now-uses- elastic-search-and-on-a-related-note-slashem-also-works-with-elastic-search/ Jetwick= http://karussell.wordpress.com/2011/02/07/why-jetwick-moved-from-solr- to-elasticsearch/ Netricos= http://www.netricos.com/blog/posts/how-we-are-using-elastic-search Stumbleupon = http://www.elasticsearch.org/case-study/stumbleupon/ UK govt. site= https://gds.blog.gov.uk/2012/08/03/from-solr-to-elasticsearch/ Wikimedia= http://thenextweb.com/insider/2014/01/06/wikimedia-will-replace- search-elasticsearch-beta-users-february-users-march-april/#!xDKnd Distributed Database Architecture 14
  • 15. 2 Parts of a whole – The Math Solr performs very well on small indexes that don’t change very often 1 Scalability, auto-sharding, GUI admin, schemaless, real-time, nested queries, routing and the way indexing and queries are handled which provide faster execution of queries and better indexing provide a distinct advantage to using ES 2 Solr ElasticSearch Distributed Database Architecture 15
  • 16. Migration Step 1 Use river plugin to migrate from existing Solr to ES. Step 2 Pulls the content from existing Solr cluster and index it in ES Step 3 When you decide to switch to Elasticsearch permanently, you would obviously switch your indexing to directly index content from your sources to Elasticsearch. Keeping Solr in the middle is not a recommended setup. Distributed Database Architecture 16
  • 17. If we have a small site and need search features without the distributed bells-and-whistles, both Solr and ElasticSearch are efficient If we are planning a large installation that requires running distributed search with nesting, scalability, sharding, real-time ElasticSearch can do a better job. Conclusion Distributed Database Architecture 17 Both products trying to catch-up based on other product’s capabilities
  • 18. Where do we go from here ? --------------------------------------- The best way to define this is: Some possible next steps…. Question to ask Distributed Database Architecture 18
  • 19. Thank you! 201-925-0488 nikkabs@gmail.com Architecture – Global Head Distributed Database Architecture 19