SlideShare ist ein Scribd-Unternehmen logo
1 von 60
Downloaden Sie, um offline zu lesen
Brahe - Flexible Indexing At Scale
Ben Brown
Software Architect, Cerner Corporation
Who I Am
• Ben Brown
Software Architect
• Cerner
Healthcare IT Company
• Semantic Solutions
Team of 10
Search Services
Fun Stuff
NLP, Medical Ontologies, ML
Chart Search
Taking This
Photo: http://bit.ly/Y7kTJt
Chart Search
Turning it into this
Chart Search Does
• Faceting
• NLP
• Semantic Concept Markup
Makes for a heavy record
(Especially on Solr 1.4)
Where We Started
Started Major Engineering in 2009
IBM Dev Works: http://ibm.co/14ZrtqX
Where We Started
Started Major Engineering in 2009
IBM Dev Works: http://ibm.co/14ZrtqX
Scale
• Clusters partitioned by client
• Raw and processed data in HDFS
• All processing & indexing done through map
reduce
Shard Size
Limiting Factor ~26 Million Discrete Results Per
Shard
Average of 35 Shards Per Client
Range 5 to 140
Query Touch Points
Query Touch Points
One User Action ~ 4 Queries
35 Shards - 432 Touch Points
140 Shards - 1692 Touch Points
• Works, but not efficient
• Chance for variance killing performance
• Failure is a massive config headache
Growth
• Hashed ID does not play well with resizing
• Deploy Again
• Reindex Everything
Document Hash modulo Shard Count
Doc One:Hash(abc123) = 15
Doc Two: Hash(efg456) = 8
Doc Three: Hash(hij789) = 7
3 Shards
Doc One -> Shard 0
Doc Two -> Shard 2
Doc Three -> Shard 1
4 Shards
Doc One -> Shard 3
Doc Two -> Shard 0
Doc Three -> Shard 3
We Have a Problem
Painful Growth
Lots of Deploys
Variance Risk
Image: http://bit.ly/Y7oBD6
What Would Be Better?
Load Balance at the Client
Automated Failover
Easy Deployments
Simplified Splitting
Minimized Touch Points
Disconnected Stages
Solution
Shift Master to HBase
Image: http://bit.ly/ZXO2na
Why HBase?
Lexically organized keys
Efficient key range scans
Efficient time based scans
We're pretty good at operating it
Coordinate With ZooKeeper
|-- Index name
|-- Version
|-- Solr Schema/Config
|-- Table Name + Connection Info
|-- Shard Number
|-- Shard Boundary Info
|-- Replica Number
|-- Ephemeral Claim
|-- Solr Connection Info
|-- Ephemeral Online
Custom Core Admin
Work with ZooKeeper for claim process
Creates solr core after claims
Controls pulling data from HBase
Claim Process
Claim Process
Claim Process
Image: http://bit.ly/Or317R
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Claim Process
Coordinate With ZooKeeper
|-- Index name
|-- Version
|-- Solr Schema/Config
|-- Table Name + Connection Info
|-- Shard Number
|-- Shard Boundary Info
|-- Replica Number
|-- Ephemeral Claim
|-- Solr Connection Info
|-- Ephemeral Online
Queries
• Client inspects ZooKeeper
• Finds online nodes
o Only for the keyspace it cares about
o Issues distributed queries if necessary
• Balances in the Client
• Retries if queries fail
Ends Thoughts
• Keep things simple
• Disconnect your stages
• Keep your touchpoints at a minimum
• Organize your data around your queries
• Use what you’re good at
CONTACT
Ben Brown
http://linkd.in/ZZIBK4
@b_brown
ENGINEERING BLOG
https://engineering.cerner.com/
WE’RE HIRING!
http://www.cerner.com/About_Cerner/Careers/
Bonus Slides!
Brahe   mass scale flexible indexing
Brahe   mass scale flexible indexing
Brahe   mass scale flexible indexing
Brahe   mass scale flexible indexing
Brahe   mass scale flexible indexing
Brahe   mass scale flexible indexing

Weitere ähnliche Inhalte

Ähnlich wie Brahe mass scale flexible indexing

Was l iberty for java batch and jsr352
Was l iberty for java batch and jsr352Was l iberty for java batch and jsr352
Was l iberty for java batch and jsr352
sflynn073
 
Enhance ServiceNow with Automated Discovery for Mainframe and IBM i
Enhance ServiceNow with Automated Discovery for Mainframe and IBM iEnhance ServiceNow with Automated Discovery for Mainframe and IBM i
Enhance ServiceNow with Automated Discovery for Mainframe and IBM i
Precisely
 

Ähnlich wie Brahe mass scale flexible indexing (20)

Was l iberty for java batch and jsr352
Was l iberty for java batch and jsr352Was l iberty for java batch and jsr352
Was l iberty for java batch and jsr352
 
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflowsCloud nativecomputingtechnologysupportinghpc cognitiveworkflows
Cloud nativecomputingtechnologysupportinghpc cognitiveworkflows
 
Insights on Knative and how it changes the serverless landscape
Insights on Knative and how it changes the serverless landscapeInsights on Knative and how it changes the serverless landscape
Insights on Knative and how it changes the serverless landscape
 
Enhance ServiceNow with Automated Discovery for Mainframe and IBM i
Enhance ServiceNow with Automated Discovery for Mainframe and IBM iEnhance ServiceNow with Automated Discovery for Mainframe and IBM i
Enhance ServiceNow with Automated Discovery for Mainframe and IBM i
 
Gemfire Introduction
Gemfire Introduction Gemfire Introduction
Gemfire Introduction
 
EEDC 2010. Scaling Web Applications
EEDC 2010. Scaling Web ApplicationsEEDC 2010. Scaling Web Applications
EEDC 2010. Scaling Web Applications
 
Benchmarking Hadoop and Big Data
Benchmarking Hadoop and Big DataBenchmarking Hadoop and Big Data
Benchmarking Hadoop and Big Data
 
General 05 integration design vs migration design
General 05   integration design vs migration designGeneral 05   integration design vs migration design
General 05 integration design vs migration design
 
Building Reactive Applications With Node.Js And Red Hat JBoss Data Grid (Gald...
Building Reactive Applications With Node.Js And Red Hat JBoss Data Grid (Gald...Building Reactive Applications With Node.Js And Red Hat JBoss Data Grid (Gald...
Building Reactive Applications With Node.Js And Red Hat JBoss Data Grid (Gald...
 
Bitfusion Nimbix Dev Summit Heterogeneous Architectures
Bitfusion Nimbix Dev Summit Heterogeneous Architectures Bitfusion Nimbix Dev Summit Heterogeneous Architectures
Bitfusion Nimbix Dev Summit Heterogeneous Architectures
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
 
Informix 14.1 launch Webinar
Informix 14.1 launch WebinarInformix 14.1 launch Webinar
Informix 14.1 launch Webinar
 
J1 - Keynote Data Platform - Rohan Kumar
J1 - Keynote Data Platform - Rohan KumarJ1 - Keynote Data Platform - Rohan Kumar
J1 - Keynote Data Platform - Rohan Kumar
 
Protecting Your Power Systems with Cloud-based HA/DR
Protecting Your Power Systems with Cloud-based HA/DRProtecting Your Power Systems with Cloud-based HA/DR
Protecting Your Power Systems with Cloud-based HA/DR
 
TDWI San Diego 2014: Wendy Lucas Describes how BLU Acceleration Delivers In-T...
TDWI San Diego 2014: Wendy Lucas Describes how BLU Acceleration Delivers In-T...TDWI San Diego 2014: Wendy Lucas Describes how BLU Acceleration Delivers In-T...
TDWI San Diego 2014: Wendy Lucas Describes how BLU Acceleration Delivers In-T...
 
File Manager for z/OS - Overview
File Manager for z/OS - OverviewFile Manager for z/OS - Overview
File Manager for z/OS - Overview
 
Informix 14.1 launch webinar
Informix 14.1 launch webinarInformix 14.1 launch webinar
Informix 14.1 launch webinar
 
Presentation design - key concepts and approaches for designing your deskto...
Presentation   design - key concepts and approaches for designing your deskto...Presentation   design - key concepts and approaches for designing your deskto...
Presentation design - key concepts and approaches for designing your deskto...
 
presentation slides
presentation slidespresentation slides
presentation slides
 
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...HBaseCon 2013:  Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
 

Mehr von lucenerevolution

Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
lucenerevolution
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - final
lucenerevolution
 

Mehr von lucenerevolution (20)

Text Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and LuceneText Classification Powered by Apache Mahout and Lucene
Text Classification Powered by Apache Mahout and Lucene
 
State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here! State of the Art Logging. Kibana4Solr is Here!
State of the Art Logging. Kibana4Solr is Here!
 
Search at Twitter
Search at TwitterSearch at Twitter
Search at Twitter
 
Building Client-side Search Applications with Solr
Building Client-side Search Applications with SolrBuilding Client-side Search Applications with Solr
Building Client-side Search Applications with Solr
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Scaling Solr with SolrCloud
Scaling Solr with SolrCloudScaling Solr with SolrCloud
Scaling Solr with SolrCloud
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
 
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and ParboiledImplementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
 
Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs Using Solr to Search and Analyze Logs
Using Solr to Search and Analyze Logs
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
 
Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?Solr's Admin UI - Where does the data come from?
Solr's Admin UI - Where does the data come from?
 
Schemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST APISchemaless Solr and the Solr Schema REST API
Schemaless Solr and the Solr Schema REST API
 
High Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with LuceneHigh Performance JSON Search and Relational Faceted Browsing with Lucene
High Performance JSON Search and Relational Faceted Browsing with Lucene
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucene
 
Recent Additions to Lucene Arsenal
Recent Additions to Lucene ArsenalRecent Additions to Lucene Arsenal
Recent Additions to Lucene Arsenal
 
Turning search upside down
Turning search upside downTurning search upside down
Turning search upside down
 
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
 
Shrinking the haystack wes caldwell - final
Shrinking the haystack   wes caldwell - finalShrinking the haystack   wes caldwell - final
Shrinking the haystack wes caldwell - final
 

Kürzlich hochgeladen

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
AnaAcapella
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
ssuserdda66b
 

Kürzlich hochgeladen (20)

Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 

Brahe mass scale flexible indexing