SlideShare a Scribd company logo
1 of 26
Download to read offline
Introduction to Elasticsearch
Praveen Manvi July 2016
Agenda
• Overview
– History, Product overview
– ES Vocabulary
– Feature set
• Demo
– Setup/ Configuration
– Eco system
– APIs for Index/Search & monitor
What is ElasticSearch?
– Document (Json) oriented search engine
– Distributed
– Horizontally scalable and Highly Available
– Multi-tenancy enabled
– API centric & RESTful
– Built on Lucene search engine library
& used for
– full-text search, structured search, analytics, or all
three in combination
• Elastic search has become de facto search
solution
• few popular examples
• GitHub uses Elasticsearch to query 130 billion lines of
code.
• Wikipedia uses Elasticsearch to provide full-text search
with highlighted search snippets, and search-as-you-
type and did-you-mean suggestions.
• Stack Overflow combines full-text search with
geolocation queries and uses more-like-this to find
related questions and answers.
History
Shay Benon @kimchy
Doug Cutting @cutting
Started Lucene in 1999, released under apache in
2005.
Now part of cloudera supporting rival solution solr
and commercial offerings
Elasticsearch released in February 2010.
Worked on this for 6 years (started with compass)
Now part of http://elastic.co commercial offerings
Building Blocks
Term Description ( ~analogy with relational database)
Cluster ~Database cluster
Group of nodes
Node ~Instance of database
A JVM process, usually a machine
Index ~Database schema
Hosts mapping types and their definitions contains
many shards
Mapping Type ~Database Table
Field description, indexing requirements
Document ~Database row
Json document.
Shard A Lucene index. Scalable unit and heart of search
engine (primary and replica)
Physical Layout
Logical Layout
Lucene Inverted Index
value add over lucene
• Distributed
– Combines results with fork join against multiple indexes, with the new building blocks
• Transaction Log
– The transaction log guarantees durability, Operations are automatically replayed when a
shard is reopened
– It also simplifies shard relocation/recovery, Helps when moving a shard from one node
to another by being able to replay the changes while transferring committed segments
• Flush/Refresh/Monitor APIs
– For managing the cluster/node/index statuses
• Query DSL
– provides huge set of grammar for search syntax
mapping/index/search docs
Document Metadata Fields
• _id - The id of the document
• _type - The document type
• _source - enabled Stores the original document that
was indexed
• _all enabled Indexes all values of all document fields
• _timestamp disabled timestamp associated with the
document
• _ttl disabled optionally defines an expiration time
• _size disabled indexes the size of the uncompressed
Search Controller
Query DSL
Search request in place
Search Types
• COUNT
• Returns no hits, only total count matching the query,
thus executes in a
• single round trip to the shards
• SCAN
• Allows to iterate over large amounts of data using a
cursor to paginate and hence memory efficient, helpful
for re-indexing and decorating data outside the ES.
• SEARCH
• General search
Aggregation
Aggregations
Nested Aggregations
Few interesting Features
• Bulk Indexing
– Send multiple docs to ES
• Multi Get APIs
– Get multiple documents in a single API
• Percolator
– The idea is to have ES to notify your application when new content matches your filters
instead of having to constantly poll the search engine to check for new updates. Great
for building alerts
• Pagination
• Highlighting
Eco System
(debug tools/development)
Client SDKs
Plugins
•head
•Elastic HQ
•Marvel
•BigDesk
[ES_HOME/bin]./plugin install head
Configuration
• Enabling store compression uses 55% less
storage (LZF/snappy)
• Disabling the '_all' field saves you 13% in
storage.
• Removing _source saves ~26% storage on disk
• ES_HEAP_SIZE set it ½ of the machine memory
(os file cache)
• bootstrap.mlockall to true avoids swap
References
• https://www.youtube.com/watch?v=5444z-L2V2A&spfreload=1 - “Lucene now
and then” from Lucene creator Doug Cutting @ twitter, Gives history and how
lucene evolved.
• https://www.youtube.com/watch?v=lpZ6ZajygDY - from elastic search creator
Shay Benon (Its 3 years old, but its very good content on data design patterns)
• https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html -
Official documentation from elasticsearch
• https://www.manning.com/books/elasticsearch-in-action - From this place
diagrams were picked in this presentation

More Related Content

What's hot

Elasticsearch for beginners
Elasticsearch for beginnersElasticsearch for beginners
Elasticsearch for beginnersNeil Baker
 
Introduction à ElasticSearch
Introduction à ElasticSearchIntroduction à ElasticSearch
Introduction à ElasticSearchFadel Chafai
 
Elasticsearch for Data Analytics
Elasticsearch for Data AnalyticsElasticsearch for Data Analytics
Elasticsearch for Data AnalyticsFelipe
 
An Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaAn Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaObjectRocket
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchRuslan Zavacky
 
Elasticsearch presentation 1
Elasticsearch presentation 1Elasticsearch presentation 1
Elasticsearch presentation 1Maruf Hassan
 
Workshop: Learning Elasticsearch
Workshop: Learning ElasticsearchWorkshop: Learning Elasticsearch
Workshop: Learning ElasticsearchAnurag Patel
 
Intro to elasticsearch
Intro to elasticsearchIntro to elasticsearch
Intro to elasticsearchJoey Wen
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackRich Lee
 
Elastic stack Presentation
Elastic stack PresentationElastic stack Presentation
Elastic stack PresentationAmr Alaa Yassen
 
quick intro to elastic search
quick intro to elastic search quick intro to elastic search
quick intro to elastic search medcl
 
Introduction to Kibana
Introduction to KibanaIntroduction to Kibana
Introduction to KibanaVineet .
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchIsmaeel Enjreny
 

What's hot (20)

Elasticsearch for beginners
Elasticsearch for beginnersElasticsearch for beginners
Elasticsearch for beginners
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Introduction à ElasticSearch
Introduction à ElasticSearchIntroduction à ElasticSearch
Introduction à ElasticSearch
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Elasticsearch Introduction
Elasticsearch IntroductionElasticsearch Introduction
Elasticsearch Introduction
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Elasticsearch for Data Analytics
Elasticsearch for Data AnalyticsElasticsearch for Data Analytics
Elasticsearch for Data Analytics
 
An Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaAn Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and Kibana
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Kibana overview
Kibana overviewKibana overview
Kibana overview
 
Elasticsearch presentation 1
Elasticsearch presentation 1Elasticsearch presentation 1
Elasticsearch presentation 1
 
Workshop: Learning Elasticsearch
Workshop: Learning ElasticsearchWorkshop: Learning Elasticsearch
Workshop: Learning Elasticsearch
 
Intro to elasticsearch
Intro to elasticsearchIntro to elasticsearch
Intro to elasticsearch
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stack
 
ELK Stack
ELK StackELK Stack
ELK Stack
 
Elastic stack Presentation
Elastic stack PresentationElastic stack Presentation
Elastic stack Presentation
 
quick intro to elastic search
quick intro to elastic search quick intro to elastic search
quick intro to elastic search
 
Introduction to Kibana
Introduction to KibanaIntroduction to Kibana
Introduction to Kibana
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 

Similar to Introduction to elasticsearch

Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Vinay Kumar
 
Explore Elasticsearch and Why It’s Worth Using
Explore Elasticsearch and Why It’s Worth UsingExplore Elasticsearch and Why It’s Worth Using
Explore Elasticsearch and Why It’s Worth UsingInexture Solutions
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedBeyondTrees
 
Episerver and search engines
Episerver and search enginesEpiserver and search engines
Episerver and search enginesMikko Huilaja
 
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...Oleksiy Panchenko
 
Hyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache SparkHyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache SparkDatabricks
 
Populate your Search index, NEST 2016-01
Populate your Search index, NEST 2016-01Populate your Search index, NEST 2016-01
Populate your Search index, NEST 2016-01David Smiley
 
06 integrate elasticsearch
06 integrate elasticsearch06 integrate elasticsearch
06 integrate elasticsearchErhwen Kuo
 
(ATS6-PLAT02) Accelrys Catalog and Protocol Validation
(ATS6-PLAT02) Accelrys Catalog and Protocol Validation(ATS6-PLAT02) Accelrys Catalog and Protocol Validation
(ATS6-PLAT02) Accelrys Catalog and Protocol ValidationBIOVIA
 
Elastic & Azure & Episever, Case Evira
Elastic & Azure & Episever, Case EviraElastic & Azure & Episever, Case Evira
Elastic & Azure & Episever, Case EviraMikko Huilaja
 
Elasticsearch Meetup - August 2018 - SocialCops
Elasticsearch Meetup - August 2018 - SocialCopsElasticsearch Meetup - August 2018 - SocialCops
Elasticsearch Meetup - August 2018 - SocialCopsAayush Sarva
 
ElasticSearch as (only) datastore
ElasticSearch as (only) datastoreElasticSearch as (only) datastore
ElasticSearch as (only) datastoreTomas Sirny
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceAmazon Web Services
 
Big Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of Gruter
Big Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of GruterBig Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of Gruter
Big Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of GruterData Con LA
 
Filebeat Elastic Search Presentation.pptx
Filebeat Elastic Search Presentation.pptxFilebeat Elastic Search Presentation.pptx
Filebeat Elastic Search Presentation.pptxKnoldus Inc.
 
Elasticsearch and Spark
Elasticsearch and SparkElasticsearch and Spark
Elasticsearch and SparkAudible, Inc.
 

Similar to Introduction to elasticsearch (20)

Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Roaring with elastic search sangam2018
Roaring with elastic search sangam2018
 
Elasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetupElasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetup
 
Explore Elasticsearch and Why It’s Worth Using
Explore Elasticsearch and Why It’s Worth UsingExplore Elasticsearch and Why It’s Worth Using
Explore Elasticsearch and Why It’s Worth Using
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learned
 
Episerver and search engines
Episerver and search enginesEpiserver and search engines
Episerver and search engines
 
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
 
Hyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache SparkHyperspace: An Indexing Subsystem for Apache Spark
Hyperspace: An Indexing Subsystem for Apache Spark
 
Introduction to Azure Data Lake
Introduction to Azure Data LakeIntroduction to Azure Data Lake
Introduction to Azure Data Lake
 
Populate your Search index, NEST 2016-01
Populate your Search index, NEST 2016-01Populate your Search index, NEST 2016-01
Populate your Search index, NEST 2016-01
 
Elastic pivorak
Elastic pivorakElastic pivorak
Elastic pivorak
 
Elastic search
Elastic searchElastic search
Elastic search
 
06 integrate elasticsearch
06 integrate elasticsearch06 integrate elasticsearch
06 integrate elasticsearch
 
(ATS6-PLAT02) Accelrys Catalog and Protocol Validation
(ATS6-PLAT02) Accelrys Catalog and Protocol Validation(ATS6-PLAT02) Accelrys Catalog and Protocol Validation
(ATS6-PLAT02) Accelrys Catalog and Protocol Validation
 
Elastic & Azure & Episever, Case Evira
Elastic & Azure & Episever, Case EviraElastic & Azure & Episever, Case Evira
Elastic & Azure & Episever, Case Evira
 
Elasticsearch Meetup - August 2018 - SocialCops
Elasticsearch Meetup - August 2018 - SocialCopsElasticsearch Meetup - August 2018 - SocialCops
Elasticsearch Meetup - August 2018 - SocialCops
 
ElasticSearch as (only) datastore
ElasticSearch as (only) datastoreElasticSearch as (only) datastore
ElasticSearch as (only) datastore
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
 
Big Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of Gruter
Big Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of GruterBig Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of Gruter
Big Data Day LA 2015 - What's New Tajo 0.10 and Beyond by Hyunsik Choi of Gruter
 
Filebeat Elastic Search Presentation.pptx
Filebeat Elastic Search Presentation.pptxFilebeat Elastic Search Presentation.pptx
Filebeat Elastic Search Presentation.pptx
 
Elasticsearch and Spark
Elasticsearch and SparkElasticsearch and Spark
Elasticsearch and Spark
 

Recently uploaded

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 

Recently uploaded (20)

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 

Introduction to elasticsearch

  • 2. Agenda • Overview – History, Product overview – ES Vocabulary – Feature set • Demo – Setup/ Configuration – Eco system – APIs for Index/Search & monitor
  • 3. What is ElasticSearch? – Document (Json) oriented search engine – Distributed – Horizontally scalable and Highly Available – Multi-tenancy enabled – API centric & RESTful – Built on Lucene search engine library & used for – full-text search, structured search, analytics, or all three in combination
  • 4. • Elastic search has become de facto search solution • few popular examples • GitHub uses Elasticsearch to query 130 billion lines of code. • Wikipedia uses Elasticsearch to provide full-text search with highlighted search snippets, and search-as-you- type and did-you-mean suggestions. • Stack Overflow combines full-text search with geolocation queries and uses more-like-this to find related questions and answers.
  • 5. History Shay Benon @kimchy Doug Cutting @cutting Started Lucene in 1999, released under apache in 2005. Now part of cloudera supporting rival solution solr and commercial offerings Elasticsearch released in February 2010. Worked on this for 6 years (started with compass) Now part of http://elastic.co commercial offerings
  • 6. Building Blocks Term Description ( ~analogy with relational database) Cluster ~Database cluster Group of nodes Node ~Instance of database A JVM process, usually a machine Index ~Database schema Hosts mapping types and their definitions contains many shards Mapping Type ~Database Table Field description, indexing requirements Document ~Database row Json document. Shard A Lucene index. Scalable unit and heart of search engine (primary and replica)
  • 10. value add over lucene • Distributed – Combines results with fork join against multiple indexes, with the new building blocks • Transaction Log – The transaction log guarantees durability, Operations are automatically replayed when a shard is reopened – It also simplifies shard relocation/recovery, Helps when moving a shard from one node to another by being able to replay the changes while transferring committed segments • Flush/Refresh/Monitor APIs – For managing the cluster/node/index statuses • Query DSL – provides huge set of grammar for search syntax
  • 12. Document Metadata Fields • _id - The id of the document • _type - The document type • _source - enabled Stores the original document that was indexed • _all enabled Indexes all values of all document fields • _timestamp disabled timestamp associated with the document • _ttl disabled optionally defines an expiration time • _size disabled indexes the size of the uncompressed
  • 16. Search Types • COUNT • Returns no hits, only total count matching the query, thus executes in a • single round trip to the shards • SCAN • Allows to iterate over large amounts of data using a cursor to paginate and hence memory efficient, helpful for re-indexing and decorating data outside the ES. • SEARCH • General search
  • 20.
  • 21. Few interesting Features • Bulk Indexing – Send multiple docs to ES • Multi Get APIs – Get multiple documents in a single API • Percolator – The idea is to have ES to notify your application when new content matches your filters instead of having to constantly poll the search engine to check for new updates. Great for building alerts • Pagination • Highlighting
  • 25. Configuration • Enabling store compression uses 55% less storage (LZF/snappy) • Disabling the '_all' field saves you 13% in storage. • Removing _source saves ~26% storage on disk • ES_HEAP_SIZE set it ½ of the machine memory (os file cache) • bootstrap.mlockall to true avoids swap
  • 26. References • https://www.youtube.com/watch?v=5444z-L2V2A&spfreload=1 - “Lucene now and then” from Lucene creator Doug Cutting @ twitter, Gives history and how lucene evolved. • https://www.youtube.com/watch?v=lpZ6ZajygDY - from elastic search creator Shay Benon (Its 3 years old, but its very good content on data design patterns) • https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html - Official documentation from elasticsearch • https://www.manning.com/books/elasticsearch-in-action - From this place diagrams were picked in this presentation