SlideShare ist ein Scribd-Unternehmen logo
1 von 39
Downloaden Sie, um offline zu lesen
Knowing Elasticsearch
By, Shagun Rathore
So, what is the ELK Stack?
● "ELK" is the acronym for three open source projects:
Elasticsearch, Logstash, and Kibana.
● Elasticsearch is a search and analytics engine.
● Logstash is a server‑side data processing pipeline that
ingests data from multiple sources simultaneously,
transforms it, and then sends it to a "stash" like
Elasticsearch.
● Kibana lets users visualize data with charts and graphs
in Elasticsearch.
ELK vs Elastic stack
When you talk about ELK stack it just means you are talking
about Elasticsearch, Logstash, and Kibana. But when you talk
about Elastic stack, other components such as Beats, X-Pack
are also included with it.
Elasticsearch
● Elasticsearch is a distributed, open-source, RESTful,
Highly Scalable search and analytics engine based on the
Apache Lucene Library and works for all types of data,
including textual, numerical, geospatial, structured, and
unstructured.
● It is the Heart of elastic stack.
● Elasticsearch is an open source developed in Java and
used by many big organizations around the world.
● It is licensed under the Apache license version 2.0.
What is Elasticsearch used for?
● Application search
● Website search
● Enterprise search
● Logging and log analytics
● Infrastructure metrics and container monitoring
● Application performance monitoring
● Geospatial data analysis and visualization
● Security analytics
● Business analytics
How does Elasticsearch work?
Raw data flows into Elasticsearch from a variety of sources,
including logs, system metrics, and web applications.
Data ingestion is the process by which this raw data is
parsed, normalized, and enriched before it is indexed in
Elasticsearch.
Once indexed in Elasticsearch, users can run complex queries
against their data and use aggregations to retrieve complex
summaries of their data.
From Kibana, users can create powerful visualizations of
their data, share dashboards, and manage the Elastic Stack.
Document
It is a collection of fields in a specific manner defined in
JSON format.
Every document belongs to a type and resides inside an
index.
Every document is associated with a unique identifier called
the UID.
Document
What is an Elasticsearch index?
An Elasticsearch index is a collection of documents that are related to each
other. Elasticsearch stores data as JSON documents. Each document correlates a
set of keys (names of fields or properties) with their corresponding values
(strings, numbers, Booleans, dates, arrays of values, geolocations, or other
types of data).
Elasticsearch uses a data structure called an inverted index, which is
designed to allow very fast full-text searches. An inverted index lists every
unique word that appears in any document and identifies all of the documents
each word occurs in.
During the indexing process, Elasticsearch stores documents and builds an
inverted index to make the document data searchable in near real-time.
Indexing is initiated with the index API, through which you can add or update
a JSON document in a specific index.
Shards
Indexes are horizontally subdivided into shards.
This means each shard contains all the properties of
document but contains less number of JSON objects than
index.
The horizontal separation makes shard an independent node,
which can be store in any node. Primary shard is the
original horizontal part of an index and then these primary
shards are replicated into replica shards.
Shards
● Shard is like a partition(piece) of an Index.
● Shard splits the index horizontally.
● You can define the number of shards in an index at the time of Index
creation.
● The main shard which is used for write is called as Primary shard.
● In Elasticsearch, replication is done with the help of Replica shards.
Replicas
● Elasticsearch allows a user to create replicas of their indexes and
shards. Replication not only helps in increasing the availability of data
in case of failure, but also improves the performance of searching by
carrying out a parallel search operation in these replicas.
● Replica contains the same data as its primary shards.
● The replicas are never allocated to the same node as the primary shard.
● Allows for fault tolerance.
● Scales search throughput.
Node
● A single server in a cluster called Node.
● A node has a unique name in the cluster.
Cluster
● It is a collection of one or more servers.
● It allows searching and indexing across all nodes in
the cluster.
● One node is one Lucene instance.
● Every cluster is identified by its UNIQUE name. (This
is Important for multi-cluster setup)
Cluster Status
Your cluster will be either of 3 stats of cluster depends on primary and
replica shards.
● Green, when all the primary, as well as replica shards, are allocated.
● Yellow, when all the primary shards are allocated where one or more
replica shards are unallocated
● Red, when one or more primary shards are unallocated.
Comparison between Elasticsearch and RDBMS
Node Types
Master Eligible Node (Default: True)
It is responsible for all the master cluster management, operations like create, update, delete,
read as well as tracking of all the clusters and shard allocation.
Data Node (Default: True)
Data nodes contain the shards. Index, Delete, Search and other operations are performed on data
nodes.
Ingest Node (Default: True)
Preprocessing of the data is done by the index node. (Logstash)
Node Types
Coordinating Only Node (Default: false)
Coordinating only nodes acts as a smart load balancer that routes the requests to the
nodes.
It also handles search reduction.
Distributes bulk indexing.
Machine Learning Node
It is a feature of X-pack which is not free.
In this node, you can run machine learning jobs and API requests.
What programming languages does Elasticsearch support?
Elasticsearch supports a variety of languages and official
clients are available for:
● Java
● JavaScript (Node.js)
● Go
● .NET (C#)
● PHP
● Perl
● Python
● Ruby
Amazon Elasticsearch
Amazon Elasticsearch Service is a fully managed service that makes it
easy for you to deploy, secure, and run Elasticsearch cost effectively
at scale.
You can build, monitor, and troubleshoot your applications using the
tools you love, at the scale you need.
The service provides support for open source Elasticsearch APIs,
managed Kibana, integration with Logstash and other AWS services, and
built-in alerting and SQL querying.
Amazon Elasticsearch Service lets you pay only for what you use – there
are no upfront costs or usage requirements. With Amazon Elasticsearch
Service, you get the ELK stack you need, without the operational
overhead.
Benefits
Easy to deploy and manage
With Amazon Elasticsearch Service you can deploy your
Elasticsearch cluster in minutes. The service simplifies
management tasks such as hardware provisioning, software
installation and patching, failure recovery, backups, and
monitoring.
To monitor your clusters, Amazon Elasticsearch service includes
built-in event monitoring and alerting so you can get notified on
changes to your data to proactively address any issues.
Benefits
Highly scalable and available
Amazon Elasticsearch Service lets you store up to 3 PB of data in
a single cluster, enabling you to run large log analytics
workloads via a single Kibana interface.
You can easily scale your cluster up or down via a single API
call or a few clicks in the AWS console.
Amazon Elasticsearch Service is designed to be highly available
using multi-AZ deployments, which allows you to replicate data
between three Availability Zones in the same region.
Benefits
Highly secure
For your data in Elasticsearch Service, you can achieve
network isolation with Amazon VPC, encrypt data at-rest and
in-transit using keys you create and control through AWS
KMS, and manage authentication and access control with
Amazon Cognito and AWS IAM policies.
Amazon Elasticsearch Service is also HIPAA eligible, and
compliant with PCI DSS, SOC, ISO, and FedRamp standards to
help you meet industry-specific or regulatory requirements.
Benefits
Cost-effective
With Amazon Elasticsearch Service, you pay only for the resources you
consume.
You can select on-demand pricing with no upfront costs or long-term
commitments, or achieve significant cost savings via our Reserved
Instance pricing.
As a fully managed service, Amazon Elasticsearch Service further lowers
your total cost of operations by eliminating the need for a dedicated
team of Elasticsearch experts to monitor and manage your clusters.
Use Cases
Application monitoring
Store, analyze, and correlate application and infrastructure log data to find
and fix issues faster and improve application performance.
Enable trace data analysis for your distributed applications to quickly
identify performance issues. You can receive automated alerts if your
application is underperforming, enabling you to proactively address any
issues.
An online travel company, for example, can use Amazon Elasticsearch Service to
analyze logs from its applications to identify and resolve performance
bottlenecks or availability issues, ensuring streamlined booking experience.
Use Cases
Security information and event management (SIEM)
Centralize and analyze logs from disparate applications and
systems across your network for real-time threat detection
and incident management.
A telecom company, for example, can use Amazon Elasticsearch
Service with Kibana to quickly index, search, and visualize
logs from its routers, applications, and other devices to
find and prevent security threats such as data breaches,
unauthorized login attempts, DoS attacks, and fraud.
Use Cases
Search
Provide a fast, personalized search experience for your applications,
websites, and data lake catalogs, allowing your users to quickly find
relevant data.
For example, a real estate business can use Amazon Elasticsearch
Service to help its consumers find homes in their desired location, in
a certain price range from among millions of real-estate properties.
You get access to all of Elasticsearch’s search APIs, supporting
natural language search, auto-completion, faceted search, and
location-aware search.
Use Cases
Infrastructure monitoring
Collect logs and metrics from your servers, routers, switches,
and virtualized machines to get a comprehensive visibility into
your infrastructure, reducing mean time to detect (MTTD) and
resolve (MTTR) issues and lowering system downtime.
A gaming company, for example, can use Amazon Elasticsearch
Service to monitor and analyze server logs to identify any server
performance issues that could lead to application downtime.
Advantages
● Elasticsearch is developed on Java, which makes it compatible on almost every
platform.
● Elasticsearch is real time, in other words after one second the added document is
searchable in this engine
● Elasticsearch is distributed, which makes it easy to scale and integrate in any
big organization.
● Creating full backups are easy by using the concept of gateway, which is present
in Elasticsearch.
● Handling multi-tenancy is very easy in Elasticsearch when compared to Apache Solr.
● Elasticsearch uses JSON objects as responses, which makes it possible to invoke
the Elasticsearch server with a large number of different programming languages.
● Elasticsearch supports almost every document type except those that do not support
text rendering.
Elasticsearch in action
Thank YOU...

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of Lucene
 
Elasticsearch From the Bottom Up
Elasticsearch From the Bottom UpElasticsearch From the Bottom Up
Elasticsearch From the Bottom Up
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
 
The Elastic ELK Stack
The Elastic ELK StackThe Elastic ELK Stack
The Elastic ELK Stack
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Elastic stack Presentation
Elastic stack PresentationElastic stack Presentation
Elastic stack Presentation
 
ElasticSearch
ElasticSearchElasticSearch
ElasticSearch
 
ELK Stack
ELK StackELK Stack
ELK Stack
 
Elasticsearch Introduction
Elasticsearch IntroductionElasticsearch Introduction
Elasticsearch Introduction
 
quick intro to elastic search
quick intro to elastic search quick intro to elastic search
quick intro to elastic search
 
Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in Netflix
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Elk
Elk Elk
Elk
 
Keeping Up with the ELK Stack: Elasticsearch, Kibana, Beats, and Logstash
Keeping Up with the ELK Stack: Elasticsearch, Kibana, Beats, and LogstashKeeping Up with the ELK Stack: Elasticsearch, Kibana, Beats, and Logstash
Keeping Up with the ELK Stack: Elasticsearch, Kibana, Beats, and Logstash
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stack
 
An Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaAn Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and Kibana
 
ELK Stack
ELK StackELK Stack
ELK Stack
 

Ähnlich wie Elasticsearch

Ähnlich wie Elasticsearch (20)

Elastic Search Capability Presentation.pptx
Elastic Search Capability Presentation.pptxElastic Search Capability Presentation.pptx
Elastic Search Capability Presentation.pptx
 
Explore Elasticsearch and Why It’s Worth Using
Explore Elasticsearch and Why It’s Worth UsingExplore Elasticsearch and Why It’s Worth Using
Explore Elasticsearch and Why It’s Worth Using
 
Overview on elastic search
Overview on elastic searchOverview on elastic search
Overview on elastic search
 
Centralization of all log (application, docker, security, ...)
Centralization of all log (application, docker, security, ...)Centralization of all log (application, docker, security, ...)
Centralization of all log (application, docker, security, ...)
 
Elastic search
Elastic searchElastic search
Elastic search
 
Perl and Elasticsearch
Perl and ElasticsearchPerl and Elasticsearch
Perl and Elasticsearch
 
Filebeat Elastic Search Presentation.pptx
Filebeat Elastic Search Presentation.pptxFilebeat Elastic Search Presentation.pptx
Filebeat Elastic Search Presentation.pptx
 
Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Roaring with elastic search sangam2018
Roaring with elastic search sangam2018
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search EngineElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
 
The Power of Elasticsearch
The Power of ElasticsearchThe Power of Elasticsearch
The Power of Elasticsearch
 
Elastic search
Elastic searchElastic search
Elastic search
 
Elasticsearch, a distributed search engine with real-time analytics
Elasticsearch, a distributed search engine with real-time analyticsElasticsearch, a distributed search engine with real-time analytics
Elasticsearch, a distributed search engine with real-time analytics
 
Elk presentation1#3
Elk presentation1#3Elk presentation1#3
Elk presentation1#3
 
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکیDeep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
 
AWS October Webinar Series - Introducing Amazon Elasticsearch Service
AWS October Webinar Series - Introducing Amazon Elasticsearch ServiceAWS October Webinar Series - Introducing Amazon Elasticsearch Service
AWS October Webinar Series - Introducing Amazon Elasticsearch Service
 
Wanna search? Piece of cake!
Wanna search? Piece of cake!Wanna search? Piece of cake!
Wanna search? Piece of cake!
 
Introduction to ElasticSearch
Introduction to ElasticSearchIntroduction to ElasticSearch
Introduction to ElasticSearch
 
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 

Mehr von Shagun Rathore

Mehr von Shagun Rathore (8)

Aurora.pdf
Aurora.pdfAurora.pdf
Aurora.pdf
 
AWS VPC.pdf
AWS VPC.pdfAWS VPC.pdf
AWS VPC.pdf
 
Amazon Web Services - 9 Posts.
Amazon Web Services - 9 Posts.Amazon Web Services - 9 Posts.
Amazon Web Services - 9 Posts.
 
#1 Cloud concepts - Amazon web services.
#1 Cloud concepts - Amazon web services.#1 Cloud concepts - Amazon web services.
#1 Cloud concepts - Amazon web services.
 
Linux basics
Linux basicsLinux basics
Linux basics
 
Basics of Email writing
Basics of Email writingBasics of Email writing
Basics of Email writing
 
Cyber security
Cyber securityCyber security
Cyber security
 
Aws config
Aws configAws config
Aws config
 

Kürzlich hochgeladen

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Kürzlich hochgeladen (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 

Elasticsearch

  • 2.
  • 3. So, what is the ELK Stack? ● "ELK" is the acronym for three open source projects: Elasticsearch, Logstash, and Kibana. ● Elasticsearch is a search and analytics engine. ● Logstash is a server‑side data processing pipeline that ingests data from multiple sources simultaneously, transforms it, and then sends it to a "stash" like Elasticsearch. ● Kibana lets users visualize data with charts and graphs in Elasticsearch.
  • 4. ELK vs Elastic stack When you talk about ELK stack it just means you are talking about Elasticsearch, Logstash, and Kibana. But when you talk about Elastic stack, other components such as Beats, X-Pack are also included with it.
  • 5. Elasticsearch ● Elasticsearch is a distributed, open-source, RESTful, Highly Scalable search and analytics engine based on the Apache Lucene Library and works for all types of data, including textual, numerical, geospatial, structured, and unstructured. ● It is the Heart of elastic stack. ● Elasticsearch is an open source developed in Java and used by many big organizations around the world. ● It is licensed under the Apache license version 2.0.
  • 6. What is Elasticsearch used for? ● Application search ● Website search ● Enterprise search ● Logging and log analytics ● Infrastructure metrics and container monitoring ● Application performance monitoring ● Geospatial data analysis and visualization ● Security analytics ● Business analytics
  • 7. How does Elasticsearch work? Raw data flows into Elasticsearch from a variety of sources, including logs, system metrics, and web applications. Data ingestion is the process by which this raw data is parsed, normalized, and enriched before it is indexed in Elasticsearch. Once indexed in Elasticsearch, users can run complex queries against their data and use aggregations to retrieve complex summaries of their data. From Kibana, users can create powerful visualizations of their data, share dashboards, and manage the Elastic Stack.
  • 8.
  • 9. Document It is a collection of fields in a specific manner defined in JSON format. Every document belongs to a type and resides inside an index. Every document is associated with a unique identifier called the UID.
  • 11. What is an Elasticsearch index? An Elasticsearch index is a collection of documents that are related to each other. Elasticsearch stores data as JSON documents. Each document correlates a set of keys (names of fields or properties) with their corresponding values (strings, numbers, Booleans, dates, arrays of values, geolocations, or other types of data). Elasticsearch uses a data structure called an inverted index, which is designed to allow very fast full-text searches. An inverted index lists every unique word that appears in any document and identifies all of the documents each word occurs in. During the indexing process, Elasticsearch stores documents and builds an inverted index to make the document data searchable in near real-time. Indexing is initiated with the index API, through which you can add or update a JSON document in a specific index.
  • 12. Shards Indexes are horizontally subdivided into shards. This means each shard contains all the properties of document but contains less number of JSON objects than index. The horizontal separation makes shard an independent node, which can be store in any node. Primary shard is the original horizontal part of an index and then these primary shards are replicated into replica shards.
  • 13. Shards ● Shard is like a partition(piece) of an Index. ● Shard splits the index horizontally. ● You can define the number of shards in an index at the time of Index creation. ● The main shard which is used for write is called as Primary shard. ● In Elasticsearch, replication is done with the help of Replica shards.
  • 14. Replicas ● Elasticsearch allows a user to create replicas of their indexes and shards. Replication not only helps in increasing the availability of data in case of failure, but also improves the performance of searching by carrying out a parallel search operation in these replicas. ● Replica contains the same data as its primary shards. ● The replicas are never allocated to the same node as the primary shard. ● Allows for fault tolerance. ● Scales search throughput.
  • 15.
  • 16. Node ● A single server in a cluster called Node. ● A node has a unique name in the cluster.
  • 17. Cluster ● It is a collection of one or more servers. ● It allows searching and indexing across all nodes in the cluster. ● One node is one Lucene instance. ● Every cluster is identified by its UNIQUE name. (This is Important for multi-cluster setup)
  • 18.
  • 19. Cluster Status Your cluster will be either of 3 stats of cluster depends on primary and replica shards. ● Green, when all the primary, as well as replica shards, are allocated. ● Yellow, when all the primary shards are allocated where one or more replica shards are unallocated ● Red, when one or more primary shards are unallocated.
  • 21. Node Types Master Eligible Node (Default: True) It is responsible for all the master cluster management, operations like create, update, delete, read as well as tracking of all the clusters and shard allocation. Data Node (Default: True) Data nodes contain the shards. Index, Delete, Search and other operations are performed on data nodes. Ingest Node (Default: True) Preprocessing of the data is done by the index node. (Logstash)
  • 22. Node Types Coordinating Only Node (Default: false) Coordinating only nodes acts as a smart load balancer that routes the requests to the nodes. It also handles search reduction. Distributes bulk indexing. Machine Learning Node It is a feature of X-pack which is not free. In this node, you can run machine learning jobs and API requests.
  • 23.
  • 24.
  • 25.
  • 26. What programming languages does Elasticsearch support? Elasticsearch supports a variety of languages and official clients are available for: ● Java ● JavaScript (Node.js) ● Go ● .NET (C#) ● PHP ● Perl ● Python ● Ruby
  • 27. Amazon Elasticsearch Amazon Elasticsearch Service is a fully managed service that makes it easy for you to deploy, secure, and run Elasticsearch cost effectively at scale. You can build, monitor, and troubleshoot your applications using the tools you love, at the scale you need. The service provides support for open source Elasticsearch APIs, managed Kibana, integration with Logstash and other AWS services, and built-in alerting and SQL querying. Amazon Elasticsearch Service lets you pay only for what you use – there are no upfront costs or usage requirements. With Amazon Elasticsearch Service, you get the ELK stack you need, without the operational overhead.
  • 28. Benefits Easy to deploy and manage With Amazon Elasticsearch Service you can deploy your Elasticsearch cluster in minutes. The service simplifies management tasks such as hardware provisioning, software installation and patching, failure recovery, backups, and monitoring. To monitor your clusters, Amazon Elasticsearch service includes built-in event monitoring and alerting so you can get notified on changes to your data to proactively address any issues.
  • 29. Benefits Highly scalable and available Amazon Elasticsearch Service lets you store up to 3 PB of data in a single cluster, enabling you to run large log analytics workloads via a single Kibana interface. You can easily scale your cluster up or down via a single API call or a few clicks in the AWS console. Amazon Elasticsearch Service is designed to be highly available using multi-AZ deployments, which allows you to replicate data between three Availability Zones in the same region.
  • 30. Benefits Highly secure For your data in Elasticsearch Service, you can achieve network isolation with Amazon VPC, encrypt data at-rest and in-transit using keys you create and control through AWS KMS, and manage authentication and access control with Amazon Cognito and AWS IAM policies. Amazon Elasticsearch Service is also HIPAA eligible, and compliant with PCI DSS, SOC, ISO, and FedRamp standards to help you meet industry-specific or regulatory requirements.
  • 31. Benefits Cost-effective With Amazon Elasticsearch Service, you pay only for the resources you consume. You can select on-demand pricing with no upfront costs or long-term commitments, or achieve significant cost savings via our Reserved Instance pricing. As a fully managed service, Amazon Elasticsearch Service further lowers your total cost of operations by eliminating the need for a dedicated team of Elasticsearch experts to monitor and manage your clusters.
  • 32.
  • 33. Use Cases Application monitoring Store, analyze, and correlate application and infrastructure log data to find and fix issues faster and improve application performance. Enable trace data analysis for your distributed applications to quickly identify performance issues. You can receive automated alerts if your application is underperforming, enabling you to proactively address any issues. An online travel company, for example, can use Amazon Elasticsearch Service to analyze logs from its applications to identify and resolve performance bottlenecks or availability issues, ensuring streamlined booking experience.
  • 34. Use Cases Security information and event management (SIEM) Centralize and analyze logs from disparate applications and systems across your network for real-time threat detection and incident management. A telecom company, for example, can use Amazon Elasticsearch Service with Kibana to quickly index, search, and visualize logs from its routers, applications, and other devices to find and prevent security threats such as data breaches, unauthorized login attempts, DoS attacks, and fraud.
  • 35. Use Cases Search Provide a fast, personalized search experience for your applications, websites, and data lake catalogs, allowing your users to quickly find relevant data. For example, a real estate business can use Amazon Elasticsearch Service to help its consumers find homes in their desired location, in a certain price range from among millions of real-estate properties. You get access to all of Elasticsearch’s search APIs, supporting natural language search, auto-completion, faceted search, and location-aware search.
  • 36. Use Cases Infrastructure monitoring Collect logs and metrics from your servers, routers, switches, and virtualized machines to get a comprehensive visibility into your infrastructure, reducing mean time to detect (MTTD) and resolve (MTTR) issues and lowering system downtime. A gaming company, for example, can use Amazon Elasticsearch Service to monitor and analyze server logs to identify any server performance issues that could lead to application downtime.
  • 37. Advantages ● Elasticsearch is developed on Java, which makes it compatible on almost every platform. ● Elasticsearch is real time, in other words after one second the added document is searchable in this engine ● Elasticsearch is distributed, which makes it easy to scale and integrate in any big organization. ● Creating full backups are easy by using the concept of gateway, which is present in Elasticsearch. ● Handling multi-tenancy is very easy in Elasticsearch when compared to Apache Solr. ● Elasticsearch uses JSON objects as responses, which makes it possible to invoke the Elasticsearch server with a large number of different programming languages. ● Elasticsearch supports almost every document type except those that do not support text rendering.