SlideShare ist ein Scribd-Unternehmen logo
1 von 30
Downloaden Sie, um offline zu lesen
www.semplicityinc.com
FROM THE TRENCHES: SCALING
A LARGE LOG MANAGEMENT
DEPLOYMENT
War stories, tips & gotchas – it’s all here.
Prepared by SEMPlicity, Inc.
George Boitano
(617) 524-0171
gboitano@semplicityinc.com
www.semplicityinc.com
© Copyright 2019 SEMplicity, Inc.
www.semplicityinc.com
Connecting the Best of Both Worlds
2© Copyright 2019 SEMplicity, Inc.
Who We Are
– SEMplicity provides Elastic professional services for legacy SIEM
modernization.
– SEMplicity is an official licensed Elastic Managed Services Provider (MSP).
– SEMplicity is the largest Micro Focus services provider for ArcSight with 10
years of experience with legacy SIEM and log management.
www.semplicityinc.com
Modern Log Management with Elastic
© Copyright 2019 SEMplicity, Inc. 3
www.semplicityinc.com
The Challenge
4© Copyright 2019 SEMplicity, Inc.
A retailer needs much faster log search response times:
• using legacy log storage, searches spanning time periods of more than an
hour take minutes or hours to return;
• legacy log storage is very expensive at present, and getting worse as log
volumes increase;
• legacy log storage technology is frozen, without any roadmap for advanced
visualizations, machine learning, etc.
Enter FastSearch, our Elastic log management deployment:
• planned users include SOC analysts, incident response and hunt team;
• initial use case: fast searching of log records using keywords and free text;
• follow-on use case: Elastic storage of sensitive compliance logs (PCI, HIPPA) at
evidentiary standards;
• Roadmap includes advanced analyst and executive visualizations, alerting,
incident response research dashboards, unsupervised machine learning
anomaly detection.
www.semplicityinc.com
Requirements Metrics
5© Copyright 2019 SEMplicity, Inc.
Metric Service Level
Retention 30 days or more
Volume Approximately 120K Events per Second (EPS)
Storage More than 31.3 3b per day
Performance 30-day single keyword search returns in under 6 seconds
Log Source 30-Day Storage
Windows Servers 55tb
Firewalls 270tb
Web Proxies 110tb
Other Sources About 500tb
www.semplicityinc.com
ECE or Elastic Cloud Enterprise
Deployed internally on re-purposed Hardware, here are the benefits of Elastic
Cloud Enterprise (ECE)
• Sensitive or regulated data is stored or available within the internal
network.
• Centralized Management of Elasticsearch Deployments for
• Provisioning
• Scaling
• Monitoring
• Upgrades (Minimum to No downtime)
• Backup and Restore
© Copyright 2018 SEMplicity, Inc. 6
www.semplicityinc.com
Hardware
7© Copyright 2019 SEMplicity, Inc.
Our biggest challenge involved designing the ECE (Elastic Cloud Enterprise)
installation for our client based on the hardware we inherited:
• 60 or so RedHat servers with 256GB memory and varying storage
capabilities;
• Some with small SSD drives, some only spinning disks, some both;
• Most servers have between 19tb and 24tb storage available.
In order to get the most out of the available resources, we decided upon an
ECE implementation with the RAM:Storage ratio of 1:98, as described later.
www.semplicityinc.com
High Level ECE Design
© Copyright 2018 SEMplicity, Inc. 8
www.semplicityinc.com
Availability Zones
9© Copyright 2019 SEMplicity, Inc.
Here is the Elastic recommended configuration for 3 availability zones:
www.semplicityinc.com
Availability Zones
Due to hardware (disk) available, we consolidated this design a bit. Here is the
actual design we used:
© Copyright 2018 SEMplicity, Inc. 10
www.semplicityinc.com
ECE Clusters
11© Copyright 2019 SEMplicity, Inc.
The first cluster (now called a deployment) we set up was for sizing. This
starts as a single-node single-shard cluster. The disk allocated and number
of nodes/shards changes depending on the event source, so we try to
keep enough free space available in this sizing cluster for onboarding.
www.semplicityinc.com
Determining Storage Density
12© Copyright 2019 SEMplicity, Inc.
Elastic Cloud Enterprise (ECE) provisions clusters at a ratio of 1GB of RAM for
every 32GB of storage, since we have servers with 256 GB RAM and 24 TB of
storage, we arrived at a memory to storage ratio of 1:98 for each allocator. This
can be calculated by using this formula.
• Storage / RAM = Storage Density
This gets more complex when you think about the instance size you plan to use.
ECE allows you to create instances with RAM allocations like this:
• For large deployments, smaller instances can be problematic. We have seen
ingest issues with 16gb and smaller (remember we’re dealing with high
volumes). An instance with 64gb RAM will be allocated 16 processors where
as instance with 16gb RAM will be allocated 4 processors.
• Calculating storage density with instance size in mind. You want to make
sure your storage density will allow for the maximum number of instances to
be created.
www.semplicityinc.com
Storage Density Example
Say you have a server you plan to use as an ECE allocator with 24tb of storage
and 256gb RAM.
• The storage density would be 1:93 roughly.
• If you decide to use mostly 64gb nodes (which, annoyingly, ECE calls an
instance), that would be 5.9tb storage per node. That’s roughly 4 nodes per
Allocator.
• I’m rounding off numbers here though. Realistically, it’s 93.75gb of storage to
every one gb of RAM. That means setting your RAM:Disk to 1:93 is actually
around 192gb of unused storage. This is compounded when you take into
consideration that only 4 nodes will fit on each Allocator.
• Using 1:93 as your storage density, you will realistically only get 23.8tb of
available storage.
This isn’t a huge problem normally, but when you have systems with different
RAM to Disk ratios, it gets difficult to avoid wasted disk space.
© Copyright 2018 SEMplicity, Inc. 13
www.semplicityinc.com
Shard Sizing (Number of Shards)
There are different approaches or methods available for Shard Sizing. This is how
we arrived at the number of shards and nodes to handle the requirement.
In addition to setting up a sizing deployment, we also setup a monitoring
deployment to view the indexing statistics as well as Logstash performance. With
index mapping template tuned for disk optimization, we started sending Proxy
events which had been enriched with Logstash, and properly parsed for keyword,
IP, text and other field types.
• Determine the Daily index size by indexing for couple of days (during the
week).
• 1300GB ( 2600gb with Replica) + 25% for Growth = 3250gb
• Determine the total Index Size based on the number of retention days
• 3250gb * 30 days = 97500GB
• Determine the number of shards. A good rule of thumb is to shoot for shard
sizes of 60gb or less
• 3250gb/60gb = 54 shards
© Copyright 2018 SEMplicity, Inc. 14
www.semplicityinc.com
Instance or Node Sizing
15© Copyright 2019 SEMplicity, Inc.
Because of the higher EPS for this Log source, we determined to go with
64gb RAM:6.13 TB of storage. In order to determine, number of nodes or
instances
a) We have the total size of the index for 30 days retention period (97500
GB)
b) We have the total size of a node, 6130GB, you first need to know the
daily index size.
c) Number of nodes would be 97500/6130 = 16 nodes. Since we have 3
zones and nodes are distributed equally, we went with 18 nodes total.
www.semplicityinc.com
Shard Sizing Details
A couple things to keep in mind about shard sizing:
• Generally speaking the more (smaller) shards you have, the faster ingest will
be (to a point). More shards will slow search times as well;
• Less (larger) shards will have the opposite effect;
• This is very dependent on EPS, number of nodes, and total index size;
• It is highly recommended to set up a Sizing Cluster with Monitoring and
thoroughly test each event source prior to sizing your production clusters.
Your search/ingest requirements will vary and these requirements will directly
impact your shard size and number of shards.
© Copyright 2018 SEMplicity, Inc. 16
www.semplicityinc.com
LogStash Architecture
17© Copyright 2019 SEMplicity, Inc.
Determining Logstash Architecture (for us) involved a lot of testing for each
log source.
Luckily, our data was already being collected in various ways and sent to
Kafka. Nearly all of it had been processed by ArcSight, so it was already in
CEF (common event format).
Pulling data from Kafka with Logstash is simple. You subscribe to the Kafka
topic (ours are separated by event type) using a Logstash input plugin.
Even simplified, much testing was required to determine how many
instances of Logstash were required for the EPS output from Kafka.
www.semplicityinc.com
Logstash Architecture Details
• We decided to leverage two of our lower disk servers for Logstash instances.
• LogStash does not run under ECE. You can deploy it as a Docker container, but
we are not doing that.
• We do send LogStash metrics and health data to our monitoring cluster, to
help with tuning and debugging.
• Here is one of the configurations for collecting Proxy events:
© Copyright 2018 SEMplicity, Inc. 18
www.semplicityinc.com
Tuning LogStash Ingestion
19© Copyright 2019 SEMplicity, Inc.
When pulling data from Kafka, the number of Kafka partitions available is
important:
• Logstash can only leverage the same number of threads as there are
partitions available;
• If a proxy topic has 60 available partitions, Logstash can only leverage 60
consumer threads: more than that will simple remain idle and unused.
Depending on the EPS, and filters used by Logstash, you may consider
splitting partitions to several Logstash instances.
www.semplicityinc.com
Tuning Logstash Applied
For our Proxy cluster, we split the topic among 4 Logstash instances each running
15 consumer threads.
• Each Logstash instance could leverage 15 consumer threads for a total of 60.
• General guidelines for ingestion:
• Lower EPS (6k/s) – fewer LogStash instances with higher number of
consumer threads each;
• Higher EPS (45k/s) – more LogStassh instances with lower number
consumer threads each.
• It’s also important to note that bumping up the Logstash JVM heap up to a
maximum of 30gb can improve throughput for each instance.
Using default settings, Logstash instances seem to max out around 2,000 EPS. By
testing different setups, you can improve this drastically.
© Copyright 2018 SEMplicity, Inc. 20
www.semplicityinc.com
LogStash Mapping
CEF is pretty good at normalizing data. However there are some things you can
do with Logstash to further enrich events.
• Concatenating fields, dropping fields, or mapping IP geoip data for instance:
© Copyright 2018 SEMplicity, Inc. 21
www.semplicityinc.com
LogStash all_content and copy_to
Our client also requested that we add all fields to a single field called
“all_content”.
• Quite often Analysts may not know the field(s) in which a certain string
resides.
• This increases ingest workload and storage by quite a bit; however, it can be
quite useful for searching the whole event for strings.
© Copyright 2018 SEMplicity, Inc. 22
To implement, modify the CEF
LogStash mapping template
with a number of copy_to
parameters:
www.semplicityinc.com
LogStash Indexing
Disable indexing on fields that are not going to be searched, like certain numeric
fields.
This is specified in the LogStash mapping template. It speeds up ingestion and
reduces storage required.
© Copyright 2018 SEMplicity, Inc. 23
www.semplicityinc.com
Problem: Ingestion Delays
24© Copyright 2019 SEMplicity, Inc.
Early on, while still setting up and tuning ECE Clusters, we were frequently
making changes, growing and shrinking nodes, etc.
During this process, we discovered that growing Deployments can
sometimes result in an issue where new indices weren’t being properly
spread across cluster instances.
We would frequently end up with a Deployment containing 20+ instances,
where only one or two instances were creating all shards and replicas.
ECE tries to keep all shards equally distributed, and when you create a new
instance, all new shards are created there until it’s shards are the same as
older instances.
This causes some serious problems with ingestion when you have 120k+
EPS.
www.semplicityinc.com
Ingestion Delay Symptoms
25© Copyright 2019 SEMplicity, Inc.
Symptoms:
1. Logs are delayed in becoming available for search. Logs should be
available within 1 minute of ingestion. We were seeing delays of several
hours between ingestion and becoming available.
2. More than 3 shards are allocated to an instance or a node:
• As indicated in the previous slide, all shards for a new indices were
created/allocated to new instances.
Solution:
a) Please confirm that the routing allocation is set to “all” so that the
shards are allocated evenly across the available instances
b) To make maximum use of the available processing capacity, set the cpu
hard limit to “false” under Data section of advanced Elastic
configuration.
www.semplicityinc.com
Setting cluster.routing.allocation
Make sure the cluster.routing.allocation is set to “all”:
GET _cluster/settings?include_defaults=true&filter_path=**.routing.allocation.enable*
If the output shows “enable” : “none”, then, you can reset by setting the value to
null.
© Copyright 2018 SEMplicity, Inc. 26
www.semplicityinc.com
Problem: Boot Loops & ECE Debugging
27© Copyright 2019 SEMplicity, Inc.
Symptom:
• If there is a syntax error with a user bundle or configuration file, ECE
does not report the actual error due;
• Instead, we only see a “boot loop error detected” message in the ECE
Admin UI as it tries to apply the config changes to the new instances of
ElasticSearch within each docker container.
Solution:
• Made sure that the deployment strategy is set to “rolling”.
This helped us to review the log file for the actual syntax errors. If set to any
strategy other than rolling, any new instances created are terminated after
the failure. This deletes the log files, making diagnosis of the root cause very
difficult.
www.semplicityinc.com
Cross-Cluster Search
28© Copyright 2019 SEMplicity, Inc.
• When dealing with large amounts of data, it becomes necessary to have
not only multiple indices, but multiple clusters.
• In ECE, clusters are referred to as Deployments. Each Deployment is a
secure logical silo. This was designed around a multi-tenant architecture,
each with it’s own Kibana instance to access data in each cluster. The
idea was to prevent cross cluster searching across multiple clients.
• In cases like ours, where a single customer has enough data to warrant
multiple Deployments, a cross-cluster searching is necessary. This is a
drawback for large customers using ECE.
While ECE does not currently support cross-cluster searching, it is planned.
We’ve been assured it will be included in the next major release, which
should be early February at the latest.
www.semplicityinc.com
Take-Aways
29© Copyright 2019 SEMplicity, Inc.
Know your EPS and plan for it to increase:
• EPS may increase when indexed due to replica shards or field mapping
Take time to consider your hardware prior to installing ECE such that you
can use similar RAM:Disk ratios;
• Identical Hardware will make your job easier in the long run.
Don't forget to set aside hardware for your Logstash architecture:
• It’s tempting and possible to put Logstash on your ECE servers, but for
large deployments they will need large amounts of memory which will
constrain Elasticsearch resources on that server.
Consider indexing options on fields to reduce the amount of writes on the
disk:
• Reduce keywords mapped;
• You may not need to index some fields.
www.semplicityinc.com
Questions & Answers (we hope)
George Boitano
(617) 524-0171
gboitano@semplicityinc.com
www.semplicityinc.com
© Copyright 2019 SEMplicity, Inc.

Weitere ähnliche Inhalte

Was ist angesagt?

How to backup, restore and archive your data on AWS
How to backup, restore and archive your data on AWSHow to backup, restore and archive your data on AWS
How to backup, restore and archive your data on AWSAmazon Web Services
 
Sizing Splunk SmartStore - Spend Less and Get More Out of Splunk
Sizing Splunk SmartStore - Spend Less and Get More Out of SplunkSizing Splunk SmartStore - Spend Less and Get More Out of Splunk
Sizing Splunk SmartStore - Spend Less and Get More Out of SplunkPaula Koziol
 
Oracle Cloud Infrastructure Introduction
Oracle Cloud Infrastructure IntroductionOracle Cloud Infrastructure Introduction
Oracle Cloud Infrastructure IntroductionPhilip (TAE-HO) Lee
 
Disaster Recovery Best Practices and Customer Use Cases: CGS and HealthQuest
Disaster Recovery Best Practices and Customer Use Cases: CGS and HealthQuestDisaster Recovery Best Practices and Customer Use Cases: CGS and HealthQuest
Disaster Recovery Best Practices and Customer Use Cases: CGS and HealthQuestAmazon Web Services
 
Using Databases and Containers From Development to Deployment
Using Databases and Containers  From Development to DeploymentUsing Databases and Containers  From Development to Deployment
Using Databases and Containers From Development to DeploymentAerospike, Inc.
 
Optimizing Storage for Big Data Workloads
Optimizing Storage for Big Data WorkloadsOptimizing Storage for Big Data Workloads
Optimizing Storage for Big Data WorkloadsAmazon Web Services
 
Cost Effective Archiving and Backup in the AWS Cloud with Amazon Glacier
Cost Effective Archiving and Backup in the AWS Cloud with Amazon GlacierCost Effective Archiving and Backup in the AWS Cloud with Amazon Glacier
Cost Effective Archiving and Backup in the AWS Cloud with Amazon GlacierAmazon Web Services
 
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACIDACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACIDAerospike, Inc.
 
System z Mainframe Data with Amazon S3 and Amazon Glacier (ENT107) | AWS re:I...
System z Mainframe Data with Amazon S3 and Amazon Glacier (ENT107) | AWS re:I...System z Mainframe Data with Amazon S3 and Amazon Glacier (ENT107) | AWS re:I...
System z Mainframe Data with Amazon S3 and Amazon Glacier (ENT107) | AWS re:I...Amazon Web Services
 
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...Aerospike, Inc.
 
Backup and archiving in the aws cloud
Backup and archiving in the aws cloudBackup and archiving in the aws cloud
Backup and archiving in the aws cloudAmazon Web Services
 
01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinar01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinarAerospike, Inc.
 
AI Scalability for the Next Decade
AI Scalability for the Next DecadeAI Scalability for the Next Decade
AI Scalability for the Next DecadePaula Koziol
 
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBMCloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBMRightScale
 
Configuring Aerospike - Part 2
Configuring Aerospike - Part 2 Configuring Aerospike - Part 2
Configuring Aerospike - Part 2 Aerospike, Inc.
 
Backup and Archiving in the AWS Cloud
Backup and Archiving in the AWS CloudBackup and Archiving in the AWS Cloud
Backup and Archiving in the AWS CloudAmazon Web Services
 
Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...
Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...
Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...Amazon Web Services
 
Aerospike Architecture
Aerospike ArchitectureAerospike Architecture
Aerospike ArchitecturePeter Milne
 
Backup and Recovery for Linux With Amazon S3
Backup and Recovery for Linux With Amazon S3Backup and Recovery for Linux With Amazon S3
Backup and Recovery for Linux With Amazon S3Amazon Web Services
 
Journey Through the AWS Cloud; Storage and Archiving
Journey Through the AWS Cloud; Storage and ArchivingJourney Through the AWS Cloud; Storage and Archiving
Journey Through the AWS Cloud; Storage and ArchivingAmazon Web Services
 

Was ist angesagt? (20)

How to backup, restore and archive your data on AWS
How to backup, restore and archive your data on AWSHow to backup, restore and archive your data on AWS
How to backup, restore and archive your data on AWS
 
Sizing Splunk SmartStore - Spend Less and Get More Out of Splunk
Sizing Splunk SmartStore - Spend Less and Get More Out of SplunkSizing Splunk SmartStore - Spend Less and Get More Out of Splunk
Sizing Splunk SmartStore - Spend Less and Get More Out of Splunk
 
Oracle Cloud Infrastructure Introduction
Oracle Cloud Infrastructure IntroductionOracle Cloud Infrastructure Introduction
Oracle Cloud Infrastructure Introduction
 
Disaster Recovery Best Practices and Customer Use Cases: CGS and HealthQuest
Disaster Recovery Best Practices and Customer Use Cases: CGS and HealthQuestDisaster Recovery Best Practices and Customer Use Cases: CGS and HealthQuest
Disaster Recovery Best Practices and Customer Use Cases: CGS and HealthQuest
 
Using Databases and Containers From Development to Deployment
Using Databases and Containers  From Development to DeploymentUsing Databases and Containers  From Development to Deployment
Using Databases and Containers From Development to Deployment
 
Optimizing Storage for Big Data Workloads
Optimizing Storage for Big Data WorkloadsOptimizing Storage for Big Data Workloads
Optimizing Storage for Big Data Workloads
 
Cost Effective Archiving and Backup in the AWS Cloud with Amazon Glacier
Cost Effective Archiving and Backup in the AWS Cloud with Amazon GlacierCost Effective Archiving and Backup in the AWS Cloud with Amazon Glacier
Cost Effective Archiving and Backup in the AWS Cloud with Amazon Glacier
 
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACIDACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID
 
System z Mainframe Data with Amazon S3 and Amazon Glacier (ENT107) | AWS re:I...
System z Mainframe Data with Amazon S3 and Amazon Glacier (ENT107) | AWS re:I...System z Mainframe Data with Amazon S3 and Amazon Glacier (ENT107) | AWS re:I...
System z Mainframe Data with Amazon S3 and Amazon Glacier (ENT107) | AWS re:I...
 
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
 
Backup and archiving in the aws cloud
Backup and archiving in the aws cloudBackup and archiving in the aws cloud
Backup and archiving in the aws cloud
 
01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinar01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinar
 
AI Scalability for the Next Decade
AI Scalability for the Next DecadeAI Scalability for the Next Decade
AI Scalability for the Next Decade
 
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBMCloud Storage Comparison: AWS vs Azure vs Google vs IBM
Cloud Storage Comparison: AWS vs Azure vs Google vs IBM
 
Configuring Aerospike - Part 2
Configuring Aerospike - Part 2 Configuring Aerospike - Part 2
Configuring Aerospike - Part 2
 
Backup and Archiving in the AWS Cloud
Backup and Archiving in the AWS CloudBackup and Archiving in the AWS Cloud
Backup and Archiving in the AWS Cloud
 
Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...
Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...
Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...
 
Aerospike Architecture
Aerospike ArchitectureAerospike Architecture
Aerospike Architecture
 
Backup and Recovery for Linux With Amazon S3
Backup and Recovery for Linux With Amazon S3Backup and Recovery for Linux With Amazon S3
Backup and Recovery for Linux With Amazon S3
 
Journey Through the AWS Cloud; Storage and Archiving
Journey Through the AWS Cloud; Storage and ArchivingJourney Through the AWS Cloud; Storage and Archiving
Journey Through the AWS Cloud; Storage and Archiving
 

Ähnlich wie Scale Large Log Management Deployment with Elastic

Amazon EC2 Foundations - SRV319 - Atlanta AWS Summit
Amazon EC2 Foundations - SRV319 - Atlanta AWS SummitAmazon EC2 Foundations - SRV319 - Atlanta AWS Summit
Amazon EC2 Foundations - SRV319 - Atlanta AWS SummitAmazon Web Services
 
Amazon EC2 Foundations - SRV319 - Anaheim AWS Summit
Amazon EC2 Foundations - SRV319 - Anaheim AWS SummitAmazon EC2 Foundations - SRV319 - Anaheim AWS Summit
Amazon EC2 Foundations - SRV319 - Anaheim AWS SummitAmazon Web Services
 
Amazon RDS & Amazon Aurora: Relational Databases on AWS - SRV206 - Atlanta AW...
Amazon RDS & Amazon Aurora: Relational Databases on AWS - SRV206 - Atlanta AW...Amazon RDS & Amazon Aurora: Relational Databases on AWS - SRV206 - Atlanta AW...
Amazon RDS & Amazon Aurora: Relational Databases on AWS - SRV206 - Atlanta AW...Amazon Web Services
 
Foundations of Amazon EC2 - SRV319 - Chicago AWS Summit
Foundations of Amazon EC2 - SRV319 - Chicago AWS SummitFoundations of Amazon EC2 - SRV319 - Chicago AWS Summit
Foundations of Amazon EC2 - SRV319 - Chicago AWS SummitAmazon Web Services
 
How to Lower TCO and Avoid Cloud Lock-in

How to Lower TCO and Avoid Cloud Lock-in
How to Lower TCO and Avoid Cloud Lock-in

How to Lower TCO and Avoid Cloud Lock-in
Cloudera, Inc.
 
SRV310 Optimizing Relational Databases on AWS: Deep Dive on Amazon RDS
 SRV310 Optimizing Relational Databases on AWS: Deep Dive on Amazon RDS SRV310 Optimizing Relational Databases on AWS: Deep Dive on Amazon RDS
SRV310 Optimizing Relational Databases on AWS: Deep Dive on Amazon RDSAmazon Web Services
 
NVMe and Flash – Make Your Storage Great Again!
NVMe and Flash – Make Your Storage Great Again!NVMe and Flash – Make Your Storage Great Again!
NVMe and Flash – Make Your Storage Great Again!DataCore Software
 
Amazon RDS: Deep Dive - SRV310 - Chicago AWS Summit
Amazon RDS: Deep Dive - SRV310 - Chicago AWS SummitAmazon RDS: Deep Dive - SRV310 - Chicago AWS Summit
Amazon RDS: Deep Dive - SRV310 - Chicago AWS SummitAmazon Web Services
 
Amazon EC2 Foundations - SRV319 - Toronto AWS Summit
Amazon EC2 Foundations - SRV319 - Toronto AWS SummitAmazon EC2 Foundations - SRV319 - Toronto AWS Summit
Amazon EC2 Foundations - SRV319 - Toronto AWS SummitAmazon Web Services
 
Relational Database Services on AWS
Relational Database Services on AWSRelational Database Services on AWS
Relational Database Services on AWSAmazon Web Services
 
Beyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage Engines	Beyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage Engines MongoDB
 
Security sizing meetup
Security sizing meetupSecurity sizing meetup
Security sizing meetupDaliya Spasova
 
Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319 Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319 Amazon Web Services
 
IBM DS8880 and IBM Z - Integrated by Design
IBM DS8880 and IBM Z - Integrated by DesignIBM DS8880 and IBM Z - Integrated by Design
IBM DS8880 and IBM Z - Integrated by DesignStefan Lein
 
Metrics-Driven Performance Tuning for AWS Glue ETL Jobs (ANT332) - AWS re:Inv...
Metrics-Driven Performance Tuning for AWS Glue ETL Jobs (ANT332) - AWS re:Inv...Metrics-Driven Performance Tuning for AWS Glue ETL Jobs (ANT332) - AWS re:Inv...
Metrics-Driven Performance Tuning for AWS Glue ETL Jobs (ANT332) - AWS re:Inv...Amazon Web Services
 

Ähnlich wie Scale Large Log Management Deployment with Elastic (20)

Amazon EC2 Foundations - SRV319 - Atlanta AWS Summit
Amazon EC2 Foundations - SRV319 - Atlanta AWS SummitAmazon EC2 Foundations - SRV319 - Atlanta AWS Summit
Amazon EC2 Foundations - SRV319 - Atlanta AWS Summit
 
Amazon EC2 Foundations - SRV319 - Anaheim AWS Summit
Amazon EC2 Foundations - SRV319 - Anaheim AWS SummitAmazon EC2 Foundations - SRV319 - Anaheim AWS Summit
Amazon EC2 Foundations - SRV319 - Anaheim AWS Summit
 
Amazon RDS_Deep Dive - SRV310
Amazon RDS_Deep Dive - SRV310 Amazon RDS_Deep Dive - SRV310
Amazon RDS_Deep Dive - SRV310
 
Amazon EC2 Foundations
Amazon EC2 FoundationsAmazon EC2 Foundations
Amazon EC2 Foundations
 
Amazon RDS & Amazon Aurora: Relational Databases on AWS - SRV206 - Atlanta AW...
Amazon RDS & Amazon Aurora: Relational Databases on AWS - SRV206 - Atlanta AW...Amazon RDS & Amazon Aurora: Relational Databases on AWS - SRV206 - Atlanta AW...
Amazon RDS & Amazon Aurora: Relational Databases on AWS - SRV206 - Atlanta AW...
 
Foundations of Amazon EC2 - SRV319 - Chicago AWS Summit
Foundations of Amazon EC2 - SRV319 - Chicago AWS SummitFoundations of Amazon EC2 - SRV319 - Chicago AWS Summit
Foundations of Amazon EC2 - SRV319 - Chicago AWS Summit
 
How to Lower TCO and Avoid Cloud Lock-in

How to Lower TCO and Avoid Cloud Lock-in
How to Lower TCO and Avoid Cloud Lock-in

How to Lower TCO and Avoid Cloud Lock-in

 
SRV310 Optimizing Relational Databases on AWS: Deep Dive on Amazon RDS
 SRV310 Optimizing Relational Databases on AWS: Deep Dive on Amazon RDS SRV310 Optimizing Relational Databases on AWS: Deep Dive on Amazon RDS
SRV310 Optimizing Relational Databases on AWS: Deep Dive on Amazon RDS
 
NVMe and Flash – Make Your Storage Great Again!
NVMe and Flash – Make Your Storage Great Again!NVMe and Flash – Make Your Storage Great Again!
NVMe and Flash – Make Your Storage Great Again!
 
Amazon RDS: Deep Dive - SRV310 - Chicago AWS Summit
Amazon RDS: Deep Dive - SRV310 - Chicago AWS SummitAmazon RDS: Deep Dive - SRV310 - Chicago AWS Summit
Amazon RDS: Deep Dive - SRV310 - Chicago AWS Summit
 
Amazon EC2 Foundations - SRV319 - Toronto AWS Summit
Amazon EC2 Foundations - SRV319 - Toronto AWS SummitAmazon EC2 Foundations - SRV319 - Toronto AWS Summit
Amazon EC2 Foundations - SRV319 - Toronto AWS Summit
 
Relational Database Services on AWS
Relational Database Services on AWSRelational Database Services on AWS
Relational Database Services on AWS
 
Beyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage Engines	Beyond the Basics 1: Storage Engines
Beyond the Basics 1: Storage Engines
 
Security sizing meetup
Security sizing meetupSecurity sizing meetup
Security sizing meetup
 
Amazon EC2 Foundations
Amazon EC2 FoundationsAmazon EC2 Foundations
Amazon EC2 Foundations
 
EC2 Foundations - Laura Thomson
EC2 Foundations - Laura ThomsonEC2 Foundations - Laura Thomson
EC2 Foundations - Laura Thomson
 
Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319 Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319
 
SRV319 Amazon EC2 Foundations
SRV319 Amazon EC2 FoundationsSRV319 Amazon EC2 Foundations
SRV319 Amazon EC2 Foundations
 
IBM DS8880 and IBM Z - Integrated by Design
IBM DS8880 and IBM Z - Integrated by DesignIBM DS8880 and IBM Z - Integrated by Design
IBM DS8880 and IBM Z - Integrated by Design
 
Metrics-Driven Performance Tuning for AWS Glue ETL Jobs (ANT332) - AWS re:Inv...
Metrics-Driven Performance Tuning for AWS Glue ETL Jobs (ANT332) - AWS re:Inv...Metrics-Driven Performance Tuning for AWS Glue ETL Jobs (ANT332) - AWS re:Inv...
Metrics-Driven Performance Tuning for AWS Glue ETL Jobs (ANT332) - AWS re:Inv...
 

Mehr von FaithWestdorp

Using Elastiknn for exact and approximate nearest neighbor search
Using Elastiknn for exact and approximate nearest neighbor searchUsing Elastiknn for exact and approximate nearest neighbor search
Using Elastiknn for exact and approximate nearest neighbor searchFaithWestdorp
 
Observability from the Home
Observability from the HomeObservability from the Home
Observability from the HomeFaithWestdorp
 
Elasticsearch Goes to Congress
Elasticsearch Goes to CongressElasticsearch Goes to Congress
Elasticsearch Goes to CongressFaithWestdorp
 
Eliminate your zombie technology ray myers - 11-5-2020
Eliminate your zombie technology   ray myers - 11-5-2020Eliminate your zombie technology   ray myers - 11-5-2020
Eliminate your zombie technology ray myers - 11-5-2020FaithWestdorp
 
Mejorando las busquedas en nuestras aplicaciones web con elasticsearch
Mejorando las busquedas en nuestras aplicaciones web con elasticsearchMejorando las busquedas en nuestras aplicaciones web con elasticsearch
Mejorando las busquedas en nuestras aplicaciones web con elasticsearchFaithWestdorp
 
Evolving with Elastic: GetSet Learning
Evolving with Elastic: GetSet LearningEvolving with Elastic: GetSet Learning
Evolving with Elastic: GetSet LearningFaithWestdorp
 
EmPOW: Integrating Attack Behavior Intelligence into Logstash Plugins
EmPOW: Integrating Attack Behavior Intelligence into Logstash PluginsEmPOW: Integrating Attack Behavior Intelligence into Logstash Plugins
EmPOW: Integrating Attack Behavior Intelligence into Logstash PluginsFaithWestdorp
 
Examining OpenData with a Search Index using Elasticsearch
Examining OpenData with a Search Index using ElasticsearchExamining OpenData with a Search Index using Elasticsearch
Examining OpenData with a Search Index using ElasticsearchFaithWestdorp
 
Logstash and Maxmind: not just for GEOIP anymore
Logstash and Maxmind: not just for GEOIP anymoreLogstash and Maxmind: not just for GEOIP anymore
Logstash and Maxmind: not just for GEOIP anymoreFaithWestdorp
 
Elasticsearch's aggregations & esctl in action or how i built a cli tool...
Elasticsearch's aggregations & esctl in action  or how i built a cli tool...Elasticsearch's aggregations & esctl in action  or how i built a cli tool...
Elasticsearch's aggregations & esctl in action or how i built a cli tool...FaithWestdorp
 
Searching for NLP: Using Elasticsearch to Create MVPs of NLP-enabled User Ex...
 Searching for NLP: Using Elasticsearch to Create MVPs of NLP-enabled User Ex... Searching for NLP: Using Elasticsearch to Create MVPs of NLP-enabled User Ex...
Searching for NLP: Using Elasticsearch to Create MVPs of NLP-enabled User Ex...FaithWestdorp
 
Introduction to machine learning using Elastic
Introduction to machine learning using ElasticIntroduction to machine learning using Elastic
Introduction to machine learning using ElasticFaithWestdorp
 
Upgrade your attack model: finding and stopping fileless attacks with MITRE A...
Upgrade your attack model: finding and stopping fileless attacks with MITRE A...Upgrade your attack model: finding and stopping fileless attacks with MITRE A...
Upgrade your attack model: finding and stopping fileless attacks with MITRE A...FaithWestdorp
 
Elastic Observability
Elastic Observability Elastic Observability
Elastic Observability FaithWestdorp
 
Threat hunting with Elastic APM
Threat hunting with Elastic APMThreat hunting with Elastic APM
Threat hunting with Elastic APMFaithWestdorp
 
Guide to Data Visualization in Kibana
Guide to Data Visualization in KibanaGuide to Data Visualization in Kibana
Guide to Data Visualization in KibanaFaithWestdorp
 
Elastic's recommendation on keeping services up and running with real-time vi...
Elastic's recommendation on keeping services up and running with real-time vi...Elastic's recommendation on keeping services up and running with real-time vi...
Elastic's recommendation on keeping services up and running with real-time vi...FaithWestdorp
 
Esctl in action elastic user group presentation aug 25 2020
Esctl in action   elastic user group presentation aug 25 2020Esctl in action   elastic user group presentation aug 25 2020
Esctl in action elastic user group presentation aug 25 2020FaithWestdorp
 

Mehr von FaithWestdorp (18)

Using Elastiknn for exact and approximate nearest neighbor search
Using Elastiknn for exact and approximate nearest neighbor searchUsing Elastiknn for exact and approximate nearest neighbor search
Using Elastiknn for exact and approximate nearest neighbor search
 
Observability from the Home
Observability from the HomeObservability from the Home
Observability from the Home
 
Elasticsearch Goes to Congress
Elasticsearch Goes to CongressElasticsearch Goes to Congress
Elasticsearch Goes to Congress
 
Eliminate your zombie technology ray myers - 11-5-2020
Eliminate your zombie technology   ray myers - 11-5-2020Eliminate your zombie technology   ray myers - 11-5-2020
Eliminate your zombie technology ray myers - 11-5-2020
 
Mejorando las busquedas en nuestras aplicaciones web con elasticsearch
Mejorando las busquedas en nuestras aplicaciones web con elasticsearchMejorando las busquedas en nuestras aplicaciones web con elasticsearch
Mejorando las busquedas en nuestras aplicaciones web con elasticsearch
 
Evolving with Elastic: GetSet Learning
Evolving with Elastic: GetSet LearningEvolving with Elastic: GetSet Learning
Evolving with Elastic: GetSet Learning
 
EmPOW: Integrating Attack Behavior Intelligence into Logstash Plugins
EmPOW: Integrating Attack Behavior Intelligence into Logstash PluginsEmPOW: Integrating Attack Behavior Intelligence into Logstash Plugins
EmPOW: Integrating Attack Behavior Intelligence into Logstash Plugins
 
Examining OpenData with a Search Index using Elasticsearch
Examining OpenData with a Search Index using ElasticsearchExamining OpenData with a Search Index using Elasticsearch
Examining OpenData with a Search Index using Elasticsearch
 
Logstash and Maxmind: not just for GEOIP anymore
Logstash and Maxmind: not just for GEOIP anymoreLogstash and Maxmind: not just for GEOIP anymore
Logstash and Maxmind: not just for GEOIP anymore
 
Elasticsearch's aggregations & esctl in action or how i built a cli tool...
Elasticsearch's aggregations & esctl in action  or how i built a cli tool...Elasticsearch's aggregations & esctl in action  or how i built a cli tool...
Elasticsearch's aggregations & esctl in action or how i built a cli tool...
 
Searching for NLP: Using Elasticsearch to Create MVPs of NLP-enabled User Ex...
 Searching for NLP: Using Elasticsearch to Create MVPs of NLP-enabled User Ex... Searching for NLP: Using Elasticsearch to Create MVPs of NLP-enabled User Ex...
Searching for NLP: Using Elasticsearch to Create MVPs of NLP-enabled User Ex...
 
Introduction to machine learning using Elastic
Introduction to machine learning using ElasticIntroduction to machine learning using Elastic
Introduction to machine learning using Elastic
 
Upgrade your attack model: finding and stopping fileless attacks with MITRE A...
Upgrade your attack model: finding and stopping fileless attacks with MITRE A...Upgrade your attack model: finding and stopping fileless attacks with MITRE A...
Upgrade your attack model: finding and stopping fileless attacks with MITRE A...
 
Elastic Observability
Elastic Observability Elastic Observability
Elastic Observability
 
Threat hunting with Elastic APM
Threat hunting with Elastic APMThreat hunting with Elastic APM
Threat hunting with Elastic APM
 
Guide to Data Visualization in Kibana
Guide to Data Visualization in KibanaGuide to Data Visualization in Kibana
Guide to Data Visualization in Kibana
 
Elastic's recommendation on keeping services up and running with real-time vi...
Elastic's recommendation on keeping services up and running with real-time vi...Elastic's recommendation on keeping services up and running with real-time vi...
Elastic's recommendation on keeping services up and running with real-time vi...
 
Esctl in action elastic user group presentation aug 25 2020
Esctl in action   elastic user group presentation aug 25 2020Esctl in action   elastic user group presentation aug 25 2020
Esctl in action elastic user group presentation aug 25 2020
 

Kürzlich hochgeladen

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 

Kürzlich hochgeladen (20)

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 

Scale Large Log Management Deployment with Elastic

  • 1. www.semplicityinc.com FROM THE TRENCHES: SCALING A LARGE LOG MANAGEMENT DEPLOYMENT War stories, tips & gotchas – it’s all here. Prepared by SEMPlicity, Inc. George Boitano (617) 524-0171 gboitano@semplicityinc.com www.semplicityinc.com © Copyright 2019 SEMplicity, Inc.
  • 2. www.semplicityinc.com Connecting the Best of Both Worlds 2© Copyright 2019 SEMplicity, Inc. Who We Are – SEMplicity provides Elastic professional services for legacy SIEM modernization. – SEMplicity is an official licensed Elastic Managed Services Provider (MSP). – SEMplicity is the largest Micro Focus services provider for ArcSight with 10 years of experience with legacy SIEM and log management.
  • 3. www.semplicityinc.com Modern Log Management with Elastic © Copyright 2019 SEMplicity, Inc. 3
  • 4. www.semplicityinc.com The Challenge 4© Copyright 2019 SEMplicity, Inc. A retailer needs much faster log search response times: • using legacy log storage, searches spanning time periods of more than an hour take minutes or hours to return; • legacy log storage is very expensive at present, and getting worse as log volumes increase; • legacy log storage technology is frozen, without any roadmap for advanced visualizations, machine learning, etc. Enter FastSearch, our Elastic log management deployment: • planned users include SOC analysts, incident response and hunt team; • initial use case: fast searching of log records using keywords and free text; • follow-on use case: Elastic storage of sensitive compliance logs (PCI, HIPPA) at evidentiary standards; • Roadmap includes advanced analyst and executive visualizations, alerting, incident response research dashboards, unsupervised machine learning anomaly detection.
  • 5. www.semplicityinc.com Requirements Metrics 5© Copyright 2019 SEMplicity, Inc. Metric Service Level Retention 30 days or more Volume Approximately 120K Events per Second (EPS) Storage More than 31.3 3b per day Performance 30-day single keyword search returns in under 6 seconds Log Source 30-Day Storage Windows Servers 55tb Firewalls 270tb Web Proxies 110tb Other Sources About 500tb
  • 6. www.semplicityinc.com ECE or Elastic Cloud Enterprise Deployed internally on re-purposed Hardware, here are the benefits of Elastic Cloud Enterprise (ECE) • Sensitive or regulated data is stored or available within the internal network. • Centralized Management of Elasticsearch Deployments for • Provisioning • Scaling • Monitoring • Upgrades (Minimum to No downtime) • Backup and Restore © Copyright 2018 SEMplicity, Inc. 6
  • 7. www.semplicityinc.com Hardware 7© Copyright 2019 SEMplicity, Inc. Our biggest challenge involved designing the ECE (Elastic Cloud Enterprise) installation for our client based on the hardware we inherited: • 60 or so RedHat servers with 256GB memory and varying storage capabilities; • Some with small SSD drives, some only spinning disks, some both; • Most servers have between 19tb and 24tb storage available. In order to get the most out of the available resources, we decided upon an ECE implementation with the RAM:Storage ratio of 1:98, as described later.
  • 8. www.semplicityinc.com High Level ECE Design © Copyright 2018 SEMplicity, Inc. 8
  • 9. www.semplicityinc.com Availability Zones 9© Copyright 2019 SEMplicity, Inc. Here is the Elastic recommended configuration for 3 availability zones:
  • 10. www.semplicityinc.com Availability Zones Due to hardware (disk) available, we consolidated this design a bit. Here is the actual design we used: © Copyright 2018 SEMplicity, Inc. 10
  • 11. www.semplicityinc.com ECE Clusters 11© Copyright 2019 SEMplicity, Inc. The first cluster (now called a deployment) we set up was for sizing. This starts as a single-node single-shard cluster. The disk allocated and number of nodes/shards changes depending on the event source, so we try to keep enough free space available in this sizing cluster for onboarding.
  • 12. www.semplicityinc.com Determining Storage Density 12© Copyright 2019 SEMplicity, Inc. Elastic Cloud Enterprise (ECE) provisions clusters at a ratio of 1GB of RAM for every 32GB of storage, since we have servers with 256 GB RAM and 24 TB of storage, we arrived at a memory to storage ratio of 1:98 for each allocator. This can be calculated by using this formula. • Storage / RAM = Storage Density This gets more complex when you think about the instance size you plan to use. ECE allows you to create instances with RAM allocations like this: • For large deployments, smaller instances can be problematic. We have seen ingest issues with 16gb and smaller (remember we’re dealing with high volumes). An instance with 64gb RAM will be allocated 16 processors where as instance with 16gb RAM will be allocated 4 processors. • Calculating storage density with instance size in mind. You want to make sure your storage density will allow for the maximum number of instances to be created.
  • 13. www.semplicityinc.com Storage Density Example Say you have a server you plan to use as an ECE allocator with 24tb of storage and 256gb RAM. • The storage density would be 1:93 roughly. • If you decide to use mostly 64gb nodes (which, annoyingly, ECE calls an instance), that would be 5.9tb storage per node. That’s roughly 4 nodes per Allocator. • I’m rounding off numbers here though. Realistically, it’s 93.75gb of storage to every one gb of RAM. That means setting your RAM:Disk to 1:93 is actually around 192gb of unused storage. This is compounded when you take into consideration that only 4 nodes will fit on each Allocator. • Using 1:93 as your storage density, you will realistically only get 23.8tb of available storage. This isn’t a huge problem normally, but when you have systems with different RAM to Disk ratios, it gets difficult to avoid wasted disk space. © Copyright 2018 SEMplicity, Inc. 13
  • 14. www.semplicityinc.com Shard Sizing (Number of Shards) There are different approaches or methods available for Shard Sizing. This is how we arrived at the number of shards and nodes to handle the requirement. In addition to setting up a sizing deployment, we also setup a monitoring deployment to view the indexing statistics as well as Logstash performance. With index mapping template tuned for disk optimization, we started sending Proxy events which had been enriched with Logstash, and properly parsed for keyword, IP, text and other field types. • Determine the Daily index size by indexing for couple of days (during the week). • 1300GB ( 2600gb with Replica) + 25% for Growth = 3250gb • Determine the total Index Size based on the number of retention days • 3250gb * 30 days = 97500GB • Determine the number of shards. A good rule of thumb is to shoot for shard sizes of 60gb or less • 3250gb/60gb = 54 shards © Copyright 2018 SEMplicity, Inc. 14
  • 15. www.semplicityinc.com Instance or Node Sizing 15© Copyright 2019 SEMplicity, Inc. Because of the higher EPS for this Log source, we determined to go with 64gb RAM:6.13 TB of storage. In order to determine, number of nodes or instances a) We have the total size of the index for 30 days retention period (97500 GB) b) We have the total size of a node, 6130GB, you first need to know the daily index size. c) Number of nodes would be 97500/6130 = 16 nodes. Since we have 3 zones and nodes are distributed equally, we went with 18 nodes total.
  • 16. www.semplicityinc.com Shard Sizing Details A couple things to keep in mind about shard sizing: • Generally speaking the more (smaller) shards you have, the faster ingest will be (to a point). More shards will slow search times as well; • Less (larger) shards will have the opposite effect; • This is very dependent on EPS, number of nodes, and total index size; • It is highly recommended to set up a Sizing Cluster with Monitoring and thoroughly test each event source prior to sizing your production clusters. Your search/ingest requirements will vary and these requirements will directly impact your shard size and number of shards. © Copyright 2018 SEMplicity, Inc. 16
  • 17. www.semplicityinc.com LogStash Architecture 17© Copyright 2019 SEMplicity, Inc. Determining Logstash Architecture (for us) involved a lot of testing for each log source. Luckily, our data was already being collected in various ways and sent to Kafka. Nearly all of it had been processed by ArcSight, so it was already in CEF (common event format). Pulling data from Kafka with Logstash is simple. You subscribe to the Kafka topic (ours are separated by event type) using a Logstash input plugin. Even simplified, much testing was required to determine how many instances of Logstash were required for the EPS output from Kafka.
  • 18. www.semplicityinc.com Logstash Architecture Details • We decided to leverage two of our lower disk servers for Logstash instances. • LogStash does not run under ECE. You can deploy it as a Docker container, but we are not doing that. • We do send LogStash metrics and health data to our monitoring cluster, to help with tuning and debugging. • Here is one of the configurations for collecting Proxy events: © Copyright 2018 SEMplicity, Inc. 18
  • 19. www.semplicityinc.com Tuning LogStash Ingestion 19© Copyright 2019 SEMplicity, Inc. When pulling data from Kafka, the number of Kafka partitions available is important: • Logstash can only leverage the same number of threads as there are partitions available; • If a proxy topic has 60 available partitions, Logstash can only leverage 60 consumer threads: more than that will simple remain idle and unused. Depending on the EPS, and filters used by Logstash, you may consider splitting partitions to several Logstash instances.
  • 20. www.semplicityinc.com Tuning Logstash Applied For our Proxy cluster, we split the topic among 4 Logstash instances each running 15 consumer threads. • Each Logstash instance could leverage 15 consumer threads for a total of 60. • General guidelines for ingestion: • Lower EPS (6k/s) – fewer LogStash instances with higher number of consumer threads each; • Higher EPS (45k/s) – more LogStassh instances with lower number consumer threads each. • It’s also important to note that bumping up the Logstash JVM heap up to a maximum of 30gb can improve throughput for each instance. Using default settings, Logstash instances seem to max out around 2,000 EPS. By testing different setups, you can improve this drastically. © Copyright 2018 SEMplicity, Inc. 20
  • 21. www.semplicityinc.com LogStash Mapping CEF is pretty good at normalizing data. However there are some things you can do with Logstash to further enrich events. • Concatenating fields, dropping fields, or mapping IP geoip data for instance: © Copyright 2018 SEMplicity, Inc. 21
  • 22. www.semplicityinc.com LogStash all_content and copy_to Our client also requested that we add all fields to a single field called “all_content”. • Quite often Analysts may not know the field(s) in which a certain string resides. • This increases ingest workload and storage by quite a bit; however, it can be quite useful for searching the whole event for strings. © Copyright 2018 SEMplicity, Inc. 22 To implement, modify the CEF LogStash mapping template with a number of copy_to parameters:
  • 23. www.semplicityinc.com LogStash Indexing Disable indexing on fields that are not going to be searched, like certain numeric fields. This is specified in the LogStash mapping template. It speeds up ingestion and reduces storage required. © Copyright 2018 SEMplicity, Inc. 23
  • 24. www.semplicityinc.com Problem: Ingestion Delays 24© Copyright 2019 SEMplicity, Inc. Early on, while still setting up and tuning ECE Clusters, we were frequently making changes, growing and shrinking nodes, etc. During this process, we discovered that growing Deployments can sometimes result in an issue where new indices weren’t being properly spread across cluster instances. We would frequently end up with a Deployment containing 20+ instances, where only one or two instances were creating all shards and replicas. ECE tries to keep all shards equally distributed, and when you create a new instance, all new shards are created there until it’s shards are the same as older instances. This causes some serious problems with ingestion when you have 120k+ EPS.
  • 25. www.semplicityinc.com Ingestion Delay Symptoms 25© Copyright 2019 SEMplicity, Inc. Symptoms: 1. Logs are delayed in becoming available for search. Logs should be available within 1 minute of ingestion. We were seeing delays of several hours between ingestion and becoming available. 2. More than 3 shards are allocated to an instance or a node: • As indicated in the previous slide, all shards for a new indices were created/allocated to new instances. Solution: a) Please confirm that the routing allocation is set to “all” so that the shards are allocated evenly across the available instances b) To make maximum use of the available processing capacity, set the cpu hard limit to “false” under Data section of advanced Elastic configuration.
  • 26. www.semplicityinc.com Setting cluster.routing.allocation Make sure the cluster.routing.allocation is set to “all”: GET _cluster/settings?include_defaults=true&filter_path=**.routing.allocation.enable* If the output shows “enable” : “none”, then, you can reset by setting the value to null. © Copyright 2018 SEMplicity, Inc. 26
  • 27. www.semplicityinc.com Problem: Boot Loops & ECE Debugging 27© Copyright 2019 SEMplicity, Inc. Symptom: • If there is a syntax error with a user bundle or configuration file, ECE does not report the actual error due; • Instead, we only see a “boot loop error detected” message in the ECE Admin UI as it tries to apply the config changes to the new instances of ElasticSearch within each docker container. Solution: • Made sure that the deployment strategy is set to “rolling”. This helped us to review the log file for the actual syntax errors. If set to any strategy other than rolling, any new instances created are terminated after the failure. This deletes the log files, making diagnosis of the root cause very difficult.
  • 28. www.semplicityinc.com Cross-Cluster Search 28© Copyright 2019 SEMplicity, Inc. • When dealing with large amounts of data, it becomes necessary to have not only multiple indices, but multiple clusters. • In ECE, clusters are referred to as Deployments. Each Deployment is a secure logical silo. This was designed around a multi-tenant architecture, each with it’s own Kibana instance to access data in each cluster. The idea was to prevent cross cluster searching across multiple clients. • In cases like ours, where a single customer has enough data to warrant multiple Deployments, a cross-cluster searching is necessary. This is a drawback for large customers using ECE. While ECE does not currently support cross-cluster searching, it is planned. We’ve been assured it will be included in the next major release, which should be early February at the latest.
  • 29. www.semplicityinc.com Take-Aways 29© Copyright 2019 SEMplicity, Inc. Know your EPS and plan for it to increase: • EPS may increase when indexed due to replica shards or field mapping Take time to consider your hardware prior to installing ECE such that you can use similar RAM:Disk ratios; • Identical Hardware will make your job easier in the long run. Don't forget to set aside hardware for your Logstash architecture: • It’s tempting and possible to put Logstash on your ECE servers, but for large deployments they will need large amounts of memory which will constrain Elasticsearch resources on that server. Consider indexing options on fields to reduce the amount of writes on the disk: • Reduce keywords mapped; • You may not need to index some fields.
  • 30. www.semplicityinc.com Questions & Answers (we hope) George Boitano (617) 524-0171 gboitano@semplicityinc.com www.semplicityinc.com © Copyright 2019 SEMplicity, Inc.