SlideShare ist ein Scribd-Unternehmen logo
1 von 58
Tuning Elasticsearch
Indexing Pipeline
for Logs
Radu Gheorghe
Rafał Kuć
Who are we?
Radu Rafał
Logsene
The next hour
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs Logs
Logs
The tools
Logsene
2.0 SNAPSHOT8.9.01.5 RC2
Let the games begin
Logstash
Multiple inputs
Lots of filters
Several outputs
Lots of plugins
How Logstash works
input
(thread per input)
file
tcp
redis
...
filter
(multiple workers)
grok
geoip
...
elasticsearch
solr
...
output
(multiple workers)
Scaling Logstash
Logstash basic
input {
syslog {
port => 13514
}
}
output {
elasticsearch {
protocol => "http”
manage_template => false
index => "test-index”
index_type => "test-type"
}
}
Logstash basic
4K events per second
~130% CPU
utilization
299MB RAM used
Logstash basic
Logstash with mutate
output {
elasticsearch {
protocol => "http”
manage_template => false
index => "test-index”
index_type => "test-type”
flush_size => 1000
workers => 5
}
}
filter {
mutate {
remove_field => [ "severity", "facility", "priority", "@version", "timestamp", "host" ]
}
}
3 filter threads!
-w 3
Logstash with mutate
5K events per second
~250% CPU
utilization
289MB RAM used
Logstash with mutate
Logstash with grok and tcp
filter {
grok {
match => [ "message", "<%{NUMBER:priority}>%{SYSLOGTIMESTAMP:date}
%{DATA:hostname} %{DATA:tag} %{DATA:what}:%{DATA:number}:" ]
}
mutate {
remove_field => [ "message", "@version", "@timestamp", "host" ]
}
}
input {
tcp {
port => 13514
}
}
Logstash with grok and tcp
8K events per second
~310% CPU
utilization
327MB RAM used
Logstash with grok and tcp
Logstash with JSON lines
input {
tcp {
port => 13514
codec => "json_lines"
}
}
Logstash with JSON lines
8K events per second
~260% CPU
utilization
322MB RAM used
Logstash with JSON lines
Rsyslog
Very fast
Very light
How rsyslog works
im*
imfile
imtcp
imjournal
...
mm* om*
mmnormalize
mmjsonparse
...
omelasticsearch
omredis
...
Using rsyslog
Rsyslog basic
module(load="impstats"
interval="10"
resetCounters="on"
log.file="/tmp/stats")
module(load="imtcp")
module(load="omelasticsearch")
input(type="imtcp" port="13514")
action(type="omelasticsearch"
template="plain-syslog"
searchIndex="test-index"
searchType="test-type"
bulkmode="on"
action.resumeretrycount="-1"
)
template(name="plain-syslog"
type="list") {
constant(value="{")
constant(value=""@timestamp":"") property(name="timereported" dateFormat="rfc3339")
constant(value="","host":"") property(name="hostname")
constant(value="","severity":"") property(name="syslogseverity-text")
constant(value="","facility":"") property(name="syslogfacility-text")
constant(value="","syslogtag":"") property(name="syslogtag" format="json")
constant(value="","message":"") property(name="msg" format="json")
constant(value=""}")
}
*http://blog.sematext.com/2015/04/13/monitoring-rsyslogs-performance-with-imstats-and-elasticsearch
Rsyslog basic
6K events per second
~20% CPU utilization
50MB RAM used
Rsyslog basic
Rsyslog queue and workers
main_queue(
queue.size="100000" # capacity of the main queue
queue.dequeuebatchsize="5000" # process messages in batches of 5K
queue.workerthreads="4" # 4 threads for the main queue
)
action(name="send-to-es"
type="omelasticsearch"
template="plain-syslog" # use the template defined earlier
searchIndex="test-index"
searchType="test-type"
bulkmode="on" # use bulk API
action.resumeretrycount="-1" # retry indefinitely if ES is unreachable
)
Rsyslog queue and workers
25K events per
second
~100% CPU
utilization (1 core)
75MB RAM used
(queue dependent)
Rsyslog queue and workers
Rsyslog + mmnormalize
module(load="mmnormalize")
action(type="mmnormalize"
ruleBase="/opt/rsyslog_rulebase.rb"
useRawMsg="on"
)
template(name="lumberjack" type="list") {
property(name="$!all-json")
}
$ cat /opt/rsyslog_rulebase.rb
rule=:<%priority:number%>%date:date-rfc3164% %host:word% %syslogtag:word% %what:char-
to:x3a%:%number:char-to:x3a%:
Rsyslog + mmnormalize
16K events per second
~200% CPU utilization
100MB RAM used
(queue dependent)
Rsyslog + mmnormalize
Rsyslog with JSON parsing
module(load="mmjsonparse")
action(type="mmjsonparse")
Rsyslog with JSON parsing
20K events per
second
~130% CPU utilization
70MB RAM used
(queue dependent)
Rsyslog with JSON parsing
Disk-assisted queues
main_queue(
queue.filename="main_queue" # write to disk if needed
queue.maxdiskspace="5g" # when to stop writing to disk
queue.highwatermark="200000" # start spilling to disk at this size
queue.lowwatermark="100000" # stop spilling when it gets back to this size
queue.saveonshutdown="on" # write queue contents to disk on shutdown
queue.dequeueBatchSize="5000"
queue.workerthreads="4"
queue.size="10000000" # absolute max queue size
)
Elasticsearch
How Elasticsearch works
JSON bulk, single doc
transaction log
inverted index
analysis
primary
transaction log
inverted index
analysis
replica
Elasticsearch
replicate
ES horizontal scaling
Node
shard
ES horizontal scaling
Node
shard
Node
shard
ES horizontal scaling
Node
shard
Node
shard
Node
shard
ES horizontal scaling
Node
shard shard
shard shard
Node
shard shard
shard shard
Node
shard shard
shard shard
ES horizontal scaling
Node
shard shard
shard shard
replica
replica
replica
replica
Node
shard shard
shard shard
replica
replica
replica
replica
Node
shard shard
shard shard
replica
replica
replica
replica
Elasticsearch for tools tests
Nothing is
indexed
No JVM
tuning
Nothing is
stored
_source
disabled
_all
disabled
-1 refresh
30m sync
translog
size: 2g
interval: 30m
Tuning Elasticsearch
refresh_interval: 5s*
doc_values: true
store.throttle.max_bytes_per_sec: 200mb
*http://blog.sematext.com/2013/07/08/elasticsearch-refresh-interval-vs-indexing-performance/
Tests: hardware and data
2 x EC2 c3.large instances
(2vCPU, 3.5GB RAM,
2x16GB SSD in RAID0)
vs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs Logs
Logs
Apache logs
Test requests
Filters Aggregations
filter by client IP date histogram
filter by word in user agent top 10 response codes
wildcard filter on domain # of unique IPs
top IPs per response per time
Test runs
1. Write throughput
2. Capacity of a single index
3. Capacity with time-based indices on
hot/cold setup
Write throughput (one index)
Capacity of one index (3200 EPS)
20 seconds @ 40 - 50M
Capacity of one index (400 EPS)
15 seconds @ 40 - 50M
Time-based indices: ideal shard size
smaller indices
lighter indexing
easier to isolate hot data from cold data
easier to relocate
bigger indices
less RAM
less management overhead
smaller cluster state
without indexing, equal latency when dividing
32M data into 1/2/4/8/16/32M indices
Time-based. 2 hot and 2 cold nodes
Before: 3200 After: 4800
Time-based. 2 hot and 2 cold nodes
Before:
15s
After:
5s
That's all folks!
What to remember?
log in
JSON
parallelize
when
possible
use time
based indices
use hot / cold
nodes policy
We are hiring
Dig Search?
Dig Analytics?
Dig Big Data?
Dig Performance?
Dig Logging?
Dig working with and in open – source?
We’re hiring world – wide!
http://sematext.com/about/jobs.html
Thank you!
Radu Gheorghe
@radu0gheorghe
radu.gheorghe@sematext.com
Rafał Kuć
@kucrafal
rafal.kuc@sematext.com
Sematext
@sematext
http://sematext.com

Weitere ähnliche Inhalte

Was ist angesagt?

Logstash: Get to know your logs
Logstash: Get to know your logsLogstash: Get to know your logs
Logstash: Get to know your logsSmartLogic
 
'Scalable Logging and Analytics with LogStash'
'Scalable Logging and Analytics with LogStash''Scalable Logging and Analytics with LogStash'
'Scalable Logging and Analytics with LogStash'Cloud Elements
 
Elasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveElasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveSematext Group, Inc.
 
ELK stack at weibo.com
ELK stack at weibo.comELK stack at weibo.com
ELK stack at weibo.com琛琳 饶
 
Logstash family introduction
Logstash family introductionLogstash family introduction
Logstash family introductionOwen Wu
 
Advanced troubleshooting linux performance
Advanced troubleshooting linux performanceAdvanced troubleshooting linux performance
Advanced troubleshooting linux performanceForthscale
 
From zero to hero - Easy log centralization with Logstash and Elasticsearch
From zero to hero - Easy log centralization with Logstash and ElasticsearchFrom zero to hero - Easy log centralization with Logstash and Elasticsearch
From zero to hero - Easy log centralization with Logstash and ElasticsearchRafał Kuć
 
Logstash-Elasticsearch-Kibana
Logstash-Elasticsearch-KibanaLogstash-Elasticsearch-Kibana
Logstash-Elasticsearch-Kibanadknx01
 
Monitoring with Graylog - a modern approach to monitoring?
Monitoring with Graylog - a modern approach to monitoring?Monitoring with Graylog - a modern approach to monitoring?
Monitoring with Graylog - a modern approach to monitoring?inovex GmbH
 
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...ForgeRock
 
Large Scale Log collection using LogStash & mongoDB
Large Scale Log collection using LogStash & mongoDB Large Scale Log collection using LogStash & mongoDB
Large Scale Log collection using LogStash & mongoDB Gaurav Bhardwaj
 
Journée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et Kibana
Journée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et KibanaJournée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et Kibana
Journée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et KibanaPublicis Sapient Engineering
 
MySQL Slow Query log Monitoring using Beats & ELK
MySQL Slow Query log Monitoring using Beats & ELKMySQL Slow Query log Monitoring using Beats & ELK
MySQL Slow Query log Monitoring using Beats & ELKYoungHeon (Roy) Kim
 
{{more}} Kibana4
{{more}} Kibana4{{more}} Kibana4
{{more}} Kibana4琛琳 饶
 
Machine Learning in a Twitter ETL using ELK
Machine Learning in a Twitter ETL using ELK Machine Learning in a Twitter ETL using ELK
Machine Learning in a Twitter ETL using ELK hypto
 

Was ist angesagt? (20)

Logstash: Get to know your logs
Logstash: Get to know your logsLogstash: Get to know your logs
Logstash: Get to know your logs
 
LogStash in action
LogStash in actionLogStash in action
LogStash in action
 
'Scalable Logging and Analytics with LogStash'
'Scalable Logging and Analytics with LogStash''Scalable Logging and Analytics with LogStash'
'Scalable Logging and Analytics with LogStash'
 
Elasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveElasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep dive
 
ELK stack at weibo.com
ELK stack at weibo.comELK stack at weibo.com
ELK stack at weibo.com
 
Logstash family introduction
Logstash family introductionLogstash family introduction
Logstash family introduction
 
Advanced troubleshooting linux performance
Advanced troubleshooting linux performanceAdvanced troubleshooting linux performance
Advanced troubleshooting linux performance
 
From zero to hero - Easy log centralization with Logstash and Elasticsearch
From zero to hero - Easy log centralization with Logstash and ElasticsearchFrom zero to hero - Easy log centralization with Logstash and Elasticsearch
From zero to hero - Easy log centralization with Logstash and Elasticsearch
 
Logstash-Elasticsearch-Kibana
Logstash-Elasticsearch-KibanaLogstash-Elasticsearch-Kibana
Logstash-Elasticsearch-Kibana
 
Using Logstash, elasticsearch & kibana
Using Logstash, elasticsearch & kibanaUsing Logstash, elasticsearch & kibana
Using Logstash, elasticsearch & kibana
 
Monitoring with Graylog - a modern approach to monitoring?
Monitoring with Graylog - a modern approach to monitoring?Monitoring with Graylog - a modern approach to monitoring?
Monitoring with Graylog - a modern approach to monitoring?
 
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
 
Large Scale Log collection using LogStash & mongoDB
Large Scale Log collection using LogStash & mongoDB Large Scale Log collection using LogStash & mongoDB
Large Scale Log collection using LogStash & mongoDB
 
Journée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et Kibana
Journée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et KibanaJournée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et Kibana
Journée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et Kibana
 
MySQL Slow Query log Monitoring using Beats & ELK
MySQL Slow Query log Monitoring using Beats & ELKMySQL Slow Query log Monitoring using Beats & ELK
MySQL Slow Query log Monitoring using Beats & ELK
 
{{more}} Kibana4
{{more}} Kibana4{{more}} Kibana4
{{more}} Kibana4
 
Elk stack
Elk stackElk stack
Elk stack
 
ELK Stack
ELK StackELK Stack
ELK Stack
 
Machine Learning in a Twitter ETL using ELK
Machine Learning in a Twitter ETL using ELK Machine Learning in a Twitter ETL using ELK
Machine Learning in a Twitter ETL using ELK
 
The basics of fluentd
The basics of fluentdThe basics of fluentd
The basics of fluentd
 

Ähnlich wie Tuning Elasticsearch Indexing Pipeline for Logs

(Fios#02) 2. elk 포렌식 분석
(Fios#02) 2. elk 포렌식 분석(Fios#02) 2. elk 포렌식 분석
(Fios#02) 2. elk 포렌식 분석INSIGHT FORENSIC
 
Scaling an ELK stack at bol.com
Scaling an ELK stack at bol.comScaling an ELK stack at bol.com
Scaling an ELK stack at bol.comRenzo Tomà
 
Application Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyApplication Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyTim Bunce
 
Hadoop cluster performance profiler
Hadoop cluster performance profilerHadoop cluster performance profiler
Hadoop cluster performance profilerIhor Bobak
 
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...Spark Summit
 
Proving out flash storage array performance using swingbench and slob
Proving out flash storage array performance using swingbench and slobProving out flash storage array performance using swingbench and slob
Proving out flash storage array performance using swingbench and slobKapil Goyal
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...DataWorks Summit/Hadoop Summit
 
Attack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and KibanaAttack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and KibanaPrajal Kulkarni
 
Tuning Solr for Logs: Presented by Radu Gheorghe, Sematext
Tuning Solr for Logs: Presented by Radu Gheorghe, SematextTuning Solr for Logs: Presented by Radu Gheorghe, Sematext
Tuning Solr for Logs: Presented by Radu Gheorghe, SematextLucidworks
 
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleStrata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
Strata Singapore: Gearpump Real time DAG-Processing with Akka at ScaleSean Zhong
 
Scaling Massive Elasticsearch Clusters
Scaling Massive Elasticsearch ClustersScaling Massive Elasticsearch Clusters
Scaling Massive Elasticsearch ClustersSematext Group, Inc.
 
D Trace Support In My Sql Guide To Solving Reallife Performance Problems
D Trace Support In My Sql Guide To Solving Reallife Performance ProblemsD Trace Support In My Sql Guide To Solving Reallife Performance Problems
D Trace Support In My Sql Guide To Solving Reallife Performance ProblemsMySQLConference
 
Tweaking performance on high-load projects
Tweaking performance on high-load projectsTweaking performance on high-load projects
Tweaking performance on high-load projectsDmitriy Dumanskiy
 
Deep learning with kafka
Deep learning with kafkaDeep learning with kafka
Deep learning with kafkaNitin Kumar
 
Using apache spark for processing trillions of records each day at Datadog
Using apache spark for processing trillions of records each day at DatadogUsing apache spark for processing trillions of records each day at Datadog
Using apache spark for processing trillions of records each day at DatadogVadim Semenov
 
"How about no grep and zabbix?". ELK based alerts and metrics.
"How about no grep and zabbix?". ELK based alerts and metrics."How about no grep and zabbix?". ELK based alerts and metrics.
"How about no grep and zabbix?". ELK based alerts and metrics.Vladimir Pavkin
 
ETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetupETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetupRafal Kwasny
 
Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Prajal Kulkarni
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...javier ramirez
 

Ähnlich wie Tuning Elasticsearch Indexing Pipeline for Logs (20)

(Fios#02) 2. elk 포렌식 분석
(Fios#02) 2. elk 포렌식 분석(Fios#02) 2. elk 포렌식 분석
(Fios#02) 2. elk 포렌식 분석
 
Scaling an ELK stack at bol.com
Scaling an ELK stack at bol.comScaling an ELK stack at bol.com
Scaling an ELK stack at bol.com
 
Application Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyApplication Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.key
 
Hadoop cluster performance profiler
Hadoop cluster performance profilerHadoop cluster performance profiler
Hadoop cluster performance profiler
 
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
 
Proving out flash storage array performance using swingbench and slob
Proving out flash storage array performance using swingbench and slobProving out flash storage array performance using swingbench and slob
Proving out flash storage array performance using swingbench and slob
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
 
Attack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and KibanaAttack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and Kibana
 
Tuning Solr for Logs: Presented by Radu Gheorghe, Sematext
Tuning Solr for Logs: Presented by Radu Gheorghe, SematextTuning Solr for Logs: Presented by Radu Gheorghe, Sematext
Tuning Solr for Logs: Presented by Radu Gheorghe, Sematext
 
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleStrata Singapore: GearpumpReal time DAG-Processing with Akka at Scale
Strata Singapore: Gearpump Real time DAG-Processing with Akka at Scale
 
Rails Performance
Rails PerformanceRails Performance
Rails Performance
 
Scaling Massive Elasticsearch Clusters
Scaling Massive Elasticsearch ClustersScaling Massive Elasticsearch Clusters
Scaling Massive Elasticsearch Clusters
 
D Trace Support In My Sql Guide To Solving Reallife Performance Problems
D Trace Support In My Sql Guide To Solving Reallife Performance ProblemsD Trace Support In My Sql Guide To Solving Reallife Performance Problems
D Trace Support In My Sql Guide To Solving Reallife Performance Problems
 
Tweaking performance on high-load projects
Tweaking performance on high-load projectsTweaking performance on high-load projects
Tweaking performance on high-load projects
 
Deep learning with kafka
Deep learning with kafkaDeep learning with kafka
Deep learning with kafka
 
Using apache spark for processing trillions of records each day at Datadog
Using apache spark for processing trillions of records each day at DatadogUsing apache spark for processing trillions of records each day at Datadog
Using apache spark for processing trillions of records each day at Datadog
 
"How about no grep and zabbix?". ELK based alerts and metrics.
"How about no grep and zabbix?". ELK based alerts and metrics."How about no grep and zabbix?". ELK based alerts and metrics.
"How about no grep and zabbix?". ELK based alerts and metrics.
 
ETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetupETL with SPARK - First Spark London meetup
ETL with SPARK - First Spark London meetup
 
Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.
 
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir más de cuatro millones de ...
 

Mehr von Sematext Group, Inc.

Tweaking the Base Score: Lucene/Solr Similarities Explained
Tweaking the Base Score: Lucene/Solr Similarities ExplainedTweaking the Base Score: Lucene/Solr Similarities Explained
Tweaking the Base Score: Lucene/Solr Similarities ExplainedSematext Group, Inc.
 
OOPs, OOMs, oh my! Containerizing JVM apps
OOPs, OOMs, oh my! Containerizing JVM appsOOPs, OOMs, oh my! Containerizing JVM apps
OOPs, OOMs, oh my! Containerizing JVM appsSematext Group, Inc.
 
Is observability good for your brain?
Is observability good for your brain?Is observability good for your brain?
Is observability good for your brain?Sematext Group, Inc.
 
Introducing log analysis to your organization
Introducing log analysis to your organization Introducing log analysis to your organization
Introducing log analysis to your organization Sematext Group, Inc.
 
Solr Search Engine: Optimize Is (Not) Bad for You
Solr Search Engine: Optimize Is (Not) Bad for YouSolr Search Engine: Optimize Is (Not) Bad for You
Solr Search Engine: Optimize Is (Not) Bad for YouSematext Group, Inc.
 
Solr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the UglySolr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the UglySematext Group, Inc.
 
Building Resilient Log Aggregation Pipeline with Elasticsearch & Kafka
Building Resilient Log Aggregation Pipeline with Elasticsearch & KafkaBuilding Resilient Log Aggregation Pipeline with Elasticsearch & Kafka
Building Resilient Log Aggregation Pipeline with Elasticsearch & KafkaSematext Group, Inc.
 
Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Running High Performance & Fault-tolerant Elasticsearch Clusters on DockerRunning High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Running High Performance & Fault-tolerant Elasticsearch Clusters on DockerSematext Group, Inc.
 
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerRunning High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerSematext Group, Inc.
 
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)Sematext Group, Inc.
 
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...Sematext Group, Inc.
 
Metrics, Logs, Transaction Traces, Anomaly Detection at Scale
Metrics, Logs, Transaction Traces, Anomaly Detection at ScaleMetrics, Logs, Transaction Traces, Anomaly Detection at Scale
Metrics, Logs, Transaction Traces, Anomaly Detection at ScaleSematext Group, Inc.
 
Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2Sematext Group, Inc.
 
From Zero to Hero - Centralized Logging with Logstash & Elasticsearch
From Zero to Hero - Centralized Logging with Logstash & ElasticsearchFrom Zero to Hero - Centralized Logging with Logstash & Elasticsearch
From Zero to Hero - Centralized Logging with Logstash & ElasticsearchSematext Group, Inc.
 

Mehr von Sematext Group, Inc. (20)

Tweaking the Base Score: Lucene/Solr Similarities Explained
Tweaking the Base Score: Lucene/Solr Similarities ExplainedTweaking the Base Score: Lucene/Solr Similarities Explained
Tweaking the Base Score: Lucene/Solr Similarities Explained
 
OOPs, OOMs, oh my! Containerizing JVM apps
OOPs, OOMs, oh my! Containerizing JVM appsOOPs, OOMs, oh my! Containerizing JVM apps
OOPs, OOMs, oh my! Containerizing JVM apps
 
Is observability good for your brain?
Is observability good for your brain?Is observability good for your brain?
Is observability good for your brain?
 
Introducing log analysis to your organization
Introducing log analysis to your organization Introducing log analysis to your organization
Introducing log analysis to your organization
 
Solr Search Engine: Optimize Is (Not) Bad for You
Solr Search Engine: Optimize Is (Not) Bad for YouSolr Search Engine: Optimize Is (Not) Bad for You
Solr Search Engine: Optimize Is (Not) Bad for You
 
Solr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the UglySolr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the Ugly
 
Monitoring and Log Management for
Monitoring and Log Management forMonitoring and Log Management for
Monitoring and Log Management for
 
Introduction to solr
Introduction to solrIntroduction to solr
Introduction to solr
 
Building Resilient Log Aggregation Pipeline with Elasticsearch & Kafka
Building Resilient Log Aggregation Pipeline with Elasticsearch & KafkaBuilding Resilient Log Aggregation Pipeline with Elasticsearch & Kafka
Building Resilient Log Aggregation Pipeline with Elasticsearch & Kafka
 
How to Run Solr on Docker and Why
How to Run Solr on Docker and WhyHow to Run Solr on Docker and Why
How to Run Solr on Docker and Why
 
Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Running High Performance & Fault-tolerant Elasticsearch Clusters on DockerRunning High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
 
Top Node.js Metrics to Watch
Top Node.js Metrics to WatchTop Node.js Metrics to Watch
Top Node.js Metrics to Watch
 
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerRunning High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
 
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
 
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
 
Metrics, Logs, Transaction Traces, Anomaly Detection at Scale
Metrics, Logs, Transaction Traces, Anomaly Detection at ScaleMetrics, Logs, Transaction Traces, Anomaly Detection at Scale
Metrics, Logs, Transaction Traces, Anomaly Detection at Scale
 
Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2
 
Solr Anti Patterns
Solr Anti PatternsSolr Anti Patterns
Solr Anti Patterns
 
From Zero to Hero - Centralized Logging with Logstash & Elasticsearch
From Zero to Hero - Centralized Logging with Logstash & ElasticsearchFrom Zero to Hero - Centralized Logging with Logstash & Elasticsearch
From Zero to Hero - Centralized Logging with Logstash & Elasticsearch
 
(Elastic)search in big data
(Elastic)search in big data(Elastic)search in big data
(Elastic)search in big data
 

Kürzlich hochgeladen

SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfYashikaSharma391629
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...Akihiro Suda
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 

Kürzlich hochgeladen (20)

SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 

Tuning Elasticsearch Indexing Pipeline for Logs

Hinweis der Redaktion

  1. Rafał starts and passes mic to Radu
  2. Rafal slide – describe the talk brefily !!! Ask people how many of the audience used the tools
  3. Radu slide we did some tests, we’ll share configs and benchmarks – here are the versions Logstash 1.5 – the final version will be up soon Rsyslog 8.9 – the current stable (note: most distros come with 5.x or 7.x) ES is a search engine based on Apache Lucene Current version is 1.5, next major is 2.0 with lots of changes. Many related to Lucene 5.0 Not the only tools for logging, there are many other tools, both open source and commercial, that can receive logs, parse them, buffer them and index them
  4. Rafal slide
  5. Rafal slide * Ask how many people know about Logstash
  6. Rafal slide
  7. Rafal slide
  8. Radu Assume we want to centralize syslog Forward syslog via TCP/UDP on a port to Logstash On the Logstash side, you can use the TCP input to listen to that port and parse syslog messages You’d use the ES output to forward to ES you can use a Java binary, but HTTP is better Logstash comes with a template for ES index, but for perf tests we’ll use our own Specify where (index,type – like a DB and a table)
  9. Radu - 1.3 CPUs
  10. Radu – segue to tuning, pass the mic
  11. Rafal Flush size – 1000 lowered from default 5000
  12. Rafal
  13. Rafal
  14. Rafal Syslog is just TCP + Grok We changed that and we are not parsing the syslog format exactly – we wanted to parse additional things and wanted to show how to parse unstructured data
  15. The bound was: - hardware (high CPU usage) - JSON lines codec is not parallelized, while GROK is - But if you want to do your homework you can do another run with JSON filter instead of codec and that will give the possibility of parallelization
  16. Radu Many people hate it, maybe because of docs I like it because it’s light and fast and has surprisingly rich functionality
  17. Like Logstash, it’s modular, you can use inputs to get data in, message modifiers to parse data and outputs to pass it on The flow of data is a bit different Inputs may have multiple threads, and they write to a main queue On the main queue, worker threads can do filtering, format messages using templates (will talk later) and run actions (parsing/output) You can have action queues as well, with their own threads => async You can have rulesets, which let you separate flows of input – parse – output (e.g. One ruleset for local logs, one for remote logs)
  18. Typical setup is to have it on each server, push to ES directly, buffer if necessary
  19. Load modules Impstats is for monitoring, then tcp and ES Start the tcp listener Template – how the JSON that we send to ES will look like Action – send to ES, using the template, specify index/type, use bulks, retry on failure
  20. Bigger memory buffer Increase bulk size Moar worker threads
  21. Not using more because ES is using the rest – Rafal will talk about that in a bit RAM has increased because of the queue size
  22. Clear win
  23. But not really apples for apples, because rsyslog has dedicated syslog parsers Still, not only for syslog, can parse unstructured data via mmnormalize Refer to a rulebase, which looks much like grok patterns, with two differences: Normally, patterns like number or date aren’t regexes but specific parsers. Faster but less flexible. The one above is equivalent to the Logstash grok seen earlier Builds a parse tree on startup, helps with speed if you have many rules
  24. Radu
  25. More throughput with less CPU usage
  26. Before moving on, one more thing: in production you probably want to use disk assisted queues instead of in-memory queues like the ones we had here. DA queue is in-memory queue that can spill to disk. Specify that via file name and give it a threshold Spilling is smart: Normally in memory When it reaches high watermark it starts writing to disk, but it does so in batches, so resumes to memory when lowwatermark Side-benefit: can save and reload memory queue contents when restarting rsyslog
  27. Rafal
  28. Rafał Index a document It goes to ES first to transaction log next to inverted index It is replicated on transaction log level
  29. Rafał
  30. Rafał
  31. Rafał
  32. Rafał
  33. Rafał
  34. Rafał
  35. Rafał Throttling – the default is 20, we are using 200, so we are actually going for 10 times more (we are usind SSD drives here)
  36. Rafał
  37. Rafał Cheaper filters and aggregations are on top The more expensive are at the bottom
  38. Radu Index as fast as we can How much data we can put in a single index at a decent indexing rate before searches took too long a good practice is to have time-based indices (e.g. Keep logs for a week, have one per day). We want to benchmark that + separating indexing load from search load by putting today’s index on different nodes than the „old” ones
  39. Rafal Rate slowly goes down, because merges happen and because the index is slowly getting bigger
  40. Rafał 40-50 m @ 20 seconds Most expensive query takes 20 sec on average Filters (quick ones) takes subseconds Some aggs takes up to 5 seconds on average
  41. Rafał Spikes because of merges, the big spike is because the merge happen and after the merge the queries are actually faster Most expensive queries take 15 seconds
  42. Radu Want to benchmark TB indices. Because: Indexing is better because of merging Searching recent data is better because idx is smaller Deleting entired indices is better But what granularity? Use-cases for small (high indexing, small retention, CPU contraint) vs big (low idx, high retention, mem constraint) granularity doesn’t affect cold search perf
  43. Rafal Tell about hot and cold setup The drop is because cold nodes were full
  44. Rafal
  45. Radu