27. Rsyslog queue and workers
main_queue(
queue.size="100000" # capacity of the main queue
queue.dequeuebatchsize="5000" # process messages in batches of 5K
queue.workerthreads="4" # 4 threads for the main queue
)
action(name="send-to-es"
type="omelasticsearch"
template="plain-syslog" # use the template defined earlier
searchIndex="test-index"
searchType="test-type"
bulkmode="on" # use bulk API
action.resumeretrycount="-1" # retry indefinitely if ES is unreachable
)
28. Rsyslog queue and workers
25K events per
second
~100% CPU
utilization (1 core)
75MB RAM used
(queue dependent)
36. Disk-assisted queues
main_queue(
queue.filename="main_queue" # write to disk if needed
queue.maxdiskspace="5g" # when to stop writing to disk
queue.highwatermark="200000" # start spilling to disk at this size
queue.lowwatermark="100000" # stop spilling when it gets back to this size
queue.saveonshutdown="on" # write queue contents to disk on shutdown
queue.dequeueBatchSize="5000"
queue.workerthreads="4"
queue.size="10000000" # absolute max queue size
)
38. How Elasticsearch works
JSON bulk, single doc
transaction log
inverted index
analysis
primary
transaction log
inverted index
analysis
replica
Elasticsearch
replicate
46. Tests: hardware and data
2 x EC2 c3.large instances
(2vCPU, 3.5GB RAM,
2x16GB SSD in RAID0)
vs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs
Logs Logs
Logs
Apache logs
47. Test requests
Filters Aggregations
filter by client IP date histogram
filter by word in user agent top 10 response codes
wildcard filter on domain # of unique IPs
top IPs per response per time
48. Test runs
1. Write throughput
2. Capacity of a single index
3. Capacity with time-based indices on
hot/cold setup
52. Time-based indices: ideal shard size
smaller indices
lighter indexing
easier to isolate hot data from cold data
easier to relocate
bigger indices
less RAM
less management overhead
smaller cluster state
without indexing, equal latency when dividing
32M data into 1/2/4/8/16/32M indices
56. What to remember?
log in
JSON
parallelize
when
possible
use time
based indices
use hot / cold
nodes policy
57. We are hiring
Dig Search?
Dig Analytics?
Dig Big Data?
Dig Performance?
Dig Logging?
Dig working with and in open – source?
We’re hiring world – wide!
http://sematext.com/about/jobs.html
Rafal slide – describe the talk brefily
!!! Ask people how many of the audience used the tools
Radu slide
we did some tests, we’ll share configs and benchmarks – here are the versions
Logstash 1.5 – the final version will be up soon
Rsyslog 8.9 – the current stable (note: most distros come with 5.x or 7.x)
ES is a search engine based on Apache Lucene
Current version is 1.5, next major is 2.0 with lots of changes. Many related to Lucene 5.0
Not the only tools for logging, there are many other tools, both open source and commercial, that can receive logs, parse them, buffer them and index them
Rafal slide
Rafal slide
* Ask how many people know about Logstash
Rafal slide
Rafal slide
Radu
Assume we want to centralize syslog
Forward syslog via TCP/UDP on a port to Logstash
On the Logstash side, you can use the TCP input to listen to that port and parse syslog messages
You’d use the ES output to forward to ES
you can use a Java binary, but HTTP is better
Logstash comes with a template for ES index, but for perf tests we’ll use our own
Specify where (index,type – like a DB and a table)
Radu
- 1.3 CPUs
Radu – segue to tuning, pass the mic
Rafal
Flush size – 1000 lowered from default 5000
Rafal
Rafal
Rafal
Syslog is just TCP + Grok
We changed that and we are not parsing the syslog format exactly – we wanted to parse additional things and wanted to show how to parse unstructured data
The bound was:
- hardware (high CPU usage)
- JSON lines codec is not parallelized, while GROK is
- But if you want to do your homework you can do another run with JSON filter instead of codec and that will give the possibility of parallelization
Radu
Many people hate it, maybe because of docs
I like it because it’s light and fast and has surprisingly rich functionality
Like Logstash, it’s modular, you can use inputs to get data in, message modifiers to parse data and outputs to pass it on
The flow of data is a bit different
Inputs may have multiple threads, and they write to a main queue
On the main queue, worker threads can do filtering, format messages using templates (will talk later) and run actions (parsing/output)
You can have action queues as well, with their own threads => async
You can have rulesets, which let you separate flows of input – parse – output (e.g. One ruleset for local logs, one for remote logs)
Typical setup is to have it on each server, push to ES directly, buffer if necessary
Load modules
Impstats is for monitoring, then tcp and ES
Start the tcp listener
Template – how the JSON that we send to ES will look like
Action – send to ES, using the template, specify index/type, use bulks, retry on failure
Not using more because ES is using the rest – Rafal will talk about that in a bit
RAM has increased because of the queue size
Clear win
But not really apples for apples, because rsyslog has dedicated syslog parsers
Still, not only for syslog, can parse unstructured data via mmnormalize
Refer to a rulebase, which looks much like grok patterns, with two differences:
Normally, patterns like number or date aren’t regexes but specific parsers. Faster but less flexible. The one above is equivalent to the Logstash grok seen earlier
Builds a parse tree on startup, helps with speed if you have many rules
Radu
More throughput with less CPU usage
Before moving on, one more thing: in production you probably want to use disk assisted queues instead of in-memory queues like the ones we had here.
DA queue is in-memory queue that can spill to disk. Specify that via file name and give it a threshold
Spilling is smart:
Normally in memory
When it reaches high watermark it starts writing to disk, but it does so in batches, so resumes to memory when lowwatermark
Side-benefit: can save and reload memory queue contents when restarting rsyslog
Rafal
Rafał
Index a document
It goes to ES
first to transaction log
next to inverted index
It is replicated on transaction log level
Rafał
Rafał
Rafał
Rafał
Rafał
Rafał
Rafał
Throttling – the default is 20, we are using 200, so we are actually going for 10 times more (we are usind SSD drives here)
Rafał
Rafał
Cheaper filters and aggregations are on top
The more expensive are at the bottom
Radu
Index as fast as we can
How much data we can put in a single index at a decent indexing rate before searches took too long
a good practice is to have time-based indices (e.g. Keep logs for a week, have one per day). We want to benchmark that + separating indexing load from search load by putting today’s index on different nodes than the „old” ones
Rafal
Rate slowly goes down, because merges happen and because the index is slowly getting bigger
Rafał
40-50 m @ 20 seconds
Most expensive query takes 20 sec on average
Filters (quick ones) takes subseconds
Some aggs takes up to 5 seconds on average
Rafał
Spikes because of merges, the big spike is because the merge happen and after the merge the queries are actually faster
Most expensive queries take 15 seconds
Radu
Want to benchmark TB indices. Because:
Indexing is better because of merging
Searching recent data is better because idx is smaller
Deleting entired indices is better
But what granularity?
Use-cases for small (high indexing, small retention, CPU contraint) vs big (low idx, high retention, mem constraint)
granularity doesn’t affect cold search perf
Rafal
Tell about hot and cold setup
The drop is because cold nodes were full