Elastic{ON} Seminar New York (2017)

Elastic
@ Squarespace
Elastic{ON} Seminar New York 2017
Franklin Angulo
@feangulo

Franklin Angulo
Director of Server Engineering
@feangulo

01 Logs for all environments: corporate, QA, staging, production
02 Logs for all software services: monolith, microservices
03 Logs for all systems components: search, caching, discovery, etc.
04 Logs for all data centers
05 Enough room for random log aggregation by different teams
06 Scaling != $$$
Goals

Logging Growth
Log Lines
May
2016
300,000
per minute
Staging &
QA envs

Logging Growth
Log Lines
May
2016
February
2017
300,000
per minute
Staging &
QA envs
1,000,000
per minute
Production
envs

Logging Growth
Log Lines
May
2016
February
2017
300,000
per minute
Staging &
QA envs
1,000,000
per minute
Production
envs
1,800,000
per minute
5 TB / day
July
2017

Elastic Stack
Application Process
(e.g. Java)
Filebeat ?

Elastic Stack
Application Process
(e.g. Java)
Filebeat Tags: source_host and environment
Routing: automatic routing to
corresponding Kafka cluster based on
data center and environment

Microservice Deployments
hello-service/deploy/deploy.yml:

Elastic Stack
Kafka
10
Ingestion bottleneck: helped identify bottleneck, ruled out Filebeat as the root cause
Retention: gave us retention beyond Filebeat’s local buffer, now have 8 hour buffer
Operational issues: very high traffic logs would rotate quickly and Filebeat would hold onto
deleted file handles and fill up disks on servers

Elastic Stack
Kafka
10
Logstash
Indexers
35

Elastic Stack
Kafka
10
Logstash
Indexers
35
Elasticsearch
LBs
3

Elastic Stack
Kafka
10
Logstash
Indexers
35
Elasticsearch
LBs
3
Elasticsearch
Controllers
3

Elastic Stack
Kafka
10
Logstash
Indexers
35
Elasticsearch
LBs
3
Elasticsearch
Controllers
3
Elasticsearch
Workhorses
16
1.5
TB
64
GB

Elastic Stack
Log filters: specify how to parse individual log types using the full power
of Logstash filter plugins
Logstash
Indexers
35

Elastic Stack
Index definitions (new or existing): index durations and retention per
environment can be configured using Ansible and applied automatically
Handles routing within the indexers and index retention time (Curator)
Elasticsearch
Workhorses
16
1.5
TB
64
GB

Elastic Stack
Index definitions (new or existing): specify how many shards and
replicas are required and any field -> data type mappings also can be
configured using Ansible and automatically applied to the workhorses
Elasticsearch
Workhorses
16
1.5
TB
64
GB

Kafka
10
Logstash
Indexers
35
Elasticsearch
LBs
3
Elasticsearch
Workhorses
16
Kafka
10
Logstash
Indexers
35
Elasticsearch
LBs
3
Elasticsearch
Workhorses
16
Elasticsearch
Controllers
3
3
Data Center 1
Data Center 2

Kafka
10
Logstash
Indexers
35
Elasticsearch
LBs
3
Elasticsearch
Workhorses
16
Kafka
10
Logstash
Indexers
35
Elasticsearch
LBs
3
Elasticsearch
Workhorses
16
Elasticsearch
Controllers
3
3
Data Center 1 (primary)
Data Center 2
Primary shards
only

Elastic Stack
Elasticsearch
Workhorses
(Production)
Elasticsearch
Workhorses
(Staging)
Elasticsearch
Workhorses
(QA)
Tribe Node
Kibana

01 Output logs in a predictable format (e.g. JSON), save a lot of time!
02 Pay attention to Elasticsearch shard sizes!
● Shard sizes should be as even as possible; we target 20-30 GB shards
● Helps when moving shards during constant cluster rebalancing
● We recommend daily or weekly indexes and tweaking retention settings
● Consider index lifespan, number of shards, and sizes of logs ingested per lifespan
Lessons Learned

03 Use x-pack security (Shield)
04 Use monitoring (Marvel)
● Export the monitoring metrics from every node into a separate ES cluster
● Monitor Kibana and Logstash using that separate cluster
● We had to add two security realms to our ES configuration: LDAP, local filesystem
● The fully-privileged admin user was hitting our LDAP servers hard!
Lessons Learned

Future: Elastic Stack
2 processes per node: run two Elasticsearch processes in each server
Beefier nodes: double disk capacity from 1.5 to 3 TBs
Retention: 30 days or more or retention for all indexes, as necessary
Elasticsearch
Workhorses
16
3
TB
64
GB

QUESTIONS
Thank you!
Franklin Angulo
@feangulo

Elastic{ON} Seminar New York (2017)

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Elastic{ON} Seminar New York (2017)

Ähnlich wie Elastic{ON} Seminar New York (2017) (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Elastic{ON} Seminar New York (2017)