SlideShare ist ein Scribd-Unternehmen logo
1 von 145
Downloaden Sie, um offline zu lesen
Elasticsearch In Netflix
Danny Yuan, Jae Bae
Welcome
Hashtag: #ES_in_Netflix
@Elasticsearch - Elasticsearch
!

@stonse - Sudhir Tonse
!

@g9yuayon - Danny Yuan
!

@metacret - Jae Bae
Who Are We?
Who Are We?
Software engineers in Netflix’s
Platform Engineering team,
working on large scale data
infrastructure
Who Are We?
Software engineers in Netflix’s
Platform Engineering team,
working on large scale data
infrastructure
Building and operating Netflix’s
cloud real-time query service
Why Are We Here?
Why Are We Here?
How We Use Elasticsearch
Why Are We Here?
How We Use Elasticsearch
Why Elasticsearch
Why Are We Here?
How We Use Elasticsearch
Why Elasticsearch
How We Run Elasticsearch
Why Are We Here?
How We Use Elasticsearch
Why Elasticsearch
How We Run Elasticsearch
To Seek Your Feedback
How We Use Elasticsearch
Querying Log Events
Tracking Service Deployments
Querying Log Events
A Little Historical Perspective
Netflix is a log generating company
that also happens to stream movies
- Adrian Cockroft

photo credit: http://www.flickr.com/photos/decade_null/142235888/sizes/o/in/photostream/
A Humble Beginning
A Humble Beginning
A Humble Beginning
A Humble Beginning
Things Changed
Application

Application

Application
Application

Application

Application

Application

Application

Application

Application
70,000,000,000
1,500,000
Making Sense of Billions of Events
So We Evolved
So We Evolved
So We Evolved

hgrep -C 10 -k 5,2,3 'users.*[1-9]{3}' *catalina.out s3//bucket
So We Evolved

hgrep -C 10 -k 5,2,3 'users.*[1-9]{3}' *catalina.out s3//bucket
So We Evolved

hgrep -C 10 -k 5,2,3 'users.*[1-9]{3}' *catalina.out s3//bucket
So We Evolved

hgrep -C 10 -k 5,2,3 'users.*[1-9]{3}' *catalina.out s3//bucket
So We Evolved

hgrep -C 10 -k 5,2,3 'users.*[1-9]{3}' *catalina.out s3//bucket
select * from log_events where dateint=20140101
Field Name

Field Value

Client

“API”

Server

“Cryptex”

StatusCode

200

ResponseTime

73
Log data
Server Farm

Log data
Server Farm
Log Collectors

Log data
Server Farm
What Could Go Wrong?
You thought parallelization would save the day?
Think again
You thought parallelization would save the day?
Think again
What Is Missing?
Interactive Exploration
Functional Requirements
Arbitrary Boolean Queries
Aggregated Query
- Top N Query
- Trend
- Distribution
Non-Functional Requirements
- Interactive (response within seconds)
!

- Quickly locates the right log events

- Minimal programming effort
It’s All about Extracting Small Data
Out of Big Data
Now Back to the Use Case
Intelligent Alerts
Guided Debugging in the Right Context
Guided Debugging in the Right Context
Guided Debugging in the Right Context
Guided Debugging in the Right Context
Guided Debugging in the Right Context
Guided Debugging in the Right Context
Guided Debugging in the Right Context
Guided Debugging in the Right Context
A Useful Pattern
Aggregated Query -> Individual Query
Examples
- S3 diagnostics
!

- Tracking email campaigns 

-	 Request traces
Status:200
RequestId Parent Id Node Id Service Name

Status

4965-4a74

0

123

Edge Service

200

4965-4a74

123

456

Gateway

200

4965-4a74

456

789

Service A

200

4965-4a74e

456

abc

Service B

200
Edge Service (456) ---> Gateway (789)

Data Name

Value

Request ID

4965-4a74

Response Time

25 ms

Endpoints

/rest/service

Status Code

200
Why Elasticsearch?
Automatic Sharding and Replication
Flexible Schema
Flexible Schema
- Schemaless
Flexible Schema
- Schemaless
- Reasonable defaults
Nice Extension Model
Nice Extension Model
- Customizable REST Actions
Nice Extension Model
- Customizable REST Actions

- Site Plugins
Nice Extension Model
- Customizable REST Actions

- Site Plugins
- River Plugins
Nice Extension Model
- Customizable REST Actions

- Site Plugins
- River Plugins
- Discovery Module
Ecosystem - Plugins, Kibana
Tracking Service Deployments
!

{ edda }
Built by Netflix Monitoring Eng Team
Built by Netflix Monitoring Eng Team
Tracks History and Changes to Service
Deployments

Built by Netflix Monitoring Eng Team
Tracks History and Changes to Service
Deployments

Keeps Many Revisions
Built by Netflix Monitoring Eng Team
Tracks History and Changes to Service
Deployments

Keeps Many Revisions
Tracks Dozens of Document Types
Why Elasticsearch?
Schemas may change at any time
Schemas may change at any time
Go schemaless
Users may search for any combination of fields

Users may search for any combination of fields

This is what search engine is designed for
Users often needs only a few fields
Users often needs only a few fields
Projection via “fields” query
Need range queries on date and revisions
Need range queries on date and revisions
Natively supported by Elasticsearch
Need range queries on date and revisions
Natively supported by Elasticsearch
Route by document ID
Running ES in Netflix
Operational Challenges
Operational Challenges
Back pressure when indexing
Operational Challenges
Back pressure when indexing
Diverse configurations and data
Operational Challenges
Back pressure when indexing
Diverse configurations and data
Dynamic flow of log events
Operational Challenges
Back pressure when indexing
Diverse configurations and data
Dynamic flow of log events
Needs extensive monitoring and alerting
Operational Challenges
Back pressure when indexing
Diverse configurations and data
Dynamic flow of log events
Needs extensive monitoring and alerting
Tolerating outage at different scales
Favor Pulling Over Pushing
Choose Config with Data
Integrating ES
AMI for Deployment by Asgard
Archaius for Configuration
Eureka for Server Discovery
Suro for Data Delivery
Servo for Monitoring Metrics
Zone-aware Replication
Multi-region Deployment
Multi-region Deployment
Discovery over Cassandra

Region-aware replication
Favor Index Rolling Over TTL
Favor Index Rolling Over TTL
A dedicated service manages index rolling

Uses index template and routing
Worth Trying G1
Worth Trying G1
Not recommended by ES team, but

Worth Trying G1
Not recommended by ES team, but

Has fewer and shorter GC pauses

Worth Trying G1
Not recommended by ES team, but

Has fewer and shorter GC pauses

Occasional SIGSEGV, but it’s okay
Simple Majority for Master Election
Simple Majority for Master Election
Split-brain problem
Simple Majority for Master Election
Split-brain problem


discovery.zen.minimum_master_nodes

Simple Majority for Master Election
Split-brain problem


discovery.zen.minimum_master_nodes

Dynamically updated
Future Work
Future Work
Automatic incremental backup and restore
Future Work
Automatic incremental backup and restore


Auto scaling

Future Work
Automatic incremental backup and restore


Auto scaling

Fully automated deployment

Future Work
Automatic incremental backup and restore


Auto scaling

Fully automated deployment

Support more use cases
We’re Hiring
Thank You!

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to Kibana
Introduction to KibanaIntroduction to Kibana
Introduction to Kibana
Vineet .
 
Log analysis with the elk stack
Log analysis with the elk stackLog analysis with the elk stack
Log analysis with the elk stack
Vikrant Chauhan
 

Was ist angesagt? (20)

HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
 
Keeping Up with the ELK Stack: Elasticsearch, Kibana, Beats, and Logstash
Keeping Up with the ELK Stack: Elasticsearch, Kibana, Beats, and LogstashKeeping Up with the ELK Stack: Elasticsearch, Kibana, Beats, and Logstash
Keeping Up with the ELK Stack: Elasticsearch, Kibana, Beats, and Logstash
 
Introduction to Kibana
Introduction to KibanaIntroduction to Kibana
Introduction to Kibana
 
Splunk conf2014 - Lesser Known Commands in Splunk Search Processing Language ...
Splunk conf2014 - Lesser Known Commands in Splunk Search Processing Language ...Splunk conf2014 - Lesser Known Commands in Splunk Search Processing Language ...
Splunk conf2014 - Lesser Known Commands in Splunk Search Processing Language ...
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
Hardening Kafka Replication
Hardening Kafka Replication Hardening Kafka Replication
Hardening Kafka Replication
 
Kafka Streams State Stores Being Persistent
Kafka Streams State Stores Being PersistentKafka Streams State Stores Being Persistent
Kafka Streams State Stores Being Persistent
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
[236] 카카오의데이터파이프라인 윤도영
[236] 카카오의데이터파이프라인 윤도영[236] 카카오의데이터파이프라인 윤도영
[236] 카카오의데이터파이프라인 윤도영
 
Elk
Elk Elk
Elk
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Log analysis with the elk stack
Log analysis with the elk stackLog analysis with the elk stack
Log analysis with the elk stack
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
Logging using ELK Stack for Microservices
Logging using ELK Stack for MicroservicesLogging using ELK Stack for Microservices
Logging using ELK Stack for Microservices
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
ELK introduction
ELK introductionELK introduction
ELK introduction
 
Kibana + timelion: time series with the elastic stack
Kibana + timelion: time series with the elastic stackKibana + timelion: time series with the elastic stack
Kibana + timelion: time series with the elastic stack
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stack
 
서비스 모니터링 구현 사례 공유 - Realtime log monitoring platform-PMon을 ...
서비스 모니터링 구현 사례 공유 - Realtime log monitoring platform-PMon을 ...서비스 모니터링 구현 사례 공유 - Realtime log monitoring platform-PMon을 ...
서비스 모니터링 구현 사례 공유 - Realtime log monitoring platform-PMon을 ...
 

Ähnlich wie Elasticsearch in Netflix

Barga IC2E & IoTDI'16 Keynote
Barga IC2E & IoTDI'16 KeynoteBarga IC2E & IoTDI'16 Keynote
Barga IC2E & IoTDI'16 Keynote
Roger Barga
 
Salesforce com-architecture
Salesforce com-architectureSalesforce com-architecture
Salesforce com-architecture
drewz lin
 

Ähnlich wie Elasticsearch in Netflix (20)

Microservices, Continuous Delivery, and Elasticsearch at Capital One
Microservices, Continuous Delivery, and Elasticsearch at Capital OneMicroservices, Continuous Delivery, and Elasticsearch at Capital One
Microservices, Continuous Delivery, and Elasticsearch at Capital One
 
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
Emerging Prevalence of Data Streaming in Analytics and it's Business Signific...
 
Using Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SFUsing Data Lakes: Data Analytics Week SF
Using Data Lakes: Data Analytics Week SF
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 
Vancouver keynote - AWS Innovate - Sam Elmalak
Vancouver keynote - AWS Innovate - Sam ElmalakVancouver keynote - AWS Innovate - Sam Elmalak
Vancouver keynote - AWS Innovate - Sam Elmalak
 
Financial Services Analytics on AWS
Financial Services Analytics on AWSFinancial Services Analytics on AWS
Financial Services Analytics on AWS
 
Comment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitablesComment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitables
 
Cómo transformar los datos en análisis con los que tomar decisiones
Cómo transformar los datos en análisis con los que tomar decisionesCómo transformar los datos en análisis con los que tomar decisiones
Cómo transformar los datos en análisis con los que tomar decisiones
 
(SOV204) Scaling Up to Your First 10 Million Users | AWS re:Invent 2014
(SOV204) Scaling Up to Your First 10 Million Users | AWS re:Invent 2014(SOV204) Scaling Up to Your First 10 Million Users | AWS re:Invent 2014
(SOV204) Scaling Up to Your First 10 Million Users | AWS re:Invent 2014
 
Transforming data into actionable insights
Transforming data into actionable insightsTransforming data into actionable insights
Transforming data into actionable insights
 
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
Analyzing Data Streams in Real Time with Amazon Kinesis: PNNL's Serverless Da...
 
Fraud Detection and Prevention on AWS using Machine Learning
Fraud Detection and Prevention on AWS using Machine LearningFraud Detection and Prevention on AWS using Machine Learning
Fraud Detection and Prevention on AWS using Machine Learning
 
Fraud Detection with Amazon Machine Learning on AWS
Fraud Detection with Amazon Machine Learning on AWSFraud Detection with Amazon Machine Learning on AWS
Fraud Detection with Amazon Machine Learning on AWS
 
AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당:: AWS Summit Online Korea 2020
AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당::  AWS Summit Online Korea 2020AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당::  AWS Summit Online Korea 2020
AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당:: AWS Summit Online Korea 2020
 
Barga IC2E & IoTDI'16 Keynote
Barga IC2E & IoTDI'16 KeynoteBarga IC2E & IoTDI'16 Keynote
Barga IC2E & IoTDI'16 Keynote
 
透過 Amazon Redshift 打造數據分析服務及 Amazon Redshift 新功能案例介紹
透過 Amazon Redshift 打造數據分析服務及 Amazon Redshift 新功能案例介紹透過 Amazon Redshift 打造數據分析服務及 Amazon Redshift 新功能案例介紹
透過 Amazon Redshift 打造數據分析服務及 Amazon Redshift 新功能案例介紹
 
AWS를 활용한 Big Data 실전 배치 사례 :: 이한주 :: AWS Summit Seoul 2016
AWS를 활용한 Big Data 실전 배치 사례 :: 이한주 :: AWS Summit Seoul 2016AWS를 활용한 Big Data 실전 배치 사례 :: 이한주 :: AWS Summit Seoul 2016
AWS를 활용한 Big Data 실전 배치 사례 :: 이한주 :: AWS Summit Seoul 2016
 
(ATS6-APP01) Unleashing the Power of Your Data with Discoverant
(ATS6-APP01) Unleashing the Power of Your Data with Discoverant(ATS6-APP01) Unleashing the Power of Your Data with Discoverant
(ATS6-APP01) Unleashing the Power of Your Data with Discoverant
 
AWS Webcast - Introduction to Amazon Kinesis
AWS Webcast - Introduction to Amazon KinesisAWS Webcast - Introduction to Amazon Kinesis
AWS Webcast - Introduction to Amazon Kinesis
 
Salesforce com-architecture
Salesforce com-architectureSalesforce com-architecture
Salesforce com-architecture
 

Mehr von Danny Yuan (6)

Streaming Analytics in Uber
Streaming Analytics in Uber Streaming Analytics in Uber
Streaming Analytics in Uber
 
Streaming Processing in Uber Marketplace for Kafka Summit 2016
Streaming Processing in Uber Marketplace for Kafka Summit 2016Streaming Processing in Uber Marketplace for Kafka Summit 2016
Streaming Processing in Uber Marketplace for Kafka Summit 2016
 
QCon SF-2015 Stream Processing in uber
QCon SF-2015 Stream Processing in uberQCon SF-2015 Stream Processing in uber
QCon SF-2015 Stream Processing in uber
 
QConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing systemQConSF 2014 talk on Netflix Mantis, a stream processing system
QConSF 2014 talk on Netflix Mantis, a stream processing system
 
netflix-real-time-data-strata-talk
netflix-real-time-data-strata-talknetflix-real-time-data-strata-talk
netflix-real-time-data-strata-talk
 
Strata lightening-talk
Strata lightening-talkStrata lightening-talk
Strata lightening-talk
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Elasticsearch in Netflix