SlideShare ist ein Scribd-Unternehmen logo
1 von 41
Downloaden Sie, um offline zu lesen
Distributed Logging Architecture
in Container Era
LinuxCon Japan 2016 at Jun 13 2016
Satoshi "Moris" Tagomori (@tagomoris)
Satoshi "Moris" Tagomori
(@tagomoris)
Fluentd, MessagePack-Ruby, Norikra, ...
Treasure Data, Inc.
http://www.linuxfoundation.org/news-media/announcements/2016/06/chaosuan-crunchy-data-qbox-storageos-and-treasure-data-join-cloud
Topics
• Microservices and logging in various industries
• Difficulties of logging with containers
• Distributed logging architecture
• Patterns of distributed logging architecture
• Case Study: Docker and Fluentd
Logging
Logging in Various Industries
• Web access logs
• Views/visitors on media
• Views/clicks on Ads
• Commercial transactions (EC, Game, ...)
• Data from devices
• Operation logs on Apps of phones
• Various sensor data
Microservices and Logging
• Monolithic service
• a service produces all data
about an user's behavior
• Microservices
• many services produce data
about an user's access
• it's needed to collect logs
from many services to know
what is happening
Users
Service (Application)
Logs
Users
Logs
Logging and Containers
Containers:
"a must" for microservices
• Dividing a service into services
• a service requires less computing resources

(VM -> containers)
• Making services independent from each other
• but it is very difficult :(
• some dependency must be solved even in
development environment

(containers on desktop)
Redesign Logging: Why?
• No permanent storages
• No fixed physical/network address
• No fixed mapping between servers and roles
• We should parse/label logs at the source, ship
these logs by pushing to destination ASAP
Containers:
immutable & disposable
• No permanent storages
• Where to write logs?
• files in the container

→ gone w/ container instance 😞
• directories shared from hosts

→ hosts are shared by many containers/services
☹
• TODO: ship logs from container to anywhere ASAP
Containers:
unfixed addresses
• No fixed physical / network address
• Where should we go to fetch logs?
• Service discovery (e.g., consul)

→ one more component 😞
• rsync? ssh+tail? or ..? Is it installed in containers?

→ one more tool to depend on ☹
• TODO: push logs to anywhere from containers
Containers:
instances per roles
• No fixed mapping between servers and roles
• How can we parse / store these logs?
• Central repository about log syntax

→ very hard to maintain 😞
• Label logs by source address

→ many containers/roles in a host ☹
• TODO: label & parse logs at source of logs
Distributed Logging
Architecture
Core Architecture
• Collector nodes
• Aggregator nodes
• Destinations
Collector nodes
(Docker containers + agent)
Destinations

(Storage, Database, ...)
Aggregator nodes
• Parse/Label (collector)
• Raw logs are not good for processing
• Convert logs to structured data (key-value pairs)
• Split/Sort (aggregator)
• Mixed logs are not good for searching
• Split whole data stream into streams per services
• Store (destination)
• Format logs(records) as destination expects
Collecting and Storing Data
Scaling Logging
• Network traffic
• CPU load to parse / format
• Parse logs on each collector (distributed)
• Format logs on aggregator (to be distributed)
• Capability
• Make aggregators redundant
• Controlling delay
• to make sure when we can know what's happening in our
systems
Patterns
source aggregation
NO
source aggregation
YES
destination
aggregation
NO
destination
aggregation
YES
Aggregation Patterns
Source Side Aggregation Patterns
w/o source aggregation w/ source aggregation
collector
aggregator
/
destination
aggregate
container
Without Source Aggregation
• Pros:
• Simple configuration
• Cons:
• fixed aggregator (endpoint) address
• many network connections
• high load in aggregator
collector
aggregator
With Source Aggregation
• Pros:
• less connections
• lower load in aggregator
• less configuration in containers

(by specifying localhost)
• highly flexible configuration

(by deployment only of aggregate containers)
• Cons:
• a bit much resource (+1 container per host)
aggregate
container
aggregator
Destination Side Aggregation Patterns
w/o destination aggregation w/ destination aggregation
aggregator
collector
destination
Without Destination Aggregation
• Pros:
• Less nodes
• Simpler configuration
• Cons:
• Storage side change affects collector side
• Worse performance: many small write requests
on storage
With Destination Aggregation
• Pros:
• Collector side configuration is

free from storage side changes
• Better performance with fine tune

on destination side aggregator
• Cons:
• More nodes
• A bit complex configuration
aggregator
Scaling Patterns
Scaling Up Endpoints
HTTP/TCP load balancer
Huge queue + workers
Scaling Out Endpoints
Round-robin clients
Load balancer
Backend nodes
Collector nodes
Aggregator nodes
Scaling Up Endpoints
• Pros:
• Simple configuration

in collector nodes
• Cons:
• Limits about scaling up
Load balancer
Backend nodes
Scaling Out Endpoints
• Pros:
• Unlimited scaling

by adding aggregator nodes
• Cons:
• Complex configuration
• Client features for round-robin
Without

Destination Aggregation
With

Destination Aggregation
Scaling Up
Endpoints
Systems in early stages
Collecting logs over
Internet
or
Using queues
Scaling Out
Endpoints
Impossible :(
Collector nodes must know
all endpoints
↓
Uncontrollable
Collecting logs
in datacenter
Case Studies
Case Study: Docker+Fluentd
• Destination aggregation + scaling up
• Fluent logger + Fluentd
• Source aggregation + scaling up
• Docker json logger + Fluentd + Elasticsearch
• Docker fluentd logger + Fluentd + Kafka
• Source/Destination aggregation + scaling out
• Docker fluentd logger + Fluentd
Why Fluentd?
• Docker Fluentd logging driver
• Docker containers can send logs to Fluentd
directly - less overhead
• Pluggable architecture
• Various destination systems
• Small memory footprint
• Source aggregation requires +1 container per host
• Less additional resource usage ( < 100MB )
Destination aggregation + scaling up
• Sending logs directly over TCP by Fluentd logger
library in application code
• Same with patterns of New Relic
• Easy to implement

- good for startups Application code
Source aggregation + scaling up
• Kubernetes: Json logger + Fluentd + Elasticsearch
• Applications write logs to STDOUT
• Docker writes logs as JSON in files
• Fluentd

reads logs from file

parse JSON objects

writes logs to Elasticsearch
• EFK stack (like ELK stack)
http://kubernetes.io/docs/getting-started-guides/logging-elasticsearch/
Elasticsearch
Application code
Files (JSON)
Source aggregation + scaling up/out
• Docker fluentd logging driver + Fluentd + Kafka
• Applications write logs to STDOUT
• Docker sends logs

to localhost Fluentd
• Fluentd

gets logs over TCP

pushes logs into Kafka
• Highly scalable & less overhead

- very good for huge deployment
Kafka
Application code
Application code
Source/Destination aggregation +
scaling out
• Docker fluentd logging driver + Fluentd
• Applications write logs to STDOUT
• Docker sends logs

to localhost Fluentd
• Fluentd

gets logs over TCP

sends logs into Aggregator Fluentd

w/ round-robin load balance
• Highly flexible

- good for complex data processing

requirements
Any other storages
What's the Best?
• Writing logs from containers: Some way to do it
• Docker logging driver
• Write logs on files + read/parse it
• Send logs from apps directly
• Make the platform scalable!
• Source aggregation: Fluentd on localhost
• Scalable storage: (Kafka, external services, ...)
• No destination aggregation + Scaling up
• Non-scalable storage: (Filesystems, RDBMSs, ...)
• Destination aggregation + Scaling out
Why OSS Are Important
For Logging?
Why OSS?
• Logging layer is interface
• transparency
• interoperability
• Keep the platform scalable
• number of nodes
• number of types of source/destination
Use OSS,
Make Logging Scalable
Thank you!

Weitere ähnliche Inhalte

Was ist angesagt?

InfluxDB Internals
InfluxDB InternalsInfluxDB Internals
InfluxDB InternalsInfluxData
 
FluentD for end to end monitoring
FluentD for end to end monitoringFluentD for end to end monitoring
FluentD for end to end monitoringPhil Wilkins
 
InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...
InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...
InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...InfluxData
 
Docker and Fluentd
Docker and FluentdDocker and Fluentd
Docker and FluentdN Masahiro
 
Kafka website activity architecture
Kafka website activity architectureKafka website activity architecture
Kafka website activity architectureOmid Vahdaty
 
Building realtime data pipeline with Apache Kafka
Building realtime data pipeline with Apache KafkaBuilding realtime data pipeline with Apache Kafka
Building realtime data pipeline with Apache KafkaNagarajan Selvaraj
 
Fluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshellFluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshellN Masahiro
 
Mapbox.com: Serving maps from 8 regions
Mapbox.com: Serving maps from 8 regionsMapbox.com: Serving maps from 8 regions
Mapbox.com: Serving maps from 8 regionsJohan
 
Data Security Governanace and Consumer Cloud Storage
Data Security Governanace and Consumer Cloud StorageData Security Governanace and Consumer Cloud Storage
Data Security Governanace and Consumer Cloud StorageDaniel Rohan
 
Oleksandr Nitavskyi "Kafka deployment at Scale"
Oleksandr Nitavskyi "Kafka deployment at Scale"Oleksandr Nitavskyi "Kafka deployment at Scale"
Oleksandr Nitavskyi "Kafka deployment at Scale"Fwdays
 
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDBEVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDBScott Mansfield
 
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...PROIDEA
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon
 
How Criteo is managing one of the largest Kafka Infrastructure in Europe
How Criteo is managing one of the largest Kafka Infrastructure in EuropeHow Criteo is managing one of the largest Kafka Infrastructure in Europe
How Criteo is managing one of the largest Kafka Infrastructure in EuropeRicardo Paiva
 
Why You Definitely Don’t Want to Build Your Own Time Series Database
Why You Definitely Don’t Want to Build Your Own Time Series DatabaseWhy You Definitely Don’t Want to Build Your Own Time Series Database
Why You Definitely Don’t Want to Build Your Own Time Series DatabaseInfluxData
 
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...confluent
 
How is Kafka so Fast?
How is Kafka so Fast?How is Kafka so Fast?
How is Kafka so Fast?Ricardo Paiva
 

Was ist angesagt? (20)

Fluentd 101
Fluentd 101Fluentd 101
Fluentd 101
 
InfluxDB Internals
InfluxDB InternalsInfluxDB Internals
InfluxDB Internals
 
FluentD for end to end monitoring
FluentD for end to end monitoringFluentD for end to end monitoring
FluentD for end to end monitoring
 
Logging for Containers
Logging for ContainersLogging for Containers
Logging for Containers
 
InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...
InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...
InfluxDB IOx Tech Talks: Replication, Durability and Subscriptions in InfluxD...
 
Fluent Bit: Log Forwarding at Scale
Fluent Bit: Log Forwarding at ScaleFluent Bit: Log Forwarding at Scale
Fluent Bit: Log Forwarding at Scale
 
Docker and Fluentd
Docker and FluentdDocker and Fluentd
Docker and Fluentd
 
Kafka website activity architecture
Kafka website activity architectureKafka website activity architecture
Kafka website activity architecture
 
Building realtime data pipeline with Apache Kafka
Building realtime data pipeline with Apache KafkaBuilding realtime data pipeline with Apache Kafka
Building realtime data pipeline with Apache Kafka
 
Fluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshellFluentd v1.0 in a nutshell
Fluentd v1.0 in a nutshell
 
Mapbox.com: Serving maps from 8 regions
Mapbox.com: Serving maps from 8 regionsMapbox.com: Serving maps from 8 regions
Mapbox.com: Serving maps from 8 regions
 
Data Security Governanace and Consumer Cloud Storage
Data Security Governanace and Consumer Cloud StorageData Security Governanace and Consumer Cloud Storage
Data Security Governanace and Consumer Cloud Storage
 
Oleksandr Nitavskyi "Kafka deployment at Scale"
Oleksandr Nitavskyi "Kafka deployment at Scale"Oleksandr Nitavskyi "Kafka deployment at Scale"
Oleksandr Nitavskyi "Kafka deployment at Scale"
 
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDBEVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
 
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
 
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at DidiHBaseCon2017 Apache HBase at Didi
HBaseCon2017 Apache HBase at Didi
 
How Criteo is managing one of the largest Kafka Infrastructure in Europe
How Criteo is managing one of the largest Kafka Infrastructure in EuropeHow Criteo is managing one of the largest Kafka Infrastructure in Europe
How Criteo is managing one of the largest Kafka Infrastructure in Europe
 
Why You Definitely Don’t Want to Build Your Own Time Series Database
Why You Definitely Don’t Want to Build Your Own Time Series DatabaseWhy You Definitely Don’t Want to Build Your Own Time Series Database
Why You Definitely Don’t Want to Build Your Own Time Series Database
 
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
Kafka Summit SF 2017 - MultiCluster, MultiTenant and Hierarchical Kafka Messa...
 
How is Kafka so Fast?
How is Kafka so Fast?How is Kafka so Fast?
How is Kafka so Fast?
 

Ähnlich wie Distributed Logging Architecture in the Container Era

Open Source SQL Databases
Open Source SQL DatabasesOpen Source SQL Databases
Open Source SQL DatabasesEmanuel Calvo
 
Apache Geode Meetup, London
Apache Geode Meetup, LondonApache Geode Meetup, London
Apache Geode Meetup, LondonApache Geode
 
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageSATOSHI TAGOMORI
 
ApacheCon Core: Service Discovery in OSGi: Beyond the JVM using Docker and Co...
ApacheCon Core: Service Discovery in OSGi: Beyond the JVM using Docker and Co...ApacheCon Core: Service Discovery in OSGi: Beyond the JVM using Docker and Co...
ApacheCon Core: Service Discovery in OSGi: Beyond the JVM using Docker and Co...Frank Lyaruu
 
Reactive Development: Commands, Actors and Events. Oh My!!
Reactive Development: Commands, Actors and Events.  Oh My!!Reactive Development: Commands, Actors and Events.  Oh My!!
Reactive Development: Commands, Actors and Events. Oh My!!David Hoerster
 
Gib 2021 - Intro to BizTalk Migrator
Gib 2021 - Intro to BizTalk MigratorGib 2021 - Intro to BizTalk Migrator
Gib 2021 - Intro to BizTalk MigratorDaniel Toomey
 
Conceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónConceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónMongoDB
 
Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Don Demcsak
 
MongoDB and Machine Learning with Flowable
MongoDB and Machine Learning with FlowableMongoDB and Machine Learning with Flowable
MongoDB and Machine Learning with FlowableFlowable
 
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)Emprovise
 
When to Use MongoDB
When to Use MongoDBWhen to Use MongoDB
When to Use MongoDBMongoDB
 
The Wix Microservice Stack
The Wix Microservice StackThe Wix Microservice Stack
The Wix Microservice StackTomer Gabel
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications OpenEBS
 
Centralizing Kubernetes and Container Operations
Centralizing Kubernetes and Container OperationsCentralizing Kubernetes and Container Operations
Centralizing Kubernetes and Container OperationsKublr
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven productsLars Albertsson
 
FiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at ScaleFiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at ScaleEvan Chan
 
Breaking the Monolith Road to Containers
Breaking the Monolith Road to ContainersBreaking the Monolith Road to Containers
Breaking the Monolith Road to ContainersAmazon Web Services
 
Kubernetes – An open platform for container orchestration
Kubernetes – An open platform for container orchestrationKubernetes – An open platform for container orchestration
Kubernetes – An open platform for container orchestrationinovex GmbH
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Lucas Jellema
 

Ähnlich wie Distributed Logging Architecture in the Container Era (20)

Open Source SQL Databases
Open Source SQL DatabasesOpen Source SQL Databases
Open Source SQL Databases
 
Apache Geode Meetup, London
Apache Geode Meetup, LondonApache Geode Meetup, London
Apache Geode Meetup, London
 
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby Usage
 
ApacheCon Core: Service Discovery in OSGi: Beyond the JVM using Docker and Co...
ApacheCon Core: Service Discovery in OSGi: Beyond the JVM using Docker and Co...ApacheCon Core: Service Discovery in OSGi: Beyond the JVM using Docker and Co...
ApacheCon Core: Service Discovery in OSGi: Beyond the JVM using Docker and Co...
 
Timesten Architecture
Timesten ArchitectureTimesten Architecture
Timesten Architecture
 
Reactive Development: Commands, Actors and Events. Oh My!!
Reactive Development: Commands, Actors and Events.  Oh My!!Reactive Development: Commands, Actors and Events.  Oh My!!
Reactive Development: Commands, Actors and Events. Oh My!!
 
Gib 2021 - Intro to BizTalk Migrator
Gib 2021 - Intro to BizTalk MigratorGib 2021 - Intro to BizTalk Migrator
Gib 2021 - Intro to BizTalk Migrator
 
Conceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producciónConceptos básicos. Seminario web 6: Despliegue de producción
Conceptos básicos. Seminario web 6: Despliegue de producción
 
Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)
 
MongoDB and Machine Learning with Flowable
MongoDB and Machine Learning with FlowableMongoDB and Machine Learning with Flowable
MongoDB and Machine Learning with Flowable
 
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
 
When to Use MongoDB
When to Use MongoDBWhen to Use MongoDB
When to Use MongoDB
 
The Wix Microservice Stack
The Wix Microservice StackThe Wix Microservice Stack
The Wix Microservice Stack
 
Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications Latest (storage IO) patterns for cloud-native applications
Latest (storage IO) patterns for cloud-native applications
 
Centralizing Kubernetes and Container Operations
Centralizing Kubernetes and Container OperationsCentralizing Kubernetes and Container Operations
Centralizing Kubernetes and Container Operations
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven products
 
FiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at ScaleFiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
FiloDB: Reactive, Real-Time, In-Memory Time Series at Scale
 
Breaking the Monolith Road to Containers
Breaking the Monolith Road to ContainersBreaking the Monolith Road to Containers
Breaking the Monolith Road to Containers
 
Kubernetes – An open platform for container orchestration
Kubernetes – An open platform for container orchestrationKubernetes – An open platform for container orchestration
Kubernetes – An open platform for container orchestration
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
 

Kürzlich hochgeladen

Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 

Kürzlich hochgeladen (20)

Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 

Distributed Logging Architecture in the Container Era

  • 1. Distributed Logging Architecture in Container Era LinuxCon Japan 2016 at Jun 13 2016 Satoshi "Moris" Tagomori (@tagomoris)
  • 2. Satoshi "Moris" Tagomori (@tagomoris) Fluentd, MessagePack-Ruby, Norikra, ... Treasure Data, Inc.
  • 3.
  • 5. Topics • Microservices and logging in various industries • Difficulties of logging with containers • Distributed logging architecture • Patterns of distributed logging architecture • Case Study: Docker and Fluentd
  • 7. Logging in Various Industries • Web access logs • Views/visitors on media • Views/clicks on Ads • Commercial transactions (EC, Game, ...) • Data from devices • Operation logs on Apps of phones • Various sensor data
  • 8. Microservices and Logging • Monolithic service • a service produces all data about an user's behavior • Microservices • many services produce data about an user's access • it's needed to collect logs from many services to know what is happening Users Service (Application) Logs Users Logs
  • 10. Containers: "a must" for microservices • Dividing a service into services • a service requires less computing resources
 (VM -> containers) • Making services independent from each other • but it is very difficult :( • some dependency must be solved even in development environment
 (containers on desktop)
  • 11. Redesign Logging: Why? • No permanent storages • No fixed physical/network address • No fixed mapping between servers and roles • We should parse/label logs at the source, ship these logs by pushing to destination ASAP
  • 12. Containers: immutable & disposable • No permanent storages • Where to write logs? • files in the container
 → gone w/ container instance 😞 • directories shared from hosts
 → hosts are shared by many containers/services ☹ • TODO: ship logs from container to anywhere ASAP
  • 13. Containers: unfixed addresses • No fixed physical / network address • Where should we go to fetch logs? • Service discovery (e.g., consul)
 → one more component 😞 • rsync? ssh+tail? or ..? Is it installed in containers?
 → one more tool to depend on ☹ • TODO: push logs to anywhere from containers
  • 14. Containers: instances per roles • No fixed mapping between servers and roles • How can we parse / store these logs? • Central repository about log syntax
 → very hard to maintain 😞 • Label logs by source address
 → many containers/roles in a host ☹ • TODO: label & parse logs at source of logs
  • 16. Core Architecture • Collector nodes • Aggregator nodes • Destinations Collector nodes (Docker containers + agent) Destinations
 (Storage, Database, ...) Aggregator nodes
  • 17. • Parse/Label (collector) • Raw logs are not good for processing • Convert logs to structured data (key-value pairs) • Split/Sort (aggregator) • Mixed logs are not good for searching • Split whole data stream into streams per services • Store (destination) • Format logs(records) as destination expects Collecting and Storing Data
  • 18. Scaling Logging • Network traffic • CPU load to parse / format • Parse logs on each collector (distributed) • Format logs on aggregator (to be distributed) • Capability • Make aggregators redundant • Controlling delay • to make sure when we can know what's happening in our systems
  • 21. Source Side Aggregation Patterns w/o source aggregation w/ source aggregation collector aggregator / destination aggregate container
  • 22. Without Source Aggregation • Pros: • Simple configuration • Cons: • fixed aggregator (endpoint) address • many network connections • high load in aggregator collector aggregator
  • 23. With Source Aggregation • Pros: • less connections • lower load in aggregator • less configuration in containers
 (by specifying localhost) • highly flexible configuration
 (by deployment only of aggregate containers) • Cons: • a bit much resource (+1 container per host) aggregate container aggregator
  • 24. Destination Side Aggregation Patterns w/o destination aggregation w/ destination aggregation aggregator collector destination
  • 25. Without Destination Aggregation • Pros: • Less nodes • Simpler configuration • Cons: • Storage side change affects collector side • Worse performance: many small write requests on storage
  • 26. With Destination Aggregation • Pros: • Collector side configuration is
 free from storage side changes • Better performance with fine tune
 on destination side aggregator • Cons: • More nodes • A bit complex configuration aggregator
  • 27. Scaling Patterns Scaling Up Endpoints HTTP/TCP load balancer Huge queue + workers Scaling Out Endpoints Round-robin clients Load balancer Backend nodes Collector nodes Aggregator nodes
  • 28. Scaling Up Endpoints • Pros: • Simple configuration
 in collector nodes • Cons: • Limits about scaling up Load balancer Backend nodes
  • 29. Scaling Out Endpoints • Pros: • Unlimited scaling
 by adding aggregator nodes • Cons: • Complex configuration • Client features for round-robin
  • 30. Without
 Destination Aggregation With
 Destination Aggregation Scaling Up Endpoints Systems in early stages Collecting logs over Internet or Using queues Scaling Out Endpoints Impossible :( Collector nodes must know all endpoints ↓ Uncontrollable Collecting logs in datacenter
  • 32. Case Study: Docker+Fluentd • Destination aggregation + scaling up • Fluent logger + Fluentd • Source aggregation + scaling up • Docker json logger + Fluentd + Elasticsearch • Docker fluentd logger + Fluentd + Kafka • Source/Destination aggregation + scaling out • Docker fluentd logger + Fluentd
  • 33. Why Fluentd? • Docker Fluentd logging driver • Docker containers can send logs to Fluentd directly - less overhead • Pluggable architecture • Various destination systems • Small memory footprint • Source aggregation requires +1 container per host • Less additional resource usage ( < 100MB )
  • 34. Destination aggregation + scaling up • Sending logs directly over TCP by Fluentd logger library in application code • Same with patterns of New Relic • Easy to implement
 - good for startups Application code
  • 35. Source aggregation + scaling up • Kubernetes: Json logger + Fluentd + Elasticsearch • Applications write logs to STDOUT • Docker writes logs as JSON in files • Fluentd
 reads logs from file
 parse JSON objects
 writes logs to Elasticsearch • EFK stack (like ELK stack) http://kubernetes.io/docs/getting-started-guides/logging-elasticsearch/ Elasticsearch Application code Files (JSON)
  • 36. Source aggregation + scaling up/out • Docker fluentd logging driver + Fluentd + Kafka • Applications write logs to STDOUT • Docker sends logs
 to localhost Fluentd • Fluentd
 gets logs over TCP
 pushes logs into Kafka • Highly scalable & less overhead
 - very good for huge deployment Kafka Application code
  • 37. Application code Source/Destination aggregation + scaling out • Docker fluentd logging driver + Fluentd • Applications write logs to STDOUT • Docker sends logs
 to localhost Fluentd • Fluentd
 gets logs over TCP
 sends logs into Aggregator Fluentd
 w/ round-robin load balance • Highly flexible
 - good for complex data processing
 requirements Any other storages
  • 38. What's the Best? • Writing logs from containers: Some way to do it • Docker logging driver • Write logs on files + read/parse it • Send logs from apps directly • Make the platform scalable! • Source aggregation: Fluentd on localhost • Scalable storage: (Kafka, external services, ...) • No destination aggregation + Scaling up • Non-scalable storage: (Filesystems, RDBMSs, ...) • Destination aggregation + Scaling out
  • 39. Why OSS Are Important For Logging?
  • 40. Why OSS? • Logging layer is interface • transparency • interoperability • Keep the platform scalable • number of nodes • number of types of source/destination
  • 41. Use OSS, Make Logging Scalable Thank you!