SlideShare ist ein Scribd-Unternehmen logo
1 von 146
Downloaden Sie, um offline zu lesen
Large-scale logging
made easy
Aliaksandr Valialkin, CTO at VictoriaMetrics
Open source
monitoring conference
2023
Logging? What’s it?
Logging? What’s it?
Logging? What’s it?
log line
Logging? What’s it?
timestamp
Logging? What’s it?
level
Logging? What’s it?
location
Logging? What’s it?
message
The purpose of logging
The purpose of logging: debugging
● Which errors have occurred in the app during the last hour?
The purpose of logging: debugging
● Which errors have occurred in the app during the last hour?
● Why the app returned unexpected response?
The purpose of logging: debugging
● Which errors have occurred in the app during the last hour?
● Why the app returned unexpected response?
● Why the app wasn’t working correctly yesterday?
The purpose of logging: debugging
● Which errors have occurred in the app during the last hour?
● Why the app returned unexpected response?
● Why the app wasn’t working correctly yesterday?
● What the app was doing at the particular time range?
The purpose of logging: security
● Who dropped the database in production?
The purpose of logging: security
● Who dropped the database in production?
● Which IP addresses were used for logging in as admin during the last hour?
The purpose of logging: security
● Who dropped the database in production?
● Which IP addresses were used for logging in as admin during the last hour?
● Who performed a particular action at the given time?
The purpose of logging: security
● Who dropped the database in production?
● Which IP addresses were used for logging in as admin during the last hour?
● Who performed a particular action at the given time?
● How many failed login attempts were during the last day?
The purpose of logging: stats and metrics
● How many requests were served per hour during the last day?
The purpose of logging: stats and metrics
● How many requests were served per hour during the last day?
● How many unique users were accessing the app during the last month?
The purpose of logging: stats and metrics
● How many requests were served per hour during the last day?
● How many unique users were accessing the app during the last month?
● How many requests were served for a particular IP range yesterday?
The purpose of logging: stats and metrics
● How many requests were served per hour during the last day?
● How many unique users were accessing the app during the last month?
● How many requests were served for a particular IP range yesterday?
● What percentage of requests finished with errors during the last hour?
The purpose of logging: stats and metrics
● How many requests were served per hour during the last day?
● How many unique users were accessing the app during the last month?
● How many requests were served for a particular IP range yesterday?
● What percentage of requests finished with errors during the last hour?
● What was the 95th percentile of request duration for the given web page
yesterday?
Traditional logging
Traditional logging
● Save logs to files on the local filesystem
Traditional logging
● Save logs to files on the local filesystem
● Use command-line tools for log analysis: cat, grep, awk, sort, uniq, head, tail,
etc.
Traditional logging: advantages
● Easy to setup and operate
Traditional logging: advantages
● Easy to setup and operate
● Easy to debug
Traditional logging: advantages
● Easy to setup and operate
● Easy to debug
● Easy to analyze logs with command-line tools and bash scripts
Traditional logging: advantages
● Easy to setup and operate
● Easy to debug
● Easy to analyze logs with command-line tools and bash scripts
● Works perfectly for 50 years (since 1970th)
Traditional logging: disadvantages
● Hard to analyze logs from hundreds of hosts
Traditional logging: disadvantages
● Hard to analyze logs from hundreds of hosts (hello, Kubernetes and
microservices)
Traditional logging: disadvantages
● Hard to analyze logs from hundreds of hosts (hello, Kubernetes and
microservices)
● Slow search speed over large log files (e.g. 1TB log file may require a hour to
scan)
Traditional logging: disadvantages
● Hard to analyze logs from hundreds of hosts (hello, Kubernetes and
microservices)
● Slow search speed over large log files (e.g. 1TB log file may require a hour to
scan)
● Imperfect support for structured logging (e.g. logs with arbitrary labels)
The solution: large-scale logging
Large-scale logging: core principles
● Push logs from large number of apps to a centralized system
Large-scale logging: core principles
● Push logs from large number of apps to a centralized system
● Provide fast queries over all the ingested logs
Large-scale logging: core principles
● Push logs from large number of apps to a centralized system
● Provide fast queries over all the ingested logs
● Support structured logging
Large-scale logging: solutions
Large-scale logging: solutions
● Cloud (DataDog, Sumo Logic, New Relic, etc.)
Large-scale logging: solutions
● Cloud (DataDog, Sumo Logic, New Relic, etc.)
● On-prem (Elasticsearch, OpenSearch, Grafana Loki, VictoriaLogs, etc.)
Large-scale logging: cloud vs on-prem
Large-scale logging: operational complexity
● Cloud: easy - cloud provider operates the system
Large-scale logging: operational complexity
● Cloud: easy - cloud provider operates the system
● On-prem: harder - you need to setup and operate the system
Large-scale logging: security
● Cloud: questionable - who has access to your logs?
Large-scale logging: security
● Cloud: questionable - who has access to your logs?
● On-prem: good - your logs are under your control
Large-scale logging: price
● Cloud: very expensive (millions of €)
Large-scale logging: price
● Cloud: very expensive (millions of €)
● On-prem: depends on the cost efficiency of the used system
Large-scale logging: on-prem comparison
Large-scale logging: on-prem: setup and operation
● Elasticsearch: hard because of non-trivial indexing configs for logs
Large-scale logging: on-prem: setup and operation
● Elasticsearch: hard because of non-trivial indexing configs for logs
● Grafana Loki: hard because of microservice architecture and complex configs
Large-scale logging: on-prem: setup and operation
● Elasticsearch: hard because of non-trivial indexing configs for logs
● Grafana Loki: hard because of microservice architecture and complex configs
● VictoriaLogs: easy because it runs out of the box from a single binary with
default configs
Large-scale logging: on-prem: costs
● Elasticsearch: high - it needs a lot of RAM and disk space
Large-scale logging: on-prem: costs
● Elasticsearch: high - it needs a lot of RAM and disk space
● Grafana Loki: medium - it needs a lot of RAM for high-cardinality labels
Large-scale logging: on-prem: costs
● Elasticsearch: high - it needs a lot of RAM and disk space
● Grafana Loki: medium - it needs a lot of RAM for high-cardinality labels
● VictoriaLogs: low - a single VictoriaLogs instance can replace a 30-node
Elasticsearch or Loki cluster
Large-scale logging: on-prem: full-text search support
● Elasticsearch: yes, but needs proper index configuration
Large-scale logging: on-prem: full-text search support
● Elasticsearch: yes, but needs proper index configuration
● Grafana Loki: yes, but very slow
Large-scale logging: on-prem: full-text search support
● Elasticsearch: yes, but needs proper index configuration
● Grafana Loki: yes, but very slow
● VictoriaLogs: yes, works out of the box for all the ingested log fields and
labels without additional configs
Large-scale logging: on-prem: how to efficiently query
100TB of logs?
● Elasticsearch: to run a cluster with 200TB of disk space and 6TB of RAM.
Infrastructure costs at GCE or AWS: ~€50K/month
Large-scale logging: on-prem: how to efficiently query
100TB of logs?
● Elasticsearch: to run a cluster with 200TB of disk space and 6TB of RAM.
Infrastructure costs at GCE or AWS: ~€50K/month
● Grafana Loki: impossible because the query will take hours to execute
Large-scale logging: on-prem: how to efficiently query
100TB of logs?
● Elasticsearch: to run a cluster with 200TB of disk space and 6TB of RAM.
Infrastructure costs at GCE or AWS: ~€50K/month
● Grafana Loki: impossible because the query will take hours to execute
● VictoriaLogs: to run a single node with 6TB of disk space and 200GB of RAM.
Infrastructure costs at GCE or AWS: ~€2K/month
Large-scale logging: on-prem: integration with CLI tools
● Elasticsearch: poor
Large-scale logging: on-prem: integration with CLI tools
● Elasticsearch: poor
● Grafana Loki: poor
Large-scale logging: on-prem: integration with CLI tools
● Elasticsearch: poor
● Grafana Loki: poor
● VictoriaLogs: excellent
VictoriaLogs for large-scale logging
● Satisfies requirements for large-scale logging
○ Efficiently stores logs from large number of distributed apps
○ Provides fast full-text search
○ Supports both structured and unstructured logs
VictoriaLogs for large-scale logging
● Satisfies requirements for large-scale logging
○ Efficiently stores logs from large number of distributed apps
○ Provides fast full-text search
○ Supports both structured and unstructured logs
● Provides traditional logging features
○ Ease of use
○ Great integration with CLI tools - grep, awk, head, tail, less, etc.
VictoriaLogs: CLI integration
with demo
Which errors have occurred in all the apps during the last hour?
_time:1h error
Which errors have occurred in all the apps during the last hour?
_time:1h error
LogsQL query
Which errors have occurred in all the apps during the last hour?
_time:1h error
Filter on log timestamp:
select logs for the last hour
Which errors have occurred in all the apps during the last hour?
_time:1h error
Word filter: select all
logs with “error” word
Which errors have occurred in all the apps during the last hour?
Which errors have occurred in all the apps during the last hour?
Simple bash wrapper
around curl
Which errors have occurred in all the apps during the last hour?
LogsQL query
Which errors have occurred in all the apps during the last hour?
Plain old CLI tools
connected via Unix pipes
Which errors have occurred in all the apps during the last hour?
The result can be saved to file at any stage with
“… > response_file”
for later analysis
Which errors have occurred in all the apps during the last hour?
JSON lines
Which errors have occurred in all the apps during the last hour?
Log message
Which errors have occurred in all the apps during the last hour?
Log stream (aka app instance)
Which errors have occurred in all the apps during the last hour?
Log timestamp
Which errors have occurred in all the apps during the last hour?
Other log fields can be requested if needed
Which errors have occurred in all the apps during the last hour?
DEMO
Show only log messages
Show only log messages
jq -r ._msg
Show only log messages
DEMO
How many errors have occurred during the last hour?
How many errors have occurred during the last hour?
Plain old “wc -l”
How many errors have occurred during the last hour?
The number of logs with
“error” word
How many errors have occurred during the last hour?
DEMO
Which apps generated the most of errors during the last
hour?
Which apps generated the most of errors during the last
hour?
Traditional bash-fu
Which apps generated the most of errors during the last
hour?
Get _stream field from
every JSON line
Which apps generated the most of errors during the last
hour?
Sort _stream values
Which apps generated the most of errors during the last
hour?
Count the number of
unique _stream values
Which apps generated the most of errors during the last
hour?
Sort counts of unique _stream
values in reverse order
Which apps generated the most of errors during the last
hour?
Return top 8 _stream values with
the highest number of counts
Which apps generated the most of errors during the last
hour?
_stream values
Which apps generated the most of errors during the last
hour?
_stream counts
Which apps generated the most of errors during the last
hour?
DEMO
Fluentbit-gke errors during the last hour
Fluentbit-gke errors during the last hour
_stream filter: select logs with
kubernetes_container_name=”fluentbit-gke”
Fluentbit-gke errors during the last hour
kubernetes_container_name=”fluentbit-gke”
Fluentbit-gke errors during the last hour
DEMO
The number of per-minute errors for the last 10 minutes
The number of per-minute errors for the last 10 minutes
select _time field from JSON lines
The number of per-minute errors for the last 10 minutes
Trim _time values to minutes
The number of per-minute errors for the last 10 minutes
Sort _time values
The number of per-minute errors for the last 10 minutes
Count unique _time values
The number of per-minute errors for the last 10 minutes
_time values trimmed to minute
The number of per-minute errors for the last 10 minutes
The number of logs for the given minute
The number of per-minute errors for the last 10 minutes
DEMO
Non-200 status codes for the last week
Non-200 status codes for the last week
Find logs with “status=” phrase, but
without “status=200” phrase
Non-200 status codes for the last week
DEMO
Top client IPs for the last 4 weeks with 400 or 404
response status codes
Top client IPs for the last 4 weeks with 400 or 404
response status codes
Find logs with “remote_addr=”
phrase
Top client IPs for the last 4 weeks with 400 or 404
response status codes
Find logs with “remote_addr=”
phrase
… and with “status=404” or “status=400”
phrases
Top client IPs for the last 4 weeks with 400 or 404
response status codes
extract IP address from remote_addr=...
Top client IPs for the last 4 weeks with 400 or 404
response status codes
drop “remote_addr=” prefix
Top client IPs for the last 4 weeks with 400 or 404
response status codes
DEMO
Per-day stats for the given IP during the last 10 days
Per-day stats for the given IP during the last 10 days
Search for log messages with the given IP
Per-day stats for the given IP during the last 10 days
A bit of bash-fu: extract log timestamp, cut it to
days and calculate the number of per day entries
Per-day stats for the given IP during the last 10 days
DEMO
Per-level stats for the last 5 days, excluding info logs
Per-level stats for the last 5 days, excluding info logs
Select logs where “level” field isn’t equal to “info”,
“INFO” or an empty string
Per-level stats for the last 5 days, excluding info logs
DEMO
System for large-scale logging
MUST provide
excellent CLI integration
Large-scale logging
Do not like CLI and bash? Then use web UI!
VictoriaLogs: (temporary) drawbacks
VictoriaLogs: (temporary) drawbacks
● Missing data extraction and advanced stats functionality in LogsQL
VictoriaLogs: (temporary) drawbacks
● Missing data extraction and advanced stats functionality in LogsQL (but it can
be replaced with traditional CLI tools as we seen before)
VictoriaLogs: (temporary) drawbacks
● Missing data extraction and advanced stats functionality in LogsQL (but it can
be replaced with traditional CLI tools as we seen before)
● Missing cluster version
VictoriaLogs: (temporary) drawbacks
● Missing data extraction and advanced stats functionality in LogsQL (but it can
be replaced with traditional CLI tools as we seen before)
● Missing cluster version (but a single-node VictoriaLogs can replace a 30-node
Elasticsearch or Loki cluster)
VictoriaLogs: (temporary) drawbacks
● Missing data extraction and advanced stats functionality in LogsQL (but it can
be replaced with traditional CLI tools as we seen before)
● Missing cluster version (but a single-node VictoriaLogs can replace a 30-node
Elasticsearch or Loki cluster)
● Missing integration with Grafana
VictoriaLogs: (temporary) drawbacks
● Missing data extraction and advanced stats functionality in LogsQL (but it can
be replaced with traditional CLI tools as we seen before)
● Missing cluster version (but a single-node VictoriaLogs can replace a 30-node
Elasticsearch or Loki cluster)
● Missing integration with Grafana (but there is own web UI, which is going to
be better than Grafana for logs)
VictoriaLogs: recap
● Easy to setup and operate
VictoriaLogs: recap
● Easy to setup and operate
● The lowest RAM usage and disk space usage (up to 30x less than
Elasticsearch and Grafana Loki)
VictoriaLogs: recap
● Easy to setup and operate
● The lowest RAM usage and disk space usage (up to 30x less than
Elasticsearch and Grafana Loki)
● Fast full-text search
VictoriaLogs: recap
● Easy to setup and operate
● The lowest RAM usage and disk space usage (up to 30x less than
Elasticsearch and Grafana Loki)
● Fast full-text search
● Excellent integration with traditional command-line tools for log analysis
VictoriaLogs: recap
● Easy to setup and operate
● The lowest RAM usage and disk space usage (up to 30x less than
Elasticsearch and Grafana Loki)
● Fast full-text search
● Excellent integration with traditional command-line tools for log analysis
● Accepts logs from all the popular log shippers (Filebeat, Fluentbit, Logstash,
Vector, Promtail)
VictoriaLogs: recap
● Easy to setup and operate
● The lowest RAM usage and disk space usage (up to 30x less than
Elasticsearch and Grafana Loki)
● Fast full-text search
● Excellent integration with traditional command-line tools for log analysis
● Accepts logs from all the popular log shippers (Filebeat, Fluentbit, Logstash,
Vector, Promtail)
● Open source and free to use!
VictoriaLogs: useful links
● General docs - https://docs.victoriametrics.com/VictoriaLogs/
VictoriaLogs: useful links
● General docs - https://docs.victoriametrics.com/VictoriaLogs/
● LogsQL docs - https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html
Questions?

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

PostgreSQL on EXT4, XFS, BTRFS and ZFS
PostgreSQL on EXT4, XFS, BTRFS and ZFSPostgreSQL on EXT4, XFS, BTRFS and ZFS
PostgreSQL on EXT4, XFS, BTRFS and ZFS
 
Scylla Summit 2022: Scylla 5.0 New Features, Part 1
Scylla Summit 2022: Scylla 5.0 New Features, Part 1Scylla Summit 2022: Scylla 5.0 New Features, Part 1
Scylla Summit 2022: Scylla 5.0 New Features, Part 1
 
Growing the Delta Ecosystem to Rust and Python with Delta-RS
Growing the Delta Ecosystem to Rust and Python with Delta-RSGrowing the Delta Ecosystem to Rust and Python with Delta-RS
Growing the Delta Ecosystem to Rust and Python with Delta-RS
 
Prometheus 101
Prometheus 101Prometheus 101
Prometheus 101
 
Scylla Compaction Strategies
Scylla Compaction StrategiesScylla Compaction Strategies
Scylla Compaction Strategies
 
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
 
Persistent Storage with Containers with Kubernetes & OpenShift
Persistent Storage with Containers with Kubernetes & OpenShiftPersistent Storage with Containers with Kubernetes & OpenShift
Persistent Storage with Containers with Kubernetes & OpenShift
 
Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup) Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup)
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compaction
 
Getting Started Monitoring with Prometheus and Grafana
Getting Started Monitoring with Prometheus and GrafanaGetting Started Monitoring with Prometheus and Grafana
Getting Started Monitoring with Prometheus and Grafana
 
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdfOSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
 
Top 10 Cypher Tuning Tips & Tricks
Top 10 Cypher Tuning Tips & TricksTop 10 Cypher Tuning Tips & Tricks
Top 10 Cypher Tuning Tips & Tricks
 
Terraform modules and (some of) best practices
Terraform modules and (some of) best practicesTerraform modules and (some of) best practices
Terraform modules and (some of) best practices
 
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
 
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and PitfallsRunning Apache Spark on Kubernetes: Best Practices and Pitfalls
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
 
Kubernetes dealing with storage and persistence
Kubernetes  dealing with storage and persistenceKubernetes  dealing with storage and persistence
Kubernetes dealing with storage and persistence
 
MAA Best Practices for Oracle Database 19c
MAA Best Practices for Oracle Database 19cMAA Best Practices for Oracle Database 19c
MAA Best Practices for Oracle Database 19c
 
Linking Metrics to Logs using Loki
Linking Metrics to Logs using LokiLinking Metrics to Logs using Loki
Linking Metrics to Logs using Loki
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
 

Ähnlich wie OSMC 2023 | Large-scale logging made easy by Alexandr Valialkin

Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 
London devops logging
London devops loggingLondon devops logging
London devops logging
Tomas Doran
 
Aws uk ug #8 not everything that happens in vegas stay in vegas
Aws uk ug #8   not everything that happens in vegas stay in vegasAws uk ug #8   not everything that happens in vegas stay in vegas
Aws uk ug #8 not everything that happens in vegas stay in vegas
Peter Mounce
 
NetflixOSS meetup lightning talks and roadmap
NetflixOSS meetup lightning talks and roadmapNetflixOSS meetup lightning talks and roadmap
NetflixOSS meetup lightning talks and roadmap
Ruslan Meshenberg
 

Ähnlich wie OSMC 2023 | Large-scale logging made easy by Alexandr Valialkin (20)

Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
VictoriaLogs: Open Source Log Management System - Preview
VictoriaLogs: Open Source Log Management System - PreviewVictoriaLogs: Open Source Log Management System - Preview
VictoriaLogs: Open Source Log Management System - Preview
 
Serverless for High Performance Computing
Serverless for High Performance ComputingServerless for High Performance Computing
Serverless for High Performance Computing
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 
Scaling ELK Stack - DevOpsDays Singapore
Scaling ELK Stack - DevOpsDays SingaporeScaling ELK Stack - DevOpsDays Singapore
Scaling ELK Stack - DevOpsDays Singapore
 
London devops logging
London devops loggingLondon devops logging
London devops logging
 
Logs @ OVHcloud
Logs @ OVHcloudLogs @ OVHcloud
Logs @ OVHcloud
 
Elastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogElastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @Datadog
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
 
Aws uk ug #8 not everything that happens in vegas stay in vegas
Aws uk ug #8   not everything that happens in vegas stay in vegasAws uk ug #8   not everything that happens in vegas stay in vegas
Aws uk ug #8 not everything that happens in vegas stay in vegas
 
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...
 
Interactive Data Analysis in Spark Streaming
Interactive Data Analysis in Spark StreamingInteractive Data Analysis in Spark Streaming
Interactive Data Analysis in Spark Streaming
 
Scalable, good, cheap
Scalable, good, cheapScalable, good, cheap
Scalable, good, cheap
 
NetflixOSS meetup lightning talks and roadmap
NetflixOSS meetup lightning talks and roadmapNetflixOSS meetup lightning talks and roadmap
NetflixOSS meetup lightning talks and roadmap
 
Protecting the Web at a scale using consul and Elk / Valentin Chernozemski (S...
Protecting the Web at a scale using consul and Elk / Valentin Chernozemski (S...Protecting the Web at a scale using consul and Elk / Valentin Chernozemski (S...
Protecting the Web at a scale using consul and Elk / Valentin Chernozemski (S...
 
Ensuring Performance in a Fast-Paced Environment (CMG 2014)
Ensuring Performance in a Fast-Paced Environment (CMG 2014)Ensuring Performance in a Fast-Paced Environment (CMG 2014)
Ensuring Performance in a Fast-Paced Environment (CMG 2014)
 
Monitoring with Clickhouse
Monitoring with ClickhouseMonitoring with Clickhouse
Monitoring with Clickhouse
 
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
 
Protecting the Data Lake
Protecting the Data LakeProtecting the Data Lake
Protecting the Data Lake
 
使用 Elasticsearch 及 Kibana 進行巨量資料搜尋及視覺化-曾書庭
使用 Elasticsearch 及 Kibana 進行巨量資料搜尋及視覺化-曾書庭使用 Elasticsearch 及 Kibana 進行巨量資料搜尋及視覺化-曾書庭
使用 Elasticsearch 及 Kibana 進行巨量資料搜尋及視覺化-曾書庭
 

Kürzlich hochgeladen

Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
ZurliaSoop
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
Kayode Fayemi
 
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
David Celestin
 
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven CuriosityUnlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Hung Le
 

Kürzlich hochgeladen (20)

SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdfSOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
 
Zone Chairperson Role and Responsibilities New updated.pptx
Zone Chairperson Role and Responsibilities New updated.pptxZone Chairperson Role and Responsibilities New updated.pptx
Zone Chairperson Role and Responsibilities New updated.pptx
 
Lions New Portal from Narsimha Raju Dichpally 320D.pptx
Lions New Portal from Narsimha Raju Dichpally 320D.pptxLions New Portal from Narsimha Raju Dichpally 320D.pptx
Lions New Portal from Narsimha Raju Dichpally 320D.pptx
 
ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
BIG DEVELOPMENTS IN LESOTHO(DAMS & MINES
BIG DEVELOPMENTS IN LESOTHO(DAMS & MINESBIG DEVELOPMENTS IN LESOTHO(DAMS & MINES
BIG DEVELOPMENTS IN LESOTHO(DAMS & MINES
 
in kuwait௹+918133066128....) @abortion pills for sale in Kuwait City
in kuwait௹+918133066128....) @abortion pills for sale in Kuwait Cityin kuwait௹+918133066128....) @abortion pills for sale in Kuwait City
in kuwait௹+918133066128....) @abortion pills for sale in Kuwait City
 
LITTLE ABOUT LESOTHO FROM THE TIME MOSHOESHOE THE FIRST WAS BORN
LITTLE ABOUT LESOTHO FROM THE TIME MOSHOESHOE THE FIRST WAS BORNLITTLE ABOUT LESOTHO FROM THE TIME MOSHOESHOE THE FIRST WAS BORN
LITTLE ABOUT LESOTHO FROM THE TIME MOSHOESHOE THE FIRST WAS BORN
 
Digital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of DrupalDigital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of Drupal
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
 
BEAUTIFUL PLACES TO VISIT IN LESOTHO.pptx
BEAUTIFUL PLACES TO VISIT IN LESOTHO.pptxBEAUTIFUL PLACES TO VISIT IN LESOTHO.pptx
BEAUTIFUL PLACES TO VISIT IN LESOTHO.pptx
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatment
 
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
 
Ready Set Go Children Sermon about Mark 16:15-20
Ready Set Go Children Sermon about Mark 16:15-20Ready Set Go Children Sermon about Mark 16:15-20
Ready Set Go Children Sermon about Mark 16:15-20
 
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven CuriosityUnlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
Unlocking Exploration: Self-Motivated Agents Thrive on Memory-Driven Curiosity
 
History of Morena Moshoeshoe birth death
History of Morena Moshoeshoe birth deathHistory of Morena Moshoeshoe birth death
History of Morena Moshoeshoe birth death
 
Introduction to Artificial intelligence.
Introduction to Artificial intelligence.Introduction to Artificial intelligence.
Introduction to Artificial intelligence.
 

OSMC 2023 | Large-scale logging made easy by Alexandr Valialkin

  • 1. Large-scale logging made easy Aliaksandr Valialkin, CTO at VictoriaMetrics Open source monitoring conference 2023
  • 9. The purpose of logging
  • 10. The purpose of logging: debugging ● Which errors have occurred in the app during the last hour?
  • 11. The purpose of logging: debugging ● Which errors have occurred in the app during the last hour? ● Why the app returned unexpected response?
  • 12. The purpose of logging: debugging ● Which errors have occurred in the app during the last hour? ● Why the app returned unexpected response? ● Why the app wasn’t working correctly yesterday?
  • 13. The purpose of logging: debugging ● Which errors have occurred in the app during the last hour? ● Why the app returned unexpected response? ● Why the app wasn’t working correctly yesterday? ● What the app was doing at the particular time range?
  • 14. The purpose of logging: security ● Who dropped the database in production?
  • 15. The purpose of logging: security ● Who dropped the database in production? ● Which IP addresses were used for logging in as admin during the last hour?
  • 16. The purpose of logging: security ● Who dropped the database in production? ● Which IP addresses were used for logging in as admin during the last hour? ● Who performed a particular action at the given time?
  • 17. The purpose of logging: security ● Who dropped the database in production? ● Which IP addresses were used for logging in as admin during the last hour? ● Who performed a particular action at the given time? ● How many failed login attempts were during the last day?
  • 18. The purpose of logging: stats and metrics ● How many requests were served per hour during the last day?
  • 19. The purpose of logging: stats and metrics ● How many requests were served per hour during the last day? ● How many unique users were accessing the app during the last month?
  • 20. The purpose of logging: stats and metrics ● How many requests were served per hour during the last day? ● How many unique users were accessing the app during the last month? ● How many requests were served for a particular IP range yesterday?
  • 21. The purpose of logging: stats and metrics ● How many requests were served per hour during the last day? ● How many unique users were accessing the app during the last month? ● How many requests were served for a particular IP range yesterday? ● What percentage of requests finished with errors during the last hour?
  • 22. The purpose of logging: stats and metrics ● How many requests were served per hour during the last day? ● How many unique users were accessing the app during the last month? ● How many requests were served for a particular IP range yesterday? ● What percentage of requests finished with errors during the last hour? ● What was the 95th percentile of request duration for the given web page yesterday?
  • 24. Traditional logging ● Save logs to files on the local filesystem
  • 25. Traditional logging ● Save logs to files on the local filesystem ● Use command-line tools for log analysis: cat, grep, awk, sort, uniq, head, tail, etc.
  • 26. Traditional logging: advantages ● Easy to setup and operate
  • 27. Traditional logging: advantages ● Easy to setup and operate ● Easy to debug
  • 28. Traditional logging: advantages ● Easy to setup and operate ● Easy to debug ● Easy to analyze logs with command-line tools and bash scripts
  • 29. Traditional logging: advantages ● Easy to setup and operate ● Easy to debug ● Easy to analyze logs with command-line tools and bash scripts ● Works perfectly for 50 years (since 1970th)
  • 30. Traditional logging: disadvantages ● Hard to analyze logs from hundreds of hosts
  • 31. Traditional logging: disadvantages ● Hard to analyze logs from hundreds of hosts (hello, Kubernetes and microservices)
  • 32. Traditional logging: disadvantages ● Hard to analyze logs from hundreds of hosts (hello, Kubernetes and microservices) ● Slow search speed over large log files (e.g. 1TB log file may require a hour to scan)
  • 33. Traditional logging: disadvantages ● Hard to analyze logs from hundreds of hosts (hello, Kubernetes and microservices) ● Slow search speed over large log files (e.g. 1TB log file may require a hour to scan) ● Imperfect support for structured logging (e.g. logs with arbitrary labels)
  • 35. Large-scale logging: core principles ● Push logs from large number of apps to a centralized system
  • 36. Large-scale logging: core principles ● Push logs from large number of apps to a centralized system ● Provide fast queries over all the ingested logs
  • 37. Large-scale logging: core principles ● Push logs from large number of apps to a centralized system ● Provide fast queries over all the ingested logs ● Support structured logging
  • 39. Large-scale logging: solutions ● Cloud (DataDog, Sumo Logic, New Relic, etc.)
  • 40. Large-scale logging: solutions ● Cloud (DataDog, Sumo Logic, New Relic, etc.) ● On-prem (Elasticsearch, OpenSearch, Grafana Loki, VictoriaLogs, etc.)
  • 42. Large-scale logging: operational complexity ● Cloud: easy - cloud provider operates the system
  • 43. Large-scale logging: operational complexity ● Cloud: easy - cloud provider operates the system ● On-prem: harder - you need to setup and operate the system
  • 44. Large-scale logging: security ● Cloud: questionable - who has access to your logs?
  • 45. Large-scale logging: security ● Cloud: questionable - who has access to your logs? ● On-prem: good - your logs are under your control
  • 46. Large-scale logging: price ● Cloud: very expensive (millions of €)
  • 47. Large-scale logging: price ● Cloud: very expensive (millions of €) ● On-prem: depends on the cost efficiency of the used system
  • 49. Large-scale logging: on-prem: setup and operation ● Elasticsearch: hard because of non-trivial indexing configs for logs
  • 50. Large-scale logging: on-prem: setup and operation ● Elasticsearch: hard because of non-trivial indexing configs for logs ● Grafana Loki: hard because of microservice architecture and complex configs
  • 51. Large-scale logging: on-prem: setup and operation ● Elasticsearch: hard because of non-trivial indexing configs for logs ● Grafana Loki: hard because of microservice architecture and complex configs ● VictoriaLogs: easy because it runs out of the box from a single binary with default configs
  • 52. Large-scale logging: on-prem: costs ● Elasticsearch: high - it needs a lot of RAM and disk space
  • 53. Large-scale logging: on-prem: costs ● Elasticsearch: high - it needs a lot of RAM and disk space ● Grafana Loki: medium - it needs a lot of RAM for high-cardinality labels
  • 54. Large-scale logging: on-prem: costs ● Elasticsearch: high - it needs a lot of RAM and disk space ● Grafana Loki: medium - it needs a lot of RAM for high-cardinality labels ● VictoriaLogs: low - a single VictoriaLogs instance can replace a 30-node Elasticsearch or Loki cluster
  • 55. Large-scale logging: on-prem: full-text search support ● Elasticsearch: yes, but needs proper index configuration
  • 56. Large-scale logging: on-prem: full-text search support ● Elasticsearch: yes, but needs proper index configuration ● Grafana Loki: yes, but very slow
  • 57. Large-scale logging: on-prem: full-text search support ● Elasticsearch: yes, but needs proper index configuration ● Grafana Loki: yes, but very slow ● VictoriaLogs: yes, works out of the box for all the ingested log fields and labels without additional configs
  • 58. Large-scale logging: on-prem: how to efficiently query 100TB of logs? ● Elasticsearch: to run a cluster with 200TB of disk space and 6TB of RAM. Infrastructure costs at GCE or AWS: ~€50K/month
  • 59. Large-scale logging: on-prem: how to efficiently query 100TB of logs? ● Elasticsearch: to run a cluster with 200TB of disk space and 6TB of RAM. Infrastructure costs at GCE or AWS: ~€50K/month ● Grafana Loki: impossible because the query will take hours to execute
  • 60. Large-scale logging: on-prem: how to efficiently query 100TB of logs? ● Elasticsearch: to run a cluster with 200TB of disk space and 6TB of RAM. Infrastructure costs at GCE or AWS: ~€50K/month ● Grafana Loki: impossible because the query will take hours to execute ● VictoriaLogs: to run a single node with 6TB of disk space and 200GB of RAM. Infrastructure costs at GCE or AWS: ~€2K/month
  • 61. Large-scale logging: on-prem: integration with CLI tools ● Elasticsearch: poor
  • 62. Large-scale logging: on-prem: integration with CLI tools ● Elasticsearch: poor ● Grafana Loki: poor
  • 63. Large-scale logging: on-prem: integration with CLI tools ● Elasticsearch: poor ● Grafana Loki: poor ● VictoriaLogs: excellent
  • 64. VictoriaLogs for large-scale logging ● Satisfies requirements for large-scale logging ○ Efficiently stores logs from large number of distributed apps ○ Provides fast full-text search ○ Supports both structured and unstructured logs
  • 65. VictoriaLogs for large-scale logging ● Satisfies requirements for large-scale logging ○ Efficiently stores logs from large number of distributed apps ○ Provides fast full-text search ○ Supports both structured and unstructured logs ● Provides traditional logging features ○ Ease of use ○ Great integration with CLI tools - grep, awk, head, tail, less, etc.
  • 66.
  • 68. Which errors have occurred in all the apps during the last hour? _time:1h error
  • 69. Which errors have occurred in all the apps during the last hour? _time:1h error LogsQL query
  • 70. Which errors have occurred in all the apps during the last hour? _time:1h error Filter on log timestamp: select logs for the last hour
  • 71. Which errors have occurred in all the apps during the last hour? _time:1h error Word filter: select all logs with “error” word
  • 72. Which errors have occurred in all the apps during the last hour?
  • 73. Which errors have occurred in all the apps during the last hour? Simple bash wrapper around curl
  • 74. Which errors have occurred in all the apps during the last hour? LogsQL query
  • 75. Which errors have occurred in all the apps during the last hour? Plain old CLI tools connected via Unix pipes
  • 76. Which errors have occurred in all the apps during the last hour? The result can be saved to file at any stage with “… > response_file” for later analysis
  • 77. Which errors have occurred in all the apps during the last hour? JSON lines
  • 78. Which errors have occurred in all the apps during the last hour? Log message
  • 79. Which errors have occurred in all the apps during the last hour? Log stream (aka app instance)
  • 80. Which errors have occurred in all the apps during the last hour? Log timestamp
  • 81. Which errors have occurred in all the apps during the last hour? Other log fields can be requested if needed
  • 82. Which errors have occurred in all the apps during the last hour? DEMO
  • 83. Show only log messages
  • 84. Show only log messages jq -r ._msg
  • 85. Show only log messages DEMO
  • 86. How many errors have occurred during the last hour?
  • 87. How many errors have occurred during the last hour? Plain old “wc -l”
  • 88. How many errors have occurred during the last hour? The number of logs with “error” word
  • 89. How many errors have occurred during the last hour? DEMO
  • 90. Which apps generated the most of errors during the last hour?
  • 91. Which apps generated the most of errors during the last hour? Traditional bash-fu
  • 92. Which apps generated the most of errors during the last hour? Get _stream field from every JSON line
  • 93. Which apps generated the most of errors during the last hour? Sort _stream values
  • 94. Which apps generated the most of errors during the last hour? Count the number of unique _stream values
  • 95. Which apps generated the most of errors during the last hour? Sort counts of unique _stream values in reverse order
  • 96. Which apps generated the most of errors during the last hour? Return top 8 _stream values with the highest number of counts
  • 97. Which apps generated the most of errors during the last hour? _stream values
  • 98. Which apps generated the most of errors during the last hour? _stream counts
  • 99. Which apps generated the most of errors during the last hour? DEMO
  • 100. Fluentbit-gke errors during the last hour
  • 101. Fluentbit-gke errors during the last hour _stream filter: select logs with kubernetes_container_name=”fluentbit-gke”
  • 102. Fluentbit-gke errors during the last hour kubernetes_container_name=”fluentbit-gke”
  • 103. Fluentbit-gke errors during the last hour DEMO
  • 104. The number of per-minute errors for the last 10 minutes
  • 105. The number of per-minute errors for the last 10 minutes select _time field from JSON lines
  • 106. The number of per-minute errors for the last 10 minutes Trim _time values to minutes
  • 107. The number of per-minute errors for the last 10 minutes Sort _time values
  • 108. The number of per-minute errors for the last 10 minutes Count unique _time values
  • 109. The number of per-minute errors for the last 10 minutes _time values trimmed to minute
  • 110. The number of per-minute errors for the last 10 minutes The number of logs for the given minute
  • 111. The number of per-minute errors for the last 10 minutes DEMO
  • 112. Non-200 status codes for the last week
  • 113. Non-200 status codes for the last week Find logs with “status=” phrase, but without “status=200” phrase
  • 114. Non-200 status codes for the last week DEMO
  • 115. Top client IPs for the last 4 weeks with 400 or 404 response status codes
  • 116. Top client IPs for the last 4 weeks with 400 or 404 response status codes Find logs with “remote_addr=” phrase
  • 117. Top client IPs for the last 4 weeks with 400 or 404 response status codes Find logs with “remote_addr=” phrase … and with “status=404” or “status=400” phrases
  • 118. Top client IPs for the last 4 weeks with 400 or 404 response status codes extract IP address from remote_addr=...
  • 119. Top client IPs for the last 4 weeks with 400 or 404 response status codes drop “remote_addr=” prefix
  • 120. Top client IPs for the last 4 weeks with 400 or 404 response status codes DEMO
  • 121. Per-day stats for the given IP during the last 10 days
  • 122. Per-day stats for the given IP during the last 10 days Search for log messages with the given IP
  • 123. Per-day stats for the given IP during the last 10 days A bit of bash-fu: extract log timestamp, cut it to days and calculate the number of per day entries
  • 124. Per-day stats for the given IP during the last 10 days DEMO
  • 125. Per-level stats for the last 5 days, excluding info logs
  • 126. Per-level stats for the last 5 days, excluding info logs Select logs where “level” field isn’t equal to “info”, “INFO” or an empty string
  • 127. Per-level stats for the last 5 days, excluding info logs DEMO
  • 128. System for large-scale logging MUST provide excellent CLI integration
  • 130. Do not like CLI and bash? Then use web UI!
  • 132. VictoriaLogs: (temporary) drawbacks ● Missing data extraction and advanced stats functionality in LogsQL
  • 133. VictoriaLogs: (temporary) drawbacks ● Missing data extraction and advanced stats functionality in LogsQL (but it can be replaced with traditional CLI tools as we seen before)
  • 134. VictoriaLogs: (temporary) drawbacks ● Missing data extraction and advanced stats functionality in LogsQL (but it can be replaced with traditional CLI tools as we seen before) ● Missing cluster version
  • 135. VictoriaLogs: (temporary) drawbacks ● Missing data extraction and advanced stats functionality in LogsQL (but it can be replaced with traditional CLI tools as we seen before) ● Missing cluster version (but a single-node VictoriaLogs can replace a 30-node Elasticsearch or Loki cluster)
  • 136. VictoriaLogs: (temporary) drawbacks ● Missing data extraction and advanced stats functionality in LogsQL (but it can be replaced with traditional CLI tools as we seen before) ● Missing cluster version (but a single-node VictoriaLogs can replace a 30-node Elasticsearch or Loki cluster) ● Missing integration with Grafana
  • 137. VictoriaLogs: (temporary) drawbacks ● Missing data extraction and advanced stats functionality in LogsQL (but it can be replaced with traditional CLI tools as we seen before) ● Missing cluster version (but a single-node VictoriaLogs can replace a 30-node Elasticsearch or Loki cluster) ● Missing integration with Grafana (but there is own web UI, which is going to be better than Grafana for logs)
  • 138. VictoriaLogs: recap ● Easy to setup and operate
  • 139. VictoriaLogs: recap ● Easy to setup and operate ● The lowest RAM usage and disk space usage (up to 30x less than Elasticsearch and Grafana Loki)
  • 140. VictoriaLogs: recap ● Easy to setup and operate ● The lowest RAM usage and disk space usage (up to 30x less than Elasticsearch and Grafana Loki) ● Fast full-text search
  • 141. VictoriaLogs: recap ● Easy to setup and operate ● The lowest RAM usage and disk space usage (up to 30x less than Elasticsearch and Grafana Loki) ● Fast full-text search ● Excellent integration with traditional command-line tools for log analysis
  • 142. VictoriaLogs: recap ● Easy to setup and operate ● The lowest RAM usage and disk space usage (up to 30x less than Elasticsearch and Grafana Loki) ● Fast full-text search ● Excellent integration with traditional command-line tools for log analysis ● Accepts logs from all the popular log shippers (Filebeat, Fluentbit, Logstash, Vector, Promtail)
  • 143. VictoriaLogs: recap ● Easy to setup and operate ● The lowest RAM usage and disk space usage (up to 30x less than Elasticsearch and Grafana Loki) ● Fast full-text search ● Excellent integration with traditional command-line tools for log analysis ● Accepts logs from all the popular log shippers (Filebeat, Fluentbit, Logstash, Vector, Promtail) ● Open source and free to use!
  • 144. VictoriaLogs: useful links ● General docs - https://docs.victoriametrics.com/VictoriaLogs/
  • 145. VictoriaLogs: useful links ● General docs - https://docs.victoriametrics.com/VictoriaLogs/ ● LogsQL docs - https://docs.victoriametrics.com/VictoriaLogs/LogsQL.html