SlideShare ist ein Scribd-Unternehmen logo
1 von 42
Scalable Streaming Data
Pipelines with Redis
Avram Lyon
@ajlyon / github.com/avram
redisconf / May 10, 2016
MOBILE GAMES - PUBLISHER AND DEVELOPER
What kind of data?
• App opened
• Killed a walker
• Bought something
• Heartbeat
• Memory usage report
• App error
• Declined a review
prompt
• Finished the tutorial
• Clicked on that button
• Lost a battle
• Found a treasure chest
• Received a push
message
• Finished a turn
• Sent an invite
• Scored a Yahtzee
• Spent 100 silver coins
• Anything else any
game designer or
developer wants to
learn about
How much?
Recently:
Peak:
2.8 million events / minute
2.4 billion events / day
Primary Data Stream
Collection
Kinesis
Warehousing
Enrichment
Realtime MonitoringKinesisPublic API
Collection
HTTP
Collection
SQS
SQS
SQS
Studio A
Studio B
Studio C
Kinesis
SQS Failover
Redis
Caching App Configurations
System Configurations
Kinesis
SQS Failover
S3
Kinesis
Elasticsearch?
Enricher
Data
Warehouse
Forwarder
Ariel
(Realtime)
Idempotence
Aggregation
Idempotence
Idempotence
What’s in the box?
=
Where does this flow?
Ariel / Real-Time
Operational monitoring
Business alerts
Dashboarding
Data Warehouse
Funnel analysis
Ad-hoc batch analysis
Reporting
Behavior analysis
Elasticsearch
Ad-hoc realtime analysis
Fraud detection
Top-K summaries
Exploration
Ad-Hoc Forwarding
Data integration with partners
Game-specific systems
Kinesis
a short aside
Kinesis
• Distributed, sharded streams. Akin to Kafka.
• Get an iterator over the stream— and checkpoint with current stream
pointer occasionally.
• Workers coordinate shard leases and checkpoints in DynamoDB (via
KCL)
Shard 0
Shard 1
Shard 2
Shard 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Checkpointing
Checkpoint for Shard 0: 10 Given: Worker checkpoints every 5
Worker A 🔥
Worker B
Auxiliary Idempotence
• Idempotence keys at each stage
• Redis sets of idempotence keys by time window
• Gives resilience against various types of failures
Auxiliary Idempotence
Auxiliary Idempotence
• Gotcha: Set expiry is O(N)
• Broke up into small sets, partitioned by first 2 bytes of md5 of
idempotence key
Collection
Kinesis
Warehousing
Enrichment
Realtime MonitoringKinesisPublic API
1. Deserialize event batch
2. Apply changes to application properties
3. Get current device and application properties
4. Get known facts about sending device
5. Emit to each enriched event to Kinesis
Collection
Kinesis
Enrichment
Kinesis
SQS Failover
Kinesis
S3
Elasticsearch
?
S3 Backups
to HDFS
Enricher
Data
Warehouse
Forwarder
Idempotence
Ariel
Realtime
Idempotence
Aggregation
Idempotence
Now we have a stream of well-
described, denormalized event facts.
Pipeline to HDFS
• Partitioned by event name and game, buffered in-memory and
written to S3
• Picked up every hour by Spark job
• Converts to Parquet, loaded to HDFS
A closer look at Ariel
Dashboards
Alarms
Ariel Goals
• Low time-to-visibility
• Easy configuration
• Low cost per configured metric
Configuration
Live Metrics (Ariel)
Enriched Event Data
name: game_end
time: 2015-07-15 10:00:00.000 UTC
_devices_per_turn: 1.0
event_id: 12345
device_token: AAAA
user_id: 100
name: game_end
time: 2015-07-15 10:01:00.000 UTC
_devices_per_turn: 14.1
event_id: 12346
device_token: BBBB
user_id: 100
name: Cheating Games
predicate: _devices_per_turn > 1.5
target: event_id
type: DISTINCT
id: 1
name: Cheating Players
predicate: _devices_per_turn > 1.5
target: user_id
type: DISTINCT
id: 2
name: game_end
time: 2015-07-15 10:01:00.000 UTC
_devices_per_turn: 14.1
event_id: 12347
device_token: BBBB
user_id: 100
PFADD /m/1/2015-07-15-10-00 12346
PFADD /m/1/2015-07-15-10-00 123467
PFADD /m/2/2015-07-15-10-00 BBBB
PFADD /m/2/2015-07-15-10-00 BBBB
PFCOUNT /m/1/2015-07-15-10-00
2
PFCOUNT /m/2/2015-07-15-10-00
1
Configured Metrics
Collector
HyperLogLog
• High-level algorithm (four bullet-point version stolen from my
colleague, Cristian)
• b bits of the hashed function is used as an index pointer (redis
uses b = 14, i.e. m = 16384 registers)
• The rest of the hash is inspected for the longest run of zeroes
we can encounter (N)
• The register pointed by the index is replaced with
max(currentValue, N + 1)
• An estimator function is used to calculate the approximated
cardinality
http://content.research.neustar.biz/blog/hll.html
Live Metrics (Ariel)
Enriched Event Data
name: game_end
time: 2015-07-15 10:00:00.000 UTC
_devices_per_turn: 1.0
event_id: 12345
device_token: AAAA
user_id: 100
name: game_end
time: 2015-07-15 10:01:00.000 UTC
_devices_per_turn: 14.1
event_id: 12346
device_token: BBBB
user_id: 100
name: Cheating Games
predicate: _devices_per_turn > 1.5
target: event_id
type: DISTINCT
id: 1
name: Cheating Players
predicate: _devices_per_turn > 1.5
target: user_id
type: DISTINCT
id: 2
name: game_end
time: 2015-07-15 10:01:00.000 UTC
_devices_per_turn: 14.1
event_id: 12347
device_token: BBBB
user_id: 100
PFADD /m/1/2015-07-15-10-00 12346
PFADD /m/1/2015-07-15-10-00 123467
PFADD /m/2/2015-07-15-10-00 BBBB
PFADD /m/2/2015-07-15-10-00 BBBB
PFCOUNT /m/1/2015-07-15-10-00
2
PFCOUNT /m/2/2015-07-15-10-00
1
Configured Metrics
We can count
different things
Collector
Kinesis
Aggregation
Ariel
PFCOUNT
Are installs anomalous?
Collector
Idempotence
PFADD
Web
Workers
Pipeline Delay
• Pipelines back up
• Dashboards get outdated
• Alarms fire!
Alarm Clocks
• Push timestamp of current events to per-game
pub/sub channel
• Worker takes 99th percentile age of last N events
per title as delay
• Use that time for alarm calculations
• Overlay delays on dashboards
Ariel, now with clocks
Event ClockKinesis
Aggregation
PFCOUNT
Are installs anomalous?
Collector
Idempotence
PFADD
Web
Workers
Ariel 1.0
• ~30K metrics configured
• Aggregation into 30-minute
buckets
• 12 kilobytes per HLL set
(plus overhead)
Challenges
• Dataset size.
RedisLabs non-cluster
max = 100GB
• Packet/s limits: 250K in
EC2-Classic
• Alarm granularity
Hybrid Datastore:
Requirements
• Need to keep HLL sets to count distinct
• Redis is relatively finite
• HLL outside of Redis is messy
Hybrid Datastore: Plan
• Move older HLL sets to DynamoDB
• They’re just strings!
• Cache reports aggressively
• Fetch backing HLL data from DynamoDB as
needed on web layer, merge using on-instance
Redis
Ariel, now with hybrid datastore
DynamoDB
Report Caches
Old Data Migration
Event Clock
Kinesis
Aggregation
PFCOUNT
Are installs anomalous?
Collector
Idempotence
PFADD
Web
Workers
Merge Scratchpad
Much less memory…
Redis Roles
• Idempotence
• Configuration Caching
• Aggregation
• Clock
• Scratchpad for merges
• Cache of reports
• Staging of DWH extracts
Other Considerations
• Multitenancy. We run parallel stacks and give
games an assigned affinity, to insulate from
pipeline delays
• Backfill. System is forward-looking only; can replay
Kinesis backups to backfill, or backfill from
warehouse
Why Not _____?
• Druid
• Flink
• InfluxDB
• RethinkDB
Thanks!
Questions?
scopely.com/jobs
@ajlyon
avram@scopely.com
github.com/avram

Weitere ähnliche Inhalte

Was ist angesagt?

Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016
Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016
Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016Zabbix
 
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...Fred de Villamil
 
Using Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comUsing Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comDamien Krotkine
 
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Nagios
 
Gordonh0945deepdive openstackcompute-140417174059-phpapp02
Gordonh0945deepdive openstackcompute-140417174059-phpapp02Gordonh0945deepdive openstackcompute-140417174059-phpapp02
Gordonh0945deepdive openstackcompute-140417174059-phpapp02Công TÔ
 
Securing Your Containerized Applications with NGINX
Securing Your Containerized Applications with NGINXSecuring Your Containerized Applications with NGINX
Securing Your Containerized Applications with NGINXDocker, Inc.
 
HadoopCon- Trend Micro SPN Hadoop Overview
HadoopCon- Trend Micro SPN Hadoop OverviewHadoopCon- Trend Micro SPN Hadoop Overview
HadoopCon- Trend Micro SPN Hadoop OverviewYafang Chang
 
Building a Real-Time Data Pipeline with Spark, Kafka, and Python
Building a Real-Time Data Pipeline with Spark, Kafka, and PythonBuilding a Real-Time Data Pipeline with Spark, Kafka, and Python
Building a Real-Time Data Pipeline with Spark, Kafka, and PythonSingleStore
 
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon KimHDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon KimDatabricks
 
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDogRedis Labs
 
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Nagios
 
Dip into prometheus
Dip into prometheusDip into prometheus
Dip into prometheusZaar Hai
 
Icinga lsm 2015 copy
Icinga lsm 2015 copyIcinga lsm 2015 copy
Icinga lsm 2015 copyNETWAYS
 
OSMC 2014: Current state of Icinga | Icinga Team
OSMC 2014: Current state of Icinga | Icinga TeamOSMC 2014: Current state of Icinga | Icinga Team
OSMC 2014: Current state of Icinga | Icinga TeamNETWAYS
 
Microservices, Continuous Delivery, and Elasticsearch at Capital One
Microservices, Continuous Delivery, and Elasticsearch at Capital OneMicroservices, Continuous Delivery, and Elasticsearch at Capital One
Microservices, Continuous Delivery, and Elasticsearch at Capital OneNoriaki Tatsumi
 
PyconKR 2019 Lightning Talk - Let The Dogs Out on Kubernetes
PyconKR 2019 Lightning Talk - Let The Dogs Out on KubernetesPyconKR 2019 Lightning Talk - Let The Dogs Out on Kubernetes
PyconKR 2019 Lightning Talk - Let The Dogs Out on KubernetesSeokju Hong
 
How to Build a Monitoring Application in 20 Minutes | Russ Savage | InfluxData
How to Build a Monitoring Application in 20 Minutes | Russ Savage | InfluxDataHow to Build a Monitoring Application in 20 Minutes | Russ Savage | InfluxData
How to Build a Monitoring Application in 20 Minutes | Russ Savage | InfluxDataInfluxData
 
Elk ruminating on logs
Elk ruminating on logsElk ruminating on logs
Elk ruminating on logsMathew Beane
 
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment OptionsNagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment OptionsNagios
 
CN Asturias - Stateful application for kubernetes
CN Asturias -  Stateful application for kubernetes CN Asturias -  Stateful application for kubernetes
CN Asturias - Stateful application for kubernetes Cédrick Lunven
 

Was ist angesagt? (20)

Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016
Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016
Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016
 
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
 
Using Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.comUsing Riak for Events storage and analysis at Booking.com
Using Riak for Events storage and analysis at Booking.com
 
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
 
Gordonh0945deepdive openstackcompute-140417174059-phpapp02
Gordonh0945deepdive openstackcompute-140417174059-phpapp02Gordonh0945deepdive openstackcompute-140417174059-phpapp02
Gordonh0945deepdive openstackcompute-140417174059-phpapp02
 
Securing Your Containerized Applications with NGINX
Securing Your Containerized Applications with NGINXSecuring Your Containerized Applications with NGINX
Securing Your Containerized Applications with NGINX
 
HadoopCon- Trend Micro SPN Hadoop Overview
HadoopCon- Trend Micro SPN Hadoop OverviewHadoopCon- Trend Micro SPN Hadoop Overview
HadoopCon- Trend Micro SPN Hadoop Overview
 
Building a Real-Time Data Pipeline with Spark, Kafka, and Python
Building a Real-Time Data Pipeline with Spark, Kafka, and PythonBuilding a Real-Time Data Pipeline with Spark, Kafka, and Python
Building a Real-Time Data Pipeline with Spark, Kafka, and Python
 
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon KimHDFS on Kubernetes—Lessons Learned with Kimoon Kim
HDFS on Kubernetes—Lessons Learned with Kimoon Kim
 
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
 
Dip into prometheus
Dip into prometheusDip into prometheus
Dip into prometheus
 
Icinga lsm 2015 copy
Icinga lsm 2015 copyIcinga lsm 2015 copy
Icinga lsm 2015 copy
 
OSMC 2014: Current state of Icinga | Icinga Team
OSMC 2014: Current state of Icinga | Icinga TeamOSMC 2014: Current state of Icinga | Icinga Team
OSMC 2014: Current state of Icinga | Icinga Team
 
Microservices, Continuous Delivery, and Elasticsearch at Capital One
Microservices, Continuous Delivery, and Elasticsearch at Capital OneMicroservices, Continuous Delivery, and Elasticsearch at Capital One
Microservices, Continuous Delivery, and Elasticsearch at Capital One
 
PyconKR 2019 Lightning Talk - Let The Dogs Out on Kubernetes
PyconKR 2019 Lightning Talk - Let The Dogs Out on KubernetesPyconKR 2019 Lightning Talk - Let The Dogs Out on Kubernetes
PyconKR 2019 Lightning Talk - Let The Dogs Out on Kubernetes
 
How to Build a Monitoring Application in 20 Minutes | Russ Savage | InfluxData
How to Build a Monitoring Application in 20 Minutes | Russ Savage | InfluxDataHow to Build a Monitoring Application in 20 Minutes | Russ Savage | InfluxData
How to Build a Monitoring Application in 20 Minutes | Russ Savage | InfluxData
 
Elk ruminating on logs
Elk ruminating on logsElk ruminating on logs
Elk ruminating on logs
 
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment OptionsNagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
 
CN Asturias - Stateful application for kubernetes
CN Asturias -  Stateful application for kubernetes CN Asturias -  Stateful application for kubernetes
CN Asturias - Stateful application for kubernetes
 

Ähnlich wie Scalable Streaming Data Pipelines with Redis

Scalable Streaming Data Pipelines with Redis
Scalable Streaming Data Pipelines with RedisScalable Streaming Data Pipelines with Redis
Scalable Streaming Data Pipelines with RedisAvram Lyon
 
High Availability by Design
High Availability by DesignHigh Availability by Design
High Availability by DesignDavid Prinzing
 
Azure Data Factory v2
Azure Data Factory v2Azure Data Factory v2
Azure Data Factory v2inovex GmbH
 
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...VMware Tanzu
 
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel Lavoie
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel LavoieSpring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel Lavoie
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel LavoieVMware Tanzu
 
Our Data Ourselves, Pydata 2015
Our Data Ourselves, Pydata 2015Our Data Ourselves, Pydata 2015
Our Data Ourselves, Pydata 2015kingsBSD
 
Azure Stream Analytics : Analyse Data in Motion
Azure Stream Analytics  : Analyse Data in MotionAzure Stream Analytics  : Analyse Data in Motion
Azure Stream Analytics : Analyse Data in MotionRuhani Arora
 
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suroDevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suroGaurav "GP" Pal
 
Realtime Web Apps in 2014 & Beyond
Realtime Web Apps in 2014 & BeyondRealtime Web Apps in 2014 & Beyond
Realtime Web Apps in 2014 & BeyondPhil Leggetter
 
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy Nguyen
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy NguyenGrokking Engineering - Data Analytics Infrastructure at Viki - Huy Nguyen
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy NguyenHuy Nguyen
 
Stream Processing in SmartNews #jawsdays
Stream Processing in SmartNews #jawsdaysStream Processing in SmartNews #jawsdays
Stream Processing in SmartNews #jawsdaysSmartNews, Inc.
 
Spring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - BostonSpring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - BostonVMware Tanzu
 
Georgia Azure Event - Scalable cloud games using Microsoft Azure
Georgia Azure Event - Scalable cloud games using Microsoft AzureGeorgia Azure Event - Scalable cloud games using Microsoft Azure
Georgia Azure Event - Scalable cloud games using Microsoft AzureMicrosoft
 
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...Amazon Web Services
 
[Hands-on] CQRS(Command Query Responsibility Segregation) 와 Event Sourcing 패턴 실습
[Hands-on] CQRS(Command Query Responsibility Segregation) 와 Event Sourcing 패턴 실습[Hands-on] CQRS(Command Query Responsibility Segregation) 와 Event Sourcing 패턴 실습
[Hands-on] CQRS(Command Query Responsibility Segregation) 와 Event Sourcing 패턴 실습Oracle Korea
 
CQRS and Event Sourcing
CQRS and Event Sourcing CQRS and Event Sourcing
CQRS and Event Sourcing Inho Kang
 
Event Driven Streaming Analytics - Demostration on Architecture of IoT
Event Driven Streaming Analytics - Demostration on Architecture of IoTEvent Driven Streaming Analytics - Demostration on Architecture of IoT
Event Driven Streaming Analytics - Demostration on Architecture of IoTLei Xu
 
Say hello to the new PlayFab!
Say hello to the new PlayFab!Say hello to the new PlayFab!
Say hello to the new PlayFab!Thomas Robbins
 
Combinación de logs, métricas y rastreos para observabilidad unificada
Combinación de logs, métricas y rastreos para observabilidad unificadaCombinación de logs, métricas y rastreos para observabilidad unificada
Combinación de logs, métricas y rastreos para observabilidad unificadaElasticsearch
 
'DOCKER' & CLOUD: ENABLERS For DEVOPS
'DOCKER' & CLOUD:  ENABLERS For DEVOPS'DOCKER' & CLOUD:  ENABLERS For DEVOPS
'DOCKER' & CLOUD: ENABLERS For DEVOPSACA IT-Solutions
 

Ähnlich wie Scalable Streaming Data Pipelines with Redis (20)

Scalable Streaming Data Pipelines with Redis
Scalable Streaming Data Pipelines with RedisScalable Streaming Data Pipelines with Redis
Scalable Streaming Data Pipelines with Redis
 
High Availability by Design
High Availability by DesignHigh Availability by Design
High Availability by Design
 
Azure Data Factory v2
Azure Data Factory v2Azure Data Factory v2
Azure Data Factory v2
 
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
SpringOne Tour Denver - Spring Boot & Spring Cloud on Pivotal Application Ser...
 
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel Lavoie
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel LavoieSpring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel Lavoie
Spring Boot & Spring Cloud Apps on Pivotal Application Service - Daniel Lavoie
 
Our Data Ourselves, Pydata 2015
Our Data Ourselves, Pydata 2015Our Data Ourselves, Pydata 2015
Our Data Ourselves, Pydata 2015
 
Azure Stream Analytics : Analyse Data in Motion
Azure Stream Analytics  : Analyse Data in MotionAzure Stream Analytics  : Analyse Data in Motion
Azure Stream Analytics : Analyse Data in Motion
 
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suroDevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
DevOps in the Amazon Cloud – Learn from the pioneersNetflix suro
 
Realtime Web Apps in 2014 & Beyond
Realtime Web Apps in 2014 & BeyondRealtime Web Apps in 2014 & Beyond
Realtime Web Apps in 2014 & Beyond
 
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy Nguyen
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy NguyenGrokking Engineering - Data Analytics Infrastructure at Viki - Huy Nguyen
Grokking Engineering - Data Analytics Infrastructure at Viki - Huy Nguyen
 
Stream Processing in SmartNews #jawsdays
Stream Processing in SmartNews #jawsdaysStream Processing in SmartNews #jawsdays
Stream Processing in SmartNews #jawsdays
 
Spring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - BostonSpring and Pivotal Application Service - SpringOne Tour - Boston
Spring and Pivotal Application Service - SpringOne Tour - Boston
 
Georgia Azure Event - Scalable cloud games using Microsoft Azure
Georgia Azure Event - Scalable cloud games using Microsoft AzureGeorgia Azure Event - Scalable cloud games using Microsoft Azure
Georgia Azure Event - Scalable cloud games using Microsoft Azure
 
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
 
[Hands-on] CQRS(Command Query Responsibility Segregation) 와 Event Sourcing 패턴 실습
[Hands-on] CQRS(Command Query Responsibility Segregation) 와 Event Sourcing 패턴 실습[Hands-on] CQRS(Command Query Responsibility Segregation) 와 Event Sourcing 패턴 실습
[Hands-on] CQRS(Command Query Responsibility Segregation) 와 Event Sourcing 패턴 실습
 
CQRS and Event Sourcing
CQRS and Event Sourcing CQRS and Event Sourcing
CQRS and Event Sourcing
 
Event Driven Streaming Analytics - Demostration on Architecture of IoT
Event Driven Streaming Analytics - Demostration on Architecture of IoTEvent Driven Streaming Analytics - Demostration on Architecture of IoT
Event Driven Streaming Analytics - Demostration on Architecture of IoT
 
Say hello to the new PlayFab!
Say hello to the new PlayFab!Say hello to the new PlayFab!
Say hello to the new PlayFab!
 
Combinación de logs, métricas y rastreos para observabilidad unificada
Combinación de logs, métricas y rastreos para observabilidad unificadaCombinación de logs, métricas y rastreos para observabilidad unificada
Combinación de logs, métricas y rastreos para observabilidad unificada
 
'DOCKER' & CLOUD: ENABLERS For DEVOPS
'DOCKER' & CLOUD:  ENABLERS For DEVOPS'DOCKER' & CLOUD:  ENABLERS For DEVOPS
'DOCKER' & CLOUD: ENABLERS For DEVOPS
 

Kürzlich hochgeladen

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 

Kürzlich hochgeladen (20)

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 

Scalable Streaming Data Pipelines with Redis

  • 1. Scalable Streaming Data Pipelines with Redis Avram Lyon @ajlyon / github.com/avram redisconf / May 10, 2016
  • 2. MOBILE GAMES - PUBLISHER AND DEVELOPER
  • 3. What kind of data? • App opened • Killed a walker • Bought something • Heartbeat • Memory usage report • App error • Declined a review prompt • Finished the tutorial • Clicked on that button • Lost a battle • Found a treasure chest • Received a push message • Finished a turn • Sent an invite • Scored a Yahtzee • Spent 100 silver coins • Anything else any game designer or developer wants to learn about
  • 4. How much? Recently: Peak: 2.8 million events / minute 2.4 billion events / day
  • 6. Collection HTTP Collection SQS SQS SQS Studio A Studio B Studio C Kinesis SQS Failover Redis Caching App Configurations System Configurations
  • 9. Where does this flow? Ariel / Real-Time Operational monitoring Business alerts Dashboarding Data Warehouse Funnel analysis Ad-hoc batch analysis Reporting Behavior analysis Elasticsearch Ad-hoc realtime analysis Fraud detection Top-K summaries Exploration Ad-Hoc Forwarding Data integration with partners Game-specific systems
  • 11. Kinesis • Distributed, sharded streams. Akin to Kafka. • Get an iterator over the stream— and checkpoint with current stream pointer occasionally. • Workers coordinate shard leases and checkpoints in DynamoDB (via KCL) Shard 0 Shard 1 Shard 2
  • 12. Shard 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 Checkpointing Checkpoint for Shard 0: 10 Given: Worker checkpoints every 5 Worker A 🔥 Worker B
  • 13. Auxiliary Idempotence • Idempotence keys at each stage • Redis sets of idempotence keys by time window • Gives resilience against various types of failures
  • 15. Auxiliary Idempotence • Gotcha: Set expiry is O(N) • Broke up into small sets, partitioned by first 2 bytes of md5 of idempotence key
  • 17. 1. Deserialize event batch 2. Apply changes to application properties 3. Get current device and application properties 4. Get known facts about sending device 5. Emit to each enriched event to Kinesis Collection Kinesis Enrichment
  • 18. Kinesis SQS Failover Kinesis S3 Elasticsearch ? S3 Backups to HDFS Enricher Data Warehouse Forwarder Idempotence Ariel Realtime Idempotence Aggregation Idempotence
  • 19. Now we have a stream of well- described, denormalized event facts.
  • 20. Pipeline to HDFS • Partitioned by event name and game, buffered in-memory and written to S3 • Picked up every hour by Spark job • Converts to Parquet, loaded to HDFS
  • 21. A closer look at Ariel
  • 23. Ariel Goals • Low time-to-visibility • Easy configuration • Low cost per configured metric
  • 25. Live Metrics (Ariel) Enriched Event Data name: game_end time: 2015-07-15 10:00:00.000 UTC _devices_per_turn: 1.0 event_id: 12345 device_token: AAAA user_id: 100 name: game_end time: 2015-07-15 10:01:00.000 UTC _devices_per_turn: 14.1 event_id: 12346 device_token: BBBB user_id: 100 name: Cheating Games predicate: _devices_per_turn > 1.5 target: event_id type: DISTINCT id: 1 name: Cheating Players predicate: _devices_per_turn > 1.5 target: user_id type: DISTINCT id: 2 name: game_end time: 2015-07-15 10:01:00.000 UTC _devices_per_turn: 14.1 event_id: 12347 device_token: BBBB user_id: 100 PFADD /m/1/2015-07-15-10-00 12346 PFADD /m/1/2015-07-15-10-00 123467 PFADD /m/2/2015-07-15-10-00 BBBB PFADD /m/2/2015-07-15-10-00 BBBB PFCOUNT /m/1/2015-07-15-10-00 2 PFCOUNT /m/2/2015-07-15-10-00 1 Configured Metrics Collector
  • 26.
  • 27. HyperLogLog • High-level algorithm (four bullet-point version stolen from my colleague, Cristian) • b bits of the hashed function is used as an index pointer (redis uses b = 14, i.e. m = 16384 registers) • The rest of the hash is inspected for the longest run of zeroes we can encounter (N) • The register pointed by the index is replaced with max(currentValue, N + 1) • An estimator function is used to calculate the approximated cardinality http://content.research.neustar.biz/blog/hll.html
  • 28. Live Metrics (Ariel) Enriched Event Data name: game_end time: 2015-07-15 10:00:00.000 UTC _devices_per_turn: 1.0 event_id: 12345 device_token: AAAA user_id: 100 name: game_end time: 2015-07-15 10:01:00.000 UTC _devices_per_turn: 14.1 event_id: 12346 device_token: BBBB user_id: 100 name: Cheating Games predicate: _devices_per_turn > 1.5 target: event_id type: DISTINCT id: 1 name: Cheating Players predicate: _devices_per_turn > 1.5 target: user_id type: DISTINCT id: 2 name: game_end time: 2015-07-15 10:01:00.000 UTC _devices_per_turn: 14.1 event_id: 12347 device_token: BBBB user_id: 100 PFADD /m/1/2015-07-15-10-00 12346 PFADD /m/1/2015-07-15-10-00 123467 PFADD /m/2/2015-07-15-10-00 BBBB PFADD /m/2/2015-07-15-10-00 BBBB PFCOUNT /m/1/2015-07-15-10-00 2 PFCOUNT /m/2/2015-07-15-10-00 1 Configured Metrics We can count different things Collector
  • 30. Pipeline Delay • Pipelines back up • Dashboards get outdated • Alarms fire!
  • 31. Alarm Clocks • Push timestamp of current events to per-game pub/sub channel • Worker takes 99th percentile age of last N events per title as delay • Use that time for alarm calculations • Overlay delays on dashboards
  • 32. Ariel, now with clocks Event ClockKinesis Aggregation PFCOUNT Are installs anomalous? Collector Idempotence PFADD Web Workers
  • 33. Ariel 1.0 • ~30K metrics configured • Aggregation into 30-minute buckets • 12 kilobytes per HLL set (plus overhead)
  • 34. Challenges • Dataset size. RedisLabs non-cluster max = 100GB • Packet/s limits: 250K in EC2-Classic • Alarm granularity
  • 35. Hybrid Datastore: Requirements • Need to keep HLL sets to count distinct • Redis is relatively finite • HLL outside of Redis is messy
  • 36. Hybrid Datastore: Plan • Move older HLL sets to DynamoDB • They’re just strings! • Cache reports aggressively • Fetch backing HLL data from DynamoDB as needed on web layer, merge using on-instance Redis
  • 37. Ariel, now with hybrid datastore DynamoDB Report Caches Old Data Migration Event Clock Kinesis Aggregation PFCOUNT Are installs anomalous? Collector Idempotence PFADD Web Workers Merge Scratchpad
  • 39. Redis Roles • Idempotence • Configuration Caching • Aggregation • Clock • Scratchpad for merges • Cache of reports • Staging of DWH extracts
  • 40. Other Considerations • Multitenancy. We run parallel stacks and give games an assigned affinity, to insulate from pipeline delays • Backfill. System is forward-looking only; can replay Kinesis backups to backfill, or backfill from warehouse
  • 41. Why Not _____? • Druid • Flink • InfluxDB • RethinkDB

Hinweis der Redaktion

  1. We also expect this to grow with the growth of our userbase, the launch of new titles, and of course with every addition of new, useful functionality.
  2. We’re just looking at one simple transformation of a stream, and the consumption of that stream by a variety of consumers. Since we’re using Kinesis, we can read the same stream in parallel from multiple applications safely. We’ll consider major challenges moving from left to right across this architecture.
  3. Primary collection is intended to be at-least-once; currently support SQS and HTTP; all batches have idempotence information to allow deduplication. At this stage, we have minimal logic— we are focused on letting game servers and clients successfully unload their batches of user events, so they can be durably stored in our systems. System configuration lives in DynamoDB; we use Netflix Archaius App configuration lives in DynamoDB; we cache in-memory on instances and in Redis Goals of SQS: Goal: Register and receive events asynchronously Goal: Provide elasticity when senders spike Goal: Reduce CPU burn for senders
  4. Autoscaling group containing a simple Java service, deployed as a golden AMI provisioned with Packer and Ansible, using Cloudformation. We make lots of these — we call them our satellites. Usually we name them after moons. The little orange symbol means we’re using Amazon’s KCL, so the fleet negotiates workers’ shard control using a lease table in DynamoDB. Monitoring is New Relic and lots of StatsD sent to Datadog. So every time we see a gray square, assume we’re talking about 1-50 EC2 instances across several availability zones in one AWS region.
  5. But first an aside on Kinesis.
  6. Checkpointing and auxiliary idempotence The data in our stream has monotonically increasing pointers (huge, huge numbers!). In our case, 1-22 and beyond. A worker on this shard appears and checkpoints every 5 successfully processed records. But it dies after processing record 12. When Worker B appears, it sees the checkpoint at 10 and picks up processing the shard at 11. But this means we’ll reprocess 11 and 12! Similar issues can occur with out-of-order processing of data.
  7. Expensive. Bloom filters may be a viable option some day
  8. Expensive. Bloom filters may be a viable option some day
  9. this stage is the latency-sensitive.
  10. This lets all downstream systems act on data without needing to hit any more systems.
  11. We have considered a streaming ingest, but this has proven easier to reason about and has sufficient liveness at the moment.
  12. Introduced in Redis 2.8.9 (http://antirez.com/news/75) But I don’t want to really get into this too much…
  13. The first complete implementation of this had three major components: collector, web and workers.
  14. Caveat— not all metrics were HLL; we also support sums, which take only several bytes. But only the sparsest of distinct metrics would require less than 12KB for a time window