EVCache at Netflix

●Caching at Netflix
●What is EVCache?
●Additional Features
●Code & Internals
●Architecture Lessons
Agenda

How we view caches
Globally available
Eventually-consistent
Ephemeral storage mechanism
Tunable replication
As an optimization for online services or
As primary storage for bulk computation
(recommendations, predictions, etc.)

EVCache Use @ Netflix
70+ distinct EVCache clusters
Used by nearly 200 applications
Data replicated over 3 AWS regions
Over 1 Million replications per second
65+ Billion objects
30+ Million ops/second (1.8 Trillion+ per day)
160+ Terabytes of data stored
Clusters from 3 to hundreds of instances
12000+ memcached instances of varying size

Ephemeral Volatile memCache (EVCache)
Clustered memcached optimized for AWS and
tuned for Netflix use cases.

EVCache Server
Memcached
Prana (Sidecar)
Monitoring & Other Processes
Eureka
Client Application
Client Library
EVCache Client

Why Optimize for AWS
●Instances disappear
●Zones disappear
●Regions can disappear (Chaos Kong)
●These do happen (and we test all the time)
●Network can be lossy
○Throttling
○Dropped packets
●Customer requests move between regions

How we Optimized for AWS
●Multiple copies of data per region
●Clients are local replica aware
●Writes to all local replicas by the client
●Reads are local and retry on other copies
●Replication across regions with a custom
replication system

Reading
Zone A
Client Application
Client Library
EVCache Client
Zone B
Client Application
Client Library
EVCache Client
Zone C
Client Application
Client Library
EVCache Client
. . .. . .. . .

Writing
Zone A
Client Application
Client Library
EVCache Client
. . .
Zone B
Client Application
Client Library
EVCache Client
. . .
Zone C
Client Application
Client Library
EVCache Client
. . .

Use Case: Fronting Services
Client Application
Client Library
EVCache Client Service Client
S S S S. . .
C C C C. . .
. . .

Use Case: As the Data Store
Offline / Nearline
Computation
Online Client Application
Client Library
EVCache Client
. . .
Online Services
Offline Services

Use Case: Transient Data Store
Client Library
EVCache Client
Client Library
EVCache Client
. . .
Client Library
EVCache Client

Additional Features
●Global cross-region replication
●Secondary indexing
●Cache warming
●Consistency checking
All powered by metadata flowing through Kafka

Cross-Region Replication
Why Replicate?
Maintain duplicate caches in each region
Invalidate stale cache entries in other region’s cache
What do we replicate?
set
delete
Where do we replicate?
One or more other regions, depending on application
requirements

●Replicate delete and invalidations on set
Usually used for caches with persistent store
Entry fetched from persistent store on next cache miss
(demand-fill)
Replicate set
Used for offline/nearline computation
Commonly no persistent store
Duplicate cache in multiple regions

Region BRegion A
EVCache
Replication
Repl Writer
Kafka
Application
Client
EVCache
Replication
Repl Writer
1 set or
delete
2 send
metadata
3 poll msg
6 set or
delete
Application
Client
Kafka
7 read

Cross-Region Replication (ping-pong)
Region A Region B
App App
EVCache
Replication
4 replicate
7 get
EVCache
Replication
2 set
3 send
metadata
5 set

Choices for Underlying Message System
● AWS SQS
○ Message queueing service
○ Reliable and fast (but with occasional spikes in latency)
○ No guaranteed ordering of messages
○ Messages are processed at-least-once and removed from queue
○ Forward a message to multiple queues to process multiple times
○ Cost based on messages, bandwidth, etc.
● Apache Kafka
○ Open-source publish-subscribe system
○ Reliable and fast enough
○ Messages are ordered within partition
○ Allows processing of same message by different applications

Secondary Indexing
●Why index?
○memcached does not provide a usable index
○Debugging
○Warmup lost instances
○Data insight
●Indexing provided by ElasticSearch

Cache Warming (Deployments)
Zone A
Client Application
Client Library
EVCache Client
Cache Warmer
. . . . . . Kafka. . .

Minimal Code Example
Create EVCache Object
EVCache evCache = new EVCache.Builder()
.setAppName(“EVCACHE_TEST”)
.setCachePrefix("pre")
.setDefaultTTL(900)
.build();
Write Data
evCache.set(“key”, “value”);
Read Data
evCache.get(“key”);
Delete Data
evCache.delete(“key”);

Client-side Hashing
Ketama Consistent Hashing algorithm
If one server is replaced, few keys are shuffled

Architecture Lessons
(from outages)

Failure Scenarios
●Load Spikes on the Service
●Dropped Packets (and virtual NIC limits)
●Write-back Cascading Failure

Load Spikes (Personalized Fallbacks)
S
Z
Cassandra
U
U
L

Dropped Packets
Client Application
Client Library
EVCache Client
. . .

Write-back cascading failure
A
C
D
B
S Cassandra

Write-back cascading failure
A
C
D
B
S CassandraCassandra

Client failure resilience
●Operations fast fail
○No servers in Eureka
○Connection reset
●Exponential backoff
●Read/Write Queues
○When full, fast fail
●Replication write failure
○Secondary path through SQS as a backup

EVCache Open Source
github.com/netflix/evcache

Dependencies
Server:
● memcached (cache process)
● Prana (sidecar)
● Servo client (metrics)
● Eureka client (Instance discovery)
Client:
● Servo client (metrics)
● Eureka client (instance discovery)
External:
● Atlas (metrics ingestion & reporting)
● Eureka service

Consistency Checking
Zone A
Client Application
Client Library
EVCache Client
. . .
Zone B
Client Application
Client Library
EVCache Client
. . .
Kafka
SConsistency
Checker

(Netflix) Multi-region Architecture
A CB
US West 2
A CB
US East 1
A CB
EU West 1

When to Use Caches
●Predictable response time with varying loads
●Improve throughput
●Reduce server costs
●Store results of idempotent computations
●Fallbacks when service is not responding
●Sharing data across multiple disparate
services

Know Your Limits
There’s probably a limitation in your
infrastructure that you don’t know about
CPU & Memory are easy, network is hard
Cascading failures

EVCache at Netflix

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (16)

Ähnlich wie EVCache at Netflix

Ähnlich wie EVCache at Netflix (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

EVCache at Netflix

Hinweis der Redaktion