1. Introduction to Redis
Remote dictionary server
1
Grégory BOISSINOT
@gboissinot
#redismeetup
Paris Redis Meetup #1
2016, October 24
2. In-memory key-value store (DB)
with optional persistence to disk
key value
Disk
(RDB format)
Optional
in-memory
0.1µs for memory vs 10ms for disk
All data in memory
At risk but very performant
2
key value
key value
Snapshotting – Periodic Dump (RDB)
Append Only File (AOF)
3. gboissinot: ~$ ./redis-cli - n 0
127.0.0.1:6379> SET the-most-popular-key-value-store "REDIS"
OK
127.0.0.1:6379> QUIT
gboissinot: ~$
~INSERT INTO ...
~SELECT * FROM ...
gboissinot: ~$ ./redis-cli - n 0
127.0.0.1:6379> GET the-most-popular-key-value-store
"REDIS"
127.0.0.1:6379> QUIT
gboissinot: ~$ 3
4. An external solution for your Application
4
Primary
Data Store
Redis
Master
Application
Redis
Slave
Index
Store
//With Jedis API (a Java Redis Client)
import redis.clients.jedis.Jedis;
…
jedis.set(”foo", ”bar");
…
String value = jedis.get(”foo");
...
Redis is mainly used to complement your existing Data solutions
Programmer’s View of Redis
5. Redis philosophy
5
“I see Redis definitively more as a flexible tool than a solution specialized
to solve a specific problem: his mixed soul of cache, store, and
messaging server shows this very well.”
Salvatore Sanfilippo’s words, author of Redis
@antirez
6. Expiration in seconds
…
TTL LastPrice:RedisBook
8
GET LastPrice:RedisBook
38.5
…
TTL LastPrice:RedisBook
-2
GET LastPrice:RedisBook
== (nil)
Eviction by setting an expiration date
SET LastPrice:RedisBook 38.5 EX 12
One simple way to decrease memory usage
6
The key has
expired
7. Eviction from Redis
Multiple available behaviors
lru: the less recently used
ttl: closest to expiration
Choose how keys are selected for eviction when the max memory is reached
maxmemory <bytes>
maxmemory-policy <evict-strategy>
7
Max 45% with Snapshotting
Max 95 % with AOF
Always 5% for overhead
Evict Policy Values
volatile-ttl
volatile-lru (default)
allkeys-lru
volatile-random
allkeys-random
noeviction
redis.conf
8. Advanced In-Memory Data Structure Store
KEY VALUE
Strings
Hashes
Lists
Sets
Sorted Sets
Primitive
Container
Binary-safe string
with max allowed size
512MB
8
Several Logical Data Types – Data is not opaque
...
9. Redis under the hood
Redis Client
Redis
Data Structure
Server
● A TCP Server using the client-server model
● Every Redis command is atomic
○ Including the ones that do multiple things
● Redis is single-threaded
○ While one command is executing, no other commands will run
RESP (REdis Serialization Protocol): Text-based protocol
9
Send
commands
10. Some promises
10
• Focused on speed (fast for storing and retrieving data)
~ as fast as in-memory
Mostly O(1) execution time complexity
• Rich API (similar as provided by RDBMS)
Very simple to use for developers by executing more than 160 commands
Data structures are directly exposed – No abstraction layer
• Can store lots of keys (limited by the memory)
2ˆ32 keys (> 4 billion elements)
• Server-side scripting (like stored procedures) with Lua
A script is a pure function
Can be compiled in memory
Atomic like any other commands
11. Internal Scripting with Lua
11
./redis-cli eval "return redis.call('set',KEYS[1], ARGV[1])" 1 foo 42
OK
./redis-cli GET foo
42
Fast embeddable scripting language
Number of keys
13. ./redis-benchmark -q -n 100000
PING_INLINE: 120627.27 requests per second
PING_BULK: 120772.95 requests per second
SET: 118343.19 requests per second
GET: 117647.05 requests per second
….
Example: A simple benchmark on a laptop (core i7 @2.2GHz, 16 GB RAM)
Redis in a low latency fashion
13
14. Many use cases
Caching
Online User Data
(Short-lived data such as Session)
Job & Queue Management
High speed Data Ingest
(counting things)
More or less technical use cases
In-database Analytics
(caching query, etc)
Time Series Data
(IoT, etc)
14
19. Tracking website visitors (1/2)
IP
192.168.7.0
IP
192.168.17.3
IP
192.168.19.2
IP
192.168.3.4
SADD visitors:19:30 192.168.7.0
SADD visitors:19:30 192.168.17.3
SADD visitors:19:30 192.168.19.2
SADD visitors:19:30 192.168.3.4
SADD visitors:19:30 192.168.7.0
SCARD visitors:19:30
4 (integer)
Redis key
Set
Temporal focus
19
visitors:19:30
key
namespace
20. Tracking website visitors (2/2)
IP
192.168.7.0
IP
192.168.17.3
IP
192.168.19.2
IP
192.168.3.4
IP
192.168.7.4
IP
192.168.19.34
IP
192.168.19.2
IP
192.168.3.4
visitors:2016:10:23 visitors:2016:10:24
//Today but not yesterday
SDIFF visitors:2016:10:24 visitors:2016:10:23
//Today or yesterday
SUNION visitors:2016:10:24 visitors:2016:10:23
//Today and yesterday
SINTER visitors:2016:10:24 visitors:2016:10:23 20
SetSet
Temporal focus
21. Storing and updating user attributes
HSET employee:123 name Gregory
HSET employee:123 mail
gregory.boissinot@gmail.com
HSET employee:123 status "married"
HGET employee:123 name
== Gregory
HGETALL employee:123
1) name
2) Gregory
3) mail
4) gregory.boissinot@gmail.com
….
name Gregory
mail gregory.boissinot@gmail.com
status married
sub-key value
HashSet
21
Redis key
employee:123
Optimized :
O(1) complexity
24. Propagating health information
Very lightweight messaging system – pub/sub model
Fire-and-forget notification
In-memory messaging: no persistence, no cache
./redis-cli
PSUBSCRIBE services:*
….
<health>…</health>"
./redis-cli
PUBLISH services:123 ”<health>…</health>"
(integer) 1
Be careful: Not suitable for reliable off-line event notifications
24
Pattern-matching subscriptions
25. In-memory index (1/2)
25
HSET user:123 id 123
HSET user:123 name Gregory
HSET user:123 age 35
...
ZADD user.age.index 35 123
ZADD user.age.index 33 124
ZADD user.age.index 13 125
ZRANGEBYSCORE user.age.index 30 40
123
124
ZCOUNT user.age.index 30 40
HGETALL user:123
HGETALL user:124
MULTI
HSET user:123 age 36
QUEUED
ZADD user.age.index 36 123
QUEUED
EXEC
A secondary index for searching
users by age
Object indexing: Multi-criteria search
Be careful to “Index inconsistency”
26. 26
ZADD mySet 0 0056:0028.44:90
ZADD mySet 0 0034:0011.00:852
ZRANGEBYLEX mySet [0056:0010.00 [0056:0030.00
All products in room 56 having a price between 10 and 30
room:price:product_id
In-memory index (2/2)
Object indexing: Multi-criteria search
• All elements are inserted with the same score
• Force lexicographical ordering
All elements between
min and max
27. Combine Redis with an event store
1. Bucket on time slot with a list of events
27
Redis can be the counter/aggregate/sorted set storage
RPUSH actions<time slice> <encoded JSON event>
2. Aggregate structure with a sorted set
ZADD actions:ts <time_stamp> <encoded JSON event>
ZCOUNT actions:ts <timestamp 1> <timestamp 2>
3. Index event by user_id and total number of events by user
ZADD event:<id> <encoded JSON event>
ZADD actions:users <user_id> <event_id>
ZCOUNT actions:users <user_id> <user_id>
28. 28
PFADD visitors 129.3.05.33
PFADD visitors 134.4.55.52
…
PFADD visitors 129.3.05.33
…
PFCOUNT visitors
33
Provide an approximation of the number of unique elements in a set using
just a constant, and a small amount of memory
Counting the number of distinct elements efficiently
HyperLogLog Data Structure
Always 12kbytes per key to count
with a standard error of 0.81%
32. Master-Slave Replication Process (1/2)
Master
DB
Slave
DB
Transaction
(write)
Transaction
(read)
Replication of write commands (async)
By default
● Replication is asynchronous
● No quorums
● Slave are “readonly” by default
● Slaves can become master on
the fly
It is always possible to lose data
● Do not forget: The main Redis
use case is a Cache
32
Asynchronous replication can produce inconsistencies
33. Master-Slave replication (2/2)
Master-slave model performed asynchronously
for redundancy and scalability
Master
Slave
Slave
Slave
Slave
SYNC
command
SYNC
command
By default, clients cannot write to slaves
Slave can be master on the fly 33
SYNC
command
34. Allows you to have a bigger dataset, as you can use more memory
Driven at client-side
Can be at driver-side
Some clients have built-in sharding support (e.g. Jedis)
With a proxy that uses the Redis protocol and does the sharding for you (Redis cluster)
Can be at application-side
(sharding support yourself on top of an existing client)
34
Sharding (Partitioning)
Horizontal partitioning solution - No built-in solution but multiple possibilities
0 < userIdKey < 100 → Redis instance 1
Otherwise → Redis instance 2
36. ./redis-cli info | grep memory
./redis-cli info | grep commandstats
….
./redis-cli
127.0.0.1:6379> showlog
127.0.0.1:6379> latency latest
127.0.0.1:6379> latency history <event name>
Monitoring - Built-in Admin Commands
36
Included into the Redis distribution
./redis-benchmark
./redis-check-dump
./redis-check-aof
Native Repair tools
37. Monitoring - Monitor Command
37
<<Admin>>
Client
Client
Redis
Server
Client
Client
monitor command
Track all other client commands
Inside the Redis distribution
⇒ However, it doesn't fit production systems with multiple Redis server
40. Fail-over with Redis Sentinel
• Official Redis HA/failover solution
Periodically checks aliveness of Redis instances
On master failure, chooses slave & promotes to master
Notifies clients & slaves about the new master
• Offers a monitoring system with support of automatic failover
• Enhance some tools such as “Redis Read Repair”
Not totally ready for production
40
41. Redis Sentinel vs Redis Cluster
41
Monitors master & slave instances
Notifies about changed behavior
Handles automatic failure
A pod of Sentinels can monitor
multiple Redis master & slave
nodes
Redis Sentinel
Data sharding solution with
automatic management,
handling failover and
replication
Requires a smart client
library that knows about
Redis Cluster
Redis Cluster
43. Multiple Databases inside Redis
43
database 0
database 2
database 4
database 15
database1
database 3
127.0.0.1:6379> SELECT 13
127.0.0.1:6379[13]> …
A way to segragate key spaces
…
44. Mostly Single-Threaded Architecture
44
Your command is the only
one running into the Redis
instance
No locks
Strengths
Your command is the only
one running into the Redis
instance
“Bad” commands can
block for several
seconds
Poorly designed in-server
scripts can block forever
Weaknesses
Not CPU intensive: for one Redis instance, max 2 CPU cores needed
45. Single-Threaded event loop
45
while (true) {
processEvents () {
check for network events
new clients (IPv4, IPv6. etc)
connected clients running commands
process scheduled events
10 times per second
replication sanity check
force-expire keys
persistence security check
…
}
}
Redis server
startup
Read config
file
Load existing
data from disk
Listen to
clients
46. Redis Expiration Process
46
Key expiration
Passive
On key demand
Active
Sampling of 100 records/second and
delete expired keys
47. Master-Slave Initial Replication Process
Master
Slave
hset
hset
rpop ... hset,hset,rpop
Save
1. - Master initiates background saving
- Master buffers all new write
commands
2. - Master transfers backup file to
slaves
- Slaves write file to disk, then load
3. - Master transfers buffered command
stream to slave
- Slave executes commands
Master Slave
Loadwrite
Master Slave
SYNC
Command
47Data replication uses disk for now: not very suitable with lots of writes
key value
48. Security with Redis
48
Basic unencrypted AUTH command
rename command FLUSHDB ""
rename command FLUSHALL ""
rename command CONFIG ""
….
Designed for trusted clients in trusted environments
50. Use Multiple (Aggregate) Argument Commands
SET MSET
GET MGET
LINDEX LRANGE
HSET HMGET
HSET HMSET
50
51. Pipelining
Aggregate multiple commands together without waiting for replies
Client Server
Cmd1
Response1
Cmd2
Response2
Cmd3
Response3
Client Server
{Cmd1,
Cmd2,
Cmd3}
Execute
Response1,2,3
Non-Pipelined Pipelined
Pipelined commands queued on client
Pipelined results consume memory on server until all commands completed
Note: Pipelines are NOT transactional or atomic
51
52. Sorted Sets - O(log(N)+M)
● ZRANGE key start stop
● ZRANGEBYSCORE
key min max
Strings - O(1)
● GET key
● SET key value
● EXISTS key
● DEL Key
Hashes - O(1)
● HGET key field
● HSET key field value
● HEXISTS key field
● HDEL key field
Hashes - O(N)
● HMGET key f1 [f2 …]
● HKEYS key
● HGETALL key
Sets - O(1)
● SADD, SREM, SCARD
key
● SPOP key
Sets - O(N)
● SDIFF key1 key2
● SUNION key1 key2
Sets - O(C)
● SINTER key1 key2
Sorted Sets - O(1)
● ZCARD
Sorted Sets - O(log(N))
● ZADD key score
member
● ZREM key member
● ZRANK key member
Pay attention to Execution Time Complexity
Avoid slow commands - Use the most efficient structure for the task at hand
52
Do not forget, Redis is mostly Single-Threaded. Pay attention to slow commands!
54. Watch closely Data Structure
54
Logical Structure vs Physical Structure
Logical
Set of Integers
Physical
IntSet
16
bits
32
bits
64
bits
N<2ˆ15 2ˆ15<N<2ˆ31 N>2ˆ31
● Avoid Small Strings as values
○ One string uses 90 bytes on 64 bits architectures
● Be careful with special encoding
○ E.g. Set
Use memory analyzer tool on RDB file
Column “bytes_saved_if_converted”
56. Additional Metadata per key
56
DEBUG myKey
Value at:0x7fae62a5b4b0 refcount:1 encoding:ziplist
serializedlength:18 lru:2093815 lru_seconds_idle:3
8 bytes per key
57. Use Hash data structures when possible
Many data types are optimized to use less space
57
account:3391:field1 value1
account:3391:field2 value2
account:3391:field3 value3
account:3391
account:3391:field4 value4
field1 à value 1
field1 à value 2
field1 à value 3
field1 à value 4
One pointer
(only 8 bytes for all fields)
A single hash with all the fields
Too many pointers – wasted bytes
(8 bytes each)
When N is small, the usual time for HGET and HSET commands is still O(1)
58. Memory Optimization @Instagram
58
MEDIA_ID USER_ID
HSET media0 media11 user123
HSET media0 media12 user123
HSET media0 media13 user87
...
HSET media152 media1521001 user256
HSET media152 media1521002 user237
...
Max 1000
elements
Max 1000
elements
Use less keys to avoid 8 bytes per key
media key
bucket key
user id
59. Named Tuple Strategy Technique
59
HSET user:123 firstname Gregory
HSET user:123 lastnane Boissinot
HSET user:123 age 35
HSET user:124 firstname Janie
HSET user:124 lastnane Boissinot
HSET user:124 age 11
[“firstname”, “Gregory”, “lastname”, “BOISSINOT”, “age”, “35”]
user:123 ziplist
[“firstname”, “Janie”, “lastname”, “BOISSINOT”, “age”, “11”]
user:124 ziplist
[“firstname”, “0”, “lastname”, “1”, “age”, “2”]
user:index ziplist
Must be used only if you have > 50 000 objects
[“Gregory”, “BOISSINOT”, “35”]
user:123 ziplist
[“Janie”, “BOISSINOT”, “11”]
user:124 ziplist
Aggregation is done on
application-side
61. The missing Redis features
No off-the-shelf REST API
Authentication but no authorization
No queries (but querying data)
For Data Structure
No default TTL per data type
No max entries
No eviction percentage
61
63. Redis Module
63
Full Text Search Enhanced JSON Graph Operations Secondary Indexes
Linear Algebra SQL Support Image Processing
N-Dimensions
Queries
• Enables users to go further than LUA scripts to their specific needs
• More data structures, more commands è more use cases
• A Module Hub as a marketplace (redismodules.com)
Composability of Redis
MODULE LOAD <my module>
64. Redis with modules
• Dynamically (server-) loaded libraries
• (will be mostly) written in C
• (nearly) as fast as the Redis core
• Modules let you process the data, compose (call core commands and other
modules) and extend (new commands, new structures)
• Can be created by anyone with a Redis Labs certification process
64
A new community-driven ecosystem
65. Redis Module API
Exposes three conceptual layers
1. Operation Layer: admin, memory, disk, replication, arguments, replies
2. High-Level Layer: client-like access to core and modules commands
3. Low-Level Layer: (almost) native access to core data structures in memory
65
Add-ons using Redis API
Available Webinar: http://blog.hackerearth.com/webinar-developing-redis-module
From Itamar Haber – Chief Developer Advocate – Redis Labs
66. Stream Data Type (RCP 11)
• A similar function as Apache Kafka
66
TWRITE key <entry>
BACKLOG <count>
TEVICT key <offset>
Each element
has an offset
New
Entry
temps
68. Redis in the Cloud
• Redis Labs
• Azure Redis Cache
• Elastic Cache Amazon
• Redis To Go
• Open Redis
• Redis Green
68
69. Redis Labs solution
A version of Redis that runs a combination of RAM
and flash
Option to configure RAM-to-flash ratios
True high availability
69
Redis Labs Node
Open Source
Zero latency proxy Cluster Manager
REST API
Proprietary
70. Spring Data Redis
• Redis Repositories (~ORM)
• CRUD Methods + Redis Template
• Cache Mapper
• JDK Collection and Atomic Counter with Redis as Backend
• Secondary indexes
70
• Redis data modeling
• Data optimization is more difficult
Try to make Redis transparent
72. Aerospike, a possible successor...
Memcached Redis Aerospike
Key-Value Key-Value Key-Value
Only in-memory in-memory with persistence in-memory with persistence
(Tuned for RAM and SSDs)
Multi-threaded Single-Threaded Multi-Threaded
Only Strings Rich Data Structure
(Strings, Lists, Sets, ZSets, etc)
Rich Data Structure too
Sharding Application level sharding or
with client sharding
Auto-sharding, Auto-
balancing
Rich Community Very Rich Community Few users
72
Keep an eye on others products
74. A Multi Utility Tool (1/2)
A very fast in-memory DB
A powerful NoSQL product such as MongoDB, Riak, Cassandra
Not a technology such as Hazelcast
Suitable for collecting large amounts of data in real-time in RAM
However, predicting data variety and volume is indispensable
A first-class citizen store
Use special encoding for small objects
You can’t really guarantee data consistency
Always tricky scenarios
Just add it in your stack
74
75. A Multi Utility Tool (2/2)
Redis is not optimized for maximum security but for maximum performance and
simplicity
Not well designed for Java clients
The main clients are Python or Ruby clients
Still improving to be/stay the most advanced key-value store
Redis labs Enterprise in progress
Just add it in your stack
75