Frank will share the motivation behind the 3D XPoint memory, the current shipping Optane SSD product and key values of why it is better than NAND-based SSDs, and a few use cases that exist in the Open Source space for Database usages of Optane SSDs.
Scylla Summit 2017: Intel Optane SSDs as the New Accelerator in Your Data Center
1. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Intel® Optane™ SSDs and Scylla
Providing the Speed of an In-memory
Database with Persistency
Tomer Sandler and Frank Ober
2. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Tomer Sandler
Solution Architect @ ScyllaDB
2
Data Center Solution Architect @ Intel®
Frank Ober
3. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Agenda
▪ Introduction
▪ Intel® Optane™ SSD DC P4800X
▪ Scylla as an In-Memory Like Solution
▪ How We Knew Optane™ is Going to “Rock”
▪ Setup and Workloads
▪ Results
▪ TCO: Enterprise SSD vs. Intel® Optane™
▪ Summary
3
4. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Introduction
The Challenge
Providing a solution with the performance of an in-memory like
database without compromises on throughput, latency, and data
persistence.
4
5. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Introduction
The Challenge
Providing a solution with the performance of an in-memory like
database without compromises on throughput, latency, and data
persistence.
How...
Using Scylla and Intel® Optane™ SSD DC P4800X to resolve cold-cache
and data persistence challenges.
5
6. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Intel® Optane™ SSD DC
P4800X
7. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
7
8. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
8
9. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
9
10. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
10
11. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
11
12. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
12
13. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Scylla as an
In-Memory Like
Solution
14. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Scylla as an In-Memory Like Solution
▪ In-Memory Database Requirements
o Sub-millisecond response time
o High throughput
o Support large number of clients concurrently
14
15. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Scylla as an In-Memory Like Solution
▪ In-Memory Database Requirements
o Sub-millisecond response time
o High throughput
o Support large number of clients concurrently
▪ In-Memory Database Challenges
o Cold cache and long warmup times
o Persistency and high availability
o Scalability
o Simplistic data models
15
16. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Scylla as an In-Memory Like Solution
▪ Scylla provides
o Persistent data storage
o High throughput, low latency data access
o Rich data model capabilities
▪ Scylla scales (and scales...)
▪ Scylla needs VERY fast storage media to pair with
▪ Ease fetching and storing information latency
16
17. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
How We Knew
Optane™ is Going
to “Rock”
18. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
How We Knew Optane™ is Going to “Rock”
▪ We used Diskplorer to measure the drives capabilities
o Small wrapper around fio that is used to graph the relationship between
concurrency (I/O depth), throughput, and IOps
18
19. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
How We Knew Optane™ is Going to “Rock”
▪ We used Diskplorer to measure the drives capabilities
o Small wrapper around fio that is used to graph the relationship between
concurrency (I/O depth), throughput, and IOps
o Concurrency is the number of parallel operations that a disk or array can
sustain. With increasing concurrency, the latency increases and we observe
diminishing IOps increases beyond an optimal point
19
20. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
How We Knew Optane™ is Going to “Rock”
▪ We used Diskplorer to measure the drives capabilities
o Small wrapper around fio that is used to graph the relationship between
concurrency (I/O depth), throughput, and IOps
o Concurrency is the number of parallel operations that a disk or array can
sustain. With increasing concurrency, the latency increases and we observe
diminishing IOps increases beyond an optimal point
RandRead test with a 4K buffer:
● Optimal concurrency is ~24
● Throughput: 1.0M IOps
● Latency: 18µs
20
21. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Setup and Workloads
22. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Setup and Workloads
▪ 3 Scylla v2.0 RC servers: 2 x 14 Core CPUs, 128GB DRAM, 2 x Intel®
Optane™ SSD DC P4800X
o CPU: Intel® Xeon® CPU E5-2690 v4 @ 2.60GHz
o Storage: RAID-0 on top of 2 Optane™ drives – total of 750GB per server
o Network: 2 bonded 10Gb Intel® x540 NICs. Bonding type: layer3+4
▪ 3 Client servers: 2 x 14 Core CPUs, 128GB DRAM, using the
cassandra-stress tool with a user profile workload
▪ Set the # of IO queues equal to the # of shards
o /etc/scylla.d/io.conf: SEASTAR_IO="--num-io-queues=54
--max-io-requests=432"
22
23. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Setup and Workloads
▪ Cassandra-stress: User defined mode that allows running
performance tests on custom data models, using yaml files for
configuration
▪ Simple K/V schema used to populate ~50% of the storage capacity
▪ Utilizing all of the server’s RAM (128GB), replication factor set to 3
(RF=3), and the consistency level is set to one (CL=ONE)
▪ Tested 1 / 5 / 10 KByte payloads
o Challenge the default 512B sector size
o Max. IOps for each payload, at very low latency for reads
23
24. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Setup and Workloads
▪ Two scenarios for read tests
o Large working set much larger than the RAM capacity. This scenario lowers the
probability of finding a read partition in Scylla’s cache
o Small working set that will create a higher probability of a partition being
cached in Scylla’s memory
▪ Latency measurements
o Cassandra stress client end-to-end latency results
o Scylla-server side latency results (using `nodetool tablehistograms` command)
24
25. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Results
26. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Latency Test Results
26
Payload Size Test Case (RF=3)
Total Requests
per Sec
Cassandra stress 95%
Latency (ms)
Scylla-server 95%
Latency (ms)
Disk Throughput per
Server (GBps)
Load per
Server
1 KB
key:64b
blob:1kb
Write
300M Partitions
(~50% disk space)
Avg: ~196K
Max: 220K
2.0
Avg: ~1.25
Max: 2.65
~65%
Read
Large Spread
(~75% from Disk)
198K 0.7 0.478
Avg: ~1.65
Max: 2.2
~32%
Read
Small Spread
(All in-Memory)
198K 0.4 0.023 None ~15%
5 KB
key:64b
blob:5kb
Write
75M Partitions
(~54% disk space)
Avg: ~166K
Max: 180K
2.8
Avg: ~2.75
Max: 4.2
~65%
Read
Large Spread
(75% from Disk)
168K 0.9 0.405
Avg: ~1.22
Max: 1.84
~36%
Read
Small Spread
(All in-Memory)
168K 0.5 0.0405 None ~18%
27. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Latency Test Results
27
Payload Size Test Case (RF=3)
Total Requests
per Sec
Cassandra stress 95%
Latency (ms)
Scylla-server 95%
Latency (ms)
Disk Throughput per
Server (GBps)
Load per
Server
10 KB
key:64b
blob:10kb
Write
36M Partitions
(~50% disk space)
120K 2.45
Avg: ~3.7
Max: 4.5
~65%
Read
Large Spread 1
(75% from Disk)
120K 1.0 0.398
Avg: ~0.95
Max 1.72
~30%
Read
Large Spread 2
(75% from Disk)
166K 1.2 0.481
Avg: ~1.35
Max: 2.27
~40%
Read
Small Spread
(All in-Memory)
166K
(120K)
0.6
(0.5)
0.063
(0.051)
None ~22%
28. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Throughput Test Results
28
Payload Size Test Case (RF=1)
Total Requests
per Sec
Cassandra stress 95%
Latency (ms)
Cassandra stress
threads per client
Disk Throughput per
Server (GBps)
Load per
Server
128B
key:64b
blob:128b
Write
600M Partitions
(~8% disk space)
Avg: ~1.95M
Max: 3.05M
7.3 520
Avg: ~0.55
Max: 1.12
~95%
Read 300M
Large Spread
(~50% from Disk)
Avg: ~976K
Max: 1.35M
2.5 120
Avg: ~2.3
Max: 4.29
~94%
Read 600M
Large Spread
(~60% from Disk)
Avg: ~771K
Max: 986K
2.95 120
Avg: ~3.35
Max: 4.53
~94%
Read
Small Spread
(All in-Memory)
Avg: ~2.19M
Max: 2.21M
2.6 300 None ~96%
▪ 128B payload with RF and CL = ONE
▪ 12 cassandra-stress instances (each instance populating a different range).
▪ Read large spread test ran twice, once on the full range (600M partitions) and once on half the
range (300M partitions)
29. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Throughput Test Results
29
Payload Size Test Case (RF=1)
Total Requests
per Sec
Cassandra stress 95%
Latency (ms)
Cassandra stress
threads per client
Disk Throughput per
Server (GBps)
Load per
Server
128B
key:64b
blob:128b
Write
600M Partitions
(~8% disk space)
Avg: ~1.95M
Max: 3.05M
7.3 520
Avg: ~0.55
Max: 1.12
~95%
Read 300M
Large Spread
(~50% from Disk)
Avg: ~976K
Max: 1.35M
2.5 120
Avg: ~2.3
Max: 4.29
~94%
Read 600M
Large Spread
(~60% from Disk)
Avg: ~771K
Max: 986K
2.95 120
Avg: ~3.35
Max: 4.53
~94%
Read
Small Spread
(All in-Memory)
Avg: ~2.19M
Max: 2.21M
2.6 300 None ~96%
▪ 128B payload with RF and CL = ONE
▪ 12 cassandra-stress instances (each instance populating a different range)
▪ Read large spread test ran twice, once on the full range (600M partitions) and once on half the
range (300M partitions)
30. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
TCO
31. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
TCO: Enterprise SSD vs. Intel® Optane™
Intel® Optane™ provide great latency results, and is also more than
50% cheaper compared to DRAM or Enterprise SSD configurations
31
32. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
TCO: Enterprise SSD vs. Intel® Optane™
Intel® Optane™ provide great latency results, and is also more than
50% cheaper compared to DRAM or Enterprise SSD configurations
32
33. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
Summary
34. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
What did we learn
▪ Scylla’s C++ per core scaling architecture and unique I/O scheduling can
fully utilize your infrastructure’s potential for running high-throughput
and low latency workloads
▪ Intel® Optane™ and Scylla achieve the performance of an all in-memory
database
▪ Intel® Optane™ and Scylla resolve the cold-cache and data persistence
challenge without compromising on throughput, latency and performance
▪ Data resides on nonvolatile storage
▪ Scylla server’s 95% write/read latency < 0.5msec at 165K requests per sec
▪ TCO: 50% cheaper than an all in-memory solution
34
35. PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
THANK YOU
Tomer@scylladb.com
Please stay in touch
Any questions?
Frank.Ober@intel.com
Check our blogs
- Intel Optane Review
- Intel Optane and Scylla