SlideShare ist ein Scribd-Unternehmen logo
1 von 70
Downloaden Sie, um offline zu lesen
SF CASSANDRA USERS MARCH 2016
CQL PERFORMANCE WITH APACHE
CASSANDRA 3.0
Aaron Morton
@aaronmorton
CEO
Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
AboutThe Last Pickle.
Work with clients to deliver and improve Apache Cassandra
based solutions.
Apache Cassandra Committer and DataStax MVPs.
Based in New Zealand,Australia, France & USA.
How We Got Here
Storage Engine 3.0
Write Path
Read Path
How We Got Here
Way back in 2011…
2011
Blog: Cassandra Query Plans
http://thelastpickle.com/blog/2011/07/04/
Cassandra-Query-Plans.html
2012
Talk:Technical Deep Dive -
Query Performance
https://www.youtube.com/watch?
v=gomOKhMV0zc
2012
Explain Read & Write
performance in 45 minutes.
Skip Forward to 2016
Blog: Introduction To The
Apache Cassandra 3.x Storage
Engine
http://thelastpickle.com/blog/2016/03/04/introductiont-to-
the-apache-cassandra-3-storage-engine.html
Skip Forward to 2016
“Why don’t I do another talk
about Cassandra
performance.”
Skip Forward to 2016
It was a busy 4 years…
Skip Forward to 2016
CQL 3, Collection Types,
UDTs, UDF’s, UDA’s,
MaterialisedViews,Triggers,
SASI,…
Skip Forward to 2016
Explain Read & Write
performance in 45 minutes.
So Lets Avoid
CQL 3, Collection Types,
UDTs, UDF’s, UDA’s,
MaterialisedViews,Triggers,
SASI,…
How We Got Here
Storage Engine 3.0
Write Path
Read Path
High Level Storage Engine 3.0
Storage Engine 3.0 Files
Data.db
Index.db
Filter.db
Storage Engine 3.0 Files
CompressionInfo.db
Statistics.db
Digest.crc32
CRC.db
Summary.db
TOC.txt
CQL Recap
create table my_table (
partition_1 text,
cluster_1 text,
foo text,
bar text,
baz text,
PRIMARY KEY (partition_1, cluster_1)
);
CQL Recap
WARNING:
FAKE DATA AHEAD
CQL WithThrift Pre 3.0
[default@dev] list my_table;
-------------------
RowKey: part_a
=> (column=clust_a:, value=, timestamp=1357…739000)
=> (column=clust_a:foo, value=some foo, timestamp=1357…739000)
=> (column=clust_a:bar, value=and bar, timestamp=1357…739000)
=> (column=clust_a:baz, value=no baz, timestamp=1357…739000)
=> (column=clust_b:, value=, timestamp=1357…739000)
=> (column=clust_b:foo, value=no foo, timestamp=1357…739000)
=> (column=clust_b:bar, value=no bar, timestamp=1357…739000)
=> (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
CQL Pre 3.0
Clustering Keys Repeated
Column Names Repeated
Timestamps Repeated
Fixed Width Encoding
No Knowledge Of Row Contents
Storage Engine 3.0 Improvements
Delta Encoding
Variable Int Encoding
Clustering Written Once
Aggregated Metadata
Cell Presence
SerializationHeader
For each SSTable*.
Stored in each SSTable.
Held in memory.
SerializationHeader
public class SerializationHeader
{
private final AbstractType<?> keyType;
private final List<AbstractType<?>>
clusteringTypes;
private final PartitionColumns columns;
private final EncodingStats stats;
…
}
EncodingStats
Collected on the fly by the
Memtable.
EncodingStats
public class EncodingStats
{
public final long minTimestamp;
public final int minLocalDeletionTime;
public final int minTTL;
…
}
SerializationHeader
public class SerializationHeader
{
public void writeTimestamp(long timestamp,
DataOutputPlus out) throws IOException
{
out.writeUnsignedVInt(timestamp -
stats.minTimestamp);
}
…
}
VIntCoding
public class VIntCoding
{
public static void writeUnsignedVInt(long value, DataOutput
output) throws IOException {
int size = VIntCoding.computeUnsignedVIntSize(value);
if (size == 1)
{
output.write((int)value);
return;
}
output.write(VIntCoding.encodeVInt(value, size), 0,
size);
}
Storage Engine 3.0 Improvements
Delta Encoding
Variable Int Encoding
Clustering Written Once
Aggregated Metadata
Cell Presence
CQL WithThrift Pre 3.0
[default@dev] list my_table;
-------------------
RowKey: part_a
=> (column=clust_a:, value=, timestamp=1357…739000)
=> (column=clust_a:foo, value=some foo, timestamp=1357…739000)
=> (column=clust_a:bar, value=and bar, timestamp=1357…739000)
=> (column=clust_a:baz, value=no baz, timestamp=1357…739000)
=> (column=clust_b:, value=, timestamp=1357…739000)
=> (column=clust_b:foo, value=no foo, timestamp=1357…739000)
=> (column=clust_b:bar, value=no bar, timestamp=1357…739000)
=> (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
Storage Engine 3.0 Data.db
Storage Engine 3.0 Partition Header
Partition Key
Partition Deletion Information
Storage Engine 3.0 Partition Header
Storage Engine 3.0 Row
Clustering Information
Row Level Liveness
Row Level Deletion
Column Presence
Columns
Storage Engine 3.0 Row
Storage Engine 3.0 Clustering Block
Clustering Cell Presence
Clustering Cells
Storage Engine 3.0 Clustering Block
Storage Engine 3.0 Improvements
Delta Encoding
Variable Int Encoding
Clustering Written Once
Aggregated Cell Metadata
Cell Presence
CQL WithThrift Pre 3.0
[default@dev] list my_table;
-------------------
RowKey: part_a
=> (column=clust_a:, value=, timestamp=1357…739000)
=> (column=clust_a:foo, value=some foo, timestamp=1357…739000)
=> (column=clust_a:bar, value=and bar, timestamp=1357…739000)
=> (column=clust_a:baz, value=no baz, timestamp=1357…739000)
=> (column=clust_b:, value=, timestamp=1357…739000)
=> (column=clust_b:foo, value=no foo, timestamp=1357…739000)
=> (column=clust_b:bar, value=no bar, timestamp=1357…739000)
=> (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
Aggregated Cell Metadata
Only store CellTimestamp,TTL, and
Local DeletionTime if different to
the Row.
Aggregated Cell Metadata
Simple Cell Component Byte Size
Flags 1
Optional Cell Timestamp (delta) varint 1…n
Optional Cell Local Deletion Time (delta) varint 1…n
Optional Cell TTL (delta) varint 1…n
Fixed Width Cell Value Byte Size
Value 1…n
Optional Cell Value See Below
Variable Width Cell Value Byte Size
Value Length varint 1…n
Value 1…n
Apache Cassandra 3.0 Storage Engine
Storage Engine 3.0 Improvements
Delta Encoding
Variable Int Encoding
Clustering Written Once
Aggregated Cell Metadata
Cell Presence
Cell Presence
SSTable stores list of Cells in this
SSTable.
Rows stores bitmap of Cells in this
Row, with reference to SSTable.
Storage Engine 3.0 Row
Remember Where We Came From
[default@dev] list my_table;
-------------------
RowKey: part_a
=> (column=clust_a:, value=, timestamp=1357…739000)
=> (column=clust_a:foo, value=some foo, timestamp=1357…739000)
=> (column=clust_a:bar, value=and bar, timestamp=1357…739000)
=> (column=clust_a:baz, value=no baz, timestamp=1357…739000)
=> (column=clust_b:, value=, timestamp=1357…739000)
=> (column=clust_b:foo, value=no foo, timestamp=1357…739000)
=> (column=clust_b:bar, value=no bar, timestamp=1357…739000)
=> (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
How We Got Here
Storage Engine 3.0
Write Path
Read Path
Write Path
Commit Log
Merge Into Memtable
Commit Log
Allocate space in the current
commit log segment.
Allocate Segment
o.a.c.m.
CommitLog.WaitingOnSegmentAllocation.
95thPercentile
Merge Into Memtable
Find the Partition.
Loop trying to update the
Rows in it using CAS.
Merge Into Memtable
If more than 10MB wasted
allocations move to
Pessimistic locking on the
Partition object.
How We Got Here
Storage Engine 3.0
Write Path
Read Path
Read Paths
Ignoring Index Read paths.
Read Commands
PartitionRangeReadCommand
SinglePartitionReadCommand
AbstractClusteringIndexFilter
ClusteringIndexNamesFilter
(When we know the column names.)
ClusteringIndexSliceFilter
(When we do not know the column names.)
ClusteringIndexNamesFilter
When we know what
Columns to select, we know
when the search is over.
ClusteringIndexNamesFilter
1. Get Partition From Memtables.
2. Filter named columns into a temporary
result.
3. Select SSTables that may contain Partition
Key.
4. Order in descending timestamp order.
5. Read from SSTables in order.
Names Filter Short Circuits
If result has a Partition Deletion
newer than next SSTable max
timestamp.
Stop Search.
Names Filter Short Circuits
If read all Columns and max
timestamp of next SSTable less than
selected Columns min timestamp.
Stop Search.
Names Filter Short Circuits
Note: list of Columns
remaining to select is pruned
after every SSTable is read
based on max timestamp.
Names Filter Short Circuits
If search clustering value not within
clustering range in the SSTable.
Skip SSTable.
Names Filter Short Circuits
If SSTable Cell not in search set.
Skip reading value.
ClusteringIndexSliceFilter
When we do not know which
columns to select, the search
ends when it is exhausted.
ClusteringIndexSliceFilter
Used with:
Distinct.
Not all clustering columns
restricted.
ClusteringIndexSliceFilter
1. Get Partition From Memtables.
2. Create Iterators for Partitions.
3. Select SSTables that may contain Partition
Key.
4. Order in reverse max timestamp order.
5. Create Iterators for SSTables in order.
Slice Filter Short Circuits
If SSTable max timestamp is before
max seen Partition Deletion
timestamp.
Stop Search.
Names Filter Short Circuits
If search clustering value not within
clustering range in the SSTable.
Skip SSTable.
So…
3.x is awesome.
Starting using it as soon as
possible.
Thanks.
Aaron Morton
@aaronmorton
Co-Founder & Principal Consultant
www.thelastpickle.com

Weitere ähnliche Inhalte

Was ist angesagt?

Getting to Know the Cassandra Codebase
Getting to Know the Cassandra CodebaseGetting to Know the Cassandra Codebase
Getting to Know the Cassandra Codebasegdusbabek
 
Cassandra Codebase 2011
Cassandra Codebase 2011Cassandra Codebase 2011
Cassandra Codebase 2011gdusbabek
 
Am I reading GC logs Correctly?
Am I reading GC logs Correctly?Am I reading GC logs Correctly?
Am I reading GC logs Correctly?Tier1 App
 
Cassandra at Glogster
Cassandra at GlogsterCassandra at Glogster
Cassandra at GlogsterRoman Komkov
 
An Introduction to Priam
An Introduction to PriamAn Introduction to Priam
An Introduction to PriamJason Brown
 
DevoxxPL: JRebel Under The Covers
DevoxxPL: JRebel Under The CoversDevoxxPL: JRebel Under The Covers
DevoxxPL: JRebel Under The CoversSimon Maple
 
Introduction to apache_cassandra_for_developers-lhg
Introduction to apache_cassandra_for_developers-lhgIntroduction to apache_cassandra_for_developers-lhg
Introduction to apache_cassandra_for_developers-lhgzznate
 
Is your profiler speaking the same language as you? -- Docklands JUG
Is your profiler speaking the same language as you? -- Docklands JUGIs your profiler speaking the same language as you? -- Docklands JUG
Is your profiler speaking the same language as you? -- Docklands JUGSimon Maple
 
Pick diamonds from garbage
Pick diamonds from garbagePick diamonds from garbage
Pick diamonds from garbageTier1 App
 
AWS RDS Benchmark - CMG Brasil 2012
AWS RDS Benchmark - CMG Brasil 2012AWS RDS Benchmark - CMG Brasil 2012
AWS RDS Benchmark - CMG Brasil 2012Rodrigo Campos
 
Scylla Summit 2022: Making Schema Changes Safe with Raft
Scylla Summit 2022: Making Schema Changes Safe with RaftScylla Summit 2022: Making Schema Changes Safe with Raft
Scylla Summit 2022: Making Schema Changes Safe with RaftScyllaDB
 
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...DataStax
 
Cassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE SearchCassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE SearchCaleb Rackliffe
 
Training Slides: Basics 105: Backup, Recovery and Provisioning Within Tungste...
Training Slides: Basics 105: Backup, Recovery and Provisioning Within Tungste...Training Slides: Basics 105: Backup, Recovery and Provisioning Within Tungste...
Training Slides: Basics 105: Backup, Recovery and Provisioning Within Tungste...Continuent
 
Spark stream - Kafka
Spark stream - Kafka Spark stream - Kafka
Spark stream - Kafka Dori Waldman
 
Time Series Processing with Solr and Spark
Time Series Processing with Solr and SparkTime Series Processing with Solr and Spark
Time Series Processing with Solr and SparkJosef Adersberger
 
Sql saturday azure storage by Anton Vidishchev
Sql saturday azure storage by Anton VidishchevSql saturday azure storage by Anton Vidishchev
Sql saturday azure storage by Anton VidishchevAlex Tumanoff
 
Time Series Processing with Apache Spark
Time Series Processing with Apache SparkTime Series Processing with Apache Spark
Time Series Processing with Apache SparkJosef Adersberger
 
Cassandra and Spark
Cassandra and Spark Cassandra and Spark
Cassandra and Spark datastaxjp
 
Effective testing for spark programs Strata NY 2015
Effective testing for spark programs   Strata NY 2015Effective testing for spark programs   Strata NY 2015
Effective testing for spark programs Strata NY 2015Holden Karau
 

Was ist angesagt? (20)

Getting to Know the Cassandra Codebase
Getting to Know the Cassandra CodebaseGetting to Know the Cassandra Codebase
Getting to Know the Cassandra Codebase
 
Cassandra Codebase 2011
Cassandra Codebase 2011Cassandra Codebase 2011
Cassandra Codebase 2011
 
Am I reading GC logs Correctly?
Am I reading GC logs Correctly?Am I reading GC logs Correctly?
Am I reading GC logs Correctly?
 
Cassandra at Glogster
Cassandra at GlogsterCassandra at Glogster
Cassandra at Glogster
 
An Introduction to Priam
An Introduction to PriamAn Introduction to Priam
An Introduction to Priam
 
DevoxxPL: JRebel Under The Covers
DevoxxPL: JRebel Under The CoversDevoxxPL: JRebel Under The Covers
DevoxxPL: JRebel Under The Covers
 
Introduction to apache_cassandra_for_developers-lhg
Introduction to apache_cassandra_for_developers-lhgIntroduction to apache_cassandra_for_developers-lhg
Introduction to apache_cassandra_for_developers-lhg
 
Is your profiler speaking the same language as you? -- Docklands JUG
Is your profiler speaking the same language as you? -- Docklands JUGIs your profiler speaking the same language as you? -- Docklands JUG
Is your profiler speaking the same language as you? -- Docklands JUG
 
Pick diamonds from garbage
Pick diamonds from garbagePick diamonds from garbage
Pick diamonds from garbage
 
AWS RDS Benchmark - CMG Brasil 2012
AWS RDS Benchmark - CMG Brasil 2012AWS RDS Benchmark - CMG Brasil 2012
AWS RDS Benchmark - CMG Brasil 2012
 
Scylla Summit 2022: Making Schema Changes Safe with Raft
Scylla Summit 2022: Making Schema Changes Safe with RaftScylla Summit 2022: Making Schema Changes Safe with Raft
Scylla Summit 2022: Making Schema Changes Safe with Raft
 
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...
A Detailed Look At cassandra.yaml (Edward Capriolo, The Last Pickle) | Cassan...
 
Cassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE SearchCassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE Search
 
Training Slides: Basics 105: Backup, Recovery and Provisioning Within Tungste...
Training Slides: Basics 105: Backup, Recovery and Provisioning Within Tungste...Training Slides: Basics 105: Backup, Recovery and Provisioning Within Tungste...
Training Slides: Basics 105: Backup, Recovery and Provisioning Within Tungste...
 
Spark stream - Kafka
Spark stream - Kafka Spark stream - Kafka
Spark stream - Kafka
 
Time Series Processing with Solr and Spark
Time Series Processing with Solr and SparkTime Series Processing with Solr and Spark
Time Series Processing with Solr and Spark
 
Sql saturday azure storage by Anton Vidishchev
Sql saturday azure storage by Anton VidishchevSql saturday azure storage by Anton Vidishchev
Sql saturday azure storage by Anton Vidishchev
 
Time Series Processing with Apache Spark
Time Series Processing with Apache SparkTime Series Processing with Apache Spark
Time Series Processing with Apache Spark
 
Cassandra and Spark
Cassandra and Spark Cassandra and Spark
Cassandra and Spark
 
Effective testing for spark programs Strata NY 2015
Effective testing for spark programs   Strata NY 2015Effective testing for spark programs   Strata NY 2015
Effective testing for spark programs Strata NY 2015
 

Ähnlich wie Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.X

CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...DataStax
 
Apache Cassandra, part 2 – data model example, machinery
Apache Cassandra, part 2 – data model example, machineryApache Cassandra, part 2 – data model example, machinery
Apache Cassandra, part 2 – data model example, machineryAndrey Lomakin
 
Introduction to apache_cassandra_for_develope
Introduction to apache_cassandra_for_developeIntroduction to apache_cassandra_for_develope
Introduction to apache_cassandra_for_developezznate
 
Improving app performance using .Net Core 3.0
Improving app performance using .Net Core 3.0Improving app performance using .Net Core 3.0
Improving app performance using .Net Core 3.0Richard Banks
 
NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandrarantav
 
Cassandra 2.1 boot camp, Read/Write path
Cassandra 2.1 boot camp, Read/Write pathCassandra 2.1 boot camp, Read/Write path
Cassandra 2.1 boot camp, Read/Write pathJoshua McKenzie
 
Cassandra Community Webinar - Introduction To Apache Cassandra 1.2
Cassandra Community Webinar  - Introduction To Apache Cassandra 1.2Cassandra Community Webinar  - Introduction To Apache Cassandra 1.2
Cassandra Community Webinar - Introduction To Apache Cassandra 1.2aaronmorton
 
Cassandra Community Webinar | Introduction to Apache Cassandra 1.2
Cassandra Community Webinar | Introduction to Apache Cassandra 1.2Cassandra Community Webinar | Introduction to Apache Cassandra 1.2
Cassandra Community Webinar | Introduction to Apache Cassandra 1.2DataStax
 
Apache Cassandra in Bangalore - Cassandra Internals and Performance
Apache Cassandra in Bangalore - Cassandra Internals and PerformanceApache Cassandra in Bangalore - Cassandra Internals and Performance
Apache Cassandra in Bangalore - Cassandra Internals and Performanceaaronmorton
 
NET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptxNET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptxpetabridge
 
SignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer OptimizationSignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer OptimizationSignalFx
 
[Deprecated] Integrating libSyntax into the compiler pipeline
[Deprecated] Integrating libSyntax into the compiler pipeline[Deprecated] Integrating libSyntax into the compiler pipeline
[Deprecated] Integrating libSyntax into the compiler pipelineYusuke Kita
 
Spark Summit EU talk by Ross Lawley
Spark Summit EU talk by Ross LawleySpark Summit EU talk by Ross Lawley
Spark Summit EU talk by Ross LawleySpark Summit
 
How To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own DatasourceHow To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own DatasourceMongoDB
 
Cassandra 2.1 boot camp, Overview
Cassandra 2.1 boot camp, OverviewCassandra 2.1 boot camp, Overview
Cassandra 2.1 boot camp, OverviewJoshua McKenzie
 
C++ Memory Management
C++ Memory ManagementC++ Memory Management
C++ Memory ManagementRahul Jamwal
 
Stata Programming Cheat Sheet
Stata Programming Cheat SheetStata Programming Cheat Sheet
Stata Programming Cheat SheetLaura Hughes
 
Writing a TSDB from scratch_ performance optimizations.pdf
Writing a TSDB from scratch_ performance optimizations.pdfWriting a TSDB from scratch_ performance optimizations.pdf
Writing a TSDB from scratch_ performance optimizations.pdfRomanKhavronenko
 

Ähnlich wie Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.X (20)

CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...
CQL performance with Apache Cassandra 3.0 (Aaron Morton, The Last Pickle) | C...
 
Apache Cassandra, part 2 – data model example, machinery
Apache Cassandra, part 2 – data model example, machineryApache Cassandra, part 2 – data model example, machinery
Apache Cassandra, part 2 – data model example, machinery
 
Introduction to apache_cassandra_for_develope
Introduction to apache_cassandra_for_developeIntroduction to apache_cassandra_for_develope
Introduction to apache_cassandra_for_develope
 
Improving app performance using .Net Core 3.0
Improving app performance using .Net Core 3.0Improving app performance using .Net Core 3.0
Improving app performance using .Net Core 3.0
 
NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandra
 
Cassandra 2.1 boot camp, Read/Write path
Cassandra 2.1 boot camp, Read/Write pathCassandra 2.1 boot camp, Read/Write path
Cassandra 2.1 boot camp, Read/Write path
 
Cassandra Community Webinar - Introduction To Apache Cassandra 1.2
Cassandra Community Webinar  - Introduction To Apache Cassandra 1.2Cassandra Community Webinar  - Introduction To Apache Cassandra 1.2
Cassandra Community Webinar - Introduction To Apache Cassandra 1.2
 
Cassandra Community Webinar | Introduction to Apache Cassandra 1.2
Cassandra Community Webinar | Introduction to Apache Cassandra 1.2Cassandra Community Webinar | Introduction to Apache Cassandra 1.2
Cassandra Community Webinar | Introduction to Apache Cassandra 1.2
 
Apache Cassandra in Bangalore - Cassandra Internals and Performance
Apache Cassandra in Bangalore - Cassandra Internals and PerformanceApache Cassandra in Bangalore - Cassandra Internals and Performance
Apache Cassandra in Bangalore - Cassandra Internals and Performance
 
J2SE 5
J2SE 5J2SE 5
J2SE 5
 
NET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptxNET Systems Programming Learned the Hard Way.pptx
NET Systems Programming Learned the Hard Way.pptx
 
SignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer OptimizationSignalFx Kafka Consumer Optimization
SignalFx Kafka Consumer Optimization
 
[Deprecated] Integrating libSyntax into the compiler pipeline
[Deprecated] Integrating libSyntax into the compiler pipeline[Deprecated] Integrating libSyntax into the compiler pipeline
[Deprecated] Integrating libSyntax into the compiler pipeline
 
Spark Summit EU talk by Ross Lawley
Spark Summit EU talk by Ross LawleySpark Summit EU talk by Ross Lawley
Spark Summit EU talk by Ross Lawley
 
How To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own DatasourceHow To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own Datasource
 
Cassandra 2.1 boot camp, Overview
Cassandra 2.1 boot camp, OverviewCassandra 2.1 boot camp, Overview
Cassandra 2.1 boot camp, Overview
 
Ast transformation
Ast transformationAst transformation
Ast transformation
 
C++ Memory Management
C++ Memory ManagementC++ Memory Management
C++ Memory Management
 
Stata Programming Cheat Sheet
Stata Programming Cheat SheetStata Programming Cheat Sheet
Stata Programming Cheat Sheet
 
Writing a TSDB from scratch_ performance optimizations.pdf
Writing a TSDB from scratch_ performance optimizations.pdfWriting a TSDB from scratch_ performance optimizations.pdf
Writing a TSDB from scratch_ performance optimizations.pdf
 

Mehr von aaronmorton

Cassandra Day Atlanta 2016 - Monitoring Cassandra
Cassandra Day Atlanta 2016  - Monitoring CassandraCassandra Day Atlanta 2016  - Monitoring Cassandra
Cassandra Day Atlanta 2016 - Monitoring Cassandraaaronmorton
 
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable CassandraCassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandraaaronmorton
 
Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodesaaronmorton
 
Cassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break Glass
Cassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break GlassCassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break Glass
Cassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break Glassaaronmorton
 
Cassandra Community Webinar - August 22 2013 - Cassandra Internals
Cassandra Community Webinar - August 22 2013 - Cassandra InternalsCassandra Community Webinar - August 22 2013 - Cassandra Internals
Cassandra Community Webinar - August 22 2013 - Cassandra Internalsaaronmorton
 
Cassandra SF 2013 - In Case Of Emergency Break Glass
Cassandra SF 2013 - In Case Of Emergency Break GlassCassandra SF 2013 - In Case Of Emergency Break Glass
Cassandra SF 2013 - In Case Of Emergency Break Glassaaronmorton
 
Cassandra SF 2013 - Cassandra Internals
Cassandra SF 2013 - Cassandra InternalsCassandra SF 2013 - Cassandra Internals
Cassandra SF 2013 - Cassandra Internalsaaronmorton
 
Cassandra SF 2012 - Technical Deep Dive: query performance
Cassandra SF 2012 - Technical Deep Dive: query performance Cassandra SF 2012 - Technical Deep Dive: query performance
Cassandra SF 2012 - Technical Deep Dive: query performance aaronmorton
 
Hello @world #cassandra
Hello @world #cassandraHello @world #cassandra
Hello @world #cassandraaaronmorton
 
Cassandra does what ? Code Mania 2012
Cassandra does what ? Code Mania 2012Cassandra does what ? Code Mania 2012
Cassandra does what ? Code Mania 2012aaronmorton
 
Nzpug welly-cassandra-02-12-2010
Nzpug welly-cassandra-02-12-2010Nzpug welly-cassandra-02-12-2010
Nzpug welly-cassandra-02-12-2010aaronmorton
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandraaaronmorton
 
Building a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with CassandraBuilding a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with Cassandraaaronmorton
 
Cassandra - Wellington No Sql
Cassandra - Wellington No SqlCassandra - Wellington No Sql
Cassandra - Wellington No Sqlaaronmorton
 

Mehr von aaronmorton (14)

Cassandra Day Atlanta 2016 - Monitoring Cassandra
Cassandra Day Atlanta 2016  - Monitoring CassandraCassandra Day Atlanta 2016  - Monitoring Cassandra
Cassandra Day Atlanta 2016 - Monitoring Cassandra
 
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable CassandraCassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
Cassandra SF 2015 - Repeatable, Scalable, Reliable, Observable Cassandra
 
Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodes
 
Cassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break Glass
Cassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break GlassCassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break Glass
Cassandra Community Webinar August 29th 2013 - In Case Of Emergency, Break Glass
 
Cassandra Community Webinar - August 22 2013 - Cassandra Internals
Cassandra Community Webinar - August 22 2013 - Cassandra InternalsCassandra Community Webinar - August 22 2013 - Cassandra Internals
Cassandra Community Webinar - August 22 2013 - Cassandra Internals
 
Cassandra SF 2013 - In Case Of Emergency Break Glass
Cassandra SF 2013 - In Case Of Emergency Break GlassCassandra SF 2013 - In Case Of Emergency Break Glass
Cassandra SF 2013 - In Case Of Emergency Break Glass
 
Cassandra SF 2013 - Cassandra Internals
Cassandra SF 2013 - Cassandra InternalsCassandra SF 2013 - Cassandra Internals
Cassandra SF 2013 - Cassandra Internals
 
Cassandra SF 2012 - Technical Deep Dive: query performance
Cassandra SF 2012 - Technical Deep Dive: query performance Cassandra SF 2012 - Technical Deep Dive: query performance
Cassandra SF 2012 - Technical Deep Dive: query performance
 
Hello @world #cassandra
Hello @world #cassandraHello @world #cassandra
Hello @world #cassandra
 
Cassandra does what ? Code Mania 2012
Cassandra does what ? Code Mania 2012Cassandra does what ? Code Mania 2012
Cassandra does what ? Code Mania 2012
 
Nzpug welly-cassandra-02-12-2010
Nzpug welly-cassandra-02-12-2010Nzpug welly-cassandra-02-12-2010
Nzpug welly-cassandra-02-12-2010
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
Building a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with CassandraBuilding a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with Cassandra
 
Cassandra - Wellington No Sql
Cassandra - Wellington No SqlCassandra - Wellington No Sql
Cassandra - Wellington No Sql
 

Kürzlich hochgeladen

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 

Kürzlich hochgeladen (20)

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 

Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.X

  • 1. SF CASSANDRA USERS MARCH 2016 CQL PERFORMANCE WITH APACHE CASSANDRA 3.0 Aaron Morton @aaronmorton CEO Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
  • 2. AboutThe Last Pickle. Work with clients to deliver and improve Apache Cassandra based solutions. Apache Cassandra Committer and DataStax MVPs. Based in New Zealand,Australia, France & USA.
  • 3. How We Got Here Storage Engine 3.0 Write Path Read Path
  • 4. How We Got Here Way back in 2011…
  • 5. 2011 Blog: Cassandra Query Plans http://thelastpickle.com/blog/2011/07/04/ Cassandra-Query-Plans.html
  • 6. 2012 Talk:Technical Deep Dive - Query Performance https://www.youtube.com/watch? v=gomOKhMV0zc
  • 7. 2012 Explain Read & Write performance in 45 minutes.
  • 8. Skip Forward to 2016 Blog: Introduction To The Apache Cassandra 3.x Storage Engine http://thelastpickle.com/blog/2016/03/04/introductiont-to- the-apache-cassandra-3-storage-engine.html
  • 9. Skip Forward to 2016 “Why don’t I do another talk about Cassandra performance.”
  • 10. Skip Forward to 2016 It was a busy 4 years…
  • 11. Skip Forward to 2016 CQL 3, Collection Types, UDTs, UDF’s, UDA’s, MaterialisedViews,Triggers, SASI,…
  • 12. Skip Forward to 2016 Explain Read & Write performance in 45 minutes.
  • 13. So Lets Avoid CQL 3, Collection Types, UDTs, UDF’s, UDA’s, MaterialisedViews,Triggers, SASI,…
  • 14. How We Got Here Storage Engine 3.0 Write Path Read Path
  • 15. High Level Storage Engine 3.0
  • 16. Storage Engine 3.0 Files Data.db Index.db Filter.db
  • 17. Storage Engine 3.0 Files CompressionInfo.db Statistics.db Digest.crc32 CRC.db Summary.db TOC.txt
  • 18. CQL Recap create table my_table ( partition_1 text, cluster_1 text, foo text, bar text, baz text, PRIMARY KEY (partition_1, cluster_1) );
  • 20. CQL WithThrift Pre 3.0 [default@dev] list my_table; ------------------- RowKey: part_a => (column=clust_a:, value=, timestamp=1357…739000) => (column=clust_a:foo, value=some foo, timestamp=1357…739000) => (column=clust_a:bar, value=and bar, timestamp=1357…739000) => (column=clust_a:baz, value=no baz, timestamp=1357…739000) => (column=clust_b:, value=, timestamp=1357…739000) => (column=clust_b:foo, value=no foo, timestamp=1357…739000) => (column=clust_b:bar, value=no bar, timestamp=1357…739000) => (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
  • 21. CQL Pre 3.0 Clustering Keys Repeated Column Names Repeated Timestamps Repeated Fixed Width Encoding No Knowledge Of Row Contents
  • 22. Storage Engine 3.0 Improvements Delta Encoding Variable Int Encoding Clustering Written Once Aggregated Metadata Cell Presence
  • 23. SerializationHeader For each SSTable*. Stored in each SSTable. Held in memory.
  • 24. SerializationHeader public class SerializationHeader { private final AbstractType<?> keyType; private final List<AbstractType<?>> clusteringTypes; private final PartitionColumns columns; private final EncodingStats stats; … }
  • 25. EncodingStats Collected on the fly by the Memtable.
  • 26. EncodingStats public class EncodingStats { public final long minTimestamp; public final int minLocalDeletionTime; public final int minTTL; … }
  • 27. SerializationHeader public class SerializationHeader { public void writeTimestamp(long timestamp, DataOutputPlus out) throws IOException { out.writeUnsignedVInt(timestamp - stats.minTimestamp); } … }
  • 28. VIntCoding public class VIntCoding { public static void writeUnsignedVInt(long value, DataOutput output) throws IOException { int size = VIntCoding.computeUnsignedVIntSize(value); if (size == 1) { output.write((int)value); return; } output.write(VIntCoding.encodeVInt(value, size), 0, size); }
  • 29. Storage Engine 3.0 Improvements Delta Encoding Variable Int Encoding Clustering Written Once Aggregated Metadata Cell Presence
  • 30. CQL WithThrift Pre 3.0 [default@dev] list my_table; ------------------- RowKey: part_a => (column=clust_a:, value=, timestamp=1357…739000) => (column=clust_a:foo, value=some foo, timestamp=1357…739000) => (column=clust_a:bar, value=and bar, timestamp=1357…739000) => (column=clust_a:baz, value=no baz, timestamp=1357…739000) => (column=clust_b:, value=, timestamp=1357…739000) => (column=clust_b:foo, value=no foo, timestamp=1357…739000) => (column=clust_b:bar, value=no bar, timestamp=1357…739000) => (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
  • 32. Storage Engine 3.0 Partition Header Partition Key Partition Deletion Information
  • 33. Storage Engine 3.0 Partition Header
  • 34. Storage Engine 3.0 Row Clustering Information Row Level Liveness Row Level Deletion Column Presence Columns
  • 36. Storage Engine 3.0 Clustering Block Clustering Cell Presence Clustering Cells
  • 37. Storage Engine 3.0 Clustering Block
  • 38. Storage Engine 3.0 Improvements Delta Encoding Variable Int Encoding Clustering Written Once Aggregated Cell Metadata Cell Presence
  • 39. CQL WithThrift Pre 3.0 [default@dev] list my_table; ------------------- RowKey: part_a => (column=clust_a:, value=, timestamp=1357…739000) => (column=clust_a:foo, value=some foo, timestamp=1357…739000) => (column=clust_a:bar, value=and bar, timestamp=1357…739000) => (column=clust_a:baz, value=no baz, timestamp=1357…739000) => (column=clust_b:, value=, timestamp=1357…739000) => (column=clust_b:foo, value=no foo, timestamp=1357…739000) => (column=clust_b:bar, value=no bar, timestamp=1357…739000) => (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
  • 40. Aggregated Cell Metadata Only store CellTimestamp,TTL, and Local DeletionTime if different to the Row.
  • 41. Aggregated Cell Metadata Simple Cell Component Byte Size Flags 1 Optional Cell Timestamp (delta) varint 1…n Optional Cell Local Deletion Time (delta) varint 1…n Optional Cell TTL (delta) varint 1…n Fixed Width Cell Value Byte Size Value 1…n Optional Cell Value See Below Variable Width Cell Value Byte Size Value Length varint 1…n Value 1…n Apache Cassandra 3.0 Storage Engine
  • 42. Storage Engine 3.0 Improvements Delta Encoding Variable Int Encoding Clustering Written Once Aggregated Cell Metadata Cell Presence
  • 43. Cell Presence SSTable stores list of Cells in this SSTable. Rows stores bitmap of Cells in this Row, with reference to SSTable.
  • 45. Remember Where We Came From [default@dev] list my_table; ------------------- RowKey: part_a => (column=clust_a:, value=, timestamp=1357…739000) => (column=clust_a:foo, value=some foo, timestamp=1357…739000) => (column=clust_a:bar, value=and bar, timestamp=1357…739000) => (column=clust_a:baz, value=no baz, timestamp=1357…739000) => (column=clust_b:, value=, timestamp=1357…739000) => (column=clust_b:foo, value=no foo, timestamp=1357…739000) => (column=clust_b:bar, value=no bar, timestamp=1357…739000) => (column=clust_b:baz, value=lots baz, timestamp=1357…739000)
  • 46. How We Got Here Storage Engine 3.0 Write Path Read Path
  • 48. Commit Log Allocate space in the current commit log segment.
  • 50. Merge Into Memtable Find the Partition. Loop trying to update the Rows in it using CAS.
  • 51. Merge Into Memtable If more than 10MB wasted allocations move to Pessimistic locking on the Partition object.
  • 52. How We Got Here Storage Engine 3.0 Write Path Read Path
  • 55. AbstractClusteringIndexFilter ClusteringIndexNamesFilter (When we know the column names.) ClusteringIndexSliceFilter (When we do not know the column names.)
  • 56. ClusteringIndexNamesFilter When we know what Columns to select, we know when the search is over.
  • 57. ClusteringIndexNamesFilter 1. Get Partition From Memtables. 2. Filter named columns into a temporary result. 3. Select SSTables that may contain Partition Key. 4. Order in descending timestamp order. 5. Read from SSTables in order.
  • 58. Names Filter Short Circuits If result has a Partition Deletion newer than next SSTable max timestamp. Stop Search.
  • 59. Names Filter Short Circuits If read all Columns and max timestamp of next SSTable less than selected Columns min timestamp. Stop Search.
  • 60. Names Filter Short Circuits Note: list of Columns remaining to select is pruned after every SSTable is read based on max timestamp.
  • 61. Names Filter Short Circuits If search clustering value not within clustering range in the SSTable. Skip SSTable.
  • 62. Names Filter Short Circuits If SSTable Cell not in search set. Skip reading value.
  • 63. ClusteringIndexSliceFilter When we do not know which columns to select, the search ends when it is exhausted.
  • 65. ClusteringIndexSliceFilter 1. Get Partition From Memtables. 2. Create Iterators for Partitions. 3. Select SSTables that may contain Partition Key. 4. Order in reverse max timestamp order. 5. Create Iterators for SSTables in order.
  • 66. Slice Filter Short Circuits If SSTable max timestamp is before max seen Partition Deletion timestamp. Stop Search.
  • 67. Names Filter Short Circuits If search clustering value not within clustering range in the SSTable. Skip SSTable.
  • 68. So… 3.x is awesome. Starting using it as soon as possible.
  • 70. Aaron Morton @aaronmorton Co-Founder & Principal Consultant www.thelastpickle.com