SlideShare ist ein Scribd-Unternehmen logo
1 von 39
Downloaden Sie, um offline zu lesen
Cassandra 0.7



Friday, December 10, 2010
Features
                    • Live schema modification
                    • Secondary indexes
                    • Hadoop OutputFormat
                    • (Very) large rows
                            •   up to 2 billion columns

                    • NetworkTopologyStrategy
Friday, December 10, 2010
Operations

                    • efficient Streaming
                    • Per-ColumnFamily settings of memtable
                            thresholds
                    • Much more (optional) metadata about
                            columns



Friday, December 10, 2010
Operations backports
                    •       HH disable (0.6.2)

                    •       compaction priority (0.6.3)

                    •       HH hourly scan (0.6.3)

                    •       JMX metrics for row-level bloom filters (0.6.3)

                    •       Flow control (0.6.4, 5)

                    •       HH paging (0.6.5)

                    •       Dynamic snitch (0.6.5)

                    •       Tombstone removal in minor compaction (0.6.6)

Friday, December 10, 2010
Compatiblity

                    • Fully backwards-compatible with 0.6 data
                    • Some Thrift API changes
                            •   String row keys become byte[]

                            •   keyspace is set once per connection

                    • Requires drain + cluster restart

Friday, December 10, 2010
Features



Friday, December 10, 2010
Live schema changes


                    • Details: http://www.riptano.com/blog/live-
                            schema-updates-cassandra-07




Friday, December 10, 2010
Data model tradeoffs


                    • Twitter: “Fifteen months ago, it took two
                            weeks to perform ALTER TABLE on the
                            statuses [tweets] table.”




Friday, December 10, 2010
A static ColumnFamily




Friday, December 10, 2010
Friday, December 10, 2010
A dynamic ColumnFamily




Friday, December 10, 2010
SELECT * FROM tweets
 WHERE user_id IN (SELECT follower FROM followers WHERE user_id = ?)




          followers                                 timeline
                                          ?

   ?


                            tweets
Friday, December 10, 2010
SuperColumns = full denormalization




Friday, December 10, 2010
A little deeper


                    • http://twissandra.com
                    • http://github.com/jhermes/twissjava



Friday, December 10, 2010
Secondary indexes




Friday, December 10, 2010
A static ColumnFamily




Friday, December 10, 2010
demo time


                    • Reading the slides after the talk?   See http://
                            www.riptano.com/blog/whats-new-
                            cassandra-07-secondary-indexes




Friday, December 10, 2010
Hadoop OutputFormat
job.setOutputFormatClass(ColumnFamilyOutputFormat.class);
ConfigHelper.setOutputColumnFamily(job.getConfiguration(), KS, CF);
...
public void reduce(Text word, Iterable<IntWritable> values, Context
context)
{
    int sum = 0;
    for (IntWritable val : values)
         sum += val.get();
    context.write(outputKey, Collections.singletonList(getMutation
(word, sum)));
}




Friday, December 10, 2010
Large rows


                    • 0.6: smaller of {2GB, memory limit}
                    • 0.7: in_memory_compaction_limit_in_mb


Friday, December 10, 2010
NetworkTopologyStrategy

                    • RackAwareStrategy is tuned for 3 replicas
                            and 2 data centers
                            •   renamed to OldNetworkTopologyStrategy

                    • NTS allows configuring replicas per data
                            center, per Keyspace
                            •   ignores replication_factor directive



Friday, December 10, 2010
Operations



Friday, December 10, 2010
Efficient Streaming
                    • The following slides show how in 0.7, we
                            just send the data portion of the sstables
                            we are moving to a new node over to it
                            (which is contiguous on disk, no random i/
                            o), which rebuilds indexes etc
                    • This minimizes the impact on existing
                            nodes


Friday, December 10, 2010
W           A




                                                F
                                    (A-L]


                            T

                                            L




Friday, December 10, 2010
W           A




                                                 F
                                    (A-F]


                                                (A-F]
                            T
                                    (F-L]
                                            L




Friday, December 10, 2010
W            A




                                                 F


                                    Data
                            T

                                             L
                                    Index
                                    Filter
Friday, December 10, 2010
W            A




                                                 F



                            T

                                             L
                                    Index
                                    Filter
Friday, December 10, 2010
Per-CF memtable thresholds



          • Easier tuning for large numbers of ColumnFamilies




Friday, December 10, 2010
Column Metadata


                    • 0.6: comparator, subcomparator
                    • 0.7: default_validation_class,
                            column_metadata




Friday, December 10, 2010
Native code


                    • JNA introduced in 0.6.5 for mlockall
                    • Extended to hard links in 0.6.6


Friday, December 10, 2010
Flow Control (0.6.4)
                    • Replica nodes drop hopeless requests on
                            the floor
                            •   Coordinator node is unaffected

                            •   TimedOutException signals client to back off

                            •   Requires enough memory to buffer
                                RPCTimeout’s worth of requests

                    • (In the short term, you’re still screwed)
Friday, December 10, 2010
Flow control in 0.5


                    • Why backpressure doesn’t fit Cassandra



Friday, December 10, 2010
Dynamic snitch

public void sortByProximity(List<InetAddress> addresses);




Friday, December 10, 2010
Everything else



Friday, December 10, 2010
0.7 performance
                    • Reads roughly 100% faster, thanks largely to
                            removing String creation
                    • Row-cached reads up to 8x faster after
                            optimizations by tjake and jbellis
                    • Optimizations for reads of large rows
                    • 0.7.1? ~15% improvement everywhere from
                            ByteBuffer optimizations


Friday, December 10, 2010
Thrift: the libpq of Cassandra



                    • OOMs on malformed packets
                    • Python Unicode string issues
                    • PHP support is buggy and maintainerless


Friday, December 10, 2010
Client support from Riptano

                    • Hector
                            •   Building JPA/JDO layer on top

                    • pycassa
                    • phpcassa
                    • Soon: cassandra gem

Friday, December 10, 2010
After 0.7.0

                    • IndexOperator.GT
                    • Triggers / plugins
                    • Entity groups
                    • On-disk data format improvements
                            (Compression, compound keys?)



Friday, December 10, 2010
Summary



Friday, December 10, 2010
Friday, December 10, 2010

Weitere ähnliche Inhalte

Ähnlich wie Cassandra 0.7, Los Angeles High Scalability Group

The missing data issue for HiSeq runs
The missing data issue for HiSeq runsThe missing data issue for HiSeq runs
The missing data issue for HiSeq runsDenis C. Bauer
 
Innodb plugin in MySQL 5.1
Innodb plugin in MySQL 5.1Innodb plugin in MySQL 5.1
Innodb plugin in MySQL 5.1Giuseppe Maxia
 
OpenStack Summit, A Community of Service Providers
OpenStack Summit, A Community of Service ProvidersOpenStack Summit, A Community of Service Providers
OpenStack Summit, A Community of Service ProvidersAndrew Shafer
 
Distributed Social Networking
Distributed Social NetworkingDistributed Social Networking
Distributed Social NetworkingBastian Hofmann
 
Mysql features for the enterprise
Mysql features for the enterpriseMysql features for the enterprise
Mysql features for the enterpriseGiuseppe Maxia
 
Puppet buero20 presentation
Puppet buero20 presentationPuppet buero20 presentation
Puppet buero20 presentationMartin Alfke
 
Introduction to CouchDB
Introduction to CouchDBIntroduction to CouchDB
Introduction to CouchDBJohn Wood
 
Riak Core: Building Distributed Applications Without Shared State
Riak Core: Building Distributed Applications Without Shared StateRiak Core: Building Distributed Applications Without Shared State
Riak Core: Building Distributed Applications Without Shared StateRusty Klophaus
 
Introduction to HTML5
Introduction to HTML5Introduction to HTML5
Introduction to HTML5Adrian Olaru
 
HBase @ Hadoop Day Seattle
HBase @ Hadoop Day SeattleHBase @ Hadoop Day Seattle
HBase @ Hadoop Day Seattleamansk
 
Sneak Peek of Nuxeo 5.4
Sneak Peek of Nuxeo 5.4Sneak Peek of Nuxeo 5.4
Sneak Peek of Nuxeo 5.4Nuxeo
 
Designing for Massive Scalability at BackType #bigdatacamp
Designing for Massive Scalability at BackType #bigdatacampDesigning for Massive Scalability at BackType #bigdatacamp
Designing for Massive Scalability at BackType #bigdatacampMichael Montano
 
Techniques for Managing Huge Data LISA10
Techniques for Managing Huge Data LISA10Techniques for Managing Huge Data LISA10
Techniques for Managing Huge Data LISA10Richard Elling
 
Liberating Your Data From MySQL: Cross-Database Replication to the Rescue!
Liberating Your Data From MySQL: Cross-Database Replication to the Rescue!Liberating Your Data From MySQL: Cross-Database Replication to the Rescue!
Liberating Your Data From MySQL: Cross-Database Replication to the Rescue!Linas Virbalas
 
Scientific Applications with Python
Scientific Applications with PythonScientific Applications with Python
Scientific Applications with PythonEnthought, Inc.
 
Persistence Smoothie: Blending SQL and NoSQL (RubyNation Edition)
Persistence  Smoothie: Blending SQL and NoSQL (RubyNation Edition)Persistence  Smoothie: Blending SQL and NoSQL (RubyNation Edition)
Persistence Smoothie: Blending SQL and NoSQL (RubyNation Edition)Michael Bleigh
 

Ähnlich wie Cassandra 0.7, Los Angeles High Scalability Group (20)

Plone on RelStorage
Plone on RelStoragePlone on RelStorage
Plone on RelStorage
 
The missing data issue for HiSeq runs
The missing data issue for HiSeq runsThe missing data issue for HiSeq runs
The missing data issue for HiSeq runs
 
Innodb plugin in MySQL 5.1
Innodb plugin in MySQL 5.1Innodb plugin in MySQL 5.1
Innodb plugin in MySQL 5.1
 
06 data
06 data06 data
06 data
 
Reef - ESUG 2010
Reef - ESUG 2010Reef - ESUG 2010
Reef - ESUG 2010
 
OpenStack Summit, A Community of Service Providers
OpenStack Summit, A Community of Service ProvidersOpenStack Summit, A Community of Service Providers
OpenStack Summit, A Community of Service Providers
 
Distributed Social Networking
Distributed Social NetworkingDistributed Social Networking
Distributed Social Networking
 
Mysql features for the enterprise
Mysql features for the enterpriseMysql features for the enterprise
Mysql features for the enterprise
 
Puppet buero20 presentation
Puppet buero20 presentationPuppet buero20 presentation
Puppet buero20 presentation
 
Introduction to CouchDB
Introduction to CouchDBIntroduction to CouchDB
Introduction to CouchDB
 
Riak Core: Building Distributed Applications Without Shared State
Riak Core: Building Distributed Applications Without Shared StateRiak Core: Building Distributed Applications Without Shared State
Riak Core: Building Distributed Applications Without Shared State
 
Introduction to HTML5
Introduction to HTML5Introduction to HTML5
Introduction to HTML5
 
HBase @ Hadoop Day Seattle
HBase @ Hadoop Day SeattleHBase @ Hadoop Day Seattle
HBase @ Hadoop Day Seattle
 
Sneak Peek of Nuxeo 5.4
Sneak Peek of Nuxeo 5.4Sneak Peek of Nuxeo 5.4
Sneak Peek of Nuxeo 5.4
 
Designing for Massive Scalability at BackType #bigdatacamp
Designing for Massive Scalability at BackType #bigdatacampDesigning for Massive Scalability at BackType #bigdatacamp
Designing for Massive Scalability at BackType #bigdatacamp
 
noSQL @ QCon SP
noSQL @ QCon SPnoSQL @ QCon SP
noSQL @ QCon SP
 
Techniques for Managing Huge Data LISA10
Techniques for Managing Huge Data LISA10Techniques for Managing Huge Data LISA10
Techniques for Managing Huge Data LISA10
 
Liberating Your Data From MySQL: Cross-Database Replication to the Rescue!
Liberating Your Data From MySQL: Cross-Database Replication to the Rescue!Liberating Your Data From MySQL: Cross-Database Replication to the Rescue!
Liberating Your Data From MySQL: Cross-Database Replication to the Rescue!
 
Scientific Applications with Python
Scientific Applications with PythonScientific Applications with Python
Scientific Applications with Python
 
Persistence Smoothie: Blending SQL and NoSQL (RubyNation Edition)
Persistence  Smoothie: Blending SQL and NoSQL (RubyNation Edition)Persistence  Smoothie: Blending SQL and NoSQL (RubyNation Edition)
Persistence Smoothie: Blending SQL and NoSQL (RubyNation Edition)
 

Mehr von jbellis

Five Lessons in Distributed Databases
Five Lessons  in Distributed DatabasesFive Lessons  in Distributed Databases
Five Lessons in Distributed Databasesjbellis
 
Data day texas: Cassandra and the Cloud
Data day texas: Cassandra and the CloudData day texas: Cassandra and the Cloud
Data day texas: Cassandra and the Cloudjbellis
 
Cassandra Summit 2015
Cassandra Summit 2015Cassandra Summit 2015
Cassandra Summit 2015jbellis
 
Cassandra 2.1
Cassandra 2.1Cassandra 2.1
Cassandra 2.1jbellis
 
Tokyo cassandra conference 2014
Tokyo cassandra conference 2014Tokyo cassandra conference 2014
Tokyo cassandra conference 2014jbellis
 
Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013jbellis
 
London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0jbellis
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynotejbellis
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012jbellis
 
Top five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solutionTop five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solutionjbellis
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012jbellis
 
Massively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache CassandraMassively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache Cassandrajbellis
 
Cassandra 1.1
Cassandra 1.1Cassandra 1.1
Cassandra 1.1jbellis
 
Apache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterpriseApache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterprisejbellis
 
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)jbellis
 
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011jbellis
 
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)jbellis
 
What python can learn from java
What python can learn from javaWhat python can learn from java
What python can learn from javajbellis
 
State of Cassandra, 2011
State of Cassandra, 2011State of Cassandra, 2011
State of Cassandra, 2011jbellis
 
PyCon 2010 SQLAlchemy tutorial
PyCon 2010 SQLAlchemy tutorialPyCon 2010 SQLAlchemy tutorial
PyCon 2010 SQLAlchemy tutorialjbellis
 

Mehr von jbellis (20)

Five Lessons in Distributed Databases
Five Lessons  in Distributed DatabasesFive Lessons  in Distributed Databases
Five Lessons in Distributed Databases
 
Data day texas: Cassandra and the Cloud
Data day texas: Cassandra and the CloudData day texas: Cassandra and the Cloud
Data day texas: Cassandra and the Cloud
 
Cassandra Summit 2015
Cassandra Summit 2015Cassandra Summit 2015
Cassandra Summit 2015
 
Cassandra 2.1
Cassandra 2.1Cassandra 2.1
Cassandra 2.1
 
Tokyo cassandra conference 2014
Tokyo cassandra conference 2014Tokyo cassandra conference 2014
Tokyo cassandra conference 2014
 
Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013
 
London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynote
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012
 
Top five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solutionTop five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solution
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012
 
Massively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache CassandraMassively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache Cassandra
 
Cassandra 1.1
Cassandra 1.1Cassandra 1.1
Cassandra 1.1
 
Apache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterpriseApache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterprise
 
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
 
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
 
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
 
What python can learn from java
What python can learn from javaWhat python can learn from java
What python can learn from java
 
State of Cassandra, 2011
State of Cassandra, 2011State of Cassandra, 2011
State of Cassandra, 2011
 
PyCon 2010 SQLAlchemy tutorial
PyCon 2010 SQLAlchemy tutorialPyCon 2010 SQLAlchemy tutorial
PyCon 2010 SQLAlchemy tutorial
 

Cassandra 0.7, Los Angeles High Scalability Group

  • 2. Features • Live schema modification • Secondary indexes • Hadoop OutputFormat • (Very) large rows • up to 2 billion columns • NetworkTopologyStrategy Friday, December 10, 2010
  • 3. Operations • efficient Streaming • Per-ColumnFamily settings of memtable thresholds • Much more (optional) metadata about columns Friday, December 10, 2010
  • 4. Operations backports • HH disable (0.6.2) • compaction priority (0.6.3) • HH hourly scan (0.6.3) • JMX metrics for row-level bloom filters (0.6.3) • Flow control (0.6.4, 5) • HH paging (0.6.5) • Dynamic snitch (0.6.5) • Tombstone removal in minor compaction (0.6.6) Friday, December 10, 2010
  • 5. Compatiblity • Fully backwards-compatible with 0.6 data • Some Thrift API changes • String row keys become byte[] • keyspace is set once per connection • Requires drain + cluster restart Friday, December 10, 2010
  • 7. Live schema changes • Details: http://www.riptano.com/blog/live- schema-updates-cassandra-07 Friday, December 10, 2010
  • 8. Data model tradeoffs • Twitter: “Fifteen months ago, it took two weeks to perform ALTER TABLE on the statuses [tweets] table.” Friday, December 10, 2010
  • 9. A static ColumnFamily Friday, December 10, 2010
  • 11. A dynamic ColumnFamily Friday, December 10, 2010
  • 12. SELECT * FROM tweets WHERE user_id IN (SELECT follower FROM followers WHERE user_id = ?) followers timeline ? ? tweets Friday, December 10, 2010
  • 13. SuperColumns = full denormalization Friday, December 10, 2010
  • 14. A little deeper • http://twissandra.com • http://github.com/jhermes/twissjava Friday, December 10, 2010
  • 16. A static ColumnFamily Friday, December 10, 2010
  • 17. demo time • Reading the slides after the talk? See http:// www.riptano.com/blog/whats-new- cassandra-07-secondary-indexes Friday, December 10, 2010
  • 18. Hadoop OutputFormat job.setOutputFormatClass(ColumnFamilyOutputFormat.class); ConfigHelper.setOutputColumnFamily(job.getConfiguration(), KS, CF); ... public void reduce(Text word, Iterable<IntWritable> values, Context context) { int sum = 0; for (IntWritable val : values) sum += val.get(); context.write(outputKey, Collections.singletonList(getMutation (word, sum))); } Friday, December 10, 2010
  • 19. Large rows • 0.6: smaller of {2GB, memory limit} • 0.7: in_memory_compaction_limit_in_mb Friday, December 10, 2010
  • 20. NetworkTopologyStrategy • RackAwareStrategy is tuned for 3 replicas and 2 data centers • renamed to OldNetworkTopologyStrategy • NTS allows configuring replicas per data center, per Keyspace • ignores replication_factor directive Friday, December 10, 2010
  • 22. Efficient Streaming • The following slides show how in 0.7, we just send the data portion of the sstables we are moving to a new node over to it (which is contiguous on disk, no random i/ o), which rebuilds indexes etc • This minimizes the impact on existing nodes Friday, December 10, 2010
  • 23. W A F (A-L] T L Friday, December 10, 2010
  • 24. W A F (A-F] (A-F] T (F-L] L Friday, December 10, 2010
  • 25. W A F Data T L Index Filter Friday, December 10, 2010
  • 26. W A F T L Index Filter Friday, December 10, 2010
  • 27. Per-CF memtable thresholds • Easier tuning for large numbers of ColumnFamilies Friday, December 10, 2010
  • 28. Column Metadata • 0.6: comparator, subcomparator • 0.7: default_validation_class, column_metadata Friday, December 10, 2010
  • 29. Native code • JNA introduced in 0.6.5 for mlockall • Extended to hard links in 0.6.6 Friday, December 10, 2010
  • 30. Flow Control (0.6.4) • Replica nodes drop hopeless requests on the floor • Coordinator node is unaffected • TimedOutException signals client to back off • Requires enough memory to buffer RPCTimeout’s worth of requests • (In the short term, you’re still screwed) Friday, December 10, 2010
  • 31. Flow control in 0.5 • Why backpressure doesn’t fit Cassandra Friday, December 10, 2010
  • 32. Dynamic snitch public void sortByProximity(List<InetAddress> addresses); Friday, December 10, 2010
  • 34. 0.7 performance • Reads roughly 100% faster, thanks largely to removing String creation • Row-cached reads up to 8x faster after optimizations by tjake and jbellis • Optimizations for reads of large rows • 0.7.1? ~15% improvement everywhere from ByteBuffer optimizations Friday, December 10, 2010
  • 35. Thrift: the libpq of Cassandra • OOMs on malformed packets • Python Unicode string issues • PHP support is buggy and maintainerless Friday, December 10, 2010
  • 36. Client support from Riptano • Hector • Building JPA/JDO layer on top • pycassa • phpcassa • Soon: cassandra gem Friday, December 10, 2010
  • 37. After 0.7.0 • IndexOperator.GT • Triggers / plugins • Entity groups • On-disk data format improvements (Compression, compound keys?) Friday, December 10, 2010