Cassandra 0.7, Los Angeles High Scalability Group

Cassandra 0.7

Friday, December 10, 2010

Features
• Live schema modiﬁcation
• Secondary indexes
• Hadoop OutputFormat
• (Very) large rows
• up to 2 billion columns

• NetworkTopologyStrategy

Operations

• efﬁcient Streaming
• Per-ColumnFamily settings of memtable
thresholds
• Much more (optional) metadata about
columns


Operations backports
• HH disable (0.6.2)

• compaction priority (0.6.3)

• HH hourly scan (0.6.3)

• JMX metrics for row-level bloom ﬁlters (0.6.3)

• Flow control (0.6.4, 5)

• HH paging (0.6.5)

• Dynamic snitch (0.6.5)

• Tombstone removal in minor compaction (0.6.6)


Compatiblity

• Fully backwards-compatible with 0.6 data
• Some Thrift API changes
• String row keys become byte[]

• keyspace is set once per connection

• Requires drain + cluster restart


Features


Live schema changes

• Details: http://www.riptano.com/blog/live-
schema-updates-cassandra-07


Data model tradeoffs

• Twitter: “Fifteen months ago, it took two
weeks to perform ALTER TABLE on the
statuses [tweets] table.”


A static ColumnFamily


A dynamic ColumnFamily


SELECT * FROM tweets
WHERE user_id IN (SELECT follower FROM followers WHERE user_id = ?)

followers timeline
?

?

tweets

SuperColumns = full denormalization


A little deeper

• http://twissandra.com
• http://github.com/jhermes/twissjava


Secondary indexes


demo time

• Reading the slides after the talk? See http://
www.riptano.com/blog/whats-new-
cassandra-07-secondary-indexes


Hadoop OutputFormat
job.setOutputFormatClass(ColumnFamilyOutputFormat.class);
ConfigHelper.setOutputColumnFamily(job.getConfiguration(), KS, CF);
...
public void reduce(Text word, Iterable<IntWritable> values, Context
context)
{
int sum = 0;
for (IntWritable val : values)
sum += val.get();
context.write(outputKey, Collections.singletonList(getMutation
(word, sum)));
}


Large rows

• 0.6: smaller of {2GB, memory limit}
• 0.7: in_memory_compaction_limit_in_mb


NetworkTopologyStrategy

• RackAwareStrategy is tuned for 3 replicas
and 2 data centers
• renamed to OldNetworkTopologyStrategy

• NTS allows conﬁguring replicas per data
center, per Keyspace
• ignores replication_factor directive


Operations


Efﬁcient Streaming
• The following slides show how in 0.7, we
just send the data portion of the sstables
we are moving to a new node over to it
(which is contiguous on disk, no random i/
o), which rebuilds indexes etc
• This minimizes the impact on existing
nodes


W A

F
(A-L]

T

L


W A

F
(A-F]

(A-F]
T
(F-L]
L


W A

F

Data
T

L
Index
Filter

W A

F

T

L
Index
Filter

Per-CF memtable thresholds

• Easier tuning for large numbers of ColumnFamilies


Column Metadata

• 0.6: comparator, subcomparator
• 0.7: default_validation_class,
column_metadata


Native code

• JNA introduced in 0.6.5 for mlockall
• Extended to hard links in 0.6.6


Flow Control (0.6.4)
• Replica nodes drop hopeless requests on
the ﬂoor
• Coordinator node is unaffected

• TimedOutException signals client to back off

• Requires enough memory to buffer
RPCTimeout’s worth of requests

• (In the short term, you’re still screwed)

Flow control in 0.5

• Why backpressure doesn’t ﬁt Cassandra


Dynamic snitch

public void sortByProximity(List<InetAddress> addresses);


Everything else


0.7 performance
• Reads roughly 100% faster, thanks largely to
removing String creation
• Row-cached reads up to 8x faster after
optimizations by tjake and jbellis
• Optimizations for reads of large rows
• 0.7.1? ~15% improvement everywhere from
ByteBuffer optimizations


Thrift: the libpq of Cassandra

• OOMs on malformed packets
• Python Unicode string issues
• PHP support is buggy and maintainerless


Client support from Riptano

• Hector
• Building JPA/JDO layer on top

• pycassa
• phpcassa
• Soon: cassandra gem


After 0.7.0

• IndexOperator.GT
• Triggers / plugins
• Entity groups
• On-disk data format improvements
(Compression, compound keys?)


Summary


Cassandra 0.7, Los Angeles High Scalability Group

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Cassandra 0.7, Los Angeles High Scalability Group

Ähnlich wie Cassandra 0.7, Los Angeles High Scalability Group (20)

Mehr von jbellis

Mehr von jbellis (20)

Cassandra 0.7, Los Angeles High Scalability Group