SlideShare ist ein Scribd-Unternehmen logo
1 von 43
Back to Basics with
CQL3
Matt Overstreet
OpenSource Connections

OpenSource Connections
Outline
•
•
•
•
•

Overview
Architecture
Data Modeling
Good At/Bad At
Using Cassandra

OpenSource Connections
Outline
•
•
•
•
•

Overview
Architecture
Data Modeling
Good At/Bad At
Using Cassandra

• What is Big Data?
• How does Cassandra fit?

OpenSource Connections
What is Big Data?
• The three V’s (and a C)

velocity
volume
Variety
Complexity
OpenSource Connections
What is Big Data
• Brewer’s CAP theorem
o
o
o
o

Consistency - all nodes have same world view
Availability - requests can be serviced
Partition tolerance - network/machine failure
Can’t have all 3 -- Pick 2!

• Examples
o MySQL – Consistent, Available
o HBase – Consistent, Partition Tolerant
o Cassandra – Available, Partition Tolerant
– and “Tunably Consistent”!

OpenSource Connections
What is Big Data?
• Common theme: Denormalize everything!
o What’s that?
• JOIN all the tables in the database...
• … well not all the tables

o Why?
• You can shard database at any point
• All related data is co-located

• What this means for you
o
o
o
o
o

No joins
No transactions - potential for inconsistency
Vastly simplified querying
No data-modeling -- Instead, query-modeling
“Infinite and easy” scaling potential
OpenSource Connections
How Does Cassandra Fit?
• No single point of failure
• Optimized for writes, still good with reads
• Can decide between Consistency and Availably
concerns

OpenSource Connections
Outline
•
•
•
•
•

Overview
Architecture
Data Modeling
Good At/Bad At
Using Cassandra

• Ring architecture
• Data partitioning
o Operations
o Writes
o Reads

OpenSource Connections
Ring Architecture
• No single point of failure
• Nodes talk via gossip
• Democratic - all nodes
are equal

OpenSource Connections
Data Partitioning

Original partitioning method.
OpenSource Connections
Data Partitioning

Flexible partitioning with virtual nodes.
OpenSource Connections
Operations: Writes

Requests sent out to nodes and replicants.
OpenSource Connections
Operations: Reads

Coordinator node reaches out to relevant replicants.
OpenSource Connections
Outline
•
•
•
•
•

Overview
Architecture
Data Modeling
Good At/Bad At
Using Cassandra

•
•
•
•

Internals
Cassandra Query Language
Modeling Strategy
Example

OpenSource Connections
C* Data Model
Keyspace

OpenSource Connections
C* Data Model
Keyspace
Column Family

Column Family

OpenSource Connections
C* Data Model
Keyspace
Column Family

Column Family

OpenSource Connections
C* Data Model
Keyspace
Column Family

Column Family

OpenSource Connections
C* Data Model
Row Key

OpenSource Connections
C* Data Model
Row Key

Column
Column Name

Column Value
(or Tombstone)
Timestamp
Time-to-live

OpenSource Connections
C* Data Model
Row Key

Column
Column Name

Column Value
(or Tombstone)
Timestamp
Time-to-live

● Row Key, Column Name, Column
Value have types
● Column Name has comparator
● RowKey has partitioner
● Rows can have any number of
columns - even in same column family
● Rows can have many columns
● Column Values can be omitted
● Time-to-live is useful!
● Tombstones
OpenSource Connections
C* Data Model: Writes

Mem
Table

CommitLog
Row
Cache

Bloom
Filter

● Insert into
MemTable
● Dump to
CommitLog
● No read
● Very Fast!
● Blocks on CPU
before O/I!
Key
Cache
Key
Cache
Key
Cache
Key
Cache

SSTable
SSTable
SSTable
SSTable

OpenSource Connections
C* Data Model: Writes

Mem
Table

CommitLog
Row
Cache

Bloom
Filter

● Insert into
MemTable
● Dump to
CommitLog
● No read
● Very Fast!
● Blocks on CPU
before O/I!
Key
Cache
Key
Cache
Key
Cache
Key
Cache

SSTable
SSTable
SSTable
SSTable

OpenSource Connections
C* Data Model: Writes

Mem
Table

CommitLog
Row
Cache

Bloom
Filter

● Insert into
MemTable
● Dump to
CommitLog
● No read
● Very Fast!
● Blocks on CPU
before O/I!
Key
Cache
Key
Cache
Key
Cache
Key
Cache

SSTable
SSTable
SSTable
SSTable

OpenSource Connections
C* Data Model:
Reads
Mem
Table

CommitLog
Row
Cache

Bloom
Filter

● Get values from Memtable
● Get values from row
cache if present
● Otherwise check bloom
filter to find appropriate
SSTables
● Check Key Cache for fast
SSTable Search
● Get values from SSTables
● Repopulate Row Cache
● Super Fast Col. retrieval
● Fast row slicing
Key
Cache
Key
Cache
Key
Cache
Key
Cache

SSTable
SSTable
SSTable
SSTable

OpenSource Connections
C* Data Model:
Reads
Mem
Table

CommitLog
Row
Cache

Bloom
Filter

● Get values from Memtable
● Get values from row
cache if present
● Otherwise check bloom
filter to find appropriate
SSTables
● Check Key Cache for fast
SSTable Search
● Get values from SSTables
● Repopulate Row Cache
● Super Fast Col. retrieval
● Fast row slicing
Key
Cache
Key
Cache
Key
Cache
Key
Cache

SSTable
SSTable
SSTable
SSTable

OpenSource Connections
C* Data Model:
Reads
Mem
Table

CommitLog
Row
Cache

Bloom
Filter

● Get values from Memtable
● Get values from row
cache if present
● Otherwise check bloom
filter to find appropriate
SSTables
● Check Key Cache for fast
SSTable Search
● Get values from SSTables
● Repopulate Row Cache
● Super Fast Col. retrieval
● Fast row slicing
Key
Cache
Key
Cache
Key
Cache
Key
Cache

SSTable
SSTable
SSTable
SSTable

OpenSource Connections
C* Data Model:
Reads
Mem
Table

CommitLog
Row
Cache

Bloom
Filter

● Get values from Memtable
● Get values from row
cache if present
● Otherwise check bloom
filter to find appropriate
SSTables
● Check Key Cache for fast
SSTable Search
● Get values from SSTables
● Repopulate Row Cache
● Super Fast Col. retrieval
● Fast row slicing
Key
Cache
Key
Cache
Key
Cache
Key
Cache

SSTable
SSTable
SSTable
SSTable

OpenSource Connections
C* Data Model:
Reads
Mem
Table

CommitLog
Row
Cache

Bloom
Filter

● Get values from Memtable
● Get values from row
cache if present
● Otherwise check bloom
filter to find appropriate
SSTables
● Check Key Cache for fast
SSTable Search
● Get values from SSTables
● Repopulate Row Cache
● Super Fast Col. retrieval
● Fast row slicing
Key
Cache
Key
Cache
Key
Cache
Key
Cache

SSTable
SSTable
SSTable
SSTable

OpenSource Connections
C* Data Model:
Reads
Mem
Table

CommitLog
Row
Cache

Bloom
Filter

● Get values from Memtable
● Get values from row
cache if present
● Otherwise check bloom
filter to find appropriate
SSTables
● Check Key Cache for fast
SSTable Search
● Get values from SSTables
● Repopulate Row Cache
● Super Fast Col. retrieval
● Fast row slicing
Key
Cache
Key
Cache
Key
Cache
Key
Cache

SSTable
SSTable
SSTable
SSTable

OpenSource Connections
Internals: Twitter Example
• 4 ColumnFamilies
o
o
o
o

followers
following
tweets
timeline

OpenSource Connections
Internals: Twitter Example
• 4 ColumnFamilies
o
o
o
o

followers
following
tweets
timeline

• Nate follows Patricia
o
o
o
o

SET followers[Patricia][Nate] = „‟;
SET following[Nate][Patricia] = „‟;
storing data in column names (not values)
denormalized, redundant!

• Get all Nate’s followers
o GET followers[Patricia]
o => Nate,Eric,Scott,Matt,Doug,Kate
o No JOIN!

OpenSource Connections
Internals: Twitter Example
• Nate tweets
o SET tweets[Nate][2013-07-19 T 09:20] = “Wonderful morning.
This coffee is great.”
o SET tweets[Nate][2013-07-19 T 09:21] = “Oops, smoke is
coming out of the SQL server!”
o SET tweets[Nate][2013-07-19 T 09:51] = “Now my coffee is
cold :-(”

• Get Nate’s tweets
o GET tweets[Nate]
…(what you’d expect)...

OpenSource Connections
CQL (Cassandra Query Language)
CREATE TABLE users (
id timeuuid PRIMARY KEY,
lastname varchar,
firstname varchar,
dateOfBirth timestamp );

OpenSource Connections
CQL (Cassandra Query Language)
CREATE TABLE users (
id timeuuid PRIMARY KEY,
lastname varchar,
firstname varchar,
dateOfBirth timestamp );
INSERT INTO users (id,lastname, firstname, dateofbirth)
VALUES (now(),'Berryman',‟John','1975-09-15');

OpenSource Connections
CQL (Cassandra Query Language)
CREATE TABLE users (
id timeuuid PRIMARY KEY,
lastname varchar,
firstname varchar,
dateOfBirth timestamp );
INSERT INTO users (id,lastname, firstname, dateofbirth)
VALUES (now(),‟Berryman‟,‟John‟,‟1975-09-15‟);
UPDATE users SET firstname = ‟John‟
WHERE id = f74c0b20-0862-11e3-8cf6-b74c10b01fc6;

OpenSource Connections
CQL (Cassandra Query Language)
CREATE TABLE users (
id timeuuid PRIMARY KEY,
lastname varchar,
firstname varchar,
dateOfBirth timestamp );
INSERT INTO users (id,lastname, firstname, dateofbirth)
VALUES (now(),'Berryman',‟John','1975-09-15');
UPDATE users SET firstname = 'John‟
WHERE id = f74c0b20-0862-11e3-8cf6-b74c10b01fc6;
SELECT dateofbirth,firstname,lastname FROM users ;
dateofbirth
| firstname | lastname
--------------------------+-----------+---------1975-09-15 00:00:00-0400 |
John | Berryman
OpenSource Connections
The CQL/Cassandra Mapping
CREATE TABLE employees (
company text,
name text,
age int,
role text,
PRIMARY KEY (company,name)
);

OpenSource Connections
The CQL/Cassandra Mapping
CREATE TABLE employees (
company text,
name text,
age int,
role text,
PRIMARY KEY (company,name)
);

company | name | age | role
--------+------+-----+----OSC | eric | 38 | ceo
OSC | john | 37 | dev
RKG | anya | 29 | lead
RKG | ben | 27 | dev
RKG | chad | 35 | ops

OpenSource Connections
The CQL/Cassandra Mapping
company | name | age | role
--------+------+-----+----OSC | eric | 38 | ceo
OSC | john | 37 | dev
RKG | anya | 29 | lead
RKG | ben | 27 | dev
RKG | chad | 35 | ops

CREATE TABLE employees (
company text,
name text,
age int,
role text,
PRIMARY KEY (company,name)
);
eric:age
OS
C

eric:role

john:age

john:role

38

dev

37

dev

anya:age
RK
G

anya:role

ben:age

ben:role

chad:age

chad:role

29

lead

27

dev

35

ops

OpenSource Connections
Modeling Strategy
• Don’t think about the data structure
• Do think of the questions you’ll ask
• Consider efficient operations for Cassandra
o
o
o
o

Writing (4K writes per second per core)
Retrieving a row
Retrieving a row slice
Retrieving in natural order (which you control)

• Write the data in the way you will query it
• Disk space is cheap
• Seperate read-heavy and write-heavy task
o Make wise use of caches

OpenSource Connections
Modeling Strategy: Anti-Patterns
• Read-then-write
• Heavy deletes
o Scatters dead columns throughout SSTables
o Won’t be corrected until first compaction after
gc_grace_seconds (10days)

• Distributed queue
• JOIN-like behavior
• Super wide-row sneak attack (>2B columns)

OpenSource Connections
QUESTIONS?

OpenSource Connections

Weitere ähnliche Inhalte

Was ist angesagt?

ETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk LoadingETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk Loading
alex_araujo
 
Building a High-Performance Distributed Task Queue on MongoDB
Building a High-Performance Distributed Task Queue on MongoDBBuilding a High-Performance Distributed Task Queue on MongoDB
Building a High-Performance Distributed Task Queue on MongoDB
MongoDB
 
MongoDB: tips, trick and hacks
MongoDB: tips, trick and hacksMongoDB: tips, trick and hacks
MongoDB: tips, trick and hacks
Scott Hernandez
 

Was ist angesagt? (20)

ETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk LoadingETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk Loading
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
Cassandra 2.0 better, faster, stronger
Cassandra 2.0   better, faster, strongerCassandra 2.0   better, faster, stronger
Cassandra 2.0 better, faster, stronger
 
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
Cassandra Summit 2014: Reading Cassandra SSTables Directly for Offline Data A...
 
Building a High-Performance Distributed Task Queue on MongoDB
Building a High-Performance Distributed Task Queue on MongoDBBuilding a High-Performance Distributed Task Queue on MongoDB
Building a High-Performance Distributed Task Queue on MongoDB
 
Cassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesCassandra 2.0 and timeseries
Cassandra 2.0 and timeseries
 
Sparkstreaming
SparkstreamingSparkstreaming
Sparkstreaming
 
Cassandra by example - the path of read and write requests
Cassandra by example - the path of read and write requestsCassandra by example - the path of read and write requests
Cassandra by example - the path of read and write requests
 
Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...
Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...
Optimizing Your Cluster with Coordinator Nodes (Eric Lubow, SimpleReach) | Ca...
 
Apache Cassandra, part 2 – data model example, machinery
Apache Cassandra, part 2 – data model example, machineryApache Cassandra, part 2 – data model example, machinery
Apache Cassandra, part 2 – data model example, machinery
 
Time series with apache cassandra strata
Time series with apache cassandra   strataTime series with apache cassandra   strata
Time series with apache cassandra strata
 
Cassandra Fundamentals - C* 2.0
Cassandra Fundamentals - C* 2.0Cassandra Fundamentals - C* 2.0
Cassandra Fundamentals - C* 2.0
 
Building Scalable, Distributed Job Queues with Redis and Redis::Client
Building Scalable, Distributed Job Queues with Redis and Redis::ClientBuilding Scalable, Distributed Job Queues with Redis and Redis::Client
Building Scalable, Distributed Job Queues with Redis and Redis::Client
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache Cassandra
 
MongoDB: tips, trick and hacks
MongoDB: tips, trick and hacksMongoDB: tips, trick and hacks
MongoDB: tips, trick and hacks
 
Apache Con NA 2013 - Cassandra Internals
Apache Con NA 2013 - Cassandra InternalsApache Con NA 2013 - Cassandra Internals
Apache Con NA 2013 - Cassandra Internals
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
 
Advanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMXAdvanced Apache Cassandra Operations with JMX
Advanced Apache Cassandra Operations with JMX
 
Cassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL MeetupCassandra and Rails at LA NoSQL Meetup
Cassandra and Rails at LA NoSQL Meetup
 
PostgreSQL
PostgreSQLPostgreSQL
PostgreSQL
 

Andere mochten auch

Andere mochten auch (20)

Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerceDon't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
 
Cassandra Community Webinar | In Case of Emergency Break Glass
Cassandra Community Webinar | In Case of Emergency Break GlassCassandra Community Webinar | In Case of Emergency Break Glass
Cassandra Community Webinar | In Case of Emergency Break Glass
 
Webinar: Eventual Consistency != Hopeful Consistency
Webinar: Eventual Consistency != Hopeful ConsistencyWebinar: Eventual Consistency != Hopeful Consistency
Webinar: Eventual Consistency != Hopeful Consistency
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
 
Webinar: Don't Leave Your Data in the Dark
Webinar: Don't Leave Your Data in the DarkWebinar: Don't Leave Your Data in the Dark
Webinar: Don't Leave Your Data in the Dark
 
How much money do you lose every time your ecommerce site goes down?
How much money do you lose every time your ecommerce site goes down?How much money do you lose every time your ecommerce site goes down?
How much money do you lose every time your ecommerce site goes down?
 
Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6Webinar | Introducing DataStax Enterprise 4.6
Webinar | Introducing DataStax Enterprise 4.6
 
Webinar | How Clear Capital Delivers Always-on Appraisals on 122 Million Prop...
Webinar | How Clear Capital Delivers Always-on Appraisals on 122 Million Prop...Webinar | How Clear Capital Delivers Always-on Appraisals on 122 Million Prop...
Webinar | How Clear Capital Delivers Always-on Appraisals on 122 Million Prop...
 
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
 
Webinar: 2 Billion Data Points Each Day
Webinar: 2 Billion Data Points Each DayWebinar: 2 Billion Data Points Each Day
Webinar: 2 Billion Data Points Each Day
 
Cassandra Community Webinar | Practice Makes Perfect: Extreme Cassandra Optim...
Cassandra Community Webinar | Practice Makes Perfect: Extreme Cassandra Optim...Cassandra Community Webinar | Practice Makes Perfect: Extreme Cassandra Optim...
Cassandra Community Webinar | Practice Makes Perfect: Extreme Cassandra Optim...
 
Webinar | From Zero to 1 Million with Google Cloud Platform and DataStax
Webinar | From Zero to 1 Million with Google Cloud Platform and DataStaxWebinar | From Zero to 1 Million with Google Cloud Platform and DataStax
Webinar | From Zero to 1 Million with Google Cloud Platform and DataStax
 
Webinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraWebinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache Cassandra
 
Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodes
 
Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...
Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...
Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...
 
Cassandra Community Webinar | Make Life Easier - An Introduction to Cassandra...
Cassandra Community Webinar | Make Life Easier - An Introduction to Cassandra...Cassandra Community Webinar | Make Life Easier - An Introduction to Cassandra...
Cassandra Community Webinar | Make Life Easier - An Introduction to Cassandra...
 
Webinar: Building Blocks for the Future of Television
Webinar: Building Blocks for the Future of TelevisionWebinar: Building Blocks for the Future of Television
Webinar: Building Blocks for the Future of Television
 
Webinar: DataStax Training - Everything you need to become a Cassandra Rockstar
Webinar: DataStax Training - Everything you need to become a Cassandra RockstarWebinar: DataStax Training - Everything you need to become a Cassandra Rockstar
Webinar: DataStax Training - Everything you need to become a Cassandra Rockstar
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in Production
 
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...
 

Ähnlich wie Cassandra Community Webinar: Back to Basics with CQL3

Cassandra Summit 2014: Understanding CQL3 Inside and Out
Cassandra Summit 2014: Understanding CQL3 Inside and OutCassandra Summit 2014: Understanding CQL3 Inside and Out
Cassandra Summit 2014: Understanding CQL3 Inside and Out
DataStax Academy
 
NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandra
rantav
 
Understanding How CQL3 Maps to Cassandra's Internal Data Structure
Understanding How CQL3 Maps to Cassandra's Internal Data StructureUnderstanding How CQL3 Maps to Cassandra's Internal Data Structure
Understanding How CQL3 Maps to Cassandra's Internal Data Structure
DataStax
 
Cassandra
CassandraCassandra
Cassandra
robjk
 
Storage cassandra
Storage   cassandraStorage   cassandra
Storage cassandra
PL dream
 
Cassandra
CassandraCassandra
Cassandra
exsuns
 

Ähnlich wie Cassandra Community Webinar: Back to Basics with CQL3 (20)

Cassandra Summit 2014: Understanding CQL3 Inside and Out
Cassandra Summit 2014: Understanding CQL3 Inside and OutCassandra Summit 2014: Understanding CQL3 Inside and Out
Cassandra Summit 2014: Understanding CQL3 Inside and Out
 
NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandra
 
Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to Cassandra
 
Understanding How CQL3 Maps to Cassandra's Internal Data Structure
Understanding How CQL3 Maps to Cassandra's Internal Data StructureUnderstanding How CQL3 Maps to Cassandra's Internal Data Structure
Understanding How CQL3 Maps to Cassandra's Internal Data Structure
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into Cassandra
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series Modeling
 
Cassandra
CassandraCassandra
Cassandra
 
Migrating To PostgreSQL
Migrating To PostgreSQLMigrating To PostgreSQL
Migrating To PostgreSQL
 
Meetup cassandra for_java_cql
Meetup cassandra for_java_cqlMeetup cassandra for_java_cql
Meetup cassandra for_java_cql
 
A Preliminary survey of RDF/Neo4j as backends for KnetMiner
A Preliminary survey of RDF/Neo4j as backends for KnetMinerA Preliminary survey of RDF/Neo4j as backends for KnetMiner
A Preliminary survey of RDF/Neo4j as backends for KnetMiner
 
Cassandra
CassandraCassandra
Cassandra
 
Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)
 
Recursive Query Throwdown
Recursive Query ThrowdownRecursive Query Throwdown
Recursive Query Throwdown
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long version
 
"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J...
"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J..."Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J...
"Einstürzenden Neudaten: Building an Analytics Engine from Scratch", Tobias J...
 
Storage cassandra
Storage   cassandraStorage   cassandra
Storage cassandra
 
Nyc summit intro_to_cassandra
Nyc summit intro_to_cassandraNyc summit intro_to_cassandra
Nyc summit intro_to_cassandra
 
Cassandra
CassandraCassandra
Cassandra
 
So You Want to Write a Connector?
So You Want to Write a Connector? So You Want to Write a Connector?
So You Want to Write a Connector?
 
Cassandra Deep Diver & Data Modeling
Cassandra Deep Diver & Data ModelingCassandra Deep Diver & Data Modeling
Cassandra Deep Diver & Data Modeling
 

Mehr von DataStax

Mehr von DataStax (20)

Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
 
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsRunning DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
 
Best Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphBest Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise Graph
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
Webinar  |  Better Together: Apache Cassandra and Apache KafkaWebinar  |  Better Together: Apache Cassandra and Apache Kafka
Webinar | Better Together: Apache Cassandra and Apache Kafka
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0
 
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesWebinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
 
Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for Dummies
 
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudHow to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
 
How to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceHow to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerce
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
 
Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)
 
An Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsAn Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking Applications
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingBecoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Cassandra Community Webinar: Back to Basics with CQL3

  • 1. Back to Basics with CQL3 Matt Overstreet OpenSource Connections OpenSource Connections
  • 3. Outline • • • • • Overview Architecture Data Modeling Good At/Bad At Using Cassandra • What is Big Data? • How does Cassandra fit? OpenSource Connections
  • 4. What is Big Data? • The three V’s (and a C) velocity volume Variety Complexity OpenSource Connections
  • 5. What is Big Data • Brewer’s CAP theorem o o o o Consistency - all nodes have same world view Availability - requests can be serviced Partition tolerance - network/machine failure Can’t have all 3 -- Pick 2! • Examples o MySQL – Consistent, Available o HBase – Consistent, Partition Tolerant o Cassandra – Available, Partition Tolerant – and “Tunably Consistent”! OpenSource Connections
  • 6. What is Big Data? • Common theme: Denormalize everything! o What’s that? • JOIN all the tables in the database... • … well not all the tables o Why? • You can shard database at any point • All related data is co-located • What this means for you o o o o o No joins No transactions - potential for inconsistency Vastly simplified querying No data-modeling -- Instead, query-modeling “Infinite and easy” scaling potential OpenSource Connections
  • 7. How Does Cassandra Fit? • No single point of failure • Optimized for writes, still good with reads • Can decide between Consistency and Availably concerns OpenSource Connections
  • 8. Outline • • • • • Overview Architecture Data Modeling Good At/Bad At Using Cassandra • Ring architecture • Data partitioning o Operations o Writes o Reads OpenSource Connections
  • 9. Ring Architecture • No single point of failure • Nodes talk via gossip • Democratic - all nodes are equal OpenSource Connections
  • 10. Data Partitioning Original partitioning method. OpenSource Connections
  • 11. Data Partitioning Flexible partitioning with virtual nodes. OpenSource Connections
  • 12. Operations: Writes Requests sent out to nodes and replicants. OpenSource Connections
  • 13. Operations: Reads Coordinator node reaches out to relevant replicants. OpenSource Connections
  • 14. Outline • • • • • Overview Architecture Data Modeling Good At/Bad At Using Cassandra • • • • Internals Cassandra Query Language Modeling Strategy Example OpenSource Connections
  • 16. C* Data Model Keyspace Column Family Column Family OpenSource Connections
  • 17. C* Data Model Keyspace Column Family Column Family OpenSource Connections
  • 18. C* Data Model Keyspace Column Family Column Family OpenSource Connections
  • 19. C* Data Model Row Key OpenSource Connections
  • 20. C* Data Model Row Key Column Column Name Column Value (or Tombstone) Timestamp Time-to-live OpenSource Connections
  • 21. C* Data Model Row Key Column Column Name Column Value (or Tombstone) Timestamp Time-to-live ● Row Key, Column Name, Column Value have types ● Column Name has comparator ● RowKey has partitioner ● Rows can have any number of columns - even in same column family ● Rows can have many columns ● Column Values can be omitted ● Time-to-live is useful! ● Tombstones OpenSource Connections
  • 22. C* Data Model: Writes Mem Table CommitLog Row Cache Bloom Filter ● Insert into MemTable ● Dump to CommitLog ● No read ● Very Fast! ● Blocks on CPU before O/I! Key Cache Key Cache Key Cache Key Cache SSTable SSTable SSTable SSTable OpenSource Connections
  • 23. C* Data Model: Writes Mem Table CommitLog Row Cache Bloom Filter ● Insert into MemTable ● Dump to CommitLog ● No read ● Very Fast! ● Blocks on CPU before O/I! Key Cache Key Cache Key Cache Key Cache SSTable SSTable SSTable SSTable OpenSource Connections
  • 24. C* Data Model: Writes Mem Table CommitLog Row Cache Bloom Filter ● Insert into MemTable ● Dump to CommitLog ● No read ● Very Fast! ● Blocks on CPU before O/I! Key Cache Key Cache Key Cache Key Cache SSTable SSTable SSTable SSTable OpenSource Connections
  • 25. C* Data Model: Reads Mem Table CommitLog Row Cache Bloom Filter ● Get values from Memtable ● Get values from row cache if present ● Otherwise check bloom filter to find appropriate SSTables ● Check Key Cache for fast SSTable Search ● Get values from SSTables ● Repopulate Row Cache ● Super Fast Col. retrieval ● Fast row slicing Key Cache Key Cache Key Cache Key Cache SSTable SSTable SSTable SSTable OpenSource Connections
  • 26. C* Data Model: Reads Mem Table CommitLog Row Cache Bloom Filter ● Get values from Memtable ● Get values from row cache if present ● Otherwise check bloom filter to find appropriate SSTables ● Check Key Cache for fast SSTable Search ● Get values from SSTables ● Repopulate Row Cache ● Super Fast Col. retrieval ● Fast row slicing Key Cache Key Cache Key Cache Key Cache SSTable SSTable SSTable SSTable OpenSource Connections
  • 27. C* Data Model: Reads Mem Table CommitLog Row Cache Bloom Filter ● Get values from Memtable ● Get values from row cache if present ● Otherwise check bloom filter to find appropriate SSTables ● Check Key Cache for fast SSTable Search ● Get values from SSTables ● Repopulate Row Cache ● Super Fast Col. retrieval ● Fast row slicing Key Cache Key Cache Key Cache Key Cache SSTable SSTable SSTable SSTable OpenSource Connections
  • 28. C* Data Model: Reads Mem Table CommitLog Row Cache Bloom Filter ● Get values from Memtable ● Get values from row cache if present ● Otherwise check bloom filter to find appropriate SSTables ● Check Key Cache for fast SSTable Search ● Get values from SSTables ● Repopulate Row Cache ● Super Fast Col. retrieval ● Fast row slicing Key Cache Key Cache Key Cache Key Cache SSTable SSTable SSTable SSTable OpenSource Connections
  • 29. C* Data Model: Reads Mem Table CommitLog Row Cache Bloom Filter ● Get values from Memtable ● Get values from row cache if present ● Otherwise check bloom filter to find appropriate SSTables ● Check Key Cache for fast SSTable Search ● Get values from SSTables ● Repopulate Row Cache ● Super Fast Col. retrieval ● Fast row slicing Key Cache Key Cache Key Cache Key Cache SSTable SSTable SSTable SSTable OpenSource Connections
  • 30. C* Data Model: Reads Mem Table CommitLog Row Cache Bloom Filter ● Get values from Memtable ● Get values from row cache if present ● Otherwise check bloom filter to find appropriate SSTables ● Check Key Cache for fast SSTable Search ● Get values from SSTables ● Repopulate Row Cache ● Super Fast Col. retrieval ● Fast row slicing Key Cache Key Cache Key Cache Key Cache SSTable SSTable SSTable SSTable OpenSource Connections
  • 31. Internals: Twitter Example • 4 ColumnFamilies o o o o followers following tweets timeline OpenSource Connections
  • 32. Internals: Twitter Example • 4 ColumnFamilies o o o o followers following tweets timeline • Nate follows Patricia o o o o SET followers[Patricia][Nate] = „‟; SET following[Nate][Patricia] = „‟; storing data in column names (not values) denormalized, redundant! • Get all Nate’s followers o GET followers[Patricia] o => Nate,Eric,Scott,Matt,Doug,Kate o No JOIN! OpenSource Connections
  • 33. Internals: Twitter Example • Nate tweets o SET tweets[Nate][2013-07-19 T 09:20] = “Wonderful morning. This coffee is great.” o SET tweets[Nate][2013-07-19 T 09:21] = “Oops, smoke is coming out of the SQL server!” o SET tweets[Nate][2013-07-19 T 09:51] = “Now my coffee is cold :-(” • Get Nate’s tweets o GET tweets[Nate] …(what you’d expect)... OpenSource Connections
  • 34. CQL (Cassandra Query Language) CREATE TABLE users ( id timeuuid PRIMARY KEY, lastname varchar, firstname varchar, dateOfBirth timestamp ); OpenSource Connections
  • 35. CQL (Cassandra Query Language) CREATE TABLE users ( id timeuuid PRIMARY KEY, lastname varchar, firstname varchar, dateOfBirth timestamp ); INSERT INTO users (id,lastname, firstname, dateofbirth) VALUES (now(),'Berryman',‟John','1975-09-15'); OpenSource Connections
  • 36. CQL (Cassandra Query Language) CREATE TABLE users ( id timeuuid PRIMARY KEY, lastname varchar, firstname varchar, dateOfBirth timestamp ); INSERT INTO users (id,lastname, firstname, dateofbirth) VALUES (now(),‟Berryman‟,‟John‟,‟1975-09-15‟); UPDATE users SET firstname = ‟John‟ WHERE id = f74c0b20-0862-11e3-8cf6-b74c10b01fc6; OpenSource Connections
  • 37. CQL (Cassandra Query Language) CREATE TABLE users ( id timeuuid PRIMARY KEY, lastname varchar, firstname varchar, dateOfBirth timestamp ); INSERT INTO users (id,lastname, firstname, dateofbirth) VALUES (now(),'Berryman',‟John','1975-09-15'); UPDATE users SET firstname = 'John‟ WHERE id = f74c0b20-0862-11e3-8cf6-b74c10b01fc6; SELECT dateofbirth,firstname,lastname FROM users ; dateofbirth | firstname | lastname --------------------------+-----------+---------1975-09-15 00:00:00-0400 | John | Berryman OpenSource Connections
  • 38. The CQL/Cassandra Mapping CREATE TABLE employees ( company text, name text, age int, role text, PRIMARY KEY (company,name) ); OpenSource Connections
  • 39. The CQL/Cassandra Mapping CREATE TABLE employees ( company text, name text, age int, role text, PRIMARY KEY (company,name) ); company | name | age | role --------+------+-----+----OSC | eric | 38 | ceo OSC | john | 37 | dev RKG | anya | 29 | lead RKG | ben | 27 | dev RKG | chad | 35 | ops OpenSource Connections
  • 40. The CQL/Cassandra Mapping company | name | age | role --------+------+-----+----OSC | eric | 38 | ceo OSC | john | 37 | dev RKG | anya | 29 | lead RKG | ben | 27 | dev RKG | chad | 35 | ops CREATE TABLE employees ( company text, name text, age int, role text, PRIMARY KEY (company,name) ); eric:age OS C eric:role john:age john:role 38 dev 37 dev anya:age RK G anya:role ben:age ben:role chad:age chad:role 29 lead 27 dev 35 ops OpenSource Connections
  • 41. Modeling Strategy • Don’t think about the data structure • Do think of the questions you’ll ask • Consider efficient operations for Cassandra o o o o Writing (4K writes per second per core) Retrieving a row Retrieving a row slice Retrieving in natural order (which you control) • Write the data in the way you will query it • Disk space is cheap • Seperate read-heavy and write-heavy task o Make wise use of caches OpenSource Connections
  • 42. Modeling Strategy: Anti-Patterns • Read-then-write • Heavy deletes o Scatters dead columns throughout SSTables o Won’t be corrected until first compaction after gc_grace_seconds (10days) • Distributed queue • JOIN-like behavior • Super wide-row sneak attack (>2B columns) OpenSource Connections

Hinweis der Redaktion

  1. In Cassandra 1.1?