SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Downloaden Sie, um offline zu lesen
Cassandra 
Pretty Cool
History 
Google Big Table 
Amazon Dynamo
Today
Why Should You Care 
● Horizontal Scaling (basically auto sharding) 
● Multiple Nodes - Highly Available 
● Really Fast Writes 
● Not too shabby at reads either - SLICES!! 
● Bright Future
The Cluster 
● replication factor (rf) 
● read consistency (r) 
● write consistency (w) 
● clustering - shard on 
partition key
The One Ring
Storage - Vnodes
Data Model 
● Wide rows 
● Slices Queries 
● Denormalization 
● Index tables
Data Model - Simple Key 
CREATE TABLE email_app.emails ( 
user_id text, 
subject text, 
to_add text, 
cc text, 
body text, 
ROW KEY 
PRIMARY KEY(user_id));
Data Model - Simple Inserts 
INSERT INTO email_app.emails (user_id, 
subject, to_add, cc, body) VALUES (‘111’, 
‘party’, ‘cat@b.com‘, ‘hippo@b.com‘, ‘at my 
place’); 
INSERT INTO email_app.emails (user_id, 
subject, to_add, cc, body) VALUES (‘999’, ‘wat 
‘, ‘horse@b.com‘, ‘giraffe@b.com‘, ‘is going 
on?’);
Data Model Simple Inserts Result 
Select * from email_app.emails; 
111 
subject to_add cc body 
party cat@ hippo@ at my place 
subject to_add cc body 
wat horse@ giraffe@ is going on 999
Mental Model - Nested Hash 
Row Keys 111 
999 
to cc body 
Column 
Values 
subject subject to cc body
Data Model - Simple Insert - Again 
INSERT INTO email_app.emails (user_id, subject, to_add, 
cc, body) VALUES (‘111’, ‘party’, ‘cat@b.com‘, ‘hippo@b. 
com‘, ‘at my place’); 
111 subject to_add cc body 
party cat@ hippo@ at my place 
subject to_add cc body 
wat horse@ giraffe@ Is going on? 999 IDEMPOTENT
Data Model - Composite Key 1 
CREATE TABLE email_app.emails ( 
user_id text, 
subject text, 
to_add text, 
cc text, 
body text, 
PRIMARY KEY(user_id, subject)); 
ROW KEY CLUSTERING KEY
Data Model - Composite Insert 1 
INSERT INTO email_app.emails (user_id, 
subject, to_add, cc, body) VALUES (‘111’, 
‘party‘, ‘cat@b.com‘, ‘hippo@b.com‘, ‘at my 
place’); 
Same as Before. 
Right???
Data Model Composite Insert Result 
Select * from emails WHERE user_id = 111; 
Subject 
111 party|to_ad party|cc party|body 
cat@ hippo@ At my place
Mental Model - Nested Hash 
111 
to_add cc body 
Row Key 
Column 
Values 
party 
Clustering 
Column 
user_id 
subject
Data Model - Composite Insert 2 
INSERT INTO email_app.emails (user_id, 
subject, to_add, cc, body) VALUES (‘111’, ’ 
swim’, ‘cat@b.com‘, ‘hippo@b.com‘, ‘in the 
pool’);
Composite Insert 2 Result 
Select * from emails WHERE user_id = ‘111’; 
Subject 
111 party|to_add party|cc party|body 
cat@ hippo@ at my place 
swim|to_add swim|cc swim|body 
cat@ hippo@b in the pool 
Sorted by clustering column - “subject”
Mental Model - Nested Sorted Hash 
111 
party 
to cc body 
Row Key 
Clustering 
Column 
Column 
Values 
swim 
to cc body 
user_id 
subject
Why sorted? 
SLICE QUERIES!! 
SELECT * FROM emails WHERE user_id = '111' 
AND (subject) >= ('s') AND (subject) < (‘t’); 
111 party|to_add party|cc party|body 
cat@ giraffe@ At my place 
swim|to_add swim|cc swim|body 
cat@ hippo@b in the pool
DM - Compound Composite Key 
CREATE TABLE email_app.emails ( 
user_id text, 
subject text, 
to_add text, 
cc text, 
body text, 
PRIMARY KEY((user_id, subject), to_add)); 
ROW KEY CLUSTERING KEY
Composite / Compound Inserts 
INSERT INTO email_app.emails (user_id, subject, to_add, 
cc, body) VALUES (‘111’, ‘wat‘, ‘horse@b.com‘, ‘giraffe@b. 
com‘, ‘is going on?’); 
INSERT INTO email_app.emails (user_id, subject, to_add, 
cc, body) VALUES (‘111’, ‘party‘, ‘cat@b.com‘, ‘hippo@b. 
com‘, ‘at my place’);
Composite Insert 2 Result 
SELECT * FROM emails WHERE user_id = ‘111’; 
SELECT * FROM emails WHERE user_id = ‘111’ 
AND subject = ‘party’; 
111:party 
cat@|cc cat@|body 
hippo@ At my place 
to_add
Data Model - Composite Insert 1 
INSERT INTO email_app.emails (user_id, subject, to_add, 
cc, body) VALUES (‘111’, ‘party‘, ‘dog@b.com‘, ‘hippo@b. 
com‘, ‘all the time’); 
SELECT * FROM emails WHERE user_id = ‘111’ AND 
subject = ‘party’; 
111:party 
cat@|cc cat@...|body 
giraffe@ At my place 
dog@|cc dog@|body 
hippo@b all the time 
Sorting / slice on - “to_add” 
to_add
DM - Compound Composite Key 2 
CREATE TABLE email_app.emails ( 
user_id text, 
subject text, 
to_add text, 
cc text, 
body text, 
ROW KEY CLUSTERING KEYS 
PRIMARY KEY((user_id, subject), to_add, cc));
Composite / Clustered Inserts 
INSERT INTO email_app.emails (user_id, subject, to_add, 
cc, body) VALUES (‘111’, ‘party‘, ‘dog@b.com‘, ‘hippo@b. 
com‘, ‘all the time); 
INSERT INTO email_app.emails (user_id, subject, to_add, 
cc, body) VALUES (‘111’, ‘party‘, ‘cat@b.com‘, ‘hippo@b. 
com‘, ‘At my place’); 
INSERT INTO email_app.emails (user_id, subject, to_add, 
cc, body) VALUES (‘111’, ‘party‘, ‘cat@b.com‘, ‘mouse@b. 
com‘, ‘At my place’);
DM - Composite / Clustered Inserts 
SELECT * FROM emails WHERE user_id = ‘111’ AND 
subject = ‘party’; 
111|party 
cat@|hippo@|body cat@|mouse@|body 
at my place at my place 
dog@|hippo@|body 
all the time 
Slice on (to_add) OR (to_add, cc)
Mental Model - Nested Sorted Hash 
111|party 
cat dog 
hippo mouse hippo 
body body body 
Row Key 
Clustering 
Columns 
Column 
Values 
user_id + 
subject 
to_add 
cc
Part 2 / 8 of this 7 hour talk 
● Denormalization 
● Index Column Families 
● Cassandra Internals (memtables, SSTables, 
compaction, repair)
Part 8 / 8: The Future 
● Continually improving 
● More and more adoption 
● Awesome projects 
● http://www.datastax. 
com/documentation/cassandra/2. 
0/pdf/cassandra20.pdf 
● http://planetcassandra.org/

Weitere ähnliche Inhalte

Andere mochten auch

Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...DataStax
 
PySpark Cassandra - Amsterdam Spark Meetup
PySpark Cassandra - Amsterdam Spark MeetupPySpark Cassandra - Amsterdam Spark Meetup
PySpark Cassandra - Amsterdam Spark MeetupFrens Jan Rumph
 
NoSQL Database- cassandra column Base DB
NoSQL Database- cassandra column Base DBNoSQL Database- cassandra column Base DB
NoSQL Database- cassandra column Base DBsadegh salehi
 
DAT202 Optimizing your Cassandra Database on AWS - AWS re: Invent 2012
DAT202 Optimizing your Cassandra Database on AWS - AWS re: Invent 2012DAT202 Optimizing your Cassandra Database on AWS - AWS re: Invent 2012
DAT202 Optimizing your Cassandra Database on AWS - AWS re: Invent 2012Amazon Web Services
 
MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012
MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012
MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012Amazon Web Services
 
NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraFolio3 Software
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed DatabaseEric Evans
 
Evaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseEvaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseDataStax
 
Best Presentation About Infosys
Best Presentation About InfosysBest Presentation About Infosys
Best Presentation About InfosysDurgadatta Dash
 

Andere mochten auch (10)

Cassandra
CassandraCassandra
Cassandra
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
 
PySpark Cassandra - Amsterdam Spark Meetup
PySpark Cassandra - Amsterdam Spark MeetupPySpark Cassandra - Amsterdam Spark Meetup
PySpark Cassandra - Amsterdam Spark Meetup
 
NoSQL Database- cassandra column Base DB
NoSQL Database- cassandra column Base DBNoSQL Database- cassandra column Base DB
NoSQL Database- cassandra column Base DB
 
DAT202 Optimizing your Cassandra Database on AWS - AWS re: Invent 2012
DAT202 Optimizing your Cassandra Database on AWS - AWS re: Invent 2012DAT202 Optimizing your Cassandra Database on AWS - AWS re: Invent 2012
DAT202 Optimizing your Cassandra Database on AWS - AWS re: Invent 2012
 
MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012
MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012
MED202 Netflix’s Transcoding Transformation - AWS re: Invent 2012
 
NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache Cassandra
 
The Cassandra Distributed Database
The Cassandra Distributed DatabaseThe Cassandra Distributed Database
The Cassandra Distributed Database
 
Evaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseEvaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud Database
 
Best Presentation About Infosys
Best Presentation About InfosysBest Presentation About Infosys
Best Presentation About Infosys
 

Mehr von DataStax Academy

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftDataStax Academy
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsDataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackDataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache CassandraDataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready CassandraDataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonDataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with DseDataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraDataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraDataStax Academy
 

Mehr von DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 

Kürzlich hochgeladen

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Kürzlich hochgeladen (20)

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

  • 2. History Google Big Table Amazon Dynamo
  • 4. Why Should You Care ● Horizontal Scaling (basically auto sharding) ● Multiple Nodes - Highly Available ● Really Fast Writes ● Not too shabby at reads either - SLICES!! ● Bright Future
  • 5. The Cluster ● replication factor (rf) ● read consistency (r) ● write consistency (w) ● clustering - shard on partition key
  • 8. Data Model ● Wide rows ● Slices Queries ● Denormalization ● Index tables
  • 9. Data Model - Simple Key CREATE TABLE email_app.emails ( user_id text, subject text, to_add text, cc text, body text, ROW KEY PRIMARY KEY(user_id));
  • 10. Data Model - Simple Inserts INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party’, ‘cat@b.com‘, ‘hippo@b.com‘, ‘at my place’); INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘999’, ‘wat ‘, ‘horse@b.com‘, ‘giraffe@b.com‘, ‘is going on?’);
  • 11. Data Model Simple Inserts Result Select * from email_app.emails; 111 subject to_add cc body party cat@ hippo@ at my place subject to_add cc body wat horse@ giraffe@ is going on 999
  • 12. Mental Model - Nested Hash Row Keys 111 999 to cc body Column Values subject subject to cc body
  • 13. Data Model - Simple Insert - Again INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party’, ‘cat@b.com‘, ‘hippo@b. com‘, ‘at my place’); 111 subject to_add cc body party cat@ hippo@ at my place subject to_add cc body wat horse@ giraffe@ Is going on? 999 IDEMPOTENT
  • 14. Data Model - Composite Key 1 CREATE TABLE email_app.emails ( user_id text, subject text, to_add text, cc text, body text, PRIMARY KEY(user_id, subject)); ROW KEY CLUSTERING KEY
  • 15. Data Model - Composite Insert 1 INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘cat@b.com‘, ‘hippo@b.com‘, ‘at my place’); Same as Before. Right???
  • 16. Data Model Composite Insert Result Select * from emails WHERE user_id = 111; Subject 111 party|to_ad party|cc party|body cat@ hippo@ At my place
  • 17. Mental Model - Nested Hash 111 to_add cc body Row Key Column Values party Clustering Column user_id subject
  • 18. Data Model - Composite Insert 2 INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ’ swim’, ‘cat@b.com‘, ‘hippo@b.com‘, ‘in the pool’);
  • 19. Composite Insert 2 Result Select * from emails WHERE user_id = ‘111’; Subject 111 party|to_add party|cc party|body cat@ hippo@ at my place swim|to_add swim|cc swim|body cat@ hippo@b in the pool Sorted by clustering column - “subject”
  • 20. Mental Model - Nested Sorted Hash 111 party to cc body Row Key Clustering Column Column Values swim to cc body user_id subject
  • 21. Why sorted? SLICE QUERIES!! SELECT * FROM emails WHERE user_id = '111' AND (subject) >= ('s') AND (subject) < (‘t’); 111 party|to_add party|cc party|body cat@ giraffe@ At my place swim|to_add swim|cc swim|body cat@ hippo@b in the pool
  • 22. DM - Compound Composite Key CREATE TABLE email_app.emails ( user_id text, subject text, to_add text, cc text, body text, PRIMARY KEY((user_id, subject), to_add)); ROW KEY CLUSTERING KEY
  • 23. Composite / Compound Inserts INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘wat‘, ‘horse@b.com‘, ‘giraffe@b. com‘, ‘is going on?’); INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘cat@b.com‘, ‘hippo@b. com‘, ‘at my place’);
  • 24. Composite Insert 2 Result SELECT * FROM emails WHERE user_id = ‘111’; SELECT * FROM emails WHERE user_id = ‘111’ AND subject = ‘party’; 111:party cat@|cc cat@|body hippo@ At my place to_add
  • 25. Data Model - Composite Insert 1 INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘dog@b.com‘, ‘hippo@b. com‘, ‘all the time’); SELECT * FROM emails WHERE user_id = ‘111’ AND subject = ‘party’; 111:party cat@|cc cat@...|body giraffe@ At my place dog@|cc dog@|body hippo@b all the time Sorting / slice on - “to_add” to_add
  • 26. DM - Compound Composite Key 2 CREATE TABLE email_app.emails ( user_id text, subject text, to_add text, cc text, body text, ROW KEY CLUSTERING KEYS PRIMARY KEY((user_id, subject), to_add, cc));
  • 27. Composite / Clustered Inserts INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘dog@b.com‘, ‘hippo@b. com‘, ‘all the time); INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘cat@b.com‘, ‘hippo@b. com‘, ‘At my place’); INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘cat@b.com‘, ‘mouse@b. com‘, ‘At my place’);
  • 28. DM - Composite / Clustered Inserts SELECT * FROM emails WHERE user_id = ‘111’ AND subject = ‘party’; 111|party cat@|hippo@|body cat@|mouse@|body at my place at my place dog@|hippo@|body all the time Slice on (to_add) OR (to_add, cc)
  • 29. Mental Model - Nested Sorted Hash 111|party cat dog hippo mouse hippo body body body Row Key Clustering Columns Column Values user_id + subject to_add cc
  • 30. Part 2 / 8 of this 7 hour talk ● Denormalization ● Index Column Families ● Cassandra Internals (memtables, SSTables, compaction, repair)
  • 31. Part 8 / 8: The Future ● Continually improving ● More and more adoption ● Awesome projects ● http://www.datastax. com/documentation/cassandra/2. 0/pdf/cassandra20.pdf ● http://planetcassandra.org/