SlideShare ist ein Scribd-Unternehmen logo
1 von 48
Downloaden Sie, um offline zu lesen
w/ @
MEET
Background
‣ White paper Blend
• Amazon Dynamo -07
• Google BigTable- 06
‣ What about Facebook?
‣ Open Source Successes
‣ & cue DataStax
‣ so What’s in a name?
why
cassandra?
Because the
world has
changed
Story Time
‣The network is reliable.
‣Latency is zero.
‣Bandwidth is infinite.
‣The network is secure.
‣Topology doesn't change.
‣There is one administrator.
‣Transport cost is zero.
‣The network is homogeneous.
http://en.wikipedia.org/wiki/Fallacies_of_distributed_computing
What we believe:
Hey, I’ve got some data!
I’ve got some bigger data!
lol wut?
uh oh…
big
data
???
high
availability
http://planetcassandra.org/nosql-performance-benchmarks/
Architecture
San
Francisco
New York
Stockholm
‣ Masterless - Peer to Peer
‣ High Availability
‣ No SPOFs
‣ Multi-dC
‣ Runs on Commodity Hardware
‣ Linear Scalability: More throughput?
‣ Fast
‣ Distributed
‣ Easy to operationally Manage
Architected w/
scale in mind
Node 1
Node 3
Node 2Node 4Node 8
Node 5
Node 6
Node 7
Data Center - East
Node 1
Node 3
Node 2Node 4
Data Center - West
Rack 1
Rack 2
Node 2
Rack 1
Rack 2
Cassandra Cluster
- 263+ 263
Token Range
(Murmur3)
A peer to peer set of nodes
‣ Node – one Cassandra instance
‣ Rack – a logical set of nodes
‣ Data Center – a logical set of racks
‣ Cluster – the full set of nodes which map to a single complete token ring
What is a c* Cluster?
Node 1
1st copy
Node 4
Node 5
Node 2
2nd copy
Node 3
3rd copy
Peer
to
Peer
Node 1
1st copy
Node 4
Node 5
Node 2
2nd copy
Node 3
3rd copy
Node 1
1st copy
Node 4
Node 5
Node 2
2nd copy
Node 3
DC: EUROPEDC: USA
Data Centers
Node 1
1st copy
Node 4
Node 5
Node 2
2nd copy
Node 3
3rd copy
Parallel
Write
Write
Consistency Level = QUORUM
Replication Factor = 3
5 µs ack
12 µs ack
500 µs ack
12 µs ack
Node 4
Node 2
2nd copy
Node 1
1st copy
Node 3
3rd copy
Node 4
Node 5
Tunable Consistency
Node 1
1st copy
Node 4
Node 5
Node 2
2nd copy
Parallel
Read
Read
Consistency Level = QUORUM
Replication Factor = 3
Node 4
Node 2
2nd copy
Node 1
1st copy
Node 3
3rd copy
Node 3
3rd copy
Node 4
Node 5
Hints
Continuous
availability
DataModel
andstoragemodel
Let’s think about a
music
database…
CQL
How to create, use and drop keyspaces/schemas?
•  To create a keyspace
•  To assign the working default keyspace for a cqlsh session
•  To delete a keyspace and all internal data objects
CREATE KEYSPACE musicdb
WITH replication = {
'class': 'SimpleStrategy',
'replication_factor' : 3
};
DROP KEYSPACE musicdb;
USE musicdb;
What is the syntax of the CREATE TABLE statement?
•  The CQL below creates a table in the current keyspace
CREATE TABLE performer (
name VARCHAR,
type VARCHAR,
country VARCHAR,
style VARCHAR,
founded INT,
born INT,
died INT,
PRIMARY KEY (name)
);
CREATE TABLE performer (
name VARCHAR PRIMARY KEY,
type VARCHAR,
country VARCHAR,
style VARCHAR,
founded INT,
born INT,
died INT
);
Primary key declared inline Primary key declared in separate clause
storage
model
What is a CQL table and how is it related to !
a column family?
•  A CQL table is a column family
•  CQL tables provide two-dimensional views of a column family, which contains
potentially multi-dimensional data, due to composite keys and collections
•  CQL table and column family are largely interchangeable terms
•  Not surprising when you recall tables and relations, columns and attributes, rows
and tuples in relational databases
•  Supported by declarative language Cassandra Query Language
•  Data Definition Language, subset of CQL
•  SQL-like syntax, but with somewhat different semantics
•  Convenient for defining and expressing Cassandra database schemas
What are row, row key, column key, !
and column value?
•  Row is the smallest unit that stores related data in Cassandra
•  Rows – individual rows constitute a column family
•  Row key – uniquely identifies a row in a column family
•  Row – stores pairs of column keys and column values
•  Column key – uniquely identifies a column value in a row
•  Column value – stores one value or a collection of values
©2014 DataStax Training. Use only with permission. Slide 4
row$
key
va
cola
vb
colb
vc
colc
vd
cold
Column'keys'(or'column'names)Row
Column'values'(or'cells)
partitions
What are partition, partition key, row, column, !
and cell?
•  Table with single-row partitions
•  Column family view
©2014 DataStax Training. Use only with permission. Slide 17
performer born country died founded style type
John Lennon 1940 England 1980 Rock artist
Paul McCartney 1942 England Rock artist
The Beatles England 1957 Rock band
rows
columns
partition key
cells
data
model
What is a data modeling framework?
•  Defines transitions between
models
•  Query-driven methodology
•  Formal analysis and validation
•  Defines a scientific approach to
data modeling
•  Modeling rules
•  Mapping patterns
•  Schema optimization techniques
Conceptual
Model
Logical
Model
Physical
Model
Query6Driven
Methodology
Analysis::&:
Validation
What is a conceptual data model?
•  Conceptual data model for music data
•  ER diagram (Chen notation)
•  Describes entities, relationships, roles, keys, cardinalities
•  What is possible and what is not in existing or future data
Album
title
year genre
releasesPerformername
founded
country
1 n
style
IsA
ArtistBand
disjoint5
covering
born
died
has3
member
n m
period
format
cover5image
number
title
1
n
Track
has
User
id
email
name
preferences
performs
m
1
involvedIn
1
n
IsA
RatePlay
disjoint5
not5covering
Activity
id
timestamp
rating
Q1
ACCESS	
  PATTERNS
Q1:	
  Find	
  performers	
  for	
  a	
  specified	
  style;	
  order	
  by	
  performer	
  (ASC).
Q2:	
  Find	
  information	
  for	
  a	
  specified	
  performer	
  (artist	
  or	
  band).
Q3:	
  Find	
  information	
  for	
  a	
  specified	
  album	
  (title	
  and	
  year).
Q4:	
  Find	
  albums	
  for	
  a	
  specified	
  performer;	
  order	
  by	
  album	
  release	
  year	
  (DESC)	
  and	
  title	
  (ASC).
Q5:	
  Find	
  albums	
  for	
  a	
  specified	
  genre;	
  order	
  by	
  performer	
  (ASC),	
  year	
  (DESC),	
  and	
  title	
  (ASC).
Q6:	
  Find	
  albums	
  and	
  performers	
  for	
  a	
  specified	
  track	
  title;	
  order	
  by	
  performer	
  (ASC),	
  year	
  (DESC),	
  and	
  title	
  (ASC).
Q7:	
  Find	
  tracks	
  for	
  a	
  specified	
  album	
  (title	
  and	
  year);	
  order	
  by	
  track	
  number	
  (ASC).
Q8:	
  Find	
  information	
  for	
  a	
  specified	
  user.
Q9:	
  Find	
  activities	
  for	
  a	
  specified	
  user;	
  order	
  by	
  activity	
  time	
  (DESC).
Q10:	
  Find	
  statistics	
  for	
  a	
  specified	
  track.
Q11:	
  Find	
  user	
  activities	
  for	
  a	
  specified	
  track;	
  order	
  by	
  activity	
  time	
  (DESC).
Q12:	
  Find	
  user	
  activities	
  for	
  a	
  specified	
  activity	
  type.
…
Performer
name K
type
country
style
founded
born
died
Performers_by_style
style K
name C↑
Albums_by_performer
performer 	
  	
  	
  	
  K
year 	
  	
  	
  	
  C↓
title 	
  	
  	
  	
  C↑
genre
Albums_by_genre
genre K
performer	
   C↑
year C↓
title C↑
Tracks_by_album
album K
year K
number 	
   C↑
performer	
   S
genre S
title
Albums_by_track
track K
performer	
   C↑
year C↓
title C↑
Album
title K
year K
performer
genre
tracks	
  (map)
Q2
Q2
Q4
Q3
Q3
Q4
Q5
Q5
Q6
Q1
Q3
Q3
Q7
Q7
Q7
User
id K
name	
  
email
preferences	
  (set)
Q8
Activities_by_user
user K
activity (timeuuid) C↓
type IDX
album_title
album_year
track_title
rating
Activities_by_track
album_title K
album_year K
track_title K
activity	
  (timeuuid) C↓
user
type
rating
Track_stats
album_title K
album_year K
track_title K
num_ratings	
  (counter)
sum_ratings	
  (counter)
num_plays	
  (counter)
Q9
Q8
Q10
Q11
Q12
A sample data
model
Writes
How does Cassandra write so fast?
•  Cassandra is a log-structured storage engine
•  Data is sequentially appended, not placed in pre-set locations
RDBMS
Seeks and writes values to !
various pre-set locations
CASSANDRA
Continuously appends to a log
?
?
What are the key components of the write path?
•  Each node implements four key components to handle its writes
•  Memtables – in-memory tables corresponding to CQL tables, with indexes
•  CommitLog – append-only log, replayed to restore downed node's Memtables
•  SSTables – Memtable snapshots periodically flushed to disk, clearing heap
•  Compaction – periodic process to merge and streamline SSTables
•  When any node receive any write request
1.  The record appends to the CommitLog, and
2.  The record appends to the Memtable for this record's target CQL table
3.  Periodically, Memtables flush to SSTables, clearing JVM heap and CommitLog
4.  Periodically, Compaction runs to merge and streamline SSTables
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
How does the write path flow on a node?
©2014 DataStax Training. Use only with permission. Slide 5
Node memory
Node file system
Client
partition key1 first:Oscar last:Orange level:42
partition key2 first:Ricky last:Red
Memtable (corresponds to a CQL table)
Coordinator
CommitLog
AppendOnly
… … … …
… … … …
… … … …
SSTables
Flush current state to SSTable
Compact related!
SSTables
W
rite !
<3, Betty, Blue, 63>
Acknowledge
partition key3 first:Betty last:Blue level:63
Compaction
Each write request …
Periodically …
Periodically …
What are Memtables and how are they flushed to disk?
•  Memtables are in-memory representations of a CQL table
•  Each node has a Memtable for each CQL table in the keyspace
•  Each Memtable accrues writes and provides reads for data not yet flushed
•  Updates to Memtables mutate the in-memory partition
•  When a Memtable flushes to disk
1.  Current Memtable data is written to a new immutable SSTable on disk
2.  JVM heap space is reclaimed from the flushed data
3.  Corresponding CommitLog entries are marked as flushed
partition key1 first:Oscar last:Orange level:42
partition key2 first:Ricky last:Red
Memtable
partition key3 first:Betty last:Blue level:63
What is a SSTable and what are its characteristics?
•  A SSTable ("sorted string table") is
•  an immutable file of sorted partitions
•  written to disk through fast, sequential i/o
•  contains the state of a Memtable when flushed
•  The current data state of a CQL table is comprised of
•  its corresponding Memtable plus
•  all current SSTables flushed from that Memtable
•  SSTables are periodically!
compacted from many to one
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
SSTables
What is a SSTable and what are its characteristics?
•  For each SSTable, two !
structures are created
•  Partition index – list of !
its primary keys and row !
start positions
•  Partition summary – in-
memory sample of its
partition index (default: 1
partition key of 128)
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
SSTables
partition key1 first:Oscar last:Orange level:42
partition key2 first:Ricky last:Red
Memtable (corresponds to a CQL table)
partition key3 first:Betty last:Blue level:63
… … … …
… … … …
… … … …
… … … …
Summary
Index
Summary
Index
Summary
Index
What is compaction?
•  Updates do mutate Memtable partitions, but
its SSTables are immutable
•  no SSTable seeks/overwrites
•  SSTables just accrue new !
timestamped updates
•  So, SSTables must be !
periodically compacted
•  related SSTables are merged
•  most recent version of each !
column is compiled to one !
partition in one new SSTable
•  partitions marked for !
deletion are evicted
•  old SSTables are deleted
SSTables
partition key1 first:Oscar last:Orange level:42
partition key2 first:Ricky last:Red
Memtable (corresponds to a CQL table)
partition key3 first:Betty last:Blue level:63
Summary
Index
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
… … … …
Compaction
Note, Compaction and the Read Path are discussed in !
further detail later in this course.
What is sstable2json?
•  bin/sstable2json is a utility which exports an SSTable in JSON
format, for testing and debugging
•  -k exclude a set of keys specified in HEX format (limit: 500)
•  -x exclude a specified set of keys (limit: 500)
•  -e enumerate keys only
©2014 DataStax Training. Use only with permission. Slide 22
./sstable2json [full_path_to_SSTable_Data_file] | more
READS
How does the read path flow on each node?
MemTable (e.g., player)
Coordinator
SSTables (e.g., player)
… … … …
pk7 … … level:42!
timestamp 1114
pk1 … … …
pk7 first:Betty!
timestamp 541
last:Blue!
timestamp 541
level:63!
timestamp 541
pk2 … … …
pk7 first:Elizabeth!
timestamp 994
pk7 first:Elizabeth last:Blue level:42
Row Cache (optional)
pk1 … … …
pk2 … … …
Read
<pk7> Hit
pk1, pk2pk1, pk2, pk7
Node memory
Node file system
Off Heap On HeapRow cache hit
How does the read path flow on each node?
MemTable (e.g., player)
SSTables (e.g., player)
… … … …
pk7 … … level:42!
timestamp 1114
pk1 … … …
pk7 first:Betty!
timestamp 541
last:Blue!
timestamp 541
level:63!
timestamp 541
pk2 … … …
pk7 first:Elizabeth!
timestamp 994
pk7 first:Elizabeth last:Blue level:42
Row Cache (optional)
pk1 … … …
pk2 … … …
pk1, pk2
Bloom
Filter
Bloom
Filter
Bloom
Filter
Miss
pk1, pk2, pk7
Node memory
Node file system
Hit
Hit
?
?
Key
Cache!
pk7!
Read
<pk7>
Off Heap On HeapKey cache hit
Coordinator
How does the read path flow on each node?
MemTable (e.g., player)
SSTables (e.g., player)
… … … …
pk7 … … level:42!
timestamp 1114
pk1 … … …
pk7 first:Betty!
timestamp 541
last:Blue!
timestamp 541
level:63!
timestamp 541
pk2 … … …
pk7 first:Elizabeth!
timestamp 994
pk7 first:Elizabeth last:Blue level:42
Row Cache (optional)
pk1 … … …
pk2 … … …
pk1, pk2
Bloom
Filter
Bloom
Filter
Bloom
Filter
Miss
pk1, pk2, pk7
Node memory
Node file system
Miss Partition!
Summary
Partition!
Index
Miss Partition!
Summary
Partition!
Index
?
?
Key
Cache!
!pk7
Read
<pk7>
Off Heap On HeapRow and Key miss
Coordinator
What is a Bloom filter and how does it optimize a read?
•  A probabilistic data structure testing if a key may be in a SSTable
•  each SSTable has a Bloom filter on disk, used from off-heap memory
•  false positives are possible, false negatives are not
•  larger tables have a higher possibility of false positives
•  1gb to 2gb per billion partitions in a SSTable
•  Eliminates seeking a partition key in any SSTable without it
Bloom
Filter
Bloom
Filter
Bloom
Filter
?
?
Key
Cache!
!
pk1 … … …
pk7 first:Betty!
timestamp 541
last:Blue!
timestamp 541
level:63!
timestamp 541
pk2 … … …
pk7 first:Elizabeth!
timestamp 994
pk1 … … …
pk2 … … …
Partition!
Index
Hit
Hit
How do you execute CQL queries in cqlsh?
Where do I Learn more?!
For the Dev: http://www.datastax.com/dev
Docs:http://docs.datastax.com/en/index.html
Planet C*: http://planetcassandra.org/
Driver Guide: http://planetcassandra.org/getting-
started-with-apache-cassandra-and-java/
My favorite blogs: http://tobert.github.io/
http://patrickmcfadin.com/
http://rustyrazorblade.com/
https://ahappyknockoutmouse.wordpress.com/author/
anukeus/
http://thelastpickle.com/blog/
My favorite C* book: http://www.amazon.com/
Cassandra-High-Availability-Robbie-Strickland/dp/
1783989122
DataStax Academy: https://academy.datastax.com/
Free Training: http://www.datastax.com/what-we-
offer/products-services/training
yo
u Cassandra
Dani Traphagen


http://datastax.com


dani.traphagen@datastax.com


@dtrapezoid

Weitere ähnliche Inhalte

Was ist angesagt?

Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra nehabsairam
 
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016DataStax
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architectureMarkus Klems
 
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화OpenStack Korea Community
 
Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecturenickmbailey
 
Apache Sentry for Hadoop security
Apache Sentry for Hadoop securityApache Sentry for Hadoop security
Apache Sentry for Hadoop securitybigdatagurus_meetup
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...DataStax Academy
 
A glimpse of cassandra 4.0 features netflix
A glimpse of cassandra 4.0 features   netflixA glimpse of cassandra 4.0 features   netflix
A glimpse of cassandra 4.0 features netflixVinay Kumar Chella
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basicsnickmbailey
 
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...ScyllaDB
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandraNguyen Quang
 
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...DataStax
 
PostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsPostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsCommand Prompt., Inc
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseDataStax
 
Modeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLModeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLScyllaDB
 
NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraFolio3 Software
 
Galera cluster for high availability
Galera cluster for high availability Galera cluster for high availability
Galera cluster for high availability Mydbops
 

Was ist angesagt? (20)

Cassandra
CassandraCassandra
Cassandra
 
Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra
 
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
Myths of Big Partitions (Robert Stupp, DataStax) | Cassandra Summit 2016
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architecture
 
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
 
Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecture
 
Apache Sentry for Hadoop security
Apache Sentry for Hadoop securityApache Sentry for Hadoop security
Apache Sentry for Hadoop security
 
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
 
A glimpse of cassandra 4.0 features netflix
A glimpse of cassandra 4.0 features   netflixA glimpse of cassandra 4.0 features   netflix
A glimpse of cassandra 4.0 features netflix
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basics
 
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...
Use ScyllaDB Alternator to Use Amazon DynamoDB API, Everywhere, Better, More ...
 
Cassandra 101
Cassandra 101Cassandra 101
Cassandra 101
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandra
 
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
 
PostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsPostgreSQL Administration for System Administrators
PostgreSQL Administration for System Administrators
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
 
Modeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQLModeling Data and Queries for Wide Column NoSQL
Modeling Data and Queries for Wide Column NoSQL
 
NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache Cassandra
 
Galera cluster for high availability
Galera cluster for high availability Galera cluster for high availability
Galera cluster for high availability
 

Andere mochten auch

Cassandra internals
Cassandra internalsCassandra internals
Cassandra internalsnarsiman
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraDataStax
 
Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013jbellis
 
A Deep Dive Into Understanding Apache Cassandra
A Deep Dive Into Understanding Apache CassandraA Deep Dive Into Understanding Apache Cassandra
A Deep Dive Into Understanding Apache CassandraDataStax Academy
 
Cassandra Day London 2015: Introduction to Apache Cassandra and DataStax Ente...
Cassandra Day London 2015: Introduction to Apache Cassandra and DataStax Ente...Cassandra Day London 2015: Introduction to Apache Cassandra and DataStax Ente...
Cassandra Day London 2015: Introduction to Apache Cassandra and DataStax Ente...DataStax Academy
 
Cassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapestCassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapestDuyhai Doan
 
Cassandra Summit 2014: CQL Under the Hood
Cassandra Summit 2014: CQL Under the HoodCassandra Summit 2014: CQL Under the Hood
Cassandra Summit 2014: CQL Under the HoodDataStax Academy
 
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...DataStax
 
DataStax: A deep look at the CQL WHERE clause
DataStax: A deep look at the CQL WHERE clauseDataStax: A deep look at the CQL WHERE clause
DataStax: A deep look at the CQL WHERE clauseDataStax Academy
 
Cassandra for the ops dos and donts
Cassandra for the ops   dos and dontsCassandra for the ops   dos and donts
Cassandra for the ops dos and dontsDuyhai Doan
 
24 compliments for guys they’ll never forget
24 compliments for guys they’ll never forget24 compliments for guys they’ll never forget
24 compliments for guys they’ll never forgetwomansvibe
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)DataStax Academy
 
Modern Algorithms and Data Structures - 1. Bloom Filters, Merkle Trees
Modern Algorithms and Data Structures - 1. Bloom Filters, Merkle TreesModern Algorithms and Data Structures - 1. Bloom Filters, Merkle Trees
Modern Algorithms and Data Structures - 1. Bloom Filters, Merkle TreesLorenzo Alberton
 

Andere mochten auch (14)

Cassandra internals
Cassandra internalsCassandra internals
Cassandra internals
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache Cassandra
 
Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013
 
A Deep Dive Into Understanding Apache Cassandra
A Deep Dive Into Understanding Apache CassandraA Deep Dive Into Understanding Apache Cassandra
A Deep Dive Into Understanding Apache Cassandra
 
Cassandra Day London 2015: Introduction to Apache Cassandra and DataStax Ente...
Cassandra Day London 2015: Introduction to Apache Cassandra and DataStax Ente...Cassandra Day London 2015: Introduction to Apache Cassandra and DataStax Ente...
Cassandra Day London 2015: Introduction to Apache Cassandra and DataStax Ente...
 
Cassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapestCassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapest
 
Cassandra Summit 2014: CQL Under the Hood
Cassandra Summit 2014: CQL Under the HoodCassandra Summit 2014: CQL Under the Hood
Cassandra Summit 2014: CQL Under the Hood
 
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...
 
Ricki k
Ricki kRicki k
Ricki k
 
DataStax: A deep look at the CQL WHERE clause
DataStax: A deep look at the CQL WHERE clauseDataStax: A deep look at the CQL WHERE clause
DataStax: A deep look at the CQL WHERE clause
 
Cassandra for the ops dos and donts
Cassandra for the ops   dos and dontsCassandra for the ops   dos and donts
Cassandra for the ops dos and donts
 
24 compliments for guys they’ll never forget
24 compliments for guys they’ll never forget24 compliments for guys they’ll never forget
24 compliments for guys they’ll never forget
 
How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)How to size up an Apache Cassandra cluster (Training)
How to size up an Apache Cassandra cluster (Training)
 
Modern Algorithms and Data Structures - 1. Bloom Filters, Merkle Trees
Modern Algorithms and Data Structures - 1. Bloom Filters, Merkle TreesModern Algorithms and Data Structures - 1. Bloom Filters, Merkle Trees
Modern Algorithms and Data Structures - 1. Bloom Filters, Merkle Trees
 

Ähnlich wie Intro to Cassandra

SRV405 Deep Dive on Amazon Redshift
SRV405 Deep Dive on Amazon RedshiftSRV405 Deep Dive on Amazon Redshift
SRV405 Deep Dive on Amazon RedshiftAmazon Web Services
 
NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandrarantav
 
Data Warehousing in the Era of Big Data
Data Warehousing in the Era of Big DataData Warehousing in the Era of Big Data
Data Warehousing in the Era of Big DataAmazon Web Services
 
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Amazon Web Services
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftAmazon Web Services
 
Ben Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra ProjectBen Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra ProjectMorningstar Tech Talks
 
Cassandra Community Webinar: Back to Basics with CQL3
Cassandra Community Webinar: Back to Basics with CQL3Cassandra Community Webinar: Back to Basics with CQL3
Cassandra Community Webinar: Back to Basics with CQL3DataStax
 
Amazon Redshift Deep Dive - February Online Tech Talks
Amazon Redshift Deep Dive - February Online Tech TalksAmazon Redshift Deep Dive - February Online Tech Talks
Amazon Redshift Deep Dive - February Online Tech TalksAmazon Web Services
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Jon Haddad
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentationMurat Çakal
 
The Right Data for the Right Job
The Right Data for the Right JobThe Right Data for the Right Job
The Right Data for the Right JobEmily Curtin
 
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWSAWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWSCobus Bernard
 
Cassandra introduction mars jug
Cassandra introduction mars jugCassandra introduction mars jug
Cassandra introduction mars jugDuyhai Doan
 
Big data analytics with Spark & Cassandra
Big data analytics with Spark & Cassandra Big data analytics with Spark & Cassandra
Big data analytics with Spark & Cassandra Matthias Niehoff
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into CassandraBrent Theisen
 

Ähnlich wie Intro to Cassandra (20)

SRV405 Deep Dive on Amazon Redshift
SRV405 Deep Dive on Amazon RedshiftSRV405 Deep Dive on Amazon Redshift
SRV405 Deep Dive on Amazon Redshift
 
NOSQL and Cassandra
NOSQL and CassandraNOSQL and Cassandra
NOSQL and Cassandra
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Data Warehousing in the Era of Big Data
Data Warehousing in the Era of Big DataData Warehousing in the Era of Big Data
Data Warehousing in the Era of Big Data
 
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
 
Data Warehousing with Amazon Redshift
Data Warehousing with Amazon RedshiftData Warehousing with Amazon Redshift
Data Warehousing with Amazon Redshift
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Ben Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra ProjectBen Coverston - The Apache Cassandra Project
Ben Coverston - The Apache Cassandra Project
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Le monde NOSQL pour les spécialistes du relationnel,
Le monde NOSQL pour les spécialistes du relationnel, Le monde NOSQL pour les spécialistes du relationnel,
Le monde NOSQL pour les spécialistes du relationnel,
 
Cassandra Community Webinar: Back to Basics with CQL3
Cassandra Community Webinar: Back to Basics with CQL3Cassandra Community Webinar: Back to Basics with CQL3
Cassandra Community Webinar: Back to Basics with CQL3
 
Amazon Redshift Deep Dive - February Online Tech Talks
Amazon Redshift Deep Dive - February Online Tech TalksAmazon Redshift Deep Dive - February Online Tech Talks
Amazon Redshift Deep Dive - February Online Tech Talks
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentation
 
The Right Data for the Right Job
The Right Data for the Right JobThe Right Data for the Right Job
The Right Data for the Right Job
 
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWSAWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
AWS SSA Webinar 20 - Getting Started with Data Warehouses on AWS
 
Cassandra introduction mars jug
Cassandra introduction mars jugCassandra introduction mars jug
Cassandra introduction mars jug
 
Big data analytics with Spark & Cassandra
Big data analytics with Spark & Cassandra Big data analytics with Spark & Cassandra
Big data analytics with Spark & Cassandra
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into Cassandra
 

Mehr von DataStax Academy

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftDataStax Academy
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsDataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackDataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache CassandraDataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready CassandraDataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonDataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with DseDataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraDataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraDataStax Academy
 

Mehr von DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 

Kürzlich hochgeladen

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 

Kürzlich hochgeladen (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

Intro to Cassandra

  • 2. Background ‣ White paper Blend • Amazon Dynamo -07 • Google BigTable- 06 ‣ What about Facebook? ‣ Open Source Successes ‣ & cue DataStax ‣ so What’s in a name?
  • 6. ‣The network is reliable. ‣Latency is zero. ‣Bandwidth is infinite. ‣The network is secure. ‣Topology doesn't change. ‣There is one administrator. ‣Transport cost is zero. ‣The network is homogeneous. http://en.wikipedia.org/wiki/Fallacies_of_distributed_computing What we believe:
  • 7. Hey, I’ve got some data!
  • 8. I’ve got some bigger data!
  • 13. San Francisco New York Stockholm ‣ Masterless - Peer to Peer ‣ High Availability ‣ No SPOFs ‣ Multi-dC ‣ Runs on Commodity Hardware ‣ Linear Scalability: More throughput? ‣ Fast ‣ Distributed ‣ Easy to operationally Manage Architected w/ scale in mind
  • 14. Node 1 Node 3 Node 2Node 4Node 8 Node 5 Node 6 Node 7 Data Center - East Node 1 Node 3 Node 2Node 4 Data Center - West Rack 1 Rack 2 Node 2 Rack 1 Rack 2 Cassandra Cluster - 263+ 263 Token Range (Murmur3) A peer to peer set of nodes ‣ Node – one Cassandra instance ‣ Rack – a logical set of nodes ‣ Data Center – a logical set of racks ‣ Cluster – the full set of nodes which map to a single complete token ring What is a c* Cluster?
  • 15. Node 1 1st copy Node 4 Node 5 Node 2 2nd copy Node 3 3rd copy Peer to Peer
  • 16. Node 1 1st copy Node 4 Node 5 Node 2 2nd copy Node 3 3rd copy Node 1 1st copy Node 4 Node 5 Node 2 2nd copy Node 3 DC: EUROPEDC: USA Data Centers
  • 17. Node 1 1st copy Node 4 Node 5 Node 2 2nd copy Node 3 3rd copy Parallel Write Write Consistency Level = QUORUM Replication Factor = 3 5 µs ack 12 µs ack 500 µs ack 12 µs ack Node 4 Node 2 2nd copy Node 1 1st copy Node 3 3rd copy Node 4 Node 5 Tunable Consistency
  • 18. Node 1 1st copy Node 4 Node 5 Node 2 2nd copy Parallel Read Read Consistency Level = QUORUM Replication Factor = 3 Node 4 Node 2 2nd copy Node 1 1st copy Node 3 3rd copy Node 3 3rd copy Node 4 Node 5 Hints Continuous availability
  • 20. CQL
  • 21. How to create, use and drop keyspaces/schemas? •  To create a keyspace •  To assign the working default keyspace for a cqlsh session •  To delete a keyspace and all internal data objects CREATE KEYSPACE musicdb WITH replication = { 'class': 'SimpleStrategy', 'replication_factor' : 3 }; DROP KEYSPACE musicdb; USE musicdb;
  • 22. What is the syntax of the CREATE TABLE statement? •  The CQL below creates a table in the current keyspace CREATE TABLE performer ( name VARCHAR, type VARCHAR, country VARCHAR, style VARCHAR, founded INT, born INT, died INT, PRIMARY KEY (name) ); CREATE TABLE performer ( name VARCHAR PRIMARY KEY, type VARCHAR, country VARCHAR, style VARCHAR, founded INT, born INT, died INT ); Primary key declared inline Primary key declared in separate clause
  • 24. What is a CQL table and how is it related to ! a column family? •  A CQL table is a column family •  CQL tables provide two-dimensional views of a column family, which contains potentially multi-dimensional data, due to composite keys and collections •  CQL table and column family are largely interchangeable terms •  Not surprising when you recall tables and relations, columns and attributes, rows and tuples in relational databases •  Supported by declarative language Cassandra Query Language •  Data Definition Language, subset of CQL •  SQL-like syntax, but with somewhat different semantics •  Convenient for defining and expressing Cassandra database schemas
  • 25. What are row, row key, column key, ! and column value? •  Row is the smallest unit that stores related data in Cassandra •  Rows – individual rows constitute a column family •  Row key – uniquely identifies a row in a column family •  Row – stores pairs of column keys and column values •  Column key – uniquely identifies a column value in a row •  Column value – stores one value or a collection of values ©2014 DataStax Training. Use only with permission. Slide 4 row$ key va cola vb colb vc colc vd cold Column'keys'(or'column'names)Row Column'values'(or'cells)
  • 26. partitions What are partition, partition key, row, column, ! and cell? •  Table with single-row partitions •  Column family view ©2014 DataStax Training. Use only with permission. Slide 17 performer born country died founded style type John Lennon 1940 England 1980 Rock artist Paul McCartney 1942 England Rock artist The Beatles England 1957 Rock band rows columns partition key cells
  • 28. What is a data modeling framework? •  Defines transitions between models •  Query-driven methodology •  Formal analysis and validation •  Defines a scientific approach to data modeling •  Modeling rules •  Mapping patterns •  Schema optimization techniques Conceptual Model Logical Model Physical Model Query6Driven Methodology Analysis::&: Validation
  • 29. What is a conceptual data model? •  Conceptual data model for music data •  ER diagram (Chen notation) •  Describes entities, relationships, roles, keys, cardinalities •  What is possible and what is not in existing or future data Album title year genre releasesPerformername founded country 1 n style IsA ArtistBand disjoint5 covering born died has3 member n m period format cover5image number title 1 n Track has User id email name preferences performs m 1 involvedIn 1 n IsA RatePlay disjoint5 not5covering Activity id timestamp rating
  • 30. Q1 ACCESS  PATTERNS Q1:  Find  performers  for  a  specified  style;  order  by  performer  (ASC). Q2:  Find  information  for  a  specified  performer  (artist  or  band). Q3:  Find  information  for  a  specified  album  (title  and  year). Q4:  Find  albums  for  a  specified  performer;  order  by  album  release  year  (DESC)  and  title  (ASC). Q5:  Find  albums  for  a  specified  genre;  order  by  performer  (ASC),  year  (DESC),  and  title  (ASC). Q6:  Find  albums  and  performers  for  a  specified  track  title;  order  by  performer  (ASC),  year  (DESC),  and  title  (ASC). Q7:  Find  tracks  for  a  specified  album  (title  and  year);  order  by  track  number  (ASC). Q8:  Find  information  for  a  specified  user. Q9:  Find  activities  for  a  specified  user;  order  by  activity  time  (DESC). Q10:  Find  statistics  for  a  specified  track. Q11:  Find  user  activities  for  a  specified  track;  order  by  activity  time  (DESC). Q12:  Find  user  activities  for  a  specified  activity  type. … Performer name K type country style founded born died Performers_by_style style K name C↑ Albums_by_performer performer        K year        C↓ title        C↑ genre Albums_by_genre genre K performer   C↑ year C↓ title C↑ Tracks_by_album album K year K number   C↑ performer   S genre S title Albums_by_track track K performer   C↑ year C↓ title C↑ Album title K year K performer genre tracks  (map) Q2 Q2 Q4 Q3 Q3 Q4 Q5 Q5 Q6 Q1 Q3 Q3 Q7 Q7 Q7 User id K name   email preferences  (set) Q8 Activities_by_user user K activity (timeuuid) C↓ type IDX album_title album_year track_title rating Activities_by_track album_title K album_year K track_title K activity  (timeuuid) C↓ user type rating Track_stats album_title K album_year K track_title K num_ratings  (counter) sum_ratings  (counter) num_plays  (counter) Q9 Q8 Q10 Q11 Q12 A sample data model
  • 32. How does Cassandra write so fast? •  Cassandra is a log-structured storage engine •  Data is sequentially appended, not placed in pre-set locations RDBMS Seeks and writes values to ! various pre-set locations CASSANDRA Continuously appends to a log ? ?
  • 33. What are the key components of the write path? •  Each node implements four key components to handle its writes •  Memtables – in-memory tables corresponding to CQL tables, with indexes •  CommitLog – append-only log, replayed to restore downed node's Memtables •  SSTables – Memtable snapshots periodically flushed to disk, clearing heap •  Compaction – periodic process to merge and streamline SSTables •  When any node receive any write request 1.  The record appends to the CommitLog, and 2.  The record appends to the Memtable for this record's target CQL table 3.  Periodically, Memtables flush to SSTables, clearing JVM heap and CommitLog 4.  Periodically, Compaction runs to merge and streamline SSTables
  • 34. … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … How does the write path flow on a node? ©2014 DataStax Training. Use only with permission. Slide 5 Node memory Node file system Client partition key1 first:Oscar last:Orange level:42 partition key2 first:Ricky last:Red Memtable (corresponds to a CQL table) Coordinator CommitLog AppendOnly … … … … … … … … … … … … SSTables Flush current state to SSTable Compact related! SSTables W rite ! <3, Betty, Blue, 63> Acknowledge partition key3 first:Betty last:Blue level:63 Compaction Each write request … Periodically … Periodically …
  • 35. What are Memtables and how are they flushed to disk? •  Memtables are in-memory representations of a CQL table •  Each node has a Memtable for each CQL table in the keyspace •  Each Memtable accrues writes and provides reads for data not yet flushed •  Updates to Memtables mutate the in-memory partition •  When a Memtable flushes to disk 1.  Current Memtable data is written to a new immutable SSTable on disk 2.  JVM heap space is reclaimed from the flushed data 3.  Corresponding CommitLog entries are marked as flushed partition key1 first:Oscar last:Orange level:42 partition key2 first:Ricky last:Red Memtable partition key3 first:Betty last:Blue level:63
  • 36. What is a SSTable and what are its characteristics? •  A SSTable ("sorted string table") is •  an immutable file of sorted partitions •  written to disk through fast, sequential i/o •  contains the state of a Memtable when flushed •  The current data state of a CQL table is comprised of •  its corresponding Memtable plus •  all current SSTables flushed from that Memtable •  SSTables are periodically! compacted from many to one … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … SSTables
  • 37. What is a SSTable and what are its characteristics? •  For each SSTable, two ! structures are created •  Partition index – list of ! its primary keys and row ! start positions •  Partition summary – in- memory sample of its partition index (default: 1 partition key of 128) … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … SSTables partition key1 first:Oscar last:Orange level:42 partition key2 first:Ricky last:Red Memtable (corresponds to a CQL table) partition key3 first:Betty last:Blue level:63 … … … … … … … … … … … … … … … … Summary Index Summary Index Summary Index
  • 38. What is compaction? •  Updates do mutate Memtable partitions, but its SSTables are immutable •  no SSTable seeks/overwrites •  SSTables just accrue new ! timestamped updates •  So, SSTables must be ! periodically compacted •  related SSTables are merged •  most recent version of each ! column is compiled to one ! partition in one new SSTable •  partitions marked for ! deletion are evicted •  old SSTables are deleted SSTables partition key1 first:Oscar last:Orange level:42 partition key2 first:Ricky last:Red Memtable (corresponds to a CQL table) partition key3 first:Betty last:Blue level:63 Summary Index … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … Compaction Note, Compaction and the Read Path are discussed in ! further detail later in this course.
  • 39. What is sstable2json? •  bin/sstable2json is a utility which exports an SSTable in JSON format, for testing and debugging •  -k exclude a set of keys specified in HEX format (limit: 500) •  -x exclude a specified set of keys (limit: 500) •  -e enumerate keys only ©2014 DataStax Training. Use only with permission. Slide 22 ./sstable2json [full_path_to_SSTable_Data_file] | more
  • 40. READS
  • 41. How does the read path flow on each node? MemTable (e.g., player) Coordinator SSTables (e.g., player) … … … … pk7 … … level:42! timestamp 1114 pk1 … … … pk7 first:Betty! timestamp 541 last:Blue! timestamp 541 level:63! timestamp 541 pk2 … … … pk7 first:Elizabeth! timestamp 994 pk7 first:Elizabeth last:Blue level:42 Row Cache (optional) pk1 … … … pk2 … … … Read <pk7> Hit pk1, pk2pk1, pk2, pk7 Node memory Node file system Off Heap On HeapRow cache hit
  • 42. How does the read path flow on each node? MemTable (e.g., player) SSTables (e.g., player) … … … … pk7 … … level:42! timestamp 1114 pk1 … … … pk7 first:Betty! timestamp 541 last:Blue! timestamp 541 level:63! timestamp 541 pk2 … … … pk7 first:Elizabeth! timestamp 994 pk7 first:Elizabeth last:Blue level:42 Row Cache (optional) pk1 … … … pk2 … … … pk1, pk2 Bloom Filter Bloom Filter Bloom Filter Miss pk1, pk2, pk7 Node memory Node file system Hit Hit ? ? Key Cache! pk7! Read <pk7> Off Heap On HeapKey cache hit Coordinator
  • 43. How does the read path flow on each node? MemTable (e.g., player) SSTables (e.g., player) … … … … pk7 … … level:42! timestamp 1114 pk1 … … … pk7 first:Betty! timestamp 541 last:Blue! timestamp 541 level:63! timestamp 541 pk2 … … … pk7 first:Elizabeth! timestamp 994 pk7 first:Elizabeth last:Blue level:42 Row Cache (optional) pk1 … … … pk2 … … … pk1, pk2 Bloom Filter Bloom Filter Bloom Filter Miss pk1, pk2, pk7 Node memory Node file system Miss Partition! Summary Partition! Index Miss Partition! Summary Partition! Index ? ? Key Cache! !pk7 Read <pk7> Off Heap On HeapRow and Key miss Coordinator
  • 44. What is a Bloom filter and how does it optimize a read? •  A probabilistic data structure testing if a key may be in a SSTable •  each SSTable has a Bloom filter on disk, used from off-heap memory •  false positives are possible, false negatives are not •  larger tables have a higher possibility of false positives •  1gb to 2gb per billion partitions in a SSTable •  Eliminates seeking a partition key in any SSTable without it Bloom Filter Bloom Filter Bloom Filter ? ? Key Cache! ! pk1 … … … pk7 first:Betty! timestamp 541 last:Blue! timestamp 541 level:63! timestamp 541 pk2 … … … pk7 first:Elizabeth! timestamp 994 pk1 … … … pk2 … … … Partition! Index Hit Hit
  • 45. How do you execute CQL queries in cqlsh?
  • 46. Where do I Learn more?! For the Dev: http://www.datastax.com/dev Docs:http://docs.datastax.com/en/index.html Planet C*: http://planetcassandra.org/ Driver Guide: http://planetcassandra.org/getting- started-with-apache-cassandra-and-java/ My favorite blogs: http://tobert.github.io/ http://patrickmcfadin.com/ http://rustyrazorblade.com/ https://ahappyknockoutmouse.wordpress.com/author/ anukeus/ http://thelastpickle.com/blog/ My favorite C* book: http://www.amazon.com/ Cassandra-High-Availability-Robbie-Strickland/dp/ 1783989122 DataStax Academy: https://academy.datastax.com/ Free Training: http://www.datastax.com/what-we- offer/products-services/training