SlideShare ist ein Scribd-Unternehmen logo
1 von 41
BASLE BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENEVA
HAMBURG COPENHAGEN LAUSANNE MUNICH STUTTGART VIENNA ZURICH
Apache Cassandra
Under The Hood
Robert Bialek
Who Am I
Apache Cassandra Under The Hood2 15.09.2018
Senior Principal Consultant and Trainer at Trivadis GmbH in Munich.
– Master of Science in Computer Engineering.
– At Trivadis since 2004.
– Trivadis Partner since 2012.
Focus:
– Data and service high availability, disaster recovery.
– Architecture design, optimization, automation.
– Troubleshooting.
– Trainer: O-RAC, O-DG.
Agenda
Apache Cassandra Under The Hood3 15.09.2018
1. Introduction
2. Key Components
3. Data Replication
4. Scalability
5. Read/Write Operations
6. Data Consistency
7. Summary
Apache Cassandra Under The Hood4 15.09.2018
Introduction
What is Apache Cassandra?
Apache Cassandra Under The Hood5 15.09.2018
Distributed NoSQL (wide column) partitioned row store database, which runs within a
JVM.
Decentralized, highly fault tolerant database with no single point of failure.
Horizontal scalable system (computing resources/performance).
Initially developed at Facebook, released as an open source project in July 2008.
– Based on Amazon‘s Dynamo and Google‘s Big Table.
Apache Cassandra & CAP Theorem?
Apache Cassandra Under The Hood6 15.09.2018
According to CAP (Brewer’s) theorem “it is impossible for a distributed data store to
simultaneously provide more than two out of the following three guarantees”
– Consistency
– Availability
– Partition tolerance
Apache Cassandra is a AP system.
– Data result is eventually consistent (though, consistency is tunable).
– Does not adhere to all ACID properties.
? ?
Cassandra for Enterprise Applications
Apache Cassandra Under The Hood7 15.09.2018
Support 24x7x365.
Enterprise features, e.g.: DSE Advanced Security, DSE Analytics, DSE Search, DSE
Graph, DSE Advanced Replication, DSE Tiered Storage, DSE NodeSync, ...
Administration and monitoring with DSE OpsCenter (real-time monitoring, tuning,
provisioning, backup, security management).
According to DataStax, 2x or more throughput compared to Apache Cassandra.
Documentation, client drivers and DSE for development are free to use.
Who is Using Cassandra Database?
Apache Cassandra Under The Hood8 15.09.2018
Source http://cassandra.apache.org
– Apple: over 75,000 nodes storing over 10 PB of data.
– Netflix: 2,500 nodes, 420 TB, over 1 trillion requests per day.
– Chinese search engine Easou: 270 nodes, 300 TB, over 800 million requests per
day.
– eBay: over 100 nodes, 250 TB.
Source https://www.datastax.com/customers
– Microsoft, UBS, Sony, Sky, ING, NEC, Coursera, CISCO, Walmart, NVIDIA,
Samsung, …
Apache Cassandra Under The Hood9 15.09.2018
Key Components
Node – Basic Database Infrastructure
Apache Cassandra Under The Hood10 15.09.2018
Commodity hardware, ideally local storage (reduce
dependencies).
Hosts software and configuration files:
– cassandra.yaml, cassandra-rackdc.properties, …
Hosts data and accompanying structures:
Cassandra Node
(DSE: Transactional Node)
Index.db
Data.db
(SSTable) Statistics.db
CompressionInfo.db
Digest.crc32
Filter.db
TOC.txt
Keyspaces & Tables
Apache Cassandra Under The Hood11 15.09.2018
Table (Column Family)
– Stores data based on a primary key.
• Primary key: partitioning key plus optionally
clustering columns.
– Physically split into partitioned.
– Denormalization (data duplication) is necessary.
Keyspace
– Grouping of data, similar to a schema.
– Defines replication properties.
Partitioner – Data Distribution
Apache Cassandra Under The Hood12 15.09.2018
Determines which node receives data based on
partitioning key token.
Supplied partitioners (own can be created):
Data
Token
PARTITIONER
Murmur3Partitioner (default)
Random Partitioner
ByteOrderedPartitioner
‘Cassandra'
356242581507269238
Cassandra Ring – Singe Token Architecture
Apache Cassandra Under The Hood13 15.09.2018
Cassandra Ring
initial_token:1
initial_token:10
initial_token: 20
initial_token: 30
Example Partitioner
Token Range: 1 – 40
Token Range: 31 – 40,1
Token Range: 2 – 10
Token Range: 11 – 20
Token Range: 21 – 30
Data
Token
Cassandra Ring – Virtual Nodes Architecture
Apache Cassandra Under The Hood14 15.09.2018
Cassandra Ring
Example Partitioner
Token Range: 1 – 40
num_tokens: 5
Token Ranges: 1 – 2, 11 – 12,
21 – 22, 33 – 34, 39 – 40
num_tokens: 5
Token Ranges: 3 – 4, 9 – 10,
29 – 30, 23 – 24, 39 – 40
num_tokens: 5
Token Ranges: 7 – 8, 17 – 18,
27 – 28, 31 – 32, 37 – 38
num_tokens: 5
Token Ranges: 5 – 6, 15 – 16,
19 – 20, 25 – 26, 35 – 36
Data
Token
Partitioner
Snitches – Ring Topology
Apache Cassandra Under The Hood15 15.09.2018
Determines physical location (datacenter and a
rack) of a Cassandra node.
Dynamic snitching (enabled by default):
– Monitors the read performance and ring health.
SNITECHES
SimpleSnitch/DseSimpleSnitch (default)
GossipingPropertyFileSnitch
PropertyFileSnitch
Ec2Snitch/Ec2MultiRegionSnitch/GoogleCloudSnitch/
CloudstackSnitch
RackInferringSnitch
DC 1 DC 2
Rack 1 Rack 1
Rack 2 Rack 2
Gossip – Internode Communication
Apache Cassandra Under The Hood16 15.09.2018
Peer-to-peer communication protocol to exchange
ring state information.
Gossip process runs every second and exchanges
messages with up to three other nodes in the ring.
Eventually, all nodes learn (indirectly) about all
other nodes.
Apache Cassandra Under The Hood17 15.09.2018
Scalability
Cassandra Ring – Scale Out
Apache Cassandra Under The Hood18 15.09.2018
Increases computing power and
throughput of a Cassandra ring.
Online and transparent to the
applications.
Ring
Information
START
Joing Ring
Generate
Tokens
FINISH
Joing Ring
Cassandra Ring
SEED Node
Bootstrap
Data Streaming
Software &
Configuration Files
Cassandra Ring – Scale In
Apache Cassandra Under The Hood19 15.09.2018
Decreases computing power of a
Cassandra ring.
Online and transparent to the
applications.
Cassandra Ring
DECOMMISSION
Data Streaming
Remove
Tokens
DECOMMISSIONED
Apache Cassandra Under The Hood20 15.09.2018
Data Replication
Replication – Data High Availability
Apache Cassandra Under The Hood21 15.09.2018
To ensure data and service high availability, Cassandra stores data on multiple
nodes in a cluster.
All replicas are equally important (no primary or
secondary data).
Replication strategy and replication factor (RF) is
defined on a keyspace (application) level.
– RF can be set differently in different data centers.
Two replication strategies are available:
– SimpleStrategy
– NetworkTopologyStrategy
DC 1 DC 2
Rack 1 Rack 1
Rack 2 Rack 2
Replication – SimpleStrategy (RF: 2)
Apache Cassandra Under The Hood22 15.09.2018
Data Center 1
Rack 1 Rack 1
Rack 1 Rack 1
Replication – NetworkTopologyStrategy (RF/DC: 2)
Apache Cassandra Under The Hood23 15.09.2018
Data Center 1 Data Center 2
Rack 1 Rack 1
Rack 2 Rack 2
Apache Cassandra Under The Hood24 15.09.2018
Read/Write Operations
Read Request Flow on a Cassandra Node
Apache Cassandra Under The Hood25 15.09.2018
Memtable Row Cache Bloom Filter
Partition Key
Cache
Compression
Offset Map Partition Summary
Partition Index
SSTables
MemoryDisk
Write Request Flow on a Cassandra Node
Apache Cassandra Under The Hood26 15.09.2018
Memtable
Index.db
Data.db
(SSTable)
MemoryDisk
Commit Log
Statistics.db
CompressionInfo.db
Digest.crc32
Filter.db
TOC.txt
Compaction
Process
Upserts on a Cassandra Node
Apache Cassandra Under The Hood27 15.09.2018
Memtable
TAG: CASSANDRA
SSTables
ID C1 C2 TSTAMP
1 2 TEST1 100
ID C1 C2 TSTAMP
2 3 TEST2 50
INSERT INTO t (TAG, ID,C1,C2)
VALUES (‘CASSANDRA‘,1,5,‘TEST3‘);
UPDATE t SET C2=PROD1 WHERE
TAG=‘CASSANDRA‘ AND ID=1;
DELTE FROM t
WHERE TAG=‘CASSANDRA‘ AND ID=2;
ID C1 C2 TSTAMP
1 5 TEST3 150
ID C2 TSTAMP
1 PROD1 200
ID Tombstone
(marked_deleted)
TSTAMP
2 250
Partition Key: TAG
Primary Key: TAG, ID
Compaction Process on a Cassandra Node
Apache Cassandra Under The Hood28 15.09.2018
ID C1 C2 TSTAMP
1 2 TEST1 100
ID C1 C2 TSTAMP
2 3 TEST2 50
ID C1 C2 TSTAMP
1 5 TEST3 150
ID C2 TSTAMP
1 PROD1 200
ID Tombstone
(marked_deleted)
TSTAMP
2 250
ID C1 C2 TSTAMP
3 4 TEST3 120
ID C1 C2 TSTAMP
1 5 PROD1 300
ID Tombstone
(marked_deleted)
TSTAMP
2 250
ID C1 C2 TSTAMP
3 4 TEST3 120
gc_grace_seconds
reached?
New SSTable
Compaction Strategies
SizeTieredCompactionStrategy (STCS)
LeveledCompactionStrategy (LCS)
TimeWindowCompactionStrategy (TWCS)
No
Apache Cassandra Under The Hood29 15.09.2018
Data Consistency
Data Consistency – Overview
Apache Cassandra Under The Hood30 15.09.2018
Cassandra offers tunable data consistency for read and write operations.
Two types of read requests:
– Direct read request.
– Digest read request.
Inconsistent data can be repaired automatically by:
– Background read repair request.
– NodeSync continuous background repair (only DSE 6).
Inconsistent data can be repaired manually by:
– Anty-Entropy Repair.
Tunable Consistency
Apache Cassandra Under The Hood31 15.09.2018
A tradeoff between data consistency and availability
WRITE Consistency Level READ Consistency Level
ALL ALL
EACH_QUORUM Not supported.
QUORUM QUORUM
LOCAL_QUORUM LOCAL_QUORUM
ONE, TWO, THREE ONE, TWO, THREE
LOCAL_ONE LOCAL_ONE
ANY Not supported.
Not supported. SERIAL
Not supported. LOCAL_SERIAL
Read Requests & Tunable Consistency (1)
Apache Cassandra Under The Hood32 15.09.2018
One DC, CONSISTENCY=QUORUM, RF=3
Coordinator
Direct Read
Digest Read
speculative_retry!
Read Requests & Tunable Consistency (2)
Apache Cassandra Under The Hood33 15.09.2018
One DC, CONSISTENCY=QUORUM, RF=3
Coordinator
Direct Read
Digest Read Background
Read Repair
read_repair_chance=0.10
Read Requests & Tunable Consistency (3)
Apache Cassandra Under The Hood34 15.09.2018
Two DC, CONSISTENCY=QUORUM, RF=3
Coordinator
DC=1 DC=2
Direct Read
Digest Read
Digest Read
Digest Read
Read Requests & Tunable Consistency (4)
Apache Cassandra Under The Hood35 15.09.2018
Two DC, CONSISTENCY=LOCAL_QUORUM, RF=3
Coordinator
DC=1 DC=2
Direct Read
Digest Read
Write Requests & Tunable Consistency (1)
Apache Cassandra Under The Hood36 15.09.2018
One DC, CONSISTENCY=ONE, RF=3
Coordinator
Write Requests & Tunable Consistency (2)
Apache Cassandra Under The Hood37 15.09.2018
One DC, CONSISTENCY=QUORUM, RF=3
Coordinator
DELETE
Possibile
ZOMBI
Hinted Handoff
Data Consistency – Anty-Entropy Repair
Apache Cassandra Under The Hood38 15.09.2018
Manual data repair:
– A Merkle tree is build for each replica
– Merkle trees are compered between all
replicas.
Repair can be performed:
– Sequential.
– Parallel.
– Datacenter parallel.
Source: DSE 6.0 Architecture Guide
Apache Cassandra Under The Hood39 15.09.2018
Summary
Summary
Apache Cassandra Under The Hood40 15.09.2018
Cassandra is a very powerful distributed and decentralized NoSQL database with no
singe point of failure.
It guarantees service and data availability in case of a partitioned network, though
the data might be stale.
Designed for large data stores which require performant and scalable system.
Application data model need to be designed for Cassandra.
Many ways to interact with the database:
– CQLSH (Cassandra Query Language Shell).
– Drivers and tools provided by DataStax.
DataStax offers support for enterprise customers and a good documentation.
15.09.2018 Apache Cassandra Under The Hood41
Robert Bialek
Senior Principal Consultant
Tel. +49 89 99 27 59 38
robert.bialek@trivadis.com

Weitere ähnliche Inhalte

Was ist angesagt?

Boston hug-2012-07
Boston hug-2012-07Boston hug-2012-07
Boston hug-2012-07Ted Dunning
 
Wide Column Store NoSQL vs SQL Data Modeling
Wide Column Store NoSQL vs SQL Data ModelingWide Column Store NoSQL vs SQL Data Modeling
Wide Column Store NoSQL vs SQL Data ModelingScyllaDB
 
Managing data analytics in a hybrid cloud
Managing data analytics in a hybrid cloudManaging data analytics in a hybrid cloud
Managing data analytics in a hybrid cloudKaran Singh
 
Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain...
Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain...Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain...
Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain...Big Data Spain
 
Addressing the High Cost of Apache Cassandra
Addressing the High Cost of Apache CassandraAddressing the High Cost of Apache Cassandra
Addressing the High Cost of Apache CassandraScyllaDB
 
C2MON - A highly scalable monitoring platform for Big Data scenarios @CERN by...
C2MON - A highly scalable monitoring platform for Big Data scenarios @CERN by...C2MON - A highly scalable monitoring platform for Big Data scenarios @CERN by...
C2MON - A highly scalable monitoring platform for Big Data scenarios @CERN by...J On The Beach
 
On-Prem Solution for the Selection of Wind Energy Models
On-Prem Solution for the Selection of Wind Energy ModelsOn-Prem Solution for the Selection of Wind Energy Models
On-Prem Solution for the Selection of Wind Energy ModelsDatabricks
 
How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...
How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...
How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...inside-BigData.com
 
MapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community EditionMapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community EditionMapR Technologies
 
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...inside-BigData.com
 
IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告Ryousei Takano
 
Home For Gypsies – Storage for NoSQL Databases​
Home For Gypsies – Storage for NoSQL Databases​Home For Gypsies – Storage for NoSQL Databases​
Home For Gypsies – Storage for NoSQL Databases​Atish Kathpal
 
Under the Hood of a Shard-per-Core Database Architecture
Under the Hood of a Shard-per-Core Database ArchitectureUnder the Hood of a Shard-per-Core Database Architecture
Under the Hood of a Shard-per-Core Database ArchitectureScyllaDB
 
Cassandra summit-2013
Cassandra summit-2013Cassandra summit-2013
Cassandra summit-2013dfilppi
 
NoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
NoSQL and NewSQL: Tradeoffs between Scalable Performance & ConsistencyNoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
NoSQL and NewSQL: Tradeoffs between Scalable Performance & ConsistencyScyllaDB
 
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...DataStax
 
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...DataStax
 

Was ist angesagt? (20)

Boston hug-2012-07
Boston hug-2012-07Boston hug-2012-07
Boston hug-2012-07
 
Wide Column Store NoSQL vs SQL Data Modeling
Wide Column Store NoSQL vs SQL Data ModelingWide Column Store NoSQL vs SQL Data Modeling
Wide Column Store NoSQL vs SQL Data Modeling
 
Managing data analytics in a hybrid cloud
Managing data analytics in a hybrid cloudManaging data analytics in a hybrid cloud
Managing data analytics in a hybrid cloud
 
Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain...
Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain...Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain...
Large Infrastructure Monitoring At CERN by Matthias Braeger at Big Data Spain...
 
Addressing the High Cost of Apache Cassandra
Addressing the High Cost of Apache CassandraAddressing the High Cost of Apache Cassandra
Addressing the High Cost of Apache Cassandra
 
C2MON - A highly scalable monitoring platform for Big Data scenarios @CERN by...
C2MON - A highly scalable monitoring platform for Big Data scenarios @CERN by...C2MON - A highly scalable monitoring platform for Big Data scenarios @CERN by...
C2MON - A highly scalable monitoring platform for Big Data scenarios @CERN by...
 
Cloud applications
Cloud applicationsCloud applications
Cloud applications
 
On-Prem Solution for the Selection of Wind Energy Models
On-Prem Solution for the Selection of Wind Energy ModelsOn-Prem Solution for the Selection of Wind Energy Models
On-Prem Solution for the Selection of Wind Energy Models
 
20140708hcj
20140708hcj20140708hcj
20140708hcj
 
How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...
How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...
How to Design Scalable HPC, Deep Learning, and Cloud Middleware for Exascale ...
 
MapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community EditionMapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community Edition
 
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
 
IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告
 
Home For Gypsies – Storage for NoSQL Databases​
Home For Gypsies – Storage for NoSQL Databases​Home For Gypsies – Storage for NoSQL Databases​
Home For Gypsies – Storage for NoSQL Databases​
 
Under the Hood of a Shard-per-Core Database Architecture
Under the Hood of a Shard-per-Core Database ArchitectureUnder the Hood of a Shard-per-Core Database Architecture
Under the Hood of a Shard-per-Core Database Architecture
 
Hadoop 1.x vs 2
Hadoop 1.x vs 2Hadoop 1.x vs 2
Hadoop 1.x vs 2
 
Cassandra summit-2013
Cassandra summit-2013Cassandra summit-2013
Cassandra summit-2013
 
NoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
NoSQL and NewSQL: Tradeoffs between Scalable Performance & ConsistencyNoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
NoSQL and NewSQL: Tradeoffs between Scalable Performance & Consistency
 
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...
 
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
Replication and Consistency in Cassandra... What Does it All Mean? (Christoph...
 

Ähnlich wie TechEvent Apache Cassandra

Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Databricks
 
Breakthrough OLAP performance with Cassandra and Spark
Breakthrough OLAP performance with Cassandra and SparkBreakthrough OLAP performance with Cassandra and Spark
Breakthrough OLAP performance with Cassandra and SparkEvan Chan
 
C* Summit 2013: Large Scale Data Ingestion, Processing and Analysis: Then, No...
C* Summit 2013: Large Scale Data Ingestion, Processing and Analysis: Then, No...C* Summit 2013: Large Scale Data Ingestion, Processing and Analysis: Then, No...
C* Summit 2013: Large Scale Data Ingestion, Processing and Analysis: Then, No...DataStax Academy
 
Elastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent MemoryElastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent MemoryDatabricks
 
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
Hw09   Hadoop Based Data Mining Platform For The Telecom IndustryHw09   Hadoop Based Data Mining Platform For The Telecom Industry
Hw09 Hadoop Based Data Mining Platform For The Telecom IndustryCloudera, Inc.
 
Running Presto and Spark on the Netflix Big Data Platform
Running Presto and Spark on the Netflix Big Data PlatformRunning Presto and Spark on the Netflix Big Data Platform
Running Presto and Spark on the Netflix Big Data PlatformEva Tse
 
(BDT303) Running Spark and Presto on the Netflix Big Data Platform
(BDT303) Running Spark and Presto on the Netflix Big Data Platform(BDT303) Running Spark and Presto on the Netflix Big Data Platform
(BDT303) Running Spark and Presto on the Netflix Big Data PlatformAmazon Web Services
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon RedshiftAmazon Web Services
 
Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016
Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016
Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016StampedeCon
 
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...confluent
 
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...confluent
 
Quick Guide to Refresh Spark skills
Quick Guide to Refresh Spark skillsQuick Guide to Refresh Spark skills
Quick Guide to Refresh Spark skillsRavindra kumar
 
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of ThingsDay 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of ThingsAmazon Web Services
 
Launching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWSLaunching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWSAmazon Web Services
 
Cisco connect montreal 2018 compute v final
Cisco connect montreal 2018   compute v finalCisco connect montreal 2018   compute v final
Cisco connect montreal 2018 compute v finalCisco Canada
 
Data analytics master class: predict hotel revenue
Data analytics master class: predict hotel revenueData analytics master class: predict hotel revenue
Data analytics master class: predict hotel revenueKris Peeters
 
RAC - The Savior of DBA
RAC - The Savior of DBARAC - The Savior of DBA
RAC - The Savior of DBANikhil Kumar
 
Getting started with Amazon Redshift
Getting started with Amazon RedshiftGetting started with Amazon Redshift
Getting started with Amazon RedshiftAmazon Web Services
 

Ähnlich wie TechEvent Apache Cassandra (20)

Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
Vectorized Deep Learning Acceleration from Preprocessing to Inference and Tra...
 
Breakthrough OLAP performance with Cassandra and Spark
Breakthrough OLAP performance with Cassandra and SparkBreakthrough OLAP performance with Cassandra and Spark
Breakthrough OLAP performance with Cassandra and Spark
 
Why Cassandra?
Why Cassandra?Why Cassandra?
Why Cassandra?
 
C* Summit 2013: Large Scale Data Ingestion, Processing and Analysis: Then, No...
C* Summit 2013: Large Scale Data Ingestion, Processing and Analysis: Then, No...C* Summit 2013: Large Scale Data Ingestion, Processing and Analysis: Then, No...
C* Summit 2013: Large Scale Data Ingestion, Processing and Analysis: Then, No...
 
Elastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent MemoryElastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent Memory
 
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
Hw09   Hadoop Based Data Mining Platform For The Telecom IndustryHw09   Hadoop Based Data Mining Platform For The Telecom Industry
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
 
Running Presto and Spark on the Netflix Big Data Platform
Running Presto and Spark on the Netflix Big Data PlatformRunning Presto and Spark on the Netflix Big Data Platform
Running Presto and Spark on the Netflix Big Data Platform
 
(BDT303) Running Spark and Presto on the Netflix Big Data Platform
(BDT303) Running Spark and Presto on the Netflix Big Data Platform(BDT303) Running Spark and Presto on the Netflix Big Data Platform
(BDT303) Running Spark and Presto on the Netflix Big Data Platform
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016
Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016
Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016
 
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi and Eri...
 
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
 
Quick Guide to Refresh Spark skills
Quick Guide to Refresh Spark skillsQuick Guide to Refresh Spark skills
Quick Guide to Refresh Spark skills
 
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of ThingsDay 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
Day 4 - Big Data on AWS - RedShift, EMR & the Internet of Things
 
Launching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWSLaunching Your First Big Data Project on AWS
Launching Your First Big Data Project on AWS
 
Cisco connect montreal 2018 compute v final
Cisco connect montreal 2018   compute v finalCisco connect montreal 2018   compute v final
Cisco connect montreal 2018 compute v final
 
Data analytics master class: predict hotel revenue
Data analytics master class: predict hotel revenueData analytics master class: predict hotel revenue
Data analytics master class: predict hotel revenue
 
RAC - The Savior of DBA
RAC - The Savior of DBARAC - The Savior of DBA
RAC - The Savior of DBA
 
Getting started with Amazon Redshift
Getting started with Amazon RedshiftGetting started with Amazon Redshift
Getting started with Amazon Redshift
 
Cisco OpenSOC
Cisco OpenSOCCisco OpenSOC
Cisco OpenSOC
 

Mehr von Trivadis

Azure Days 2019: Azure Chatbot Development for Airline Irregularities (Remco ...
Azure Days 2019: Azure Chatbot Development for Airline Irregularities (Remco ...Azure Days 2019: Azure Chatbot Development for Airline Irregularities (Remco ...
Azure Days 2019: Azure Chatbot Development for Airline Irregularities (Remco ...Trivadis
 
Azure Days 2019: Trivadis Azure Foundation – Das Fundament für den ... (Nisan...
Azure Days 2019: Trivadis Azure Foundation – Das Fundament für den ... (Nisan...Azure Days 2019: Trivadis Azure Foundation – Das Fundament für den ... (Nisan...
Azure Days 2019: Trivadis Azure Foundation – Das Fundament für den ... (Nisan...Trivadis
 
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)Trivadis
 
Azure Days 2019: Master the Move to Azure (Konrad Brunner)
Azure Days 2019: Master the Move to Azure (Konrad Brunner)Azure Days 2019: Master the Move to Azure (Konrad Brunner)
Azure Days 2019: Master the Move to Azure (Konrad Brunner)Trivadis
 
Azure Days 2019: Keynote Azure Switzerland – Status Quo und Ausblick (Primo A...
Azure Days 2019: Keynote Azure Switzerland – Status Quo und Ausblick (Primo A...Azure Days 2019: Keynote Azure Switzerland – Status Quo und Ausblick (Primo A...
Azure Days 2019: Keynote Azure Switzerland – Status Quo und Ausblick (Primo A...Trivadis
 
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)Trivadis
 
Azure Days 2019: Get Connected with Azure API Management (Gerry Keune & Stefa...
Azure Days 2019: Get Connected with Azure API Management (Gerry Keune & Stefa...Azure Days 2019: Get Connected with Azure API Management (Gerry Keune & Stefa...
Azure Days 2019: Get Connected with Azure API Management (Gerry Keune & Stefa...Trivadis
 
Azure Days 2019: Infrastructure as Code auf Azure (Jonas Wanninger & Daniel H...
Azure Days 2019: Infrastructure as Code auf Azure (Jonas Wanninger & Daniel H...Azure Days 2019: Infrastructure as Code auf Azure (Jonas Wanninger & Daniel H...
Azure Days 2019: Infrastructure as Code auf Azure (Jonas Wanninger & Daniel H...Trivadis
 
Azure Days 2019: Wie bringt man eine Data Analytics Plattform in die Cloud? (...
Azure Days 2019: Wie bringt man eine Data Analytics Plattform in die Cloud? (...Azure Days 2019: Wie bringt man eine Data Analytics Plattform in die Cloud? (...
Azure Days 2019: Wie bringt man eine Data Analytics Plattform in die Cloud? (...Trivadis
 
Azure Days 2019: Azure@Helsana: Die Erweiterung von Dynamics CRM mit Azure Po...
Azure Days 2019: Azure@Helsana: Die Erweiterung von Dynamics CRM mit Azure Po...Azure Days 2019: Azure@Helsana: Die Erweiterung von Dynamics CRM mit Azure Po...
Azure Days 2019: Azure@Helsana: Die Erweiterung von Dynamics CRM mit Azure Po...Trivadis
 
TechEvent 2019: Kundenstory - Kein Angebot, kein Auftrag – Wie Du ein individ...
TechEvent 2019: Kundenstory - Kein Angebot, kein Auftrag – Wie Du ein individ...TechEvent 2019: Kundenstory - Kein Angebot, kein Auftrag – Wie Du ein individ...
TechEvent 2019: Kundenstory - Kein Angebot, kein Auftrag – Wie Du ein individ...Trivadis
 
TechEvent 2019: Oracle Database Appliance M/L - Erfahrungen und Erfolgsmethod...
TechEvent 2019: Oracle Database Appliance M/L - Erfahrungen und Erfolgsmethod...TechEvent 2019: Oracle Database Appliance M/L - Erfahrungen und Erfolgsmethod...
TechEvent 2019: Oracle Database Appliance M/L - Erfahrungen und Erfolgsmethod...Trivadis
 
TechEvent 2019: Security 101 für Web Entwickler; Roland Krüger - Trivadis
TechEvent 2019: Security 101 für Web Entwickler; Roland Krüger - TrivadisTechEvent 2019: Security 101 für Web Entwickler; Roland Krüger - Trivadis
TechEvent 2019: Security 101 für Web Entwickler; Roland Krüger - TrivadisTrivadis
 
TechEvent 2019: Trivadis & Swisscom Partner Angebote; Konrad Häfeli, Markus O...
TechEvent 2019: Trivadis & Swisscom Partner Angebote; Konrad Häfeli, Markus O...TechEvent 2019: Trivadis & Swisscom Partner Angebote; Konrad Häfeli, Markus O...
TechEvent 2019: Trivadis & Swisscom Partner Angebote; Konrad Häfeli, Markus O...Trivadis
 
TechEvent 2019: DBaaS from Swisscom Cloud powered by Trivadis; Konrad Häfeli ...
TechEvent 2019: DBaaS from Swisscom Cloud powered by Trivadis; Konrad Häfeli ...TechEvent 2019: DBaaS from Swisscom Cloud powered by Trivadis; Konrad Häfeli ...
TechEvent 2019: DBaaS from Swisscom Cloud powered by Trivadis; Konrad Häfeli ...Trivadis
 
TechEvent 2019: Status of the partnership Trivadis and EDB - Comparing Postgr...
TechEvent 2019: Status of the partnership Trivadis and EDB - Comparing Postgr...TechEvent 2019: Status of the partnership Trivadis and EDB - Comparing Postgr...
TechEvent 2019: Status of the partnership Trivadis and EDB - Comparing Postgr...Trivadis
 
TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...
TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...
TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...Trivadis
 
TechEvent 2019: Kundenstory - Vom Hauptmann zu Köpenick zum Polizisten 2020 -...
TechEvent 2019: Kundenstory - Vom Hauptmann zu Köpenick zum Polizisten 2020 -...TechEvent 2019: Kundenstory - Vom Hauptmann zu Köpenick zum Polizisten 2020 -...
TechEvent 2019: Kundenstory - Vom Hauptmann zu Köpenick zum Polizisten 2020 -...Trivadis
 
TechEvent 2019: Vom Rechenzentrum in die Oracle Cloud - Übertragungsmethoden;...
TechEvent 2019: Vom Rechenzentrum in die Oracle Cloud - Übertragungsmethoden;...TechEvent 2019: Vom Rechenzentrum in die Oracle Cloud - Übertragungsmethoden;...
TechEvent 2019: Vom Rechenzentrum in die Oracle Cloud - Übertragungsmethoden;...Trivadis
 
TechEvent 2019: The sleeping Power of Data; Eberhard Lösch - Trivadis
TechEvent 2019: The sleeping Power of Data; Eberhard Lösch - TrivadisTechEvent 2019: The sleeping Power of Data; Eberhard Lösch - Trivadis
TechEvent 2019: The sleeping Power of Data; Eberhard Lösch - TrivadisTrivadis
 

Mehr von Trivadis (20)

Azure Days 2019: Azure Chatbot Development for Airline Irregularities (Remco ...
Azure Days 2019: Azure Chatbot Development for Airline Irregularities (Remco ...Azure Days 2019: Azure Chatbot Development for Airline Irregularities (Remco ...
Azure Days 2019: Azure Chatbot Development for Airline Irregularities (Remco ...
 
Azure Days 2019: Trivadis Azure Foundation – Das Fundament für den ... (Nisan...
Azure Days 2019: Trivadis Azure Foundation – Das Fundament für den ... (Nisan...Azure Days 2019: Trivadis Azure Foundation – Das Fundament für den ... (Nisan...
Azure Days 2019: Trivadis Azure Foundation – Das Fundament für den ... (Nisan...
 
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
 
Azure Days 2019: Master the Move to Azure (Konrad Brunner)
Azure Days 2019: Master the Move to Azure (Konrad Brunner)Azure Days 2019: Master the Move to Azure (Konrad Brunner)
Azure Days 2019: Master the Move to Azure (Konrad Brunner)
 
Azure Days 2019: Keynote Azure Switzerland – Status Quo und Ausblick (Primo A...
Azure Days 2019: Keynote Azure Switzerland – Status Quo und Ausblick (Primo A...Azure Days 2019: Keynote Azure Switzerland – Status Quo und Ausblick (Primo A...
Azure Days 2019: Keynote Azure Switzerland – Status Quo und Ausblick (Primo A...
 
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)
Azure Days 2019: Grösser und Komplexer ist nicht immer besser (Meinrad Weiss)
 
Azure Days 2019: Get Connected with Azure API Management (Gerry Keune & Stefa...
Azure Days 2019: Get Connected with Azure API Management (Gerry Keune & Stefa...Azure Days 2019: Get Connected with Azure API Management (Gerry Keune & Stefa...
Azure Days 2019: Get Connected with Azure API Management (Gerry Keune & Stefa...
 
Azure Days 2019: Infrastructure as Code auf Azure (Jonas Wanninger & Daniel H...
Azure Days 2019: Infrastructure as Code auf Azure (Jonas Wanninger & Daniel H...Azure Days 2019: Infrastructure as Code auf Azure (Jonas Wanninger & Daniel H...
Azure Days 2019: Infrastructure as Code auf Azure (Jonas Wanninger & Daniel H...
 
Azure Days 2019: Wie bringt man eine Data Analytics Plattform in die Cloud? (...
Azure Days 2019: Wie bringt man eine Data Analytics Plattform in die Cloud? (...Azure Days 2019: Wie bringt man eine Data Analytics Plattform in die Cloud? (...
Azure Days 2019: Wie bringt man eine Data Analytics Plattform in die Cloud? (...
 
Azure Days 2019: Azure@Helsana: Die Erweiterung von Dynamics CRM mit Azure Po...
Azure Days 2019: Azure@Helsana: Die Erweiterung von Dynamics CRM mit Azure Po...Azure Days 2019: Azure@Helsana: Die Erweiterung von Dynamics CRM mit Azure Po...
Azure Days 2019: Azure@Helsana: Die Erweiterung von Dynamics CRM mit Azure Po...
 
TechEvent 2019: Kundenstory - Kein Angebot, kein Auftrag – Wie Du ein individ...
TechEvent 2019: Kundenstory - Kein Angebot, kein Auftrag – Wie Du ein individ...TechEvent 2019: Kundenstory - Kein Angebot, kein Auftrag – Wie Du ein individ...
TechEvent 2019: Kundenstory - Kein Angebot, kein Auftrag – Wie Du ein individ...
 
TechEvent 2019: Oracle Database Appliance M/L - Erfahrungen und Erfolgsmethod...
TechEvent 2019: Oracle Database Appliance M/L - Erfahrungen und Erfolgsmethod...TechEvent 2019: Oracle Database Appliance M/L - Erfahrungen und Erfolgsmethod...
TechEvent 2019: Oracle Database Appliance M/L - Erfahrungen und Erfolgsmethod...
 
TechEvent 2019: Security 101 für Web Entwickler; Roland Krüger - Trivadis
TechEvent 2019: Security 101 für Web Entwickler; Roland Krüger - TrivadisTechEvent 2019: Security 101 für Web Entwickler; Roland Krüger - Trivadis
TechEvent 2019: Security 101 für Web Entwickler; Roland Krüger - Trivadis
 
TechEvent 2019: Trivadis & Swisscom Partner Angebote; Konrad Häfeli, Markus O...
TechEvent 2019: Trivadis & Swisscom Partner Angebote; Konrad Häfeli, Markus O...TechEvent 2019: Trivadis & Swisscom Partner Angebote; Konrad Häfeli, Markus O...
TechEvent 2019: Trivadis & Swisscom Partner Angebote; Konrad Häfeli, Markus O...
 
TechEvent 2019: DBaaS from Swisscom Cloud powered by Trivadis; Konrad Häfeli ...
TechEvent 2019: DBaaS from Swisscom Cloud powered by Trivadis; Konrad Häfeli ...TechEvent 2019: DBaaS from Swisscom Cloud powered by Trivadis; Konrad Häfeli ...
TechEvent 2019: DBaaS from Swisscom Cloud powered by Trivadis; Konrad Häfeli ...
 
TechEvent 2019: Status of the partnership Trivadis and EDB - Comparing Postgr...
TechEvent 2019: Status of the partnership Trivadis and EDB - Comparing Postgr...TechEvent 2019: Status of the partnership Trivadis and EDB - Comparing Postgr...
TechEvent 2019: Status of the partnership Trivadis and EDB - Comparing Postgr...
 
TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...
TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...
TechEvent 2019: More Agile, More AI, More Cloud! Less Work?!; Oliver Dörr - T...
 
TechEvent 2019: Kundenstory - Vom Hauptmann zu Köpenick zum Polizisten 2020 -...
TechEvent 2019: Kundenstory - Vom Hauptmann zu Köpenick zum Polizisten 2020 -...TechEvent 2019: Kundenstory - Vom Hauptmann zu Köpenick zum Polizisten 2020 -...
TechEvent 2019: Kundenstory - Vom Hauptmann zu Köpenick zum Polizisten 2020 -...
 
TechEvent 2019: Vom Rechenzentrum in die Oracle Cloud - Übertragungsmethoden;...
TechEvent 2019: Vom Rechenzentrum in die Oracle Cloud - Übertragungsmethoden;...TechEvent 2019: Vom Rechenzentrum in die Oracle Cloud - Übertragungsmethoden;...
TechEvent 2019: Vom Rechenzentrum in die Oracle Cloud - Übertragungsmethoden;...
 
TechEvent 2019: The sleeping Power of Data; Eberhard Lösch - Trivadis
TechEvent 2019: The sleeping Power of Data; Eberhard Lösch - TrivadisTechEvent 2019: The sleeping Power of Data; Eberhard Lösch - Trivadis
TechEvent 2019: The sleeping Power of Data; Eberhard Lösch - Trivadis
 

Kürzlich hochgeladen

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 

Kürzlich hochgeladen (20)

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 

TechEvent Apache Cassandra

  • 1. BASLE BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENEVA HAMBURG COPENHAGEN LAUSANNE MUNICH STUTTGART VIENNA ZURICH Apache Cassandra Under The Hood Robert Bialek
  • 2. Who Am I Apache Cassandra Under The Hood2 15.09.2018 Senior Principal Consultant and Trainer at Trivadis GmbH in Munich. – Master of Science in Computer Engineering. – At Trivadis since 2004. – Trivadis Partner since 2012. Focus: – Data and service high availability, disaster recovery. – Architecture design, optimization, automation. – Troubleshooting. – Trainer: O-RAC, O-DG.
  • 3. Agenda Apache Cassandra Under The Hood3 15.09.2018 1. Introduction 2. Key Components 3. Data Replication 4. Scalability 5. Read/Write Operations 6. Data Consistency 7. Summary
  • 4. Apache Cassandra Under The Hood4 15.09.2018 Introduction
  • 5. What is Apache Cassandra? Apache Cassandra Under The Hood5 15.09.2018 Distributed NoSQL (wide column) partitioned row store database, which runs within a JVM. Decentralized, highly fault tolerant database with no single point of failure. Horizontal scalable system (computing resources/performance). Initially developed at Facebook, released as an open source project in July 2008. – Based on Amazon‘s Dynamo and Google‘s Big Table.
  • 6. Apache Cassandra & CAP Theorem? Apache Cassandra Under The Hood6 15.09.2018 According to CAP (Brewer’s) theorem “it is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees” – Consistency – Availability – Partition tolerance Apache Cassandra is a AP system. – Data result is eventually consistent (though, consistency is tunable). – Does not adhere to all ACID properties. ? ?
  • 7. Cassandra for Enterprise Applications Apache Cassandra Under The Hood7 15.09.2018 Support 24x7x365. Enterprise features, e.g.: DSE Advanced Security, DSE Analytics, DSE Search, DSE Graph, DSE Advanced Replication, DSE Tiered Storage, DSE NodeSync, ... Administration and monitoring with DSE OpsCenter (real-time monitoring, tuning, provisioning, backup, security management). According to DataStax, 2x or more throughput compared to Apache Cassandra. Documentation, client drivers and DSE for development are free to use.
  • 8. Who is Using Cassandra Database? Apache Cassandra Under The Hood8 15.09.2018 Source http://cassandra.apache.org – Apple: over 75,000 nodes storing over 10 PB of data. – Netflix: 2,500 nodes, 420 TB, over 1 trillion requests per day. – Chinese search engine Easou: 270 nodes, 300 TB, over 800 million requests per day. – eBay: over 100 nodes, 250 TB. Source https://www.datastax.com/customers – Microsoft, UBS, Sony, Sky, ING, NEC, Coursera, CISCO, Walmart, NVIDIA, Samsung, …
  • 9. Apache Cassandra Under The Hood9 15.09.2018 Key Components
  • 10. Node – Basic Database Infrastructure Apache Cassandra Under The Hood10 15.09.2018 Commodity hardware, ideally local storage (reduce dependencies). Hosts software and configuration files: – cassandra.yaml, cassandra-rackdc.properties, … Hosts data and accompanying structures: Cassandra Node (DSE: Transactional Node) Index.db Data.db (SSTable) Statistics.db CompressionInfo.db Digest.crc32 Filter.db TOC.txt
  • 11. Keyspaces & Tables Apache Cassandra Under The Hood11 15.09.2018 Table (Column Family) – Stores data based on a primary key. • Primary key: partitioning key plus optionally clustering columns. – Physically split into partitioned. – Denormalization (data duplication) is necessary. Keyspace – Grouping of data, similar to a schema. – Defines replication properties.
  • 12. Partitioner – Data Distribution Apache Cassandra Under The Hood12 15.09.2018 Determines which node receives data based on partitioning key token. Supplied partitioners (own can be created): Data Token PARTITIONER Murmur3Partitioner (default) Random Partitioner ByteOrderedPartitioner ‘Cassandra' 356242581507269238
  • 13. Cassandra Ring – Singe Token Architecture Apache Cassandra Under The Hood13 15.09.2018 Cassandra Ring initial_token:1 initial_token:10 initial_token: 20 initial_token: 30 Example Partitioner Token Range: 1 – 40 Token Range: 31 – 40,1 Token Range: 2 – 10 Token Range: 11 – 20 Token Range: 21 – 30 Data Token
  • 14. Cassandra Ring – Virtual Nodes Architecture Apache Cassandra Under The Hood14 15.09.2018 Cassandra Ring Example Partitioner Token Range: 1 – 40 num_tokens: 5 Token Ranges: 1 – 2, 11 – 12, 21 – 22, 33 – 34, 39 – 40 num_tokens: 5 Token Ranges: 3 – 4, 9 – 10, 29 – 30, 23 – 24, 39 – 40 num_tokens: 5 Token Ranges: 7 – 8, 17 – 18, 27 – 28, 31 – 32, 37 – 38 num_tokens: 5 Token Ranges: 5 – 6, 15 – 16, 19 – 20, 25 – 26, 35 – 36 Data Token Partitioner
  • 15. Snitches – Ring Topology Apache Cassandra Under The Hood15 15.09.2018 Determines physical location (datacenter and a rack) of a Cassandra node. Dynamic snitching (enabled by default): – Monitors the read performance and ring health. SNITECHES SimpleSnitch/DseSimpleSnitch (default) GossipingPropertyFileSnitch PropertyFileSnitch Ec2Snitch/Ec2MultiRegionSnitch/GoogleCloudSnitch/ CloudstackSnitch RackInferringSnitch DC 1 DC 2 Rack 1 Rack 1 Rack 2 Rack 2
  • 16. Gossip – Internode Communication Apache Cassandra Under The Hood16 15.09.2018 Peer-to-peer communication protocol to exchange ring state information. Gossip process runs every second and exchanges messages with up to three other nodes in the ring. Eventually, all nodes learn (indirectly) about all other nodes.
  • 17. Apache Cassandra Under The Hood17 15.09.2018 Scalability
  • 18. Cassandra Ring – Scale Out Apache Cassandra Under The Hood18 15.09.2018 Increases computing power and throughput of a Cassandra ring. Online and transparent to the applications. Ring Information START Joing Ring Generate Tokens FINISH Joing Ring Cassandra Ring SEED Node Bootstrap Data Streaming Software & Configuration Files
  • 19. Cassandra Ring – Scale In Apache Cassandra Under The Hood19 15.09.2018 Decreases computing power of a Cassandra ring. Online and transparent to the applications. Cassandra Ring DECOMMISSION Data Streaming Remove Tokens DECOMMISSIONED
  • 20. Apache Cassandra Under The Hood20 15.09.2018 Data Replication
  • 21. Replication – Data High Availability Apache Cassandra Under The Hood21 15.09.2018 To ensure data and service high availability, Cassandra stores data on multiple nodes in a cluster. All replicas are equally important (no primary or secondary data). Replication strategy and replication factor (RF) is defined on a keyspace (application) level. – RF can be set differently in different data centers. Two replication strategies are available: – SimpleStrategy – NetworkTopologyStrategy DC 1 DC 2 Rack 1 Rack 1 Rack 2 Rack 2
  • 22. Replication – SimpleStrategy (RF: 2) Apache Cassandra Under The Hood22 15.09.2018 Data Center 1 Rack 1 Rack 1 Rack 1 Rack 1
  • 23. Replication – NetworkTopologyStrategy (RF/DC: 2) Apache Cassandra Under The Hood23 15.09.2018 Data Center 1 Data Center 2 Rack 1 Rack 1 Rack 2 Rack 2
  • 24. Apache Cassandra Under The Hood24 15.09.2018 Read/Write Operations
  • 25. Read Request Flow on a Cassandra Node Apache Cassandra Under The Hood25 15.09.2018 Memtable Row Cache Bloom Filter Partition Key Cache Compression Offset Map Partition Summary Partition Index SSTables MemoryDisk
  • 26. Write Request Flow on a Cassandra Node Apache Cassandra Under The Hood26 15.09.2018 Memtable Index.db Data.db (SSTable) MemoryDisk Commit Log Statistics.db CompressionInfo.db Digest.crc32 Filter.db TOC.txt Compaction Process
  • 27. Upserts on a Cassandra Node Apache Cassandra Under The Hood27 15.09.2018 Memtable TAG: CASSANDRA SSTables ID C1 C2 TSTAMP 1 2 TEST1 100 ID C1 C2 TSTAMP 2 3 TEST2 50 INSERT INTO t (TAG, ID,C1,C2) VALUES (‘CASSANDRA‘,1,5,‘TEST3‘); UPDATE t SET C2=PROD1 WHERE TAG=‘CASSANDRA‘ AND ID=1; DELTE FROM t WHERE TAG=‘CASSANDRA‘ AND ID=2; ID C1 C2 TSTAMP 1 5 TEST3 150 ID C2 TSTAMP 1 PROD1 200 ID Tombstone (marked_deleted) TSTAMP 2 250 Partition Key: TAG Primary Key: TAG, ID
  • 28. Compaction Process on a Cassandra Node Apache Cassandra Under The Hood28 15.09.2018 ID C1 C2 TSTAMP 1 2 TEST1 100 ID C1 C2 TSTAMP 2 3 TEST2 50 ID C1 C2 TSTAMP 1 5 TEST3 150 ID C2 TSTAMP 1 PROD1 200 ID Tombstone (marked_deleted) TSTAMP 2 250 ID C1 C2 TSTAMP 3 4 TEST3 120 ID C1 C2 TSTAMP 1 5 PROD1 300 ID Tombstone (marked_deleted) TSTAMP 2 250 ID C1 C2 TSTAMP 3 4 TEST3 120 gc_grace_seconds reached? New SSTable Compaction Strategies SizeTieredCompactionStrategy (STCS) LeveledCompactionStrategy (LCS) TimeWindowCompactionStrategy (TWCS) No
  • 29. Apache Cassandra Under The Hood29 15.09.2018 Data Consistency
  • 30. Data Consistency – Overview Apache Cassandra Under The Hood30 15.09.2018 Cassandra offers tunable data consistency for read and write operations. Two types of read requests: – Direct read request. – Digest read request. Inconsistent data can be repaired automatically by: – Background read repair request. – NodeSync continuous background repair (only DSE 6). Inconsistent data can be repaired manually by: – Anty-Entropy Repair.
  • 31. Tunable Consistency Apache Cassandra Under The Hood31 15.09.2018 A tradeoff between data consistency and availability WRITE Consistency Level READ Consistency Level ALL ALL EACH_QUORUM Not supported. QUORUM QUORUM LOCAL_QUORUM LOCAL_QUORUM ONE, TWO, THREE ONE, TWO, THREE LOCAL_ONE LOCAL_ONE ANY Not supported. Not supported. SERIAL Not supported. LOCAL_SERIAL
  • 32. Read Requests & Tunable Consistency (1) Apache Cassandra Under The Hood32 15.09.2018 One DC, CONSISTENCY=QUORUM, RF=3 Coordinator Direct Read Digest Read speculative_retry!
  • 33. Read Requests & Tunable Consistency (2) Apache Cassandra Under The Hood33 15.09.2018 One DC, CONSISTENCY=QUORUM, RF=3 Coordinator Direct Read Digest Read Background Read Repair read_repair_chance=0.10
  • 34. Read Requests & Tunable Consistency (3) Apache Cassandra Under The Hood34 15.09.2018 Two DC, CONSISTENCY=QUORUM, RF=3 Coordinator DC=1 DC=2 Direct Read Digest Read Digest Read Digest Read
  • 35. Read Requests & Tunable Consistency (4) Apache Cassandra Under The Hood35 15.09.2018 Two DC, CONSISTENCY=LOCAL_QUORUM, RF=3 Coordinator DC=1 DC=2 Direct Read Digest Read
  • 36. Write Requests & Tunable Consistency (1) Apache Cassandra Under The Hood36 15.09.2018 One DC, CONSISTENCY=ONE, RF=3 Coordinator
  • 37. Write Requests & Tunable Consistency (2) Apache Cassandra Under The Hood37 15.09.2018 One DC, CONSISTENCY=QUORUM, RF=3 Coordinator DELETE Possibile ZOMBI Hinted Handoff
  • 38. Data Consistency – Anty-Entropy Repair Apache Cassandra Under The Hood38 15.09.2018 Manual data repair: – A Merkle tree is build for each replica – Merkle trees are compered between all replicas. Repair can be performed: – Sequential. – Parallel. – Datacenter parallel. Source: DSE 6.0 Architecture Guide
  • 39. Apache Cassandra Under The Hood39 15.09.2018 Summary
  • 40. Summary Apache Cassandra Under The Hood40 15.09.2018 Cassandra is a very powerful distributed and decentralized NoSQL database with no singe point of failure. It guarantees service and data availability in case of a partitioned network, though the data might be stale. Designed for large data stores which require performant and scalable system. Application data model need to be designed for Cassandra. Many ways to interact with the database: – CQLSH (Cassandra Query Language Shell). – Drivers and tools provided by DataStax. DataStax offers support for enterprise customers and a good documentation.
  • 41. 15.09.2018 Apache Cassandra Under The Hood41 Robert Bialek Senior Principal Consultant Tel. +49 89 99 27 59 38 robert.bialek@trivadis.com