SlideShare ist ein Scribd-Unternehmen logo
1 von 37
Cassandra - A Decentralized
Structured Storage System
Nguyen Tuan Quang
Saltlux – Vietnam Development Center
2016.03.21
Agenda
• Database System Outlines
• Cassandra Overview
• Data Model & Architecture
• Key features
• Comparison
Database Market
Relational DBMS
• Since 1970
• Use SQL to manipulate data
• Excellent for applications such as management
(accounting, reservations, staff management, etc)
Relational DBMS
• Schemas aren't designed for sparse data
• Databases are simply not designed to be distributed
New Trends and Requirements
New Trends and Requirements
CAP Theory
all nodes see the
same data at the
same time
the system
continues to operate
despite arbitrary
message loss
every request receives a response about
whether it was successful or failed
Consistency Level
• Strong (Sequential): After the update completes any
subsequent access will return the updated value
• Weak (weaker than Sequential): The system does not
guarantee that subsequent accesses will return the
updated value
• Eventual: All updates will propagate throughout all of the
replicas in a distributed system, but that this may take
some time. Eventually, all replicas will be consistent.
Cassandra
• Apache Cassandra was initially
developed at Facebook to power their
Inbox Search
• Originally designed at Facebook,
Cassandra came from Amazon’s highly
available Dynamo and Google’s BigTable
data model
Use-case: Facebook Inbox Search
• Cassandra developed to address this problem.
• 50+TB of user messages data in 150 node cluster on which
Cassandra is tested.
• Search user index of all messages in 2 ways.
– Term search : search by a key word
– Interactions search : search by a user id
Use-cases: Apple
• Cassandra is Apple's dominant NoSQL database
– MongoDB - 35 job listings (iTunes, Customer Systems Platform, and
others)
– Couchbase - 4 job listings (iTunes Social)
– Hbase - 33 job listings (Maps, Siri, iAd, iCloud, and more)
– Cassandra - 70 job listings (Maps, iAd, iCloud, iTunes, and more)
Replication and Multi Data Center Replication
Use-cases: NetFlix
Use-cases - Apple
Data Model
• Keyspace is the outermost container for data in Cassandra
• Columns are grouped into Column Families.
• Each Column has
– Name
– Value
– Timestamp
Keyspace: metasearch
Column Families: Metasearch_korean
Data Model for Tornado
Metasearch
TOPIC_URL
URL1
TOPIC_CONTENT
CONTENT 1
TOPIC_TITLE
TOPIC_TITLE1
Row 1 Key
TOPIC_URL
URL2
TOPIC_CONTENT
CONTENT 2
TOPIC_TITLE
TOPIC_TITLE2
Row 2 Key
• Partitioning
How data is partitioned across nodes
• Replication
How data is duplicated across nodes
• Cluster Membership
How nodes are added, deleted to the cluster
System Architecture
• Nodes are logically structured in Ring Topology.
• Hashed value of key associated with data partition is used
to assign it to a node in the ring.
• Hashing rounds off after certain value to support ring
structure.
• Lightly loaded nodes moves position to alleviate highly
loaded nodes.
Partitioning
Partitioning
Partitioning
?
Partitioning
Partitions, Partition Key
Replication
• Each data item is replicated at N (replication factor) nodes.
• Different Replication Policies
– Rack Unaware – replicate data at N-1 successive nodes after its
coordinator
– Rack Aware – uses ‘Zookeeper’ to choose a leader which tells nodes
the range they are replicas for
– Datacenter Aware – similar to Rack Aware but leader is chosen at
Datacenter level instead of Rack level.
01
1/2
F
E
D
C
B
A N=3
h(key2)
h(key1)
24
Partitioning and Replication
* Figure taken from Avinash Lakshman and Prashant Malik (authors of the paper) slides.
25
Partitioning and Replication
Cassandra Key features
• Big Data Scalability
– Scalable to petabytes
– New nodes = linear performance increase
– Add new nodes online
Cassandra Key features
• No Single Point of Failture
– All nodes are the same
– Read/write from any nodes
– Can replicate from different data centers
Cassandra Key features
• Easy Replica/Data Distribution
– Transparently handled by Cassandra
– Multiple data centers are supported
– Exploit the benefits of cloud computing
Cassandra Key features
• No need for caching software
– Peer-to-peer architectures removes needs for special caching layer
– Database cluster uses memory of its own nodes to cache data
Cassandra Key features
• Tunable Data Consistency
– Choose between strong and eventually consistency
– Can be done on per-operation basis, and for both reads and writes
Cassandra Key features
• Tunable Data Consistency
– Choose between strong and eventually consistency
– Can be done on per-operation basis, and for both reads and writes
Mongodb vs. Cassandra
Comparison with MySQL
• MySQL > 50 GB Data
Writes Average : ~300 ms
Reads Average : ~350 ms
• Cassandra > 50 GB Data
Writes Average : 0.12 ms
Reads Average : 15 ms
• Stats provided by Authors using facebook data.
Key features Recaps
• Distributed and Decentralized
– Some nodes need to be set up as masters in order to organize other
nodes, which are set up as slaves
– That there is no single point of failure
• High Availability & Fault Tolerance
– You can replace failed nodes in the cluster with no downtime, and
you can replicate data to multiple data centers to offer improved
local performance and prevent downtime if one data center
experiences a catastrophe such as fire or flood.
• Tunable Consistency
– It allows you to easily decide the level of consistency you require, in
balance with the level of availability
Key features Recaps
• Elastic Scalability
– Elastic scalability refers to a special property of horizontal scalability.
It means that your cluster can seamlessly scale up and scale back
down.
References
• https://jaxenter.com/evaluating-nosql-performance-which-database-is-
right-for-your-data-107481.html
• http://www.slideshare.net/amcsquarelearning/learn-mongo-db-at-
amc-square-learning?next_slideshow=1
• https://en.wikipedia.org/wiki/Apache_Cassandra
• http://www.datastax.com/
• http://www.slideshare.net/asismohanty/cassandra-basics-20
Thank You

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecturenickmbailey
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into CassandraBrent Theisen
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overviewPritamKathar
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage systemArunit Gupta
 
Inside the InfluxDB storage engine
Inside the InfluxDB storage engineInside the InfluxDB storage engine
Inside the InfluxDB storage engineInfluxData
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to CassandraGokhan Atil
 
Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012Jay Patel
 
NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraFolio3 Software
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architectureMarkus Klems
 
[TDC2016] Apache Cassandra Estratégias de Modelagem de Dados
[TDC2016]  Apache Cassandra Estratégias de Modelagem de Dados[TDC2016]  Apache Cassandra Estratégias de Modelagem de Dados
[TDC2016] Apache Cassandra Estratégias de Modelagem de DadosEiti Kimura
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandraAaron Ploetz
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraDataStax
 
Stl meetup cloudera platform - january 2020
Stl meetup   cloudera platform  - january 2020Stl meetup   cloudera platform  - january 2020
Stl meetup cloudera platform - january 2020Adam Doyle
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon RedshiftKel Graham
 
Cassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ NetflixCassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ Netflixnkorla1share
 

Was ist angesagt? (20)

Intro to Cassandra
Intro to CassandraIntro to Cassandra
Intro to Cassandra
 
Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecture
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into Cassandra
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
 
Cassandra
CassandraCassandra
Cassandra
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
 
Inside the InfluxDB storage engine
Inside the InfluxDB storage engineInside the InfluxDB storage engine
Inside the InfluxDB storage engine
 
Introduction to Cassandra
Introduction to CassandraIntroduction to Cassandra
Introduction to Cassandra
 
Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012
 
NOSQL Database: Apache Cassandra
NOSQL Database: Apache CassandraNOSQL Database: Apache Cassandra
NOSQL Database: Apache Cassandra
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architecture
 
Rds data lake @ Robinhood
Rds data lake @ Robinhood Rds data lake @ Robinhood
Rds data lake @ Robinhood
 
[TDC2016] Apache Cassandra Estratégias de Modelagem de Dados
[TDC2016]  Apache Cassandra Estratégias de Modelagem de Dados[TDC2016]  Apache Cassandra Estratégias de Modelagem de Dados
[TDC2016] Apache Cassandra Estratégias de Modelagem de Dados
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandra
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache Cassandra
 
Stl meetup cloudera platform - january 2020
Stl meetup   cloudera platform  - january 2020Stl meetup   cloudera platform  - january 2020
Stl meetup cloudera platform - january 2020
 
A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon Redshift
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
 
Cassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ NetflixCassandra Data Modeling - Practical Considerations @ Netflix
Cassandra Data Modeling - Practical Considerations @ Netflix
 

Ähnlich wie Introduction to cassandra

Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra nehabsairam
 
CASSANDRA - Next to RDBMS
CASSANDRA - Next to RDBMSCASSANDRA - Next to RDBMS
CASSANDRA - Next to RDBMSVipul Thakur
 
Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)Jason Brown
 
Cassandra from the trenches: migrating Netflix
Cassandra from the trenches: migrating NetflixCassandra from the trenches: migrating Netflix
Cassandra from the trenches: migrating NetflixJason Brown
 
The No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelThe No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelRishikese MR
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
Cassandra
CassandraCassandra
Cassandraexsuns
 
6.1-Cassandra.ppt
6.1-Cassandra.ppt6.1-Cassandra.ppt
6.1-Cassandra.pptDanBarcan2
 
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...raghdooosh
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache KuduAndriy Zabavskyy
 
MySQL NDB Cluster 8.0
MySQL NDB Cluster 8.0MySQL NDB Cluster 8.0
MySQL NDB Cluster 8.0Ted Wennmark
 

Ähnlich wie Introduction to cassandra (20)

Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra
 
CASSANDRA - Next to RDBMS
CASSANDRA - Next to RDBMSCASSANDRA - Next to RDBMS
CASSANDRA - Next to RDBMS
 
Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)
 
Cassandra from the trenches: migrating Netflix
Cassandra from the trenches: migrating NetflixCassandra from the trenches: migrating Netflix
Cassandra from the trenches: migrating Netflix
 
NoSql
NoSqlNoSql
NoSql
 
The No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelThe No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra Model
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
Master.pptx
Master.pptxMaster.pptx
Master.pptx
 
No sql databases
No sql databasesNo sql databases
No sql databases
 
Cassandra
CassandraCassandra
Cassandra
 
Cassandra tutorial
Cassandra tutorialCassandra tutorial
Cassandra tutorial
 
6.1-Cassandra.ppt
6.1-Cassandra.ppt6.1-Cassandra.ppt
6.1-Cassandra.ppt
 
Cassandra
CassandraCassandra
Cassandra
 
6.1-Cassandra.ppt
6.1-Cassandra.ppt6.1-Cassandra.ppt
6.1-Cassandra.ppt
 
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
Big Data Storage Concepts from the "Big Data concepts Technology and Architec...
 
2. Lecture2_NOSQL_KeyValue.ppt
2. Lecture2_NOSQL_KeyValue.ppt2. Lecture2_NOSQL_KeyValue.ppt
2. Lecture2_NOSQL_KeyValue.ppt
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
MySQL NDB Cluster 8.0
MySQL NDB Cluster 8.0MySQL NDB Cluster 8.0
MySQL NDB Cluster 8.0
 
BigData, NoSQL & ElasticSearch
BigData, NoSQL & ElasticSearchBigData, NoSQL & ElasticSearch
BigData, NoSQL & ElasticSearch
 
cassandra.pptx
cassandra.pptxcassandra.pptx
cassandra.pptx
 

Mehr von Nguyen Quang

Deep Reinforcement Learning
Deep Reinforcement LearningDeep Reinforcement Learning
Deep Reinforcement LearningNguyen Quang
 
Deep Dialog System Review
Deep Dialog System ReviewDeep Dialog System Review
Deep Dialog System ReviewNguyen Quang
 
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksSequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksNguyen Quang
 
Web browser architecture
Web browser architectureWeb browser architecture
Web browser architectureNguyen Quang
 
X Query for beginner
X Query for beginnerX Query for beginner
X Query for beginnerNguyen Quang
 
Redistributable introtoscrum
Redistributable introtoscrumRedistributable introtoscrum
Redistributable introtoscrumNguyen Quang
 
Text categorization
Text categorizationText categorization
Text categorizationNguyen Quang
 
A holistic lexicon based approach to opinion mining
A holistic lexicon based approach to opinion miningA holistic lexicon based approach to opinion mining
A holistic lexicon based approach to opinion miningNguyen Quang
 

Mehr von Nguyen Quang (13)

Apache Zookeeper
Apache ZookeeperApache Zookeeper
Apache Zookeeper
 
Apache Storm
Apache StormApache Storm
Apache Storm
 
Deep Reinforcement Learning
Deep Reinforcement LearningDeep Reinforcement Learning
Deep Reinforcement Learning
 
Deep Dialog System Review
Deep Dialog System ReviewDeep Dialog System Review
Deep Dialog System Review
 
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksSequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural Networks
 
Web browser architecture
Web browser architectureWeb browser architecture
Web browser architecture
 
Eclipse orion
Eclipse orionEclipse orion
Eclipse orion
 
X Query for beginner
X Query for beginnerX Query for beginner
X Query for beginner
 
Html 5
Html 5Html 5
Html 5
 
Redistributable introtoscrum
Redistributable introtoscrumRedistributable introtoscrum
Redistributable introtoscrum
 
Text categorization
Text categorizationText categorization
Text categorization
 
A holistic lexicon based approach to opinion mining
A holistic lexicon based approach to opinion miningA holistic lexicon based approach to opinion mining
A holistic lexicon based approach to opinion mining
 
Overview of NoSQL
Overview of NoSQLOverview of NoSQL
Overview of NoSQL
 

Kürzlich hochgeladen

WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburgmasabamasaba
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationJuha-Pekka Tolvanen
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxAnnaArtyushina1
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2
 

Kürzlich hochgeladen (20)

WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 

Introduction to cassandra

  • 1. Cassandra - A Decentralized Structured Storage System Nguyen Tuan Quang Saltlux – Vietnam Development Center 2016.03.21
  • 2. Agenda • Database System Outlines • Cassandra Overview • Data Model & Architecture • Key features • Comparison
  • 4. Relational DBMS • Since 1970 • Use SQL to manipulate data • Excellent for applications such as management (accounting, reservations, staff management, etc)
  • 5. Relational DBMS • Schemas aren't designed for sparse data • Databases are simply not designed to be distributed
  • 6. New Trends and Requirements
  • 7. New Trends and Requirements
  • 8. CAP Theory all nodes see the same data at the same time the system continues to operate despite arbitrary message loss every request receives a response about whether it was successful or failed
  • 9. Consistency Level • Strong (Sequential): After the update completes any subsequent access will return the updated value • Weak (weaker than Sequential): The system does not guarantee that subsequent accesses will return the updated value • Eventual: All updates will propagate throughout all of the replicas in a distributed system, but that this may take some time. Eventually, all replicas will be consistent.
  • 10. Cassandra • Apache Cassandra was initially developed at Facebook to power their Inbox Search • Originally designed at Facebook, Cassandra came from Amazon’s highly available Dynamo and Google’s BigTable data model
  • 11. Use-case: Facebook Inbox Search • Cassandra developed to address this problem. • 50+TB of user messages data in 150 node cluster on which Cassandra is tested. • Search user index of all messages in 2 ways. – Term search : search by a key word – Interactions search : search by a user id
  • 12. Use-cases: Apple • Cassandra is Apple's dominant NoSQL database – MongoDB - 35 job listings (iTunes, Customer Systems Platform, and others) – Couchbase - 4 job listings (iTunes Social) – Hbase - 33 job listings (Maps, Siri, iAd, iCloud, and more) – Cassandra - 70 job listings (Maps, iAd, iCloud, iTunes, and more) Replication and Multi Data Center Replication
  • 15. Data Model • Keyspace is the outermost container for data in Cassandra • Columns are grouped into Column Families. • Each Column has – Name – Value – Timestamp
  • 16. Keyspace: metasearch Column Families: Metasearch_korean Data Model for Tornado Metasearch TOPIC_URL URL1 TOPIC_CONTENT CONTENT 1 TOPIC_TITLE TOPIC_TITLE1 Row 1 Key TOPIC_URL URL2 TOPIC_CONTENT CONTENT 2 TOPIC_TITLE TOPIC_TITLE2 Row 2 Key
  • 17. • Partitioning How data is partitioned across nodes • Replication How data is duplicated across nodes • Cluster Membership How nodes are added, deleted to the cluster System Architecture
  • 18. • Nodes are logically structured in Ring Topology. • Hashed value of key associated with data partition is used to assign it to a node in the ring. • Hashing rounds off after certain value to support ring structure. • Lightly loaded nodes moves position to alleviate highly loaded nodes. Partitioning
  • 23. Replication • Each data item is replicated at N (replication factor) nodes. • Different Replication Policies – Rack Unaware – replicate data at N-1 successive nodes after its coordinator – Rack Aware – uses ‘Zookeeper’ to choose a leader which tells nodes the range they are replicas for – Datacenter Aware – similar to Rack Aware but leader is chosen at Datacenter level instead of Rack level.
  • 24. 01 1/2 F E D C B A N=3 h(key2) h(key1) 24 Partitioning and Replication * Figure taken from Avinash Lakshman and Prashant Malik (authors of the paper) slides.
  • 26. Cassandra Key features • Big Data Scalability – Scalable to petabytes – New nodes = linear performance increase – Add new nodes online
  • 27. Cassandra Key features • No Single Point of Failture – All nodes are the same – Read/write from any nodes – Can replicate from different data centers
  • 28. Cassandra Key features • Easy Replica/Data Distribution – Transparently handled by Cassandra – Multiple data centers are supported – Exploit the benefits of cloud computing
  • 29. Cassandra Key features • No need for caching software – Peer-to-peer architectures removes needs for special caching layer – Database cluster uses memory of its own nodes to cache data
  • 30. Cassandra Key features • Tunable Data Consistency – Choose between strong and eventually consistency – Can be done on per-operation basis, and for both reads and writes
  • 31. Cassandra Key features • Tunable Data Consistency – Choose between strong and eventually consistency – Can be done on per-operation basis, and for both reads and writes
  • 33. Comparison with MySQL • MySQL > 50 GB Data Writes Average : ~300 ms Reads Average : ~350 ms • Cassandra > 50 GB Data Writes Average : 0.12 ms Reads Average : 15 ms • Stats provided by Authors using facebook data.
  • 34. Key features Recaps • Distributed and Decentralized – Some nodes need to be set up as masters in order to organize other nodes, which are set up as slaves – That there is no single point of failure • High Availability & Fault Tolerance – You can replace failed nodes in the cluster with no downtime, and you can replicate data to multiple data centers to offer improved local performance and prevent downtime if one data center experiences a catastrophe such as fire or flood. • Tunable Consistency – It allows you to easily decide the level of consistency you require, in balance with the level of availability
  • 35. Key features Recaps • Elastic Scalability – Elastic scalability refers to a special property of horizontal scalability. It means that your cluster can seamlessly scale up and scale back down.