SlideShare ist ein Scribd-Unternehmen logo
1 von 10
Downloaden Sie, um offline zu lesen
Cassandra

    Rob Keisler
CSCI 638 -- Summer 2011
What is Cassandra?

● A distributed storage system with a flexible schema and
  high-write throughput

● Developed by Facebook; turned over to Apache

● At its core, Cassandra borrows from both:
   ○ Amazon's Dynamo Infrastructure
   ○ Google's BigTable Data Model
Cassandra's Infrastructure
Cassandra's Data Model

● Rows (keyspace)
● Column Families 
● Columns and Super Columns
   ○ User can specify sorting by name or timestamp


                                                 Column        SuperColumn
  KeyA   ColumnA        ColumnB    ColumnC
                                              Byte [] Name     Byte [] Name

  KeyB   ColumnX        ColumnY   Column Z    Byte [] Value    List<Column>
                                                                  Columns
                                             Int64 Timestamp
  KeyA   SuperColumnI        SuperColumnJ


  KeyB   SuperColumnM        SuperColumnN
Cassandra's Data Model (in JSON)

● Key > Column Family > Column 
    {
        "keyA":{
           "Users":{
              "emailAddress":{"timestamp":"1", "value":"foo@bar.com"},
              "webSite":{"timestamp":"4", "value":"http://bar.com"}
           },
           "Stats":{
              "visits":{"timestamp":"3", "value":"243"}
           }
        },
        "keyB":{
           "Users":{
              "emailAddress":{"timestamp":"1", "value":"user2@bar.com"},
              "twitter":{"timestamp":"4", "value":"user2"}
           }
        }
    }
Cassandra's Data Model (in JSON)

● Key > Column Family > Super Column > Column 
    {
      "KeyA": {
        "Tags": {
          "cassandra": {
            "incubator": {"timestamp": "http://incubator.apache.org/cassandra/"},
            "jira": {"timestamp": "http://issues.apache.org/jira/browse/CASSANDRA"}
          },
          "thrift": {
            "jira": {"timestamp": "http://issues.apache.org/jira/browse/THRIFT"}
       }
      }
     }
    }
Differences from Dynamo

● Partitioning
   ○ Dynamo distributes virtual nodes on the hash ring using
     the performance of the host node
   ○ Cassandra distributes host nodes by examining load
     information on the hash ring and moving lightly loaded
     nodes to alleviate those with high load

● Replication
   ○ "Rack Unaware"
   ○ "Rack Aware"
   ○ "Datacenter Aware"
Differences from Dynamo

● Failure Detection
   ○ Dynamo uses a gossip-based protocol for membership
     changes; a node is assumed failed if it does not respond
   ○ Cassandra uses the same gossip-based protocol but uses
     a φ (phi) Accrual Failure Detector
       ■ Does not emit a boolean up or down
       ■ Emits a value which represents a suspicion level
       ■ The suspicion threshold is dynamically adjusted via
         the gossip messages
           ■ Sliding windows determined by arrival times 
           ■ Statistical distribution model created
Differences from BigTable

● Data Model
   ○ BigTable stores <K,V> pairs in SSTables by Column
     Family with historical versions
   ○ Cassandra drops historical versions and adds the super
     column concept

● Storage
   ○ BigTable uses the Google File System (GFS)
   ○ Cassandra uses the local file system
Cassandra

http://cassandra.apache.org/

Weitere ähnliche Inhalte

Was ist angesagt?

OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017HBaseCon
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase HBaseCon
 
Scalable Data Modeling by Example (Carlos Alonso, Job and Talent) | Cassandra...
Scalable Data Modeling by Example (Carlos Alonso, Job and Talent) | Cassandra...Scalable Data Modeling by Example (Carlos Alonso, Job and Talent) | Cassandra...
Scalable Data Modeling by Example (Carlos Alonso, Job and Talent) | Cassandra...DataStax
 
OpenTSDB 2.0
OpenTSDB 2.0OpenTSDB 2.0
OpenTSDB 2.0HBaseCon
 
Data engineering Stl Big Data IDEA user group
Data engineering   Stl Big Data IDEA user groupData engineering   Stl Big Data IDEA user group
Data engineering Stl Big Data IDEA user groupAdam Doyle
 
Cassandra Lunch #59 Functions in Cassandra
Cassandra Lunch #59  Functions in CassandraCassandra Lunch #59  Functions in Cassandra
Cassandra Lunch #59 Functions in CassandraAnant Corporation
 
Scalable real-time processing techniques
Scalable real-time processing techniquesScalable real-time processing techniques
Scalable real-time processing techniquesLars Albertsson
 
Rolling Out Apache HBase for Mobile Offerings at Visa
Rolling Out Apache HBase for Mobile Offerings at Visa Rolling Out Apache HBase for Mobile Offerings at Visa
Rolling Out Apache HBase for Mobile Offerings at Visa HBaseCon
 
Data- How Does It Work-
Data- How Does It Work-Data- How Does It Work-
Data- How Does It Work-Boyang Niu
 
Small intro to Big Data - Old version
Small intro to Big Data - Old versionSmall intro to Big Data - Old version
Small intro to Big Data - Old versionSoftwareMill
 
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...DataStax
 
DataStax and Esri: Geotemporal IoT Search and Analytics
DataStax and Esri: Geotemporal IoT Search and AnalyticsDataStax and Esri: Geotemporal IoT Search and Analytics
DataStax and Esri: Geotemporal IoT Search and AnalyticsDataStax Academy
 
Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...Omid Vahdaty
 
Engineering fast indexes
Engineering fast indexesEngineering fast indexes
Engineering fast indexesDaniel Lemire
 
openTSDB - Metrics for a distributed world
openTSDB - Metrics for a distributed worldopenTSDB - Metrics for a distributed world
openTSDB - Metrics for a distributed worldOliver Hankeln
 
J-Day Kraków: Listen to the sounds of your application
J-Day Kraków: Listen to the sounds of your applicationJ-Day Kraków: Listen to the sounds of your application
J-Day Kraków: Listen to the sounds of your applicationMaciej Bilas
 
Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20Jelena Zanko
 
Apache Solr as a compressed, scalable, and high performance time series database
Apache Solr as a compressed, scalable, and high performance time series databaseApache Solr as a compressed, scalable, and high performance time series database
Apache Solr as a compressed, scalable, and high performance time series databaseFlorian Lautenschlager
 

Was ist angesagt? (20)

OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase
 
Scalable Data Modeling by Example (Carlos Alonso, Job and Talent) | Cassandra...
Scalable Data Modeling by Example (Carlos Alonso, Job and Talent) | Cassandra...Scalable Data Modeling by Example (Carlos Alonso, Job and Talent) | Cassandra...
Scalable Data Modeling by Example (Carlos Alonso, Job and Talent) | Cassandra...
 
OpenTSDB 2.0
OpenTSDB 2.0OpenTSDB 2.0
OpenTSDB 2.0
 
Data engineering Stl Big Data IDEA user group
Data engineering   Stl Big Data IDEA user groupData engineering   Stl Big Data IDEA user group
Data engineering Stl Big Data IDEA user group
 
Cassandra Lunch #59 Functions in Cassandra
Cassandra Lunch #59  Functions in CassandraCassandra Lunch #59  Functions in Cassandra
Cassandra Lunch #59 Functions in Cassandra
 
Scalable real-time processing techniques
Scalable real-time processing techniquesScalable real-time processing techniques
Scalable real-time processing techniques
 
Rolling Out Apache HBase for Mobile Offerings at Visa
Rolling Out Apache HBase for Mobile Offerings at Visa Rolling Out Apache HBase for Mobile Offerings at Visa
Rolling Out Apache HBase for Mobile Offerings at Visa
 
Data- How Does It Work-
Data- How Does It Work-Data- How Does It Work-
Data- How Does It Work-
 
Small intro to Big Data - Old version
Small intro to Big Data - Old versionSmall intro to Big Data - Old version
Small intro to Big Data - Old version
 
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
SASI: Cassandra on the Full Text Search Ride (DuyHai DOAN, DataStax) | C* Sum...
 
DataStax and Esri: Geotemporal IoT Search and Analytics
DataStax and Esri: Geotemporal IoT Search and AnalyticsDataStax and Esri: Geotemporal IoT Search and Analytics
DataStax and Esri: Geotemporal IoT Search and Analytics
 
JSONiq - The SQL of NoSQL
JSONiq - The SQL of NoSQLJSONiq - The SQL of NoSQL
JSONiq - The SQL of NoSQL
 
Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...Amazon aws big data demystified | Introduction to streaming and messaging flu...
Amazon aws big data demystified | Introduction to streaming and messaging flu...
 
Engineering fast indexes
Engineering fast indexesEngineering fast indexes
Engineering fast indexes
 
openTSDB - Metrics for a distributed world
openTSDB - Metrics for a distributed worldopenTSDB - Metrics for a distributed world
openTSDB - Metrics for a distributed world
 
druid.io
druid.iodruid.io
druid.io
 
J-Day Kraków: Listen to the sounds of your application
J-Day Kraków: Listen to the sounds of your applicationJ-Day Kraków: Listen to the sounds of your application
J-Day Kraków: Listen to the sounds of your application
 
Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20Imply at Apache Druid Meetup in London 1-15-20
Imply at Apache Druid Meetup in London 1-15-20
 
Apache Solr as a compressed, scalable, and high performance time series database
Apache Solr as a compressed, scalable, and high performance time series databaseApache Solr as a compressed, scalable, and high performance time series database
Apache Solr as a compressed, scalable, and high performance time series database
 

Andere mochten auch

Understanding Digital Signal Processing by Lyons
Understanding Digital Signal Processing by LyonsUnderstanding Digital Signal Processing by Lyons
Understanding Digital Signal Processing by LyonsGovind Sridharan
 
易春香
易春香易春香
易春香zxedu
 
親友團說明
親友團說明  親友團說明
親友團說明 珮儀 江
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internalsAcunu
 
Swami mamunigal aani moolam srisailesa vaibhavam
Swami mamunigal aani moolam srisailesa vaibhavamSwami mamunigal aani moolam srisailesa vaibhavam
Swami mamunigal aani moolam srisailesa vaibhavamGovind Sridharan
 
20 things I learned about Browsers and the Web
20 things I learned about Browsers and the Web20 things I learned about Browsers and the Web
20 things I learned about Browsers and the WebGovind Sridharan
 

Andere mochten auch (6)

Understanding Digital Signal Processing by Lyons
Understanding Digital Signal Processing by LyonsUnderstanding Digital Signal Processing by Lyons
Understanding Digital Signal Processing by Lyons
 
易春香
易春香易春香
易春香
 
親友團說明
親友團說明  親友團說明
親友團說明
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internals
 
Swami mamunigal aani moolam srisailesa vaibhavam
Swami mamunigal aani moolam srisailesa vaibhavamSwami mamunigal aani moolam srisailesa vaibhavam
Swami mamunigal aani moolam srisailesa vaibhavam
 
20 things I learned about Browsers and the Web
20 things I learned about Browsers and the Web20 things I learned about Browsers and the Web
20 things I learned about Browsers and the Web
 

Ähnlich wie Cassandra

Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka MeetupCliff Gilmore
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real WorldJeremy Hanna
 
On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache CassandraStu Hood
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage systemArunit Gupta
 
NoSQL - Cassandra & MongoDB.pptx
NoSQL -  Cassandra & MongoDB.pptxNoSQL -  Cassandra & MongoDB.pptx
NoSQL - Cassandra & MongoDB.pptxNaveen Kumar
 
Storage cassandra
Storage   cassandraStorage   cassandra
Storage cassandraPL dream
 
Cassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache CassandraCassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache CassandraDave Gardner
 
Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Introduce Apache Cassandra - JavaTwo Taiwan, 2012Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Introduce Apache Cassandra - JavaTwo Taiwan, 2012Boris Yen
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUGStu Hood
 
cassandra_presentation_final
cassandra_presentation_finalcassandra_presentation_final
cassandra_presentation_finalSergioBruno21
 
Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Robbie Strickland
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real WorldJeremy Hanna
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataChen Robert
 
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...Lviv Startup Club
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into CassandraBrent Theisen
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Md. Shohel Rana
 
How Opera Syncs Tens of Millions of Browsers and Sleeps Well at Night
How Opera Syncs Tens of Millions of Browsers and Sleeps Well at NightHow Opera Syncs Tens of Millions of Browsers and Sleeps Well at Night
How Opera Syncs Tens of Millions of Browsers and Sleeps Well at NightScyllaDB
 
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by ScyllaScylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by ScyllaScyllaDB
 

Ähnlich wie Cassandra (20)

Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka Meetup
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World
 
On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache Cassandra
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
 
NoSQL - Cassandra & MongoDB.pptx
NoSQL -  Cassandra & MongoDB.pptxNoSQL -  Cassandra & MongoDB.pptx
NoSQL - Cassandra & MongoDB.pptx
 
Storage cassandra
Storage   cassandraStorage   cassandra
Storage cassandra
 
Cassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache CassandraCassandra's Sweet Spot - an introduction to Apache Cassandra
Cassandra's Sweet Spot - an introduction to Apache Cassandra
 
Cassandra Overview
Cassandra OverviewCassandra Overview
Cassandra Overview
 
Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Introduce Apache Cassandra - JavaTwo Taiwan, 2012Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Introduce Apache Cassandra - JavaTwo Taiwan, 2012
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUG
 
cassandra_presentation_final
cassandra_presentation_finalcassandra_presentation_final
cassandra_presentation_final
 
Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting data
 
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into Cassandra
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System
 
How Opera Syncs Tens of Millions of Browsers and Sleeps Well at Night
How Opera Syncs Tens of Millions of Browsers and Sleeps Well at NightHow Opera Syncs Tens of Millions of Browsers and Sleeps Well at Night
How Opera Syncs Tens of Millions of Browsers and Sleeps Well at Night
 
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by ScyllaScylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
Scylla Summit 2016: Analytics Show Time - Spark and Presto Powered by Scylla
 
Presentation
PresentationPresentation
Presentation
 

Kürzlich hochgeladen

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 

Cassandra

  • 1. Cassandra Rob Keisler CSCI 638 -- Summer 2011
  • 2. What is Cassandra? ● A distributed storage system with a flexible schema and high-write throughput ● Developed by Facebook; turned over to Apache ● At its core, Cassandra borrows from both: ○ Amazon's Dynamo Infrastructure ○ Google's BigTable Data Model
  • 4. Cassandra's Data Model ● Rows (keyspace) ● Column Families  ● Columns and Super Columns ○ User can specify sorting by name or timestamp Column SuperColumn KeyA ColumnA ColumnB ColumnC Byte [] Name Byte [] Name KeyB ColumnX ColumnY Column Z Byte [] Value List<Column> Columns Int64 Timestamp KeyA SuperColumnI SuperColumnJ KeyB SuperColumnM SuperColumnN
  • 5. Cassandra's Data Model (in JSON) ● Key > Column Family > Column  { "keyA":{ "Users":{ "emailAddress":{"timestamp":"1", "value":"foo@bar.com"}, "webSite":{"timestamp":"4", "value":"http://bar.com"} }, "Stats":{ "visits":{"timestamp":"3", "value":"243"} } }, "keyB":{ "Users":{ "emailAddress":{"timestamp":"1", "value":"user2@bar.com"}, "twitter":{"timestamp":"4", "value":"user2"} } } }
  • 6. Cassandra's Data Model (in JSON) ● Key > Column Family > Super Column > Column  {   "KeyA": {     "Tags": {       "cassandra": {         "incubator": {"timestamp": "http://incubator.apache.org/cassandra/"},         "jira": {"timestamp": "http://issues.apache.org/jira/browse/CASSANDRA"}       },       "thrift": {         "jira": {"timestamp": "http://issues.apache.org/jira/browse/THRIFT"}    }   }  } }
  • 7. Differences from Dynamo ● Partitioning ○ Dynamo distributes virtual nodes on the hash ring using the performance of the host node ○ Cassandra distributes host nodes by examining load information on the hash ring and moving lightly loaded nodes to alleviate those with high load ● Replication ○ "Rack Unaware" ○ "Rack Aware" ○ "Datacenter Aware"
  • 8. Differences from Dynamo ● Failure Detection ○ Dynamo uses a gossip-based protocol for membership changes; a node is assumed failed if it does not respond ○ Cassandra uses the same gossip-based protocol but uses a φ (phi) Accrual Failure Detector ■ Does not emit a boolean up or down ■ Emits a value which represents a suspicion level ■ The suspicion threshold is dynamically adjusted via the gossip messages ■ Sliding windows determined by arrival times  ■ Statistical distribution model created
  • 9. Differences from BigTable ● Data Model ○ BigTable stores <K,V> pairs in SSTables by Column Family with historical versions ○ Cassandra drops historical versions and adds the super column concept ● Storage ○ BigTable uses the Google File System (GFS) ○ Cassandra uses the local file system