SlideShare ist ein Scribd-Unternehmen logo
1 von 16
An introduction
 to Cassandra

                           Pedro Gomes
              pedrogomes@lsd.di.uminho.pt
          Braga Geek Nights - Abril 2010
Context
•   NOSQL movement- Not only SQL
    •unstructured data
    •web oriented interfaces
    •scale problems
                                            Voldemort
•   +20 emerging non relational databases
    • Document stores
    • Graph databases
    • Key-Value and Wide Column Stores
Cassandra - introduction
• From the greek prophetess Cassandra.
• Based on Amazon Dynamo and Goggle
  BigTable
• Built on FaceBook, open sourced in 2008
• Scalable, decentralized and structured data
  store
Why Cassandra?
•   High available
•   Eventual consistent
•   Decentralized
•   Elastic
•   Fault tolerant
•   Flexible Schema
A little internals...
• Built for Scale -   Consistence Hashing
                                     A
      A




                      New node
                                            F
               F




                                 N
M


                                     I
                          B
Partitioners
• Order preserving
• Random
• Custom...
Consistency
• CAP theorem                               Availability   Consitency


 • Trade consistency for availability
                                                     Partition
                                                    Tolerance

    •   Eventual consistency

    •   Read Repair, Hinted Handoff , Proactive Repair

  • A choice, not an obligation
Consistency - N,W,R
• Define your Consistency:
 • Define the replication factor N
 • For writes and reads chose the number
    of nodes R or W
   • ALL, ONE, QUORUM, ZERO.
   • W + R > N = Consistency
Data model
• KeySpaces - collection of your unique keys
• Column Families - groups of columns
• Columns - a tuple with column name, value,
  and time stamp
• Super columns - A column that is a set of
  column
• I will show pictures next, don’t worry.
Data model - Column Families
• Using the blog example:
 • PostsKeys       Columns

        Geek          Title:       Author:        Body:
        Nights     Geek Nights     Pedro          The...


                     Title:      Author:     Body:      Tags:
       Cassandra                                       Data, ...
                   Cassandra     Pedro       This...


                    Title:    Author:        Body:
         Stuff
                    Stuff    Someone       Something
Data model - Super Columns
          • Comments
 Keys       SuperColumns

 Geek        4/5/2010   Author:    Comment:       email:    4/5/2010   Author:   Comment:    email:
 Nights        20:00    Ricardo     I think...   email@       19:00     Jack      IMO ...   email@


             1/4/2010   Author:    Comment:       email:    1/4/2010   Author:   Comment:    email:
Cassandra
               14:00     Filipe    My POV..      email@       14:00     Jon         ...     email@


  Stuff      1/4/2010    Author:    Comment:       email:
               14:00      Filipe     Great...     email@
Data model
<Keyspace Name="BloggyAppy">

   <!-- CF definitions -->
   <ColumnFamily CompareWith="BytesType" Name="BlogEntries"/>
   <ColumnFamily CompareWith="TimeUUIDType" Name="Comments"
       CompareSubcolumnsWith="BytesType" ColumnType="Super"/>

</Keyspace>




• Think about your schema
API

• Thrift RPC
 • Java, PHP, C++....
API
•   insert(KeySpace, Key,Column_path,Value, Timestamp,Consistency_level)

•   get(KeySpace, Key,Column_path,Consistency_level)

•   batch_mutate

•   multi_get

•   range

•   ...
Have fun

• Clients for many languages
• Lucandra
• Hadoop support
• ...
End


• Questions ?

Weitere ähnliche Inhalte

Andere mochten auch (6)

Week13
Week13Week13
Week13
 
Research Orientation towards Do-it-Yourself Internet-of-Things Mass Creativit...
Research Orientation towards Do-it-Yourself Internet-of-Things Mass Creativit...Research Orientation towards Do-it-Yourself Internet-of-Things Mass Creativit...
Research Orientation towards Do-it-Yourself Internet-of-Things Mass Creativit...
 
MCCLV Celebrating 30 Years of Continuous Ministry in Las Vegas
MCCLV Celebrating 30 Years of Continuous Ministry in Las VegasMCCLV Celebrating 30 Years of Continuous Ministry in Las Vegas
MCCLV Celebrating 30 Years of Continuous Ministry in Las Vegas
 
SLQ vs NOSQL - friends or foes
SLQ vs NOSQL - friends or foes SLQ vs NOSQL - friends or foes
SLQ vs NOSQL - friends or foes
 
Incorporation of new arc
Incorporation of new arcIncorporation of new arc
Incorporation of new arc
 
اختبار القدرات
اختبار القدراتاختبار القدرات
اختبار القدرات
 

Ähnlich wie Cassandra presentation - Geek Nights Braga

SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and models
Korea Sdec
 
What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010
jbellis
 
Scaling Twitter with Cassandra
Scaling Twitter with CassandraScaling Twitter with Cassandra
Scaling Twitter with Cassandra
Ryan King
 
Writing DSL's in Scala
Writing DSL's in ScalaWriting DSL's in Scala
Writing DSL's in Scala
Abhijit Sharma
 

Ähnlich wie Cassandra presentation - Geek Nights Braga (20)

Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
 
Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and models
 
Accelerating NoSQL
Accelerating NoSQLAccelerating NoSQL
Accelerating NoSQL
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010
 
KeyValue Stores
KeyValue StoresKeyValue Stores
KeyValue Stores
 
Building a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with CassandraBuilding a distributed Key-Value store with Cassandra
Building a distributed Key-Value store with Cassandra
 
Using Scala for building DSLs
Using Scala for building DSLsUsing Scala for building DSLs
Using Scala for building DSLs
 
Cassandra from the trenches: migrating Netflix
Cassandra from the trenches: migrating NetflixCassandra from the trenches: migrating Netflix
Cassandra from the trenches: migrating Netflix
 
Scaling Twitter with Cassandra
Scaling Twitter with CassandraScaling Twitter with Cassandra
Scaling Twitter with Cassandra
 
Apache Con 2021 Structured Data Streaming
Apache Con 2021 Structured Data StreamingApache Con 2021 Structured Data Streaming
Apache Con 2021 Structured Data Streaming
 
Client storage
Client storageClient storage
Client storage
 
Writing DSL's in Scala
Writing DSL's in ScalaWriting DSL's in Scala
Writing DSL's in Scala
 
Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Scaling Databases On The Cloud
Scaling Databases On The CloudScaling Databases On The Cloud
Scaling Databases On The Cloud
 
Scaing databases on the cloud
Scaing databases on the cloudScaing databases on the cloud
Scaing databases on the cloud
 
NoSQL overview #phptostart turin 11.07.2011
NoSQL overview #phptostart turin 11.07.2011NoSQL overview #phptostart turin 11.07.2011
NoSQL overview #phptostart turin 11.07.2011
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUG
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 

Cassandra presentation - Geek Nights Braga

  • 1. An introduction to Cassandra Pedro Gomes pedrogomes@lsd.di.uminho.pt Braga Geek Nights - Abril 2010
  • 2. Context • NOSQL movement- Not only SQL •unstructured data •web oriented interfaces •scale problems Voldemort • +20 emerging non relational databases • Document stores • Graph databases • Key-Value and Wide Column Stores
  • 3. Cassandra - introduction • From the greek prophetess Cassandra. • Based on Amazon Dynamo and Goggle BigTable • Built on FaceBook, open sourced in 2008 • Scalable, decentralized and structured data store
  • 4. Why Cassandra? • High available • Eventual consistent • Decentralized • Elastic • Fault tolerant • Flexible Schema
  • 5. A little internals... • Built for Scale - Consistence Hashing A A New node F F N M I B
  • 7. Consistency • CAP theorem Availability Consitency • Trade consistency for availability Partition Tolerance • Eventual consistency • Read Repair, Hinted Handoff , Proactive Repair • A choice, not an obligation
  • 8. Consistency - N,W,R • Define your Consistency: • Define the replication factor N • For writes and reads chose the number of nodes R or W • ALL, ONE, QUORUM, ZERO. • W + R > N = Consistency
  • 9. Data model • KeySpaces - collection of your unique keys • Column Families - groups of columns • Columns - a tuple with column name, value, and time stamp • Super columns - A column that is a set of column • I will show pictures next, don’t worry.
  • 10. Data model - Column Families • Using the blog example: • PostsKeys Columns Geek Title: Author: Body: Nights Geek Nights Pedro The... Title: Author: Body: Tags: Cassandra Data, ... Cassandra Pedro This... Title: Author: Body: Stuff Stuff Someone Something
  • 11. Data model - Super Columns • Comments Keys SuperColumns Geek 4/5/2010 Author: Comment: email: 4/5/2010 Author: Comment: email: Nights 20:00 Ricardo I think... email@ 19:00 Jack IMO ... email@ 1/4/2010 Author: Comment: email: 1/4/2010 Author: Comment: email: Cassandra 14:00 Filipe My POV.. email@ 14:00 Jon ... email@ Stuff 1/4/2010 Author: Comment: email: 14:00 Filipe Great... email@
  • 12. Data model <Keyspace Name="BloggyAppy"> <!-- CF definitions --> <ColumnFamily CompareWith="BytesType" Name="BlogEntries"/> <ColumnFamily CompareWith="TimeUUIDType" Name="Comments" CompareSubcolumnsWith="BytesType" ColumnType="Super"/> </Keyspace> • Think about your schema
  • 13. API • Thrift RPC • Java, PHP, C++....
  • 14. API • insert(KeySpace, Key,Column_path,Value, Timestamp,Consistency_level) • get(KeySpace, Key,Column_path,Consistency_level) • batch_mutate • multi_get • range • ...
  • 15. Have fun • Clients for many languages • Lucandra • Hadoop support • ...

Hinweis der Redaktion