My experience with Cassandra concepts

•Als DOC, PDF herunterladen•

0 gefällt mir•201 views

I recently read about Cassandra concepts and internals to understand how it works and why it is suited for handling large volume of data. This is a very interesting and also complex subject and I have merely scratched the surface so far.

Technologie

My experience with Cassandra concepts
I recently read about Cassandra concepts and internals to understand how it works and why
it is suited for handling large volume of data. This is a very interesting and also complex
subject and I have merely scratched the surface so far.
Cassandra is an open source scalable and highly available "NoSQL" distributed database
management system from Apache. It is classified under the Column-Family NoSQL
category. It was initially developed by Facebook and was later taken over by Apache. The
core features of Cassandra have been extracted from Amazon’s Dynamo and Google’s
Bigtable.
Its support for dynamic columns and distributed counters will resolve a major problem of
being able to aggregate most statistics as they are, rather than aggregating them with
map/reduce at the later stage.
Another beautiful thing about Cassandra is that it can keep maximum data in its cache (if
given enough RAM).
Cassandra Data Model
The Cassandra data model consists of a keyspace (analogous to a database), column
families (analogous to tables in the relational model), keys and columns. Here’s what the
basic Cassandra table (also known as a column family) structure looks like:

Figure1Error! No text of specified style in document.-1 Structure of a super column family in Cassandra

Don’t think of a relational table
Instead, think of a nested, sorted map data structure.
The following relational model analogy is often used to introduce Cassandra to newcomers:

Figure 1Error! No text of specified style in document.-2 Relational vs. Cassandra Model

This analogy helps make the transition from the relational to non-relational world. But don’t
use this analogy while designing Cassandra column families. Instead, think of the Cassandra
column family as a map of a map: an outer map keyed by a row key, and an inner map
keyed by a column key. Both maps are sorted.
SortedMap<RowKey, SortedMap<ColumnKey, ColumnValue>>
Why?
A nested sorted map is a more accurate analogy than a relational table, and will help you
make the right decisions about your Cassandra data model.

Figure 1-3: Cassandra Data Model
How?





A map gives efficient key lookup, and the sorted nature gives efficient scans. In
Cassandra, we can use row keys and column keys to do efficient lookups and range
scans.
The number of column keys is unbounded. In other words, you can have wide rows.
A key can itself hold a value. In other words, you can have a valueless column.

Map<RowKey, SortedMap<ColumnKey, ColumnValue>>
Conclusion
It’s important to think carefully about your data and your technology choices, and
sometimes it can be difficult to do that in a data vacuum. Cassandra, Hive, and Hadoop are
considered as the right tools to resolve most of the data challenges.
Your mileage may vary, but feel free to ask us questions in the comments!

Empfohlen

Cassandra 3.0Robert Stupp

CATALOG Exhibited Artworks 1999 - 2005 (LOW RES.)Aris Stoidis

DefiníciókBarbi Lengyel

Primo risultato dopo lo sciopero dei bancari. Venerdì 20 incontro sul Fondo d...Fisac-Cgil Varese

AppMyDay short introGuy Eldar

BorisovaElena Nedelina

Insight SIP overviewMichelBeghin

Île aux Moinescamiilleg

Empfohlen

Cassandra 3.0Robert Stupp

CATALOG Exhibited Artworks 1999 - 2005 (LOW RES.)Aris Stoidis

DefiníciókBarbi Lengyel

Primo risultato dopo lo sciopero dei bancari. Venerdì 20 incontro sul Fondo d...Fisac-Cgil Varese

AppMyDay short introGuy Eldar

BorisovaElena Nedelina

Insight SIP overviewMichelBeghin

Île aux Moinescamiilleg

The International Journal of Engineering and Science (The IJES)theijes

Проект "Встречи на родине М. И. Семевского"vera2011s

Srouji hala e118_finalhalahalo

Esproncedapabloafherradura

Presentació 14 12 reginaNúria Serra Gamisans

The International Journal of Engineering and Science (The IJES)theijes

Hypotheekrente ABN AMRO4ieder

Handling JealousyPAVAN CHOUDARY

Td 10 diaporamaFlolet

Bodhtree salesforce.com consulting_partnerBodhtree

Potential link between digital adoption and business growthBodhtree

Advanced analytics playing a vital role for health insurersBodhtree

Bodhtree workforce productivity_improvement_solutionBodhtree

Bodhtree executive management_program_tracking_solutionBodhtree

Bodhtree key account_planning_solutionBodhtree

Bodhtree cloud geotagging_solutionBodhtree

Bodhtree healthcare payer solutionsBodhtree

Tele health startup case studyBodhtree

How unused Big Data turns into Big ValueBodhtree

Change is the only constantBodhtree

Bodhtree Corporate OverviewBodhtree

Weitere ähnliche Inhalte

Andere mochten auch

The International Journal of Engineering and Science (The IJES)theijes

Проект "Встречи на родине М. И. Семевского"vera2011s

Srouji hala e118_finalhalahalo

Esproncedapabloafherradura

Presentació 14 12 reginaNúria Serra Gamisans

The International Journal of Engineering and Science (The IJES)theijes

Hypotheekrente ABN AMRO4ieder

Handling JealousyPAVAN CHOUDARY

Td 10 diaporamaFlolet

Andere mochten auch (10)

The International Journal of Engineering and Science (The IJES)

Проект "Встречи на родине М. И. Семевского"

Srouji hala e118_final

Espronceda

Presentació 14 12 regina

The International Journal of Engineering and Science (The IJES)

Hypotheekrente ABN AMRO

Handling Jealousy

Td 10 diaporama

Mehr von Bodhtree

Bodhtree salesforce.com consulting_partnerBodhtree

Potential link between digital adoption and business growthBodhtree

Advanced analytics playing a vital role for health insurersBodhtree

Bodhtree workforce productivity_improvement_solutionBodhtree

Bodhtree executive management_program_tracking_solutionBodhtree

Bodhtree key account_planning_solutionBodhtree

Bodhtree cloud geotagging_solutionBodhtree

Bodhtree healthcare payer solutionsBodhtree

Tele health startup case studyBodhtree

How unused Big Data turns into Big ValueBodhtree

Change is the only constantBodhtree

Bodhtree Corporate OverviewBodhtree

Balance your Supply Chain with Big DataBodhtree

Business Analytics from BodhtreeBodhtree

Bodhtree Corporate DeckBodhtree

Mehr von Bodhtree (15)

Bodhtree salesforce.com consulting_partner

Potential link between digital adoption and business growth

Advanced analytics playing a vital role for health insurers

Bodhtree workforce productivity_improvement_solution

Bodhtree executive management_program_tracking_solution

Bodhtree key account_planning_solution

Bodhtree cloud geotagging_solution

Bodhtree healthcare payer solutions

Tele health startup case study

How unused Big Data turns into Big Value

Change is the only constant

Bodhtree Corporate Overview

Balance your Supply Chain with Big Data

Business Analytics from Bodhtree

Bodhtree Corporate Deck

Kürzlich hochgeladen

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

A Domino Admins Adventures (Engage 2024)Gabriella Davis

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

A Call to Action for Generative AI in 2024Results

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

🐬 The future of MySQL is Postgres 🐘RTylerCroy

Artificial Intelligence: Facts and MythsJoaquim Jorge

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Kürzlich hochgeladen (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

Data Cloud, More than a CDP by Matt Robison

Powerful Google developer tools for immediate impact! (2023-24 C)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

Exploring the Future Potential of AI-Enabled Smartphone Processors

A Domino Admins Adventures (Engage 2024)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Axa Assurance Maroc - Insurer Innovation Award 2024

Boost Fertility New Invention Ups Success Rates.pdf

2024: Domino Containers - The Next Step. News from the Domino Container commu...

A Call to Action for Generative AI in 2024

Tata AIG General Insurance Company - Insurer Innovation Award 2024

Factors to Consider When Choosing Accounts Payable Services Providers.pptx

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

Automating Google Workspace (GWS) & more with Apps Script

🐬 The future of MySQL is Postgres 🐘

Artificial Intelligence: Facts and Myths

How to Troubleshoot Apps for the Modern Connected Worker

My experience with Cassandra concepts

1. My experience with Cassandra concepts I recently read about Cassandra concepts and internals to understand how it works and why it is suited for handling large volume of data. This is a very interesting and also complex subject and I have merely scratched the surface so far. Cassandra is an open source scalable and highly available "NoSQL" distributed database management system from Apache. It is classified under the Column-Family NoSQL category. It was initially developed by Facebook and was later taken over by Apache. The core features of Cassandra have been extracted from Amazon’s Dynamo and Google’s Bigtable. Its support for dynamic columns and distributed counters will resolve a major problem of being able to aggregate most statistics as they are, rather than aggregating them with map/reduce at the later stage. Another beautiful thing about Cassandra is that it can keep maximum data in its cache (if given enough RAM). Cassandra Data Model The Cassandra data model consists of a keyspace (analogous to a database), column families (analogous to tables in the relational model), keys and columns. Here’s what the basic Cassandra table (also known as a column family) structure looks like: Figure1Error! No text of specified style in document.-1 Structure of a super column family in Cassandra Don’t think of a relational table Instead, think of a nested, sorted map data structure. The following relational model analogy is often used to introduce Cassandra to newcomers: Figure 1Error! No text of specified style in document.-2 Relational vs. Cassandra Model

2. This analogy helps make the transition from the relational to non-relational world. But don’t use this analogy while designing Cassandra column families. Instead, think of the Cassandra column family as a map of a map: an outer map keyed by a row key, and an inner map keyed by a column key. Both maps are sorted. SortedMap<RowKey, SortedMap<ColumnKey, ColumnValue>> Why? A nested sorted map is a more accurate analogy than a relational table, and will help you make the right decisions about your Cassandra data model. Figure 1-3: Cassandra Data Model How?    A map gives efficient key lookup, and the sorted nature gives efficient scans. In Cassandra, we can use row keys and column keys to do efficient lookups and range scans. The number of column keys is unbounded. In other words, you can have wide rows. A key can itself hold a value. In other words, you can have a valueless column. Map<RowKey, SortedMap<ColumnKey, ColumnValue>> Conclusion It’s important to think carefully about your data and your technology choices, and sometimes it can be difficult to do that in a data vacuum. Cassandra, Hive, and Hadoop are considered as the right tools to resolve most of the data challenges. Your mileage may vary, but feel free to ask us questions in the comments!