Monitor Cassandra Performance and Resources with Metrics

Presented By: Shubhrank Rastogi
Monitoring Cassandra
with an EYE

Agenda
01
● A little about Cassandra
● Cassandra Internal Concepts
○ Read Path
○ Write Path
○ Compactions
○ JVM GC
● Quick Q’s on Metrics, Monitoring
● Monitoring Cassandra
● Categories wise - Metrics with Visuals

Quick Overview - About Cassandra
● Apache Cassandra is a free and open-source, distributed, wide column store,
NoSQL database management system designed to handle large amounts of
data across many commodity servers, providing high availability with no single
point of failure.
● A Cassandra cluster has no special nodes i.e. the cluster has no masters, no slaves
or elected leaders.
● It provides various features:
○ Fault-tolerant
○ Scalable
○ Distributed
○ Tunable consistency

Let’s Take a PATH - WRITE PATH
● A write request for Cassandra follows the following path at node level:

WRITE PATH - Cont.
● A write request for Cassandra follows the following path at cluster level:

Let’s Take a PATH - READ PATH
● The read request follows the following path at SSTable level:

Compactions is necessary!
○ Why?
■to save disk space.
■to clean redundant data.
■to make response faster.
○ How cassandra stores data?
■adds data by timstamping.
■generates mulitple SSTables.
○ Why is it bad?
■Data utilization is high.
■response time is high.
○ Solution:
■remove redundant data with the help of
Tombstones
■generates ONE sstable throughout cluster.

JVM GC to the rescue!
● Cassandra process is run on JVM, using off-heap and in-heap
memory for its components.
● One situation that you definitely want to minimize
○ garbage collection pause aka stop-the-world event
● A pause occurs when a region of memory is full and its effects:
○ operations are suspended.
○ node can appear as down to other nodes.
○ the JVM needs to make space to continue.
○ Read/Write operations will wait, increases Read/Write
Latency.

Why do we need Metrics of
anything?

To measure something or
Probably to Monitor
something.

And why do we need to
Monitor something?

●To make that something successful for long
run
●To detect anomalies and see future
●To optimize that something
●To improve that something
●To save some $$$$

“ Necessity is the mother of
Invention and to make that
Invention successful, we need
to Monitor it. ”

Now Let’s MONITOR Cassandra
● What we can monitor in cassandra?
○ Performance
■Write Path
■Read Path
○ Capacity (Resources)
■Server/Node
■JVM
○ Operations
■Compactions
■Eventual Consistency

Performance based - Metrics
● Read and Write Latency in READ and WRITE PATH
○ The recent read latency and write latency counters are important in making sure operations are happening
in a consistent manner.
○ Effected by -
■Replication factor
■Compactions (Improves READ only)
■MemTable Flush

Cont:
● LiveSStable count - No. of SStables for a table in all keyspaces, or you can be more generic like to a specific
keyspace and table.
○ (Read Latency, Disk Usage) is directly proportional to Live SSTable count.
○ Compaction follows a strategy and uses a large amount of CPU

Cont:
● ThreadPool
○ Each of the thread pools provides statistics on the number of tasks that are active, pending, and
completed.
○ Metrics:
■PendingTasks
■CompletedTasks
○ Trends on these pools for increases in the pending task column indicate when to add additional capacity.

● MemTable flush count
○ Configuring memtable thresholds can improve write performance by creating SSTables.
○ it executes when the commit log space threshold or the memtable cleanup threshold has been exceeded.
○ How you tune memtable thresholds depends on your data and write load. Increase memtable thresholds
under either of these conditions:
■The write load includes a high volume of updates on a smaller set of data.
■A steady stream of continuous writes occurs.
Cont:

Operations based: Metrics
● Compactions
○ Metrics
■PendingTasks
■CompletedTasks
○ Monitoring compaction performance is an important aspect of knowing when to add capacity to your
cluster.
○ Compactions affects both paths READ and WRITE, indirectly or directly.

● Eventual Consistency
tells us how consistent is our data replication across nodes.
○ HintedHandoff Metrics
keeps track of data which has been replicated to nodes yet.
■Hints_created: number of hinted key space created.)
○ Every data that is stored in cassandra would be replicated across the cluster based on its replication
factor.
○ When coordinator node couldn’t write to replicas node, it would store the data on it self, then re-attempt to
write to replica node if the replica node available.
Cont:

Resources based - Metrics
● JVM based resources
○ Metrics:
■JVM_gc_collection_seconds_count
■JVM_heap_memory_bytes_used
■JVM_memory_pool_bytes_used

REFERENCES:
● https://docs.datastax.com/en/ddac/doc/datastax_enterprise/ddacAdminGettingStarted.html
● https://thelastpickle.com/blog/2018/04/11/gc-tuning.html
● https://thelastpickle.com/blog/2017/03/16/compaction-nuance.html
● https://thelastpickle.com/blog/2011/04/28/Forces-of-Write-and-Read.html
● https://www.datadoghq.com/blog/how-to-monitor-cassandra-performance-metrics/
● https://medium.com/prismapp/how-do-we-monitor-cassandra-cluster-3cb4c9a2c162
● https://www.datastax.com/dev/blog/modern-hinted-handoff
● https://www.metricly.com/how-to-monitor-cassandra/

Monitor Cassandra Performance and Resources with Metrics

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Monitor Cassandra Performance and Resources with Metrics

Ähnlich wie Monitor Cassandra Performance and Resources with Metrics (20)

Mehr von Knoldus Inc.

Mehr von Knoldus Inc. (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Monitor Cassandra Performance and Resources with Metrics