SlideShare a Scribd company logo
1 of 70
Download to read offline
Christopher Batey
@chbatey
LJC: Building fault tolerant Java applications with
Apache Cassandra
@chbatey
Who am I?
• Technical Evangelist for Apache Cassandra
• Founder of Stubbed Cassandra
• Help out Apache Cassandra users
• DataStax
• Builds enterprise ready version of Apache
Cassandra
• Previous: Cassandra backed apps at BSkyB
@chbatey
Overview
• ‘Building fault tolerant Java applications with Apache
Cassandra’
-Fault tolerance?
-Building an ‘always on’ (HA) service
-Cassandra architecture overview
-Cassandra failure scenarios
-Cassandra Java Driver
@chbatey
The evolution of a developer
• A few years ago I used to write a lot of code in an IDE
• Recently not so much:
- Config management (Ansible, Puppet etc)
- Solution architecture
• Frequently bootstrapping new services
@chbatey
@chbatey
Faults?
• Infrastructure failure
- Node
- Rack
- Data center
• Dependency failure
- HTTP services
- Databases
@chbatey
Building a web app
Who here has downtime on a Friday night for ‘the release’?
@chbatey
Running multiple copies of your app
@chbatey
So when one dies…
@chbatey
Thou shalt be stateless…
• Using container session store = 1/nth of your customers
unhappy if a server dies
@chbatey
Storing state in a database?
Session Store?
@chbatey
The drawbacks
• Latency :(
• Tricky to setup
• New DB/Cache = new component in your architecture to
scale and make HA
• Tightly coupled to your container now
• Move to embedded container in executable jars :(
@chbatey
Client side state?
SecretState
@chbatey
Still in one DC?
@chbatey
Handling hardware failure
@chbatey
Handling hardware failure
@chbatey
Master/slave
•Master serves all writes
•Read from master and optionally slaves
@chbatey
Peer-to-Peer
• No master
• Read/write to any
• Consistency?
@chbatey
Decisions decisions… CAP theorem
Are these really that different??
Relational Database
Mongo, Redis
Highly Available Databases:
Voldermort, Cassandra
@chbatey
PAC-ELC
• If Partition: Trade off Availability vs Consistency
• Else: Trade off Latency vs Consistency
• http://dbmsmusings.blogspot.co.uk/2010/04/problems-
with-cap-and-yahoos-little.html
@chbatey
Cassandra
@chbatey
Cassandra
Cassandra
• Distributed masterless
database (Dynamo)
• Column family data model
(Google BigTable)
@chbatey
Datacenter and rack aware
Europe
• Distributed master less
database (Dynamo)
• Column family data model
(Google BigTable)
• Multi data centre replication
built in from the start
USA
@chbatey
Cassandra
Online
• Distributed master less
database (Dynamo)
• Column family data model
(Google BigTable)
• Multi data centre replication
built in from the start
• Analytics with Apache
Spark
Analytics
@chbatey
Dynamo 101
• The parts Cassandra took
- Consistent hashing
- Replication
- Strategies for replication
- Gossip
- Hinted handoff
- Anti-entropy repair
• And the parts it left behind
- Key/Value
- Vector clocks
@chbatey
Picking the right nodes
• You don’t want a full table scan on a 1000 node cluster!
• Dynamo to the rescue: Consistent Hashing
• Then the replication strategy takes over:
- Network topology
- Simple
@chbatey
Murmer3 Example
• Data:
• Murmer3 Hash Values:
jim age: 36 car: ford gender: M
carol age: 37 car: bmw gender: F
johnny age: 12 gender: M
suzy: age: 10 gender: F
Primary Key Murmur3 hash value
jim -2245462676723223822
carol 7723358927203680754
johnny -6723372854036780875
suzy 1168604627387940318
Primary Key
@chbatey
Company Confidential
Murmer3 Example
Four node cluster:
© 2014 DataStax, All Rights Reserved.
Node Murmur3 start range Murmur3 end range
A -9223372036854775808 -4611686018427387903
B -4611686018427387904 -1
C 0 4611686018427387903
D 4611686018427387904 9223372036854775807
@chbatey
Pictures are better
Hash range: -9223372036854775808 to 9223372036854775807
A
B
C
D
-4611686018427387903
-1
4611686018427387903
-9223372036854775808
9223372036854775807
-4611686018427387904
0
4611686018427387904
B
CD
A
@chbatey
Murmer3 Example
Data is distributed as:
Node Start range End range Primary
key
Hash value
A -9223372036854775808 -4611686018427387903 johnny -67233728540367808
75
B -4611686018427387904 -1 jim -22454626767232238
22
C 0 4611686018427387903 suzy 116860462738794031
8
D 4611686018427387904 9223372036854775807 carol 772335892720368075
4
@chbatey
Replication
@chbatey
Replication
DC1 DC2
client
RF3 RF3
C
RC
WRITE
CL = 1 We have replication!
@chbatey
Replication strategy
• Simple
- Give it to the next node in the ring
- Don’t use this in production
• NetworkTopology
- Every Cassandra node knows its DC and Rack
- Replicas won’t be put on the same rack unless Replication Factor > # of racks
- Unfortunately Cassandra can’t create servers and racks on the fly to fix this :(
@chbatey
Tunable Consistency
•Data is replicated N times
•Every query that you execute you give a consistency
-ALL
-QUORUM
-LOCAL_QUORUM
-ONE
• Christos Kalantzis Eventual Consistency != Hopeful Consistency: http://
youtu.be/A6qzx_HE3EU?list=PLqcm6qE9lgKJzVvwHprow9h7KMpb5hcUU
@chbatey
Handling hardware failure
Async Replication
@chbatey
Handling hardware failure
Async Replication
@chbatey
Handling hardware failure
Async Replication
@chbatey
Handling hardware failure
Async Replication
@chbatey
Load balancing
•Data centre aware
policy
•Token aware policy
•Latency aware policy
•Whitelist policy
APP APP
Async
Replication
DC1 DC2
@chbatey
But what happens when they come back?
• Hinted handoff to the rescue
• Coordinators keep writes for downed nodes for a
configurable amount of time, default 3 hours
• Longer than that run a repair
@chbatey
Anti entropy repair
• Not exciting but mandatory :)
• New in 2.1 - incremental repair <— awesome
@chbatey
Don’t forget to be social
• Each node talks to a few of its other and shares
information
Did you hear node
1 was down???
@chbatey
Scaling shouldn’t be hard
• Throw more nodes at a cluster
• Bootstrapping + joining the ring
• For large data sets this can take some time
@chbatey
Cassandra failure modes
@chbatey
Types of failure
• Server failure
• Fail to write to disk
• Whole node fail
• Socket read time out
• Consistency failure
• Read timeout
• Write timeout
• Unavailable
• Application failures
• Idempotent?
@chbatey
Read and write timeout
Application C
R1
R2
R3
C=QUROUM
Replication factor: 3
timeout
timeout
Read timeout
@chbatey
Unavailable
Application C
R1
R2
R3
C=QUROUM
Replication factor: 3
Unavailable
@chbatey
Coordinator issue
Application C
R1
R2
R3
C=QUROUM
Replication factor: 3
???
Set a socket read timeout!
@chbatey
How do I test this? The test double
• Release it - great book
• Wiremock - mocking HTTP
services
• Saboteur - adding network latency
If you want to build a fault tolerant application
you better test faults!
@chbatey
Stubbed Cassandra
Application
Acceptance
test
prime on admin port (REST)
verification on admin port
Admin endpoints
Native protocol
@chbatey
Test with a real Cassandra
• Reduced throughput / increased latency
- Nodes down
- Racks down
- DC down
@chbatey
Test with a real Cassandra
• Reduced throughput / increased latency
- Nodes down
- Racks down
- DC down
• Automate this if possible
• Load test your schema
- Cassandra stress
- JMeter Cassandra plugin
@chbatey
Summary
•Don’t build services with a single point of failure
•Cassandra deployed correctly means you can handle
node, rack and entire DC failure
•Know when to pick availability over consistency
•Accessing Cassandra from Java is boringly easy so I left
it out
@chbatey
Thanks for listening
• Follow me on twitter @chbatey
• Cassandra + Fault tolerance posts a plenty:
- http://christopher-batey.blogspot.co.uk/
@chbatey
Write path on an individual server
• A commit log for durability
• Data also written to an in-memory structure (memtable)
and then to disk once the memory structure is full (an
SStable)
@chbatey
Cassandra + Java
@chbatey
DataStax Java Driver
• Open source
@chbatey
Modelling in Cassandra
CREATE TABLE customer_events(
customer_id text,
staff_id text,
time timeuuid,
store_type text,
event_type text,
tags map<text, text>,
PRIMARY KEY ((customer_id), time));
Partition Key
Clustering Column(s)
@chbatey
Get all the events
public List<CustomerEvent> getAllCustomerEvents() {

return session.execute("select * from customers.customer_events")

.all().stream()

.map(mapCustomerEvent())

.collect(Collectors.toList());

}
private Function<Row, CustomerEvent> mapCustomerEvent() {

return row -> new CustomerEvent(

row.getString("customer_id"),

row.getUUID("time"),

row.getString("staff_id"),

row.getString("store_type"),

row.getString("event_type"),

row.getMap("tags", String.class, String.class));

}
@chbatey
All events for a particular customer
private PreparedStatement getEventsForCustomer;



@PostConstruct

public void prepareSatements() {

getEventsForCustomer =
session.prepare("select * from customers.customer_events where customer_id = ?");

}



public List<CustomerEvent> getCustomerEvents(String customerId) {

BoundStatement boundStatement = getEventsForCustomer.bind(customerId);

return session.execute(boundStatement)
.all().stream()

.map(mapCustomerEvent())

.collect(Collectors.toList());

}
@chbatey
Customer events for a time slice
public List<CustomerEvent> getCustomerEventsForTime(String customerId, long startTime,
long endTime) {


Select.Where getCustomers = QueryBuilder.select()

.all()

.from("customers", "customer_events")

.where(eq("customer_id", customerId))

.and(gt("time", UUIDs.startOf(startTime)))

.and(lt("time", UUIDs.endOf(endTime)));





return session.execute(getCustomers).all().stream()

.map(mapCustomerEvent())

.collect(Collectors.toList());

}
@chbatey
Mapping API
@Table(keyspace = "customers", name = "customer_events")

public class CustomerEvent {

@PartitionKey

@Column(name = "customer_id")

private String customerId;



@ClusteringColumn

private UUID time;



@Column(name = "staff_id")

private String staffId;



@Column(name = "store_type")

private String storeType;



@Column(name = "event_type")

private String eventType;



private Map<String, String> tags;
// ctr / getters etc
}

@chbatey
Mapping API
@Accessor

public interface CustomerEventDao {

@Query("select * from customers.customer_events where customer_id = :customerId")

Result<CustomerEvent> getCustomerEvents(String customerId);



@Query("select * from customers.customer_events")

Result<CustomerEvent> getAllCustomerEvents();



@Query("select * from customers.customer_events where customer_id = :customerId
and time > minTimeuuid(:startTime) and time < maxTimeuuid(:endTime)")

Result<CustomerEvent> getCustomerEventsForTime(String customerId, long startTime,
long endTime);

}


@Bean

public CustomerEventDao customerEventDao() {

MappingManager mappingManager = new MappingManager(session);

return mappingManager.createAccessor(CustomerEventDao.class);

}
@chbatey
Adding some type safety
public enum StoreType {

ONLINE, RETAIL, FRANCHISE, MOBILE

}
@Table(keyspace = "customers", name = "customer_events")

public class CustomerEvent {

@PartitionKey

@Column(name = "customer_id")

private String customerId;



@ClusteringColumn()

private UUID time;



@Column(name = "staff_id")

private String staffId;



@Column(name = "store_type")

@Enumerated(EnumType.STRING) // could be EnumType.ORDINAL

private StoreType storeType;

@chbatey
User defined types
create TYPE store (name text, type text, postcode text) ;





CREATE TABLE customer_events_type(
customer_id text,
staff_id text,
time timeuuid,
store frozen<store>,
event_type text,
tags map<text, text>,
PRIMARY KEY ((customer_id), time));

@chbatey
Mapping user defined types
@UDT(keyspace = "customers", name = "store")

public class Store {

private String name;

private StoreType type;

private String postcode;
// getters etc
}
@Table(keyspace = "customers", name = "customer_events_type")

public class CustomerEventType {

@PartitionKey

@Column(name = "customer_id")

private String customerId;



@ClusteringColumn()

private UUID time;



@Column(name = "staff_id")

private String staffId;



@Frozen

private Store store;



@Column(name = "event_type")

private String eventType;



private Map<String, String> tags;

@chbatey
Mapping user defined types
@UDT(keyspace = "customers", name = "store")

public class Store {

private String name;

private StoreType type;

private String postcode;
// getters etc
}
@Table(keyspace = "customers", name = "customer_events_type")

public class CustomerEventType {

@PartitionKey

@Column(name = "customer_id")

private String customerId;



@ClusteringColumn()

private UUID time;



@Column(name = "staff_id")

private String staffId;



@Frozen

private Store store;



@Column(name = "event_type")

private String eventType;



private Map<String, String> tags;

@Query("select * from customers.customer_events_type")

Result<CustomerEventType> getAllCustomerEventsWithStoreType();
@chbatey
Snooze…
@chbatey
Summary
•Don’t build services with a single point of failure
•Cassandra deployed correctly means you can handle
node, rack and entire DC failure
•Know when to pick availability over consistency
•Accessing Cassandra from Java is boringly easy so I left
it out
@chbatey
Thanks for listening
• Follow me on twitter @chbatey
• Cassandra + Fault tolerance posts a plenty:
- http://christopher-batey.blogspot.co.uk/

More Related Content

What's hot

Use case for using the ElastiCache for Redis in production
Use case for using the ElastiCache for Redis in productionUse case for using the ElastiCache for Redis in production
Use case for using the ElastiCache for Redis in production
知教 本間
 
Scalable Web Architectures: Common Patterns and Approaches
Scalable Web Architectures: Common Patterns and ApproachesScalable Web Architectures: Common Patterns and Approaches
Scalable Web Architectures: Common Patterns and Approaches
adunne
 

What's hot (20)

Docker and jvm. A good idea?
Docker and jvm. A good idea?Docker and jvm. A good idea?
Docker and jvm. A good idea?
 
Use case for using the ElastiCache for Redis in production
Use case for using the ElastiCache for Redis in productionUse case for using the ElastiCache for Redis in production
Use case for using the ElastiCache for Redis in production
 
Gophers Riding Elephants: Writing PostgreSQL tools in Go
Gophers Riding Elephants: Writing PostgreSQL tools in GoGophers Riding Elephants: Writing PostgreSQL tools in Go
Gophers Riding Elephants: Writing PostgreSQL tools in Go
 
What's up?
What's up?What's up?
What's up?
 
Wordpress optimization
Wordpress optimizationWordpress optimization
Wordpress optimization
 
Message Queuing on a Large Scale: IMVUs stateful real-time message queue for ...
Message Queuing on a Large Scale: IMVUs stateful real-time message queue for ...Message Queuing on a Large Scale: IMVUs stateful real-time message queue for ...
Message Queuing on a Large Scale: IMVUs stateful real-time message queue for ...
 
Scalable Web Architectures: Common Patterns and Approaches
Scalable Web Architectures: Common Patterns and ApproachesScalable Web Architectures: Common Patterns and Approaches
Scalable Web Architectures: Common Patterns and Approaches
 
23 Ways To Speed Up WordPress
23 Ways To Speed Up WordPress23 Ways To Speed Up WordPress
23 Ways To Speed Up WordPress
 
Building an Impenetrable ZooKeeper - Kathleen Ting
Building an Impenetrable ZooKeeper - Kathleen TingBuilding an Impenetrable ZooKeeper - Kathleen Ting
Building an Impenetrable ZooKeeper - Kathleen Ting
 
Tool Academy: Web Archiving
Tool Academy: Web ArchivingTool Academy: Web Archiving
Tool Academy: Web Archiving
 
Memcached
MemcachedMemcached
Memcached
 
Where Django Caching Bust at the Seams
Where Django Caching Bust at the SeamsWhere Django Caching Bust at the Seams
Where Django Caching Bust at the Seams
 
(SDD402) Amazon ElastiCache Deep Dive | AWS re:Invent 2014
(SDD402) Amazon ElastiCache Deep Dive | AWS re:Invent 2014(SDD402) Amazon ElastiCache Deep Dive | AWS re:Invent 2014
(SDD402) Amazon ElastiCache Deep Dive | AWS re:Invent 2014
 
Emerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big DataEmerging technologies /frameworks in Big Data
Emerging technologies /frameworks in Big Data
 
Building a better web
Building a better webBuilding a better web
Building a better web
 
Top ten-list
Top ten-listTop ten-list
Top ten-list
 
Ehcache 3 @ BruJUG
Ehcache 3 @ BruJUGEhcache 3 @ BruJUG
Ehcache 3 @ BruJUG
 
Memcached Code Camp 2009
Memcached Code Camp 2009Memcached Code Camp 2009
Memcached Code Camp 2009
 
Caching In The Cloud
Caching In The CloudCaching In The Cloud
Caching In The Cloud
 
Building a Fast, Reliable SQL Server for kCura Relativity
Building a Fast, Reliable SQL Server for kCura RelativityBuilding a Fast, Reliable SQL Server for kCura Relativity
Building a Fast, Reliable SQL Server for kCura Relativity
 

Similar to LJC: Fault tolerance with Apache Cassandra

Cassandra
CassandraCassandra
Cassandra
exsuns
 
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
DataStax
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandra
Axel Liljencrantz
 

Similar to LJC: Fault tolerance with Apache Cassandra (20)

Jan 2015 - Cassandra101 Manchester Meetup
Jan 2015 - Cassandra101 Manchester MeetupJan 2015 - Cassandra101 Manchester Meetup
Jan 2015 - Cassandra101 Manchester Meetup
 
Vienna Feb 2015: Cassandra: How it works and what it's good for!
Vienna Feb 2015: Cassandra: How it works and what it's good for!Vienna Feb 2015: Cassandra: How it works and what it's good for!
Vienna Feb 2015: Cassandra: How it works and what it's good for!
 
Devoxx France: Fault tolerant microservices on the JVM with Cassandra
Devoxx France: Fault tolerant microservices on the JVM with CassandraDevoxx France: Fault tolerant microservices on the JVM with Cassandra
Devoxx France: Fault tolerant microservices on the JVM with Cassandra
 
Data Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and SparkData Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and Spark
 
Cassandra
CassandraCassandra
Cassandra
 
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetupDataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
 
Paris Day Cassandra: Use case
Paris Day Cassandra: Use caseParis Day Cassandra: Use case
Paris Day Cassandra: Use case
 
Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)Cassandra from the trenches: migrating Netflix (update)
Cassandra from the trenches: migrating Netflix (update)
 
Hindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to CassandraHindsight is 20/20: MySQL to Cassandra
Hindsight is 20/20: MySQL to Cassandra
 
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael KjellmanC* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
C* Summit 2013 - Hindsight is 20/20. MySQL to Cassandra by Michael Kjellman
 
Voxxed Vienna 2015 Fault tolerant microservices
Voxxed Vienna 2015 Fault tolerant microservicesVoxxed Vienna 2015 Fault tolerant microservices
Voxxed Vienna 2015 Fault tolerant microservices
 
Avoiding big data antipatterns
Avoiding big data antipatternsAvoiding big data antipatterns
Avoiding big data antipatterns
 
NoSQL_Night
NoSQL_NightNoSQL_Night
NoSQL_Night
 
Real time analytics using Hadoop and Elasticsearch
Real time analytics using Hadoop and ElasticsearchReal time analytics using Hadoop and Elasticsearch
Real time analytics using Hadoop and Elasticsearch
 
L6.sp17.pptx
L6.sp17.pptxL6.sp17.pptx
L6.sp17.pptx
 
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
 
1 Dundee - Cassandra 101
1 Dundee - Cassandra 1011 Dundee - Cassandra 101
1 Dundee - Cassandra 101
 
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd KnownCassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandra
 
DataStax: Making Cassandra Fail (for effective testing)
DataStax: Making Cassandra Fail (for effective testing)DataStax: Making Cassandra Fail (for effective testing)
DataStax: Making Cassandra Fail (for effective testing)
 

More from Christopher Batey

More from Christopher Batey (20)

Cassandra summit LWTs
Cassandra summit  LWTsCassandra summit  LWTs
Cassandra summit LWTs
 
Cassandra Day NYC - Cassandra anti patterns
Cassandra Day NYC - Cassandra anti patternsCassandra Day NYC - Cassandra anti patterns
Cassandra Day NYC - Cassandra anti patterns
 
Think your software is fault-tolerant? Prove it!
Think your software is fault-tolerant? Prove it!Think your software is fault-tolerant? Prove it!
Think your software is fault-tolerant? Prove it!
 
Manchester Hadoop Meetup: Cassandra Spark internals
Manchester Hadoop Meetup: Cassandra Spark internalsManchester Hadoop Meetup: Cassandra Spark internals
Manchester Hadoop Meetup: Cassandra Spark internals
 
Cassandra London - 2.2 and 3.0
Cassandra London - 2.2 and 3.0Cassandra London - 2.2 and 3.0
Cassandra London - 2.2 and 3.0
 
Cassandra London - C* Spark Connector
Cassandra London - C* Spark ConnectorCassandra London - C* Spark Connector
Cassandra London - C* Spark Connector
 
IoT London July 2015
IoT London July 2015IoT London July 2015
IoT London July 2015
 
2 Dundee - Cassandra-3
2 Dundee - Cassandra-32 Dundee - Cassandra-3
2 Dundee - Cassandra-3
 
3 Dundee-Spark Overview for C* developers
3 Dundee-Spark Overview for C* developers3 Dundee-Spark Overview for C* developers
3 Dundee-Spark Overview for C* developers
 
Dublin Meetup: Cassandra anti patterns
Dublin Meetup: Cassandra anti patternsDublin Meetup: Cassandra anti patterns
Dublin Meetup: Cassandra anti patterns
 
Cassandra Day London: Building Java Applications
Cassandra Day London: Building Java ApplicationsCassandra Day London: Building Java Applications
Cassandra Day London: Building Java Applications
 
Manchester Hadoop Meetup: Spark Cassandra Integration
Manchester Hadoop Meetup: Spark Cassandra IntegrationManchester Hadoop Meetup: Spark Cassandra Integration
Manchester Hadoop Meetup: Spark Cassandra Integration
 
Manchester Hadoop User Group: Cassandra Intro
Manchester Hadoop User Group: Cassandra IntroManchester Hadoop User Group: Cassandra Intro
Manchester Hadoop User Group: Cassandra Intro
 
Webinar Cassandra Anti-Patterns
Webinar Cassandra Anti-PatternsWebinar Cassandra Anti-Patterns
Webinar Cassandra Anti-Patterns
 
Munich March 2015 - Cassandra + Spark Overview
Munich March 2015 -  Cassandra + Spark OverviewMunich March 2015 -  Cassandra + Spark Overview
Munich March 2015 - Cassandra + Spark Overview
 
Reading Cassandra Meetup Feb 2015: Apache Spark
Reading Cassandra Meetup Feb 2015: Apache SparkReading Cassandra Meetup Feb 2015: Apache Spark
Reading Cassandra Meetup Feb 2015: Apache Spark
 
LA Cassandra Day 2015 - Testing Cassandra
LA Cassandra Day 2015  - Testing CassandraLA Cassandra Day 2015  - Testing Cassandra
LA Cassandra Day 2015 - Testing Cassandra
 
LA Cassandra Day 2015 - Cassandra for developers
LA Cassandra Day 2015  - Cassandra for developersLA Cassandra Day 2015  - Cassandra for developers
LA Cassandra Day 2015 - Cassandra for developers
 
Cassandra Summit EU 2014 Lightning talk - Paging (no animation)
Cassandra Summit EU 2014 Lightning talk - Paging (no animation)Cassandra Summit EU 2014 Lightning talk - Paging (no animation)
Cassandra Summit EU 2014 Lightning talk - Paging (no animation)
 
Cassandra Summit EU 2014 - Testing Cassandra Applications
Cassandra Summit EU 2014 - Testing Cassandra ApplicationsCassandra Summit EU 2014 - Testing Cassandra Applications
Cassandra Summit EU 2014 - Testing Cassandra Applications
 

Recently uploaded

JustNaik Solution Deck (stage bus sector)
JustNaik Solution Deck (stage bus sector)JustNaik Solution Deck (stage bus sector)
JustNaik Solution Deck (stage bus sector)
Max Lee
 
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfMastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
mbmh111980
 

Recently uploaded (20)

Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
 
How to pick right visual testing tool.pdf
How to pick right visual testing tool.pdfHow to pick right visual testing tool.pdf
How to pick right visual testing tool.pdf
 
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdfStrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi.pdf
 
how-to-download-files-safely-from-the-internet.pdf
how-to-download-files-safely-from-the-internet.pdfhow-to-download-files-safely-from-the-internet.pdf
how-to-download-files-safely-from-the-internet.pdf
 
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
A Python-based approach to data loading in TM1 - Using Airflow as an ETL for TM1
 
KLARNA - Language Models and Knowledge Graphs: A Systems Approach
KLARNA -  Language Models and Knowledge Graphs: A Systems ApproachKLARNA -  Language Models and Knowledge Graphs: A Systems Approach
KLARNA - Language Models and Knowledge Graphs: A Systems Approach
 
JustNaik Solution Deck (stage bus sector)
JustNaik Solution Deck (stage bus sector)JustNaik Solution Deck (stage bus sector)
JustNaik Solution Deck (stage bus sector)
 
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
 
IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024
 
Agnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in KrakówAgnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in Kraków
 
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdfMicrosoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
Microsoft 365 Copilot; An AI tool changing the world of work _PDF.pdf
 
AI Hackathon.pptx
AI                        Hackathon.pptxAI                        Hackathon.pptx
AI Hackathon.pptx
 
What need to be mastered as AI-Powered Java Developers
What need to be mastered as AI-Powered Java DevelopersWhat need to be mastered as AI-Powered Java Developers
What need to be mastered as AI-Powered Java Developers
 
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdf
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdfImplementing KPIs and Right Metrics for Agile Delivery Teams.pdf
Implementing KPIs and Right Metrics for Agile Delivery Teams.pdf
 
CompTIA Security+ (Study Notes) for cs.pdf
CompTIA Security+ (Study Notes) for cs.pdfCompTIA Security+ (Study Notes) for cs.pdf
CompTIA Security+ (Study Notes) for cs.pdf
 
AI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in Michelangelo
 
5 Reasons Driving Warehouse Management Systems Demand
5 Reasons Driving Warehouse Management Systems Demand5 Reasons Driving Warehouse Management Systems Demand
5 Reasons Driving Warehouse Management Systems Demand
 
Crafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationCrafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM Integration
 
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdfMastering Windows 7 A Comprehensive Guide for Power Users .pdf
Mastering Windows 7 A Comprehensive Guide for Power Users .pdf
 
COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...
COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...
COMPUTER AND ITS COMPONENTS PPT.by naitik sharma Class 9th A mittal internati...
 

LJC: Fault tolerance with Apache Cassandra

  • 1. Christopher Batey @chbatey LJC: Building fault tolerant Java applications with Apache Cassandra
  • 2. @chbatey Who am I? • Technical Evangelist for Apache Cassandra • Founder of Stubbed Cassandra • Help out Apache Cassandra users • DataStax • Builds enterprise ready version of Apache Cassandra • Previous: Cassandra backed apps at BSkyB
  • 3. @chbatey Overview • ‘Building fault tolerant Java applications with Apache Cassandra’ -Fault tolerance? -Building an ‘always on’ (HA) service -Cassandra architecture overview -Cassandra failure scenarios -Cassandra Java Driver
  • 4. @chbatey The evolution of a developer • A few years ago I used to write a lot of code in an IDE • Recently not so much: - Config management (Ansible, Puppet etc) - Solution architecture • Frequently bootstrapping new services
  • 6. @chbatey Faults? • Infrastructure failure - Node - Rack - Data center • Dependency failure - HTTP services - Databases
  • 7. @chbatey Building a web app Who here has downtime on a Friday night for ‘the release’?
  • 10. @chbatey Thou shalt be stateless… • Using container session store = 1/nth of your customers unhappy if a server dies
  • 11. @chbatey Storing state in a database? Session Store?
  • 12. @chbatey The drawbacks • Latency :( • Tricky to setup • New DB/Cache = new component in your architecture to scale and make HA • Tightly coupled to your container now • Move to embedded container in executable jars :(
  • 17. @chbatey Master/slave •Master serves all writes •Read from master and optionally slaves
  • 18. @chbatey Peer-to-Peer • No master • Read/write to any • Consistency?
  • 19. @chbatey Decisions decisions… CAP theorem Are these really that different?? Relational Database Mongo, Redis Highly Available Databases: Voldermort, Cassandra
  • 20. @chbatey PAC-ELC • If Partition: Trade off Availability vs Consistency • Else: Trade off Latency vs Consistency • http://dbmsmusings.blogspot.co.uk/2010/04/problems- with-cap-and-yahoos-little.html
  • 22. @chbatey Cassandra Cassandra • Distributed masterless database (Dynamo) • Column family data model (Google BigTable)
  • 23. @chbatey Datacenter and rack aware Europe • Distributed master less database (Dynamo) • Column family data model (Google BigTable) • Multi data centre replication built in from the start USA
  • 24. @chbatey Cassandra Online • Distributed master less database (Dynamo) • Column family data model (Google BigTable) • Multi data centre replication built in from the start • Analytics with Apache Spark Analytics
  • 25. @chbatey Dynamo 101 • The parts Cassandra took - Consistent hashing - Replication - Strategies for replication - Gossip - Hinted handoff - Anti-entropy repair • And the parts it left behind - Key/Value - Vector clocks
  • 26. @chbatey Picking the right nodes • You don’t want a full table scan on a 1000 node cluster! • Dynamo to the rescue: Consistent Hashing • Then the replication strategy takes over: - Network topology - Simple
  • 27. @chbatey Murmer3 Example • Data: • Murmer3 Hash Values: jim age: 36 car: ford gender: M carol age: 37 car: bmw gender: F johnny age: 12 gender: M suzy: age: 10 gender: F Primary Key Murmur3 hash value jim -2245462676723223822 carol 7723358927203680754 johnny -6723372854036780875 suzy 1168604627387940318 Primary Key
  • 28. @chbatey Company Confidential Murmer3 Example Four node cluster: © 2014 DataStax, All Rights Reserved. Node Murmur3 start range Murmur3 end range A -9223372036854775808 -4611686018427387903 B -4611686018427387904 -1 C 0 4611686018427387903 D 4611686018427387904 9223372036854775807
  • 29. @chbatey Pictures are better Hash range: -9223372036854775808 to 9223372036854775807 A B C D -4611686018427387903 -1 4611686018427387903 -9223372036854775808 9223372036854775807 -4611686018427387904 0 4611686018427387904 B CD A
  • 30. @chbatey Murmer3 Example Data is distributed as: Node Start range End range Primary key Hash value A -9223372036854775808 -4611686018427387903 johnny -67233728540367808 75 B -4611686018427387904 -1 jim -22454626767232238 22 C 0 4611686018427387903 suzy 116860462738794031 8 D 4611686018427387904 9223372036854775807 carol 772335892720368075 4
  • 33. @chbatey Replication strategy • Simple - Give it to the next node in the ring - Don’t use this in production • NetworkTopology - Every Cassandra node knows its DC and Rack - Replicas won’t be put on the same rack unless Replication Factor > # of racks - Unfortunately Cassandra can’t create servers and racks on the fly to fix this :(
  • 34. @chbatey Tunable Consistency •Data is replicated N times •Every query that you execute you give a consistency -ALL -QUORUM -LOCAL_QUORUM -ONE • Christos Kalantzis Eventual Consistency != Hopeful Consistency: http:// youtu.be/A6qzx_HE3EU?list=PLqcm6qE9lgKJzVvwHprow9h7KMpb5hcUU
  • 39. @chbatey Load balancing •Data centre aware policy •Token aware policy •Latency aware policy •Whitelist policy APP APP Async Replication DC1 DC2
  • 40. @chbatey But what happens when they come back? • Hinted handoff to the rescue • Coordinators keep writes for downed nodes for a configurable amount of time, default 3 hours • Longer than that run a repair
  • 41. @chbatey Anti entropy repair • Not exciting but mandatory :) • New in 2.1 - incremental repair <— awesome
  • 42. @chbatey Don’t forget to be social • Each node talks to a few of its other and shares information Did you hear node 1 was down???
  • 43. @chbatey Scaling shouldn’t be hard • Throw more nodes at a cluster • Bootstrapping + joining the ring • For large data sets this can take some time
  • 45. @chbatey Types of failure • Server failure • Fail to write to disk • Whole node fail • Socket read time out • Consistency failure • Read timeout • Write timeout • Unavailable • Application failures • Idempotent?
  • 46. @chbatey Read and write timeout Application C R1 R2 R3 C=QUROUM Replication factor: 3 timeout timeout Read timeout
  • 49. @chbatey How do I test this? The test double • Release it - great book • Wiremock - mocking HTTP services • Saboteur - adding network latency If you want to build a fault tolerant application you better test faults!
  • 50. @chbatey Stubbed Cassandra Application Acceptance test prime on admin port (REST) verification on admin port Admin endpoints Native protocol
  • 51. @chbatey Test with a real Cassandra • Reduced throughput / increased latency - Nodes down - Racks down - DC down
  • 52. @chbatey Test with a real Cassandra • Reduced throughput / increased latency - Nodes down - Racks down - DC down • Automate this if possible • Load test your schema - Cassandra stress - JMeter Cassandra plugin
  • 53. @chbatey Summary •Don’t build services with a single point of failure •Cassandra deployed correctly means you can handle node, rack and entire DC failure •Know when to pick availability over consistency •Accessing Cassandra from Java is boringly easy so I left it out
  • 54. @chbatey Thanks for listening • Follow me on twitter @chbatey • Cassandra + Fault tolerance posts a plenty: - http://christopher-batey.blogspot.co.uk/
  • 55. @chbatey Write path on an individual server • A commit log for durability • Data also written to an in-memory structure (memtable) and then to disk once the memory structure is full (an SStable)
  • 58. @chbatey Modelling in Cassandra CREATE TABLE customer_events( customer_id text, staff_id text, time timeuuid, store_type text, event_type text, tags map<text, text>, PRIMARY KEY ((customer_id), time)); Partition Key Clustering Column(s)
  • 59. @chbatey Get all the events public List<CustomerEvent> getAllCustomerEvents() {
 return session.execute("select * from customers.customer_events")
 .all().stream()
 .map(mapCustomerEvent())
 .collect(Collectors.toList());
 } private Function<Row, CustomerEvent> mapCustomerEvent() {
 return row -> new CustomerEvent(
 row.getString("customer_id"),
 row.getUUID("time"),
 row.getString("staff_id"),
 row.getString("store_type"),
 row.getString("event_type"),
 row.getMap("tags", String.class, String.class));
 }
  • 60. @chbatey All events for a particular customer private PreparedStatement getEventsForCustomer;
 
 @PostConstruct
 public void prepareSatements() {
 getEventsForCustomer = session.prepare("select * from customers.customer_events where customer_id = ?");
 }
 
 public List<CustomerEvent> getCustomerEvents(String customerId) {
 BoundStatement boundStatement = getEventsForCustomer.bind(customerId);
 return session.execute(boundStatement) .all().stream()
 .map(mapCustomerEvent())
 .collect(Collectors.toList());
 }
  • 61. @chbatey Customer events for a time slice public List<CustomerEvent> getCustomerEventsForTime(String customerId, long startTime, long endTime) { 
 Select.Where getCustomers = QueryBuilder.select()
 .all()
 .from("customers", "customer_events")
 .where(eq("customer_id", customerId))
 .and(gt("time", UUIDs.startOf(startTime)))
 .and(lt("time", UUIDs.endOf(endTime)));
 
 
 return session.execute(getCustomers).all().stream()
 .map(mapCustomerEvent())
 .collect(Collectors.toList());
 }
  • 62. @chbatey Mapping API @Table(keyspace = "customers", name = "customer_events")
 public class CustomerEvent {
 @PartitionKey
 @Column(name = "customer_id")
 private String customerId;
 
 @ClusteringColumn
 private UUID time;
 
 @Column(name = "staff_id")
 private String staffId;
 
 @Column(name = "store_type")
 private String storeType;
 
 @Column(name = "event_type")
 private String eventType;
 
 private Map<String, String> tags; // ctr / getters etc }

  • 63. @chbatey Mapping API @Accessor
 public interface CustomerEventDao {
 @Query("select * from customers.customer_events where customer_id = :customerId")
 Result<CustomerEvent> getCustomerEvents(String customerId);
 
 @Query("select * from customers.customer_events")
 Result<CustomerEvent> getAllCustomerEvents();
 
 @Query("select * from customers.customer_events where customer_id = :customerId and time > minTimeuuid(:startTime) and time < maxTimeuuid(:endTime)")
 Result<CustomerEvent> getCustomerEventsForTime(String customerId, long startTime, long endTime);
 } 
 @Bean
 public CustomerEventDao customerEventDao() {
 MappingManager mappingManager = new MappingManager(session);
 return mappingManager.createAccessor(CustomerEventDao.class);
 }
  • 64. @chbatey Adding some type safety public enum StoreType {
 ONLINE, RETAIL, FRANCHISE, MOBILE
 } @Table(keyspace = "customers", name = "customer_events")
 public class CustomerEvent {
 @PartitionKey
 @Column(name = "customer_id")
 private String customerId;
 
 @ClusteringColumn()
 private UUID time;
 
 @Column(name = "staff_id")
 private String staffId;
 
 @Column(name = "store_type")
 @Enumerated(EnumType.STRING) // could be EnumType.ORDINAL
 private StoreType storeType;

  • 65. @chbatey User defined types create TYPE store (name text, type text, postcode text) ;
 
 
 CREATE TABLE customer_events_type( customer_id text, staff_id text, time timeuuid, store frozen<store>, event_type text, tags map<text, text>, PRIMARY KEY ((customer_id), time));

  • 66. @chbatey Mapping user defined types @UDT(keyspace = "customers", name = "store")
 public class Store {
 private String name;
 private StoreType type;
 private String postcode; // getters etc } @Table(keyspace = "customers", name = "customer_events_type")
 public class CustomerEventType {
 @PartitionKey
 @Column(name = "customer_id")
 private String customerId;
 
 @ClusteringColumn()
 private UUID time;
 
 @Column(name = "staff_id")
 private String staffId;
 
 @Frozen
 private Store store;
 
 @Column(name = "event_type")
 private String eventType;
 
 private Map<String, String> tags;

  • 67. @chbatey Mapping user defined types @UDT(keyspace = "customers", name = "store")
 public class Store {
 private String name;
 private StoreType type;
 private String postcode; // getters etc } @Table(keyspace = "customers", name = "customer_events_type")
 public class CustomerEventType {
 @PartitionKey
 @Column(name = "customer_id")
 private String customerId;
 
 @ClusteringColumn()
 private UUID time;
 
 @Column(name = "staff_id")
 private String staffId;
 
 @Frozen
 private Store store;
 
 @Column(name = "event_type")
 private String eventType;
 
 private Map<String, String> tags;
 @Query("select * from customers.customer_events_type")
 Result<CustomerEventType> getAllCustomerEventsWithStoreType();
  • 69. @chbatey Summary •Don’t build services with a single point of failure •Cassandra deployed correctly means you can handle node, rack and entire DC failure •Know when to pick availability over consistency •Accessing Cassandra from Java is boringly easy so I left it out
  • 70. @chbatey Thanks for listening • Follow me on twitter @chbatey • Cassandra + Fault tolerance posts a plenty: - http://christopher-batey.blogspot.co.uk/