Cassandra On EC2

Cassandra On EC2

Matthew F. Dennis // @mdennis

@mdennis

Instance Sizes
● m1.xlarge is by far the most common size
● m1.large is ok for many use cases
● m2.4xlarge in some cases
● keep the entire dataset in memory
● c1.xlarge / cc1.4xlarge
● Smallish but very hot set of data
– regardless of how much data is on disk
● Extremely high request rate
● Encrypted node-node communications and high traffic
● Usually better off with many m1.xlarge instances because of
the extra memory, but not always

@mdennis

Configuration
● Stripe All Ephemeral Drives
● data directory and commit log on same volume
● Only applies to EC2 and SSDs, not physical HW
● Why?
● 6-8 GB heap on m1.xlarge
● 3-4 GB heap on m1.large
● Phi Convict Threshold? Maybe ...

@mdennis

EBS versus Ephemeral
● Ephemeral drives are:
● Generally faster for C*
● More stable (no pauses/freezes; outages?)
● Cheaper
● Easier to initially configure
● Striped EBS?
● yeah, about that …
● TL;DL don't use EBS for C* on EC2

@mdennis

Multi-Zone
● Alternate zones in your token topology
● No really, this is important, alternate zones
– We should probably fix this ...
● “complicated, but possible” to add new zones
after initial deployment
● Never move a *token* to a different region or
zone
● If you think that is what you want to do, really you
want to bootstrap new one at token-1 in the new
region/zone and then decom the old one

@mdennis

Multi-Region C* on EC2
● Connectivity is the complicated part
● Ec2MultiRegionSnitch is not the entire answer
– https://issues.apache.org/jira/browse/CASSANDRA-2452
● Don't try to make a “fail over” DC, just go with active-active
● If you insist, then do the fail over in your application and configure C* the
same as you would active-active
● Generally requires a lot more storage
● Doesn't matter though because you're using ephemeral drives (right?)
and don't want a TB of data on each node anyway

@mdennis

Multi-Region Connectivity Options
● VPN
● Encrypted node-node communication
● CPU utilization is often a downside
● VPNCubed / VPCPlus
● I've never deployed it, heard good things about it though
● Amazon VPC
● anyone know if a single VPC can span regions yet?
● SSH Tunnels
● EC2 security groups
● IPTables
● Encrypted node-node + public IP binding + AWS security groups +
IPTables (EIPs may simplify this, never actually tried it)

@mdennis

Recovery From Failures
● Don't “fix” EC2 nodes, replace them
● boostrap at token-1, remove old token
– bootstrap can be slow, but will get better

● Other than that it's the same in EC2 as not ...

@mdennis

Node Maintenance
● “Maintenance” On EC2?
● Usually not required (just replace the node)
● If it is, just stop C*, CL+HH/repair/RR will fix it
● Same as physical HW
● https://issues.apache.org/jira/browse/CASSANDRA-2034

● Stop Trying To Decom Nodes Just To Replace a Disk !!!

@mdennis

Backups
● C* snapshots and push to S3
● Directory Watcher that pushes new files to S3
● SimpleGeo: https://github.com/simplegeo/tablesnap
● Netflix: http://slidesha.re/NFOnCassBkup
● Keep a log of all incoming writes
● Not specific to S3
● Can be coupled with snapshots / S3
● Useful for other reasons as well
● Compression in transit to S3 (or where ever) can be done on
a separate EC2 instance to avoid burning CPU
● Usually not worth the extra complexity / cost

@mdennis

Changing Node Sizes
● Start a new instance
● rsync data from from original node to new node
● Shutdown C* on original node
● rsync data from from original node to new node
● Start C* on new node
● Shutdown original instance
● NB: Assumes same token, region, zone, etc

@mdennis

Elastic Load Balancers
● They're awesome, use them
● Could be more awesome (e.g. better integration with Route 53)

● What I really want is TCP anycast for ELB across regions (AWS could

make it work)
● Balance across regions with GeoIP / GeoDNS

● Zerigo, TZOHA, Neustar, “homegrown”, etc

● Route 53? You wish (though Route 53 itself is run over anycast)

– “in the future we plan for Route 53 to also give you greater control over
… the route your users take to reach an endpoint” --Werner Vogels
● Put them in front of your app servers, not your C* instances

● Keep your app servers stateless or at least “weakly” stateless (e.g. no sticky

sessions required)

@mdennis

AMIs versus Scripted Setup
● DataStax publishes C* AMIs
● Chef Recipes as well
● Or roll your own …
● Whatever you do, just make sure it's automated
and repeatable
● *personally* I prefer scripting the setup
remotely, but this is … “less than ideal”
● PSSH is, in general, awesome

@mdennis

WTF?!
● Your zone X is not the same as my zone X
● Consistent within an EC2 account
● Problematic across accounts
● Does not apply to regions (i.e. your region X is my region X)
● EIPs resolve to private IPs from within AWS
● EBS volumes sometimes just “freeze”
● AWS: “yeah, that happens sometimes under load”
● steal% sometimes 20% or more (1%-3% is “normal”)
● This is AWS literally stealing your money
● Thankfully not all that common, but watch out for it

@mdennis

Missing AWS Features
● ELB over anycast
● Probably doable by AWS, but not others ...
● GeoDNS from Route53
● No really, WTF Doesn't Route53 Do GeoDNS ?!?!
● Multi-Region VPC
● Local SSDs

@mdennis

We're Hiring !
● Developers
● QA
● Community Manager
● Sales / SE
● Interns
– Dev
– Support
– QA
● Smart People Interested In Cassandra

@mdennis

Cassandra On EC2

Q?
(yes, I'll post the slides on slideshare)

@mdennis

Cassandra On EC2

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (18)

Ähnlich wie Cassandra On EC2

Ähnlich wie Cassandra On EC2 (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Cassandra On EC2