Cassandra Day NY 2014: From Proof of Concept to Production

©2014 DataStax
@tjake
T Jake Luciani 
Apache Cassandra Committer & PMC
Proof-of-Concept to Production
1

©2014 DataStax
The way we build software
1. Proof of Concept
2. ??
3. Production
4. Profit!
2
Do Nothing!
Preparation!
Development
Testing
Performance
Operations
Monitoring

©2013 DataStax Conﬁdential. Do not distribute without consent.
Cassandra Preparation
• Going to production with C* you must validate your assumptions and
have a plan for when you:
• loose nodes, disks, networks
• have spikes of traffic
• need to add more nodes
• upgrade cassandra, java, hardware
• …
3
Plan for all the nightmare
scenarios. This gives you
conﬁdence in your system

Before we begin
• Be comfortable on the command line!
• When something is going wrong you need to be able to get to the
problem quickly and ask the write questions. Provide diagnostic
information.
• cassandra: nodetool, cqlsh
• disk: iostat
• cpu: top/htop
• network: iftop
• java: jstatd, jstack, jmx, visualvm (ok not command line)
!
• cssh (csshx on osx)
4

Phase 1: DataModeling
• You’ve modeled your application in Cassandra
• You’ve de-normalized based on queries
!
• Stop. Stress test time…
• C* 2.1 native CQL stress tool (works with 2.0)
• CASSANDRA-6164
• https://github.com/tjake/cassandra/archive/6164.zip
5

CQL Stress tool
• Why? Because you can push your cluster to the limit, see how *your*
queries run on *your* hardware
!
• cassandra-stress write -schema yaml=my.yaml
!
• cassandra-stress read -schema yaml=my.yaml query=simple1
6

CQL Stress
7
YAML File + Demo

©2013 DataStax Conﬁdential. Do not distribute without consent. 8
Drain Dump

Hardware
• Currently C* isn’t well suited for > 1TB per node
• Except DSE Hadoop nodes which can be much larger
!
• Ideally 1U or smaller (blades)
• separate network, power, disk
!
• If you have larger machines
• VMs with disk per vm
• Containers?
!
• EC2 use I2 instances
9

Unix level stuff
• turn off swap
• turn off cpuspeed
• switch to deadline kernel scheduler
• socket buffers resize
• install numactl
• raise limits.conf esp (nofile and
• stress your disks using something like bonnie++ to get a idea of the
raw limits
10

Deployment
• Chef/Puppet/Ansible/etc
!
• Simpler rollout and rollback
!
• You should release your artifacts to a central location
!
• Do this for Cassandra too
• Makes upgrades easier
11

Monitoring
• Stress your system and learn where it breaks down
• Use that to create your alerts
!
• Know your SLAs
• Define them at each layer of your architecture
!
• OpsCenter for all things C*
!
• You can also easily integrate C* metrics into other metrics systems
• http://www.datastax.com/dev/blog/pluggable-metrics-reporting-in-cassandra-2-0-2
12

C* Monitoring
• Specific to C* things to monitor
• pending compactions
• exception count
• disk space
13

Cassandra Ops
• Understand operational basics like:
• bootstrapping
• repair
• rebuild
• scrub
14

Choose your own consistency
• When things go wrong you are in control
• Build consistency controls into your application
• In a pinch you can lower consistency and stay available
15

Backups
• Backups in C* are primarily to avoid human error
• C* provides lightweight local snapshots
• Traditional full backup of data in C* is hard todo
• Your data needs to be de-duped since each nodes files contain data
from many replicas
• If you need full traditional backup you are best to do full machine
backups
• At a minimum backup system tables (incase you loose the entire box)
16

Cassandra upgrades
• Read the release notes! NEWS.txt
• Read the change log! CHANGES.txt
• Understand the changes and how they impact your system
!
!
• Do this even if you don’t plan on upgrading.
• Someone else may have fixed a potential issue for you.
!
• Always snapshot your data before upgrading
17

Canary node
• When rolling out a new version of C* or your application, roll it out only
to a single node and watch it
• Quickly see if something is terribly wrong
• Gives you ability to verify new functionality before full rollout
18

Pre-Prod Environments
• Hard to do in large scale systems
• Requires work like replaying traffic to second cluster
• Doesn’t need to be 1:1 but offer a subset of real data to test with
19

C* level stuff
• cassandra.yaml
• Use stress to size your write and read pools
• internode_compression: dc
• lower request timeouts (improves tail latency)
• set concurrent compactors to 1/4 your cores
• in 2.1 we have off heap memtable
• Turn on Authentication
• Keeps you/apps from accidentally connecting to prod
20

Thanks!
21
Questions?
!
@tjake

Cassandra Day NY 2014: From Proof of Concept to Production

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (17)

Ähnlich wie Cassandra Day NY 2014: From Proof of Concept to Production

Ähnlich wie Cassandra Day NY 2014: From Proof of Concept to Production (20)

Mehr von DataStax Academy

Mehr von DataStax Academy (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Cassandra Day NY 2014: From Proof of Concept to Production