Weitere ähnliche Inhalte Ähnlich wie TechTalk v2.0 - Performance tuning Cassandra + AWS (20) Kürzlich hochgeladen (20) TechTalk v2.0 - Performance tuning Cassandra + AWS1. Eddie
Garcia,
VP
of
InfoSec
and
Services
Gazzang,
Inc.
I/O
Performance
tuning
for
Cassandra
running
on
AWS
with
Gazzang
2. Today’s
Agenda
• Tips
and
Tricks
to
achieve
high
performance
when
running
Cassandra
on
AWS
• ConfiguraBon
tuning
for
Cassandra
• Tools
to
benchmark
raw
file
system
I/O
• AWS
available
AMIs
to
boost
performance
• Stress
tesBng
on
AWS
i2
HVM
instances
• Configuring
AWS
EC2
instances
with
SSDs
and
EBS
storage
with
PIOPS
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 2
3. Performance
tuning
• Tuning
at
every
layer
– Tune
the
AWS
layer
– Tune
the
Cassandra
layer
– Tune
the
file
system
/
security
layer
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 3
4.
Tune
the
AWS
layer
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 4
5. Tune
the
AWS
layer
• i2
HVM
instances
will
provide
beNer
I/O
over
other
instance
types
• i2
instances
will
support
SSD
TRIM
for
beNer
SDD
health
and
performance
over
Bme
• Use
Amazon
Linux
distribuBon
AMI
or
kernel
version
3.8
and
greater
for
higher
I/O
performance
• Use
Amazon
Linux
distribuBon
AMI
for
built-‐in
SR-‐IOV
(single
root
I/O
virtualizaBon)
drivers
to
enable
higher
performance
AWS
Enhanced
Networking
when
running
in
a
VPC
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 5
6. Amazon
Linux
AMI
Instance
Types
and
Sizes
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 6
http://aws.amazon.com/amazon-linux-ami/
7. Amazon
Linux
AMI
Instance
Types
and
Cost
on-‐demand
in
US
East
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 7
http://aws.amazon.com/ec2/pricing/
9. Tune
the
Cassandra
layer
• Follow
DataStax
published
Cassandra
best
pracBces
hNp://www.datastax.com/documentaBon/cassandra/2.0/cassandra/install/installRecommendSe]ngs.html
• Data
directory
should
go
on
the
mounted
ephemeral
instance
storage,
avoid
EBS
storage
for
maximum
I/O
performance
• IMPORTANT:
You
must
have
a
backup
strategy
when
using
ephemeral,
for
example
using
S3
for
backups
• RAID-‐0
(stripe)
of
SSDs
is
supported
but
Cassandra
also
does
a
great
job
of
using
all
mounted
drives
without
RAID
• Scale
by
adding
smaller
instances
vs.
increasing
instance
size
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 9
10. Tune
the
Cassandra
layer
• Cassandra
writes
immutable
sstable
files
to
disk.
It
then
compacts
mulBple
sstables
into
1
larger
sstable
with
some
cleanup
occurring
along
the
way
which
also
helps
TRIM
• More
OS
memory
the
beNer,
on
read
the
sstables
are
cached
as
normal
memory
mapped
file
loaded
into
OS
memory
• Increasing
the
JVM
heap
size
can
cause
performance
issues
for
Cassandra
during
garbage
collecBon
“Death
by
Garbage
CollecBon”
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 10
11.
Tune
the
file
system
/
security
layer
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 11
12. Tune
the
file
system
layer
• Format
the
file
system
with
ext4
vs
ext3
or
xfs
if
supported
by
your
chosen
Linux
distribuBon
• Use
the
most
current
Linux
version
for
your
distribuBon,
many
performance
fixes
are
supported
only
in
newer
kernels
• Use
IOZone
or
other
file
system
tests
before
and
ager
configuraBons
to
benchmark
raw
file
I/O
before
loading
your
Cassandra
data
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 12
13. Tune
the
file
security
layer
• Use
Block
Level
encrypBon
dedicaBng
enBre
SSD
volume
• Encrypt
the
cluster
before
loading
data
whenever
possible
• Use
systems
that
support
hardware
encrypBon
acceleraBon
like
Intel
AES-‐NI
hNp://aws.amazon.com/ec2/instance-‐types
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 13
14.
Test
and
measure
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 14
15. Performance
TesJng
• When
tesBng
performance
reduce
the
number
of
variables
that
can
affect
the
test
– Stopping
and
stopping
a
server
can
switch
your
instance
to
a
different
host
with
different
performance
– Time
of
day
when
you
run
tests
can
affect
the
performance
– Eliminate
cached
in
memory
data
from
prior
tests
which
may
contaminate
your
results
– Avoid
tesBng
on
systems
with
unknown
state
and
size
of
data
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 15
16. Cassandra
Test
Environment
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 16
Cassandra
Stress
Client
Cassandra
Node
1
Cassandra
Node
2
Cassandra
Node
3
Cassandra
Node
4
Cassandra
Node
5
Cassandra
Node
6
EBS
Clear
text
EBS
4K
PIOPS
SSD
Clear
text
SSD
Encrypted
IOZone Tests
Cassandra
Stress Tests
S3
Backups
17. Test
Environment
SpecificaJons
Instance:
i2.2xlarge
AZ:
us-‐east-‐1a
AMI
InformaBon:
amzn-‐ami-‐hvm-‐2013.09.2.x86_64-‐ebs
(ami-‐e9a18d80)
Linux
DistribuBon:
Amazon
Linux
AMI
release
2013.09
Kernel
Version:
3.4.73-‐64.112.amzn1.x86_64
Drive
Layout:
Filesystem
Size
Used
Avail
Use%
Mounted
on
/dev/xvda1
7.9G
1.8G
6.1G
23%
/
(EBS
backed
for
tests,
ephemeral
is
beNer)
tmpfs
30G
0
30G
0%
/dev/shm
/dev/xvdb
734G
197M
697G
1%
/mount/ssd1
(Cleartext
test
SSD)
/dev/mapper/encrypted
734G
36G
662G
6%
/encrypted
(Encrypted
test
SSD)
Cassandra
Stress
Client
–
m1.medium
Cassandra
Cluster:
6
Nodes
DataStax
enterprise:
dse-‐libcassandra-‐3.2.2-‐1.noarch
Cassandra:
version
1.2.12.2
Java
HotSpot(TM)
64-‐Bit
Server
VM/1.6.0_45
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 17
18. IOZone
SSD
vs.
Non-‐SSD
IOZone
test
configuraBon
Bme
iozone
-‐ORa
-‐s
163840
-‐r
16384
Iozone:
Performance
Test
of
File
I/O
Version
$Revision:
3.420
$
Compiled
for
64
bit
mode.
Build:
linux-‐AMD64
OPS
Mode.
Output
is
in
operaBons
per
second.
Excel
chart
generaBon
enabled
Auto
Mode
File
size
set
to
163840
KB
Record
Size
16384
KB
Command
line
used:
iozone
-‐ORa
-‐s
163840
-‐r
16384
Time
ResoluBon
=
0.000001
seconds.
Processor
cache
size
set
to
1024
Kbytes.
Processor
cache
line
size
set
to
32
bytes.
File
stride
size
set
to
17
*
record
size.
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 18
http://www.iozone.org/
19. Cassandra
Test
Environment
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 19
Cassandra
Node
EBS
Clear
text
EBS
4K
PIOPS
encrypted
SSD
SSD
Encrypted
IOZone Tests
real 1m6.360s
user 0m0.084s
sys 0m0.911s
real 0m15.223s
user 0m0.115s
sys 0m1.391s
real 0m9.951s
user 0m0.291s
sys 0m3.595s
20. Cassandra
stress
The
cassandra-‐stress
tool
• A
Java-‐based
stress
tesBng
uBlity
for
benchmarking
and
load
tesBng
a
Cassandra
cluster.
• The
binary
installaBon
of
the
tool
also
includes
a
daemon,
which
in
larger-‐scale
tesBng
can
prevent
potenBal
skews
in
the
test
results
by
keeping
the
JVM
warm.
• Modes
of
operaBon:
– InserBng:
Loads
test
data.
– Reading:
Reads
test
data.
– Indexed
range
slicing:
Works
with
RandomParBBoner
on
indexed
tables.
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 20
http://www.datastax.com/documentation/cassandra/2.0/cassandra/tools/
toolsCStress_t.html
21. Current
Cassandra
stress
test
configuraJon
• Cassandra
stress
test
command
– <cassandra
home>/tools/bin/cassandra-‐stress
-‐l
3
-‐o
insert
-‐n
100000000
-‐i
1
-‐e
ONE
-‐c
10
-‐d
<Cassandra
Node
IPs>
-‐t
150
-‐f
T1.csv
&
• In
the
stress
test,
client
stress
test
nodes
1
–
3
will
target
two
separate
Cassandra
nodes.
On
client
node
#4,
target
all
Cassandra
nodes.
– Client#1
—>
CAS
1,
2
– Client#2
—>
CAS
3,
4
– Client#3
—>
CAS
5,
6
– Client#4
—>
CAS
1,
2,
3,
4,
5,
6
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 21
22. Cassandra
Test
Environment
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 22
Stress
Client
1
Cassandra
Node
1
Cassandra
Node
2
Cassandra
Node
3
Cassandra
Node
4
Cassandra
Node
5
Cassandra
Node
6
SSD
Clear
text
SSD
Encrypted
Cassandra
Stress Tests
Stress
Client
2
Stress
Client
3
Stress
Client
4
24. Summary
• Test
in
your
environment
with
your
data,
results
will
vary
greatly
on
OS,
HW
and
applicaBon
configuraBons
– Baseline
before
you
tune
– Tune
– Test
ager
tuning
– Measure
– Rinse
and
repeat
twice
• Security
and
Performance
are
not
mutually
exclusive,
encrypBon
can
coexist
with
High
I/O
performance
• Do
your
homework,
configure
and
run
tests
that
map
to
your
use
case
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 24
25. • Headquartered
in
AusBn,
Texas
• Focus
on
securing
sensiBve
data
in
cloud
and
big
data
environments
• Enable
customers
to
meet
compliance
requirements
like
HIPAA,
PCI,
FIPS
and
FERPA
• SaBsfy
internal
security
mandates
• Protect
valuable
client
informaBon
About
Gazzang
26. Gazzang
is
focused
on
data
at-‐rest
encrypBon
Security
in
the
cloud
is
a
layered
approach
264/24/14 Gazzang - All rights reserved 2013
Data
in
process
(in
applicaJon)
Data
at
rest
(storage)
Data
in
transit
(SSL)
27. and
key
management
274/24/14 Gazzang - All rights reserved 2013
Security
in
the
cloud
is
a
layered
approach
Data
in
process
(in
applicaJon)
Data
at
rest
(storage)
Data
in
transit
(SSL)
28. Thank
you!
Gazzang,
Inc
www.gazzang.com
Eddie
Garcia
VP
of
InfoSec
and
Services
eddie.garcia@gazzang.com
4/24/14 © Gazzang, Inc. -- CONFIDENTIAL -- 28