SlideShare ist ein Scribd-Unternehmen logo
1 von 41
Downloaden Sie, um offline zu lesen
Getting The Most Out
Of Your Flash/SSDs
Young Paik
Technical Marketing Director
young@aerospike.com

Aerospike aer . o . spike [air-oh- spahyk]
noun, 1. tip of a rocket that enhances speed and stability
Introduction
Flash/SSDs (used interchangeably) are still
relatively new.
Getting the most out of them requires a good
understanding of how they work and how
Aerospike uses them.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 2
Agenda









SSDs vs. Rotational Drives
What Aerospike Does To Make The Most of SSDs
The Factors That Most Improve The Performance of
SSDs
Testing SSDs
More on Testing SSDs
Even more on Testing SSDs
Final Preparations For Your Drives

© 2014 Aerospike. All rights reserved. Confidential

Pg. 3
SSDs
vs.
Rotational Drives
Differences Matter
Some will tell you that their databases will work
on SSDs and that no changes are necessary.
There are differences between SSDs and
rotational drives that are important. You must do
more than simply swap out your old drive and put
in an SSD to get the best performance.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 5
Comparing Old and New
There are differences between rotational and SSD disks that are independent of the
database you are using.
Characteristic

Rotational

SSD

Notes

Random read

Poor

Excellent

This is where SSDs shine the most. With no moving
parts, SSDs are clearly the choice for random reads.

Random write

Poor

Good

Similar to reads, but SSDs are not quite as fast with
random writes as they are with reads.

Sequential write

Good

Excellent

Rotational drives narrow the gap here. While they
are close in pure write performance, any reads
during these writes will require the movement of the
heads on rotational drives.

Rewritability
(durability)

Excellent

Poor

This is where SSDs are the weakest. NAND (Flash)
chips have limits to how many times you can write to
the same area. Databases must take this into account
to avoid “hotspots.” Databases that do not are
relying on the operating systems (i.e. the TRIM
command) to alleviate these issues. Aerospike
manages this differently.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 6
What Aerospike
Does To Make The
Most Of SSDs
Techniques
In order to make the best use of SSDs, Aerospike has
designed an architecture that does the following:
Uses raw disk

Aerospike does not use a file system, which would only slow
down the database.

Writes in large blocks

Rather than trying to write many smaller items, it is much
more efficient to write a few large ones. Aerospike uses
black sizes that are integral multiple of 128 KB.

Reads in small blocks

Reads are done in 512 byte data segments.

Handles
defragmentation on a
regular basis

All databases must delete data. This creates fragmentation
of the data on disk, which makes it harder to use efficiently.
Aerospike does this through a continual process called
defragmentation. This means you do not need the TRIM
command used on most operating systems.

Works with vendors

Aerospike works closely with SSD manufacturers to test
hardware and provide feedback for the best performance.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 8
Accessing An Object In Aerospike
Writing A New Standard Data Type Record With SSDs

Client

Master Node

DRAM (Index)

SSD (DATA)
1) Client finds Master Node from
partition map.
2) Client makes write request to
Master Node.
3) Master Node make an entry indo
index (in DRAM) and queues write
in temporary write buffer.
4) Master Node coordinates write
with replica nodes (not shown).
5) Master Node returns success to
client.
6) Master Node asynchronously writes
data in blocks.
7) Index in DRAM points to location
on SSD.

Asynchronous write

Block size (128 KB by default)

© 2014 Aerospike. All rights reserved. Confidential

Pg. 9
Defragmentation In Aerospike
How Space Is Freed Up
SSD (DATA)

Aerospike writes the data in large data
blocks.

1
2
3
4
5
6
7
8

Block size (128 KB by default)

© 2014 Aerospike. All rights reserved. Confidential

Pg. 10
Defragmentation In Aerospike
How Space Is Freed Up
SSD (DATA)

As new data is added to the disk, new
blocks will be continually written to
the SSD.

1
2
3
4
5
6
7
8

Block size (128 KB by default)

© 2014 Aerospike. All rights reserved. Confidential

Pg. 11
Defragmentation In Aerospike
How Space Is Freed Up
SSD (DATA)

Over time, some records will be
deleted or updated, resulting in
fragmented usage on the flash/SSD
disk. This unused space must be freed
up.

1
2
3
4
5
6
7
8

Block size

© 2014 Aerospike. All rights reserved. Confidential

Pg. 12
Defragmentation In Aerospike
How Space Is Freed Up
SSD (DATA)

Some databases use a nightly process
called “compaction,” which is an
intensive process. Aerospike runs a
regular process (every few minutes) that
looks for blocks below some level of use
(called the high watermark).

1
2
3
4
5
6
7

In this example, if the high watermark is
50%, blocks 1 and 3 to the left are below
50% occupied. The defragmenter will
take the data in these blocks and merge
then into another block.

8

Block size

© 2014 Aerospike. All rights reserved. Confidential

Pg. 13
Defragmentation In Aerospike
How Space Is Freed Up
SSD (DATA)

The defragmenter will get write the new
block (block 7) and clear up blocks 1 and
3 for new writes.

1
2
3
4

Because this runs constantly, there is no
special time where the performance of
the database is bad.

5
6
7
8

This algorithm operates best when the
SSD is less than 50% occupied. As disk use
grows above this, the performance of the
defragmenter will decrease.

Block size

© 2014 Aerospike. All rights reserved. Confidential

Pg. 14
Aerospike Certification Tool (ACT) for SSDs
■ Industry Standard Flash (SSD / PCI-E) Benchmark
■ Open Source Tool used by Flash Vendors to certify drives
The Factors That
Most Improve The
Performance of SSDs
How To Prepare Your System
➤ Select




the correct hardware

SSD
Disk Controller

➤ Configure

the hardware
➤ Configure Aerospike

© 2014 Aerospike. All rights reserved. Confidential

Pg. 17
Most Important Factors for SSD Performance
Factor

Importance
(rough)

Notes

Interface
(SATA v. PCIe)

Very High

One of the most critical choices is the use of interface. Today, the
difference in price and layout is huge, so is quite easy for customers to
make. If the very low latency is absolutely required, use PCIe. Costs are
2x-5x what they would be on SATA.

Consumer v. Enterprise

Very High

A few years ago the difference between these types was small, but
today very few consumer rated drives pass Aerospike certification.

Make/model

Very High

Differences in specific models from the same maker can be very large.
In some cases, the manufacturer may have quietly made changes to the
hardware and firmware, but not changed the model number.

Disk controller (RAID,
HBA)

Very High

Aerospike prefers direct control of each SSD. RAID controllers will add
latency, without much added benefit (Aerospike is already replicated).

Over-provisioning (OP)

Very High

Over-provisioning allocates space on the drive for use by the controller.
The amount the manufacturer has set will amount varies from one
model to the other. Typical amounts are 6% - 28%.

Used before
NCQ

Scheduler

High

If the SSD has been in use for a long time for other purposes, the disk
will be unevenly worn, causing poor performance.

Medium

Native Command Queuing is a SATA extension that allows the disk to
internally optimize how commands are executed. Rarely a problem on
modern equipment.

Low

This is the I/O scheduler for the Linux kernel. Aerospike prefers the
NOOP scheduler and automatically selects it.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 18
Selecting The Correct SSD Model
Given the most important factors, obviously it is important to choose the
correct model. Aerospike publishes a list that it updates with information on
models that have passed testing.

These SSDs can be found at:
https://support.aerospike.com/customer/portal/articles/1315402-recommended-ssds

© 2014 Aerospike. All rights reserved. Confidential

Pg. 19
Selecting The Correct Disk Controller
Warning: Be very careful on the disk controller. Aerospike uses them in a
way that goes against traditional conventional wisdom.

Best practices:
➤ Do not use RAID across the SSDs. Aerospike stores small objects and is
much more sensitive to latency than bandwidth.
➤ When possible, use direct attach (SATA or PCIe)
➤ If you can’t use direct attach try one of the following:




➤
➤
➤

➤

Use HBAs without RAID
Configure each SSD as a separate RAID 0 array

Spread the SSDs among as many controllers as possible
All servers will have a limit to the number of drives that will perform
well. 4 is a common number.
If your company has a standard configuration for Hadoop, these often
have similar hardware needs to Aerospike
Some controllers have special software to boost performance. E.g.
The LSI 2208 chip has the fastpath available for specific models.
Check with your vendor.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 20
Over-provisioning (OP)
OP can make the difference between bad performance
and great performance.
2 types of OP:



Manufacturer’s OP
User OP

Manufacturer’s typically set 6%-8% for consumer rated
drives and 14%-28% for enterprise rated. This varies
depending on the model and capacity.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 21
Over-Provisioning: What You Can Do
Adding user over-provisioning can be done in one
of 2 ways:





Manufacturer’s software
Host Protected Area (HPA) – Linux has a command
that can use called hdparm that you can use to set
the HPA (Host Protected Area)
Disk partitions – You can also leave some space on
the disk as unpartitioned. The remainder of the
space will be used by the controller.

No matter which method you use, it is good to
reserve 21% for use by the controller.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 22
Comparing OP Methods
HPA (Host Protected Area) Partitioning
Ease of use

Use hdparm 9.37+

Use built-in fdisk command

Most versions of Linux come with earlier
versions.

Performance

Both methods have the same performance

Device ID

Must specify the basic
device (e.g. /dev/sdb)

Must specify the specific
partition (e.g. /dev/sdb1)

Notes

hdparm may not work
through your RAID
controller

All commands must specify
the full partition. Not doing
so may result in using disks
not OPed.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 23
OP Using Host Protected Area (HPA)
In order to use the HPA, it is easiest to use the
command hdparm (must have version 9.37+). You
can get a copy of this at:
http://sourceforge.net/projects/hdparm/

© 2014 Aerospike. All rights reserved. Confidential

Pg. 24
OP Using Host Protected Area (HPA) - Example
First find the number of sectors (must be root or use sudo)
> sudo /opt/hdparm-9.43/hdparm -N /dev/sdb
/dev/sdb:
max sectors
= 500118192/500118192, HPA is disabled

Then multiply by the OP amount (79%):
500,118,192 x 0.79 = 395,093,372 sectors
> sudo /opt/hdparm-9.43/hdparm -Np395093372 --yes-i-know-what-iam-doing /dev/sdb
/dev/sdb:
setting max visible sectors to 395093372 (permanent)
max sectors
= 395093372/500118192, HPA is enabled

Finally reboot. This is actually necessary to make sure the new
settings take hold.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 25
OP Using Partitions - Example
In this example we will over-provision the disk /dev/sdb by creating a single
partition that is 79% of the overall capacity (15121 = 19140 x 0.79):
> sudo /sbin/fdisk /dev/sdb
Command (m for help): n
Command action
e
extended
p
primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-19140, default 1): 1
Last cylinder, +cylinders or +size{K,M,G} (1-19140, default 19140): 15121
Command (m for help): p
Disk /dev/sdb: 157.4 GB, 157437394944 bytes
255 heads, 63 sectors/track, 19140 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xeff8f3ae
Device Boot
Start
End
Blocks
Id
/dev/sdb1
1
15121
121459401
83
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.

System
Linux

We recommend rebooting the server once this has been done. Note that for this
disk you will need to use /dev/sdb1 as the device.
© 2014 Aerospike. All rights reserved. Confidential

Pg. 26
Testing SSDs
Did You Choose Well?
The only way to be sure how these all work in
your environment is to test.
The best way is to use the Aerospike Certification
Test (ACT). This is a tool that has been Open
Sourced by Aerospike for testing SSD
configurations.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 28
Aerospike ACT
The ACT accesses SSDs similarly to the way the
Aerospike database does: reads with concurrent large
block writes. By default the tests run for a period of
24 hours.
The tests are based on factors of “x”.
1x represents 2,000 reads/s and 1,000 writes/s per SSD
2x represents 4,000 reads/s and 2,000 writes/s per SSD
etc.

1x represents decent performance of an SSD in 2010. Today,
several models of SSDs perform well at 3x. These tests must be
run for 24 hours to ensure stability.
Test with greater and greater “x” levels until the SSD performs
poorly.
© 2014 Aerospike. All rights reserved. Confidential

Pg. 29
Methodology For Single Disk
The basic methodology is:
➤ Test a single drive at 3x
➤ Retest with different configurations (OP, disk
controller, settings, etc)
➤ If the best of these pass standards, retest at a
higher x. If not, lower test standards to 2x.
➤ Repeat these tests until you have discovered the
limits of performance.
➤ Finally, test at twice the highest level passed to
make sure the disk can handle large bursts of
traffic. If a disk passes the test criteria at Nx and
completes the test at twice that speed, it is said to
pass at Nx.
© 2014 Aerospike. All rights reserved. Confidential

Pg. 30
What Is Passing?
Aerospike defines passing with the following
criteria:
No more than 5% of all transactions exceed 1 ms
No more than 1% of all transactions exceed 8 ms
No more than 0.1% of all transactions exceed 64 ms
You may determine your own.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 31
Analyzing The Results
When you run the ACT analysis tool, you will see output like this (time slices are
hourly):
trans
device
%>(ms)
%>(ms)
slice
1
2
4
8 16 32 64
1
2
4
8 16 32 64
----- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ -----1 21.01 1.59 0.04 0.00 0.00 0.00 0.00 20.88 1.57 0.04 0.00 0.00 0.00 0.00
2 23.34 1.58 0.03 0.00 0.00 0.00 0.00 23.19 1.56 0.03 0.00 0.00 0.00 0.00
3 23.89 1.66 0.04 0.00 0.00 0.00 0.00 23.75 1.64 0.04 0.00 0.00 0.00 0.00
4 25.39 2.06 0.05 0.00 0.00 0.00 0.00 25.24 2.03 0.05 0.00 0.00 0.00 0.00
5 26.72 2.41 0.07 0.00 0.00 0.00 0.00 26.57 2.38 0.07 0.00 0.00 0.00 0.00
6 26.68 2.37 0.07 0.00 0.00 0.00 0.00 26.53 2.34 0.06 0.00 0.00 0.00 0.00
7 24.93 1.82 0.04 0.00 0.00 0.00 0.00 24.78 1.79 0.04 0.00 0.00 0.00 0.00
8 25.61 1.99 0.05 0.00 0.00 0.00 0.00 25.46 1.97 0.05 0.00 0.00 0.00 0.00
9 25.68 1.96 0.05 0.00 0.00 0.00 0.00 25.53 1.94 0.05 0.00 0.00 0.00 0.00
10 26.79 2.28 0.06 0.00 0.00 0.00 0.00 26.64 2.25 0.06 0.00 0.00 0.00 0.00
11 24.69 1.63 0.03 0.00 0.00 0.00 0.00 24.54 1.61 0.03 0.00 0.00 0.00 0.00
12 25.73 1.92 0.04 0.00 0.00 0.00 0.00 25.58 1.90 0.04 0.00 0.00 0.00 0.00
13 26.86 2.26 0.06 0.00 0.00 0.00 0.00 26.70 2.23 0.06 0.00 0.00 0.00 0.00
14 26.17 2.03 0.05 0.00 0.00 0.00 0.00 26.02 2.01 0.05 0.00 0.00 0.00 0.00
15 26.40 2.10 0.05 0.00 0.00 0.00 0.00 26.24 2.07 0.05 0.00 0.00 0.00 0.00
16 26.70 2.18 0.06 0.00 0.00 0.00 0.00 26.54 2.15 0.05 0.00 0.00 0.00 0.00
17 26.57 2.13 0.05 0.00 0.00 0.00 0.00 26.41 2.11 0.05 0.00 0.00 0.00 0.00
18 26.53 2.11 0.05 0.00 0.00 0.00 0.00 26.37 2.09 0.05 0.00 0.00 0.00 0.00
19 26.53 2.11 0.05 0.00 0.00 0.00 0.00 26.37 2.08 0.05 0.00 0.00 0.00 0.00
20 25.43 1.79 0.04 0.00 0.00 0.00 0.00 25.27 1.77 0.04 0.00 0.00 0.00 0.00
21 27.56 2.40 0.06 0.00 0.00 0.00 0.00 27.40 2.37 0.06 0.00 0.00 0.00 0.00
22 27.61 2.43 0.07 0.00 0.00 0.00 0.00 27.45 2.40 0.07 0.00 0.00 0.00 0.00
23 25.21 1.71 0.04 0.00 0.00 0.00 0.00 25.05 1.68 0.04 0.00 0.00 0.00 0.00
24 26.61 2.10 0.05 0.00 0.00 0.00 0.00 26.45 2.08 0.05 0.00 0.00 0.00 0.00
----- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ -----avg 25.78 2.03 0.05 0.00 0.00 0.00 0.00 25.62 2.00 0.05 0.00 0.00 0.00 0.00
max 27.61 2.43 0.07 0.00 0.00 0.00 0.00 27.45 2.40 0.07 0.00 0.00 0.00 0.00

© 2014 Aerospike. All rights reserved. Confidential

Pg. 32
Methodology For Multiple Disks
In this case, you already know the performance of
a single drive. What you are actually testing for is
if this will scale linearly with the controller(s) you
have.
➤ Test 2 drives in parallel and increase the
number of drives until the performance is
obviously unacceptable or you have reached the
number of drives you wish to test.
As with the single disk, if a disk setup passes the
test criteria at Nx and completes the test at
twice that speed, it is said to pass at Nx.
© 2014 Aerospike. All rights reserved. Confidential

Pg. 33
Running ACT Tests
In order to run ACT tests (e.g. for drive /dev/sdb). This will
require root or sudo.
1.

2.
3.
4.

5.

Download and compile the ACT. Follow the included directions to
compile.
http://aerospike.github.io/act/
Prepare the drive(s) for use:
<ACT_DIR>/actprep /dev/sdb
Create a config file for the ACT run
python <ACT_DIR>/act_config_helper.py
Execute the ACT on the config file (since these will run for a long
time, it is useful to put it into the background.
<ACT_DIR>/act [config_file] > [log_file] &
Test to make sure it is running and outputting data. The “-t 10”
means to put the data into 10 second slices (default is 3600).
<ACT_DIR>/latency_calc/act_latency.py –l [log_file] –t 10

6.

Wait for test to complete (24 hours)
© 2014 Aerospike. All rights reserved. Confidential

Pg. 34
Example: Creating Config Files
> python act_config_helper.py
Enter the number of devices you want to create config for: 1
Enter either raw device if over-provisioned using hdparm or
partition if over-provisioned using fdisk
Enter device name # 1(e.g. /dev/sdb or /dev/sdb1): /dev/sdb
Duration for the test (default :24 hours) [ENTER]
Configure test duration ? (N for using default) (y/N) :n
Use advanced mode for configuration ? (y/N) n
"1x" load is 2000 reads per sec and 1000 writes per sec
Enter the load you want to test the devices ( e.g. enter 1 for 1x
test):3
Do you want to Create the config (Save to a file) ? : (y/N) y
Config File actconfig_3x_1d.txt successfully created

The result will be the output file “actconfig_3x_1d.txt”. If you have multiple SSDs, the
the load will be taken for each device. Defaults for the ACT are for small objects (1.5
KB) and can be changed in the advanced options.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 35
Analyzing The Results
Analyze the final output log.
<ACT_DIR>/latency_calc/act_latency.py –l [log_file]
trans
device
%>(ms)
%>(ms)
slice
1
2
4
8 16 32 64
1
2
4
8 16 32 64
----- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ -----1 21.01 1.59 0.04 0.00 0.00 0.00 0.00 20.88 1.57 0.04 0.00 0.00 0.00 0.00
2 23.34 1.58 0.03 0.00 0.00 0.00 0.00 23.19 1.56 0.03 0.00 0.00 0.00 0.00
3 23.89 1.66 0.04 0.00 0.00 0.00 0.00 23.75 1.64 0.04 0.00 0.00 0.00 0.00
4 25.39 2.06 0.05 0.00 0.00 0.00 0.00 25.24 2.03 0.05 0.00 0.00 0.00 0.00
5 26.72 2.41 0.07 0.00 0.00 0.00 0.00 26.57 2.38 0.07 0.00 0.00 0.00 0.00
6 26.68 2.37 0.07 0.00 0.00 0.00 0.00 26.53 2.34 0.06 0.00 0.00 0.00 0.00
7 24.93 1.82 0.04 0.00 0.00 0.00 0.00 24.78 1.79 0.04 0.00 0.00 0.00 0.00
8 25.61 1.99 0.05 0.00 0.00 0.00 0.00 25.46 1.97 0.05 0.00 0.00 0.00 0.00
9 25.68 1.96 0.05 0.00 0.00 0.00 0.00 25.53 1.94 0.05 0.00 0.00 0.00 0.00
10 26.79 2.28 0.06 0.00 0.00 0.00 0.00 26.64 2.25 0.06 0.00 0.00 0.00 0.00
11 24.69 1.63 0.03 0.00 0.00 0.00 0.00 24.54 1.61 0.03 0.00 0.00 0.00 0.00
12 25.73 1.92 0.04 0.00 0.00 0.00 0.00 25.58 1.90 0.04 0.00 0.00 0.00 0.00
13 26.86 2.26 0.06 0.00 0.00 0.00 0.00 26.70 2.23 0.06 0.00 0.00 0.00 0.00
14 26.17 2.03 0.05 0.00 0.00 0.00 0.00 26.02 2.01 0.05 0.00 0.00 0.00 0.00
15 26.40 2.10 0.05 0.00 0.00 0.00 0.00 26.24 2.07 0.05 0.00 0.00 0.00 0.00
16 26.70 2.18 0.06 0.00 0.00 0.00 0.00 26.54 2.15 0.05 0.00 0.00 0.00 0.00
17 26.57 2.13 0.05 0.00 0.00 0.00 0.00 26.41 2.11 0.05 0.00 0.00 0.00 0.00
18 26.53 2.11 0.05 0.00 0.00 0.00 0.00 26.37 2.09 0.05 0.00 0.00 0.00 0.00
19 26.53 2.11 0.05 0.00 0.00 0.00 0.00 26.37 2.08 0.05 0.00 0.00 0.00 0.00
20 25.43 1.79 0.04 0.00 0.00 0.00 0.00 25.27 1.77 0.04 0.00 0.00 0.00 0.00
21 27.56 2.40 0.06 0.00 0.00 0.00 0.00 27.40 2.37 0.06 0.00 0.00 0.00 0.00
22 27.61 2.43 0.07 0.00 0.00 0.00 0.00 27.45 2.40 0.07 0.00 0.00 0.00 0.00
23 25.21 1.71 0.04 0.00 0.00 0.00 0.00 25.05 1.68 0.04 0.00 0.00 0.00 0.00
24 26.61 2.10 0.05 0.00 0.00 0.00 0.00 26.45 2.08 0.05 0.00 0.00 0.00 0.00
----- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ -----avg 25.78 2.03 0.05 0.00 0.00 0.00 0.00 25.62 2.00 0.05 0.00 0.00 0.00 0.00
max 27.61 2.43 0.07 0.00 0.00 0.00 0.00 27.45 2.40 0.07 0.00 0.00 0.00 0.00

© 2014 Aerospike. All rights reserved. Confidential

Pg. 36
Final Preparation
Final Preparations
Once you have your hardware properly configured,
there are some final steps before you use the SSDs.
You must blank out the drives (similar to a format
with a filesystem) bye running the dd command on
each of the drives. These can be run in parallel, but
must be done by root or with sudo:
> sudo dd if=/dev/zero of=/dev/<DEVICE_ID> bs=128k &
If you used partitioning to OP the drives, make sure to use the partition
id (e.g. /dev/sdb1).
WARNING: Do not run this on the disk with your operating system
(usually /dev/sda)!

© 2014 Aerospike. All rights reserved. Confidential

Pg. 38
Troubleshooting Common Issues


Tests show much greater than expected latency





Test won’t complete





Make sure you have properly configured over-provisioning. This is a common issue.
If you are doing a multi-disk test, the problem may lie in a single disk. Variances in
manufacturing may lead to a single drive masking poor latencies for all drives. Also make
sure your drives are fresh. Old drives may have hotspots.
Your load may be overwhelming your controller or the drive. A log message will let you
know if it is stopping because it cannot keep up.
If there is no error message in the log, sometimes logging out of the server will stop the
ACT process. You must use nohup or a similar mechanism to ensure the process will run
for the full 24 hours.

Operating system gives odd errors


You may have inadvertently run actprep or dd on the OS drive. Even the best of us have
done this.

© 2014 Aerospike. All rights reserved. Confidential

Pg. 39
Q&A
Thank You
Send all questions/comments/complaints to
YOUNG PAIK
YOUNG@AEROSPIKE.COM

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeperSaurav Haloi
 
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group ReplicationPercona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group ReplicationKenny Gryp
 
Debunking the Myths of HDFS Erasure Coding Performance
Debunking the Myths of HDFS Erasure Coding Performance Debunking the Myths of HDFS Erasure Coding Performance
Debunking the Myths of HDFS Erasure Coding Performance DataWorks Summit/Hadoop Summit
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryCloudera, Inc.
 
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...StreamNative
 
Disaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoFDisaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoFShapeBlue
 
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍...
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍...Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍...
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍...DataStax Academy
 
Ceph Object Storage Reference Architecture Performance and Sizing Guide
Ceph Object Storage Reference Architecture Performance and Sizing GuideCeph Object Storage Reference Architecture Performance and Sizing Guide
Ceph Object Storage Reference Architecture Performance and Sizing GuideKaran Singh
 
Bulk Loading Data into Cassandra
Bulk Loading Data into CassandraBulk Loading Data into Cassandra
Bulk Loading Data into CassandraDataStax
 
Apache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data ProcessingApache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data ProcessingDataWorks Summit
 
Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing GuideJose De La Rosa
 
Storage tiering and erasure coding in Ceph (SCaLE13x)
Storage tiering and erasure coding in Ceph (SCaLE13x)Storage tiering and erasure coding in Ceph (SCaLE13x)
Storage tiering and erasure coding in Ceph (SCaLE13x)Sage Weil
 
What is aerospike database and why is it vastly superior to other database an...
What is aerospike database and why is it vastly superior to other database an...What is aerospike database and why is it vastly superior to other database an...
What is aerospike database and why is it vastly superior to other database an...Aerospike
 
Crossing the Streams: the New Streaming Foreign-Key Join Feature in Kafka Str...
Crossing the Streams: the New Streaming Foreign-Key Join Feature in Kafka Str...Crossing the Streams: the New Streaming Foreign-Key Join Feature in Kafka Str...
Crossing the Streams: the New Streaming Foreign-Key Join Feature in Kafka Str...confluent
 
Extreme Linux Performance Monitoring and Tuning
Extreme Linux Performance Monitoring and TuningExtreme Linux Performance Monitoring and Tuning
Extreme Linux Performance Monitoring and TuningMilind Koyande
 
Apache Arrow: In Theory, In Practice
Apache Arrow: In Theory, In PracticeApache Arrow: In Theory, In Practice
Apache Arrow: In Theory, In PracticeDremio Corporation
 

Was ist angesagt? (20)

Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group ReplicationPercona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
 
Debunking the Myths of HDFS Erasure Coding Performance
Debunking the Myths of HDFS Erasure Coding Performance Debunking the Myths of HDFS Erasure Coding Performance
Debunking the Myths of HDFS Erasure Coding Performance
 
Introduction to redis
Introduction to redisIntroduction to redis
Introduction to redis
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
 
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
 
Introduction to Aerospike
Introduction to AerospikeIntroduction to Aerospike
Introduction to Aerospike
 
Disaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoFDisaggregating Ceph using NVMeoF
Disaggregating Ceph using NVMeoF
 
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍...
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍...Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍...
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍...
 
Ceph Object Storage Reference Architecture Performance and Sizing Guide
Ceph Object Storage Reference Architecture Performance and Sizing GuideCeph Object Storage Reference Architecture Performance and Sizing Guide
Ceph Object Storage Reference Architecture Performance and Sizing Guide
 
Caching Strategies
Caching StrategiesCaching Strategies
Caching Strategies
 
Bulk Loading Data into Cassandra
Bulk Loading Data into CassandraBulk Loading Data into Cassandra
Bulk Loading Data into Cassandra
 
Apache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data ProcessingApache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data Processing
 
Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing Guide
 
Storage tiering and erasure coding in Ceph (SCaLE13x)
Storage tiering and erasure coding in Ceph (SCaLE13x)Storage tiering and erasure coding in Ceph (SCaLE13x)
Storage tiering and erasure coding in Ceph (SCaLE13x)
 
Block Storage For VMs With Ceph
Block Storage For VMs With CephBlock Storage For VMs With Ceph
Block Storage For VMs With Ceph
 
What is aerospike database and why is it vastly superior to other database an...
What is aerospike database and why is it vastly superior to other database an...What is aerospike database and why is it vastly superior to other database an...
What is aerospike database and why is it vastly superior to other database an...
 
Crossing the Streams: the New Streaming Foreign-Key Join Feature in Kafka Str...
Crossing the Streams: the New Streaming Foreign-Key Join Feature in Kafka Str...Crossing the Streams: the New Streaming Foreign-Key Join Feature in Kafka Str...
Crossing the Streams: the New Streaming Foreign-Key Join Feature in Kafka Str...
 
Extreme Linux Performance Monitoring and Tuning
Extreme Linux Performance Monitoring and TuningExtreme Linux Performance Monitoring and Tuning
Extreme Linux Performance Monitoring and Tuning
 
Apache Arrow: In Theory, In Practice
Apache Arrow: In Theory, In PracticeApache Arrow: In Theory, In Practice
Apache Arrow: In Theory, In Practice
 

Andere mochten auch

Aerospike: Maximizing Performance
Aerospike: Maximizing PerformanceAerospike: Maximizing Performance
Aerospike: Maximizing PerformanceAerospike, Inc.
 
Distributing Data The Aerospike Way
Distributing Data The Aerospike WayDistributing Data The Aerospike Way
Distributing Data The Aerospike WayAerospike, Inc.
 
Developing High Performance Application with Aerospike & Go
Developing High Performance Application with Aerospike & GoDeveloping High Performance Application with Aerospike & Go
Developing High Performance Application with Aerospike & GoChris Stivers
 
Live Analytics with Go & Aerospike
Live Analytics with Go & AerospikeLive Analytics with Go & Aerospike
Live Analytics with Go & AerospikeNick Manning
 
Building ZingMe News Feed System
Building ZingMe News Feed SystemBuilding ZingMe News Feed System
Building ZingMe News Feed SystemChau Thanh
 
Zing Me Real Time Web Chat Architect
Zing Me Real Time Web Chat ArchitectZing Me Real Time Web Chat Architect
Zing Me Real Time Web Chat ArchitectChau Thanh
 
Design a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutionsDesign a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutionsChau Thanh
 
Inside Zalo: Developing a mobile messenger for the audience of millions - VN ...
Inside Zalo: Developing a mobile messenger for the audience of millions - VN ...Inside Zalo: Developing a mobile messenger for the audience of millions - VN ...
Inside Zalo: Developing a mobile messenger for the audience of millions - VN ...Zalo_app
 
Riak Search 2.0を使ったデータ集計
Riak Search 2.0を使ったデータ集計Riak Search 2.0を使ったデータ集計
Riak Search 2.0を使ったデータ集計正志 坪坂
 

Andere mochten auch (9)

Aerospike: Maximizing Performance
Aerospike: Maximizing PerformanceAerospike: Maximizing Performance
Aerospike: Maximizing Performance
 
Distributing Data The Aerospike Way
Distributing Data The Aerospike WayDistributing Data The Aerospike Way
Distributing Data The Aerospike Way
 
Developing High Performance Application with Aerospike & Go
Developing High Performance Application with Aerospike & GoDeveloping High Performance Application with Aerospike & Go
Developing High Performance Application with Aerospike & Go
 
Live Analytics with Go & Aerospike
Live Analytics with Go & AerospikeLive Analytics with Go & Aerospike
Live Analytics with Go & Aerospike
 
Building ZingMe News Feed System
Building ZingMe News Feed SystemBuilding ZingMe News Feed System
Building ZingMe News Feed System
 
Zing Me Real Time Web Chat Architect
Zing Me Real Time Web Chat ArchitectZing Me Real Time Web Chat Architect
Zing Me Real Time Web Chat Architect
 
Design a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutionsDesign a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutions
 
Inside Zalo: Developing a mobile messenger for the audience of millions - VN ...
Inside Zalo: Developing a mobile messenger for the audience of millions - VN ...Inside Zalo: Developing a mobile messenger for the audience of millions - VN ...
Inside Zalo: Developing a mobile messenger for the audience of millions - VN ...
 
Riak Search 2.0を使ったデータ集計
Riak Search 2.0を使ったデータ集計Riak Search 2.0を使ったデータ集計
Riak Search 2.0を使ったデータ集計
 

Ähnlich wie Getting The Most Out Of Your Flash/SSDs

Solid State Drives (SSDs) -What it Takes to Make Data Go Away
Solid State Drives (SSDs) -What it Takes to Make Data Go AwaySolid State Drives (SSDs) -What it Takes to Make Data Go Away
Solid State Drives (SSDs) -What it Takes to Make Data Go AwayBlancco
 
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash Technology
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash TechnologyCeph Day San Jose - Red Hat Storage Acceleration Utlizing Flash Technology
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash TechnologyCeph Community
 
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Community
 
fdocuments.in_aerospike-key-value-data-access.ppt
fdocuments.in_aerospike-key-value-data-access.pptfdocuments.in_aerospike-key-value-data-access.ppt
fdocuments.in_aerospike-key-value-data-access.pptyashsharma863914
 
IBM System Storage DS8000 with SSDs An In-Depth Look at SSD Performance in th...
IBM System Storage DS8000 with SSDs An In-Depth Look at SSD Performance in th...IBM System Storage DS8000 with SSDs An In-Depth Look at SSD Performance in th...
IBM System Storage DS8000 with SSDs An In-Depth Look at SSD Performance in th...IBM India Smarter Computing
 
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...Red_Hat_Storage
 
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDSAccelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDSCeph Community
 
Open Ware Ramsan Dram Ssd
Open Ware Ramsan  Dram SsdOpen Ware Ramsan  Dram Ssd
Open Ware Ramsan Dram SsdSidnir Vieira
 
Development to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB ClustersDevelopment to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB ClustersSeveralnines
 
How to deploy SQL Server on an Microsoft Azure virtual machines
How to deploy SQL Server on an Microsoft Azure virtual machinesHow to deploy SQL Server on an Microsoft Azure virtual machines
How to deploy SQL Server on an Microsoft Azure virtual machinesSolarWinds
 
Architectural designs driving sql server performance and high availability
Architectural designs driving sql server performance and high availabilityArchitectural designs driving sql server performance and high availability
Architectural designs driving sql server performance and high availabilitySumeet Bansal
 
JetStor portfolio update final_2020-2021
JetStor portfolio update final_2020-2021JetStor portfolio update final_2020-2021
JetStor portfolio update final_2020-2021Gene Leyzarovich
 
Designing SSD-friendly Applications for Better Application Performance and Hi...
Designing SSD-friendly Applications for Better Application Performance and Hi...Designing SSD-friendly Applications for Better Application Performance and Hi...
Designing SSD-friendly Applications for Better Application Performance and Hi...Zhenyun Zhuang
 
Apresentacao Solid Access Corp Presentation Openware 5 20 10
Apresentacao Solid Access Corp Presentation Openware 5 20 10Apresentacao Solid Access Corp Presentation Openware 5 20 10
Apresentacao Solid Access Corp Presentation Openware 5 20 10Sidnir Vieira
 

Ähnlich wie Getting The Most Out Of Your Flash/SSDs (20)

Solid State Drives (SSDs) -What it Takes to Make Data Go Away
Solid State Drives (SSDs) -What it Takes to Make Data Go AwaySolid State Drives (SSDs) -What it Takes to Make Data Go Away
Solid State Drives (SSDs) -What it Takes to Make Data Go Away
 
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash Technology
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash TechnologyCeph Day San Jose - Red Hat Storage Acceleration Utlizing Flash Technology
Ceph Day San Jose - Red Hat Storage Acceleration Utlizing Flash Technology
 
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
 
fdocuments.in_aerospike-key-value-data-access.ppt
fdocuments.in_aerospike-key-value-data-access.pptfdocuments.in_aerospike-key-value-data-access.ppt
fdocuments.in_aerospike-key-value-data-access.ppt
 
IBM System Storage DS8000 with SSDs An In-Depth Look at SSD Performance in th...
IBM System Storage DS8000 with SSDs An In-Depth Look at SSD Performance in th...IBM System Storage DS8000 with SSDs An In-Depth Look at SSD Performance in th...
IBM System Storage DS8000 with SSDs An In-Depth Look at SSD Performance in th...
 
Disk configtips wp-cn
Disk configtips wp-cnDisk configtips wp-cn
Disk configtips wp-cn
 
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
Red Hat Storage Day Dallas - Red Hat Ceph Storage Acceleration Utilizing Flas...
 
Introduction to Hard Disk Drive by Vishal Garg
Introduction to Hard Disk Drive by Vishal GargIntroduction to Hard Disk Drive by Vishal Garg
Introduction to Hard Disk Drive by Vishal Garg
 
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDSAccelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS
Accelerating Cassandra Workloads on Ceph with All-Flash PCIE SSDS
 
Open Ware Ramsan Dram Ssd
Open Ware Ramsan  Dram SsdOpen Ware Ramsan  Dram Ssd
Open Ware Ramsan Dram Ssd
 
Development to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB ClustersDevelopment to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB Clusters
 
How to deploy SQL Server on an Microsoft Azure virtual machines
How to deploy SQL Server on an Microsoft Azure virtual machinesHow to deploy SQL Server on an Microsoft Azure virtual machines
How to deploy SQL Server on an Microsoft Azure virtual machines
 
3 5 SSD
3 5 SSD3 5 SSD
3 5 SSD
 
Generic SAN Acceleration White Paper DRAFT
Generic SAN Acceleration White Paper DRAFTGeneric SAN Acceleration White Paper DRAFT
Generic SAN Acceleration White Paper DRAFT
 
Architectural designs driving sql server performance and high availability
Architectural designs driving sql server performance and high availabilityArchitectural designs driving sql server performance and high availability
Architectural designs driving sql server performance and high availability
 
SQL 2005 Disk IO Performance
SQL 2005 Disk IO PerformanceSQL 2005 Disk IO Performance
SQL 2005 Disk IO Performance
 
Firebird and RAID
Firebird and RAIDFirebird and RAID
Firebird and RAID
 
JetStor portfolio update final_2020-2021
JetStor portfolio update final_2020-2021JetStor portfolio update final_2020-2021
JetStor portfolio update final_2020-2021
 
Designing SSD-friendly Applications for Better Application Performance and Hi...
Designing SSD-friendly Applications for Better Application Performance and Hi...Designing SSD-friendly Applications for Better Application Performance and Hi...
Designing SSD-friendly Applications for Better Application Performance and Hi...
 
Apresentacao Solid Access Corp Presentation Openware 5 20 10
Apresentacao Solid Access Corp Presentation Openware 5 20 10Apresentacao Solid Access Corp Presentation Openware 5 20 10
Apresentacao Solid Access Corp Presentation Openware 5 20 10
 

Mehr von Aerospike, Inc.

2017 DB Trends for Powering Real-Time Systems of Engagement
2017 DB Trends for Powering Real-Time Systems of Engagement2017 DB Trends for Powering Real-Time Systems of Engagement
2017 DB Trends for Powering Real-Time Systems of EngagementAerospike, Inc.
 
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...Aerospike, Inc.
 
Leveraging Big Data with Hadoop, NoSQL and RDBMS
Leveraging Big Data with Hadoop, NoSQL and RDBMSLeveraging Big Data with Hadoop, NoSQL and RDBMS
Leveraging Big Data with Hadoop, NoSQL and RDBMSAerospike, Inc.
 
Using Databases and Containers From Development to Deployment
Using Databases and Containers  From Development to DeploymentUsing Databases and Containers  From Development to Deployment
Using Databases and Containers From Development to DeploymentAerospike, Inc.
 
01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinar01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinarAerospike, Inc.
 
There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?Aerospike, Inc.
 
The role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsThe role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsAerospike, Inc.
 
Tectonic Shift: A New Foundation for Data Driven Business
Tectonic Shift: A New Foundation for Data Driven BusinessTectonic Shift: A New Foundation for Data Driven Business
Tectonic Shift: A New Foundation for Data Driven BusinessAerospike, Inc.
 
How to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
How to Get a Game Changing Performance Advantage with Intel SSDs and AerospikeHow to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
How to Get a Game Changing Performance Advantage with Intel SSDs and AerospikeAerospike, Inc.
 
What the Spark!? Intro and Use Cases
What the Spark!? Intro and Use CasesWhat the Spark!? Intro and Use Cases
What the Spark!? Intro and Use CasesAerospike, Inc.
 
Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysAerospike, Inc.
 
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/HourRunning a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/HourAerospike, Inc.
 
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACIDACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACIDAerospike, Inc.
 
Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...Aerospike, Inc.
 
Storm Persistence and Real-Time Analytics
Storm Persistence and Real-Time AnalyticsStorm Persistence and Real-Time Analytics
Storm Persistence and Real-Time AnalyticsAerospike, Inc.
 
You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?Aerospike, Inc.
 
Big Data Learnings from a Vendor's Perspective
Big Data Learnings from a Vendor's PerspectiveBig Data Learnings from a Vendor's Perspective
Big Data Learnings from a Vendor's PerspectiveAerospike, Inc.
 
Predictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timePredictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timeAerospike, Inc.
 

Mehr von Aerospike, Inc. (18)

2017 DB Trends for Powering Real-Time Systems of Engagement
2017 DB Trends for Powering Real-Time Systems of Engagement2017 DB Trends for Powering Real-Time Systems of Engagement
2017 DB Trends for Powering Real-Time Systems of Engagement
 
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
 
Leveraging Big Data with Hadoop, NoSQL and RDBMS
Leveraging Big Data with Hadoop, NoSQL and RDBMSLeveraging Big Data with Hadoop, NoSQL and RDBMS
Leveraging Big Data with Hadoop, NoSQL and RDBMS
 
Using Databases and Containers From Development to Deployment
Using Databases and Containers  From Development to DeploymentUsing Databases and Containers  From Development to Deployment
Using Databases and Containers From Development to Deployment
 
01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinar01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinar
 
There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?
 
The role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsThe role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial Informatics
 
Tectonic Shift: A New Foundation for Data Driven Business
Tectonic Shift: A New Foundation for Data Driven BusinessTectonic Shift: A New Foundation for Data Driven Business
Tectonic Shift: A New Foundation for Data Driven Business
 
How to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
How to Get a Game Changing Performance Advantage with Intel SSDs and AerospikeHow to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
How to Get a Game Changing Performance Advantage with Intel SSDs and Aerospike
 
What the Spark!? Intro and Use Cases
What the Spark!? Intro and Use CasesWhat the Spark!? Intro and Use Cases
What the Spark!? Intro and Use Cases
 
Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California Highways
 
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/HourRunning a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
 
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACIDACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID
 
Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...
 
Storm Persistence and Real-Time Analytics
Storm Persistence and Real-Time AnalyticsStorm Persistence and Real-Time Analytics
Storm Persistence and Real-Time Analytics
 
You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?
 
Big Data Learnings from a Vendor's Perspective
Big Data Learnings from a Vendor's PerspectiveBig Data Learnings from a Vendor's Perspective
Big Data Learnings from a Vendor's Perspective
 
Predictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-timePredictable Big Data Performance in Real-time
Predictable Big Data Performance in Real-time
 

Kürzlich hochgeladen

Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Nikki Chapple
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 

Kürzlich hochgeladen (20)

Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 

Getting The Most Out Of Your Flash/SSDs

  • 1. Getting The Most Out Of Your Flash/SSDs Young Paik Technical Marketing Director young@aerospike.com Aerospike aer . o . spike [air-oh- spahyk] noun, 1. tip of a rocket that enhances speed and stability
  • 2. Introduction Flash/SSDs (used interchangeably) are still relatively new. Getting the most out of them requires a good understanding of how they work and how Aerospike uses them. © 2014 Aerospike. All rights reserved. Confidential Pg. 2
  • 3. Agenda        SSDs vs. Rotational Drives What Aerospike Does To Make The Most of SSDs The Factors That Most Improve The Performance of SSDs Testing SSDs More on Testing SSDs Even more on Testing SSDs Final Preparations For Your Drives © 2014 Aerospike. All rights reserved. Confidential Pg. 3
  • 5. Differences Matter Some will tell you that their databases will work on SSDs and that no changes are necessary. There are differences between SSDs and rotational drives that are important. You must do more than simply swap out your old drive and put in an SSD to get the best performance. © 2014 Aerospike. All rights reserved. Confidential Pg. 5
  • 6. Comparing Old and New There are differences between rotational and SSD disks that are independent of the database you are using. Characteristic Rotational SSD Notes Random read Poor Excellent This is where SSDs shine the most. With no moving parts, SSDs are clearly the choice for random reads. Random write Poor Good Similar to reads, but SSDs are not quite as fast with random writes as they are with reads. Sequential write Good Excellent Rotational drives narrow the gap here. While they are close in pure write performance, any reads during these writes will require the movement of the heads on rotational drives. Rewritability (durability) Excellent Poor This is where SSDs are the weakest. NAND (Flash) chips have limits to how many times you can write to the same area. Databases must take this into account to avoid “hotspots.” Databases that do not are relying on the operating systems (i.e. the TRIM command) to alleviate these issues. Aerospike manages this differently. © 2014 Aerospike. All rights reserved. Confidential Pg. 6
  • 7. What Aerospike Does To Make The Most Of SSDs
  • 8. Techniques In order to make the best use of SSDs, Aerospike has designed an architecture that does the following: Uses raw disk Aerospike does not use a file system, which would only slow down the database. Writes in large blocks Rather than trying to write many smaller items, it is much more efficient to write a few large ones. Aerospike uses black sizes that are integral multiple of 128 KB. Reads in small blocks Reads are done in 512 byte data segments. Handles defragmentation on a regular basis All databases must delete data. This creates fragmentation of the data on disk, which makes it harder to use efficiently. Aerospike does this through a continual process called defragmentation. This means you do not need the TRIM command used on most operating systems. Works with vendors Aerospike works closely with SSD manufacturers to test hardware and provide feedback for the best performance. © 2014 Aerospike. All rights reserved. Confidential Pg. 8
  • 9. Accessing An Object In Aerospike Writing A New Standard Data Type Record With SSDs Client Master Node DRAM (Index) SSD (DATA) 1) Client finds Master Node from partition map. 2) Client makes write request to Master Node. 3) Master Node make an entry indo index (in DRAM) and queues write in temporary write buffer. 4) Master Node coordinates write with replica nodes (not shown). 5) Master Node returns success to client. 6) Master Node asynchronously writes data in blocks. 7) Index in DRAM points to location on SSD. Asynchronous write Block size (128 KB by default) © 2014 Aerospike. All rights reserved. Confidential Pg. 9
  • 10. Defragmentation In Aerospike How Space Is Freed Up SSD (DATA) Aerospike writes the data in large data blocks. 1 2 3 4 5 6 7 8 Block size (128 KB by default) © 2014 Aerospike. All rights reserved. Confidential Pg. 10
  • 11. Defragmentation In Aerospike How Space Is Freed Up SSD (DATA) As new data is added to the disk, new blocks will be continually written to the SSD. 1 2 3 4 5 6 7 8 Block size (128 KB by default) © 2014 Aerospike. All rights reserved. Confidential Pg. 11
  • 12. Defragmentation In Aerospike How Space Is Freed Up SSD (DATA) Over time, some records will be deleted or updated, resulting in fragmented usage on the flash/SSD disk. This unused space must be freed up. 1 2 3 4 5 6 7 8 Block size © 2014 Aerospike. All rights reserved. Confidential Pg. 12
  • 13. Defragmentation In Aerospike How Space Is Freed Up SSD (DATA) Some databases use a nightly process called “compaction,” which is an intensive process. Aerospike runs a regular process (every few minutes) that looks for blocks below some level of use (called the high watermark). 1 2 3 4 5 6 7 In this example, if the high watermark is 50%, blocks 1 and 3 to the left are below 50% occupied. The defragmenter will take the data in these blocks and merge then into another block. 8 Block size © 2014 Aerospike. All rights reserved. Confidential Pg. 13
  • 14. Defragmentation In Aerospike How Space Is Freed Up SSD (DATA) The defragmenter will get write the new block (block 7) and clear up blocks 1 and 3 for new writes. 1 2 3 4 Because this runs constantly, there is no special time where the performance of the database is bad. 5 6 7 8 This algorithm operates best when the SSD is less than 50% occupied. As disk use grows above this, the performance of the defragmenter will decrease. Block size © 2014 Aerospike. All rights reserved. Confidential Pg. 14
  • 15. Aerospike Certification Tool (ACT) for SSDs ■ Industry Standard Flash (SSD / PCI-E) Benchmark ■ Open Source Tool used by Flash Vendors to certify drives
  • 16. The Factors That Most Improve The Performance of SSDs
  • 17. How To Prepare Your System ➤ Select   the correct hardware SSD Disk Controller ➤ Configure the hardware ➤ Configure Aerospike © 2014 Aerospike. All rights reserved. Confidential Pg. 17
  • 18. Most Important Factors for SSD Performance Factor Importance (rough) Notes Interface (SATA v. PCIe) Very High One of the most critical choices is the use of interface. Today, the difference in price and layout is huge, so is quite easy for customers to make. If the very low latency is absolutely required, use PCIe. Costs are 2x-5x what they would be on SATA. Consumer v. Enterprise Very High A few years ago the difference between these types was small, but today very few consumer rated drives pass Aerospike certification. Make/model Very High Differences in specific models from the same maker can be very large. In some cases, the manufacturer may have quietly made changes to the hardware and firmware, but not changed the model number. Disk controller (RAID, HBA) Very High Aerospike prefers direct control of each SSD. RAID controllers will add latency, without much added benefit (Aerospike is already replicated). Over-provisioning (OP) Very High Over-provisioning allocates space on the drive for use by the controller. The amount the manufacturer has set will amount varies from one model to the other. Typical amounts are 6% - 28%. Used before NCQ Scheduler High If the SSD has been in use for a long time for other purposes, the disk will be unevenly worn, causing poor performance. Medium Native Command Queuing is a SATA extension that allows the disk to internally optimize how commands are executed. Rarely a problem on modern equipment. Low This is the I/O scheduler for the Linux kernel. Aerospike prefers the NOOP scheduler and automatically selects it. © 2014 Aerospike. All rights reserved. Confidential Pg. 18
  • 19. Selecting The Correct SSD Model Given the most important factors, obviously it is important to choose the correct model. Aerospike publishes a list that it updates with information on models that have passed testing. These SSDs can be found at: https://support.aerospike.com/customer/portal/articles/1315402-recommended-ssds © 2014 Aerospike. All rights reserved. Confidential Pg. 19
  • 20. Selecting The Correct Disk Controller Warning: Be very careful on the disk controller. Aerospike uses them in a way that goes against traditional conventional wisdom. Best practices: ➤ Do not use RAID across the SSDs. Aerospike stores small objects and is much more sensitive to latency than bandwidth. ➤ When possible, use direct attach (SATA or PCIe) ➤ If you can’t use direct attach try one of the following:   ➤ ➤ ➤ ➤ Use HBAs without RAID Configure each SSD as a separate RAID 0 array Spread the SSDs among as many controllers as possible All servers will have a limit to the number of drives that will perform well. 4 is a common number. If your company has a standard configuration for Hadoop, these often have similar hardware needs to Aerospike Some controllers have special software to boost performance. E.g. The LSI 2208 chip has the fastpath available for specific models. Check with your vendor. © 2014 Aerospike. All rights reserved. Confidential Pg. 20
  • 21. Over-provisioning (OP) OP can make the difference between bad performance and great performance. 2 types of OP:   Manufacturer’s OP User OP Manufacturer’s typically set 6%-8% for consumer rated drives and 14%-28% for enterprise rated. This varies depending on the model and capacity. © 2014 Aerospike. All rights reserved. Confidential Pg. 21
  • 22. Over-Provisioning: What You Can Do Adding user over-provisioning can be done in one of 2 ways:    Manufacturer’s software Host Protected Area (HPA) – Linux has a command that can use called hdparm that you can use to set the HPA (Host Protected Area) Disk partitions – You can also leave some space on the disk as unpartitioned. The remainder of the space will be used by the controller. No matter which method you use, it is good to reserve 21% for use by the controller. © 2014 Aerospike. All rights reserved. Confidential Pg. 22
  • 23. Comparing OP Methods HPA (Host Protected Area) Partitioning Ease of use Use hdparm 9.37+ Use built-in fdisk command Most versions of Linux come with earlier versions. Performance Both methods have the same performance Device ID Must specify the basic device (e.g. /dev/sdb) Must specify the specific partition (e.g. /dev/sdb1) Notes hdparm may not work through your RAID controller All commands must specify the full partition. Not doing so may result in using disks not OPed. © 2014 Aerospike. All rights reserved. Confidential Pg. 23
  • 24. OP Using Host Protected Area (HPA) In order to use the HPA, it is easiest to use the command hdparm (must have version 9.37+). You can get a copy of this at: http://sourceforge.net/projects/hdparm/ © 2014 Aerospike. All rights reserved. Confidential Pg. 24
  • 25. OP Using Host Protected Area (HPA) - Example First find the number of sectors (must be root or use sudo) > sudo /opt/hdparm-9.43/hdparm -N /dev/sdb /dev/sdb: max sectors = 500118192/500118192, HPA is disabled Then multiply by the OP amount (79%): 500,118,192 x 0.79 = 395,093,372 sectors > sudo /opt/hdparm-9.43/hdparm -Np395093372 --yes-i-know-what-iam-doing /dev/sdb /dev/sdb: setting max visible sectors to 395093372 (permanent) max sectors = 395093372/500118192, HPA is enabled Finally reboot. This is actually necessary to make sure the new settings take hold. © 2014 Aerospike. All rights reserved. Confidential Pg. 25
  • 26. OP Using Partitions - Example In this example we will over-provision the disk /dev/sdb by creating a single partition that is 79% of the overall capacity (15121 = 19140 x 0.79): > sudo /sbin/fdisk /dev/sdb Command (m for help): n Command action e extended p primary partition (1-4) p Partition number (1-4): 1 First cylinder (1-19140, default 1): 1 Last cylinder, +cylinders or +size{K,M,G} (1-19140, default 19140): 15121 Command (m for help): p Disk /dev/sdb: 157.4 GB, 157437394944 bytes 255 heads, 63 sectors/track, 19140 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0xeff8f3ae Device Boot Start End Blocks Id /dev/sdb1 1 15121 121459401 83 Command (m for help): w The partition table has been altered! Calling ioctl() to re-read partition table. Syncing disks. System Linux We recommend rebooting the server once this has been done. Note that for this disk you will need to use /dev/sdb1 as the device. © 2014 Aerospike. All rights reserved. Confidential Pg. 26
  • 28. Did You Choose Well? The only way to be sure how these all work in your environment is to test. The best way is to use the Aerospike Certification Test (ACT). This is a tool that has been Open Sourced by Aerospike for testing SSD configurations. © 2014 Aerospike. All rights reserved. Confidential Pg. 28
  • 29. Aerospike ACT The ACT accesses SSDs similarly to the way the Aerospike database does: reads with concurrent large block writes. By default the tests run for a period of 24 hours. The tests are based on factors of “x”. 1x represents 2,000 reads/s and 1,000 writes/s per SSD 2x represents 4,000 reads/s and 2,000 writes/s per SSD etc. 1x represents decent performance of an SSD in 2010. Today, several models of SSDs perform well at 3x. These tests must be run for 24 hours to ensure stability. Test with greater and greater “x” levels until the SSD performs poorly. © 2014 Aerospike. All rights reserved. Confidential Pg. 29
  • 30. Methodology For Single Disk The basic methodology is: ➤ Test a single drive at 3x ➤ Retest with different configurations (OP, disk controller, settings, etc) ➤ If the best of these pass standards, retest at a higher x. If not, lower test standards to 2x. ➤ Repeat these tests until you have discovered the limits of performance. ➤ Finally, test at twice the highest level passed to make sure the disk can handle large bursts of traffic. If a disk passes the test criteria at Nx and completes the test at twice that speed, it is said to pass at Nx. © 2014 Aerospike. All rights reserved. Confidential Pg. 30
  • 31. What Is Passing? Aerospike defines passing with the following criteria: No more than 5% of all transactions exceed 1 ms No more than 1% of all transactions exceed 8 ms No more than 0.1% of all transactions exceed 64 ms You may determine your own. © 2014 Aerospike. All rights reserved. Confidential Pg. 31
  • 32. Analyzing The Results When you run the ACT analysis tool, you will see output like this (time slices are hourly): trans device %>(ms) %>(ms) slice 1 2 4 8 16 32 64 1 2 4 8 16 32 64 ----- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ -----1 21.01 1.59 0.04 0.00 0.00 0.00 0.00 20.88 1.57 0.04 0.00 0.00 0.00 0.00 2 23.34 1.58 0.03 0.00 0.00 0.00 0.00 23.19 1.56 0.03 0.00 0.00 0.00 0.00 3 23.89 1.66 0.04 0.00 0.00 0.00 0.00 23.75 1.64 0.04 0.00 0.00 0.00 0.00 4 25.39 2.06 0.05 0.00 0.00 0.00 0.00 25.24 2.03 0.05 0.00 0.00 0.00 0.00 5 26.72 2.41 0.07 0.00 0.00 0.00 0.00 26.57 2.38 0.07 0.00 0.00 0.00 0.00 6 26.68 2.37 0.07 0.00 0.00 0.00 0.00 26.53 2.34 0.06 0.00 0.00 0.00 0.00 7 24.93 1.82 0.04 0.00 0.00 0.00 0.00 24.78 1.79 0.04 0.00 0.00 0.00 0.00 8 25.61 1.99 0.05 0.00 0.00 0.00 0.00 25.46 1.97 0.05 0.00 0.00 0.00 0.00 9 25.68 1.96 0.05 0.00 0.00 0.00 0.00 25.53 1.94 0.05 0.00 0.00 0.00 0.00 10 26.79 2.28 0.06 0.00 0.00 0.00 0.00 26.64 2.25 0.06 0.00 0.00 0.00 0.00 11 24.69 1.63 0.03 0.00 0.00 0.00 0.00 24.54 1.61 0.03 0.00 0.00 0.00 0.00 12 25.73 1.92 0.04 0.00 0.00 0.00 0.00 25.58 1.90 0.04 0.00 0.00 0.00 0.00 13 26.86 2.26 0.06 0.00 0.00 0.00 0.00 26.70 2.23 0.06 0.00 0.00 0.00 0.00 14 26.17 2.03 0.05 0.00 0.00 0.00 0.00 26.02 2.01 0.05 0.00 0.00 0.00 0.00 15 26.40 2.10 0.05 0.00 0.00 0.00 0.00 26.24 2.07 0.05 0.00 0.00 0.00 0.00 16 26.70 2.18 0.06 0.00 0.00 0.00 0.00 26.54 2.15 0.05 0.00 0.00 0.00 0.00 17 26.57 2.13 0.05 0.00 0.00 0.00 0.00 26.41 2.11 0.05 0.00 0.00 0.00 0.00 18 26.53 2.11 0.05 0.00 0.00 0.00 0.00 26.37 2.09 0.05 0.00 0.00 0.00 0.00 19 26.53 2.11 0.05 0.00 0.00 0.00 0.00 26.37 2.08 0.05 0.00 0.00 0.00 0.00 20 25.43 1.79 0.04 0.00 0.00 0.00 0.00 25.27 1.77 0.04 0.00 0.00 0.00 0.00 21 27.56 2.40 0.06 0.00 0.00 0.00 0.00 27.40 2.37 0.06 0.00 0.00 0.00 0.00 22 27.61 2.43 0.07 0.00 0.00 0.00 0.00 27.45 2.40 0.07 0.00 0.00 0.00 0.00 23 25.21 1.71 0.04 0.00 0.00 0.00 0.00 25.05 1.68 0.04 0.00 0.00 0.00 0.00 24 26.61 2.10 0.05 0.00 0.00 0.00 0.00 26.45 2.08 0.05 0.00 0.00 0.00 0.00 ----- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ -----avg 25.78 2.03 0.05 0.00 0.00 0.00 0.00 25.62 2.00 0.05 0.00 0.00 0.00 0.00 max 27.61 2.43 0.07 0.00 0.00 0.00 0.00 27.45 2.40 0.07 0.00 0.00 0.00 0.00 © 2014 Aerospike. All rights reserved. Confidential Pg. 32
  • 33. Methodology For Multiple Disks In this case, you already know the performance of a single drive. What you are actually testing for is if this will scale linearly with the controller(s) you have. ➤ Test 2 drives in parallel and increase the number of drives until the performance is obviously unacceptable or you have reached the number of drives you wish to test. As with the single disk, if a disk setup passes the test criteria at Nx and completes the test at twice that speed, it is said to pass at Nx. © 2014 Aerospike. All rights reserved. Confidential Pg. 33
  • 34. Running ACT Tests In order to run ACT tests (e.g. for drive /dev/sdb). This will require root or sudo. 1. 2. 3. 4. 5. Download and compile the ACT. Follow the included directions to compile. http://aerospike.github.io/act/ Prepare the drive(s) for use: <ACT_DIR>/actprep /dev/sdb Create a config file for the ACT run python <ACT_DIR>/act_config_helper.py Execute the ACT on the config file (since these will run for a long time, it is useful to put it into the background. <ACT_DIR>/act [config_file] > [log_file] & Test to make sure it is running and outputting data. The “-t 10” means to put the data into 10 second slices (default is 3600). <ACT_DIR>/latency_calc/act_latency.py –l [log_file] –t 10 6. Wait for test to complete (24 hours) © 2014 Aerospike. All rights reserved. Confidential Pg. 34
  • 35. Example: Creating Config Files > python act_config_helper.py Enter the number of devices you want to create config for: 1 Enter either raw device if over-provisioned using hdparm or partition if over-provisioned using fdisk Enter device name # 1(e.g. /dev/sdb or /dev/sdb1): /dev/sdb Duration for the test (default :24 hours) [ENTER] Configure test duration ? (N for using default) (y/N) :n Use advanced mode for configuration ? (y/N) n "1x" load is 2000 reads per sec and 1000 writes per sec Enter the load you want to test the devices ( e.g. enter 1 for 1x test):3 Do you want to Create the config (Save to a file) ? : (y/N) y Config File actconfig_3x_1d.txt successfully created The result will be the output file “actconfig_3x_1d.txt”. If you have multiple SSDs, the the load will be taken for each device. Defaults for the ACT are for small objects (1.5 KB) and can be changed in the advanced options. © 2014 Aerospike. All rights reserved. Confidential Pg. 35
  • 36. Analyzing The Results Analyze the final output log. <ACT_DIR>/latency_calc/act_latency.py –l [log_file] trans device %>(ms) %>(ms) slice 1 2 4 8 16 32 64 1 2 4 8 16 32 64 ----- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ -----1 21.01 1.59 0.04 0.00 0.00 0.00 0.00 20.88 1.57 0.04 0.00 0.00 0.00 0.00 2 23.34 1.58 0.03 0.00 0.00 0.00 0.00 23.19 1.56 0.03 0.00 0.00 0.00 0.00 3 23.89 1.66 0.04 0.00 0.00 0.00 0.00 23.75 1.64 0.04 0.00 0.00 0.00 0.00 4 25.39 2.06 0.05 0.00 0.00 0.00 0.00 25.24 2.03 0.05 0.00 0.00 0.00 0.00 5 26.72 2.41 0.07 0.00 0.00 0.00 0.00 26.57 2.38 0.07 0.00 0.00 0.00 0.00 6 26.68 2.37 0.07 0.00 0.00 0.00 0.00 26.53 2.34 0.06 0.00 0.00 0.00 0.00 7 24.93 1.82 0.04 0.00 0.00 0.00 0.00 24.78 1.79 0.04 0.00 0.00 0.00 0.00 8 25.61 1.99 0.05 0.00 0.00 0.00 0.00 25.46 1.97 0.05 0.00 0.00 0.00 0.00 9 25.68 1.96 0.05 0.00 0.00 0.00 0.00 25.53 1.94 0.05 0.00 0.00 0.00 0.00 10 26.79 2.28 0.06 0.00 0.00 0.00 0.00 26.64 2.25 0.06 0.00 0.00 0.00 0.00 11 24.69 1.63 0.03 0.00 0.00 0.00 0.00 24.54 1.61 0.03 0.00 0.00 0.00 0.00 12 25.73 1.92 0.04 0.00 0.00 0.00 0.00 25.58 1.90 0.04 0.00 0.00 0.00 0.00 13 26.86 2.26 0.06 0.00 0.00 0.00 0.00 26.70 2.23 0.06 0.00 0.00 0.00 0.00 14 26.17 2.03 0.05 0.00 0.00 0.00 0.00 26.02 2.01 0.05 0.00 0.00 0.00 0.00 15 26.40 2.10 0.05 0.00 0.00 0.00 0.00 26.24 2.07 0.05 0.00 0.00 0.00 0.00 16 26.70 2.18 0.06 0.00 0.00 0.00 0.00 26.54 2.15 0.05 0.00 0.00 0.00 0.00 17 26.57 2.13 0.05 0.00 0.00 0.00 0.00 26.41 2.11 0.05 0.00 0.00 0.00 0.00 18 26.53 2.11 0.05 0.00 0.00 0.00 0.00 26.37 2.09 0.05 0.00 0.00 0.00 0.00 19 26.53 2.11 0.05 0.00 0.00 0.00 0.00 26.37 2.08 0.05 0.00 0.00 0.00 0.00 20 25.43 1.79 0.04 0.00 0.00 0.00 0.00 25.27 1.77 0.04 0.00 0.00 0.00 0.00 21 27.56 2.40 0.06 0.00 0.00 0.00 0.00 27.40 2.37 0.06 0.00 0.00 0.00 0.00 22 27.61 2.43 0.07 0.00 0.00 0.00 0.00 27.45 2.40 0.07 0.00 0.00 0.00 0.00 23 25.21 1.71 0.04 0.00 0.00 0.00 0.00 25.05 1.68 0.04 0.00 0.00 0.00 0.00 24 26.61 2.10 0.05 0.00 0.00 0.00 0.00 26.45 2.08 0.05 0.00 0.00 0.00 0.00 ----- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ -----avg 25.78 2.03 0.05 0.00 0.00 0.00 0.00 25.62 2.00 0.05 0.00 0.00 0.00 0.00 max 27.61 2.43 0.07 0.00 0.00 0.00 0.00 27.45 2.40 0.07 0.00 0.00 0.00 0.00 © 2014 Aerospike. All rights reserved. Confidential Pg. 36
  • 38. Final Preparations Once you have your hardware properly configured, there are some final steps before you use the SSDs. You must blank out the drives (similar to a format with a filesystem) bye running the dd command on each of the drives. These can be run in parallel, but must be done by root or with sudo: > sudo dd if=/dev/zero of=/dev/<DEVICE_ID> bs=128k & If you used partitioning to OP the drives, make sure to use the partition id (e.g. /dev/sdb1). WARNING: Do not run this on the disk with your operating system (usually /dev/sda)! © 2014 Aerospike. All rights reserved. Confidential Pg. 38
  • 39. Troubleshooting Common Issues  Tests show much greater than expected latency    Test won’t complete    Make sure you have properly configured over-provisioning. This is a common issue. If you are doing a multi-disk test, the problem may lie in a single disk. Variances in manufacturing may lead to a single drive masking poor latencies for all drives. Also make sure your drives are fresh. Old drives may have hotspots. Your load may be overwhelming your controller or the drive. A log message will let you know if it is stopping because it cannot keep up. If there is no error message in the log, sometimes logging out of the server will stop the ACT process. You must use nohup or a similar mechanism to ensure the process will run for the full 24 hours. Operating system gives odd errors  You may have inadvertently run actprep or dd on the OS drive. Even the best of us have done this. © 2014 Aerospike. All rights reserved. Confidential Pg. 39
  • 40. Q&A
  • 41. Thank You Send all questions/comments/complaints to YOUNG PAIK YOUNG@AEROSPIKE.COM

Hinweis der Redaktion

  1. FastestBest uptimePredictable performanceconsistency