Knowledge of how to set up good benchmarks is invaluable in understanding performance of the system. Writing correct and useful benchmarks is hard, and verification of the results is difficult and prone to errors. When done right, benchmarks guide teams to improve the performance of their systems. When done wrong, hours of effort may result in a worse performing application, upset customers or worse! In this talk, we will discuss what you need to know to write better benchmarks for distributed systems. We will look at examples of bad benchmarks and learn about what biases can invalidate the measurements, in the hope of correctly applying our new-found skills and avoiding such pitfalls in the future.
4. Benchmark
=
How
Fast?
your
process
vs
Goal
your
process
vs
Best
PracCces
5. Today
• How
Not
to
Write
Benchmarks
• Benchmark
Setup
&
Results:
-
You’re
wrong
about
machines
-
You’re
wrong
about
stats
-
You’re
wrong
about
what
maLers
• Becoming
Less
Wrong
• Having
Fun
with
Riak
28. Wrong
About
Stats
120
100
80
60
40
20
0
Convergence
of
Median
on
Samples
0
10
20
30
40
50
60
Latency
Time
Stable
Samples
Stable
Median
Decaying
Samples
Decaying
Median
29. Website
Serving
Images
• Access
1
image
1000
Cmes
• Latency
measured
for
each
access
• Start
measuring
immediately
• 3
runs
• Find
mean
• Dev
machine
Web
Request
Server
Cache
S3
37. “Programmers
waste
enormous
amounts
of
Cme
thinking
about
…
the
speed
of
noncriCcal
parts
of
their
programs
...
Forget
about
small
efficiencies
…97%
of
the
Cme:
premature
opHmizaHon
is
the
root
of
all
evil.
Yet
we
should
not
pass
up
our
opportuniCes
in
that
criCcal
3%.”
-‐-‐
Donald
Knuth
38. Wrong
About
What
MaLers
• Premature
opCmizaCon
• UnrepresentaCve
workloads
39. Wrong
About
What
MaLers
• Premature
opCmizaCon
• UnrepresentaCve
workloads
• Memory
pressure
40. Wrong
About
What
MaLers
• Premature
opCmizaCon
• UnrepresentaCve
workloads
• Memory
pressure
• Load
balancing
41. Wrong
About
What
MaLers
• Premature
opCmizaCon
• UnrepresentaCve
workloads
• Memory
pressure
• Load
balancing
• Reproducibility
of
measurements
50. Microbenchmarking:
Blessing
&
Curse
• Choose
your
N
wisely
• Measure
side
effects
• Beware
of
clock
resoluCon
• Dead
Code
EliminaCon
51. Microbenchmarking:
Blessing
&
Curse
• Choose
your
N
wisely
• Measure
side
effects
• Beware
of
clock
resoluCon
• Dead
Code
EliminaCon
• Constant
work
per
iteraCon
53. Follow-‐up
Material
• How
NOT
to
Measure
Latency
by
Gil
Tene
– hLp://www.infoq.com/presentaCons/latency-‐piualls
• Taming
the
Long
Latency
Tail
on
highscalability.com
– hLp://highscalability.com/blog/2012/3/12/google-‐taming-‐
the-‐long-‐latency-‐tail-‐when-‐more-‐machines-‐equal.html
• Performance
Analysis
Methodology
by
Brendan
Gregg
– hLp://www.brendangregg.com/methodology.html
• Silverman’s
Mode
Detec@on
Method
by
MaL
Adereth
– hLp://adereth.github.io/blog/2014/10/12/silvermans-‐
mode-‐detecCon-‐method-‐explained/