SlideShare ist ein Scribd-Unternehmen logo
1 von 60
Downloaden Sie, um offline zu lesen
Benchmarking: 
You’re Doing It Wrong 
Aysylu 
Greenberg 
@aysylu22
To 
Write 
Good 
Benchmarks… 
Need 
to 
be 
Full 
Stack
Benchmark 
= 
How 
Fast? 
your 
process 
vs 
Goal 
your 
process 
vs 
Best 
PracCces
Today 
• How 
Not 
to 
Write 
Benchmarks 
• Benchmark 
Setup 
& 
Results: 
- 
You’re 
wrong 
about 
machines 
- 
You’re 
wrong 
about 
stats 
- 
You’re 
wrong 
about 
what 
maLers 
• Becoming 
Less 
Wrong 
• Having 
Fun 
with 
Riak
HOW 
NOT 
TO 
WRITE 
BENCHMARKS
Website 
Serving 
Images 
• Access 
1 
image 
1000 
Cmes 
• Latency 
measured 
for 
each 
access 
• Start 
measuring 
immediately 
• 3 
runs 
• Find 
mean 
• Dev 
environment 
Web 
Request 
Server 
Cache 
S3
WHAT’S 
WRONG 
WITH 
THIS 
BENCHMARK?
YOU’RE 
WRONG 
ABOUT 
THE 
MACHINE
Wrong 
About 
the 
Machine 
• Cache, 
cache, 
cache, 
cache!
It’s 
Caches 
All 
The 
Way 
Down 
Web 
Request 
Server 
Cache 
S3
It’s 
Caches 
All 
The 
Way 
Down
Caches 
in 
Benchmarks 
Prof. 
Saman 
Amarasinghe, 
MIT 
2009
Caches 
in 
Benchmarks 
Prof. 
Saman 
Amarasinghe, 
MIT 
2009
Caches 
in 
Benchmarks 
Prof. 
Saman 
Amarasinghe, 
MIT 
2009
Caches 
in 
Benchmarks 
Prof. 
Saman 
Amarasinghe, 
MIT 
2009
Caches 
in 
Benchmarks 
Prof. 
Saman 
Amarasinghe, 
MIT 
2009
Website 
Serving 
Images 
• Access 
1 
image 
1000 
Cmes 
• Latency 
measured 
for 
each 
access 
• Start 
measuring 
immediately 
• 3 
runs 
• Find 
mean 
• Dev 
environment 
Web 
Request 
Server 
Cache 
S3
Wrong 
About 
the 
Machine 
• Cache, 
cache, 
cache, 
cache! 
• Warmup 
& 
Cming
Website 
Serving 
Images 
• Access 
1 
image 
1000 
Cmes 
• Latency 
measured 
for 
each 
access 
• Start 
measuring 
immediately 
• 3 
runs 
• Find 
mean 
• Dev 
environment 
Web 
Request 
Server 
Cache 
S3
Wrong 
About 
the 
Machine 
• Cache, 
cache, 
cache, 
cache! 
• Warmup 
& 
Cming 
• Periodic 
interference
Website 
Serving 
Images 
• Access 
1 
image 
1000 
Cmes 
• Latency 
measured 
for 
each 
access 
• Start 
measuring 
immediately 
• 3 
runs 
• Find 
mean 
• Dev 
environment 
Web 
Request 
Server 
Cache 
S3
Wrong 
About 
the 
Machine 
• Cache, 
cache, 
cache, 
cache! 
• Warmup 
& 
Cming 
• Periodic 
interference 
• Test 
!= 
Prod
Website 
Serving 
Images 
• Access 
1 
image 
1000 
Cmes 
• Latency 
measured 
for 
each 
access 
• Start 
measuring 
immediately 
• 3 
runs 
• Find 
mean 
• Dev 
environment 
Web 
Request 
Server 
Cache 
S3
Wrong 
About 
the 
Machine 
• Cache, 
cache, 
cache, 
cache! 
• Warmup 
& 
Cming 
• Periodic 
interference 
• Test 
!= 
Prod 
• Power 
mode 
changes
YOU’RE 
WRONG 
ABOUT 
THE 
STATS
Wrong 
About 
Stats 
• Too 
few 
samples
Wrong 
About 
Stats 
120 
100 
80 
60 
40 
20 
0 
Convergence 
of 
Median 
on 
Samples 
0 
10 
20 
30 
40 
50 
60 
Latency 
Time 
Stable 
Samples 
Stable 
Median 
Decaying 
Samples 
Decaying 
Median
Website 
Serving 
Images 
• Access 
1 
image 
1000 
Cmes 
• Latency 
measured 
for 
each 
access 
• Start 
measuring 
immediately 
• 3 
runs 
• Find 
mean 
• Dev 
machine 
Web 
Request 
Server 
Cache 
S3
Wrong 
About 
Stats 
• Too 
few 
samples 
• Gaussian 
(not)
Website 
Serving 
Images 
• Access 
1 
image 
1000 
Cmes 
• Latency 
measured 
for 
each 
access 
• Start 
measuring 
immediately 
• 3 
runs 
• Find 
mean 
• Dev 
machine 
Web 
Request 
Server 
Cache 
S3
Wrong 
About 
Stats 
• Too 
few 
samples 
• Gaussian 
(not) 
• MulCmodal 
distribuCon
MulCmodal 
DistribuCon 
50% 
99% 
# 
occurrences 
Latency 
5 
ms 
10 
ms
Wrong 
About 
Stats 
• Too 
few 
samples 
• Gaussian 
(not) 
• MulCmodal 
distribuCon 
• Outliers
YOU’RE 
WRONG 
ABOUT 
WHAT 
MATTERS
Wrong 
About 
What 
MaLers 
• Premature 
opCmizaCon
“Programmers 
waste 
enormous 
amounts 
of 
Cme 
thinking 
about 
… 
the 
speed 
of 
noncriCcal 
parts 
of 
their 
programs 
... 
Forget 
about 
small 
efficiencies 
…97% 
of 
the 
Cme: 
premature 
opHmizaHon 
is 
the 
root 
of 
all 
evil. 
Yet 
we 
should 
not 
pass 
up 
our 
opportuniCes 
in 
that 
criCcal 
3%.” 
-­‐-­‐ 
Donald 
Knuth
Wrong 
About 
What 
MaLers 
• Premature 
opCmizaCon 
• UnrepresentaCve 
workloads
Wrong 
About 
What 
MaLers 
• Premature 
opCmizaCon 
• UnrepresentaCve 
workloads 
• Memory 
pressure
Wrong 
About 
What 
MaLers 
• Premature 
opCmizaCon 
• UnrepresentaCve 
workloads 
• Memory 
pressure 
• Load 
balancing
Wrong 
About 
What 
MaLers 
• Premature 
opCmizaCon 
• UnrepresentaCve 
workloads 
• Memory 
pressure 
• Load 
balancing 
• Reproducibility 
of 
measurements
BECOMING 
LESS 
WRONG
User 
AcCons 
MaLer 
X 
> 
Y 
for 
workload 
Z 
with 
trade 
offs 
A, 
B, 
and 
C 
-­‐ 
hLp://www.toomuchcode.org/
Profiling 
Code 
instrumentaCon 
Aggregate 
over 
logs 
Traces
Microbenchmarking: 
Blessing 
& 
Curse 
+ Quick 
& 
cheap 
+ Answers 
narrow 
?s 
well 
- Osen 
misleading 
results 
- Not 
representaCve 
of 
the 
program
Microbenchmarking: 
Blessing 
& 
Curse 
• Choose 
your 
N 
wisely
Choose 
Your 
N 
Wisely 
Prof. 
Saman 
Amarasinghe, 
MIT 
2009
Microbenchmarking: 
Blessing 
& 
Curse 
• Choose 
your 
N 
wisely 
• Measure 
side 
effects
Microbenchmarking: 
Blessing 
& 
Curse 
• Choose 
your 
N 
wisely 
• Measure 
side 
effects 
• Beware 
of 
clock 
resoluCon
Microbenchmarking: 
Blessing 
& 
Curse 
• Choose 
your 
N 
wisely 
• Measure 
side 
effects 
• Beware 
of 
clock 
resoluCon 
• Dead 
Code 
EliminaCon
Microbenchmarking: 
Blessing 
& 
Curse 
• Choose 
your 
N 
wisely 
• Measure 
side 
effects 
• Beware 
of 
clock 
resoluCon 
• Dead 
Code 
EliminaCon 
• Constant 
work 
per 
iteraCon
Non-­‐Constant 
Work 
Per 
IteraCon
Follow-­‐up 
Material 
• How 
NOT 
to 
Measure 
Latency 
by 
Gil 
Tene 
– hLp://www.infoq.com/presentaCons/latency-­‐piualls 
• Taming 
the 
Long 
Latency 
Tail 
on 
highscalability.com 
– hLp://highscalability.com/blog/2012/3/12/google-­‐taming-­‐ 
the-­‐long-­‐latency-­‐tail-­‐when-­‐more-­‐machines-­‐equal.html 
• Performance 
Analysis 
Methodology 
by 
Brendan 
Gregg 
– hLp://www.brendangregg.com/methodology.html 
• Silverman’s 
Mode 
Detec@on 
Method 
by 
MaL 
Adereth 
– hLp://adereth.github.io/blog/2014/10/12/silvermans-­‐ 
mode-­‐detecCon-­‐method-­‐explained/
HAVING 
FUN 
WITH
Setup 
• SSD 
30 
GB 
• M3 
large 
• Riak 
version 
1.4.2-­‐0-­‐g61ac9d8 
• Ubuntu 
12.04.5 
LTS 
• 4 
byte 
keys, 
10 
KB 
values
2350 
2300 
2250 
2200 
2150 
2100 
2050 
2000 
1950 
1900 
1850 
Latency 
(usec) 
Get 
Latency 
L3 
Number 
of 
Keys
Takeaway 
#1: 
Cache
Takeaway 
#2: 
Outliers
Takeaway 
#3: 
Workload
Benchmarking: 
You’re Doing It Wrong 
Aysylu 
Greenberg 
@aysylu22

Weitere ähnliche Inhalte

Ähnlich wie Benchmarking (RICON 2014)

Anomaly Detection Using the CLA
Anomaly Detection Using the CLAAnomaly Detection Using the CLA
Anomaly Detection Using the CLANumenta
 
The challenges of live events scalability
The challenges of live events scalabilityThe challenges of live events scalability
The challenges of live events scalabilityGuy Tomer
 
Adventures in Azure Machine Learning from NE Bytes
Adventures in Azure Machine Learning from NE BytesAdventures in Azure Machine Learning from NE Bytes
Adventures in Azure Machine Learning from NE BytesDerek Graham
 
Machine learning systems for engineers
Machine learning systems for engineersMachine learning systems for engineers
Machine learning systems for engineersCameron Joannidis
 
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...Matt Ray
 
Building data intensive applications
Building data intensive applicationsBuilding data intensive applications
Building data intensive applicationsAmit Kejriwal
 
Coates bosc2010 clouds-fluff-and-no-substance
Coates bosc2010 clouds-fluff-and-no-substanceCoates bosc2010 clouds-fluff-and-no-substance
Coates bosc2010 clouds-fluff-and-no-substanceBOSC 2010
 
Embrace Chaos - Introducing Chaos Engineering to your Organization
Embrace Chaos - Introducing Chaos Engineering to your OrganizationEmbrace Chaos - Introducing Chaos Engineering to your Organization
Embrace Chaos - Introducing Chaos Engineering to your OrganizationPaul Osman
 
The deep bootstrap framework review
The deep bootstrap framework reviewThe deep bootstrap framework review
The deep bootstrap framework reviewtaeseon ryu
 
Show Me the Numbers: Automated Browser
Show Me the Numbers: Automated Browser Show Me the Numbers: Automated Browser
Show Me the Numbers: Automated Browser colleenfry
 
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...Sean Kandel - Data profiling: Assessing the overall content and quality of a ...
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...huguk
 
Badneedles
BadneedlesBadneedles
Badneedlesdimisec
 
Cvcc performance tuning
Cvcc performance tuningCvcc performance tuning
Cvcc performance tuningJohn McCaffrey
 
The RED Method: How to monitoring your microservices.
The RED Method: How to monitoring your microservices.The RED Method: How to monitoring your microservices.
The RED Method: How to monitoring your microservices.Grafana Labs
 
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
(DVO205) Monitoring Evolution: Flying Blind to Flying by InstrumentAmazon Web Services
 
Rubyslava beyond the_monolith
Rubyslava beyond the_monolithRubyslava beyond the_monolith
Rubyslava beyond the_monolitholahmichal
 
Capacity Planning for fun & profit
Capacity Planning for fun & profitCapacity Planning for fun & profit
Capacity Planning for fun & profitRodrigo Campos
 
Continuous Integration, the minimum viable product
Continuous Integration, the minimum viable productContinuous Integration, the minimum viable product
Continuous Integration, the minimum viable productJulian Simpson
 
Performance Oriented Design
Performance Oriented DesignPerformance Oriented Design
Performance Oriented DesignRodrigo Campos
 

Ähnlich wie Benchmarking (RICON 2014) (20)

Anomaly Detection Using the CLA
Anomaly Detection Using the CLAAnomaly Detection Using the CLA
Anomaly Detection Using the CLA
 
The challenges of live events scalability
The challenges of live events scalabilityThe challenges of live events scalability
The challenges of live events scalability
 
Adventures in Azure Machine Learning from NE Bytes
Adventures in Azure Machine Learning from NE BytesAdventures in Azure Machine Learning from NE Bytes
Adventures in Azure Machine Learning from NE Bytes
 
Machine learning systems for engineers
Machine learning systems for engineersMachine learning systems for engineers
Machine learning systems for engineers
 
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
 
Building data intensive applications
Building data intensive applicationsBuilding data intensive applications
Building data intensive applications
 
Coates bosc2010 clouds-fluff-and-no-substance
Coates bosc2010 clouds-fluff-and-no-substanceCoates bosc2010 clouds-fluff-and-no-substance
Coates bosc2010 clouds-fluff-and-no-substance
 
Embrace Chaos - Introducing Chaos Engineering to your Organization
Embrace Chaos - Introducing Chaos Engineering to your OrganizationEmbrace Chaos - Introducing Chaos Engineering to your Organization
Embrace Chaos - Introducing Chaos Engineering to your Organization
 
The deep bootstrap framework review
The deep bootstrap framework reviewThe deep bootstrap framework review
The deep bootstrap framework review
 
Show Me the Numbers: Automated Browser
Show Me the Numbers: Automated Browser Show Me the Numbers: Automated Browser
Show Me the Numbers: Automated Browser
 
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...Sean Kandel - Data profiling: Assessing the overall content and quality of a ...
Sean Kandel - Data profiling: Assessing the overall content and quality of a ...
 
Badneedles
BadneedlesBadneedles
Badneedles
 
Cvcc performance tuning
Cvcc performance tuningCvcc performance tuning
Cvcc performance tuning
 
The RED Method: How to monitoring your microservices.
The RED Method: How to monitoring your microservices.The RED Method: How to monitoring your microservices.
The RED Method: How to monitoring your microservices.
 
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
 
Rubyslava beyond the_monolith
Rubyslava beyond the_monolithRubyslava beyond the_monolith
Rubyslava beyond the_monolith
 
Capacity Planning for fun & profit
Capacity Planning for fun & profitCapacity Planning for fun & profit
Capacity Planning for fun & profit
 
Continuous Integration, the minimum viable product
Continuous Integration, the minimum viable productContinuous Integration, the minimum viable product
Continuous Integration, the minimum viable product
 
Cloud War Stories
Cloud War StoriesCloud War Stories
Cloud War Stories
 
Performance Oriented Design
Performance Oriented DesignPerformance Oriented Design
Performance Oriented Design
 

Mehr von Aysylu Greenberg

Software Supply Chains for DevOps @ InfoQ Live 2021
Software Supply Chains for DevOps @ InfoQ Live 2021Software Supply Chains for DevOps @ InfoQ Live 2021
Software Supply Chains for DevOps @ InfoQ Live 2021Aysylu Greenberg
 
Binary Authorization in Kubernetes
Binary Authorization in KubernetesBinary Authorization in Kubernetes
Binary Authorization in KubernetesAysylu Greenberg
 
Software Supply Chain Management with Grafeas and Kritis
Software Supply Chain Management with Grafeas and KritisSoftware Supply Chain Management with Grafeas and Kritis
Software Supply Chain Management with Grafeas and KritisAysylu Greenberg
 
Software Supply Chain Observability with Grafeas and Kritis
Software Supply Chain Observability with Grafeas and KritisSoftware Supply Chain Observability with Grafeas and Kritis
Software Supply Chain Observability with Grafeas and KritisAysylu Greenberg
 
Software Supply Chain Management with Grafeas and Kritis
Software Supply Chain Management with Grafeas and KritisSoftware Supply Chain Management with Grafeas and Kritis
Software Supply Chain Management with Grafeas and KritisAysylu Greenberg
 
Zero Downtime Migrations at Scale
Zero Downtime Migrations at ScaleZero Downtime Migrations at Scale
Zero Downtime Migrations at ScaleAysylu Greenberg
 
Distributed systems in practice, in theory (ScaleConf Colombia)
Distributed systems in practice, in theory (ScaleConf Colombia)Distributed systems in practice, in theory (ScaleConf Colombia)
Distributed systems in practice, in theory (ScaleConf Colombia)Aysylu Greenberg
 
MesosCon Asia Keynote: Replacing a Jet Engine Mid-flight
MesosCon Asia Keynote: Replacing a Jet Engine Mid-flightMesosCon Asia Keynote: Replacing a Jet Engine Mid-flight
MesosCon Asia Keynote: Replacing a Jet Engine Mid-flightAysylu Greenberg
 
Distributed systems in practice, in theory (JAX London)
Distributed systems in practice, in theory (JAX London)Distributed systems in practice, in theory (JAX London)
Distributed systems in practice, in theory (JAX London)Aysylu Greenberg
 
Building A Distributed Build System at Google Scale (StrangeLoop 2016)
Building A Distributed Build System at Google Scale (StrangeLoop 2016)Building A Distributed Build System at Google Scale (StrangeLoop 2016)
Building A Distributed Build System at Google Scale (StrangeLoop 2016)Aysylu Greenberg
 
Building a Distributed Build System at Google Scale
Building a Distributed Build System at Google ScaleBuilding a Distributed Build System at Google Scale
Building a Distributed Build System at Google ScaleAysylu Greenberg
 
Distributed systems in practice, in theory
Distributed systems in practice, in theoryDistributed systems in practice, in theory
Distributed systems in practice, in theoryAysylu Greenberg
 
Probabilistic Accuracy Bounds @ Papers We Love SF
Probabilistic Accuracy Bounds @ Papers We Love SFProbabilistic Accuracy Bounds @ Papers We Love SF
Probabilistic Accuracy Bounds @ Papers We Love SFAysylu Greenberg
 
Benchmarking (JAXLondon 2015)
Benchmarking (JAXLondon 2015)Benchmarking (JAXLondon 2015)
Benchmarking (JAXLondon 2015)Aysylu Greenberg
 
Loom & Functional Graphs in Clojure @ LambdaConf 2015
Loom & Functional Graphs in Clojure @ LambdaConf 2015Loom & Functional Graphs in Clojure @ LambdaConf 2015
Loom & Functional Graphs in Clojure @ LambdaConf 2015Aysylu Greenberg
 
Benchmarking (DevNexus 2015)
Benchmarking (DevNexus 2015)Benchmarking (DevNexus 2015)
Benchmarking (DevNexus 2015)Aysylu Greenberg
 
PWL: One VM to Rule Them All
PWL: One VM to Rule Them AllPWL: One VM to Rule Them All
PWL: One VM to Rule Them AllAysylu Greenberg
 

Mehr von Aysylu Greenberg (20)

Software Supply Chains for DevOps @ InfoQ Live 2021
Software Supply Chains for DevOps @ InfoQ Live 2021Software Supply Chains for DevOps @ InfoQ Live 2021
Software Supply Chains for DevOps @ InfoQ Live 2021
 
Binary Authorization in Kubernetes
Binary Authorization in KubernetesBinary Authorization in Kubernetes
Binary Authorization in Kubernetes
 
Software Supply Chain Management with Grafeas and Kritis
Software Supply Chain Management with Grafeas and KritisSoftware Supply Chain Management with Grafeas and Kritis
Software Supply Chain Management with Grafeas and Kritis
 
Software Supply Chain Observability with Grafeas and Kritis
Software Supply Chain Observability with Grafeas and KritisSoftware Supply Chain Observability with Grafeas and Kritis
Software Supply Chain Observability with Grafeas and Kritis
 
Software Supply Chain Management with Grafeas and Kritis
Software Supply Chain Management with Grafeas and KritisSoftware Supply Chain Management with Grafeas and Kritis
Software Supply Chain Management with Grafeas and Kritis
 
Zero Downtime Migrations at Scale
Zero Downtime Migrations at ScaleZero Downtime Migrations at Scale
Zero Downtime Migrations at Scale
 
Zero Downtime Migration
Zero Downtime MigrationZero Downtime Migration
Zero Downtime Migration
 
PWL Denver: Copysets
PWL Denver: CopysetsPWL Denver: Copysets
PWL Denver: Copysets
 
Distributed systems in practice, in theory (ScaleConf Colombia)
Distributed systems in practice, in theory (ScaleConf Colombia)Distributed systems in practice, in theory (ScaleConf Colombia)
Distributed systems in practice, in theory (ScaleConf Colombia)
 
MesosCon Asia Keynote: Replacing a Jet Engine Mid-flight
MesosCon Asia Keynote: Replacing a Jet Engine Mid-flightMesosCon Asia Keynote: Replacing a Jet Engine Mid-flight
MesosCon Asia Keynote: Replacing a Jet Engine Mid-flight
 
Distributed systems in practice, in theory (JAX London)
Distributed systems in practice, in theory (JAX London)Distributed systems in practice, in theory (JAX London)
Distributed systems in practice, in theory (JAX London)
 
Building A Distributed Build System at Google Scale (StrangeLoop 2016)
Building A Distributed Build System at Google Scale (StrangeLoop 2016)Building A Distributed Build System at Google Scale (StrangeLoop 2016)
Building A Distributed Build System at Google Scale (StrangeLoop 2016)
 
Building a Distributed Build System at Google Scale
Building a Distributed Build System at Google ScaleBuilding a Distributed Build System at Google Scale
Building a Distributed Build System at Google Scale
 
(+ Loom (years 2))
(+ Loom (years 2))(+ Loom (years 2))
(+ Loom (years 2))
 
Distributed systems in practice, in theory
Distributed systems in practice, in theoryDistributed systems in practice, in theory
Distributed systems in practice, in theory
 
Probabilistic Accuracy Bounds @ Papers We Love SF
Probabilistic Accuracy Bounds @ Papers We Love SFProbabilistic Accuracy Bounds @ Papers We Love SF
Probabilistic Accuracy Bounds @ Papers We Love SF
 
Benchmarking (JAXLondon 2015)
Benchmarking (JAXLondon 2015)Benchmarking (JAXLondon 2015)
Benchmarking (JAXLondon 2015)
 
Loom & Functional Graphs in Clojure @ LambdaConf 2015
Loom & Functional Graphs in Clojure @ LambdaConf 2015Loom & Functional Graphs in Clojure @ LambdaConf 2015
Loom & Functional Graphs in Clojure @ LambdaConf 2015
 
Benchmarking (DevNexus 2015)
Benchmarking (DevNexus 2015)Benchmarking (DevNexus 2015)
Benchmarking (DevNexus 2015)
 
PWL: One VM to Rule Them All
PWL: One VM to Rule Them AllPWL: One VM to Rule Them All
PWL: One VM to Rule Them All
 

Kürzlich hochgeladen

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....kzayra69
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentationvaddepallysandeep122
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 

Kürzlich hochgeladen (20)

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 

Benchmarking (RICON 2014)

  • 1. Benchmarking: You’re Doing It Wrong Aysylu Greenberg @aysylu22
  • 2.
  • 3. To Write Good Benchmarks… Need to be Full Stack
  • 4. Benchmark = How Fast? your process vs Goal your process vs Best PracCces
  • 5. Today • How Not to Write Benchmarks • Benchmark Setup & Results: - You’re wrong about machines - You’re wrong about stats - You’re wrong about what maLers • Becoming Less Wrong • Having Fun with Riak
  • 6. HOW NOT TO WRITE BENCHMARKS
  • 7. Website Serving Images • Access 1 image 1000 Cmes • Latency measured for each access • Start measuring immediately • 3 runs • Find mean • Dev environment Web Request Server Cache S3
  • 8. WHAT’S WRONG WITH THIS BENCHMARK?
  • 9. YOU’RE WRONG ABOUT THE MACHINE
  • 10. Wrong About the Machine • Cache, cache, cache, cache!
  • 11. It’s Caches All The Way Down Web Request Server Cache S3
  • 12. It’s Caches All The Way Down
  • 13. Caches in Benchmarks Prof. Saman Amarasinghe, MIT 2009
  • 14. Caches in Benchmarks Prof. Saman Amarasinghe, MIT 2009
  • 15. Caches in Benchmarks Prof. Saman Amarasinghe, MIT 2009
  • 16. Caches in Benchmarks Prof. Saman Amarasinghe, MIT 2009
  • 17. Caches in Benchmarks Prof. Saman Amarasinghe, MIT 2009
  • 18. Website Serving Images • Access 1 image 1000 Cmes • Latency measured for each access • Start measuring immediately • 3 runs • Find mean • Dev environment Web Request Server Cache S3
  • 19. Wrong About the Machine • Cache, cache, cache, cache! • Warmup & Cming
  • 20. Website Serving Images • Access 1 image 1000 Cmes • Latency measured for each access • Start measuring immediately • 3 runs • Find mean • Dev environment Web Request Server Cache S3
  • 21. Wrong About the Machine • Cache, cache, cache, cache! • Warmup & Cming • Periodic interference
  • 22. Website Serving Images • Access 1 image 1000 Cmes • Latency measured for each access • Start measuring immediately • 3 runs • Find mean • Dev environment Web Request Server Cache S3
  • 23. Wrong About the Machine • Cache, cache, cache, cache! • Warmup & Cming • Periodic interference • Test != Prod
  • 24. Website Serving Images • Access 1 image 1000 Cmes • Latency measured for each access • Start measuring immediately • 3 runs • Find mean • Dev environment Web Request Server Cache S3
  • 25. Wrong About the Machine • Cache, cache, cache, cache! • Warmup & Cming • Periodic interference • Test != Prod • Power mode changes
  • 26. YOU’RE WRONG ABOUT THE STATS
  • 27. Wrong About Stats • Too few samples
  • 28. Wrong About Stats 120 100 80 60 40 20 0 Convergence of Median on Samples 0 10 20 30 40 50 60 Latency Time Stable Samples Stable Median Decaying Samples Decaying Median
  • 29. Website Serving Images • Access 1 image 1000 Cmes • Latency measured for each access • Start measuring immediately • 3 runs • Find mean • Dev machine Web Request Server Cache S3
  • 30. Wrong About Stats • Too few samples • Gaussian (not)
  • 31. Website Serving Images • Access 1 image 1000 Cmes • Latency measured for each access • Start measuring immediately • 3 runs • Find mean • Dev machine Web Request Server Cache S3
  • 32. Wrong About Stats • Too few samples • Gaussian (not) • MulCmodal distribuCon
  • 33. MulCmodal DistribuCon 50% 99% # occurrences Latency 5 ms 10 ms
  • 34. Wrong About Stats • Too few samples • Gaussian (not) • MulCmodal distribuCon • Outliers
  • 35. YOU’RE WRONG ABOUT WHAT MATTERS
  • 36. Wrong About What MaLers • Premature opCmizaCon
  • 37. “Programmers waste enormous amounts of Cme thinking about … the speed of noncriCcal parts of their programs ... Forget about small efficiencies …97% of the Cme: premature opHmizaHon is the root of all evil. Yet we should not pass up our opportuniCes in that criCcal 3%.” -­‐-­‐ Donald Knuth
  • 38. Wrong About What MaLers • Premature opCmizaCon • UnrepresentaCve workloads
  • 39. Wrong About What MaLers • Premature opCmizaCon • UnrepresentaCve workloads • Memory pressure
  • 40. Wrong About What MaLers • Premature opCmizaCon • UnrepresentaCve workloads • Memory pressure • Load balancing
  • 41. Wrong About What MaLers • Premature opCmizaCon • UnrepresentaCve workloads • Memory pressure • Load balancing • Reproducibility of measurements
  • 43. User AcCons MaLer X > Y for workload Z with trade offs A, B, and C -­‐ hLp://www.toomuchcode.org/
  • 44. Profiling Code instrumentaCon Aggregate over logs Traces
  • 45. Microbenchmarking: Blessing & Curse + Quick & cheap + Answers narrow ?s well - Osen misleading results - Not representaCve of the program
  • 46. Microbenchmarking: Blessing & Curse • Choose your N wisely
  • 47. Choose Your N Wisely Prof. Saman Amarasinghe, MIT 2009
  • 48. Microbenchmarking: Blessing & Curse • Choose your N wisely • Measure side effects
  • 49. Microbenchmarking: Blessing & Curse • Choose your N wisely • Measure side effects • Beware of clock resoluCon
  • 50. Microbenchmarking: Blessing & Curse • Choose your N wisely • Measure side effects • Beware of clock resoluCon • Dead Code EliminaCon
  • 51. Microbenchmarking: Blessing & Curse • Choose your N wisely • Measure side effects • Beware of clock resoluCon • Dead Code EliminaCon • Constant work per iteraCon
  • 53. Follow-­‐up Material • How NOT to Measure Latency by Gil Tene – hLp://www.infoq.com/presentaCons/latency-­‐piualls • Taming the Long Latency Tail on highscalability.com – hLp://highscalability.com/blog/2012/3/12/google-­‐taming-­‐ the-­‐long-­‐latency-­‐tail-­‐when-­‐more-­‐machines-­‐equal.html • Performance Analysis Methodology by Brendan Gregg – hLp://www.brendangregg.com/methodology.html • Silverman’s Mode Detec@on Method by MaL Adereth – hLp://adereth.github.io/blog/2014/10/12/silvermans-­‐ mode-­‐detecCon-­‐method-­‐explained/
  • 55. Setup • SSD 30 GB • M3 large • Riak version 1.4.2-­‐0-­‐g61ac9d8 • Ubuntu 12.04.5 LTS • 4 byte keys, 10 KB values
  • 56. 2350 2300 2250 2200 2150 2100 2050 2000 1950 1900 1850 Latency (usec) Get Latency L3 Number of Keys
  • 60. Benchmarking: You’re Doing It Wrong Aysylu Greenberg @aysylu22