SlideShare ist ein Scribd-Unternehmen logo
1 von 66
Downloaden Sie, um offline zu lesen
Beyond The Numbers
         Baron Schwartz
Who Am I?

            ●   baron@percona.com
            ●   @xaprb
            ●   linkedin.com/in/xaprb
            ●   xaprb.com/blog
Who Am I?

●   Maatkit                ●   Percona Toolkit
●   Innotop                ●   Monitoring Plugins
●   Aspersa                ●   Online Tools
●   JavaScript Libraries
●   Consulting      ●   Percona Server
●   Support         ●   Percona XtraBackup
●   Remote DBA      ●   Percona XtraDB
                        Cluster
●   Engineering
                    ●   Percona Toolkit
●   Conferences &
    Training        ●   Many More
Today's Agenda

●   Benchmarks
●   Aggregation and Distributions
●   Performance, Capacity & Utilization
●   Rules of Thumb
●   Queueing Theory and Scalability
Benchmarks
What's Missing?

                  ●   Distribution
                  ●   Time Series
                  ●   Response Times
                  ●   Parameters
                  ●   Goals
                  ●   System Specs
What's Misleading?

                 ●   Logarithmic X-Axis
                 ●   Interpolation
What's Good?

               ●   Y-Axis Reaches 0
               ●   No Fake-Smoothing
Behind a Single Dot
Look At All That Data...
What's With The Grid Lines?!?!?
Better Benchmarks




 What does an ideal benchmark report look like?
Clear Benchmark Goals

●   Validating hardware configuration
●   Comparing two systems
●   Checking for regressions
●   Capacity planning
●   Reproducing bad behavior to solve it
●   Stress-testing to find bottlenecks
Hardware and Software

●   Specs for CPU, disk, memory, network
●   Software versions (OS, SUT, benchmark)
●   Filesystem, RAID controller
●   Disk queue scheduler
Presenting Results

●   Ideally, make raw results available
●   Include metrics from OS (CPU, RAM, IO,
    network)
●   Generate some plots to summarize
    ●   This is where the rubber meets the road!
Better Aggregate Measures

●   Average
●   Percentiles
    ●   95th
    ●   99th
●   Maximum
●   Observation Duration
    ●   Question: how bad can 95th percentile be?
More Aggregate Measures

●   Median (50th Percentile)
●   Standard Deviation
●   Index of Dispersion
Better...
Better Still...
Keep It Coming...
Throughput AND Response Time
Performance

●   What is Performance?
●   Two Metrics
    ●   Response Time (time per task)
    ●   Throughput (tasks per time)
●   They're not reciprocals
    ●   More on this later
What Performance Isn't

●   CPU Usage
●   Load Average
●   Other metrics of resource consumption
Performance

●   I often focus on response time
    ●   It represents user experience
    ●   Throughput indicates capacity rather than
        performance
●   For benchmarking, throughput is primary
Utilization

●   The portion of time during which the
    resource is busy
    ●   i.e. there is at least one thing in progress
Utilization is Confusing

●   Be very careful with tools that report
    utilization
●   From the Linux iostat man page:
    ●   “%util: Percentage of CPU time during which
        I/O requests were issued to the device
        (bandwidth utilization for the device). Device
        saturation occurs when this value is close to
        100%.”
●   Can you parse that? Is it true?
Capacity

●   What is Capacity?
Capacity
Capacity – My Definition

 Capacity is the maximum throughput
 ... at achievable concurrency
 ... with acceptable performance
 ... as defined by response time
 ... meeting specified constraints
 ... over specified observation intervals.
Capacity Example

●   What is capacity of the system at a
    concurrency of 32 with 10-second 95th-
    percentile response time not to exceed
    2ms over a 60-minute duration?
●   To determine this, we need goal-seeking
    benchmark software
    ●   Most benchmark software can't do this
Benchmarks, etc Recap

●   Most benchmarks reveal very little
●   Benchmark reports reveal even less
●   It's good to go beyond the surface
Amdahl's Law

●   “The speedup of a program using multiple
    processors in parallel computing is limited
    by the time needed for the sequential
    fraction of the program.” - Wikipedia
●   It's basically a law of diminishing returns.
Should I Defragment My Disk?

●   Method 1: Google “defragment”
●   Method 2: Try it and see
●   Method 3: Measure if the disk is a
    bottleneck
Spolsky -vs- Millsap
Spolsky -vs- Millsap
Amdahl's Law

●   Don't try to optimize little things.
Little's Law

●   N = XR
●   That is,
    ●   Concurrency = Throughput * Response Time
●   This holds regardless of queueing, arrival
    rate distribution, response time
    distribution, etc.
Little's Law Example

●   If disk IOs average 4ms...
●   And there are 280 IOs per second...
●   Then the disk's average concurrency is:
    ●   N = 280 * .004
    ●   N = 1.12
●   Do you believe this?
    ●   When might it not be true?
Little's Law Example #2

●   If disk utilization is 98%
●   And there are 280 IOs per second
●   What do we know?
Utilization Law

●   U = SX
    ●   Also independent of distributions, etc...
●   That is,
    ●   Utilization = Service Time * Throughput
●   Utilization = 98% and Throughput = 280
    ●   S = U/X
    ●   Service Time = .98 / 280 = .0035
Queueing Theory

●   How can we predict the amount of
    queueing in a system?
●   How can we predict its response times?
●   How can we predict capacity?
Erlang Queueing

●   Erlang's formulas model the probability of
    queueing for a given arrival rate, service
    time, and number of servers.
●   A “server” is anything capable of serving
    a request.
    ●   CPUs
    ●   Disks
CPU -vs- Disk Queueing

●   Scenario: 4-CPU, 4-disk (RAID0) server
●   Thought experiment:
    ●   How do processes queue for CPU?
    ●   How do I/O requests queue on disks?
Notation

●   Typically see something like M/M/1
●   Each letter is a placeholder in A/S/n
    ●   A = Arrival distribution
    ●   S = Service-time distribution
    ●   n = Number of servers
●   A and S can be one of:
    ●   Markov
    ●   Deterministic
    ●   General
CPUs -vs- Disks

●   CPUs: M/M/4



●   Disks: 4 x {M/M/1}
M/M/1 Queueing




                 cmg.org
M/M/n Queueing




                 cmg.org
Erlang C Function

●   M/M/n queueing is modeled by Erlang C
    ●   See http://en.wikipedia.org/wiki/Erlang_(unit)
What's Wrong With Erlang C?

●   You must validate your arrival times.
●   You must validate your service times.
●   The equation is hard to work with.
●   In practice, it's hard to use Erlang C.
Scalability

●   Queueing causes non-linear scaling.
●   But first, let's talk about linearity.
System Scalability
Throughput




                         Why?




               Concurrency
Universal Scalability Law


                             Linear



                               Amdahl
Throughput




                                 USL




               Concurrency
Amdahl Scalability
USL Scalability
USL Scalability Modeling
USL Performance Modeling
Scalability Limitations

●   Locks
●   Synchronization points
●   Shared resources
●   Duplicated data to be kept in sync
●   Weakest-link problems
RAID10 On EBS

●   Which is faster?
    ●   RAID 10 over 10 EBS volumes
    ●   RAID 10 over 20 EBS volumes
●   Hint: http://goo.gl/Xm92Y
    ●   Also, http://goo.gl/fAEIL
Debunking “Linear”

●   Ask to see the actual numbers.
    ●   They shouldn't be rounded off suspiciously.
    ●   They must be truly linear.
    ●   They must intersect the point (0, 0).
Debunking, Example #1
Is it Linear?
It's Not Linear
Resources

●   Naomi Robbins' Blog
    ●   http://blogs.forbes.com/naomirobbins/
●   Percona White Papers
    ●   http://www.percona.com/
●   Neil J. Gunther
    ●   Guerrilla Capacity Planning
●   http://www.contextneeded.com/
Questions?
baron@percona.com
           @xaprb

Weitere ähnliche Inhalte

Was ist angesagt?

Reactive mistakes reactive nyc
Reactive mistakes   reactive nycReactive mistakes   reactive nyc
Reactive mistakes reactive nycPetr Zapletal
 
Cryptography and secure systems
Cryptography and secure systemsCryptography and secure systems
Cryptography and secure systemsVsevolod Stakhov
 
Redis as a Main Database, Scaling and HA
Redis as a Main Database, Scaling and HARedis as a Main Database, Scaling and HA
Redis as a Main Database, Scaling and HADave Nielsen
 
GC free coding in @Java presented @Geecon
GC free coding in @Java presented @GeeconGC free coding in @Java presented @Geecon
GC free coding in @Java presented @GeeconPeter Lawrey
 
Monitoring Cassandra With An EYE
Monitoring Cassandra With An EYEMonitoring Cassandra With An EYE
Monitoring Cassandra With An EYEKnoldus Inc.
 
Storm 2012-03-29
Storm 2012-03-29Storm 2012-03-29
Storm 2012-03-29Ted Dunning
 
Mantis qcon nyc_2015
Mantis qcon nyc_2015Mantis qcon nyc_2015
Mantis qcon nyc_2015neerajrj
 
Low latency microservices in java QCon New York 2016
Low latency microservices in java   QCon New York 2016Low latency microservices in java   QCon New York 2016
Low latency microservices in java QCon New York 2016Peter Lawrey
 
Refactoring Applications for the XK7 and Future Hybrid Architectures
Refactoring Applications for the XK7 and Future Hybrid ArchitecturesRefactoring Applications for the XK7 and Future Hybrid Architectures
Refactoring Applications for the XK7 and Future Hybrid ArchitecturesJeff Larkin
 
Determinism in finance
Determinism in financeDeterminism in finance
Determinism in financePeter Lawrey
 
LCU14 206- Tools to Analyse Scheduling Behaviour and Its Impact on Power Mana...
LCU14 206- Tools to Analyse Scheduling Behaviour and Its Impact on Power Mana...LCU14 206- Tools to Analyse Scheduling Behaviour and Its Impact on Power Mana...
LCU14 206- Tools to Analyse Scheduling Behaviour and Its Impact on Power Mana...Linaro
 
Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016Peter Lawrey
 
P99CONF — What We Need to Unlearn About Persistent Storage
P99CONF — What We Need to Unlearn About Persistent StorageP99CONF — What We Need to Unlearn About Persistent Storage
P99CONF — What We Need to Unlearn About Persistent StorageScyllaDB
 
Solving Multi-tenancy and G1GC in Apache HBase
Solving Multi-tenancy and G1GC in Apache HBase Solving Multi-tenancy and G1GC in Apache HBase
Solving Multi-tenancy and G1GC in Apache HBase HBaseCon
 

Was ist angesagt? (20)

rspamd-fosdem
rspamd-fosdemrspamd-fosdem
rspamd-fosdem
 
rspamd-slides
rspamd-slidesrspamd-slides
rspamd-slides
 
rspamd-hyperscan
rspamd-hyperscanrspamd-hyperscan
rspamd-hyperscan
 
Reactive mistakes reactive nyc
Reactive mistakes   reactive nycReactive mistakes   reactive nyc
Reactive mistakes reactive nyc
 
Netty training
Netty trainingNetty training
Netty training
 
Cryptography and secure systems
Cryptography and secure systemsCryptography and secure systems
Cryptography and secure systems
 
Redis as a Main Database, Scaling and HA
Redis as a Main Database, Scaling and HARedis as a Main Database, Scaling and HA
Redis as a Main Database, Scaling and HA
 
GC free coding in @Java presented @Geecon
GC free coding in @Java presented @GeeconGC free coding in @Java presented @Geecon
GC free coding in @Java presented @Geecon
 
Monitoring Cassandra With An EYE
Monitoring Cassandra With An EYEMonitoring Cassandra With An EYE
Monitoring Cassandra With An EYE
 
Storm 2012-03-29
Storm 2012-03-29Storm 2012-03-29
Storm 2012-03-29
 
Mantis qcon nyc_2015
Mantis qcon nyc_2015Mantis qcon nyc_2015
Mantis qcon nyc_2015
 
Low latency microservices in java QCon New York 2016
Low latency microservices in java   QCon New York 2016Low latency microservices in java   QCon New York 2016
Low latency microservices in java QCon New York 2016
 
Refactoring Applications for the XK7 and Future Hybrid Architectures
Refactoring Applications for the XK7 and Future Hybrid ArchitecturesRefactoring Applications for the XK7 and Future Hybrid Architectures
Refactoring Applications for the XK7 and Future Hybrid Architectures
 
Determinism in finance
Determinism in financeDeterminism in finance
Determinism in finance
 
LCU14 206- Tools to Analyse Scheduling Behaviour and Its Impact on Power Mana...
LCU14 206- Tools to Analyse Scheduling Behaviour and Its Impact on Power Mana...LCU14 206- Tools to Analyse Scheduling Behaviour and Its Impact on Power Mana...
LCU14 206- Tools to Analyse Scheduling Behaviour and Its Impact on Power Mana...
 
Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016
 
P99CONF — What We Need to Unlearn About Persistent Storage
P99CONF — What We Need to Unlearn About Persistent StorageP99CONF — What We Need to Unlearn About Persistent Storage
P99CONF — What We Need to Unlearn About Persistent Storage
 
Solving Multi-tenancy and G1GC in Apache HBase
Solving Multi-tenancy and G1GC in Apache HBase Solving Multi-tenancy and G1GC in Apache HBase
Solving Multi-tenancy and G1GC in Apache HBase
 
Pre fosdem2020 uber
Pre fosdem2020 uberPre fosdem2020 uber
Pre fosdem2020 uber
 
Rust Primer
Rust PrimerRust Primer
Rust Primer
 

Andere mochten auch

Ignite 2015 NA Technology Breakout Session - "Security, Stability and Scalabi...
Ignite 2015 NA Technology Breakout Session - "Security, Stability and Scalabi...Ignite 2015 NA Technology Breakout Session - "Security, Stability and Scalabi...
Ignite 2015 NA Technology Breakout Session - "Security, Stability and Scalabi...Elemica
 
2009 eCampaigning Review: Performance Benchmarks
2009 eCampaigning Review: Performance Benchmarks2009 eCampaigning Review: Performance Benchmarks
2009 eCampaigning Review: Performance BenchmarksFairSay
 
Signature assignment organizational ethics & social responsibility
Signature assignment   organizational ethics & social responsibilitySignature assignment   organizational ethics & social responsibility
Signature assignment organizational ethics & social responsibilityAmartin2009
 
Workplace deviance: cheating at work
Workplace deviance: cheating at workWorkplace deviance: cheating at work
Workplace deviance: cheating at workVictoria Miranda
 
Job roles presentation 2
Job roles presentation 2Job roles presentation 2
Job roles presentation 2christyguy
 
job description and specification importance and practical practices
job description and specification importance and practical practices job description and specification importance and practical practices
job description and specification importance and practical practices Khuram Shafiq
 
Understanding Material Safety Data Sheets Training by University of Alaska Fa...
Understanding Material Safety Data Sheets Training by University of Alaska Fa...Understanding Material Safety Data Sheets Training by University of Alaska Fa...
Understanding Material Safety Data Sheets Training by University of Alaska Fa...Atlantic Training, LLC.
 
Nationals policies,Plans,and Programme , Community Health Nursing India
Nationals policies,Plans,and Programme , Community Health Nursing India  Nationals policies,Plans,and Programme , Community Health Nursing India
Nationals policies,Plans,and Programme , Community Health Nursing India Paul Ebenezer
 
Understanding Employee Turnover
Understanding Employee TurnoverUnderstanding Employee Turnover
Understanding Employee TurnoverG&A Partners
 
Performance Appraisal & Job Evaluation
Performance Appraisal & Job Evaluation Performance Appraisal & Job Evaluation
Performance Appraisal & Job Evaluation Soham Gupta
 
Safety Rules in the Workplace & Office
Safety Rules in the Workplace & OfficeSafety Rules in the Workplace & Office
Safety Rules in the Workplace & OfficeOxbridge Academy
 
How to Manage Employee Turnover
How to Manage Employee TurnoverHow to Manage Employee Turnover
How to Manage Employee TurnoverBambooHR
 
Material Safety Data Sheet
Material Safety Data Sheet Material Safety Data Sheet
Material Safety Data Sheet Reliance
 
Hrm Ethics
Hrm EthicsHrm Ethics
Hrm Ethicsajithsrc
 

Andere mochten auch (20)

Ignite 2015 NA Technology Breakout Session - "Security, Stability and Scalabi...
Ignite 2015 NA Technology Breakout Session - "Security, Stability and Scalabi...Ignite 2015 NA Technology Breakout Session - "Security, Stability and Scalabi...
Ignite 2015 NA Technology Breakout Session - "Security, Stability and Scalabi...
 
2009 eCampaigning Review: Performance Benchmarks
2009 eCampaigning Review: Performance Benchmarks2009 eCampaigning Review: Performance Benchmarks
2009 eCampaigning Review: Performance Benchmarks
 
Signature assignment organizational ethics & social responsibility
Signature assignment   organizational ethics & social responsibilitySignature assignment   organizational ethics & social responsibility
Signature assignment organizational ethics & social responsibility
 
Workplace deviance: cheating at work
Workplace deviance: cheating at workWorkplace deviance: cheating at work
Workplace deviance: cheating at work
 
Trust and ethics in workplace culture
Trust and ethics in workplace cultureTrust and ethics in workplace culture
Trust and ethics in workplace culture
 
Job roles presentation 2
Job roles presentation 2Job roles presentation 2
Job roles presentation 2
 
Workplace Deviant Behaviour
Workplace Deviant Behaviour Workplace Deviant Behaviour
Workplace Deviant Behaviour
 
job description and specification importance and practical practices
job description and specification importance and practical practices job description and specification importance and practical practices
job description and specification importance and practical practices
 
Corporate Governance
Corporate GovernanceCorporate Governance
Corporate Governance
 
Understanding Material Safety Data Sheets Training by University of Alaska Fa...
Understanding Material Safety Data Sheets Training by University of Alaska Fa...Understanding Material Safety Data Sheets Training by University of Alaska Fa...
Understanding Material Safety Data Sheets Training by University of Alaska Fa...
 
Nationals policies,Plans,and Programme , Community Health Nursing India
Nationals policies,Plans,and Programme , Community Health Nursing India  Nationals policies,Plans,and Programme , Community Health Nursing India
Nationals policies,Plans,and Programme , Community Health Nursing India
 
Understanding Employee Turnover
Understanding Employee TurnoverUnderstanding Employee Turnover
Understanding Employee Turnover
 
Performance Appraisal & Job Evaluation
Performance Appraisal & Job Evaluation Performance Appraisal & Job Evaluation
Performance Appraisal & Job Evaluation
 
Safety Rules in the Workplace & Office
Safety Rules in the Workplace & OfficeSafety Rules in the Workplace & Office
Safety Rules in the Workplace & Office
 
How to Manage Employee Turnover
How to Manage Employee TurnoverHow to Manage Employee Turnover
How to Manage Employee Turnover
 
Employee turnover and maximizing staff retention
Employee turnover and maximizing staff retentionEmployee turnover and maximizing staff retention
Employee turnover and maximizing staff retention
 
Employee relation
Employee relationEmployee relation
Employee relation
 
Job analysis, job design, job specification
Job analysis, job design, job specificationJob analysis, job design, job specification
Job analysis, job design, job specification
 
Material Safety Data Sheet
Material Safety Data Sheet Material Safety Data Sheet
Material Safety Data Sheet
 
Hrm Ethics
Hrm EthicsHrm Ethics
Hrm Ethics
 

Ähnlich wie Benchmarks, performance, scalability, and capacity what s behind the numbers_ presentation

Monitoring and automation
Monitoring and automationMonitoring and automation
Monitoring and automationRicardo Bánffy
 
Ensuring Performance in a Fast-Paced Environment (CMG 2014)
Ensuring Performance in a Fast-Paced Environment (CMG 2014)Ensuring Performance in a Fast-Paced Environment (CMG 2014)
Ensuring Performance in a Fast-Paced Environment (CMG 2014)Martin Spier
 
OSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles JudithOSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles JudithNETWAYS
 
Netflix SRE perf meetup_slides
Netflix SRE perf meetup_slidesNetflix SRE perf meetup_slides
Netflix SRE perf meetup_slidesEd Hunter
 
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache KafkaStrata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafkaconfluent
 
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...Red Hat Developers
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadKrivoy Rog IT Community
 
Scalability broad strokes
Scalability   broad strokesScalability   broad strokes
Scalability broad strokesGagan Bajpai
 
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, CriteoParis Open Source Summit
 
Performance Test Automation With Gatling
Performance Test Automation  With GatlingPerformance Test Automation  With Gatling
Performance Test Automation With GatlingKnoldus Inc.
 
Machine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systemsMachine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systemsZhenxiao Luo
 
DrupalCon 2014: A Perfect Launch, Every Time
DrupalCon 2014: A Perfect Launch, Every TimeDrupalCon 2014: A Perfect Launch, Every Time
DrupalCon 2014: A Perfect Launch, Every TimePantheon
 
UKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL TuningUKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL TuningFromDual GmbH
 
Overview of Site Reliability Engineering (SRE) & best practices
Overview of Site Reliability Engineering (SRE) & best practicesOverview of Site Reliability Engineering (SRE) & best practices
Overview of Site Reliability Engineering (SRE) & best practicesAshutosh Agarwal
 
The Professional Programmer
The Professional ProgrammerThe Professional Programmer
The Professional ProgrammerDave Cross
 
Simplified Troubleshooting through API Scripting
Simplified Troubleshooting through API Scripting Simplified Troubleshooting through API Scripting
Simplified Troubleshooting through API Scripting Network Automation Forum
 

Ähnlich wie Benchmarks, performance, scalability, and capacity what s behind the numbers_ presentation (20)

Monitoring and automation
Monitoring and automationMonitoring and automation
Monitoring and automation
 
Ensuring Performance in a Fast-Paced Environment (CMG 2014)
Ensuring Performance in a Fast-Paced Environment (CMG 2014)Ensuring Performance in a Fast-Paced Environment (CMG 2014)
Ensuring Performance in a Fast-Paced Environment (CMG 2014)
 
OSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles JudithOSMC 2019 | How to improve database Observability by Charles Judith
OSMC 2019 | How to improve database Observability by Charles Judith
 
Cloud accounting software uk
Cloud accounting software ukCloud accounting software uk
Cloud accounting software uk
 
Netflix SRE perf meetup_slides
Netflix SRE perf meetup_slidesNetflix SRE perf meetup_slides
Netflix SRE perf meetup_slides
 
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache KafkaStrata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
 
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
 
Scalability broad strokes
Scalability   broad strokesScalability   broad strokes
Scalability broad strokes
 
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
#OSSPARIS19 - How to improve database observability - CHARLES JUDITH, Criteo
 
Gatling
Gatling Gatling
Gatling
 
Performance Test Automation With Gatling
Performance Test Automation  With GatlingPerformance Test Automation  With Gatling
Performance Test Automation With Gatling
 
Machine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systemsMachine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systems
 
DrupalCon 2014: A Perfect Launch, Every Time
DrupalCon 2014: A Perfect Launch, Every TimeDrupalCon 2014: A Perfect Launch, Every Time
DrupalCon 2014: A Perfect Launch, Every Time
 
Java vs. C/C++
Java vs. C/C++Java vs. C/C++
Java vs. C/C++
 
UKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL TuningUKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL Tuning
 
Overview of Site Reliability Engineering (SRE) & best practices
Overview of Site Reliability Engineering (SRE) & best practicesOverview of Site Reliability Engineering (SRE) & best practices
Overview of Site Reliability Engineering (SRE) & best practices
 
The Professional Programmer
The Professional ProgrammerThe Professional Programmer
The Professional Programmer
 
Optimizing Linux Servers
Optimizing Linux ServersOptimizing Linux Servers
Optimizing Linux Servers
 
Simplified Troubleshooting through API Scripting
Simplified Troubleshooting through API Scripting Simplified Troubleshooting through API Scripting
Simplified Troubleshooting through API Scripting
 

Mehr von james tong

数据库系统设计漫谈
数据库系统设计漫谈数据库系统设计漫谈
数据库系统设计漫谈james tong
 
Migrating from MySQL to PostgreSQL
Migrating from MySQL to PostgreSQLMigrating from MySQL to PostgreSQL
Migrating from MySQL to PostgreSQLjames tong
 
Oracle 性能优化
Oracle 性能优化Oracle 性能优化
Oracle 性能优化james tong
 
Cap 理论与实践
Cap 理论与实践Cap 理论与实践
Cap 理论与实践james tong
 
Scalable system operations presentation
Scalable system operations presentationScalable system operations presentation
Scalable system operations presentationjames tong
 
Stability patterns presentation
Stability patterns presentationStability patterns presentation
Stability patterns presentationjames tong
 
The right read optimization is actually write optimization
The right read optimization is actually write optimizationThe right read optimization is actually write optimization
The right read optimization is actually write optimizationjames tong
 
My sql ssd-mysqluc-2012
My sql ssd-mysqluc-2012My sql ssd-mysqluc-2012
My sql ssd-mysqluc-2012james tong
 
Evaluating ha alternatives my sql tutorial2
Evaluating ha alternatives my sql tutorial2Evaluating ha alternatives my sql tutorial2
Evaluating ha alternatives my sql tutorial2james tong
 
Troubleshooting mysql-tutorial
Troubleshooting mysql-tutorialTroubleshooting mysql-tutorial
Troubleshooting mysql-tutorialjames tong
 
Understanding performance through_measurement
Understanding performance through_measurementUnderstanding performance through_measurement
Understanding performance through_measurementjames tong
 
我对后端优化的一点想法 (2012)
我对后端优化的一点想法 (2012)我对后端优化的一点想法 (2012)
我对后端优化的一点想法 (2012)james tong
 
设计可扩展的Oracle应用
设计可扩展的Oracle应用设计可扩展的Oracle应用
设计可扩展的Oracle应用james tong
 
我对后端优化的一点想法.pptx
我对后端优化的一点想法.pptx我对后端优化的一点想法.pptx
我对后端优化的一点想法.pptxjames tong
 
Enqueue Lock介绍.ppt
Enqueue Lock介绍.pptEnqueue Lock介绍.ppt
Enqueue Lock介绍.pptjames tong
 
Oracle数据库体系结构简介.ppt
Oracle数据库体系结构简介.pptOracle数据库体系结构简介.ppt
Oracle数据库体系结构简介.pptjames tong
 
Cassandra简介.ppt
Cassandra简介.pptCassandra简介.ppt
Cassandra简介.pptjames tong
 

Mehr von james tong (17)

数据库系统设计漫谈
数据库系统设计漫谈数据库系统设计漫谈
数据库系统设计漫谈
 
Migrating from MySQL to PostgreSQL
Migrating from MySQL to PostgreSQLMigrating from MySQL to PostgreSQL
Migrating from MySQL to PostgreSQL
 
Oracle 性能优化
Oracle 性能优化Oracle 性能优化
Oracle 性能优化
 
Cap 理论与实践
Cap 理论与实践Cap 理论与实践
Cap 理论与实践
 
Scalable system operations presentation
Scalable system operations presentationScalable system operations presentation
Scalable system operations presentation
 
Stability patterns presentation
Stability patterns presentationStability patterns presentation
Stability patterns presentation
 
The right read optimization is actually write optimization
The right read optimization is actually write optimizationThe right read optimization is actually write optimization
The right read optimization is actually write optimization
 
My sql ssd-mysqluc-2012
My sql ssd-mysqluc-2012My sql ssd-mysqluc-2012
My sql ssd-mysqluc-2012
 
Evaluating ha alternatives my sql tutorial2
Evaluating ha alternatives my sql tutorial2Evaluating ha alternatives my sql tutorial2
Evaluating ha alternatives my sql tutorial2
 
Troubleshooting mysql-tutorial
Troubleshooting mysql-tutorialTroubleshooting mysql-tutorial
Troubleshooting mysql-tutorial
 
Understanding performance through_measurement
Understanding performance through_measurementUnderstanding performance through_measurement
Understanding performance through_measurement
 
我对后端优化的一点想法 (2012)
我对后端优化的一点想法 (2012)我对后端优化的一点想法 (2012)
我对后端优化的一点想法 (2012)
 
设计可扩展的Oracle应用
设计可扩展的Oracle应用设计可扩展的Oracle应用
设计可扩展的Oracle应用
 
我对后端优化的一点想法.pptx
我对后端优化的一点想法.pptx我对后端优化的一点想法.pptx
我对后端优化的一点想法.pptx
 
Enqueue Lock介绍.ppt
Enqueue Lock介绍.pptEnqueue Lock介绍.ppt
Enqueue Lock介绍.ppt
 
Oracle数据库体系结构简介.ppt
Oracle数据库体系结构简介.pptOracle数据库体系结构简介.ppt
Oracle数据库体系结构简介.ppt
 
Cassandra简介.ppt
Cassandra简介.pptCassandra简介.ppt
Cassandra简介.ppt
 

Kürzlich hochgeladen

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 

Kürzlich hochgeladen (20)

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

Benchmarks, performance, scalability, and capacity what s behind the numbers_ presentation

  • 1. Beyond The Numbers Baron Schwartz
  • 2. Who Am I? ● baron@percona.com ● @xaprb ● linkedin.com/in/xaprb ● xaprb.com/blog
  • 3. Who Am I? ● Maatkit ● Percona Toolkit ● Innotop ● Monitoring Plugins ● Aspersa ● Online Tools ● JavaScript Libraries
  • 4. Consulting ● Percona Server ● Support ● Percona XtraBackup ● Remote DBA ● Percona XtraDB Cluster ● Engineering ● Percona Toolkit ● Conferences & Training ● Many More
  • 5. Today's Agenda ● Benchmarks ● Aggregation and Distributions ● Performance, Capacity & Utilization ● Rules of Thumb ● Queueing Theory and Scalability
  • 7. What's Missing? ● Distribution ● Time Series ● Response Times ● Parameters ● Goals ● System Specs
  • 8. What's Misleading? ● Logarithmic X-Axis ● Interpolation
  • 9. What's Good? ● Y-Axis Reaches 0 ● No Fake-Smoothing
  • 11. Look At All That Data...
  • 12. What's With The Grid Lines?!?!?
  • 13. Better Benchmarks What does an ideal benchmark report look like?
  • 14. Clear Benchmark Goals ● Validating hardware configuration ● Comparing two systems ● Checking for regressions ● Capacity planning ● Reproducing bad behavior to solve it ● Stress-testing to find bottlenecks
  • 15. Hardware and Software ● Specs for CPU, disk, memory, network ● Software versions (OS, SUT, benchmark) ● Filesystem, RAID controller ● Disk queue scheduler
  • 16. Presenting Results ● Ideally, make raw results available ● Include metrics from OS (CPU, RAM, IO, network) ● Generate some plots to summarize ● This is where the rubber meets the road!
  • 17. Better Aggregate Measures ● Average ● Percentiles ● 95th ● 99th ● Maximum ● Observation Duration ● Question: how bad can 95th percentile be?
  • 18. More Aggregate Measures ● Median (50th Percentile) ● Standard Deviation ● Index of Dispersion
  • 23. Performance ● What is Performance? ● Two Metrics ● Response Time (time per task) ● Throughput (tasks per time) ● They're not reciprocals ● More on this later
  • 24. What Performance Isn't ● CPU Usage ● Load Average ● Other metrics of resource consumption
  • 25. Performance ● I often focus on response time ● It represents user experience ● Throughput indicates capacity rather than performance ● For benchmarking, throughput is primary
  • 26. Utilization ● The portion of time during which the resource is busy ● i.e. there is at least one thing in progress
  • 27. Utilization is Confusing ● Be very careful with tools that report utilization ● From the Linux iostat man page: ● “%util: Percentage of CPU time during which I/O requests were issued to the device (bandwidth utilization for the device). Device saturation occurs when this value is close to 100%.” ● Can you parse that? Is it true?
  • 28. Capacity ● What is Capacity?
  • 30. Capacity – My Definition Capacity is the maximum throughput ... at achievable concurrency ... with acceptable performance ... as defined by response time ... meeting specified constraints ... over specified observation intervals.
  • 31. Capacity Example ● What is capacity of the system at a concurrency of 32 with 10-second 95th- percentile response time not to exceed 2ms over a 60-minute duration? ● To determine this, we need goal-seeking benchmark software ● Most benchmark software can't do this
  • 32. Benchmarks, etc Recap ● Most benchmarks reveal very little ● Benchmark reports reveal even less ● It's good to go beyond the surface
  • 33. Amdahl's Law ● “The speedup of a program using multiple processors in parallel computing is limited by the time needed for the sequential fraction of the program.” - Wikipedia ● It's basically a law of diminishing returns.
  • 34. Should I Defragment My Disk? ● Method 1: Google “defragment” ● Method 2: Try it and see ● Method 3: Measure if the disk is a bottleneck
  • 37. Amdahl's Law ● Don't try to optimize little things.
  • 38. Little's Law ● N = XR ● That is, ● Concurrency = Throughput * Response Time ● This holds regardless of queueing, arrival rate distribution, response time distribution, etc.
  • 39. Little's Law Example ● If disk IOs average 4ms... ● And there are 280 IOs per second... ● Then the disk's average concurrency is: ● N = 280 * .004 ● N = 1.12 ● Do you believe this? ● When might it not be true?
  • 40. Little's Law Example #2 ● If disk utilization is 98% ● And there are 280 IOs per second ● What do we know?
  • 41. Utilization Law ● U = SX ● Also independent of distributions, etc... ● That is, ● Utilization = Service Time * Throughput ● Utilization = 98% and Throughput = 280 ● S = U/X ● Service Time = .98 / 280 = .0035
  • 42. Queueing Theory ● How can we predict the amount of queueing in a system? ● How can we predict its response times? ● How can we predict capacity?
  • 43. Erlang Queueing ● Erlang's formulas model the probability of queueing for a given arrival rate, service time, and number of servers. ● A “server” is anything capable of serving a request. ● CPUs ● Disks
  • 44. CPU -vs- Disk Queueing ● Scenario: 4-CPU, 4-disk (RAID0) server ● Thought experiment: ● How do processes queue for CPU? ● How do I/O requests queue on disks?
  • 45. Notation ● Typically see something like M/M/1 ● Each letter is a placeholder in A/S/n ● A = Arrival distribution ● S = Service-time distribution ● n = Number of servers ● A and S can be one of: ● Markov ● Deterministic ● General
  • 46. CPUs -vs- Disks ● CPUs: M/M/4 ● Disks: 4 x {M/M/1}
  • 47. M/M/1 Queueing cmg.org
  • 48. M/M/n Queueing cmg.org
  • 49. Erlang C Function ● M/M/n queueing is modeled by Erlang C ● See http://en.wikipedia.org/wiki/Erlang_(unit)
  • 50. What's Wrong With Erlang C? ● You must validate your arrival times. ● You must validate your service times. ● The equation is hard to work with. ● In practice, it's hard to use Erlang C.
  • 51. Scalability ● Queueing causes non-linear scaling. ● But first, let's talk about linearity.
  • 52. System Scalability Throughput Why? Concurrency
  • 53. Universal Scalability Law Linear Amdahl Throughput USL Concurrency
  • 58. Scalability Limitations ● Locks ● Synchronization points ● Shared resources ● Duplicated data to be kept in sync ● Weakest-link problems
  • 59. RAID10 On EBS ● Which is faster? ● RAID 10 over 10 EBS volumes ● RAID 10 over 20 EBS volumes ● Hint: http://goo.gl/Xm92Y ● Also, http://goo.gl/fAEIL
  • 60. Debunking “Linear” ● Ask to see the actual numbers. ● They shouldn't be rounded off suspiciously. ● They must be truly linear. ● They must intersect the point (0, 0).
  • 64. Resources ● Naomi Robbins' Blog ● http://blogs.forbes.com/naomirobbins/ ● Percona White Papers ● http://www.percona.com/ ● Neil J. Gunther ● Guerrilla Capacity Planning ● http://www.contextneeded.com/