SlideShare ist ein Scribd-Unternehmen logo
1 von 23
@Twitter | Velocity 2013 1
A Systematic Approach to !
Capacity Planning in the Real World
Bryce Yan, Arun Kejariwal
(@bryce_yan, @arun_kejariwal)
Capacity Engineering @ Twitter
June 2013
@Twitter | Velocity 2013 2
User Experience
•  Anytime, Anywhere, Any device
•  Real-time performance
•  Additional challenges




[2] http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/people/jeff/Berkeley-Latency-Mar2012.pdf
[1] Xu et al. NSDI 2013 - https://www.usenix.org/system/files/conference/nsdi13/nsdi13-final77.pdf
Fault Tolerance
Variability [2]
@Twitter | Velocity 2013 3
Approaches to Capacity Planning
•  Throw hardware at the problem
•  Reactive approach


o  How much?
o  What kind? (Inventory management etc.)
PoorUX
Bottomline
@Twitter | Velocity 2013 4
Capacity Planning is Non-trivial
•  Organic growth
  Over 200M monthly active users [1]
•  Events planned or unplanned




  Events/incidents (e.g., Superbowl’13 blackout)
  Behavioral response
o  Demographics, Cultural
o  Retweets, Photos, Vines
  Tax different services/applications
o  Different capacity requests
[2] http://arstechnica.com/information-technology/2012/10/hurricane-sandy-takes-data-centers-offline-with-flooding-power-outages/
[3] http://www.zdnet.com/amazons-compute-cloud-has-a-networking-hiccup-7000005776/
[2, 3]
[1] https://twitter.com/twitter/status/281051652235087872
@Twitter | Velocity 2013 5
Capacity Planning is Non-trivial (cont’d)
•  Evolving product development landscape
  New features
  New products
•  New hardware platforms
  Purchase pipeline
  How much and when to buy – Cost performance trade-off
•  Overall goal



User Experience
 Operational footprint
@Twitter | Velocity 2013 6
Capacity Modeling Overview
@Twitter | Velocity 2013 7
Capacity Modeling
•  Takes core drivers as inputs to generate usage demand
  Forecasts the amount of work based on core driver projections
•  Relates the work metric to a primary resource to identify the capacity
threshold
  Primary resources
  Computing power (CPU, RAM)
  Storage (disk I/O, disk space)
  Network (network bandwidth)
•  Generate hardware demand based on the limiting primary resource
@Twitter | Velocity 2013 8
Core Drivers
•  Underlying business metrics that drive demand for more capacity
  Active Users
  Tweets per second (TPS)
  Favorites per second (FPS)
  Requests per second (RPS)
•  Normalized by Active Users to isolate user engagement
•  Project user engagement and Active Users independently
@Twitter | Velocity 2013 9
Active Users aka User Growth
 Normalized Core Drivers for Engagement
Core Drivers (cont’d)
PerActiveUserValues
Time
Favorites
Retweets
Poly. (Favorites)
Linear (Retweets)
ActiveUserCount
Time
Active
Users
Linear (Active
Users)
@Twitter | Velocity 2013 10
Core Drivers (cont’d)
Time
User Growth: Active Users
Active
Users
Linear (Active
Users)
Time
Engagement: Photos/Active User
Photos
Linear (Photos)
Time
Core Driver: Photos per Day
Photos
Photos
Forecast
@Twitter | Velocity 2013 11
Capacity Threshold
•  Primary resource scalability threshold
  Determined by load testing
  Synthetic load
  Replaying production traffic
  Real-time production traffic
  Test systems may be
  Isolated replicas of production
  Staging systems in production
  Production systems
ServiceResponseTime
CPU
Average Response Times vs CPU
X
@Twitter | Velocity 2013 12
Hardware Demand
•  Core driver  capacity threshold  scaling formula  server count
•  Example
  Core driver: Requests per Second
  Per server request throughput determined by 
capacity threshold
  Scaling formula for Sizing
  Number of Servers = (RPS) / Per Server Threshold
CoreDriver(RPS)/ServerCount
Time
RPS (Actuals)
 RPS (Forecast)
 # Servers (Actuals)
 # Servers (Forecast)
@Twitter | Velocity 2013 13
Statistical Approach to Capacity Modeling
@Twitter | Velocity 2013 14
Capacity Planning Methodology
•  Predict expected value based on historical and temporal statistical analysis
  Metrics 
  Average, Standard deviation, 95th, 99th percentile 
  Techniques
  Moving Average – EMA (exponential moving average)
  Correlation
  β analysis
  MACD
  Forecasting - ARIMA

•  Limitations
  Changing usage patterns
  Organic growth, behavioral, cultural 
  Event driven
  Super Bowl: How a game would turn out?
@Twitter | Velocity 2013 15
Capacity Planning Methodology (contd.)
•  Correlation Analysis
  Assess the relation between resource metric(s) and core driver
  Caution: Correlation does not imply causation 
Core Driver
Network
CPU
Time
@Twitter | Velocity 2013 16
1
0.95
0.99
0.98
0.97
0.94
0.81
1
0.89
0.95
0.87
0.98
0.86
1
0.97
0.99
0.88
0.75
1
0.94
0.95
0.8
1
0.85
0.71
1
0.79 1
CoreDriver1
CoreDriver2
CoreDriver3
CoreDriver4
CoreDriver5
CoreDriver6
CoreDriver7
Core Driver 1
Core Driver 2
Core Driver 3
Core Driver 4
Core Driver 5
Core Driver 6
Core Driver 7
Core Driver Correlations
Capacity Planning Methodology (contd.)
•  Correlation matrix 
  Capture interactions in a Service Oriented Architecture (SOA)
  Other Use: User engagement
@Twitter | Velocity 2013 17
Rolling Correlation
Time
Capacity Planning Methodology (contd.)
•  Correlation varies over time
  Growing user base
  New products, features
•  Rolling correlation analysis – capture time varying nature
  Raw times series 
  EMA
  Challenge: What should be the window width?
@Twitter | Velocity 2013 18
Capacity Planning Methodology (contd.)
•  Relative Growth
  How does INTC moves with respect to S&P 500?
-6.00%
-4.00%
-2.00%
0.00%
2.00%
4.00%
6.00%
8.00%
12/13/08
12/20/08
12/27/08
1/3/09
1/10/09
1/17/09
1/24/09
1/31/09
2/7/09
2/14/09
2/21/09
2/28/09
3/7/09
3/14/09
3/21/09
3/28/09
4/4/09
4/11/09
4/18/09
4/25/09
5/2/09
5/9/09
DailyReturns
S&P 500 
 INTC
β: 1.35
: β Analysis
@Twitter | Velocity 2013 19
Capacity Planning Methodology (contd.)
0
200
400
600
800
1000
1200
1400
1600
0
200
400
600
800
1000
1200
1400
1600
Resource
CoreDriver
Time
Core Driver
 Resource
β: 1.08
•  Relative Growth:β Analysis 
  Relative growth of a core driver and a resource driver
@Twitter | Velocity 2013 20
Capacity Planning Methodology (contd.)
•  β varies over time
  New products, features 
  New metric to log
Rolling Beta
Time
@Twitter | Velocity 2013 21
Capacity Planning Methodology (contd.)
•  Growth: Detecting breakout
  MACD: Moving Average Convergence Divergence
  Difference of n- and m-width, n>m, EMA
  Diverging EMAs
o  Commonly used as a 

buy/sell signal in

context of a stock
o  Early detection of

potential capacity ask 
"MACD"
MACD Signal
Time
@Twitter | Velocity 2013 22
Acknowledgements
•  Winston Lee, Capacity Engineer, Twitter
•  Management team
@Twitter | Velocity 2013 23
Join the Flock
•  We are hiring!!
  https://twitter.com/JoinTheFlock
  https://twitter.com/jobs
  Contact us: @bryce_yan, @arun_kejariwal
Like problem solving? 
 Like challenges? 
 Be at cutting Edge 
 Make an impact

Weitere ähnliche Inhalte

Was ist angesagt?

MapR Edge : Act Locally Learn Globally
MapR Edge : Act Locally Learn GloballyMapR Edge : Act Locally Learn Globally
MapR Edge : Act Locally Learn Globallyridhav
 
High Performance Computing
High Performance ComputingHigh Performance Computing
High Performance ComputingNous Infosystems
 
What's New in 6.3 + Data On-Boarding
What's New in 6.3 + Data On-BoardingWhat's New in 6.3 + Data On-Boarding
What's New in 6.3 + Data On-BoardingSplunk
 
Going Server-less for Web-Services that need to Crunch Large Volumes of Data
Going Server-less for Web-Services that need to Crunch Large Volumes of DataGoing Server-less for Web-Services that need to Crunch Large Volumes of Data
Going Server-less for Web-Services that need to Crunch Large Volumes of DataDenis C. Bauer
 
Splunk Ninjas: New Features and Search Dojo
Splunk Ninjas: New Features and Search DojoSplunk Ninjas: New Features and Search Dojo
Splunk Ninjas: New Features and Search DojoSplunk
 
Anomaly detection in real-time data streams using Heron
Anomaly detection in real-time data streams using HeronAnomaly detection in real-time data streams using Heron
Anomaly detection in real-time data streams using HeronArun Kejariwal
 
(BDT207) Use Streaming Analytics to Exploit Perishable Insights | AWS re:Inve...
(BDT207) Use Streaming Analytics to Exploit Perishable Insights | AWS re:Inve...(BDT207) Use Streaming Analytics to Exploit Perishable Insights | AWS re:Inve...
(BDT207) Use Streaming Analytics to Exploit Perishable Insights | AWS re:Inve...Amazon Web Services
 
How novel compute technology transforms life science research
How novel compute technology transforms life science researchHow novel compute technology transforms life science research
How novel compute technology transforms life science researchDenis C. Bauer
 
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...Mathieu Dumoulin
 
IRJET- Optimization of Completion Time through Efficient Resource Allocation ...
IRJET- Optimization of Completion Time through Efficient Resource Allocation ...IRJET- Optimization of Completion Time through Efficient Resource Allocation ...
IRJET- Optimization of Completion Time through Efficient Resource Allocation ...IRJET Journal
 
[Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Servic...
[Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Servic...[Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Servic...
[Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Servic...Vinu Charanya
 
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...MapR Technologies
 
Keynote 1 the rise of stream processing for data management & micro serv...
Keynote 1  the rise of stream processing for data management & micro serv...Keynote 1  the rise of stream processing for data management & micro serv...
Keynote 1 the rise of stream processing for data management & micro serv...Sabri Skhiri
 
Data-Drive DevOps: Mining Machine Data for "Metrics that Matter"
Data-Drive DevOps: Mining Machine Data for "Metrics that Matter"Data-Drive DevOps: Mining Machine Data for "Metrics that Matter"
Data-Drive DevOps: Mining Machine Data for "Metrics that Matter"Splunk
 
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...Spark Summit
 
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Sri Ambati
 
Using Cloud CAE Delivered by AWS HPC to Optimize Next-Gen Medical Devices - B...
Using Cloud CAE Delivered by AWS HPC to Optimize Next-Gen Medical Devices - B...Using Cloud CAE Delivered by AWS HPC to Optimize Next-Gen Medical Devices - B...
Using Cloud CAE Delivered by AWS HPC to Optimize Next-Gen Medical Devices - B...Amazon Web Services
 
Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...GetInData
 
SplunkLive! - Splunk for IT Operations
SplunkLive! - Splunk for IT OperationsSplunkLive! - Splunk for IT Operations
SplunkLive! - Splunk for IT OperationsSplunk
 

Was ist angesagt? (20)

MapR Edge : Act Locally Learn Globally
MapR Edge : Act Locally Learn GloballyMapR Edge : Act Locally Learn Globally
MapR Edge : Act Locally Learn Globally
 
High Performance Computing
High Performance ComputingHigh Performance Computing
High Performance Computing
 
What's New in 6.3 + Data On-Boarding
What's New in 6.3 + Data On-BoardingWhat's New in 6.3 + Data On-Boarding
What's New in 6.3 + Data On-Boarding
 
Going Server-less for Web-Services that need to Crunch Large Volumes of Data
Going Server-less for Web-Services that need to Crunch Large Volumes of DataGoing Server-less for Web-Services that need to Crunch Large Volumes of Data
Going Server-less for Web-Services that need to Crunch Large Volumes of Data
 
Splunk Ninjas: New Features and Search Dojo
Splunk Ninjas: New Features and Search DojoSplunk Ninjas: New Features and Search Dojo
Splunk Ninjas: New Features and Search Dojo
 
Anomaly detection in real-time data streams using Heron
Anomaly detection in real-time data streams using HeronAnomaly detection in real-time data streams using Heron
Anomaly detection in real-time data streams using Heron
 
(BDT207) Use Streaming Analytics to Exploit Perishable Insights | AWS re:Inve...
(BDT207) Use Streaming Analytics to Exploit Perishable Insights | AWS re:Inve...(BDT207) Use Streaming Analytics to Exploit Perishable Insights | AWS re:Inve...
(BDT207) Use Streaming Analytics to Exploit Perishable Insights | AWS re:Inve...
 
I'm being followed by drones
I'm being followed by dronesI'm being followed by drones
I'm being followed by drones
 
How novel compute technology transforms life science research
How novel compute technology transforms life science researchHow novel compute technology transforms life science research
How novel compute technology transforms life science research
 
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
 
IRJET- Optimization of Completion Time through Efficient Resource Allocation ...
IRJET- Optimization of Completion Time through Efficient Resource Allocation ...IRJET- Optimization of Completion Time through Efficient Resource Allocation ...
IRJET- Optimization of Completion Time through Efficient Resource Allocation ...
 
[Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Servic...
[Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Servic...[Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Servic...
[Kubecon 2017 Austin, TX] How We Built a Framework at Twitter to Solve Servic...
 
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
 
Keynote 1 the rise of stream processing for data management & micro serv...
Keynote 1  the rise of stream processing for data management & micro serv...Keynote 1  the rise of stream processing for data management & micro serv...
Keynote 1 the rise of stream processing for data management & micro serv...
 
Data-Drive DevOps: Mining Machine Data for "Metrics that Matter"
Data-Drive DevOps: Mining Machine Data for "Metrics that Matter"Data-Drive DevOps: Mining Machine Data for "Metrics that Matter"
Data-Drive DevOps: Mining Machine Data for "Metrics that Matter"
 
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
 
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
 
Using Cloud CAE Delivered by AWS HPC to Optimize Next-Gen Medical Devices - B...
Using Cloud CAE Delivered by AWS HPC to Optimize Next-Gen Medical Devices - B...Using Cloud CAE Delivered by AWS HPC to Optimize Next-Gen Medical Devices - B...
Using Cloud CAE Delivered by AWS HPC to Optimize Next-Gen Medical Devices - B...
 
Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...
 
SplunkLive! - Splunk for IT Operations
SplunkLive! - Splunk for IT OperationsSplunkLive! - Splunk for IT Operations
SplunkLive! - Splunk for IT Operations
 

Andere mochten auch

Design+Performance Velocity 2015
Design+Performance Velocity 2015Design+Performance Velocity 2015
Design+Performance Velocity 2015Steve Souders
 
Days In Green (DIG): Forecasting the life of a healthy service
Days In Green (DIG): Forecasting the life of a healthy serviceDays In Green (DIG): Forecasting the life of a healthy service
Days In Green (DIG): Forecasting the life of a healthy serviceArun Kejariwal
 
Days In Green : Forecasting the Life of a Healthy Service @Twitter
Days In Green : Forecasting the Life of a Healthy Service @TwitterDays In Green : Forecasting the Life of a Healthy Service @Twitter
Days In Green : Forecasting the Life of a Healthy Service @TwitterVibhav Garg
 
Anteojito, revista completa, agosto 11 de 1966
Anteojito, revista completa,  agosto 11 de 1966Anteojito, revista completa,  agosto 11 de 1966
Anteojito, revista completa, agosto 11 de 1966Martin Alberto Belaustegui
 
Mitigating User Experience from 'Breaking Bad': The Twitter Approach [Velocit...
Mitigating User Experience from 'Breaking Bad': The Twitter Approach [Velocit...Mitigating User Experience from 'Breaking Bad': The Twitter Approach [Velocit...
Mitigating User Experience from 'Breaking Bad': The Twitter Approach [Velocit...Piyush Kumar
 
A Tool for Practical Garbage Collection Analysis In the Cloud
A Tool for Practical Garbage Collection Analysis In the CloudA Tool for Practical Garbage Collection Analysis In the Cloud
A Tool for Practical Garbage Collection Analysis In the CloudArun Kejariwal
 
Metrics, Metrics Everywhere (but where the heck do you start?)
Metrics, Metrics Everywhere (but where the heck do you start?)Metrics, Metrics Everywhere (but where the heck do you start?)
Metrics, Metrics Everywhere (but where the heck do you start?)SOASTA
 
Location Planning and Analysis
Location Planning and AnalysisLocation Planning and Analysis
Location Planning and AnalysisIza Marie
 
Simple Log Analysis and Trending
Simple Log Analysis and TrendingSimple Log Analysis and Trending
Simple Log Analysis and TrendingMike Brittain
 
Statistical Learning Based Anomaly Detection @ Twitter
Statistical Learning Based Anomaly Detection @ TwitterStatistical Learning Based Anomaly Detection @ Twitter
Statistical Learning Based Anomaly Detection @ TwitterArun Kejariwal
 
6. process selection and facility layout
6. process selection and facility layout6. process selection and facility layout
6. process selection and facility layoutSudipta Saha
 

Andere mochten auch (20)

Design+Performance Velocity 2015
Design+Performance Velocity 2015Design+Performance Velocity 2015
Design+Performance Velocity 2015
 
Days In Green (DIG): Forecasting the life of a healthy service
Days In Green (DIG): Forecasting the life of a healthy serviceDays In Green (DIG): Forecasting the life of a healthy service
Days In Green (DIG): Forecasting the life of a healthy service
 
Com t'ho explico
Com t'ho explicoCom t'ho explico
Com t'ho explico
 
Velocity 2015-final
Velocity 2015-finalVelocity 2015-final
Velocity 2015-final
 
Days In Green : Forecasting the Life of a Healthy Service @Twitter
Days In Green : Forecasting the Life of a Healthy Service @TwitterDays In Green : Forecasting the Life of a Healthy Service @Twitter
Days In Green : Forecasting the Life of a Healthy Service @Twitter
 
Tric y Trake 15 junio 1967
Tric y Trake  15 junio 1967Tric y Trake  15 junio 1967
Tric y Trake 15 junio 1967
 
Anteojito, revista completa, agosto 11 de 1966
Anteojito, revista completa,  agosto 11 de 1966Anteojito, revista completa,  agosto 11 de 1966
Anteojito, revista completa, agosto 11 de 1966
 
La diversidad de los seres vivos
La diversidad de los seres vivosLa diversidad de los seres vivos
La diversidad de los seres vivos
 
Mapa ilustrado de Estados Unidos
Mapa ilustrado de Estados Unidos Mapa ilustrado de Estados Unidos
Mapa ilustrado de Estados Unidos
 
Mitigating User Experience from 'Breaking Bad': The Twitter Approach [Velocit...
Mitigating User Experience from 'Breaking Bad': The Twitter Approach [Velocit...Mitigating User Experience from 'Breaking Bad': The Twitter Approach [Velocit...
Mitigating User Experience from 'Breaking Bad': The Twitter Approach [Velocit...
 
A Tool for Practical Garbage Collection Analysis In the Cloud
A Tool for Practical Garbage Collection Analysis In the CloudA Tool for Practical Garbage Collection Analysis In the Cloud
A Tool for Practical Garbage Collection Analysis In the Cloud
 
Metrics, Metrics Everywhere (but where the heck do you start?)
Metrics, Metrics Everywhere (but where the heck do you start?)Metrics, Metrics Everywhere (but where the heck do you start?)
Metrics, Metrics Everywhere (but where the heck do you start?)
 
Mistery box
Mistery boxMistery box
Mistery box
 
山水
山水山水
山水
 
Formació en competències
Formació en competènciesFormació en competències
Formació en competències
 
Location Planning and Analysis
Location Planning and AnalysisLocation Planning and Analysis
Location Planning and Analysis
 
St patricks day
St patricks daySt patricks day
St patricks day
 
Simple Log Analysis and Trending
Simple Log Analysis and TrendingSimple Log Analysis and Trending
Simple Log Analysis and Trending
 
Statistical Learning Based Anomaly Detection @ Twitter
Statistical Learning Based Anomaly Detection @ TwitterStatistical Learning Based Anomaly Detection @ Twitter
Statistical Learning Based Anomaly Detection @ Twitter
 
6. process selection and facility layout
6. process selection and facility layout6. process selection and facility layout
6. process selection and facility layout
 

Ähnlich wie A Systematic Approach to Capacity Planning in the Real World

Build User-Facing Analytics Application That Scales Using StarRocks (DLH).pdf
Build User-Facing Analytics Application That Scales Using StarRocks (DLH).pdfBuild User-Facing Analytics Application That Scales Using StarRocks (DLH).pdf
Build User-Facing Analytics Application That Scales Using StarRocks (DLH).pdfAlbert Wong
 
Re-Platforming Applications for the Cloud
Re-Platforming Applications for the CloudRe-Platforming Applications for the Cloud
Re-Platforming Applications for the CloudCarter Wickstrom
 
Real-Time Analytics With StarRocks (DWH+DL).pdf
Real-Time Analytics With StarRocks (DWH+DL).pdfReal-Time Analytics With StarRocks (DWH+DL).pdf
Real-Time Analytics With StarRocks (DWH+DL).pdfAlbert Wong
 
Automated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise ApplicationsAutomated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise ApplicationsSAIL_QU
 
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesPutting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesDATAVERSITY
 
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...IRJET Journal
 
performancetestinganoverview-110206071921-phpapp02.pdf
performancetestinganoverview-110206071921-phpapp02.pdfperformancetestinganoverview-110206071921-phpapp02.pdf
performancetestinganoverview-110206071921-phpapp02.pdfMAshok10
 
Innovate2010 jazz keynote
Innovate2010 jazz keynoteInnovate2010 jazz keynote
Innovate2010 jazz keynoteoslc
 
Web Performance Bootcamp 2014
Web Performance Bootcamp 2014Web Performance Bootcamp 2014
Web Performance Bootcamp 2014Daniel Austin
 
Chapter 10
Chapter 10Chapter 10
Chapter 10bodo-con
 
All about that reactive ui
All about that reactive uiAll about that reactive ui
All about that reactive uiPaul van Zyl
 
3 Keys to Performance Testing at the Speed of Agile
3 Keys to Performance Testing at the Speed of Agile3 Keys to Performance Testing at the Speed of Agile
3 Keys to Performance Testing at the Speed of AgileNeotys
 
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDogRedis Labs
 
FlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaFlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaDatabricks
 
AWS July Webinar Series: Amazon Redshift Reporting and Advanced Analytics
AWS July Webinar Series: Amazon Redshift Reporting and Advanced AnalyticsAWS July Webinar Series: Amazon Redshift Reporting and Advanced Analytics
AWS July Webinar Series: Amazon Redshift Reporting and Advanced AnalyticsAmazon Web Services
 
Web Performance BootCamp 2013
Web Performance BootCamp 2013Web Performance BootCamp 2013
Web Performance BootCamp 2013Daniel Austin
 

Ähnlich wie A Systematic Approach to Capacity Planning in the Real World (20)

Build User-Facing Analytics Application That Scales Using StarRocks (DLH).pdf
Build User-Facing Analytics Application That Scales Using StarRocks (DLH).pdfBuild User-Facing Analytics Application That Scales Using StarRocks (DLH).pdf
Build User-Facing Analytics Application That Scales Using StarRocks (DLH).pdf
 
Re-Platforming Applications for the Cloud
Re-Platforming Applications for the CloudRe-Platforming Applications for the Cloud
Re-Platforming Applications for the Cloud
 
Real-Time Analytics With StarRocks (DWH+DL).pdf
Real-Time Analytics With StarRocks (DWH+DL).pdfReal-Time Analytics With StarRocks (DWH+DL).pdf
Real-Time Analytics With StarRocks (DWH+DL).pdf
 
Automated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise ApplicationsAutomated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise Applications
 
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesPutting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
 
Shikha fdp 62_14july2017
Shikha fdp 62_14july2017Shikha fdp 62_14july2017
Shikha fdp 62_14july2017
 
Druid @ branch
Druid @ branch Druid @ branch
Druid @ branch
 
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
 
Why do Users kill HPC Jobs?
Why do Users kill HPC Jobs?Why do Users kill HPC Jobs?
Why do Users kill HPC Jobs?
 
performancetestinganoverview-110206071921-phpapp02.pdf
performancetestinganoverview-110206071921-phpapp02.pdfperformancetestinganoverview-110206071921-phpapp02.pdf
performancetestinganoverview-110206071921-phpapp02.pdf
 
Innovate2010 jazz keynote
Innovate2010 jazz keynoteInnovate2010 jazz keynote
Innovate2010 jazz keynote
 
Web Performance Bootcamp 2014
Web Performance Bootcamp 2014Web Performance Bootcamp 2014
Web Performance Bootcamp 2014
 
ADF Performance Monitor
ADF Performance MonitorADF Performance Monitor
ADF Performance Monitor
 
Chapter 10
Chapter 10Chapter 10
Chapter 10
 
All about that reactive ui
All about that reactive uiAll about that reactive ui
All about that reactive ui
 
3 Keys to Performance Testing at the Speed of Agile
3 Keys to Performance Testing at the Speed of Agile3 Keys to Performance Testing at the Speed of Agile
3 Keys to Performance Testing at the Speed of Agile
 
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
 
FlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaFlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at Humana
 
AWS July Webinar Series: Amazon Redshift Reporting and Advanced Analytics
AWS July Webinar Series: Amazon Redshift Reporting and Advanced AnalyticsAWS July Webinar Series: Amazon Redshift Reporting and Advanced Analytics
AWS July Webinar Series: Amazon Redshift Reporting and Advanced Analytics
 
Web Performance BootCamp 2013
Web Performance BootCamp 2013Web Performance BootCamp 2013
Web Performance BootCamp 2013
 

Mehr von Arun Kejariwal

Anomaly Detection At The Edge
Anomaly Detection At The EdgeAnomaly Detection At The Edge
Anomaly Detection At The EdgeArun Kejariwal
 
Serverless Streaming Architectures and Algorithms for the Enterprise
Serverless Streaming Architectures and Algorithms for the EnterpriseServerless Streaming Architectures and Algorithms for the Enterprise
Serverless Streaming Architectures and Algorithms for the EnterpriseArun Kejariwal
 
Sequence-to-Sequence Modeling for Time Series
Sequence-to-Sequence Modeling for Time SeriesSequence-to-Sequence Modeling for Time Series
Sequence-to-Sequence Modeling for Time SeriesArun Kejariwal
 
Sequence-to-Sequence Modeling for Time Series
Sequence-to-Sequence Modeling for Time SeriesSequence-to-Sequence Modeling for Time Series
Sequence-to-Sequence Modeling for Time SeriesArun Kejariwal
 
Model Serving via Pulsar Functions
Model Serving via Pulsar FunctionsModel Serving via Pulsar Functions
Model Serving via Pulsar FunctionsArun Kejariwal
 
Designing Modern Streaming Data Applications
Designing Modern Streaming Data ApplicationsDesigning Modern Streaming Data Applications
Designing Modern Streaming Data ApplicationsArun Kejariwal
 
Correlation Analysis on Live Data Streams
Correlation Analysis on Live Data StreamsCorrelation Analysis on Live Data Streams
Correlation Analysis on Live Data StreamsArun Kejariwal
 
Deep Learning for Time Series Data
Deep Learning for Time Series DataDeep Learning for Time Series Data
Deep Learning for Time Series DataArun Kejariwal
 
Correlation Analysis on Live Data Streams
Correlation Analysis on Live Data StreamsCorrelation Analysis on Live Data Streams
Correlation Analysis on Live Data StreamsArun Kejariwal
 
Live Anomaly Detection
Live Anomaly DetectionLive Anomaly Detection
Live Anomaly DetectionArun Kejariwal
 
Real Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and SystemsReal Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and SystemsArun Kejariwal
 

Mehr von Arun Kejariwal (11)

Anomaly Detection At The Edge
Anomaly Detection At The EdgeAnomaly Detection At The Edge
Anomaly Detection At The Edge
 
Serverless Streaming Architectures and Algorithms for the Enterprise
Serverless Streaming Architectures and Algorithms for the EnterpriseServerless Streaming Architectures and Algorithms for the Enterprise
Serverless Streaming Architectures and Algorithms for the Enterprise
 
Sequence-to-Sequence Modeling for Time Series
Sequence-to-Sequence Modeling for Time SeriesSequence-to-Sequence Modeling for Time Series
Sequence-to-Sequence Modeling for Time Series
 
Sequence-to-Sequence Modeling for Time Series
Sequence-to-Sequence Modeling for Time SeriesSequence-to-Sequence Modeling for Time Series
Sequence-to-Sequence Modeling for Time Series
 
Model Serving via Pulsar Functions
Model Serving via Pulsar FunctionsModel Serving via Pulsar Functions
Model Serving via Pulsar Functions
 
Designing Modern Streaming Data Applications
Designing Modern Streaming Data ApplicationsDesigning Modern Streaming Data Applications
Designing Modern Streaming Data Applications
 
Correlation Analysis on Live Data Streams
Correlation Analysis on Live Data StreamsCorrelation Analysis on Live Data Streams
Correlation Analysis on Live Data Streams
 
Deep Learning for Time Series Data
Deep Learning for Time Series DataDeep Learning for Time Series Data
Deep Learning for Time Series Data
 
Correlation Analysis on Live Data Streams
Correlation Analysis on Live Data StreamsCorrelation Analysis on Live Data Streams
Correlation Analysis on Live Data Streams
 
Live Anomaly Detection
Live Anomaly DetectionLive Anomaly Detection
Live Anomaly Detection
 
Real Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and SystemsReal Time Analytics: Algorithms and Systems
Real Time Analytics: Algorithms and Systems
 

Kürzlich hochgeladen

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 

Kürzlich hochgeladen (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 

A Systematic Approach to Capacity Planning in the Real World

  • 1. @Twitter | Velocity 2013 1 A Systematic Approach to ! Capacity Planning in the Real World Bryce Yan, Arun Kejariwal (@bryce_yan, @arun_kejariwal) Capacity Engineering @ Twitter June 2013
  • 2. @Twitter | Velocity 2013 2 User Experience •  Anytime, Anywhere, Any device •  Real-time performance •  Additional challenges [2] http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/people/jeff/Berkeley-Latency-Mar2012.pdf [1] Xu et al. NSDI 2013 - https://www.usenix.org/system/files/conference/nsdi13/nsdi13-final77.pdf Fault Tolerance Variability [2]
  • 3. @Twitter | Velocity 2013 3 Approaches to Capacity Planning •  Throw hardware at the problem •  Reactive approach o  How much? o  What kind? (Inventory management etc.) PoorUX Bottomline
  • 4. @Twitter | Velocity 2013 4 Capacity Planning is Non-trivial •  Organic growth   Over 200M monthly active users [1] •  Events planned or unplanned   Events/incidents (e.g., Superbowl’13 blackout)   Behavioral response o  Demographics, Cultural o  Retweets, Photos, Vines   Tax different services/applications o  Different capacity requests [2] http://arstechnica.com/information-technology/2012/10/hurricane-sandy-takes-data-centers-offline-with-flooding-power-outages/ [3] http://www.zdnet.com/amazons-compute-cloud-has-a-networking-hiccup-7000005776/ [2, 3] [1] https://twitter.com/twitter/status/281051652235087872
  • 5. @Twitter | Velocity 2013 5 Capacity Planning is Non-trivial (cont’d) •  Evolving product development landscape   New features   New products •  New hardware platforms   Purchase pipeline   How much and when to buy – Cost performance trade-off •  Overall goal User Experience Operational footprint
  • 6. @Twitter | Velocity 2013 6 Capacity Modeling Overview
  • 7. @Twitter | Velocity 2013 7 Capacity Modeling •  Takes core drivers as inputs to generate usage demand   Forecasts the amount of work based on core driver projections •  Relates the work metric to a primary resource to identify the capacity threshold   Primary resources   Computing power (CPU, RAM)   Storage (disk I/O, disk space)   Network (network bandwidth) •  Generate hardware demand based on the limiting primary resource
  • 8. @Twitter | Velocity 2013 8 Core Drivers •  Underlying business metrics that drive demand for more capacity   Active Users   Tweets per second (TPS)   Favorites per second (FPS)   Requests per second (RPS) •  Normalized by Active Users to isolate user engagement •  Project user engagement and Active Users independently
  • 9. @Twitter | Velocity 2013 9 Active Users aka User Growth Normalized Core Drivers for Engagement Core Drivers (cont’d) PerActiveUserValues Time Favorites Retweets Poly. (Favorites) Linear (Retweets) ActiveUserCount Time Active Users Linear (Active Users)
  • 10. @Twitter | Velocity 2013 10 Core Drivers (cont’d) Time User Growth: Active Users Active Users Linear (Active Users) Time Engagement: Photos/Active User Photos Linear (Photos) Time Core Driver: Photos per Day Photos Photos Forecast
  • 11. @Twitter | Velocity 2013 11 Capacity Threshold •  Primary resource scalability threshold   Determined by load testing   Synthetic load   Replaying production traffic   Real-time production traffic   Test systems may be   Isolated replicas of production   Staging systems in production   Production systems ServiceResponseTime CPU Average Response Times vs CPU X
  • 12. @Twitter | Velocity 2013 12 Hardware Demand •  Core driver  capacity threshold  scaling formula  server count •  Example   Core driver: Requests per Second   Per server request throughput determined by capacity threshold   Scaling formula for Sizing   Number of Servers = (RPS) / Per Server Threshold CoreDriver(RPS)/ServerCount Time RPS (Actuals) RPS (Forecast) # Servers (Actuals) # Servers (Forecast)
  • 13. @Twitter | Velocity 2013 13 Statistical Approach to Capacity Modeling
  • 14. @Twitter | Velocity 2013 14 Capacity Planning Methodology •  Predict expected value based on historical and temporal statistical analysis   Metrics   Average, Standard deviation, 95th, 99th percentile   Techniques   Moving Average – EMA (exponential moving average)   Correlation   β analysis   MACD   Forecasting - ARIMA •  Limitations   Changing usage patterns   Organic growth, behavioral, cultural   Event driven   Super Bowl: How a game would turn out?
  • 15. @Twitter | Velocity 2013 15 Capacity Planning Methodology (contd.) •  Correlation Analysis   Assess the relation between resource metric(s) and core driver   Caution: Correlation does not imply causation Core Driver Network CPU Time
  • 16. @Twitter | Velocity 2013 16 1 0.95 0.99 0.98 0.97 0.94 0.81 1 0.89 0.95 0.87 0.98 0.86 1 0.97 0.99 0.88 0.75 1 0.94 0.95 0.8 1 0.85 0.71 1 0.79 1 CoreDriver1 CoreDriver2 CoreDriver3 CoreDriver4 CoreDriver5 CoreDriver6 CoreDriver7 Core Driver 1 Core Driver 2 Core Driver 3 Core Driver 4 Core Driver 5 Core Driver 6 Core Driver 7 Core Driver Correlations Capacity Planning Methodology (contd.) •  Correlation matrix   Capture interactions in a Service Oriented Architecture (SOA)   Other Use: User engagement
  • 17. @Twitter | Velocity 2013 17 Rolling Correlation Time Capacity Planning Methodology (contd.) •  Correlation varies over time   Growing user base   New products, features •  Rolling correlation analysis – capture time varying nature   Raw times series   EMA   Challenge: What should be the window width?
  • 18. @Twitter | Velocity 2013 18 Capacity Planning Methodology (contd.) •  Relative Growth   How does INTC moves with respect to S&P 500? -6.00% -4.00% -2.00% 0.00% 2.00% 4.00% 6.00% 8.00% 12/13/08 12/20/08 12/27/08 1/3/09 1/10/09 1/17/09 1/24/09 1/31/09 2/7/09 2/14/09 2/21/09 2/28/09 3/7/09 3/14/09 3/21/09 3/28/09 4/4/09 4/11/09 4/18/09 4/25/09 5/2/09 5/9/09 DailyReturns S&P 500 INTC β: 1.35 : β Analysis
  • 19. @Twitter | Velocity 2013 19 Capacity Planning Methodology (contd.) 0 200 400 600 800 1000 1200 1400 1600 0 200 400 600 800 1000 1200 1400 1600 Resource CoreDriver Time Core Driver Resource β: 1.08 •  Relative Growth:β Analysis   Relative growth of a core driver and a resource driver
  • 20. @Twitter | Velocity 2013 20 Capacity Planning Methodology (contd.) •  β varies over time   New products, features   New metric to log Rolling Beta Time
  • 21. @Twitter | Velocity 2013 21 Capacity Planning Methodology (contd.) •  Growth: Detecting breakout   MACD: Moving Average Convergence Divergence   Difference of n- and m-width, n>m, EMA   Diverging EMAs o  Commonly used as a buy/sell signal in context of a stock o  Early detection of potential capacity ask "MACD" MACD Signal Time
  • 22. @Twitter | Velocity 2013 22 Acknowledgements •  Winston Lee, Capacity Engineer, Twitter •  Management team
  • 23. @Twitter | Velocity 2013 23 Join the Flock •  We are hiring!!   https://twitter.com/JoinTheFlock   https://twitter.com/jobs   Contact us: @bryce_yan, @arun_kejariwal Like problem solving? Like challenges? Be at cutting Edge Make an impact