SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Downloaden Sie, um offline zu lesen
Sixth International Workshop on Serverless Computing (WoSC6) 2020 Dec 7, 2020
Temporal Performance Modeling of Serverless Computing Platforms
N. Mahmoudi and H. Khazaei, “Temporal Performance Modelling of Serverless
Computing Platforms,” in Proceedings of the 6th International Workshop on
Serverless Computing, 2020, p. 1–6., doi: 10.10.1145/3429880.3430092.
https://www.serverlesscomputing.org/wosc6/#p1
Hamzeh Khazaei
hkh@yorku.ca
Performant and Available
Computing Systems (PACS) Lab
Nima Mahmoudi
nmahmoud@ualberta.ca
Sixth International Workshop on Serverless Computing (WoSC6) 2020 Dec 7, 2020
TOC
Introduction
System Description
Analytical Modelling
Experimental Validation
Conclusion
2
Introduction
3
Serverless Computing
● Runtime operation and management done by the provider
○ Reduces the overhead for the software owner
○ Provisioning
○ Scaling resources
● Software is developed by writing functions
○ Well-defined interface
○ Functions deployed separately
4
Serverless Computing
Image source: https://aws.amazon.com/lambda/
5
The Need for a Performance Model
● No previous work has been done for performance modelling of Serverless Computing
platforms
● Accurate performance modelling can beneficial in many ways:
○ Ensure the Quality of Service (QoS)
○ Improve performance metrics
○ Predict/optimize infrastructure cost
○ Move from best-effort to performance guarantees
● It can benefit both serverless provider and application developer
6
System Description
7
Function States, Cold Starts, and Warm Starts
● Function States:
○ Initializing: Performing initialization tasks to prepare the function for incoming requests. Includes
infrastructure initialization and application initialization.
○ Running: Running the tasks required to process a request.
○ Idle: Provisioned instance that is not running any workloads. The instances in this state are not billed.
● Cold Start Requests:
○ A request that needs to go through initialization steps due to lack of provisioned capacity.
● Warm Start Requests:
○ Only includes request processing time since idle instance was available
8
Autoscaling
Expiration Threshold
Scaling Out:
Scaling In:
9
Other Important Characteristics
● Initialization Time: The amount of time instance spends in the initializing state.
● Response Time: No queuing, so it is equal to service time. It remains stable
throughout time for cold and warm start requests.
● Maximum Concurrency Level: Maximum number of instances that can be in the state
running in parallel.
● Request Routing: To facilitate scaling in, requests are routed to recently created
instances first.
10
Analytical Modelling
11
Overview
Warm Pool Model:
12
● Cold Start Rate
○ The system behaves like an Erlang Loss System (M/G/m/m).
○ The blocked requests are either rejected (reached maximum concurrency level) or cause a
cold start (and thus the creation of a new instance)
● Arrival Rate for Each Instance
○ Requests blocked by instance n are either processed by instance n+1 or blocked by it.
○ The difference between blocked rates gives us individual arrival rates.
● Server Expiration Rate
○ Can be calculated knowing individual arrival rates and expiration threshold.
● Warm Pool Model
○ Each state represents the number of instances in the warm pool.
13
● For each state, we can also calculate:
○ Probability of Rejection: Probability of being blocked when reaching
maximum concurrency.
○ Probability of Cold Start: Probability of being blocked in other states.
○ Average Response Time:
○ Mean Number of Instances in Warm Pool:
■ Running
■ Idle
○ Utilization: Ratio of instances in running state over all instances.
● All predictions can be found in a time-bounded fashion (e.g., answers the
question “what happens to my QoS in the next 5 minutes?”)
14
Experimental Validation
15
Experimental Setup
● Experiments done on AWS Lambda
○ Python 3.6 runtime with 128MB of RAM on us-east-1 region
○ A mixture of CPU and IO intensive tasks
● Client was a virtual machine on Compute Canada Arbutus
○ 8 vCPUs, 16GB of RAM, 1000Mbps connectivity, single-digit milliseconds latency to AWS
servers
○ Python with in-house workload generation tool pacswg
○ Official boto3 library for API communication
○ Communicated directly with Lambda API, no intermediary interfaces like API Gateway
● Predictions are made 5 minutes into the future
● Assumed oracle request pattern prediction
16
Sample Workload
17
Experimental Results
18
Experimental Results (2)
19
Conclusion
20
Conclusion
● Accurate and tractable analytical performance model
● Ability to predict important performance/cost related metrics
● Can predict QoS
● Can benefit serverless providers
○ Ability to predict QoS under different loads on deployment time
○ Can be used in the management to prevent performance degradation
● Could be useful to application developers
○ Predict how their system will perform in the immediate future
○ Help them optimize their memory configuration to occur minimal cost that satisfies
performance requirements throughout their daily request patterns
● Can be used in the management systems to warm-up instances to prevent SLA violations
21
Summary
Website: nima-dev.com
Twitter: @nima_mahmoudi
22

Weitere ähnliche Inhalte

Was ist angesagt?

Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...
Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...
Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...
Flink Forward
 
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward
 
Netflix Data Benchmark @ HPTS 2017
Netflix Data Benchmark @ HPTS 2017Netflix Data Benchmark @ HPTS 2017
Netflix Data Benchmark @ HPTS 2017
Ioannis Papapanagiotou
 

Was ist angesagt? (20)

End to end testing a web application with Clojure
End to end testing a web application with ClojureEnd to end testing a web application with Clojure
End to end testing a web application with Clojure
 
Ajax
AjaxAjax
Ajax
 
Spring batch in action
Spring batch in actionSpring batch in action
Spring batch in action
 
H2020 finsec-ibm- aidan-shribman-finsec-skydive 260820
H2020 finsec-ibm- aidan-shribman-finsec-skydive 260820H2020 finsec-ibm- aidan-shribman-finsec-skydive 260820
H2020 finsec-ibm- aidan-shribman-finsec-skydive 260820
 
202104 technical challenging and our solutions - golang taipei
202104   technical challenging and our solutions - golang taipei202104   technical challenging and our solutions - golang taipei
202104 technical challenging and our solutions - golang taipei
 
What we learnt at carousell tw for golang gathering #31
What we learnt at carousell tw for golang gathering #31What we learnt at carousell tw for golang gathering #31
What we learnt at carousell tw for golang gathering #31
 
Flink Forward Berlin 2017: Matt Zimmer - Custom, Complex Windows at Scale Usi...
Flink Forward Berlin 2017: Matt Zimmer - Custom, Complex Windows at Scale Usi...Flink Forward Berlin 2017: Matt Zimmer - Custom, Complex Windows at Scale Usi...
Flink Forward Berlin 2017: Matt Zimmer - Custom, Complex Windows at Scale Usi...
 
Flink Forward Berlin 2017: Francesco Versaci - Integrating Flink and Kafka in...
Flink Forward Berlin 2017: Francesco Versaci - Integrating Flink and Kafka in...Flink Forward Berlin 2017: Francesco Versaci - Integrating Flink and Kafka in...
Flink Forward Berlin 2017: Francesco Versaci - Integrating Flink and Kafka in...
 
Monitoring NGINX (plus): key metrics and how-to
Monitoring NGINX (plus): key metrics and how-toMonitoring NGINX (plus): key metrics and how-to
Monitoring NGINX (plus): key metrics and how-to
 
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
Apache Software Foundation: How To Contribute, with Apache Flink as Example (...
 
Monolithic to microservices
Monolithic to microservicesMonolithic to microservices
Monolithic to microservices
 
The Dark Side Of Go -- Go runtime related problems in TiDB in production
The Dark Side Of Go -- Go runtime related problems in TiDB  in productionThe Dark Side Of Go -- Go runtime related problems in TiDB  in production
The Dark Side Of Go -- Go runtime related problems in TiDB in production
 
Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...
Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...
Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...
 
OSMC 2018 | Visualization of your distributed infrastructure by Nicolai Buchwitz
OSMC 2018 | Visualization of your distributed infrastructure by Nicolai BuchwitzOSMC 2018 | Visualization of your distributed infrastructure by Nicolai Buchwitz
OSMC 2018 | Visualization of your distributed infrastructure by Nicolai Buchwitz
 
Cassandra To Infinity And Beyond
Cassandra To Infinity And BeyondCassandra To Infinity And Beyond
Cassandra To Infinity And Beyond
 
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
 
Anatomy of an action
Anatomy of an actionAnatomy of an action
Anatomy of an action
 
OSMC 2018 | Stream connector: Easily sending events and/or metrics from the C...
OSMC 2018 | Stream connector: Easily sending events and/or metrics from the C...OSMC 2018 | Stream connector: Easily sending events and/or metrics from the C...
OSMC 2018 | Stream connector: Easily sending events and/or metrics from the C...
 
Life of a startup - Sjoerd Mulder - Codemotion Amsterdam 2017
Life of a startup - Sjoerd Mulder - Codemotion Amsterdam 2017Life of a startup - Sjoerd Mulder - Codemotion Amsterdam 2017
Life of a startup - Sjoerd Mulder - Codemotion Amsterdam 2017
 
Netflix Data Benchmark @ HPTS 2017
Netflix Data Benchmark @ HPTS 2017Netflix Data Benchmark @ HPTS 2017
Netflix Data Benchmark @ HPTS 2017
 

Ähnlich wie Temporal Performance Modelling of Serverless Computing Platforms - WoSC6

Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ UberKafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
confluent
 
Archmage, Pinterest’s Real-time Analytics Platform on Druid
Archmage, Pinterest’s Real-time Analytics Platform on DruidArchmage, Pinterest’s Real-time Analytics Platform on Druid
Archmage, Pinterest’s Real-time Analytics Platform on Druid
Imply
 
Next Generation Automation in Ruckus Wireless
Next Generation Automation in Ruckus WirelessNext Generation Automation in Ruckus Wireless
Next Generation Automation in Ruckus Wireless
David Ko
 
Who needs containers in a serverless world
Who needs containers in a serverless worldWho needs containers in a serverless world
Who needs containers in a serverless world
Matthias Luebken
 

Ähnlich wie Temporal Performance Modelling of Serverless Computing Platforms - WoSC6 (20)

Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ UberKafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
 
Nexmark with beam
Nexmark with beamNexmark with beam
Nexmark with beam
 
Netty training
Netty trainingNetty training
Netty training
 
Zero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesZero Downtime JEE Architectures
Zero Downtime JEE Architectures
 
Netty training
Netty trainingNetty training
Netty training
 
Serving deep learning models in a serverless platform (IC2E 2018)
Serving deep learning models in a serverless platform (IC2E 2018)Serving deep learning models in a serverless platform (IC2E 2018)
Serving deep learning models in a serverless platform (IC2E 2018)
 
Archmage, Pinterest’s Real-time Analytics Platform on Druid
Archmage, Pinterest’s Real-time Analytics Platform on DruidArchmage, Pinterest’s Real-time Analytics Platform on Druid
Archmage, Pinterest’s Real-time Analytics Platform on Druid
 
Cloud arch patterns
Cloud arch patternsCloud arch patterns
Cloud arch patterns
 
Next Generation Automation in Ruckus Wireless
Next Generation Automation in Ruckus WirelessNext Generation Automation in Ruckus Wireless
Next Generation Automation in Ruckus Wireless
 
Scalable complex event processing on samza @UBER
Scalable complex event processing on samza @UBERScalable complex event processing on samza @UBER
Scalable complex event processing on samza @UBER
 
Machine Learning Infrastructure
Machine Learning InfrastructureMachine Learning Infrastructure
Machine Learning Infrastructure
 
Container world 2019 Canary Release
Container world 2019 Canary ReleaseContainer world 2019 Canary Release
Container world 2019 Canary Release
 
How Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per dayHow Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per day
 
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per DayHadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
 
Journey and evolution of Presto@Grab
Journey and evolution of Presto@GrabJourney and evolution of Presto@Grab
Journey and evolution of Presto@Grab
 
Enabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speedEnabling Presto to handle massive scale at lightning speed
Enabling Presto to handle massive scale at lightning speed
 
Testing data streaming applications
Testing data streaming applicationsTesting data streaming applications
Testing data streaming applications
 
Who needs containers in a serverless world
Who needs containers in a serverless worldWho needs containers in a serverless world
Who needs containers in a serverless world
 
Structured Streaming in Spark
Structured Streaming in SparkStructured Streaming in Spark
Structured Streaming in Spark
 
Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3Scaling Monitoring At Databricks From Prometheus to M3
Scaling Monitoring At Databricks From Prometheus to M3
 

Kürzlich hochgeladen

Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
Epec Engineered Technologies
 
Verification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptxVerification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptx
chumtiyababu
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
Kamal Acharya
 

Kürzlich hochgeladen (20)

Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Verification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptxVerification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptx
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
Hospital management system project report.pdf
Hospital management system project report.pdfHospital management system project report.pdf
Hospital management system project report.pdf
 

Temporal Performance Modelling of Serverless Computing Platforms - WoSC6

  • 1. Sixth International Workshop on Serverless Computing (WoSC6) 2020 Dec 7, 2020 Temporal Performance Modeling of Serverless Computing Platforms N. Mahmoudi and H. Khazaei, “Temporal Performance Modelling of Serverless Computing Platforms,” in Proceedings of the 6th International Workshop on Serverless Computing, 2020, p. 1–6., doi: 10.10.1145/3429880.3430092. https://www.serverlesscomputing.org/wosc6/#p1 Hamzeh Khazaei hkh@yorku.ca Performant and Available Computing Systems (PACS) Lab Nima Mahmoudi nmahmoud@ualberta.ca
  • 2. Sixth International Workshop on Serverless Computing (WoSC6) 2020 Dec 7, 2020 TOC Introduction System Description Analytical Modelling Experimental Validation Conclusion 2
  • 4. Serverless Computing ● Runtime operation and management done by the provider ○ Reduces the overhead for the software owner ○ Provisioning ○ Scaling resources ● Software is developed by writing functions ○ Well-defined interface ○ Functions deployed separately 4
  • 5. Serverless Computing Image source: https://aws.amazon.com/lambda/ 5
  • 6. The Need for a Performance Model ● No previous work has been done for performance modelling of Serverless Computing platforms ● Accurate performance modelling can beneficial in many ways: ○ Ensure the Quality of Service (QoS) ○ Improve performance metrics ○ Predict/optimize infrastructure cost ○ Move from best-effort to performance guarantees ● It can benefit both serverless provider and application developer 6
  • 8. Function States, Cold Starts, and Warm Starts ● Function States: ○ Initializing: Performing initialization tasks to prepare the function for incoming requests. Includes infrastructure initialization and application initialization. ○ Running: Running the tasks required to process a request. ○ Idle: Provisioned instance that is not running any workloads. The instances in this state are not billed. ● Cold Start Requests: ○ A request that needs to go through initialization steps due to lack of provisioned capacity. ● Warm Start Requests: ○ Only includes request processing time since idle instance was available 8
  • 10. Other Important Characteristics ● Initialization Time: The amount of time instance spends in the initializing state. ● Response Time: No queuing, so it is equal to service time. It remains stable throughout time for cold and warm start requests. ● Maximum Concurrency Level: Maximum number of instances that can be in the state running in parallel. ● Request Routing: To facilitate scaling in, requests are routed to recently created instances first. 10
  • 13. ● Cold Start Rate ○ The system behaves like an Erlang Loss System (M/G/m/m). ○ The blocked requests are either rejected (reached maximum concurrency level) or cause a cold start (and thus the creation of a new instance) ● Arrival Rate for Each Instance ○ Requests blocked by instance n are either processed by instance n+1 or blocked by it. ○ The difference between blocked rates gives us individual arrival rates. ● Server Expiration Rate ○ Can be calculated knowing individual arrival rates and expiration threshold. ● Warm Pool Model ○ Each state represents the number of instances in the warm pool. 13
  • 14. ● For each state, we can also calculate: ○ Probability of Rejection: Probability of being blocked when reaching maximum concurrency. ○ Probability of Cold Start: Probability of being blocked in other states. ○ Average Response Time: ○ Mean Number of Instances in Warm Pool: ■ Running ■ Idle ○ Utilization: Ratio of instances in running state over all instances. ● All predictions can be found in a time-bounded fashion (e.g., answers the question “what happens to my QoS in the next 5 minutes?”) 14
  • 16. Experimental Setup ● Experiments done on AWS Lambda ○ Python 3.6 runtime with 128MB of RAM on us-east-1 region ○ A mixture of CPU and IO intensive tasks ● Client was a virtual machine on Compute Canada Arbutus ○ 8 vCPUs, 16GB of RAM, 1000Mbps connectivity, single-digit milliseconds latency to AWS servers ○ Python with in-house workload generation tool pacswg ○ Official boto3 library for API communication ○ Communicated directly with Lambda API, no intermediary interfaces like API Gateway ● Predictions are made 5 minutes into the future ● Assumed oracle request pattern prediction 16
  • 21. Conclusion ● Accurate and tractable analytical performance model ● Ability to predict important performance/cost related metrics ● Can predict QoS ● Can benefit serverless providers ○ Ability to predict QoS under different loads on deployment time ○ Can be used in the management to prevent performance degradation ● Could be useful to application developers ○ Predict how their system will perform in the immediate future ○ Help them optimize their memory configuration to occur minimal cost that satisfies performance requirements throughout their daily request patterns ● Can be used in the management systems to warm-up instances to prevent SLA violations 21