SlideShare ist ein Scribd-Unternehmen logo
1 von 48
Downloaden Sie, um offline zu lesen
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Blair Layton, Business Development Manager,
APAC
April, 2016
Scaling Applications for Large
Promotions and Events
What is a
Large Scale Event?
What is a Large Scale Event
An event where you need more capacity than normally
allocated for a period of time
Typically from minutes to days, but could be a couple of
weeks
Often associated with a sudden surge of users
Hard to architect and provision for at a reasonable cost
Consumers get angry when it all goes wrong!
What is a Large Scale Event?
For you, it could be as simple as needing twice as much
capacity for a short promotion
Everyone’s Large Scale Event is different, but the
underlying concepts are the same
What Problems do you Face?
Unknown infrastructure requirements
• Cost?
Short duration of the event
• Massive investment in infrastructure that is otherwise idle or
underutilized
• Often tight deadlines to get the system live
Legacy system integration
Understanding system bahaviour, required metrics
Getting the right architecture
Finding the right talent
You Don’t Want This!
One question is
constant!
How do we scale,
especially the
database?
So let’s start from day
one, user one ( you )
Day One, User One
A single EC2 Instance
• With full stack on this host
• Web app
• Database
• Management
• Etc.
A single Elastic IP
Route53 for DNS
EC2
Instance
Elastic IP
Amazon
Route 53
User
“We’re gonna need a bigger box”
Simplest approach
Can now leverage PIOPs
High I/O instances
High memory instances
High CPU instances
High storage instances
Easy to change instance sizes
Will hit an endpoint eventually
r3.8xlarge
m3.large
t2.micro
Day One, User One:
We could potentially get to a
few hundred to a few
thousand depending on
application complexity and
traffic
No failover
No redundancy
Too many eggs in one
basket
EC2
Instance
Elastic IP
Amazon
Route 53
User
Day Two, User >1
First let’s separate out our
single host into more than one.
Web
Database
• Make use of a database
service?
Web
Instance
Database
Instance
Elastic IP
Amazon
Route 53
User
Start with the right
databases for the job
So decide wisely.
Look for the key
points of scale.
User >100
First let’s separate out our
single host into more than one
Web
Database
• Use RDS to make your life
easier
Web
Instance
Elastic IP
RDS DB
Instance
Amazon
Route 53
User
User > 1000
Next let’s address our lack of
failover and redundancy
issues
Elastic Load Balancing
Another web instance
• In another Availability Zone
Enable Amazon RDS multi-AZ
Web
Instance
RDS DB Instance
Active (Multi-AZ)
Availability Zone Availability Zone
Web
Instance
RDS DB Instance
Standby (Multi-AZ)
Elastic Load
Balancing
Amazon
Route 53
User
User >10 ks–100 ks
RDS DB Instance
Active (Multi-AZ)
Availability Zone Availability Zone
RDS DB Instance
Standby (Multi-AZ)
Elastic Load
Balancing
RDS DB Instance
Read Replica
RDS DB Instance
Read Replica
RDS DB Instance
Read Replica
RDS DB Instance
Read Replica
Web
Instance
Web
Instance
Web
Instance
Web
Instance
Web
Instance
Web
Instance
Web
Instance
Web
Instance
Amazon
Route 53
User
This will take us pretty far
honestly, but we care about
performance and
efficiency, so let’s clean
this up a bit
Shift Some Load Around
Let’s lighten the load on our
web and database instances
Move static content from the web
instance to Amazon S3 and
CloudFront
Move dynamic content from the
Elastic Load Balancing to
CloudFront
Move session/state and DB
caching to ElastiCache or
DynamoDB
Web
Instance
RDS DB Instance
Active (Multi-AZ)
Availability Zone
Elastic Load
Balancing
Amazon S3
Amazon
CloudFront
Amazon
Route 53
User
ElastiCache
Amazon
DynamoDB
User >500k+
Availability Zone
Amazon
Route 53
User
Amazon S3
Amazon
Cloudfront
Availability Zone
Elastic Load
Balancing
DynamoDB
RDS DB Instance
Read Replica
Web
Instance
Web
Instance
Web
Instance
ElastiCache RDS DB Instance
Read Replica
Web
Instance
Web
Instance
Web
Instance
ElastiCacheRDS DB Instance
Standby (Multi-AZ)
RDS DB Instance
Active (Multi-AZ)
Time to make some
radical improvements at
the web & app layers
SOAing
Move services into their own tiers
or modules. Treat each of these
as 100% separate pieces of your
infrastructure and scale them
independently.
Amazon.com and AWS do this
extensively! It offers flexibility and
greater understanding of each
component.
Loose Coupling Sets You Free!
The looser they're coupled, the bigger they scale
• Use independent components
• Design everything as a black box
• Decouple interactions
• Favor services with built in redundancy and scalability than
building your own
Controller A Controller B
Controller A Controller B
Q Q
Tight Coupling
Use Amazon SQS as Buffers
Loose Coupling
Users > 1 Million
RDS DB Instance
Active (Multi-AZ)
Availability Zone
Elastic Load
Balancer
RDS DB Instance
Read Replica
RDS DB Instance
Read Replica
Web
Instance
Web
Instance
Web
Instance
Web
Instance
Amazon
Route 53
User
Amazon S3
Amazon
Cloudfront
Amazon
DynamoDB
Amazon SQS
ElastiCache
Worker
Instance
Worker
Instance
Amazon
CloudWatch
Internal App
Instance
Internal App
Instance
Amazon SES
The next big steps
From 5 to 10 Million Users
You may start to run into issues with your database around
contention on the write master.
How can you solve it?
Federation (splitting into multiple DBs based on function)
Sharding (splitting one data set up across multiple hosts)
Moving some functionality to other types of DBs (NoSQL)
From 5 to 10 Million Users
You may start to run into issues with speed and performance
of your applications
Make sure you have monitoring, metrics, & logging in place
• If you can’t build it internally, outsource it! (third-party SaaS)
Pay attention to what customers are saying works well vs.
what doesn’t, and use this as direction
Try to work on squeezing as much performance out of each
service or component
Singaporean Gaming
Company
Sizing for Peak Loads
Promotions cause huge spikes in user activity
Auto-scaling works for the web and middle tier
RDS instances have to be sized for peak loads
Adopted our recommendations in a staged approach
Amazon
Route 53
CloudFront
Availability Zone #1
Amazon S3
Availability Zone #2
Amazon EC2Amazon EC2
Auto Scaling
Geo Routing
US East
Amazon
CloudWatch
RDS DB Instance
Active (Multi-AZ)
RDS DB Instance
Standby (Multi-AZ)
User
Amazon
Route 53
CloudFront
Availability Zone #1
Amazon S3
Availability Zone #2
Amazon EC2Amazon EC2
Auto Scaling
Geo Routing
US East
User
Amazon
CloudWatch
RDS DB Instance
Active (Multi-AZ)
RDS DB Instance
Standby (Multi-AZ)
RDS DB
instance read
replica
Amazon
Route 53
CloudFront
Availability Zone #1
Amazon S3
DynamoDB
Availability Zone #2
Amazon EC2Amazon EC2
Auto Scaling
Geo Routing
US East
User
Amazon
CloudWatch
RDS DB Instance
Active (Multi-AZ)
RDS DB Instance
Standby (Multi-AZ)
RDS DB
instance read
replica
Amazon
Route 53
CloudFront
Availability Zone #1
Amazon S3
DynamoDB
Availability Zone #2
Amazon EC2
ElastiCache
Memcached
Amazon EC2
Auto Scaling
Geo Routing
US East
User
Amazon
CloudWatch
RDS DB Instance
Active (Multi-AZ)
RDS DB Instance
Standby (Multi-AZ)
RDS DB
instance read
replica
Amazon
Route 53
CloudFront
Availability Zone #1
Amazon S3
DynamoDB
Availability Zone #2
Amazon EC2
ElastiCache
(Redis Master)
ElastiCache
Memcached
Amazon EC2
Redis Slave
Auto Scaling
Geo Routing
US East
User
Amazon
CloudWatch
RDS DB Instance
Active (Multi-AZ)
RDS DB Instance
Standby (Multi-AZ)
RDS DB
instance read
replica
Amazon Redshift
Lessons Learned
Listen to AWS Business Development and Solution
Architects ;)
Gaming promotions much easier to handle
Unpredicted loads also easier to handle
Senior operations person moving to a new game
Customers get a much better gaming experience!
Singaporean Telco
Customer Success Stories
Telecommunications Company
iPhone 5s/5c, 6/6+ and Samsung Note III launch
Needed a system to handle a huge number of concurrent
requests
Failed previously at the iPhone5 launch
Management directive to succeed at all costs!
Telco
Availability Zone
Elastic Load
Balancer
Web
Instance
Web
Instance
Web
Instance
Web
Instance
Amazon
Route 53
User
Amazon S3
Amazon
Cloudfront
Amazon
DynamoDB
ElastiCache
Amazon
CloudWatch
ElastiCache
Great Success!
Tested with 150,000 concurrent users
All phones gone within 2 minutes
No phones misallocated or unallocated
Management said the system was too fast!
Actual launch went smoothly
Lessons
AWS can provide infrastructure for applications to scale to
very high concurrent users
Managed services allow for quick deployment and changes
to infrastructure
Impossible for the customer to execute internally
Massive cost savings, even with huge over provisioning
“With our systems on AWS, we
can scale our resources more
than 130-fold in 30 minutes,
enabling us to support more
than 2,500 orders per second”
KT Chiu
Founder and Chief Executive Officer
TixCraft
Scaling Applications for Large Promotions and Events

Weitere ähnliche Inhalte

Andere mochten auch

Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...
Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...
Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...
Amazon Web Services
 

Andere mochten auch (20)

Account Separation and Mandatory Access Control on AWS | Security Roadshow Du...
Account Separation and Mandatory Access Control on AWS | Security Roadshow Du...Account Separation and Mandatory Access Control on AWS | Security Roadshow Du...
Account Separation and Mandatory Access Control on AWS | Security Roadshow Du...
 
Keynote - Currency fair
Keynote - Currency fairKeynote - Currency fair
Keynote - Currency fair
 
Workshop: We love APIs
Workshop: We love APIsWorkshop: We love APIs
Workshop: We love APIs
 
Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...
Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...
Faster Time to Science - Scaling BioMedical Research in the Cloud with SciOps...
 
Cloud First: New Architecture for New Infrastructure
Cloud First: New Architecture for New InfrastructureCloud First: New Architecture for New Infrastructure
Cloud First: New Architecture for New Infrastructure
 
Another Day, Another Billion Packets
Another Day, Another Billion PacketsAnother Day, Another Billion Packets
Another Day, Another Billion Packets
 
Amazon Simple Work Flow Engine (SWF): How Beamr uses SWF for video optimizati...
Amazon Simple Work Flow Engine (SWF): How Beamr uses SWF for video optimizati...Amazon Simple Work Flow Engine (SWF): How Beamr uses SWF for video optimizati...
Amazon Simple Work Flow Engine (SWF): How Beamr uses SWF for video optimizati...
 
Keynote - AON
Keynote - AONKeynote - AON
Keynote - AON
 
AWS Lambda
AWS LambdaAWS Lambda
AWS Lambda
 
Introducing Database Offerings on AWS - Technical 101
Introducing Database Offerings on AWS - Technical 101Introducing Database Offerings on AWS - Technical 101
Introducing Database Offerings on AWS - Technical 101
 
Log Analysis At Scale
Log Analysis At ScaleLog Analysis At Scale
Log Analysis At Scale
 
AWS March 2016 Webinar Series - Managed Database Services on Amazon Web Services
AWS March 2016 Webinar Series - Managed Database Services on Amazon Web ServicesAWS March 2016 Webinar Series - Managed Database Services on Amazon Web Services
AWS March 2016 Webinar Series - Managed Database Services on Amazon Web Services
 
Customer Story: Derivitec
Customer Story: DerivitecCustomer Story: Derivitec
Customer Story: Derivitec
 
Vidispine
VidispineVidispine
Vidispine
 
Data-Driven Civic Innovation
Data-Driven Civic InnovationData-Driven Civic Innovation
Data-Driven Civic Innovation
 
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel AvivBig Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
 
An Introduction to AWS GovCloud (US) | AWS Public Sector Summit 2016
An Introduction to AWS GovCloud (US) | AWS Public Sector Summit 2016An Introduction to AWS GovCloud (US) | AWS Public Sector Summit 2016
An Introduction to AWS GovCloud (US) | AWS Public Sector Summit 2016
 
Hands-on with AWS IoT
Hands-on with AWS IoTHands-on with AWS IoT
Hands-on with AWS IoT
 
Deep Dive on Microservices
Deep Dive on MicroservicesDeep Dive on Microservices
Deep Dive on Microservices
 
Getting Started: Optimizing your SAP landscape in the Cloud-SAPPHIRE NOW 2016
Getting Started: Optimizing your SAP landscape in the Cloud-SAPPHIRE NOW 2016Getting Started: Optimizing your SAP landscape in the Cloud-SAPPHIRE NOW 2016
Getting Started: Optimizing your SAP landscape in the Cloud-SAPPHIRE NOW 2016
 

Mehr von Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 

Scaling Applications for Large Promotions and Events

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Blair Layton, Business Development Manager, APAC April, 2016 Scaling Applications for Large Promotions and Events
  • 2. What is a Large Scale Event?
  • 3. What is a Large Scale Event An event where you need more capacity than normally allocated for a period of time Typically from minutes to days, but could be a couple of weeks Often associated with a sudden surge of users Hard to architect and provision for at a reasonable cost Consumers get angry when it all goes wrong!
  • 4.
  • 5.
  • 6.
  • 7.
  • 8. What is a Large Scale Event? For you, it could be as simple as needing twice as much capacity for a short promotion Everyone’s Large Scale Event is different, but the underlying concepts are the same
  • 9. What Problems do you Face? Unknown infrastructure requirements • Cost? Short duration of the event • Massive investment in infrastructure that is otherwise idle or underutilized • Often tight deadlines to get the system live Legacy system integration Understanding system bahaviour, required metrics Getting the right architecture Finding the right talent
  • 10.
  • 13. How do we scale, especially the database?
  • 14. So let’s start from day one, user one ( you )
  • 15. Day One, User One A single EC2 Instance • With full stack on this host • Web app • Database • Management • Etc. A single Elastic IP Route53 for DNS EC2 Instance Elastic IP Amazon Route 53 User
  • 16. “We’re gonna need a bigger box” Simplest approach Can now leverage PIOPs High I/O instances High memory instances High CPU instances High storage instances Easy to change instance sizes Will hit an endpoint eventually r3.8xlarge m3.large t2.micro
  • 17. Day One, User One: We could potentially get to a few hundred to a few thousand depending on application complexity and traffic No failover No redundancy Too many eggs in one basket EC2 Instance Elastic IP Amazon Route 53 User
  • 18. Day Two, User >1 First let’s separate out our single host into more than one. Web Database • Make use of a database service? Web Instance Database Instance Elastic IP Amazon Route 53 User
  • 19. Start with the right databases for the job
  • 20. So decide wisely. Look for the key points of scale.
  • 21. User >100 First let’s separate out our single host into more than one Web Database • Use RDS to make your life easier Web Instance Elastic IP RDS DB Instance Amazon Route 53 User
  • 22. User > 1000 Next let’s address our lack of failover and redundancy issues Elastic Load Balancing Another web instance • In another Availability Zone Enable Amazon RDS multi-AZ Web Instance RDS DB Instance Active (Multi-AZ) Availability Zone Availability Zone Web Instance RDS DB Instance Standby (Multi-AZ) Elastic Load Balancing Amazon Route 53 User
  • 23. User >10 ks–100 ks RDS DB Instance Active (Multi-AZ) Availability Zone Availability Zone RDS DB Instance Standby (Multi-AZ) Elastic Load Balancing RDS DB Instance Read Replica RDS DB Instance Read Replica RDS DB Instance Read Replica RDS DB Instance Read Replica Web Instance Web Instance Web Instance Web Instance Web Instance Web Instance Web Instance Web Instance Amazon Route 53 User
  • 24. This will take us pretty far honestly, but we care about performance and efficiency, so let’s clean this up a bit
  • 25. Shift Some Load Around Let’s lighten the load on our web and database instances Move static content from the web instance to Amazon S3 and CloudFront Move dynamic content from the Elastic Load Balancing to CloudFront Move session/state and DB caching to ElastiCache or DynamoDB Web Instance RDS DB Instance Active (Multi-AZ) Availability Zone Elastic Load Balancing Amazon S3 Amazon CloudFront Amazon Route 53 User ElastiCache Amazon DynamoDB
  • 26. User >500k+ Availability Zone Amazon Route 53 User Amazon S3 Amazon Cloudfront Availability Zone Elastic Load Balancing DynamoDB RDS DB Instance Read Replica Web Instance Web Instance Web Instance ElastiCache RDS DB Instance Read Replica Web Instance Web Instance Web Instance ElastiCacheRDS DB Instance Standby (Multi-AZ) RDS DB Instance Active (Multi-AZ)
  • 27. Time to make some radical improvements at the web & app layers
  • 28. SOAing Move services into their own tiers or modules. Treat each of these as 100% separate pieces of your infrastructure and scale them independently. Amazon.com and AWS do this extensively! It offers flexibility and greater understanding of each component.
  • 29. Loose Coupling Sets You Free! The looser they're coupled, the bigger they scale • Use independent components • Design everything as a black box • Decouple interactions • Favor services with built in redundancy and scalability than building your own Controller A Controller B Controller A Controller B Q Q Tight Coupling Use Amazon SQS as Buffers Loose Coupling
  • 30. Users > 1 Million RDS DB Instance Active (Multi-AZ) Availability Zone Elastic Load Balancer RDS DB Instance Read Replica RDS DB Instance Read Replica Web Instance Web Instance Web Instance Web Instance Amazon Route 53 User Amazon S3 Amazon Cloudfront Amazon DynamoDB Amazon SQS ElastiCache Worker Instance Worker Instance Amazon CloudWatch Internal App Instance Internal App Instance Amazon SES
  • 31. The next big steps
  • 32. From 5 to 10 Million Users You may start to run into issues with your database around contention on the write master. How can you solve it? Federation (splitting into multiple DBs based on function) Sharding (splitting one data set up across multiple hosts) Moving some functionality to other types of DBs (NoSQL)
  • 33. From 5 to 10 Million Users You may start to run into issues with speed and performance of your applications Make sure you have monitoring, metrics, & logging in place • If you can’t build it internally, outsource it! (third-party SaaS) Pay attention to what customers are saying works well vs. what doesn’t, and use this as direction Try to work on squeezing as much performance out of each service or component
  • 35. Sizing for Peak Loads Promotions cause huge spikes in user activity Auto-scaling works for the web and middle tier RDS instances have to be sized for peak loads Adopted our recommendations in a staged approach
  • 36. Amazon Route 53 CloudFront Availability Zone #1 Amazon S3 Availability Zone #2 Amazon EC2Amazon EC2 Auto Scaling Geo Routing US East Amazon CloudWatch RDS DB Instance Active (Multi-AZ) RDS DB Instance Standby (Multi-AZ) User
  • 37. Amazon Route 53 CloudFront Availability Zone #1 Amazon S3 Availability Zone #2 Amazon EC2Amazon EC2 Auto Scaling Geo Routing US East User Amazon CloudWatch RDS DB Instance Active (Multi-AZ) RDS DB Instance Standby (Multi-AZ) RDS DB instance read replica
  • 38. Amazon Route 53 CloudFront Availability Zone #1 Amazon S3 DynamoDB Availability Zone #2 Amazon EC2Amazon EC2 Auto Scaling Geo Routing US East User Amazon CloudWatch RDS DB Instance Active (Multi-AZ) RDS DB Instance Standby (Multi-AZ) RDS DB instance read replica
  • 39. Amazon Route 53 CloudFront Availability Zone #1 Amazon S3 DynamoDB Availability Zone #2 Amazon EC2 ElastiCache Memcached Amazon EC2 Auto Scaling Geo Routing US East User Amazon CloudWatch RDS DB Instance Active (Multi-AZ) RDS DB Instance Standby (Multi-AZ) RDS DB instance read replica
  • 40. Amazon Route 53 CloudFront Availability Zone #1 Amazon S3 DynamoDB Availability Zone #2 Amazon EC2 ElastiCache (Redis Master) ElastiCache Memcached Amazon EC2 Redis Slave Auto Scaling Geo Routing US East User Amazon CloudWatch RDS DB Instance Active (Multi-AZ) RDS DB Instance Standby (Multi-AZ) RDS DB instance read replica Amazon Redshift
  • 41. Lessons Learned Listen to AWS Business Development and Solution Architects ;) Gaming promotions much easier to handle Unpredicted loads also easier to handle Senior operations person moving to a new game Customers get a much better gaming experience!
  • 43. Customer Success Stories Telecommunications Company iPhone 5s/5c, 6/6+ and Samsung Note III launch Needed a system to handle a huge number of concurrent requests Failed previously at the iPhone5 launch Management directive to succeed at all costs!
  • 44. Telco Availability Zone Elastic Load Balancer Web Instance Web Instance Web Instance Web Instance Amazon Route 53 User Amazon S3 Amazon Cloudfront Amazon DynamoDB ElastiCache Amazon CloudWatch ElastiCache
  • 45. Great Success! Tested with 150,000 concurrent users All phones gone within 2 minutes No phones misallocated or unallocated Management said the system was too fast! Actual launch went smoothly
  • 46. Lessons AWS can provide infrastructure for applications to scale to very high concurrent users Managed services allow for quick deployment and changes to infrastructure Impossible for the customer to execute internally Massive cost savings, even with huge over provisioning
  • 47. “With our systems on AWS, we can scale our resources more than 130-fold in 30 minutes, enabling us to support more than 2,500 orders per second” KT Chiu Founder and Chief Executive Officer TixCraft