Scaling Applications for Large Promotions and Events

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Blair Layton, Business Development Manager,
APAC
April, 2016
Scaling Applications for Large
Promotions and Events

What is a Large Scale Event
An event where you need more capacity than normally
allocated for a period of time
Typically from minutes to days, but could be a couple of
weeks
Often associated with a sudden surge of users
Hard to architect and provision for at a reasonable cost
Consumers get angry when it all goes wrong!

What is a Large Scale Event?
For you, it could be as simple as needing twice as much
capacity for a short promotion
Everyone’s Large Scale Event is different, but the
underlying concepts are the same

What Problems do you Face?
Unknown infrastructure requirements
• Cost?
Short duration of the event
• Massive investment in infrastructure that is otherwise idle or
underutilized
• Often tight deadlines to get the system live
Legacy system integration
Understanding system bahaviour, required metrics
Getting the right architecture
Finding the right talent

How do we scale,
especially the
database?

So let’s start from day
one, user one ( you )

Day One, User One
A single EC2 Instance
• With full stack on this host
• Web app
• Database
• Management
• Etc.
A single Elastic IP
Route53 for DNS
EC2
Instance
Elastic IP
Amazon
Route 53
User

“We’re gonna need a bigger box”
Simplest approach
Can now leverage PIOPs
High I/O instances
High memory instances
High CPU instances
High storage instances
Easy to change instance sizes
Will hit an endpoint eventually
r3.8xlarge
m3.large
t2.micro

Day One, User One:
We could potentially get to a
few hundred to a few
thousand depending on
application complexity and
traffic
No failover
No redundancy
Too many eggs in one
basket
EC2
Instance
Elastic IP
Amazon
Route 53
User

Day Two, User >1
First let’s separate out our
single host into more than one.
Web
Database
• Make use of a database
service?
Web
Instance
Database
Instance
Elastic IP
Amazon
Route 53
User

Start with the right
databases for the job

So decide wisely.
Look for the key
points of scale.

User >100
First let’s separate out our
single host into more than one
Web
Database
• Use RDS to make your life
easier
Web
Instance
Elastic IP
RDS DB
Instance
Amazon
Route 53
User

User > 1000
Next let’s address our lack of
failover and redundancy
issues
Elastic Load Balancing
Another web instance
• In another Availability Zone
Enable Amazon RDS multi-AZ
Web
Instance
RDS DB Instance
Active (Multi-AZ)
Availability Zone Availability Zone
Web
Instance
RDS DB Instance
Standby (Multi-AZ)
Elastic Load
Balancing
Amazon
Route 53
User

User >10 ks–100 ks
RDS DB Instance
Active (Multi-AZ)
Availability Zone Availability Zone
RDS DB Instance
Standby (Multi-AZ)
Elastic Load
Balancing
RDS DB Instance
Read Replica
RDS DB Instance
Read Replica
RDS DB Instance
Read Replica
RDS DB Instance
Read Replica
Web
Instance
Web
Instance
Web
Instance
Web
Instance
Web
Instance
Web
Instance
Web
Instance
Web
Instance
Amazon
Route 53
User

This will take us pretty far
honestly, but we care about
performance and
efficiency, so let’s clean
this up a bit

Shift Some Load Around
Let’s lighten the load on our
web and database instances
Move static content from the web
instance to Amazon S3 and
CloudFront
Move dynamic content from the
Elastic Load Balancing to
CloudFront
Move session/state and DB
caching to ElastiCache or
DynamoDB
Web
Instance
RDS DB Instance
Active (Multi-AZ)
Availability Zone
Elastic Load
Balancing
Amazon S3
Amazon
CloudFront
Amazon
Route 53
User
ElastiCache
Amazon
DynamoDB

User >500k+
Availability Zone
Amazon
Route 53
User
Amazon S3
Amazon
Cloudfront
Availability Zone
Elastic Load
Balancing
DynamoDB
RDS DB Instance
Read Replica
Web
Instance
Web
Instance
Web
Instance
ElastiCache RDS DB Instance
Read Replica
Web
Instance
Web
Instance
Web
Instance
ElastiCacheRDS DB Instance
Standby (Multi-AZ)
RDS DB Instance
Active (Multi-AZ)

Time to make some
radical improvements at
the web & app layers

SOAing
Move services into their own tiers
or modules. Treat each of these
as 100% separate pieces of your
infrastructure and scale them
independently.
Amazon.com and AWS do this
extensively! It offers flexibility and
greater understanding of each
component.

Loose Coupling Sets You Free!
The looser they're coupled, the bigger they scale
• Use independent components
• Design everything as a black box
• Decouple interactions
• Favor services with built in redundancy and scalability than
building your own
Controller A Controller B
Controller A Controller B
Q Q
Tight Coupling
Use Amazon SQS as Buffers
Loose Coupling

Users > 1 Million
RDS DB Instance
Active (Multi-AZ)
Availability Zone
Elastic Load
Balancer
RDS DB Instance
Read Replica
RDS DB Instance
Read Replica
Web
Instance
Web
Instance
Web
Instance
Web
Instance
Amazon
Route 53
User
Amazon S3
Amazon
Cloudfront
Amazon
DynamoDB
Amazon SQS
ElastiCache
Worker
Instance
Worker
Instance
Amazon
CloudWatch
Internal App
Instance
Internal App
Instance
Amazon SES

From 5 to 10 Million Users
You may start to run into issues with your database around
contention on the write master.
How can you solve it?
Federation (splitting into multiple DBs based on function)
Sharding (splitting one data set up across multiple hosts)
Moving some functionality to other types of DBs (NoSQL)

From 5 to 10 Million Users
You may start to run into issues with speed and performance
of your applications
Make sure you have monitoring, metrics, & logging in place
• If you can’t build it internally, outsource it! (third-party SaaS)
Pay attention to what customers are saying works well vs.
what doesn’t, and use this as direction
Try to work on squeezing as much performance out of each
service or component

Sizing for Peak Loads
Promotions cause huge spikes in user activity
Auto-scaling works for the web and middle tier
RDS instances have to be sized for peak loads
Adopted our recommendations in a staged approach

Amazon
Route 53
CloudFront
Availability Zone #1
Amazon S3
Amazon EC2Amazon EC2
Auto Scaling
Geo Routing
US East
Amazon
CloudWatch
RDS DB Instance
Active (Multi-AZ)
RDS DB Instance
Standby (Multi-AZ)
User

Amazon
Route 53
CloudFront
Amazon S3
Auto Scaling
Geo Routing
US East
User
Amazon
CloudWatch
RDS DB Instance
Active (Multi-AZ)
RDS DB Instance
Standby (Multi-AZ)
RDS DB
instance read
replica

Amazon
Route 53
CloudFront
Amazon S3
DynamoDB
Auto Scaling
Geo Routing
US East
User
Amazon
CloudWatch
RDS DB Instance
Active (Multi-AZ)
RDS DB Instance
Standby (Multi-AZ)
RDS DB
instance read
replica

Amazon
Route 53
CloudFront
Amazon S3
DynamoDB
Amazon EC2
ElastiCache
Memcached
Amazon EC2
Auto Scaling
Geo Routing
US East
User
Amazon
CloudWatch
RDS DB Instance
Active (Multi-AZ)
RDS DB Instance
Standby (Multi-AZ)
RDS DB
instance read
replica

Amazon
Route 53
CloudFront
Amazon S3
DynamoDB
Amazon EC2
ElastiCache
(Redis Master)
ElastiCache
Memcached
Amazon EC2
Redis Slave
Auto Scaling
Geo Routing
US East
User
Amazon
CloudWatch
RDS DB Instance
Active (Multi-AZ)
RDS DB Instance
Standby (Multi-AZ)
RDS DB
instance read
replica
Amazon Redshift

Lessons Learned
Listen to AWS Business Development and Solution
Architects ;)
Gaming promotions much easier to handle
Unpredicted loads also easier to handle
Senior operations person moving to a new game
Customers get a much better gaming experience!

Customer Success Stories
Telecommunications Company
iPhone 5s/5c, 6/6+ and Samsung Note III launch
Needed a system to handle a huge number of concurrent
requests
Failed previously at the iPhone5 launch
Management directive to succeed at all costs!

Telco
Availability Zone
Elastic Load
Balancer
Web
Instance
Web
Instance
Web
Instance
Web
Instance
Amazon
Route 53
User
Amazon S3
Amazon
Cloudfront
Amazon
DynamoDB
ElastiCache
Amazon
CloudWatch
ElastiCache

Great Success!
Tested with 150,000 concurrent users
All phones gone within 2 minutes
No phones misallocated or unallocated
Management said the system was too fast!
Actual launch went smoothly

Lessons
AWS can provide infrastructure for applications to scale to
very high concurrent users
Managed services allow for quick deployment and changes
to infrastructure
Impossible for the customer to execute internally
Massive cost savings, even with huge over provisioning

“With our systems on AWS, we
can scale our resources more
than 130-fold in 30 minutes,
enabling us to support more
than 2,500 orders per second”
KT Chiu
Founder and Chief Executive Officer
TixCraft

Scaling Applications for Large Promotions and Events

Scaling Applications for Large Promotions and Events

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (20)

Mehr von Amazon Web Services

Mehr von Amazon Web Services (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Scaling Applications for Large Promotions and Events