Scalable Databases for Fast Growing Startups
Blair Layton
Business Development Manager – Database Services
Amazon Web Services - APAC

Agenda
• Self-Managed or Managed Database Services?
• NoSQL or Relational?
• Performance Tips and Tricks
• How to scale from 1 to 10,000,000 users?
• How do I save money?
• Summary
• Q&A

SelfSelfSelfSelf----Managed orManaged orManaged orManaged or
Managed Database Services?Managed Database Services?Managed Database Services?Managed Database Services?

backup & recovery,
data load & unload
performance tuning
25%25%25%25%40%40%40%40%
5%5%5%5% 5%5%5%5%
scripting & coding
security
planning
install, upgrade,
patch and migrate
documentation,
licensing & training
Why Managed Databases?

If You Host Your Databases On-premises
Power, HVAC, net
Rack & stack
Server maintenance
OS patches
DB s/w patches
Database backups
Scaling
High availability
DB s/w installs
OS installation
you
App optimization

Power, HVAC, net
Rack & stack
Server maintenance
OS patches
DB s/w patches
Database backups
Scaling
High availability
DB s/w installs
OS installation
you
App optimization
If You Host Your Databases On-premises

If You Host Your Databases in EC2
Power, HVAC, net
Rack & stack
Server maintenance
OS patches
DB s/w patches
Database backups
Scaling
High availability
DB s/w installs
OS installation
you
App optimization

OS patches
DB s/w patches
Database backups
Scaling
High availability
DB s/w installs
you
App optimization
Power, HVAC, net
Rack & stack
Server maintenance
OS installation
If You Host Your Databases in EC2

If You Choose a Managed
Database Service
Power, HVAC, net
Rack & stack
Server maintenance
OS patches
DB s/w patches
Database backups
App optimization
High availability
DB s/w installs
OS installation
you
Scaling

differentiated effort increases the
uniqueness of an application

Amazon RDS
Amazon DynamoDB Amazon Redshift
Amazon ElastiCache
Compute Storage
AWS Global Infrastructure
Database
Application Services
Deployment & Administration
Networking
AWS Database Services
Scalable High Performance
Application Storage in the Cloud

Relational Databases
Fully managed; zero admin
MySQL, Oracle, Postgres, SQL Server
Trillions of I/O requests/month
Amazon
RDS

Flipboard relies on Amazon RDS
• Flipboard is an online
magazine with millions of
users and billions of “flips”
per month
• Uses Amazon RDS and its
Multi-AZ capabilities to store
mission critical user data
"We were able to go from
concept to delivered product in
about six months with just a
handful of engineers."
- Greg Scallan, Chief Architect,
Flipboard

• Manageability
Rapid deployment with pre-configured parameters
Patch Management
Monitoring and Metrics
• Availability and Data Durability
Automated Backups and Point-In-Time-Recovery
DB Snapshots
Automatic Host Replacement (Single-AZ)
Multi-AZ deployments
• Scalability
Push-Button Scaling
• Storage, Memory and Compute
Read Replicas
Key Features

RDS for Production Workloads
AmazonAmazonAmazonAmazon RDSRDSRDSRDS
ConfigurationConfigurationConfigurationConfiguration
ImproveImproveImproveImprove
AvailabilityAvailabilityAvailabilityAvailability
IncreaseIncreaseIncreaseIncrease
ThroughputThroughputThroughputThroughput
ReduceReduceReduceReduce
LatencyLatencyLatencyLatency
PushPushPushPush----Button ScalingButton ScalingButton ScalingButton Scaling
MultiMultiMultiMulti AZAZAZAZ
ReadReadReadRead ReplicasReplicasReplicasReplicas
Provisioned IOPSProvisioned IOPSProvisioned IOPSProvisioned IOPS
Read ReplicasPush-Button Scaling Provisioned IOPS
Region
Multi-AZ
availability
zone
availability
zone

In-Memory Cache
Elastic and reliable
Memcached or Redis
Amazon
ElastiCache

ElastiCache: Fully Managed Cache Service
Easy to
Deploy
Deploy master-
slave(s)
configuration with
a few button clicks
or API calls
Easy to
Migrate
Compatible with
memcached or
Redis
Existing code will
work when you
update node end
points
Easy to
Administer
ElastiCache
automatically
replaces failed
nodes and patches
software as needed
CloudWatch
enables you to
monitor cache
performance
metrics
Easy to
Secure
Supports VPC and
Security Group
configurations
Easy to
Scale
Provide assisted
scale up and scale
out capability

Application
Server
Hot Items
Small, frequently-accessed items are ideal
candidates for read caching
• Reduce server-side latency to <1ms
• Eliminate “hot spot” performance barriers
• Offload heavy read activity from database

NoSQL Database
Durable low latency
Massive and seamless scalability
Amazon
DynamoDB

WRITES
Continuously replicated to 3 AZ’s
Quorum acknowledgment
Persisted to disk (custom SSD)
READS
Strongly or eventually consistent
No trade-off in latency
Durable Low Latency – At Scale

Petabyte scale
Massively parallel
Relational data warehouse
Amazon
Redshift
a lot faster
a lot cheaper
a whole lot simpler

• Load
• Query
• Resize
• Backup
• Restore
Parallelize and Distribute Everything
Compute
Node
16TB
10 GigE
(HPC)
Ingestion
Backup
Restore
SQL Clients / BI Tools
Amazon S3
Client VPC
Compute
Node
16TB
Compute
Node
16TB
Leader
Node

Databases on EC2
• Any database that runs on Windows or Linux!
• Why?
• No managed service exists from AWS, e.g. MongoDB
• Full control
• Exceed limits of managed service, e.g. > 3TB of storage
on RDS

Not available
on AWS
Spectrum of Database Options
SQL NoSQL
Low Cost High Cost
Do-it Yourself Fully
Managed

SQL NoSQL
Managed

MySQL, Oracle, SQL
Server, PostgreSQL
Amazon Redshift
SQL NoSQL
Managed
MySQL, Oracle, SQL
Server, PostgreSQL,
MariaDB, Vertica,
ParAccel…

SQL NoSQL
Managed
MongoDB
Cassandra
Redis
Memcache
DynamoDB
ElastiCache (Memcache)
ElastiCache (Redis)
SimpleDB

Thinking About the Questions
Should I use
SQL or NoSQL?
Should I use
MySQL or
PostgreSQL?
Should I use Redis,
Memcache, or
ElastiCache?
?Should I use
MongoDB,
Cassandra, or
DynamoDB?

Actually, Thinking About the Right Questions
What are my scale
and latency
needs?
What are my
transactional and
consistency
needs?
What are my
read/write, storage
and IOPS needs?
What are my time
to market and
server control
needs?
?

Factors to Consider
Factors SQL NoSQL
Application • App with complex business logic? • Web app with lots of users?
Transactions • Complex transactions, joins, updates? • Simple data model, updates, queries?
Scale • Developer managed • Automatic, on-demand scaling
Performance • Developer architected • Consistent, high performance at scale
Availability • Architected for fail-over • Seamless and transparent
Core Skills • SQL + Java/Ruby/Python/PhP • NoSQL + Java/Ruby/Python/PhP
Best of both worlds: Possible to Use SQL and NoSQL models in one AppBest of both worlds: Possible to Use SQL and NoSQL models in one App

Performance Tips and Tricks
• Understand your workload
– Read:Write ratio, I/O requirements, CPU requirements
• Identify bottlenecks
– CPU, Memory, Disk I/O, Network latency/bandwidth
– Use Cloudwatch and OS metrics
• Choose the right instance type
– High CPU, High Memory, High Storage, etc.
• Understand EBS!

Amazon EBS Magnetic
Amazon Elastic
Block Storage
(EBS)
• IOPS: ~100 IOPS steady-state, with best-effort bursts
to hundreds. 40-200 IOPS in terms of variability.
• Throughput: variable by workload, best effort to 10s of MB/s.
• Latency: Varies, reads typically <20 ms,
writes typically <10 ms.
• Capacity: As provisioned, up to 1 TB.

Amazon EBS General Purpose
• IOPS: 3 IOPS per GB consistent, with bursts to 3,000 IOPS.
Bucket principle, fills up when not used and empties as used.
• Throughput: variable by workload, best effort to 64 MB/s.
• Latency: Low and consistent.
• Capacity: As provisioned, up to 1 TB.
Amazon Elastic
Block Storage
(EBS)

Amazon EBS Provisioned IOPS
• IOPS: Within 10% of up to 4000 IOPS,
99.9% of a given year, as provisioned.
• Throughput: 16 KB per I/O = up to 64 MB/s, as provisioned.
• Latency: low and consistent, at recommended QD
• Capacity: As provisioned, up to 1 TB
*
*
Amazon Elastic
Block Storage
(EBS)

EC2
Why the ?
*
An I/O
EBS
Just because Amazon EC2 sends more
work doesn’t mean there’s enough
bandwidth to handle it!

EC2
Why the ?
*
An I/O
Without more bandwidth,
more EBS volumes or higher PIOPS won’t help!

EBS-Optimized
Oh, YEAH!!
*
EC2
A “boatload” of I/O
*
EBS w/ PIOPS

Architecting for Performance
• IOPS consistency requires EBS-
optimized instances
• Maximum throughput delivered by
Amazon EBS is limited by Amazon
EC2 bandwidth
• EBS throughput =
EBS IOPS × Block size
– Ex: 64 MB/s = 4000 IOPS × 16 KB
Max 8k =
2x
Max 4k =
4x*
Max 2k =
8x*
*Maximum IOPS is also limited to ~100,000 per 32 vCpu,
irrespective of block size/throughput.

Additional Hints
• Mount partitions with “noatime” and “nodiratime”
– Removes a write every time a read is done
• Turn off file system read ahead if possible
– Especially for OLTP systems
• Use vendor storage solutions
– Oracle ASM
• Optimize kernel settings

Scaling from
1 to 10,000,000 Users

Hi, I have NO IDEA what I am doing!!

So let’s start from day
one, user one ( you )

Day One, User One:
• We could potentially get
to a few hundred to a few
thousand depending on
application complexity
and traffic
• No failover
• No redundancy
• Too many eggs in one
basket
EC2
Instance
Elastic IP
Amazon
Route 53
User

“We’re gonna need a bigger box”
• Simplest approach
• Can now leverage PIOPs
• High I/O instances
• High memory instances
• High CPU instances
• High storage instances
• Easy to change instance sizes
• Will hit an endpoint eventually
r3.8xlarge
m3.2xlarge
t2.small

Day Two, User >1
First let’s separate out
our single host into
more than one.
• Web
• Database
– Make use of a database
service? Web
Instance
Database
Instance
Elastic IP
Amazon
Route 53
User

Start with the right
databases for the job

User >100
First let’s separate out
our single host into
more than one
• Web
• Database
– Use RDS to make your life
easier Web
Instance
Elastic IP
RDS DB
Instance
Amazon
Route 53
User

User > 1000
Next let’s address our
lack of failover and
redundancy issues
• Elastic Load Balancing
• Another web instance
– In another Availability Zone
• Enable Amazon RDS multi-AZ
Web
Instance
RDS DB Instance
Active (Multi-AZ)
Availability Zone Availability Zone
Web
Instance
RDS DB Instance
Standby (Multi-AZ)
Elastic Load
Balancing
Amazon
Route 53
User

User >10 ks–100 ks
RDS DB Instance
Active (Multi-AZ)
Availability Zone Availability Zone
RDS DB Instance
Standby (Multi-AZ)
Elastic Load
Balancing
RDS DB Instance
Read Replica
RDS DB Instance
Read Replica
RDS DB Instance
Read Replica
RDS DB Instance
Read Replica
Web
Instance
Web
Instance
Web
Instance
Web
Instance
Web
Instance
Web
Instance
Web
Instance
Web
Instance
Amazon
Route 53
User

This will take us pretty far
honestly, but we care about
performance and
efficiency, so let’s clean
this up a bit

Shift Some Load Around
Let’s lighten the load on our
web and database instances
• Move static content from the
web instance to Amazon S3
and CloudFront
• Move dynamic content from
the Elastic Load Balancing to
CloudFront
• Move session/state and DB
caching to ElastiCache or
DynamoDB
Web
Instance
RDS DB Instance
Active (Multi-AZ)
Availability Zone
Elastic Load
Balancing
Amazon S3
Amazon
CloudFront
Amazon
Route 53
User
ElastiCache
Amazon
DynamoDB

User >500k+
Availability Zone
Amazon
Route 53
User
Amazon S3
Amazon
Cloudfront
Availability Zone
Elastic Load
Balancing
DynamoDB
RDS DB Instance
Read Replica
Web
Instance
Web
Instance
Web
Instance
ElastiCache RDS DB Instance
Read Replica
Web
Instance
Web
Instance
Web
Instance
ElastiCacheRDS DB Instance
Standby (Multi-AZ)
RDS DB Instance
Active (Multi-AZ)

From 500K to 1 Million Users
• Getting serious now
• Significant user base
• Plenty of attention if things go wrong
• Interesting phase for startups with funding
rounds

Time to make some
radical improvements at
the web & app layers

SOAing
Move services into their own tiers
or modules. Treat each of these
as 100% separate pieces of your
infrastructure and scale them
independently. Use queues!
Amazon.com and AWS do this
extensively! It offers flexibility and
greater understanding of each
component.

Users > 1 Million
RDS DB Instance
Active (Multi-AZ)
Availability Zone
Elastic Load
Balancer
RDS DB Instance
Read Replica
RDS DB Instance
Read Replica
Web
Instance
Web
Instance
Web
Instance
Web
Instance
Amazon
Route 53
User
Amazon S3
Amazon
Cloudfront
Amazon
DynamoDB
Amazon SQS
ElastiCache
Worker
Instance
Worker
Instance
Amazon
CloudWatch
Internal App
Instance
Internal App
Instance
Amazon SES

From 5 to 10 Million Users
You may start to run into issues with your database around
contention on the write master.
How can you solve it?
• Federation (splitting into multiple DBs based on function)
• Sharding (splitting one data set up across multiple hosts)
• Moving some functionality to other types of databases
– NoSQL for hot tables, lookup tables, leaderboards/scoring, meta data
– Data warehouse for analytics: user behavior, performance monitoring, a/b testing
results, KPIs/dashboards.

Saving $$$
• Use managed database services
– Focus your limited resources on the application
– Elasticache can reduce your database costs
• Understand how to scale from the start
– Save redesign work and unhappy customers
– Start and stop instances as required
• Use the AWS platform
– Don’t reinvent the wheel, concentrate on your core competency
– Using CloudFront will reduce your costs on EC2 dramatically
• Purchase RIs and use spot instances
• Constantly monitor and right-size your environment

Sorry, How do I Scale my
Database?

Summary
• Decide on self-managed or managed database services
• Choose the right database for your use case and skillsets to start with
• Use Multi-AZ for your infrastructure
• Choose the right instance family and size for your workloads
• Understand the 3 types of EBS (Magnetic, General Purpose and PIOPS)
• Make use of self-scaling services (Elastic Load Balancing, Amazon S3, Amazon
SNS, SQS, Amazon SES, etc.)
• Build in redundancy at every level
• Blend SQL & NoSQL wisely
• Use a data warehouse to offload large analytical queries from your main database
• Cache data both inside and outside your infrastructure
• Purchase RIs and use Spot instances
• Split tiers into individual services (SOA)
• Use autoscaling once you are ready for it
• Use automation tools in your infrastructure
• Make sure you have good metrics, monitoring, and logging tools in place
• Don’t reinvent the wheel

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (10)

Mehr von Amazon Web Services

Mehr von Amazon Web Services (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)