SlideShare ist ein Scribd-Unternehmen logo
1 von 67
Downloaden Sie, um offline zu lesen
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
July 7th, 2016
Amazon Aurora Deep Dive
Ronan Guilfoyle, AWS Solution Architect
Mario Kostelac, Product Engineer, Intercom
MySQL-compatible relational database
Performance and availability of
commercial databases
Simplicity and cost-effectiveness of
open-source databases
What is Amazon Aurora?
Expedia: Online travel marketplace
 Real-time business intelligence and analytics on a
growing corpus of online travel marketplace data.
 Current Microsoft SQL Server–based architecture
is too expensive. Performance degrades as data
volume grows.
 Cassandra with Solr index requires large memory
footprint and hundreds of nodes, adding cost.
Aurora benefits:
 Aurora meets scale and performance
requirements with much lower cost.
 25,000 inserts/sec with peak up to 70,000. 30
millisecond average response time for write and
17 millisecond for read, with 1 month of data.
World’s leading online travel
company, with a portfolio that
includes 150+ travel sites in 70
countries.
For more detail see: re:Invent 2015 | (DAT312) Using Amazon Aurora for Enterprise Workloads
Alfresco: Enterprise content management
 Scaling Alfresco document repositories to billions
of documents.
 Support user applications that require sub-
second response times.
Aurora benefits:
 Scaled to 1 billion documents with a throughput
of 3 million per hour, which is 10 times faster than
their current environment.
 Moving from large data centers to cost-effective
management with AWS and Aurora.
Leading the convergence of enterprise
content management and business process
management. More than 1,800 organizations
in 195 countries rely on Alfresco, including
leaders in financial services, healthcare, and
the public sector.
For more see: re:Invent 2015 | (DAT309) Scaling Massive Content Stores with Amazon Aurora
Relational databases were not designed for cloud
Multiple layers of
functionality all in a
monolithic stack
SQL
Transactions
Caching
Logging
Not much has changed in the last 30 years
Even when you scale it out, you’re still replicating the same stack
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
Application
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
Application
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
Storage
Application
Reimagining the relational database
What if you were inventing the database today?
You wouldn’t design it the way we did in 1970
You’d build something
 that can scale out …
 that is self-healing …
 that leverages existing AWS services …
A service-oriented architecture applied to databases
Moved the logging and storage layer into a
multitenant, scale-out database-optimized
storage service.
Integrated with other AWS services like
Amazon EC2, Amazon VPC, Amazon
DynamoDB, Amazon SWF, and Amazon
Route 53 for control plane operations.
Integrated with Amazon S3 for continuous
backup with 99.999999999% durability.
Control PlaneData Plane
Amazon
DynamoDB
Amazon SWF
Amazon Route 53
Logging + Storage
SQL
Transactions
Caching
Amazon S3
1
2
3
SQL benchmark results
4 client machines with 1,000 connections each
WRITE PERFORMANCE READ PERFORMANCE
Single client machine with 1,600 connections
Using MySQL SysBench with Amazon Aurora R3.8XL with 32 cores and 244 GB RAM
https ://d0.a wsstat ic . com /product -m ark eting/Aurora /R DS_ Auro ra_Perf orm ance_Assessm ent_Benchm ark ing_v 1-2 .pdf
Beyond benchmarks
If only real-world applications saw benchmark performance.
POSSIBLE DISTORTIONS
Real-world requests contend with each other.
Real-world metadata rarely fits in the data dictionary cache.
Real-world data rarely fits in the buffer cache.
Real-world production databases need to run at high availability.
Scaling user connections
SysBench OLTP workload
250 tables
Connections Amazon Aurora
Amazon RDS MySQL
30 K IOPS (single AZ)
50 40,000 10,000
500 71,000 21,000
5,000 110,000 13,000
8x
UP TO
FA STER
Scaling table count
SysBench write-only workload
1,000 connections, default settings
Tables
Amazon
Aurora
MySQL
I2.8XL
local SSD
MySQL
I2.8XL
RAM disk
RDS
MySQL
30 K IOPS
(single AZ)
10 60,000 18,000 22,000 25,000
100 66,000 19,000 24,000 23,000
1,000 64,000 7,000 18,000 8,000
10,000 54,000 4,000 8,000 5,000
11x
UP TO
FA STER
Number of write operations per second
Scaling dataset size
SYSBENCH WRITE-ONLY (Statements)
DB Size Amazon Aurora
RDS MySQL
30 K IOPS (single AZ)
1GB 107,000 8,400
10GB 107,000 2,400
100GB 101,000 1,500
1TB 26,000 1,200
67x
U P TO
FASTER
DB Size Amazon Aurora
RDS MySQL
30K IOPS (single AZ)
80GB 12,582 585
800GB 9,406 69
CLOUDHARMONY TPC-C (Transactions)
136x
U P TO
FASTER
Running with read replicas
SysBench write-only workload
250 tables
Updates per
second Amazon Aurora
RDS MySQL
30 K IOPS (single AZ)
1,000 2.62 ms 0 s
2,000 3.42 ms 1 s
5,000 3.94 ms 60 s
10,000 5.38 ms 300 s
500x
UP TO
LOW ER LA G
Do fewer I/Os
Minimize network packets
Cache prior results
Offload the database engine
DO LESS WORK
Process asynchronously
Reduce latency path
Use lock-free data structures
Batch operations together
BE MORE EFFICIENT
How do we achieve these results?
DATABASES ARE ALL ABOUT I/O
NETWORK-ATTACHED STORAGE IS ALL ABOUT PACKETS/SECOND
HIGH-THROUGHPUT PROCESSING DOES NOT ALLOW CONTEXT SWITCHES
I/O traffic in RDS MySQL
BINLOG DATA DOUBLE-WRITELOG FRM FILES
T Y P E O F W R IT E
MYSQL WITH STANDBY
EBS mirrorEBS mirror
AZ 1 AZ 2
Amazon S3
EBS
Amazon Elastic
Block Store (EBS)
Primary
Instance
Standby
Instance
1
2
3
4
5
Issue write to Amazon EBS—EBS issues to mirror, acknowledge
when both done.
Stages write to standby instance using storage level replication.
Issues write to EBS on standby instance.
I/O FLOW
Steps 1, 3, 4 are sequential and synchronous.
This amplifies both latency and jitter.
Many types of write operations for each user operation.
Have to write data blocks twice to avoid torn write operations.
OBSERVATIONS
780 K write transactions.
7,388 K I/Os per million transactions (excludes mirroring,
standby).
Average 7.4 I/Os per transaction.
PERFORMANCE
30 minute SysBench write-only workload, 100 GB dataset, RDS Single AZ, 30 K PIOPS
I/O traffic in Aurora (database)
AZ 1 AZ 3
Primary
Instance
Amazon S3
AZ 2
Replica
Instance
AMAZON AURORA
ASYNC
4/6 QUORUM
DISTRIBUTED
WRITES
BINLOG DATA DOUBLE-WRITELOG FRM FILES
T Y P E O F W R IT E S
30 minute SysBench writeonly workload, 100GB dataset
IO FLOW
Only write redo log records; all steps asynchronous.
No data block writes (checkpoint, cache replacement).
6x more log writes, but 9x less network traffic.
Tolerant of network and storage outlier latency.
OBSERVATIONS
27,378 K transactions 35x MORE
950K I/Os per 1M transactions (6x amplification) 7.7x LESS
PERFORMANCE
Gather redo log records—fully ordered by LSN.
Shuffle to appropriate segments—partially ordered.
Batch witre to storage nodes and issue write operations.
I/O traffic in Aurora (storage node)
LOG RECORDS
Primary
Instance
INCOMING QUEUE
STORAGE NODE
S3 BACKUP
1
2
3
4
5
6
7
8
UPDATE
QUEUE
ACK
HOT
LOG
DATA
BLOCKS
POINT IN TIME
SNAPSHOT
GC
SCRUB
COALESCE
SORT
GROUP
PEER TO PEER GOSSIPPeer
Storage
Nodes
All steps are asynchronous.
Only steps 1 and 2 are in the foreground latency path.
Input queue is 46x less than MySQL (unamplified, per node).
Favors latency-sensitive operations.
Use disk space to buffer against spikes in activity.
OBSERVATIONS
I/O FLOW
① Receive record and add to in-memory queue.
② Persist record and acknowledge.
③ Organize records and identify gaps in log.
④ Gossip with peers to fill in holes.
⑤ Coalesce log records into new data block versions.
⑥ Periodically stage log and new block versions to S3.
⑦ Periodically garbage-collect old versions.
⑧ Periodically validate CRC codes on blocks.
Asynchronous group commits
Read
Write
Commit
Read
Read
T1
Commit (T1)
Commit (T2)
Commit (T3)
LSN 10
LSN 12
LSN 22
LSN 50
LSN 30
LSN 34
LSN 41
LSN 47
LSN 20
LSN 49
Commit (T4)
Commit (T5)
Commit (T6)
Commit (T7)
Commit (T8)
LSN GROWTH
Durable LSN at head node
COMMIT QUEUE
Pending commits in LSN order
TIME
GROUP
COMMIT
TRANSACTIONS
Read
Write
Commit
Read
Read
T1
Read
Write
Commit
Read
Read
Tn
TRADITIONAL APPROACH AMAZON AURORA
Maintain a buffer of log records to write out to disk.
Issue write operations when buffer is full, or time out waiting for
write operations.
First writer has latency penalty when write rate is low.
Request I/O with first write, fill buffer till write picked up.
Individual write durable when 4 of 6 storage nodes acknowledge.
Advance DB durable point up to earliest pending
acknowledgement.
Re-entrant connections multiplexed to active threads.
Kernel-space epoll() inserts into latch-free event queue.
Dynamically size threads pool.
Gracefully handles 5000+ concurrent client sessions on r3.8xl.
Standard MySQL—one thread per connection.
Doesn’t scale with connection count.
MySQL EE—connections assigned to thread group.
Requires careful stall threshold tuning.
CLIENTCONNECTION
CLIENTCONNECTION
LATCH FREE
TASK QUEUE
epoll()
MYSQL THREAD MODEL AURORA THREAD MODEL
Adaptive thread pool
I/O Traffic in Aurora (read replica)
PAGE CACHE
UPDATE
Aurora Master
30% Read
70% Write
Aurora Replica
100% New Reads
Shared Multi-AZ Storage
MySQL Master
30% Read
70% Write
MySQL Replica
30% New Reads
70% Write
SINGLE-THREADED
BINLOG APPLY
Data Volume Data Volume
Logical: Ship SQL statements to replica.
Write workload similar on both instances.
Independent storage.
Can result in data drift between master and
replica.
Physical: Ship redo from master to replica.
Replica shares storage. No writes performed.
Cached pages have redo applied.
Advance read view when all commits seen.
MYSQL READ SCALING AMAZON AURORA READ SCALING
Availability
“Performance only matters if your database is up”
Storage node availability
Quorum system for read/write; latency tolerant.
Peer-to-peer gossip replication to fill in holes.
Continuous backup to S3 (designed for 11 9s
durability).
Continuous scrubbing of data blocks.
Continuous monitoring of nodes and disks for repair.
10 GB segments as unit of repair or hotspot
rebalance to quickly rebalance load.
Quorum membership changes do not stall write
operations.
AZ 1 AZ 2 AZ 3
Amazon
S3
Traditional databases
Have to replay logs since the last
checkpoint.
Typically 5 minutes between checkpoints.
Single-threaded in MySQL; requires a
large number of disk accesses.
Amazon Aurora
Underlying storage replays redo records
on demand as part of a disk read.
Parallel, distributed, asynchronous.
No replay for startup.
Checkpointed Data Redo Log
Crash at T0 requires
a re-application of the
SQL in the redo log since
last checkpoint.
T0 T0
Crash at T0 will result in redo logs being
applied to each segment on demand, in
parallel, asynchronously.
Instant crash recovery
Survivable caches
We moved the cache out of the
database process.
Cache remains warm in the event
of a database restart.
Lets you resume fully loaded
operations much faster.
Instant crash recovery +
survivable cache = quick and
easy recovery from DB failures.
SQL
Transactions
Caching
SQL
Transactions
Caching
SQL
Transactions
Caching
Caching process is outside the DB process
and remains warm across a database restart.
Faster, more predictable failover
App
RunningFailure Detection DNS Propagation
Recovery Recovery
DB
Failure
MYSQL
App
Running
Failure Detection DNS Propagation
Recovery
DB
Failure
AURORA WITH MARIADB DRIVER
1 5 – 2 0 s e c .
3 – 2 0 s e c .
High availability with Read Replicas
Amazon S3
AZ 1 AZ 2 AZ 3
Aurora Primary
instance
Cluster volume spans 3 AZs
Aurora Replica Aurora Replica
db.r3.8xlarge db.r3.2xlarge
Priority: tier-1
db.r3.8xlarge
Priority: tier-0
High availability with Read Replicas
Amazon S3
AZ 1 AZ 2 AZ 3
Aurora Primary
instance
Cluster volume spans 3 AZs
Aurora Replica
Aurora Primary
Instance
db.r3.8xlarge db.r3.2xlarge
Priority: tier-1
db.r3.8xlarge
ALTER SYSTEM CRASH [{INSTANCE | DISPATCHER | NODE}]
ALTER SYSTEM SIMULATE percent_failure DISK failure_type IN
[DISK index | NODE index] FOR INTERVAL interval
ALTER SYSTEM SIMULATE percent_failure NETWORK failure_type
[TO {ALL | read_replica | availability_zone}] FOR INTERVAL interval
Simulate failures using SQL
To cause the failure of a component at the database node:
To simulate the failure of disks:
To simulate the failure of networking:
Thank you!
http://aws.amazon.com/rds/aurora
Please remember to rate this
session under My Agenda on
awssummit.london
Mario Kostelac
Product engineer
@mariokostelac
Intercom and Aurora
!=
😭
main DB
conversations DB
SELECT MAX(id) FROM
conversations;
>2B
What does 2B rows in MySQL table mean?
- your table can’t fit in RAM (not cool)
- unpredictable performance (not cool)
- you have a fixed schema (cool?)
that’s SQL, I want fixed schema
NO!
that’s SQL, I want fixed schema
Options for migrations
• ALTER TABLE statement?
• Percona migrations?
⏰ I !
My Intermission
?
Testing time
• synthetic testing
• production-like testing
sysbench
Aurora > sysbench
Migration plan
Migration plan
MySQL prod MySQL rollbackAurora
Where are we now?
• aurora is more forgiving
• uptime is great
• predictable performance
• schema changes are slow, but safe operation
• fast failovers
• read replicas
We think about
product again!
we are product company!
Mario Kostelac
@mariokostelac
mario@intercom.io
all used icons are official AWS icons or downloaded from freepik.com

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

NEW LAUNCH! Introducing PostgreSQL compatibility for Amazon Aurora
NEW LAUNCH! Introducing PostgreSQL compatibility for Amazon AuroraNEW LAUNCH! Introducing PostgreSQL compatibility for Amazon Aurora
NEW LAUNCH! Introducing PostgreSQL compatibility for Amazon Aurora
 
AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)
AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)
AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)
 
AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)
AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)
AWS re:Invent 2016: How to Scale and Operate Elasticsearch on AWS (DEV307)
 
Deep Dive on Amazon Aurora
Deep Dive on Amazon AuroraDeep Dive on Amazon Aurora
Deep Dive on Amazon Aurora
 
Amazon RDS Deep Dive
Amazon RDS Deep DiveAmazon RDS Deep Dive
Amazon RDS Deep Dive
 
Deep Dive on Elastic Load Balancing
Deep Dive on Elastic Load BalancingDeep Dive on Elastic Load Balancing
Deep Dive on Elastic Load Balancing
 
Introduction to Amazon Aurora
Introduction to Amazon AuroraIntroduction to Amazon Aurora
Introduction to Amazon Aurora
 
Startup Showcase - Mojang
Startup Showcase - MojangStartup Showcase - Mojang
Startup Showcase - Mojang
 
Getting started with amazon aurora - Toronto
Getting started with amazon aurora - TorontoGetting started with amazon aurora - Toronto
Getting started with amazon aurora - Toronto
 
Amazon RDS & Amazon Aurora: Relational Databases on AWS - SRV206 - Atlanta AW...
Amazon RDS & Amazon Aurora: Relational Databases on AWS - SRV206 - Atlanta AW...Amazon RDS & Amazon Aurora: Relational Databases on AWS - SRV206 - Atlanta AW...
Amazon RDS & Amazon Aurora: Relational Databases on AWS - SRV206 - Atlanta AW...
 
Amazon Aurora Let's Talk About Performance
Amazon Aurora Let's Talk About PerformanceAmazon Aurora Let's Talk About Performance
Amazon Aurora Let's Talk About Performance
 
(DAT312) Using Amazon Aurora for Enterprise Workloads
(DAT312) Using Amazon Aurora for Enterprise Workloads(DAT312) Using Amazon Aurora for Enterprise Workloads
(DAT312) Using Amazon Aurora for Enterprise Workloads
 
What’s New in Amazon Aurora for MySQL and PostgreSQL
What’s New in Amazon Aurora for MySQL and PostgreSQLWhat’s New in Amazon Aurora for MySQL and PostgreSQL
What’s New in Amazon Aurora for MySQL and PostgreSQL
 
Getting Started with Amazon Aurora
Getting Started with Amazon AuroraGetting Started with Amazon Aurora
Getting Started with Amazon Aurora
 
What’s New in Amazon Aurora
What’s New in Amazon AuroraWhat’s New in Amazon Aurora
What’s New in Amazon Aurora
 
Database Migration – Simple, Cross-Engine and Cross-Platform Migration
Database Migration – Simple, Cross-Engine and Cross-Platform MigrationDatabase Migration – Simple, Cross-Engine and Cross-Platform Migration
Database Migration – Simple, Cross-Engine and Cross-Platform Migration
 
Building Your First Big Data Application on AWS
Building Your First Big Data Application on AWSBuilding Your First Big Data Application on AWS
Building Your First Big Data Application on AWS
 
Amazon Redshift Deep Dive
Amazon Redshift Deep Dive Amazon Redshift Deep Dive
Amazon Redshift Deep Dive
 
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech TalksDeep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
Deep Dive on Amazon EC2 Instances - January 2017 AWS Online Tech Talks
 
RDS Postgres and Aurora Postgres | AWS Public Sector Summit 2017
RDS Postgres and Aurora Postgres | AWS Public Sector Summit 2017RDS Postgres and Aurora Postgres | AWS Public Sector Summit 2017
RDS Postgres and Aurora Postgres | AWS Public Sector Summit 2017
 

Andere mochten auch

Andere mochten auch (17)

Deep Dive on Amazon Aurora - Covering New Feature Announcements
Deep Dive on Amazon Aurora - Covering New Feature AnnouncementsDeep Dive on Amazon Aurora - Covering New Feature Announcements
Deep Dive on Amazon Aurora - Covering New Feature Announcements
 
AWS re:Invent 2016: Amazon Aurora Best Practices: Getting the Best Out of You...
AWS re:Invent 2016: Amazon Aurora Best Practices: Getting the Best Out of You...AWS re:Invent 2016: Amazon Aurora Best Practices: Getting the Best Out of You...
AWS re:Invent 2016: Amazon Aurora Best Practices: Getting the Best Out of You...
 
Getting Started with Amazon Aurora
Getting Started with Amazon AuroraGetting Started with Amazon Aurora
Getting Started with Amazon Aurora
 
Amazon Aurora New Features - September 2016 Webinar Series
Amazon Aurora New Features - September 2016 Webinar SeriesAmazon Aurora New Features - September 2016 Webinar Series
Amazon Aurora New Features - September 2016 Webinar Series
 
Getting Started with Amazon WorkSpaces
Getting Started with Amazon WorkSpacesGetting Started with Amazon WorkSpaces
Getting Started with Amazon WorkSpaces
 
Creating Your Virtual Data Center: Amazon VPC Fundamentals and Connectivity O...
Creating Your Virtual Data Center: Amazon VPC Fundamentals and Connectivity O...Creating Your Virtual Data Center: Amazon VPC Fundamentals and Connectivity O...
Creating Your Virtual Data Center: Amazon VPC Fundamentals and Connectivity O...
 
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
 
AWS June 2016 Webinar Series - Amazon Aurora Deep Dive - Optimizing Database ...
AWS June 2016 Webinar Series - Amazon Aurora Deep Dive - Optimizing Database ...AWS June 2016 Webinar Series - Amazon Aurora Deep Dive - Optimizing Database ...
AWS June 2016 Webinar Series - Amazon Aurora Deep Dive - Optimizing Database ...
 
Scaling MySQL in Amazon Web Services
Scaling MySQL in Amazon Web ServicesScaling MySQL in Amazon Web Services
Scaling MySQL in Amazon Web Services
 
Getting started with Amazon ElastiCache
Getting started with Amazon ElastiCacheGetting started with Amazon ElastiCache
Getting started with Amazon ElastiCache
 
Getting Started with Amazon Aurora
Getting Started with Amazon AuroraGetting Started with Amazon Aurora
Getting Started with Amazon Aurora
 
Amazon Aurora
Amazon AuroraAmazon Aurora
Amazon Aurora
 
AWS re:Invent 2016: IAM Best Practices to Live By (SAC317)
AWS re:Invent 2016: IAM Best Practices to Live By (SAC317)AWS re:Invent 2016: IAM Best Practices to Live By (SAC317)
AWS re:Invent 2016: IAM Best Practices to Live By (SAC317)
 
AWS re:Invent 2016: Deep Dive on Amazon Relational Database Service (DAT305)
AWS re:Invent 2016: Deep Dive on Amazon Relational Database Service (DAT305)AWS re:Invent 2016: Deep Dive on Amazon Relational Database Service (DAT305)
AWS re:Invent 2016: Deep Dive on Amazon Relational Database Service (DAT305)
 
Amazon RDS with Amazon Aurora | AWS Public Sector Summit 2016
Amazon RDS with Amazon Aurora | AWS Public Sector Summit 2016Amazon RDS with Amazon Aurora | AWS Public Sector Summit 2016
Amazon RDS with Amazon Aurora | AWS Public Sector Summit 2016
 
Deep Dive on Amazon Aurora
Deep Dive on Amazon AuroraDeep Dive on Amazon Aurora
Deep Dive on Amazon Aurora
 
(ISM304) Oracle to Amazon RDS MySQL & Aurora: How Gallup Made the Move
(ISM304) Oracle to Amazon RDS MySQL & Aurora: How Gallup Made the Move(ISM304) Oracle to Amazon RDS MySQL & Aurora: How Gallup Made the Move
(ISM304) Oracle to Amazon RDS MySQL & Aurora: How Gallup Made the Move
 

Ähnlich wie Deep Dive on Amazon Aurora

Ähnlich wie Deep Dive on Amazon Aurora (20)

AWS November Webinar Series - Aurora Deep Dive
AWS November Webinar Series - Aurora Deep DiveAWS November Webinar Series - Aurora Deep Dive
AWS November Webinar Series - Aurora Deep Dive
 
What's New in Amazon Aurora
What's New in Amazon AuroraWhat's New in Amazon Aurora
What's New in Amazon Aurora
 
Getting Started with Amazon Aurora
Getting Started with Amazon AuroraGetting Started with Amazon Aurora
Getting Started with Amazon Aurora
 
Introducing Amazon Aurora
Introducing Amazon AuroraIntroducing Amazon Aurora
Introducing Amazon Aurora
 
SRV407 Deep Dive on Amazon Aurora
SRV407 Deep Dive on Amazon AuroraSRV407 Deep Dive on Amazon Aurora
SRV407 Deep Dive on Amazon Aurora
 
getting started with amazon aurora
getting started with amazon auroragetting started with amazon aurora
getting started with amazon aurora
 
What's New in Amazon Aurora
What's New in Amazon AuroraWhat's New in Amazon Aurora
What's New in Amazon Aurora
 
Getting Started with Amazon Aurora
 Getting Started with Amazon Aurora Getting Started with Amazon Aurora
Getting Started with Amazon Aurora
 
Deep Dive on Amazon Aurora
Deep Dive on Amazon AuroraDeep Dive on Amazon Aurora
Deep Dive on Amazon Aurora
 
Aurora Deep Dive | AWS Floor28
Aurora Deep Dive | AWS Floor28Aurora Deep Dive | AWS Floor28
Aurora Deep Dive | AWS Floor28
 
Amazon Aurora TechConnect
Amazon Aurora TechConnect Amazon Aurora TechConnect
Amazon Aurora TechConnect
 
(DAT207) Amazon Aurora: The New Amazon Relational Database Engine
(DAT207) Amazon Aurora: The New Amazon Relational Database Engine(DAT207) Amazon Aurora: The New Amazon Relational Database Engine
(DAT207) Amazon Aurora: The New Amazon Relational Database Engine
 
DAT202_Getting started with Amazon Aurora
DAT202_Getting started with Amazon AuroraDAT202_Getting started with Amazon Aurora
DAT202_Getting started with Amazon Aurora
 
What’s New in Amazon Aurora
What’s New in Amazon AuroraWhat’s New in Amazon Aurora
What’s New in Amazon Aurora
 
Getting Started with Amazon Aurora
Getting Started with Amazon AuroraGetting Started with Amazon Aurora
Getting Started with Amazon Aurora
 
What’s New in Amazon Aurora
What’s New in Amazon AuroraWhat’s New in Amazon Aurora
What’s New in Amazon Aurora
 
Amazon (AWS) Aurora
Amazon (AWS) AuroraAmazon (AWS) Aurora
Amazon (AWS) Aurora
 
AWS August Webinar Series - Introducing Amazon Aurora
AWS August Webinar Series - Introducing Amazon AuroraAWS August Webinar Series - Introducing Amazon Aurora
AWS August Webinar Series - Introducing Amazon Aurora
 
AWS re:Invent re:Cap - 새로운 관계형 데이터베이스 엔진: Amazon Aurora - 양승도
AWS re:Invent re:Cap - 새로운 관계형 데이터베이스 엔진: Amazon Aurora - 양승도AWS re:Invent re:Cap - 새로운 관계형 데이터베이스 엔진: Amazon Aurora - 양승도
AWS re:Invent re:Cap - 새로운 관계형 데이터베이스 엔진: Amazon Aurora - 양승도
 
AWS January 2016 Webinar Series - Amazon Aurora for Enterprise Database Appli...
AWS January 2016 Webinar Series - Amazon Aurora for Enterprise Database Appli...AWS January 2016 Webinar Series - Amazon Aurora for Enterprise Database Appli...
AWS January 2016 Webinar Series - Amazon Aurora for Enterprise Database Appli...
 

Mehr von Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Kürzlich hochgeladen

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

Deep Dive on Amazon Aurora

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. July 7th, 2016 Amazon Aurora Deep Dive Ronan Guilfoyle, AWS Solution Architect Mario Kostelac, Product Engineer, Intercom
  • 2. MySQL-compatible relational database Performance and availability of commercial databases Simplicity and cost-effectiveness of open-source databases What is Amazon Aurora?
  • 3. Expedia: Online travel marketplace  Real-time business intelligence and analytics on a growing corpus of online travel marketplace data.  Current Microsoft SQL Server–based architecture is too expensive. Performance degrades as data volume grows.  Cassandra with Solr index requires large memory footprint and hundreds of nodes, adding cost. Aurora benefits:  Aurora meets scale and performance requirements with much lower cost.  25,000 inserts/sec with peak up to 70,000. 30 millisecond average response time for write and 17 millisecond for read, with 1 month of data. World’s leading online travel company, with a portfolio that includes 150+ travel sites in 70 countries. For more detail see: re:Invent 2015 | (DAT312) Using Amazon Aurora for Enterprise Workloads
  • 4. Alfresco: Enterprise content management  Scaling Alfresco document repositories to billions of documents.  Support user applications that require sub- second response times. Aurora benefits:  Scaled to 1 billion documents with a throughput of 3 million per hour, which is 10 times faster than their current environment.  Moving from large data centers to cost-effective management with AWS and Aurora. Leading the convergence of enterprise content management and business process management. More than 1,800 organizations in 195 countries rely on Alfresco, including leaders in financial services, healthcare, and the public sector. For more see: re:Invent 2015 | (DAT309) Scaling Massive Content Stores with Amazon Aurora
  • 5. Relational databases were not designed for cloud Multiple layers of functionality all in a monolithic stack SQL Transactions Caching Logging
  • 6. Not much has changed in the last 30 years Even when you scale it out, you’re still replicating the same stack SQL Transactions Caching Logging SQL Transactions Caching Logging Application SQL Transactions Caching Logging SQL Transactions Caching Logging Application SQL Transactions Caching Logging SQL Transactions Caching Logging Storage Application
  • 7. Reimagining the relational database What if you were inventing the database today? You wouldn’t design it the way we did in 1970 You’d build something  that can scale out …  that is self-healing …  that leverages existing AWS services …
  • 8. A service-oriented architecture applied to databases Moved the logging and storage layer into a multitenant, scale-out database-optimized storage service. Integrated with other AWS services like Amazon EC2, Amazon VPC, Amazon DynamoDB, Amazon SWF, and Amazon Route 53 for control plane operations. Integrated with Amazon S3 for continuous backup with 99.999999999% durability. Control PlaneData Plane Amazon DynamoDB Amazon SWF Amazon Route 53 Logging + Storage SQL Transactions Caching Amazon S3 1 2 3
  • 9. SQL benchmark results 4 client machines with 1,000 connections each WRITE PERFORMANCE READ PERFORMANCE Single client machine with 1,600 connections Using MySQL SysBench with Amazon Aurora R3.8XL with 32 cores and 244 GB RAM https ://d0.a wsstat ic . com /product -m ark eting/Aurora /R DS_ Auro ra_Perf orm ance_Assessm ent_Benchm ark ing_v 1-2 .pdf
  • 10. Beyond benchmarks If only real-world applications saw benchmark performance. POSSIBLE DISTORTIONS Real-world requests contend with each other. Real-world metadata rarely fits in the data dictionary cache. Real-world data rarely fits in the buffer cache. Real-world production databases need to run at high availability.
  • 11. Scaling user connections SysBench OLTP workload 250 tables Connections Amazon Aurora Amazon RDS MySQL 30 K IOPS (single AZ) 50 40,000 10,000 500 71,000 21,000 5,000 110,000 13,000 8x UP TO FA STER
  • 12. Scaling table count SysBench write-only workload 1,000 connections, default settings Tables Amazon Aurora MySQL I2.8XL local SSD MySQL I2.8XL RAM disk RDS MySQL 30 K IOPS (single AZ) 10 60,000 18,000 22,000 25,000 100 66,000 19,000 24,000 23,000 1,000 64,000 7,000 18,000 8,000 10,000 54,000 4,000 8,000 5,000 11x UP TO FA STER Number of write operations per second
  • 13. Scaling dataset size SYSBENCH WRITE-ONLY (Statements) DB Size Amazon Aurora RDS MySQL 30 K IOPS (single AZ) 1GB 107,000 8,400 10GB 107,000 2,400 100GB 101,000 1,500 1TB 26,000 1,200 67x U P TO FASTER DB Size Amazon Aurora RDS MySQL 30K IOPS (single AZ) 80GB 12,582 585 800GB 9,406 69 CLOUDHARMONY TPC-C (Transactions) 136x U P TO FASTER
  • 14. Running with read replicas SysBench write-only workload 250 tables Updates per second Amazon Aurora RDS MySQL 30 K IOPS (single AZ) 1,000 2.62 ms 0 s 2,000 3.42 ms 1 s 5,000 3.94 ms 60 s 10,000 5.38 ms 300 s 500x UP TO LOW ER LA G
  • 15. Do fewer I/Os Minimize network packets Cache prior results Offload the database engine DO LESS WORK Process asynchronously Reduce latency path Use lock-free data structures Batch operations together BE MORE EFFICIENT How do we achieve these results? DATABASES ARE ALL ABOUT I/O NETWORK-ATTACHED STORAGE IS ALL ABOUT PACKETS/SECOND HIGH-THROUGHPUT PROCESSING DOES NOT ALLOW CONTEXT SWITCHES
  • 16. I/O traffic in RDS MySQL BINLOG DATA DOUBLE-WRITELOG FRM FILES T Y P E O F W R IT E MYSQL WITH STANDBY EBS mirrorEBS mirror AZ 1 AZ 2 Amazon S3 EBS Amazon Elastic Block Store (EBS) Primary Instance Standby Instance 1 2 3 4 5 Issue write to Amazon EBS—EBS issues to mirror, acknowledge when both done. Stages write to standby instance using storage level replication. Issues write to EBS on standby instance. I/O FLOW Steps 1, 3, 4 are sequential and synchronous. This amplifies both latency and jitter. Many types of write operations for each user operation. Have to write data blocks twice to avoid torn write operations. OBSERVATIONS 780 K write transactions. 7,388 K I/Os per million transactions (excludes mirroring, standby). Average 7.4 I/Os per transaction. PERFORMANCE 30 minute SysBench write-only workload, 100 GB dataset, RDS Single AZ, 30 K PIOPS
  • 17. I/O traffic in Aurora (database) AZ 1 AZ 3 Primary Instance Amazon S3 AZ 2 Replica Instance AMAZON AURORA ASYNC 4/6 QUORUM DISTRIBUTED WRITES BINLOG DATA DOUBLE-WRITELOG FRM FILES T Y P E O F W R IT E S 30 minute SysBench writeonly workload, 100GB dataset IO FLOW Only write redo log records; all steps asynchronous. No data block writes (checkpoint, cache replacement). 6x more log writes, but 9x less network traffic. Tolerant of network and storage outlier latency. OBSERVATIONS 27,378 K transactions 35x MORE 950K I/Os per 1M transactions (6x amplification) 7.7x LESS PERFORMANCE Gather redo log records—fully ordered by LSN. Shuffle to appropriate segments—partially ordered. Batch witre to storage nodes and issue write operations.
  • 18. I/O traffic in Aurora (storage node) LOG RECORDS Primary Instance INCOMING QUEUE STORAGE NODE S3 BACKUP 1 2 3 4 5 6 7 8 UPDATE QUEUE ACK HOT LOG DATA BLOCKS POINT IN TIME SNAPSHOT GC SCRUB COALESCE SORT GROUP PEER TO PEER GOSSIPPeer Storage Nodes All steps are asynchronous. Only steps 1 and 2 are in the foreground latency path. Input queue is 46x less than MySQL (unamplified, per node). Favors latency-sensitive operations. Use disk space to buffer against spikes in activity. OBSERVATIONS I/O FLOW ① Receive record and add to in-memory queue. ② Persist record and acknowledge. ③ Organize records and identify gaps in log. ④ Gossip with peers to fill in holes. ⑤ Coalesce log records into new data block versions. ⑥ Periodically stage log and new block versions to S3. ⑦ Periodically garbage-collect old versions. ⑧ Periodically validate CRC codes on blocks.
  • 19. Asynchronous group commits Read Write Commit Read Read T1 Commit (T1) Commit (T2) Commit (T3) LSN 10 LSN 12 LSN 22 LSN 50 LSN 30 LSN 34 LSN 41 LSN 47 LSN 20 LSN 49 Commit (T4) Commit (T5) Commit (T6) Commit (T7) Commit (T8) LSN GROWTH Durable LSN at head node COMMIT QUEUE Pending commits in LSN order TIME GROUP COMMIT TRANSACTIONS Read Write Commit Read Read T1 Read Write Commit Read Read Tn TRADITIONAL APPROACH AMAZON AURORA Maintain a buffer of log records to write out to disk. Issue write operations when buffer is full, or time out waiting for write operations. First writer has latency penalty when write rate is low. Request I/O with first write, fill buffer till write picked up. Individual write durable when 4 of 6 storage nodes acknowledge. Advance DB durable point up to earliest pending acknowledgement.
  • 20. Re-entrant connections multiplexed to active threads. Kernel-space epoll() inserts into latch-free event queue. Dynamically size threads pool. Gracefully handles 5000+ concurrent client sessions on r3.8xl. Standard MySQL—one thread per connection. Doesn’t scale with connection count. MySQL EE—connections assigned to thread group. Requires careful stall threshold tuning. CLIENTCONNECTION CLIENTCONNECTION LATCH FREE TASK QUEUE epoll() MYSQL THREAD MODEL AURORA THREAD MODEL Adaptive thread pool
  • 21. I/O Traffic in Aurora (read replica) PAGE CACHE UPDATE Aurora Master 30% Read 70% Write Aurora Replica 100% New Reads Shared Multi-AZ Storage MySQL Master 30% Read 70% Write MySQL Replica 30% New Reads 70% Write SINGLE-THREADED BINLOG APPLY Data Volume Data Volume Logical: Ship SQL statements to replica. Write workload similar on both instances. Independent storage. Can result in data drift between master and replica. Physical: Ship redo from master to replica. Replica shares storage. No writes performed. Cached pages have redo applied. Advance read view when all commits seen. MYSQL READ SCALING AMAZON AURORA READ SCALING
  • 22. Availability “Performance only matters if your database is up”
  • 23. Storage node availability Quorum system for read/write; latency tolerant. Peer-to-peer gossip replication to fill in holes. Continuous backup to S3 (designed for 11 9s durability). Continuous scrubbing of data blocks. Continuous monitoring of nodes and disks for repair. 10 GB segments as unit of repair or hotspot rebalance to quickly rebalance load. Quorum membership changes do not stall write operations. AZ 1 AZ 2 AZ 3 Amazon S3
  • 24. Traditional databases Have to replay logs since the last checkpoint. Typically 5 minutes between checkpoints. Single-threaded in MySQL; requires a large number of disk accesses. Amazon Aurora Underlying storage replays redo records on demand as part of a disk read. Parallel, distributed, asynchronous. No replay for startup. Checkpointed Data Redo Log Crash at T0 requires a re-application of the SQL in the redo log since last checkpoint. T0 T0 Crash at T0 will result in redo logs being applied to each segment on demand, in parallel, asynchronously. Instant crash recovery
  • 25. Survivable caches We moved the cache out of the database process. Cache remains warm in the event of a database restart. Lets you resume fully loaded operations much faster. Instant crash recovery + survivable cache = quick and easy recovery from DB failures. SQL Transactions Caching SQL Transactions Caching SQL Transactions Caching Caching process is outside the DB process and remains warm across a database restart.
  • 26. Faster, more predictable failover App RunningFailure Detection DNS Propagation Recovery Recovery DB Failure MYSQL App Running Failure Detection DNS Propagation Recovery DB Failure AURORA WITH MARIADB DRIVER 1 5 – 2 0 s e c . 3 – 2 0 s e c .
  • 27. High availability with Read Replicas Amazon S3 AZ 1 AZ 2 AZ 3 Aurora Primary instance Cluster volume spans 3 AZs Aurora Replica Aurora Replica db.r3.8xlarge db.r3.2xlarge Priority: tier-1 db.r3.8xlarge Priority: tier-0
  • 28. High availability with Read Replicas Amazon S3 AZ 1 AZ 2 AZ 3 Aurora Primary instance Cluster volume spans 3 AZs Aurora Replica Aurora Primary Instance db.r3.8xlarge db.r3.2xlarge Priority: tier-1 db.r3.8xlarge
  • 29. ALTER SYSTEM CRASH [{INSTANCE | DISPATCHER | NODE}] ALTER SYSTEM SIMULATE percent_failure DISK failure_type IN [DISK index | NODE index] FOR INTERVAL interval ALTER SYSTEM SIMULATE percent_failure NETWORK failure_type [TO {ALL | read_replica | availability_zone}] FOR INTERVAL interval Simulate failures using SQL To cause the failure of a component at the database node: To simulate the failure of disks: To simulate the failure of networking:
  • 31. Please remember to rate this session under My Agenda on awssummit.london
  • 34.
  • 35.
  • 36. !=
  • 37.
  • 38.
  • 39.
  • 40.
  • 41. 😭
  • 44. >2B
  • 45. What does 2B rows in MySQL table mean? - your table can’t fit in RAM (not cool) - unpredictable performance (not cool) - you have a fixed schema (cool?)
  • 46. that’s SQL, I want fixed schema
  • 47. NO! that’s SQL, I want fixed schema
  • 48. Options for migrations • ALTER TABLE statement? • Percona migrations?
  • 49.
  • 52.
  • 53.
  • 54. Testing time • synthetic testing • production-like testing
  • 55.
  • 57.
  • 58.
  • 60.
  • 62. Migration plan MySQL prod MySQL rollbackAurora
  • 63.
  • 64. Where are we now? • aurora is more forgiving • uptime is great • predictable performance • schema changes are slow, but safe operation • fast failovers • read replicas
  • 65.
  • 66. We think about product again! we are product company!
  • 67. Mario Kostelac @mariokostelac mario@intercom.io all used icons are official AWS icons or downloaded from freepik.com