[REPEAT 1] Deep Dive on Amazon Aurora with MySQL Compatibility (DAT304-R1) - AWS re:Invent 2018

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Deep Dive on Amazon Aurora with
MySQL Compatibility
D A T 3 0 4
Kamal Gupta
Senior Development Manager, Amazon Aurora MySQL

RDS Platform
OPEN SOURCE ENGINES COMMERCIAL ENGINES
> Advanced monitoring
> Routine maintenance
> Push-button scaling
> Automatic fail-over
> Backup & recovery
> X-region replication
> Isolation & security
> Industry compliance
> Automated patching
CLOUD NATIVE ENGINE
Vision:Amazon Relational DatabaseService
Choice of open source and commercial databases

 Speed and availability of high-end commercial databases
 Simplicity and cost-effectiveness of open source databases
 Drop-in compatibility with MySQL and PostgreSQL
 Simple pay as you go pricing
Delivered as a managed service
Amazon Aurora
AmazonAurora…
Enterprise database at open source price

Aurora customer adoption
Aurora is used by ¾ of the top 100 AWS customers
Fastest growing service in AWS history

Local
Storage
SQL
Transactions
Caching
Logging
Compute
Traditional Database Architecture
Monolithic stack in a Single box

Network
Storage
SQL
Transactions
Caching
Logging
Compute
Traditional Database Architecture
Decoupled Storage from Compute

Traditional Distributed Database stack
Storage
Application Application Application
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
Storage StorageStorage Storage
 Same Monolithic stack  Distributed consensus algorithms
perform poorly

Aurora: Scale-out, Distributed architecture
 Push Log applicator to Storage
Master Replica Replica Replica
Master
Shared storage volume
Replica Replica
SQL
Transactions
Caching
SQL
Transactions
Caching
SQL
Transactions
Caching
No more trade-offs!
AZ1 AZ2 AZ3
 Write performance
 Read scale out
 AZ + 1 failure tolerance
 Instant database redo recovery
 4/6 Write Quorum & Local tracking

Page1
T1
T2
T3
T4
MASTER REPLICA
2 3 4 5 61
PAGE1
Page1
Asynchronous Replication
The Log is the database!
On-demand Log apply
Log Applicator in Action
Page1

BINLOG DATA DOUBLE-WRITELOG FRM FILES
TYPE OF WRITE
MYSQL WITH REPLICA
EBS mirrorEBS mirror
AZ 1 AZ 2
EBS
Amazon Elastic
Block Store (EBS)
Primary
Instance
Replica
Instance
1
2
3
4
5
AZ 1 AZ 3
Primary
Instance
AZ 2
Replica
Instance
ASYNC
4/6 QUORUM
DISTRIBUTED
WRITES
Replica
Instance
Amazon S3
AMAZON AURORA
0.78MM transactions
7.4 I/Os per transaction
MySQL I/O profile for 30 min Sysbench run
27MM transactions 35X MORE
0.95 I/Os per transaction 7.7X LESS
Aurora IO profile for 30 min Sysbench run
MySQL vs. Aurora I/O profile
Amazon S3

© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Writeand readthroughput
AuroraMySQLis5xfasterthanMySQL
0
50,000
100,000
150,000
200,000
250,000
MySQL 5.6 MySQL 5.7 MySQL 8.0
Aurora 5.6 Aurora 5.7
0
100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
MySQL 5.6 MySQL 5.7 MySQL 8.0
Aurora 5.6 Aurora 5.7
WriteThroughput ReadThroughput
Using Sysbench with 250 tables and 200,000 rows per table on R4.16XL

Bulkdataload performance
AuroraMySQLloadsdata2.5xfasterthanMySQL
Data loading
Data loading
Index build
Index build
0 100 200 300 400 500 600 700 800
MySQL
Amazon
Aurora
Runtime (sec.)
10 SysbenchTables, 10MM rows per each on R4.16XL

Read scale out
PAGE CACHE
UPDATE
Aurora Master
30% Read
70% Write
Aurora Replica
100% New Reads
Shared Multi-AZ Storage
MySQL Master
30% Read
70% Write
MySQL Replica
30% New Reads
70% Write
SINGLE-THREADED
BINLOG APPLY
Data Volume Data Volume
Logical using complete changes
Same write workload
Independent storage
Physical using delta changes
NO writes on replica
Shared storage
MYSQL READ SCALING AMAZON AURORA READ SCALING

AuroraMySQLLogicalvs Physicalreplicalag
“In MySQL, we saw replica lag spike to almost 12 minutes which is almost absurd from
an application’s perspective. With Aurora, the maximum read replica lag across 4
replicas never exceeded 20 ms.”
Binlog Replica Lag (sec.) Aurora Physical Replica Lag (msec)Aurora Logical Replica Lag (seconds)

2 3 4 5 61
[T1]
[T3]
[T4]
Quorum
[T2]
PAGE1
Page1
T1
T2
T3
T4
MASTER
Durability: 0 0 0 0
Tracking:
Waiting Tx: T1 T2 T3 T4
Committed Tx: None
Durability: 4 3 1 0
Tracking:
Waiting Tx: T2 T3 T4
Committed Tx: T1
Durability: 6 3 5 2
Tracking:
Waiting Tx: T2 T3 T4
Committed Tx: T1
Durability: 6 4 5 6
Tracking:
Waiting Tx: None
Committed Tx: T1 T2 T3 T4
Durability: 6 6 6 6
Tracking:
Waiting Tx: None
Committed Tx: T1 T2 T3 T4
Shared StorageVolume
No heavy-weight Distributed
commits
Quorum and Local tracking in action
Parallel Flush

Performance variability under load
AmazonAurora >200x more consistent
SysBench OLTP (write-only) workload with 250 tables and 200,000 rows per table on R4.16XL
0
2
4
6
8
10
12
0 100 200 300 400 500 600
Time in seconds
Write ResponseTime (seconds)
Amazon Aurora
MySQL 5.6 on EBS

What else have we done to drive throughput?CLIENTCONNECTION
CLIENTCONNECTION
LATCH FREE
TASK QUEUE
epoll()
MYSQL THREAD MODEL AURORA THREAD MODEL

What else have we done to drive throughput?CLIENTCONNECTION
CLIENTCONNECTION
LATCH FREE
TASK QUEUE
epoll()
MYSQL THREAD MODEL AURORA THREAD MODEL
Scan
Delete
Scan
Delete
Insert
Scan
Scan
Insert
Delete
Scan
Insert
Insert
MYSQL LOCK MANAGER AURORA LOCK MANAGER

Performanceimprovementovertime
AuroraMySQL–2015-2018
0
50
100
150
200
250
2015 2016 2017 2018
Max write throughput
0
200
400
600
800
2015 2016 2017 2018
Max read throughput
Launched with R3.8xl
32 cores, 256GB memory
Now support R4.16xl
R5.24xl coming soon
Besides many performance optimizations, we are also upgrading HW platform

Pre-tuned or Auto-tunes for different hardware configurations
What about Performance parameters?

SysBench OLTP (write-only) 10GiB workload with 250 tables and 200,000 rows
Recovery time vs. write performance
No more trade-offs!
Recovery time (sec.): 376.0
Write/s: 90,606
Write/s: 26,129
Write/s: 4,382
Write/s: 207,398
0 50,000 100,000 150,000 200,000
0 50 100 150 200 250 300 350 400 450 500
MySQL (16GB Checkpoint)
MySQL (1GB Checkpoint)
MySQL (128MB Checkpoint)
Amazon Aurora
Writes per Second
Recovery Time (sec.)

Scalability Availability Manageability Compatibility
SQL (1 node)
Manual sharding
No SQL
Aurora Single Master
Distributed databases comparison

Existing Multi-Master solutions
Paxos leader with 2PCGlobal ordering with read-write set
DATA
RANGE #1
DATA
RANGE #2
DATA
RANGE #4
DATA
RANGE #3
DATA
RANGE #5
L
L L
L
L
Distributed lock manager
SHARED STORAGE
M1 M2 M3
M1 M1 M1M2 M3 M2
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
GLOBAL ORDERING UNIT
T1 T2 T3 T100
Heavyweight synchronization: pessimistic
and negative scaling
Ex – Oracle RAC, DB2 Purescale, Sybase
Global entity: scaling bottleneck
Ex – Galera,TangoDB, FaunaDB
Heavy-weight consensus protocol: hot
partitions and struggle with
cross-partition queries
Ex – Spanner, CockroachDB, Ignite

Master
Replica
Orange Master Blue Master
SQL
Transactions
Caching
SQL
Transactions
Caching
Aurora Multi-Master Architecture
Shared Storage Volume
 No Pessimistic Locking
 No Global Ordering
 No Global Commit-Coordination
Replica
• Membership
• Heartbeat
• Replication
• Metadata
Cluster Services
1
1
1
1 1
1
2
2
2
2 2
2
3 3
3 3
3 3
1 3?
T1 T2
AZ1
AZ2
AZ3
Decoupled
Decoupled
Decoupled
 Decoupled System
 Microservices Architecture
2
 Optimistic Conflict Resolution

Shared distributed storage volume
AuroraMulti-Master –how (happypath)
Blue master Orange master
C1 C2
Non-conflicting writes originating on different masters on
different tables
Blue Master Orange MasterTime
BeginTrx (BT1)1 BeginTrx (OT1)
2 Update (table1)
3
Update (table2)
Page 1
Page 1
Page 1
Page 1
Page 1
Page 2
Page 2
Page 2
Page 2
Page 2
Page 2
Commit (BT1) Commit (OT1)
Page 1
OK OK

AuroraMulti-Master –how (physical conflict)
C1 C2
Conflicting writes originating on different masters on the same
table
2 Update (row1, table1)
3
Update (row1, table1)
Page 1
Page 1
Page 1
Page 1
Page 1
Page 1
Commit (BT1) Rollback (OT1)
OK RETRY
Page 2
Page 2
Page 2
Page 2
Page 2
Page 2

AuroraMulti-Master– how (logical conflict)
C1 C2
Conflicting writes originating on different masters on the same
table
2 Update (row1, table1)
4
Update (row1, table1)
and Rollback (OT1)
Commit (BT1)
OK RETRY
3
Page 2
Page 2
Page 2
Page 2
Page 2
Page 2
Page 1
Page 1
Page 1
Page 1
Page 1
Page 1
Page 1

Aurora Multi-Master – scaling and availability
0
10000
20000
30000
40000
50000
60000
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
AggregatedThroughput
Time in minutes
Sysbench workload on 4 R4.XL nodes

AuroraMulti-MasterGlobal Reads -What?
John Status
Engaged
Single
Updates status
John
Bob
Send post
John Posts
Proposed to Sara
Local Read: ☹Global Read: 😊
Asynchronous
Engaged
Single

N3
N1 N2
Client
T2
T3 N1 wait for replication to catch up untilT2 ANDT3
Globally consistent results
Aurora Multi-Master Global Reads - How?
 No waits on the write path
 Adds latency ONLY to Globally
consistent reads
 Configurable per session
Shared distributed
storage volume
Performs the read at vector clockT = (T1,T2,T3)
ReadT1

Aurora Multi-Master – Summary
Linear Scaling
MicroservicesArchitecture
6 copies, 2 copies per AZ
OptimisticConflict Resolution
Continuous Availability
Enterprise-grade Durability
Support Indexes, Constraints,Triggers,
Procedures, Functions etc.
SQL compatible

Driving down query latency
Asynchronous Key PrefetchHash joinsBatched scans

QP Performance Improvement
Well-known decision support benchmark
0x
5x
10x
15x
20x
Q1 Q3 Q5 Q7 Q9 Q11 Q13 Q15 Q17 Q19 Q21
Query response time reduction
 Peak speed up ~18x
 >2x speedup: 10 of 22 queries

Drivingdown querylatency–ParallelQuery
 Parallel, Distributed processing
 Push-down processing closer to data
 Reduces buffer pool pollution
DATABASE NODE
STORAGE NODES
PUSH DOWN
PREDICATES
AGGREGATE
RESULTS

ParallelQueryArchitecture
Network storage driver
MVCC
ConverterQuery processor Aggregator
Dirty StreamClean Stream

Parallel Query - Performance results
Well-known decision support benchmark
We were able to testAurora’s parallel query feature and the performance gains were
very good.To be specific,We were able to reduce the instance type from r3.8xlarge to
r3.2xlarge. For this use-case, parallel query was a great win for us.
Jyoti Shandil, Cloud DataArchitect
0x
20x
40x
60x
80x
100x
120x
Q1 Q3 Q5 Q7 Q9 Q11 Q13 Q15 Q17 Q19 Q21
Query response time reduction
 Peak speed up ~120x
 >10x speedup: 8 of 22 queries

Parallel Query - Performance results

“AZ+1” failure tolerance
Why?
 In a large fleet, always some failures
 AZ failures have ”shared fate”
AZ 1 AZ 2 AZ 3
Quorum
break on
AZ failure
2/3 read
2/3 write
AZ 1 AZ 2 AZ 3
Quorum
survives
AZ failure
3/6 read
4/6 write
How?
 6 copies, 2 copies per AZ
 2/3 quorum will not work

Continuous backup
• Take periodic snapshot of each segment in parallel; stream the redo logs to Amazon S3
• Backup happens continuously without performance or availability impact
• At restore, retrieve the appropriate segment snapshots and log streams to storage nodes
• Apply log streams to segment snapshots in parallel and asynchronously
Segment snapshot Log records
Recovery point
Segment 1
Segment 2
Segment 3
Time

Database backtrack
Backtrack brings the database to a point in time without requiring restore from backups
• Backtracking from an unintentional DML or DDL operation
• Backtrack is not destructive.You can backtrack multiple times to find the right point in time
t0 t1 t2
t0 t1
t2
t3 t4
t3
t4
Rewind to t1
Rewind to t3
Invisible Invisible

Instantcrashredorecovery
Traditional database
- Replay logs since the last checkpoint
- Slow replay in the single thread
Amazon Aurora
- No checkpointing
- No replay for startup
Checkpointed Data Redo Log
Crash at T0 requires
a re-application of the
SQL in the redo log since
last checkpoint
T0 T0
Crash at T0 will result in redo logs being applied
to each segment on demand, in parallel,
asynchronously

Read replica and fast failover
Up to 15 promotable read replicas across multiple availability zones
Replica shares the storage with Master – no loss of data
Configurable failover order
MASTER
READ
REPLICA
READ
REPLICA
READ
REPLICA
SHARED DISTRIBUTED STORAGEVOLUME

Continuous availability with multi-master
Availability zone 1
Region
Availability zone 2 Availability zone 3
Read-write Master 1 Read-write Master 2 Read-write Master N
Transactions aborted
Connections terminated
Other nodes operate as before, with
access to the ENTIRE database
Redistribute
connections
Failed master recovers
independently
Recovery complete
Add new connections
• Continuous availability through failures and planned maintenance
• Continuous monitoring and automatic recovery of failed master nodes
App

Global replication
Fasterdisasterrecoveryandenhanceddatalocality

PerformanceInsights
Dashboard showing database load
 Easy – e.g. drag and drop
 Powerful – drill down using zoom in
Identifies source of bottlenecks
 Sort by top SQL
 Slice by host, user, wait events
Adjustable time frame
 Hour, day, week , month
 Up to 2 years of data; 7 days free
Max vCPU
CPU bottleneck
SQL w/ high CPU

Simplified management
 Automatic storage scaling up-to 64TB
 Automatic restriping, mirror repair, hot
spot management, encryption
up to 64TB
 Reader end-point with load balancing
 Reader end-point auto-scaling * NEW *
 Custom reader endpoints
MASTER
READ
REPLICA
READ
REPLICA
READ
REPLICA
SHARED DISTRIBUTED STORAGEVOLUME
READER END-POINT
READER END-POINT #2

AuroraServerless...
Responds to your application load automatically
Scale capacity up and down in < 10 seconds
New instance has warm buffer pool
Multi-tenant proxy is highly available

Howdoes itwork...
Availability zone 1
Region
App
Shared distributedstorage volume
Multi-tenant NLB / database proxy layer
Warm-poolof Aurorainstances
Monitoring service

Howdoes itworkin practice?
1
2
4
8
16
32
64
128
0
500
1000
1500
2000
2500
3000
10
190
370
550
730
910
1090
1270
1450
1630
1810
1990
2170
2350
2530
2710
2890
3070
3250
3430
3610
3790
3970
4150
4330
4510
4690
4870
5050
5230
5410
5590
5770
5950
6130
6310
6490
6670
6850
7030
7210
tps ACU

IntroducingWebServiceDataAPI
Access your database from Lambda applications
SQL statements packaged as HTTP requests
Connection pooling managed behind proxy
Web Service Data API
Aurora Serverless

Relatedbreakouts
Wednesday, November 28
DAT415 - AmazonAurora Multi-Master: Scaling Out Database Write Performance
11:30 PM–12:30 PM |Venetian, Level 2,Veronese 2406
Thursday, November 29
DAT362 - AccelerateYour Analytic Queries with AmazonAurora Parallel Query
4:00 PM–5:00 PM |Venetian, Level 2,Veronese 2406
Wednesday, November 28
DAT427 — Going Deep on AmazonAurora Serverless
4:00 PM–5:00 PM | Aria East, Plaza Level, Orovada 3

Thank you!
Kamal Gupta
kamalg@amazon.com

[REPEAT 1] Deep Dive on Amazon Aurora with MySQL Compatibility (DAT304-R1) - AWS re:Invent 2018

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie [REPEAT 1] Deep Dive on Amazon Aurora with MySQL Compatibility (DAT304-R1) - AWS re:Invent 2018

Ähnlich wie [REPEAT 1] Deep Dive on Amazon Aurora with MySQL Compatibility (DAT304-R1) - AWS re:Invent 2018 (20)

Mehr von Amazon Web Services

Mehr von Amazon Web Services (20)

[REPEAT 1] Deep Dive on Amazon Aurora with MySQL Compatibility (DAT304-R1) - AWS re:Invent 2018

Hinweis der Redaktion