SlideShare ist ein Scribd-Unternehmen logo
1 von 57
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Deep Dive on Amazon Aurora with
MySQL Compatibility
D A T 3 0 4
Kamal Gupta
Senior Development Manager, Amazon Aurora MySQL
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
RDS Platform
OPEN SOURCE ENGINES COMMERCIAL ENGINES
> Advanced monitoring
> Routine maintenance
> Push-button scaling
> Automatic fail-over
> Backup & recovery
> X-region replication
> Isolation & security
> Industry compliance
> Automated patching
CLOUD NATIVE ENGINE
Vision:Amazon Relational DatabaseService
Choice of open source and commercial databases
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
 Speed and availability of high-end commercial databases
 Simplicity and cost-effectiveness of open source databases
 Drop-in compatibility with MySQL and PostgreSQL
 Simple pay as you go pricing
Delivered as a managed service
Amazon Aurora
AmazonAurora…
Enterprise database at open source price
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Aurora customer adoption
Aurora is used by ¾ of the top 100 AWS customers
Fastest growing service in AWS history
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Local
Storage
SQL
Transactions
Caching
Logging
Compute
Traditional Database Architecture
Monolithic stack in a Single box
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Network
Storage
SQL
Transactions
Caching
Logging
Compute
Traditional Database Architecture
Decoupled Storage from Compute
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Traditional Distributed Database stack
Storage
Application Application Application
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
Storage StorageStorage Storage
 Same Monolithic stack  Distributed consensus algorithms
perform poorly
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Aurora: Scale-out, Distributed architecture
 Push Log applicator to Storage
Master Replica Replica Replica
Master
Shared storage volume
Replica Replica
SQL
Transactions
Caching
SQL
Transactions
Caching
SQL
Transactions
Caching
No more trade-offs!
AZ1 AZ2 AZ3
 Write performance
 Read scale out
 AZ + 1 failure tolerance
 Instant database redo recovery
 4/6 Write Quorum & Local tracking
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Page1
T1
T2
T3
T4
MASTER REPLICA
2 3 4 5 61
PAGE1
Page1
Asynchronous Replication
The Log is the database!
On-demand Log apply
Log Applicator in Action
Page1
BINLOG DATA DOUBLE-WRITELOG FRM FILES
TYPE OF WRITE
MYSQL WITH REPLICA
EBS mirrorEBS mirror
AZ 1 AZ 2
EBS
Amazon Elastic
Block Store (EBS)
Primary
Instance
Replica
Instance
1
2
3
4
5
AZ 1 AZ 3
Primary
Instance
AZ 2
Replica
Instance
ASYNC
4/6 QUORUM
DISTRIBUTED
WRITES
Replica
Instance
Amazon S3
AMAZON AURORA
0.78MM transactions
7.4 I/Os per transaction
MySQL I/O profile for 30 min Sysbench run
27MM transactions 35X MORE
0.95 I/Os per transaction 7.7X LESS
Aurora IO profile for 30 min Sysbench run
MySQL vs. Aurora I/O profile
Amazon S3
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Writeand readthroughput
AuroraMySQLis5xfasterthanMySQL
0
50,000
100,000
150,000
200,000
250,000
MySQL 5.6 MySQL 5.7 MySQL 8.0
Aurora 5.6 Aurora 5.7
0
100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
MySQL 5.6 MySQL 5.7 MySQL 8.0
Aurora 5.6 Aurora 5.7
WriteThroughput ReadThroughput
Using Sysbench with 250 tables and 200,000 rows per table on R4.16XL
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Bulkdataload performance
AuroraMySQLloadsdata2.5xfasterthanMySQL
Data loading
Data loading
Index build
Index build
0 100 200 300 400 500 600 700 800
MySQL
Amazon
Aurora
Runtime (sec.)
10 SysbenchTables, 10MM rows per each on R4.16XL
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Read scale out
PAGE CACHE
UPDATE
Aurora Master
30% Read
70% Write
Aurora Replica
100% New Reads
Shared Multi-AZ Storage
MySQL Master
30% Read
70% Write
MySQL Replica
30% New Reads
70% Write
SINGLE-THREADED
BINLOG APPLY
Data Volume Data Volume
Logical using complete changes
Same write workload
Independent storage
Physical using delta changes
NO writes on replica
Shared storage
MYSQL READ SCALING AMAZON AURORA READ SCALING
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
AuroraMySQLLogicalvs Physicalreplicalag
“In MySQL, we saw replica lag spike to almost 12 minutes which is almost absurd from
an application’s perspective. With Aurora, the maximum read replica lag across 4
replicas never exceeded 20 ms.”
Binlog Replica Lag (sec.) Aurora Physical Replica Lag (msec)Aurora Logical Replica Lag (seconds)
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
2 3 4 5 61
[T1]
[T3]
[T4]
Quorum
[T2]
PAGE1
Page1
T1
T2
T3
T4
MASTER
Durability: 0 0 0 0
Tracking:
Waiting Tx: T1 T2 T3 T4
Committed Tx: None
Durability: 4 3 1 0
Tracking:
Waiting Tx: T2 T3 T4
Committed Tx: T1
Durability: 6 3 5 2
Tracking:
Waiting Tx: T2 T3 T4
Committed Tx: T1
Durability: 6 4 5 6
Tracking:
Waiting Tx: None
Committed Tx: T1 T2 T3 T4
Durability: 6 6 6 6
Tracking:
Waiting Tx: None
Committed Tx: T1 T2 T3 T4
Shared StorageVolume
No heavy-weight Distributed
commits
Quorum and Local tracking in action
Parallel Flush
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Performance variability under load
AmazonAurora >200x more consistent
SysBench OLTP (write-only) workload with 250 tables and 200,000 rows per table on R4.16XL
0
2
4
6
8
10
12
0 100 200 300 400 500 600
Time in seconds
Write ResponseTime (seconds)
Amazon Aurora
MySQL 5.6 on EBS
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
What else have we done to drive throughput?CLIENTCONNECTION
CLIENTCONNECTION
LATCH FREE
TASK QUEUE
epoll()
MYSQL THREAD MODEL AURORA THREAD MODEL
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
What else have we done to drive throughput?CLIENTCONNECTION
CLIENTCONNECTION
LATCH FREE
TASK QUEUE
epoll()
MYSQL THREAD MODEL AURORA THREAD MODEL
Scan
Delete
Scan
Delete
Insert
Scan
Scan
Insert
Delete
Scan
Insert
Insert
MYSQL LOCK MANAGER AURORA LOCK MANAGER
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Performanceimprovementovertime
AuroraMySQL–2015-2018
0
50
100
150
200
250
2015 2016 2017 2018
Max write throughput
0
200
400
600
800
2015 2016 2017 2018
Max read throughput
Launched with R3.8xl
32 cores, 256GB memory
Now support R4.16xl
64 cores, 512GB memory
R5.24xl coming soon
96 cores, 768GB memory
Besides many performance optimizations, we are also upgrading HW platform
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Pre-tuned or Auto-tunes for different hardware configurations
What about Performance parameters?
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
SysBench OLTP (write-only) 10GiB workload with 250 tables and 200,000 rows
Recovery time vs. write performance
No more trade-offs!
Recovery time (sec.): 376.0
Recovery time (sec.): 99.0
Recovery time (sec.): 40.0
Recovery time (sec.): 0.5
Write/s: 90,606
Write/s: 26,129
Write/s: 4,382
Write/s: 207,398
0 50,000 100,000 150,000 200,000
0 50 100 150 200 250 300 350 400 450 500
MySQL (16GB Checkpoint)
MySQL (1GB Checkpoint)
MySQL (128MB Checkpoint)
Amazon Aurora
Writes per Second
Recovery Time (sec.)
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Scalability Availability Manageability Compatibility
SQL (1 node)
Manual sharding
No SQL
Aurora Single Master
Distributed databases comparison
Existing Multi-Master solutions
Paxos leader with 2PCGlobal ordering with read-write set
DATA
RANGE #1
DATA
RANGE #2
DATA
RANGE #4
DATA
RANGE #3
DATA
RANGE #5
L
L L
L
L
Distributed lock manager
SHARED STORAGE
M1 M2 M3
M1 M1 M1M2 M3 M2
SQL
Transactions
Caching
Logging
SQL
Transactions
Caching
Logging
GLOBAL ORDERING UNIT
T1 T2 T3 T100
Heavyweight synchronization: pessimistic
and negative scaling
Ex – Oracle RAC, DB2 Purescale, Sybase
Global entity: scaling bottleneck
Ex – Galera,TangoDB, FaunaDB
Heavy-weight consensus protocol: hot
partitions and struggle with
cross-partition queries
Ex – Spanner, CockroachDB, Ignite
Master
Replica
Orange Master Blue Master
SQL
Transactions
Caching
SQL
Transactions
Caching
Aurora Multi-Master Architecture
Shared Storage Volume
 No Pessimistic Locking
 No Global Ordering
 No Global Commit-Coordination
Replica
• Membership
• Heartbeat
• Replication
• Metadata
Cluster Services
1
1
1
1 1
1
2
2
2
2 2
2
3 3
3 3
3 3
1 3?
T1 T2
AZ1
AZ2
AZ3
Decoupled
Decoupled
Decoupled
 Decoupled System
 Microservices Architecture
2
 Optimistic Conflict Resolution
Shared distributed storage volume
AuroraMulti-Master –how (happypath)
Blue master Orange master
C1 C2
Non-conflicting writes originating on different masters on
different tables
Blue Master Orange MasterTime
BeginTrx (BT1)1 BeginTrx (OT1)
2 Update (table1)
3
Update (table2)
Page 1
Page 1
Page 1
Page 1
Page 1
Page 2
Page 2
Page 2
Page 2
Page 2
Page 2
Commit (BT1) Commit (OT1)
Page 1
OK OK
Shared distributed storage volume
AuroraMulti-Master –how (physical conflict)
Blue master Orange master
C1 C2
Conflicting writes originating on different masters on the same
table
Blue Master Orange MasterTime
BeginTrx (BT1)1 BeginTrx (OT1)
2 Update (row1, table1)
3
Update (row1, table1)
Page 1
Page 1
Page 1
Page 1
Page 1
Page 1
Commit (BT1) Rollback (OT1)
OK RETRY
Page 2
Page 2
Page 2
Page 2
Page 2
Page 2
Shared distributed storage volume
AuroraMulti-Master– how (logical conflict)
Blue master Orange master
C1 C2
Conflicting writes originating on different masters on the same
table
Blue Master Orange MasterTime
BeginTrx (BT1)1 BeginTrx (OT1)
2 Update (row1, table1)
4
Update (row1, table1)
and Rollback (OT1)
Commit (BT1)
OK RETRY
3
Page 2
Page 2
Page 2
Page 2
Page 2
Page 2
Page 1
Page 1
Page 1
Page 1
Page 1
Page 1
Page 1
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Aurora Multi-Master – scaling and availability
0
10000
20000
30000
40000
50000
60000
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
AggregatedThroughput
Time in minutes
Sysbench workload on 4 R4.XL nodes
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
AuroraMulti-MasterGlobal Reads -What?
John Status
Engaged
Single
Updates status
John
Bob
Send post
John Posts
Proposed to Sara
Local Read: ☹Global Read: 😊
Asynchronous
Engaged
Single
N3
N1 N2
Client
T2
T3 N1 wait for replication to catch up untilT2 ANDT3
Globally consistent results
Aurora Multi-Master Global Reads - How?
 No waits on the write path
 Adds latency ONLY to Globally
consistent reads
 Configurable per session
Shared distributed
storage volume
Performs the read at vector clockT = (T1,T2,T3)
ReadT1
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Aurora Multi-Master – Summary
Linear Scaling
MicroservicesArchitecture
6 copies, 2 copies per AZ
OptimisticConflict Resolution
Continuous Availability
Enterprise-grade Durability
Support Indexes, Constraints,Triggers,
Procedures, Functions etc.
SQL compatible
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Driving down query latency
Asynchronous Key PrefetchHash joinsBatched scans
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
QP Performance Improvement
Well-known decision support benchmark
0x
5x
10x
15x
20x
Q1 Q3 Q5 Q7 Q9 Q11 Q13 Q15 Q17 Q19 Q21
Query response time reduction
 Peak speed up ~18x
 >2x speedup: 10 of 22 queries
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Drivingdown querylatency–ParallelQuery
 Parallel, Distributed processing
 Push-down processing closer to data
 Reduces buffer pool pollution
DATABASE NODE
STORAGE NODES
PUSH DOWN
PREDICATES
AGGREGATE
RESULTS
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
ParallelQueryArchitecture
Network storage driver
MVCC
ConverterQuery processor Aggregator
Dirty StreamClean Stream
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Parallel Query - Performance results
Well-known decision support benchmark
We were able to testAurora’s parallel query feature and the performance gains were
very good.To be specific,We were able to reduce the instance type from r3.8xlarge to
r3.2xlarge. For this use-case, parallel query was a great win for us.
Jyoti Shandil, Cloud DataArchitect
0x
20x
40x
60x
80x
100x
120x
Q1 Q3 Q5 Q7 Q9 Q11 Q13 Q15 Q17 Q19 Q21
Query response time reduction
 Peak speed up ~120x
 >10x speedup: 8 of 22 queries
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Parallel Query - Performance results
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
“AZ+1” failure tolerance
Why?
 In a large fleet, always some failures
 AZ failures have ”shared fate”
AZ 1 AZ 2 AZ 3
Quorum
break on
AZ failure
2/3 read
2/3 write
AZ 1 AZ 2 AZ 3
Quorum
survives
AZ failure
3/6 read
4/6 write
How?
 6 copies, 2 copies per AZ
 2/3 quorum will not work
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Continuous backup
• Take periodic snapshot of each segment in parallel; stream the redo logs to Amazon S3
• Backup happens continuously without performance or availability impact
• At restore, retrieve the appropriate segment snapshots and log streams to storage nodes
• Apply log streams to segment snapshots in parallel and asynchronously
Segment snapshot Log records
Recovery point
Segment 1
Segment 2
Segment 3
Time
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Database backtrack
Backtrack brings the database to a point in time without requiring restore from backups
• Backtracking from an unintentional DML or DDL operation
• Backtrack is not destructive.You can backtrack multiple times to find the right point in time
t0 t1 t2
t0 t1
t2
t3 t4
t3
t4
Rewind to t1
Rewind to t3
Invisible Invisible
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Instantcrashredorecovery
Traditional database
- Replay logs since the last checkpoint
- Slow replay in the single thread
Amazon Aurora
- No checkpointing
- No replay for startup
Checkpointed Data Redo Log
Crash at T0 requires
a re-application of the
SQL in the redo log since
last checkpoint
T0 T0
Crash at T0 will result in redo logs being applied
to each segment on demand, in parallel,
asynchronously
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Read replica and fast failover
Up to 15 promotable read replicas across multiple availability zones
Replica shares the storage with Master – no loss of data
Configurable failover order
MASTER
READ
REPLICA
READ
REPLICA
READ
REPLICA
SHARED DISTRIBUTED STORAGEVOLUME
Continuous availability with multi-master
Availability zone 1
Region
Availability zone 2 Availability zone 3
Read-write Master 1 Read-write Master 2 Read-write Master N
Shared distributed storage volume
Transactions aborted
Connections terminated
Other nodes operate as before, with
access to the ENTIRE database
Redistribute
connections
Failed master recovers
independently
Recovery complete
Add new connections
• Continuous availability through failures and planned maintenance
• Continuous monitoring and automatic recovery of failed master nodes
App
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Global replication
Fasterdisasterrecoveryandenhanceddatalocality
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
PerformanceInsights
Dashboard showing database load
 Easy – e.g. drag and drop
 Powerful – drill down using zoom in
Identifies source of bottlenecks
 Sort by top SQL
 Slice by host, user, wait events
Adjustable time frame
 Hour, day, week , month
 Up to 2 years of data; 7 days free
Max vCPU
CPU bottleneck
SQL w/ high CPU
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Simplified management
 Automatic storage scaling up-to 64TB
 Automatic restriping, mirror repair, hot
spot management, encryption
up to 64TB
 Reader end-point with load balancing
 Reader end-point auto-scaling * NEW *
 Custom reader endpoints
MASTER
READ
REPLICA
READ
REPLICA
READ
REPLICA
SHARED DISTRIBUTED STORAGEVOLUME
READER END-POINT
READER END-POINT #2
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
AuroraServerless...
Responds to your application load automatically
Scale capacity up and down in < 10 seconds
New instance has warm buffer pool
Multi-tenant proxy is highly available
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Howdoes itwork...
Availability zone 1
Region
App
Shared distributedstorage volume
Multi-tenant NLB / database proxy layer
Warm-poolof Aurorainstances
Monitoring service
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Howdoes itworkin practice?
1
2
4
8
16
32
64
128
0
500
1000
1500
2000
2500
3000
10
190
370
550
730
910
1090
1270
1450
1630
1810
1990
2170
2350
2530
2710
2890
3070
3250
3430
3610
3790
3970
4150
4330
4510
4690
4870
5050
5230
5410
5590
5770
5950
6130
6310
6490
6670
6850
7030
7210
tps ACU
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
IntroducingWebServiceDataAPI
Access your database from Lambda applications
SQL statements packaged as HTTP requests
Connection pooling managed behind proxy
Web Service Data API
Aurora Serverless
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Relatedbreakouts
Wednesday, November 28
DAT415 - AmazonAurora Multi-Master: Scaling Out Database Write Performance
11:30 PM–12:30 PM |Venetian, Level 2,Veronese 2406
Thursday, November 29
DAT362 - AccelerateYour Analytic Queries with AmazonAurora Parallel Query
4:00 PM–5:00 PM |Venetian, Level 2,Veronese 2406
Wednesday, November 28
DAT427 — Going Deep on AmazonAurora Serverless
4:00 PM–5:00 PM | Aria East, Plaza Level, Orovada 3
Thank you!
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
Kamal Gupta
kamalg@amazon.com
© 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

re:Invent 2022 DAT326 Deep dive into Amazon Aurora and its innovations
re:Invent 2022  DAT326 Deep dive into Amazon Aurora and its innovationsre:Invent 2022  DAT326 Deep dive into Amazon Aurora and its innovations
re:Invent 2022 DAT326 Deep dive into Amazon Aurora and its innovations
 
Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...
Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...
Architecture Patterns for Multi-Region Active-Active Applications (ARC209-R2)...
 
Introducing Amazon Aurora with PostgreSQL Compatibility - AWS Online Tech Talks
Introducing Amazon Aurora with PostgreSQL Compatibility - AWS Online Tech TalksIntroducing Amazon Aurora with PostgreSQL Compatibility - AWS Online Tech Talks
Introducing Amazon Aurora with PostgreSQL Compatibility - AWS Online Tech Talks
 
Amazon Aurora
Amazon AuroraAmazon Aurora
Amazon Aurora
 
Amazon Aurora
Amazon AuroraAmazon Aurora
Amazon Aurora
 
RDS Postgres and Aurora Postgres | AWS Public Sector Summit 2017
RDS Postgres and Aurora Postgres | AWS Public Sector Summit 2017RDS Postgres and Aurora Postgres | AWS Public Sector Summit 2017
RDS Postgres and Aurora Postgres | AWS Public Sector Summit 2017
 
갤럭시 규모의 인공지능 서비스를 위한 AWS 데이터베이스 아키텍처 - 김상필 솔루션 아키텍트 매니저, AWS / 김정환 데브옵스 엔지니어,...
갤럭시 규모의 인공지능 서비스를 위한 AWS 데이터베이스 아키텍처 - 김상필 솔루션 아키텍트 매니저, AWS / 김정환 데브옵스 엔지니어,...갤럭시 규모의 인공지능 서비스를 위한 AWS 데이터베이스 아키텍처 - 김상필 솔루션 아키텍트 매니저, AWS / 김정환 데브옵스 엔지니어,...
갤럭시 규모의 인공지능 서비스를 위한 AWS 데이터베이스 아키텍처 - 김상필 솔루션 아키텍트 매니저, AWS / 김정환 데브옵스 엔지니어,...
 
Amazon RDS with Amazon Aurora | AWS Public Sector Summit 2016
Amazon RDS with Amazon Aurora | AWS Public Sector Summit 2016Amazon RDS with Amazon Aurora | AWS Public Sector Summit 2016
Amazon RDS with Amazon Aurora | AWS Public Sector Summit 2016
 
S3, 넌 이것까지 할 수있네 (Amazon S3 신규 기능 소개) - 김세준, AWS 솔루션즈 아키텍트:: AWS Summit Onli...
S3, 넌 이것까지 할 수있네 (Amazon S3 신규 기능 소개) - 김세준, AWS 솔루션즈 아키텍트::  AWS Summit Onli...S3, 넌 이것까지 할 수있네 (Amazon S3 신규 기능 소개) - 김세준, AWS 솔루션즈 아키텍트::  AWS Summit Onli...
S3, 넌 이것까지 할 수있네 (Amazon S3 신규 기능 소개) - 김세준, AWS 솔루션즈 아키텍트:: AWS Summit Onli...
 
Cost Optimisation on AWS
Cost Optimisation on AWSCost Optimisation on AWS
Cost Optimisation on AWS
 
AWS 9월 웨비나 | Amazon Aurora Deep Dive
AWS 9월 웨비나 | Amazon Aurora Deep DiveAWS 9월 웨비나 | Amazon Aurora Deep Dive
AWS 9월 웨비나 | Amazon Aurora Deep Dive
 
Introduction to Amazon DynamoDB
Introduction to Amazon DynamoDBIntroduction to Amazon DynamoDB
Introduction to Amazon DynamoDB
 
Amazon Aurora Deep Dive (김기완) - AWS DB Day
Amazon Aurora Deep Dive (김기완) - AWS DB DayAmazon Aurora Deep Dive (김기완) - AWS DB Day
Amazon Aurora Deep Dive (김기완) - AWS DB Day
 
Amazon DocumentDB vs MongoDB 의 내부 아키텍쳐 와 장단점 비교
Amazon DocumentDB vs MongoDB 의 내부 아키텍쳐 와 장단점 비교Amazon DocumentDB vs MongoDB 의 내부 아키텍쳐 와 장단점 비교
Amazon DocumentDB vs MongoDB 의 내부 아키텍쳐 와 장단점 비교
 
Getting Started with Amazon Aurora
Getting Started with Amazon AuroraGetting Started with Amazon Aurora
Getting Started with Amazon Aurora
 
Deep Dive on Amazon Aurora MySQL Performance Tuning (DAT429-R1) - AWS re:Inve...
Deep Dive on Amazon Aurora MySQL Performance Tuning (DAT429-R1) - AWS re:Inve...Deep Dive on Amazon Aurora MySQL Performance Tuning (DAT429-R1) - AWS re:Inve...
Deep Dive on Amazon Aurora MySQL Performance Tuning (DAT429-R1) - AWS re:Inve...
 
Masterclass - Redshift
Masterclass - RedshiftMasterclass - Redshift
Masterclass - Redshift
 
Building A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWSBuilding A Modern Data Analytics Architecture on AWS
Building A Modern Data Analytics Architecture on AWS
 
Unleash the Power of Redis with Amazon ElastiCache
Unleash the Power of Redis with Amazon ElastiCacheUnleash the Power of Redis with Amazon ElastiCache
Unleash the Power of Redis with Amazon ElastiCache
 
AWS Black Belt Online Seminar Amazon Aurora
AWS Black Belt Online Seminar Amazon AuroraAWS Black Belt Online Seminar Amazon Aurora
AWS Black Belt Online Seminar Amazon Aurora
 

Ähnlich wie [REPEAT 1] Deep Dive on Amazon Aurora with MySQL Compatibility (DAT304-R1) - AWS re:Invent 2018

Deep Dive - Amazon Relational Database Services_AWSPSSummit_Singapore
Deep Dive - Amazon Relational Database Services_AWSPSSummit_SingaporeDeep Dive - Amazon Relational Database Services_AWSPSSummit_Singapore
Deep Dive - Amazon Relational Database Services_AWSPSSummit_Singapore
Amazon Web Services
 

Ähnlich wie [REPEAT 1] Deep Dive on Amazon Aurora with MySQL Compatibility (DAT304-R1) - AWS re:Invent 2018 (20)

Amazon Aurora and AWS Database Migration Service
Amazon Aurora and AWS Database Migration ServiceAmazon Aurora and AWS Database Migration Service
Amazon Aurora and AWS Database Migration Service
 
Amazon Aurora_Deep Dive
Amazon Aurora_Deep DiveAmazon Aurora_Deep Dive
Amazon Aurora_Deep Dive
 
Build on Amazon Aurora with MySQL Compatibility (DAT348-R4) - AWS re:Invent 2018
Build on Amazon Aurora with MySQL Compatibility (DAT348-R4) - AWS re:Invent 2018Build on Amazon Aurora with MySQL Compatibility (DAT348-R4) - AWS re:Invent 2018
Build on Amazon Aurora with MySQL Compatibility (DAT348-R4) - AWS re:Invent 2018
 
Amazon Aurora
Amazon AuroraAmazon Aurora
Amazon Aurora
 
Migrating Your Oracle & SQL Server Databases to Amazon Aurora (DAT318) - AWS ...
Migrating Your Oracle & SQL Server Databases to Amazon Aurora (DAT318) - AWS ...Migrating Your Oracle & SQL Server Databases to Amazon Aurora (DAT318) - AWS ...
Migrating Your Oracle & SQL Server Databases to Amazon Aurora (DAT318) - AWS ...
 
Amazon Aurora
Amazon AuroraAmazon Aurora
Amazon Aurora
 
Amazon Aurora - Rajeev Chakrabarti
Amazon Aurora - Rajeev ChakrabartiAmazon Aurora - Rajeev Chakrabarti
Amazon Aurora - Rajeev Chakrabarti
 
Amazon Aurora
Amazon AuroraAmazon Aurora
Amazon Aurora
 
What's New in Amazon Aurora (DAT204-R1) - AWS re:Invent 2018
What's New in Amazon Aurora (DAT204-R1) - AWS re:Invent 2018What's New in Amazon Aurora (DAT204-R1) - AWS re:Invent 2018
What's New in Amazon Aurora (DAT204-R1) - AWS re:Invent 2018
 
Amazon Aurora: Deep Dive - SRV308 - Chicago AWS Summit
Amazon Aurora: Deep Dive - SRV308 - Chicago AWS SummitAmazon Aurora: Deep Dive - SRV308 - Chicago AWS Summit
Amazon Aurora: Deep Dive - SRV308 - Chicago AWS Summit
 
Amazon Aurora 深度探討
Amazon Aurora 深度探討Amazon Aurora 深度探討
Amazon Aurora 深度探討
 
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
 
AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)
AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)
AWS re:Invent 2016: Amazon Aurora Deep Dive (GPST402)
 
Deep Dive - Amazon Relational Database Services_AWSPSSummit_Singapore
Deep Dive - Amazon Relational Database Services_AWSPSSummit_SingaporeDeep Dive - Amazon Relational Database Services_AWSPSSummit_Singapore
Deep Dive - Amazon Relational Database Services_AWSPSSummit_Singapore
 
AWS August Webinar Series - Introducing Amazon Aurora
AWS August Webinar Series - Introducing Amazon AuroraAWS August Webinar Series - Introducing Amazon Aurora
AWS August Webinar Series - Introducing Amazon Aurora
 
Getting Started with Amazon Aurora
Getting Started with Amazon AuroraGetting Started with Amazon Aurora
Getting Started with Amazon Aurora
 
What’s New in Amazon Aurora
What’s New in Amazon AuroraWhat’s New in Amazon Aurora
What’s New in Amazon Aurora
 
Amazon Aurora TechConnect
Amazon Aurora TechConnect Amazon Aurora TechConnect
Amazon Aurora TechConnect
 
What's new in Amazon Aurora - ADB207 - New York AWS Summit
What's new in Amazon Aurora - ADB207 - New York AWS SummitWhat's new in Amazon Aurora - ADB207 - New York AWS Summit
What's new in Amazon Aurora - ADB207 - New York AWS Summit
 
Scale Up and Modernize Your Database with Amazon Relational Database Service ...
Scale Up and Modernize Your Database with Amazon Relational Database Service ...Scale Up and Modernize Your Database with Amazon Relational Database Service ...
Scale Up and Modernize Your Database with Amazon Relational Database Service ...
 

Mehr von Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

[REPEAT 1] Deep Dive on Amazon Aurora with MySQL Compatibility (DAT304-R1) - AWS re:Invent 2018

  • 1.
  • 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Deep Dive on Amazon Aurora with MySQL Compatibility D A T 3 0 4 Kamal Gupta Senior Development Manager, Amazon Aurora MySQL
  • 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. RDS Platform OPEN SOURCE ENGINES COMMERCIAL ENGINES > Advanced monitoring > Routine maintenance > Push-button scaling > Automatic fail-over > Backup & recovery > X-region replication > Isolation & security > Industry compliance > Automated patching CLOUD NATIVE ENGINE Vision:Amazon Relational DatabaseService Choice of open source and commercial databases
  • 4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.  Speed and availability of high-end commercial databases  Simplicity and cost-effectiveness of open source databases  Drop-in compatibility with MySQL and PostgreSQL  Simple pay as you go pricing Delivered as a managed service Amazon Aurora AmazonAurora… Enterprise database at open source price
  • 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Aurora customer adoption Aurora is used by ¾ of the top 100 AWS customers Fastest growing service in AWS history
  • 6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Local Storage SQL Transactions Caching Logging Compute Traditional Database Architecture Monolithic stack in a Single box
  • 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Network Storage SQL Transactions Caching Logging Compute Traditional Database Architecture Decoupled Storage from Compute
  • 9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Traditional Distributed Database stack Storage Application Application Application SQL Transactions Caching Logging SQL Transactions Caching Logging SQL Transactions Caching Logging SQL Transactions Caching Logging SQL Transactions Caching Logging SQL Transactions Caching Logging Storage StorageStorage Storage  Same Monolithic stack  Distributed consensus algorithms perform poorly
  • 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Aurora: Scale-out, Distributed architecture  Push Log applicator to Storage Master Replica Replica Replica Master Shared storage volume Replica Replica SQL Transactions Caching SQL Transactions Caching SQL Transactions Caching No more trade-offs! AZ1 AZ2 AZ3  Write performance  Read scale out  AZ + 1 failure tolerance  Instant database redo recovery  4/6 Write Quorum & Local tracking
  • 11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Page1 T1 T2 T3 T4 MASTER REPLICA 2 3 4 5 61 PAGE1 Page1 Asynchronous Replication The Log is the database! On-demand Log apply Log Applicator in Action Page1
  • 12. BINLOG DATA DOUBLE-WRITELOG FRM FILES TYPE OF WRITE MYSQL WITH REPLICA EBS mirrorEBS mirror AZ 1 AZ 2 EBS Amazon Elastic Block Store (EBS) Primary Instance Replica Instance 1 2 3 4 5 AZ 1 AZ 3 Primary Instance AZ 2 Replica Instance ASYNC 4/6 QUORUM DISTRIBUTED WRITES Replica Instance Amazon S3 AMAZON AURORA 0.78MM transactions 7.4 I/Os per transaction MySQL I/O profile for 30 min Sysbench run 27MM transactions 35X MORE 0.95 I/Os per transaction 7.7X LESS Aurora IO profile for 30 min Sysbench run MySQL vs. Aurora I/O profile Amazon S3
  • 13. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Writeand readthroughput AuroraMySQLis5xfasterthanMySQL 0 50,000 100,000 150,000 200,000 250,000 MySQL 5.6 MySQL 5.7 MySQL 8.0 Aurora 5.6 Aurora 5.7 0 100,000 200,000 300,000 400,000 500,000 600,000 700,000 800,000 MySQL 5.6 MySQL 5.7 MySQL 8.0 Aurora 5.6 Aurora 5.7 WriteThroughput ReadThroughput Using Sysbench with 250 tables and 200,000 rows per table on R4.16XL
  • 14. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Bulkdataload performance AuroraMySQLloadsdata2.5xfasterthanMySQL Data loading Data loading Index build Index build 0 100 200 300 400 500 600 700 800 MySQL Amazon Aurora Runtime (sec.) 10 SysbenchTables, 10MM rows per each on R4.16XL
  • 15. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Read scale out PAGE CACHE UPDATE Aurora Master 30% Read 70% Write Aurora Replica 100% New Reads Shared Multi-AZ Storage MySQL Master 30% Read 70% Write MySQL Replica 30% New Reads 70% Write SINGLE-THREADED BINLOG APPLY Data Volume Data Volume Logical using complete changes Same write workload Independent storage Physical using delta changes NO writes on replica Shared storage MYSQL READ SCALING AMAZON AURORA READ SCALING
  • 16. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. AuroraMySQLLogicalvs Physicalreplicalag “In MySQL, we saw replica lag spike to almost 12 minutes which is almost absurd from an application’s perspective. With Aurora, the maximum read replica lag across 4 replicas never exceeded 20 ms.” Binlog Replica Lag (sec.) Aurora Physical Replica Lag (msec)Aurora Logical Replica Lag (seconds)
  • 17. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. 2 3 4 5 61 [T1] [T3] [T4] Quorum [T2] PAGE1 Page1 T1 T2 T3 T4 MASTER Durability: 0 0 0 0 Tracking: Waiting Tx: T1 T2 T3 T4 Committed Tx: None Durability: 4 3 1 0 Tracking: Waiting Tx: T2 T3 T4 Committed Tx: T1 Durability: 6 3 5 2 Tracking: Waiting Tx: T2 T3 T4 Committed Tx: T1 Durability: 6 4 5 6 Tracking: Waiting Tx: None Committed Tx: T1 T2 T3 T4 Durability: 6 6 6 6 Tracking: Waiting Tx: None Committed Tx: T1 T2 T3 T4 Shared StorageVolume No heavy-weight Distributed commits Quorum and Local tracking in action Parallel Flush
  • 18. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Performance variability under load AmazonAurora >200x more consistent SysBench OLTP (write-only) workload with 250 tables and 200,000 rows per table on R4.16XL 0 2 4 6 8 10 12 0 100 200 300 400 500 600 Time in seconds Write ResponseTime (seconds) Amazon Aurora MySQL 5.6 on EBS
  • 19. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. What else have we done to drive throughput?CLIENTCONNECTION CLIENTCONNECTION LATCH FREE TASK QUEUE epoll() MYSQL THREAD MODEL AURORA THREAD MODEL
  • 20. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. What else have we done to drive throughput?CLIENTCONNECTION CLIENTCONNECTION LATCH FREE TASK QUEUE epoll() MYSQL THREAD MODEL AURORA THREAD MODEL Scan Delete Scan Delete Insert Scan Scan Insert Delete Scan Insert Insert MYSQL LOCK MANAGER AURORA LOCK MANAGER
  • 21. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Performanceimprovementovertime AuroraMySQL–2015-2018 0 50 100 150 200 250 2015 2016 2017 2018 Max write throughput 0 200 400 600 800 2015 2016 2017 2018 Max read throughput Launched with R3.8xl 32 cores, 256GB memory Now support R4.16xl 64 cores, 512GB memory R5.24xl coming soon 96 cores, 768GB memory Besides many performance optimizations, we are also upgrading HW platform
  • 22. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Pre-tuned or Auto-tunes for different hardware configurations What about Performance parameters?
  • 23. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. SysBench OLTP (write-only) 10GiB workload with 250 tables and 200,000 rows Recovery time vs. write performance No more trade-offs! Recovery time (sec.): 376.0 Recovery time (sec.): 99.0 Recovery time (sec.): 40.0 Recovery time (sec.): 0.5 Write/s: 90,606 Write/s: 26,129 Write/s: 4,382 Write/s: 207,398 0 50,000 100,000 150,000 200,000 0 50 100 150 200 250 300 350 400 450 500 MySQL (16GB Checkpoint) MySQL (1GB Checkpoint) MySQL (128MB Checkpoint) Amazon Aurora Writes per Second Recovery Time (sec.)
  • 24. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Scalability Availability Manageability Compatibility SQL (1 node) Manual sharding No SQL Aurora Single Master Distributed databases comparison
  • 25. Existing Multi-Master solutions Paxos leader with 2PCGlobal ordering with read-write set DATA RANGE #1 DATA RANGE #2 DATA RANGE #4 DATA RANGE #3 DATA RANGE #5 L L L L L Distributed lock manager SHARED STORAGE M1 M2 M3 M1 M1 M1M2 M3 M2 SQL Transactions Caching Logging SQL Transactions Caching Logging GLOBAL ORDERING UNIT T1 T2 T3 T100 Heavyweight synchronization: pessimistic and negative scaling Ex – Oracle RAC, DB2 Purescale, Sybase Global entity: scaling bottleneck Ex – Galera,TangoDB, FaunaDB Heavy-weight consensus protocol: hot partitions and struggle with cross-partition queries Ex – Spanner, CockroachDB, Ignite
  • 26. Master Replica Orange Master Blue Master SQL Transactions Caching SQL Transactions Caching Aurora Multi-Master Architecture Shared Storage Volume  No Pessimistic Locking  No Global Ordering  No Global Commit-Coordination Replica • Membership • Heartbeat • Replication • Metadata Cluster Services 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 1 3? T1 T2 AZ1 AZ2 AZ3 Decoupled Decoupled Decoupled  Decoupled System  Microservices Architecture 2  Optimistic Conflict Resolution
  • 27. Shared distributed storage volume AuroraMulti-Master –how (happypath) Blue master Orange master C1 C2 Non-conflicting writes originating on different masters on different tables Blue Master Orange MasterTime BeginTrx (BT1)1 BeginTrx (OT1) 2 Update (table1) 3 Update (table2) Page 1 Page 1 Page 1 Page 1 Page 1 Page 2 Page 2 Page 2 Page 2 Page 2 Page 2 Commit (BT1) Commit (OT1) Page 1 OK OK
  • 28. Shared distributed storage volume AuroraMulti-Master –how (physical conflict) Blue master Orange master C1 C2 Conflicting writes originating on different masters on the same table Blue Master Orange MasterTime BeginTrx (BT1)1 BeginTrx (OT1) 2 Update (row1, table1) 3 Update (row1, table1) Page 1 Page 1 Page 1 Page 1 Page 1 Page 1 Commit (BT1) Rollback (OT1) OK RETRY Page 2 Page 2 Page 2 Page 2 Page 2 Page 2
  • 29. Shared distributed storage volume AuroraMulti-Master– how (logical conflict) Blue master Orange master C1 C2 Conflicting writes originating on different masters on the same table Blue Master Orange MasterTime BeginTrx (BT1)1 BeginTrx (OT1) 2 Update (row1, table1) 4 Update (row1, table1) and Rollback (OT1) Commit (BT1) OK RETRY 3 Page 2 Page 2 Page 2 Page 2 Page 2 Page 2 Page 1 Page 1 Page 1 Page 1 Page 1 Page 1 Page 1
  • 30. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Aurora Multi-Master – scaling and availability 0 10000 20000 30000 40000 50000 60000 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 AggregatedThroughput Time in minutes Sysbench workload on 4 R4.XL nodes
  • 31. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. AuroraMulti-MasterGlobal Reads -What? John Status Engaged Single Updates status John Bob Send post John Posts Proposed to Sara Local Read: ☹Global Read: 😊 Asynchronous Engaged Single
  • 32. N3 N1 N2 Client T2 T3 N1 wait for replication to catch up untilT2 ANDT3 Globally consistent results Aurora Multi-Master Global Reads - How?  No waits on the write path  Adds latency ONLY to Globally consistent reads  Configurable per session Shared distributed storage volume Performs the read at vector clockT = (T1,T2,T3) ReadT1
  • 33. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Aurora Multi-Master – Summary Linear Scaling MicroservicesArchitecture 6 copies, 2 copies per AZ OptimisticConflict Resolution Continuous Availability Enterprise-grade Durability Support Indexes, Constraints,Triggers, Procedures, Functions etc. SQL compatible
  • 34. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Driving down query latency Asynchronous Key PrefetchHash joinsBatched scans
  • 35. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. QP Performance Improvement Well-known decision support benchmark 0x 5x 10x 15x 20x Q1 Q3 Q5 Q7 Q9 Q11 Q13 Q15 Q17 Q19 Q21 Query response time reduction  Peak speed up ~18x  >2x speedup: 10 of 22 queries
  • 36. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Drivingdown querylatency–ParallelQuery  Parallel, Distributed processing  Push-down processing closer to data  Reduces buffer pool pollution DATABASE NODE STORAGE NODES PUSH DOWN PREDICATES AGGREGATE RESULTS
  • 37. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. ParallelQueryArchitecture Network storage driver MVCC ConverterQuery processor Aggregator Dirty StreamClean Stream
  • 38. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Parallel Query - Performance results Well-known decision support benchmark We were able to testAurora’s parallel query feature and the performance gains were very good.To be specific,We were able to reduce the instance type from r3.8xlarge to r3.2xlarge. For this use-case, parallel query was a great win for us. Jyoti Shandil, Cloud DataArchitect 0x 20x 40x 60x 80x 100x 120x Q1 Q3 Q5 Q7 Q9 Q11 Q13 Q15 Q17 Q19 Q21 Query response time reduction  Peak speed up ~120x  >10x speedup: 8 of 22 queries
  • 39. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Parallel Query - Performance results
  • 40. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
  • 41. “AZ+1” failure tolerance Why?  In a large fleet, always some failures  AZ failures have ”shared fate” AZ 1 AZ 2 AZ 3 Quorum break on AZ failure 2/3 read 2/3 write AZ 1 AZ 2 AZ 3 Quorum survives AZ failure 3/6 read 4/6 write How?  6 copies, 2 copies per AZ  2/3 quorum will not work © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
  • 42. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Continuous backup • Take periodic snapshot of each segment in parallel; stream the redo logs to Amazon S3 • Backup happens continuously without performance or availability impact • At restore, retrieve the appropriate segment snapshots and log streams to storage nodes • Apply log streams to segment snapshots in parallel and asynchronously Segment snapshot Log records Recovery point Segment 1 Segment 2 Segment 3 Time
  • 43. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Database backtrack Backtrack brings the database to a point in time without requiring restore from backups • Backtracking from an unintentional DML or DDL operation • Backtrack is not destructive.You can backtrack multiple times to find the right point in time t0 t1 t2 t0 t1 t2 t3 t4 t3 t4 Rewind to t1 Rewind to t3 Invisible Invisible
  • 44. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Instantcrashredorecovery Traditional database - Replay logs since the last checkpoint - Slow replay in the single thread Amazon Aurora - No checkpointing - No replay for startup Checkpointed Data Redo Log Crash at T0 requires a re-application of the SQL in the redo log since last checkpoint T0 T0 Crash at T0 will result in redo logs being applied to each segment on demand, in parallel, asynchronously
  • 45. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Read replica and fast failover Up to 15 promotable read replicas across multiple availability zones Replica shares the storage with Master – no loss of data Configurable failover order MASTER READ REPLICA READ REPLICA READ REPLICA SHARED DISTRIBUTED STORAGEVOLUME
  • 46. Continuous availability with multi-master Availability zone 1 Region Availability zone 2 Availability zone 3 Read-write Master 1 Read-write Master 2 Read-write Master N Shared distributed storage volume Transactions aborted Connections terminated Other nodes operate as before, with access to the ENTIRE database Redistribute connections Failed master recovers independently Recovery complete Add new connections • Continuous availability through failures and planned maintenance • Continuous monitoring and automatic recovery of failed master nodes App © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
  • 47. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Global replication Fasterdisasterrecoveryandenhanceddatalocality © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
  • 48. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.
  • 49. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. PerformanceInsights Dashboard showing database load  Easy – e.g. drag and drop  Powerful – drill down using zoom in Identifies source of bottlenecks  Sort by top SQL  Slice by host, user, wait events Adjustable time frame  Hour, day, week , month  Up to 2 years of data; 7 days free Max vCPU CPU bottleneck SQL w/ high CPU
  • 50. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Simplified management  Automatic storage scaling up-to 64TB  Automatic restriping, mirror repair, hot spot management, encryption up to 64TB  Reader end-point with load balancing  Reader end-point auto-scaling * NEW *  Custom reader endpoints MASTER READ REPLICA READ REPLICA READ REPLICA SHARED DISTRIBUTED STORAGEVOLUME READER END-POINT READER END-POINT #2
  • 51. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. AuroraServerless... Responds to your application load automatically Scale capacity up and down in < 10 seconds New instance has warm buffer pool Multi-tenant proxy is highly available
  • 52. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Howdoes itwork... Availability zone 1 Region App Shared distributedstorage volume Multi-tenant NLB / database proxy layer Warm-poolof Aurorainstances Monitoring service
  • 53. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Howdoes itworkin practice? 1 2 4 8 16 32 64 128 0 500 1000 1500 2000 2500 3000 10 190 370 550 730 910 1090 1270 1450 1630 1810 1990 2170 2350 2530 2710 2890 3070 3250 3430 3610 3790 3970 4150 4330 4510 4690 4870 5050 5230 5410 5590 5770 5950 6130 6310 6490 6670 6850 7030 7210 tps ACU
  • 54. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. IntroducingWebServiceDataAPI Access your database from Lambda applications SQL statements packaged as HTTP requests Connection pooling managed behind proxy Web Service Data API Aurora Serverless
  • 55. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Relatedbreakouts Wednesday, November 28 DAT415 - AmazonAurora Multi-Master: Scaling Out Database Write Performance 11:30 PM–12:30 PM |Venetian, Level 2,Veronese 2406 Thursday, November 29 DAT362 - AccelerateYour Analytic Queries with AmazonAurora Parallel Query 4:00 PM–5:00 PM |Venetian, Level 2,Veronese 2406 Wednesday, November 28 DAT427 — Going Deep on AmazonAurora Serverless 4:00 PM–5:00 PM | Aria East, Plaza Level, Orovada 3
  • 56. Thank you! © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved. Kamal Gupta kamalg@amazon.com
  • 57. © 2018, Amazon Web Services, Inc. or its affiliates.All rights reserved.

Hinweis der Redaktion

  1. Overall In summary Net-net So, Lets, Now, Alright,
  2. (2 min) HELLO everyone It’s my PRIVILEGE to be with ALL of you, and a WARM welcome to DAY 1 at REINVENT 2018 You know, OVER 7 years ago, WHEN we started building AURORA, WE had a simple mission, we WANTED, ANY person, ANYWHERE in the world to be able to RUN and MANAGE databases, AND all they would need is their BUSINESS APPLICATION. They wouldn’t need to worry about PROVISIONING, need HIGHLY skilled operators MANAGING their databases, make TRADEOFFS between PERFORMANCE, AVAILABILITY, DURABILITY and COST. AND by doing so, we BELIEVE we can TRANSFORM EVERYONE in the world to RE-IMAGINE the DATABASES in the cloud. hi my name is Kamal Gupta and I am a senior engineering manager at AWS  TODAY, WE are going to SHOW you, in the NEXT hour, not only the NEW INNOVATIONS in Aurora, but also, how we DID it. AND how we are adding new capabilities like multimaster, parallel query, serverless, global databases into our Aurora offering that my team is building. Joining me today, I am EXCITED to have Sirish Chandrasekaran, principal product manager at AWS  Let’s deep dive into Aurora MySQL
  3. (1 min 15 sec) Our vision for Amazon relational database service is to offer you choices and recommendations so that you can decide what's best for your application. On the one hand, AWS offers Open source engines for our customers who simply likes the simplicity and cost-effectiveness of them but the problem is that they lack the enterprise grade performance and reliability that our customers need for their mission critical applications. We also offer old guard commercial database engines for our customers, who needs enterprise grade performance and reliability even though they are quite expensive with lock-ins and punitive licensing terms. One of the early feedback we got from our customers is to build something that combines the best of both world. And we have created an Aurora for you.
  4. (20 sec) With Aurora, you NO longer have to make the trade-offs. It provides you the commercial grade performance, durability and availability at the simplicity and cost-effectiveness of open-source solutions And its delivered to you as a managed service.
  5. (20 sec) Here are some of the customers who have been using Aurora – Airbnb, zynga, hulu, ancenstory, Nasdaq, some big names. As you can see, Aurora continues to be the fastest growing service in the AWS history
  6. (20 seconds) So, with that intro, I will first talk talk about Performance and then Availability and Manageability.
  7. (20 sec) You know, when databases first came out, it looked something like this. Monolithic architecture in a single box. With local storage, we were trading availability and durability to get better performance
  8. (20 seconds) Over time, we decoupled storage from compute that allowed us to scale/customize/manage each layer independently but still monolithic stack remained the same.
  9. (30 seconds) And then we added more such boxes. As you can see, it’s the same SQL stack everywhere. Nothing changed! Moreover, we need heavy-weight distributed consensus for data replication and they perform poorly because of multiple phases, multiple rounds, sync points etc.
  10. (2.5 minutes) With Aurora, we did two big contributions: We pushed down Log applicator down to the storage => that allowed us to construct pages from the logs themselves. This is really cool because we don’t have to write full pages anymore. So unlike traditional databases which write both logs and pages, we just have to write logs. This means we have significantly less network IO, fundamentally less work on the engine - you don't need checkpointing anymore, you don't need flushing of pages or cache eviction any more. Instead of heavy weight distributed consensus for data replication. We use 4/6 write Quorum & Local tracking. The reason we can avoid distributed consensus is because we exploit monotonically increasing Log Sequence Number (LSN) by the Master that allows us to order the writes. And so SN's just accept the writes. There is no voting involved. We are going to see both these things in action. As a result, YOU get significantly better write performance YOU get Read scale out because they share the same storage with the master. YOU get AZ+1 failure tolerance => Aurora stores 6 copies, two copies per AZ. Even in the presence of background radiation, an entire AZ goes down on top, Aurora can handle it. No problem. YOU get an Instant database redo recovery because we don't have to explicitly do anything at startup other than doing some math to find out the point at which we crashed. Overall: You NO longer have to make a trade-off between performance, availability, and durability.
  11. (2 minutes) Lets see how log applicator works in action. Here, we are running 4 transactions with Master and replica. We have the storage at the bottom with each log 6-ways replication. Lets say we commit a Trx T1. As you can see all SNs and replica received the changes. And so if we try to read, both master and replica will get the page with the orange Trx. Now - lets say we commit T2/T3/T4. Please note that SN already coalesced purple but left blue and green log records on the side because it can't. This is because replica clock is still at purple. And if both master and replica try to read, they will get the right image. For master, storage will apply the log records, blue and green, kept on the side, on the fly And at some point, replica clock will go green and we can garbage collect the remaining log records by coalescing the changes. Hopefully you can see, how we can construct pages from the logs themselves.
  12. (1 minute) Lets see what benefits we got out of this. Here is how the IO profile looks like for Aurora and MySQL. 1. On the left, we have MySQL on EBS. Thing to note is that it has to replicate all kinds of data. With Aurora, on the right, we ONLY have to replicate log records. As a result we do 7.7X less IO and 35X more work, despite 6X amplification due to 6 copies. 2. The other thing to note is that steps 1, 3, 4 on the left are synchronous and leads to jitter but with Aurora we use 4/6 quorum, that 's much more resilient to tail latency. And we will see that in a second, why that matters for your applications.
  13. (20 seconds) And so we ran Sysbench workload on Aurora and MySQL, and we got an order of magnitude more writes and 2.5X more reads as opposed to stock MySQL running on EBS.
  14. (10 seconds) Here is another example with bulk load plus indexing. Again Aurora is 2.5X faster.
  15. (1 minute) Lets talk about read scale out for your OLTP reads, or analytics queries. Here, you can see on the left, we have MySQL Native binlog based replication, typically used for replication in the MySQL community. And on the right, we have Aurora physical replication. - Unlike MySQL which have to transfer full rows or statements, Aurora only transfer log records (nothing but the delta changes). And that too are compressed. - Unlike MySQL binlog replication, Aurora doesn’t need to write anything on replica, no extra write IO or storage involved. - Also note, Aurora only need to update the pages only in-cache and so no read IO as well. And in-fact we only transfer what's in the replica cache, we filter it out on the master itself. Even better.
  16. (30 seconds) You can see the comparison of Binlog replica with Aurora Physical replica. Both are on the same Aurora instance, same software, same hardware to keep it apples to apples comparison. The Binlog graph on the left is in seconds and it spiked to 5 minutes with in first 10 min under heavy load. On the right we have Aurora Physical Replica. As you can see, its consistently stays under 20ms for hours and hours under the same load.
  17. (2 minutes) Lets see how we do write quorum and local tracking in action. We have the same setup with 4 transactions with storage at the bottom. There is a quorum tracking on the right. There are 4 waiting transactions and none of them are committed yet. With traditional database, we keep WAL sequential, buffer the writes and flush them sequentially. As soon as write is flushed, we consider that trx committed and ack back to the client. Instead, we issue the writes to storage immediately in parallel and instead we use write tracker to ack in the right order until everything is flushed until this point. Otherwise we will break write-ahead logging As you can see, there is no distributed consensus like Paxos or Raft, with multiple phases, or sync points in the storage. Its all Quorum and local tracking by individual SN because we leverage the sequencing from the head node Now, I didn’t talk about reads here, but we use a different tracking mechanism instead of relying on any sort of consensus. Refer to Aurora SIGMOD papers for details, if you are interested.
  18. (1 minute 30 sec) So, we looked at sysbench response times under heavy load. With Aurora we used 10K connections, with MySQL we just used only 500 connections because it starts thrashing after that and gets even worse. As expected, the response times for Aurora are not only lower, but have much less variation. More precisely, based on the standard deviations of the two data sets, Amazon Aurora is more than 200x more consistent than MySQL. Also, the average response time is about 25x lower. Please note that Aurora is pushing 45x more throughput in this example. You might wonder what’s going on with the spikes in MySQL. What you see here is the impact of database checkpoints. During a checkpoint, MySQL will do a lot of writes, which will slow down user transactions, hence the variability in the MySQL response times. 3 things why Aurora is so much better: 1) Light-weight consensus 2) ability to do out of order flush 3) no checkpointing because we construct pages from the logs themselves
  19. (1 minute). Here are some examples of software innovations we did to give you the world-class database in the industry. Lets take a look at the thread model. With MySQL, on the left, it follows a thread per connection model. Clearly, it doesn’t scale with connections. With Aurora, instead, we use a thread pool with epoll and latch-free task queue. This allows Aurora to scale much better with connections. Here is another example of innovation, when you push more writes, you will have more contention in the system. And if we simply lock the whole lock table like MySQL, a lot of our other effort will go in vain. Instead Aurora allows concurrent access to any given lock chain
  20. (1 minute). Here are some examples of software innovations we did to give you the world-class database in the industry. Lets take a look at the thread model. With MySQL, on the left, it follows a thread per connection model. Clearly, it doesn’t scale with connections. With Aurora, instead, we use a thread pool with epoll and latch-free task queue. This allows Aurora to scale much better with connections. Here is another example of innovation, when you push more writes, you will have more contention in the system. And if we simply lock the whole lock table like MySQL, a lot of our other effort will go in vain. Instead Aurora allows concurrent access to any given lock chain
  21. (20 seconds) Besides software improvements, H/W also improved. And combined you can see, Aurora is getting better and better. Aurora now delivers 200K writes and whopping 700k reads per second on a single R4.16XL instance and its getting better each day!
  22. (20 seconds) A lot of our customers come to me on how to tune Aurora Well, Aurora automatically pre-tunes or auto-tunes different parameters for different h/w configurations for you. Unless you are doing something really peculiar, you will just get the best performance out of the box.
  23. (1 minute): Now, THERE are few parameters in MySQL like innodb_flush_at_trx_commit or innodb_file_size or sync_binlog that allow for better write performance but there is usually a tradeoff. Here is one such example with innodb_log_file_size. You can get better performance in MySQL but it also increases the recovery time if there is a database failure. The reason is that this parameter is fundamentally delaying the checkpoint. And by increasing the duration, you are accumulating more redo logs. And when recovering from a crash, MySQL has to replay more logs in a single thread. The more logs to replay, the longer recovery will take. With Amazon Aurora, there are no checkpoints and it doesn't even matter. In essence, you don't have to make trade-offs between performance and availability or durability with Aurora.
  24. (45 sec) Okay – lets talk about multi-master. Before we jump, quick background on the space. - We first had SQL running on 1 node but that was hard to scale - To scale, we manually sharded them, but it was very hard to manage, as partitions becomes hot or we need schema changes across the partitions. - Then to simplify, we built No SQL systems but they fundamentally lack transaction support and is very hard for our customers to build apps on eventual consistent systems. Customer that I talk to love the trx model, very easy to reason about. - With Aurora SM, we addressed most of them but there were few gaps. - With MM, we are addressing most of those gaps by adding write scalability and database write availability for our customers.
  25. (2 min) Lets take a look how some of the existing MM solutions work: 1) First, we have a shared disk model with caches fused together. The challenge with these systems is that they uses pessimistic locking. The other challenge is that they require high cache coheeerence traffic, on per-lock basis, and so you either need expensive interconnects between the nodes (typically put together in a small room in the datacenter) or you suffer from hot blocks ping-ponging across the nodes. 2) Then there are systems that use read-write set technique - basically, what this is, as part of the transactions, you first read all the objects and then later modify based on the value you read. Now, at the time of the commit, if you find anyone else changed any of those objects since you read, we simply abort that transaction, otherwise commit the transaction. And if we can follow that for all transactions in some particular order, we can guarantee that all nodes will independently come to the same decision. But coming up with this global order ends up becoming the bottleneck in these systems. 3) And finally, there are noSQL style systems, with QP support on top, where data is range partitioned. They select a paxos leader within each partition but if there is a skew in the access pattern, which is quite typical, you will end up with the hot partitions ex - lets say you partitioned by date-time and you are inserting in the last range. Also, they typically use heavy-weight consensus protocol for commits.
  26. (2.5 min) With Aurora MM, there is no pessimistic locking, no explicit global ordering, no global commit-coordination. Aurora architecture is based on 3 techniques: First, it uses optimistic-conflict-resolution in the storage. To understand this better, lets say Orange Master runs T1 and Blue Master runs T2. Now, if T1 and T2 modifies different pages, there is no conflict and hence no sync required. However, if T1 and T2 both touches P2, then one of them wins and the other one have to retry based on the quorum. And as you can see, that doesn’t require any heavy-weight consensus protocol. Again we rely on quorum and local tracking with partitioned, monotonic LSN sequencing, from individual database nodes, to order the writes. Since logging layer is pushed down, Aurora decouples the Transaction layer from the logging layer, and this allows Aurora to separate physical conflict on the pages from the logical conflict between the transactions. Trx conflicts are handled through MVCC, and physical conflicts through OCR. Moreover, there is no direct coupling in the storage partitions or, with-in the database nodes in the cluster. Microservices architecture – there are independent, minimal and resilient services running in the cluster that are needed for async coordination. And any of them temporarily going down doesn’t NOT impact the whole cluster. Net-net: Aurora only coordinates when it has to coordinate. Lets see this in action.
  27. 1 min Lets say we have two clients C1 and C2 talking to Blue and orange master respectively. Lets start from the simple case where two clients are writing to two different tables Both clients start trx BT1 and OT1 to respective Master nodes. They both issue an update but to 2 different tables And they can both commit, no explicit sync required.
  28. 1 min Same setup. Lets say two clients wants to write to the same entry in the same table. Again, Both clients start trx BT1 and OT1 to respective Master nodes. They both issue an update but to the same table, modifying the same entry And they both try to commit, one of them wins, other one looses.
  29. 1 min Now, its possible that two trx may conflict even though there is no physical conflict. Lets see that. Again same setup and two clients wants to write to the same row, same table. Again, Both clients start trx BT1 and OT1 to respective Master nodes. C1 sends an update and get a quorum. The changes gets replicated to Orange Master. Now, if C2 updates the same row, Storage is totally okay because the changes were made on top of the latest image. But we detect that conflict in the database itself through MVCC and rollback the trx. No distributed locking needed. And ofcourse C1 commits successfully.
  30. (1 min) We ran sysbench on MM cluster As we scaled up the cluster at 5 min mark, performance went up from 14 to 27K. At time t=10, we added 2 more nodes in the cluster and you can see performance went up to 48K. At t=15, one of the machines went down, and aggregated throughput came down to 38K and went back to 48K when the affected node came back up at t=16. This is really cool!
  31. (1 min) Switching gears, what about reads. How does reads work in MM. More precisely, how do we offer linearizability? Let me illustrate the problem with an example. John and Bob are friends. One day, John proposed Sara, and so he updates his status and let everyone know that he proposed Sara. Bob, who was checking his updates, saw that John finally proposed but then realizes that his status is still single and so probably it didn’t work out. However, if Bob did a global read, he would have found out that John is engaged to Sara. Bob immediately calls John to congratulate him. Local reads allow you to read your own changes but if you want to read all the changes in the cluster, you need global reads.
  32. (1min 15 sec) Say we have 3 nodes, N1/N2/N3. C1 issues a request to N1. N1 then sends a hello request to N2 and N3. N2 and N3 respond with the timestamp T2 and T3 respectively, when they saw the hello request from N1. N1 then waits for replication to catch until T2 and T3 from N2 and N3 respectively. And once its caught up, performs the read and return the results back. This is a very simplified view, there is quite bit of engineering and complexities involved to make it work in practice. As you can see, there is no wait on the write path. It only adds latency to the reads that needs global consistency. And its configurable per session.
  33. (30 seconds) Net net, Aurora achieves linear write scaling with OCR. Continuous availability through Microservices architecure. Enterprise-grade Durability through 6 copies, 2 copies. Plus we continuously backup your data in a highly durable S3 store in addition And supports indexes, constraints, triggers, proc, funcs you need for your relational database application.
  34. (45 sec) So much about OLTP, lets talk about your OLAP queries. Here are some of the optimizations we did for OLAP... Batched scans – the idea is to scans tuples in batches from the innodb buffer pool to avoid latching pages again/again, traversing again/again and allows to use JIT optimizations. Mainly for in-memory workloads. Hash joins – improves equi-join performance. Build one side and scan through the other to probe the hash table. Lots of complexity with skew and duplicates – have to minimize the #passes, not to mention when to choose hash joins over other join operators like index join or nested loop joins. Async. Key Prefetch - prefetches pages in memory for index joins using BKA, quite useful for non-equi joins or some equi joins (if one side is small and the big side has a high-cardinality index on the joining column). For Out of cache workloads.
  35. (15 seconds) We ran TPCH like workload. Here you can the performance improvement from all those improvements. And as you can see, roughly half the queries are >2x better, with a peak speed up of roughly 18x.
  36. (1 min) Push down processing to thousands of storage nodes Moving processing closer to data reduces network traffic It reduces buffer pool pollution - why does it matter? we will see it in a second
  37. (1 min) On the left, request is sent to SN – including the pages, page-lsn, function to eval. In return we get 2 streams back: clean and dirty. Clean stream is set of records which have not been modified since the query started. Dirty stream…Clean stream is sent to aggregator to merge with other partial clean stream from other SN nodes. For Dirty, have it go through MVCC converter to get the right version, apply the function and feed it back into the agg. The combined result is set back to the client and next step in the query execution. We already push down predicates, projections,….This is an active area of work and we can do much more and exploit storage in unique ways that was not possible before. There are more challenges involved ex - how we scan the list of pages to process without holding the latches for long How we do the flow control How we run each request in a secure container on the storage How do we seemlessly handle failures of storage nodes There will be a PQ chalk talk on Thu, if you are interested in learning more details. DO NOT USE =========== Some of the obvious next work items: Aggr pushdown, subquey unnesting with (inner HJ + aggr), native semi-join support in HJ
  38. (30 sec) On top of previous improvements. As you can see some queries are 2 orders of magnitude better and several queries are an order of magnitude better. Now, to be clear, this is no AWS redshift but clearly its a good option if you are doing some light weight analytics.
  39. (30 sec) Processing closer to data significantly reduces the data transfer between HN and SN. As a result, there is a significant less impact on the OLTP performance because of buffer pool pollution and reduction in network traffic. We use 150GB just to show the impact because 8xl buffer pool is around 150GB something. And so if we bring in pages for OLAP queries, it will evict pages needed by OLTP queries.
  40. Lets talk about availability.
  41. Why ? - Even in the presence of background radiation, where nodes… But how? The reason is that even if we loose 3 copies, we still have 1 copy left. Instead if we use 2 out of 3 quorum, and 2 copies are lost, we will loose data.
  42. We continuously back your data to S3. How does it work? Aurora divides the database into 10GB segments. And so we basically take snapshot of those segments and stream any delta redo logs to S3. On restore, we get those snapshots and apply delta log stream on top in parallel. This all is happening on the storage, has no performance impact on the database nodes.
  43. Now, there are times when we accidentally delete a table or forgot a where clause in the delete statement. Backtrack allows to quick get the data back w/o fully restoring from backups. Relatively quick operation. Couple of minutes vs hours. In this example, you first backtracked to t1 and let it run some trx… note that you can actually backtrack back and forth to find out the right point, its not a destructive operation.
  44. In traditional databases, we have to replay logs since the last checkpoint in a single thread. With Aurora, these redo logs are already being applied to each segment in parallel, ASYNCHRONOUSLY. For startup - we don’t have to do anything other than doing some math to find out the point at the time of the crash.
  45. You can have up-to 15 replicas Because Replica shares the storage with Master – no loss of data You can define the failover order
  46. Lets see how MM availability looks like in practice
  47. Two primary usecases: DR Enhance data locality by bringing data closer to your customer’s applications in different regions
  48. Continue to get high throughput Lag across regions under a sec even at the peak throughput, quite impressive For DR, you can switch your apps to a different region under a minute, basically your RTO. RPO under few seconds. How did we do it? - Multi-tenant distributed replication fleet, which attaches itself as replica on one side and writer on the other side. And you do compressed physical replication between the two fleets.
  49. Here we did the comparison between logical and physical replication across regions using Sysbench. For logical, we used Multi-thread with 64 parallel workers As we ramp up the workload past 25K QPS, the logical replica was unable to keep up, with lag rising consistently. But physical lag was under a second even at the peak throughput. DO NOT USE ========== The above tests demonstrate that even under heavy write workload global database is able to maintain low latency physical replication. Unlike logical Binlog replication changes in the master do not need to be executed on replicas, the changes are physically replicated. This means that within milliseconds, committed DML and DDL changes from the writer are replicated globally to regions you have selected. All this done while ensuring data is durable in 3 availability zones in each region of your global database cluster. Global databases’ physical replication also removes any dependency on binary logs, all replication is handled natively by the Aurora storage layer. Notes: Important to note there are many different Binlog replica configurations, the following tests were done using a sustained write only workload Tests: Write only Sysbench workload. Stepped workload, Every 600 seconds workload was ramped up. WL used: https://d0.awsstatic.com/product-marketing/Aurora/RDS_Aurora_Performance_Assessment_Benchmarking_v1-2.pdf Binlog tests: -- RDS MySQL 5.7.23 -- Single AZ, 30k IOPS, r4.16xlarge -- Master: us-east-1, Replica: us-west-2 - Multi threaded slave apply threads used Replica: 64 parallel slave worker threads, LOGICAL_CLOCK - TPS step rate: 800,900,1000,2000,3000,4000,5000,6000 Desc: Until 25K QPS we had lag of < 5 seconds, afterwards the replica was unable to keep up with lag rising consistently. Global Databases - Aurora MySQL one writer & one cross region reader - Reader. r4.16xlarge, US-Writer: r4.16xlarge, US-EAST-1 - EAST-2 - TPS step rate: 1000,2000,4000,8000,16000,24000,28000 - Desc: 200ms lag up to 150 QPS At > 170K QPS we were seeing a lag of ~500ms.
  50. Lets talk about manageability
  51. (1 min) - We announced performance insights support for Aurora MySQL earlier this year. - It’s a single place for you to monitor and root cause load issues. It allows you to group by waits, sql, users, hosts over time or over any of those metrics. - For ex – the problem could be higher cpu, lock wait, io latency. You can then take actions by tuning the SQL statements or adding more resources. - Its done in a seemless way that it doesn’t impact the performance of your database.
  52. (30 seconds) With Aurora, you don’t have to manage storage, it automatically grows for you. You don’t have to manage read replicas, we can auto-scale for you based on your workload. And custom reader endpoints to manage analytics replica from OLTP replicas for example.
  53. (30 seconds) - With Serverless, we can now manage writer instance for you - can automatically scale up and down (including going to zero) depending on the load. - And so if you have a dev test workload or a sporadic/cyclic/unpredicatable workload, Serverless maybe a great option for you. - It’s a great way to save cost - as you only pay for the time, you use for.
  54. (2 min) To understand how, lets first look at the different layers: - There is a Multitenant, distributed proxy layer in the front and your database instance connections are spanned across that fleet. And so there is no single point of failure. - Then we have a warm pool set of instance of different sizes kept on the side to quickly scale up or down. - Finally, we also have a monitoring service running on the left, monitoring your database instances and taking actions as needed. How seemless scaling works: We first attach the new instance as replica Then we request the database to find a safe scaling point with no active trx. Once it find such a point, it starts looping back all the incoming traffic back to proxy fleet along with the coordinates of the replica instance Proxy then reads the payload and redirect all the network stream from the old host to the new host and any new traffic to the new host. Finally proxy sends the close message to the old machine So you get no breakage in connections, no app impact.
  55. (30 seconds) We ran a quick simulation and you can see for yourself.
  56. (20 second) We are also announcing web service data api for your lambda apps on top of Serverless, you can simply send us the http request and don’t need to worry about even connection pooling
  57. (10 second) If you like to know more about mm, or serverless or PQ in Aurora MySQL, we will following sessions coming
  58. Thank you all and have a great rest of your conference.