SlideShare ist ein Scribd-Unternehmen logo
1 von 93
Scaling DB Access for 100s of billions of
queries per day
1
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
paypal-scale-db/
Presented at QCon New York
www.qconnewyork.com
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
TLDR - Hera
Open Source
High Efficiency
Reliable Access
to data stores
2
Out Of DB Connections?
• Scaling
• Multiplexing
• Sharding
• Resiliency
• Manageability
• Our switch to Go Lang
About Petrica
• Works @ PayPal
• developing the database proxy.
• database backed publish-subscriber messaging system
• PayPal Here backend
• Golang contributor
• Previously @
• Worked on distributed systems, embedded system, desktop apps
©2019 PayPal Inc. Confidential and proprietary. 3
About Kenneth
• Works @ PayPal
• paypal.de
• PayPal Verisign Two Factor Authentication
• Coded shared memory for sharding
• Migrated to Go (hera)
• Previously @
• Visualization of Genomic Metabolism, Artificial Chemistry, and EPICS
Monitoring
©2019 PayPal Inc. Confidential and proprietary. 4
©2019 PayPal Inc. Confidential and proprietary. 5
About PayPal – 2018 facts
©2019 PayPal Inc. Confidential and proprietary. 6
100+
Currencies
200+
Markets
267M
Active Customer Accounts
9.9B
Payments Transactions
$578B
Total Payments Volume
Up 27% YoY*
Total Payment Volume
2,700
Applications
4,500
Engineers
17,000
Releases
PayPal OLTP Environment scale
©2015 PayPal Inc. Confidential and proprietary. 7
2000+
Database Instances
100+ Billion
Calls/day
3000+
Database Nodes
74 PB
Total Storage
300+
Schemas
10 +
Availability Zones
200,000 +
Application Servers
1+ Billion
Calls/hour
8
Key Take-aways
• Massive scale
• Complexity
• Growth
©2019 PayPal Inc. Confidential and proprietary. 9
Micro serv ice
connect
request
request
request
disconnect
request
response
Typical Database Access by Application
©2019 PayPal Inc. Confidential and proprietary. 10
Micro serv ice
connect
request
request
request
disconnect
request
response
100 ms
1 ms
1 ms
1 ms
103 ms
Typical Database Access by Application
©2019 PayPal Inc. Confidential and proprietary. 11
Micro serv ice
connect
request
request
request
request
response
100 ms
~0 ms
1 ms
1 ms
~3 ms
Po o l
1 ms
Connection is pre-established, cached and re-used for each request
release
lease
Typical Database Access by Application
From Few to Thousands Microservices
©2019 PayPal Inc. Confidential and proprietary. 12
Micro serv ice
Configure the pool cache large
enough for availability
From Few to Thousands Microservices
©2019 PayPal Inc. Confidential and proprietary. 13
Micro serv ice
Micro serv ice
Configure the pool cache large
enough for availability
Add more service when scaling up
©2019 PayPal Inc. Confidential and proprietary. 14
Micro serv ice
Micro serv ice
Micro serv ice
Configure the pool cache large
enough for availability
Add more service when scaling up
From Few to Thousands Microservices
©2019 PayPal Inc. Confidential and proprietary. 15
Micro serv ice
Micro serv ice
Micro serv ice
Micro serv ice
Configure the pool cache large
enough for availability
Add more service when scaling up
From Few to Thousands Microservices
©2019 PayPal Inc. Confidential and proprietary. 16
Micro serv ice
Micro serv ice
Micro serv ice
Micro serv ice
Micro serv ice
Configure the pool cache large
enough for availability
Add more service when scaling up
Hit a connection limit
From Few to Thousands Microservices
17
Solutions?
• Lower the connection pool
size in each microservice
node
• Improve microservice code
to use the connection more
efficiently
• Buy better database
hardware
Partitioned Connection Pool
©2019 PayPal Inc. Confidential and proprietary. 18
Micro serv ice
Micro serv ice
Request
Request
While servicing requests, the
microservices use only some
DB connections
Partitioned Connection Pool
©2019 PayPal Inc. Confidential and proprietary. 19
Micro serv ice
Micro serv ice
Request
Request
While servicing requests, the
microservices use only some
DB connections
In some node all the
connections can be temporary
busy
Partitioned Connection Pool
©2019 PayPal Inc. Confidential and proprietary. 20
Micro serv ice
While servicing requests, the
microservices use only some
DB connections
In some node all the
connections can be temporary
busy
Client requests will incur
latency though the database
has enough capacity
Micro serv ice
Request
Request
21
Solutions?
Scale with shared pool !!!
S h a r e d c o n n e c t i o n p o o l
©2019 PayPal Inc. Confidential and proprietary. 22
Micro serv ice
Micro serv ice
Request
Request
Shared pool
©2019 PayPal Inc. Confidential and proprietary. 23
Serv ice
Service request
Service
response
SQL request
SQL request
SQL request
SQL request
DB connection
Database connection is
allocated but mostly idle
Typical service request
©2019 PayPal Inc. Confidential and proprietary. 24
Serv ice
Service request
Service
response
SQL request
SQL request
SQL request
SQL request
SQL request
SQL request
SQL request
DB connection
Serving other requests when client seems idle?
Multiplexing
©2019 PayPal Inc. Confidential and proprietary. 25
Micro serv ice
Micro serv ice
Request
Request
Shared Po o l
Services perceive that they have a database connection
Multiplexing
©2019 PayPal Inc. Confidential and proprietary. 26
Micro serv ice
Micro serv ice
Request
Request
Shared Po o l
Services perceive that they have a database connection
Hera – Highly Efficient Reliable Access
©2019 PayPal Inc. Confidential and proprietary. 27
Micro serv ice
Micro serv ice
Request
Request
Hera
Shared connection pool multiplexer
to data stores
HERA
Microservice-
First
DB proxy
28
Principles and constraints
• Correctness
• Resiliency, -ilities,
performance
• Minimal configuration
• Thin client
• Minimal change to wire
protocol
29
Let’s talk about other features!
Scale
Features
30
• Multiplexing
• Read write split
• Sharding
Read-Write Split
©2019 PayPal Inc. Confidential and proprietary. 31
RW
RW RW
RW
Oracle RAC - Real
Application Cluster
Each database is exchanging
lock information
Read-Write Split
©2019 PayPal Inc. Confidential and proprietary. 32
RW RW
Oracle RAC - Real
Application Cluster
R
R R
Hera No de
Separate write queries
and direct to one node
Reduces traffic between
DB nodes
Sharding
33
• Client vs Server
• Finding the Shard
• Sample Query Run
• Convert Legacy App
• Move Scuttle Bin’s Data
Sharding: Client vs Server
©2019 PayPal Inc. Confidential and proprietary. 34
C++ Client
C++ Lib
Database (shard 0) Database (shard 1) Database (shard 2)
Sharding: Same Logic in Java Client
©2019 PayPal Inc. Confidential and proprietary. 35
C++ Client
C++ Lib
Database (shard 0) Database (shard 1) Database (shard 2)
Java Client
Java Lib
Sharding: Duplicate Logic in Go Client
©2019 PayPal Inc. Confidential and proprietary. 36
C++ Client
C++ Lib
Database (shard 0) Database (shard 1) Database (shard 2)
Java Client Go Client
Java Lib Go Lib
Sharding: Copy Logic in Python
©2019 PayPal Inc. Confidential and proprietary. 37
C++ Client
C++ Lib
Database (shard 0) Database (shard 1) Database (shard 2)
Java Client Go Client
Java Lib Go Lib
Python Client
Python Lib
Sharding: Node.js Client Too
©2019 PayPal Inc. Confidential and proprietary.
38
Node.js ClientC++ Client
C++ Lib
Database (shard 1) Database (shard 2)
Java Client Go Client
Java Lib Go Lib
Python Client
Python Lib Node.js Lib
Database (shard 0)
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Node.js Client
Sharding: Keep Logic in Server
© 2019 PayPal Inc. Confidential and proprietary. 39
Hera Multiplexer
C++ Client
C++ Lib
Database (shard 0) Database (shard 1) Database (shard 2)
Java Client Go Client
Hera Java Lib Hera Go Lib
Python Client
Python Lib Node.js Lib
Sharding: Implementation
©2019 PayPal Inc. Confidential and proprietary. 40
1. Shard Key: account_id=2000
…
Shard 0 Shard 1 Shard M
Hash function (shard key value) %K
Scuttle bin to logical shard mapping
Logical to physical shard mapping
Shard Key
DB Shard
0
DB Shard
N
…
…
…
©2019 PayPal Inc. Confidential and proprietary. 41
…
Shard 0 Shard 1 Shard M
Hash function (shard key value) %K
Scuttle bin to logical shard mapping
Logical to physical shard mapping
Shard Key
DB Shard
0
DB Shard
N
1. Shard Key: account_id=2000
2. Murmur3Hash(2000)%1024=280
…
…
…
Sharding: Implementation
©2019 PayPal Inc. Confidential and proprietary. 42
…
Shard 0 Shard 1 Shard M
Hash function (shard key value) %K
Scuttle bin to logical shard mapping
Logical to physical shard mapping
Shard Key
DB Shard
0
DB Shard
N
1. Shard Key: account_id=2000
2. Murmur3Hash(2000)%1024=280
3. Select shard_id from
hera_shard_map where
scuttle_id = 280;
shard_id
---------
1
…
…
…
Sharding: Implementation
©2019 PayPal Inc. Confidential and proprietary. 43
…
Shard 0 Shard 1 Shard M
Hash function (shard key value) %K
Scuttle bin to logical shard mapping
Logical to physical shard mapping
Shard Key
DB Shard
0
DB Shard
N
1. Shard Key: account_id=2000
2. Murmur3Hash(2000)%1024=280
3. Select shard_id from
hera_shard_map where scuttle_id =
280;
shard_id
---------
1
4. TWO_TASK_1 = LOAN_SH1
TWO_TASK_1=‘tcp(loan-sh1:
3306)/loan’
…
…
…
Sharding: Implementation
Sharding: Converting Application Query
- Shard Key: account_id
- select * from loan, appfile where loan.id = ? and appfile.loan_id = loan.id
©2019 PayPal Inc. Confidential and proprietary. 44
Sharding Compatible Query
- Shard Key: account_id
- select * from loan, appfile where loan.id = ? and appfile.loan_id = loan.id
and loan.account_id = ? and appfile.account_id = loan.account_id
©2019 PayPal Inc. Confidential and proprietary. 45
Sharding: Hera JDBC SQL Rewrite
- Shard Key: account_id
- select * from loan, appfile where loan.id = :loan_id and appfile.loan_id =
loan.id and loan.account_id = :account_id and appfile.account_id =
loan.account_id
- Hera can now pick out the shard key name and value
©2019 PayPal Inc. Confidential and proprietary. 46
Sharding: Legacy App Conversion
©2019 PayPal Inc. Confidential and proprietary. 47
• All queries directed to shard 0
• Logs queries that don’t bind to shard
key
…
Shard 0
Hash function (shard key value) %K
Scuttle bin to logical shard mapping
Logical to physical shard mapping
Shard Key
DB Shard
0
DB Shard
N
…
…
Sharding: Legacy App Conversion
©2019 PayPal Inc. Confidential and proprietary. 48
• Whitelist sends one value to a
specific shard
• Limits risk of failures to 1 value
• Hera uses Shard 1, but same DB
• Fast rollback on errors
• Repeat for larger sets
…
Shard 0
Hash function (shard key value)
%K
Scuttle bin to logical shard
mapping
Logical to physical shard mapping
Shard
Key
DB Shard
0
DB Shard
N
Whitelist
Shard 1
…
…
Sharding: Physical Whitelist
©2019 PayPal Inc. Confidential and proprietary. 49
• Validates permissions and data
copy
…
Shard 0
Hash function (shard key value)
%K
Scuttle bin to logical shard
mapping
Logical to physical shard mapping
Shard
Key
DB Shard
0
DB Shard
N
Whitelist
Shard 1
…
Sharding: Logical Move for One Scuttle Bin
©2019 PayPal Inc. Confidential and proprietary. 50
• If successful, do a physical data
move next
…
Shard 0
Hash function (shard key value)
%K
Scuttle bin to logical shard
mapping
Logical to physical shard mapping
Shard
Key
DB Shard
0
DB Shard
N
Whitelist
Shard 1
…
…
Sharding: Moving Scuttle Bin
©2019 PayPal Inc. Confidential and proprietary. 51
• Start data copy
• Typically, tables are partitioned by
scuttle bin
…
Shard 0 Shard 1 Shard M
Hash function (shard key value) %K
Scuttle bin to logical shard mapping
Logical to physical shard mapping
Shard Key
DB Shard
0
DB Shard
N
…
…
…
Sharding: Moving Scuttle Bin
©2019 PayPal Inc. Confidential and proprietary. 52
• Start data copy
• Typically, tables are partitioned by
scuttle bin
• Block writes
…
Shard 0 Shard 1 Shard M
Hash function (shard key value) %K
Scuttle bin to logical shard mapping
Logical to physical shard mapping
Shard Key
DB Shard
0
DB Shard
N
…
…
Sharding: Moving Scuttle Bin
©2019 PayPal Inc. Confidential and proprietary. 53
• Start data copy
• Typically, tables are partitioned by
scuttle bin
• Block writes
• Data fully copied
…
Shard 0 Shard 1 Shard M
Hash function (shard key value) %K
Scuttle bin to logical shard mapping
Logical to physical shard mapping
Shard Key
DB Shard
0
DB Shard
N
…
…
…
Sharding: Moving Scuttle Bin
©2019 PayPal Inc. Confidential and proprietary. 54
• Start data copy
• Typically, tables are partitioned by
scuttle bin
• Block writes
• Data fully copied
• Update map
• Use scuttle bin in new location
…
Shard 0 Shard 1 Shard M
Hash function (shard key value) %K
Scuttle bin to logical shard mapping
Logical to physical shard mapping
Shard Key
DB Shard
0
DB Shard
N
…
…
…
55
Sharding: Take Away
• Queries must bind shard key & value
• Manage risk converting legacy code
• Controlled data redistribution
Resiliency
Principles
56
• Deal with exceptional
cases
▪ Surge in load
▪ Slow SQL requests
▪ Issues with the DB
Resiliency
Principles
57
• Deal with exceptional
cases
▪ Surge in load
▪ Slow SQL requests
▪ Issues with the DB
• Limit or avoid impact
• Self heal
Resiliency
Features
58
• Bouncer
• Surge Queue
• Slow Query Eviction
• Read Replica
• Transparent Application
Failover
Resiliency: Oversubscription - Bouncer
©2019 PayPal Inc. Confidential and proprietary. 59
Serv ice 1
Hera No de
Serv ice 2
Serv ice 3
Serv ice 4
Service 4 connection is
tcp accepted and then
closed without SSL.
Resiliency: Oversubscription – Surge Queue
©2019 PayPal Inc. Confidential and proprietary. 60
Serv ice 1
Hera No de
Serv ice 2
Serv ice 3
Service 3 can make a
request
It waits--queued
Resiliency: Oversubscription – Surge Queue
©2019 PayPal Inc. Confidential and proprietary. 61
Serv ice 1
Hera No de
Serv ice 2
Serv ice 3
Generally, 1ms later,
Service 3 gets run.
Resiliency: Oversubscription – Surge Queue
©2019 PayPal Inc. Confidential and proprietary. 62
Serv ice 1
Hera No de
Serv ice 2
Serv ice 3
Waiting more than 1s gives
an error to Service 3.
Resiliency: Slow Query Eviction
©2019 PayPal Inc. Confidential and proprietary. 63
Serv ice 1
Hera No de
Serv ice 2
Serv ice 3
1s
Slow Query
When Hera is overloaded
Resiliency: Slow Query Eviction
©2019 PayPal Inc. Confidential and proprietary. 64
Serv ice 1
Hera No de
Serv ice 2
Serv ice 3
1s
Slow Query
Hundred of queries can run
once slow query is
removed.
Monitoring Usage
©2019 PayPal Inc. Confidential and proprietary. 65
Serv ice 1
Hera No de
Serv ice 2
Serv ice 3
Each Hera Node
Free DB
Connections
(acpt)
Time
Resiliency: Eviction – Actual event
©2019 PayPal Inc. Confidential and proprietary. 66
Each Hera Node
Free DB
Connections
(acpt)
Time
Happened few years back, when automatic eviction went unnoticed.
Evicting expensive query makes space for many typical queries.
Read Replicas
©2019 PayPal Inc. Confidential and proprietary. 67
MUX R1 Pool
Po o l 1
R1 Pool
R1 Pool
MUX R1 Pool
R1 Pool
R2 Pool
Po o l 2
Hera
Application
Read Replica 1
Read Replica 2
Write DB
F a i l o v e r
m o d u l e
B u s i n e s s
l o g i c
R1 & R2: different data replication lag
Transparent Application Failover – Read Replica Retry
©2019 PayPal Inc. Confidential and proprietary. 68
MUX
R1 Pool
R1 Pool
R1 Pool
R1 Pool
R1 Pool
R2 Pool
Po o l
Hera
Application
B u s i n e s s
l o g i c
Read Replica 1
Read Replica 2
Write DB
• Build failover on server side
• Also failover when query is slow
Manageability
69
• Maintenance without customer
impact
• Recommend clients recycle
often
• Oracle RAC maintenance
Recycle Connections– Node Going Away
©2019 PayPal Inc. Confidential and proprietary. 70
Hera No de
Serv ice 2
Serv ice 3
Hera No de
Hera No de
TCP Load
Balancer
Recycle Connections for Manageability
©2019 PayPal Inc. Confidential and proprietary. 71
Hera No de
Serv ice 2
Serv ice 3
Hera No de
Hera No de
TCP Load
Balancer
Recycle Connections– Node Returns
©2019 PayPal Inc. Confidential and proprietary. 72
Hera No de
Serv ice 2
Serv ice 3
Hera No de
TCP Load
Balancer
Hera No de
DB Maintenance
©2019 PayPal Inc. Confidential and proprietary. 73
RW 2
Oracle RAC - Real
Application Cluster
3
4 1
Hera No de
• Preparing Oracle RAC node 3
maintenance
• DBAs remove node 3 from Oracle
configs
• Insert into hera_maint (inst_id,
status, status_time, module,
machine) values (3, ‘F’, [unix
epoch]+2, [hera pool name], [host]
DB Maintenance
©2019 PayPal Inc. Confidential and proprietary. 74
RW 2
Oracle RAC -Real
Application Cluster
3
4 1
Hera No de
• After [unix epoch] +2
Take Aways
• Keep controls near those who need
it
• Avoid being unnecessarily involved
75
Transition to Go
- Written in C++, asynchronous
code
- Scalable
- Efficient
- Complex
©2019 PayPal Inc. Confidential and proprietary. 76
Previous version
Transition to Go
- Written in C++, asynchronous
code
- Scalable
- Efficient
- Complex
©2019 PayPal Inc. Confidential and proprietary. 77
Previous version
Transition to Go
New changes
- New functional requirements
- New non functional requirements
(efficiency)
- Written in C++, asynchronous
code
- Scalable
- Efficient
- Complex
©2019 PayPal Inc. Confidential and proprietary. 78
Previous version
Transition to Go
New changes
- New functional requirements
- New non functional requirements
(efficiency)
- Go programing is synchronous
- Scale efficiently out of the box
- Does it meet our requirements?
- Is stop-the-world garbage collection time an issue?
How about writing in Go?
Proof-of-
concept
1st Release:
No Sharding
Full feature
with C++
dependencies
Full feature
all Go
©2019 PayPal Inc. Confidential and proprietary. 79
Going with Go
Transition to Go
- Running in production for 1 year
- Latency and CPU met the requirements
- No clients change
- Had some issues and learnings
©2019 PayPal Inc. Confidential and proprietary. 80
Mission Accomplished
func routine() {
time.Sleep(time.Second * 10)
}
func main() {
routines := 10000
wg.Add(routines)
for i := 0; i < routines; i++ {
go func() {
routine(i)
wg.Done()
}()
}
wg.Wait()
}
©2019 PayPal Inc. Confidential and proprietary. 81
Spawn 10000 go-routines sleep for
10 seconds to simulate
processing
Running on my 2 cores test
machine
How many OS threads I expect at
runtime?
Learning Example
func routine() {
time.Sleep(time.Second * 10)
}
func main() {
routines := 10000
wg.Add(routines)
for i := 0; i < routines; i++ {
go func() {
routine(i)
wg.Done()
}()
}
wg.Wait()
}
©2019 PayPal Inc. Confidential and proprietary. 82
Spawn 10000 go-routines sleep for
10 seconds to simulate processing
Running on my 2 cores test
machine
Learning Example
$> ./mytest &
[1] 12345
$> ps –L –C mytest –no-headers | wc –l
6
$>
func routine() {
//time.Sleep(time.Second * 10)
syscall.Select(0, nil, nil, nil,
&syscall.Timeval{Sec: 10})
}
func main() {
routines := 10000
wg.Add(routines)
for i := 0; i < routines; i++ {
go func() {
routine(i)
wg.Done()
}()
}
wg.Wait()
}
©2019 PayPal Inc. Confidential and proprietary. 83
Learning Example
Spawn 10000 go-routines sleep for
10 seconds to simulate
processing
Running on my 2 cores test
machine
Using system call to sleep how
many OS threads I expect at
runtime?
func routine() {
//time.Sleep(time.Second * 10)
syscall.Select(0, nil, nil, nil,
&syscall.Timeval{Sec: 10})
}
func main() {
routines := 10000
wg.Add(routines)
for i := 0; i < routines; i++ {
go func() {
routine(i)
wg.Done()
}()
}
wg.Wait()
}
©2019 PayPal Inc. Confidential and proprietary. 84
Learning Example
Spawn 10000 go-routines sleep for
10 seconds to simulate processing
Running on my 2 cores test
machine
$ ./mytest
runtime: program exceeds 10000-thread limit
fatal error: thread exhaustion
runtime stack:
….
. . . “A Go program creates a new thread only when a goroutine is
ready to run but all the existing threads are blocked in system calls,
cgo calls, or are locked to other goroutines due to use of
runtime.LockOSThread”
©2019 PayPal Inc. Confidential and proprietary. 85
from “runtime/debug” func SetMaxThreads()
Learning Example
86
Key Take Away • System calls lock the OS
thread. Ex: os.File read
• When using C/C++ library,
be aware of the systems
calls (mutexes, file reads,
socket reads, sleep)
• Golang locks, socket reads,
sleep are not locking the OS
thread
• In your load tests, verify the
number of OS threads
Open source
github.com/paypal/hera
✓ Code, documentation, examples
✓ Standard clients
• Java: JDBC driver
• Golang: database/sql driver
Google group heradatalink
©2019 PayPal Inc. Confidential and proprietary. 87
github.com/paypal/hera
Internal Mirror
Build
Deploy
PayPal Config
Acknowledgements
Kamlakar
Mukundan
Liana
Sid
Yaping
Shuping
Stephanie
Varun
Qing
Chun
Dwain
©2019 PayPal Inc. Confidential and proprietary. 88
Paresh
Srini
Dheeraj
Pradeep
Pramod
Samrat
Akash
Ashwin
and countless others
Next Steps
Need more database connections?
Hera can help scale
• Multiplexing
• Sharding
• Resiliency
• Manageability
©2019 PayPal Inc. Confidential and proprietary. 89
Open Source
github.com/paypal/hera
Google Groups – heradatalink
Our Future
• Scaling Tools
• More Clients
©2019 PayPal Inc. Confidential and proprietary. 90
github.com/paypal/hera Google group: heradatalink
Watch the video with slide
synchronization on InfoQ.com!
https://www.infoq.com/presentations/
paypal-scale-db/

Weitere ähnliche Inhalte

Mehr von C4Media

Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDC4Media
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine LearningC4Media
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at SpeedC4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsC4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsC4Media
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerC4Media
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleC4Media
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeC4Media
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereC4Media
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing ForC4Media
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreC4Media
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsC4Media
 
High Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechHigh Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechC4Media
 
Rust's Journey to Async/await
Rust's Journey to Async/awaitRust's Journey to Async/await
Rust's Journey to Async/awaitC4Media
 
Opportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaOpportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaC4Media
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayC4Media
 
Are We Really Cloud-Native?
Are We Really Cloud-Native?Are We Really Cloud-Native?
Are We Really Cloud-Native?C4Media
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseC4Media
 
A Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with BrooklinA Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with BrooklinC4Media
 

Mehr von C4Media (20)

Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery Teams
 
High Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechHigh Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in Adtech
 
Rust's Journey to Async/await
Rust's Journey to Async/awaitRust's Journey to Async/await
Rust's Journey to Async/await
 
Opportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaOpportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven Utopia
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
 
Are We Really Cloud-Native?
Are We Really Cloud-Native?Are We Really Cloud-Native?
Are We Really Cloud-Native?
 
CockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL DatabaseCockroachDB: Architecture of a Geo-Distributed SQL Database
CockroachDB: Architecture of a Geo-Distributed SQL Database
 
A Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with BrooklinA Dive into Streams @LinkedIn with Brooklin
A Dive into Streams @LinkedIn with Brooklin
 

Kürzlich hochgeladen

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Kürzlich hochgeladen (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Scaling DB Access for Billions of Queries Per Day @PayPal

  • 1. Scaling DB Access for 100s of billions of queries per day 1
  • 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ paypal-scale-db/
  • 3. Presented at QCon New York www.qconnewyork.com Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide
  • 4. TLDR - Hera Open Source High Efficiency Reliable Access to data stores 2 Out Of DB Connections? • Scaling • Multiplexing • Sharding • Resiliency • Manageability • Our switch to Go Lang
  • 5. About Petrica • Works @ PayPal • developing the database proxy. • database backed publish-subscriber messaging system • PayPal Here backend • Golang contributor • Previously @ • Worked on distributed systems, embedded system, desktop apps ©2019 PayPal Inc. Confidential and proprietary. 3
  • 6. About Kenneth • Works @ PayPal • paypal.de • PayPal Verisign Two Factor Authentication • Coded shared memory for sharding • Migrated to Go (hera) • Previously @ • Visualization of Genomic Metabolism, Artificial Chemistry, and EPICS Monitoring ©2019 PayPal Inc. Confidential and proprietary. 4
  • 7. ©2019 PayPal Inc. Confidential and proprietary. 5
  • 8. About PayPal – 2018 facts ©2019 PayPal Inc. Confidential and proprietary. 6 100+ Currencies 200+ Markets 267M Active Customer Accounts 9.9B Payments Transactions $578B Total Payments Volume Up 27% YoY* Total Payment Volume 2,700 Applications 4,500 Engineers 17,000 Releases
  • 9. PayPal OLTP Environment scale ©2015 PayPal Inc. Confidential and proprietary. 7 2000+ Database Instances 100+ Billion Calls/day 3000+ Database Nodes 74 PB Total Storage 300+ Schemas 10 + Availability Zones 200,000 + Application Servers 1+ Billion Calls/hour
  • 10. 8 Key Take-aways • Massive scale • Complexity • Growth
  • 11. ©2019 PayPal Inc. Confidential and proprietary. 9 Micro serv ice connect request request request disconnect request response Typical Database Access by Application
  • 12. ©2019 PayPal Inc. Confidential and proprietary. 10 Micro serv ice connect request request request disconnect request response 100 ms 1 ms 1 ms 1 ms 103 ms Typical Database Access by Application
  • 13. ©2019 PayPal Inc. Confidential and proprietary. 11 Micro serv ice connect request request request request response 100 ms ~0 ms 1 ms 1 ms ~3 ms Po o l 1 ms Connection is pre-established, cached and re-used for each request release lease Typical Database Access by Application
  • 14. From Few to Thousands Microservices ©2019 PayPal Inc. Confidential and proprietary. 12 Micro serv ice Configure the pool cache large enough for availability
  • 15. From Few to Thousands Microservices ©2019 PayPal Inc. Confidential and proprietary. 13 Micro serv ice Micro serv ice Configure the pool cache large enough for availability Add more service when scaling up
  • 16. ©2019 PayPal Inc. Confidential and proprietary. 14 Micro serv ice Micro serv ice Micro serv ice Configure the pool cache large enough for availability Add more service when scaling up From Few to Thousands Microservices
  • 17. ©2019 PayPal Inc. Confidential and proprietary. 15 Micro serv ice Micro serv ice Micro serv ice Micro serv ice Configure the pool cache large enough for availability Add more service when scaling up From Few to Thousands Microservices
  • 18. ©2019 PayPal Inc. Confidential and proprietary. 16 Micro serv ice Micro serv ice Micro serv ice Micro serv ice Micro serv ice Configure the pool cache large enough for availability Add more service when scaling up Hit a connection limit From Few to Thousands Microservices
  • 19. 17 Solutions? • Lower the connection pool size in each microservice node • Improve microservice code to use the connection more efficiently • Buy better database hardware
  • 20. Partitioned Connection Pool ©2019 PayPal Inc. Confidential and proprietary. 18 Micro serv ice Micro serv ice Request Request While servicing requests, the microservices use only some DB connections
  • 21. Partitioned Connection Pool ©2019 PayPal Inc. Confidential and proprietary. 19 Micro serv ice Micro serv ice Request Request While servicing requests, the microservices use only some DB connections In some node all the connections can be temporary busy
  • 22. Partitioned Connection Pool ©2019 PayPal Inc. Confidential and proprietary. 20 Micro serv ice While servicing requests, the microservices use only some DB connections In some node all the connections can be temporary busy Client requests will incur latency though the database has enough capacity Micro serv ice Request Request
  • 24. S h a r e d c o n n e c t i o n p o o l ©2019 PayPal Inc. Confidential and proprietary. 22 Micro serv ice Micro serv ice Request Request Shared pool
  • 25. ©2019 PayPal Inc. Confidential and proprietary. 23 Serv ice Service request Service response SQL request SQL request SQL request SQL request DB connection Database connection is allocated but mostly idle Typical service request
  • 26. ©2019 PayPal Inc. Confidential and proprietary. 24 Serv ice Service request Service response SQL request SQL request SQL request SQL request SQL request SQL request SQL request DB connection Serving other requests when client seems idle?
  • 27. Multiplexing ©2019 PayPal Inc. Confidential and proprietary. 25 Micro serv ice Micro serv ice Request Request Shared Po o l Services perceive that they have a database connection
  • 28. Multiplexing ©2019 PayPal Inc. Confidential and proprietary. 26 Micro serv ice Micro serv ice Request Request Shared Po o l Services perceive that they have a database connection
  • 29. Hera – Highly Efficient Reliable Access ©2019 PayPal Inc. Confidential and proprietary. 27 Micro serv ice Micro serv ice Request Request Hera Shared connection pool multiplexer to data stores
  • 30. HERA Microservice- First DB proxy 28 Principles and constraints • Correctness • Resiliency, -ilities, performance • Minimal configuration • Thin client • Minimal change to wire protocol
  • 31. 29 Let’s talk about other features!
  • 33. Read-Write Split ©2019 PayPal Inc. Confidential and proprietary. 31 RW RW RW RW Oracle RAC - Real Application Cluster Each database is exchanging lock information
  • 34. Read-Write Split ©2019 PayPal Inc. Confidential and proprietary. 32 RW RW Oracle RAC - Real Application Cluster R R R Hera No de Separate write queries and direct to one node Reduces traffic between DB nodes
  • 35. Sharding 33 • Client vs Server • Finding the Shard • Sample Query Run • Convert Legacy App • Move Scuttle Bin’s Data
  • 36. Sharding: Client vs Server ©2019 PayPal Inc. Confidential and proprietary. 34 C++ Client C++ Lib Database (shard 0) Database (shard 1) Database (shard 2)
  • 37. Sharding: Same Logic in Java Client ©2019 PayPal Inc. Confidential and proprietary. 35 C++ Client C++ Lib Database (shard 0) Database (shard 1) Database (shard 2) Java Client Java Lib
  • 38. Sharding: Duplicate Logic in Go Client ©2019 PayPal Inc. Confidential and proprietary. 36 C++ Client C++ Lib Database (shard 0) Database (shard 1) Database (shard 2) Java Client Go Client Java Lib Go Lib
  • 39. Sharding: Copy Logic in Python ©2019 PayPal Inc. Confidential and proprietary. 37 C++ Client C++ Lib Database (shard 0) Database (shard 1) Database (shard 2) Java Client Go Client Java Lib Go Lib Python Client Python Lib
  • 40. Sharding: Node.js Client Too ©2019 PayPal Inc. Confidential and proprietary. 38 Node.js ClientC++ Client C++ Lib Database (shard 1) Database (shard 2) Java Client Go Client Java Lib Go Lib Python Client Python Lib Node.js Lib Database (shard 0)
  • 41. Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Worker Node.js Client Sharding: Keep Logic in Server © 2019 PayPal Inc. Confidential and proprietary. 39 Hera Multiplexer C++ Client C++ Lib Database (shard 0) Database (shard 1) Database (shard 2) Java Client Go Client Hera Java Lib Hera Go Lib Python Client Python Lib Node.js Lib
  • 42. Sharding: Implementation ©2019 PayPal Inc. Confidential and proprietary. 40 1. Shard Key: account_id=2000 … Shard 0 Shard 1 Shard M Hash function (shard key value) %K Scuttle bin to logical shard mapping Logical to physical shard mapping Shard Key DB Shard 0 DB Shard N … … …
  • 43. ©2019 PayPal Inc. Confidential and proprietary. 41 … Shard 0 Shard 1 Shard M Hash function (shard key value) %K Scuttle bin to logical shard mapping Logical to physical shard mapping Shard Key DB Shard 0 DB Shard N 1. Shard Key: account_id=2000 2. Murmur3Hash(2000)%1024=280 … … … Sharding: Implementation
  • 44. ©2019 PayPal Inc. Confidential and proprietary. 42 … Shard 0 Shard 1 Shard M Hash function (shard key value) %K Scuttle bin to logical shard mapping Logical to physical shard mapping Shard Key DB Shard 0 DB Shard N 1. Shard Key: account_id=2000 2. Murmur3Hash(2000)%1024=280 3. Select shard_id from hera_shard_map where scuttle_id = 280; shard_id --------- 1 … … … Sharding: Implementation
  • 45. ©2019 PayPal Inc. Confidential and proprietary. 43 … Shard 0 Shard 1 Shard M Hash function (shard key value) %K Scuttle bin to logical shard mapping Logical to physical shard mapping Shard Key DB Shard 0 DB Shard N 1. Shard Key: account_id=2000 2. Murmur3Hash(2000)%1024=280 3. Select shard_id from hera_shard_map where scuttle_id = 280; shard_id --------- 1 4. TWO_TASK_1 = LOAN_SH1 TWO_TASK_1=‘tcp(loan-sh1: 3306)/loan’ … … … Sharding: Implementation
  • 46. Sharding: Converting Application Query - Shard Key: account_id - select * from loan, appfile where loan.id = ? and appfile.loan_id = loan.id ©2019 PayPal Inc. Confidential and proprietary. 44
  • 47. Sharding Compatible Query - Shard Key: account_id - select * from loan, appfile where loan.id = ? and appfile.loan_id = loan.id and loan.account_id = ? and appfile.account_id = loan.account_id ©2019 PayPal Inc. Confidential and proprietary. 45
  • 48. Sharding: Hera JDBC SQL Rewrite - Shard Key: account_id - select * from loan, appfile where loan.id = :loan_id and appfile.loan_id = loan.id and loan.account_id = :account_id and appfile.account_id = loan.account_id - Hera can now pick out the shard key name and value ©2019 PayPal Inc. Confidential and proprietary. 46
  • 49. Sharding: Legacy App Conversion ©2019 PayPal Inc. Confidential and proprietary. 47 • All queries directed to shard 0 • Logs queries that don’t bind to shard key … Shard 0 Hash function (shard key value) %K Scuttle bin to logical shard mapping Logical to physical shard mapping Shard Key DB Shard 0 DB Shard N … …
  • 50. Sharding: Legacy App Conversion ©2019 PayPal Inc. Confidential and proprietary. 48 • Whitelist sends one value to a specific shard • Limits risk of failures to 1 value • Hera uses Shard 1, but same DB • Fast rollback on errors • Repeat for larger sets … Shard 0 Hash function (shard key value) %K Scuttle bin to logical shard mapping Logical to physical shard mapping Shard Key DB Shard 0 DB Shard N Whitelist Shard 1 … …
  • 51. Sharding: Physical Whitelist ©2019 PayPal Inc. Confidential and proprietary. 49 • Validates permissions and data copy … Shard 0 Hash function (shard key value) %K Scuttle bin to logical shard mapping Logical to physical shard mapping Shard Key DB Shard 0 DB Shard N Whitelist Shard 1 …
  • 52. Sharding: Logical Move for One Scuttle Bin ©2019 PayPal Inc. Confidential and proprietary. 50 • If successful, do a physical data move next … Shard 0 Hash function (shard key value) %K Scuttle bin to logical shard mapping Logical to physical shard mapping Shard Key DB Shard 0 DB Shard N Whitelist Shard 1 … …
  • 53. Sharding: Moving Scuttle Bin ©2019 PayPal Inc. Confidential and proprietary. 51 • Start data copy • Typically, tables are partitioned by scuttle bin … Shard 0 Shard 1 Shard M Hash function (shard key value) %K Scuttle bin to logical shard mapping Logical to physical shard mapping Shard Key DB Shard 0 DB Shard N … … …
  • 54. Sharding: Moving Scuttle Bin ©2019 PayPal Inc. Confidential and proprietary. 52 • Start data copy • Typically, tables are partitioned by scuttle bin • Block writes … Shard 0 Shard 1 Shard M Hash function (shard key value) %K Scuttle bin to logical shard mapping Logical to physical shard mapping Shard Key DB Shard 0 DB Shard N … …
  • 55. Sharding: Moving Scuttle Bin ©2019 PayPal Inc. Confidential and proprietary. 53 • Start data copy • Typically, tables are partitioned by scuttle bin • Block writes • Data fully copied … Shard 0 Shard 1 Shard M Hash function (shard key value) %K Scuttle bin to logical shard mapping Logical to physical shard mapping Shard Key DB Shard 0 DB Shard N … … …
  • 56. Sharding: Moving Scuttle Bin ©2019 PayPal Inc. Confidential and proprietary. 54 • Start data copy • Typically, tables are partitioned by scuttle bin • Block writes • Data fully copied • Update map • Use scuttle bin in new location … Shard 0 Shard 1 Shard M Hash function (shard key value) %K Scuttle bin to logical shard mapping Logical to physical shard mapping Shard Key DB Shard 0 DB Shard N … … …
  • 57. 55 Sharding: Take Away • Queries must bind shard key & value • Manage risk converting legacy code • Controlled data redistribution
  • 58. Resiliency Principles 56 • Deal with exceptional cases ▪ Surge in load ▪ Slow SQL requests ▪ Issues with the DB
  • 59. Resiliency Principles 57 • Deal with exceptional cases ▪ Surge in load ▪ Slow SQL requests ▪ Issues with the DB • Limit or avoid impact • Self heal
  • 60. Resiliency Features 58 • Bouncer • Surge Queue • Slow Query Eviction • Read Replica • Transparent Application Failover
  • 61. Resiliency: Oversubscription - Bouncer ©2019 PayPal Inc. Confidential and proprietary. 59 Serv ice 1 Hera No de Serv ice 2 Serv ice 3 Serv ice 4 Service 4 connection is tcp accepted and then closed without SSL.
  • 62. Resiliency: Oversubscription – Surge Queue ©2019 PayPal Inc. Confidential and proprietary. 60 Serv ice 1 Hera No de Serv ice 2 Serv ice 3 Service 3 can make a request It waits--queued
  • 63. Resiliency: Oversubscription – Surge Queue ©2019 PayPal Inc. Confidential and proprietary. 61 Serv ice 1 Hera No de Serv ice 2 Serv ice 3 Generally, 1ms later, Service 3 gets run.
  • 64. Resiliency: Oversubscription – Surge Queue ©2019 PayPal Inc. Confidential and proprietary. 62 Serv ice 1 Hera No de Serv ice 2 Serv ice 3 Waiting more than 1s gives an error to Service 3.
  • 65. Resiliency: Slow Query Eviction ©2019 PayPal Inc. Confidential and proprietary. 63 Serv ice 1 Hera No de Serv ice 2 Serv ice 3 1s Slow Query When Hera is overloaded
  • 66. Resiliency: Slow Query Eviction ©2019 PayPal Inc. Confidential and proprietary. 64 Serv ice 1 Hera No de Serv ice 2 Serv ice 3 1s Slow Query Hundred of queries can run once slow query is removed.
  • 67. Monitoring Usage ©2019 PayPal Inc. Confidential and proprietary. 65 Serv ice 1 Hera No de Serv ice 2 Serv ice 3 Each Hera Node Free DB Connections (acpt) Time
  • 68. Resiliency: Eviction – Actual event ©2019 PayPal Inc. Confidential and proprietary. 66 Each Hera Node Free DB Connections (acpt) Time Happened few years back, when automatic eviction went unnoticed. Evicting expensive query makes space for many typical queries.
  • 69. Read Replicas ©2019 PayPal Inc. Confidential and proprietary. 67 MUX R1 Pool Po o l 1 R1 Pool R1 Pool MUX R1 Pool R1 Pool R2 Pool Po o l 2 Hera Application Read Replica 1 Read Replica 2 Write DB F a i l o v e r m o d u l e B u s i n e s s l o g i c R1 & R2: different data replication lag
  • 70. Transparent Application Failover – Read Replica Retry ©2019 PayPal Inc. Confidential and proprietary. 68 MUX R1 Pool R1 Pool R1 Pool R1 Pool R1 Pool R2 Pool Po o l Hera Application B u s i n e s s l o g i c Read Replica 1 Read Replica 2 Write DB • Build failover on server side • Also failover when query is slow
  • 71. Manageability 69 • Maintenance without customer impact • Recommend clients recycle often • Oracle RAC maintenance
  • 72. Recycle Connections– Node Going Away ©2019 PayPal Inc. Confidential and proprietary. 70 Hera No de Serv ice 2 Serv ice 3 Hera No de Hera No de TCP Load Balancer
  • 73. Recycle Connections for Manageability ©2019 PayPal Inc. Confidential and proprietary. 71 Hera No de Serv ice 2 Serv ice 3 Hera No de Hera No de TCP Load Balancer
  • 74. Recycle Connections– Node Returns ©2019 PayPal Inc. Confidential and proprietary. 72 Hera No de Serv ice 2 Serv ice 3 Hera No de TCP Load Balancer Hera No de
  • 75. DB Maintenance ©2019 PayPal Inc. Confidential and proprietary. 73 RW 2 Oracle RAC - Real Application Cluster 3 4 1 Hera No de • Preparing Oracle RAC node 3 maintenance • DBAs remove node 3 from Oracle configs • Insert into hera_maint (inst_id, status, status_time, module, machine) values (3, ‘F’, [unix epoch]+2, [hera pool name], [host]
  • 76. DB Maintenance ©2019 PayPal Inc. Confidential and proprietary. 74 RW 2 Oracle RAC -Real Application Cluster 3 4 1 Hera No de • After [unix epoch] +2 Take Aways • Keep controls near those who need it • Avoid being unnecessarily involved
  • 78. - Written in C++, asynchronous code - Scalable - Efficient - Complex ©2019 PayPal Inc. Confidential and proprietary. 76 Previous version Transition to Go
  • 79. - Written in C++, asynchronous code - Scalable - Efficient - Complex ©2019 PayPal Inc. Confidential and proprietary. 77 Previous version Transition to Go New changes - New functional requirements - New non functional requirements (efficiency)
  • 80. - Written in C++, asynchronous code - Scalable - Efficient - Complex ©2019 PayPal Inc. Confidential and proprietary. 78 Previous version Transition to Go New changes - New functional requirements - New non functional requirements (efficiency) - Go programing is synchronous - Scale efficiently out of the box - Does it meet our requirements? - Is stop-the-world garbage collection time an issue? How about writing in Go?
  • 81. Proof-of- concept 1st Release: No Sharding Full feature with C++ dependencies Full feature all Go ©2019 PayPal Inc. Confidential and proprietary. 79 Going with Go Transition to Go
  • 82. - Running in production for 1 year - Latency and CPU met the requirements - No clients change - Had some issues and learnings ©2019 PayPal Inc. Confidential and proprietary. 80 Mission Accomplished
  • 83. func routine() { time.Sleep(time.Second * 10) } func main() { routines := 10000 wg.Add(routines) for i := 0; i < routines; i++ { go func() { routine(i) wg.Done() }() } wg.Wait() } ©2019 PayPal Inc. Confidential and proprietary. 81 Spawn 10000 go-routines sleep for 10 seconds to simulate processing Running on my 2 cores test machine How many OS threads I expect at runtime? Learning Example
  • 84. func routine() { time.Sleep(time.Second * 10) } func main() { routines := 10000 wg.Add(routines) for i := 0; i < routines; i++ { go func() { routine(i) wg.Done() }() } wg.Wait() } ©2019 PayPal Inc. Confidential and proprietary. 82 Spawn 10000 go-routines sleep for 10 seconds to simulate processing Running on my 2 cores test machine Learning Example $> ./mytest & [1] 12345 $> ps –L –C mytest –no-headers | wc –l 6 $>
  • 85. func routine() { //time.Sleep(time.Second * 10) syscall.Select(0, nil, nil, nil, &syscall.Timeval{Sec: 10}) } func main() { routines := 10000 wg.Add(routines) for i := 0; i < routines; i++ { go func() { routine(i) wg.Done() }() } wg.Wait() } ©2019 PayPal Inc. Confidential and proprietary. 83 Learning Example Spawn 10000 go-routines sleep for 10 seconds to simulate processing Running on my 2 cores test machine Using system call to sleep how many OS threads I expect at runtime?
  • 86. func routine() { //time.Sleep(time.Second * 10) syscall.Select(0, nil, nil, nil, &syscall.Timeval{Sec: 10}) } func main() { routines := 10000 wg.Add(routines) for i := 0; i < routines; i++ { go func() { routine(i) wg.Done() }() } wg.Wait() } ©2019 PayPal Inc. Confidential and proprietary. 84 Learning Example Spawn 10000 go-routines sleep for 10 seconds to simulate processing Running on my 2 cores test machine $ ./mytest runtime: program exceeds 10000-thread limit fatal error: thread exhaustion runtime stack: ….
  • 87. . . . “A Go program creates a new thread only when a goroutine is ready to run but all the existing threads are blocked in system calls, cgo calls, or are locked to other goroutines due to use of runtime.LockOSThread” ©2019 PayPal Inc. Confidential and proprietary. 85 from “runtime/debug” func SetMaxThreads() Learning Example
  • 88. 86 Key Take Away • System calls lock the OS thread. Ex: os.File read • When using C/C++ library, be aware of the systems calls (mutexes, file reads, socket reads, sleep) • Golang locks, socket reads, sleep are not locking the OS thread • In your load tests, verify the number of OS threads
  • 89. Open source github.com/paypal/hera ✓ Code, documentation, examples ✓ Standard clients • Java: JDBC driver • Golang: database/sql driver Google group heradatalink ©2019 PayPal Inc. Confidential and proprietary. 87 github.com/paypal/hera Internal Mirror Build Deploy PayPal Config
  • 90. Acknowledgements Kamlakar Mukundan Liana Sid Yaping Shuping Stephanie Varun Qing Chun Dwain ©2019 PayPal Inc. Confidential and proprietary. 88 Paresh Srini Dheeraj Pradeep Pramod Samrat Akash Ashwin and countless others
  • 91. Next Steps Need more database connections? Hera can help scale • Multiplexing • Sharding • Resiliency • Manageability ©2019 PayPal Inc. Confidential and proprietary. 89 Open Source github.com/paypal/hera Google Groups – heradatalink Our Future • Scaling Tools • More Clients
  • 92. ©2019 PayPal Inc. Confidential and proprietary. 90 github.com/paypal/hera Google group: heradatalink
  • 93. Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ paypal-scale-db/