** Talk Given at OSCon 2010 **
We often have clients approach us looking for help in scaling their systems, and all too often their long term vision is a mixed reality based on the approaches read about on popular blogs trying to solve very different problems. Hey, scaling your database can be difficult enough by itself, you don't want to get tripped up by not understanding where you're going. In Database Scalability Patterns we will attempt to distill all of the information/hype/discussions around scaling databases, and break down the common patterns we've seen dealing with scaling databases. "Buzzwords" we'll cover (and hopefully debuzz) include:
Vertical Scaling
Horizontal Partitioning
Horizontal Scaling
Read Slaves
Multi-Master
Vertical Partitioning
Federated Data Storage
More important than just describing what these things are (although that's a good first step), we'll also discuss along the way different points in the life-cycle of your database when you need to be thinking about the different options in front of you. We'll factor in the types of application that your working on (think OLTP vs OLAP, or Social Networking vs. Corporate Application), the environment you'll be working on (Scaling "in the cloud" is very different from DIY in the datacenter), and we will talk about the types of tools you'll need to accomplish these goals (All replication systems are not the same, and some won't help at all).
18. Read Slaves / Master - Slave
Scale Read Load
writes go here!
Wednesday, July 21, 2010
19. Read Slaves / Master - Slave
Scale Read Load
writes go here!
reads go here (or here) (or here)
Wednesday, July 21, 2010
20. Read Slaves / Master - Slave
Scale Read Load
database writes
data to slaves writes go here!
slave db slave db slave db
reads go here (or here) (or here)
Wednesday, July 21, 2010
21. Read Slaves / Master - Slave
Scale Read Load
app writes data
everywhere writes go here!
memcached memcached memcached
reads go here (or here) (or here)
Wednesday, July 21, 2010
22. Read Slaves / Master - Slave
Scale Read Load
• Typically
• Full Copy of Data
On Each Node
• Asynchronous
Wednesday, July 21, 2010
23. Read Slaves / Master - Slave
Scale Read Load
• Typically • Consider
• Full Copy of Data • Partial Copy
On Each Node • Synchronous
• Asynchronous • Don’t use a RDBMS?
Wednesday, July 21, 2010
24. Read Slaves / Master - Slave
Scale Read Load
• Typically • Consider
• Full Copy of Data • Partial Copy
On Each Node • Synchronous
• Asynchronous • Don’t use a RDBMS?
Requires Application Changes
Wednesday, July 21, 2010
25. Read Slaves / Master - Slave
Scale Read Load
• Typically • Consider
• Full Copy of Data • Partial Copy
On Each Node • Synchronous
• Asynchronous • Don’t use a RDBMS?
Requires Application Changes
“easy”
Wednesday, July 21, 2010
28. Multi-Master
many different ways to implement this,
few that actually work in production
Wednesday, July 21, 2010
29. Multi-Master
many different ways to implement this,
few that actually work in production
write to any node, database syncs data
Wednesday, July 21, 2010
30. Multi-Master
many different ways to implement this,
few that actually work in production
write to any node, database syncs data
can reduce cpu, doesn’t reduce i/o
Wednesday, July 21, 2010
31. Multi-Master
many different ways to implement this,
few that actually work in production
write to any node, database syncs data
can reduce cpu, doesn’t reduce i/o
failover solution
not a scalability solution
Wednesday, July 21, 2010
34. Horizontal Partitioning
• Divide schema by job operations
• Move each piece to own server
• Duplicate some data as needed
Wednesday, July 21, 2010
35. Horizontal Partitioning
items
• Divide schema by job operations
• Move each piece to own server
• Duplicate some data as needed
forums
users
Wednesday, July 21, 2010
36. Horizontal Partitioning
items
• Divide schema by job operations
• Move each piece to own server
• Duplicate some data as needed
forums
• You must separate dependencies
in the app code first!
users
Wednesday, July 21, 2010
37. Horizontal Partitioning
items
• Divide schema by job operations
• Move each piece to own server
• Duplicate some data as needed
forums
• You must separate dependencies
in the app code first!
users
Each node is a new instance of vertical scaling
Wednesday, July 21, 2010
38. Horizontal Scaling
• data split across servers based on algorithm
• data dropped into buckets (multiple?)
app
magic hash algorithm
Wednesday, July 21, 2010
39. Horizontal Scaling
• data split across servers based on algorithm
• data dropped into buckets (multiple?)
magic hash algorithm app
Wednesday, July 21, 2010
40. Horizontal Scaling
• data split across servers based on algorithm
• data dropped into buckets (multiple?)
•someone must keep track of data, and provide
lookup services
magic hash algorithm app
Wednesday, July 21, 2010
41. Universal Truths of Scaling Databases
Vertical Scalability is Helpful for Every Pattern
Even in a horizontally scaled, fully distributed
database, the number of nodes needed is
affected by vertical scalability
Wednesday, July 21, 2010
42. Universal Truths of Scaling Databases
New Nodes Are Never Free
• Add points of failure
• Add management costs
• Add complexity to architecture
• Add complexity to your app code
Wednesday, July 21, 2010
43. MyFirstDB
V
Vertical Partitioning
V
Vertical Scaling
V
Read Slaves
V
Horizontal Partitioning
Wednesday, July 21, 2010