6. OmniTI
1
• Helping customers
navigate explosive growth
with technology.
100MM+ users
$1B+ gross online sales
Open and closed source thought leaders,
experts and authors
Wednesday, September 18, 13
7. OmniTI
1
• Helping customers
navigate explosive growth
with technology.
100MM+ users
$1B+ gross online sales
Open and closed source thought leaders,
experts and authors
Wednesday, September 18, 13
13. What is Scalability?
3
A service is said to be scalable if when we
increase the resources in a system, it results in
increased performance in a manner proportional
to resources added.
Wednesday, September 18, 13
15. Why to scale databases?
4
Wednesday, September 18, 13
16. Why to scale databases?
• Support a higher volume of users
4
Wednesday, September 18, 13
17. Why to scale databases?
• Support a higher volume of users
4
Wednesday, September 18, 13
18. Why to scale databases?
• Support a higher volume of users
4
Wednesday, September 18, 13
19. Why to scale databases?
• Support a higher volume of users
4
Wednesday, September 18, 13
20. Why to scale databases?
• Provide better performance for existing users
• Support a higher volume of users
4
Wednesday, September 18, 13
21. Why to scale databases?
• Provide better performance for existing users
• Store a larger volume of data
• Support a higher volume of users
4
Wednesday, September 18, 13
22. Why to scale databases?
• Provide better performance for existing users
• Store a larger volume of data
• Improve system availability
• Support a higher volume of users
4
Wednesday, September 18, 13
23. Why to scale databases?
• Provide better performance for existing users
• Store a larger volume of data
• Improve system availability
• Geographic dispersion
• Support a higher volume of users
4
Wednesday, September 18, 13
62. Horizontal Scaling (Scale Out)
• Pros
• Cheaper in hardware cost
• Flexibility
• Higher fault tolerance
• Cons
• Complex to implement
8
Wednesday, September 18, 13
63. Horizontal Scaling (Scale Out)
• Pros
• Cheaper in hardware cost
• Flexibility
• Higher fault tolerance
• Cons
• Complex to implement
• Expensive to maintain
8
Wednesday, September 18, 13
64. Horizontal Scaling (Scale Out)
• Pros
• Cheaper in hardware cost
• Flexibility
• Higher fault tolerance
• Cons
• Complex to implement
• Expensive to maintain
• Bigger footprint in the Data Center
8
Wednesday, September 18, 13
65. Horizontal Scaling (Scale Out)
• Pros
• Cheaper in hardware cost
• Flexibility
• Higher fault tolerance
• Cons
• Complex to implement
• Expensive to maintain
• Bigger footprint in the Data Center
• No built in support in databases
8
Wednesday, September 18, 13
118. Optimize Queries /Explain Analyze
explain (analyze,buffers) select col1,col2 from demo_ios where col2 between 0.01 and 0.02;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using idx_demo_ios on demo_ios (cost=0.00..35330.93 rows=993633 width=16) (actual time=58.100..3250.589
rows=1000392 loops=1)
Index Cond: ((col2 >= 0.01::double precision) AND (col2 <= 0.02::double precision))
Heap Fetches: 0
Buffers: shared hit=923073 read=3848
Total runtime: 4297.405 ms
15
Wednesday, September 18, 13
119. Optimize Queries /Explain Analyze
explain (analyze,buffers) select col1,col2 from demo_ios where col2 between 0.01 and 0.02;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using idx_demo_ios on demo_ios (cost=0.00..35330.93 rows=993633 width=16) (actual time=58.100..3250.589
rows=1000392 loops=1)
Index Cond: ((col2 >= 0.01::double precision) AND (col2 <= 0.02::double precision))
Heap Fetches: 0
Buffers: shared hit=923073 read=3848
Total runtime: 4297.405 ms
15
Wednesday, September 18, 13
131. Partitioning
• As table size grows, queries eventually slows down, even with indexing
17
Wednesday, September 18, 13
132. Partitioning
• As table size grows, queries eventually slows down, even with indexing
• Allows data added, removed and queried fast
17
Wednesday, September 18, 13
133. Partitioning
• As table size grows, queries eventually slows down, even with indexing
• Allows data added, removed and queried fast
• Partitioning pruning queries
17
Wednesday, September 18, 13
134. Partitioning
• As table size grows, queries eventually slows down, even with indexing
• Allows data added, removed and queried fast
• Partitioning pruning queries
• Manage partitions
17
Wednesday, September 18, 13
147. Partitioning /functional
Configuration Data Transaction data Session data
Configuration Tools Reporting Tools Monitoring Tools
Web Applications Other Applications
19
Wednesday, September 18, 13
148. Partitioning /functional
Configuration Data Transaction data Session data
Configuration Tools Reporting Tools Monitoring Tools
Web Applications Other Applications
• Partition data based on functionality
19
Wednesday, September 18, 13
149. Partitioning /functional
Configuration Data Transaction data Session data
Configuration Tools Reporting Tools Monitoring Tools
Web Applications Other Applications
• Partition data based on functionality
• Separate Postgres clusters
19
Wednesday, September 18, 13
150. Partitioning /functional
Configuration Data Transaction data Session data
Configuration Tools Reporting Tools Monitoring Tools
Web Applications Other Applications
• Partition data based on functionality
• Separate Postgres clusters
• Start with Separate schemas
19
Wednesday, September 18, 13
151. Partitioning /functional
Configuration Data Transaction data Session data
Configuration Tools Reporting Tools Monitoring Tools
Web Applications Other Applications
• Partition data based on functionality
• Separate Postgres clusters
• Start with Separate schemas
• No relationship between data
19
Wednesday, September 18, 13
152. Partitioning /functional
Configuration Data Transaction data Session data
Configuration Tools Reporting Tools Monitoring Tools
Web Applications Other Applications
• Partition data based on functionality
• Separate Postgres clusters
• Start with Separate schemas
• No relationship between data
• Help to spread the load across server
19
Wednesday, September 18, 13
153. Partitioning /functional
Configuration Data Transaction data Session data
Configuration Tools Reporting Tools Monitoring Tools
Web Applications Other Applications
• Partition data based on functionality
• Separate Postgres clusters
• Start with Separate schemas
• No relationship between data
• Help to spread the load across server
• Less complex compare to sharding!
19
Wednesday, September 18, 13
160. pgbouncer
• A lightweight connection pooler
• Helps to reduce # of newly created connections on DB server
20
Wednesday, September 18, 13
161. pgbouncer
• A lightweight connection pooler
• Helps to reduce # of newly created connections on DB server
• Abstracts DBs from App
20
Wednesday, September 18, 13
162. pgbouncer
• A lightweight connection pooler
• Helps to reduce # of newly created connections on DB server
• Abstracts DBs from App
• Helps to instrument smooth and easy failover
20
Wednesday, September 18, 13
163. pgbouncer
• A lightweight connection pooler
• Helps to reduce # of newly created connections on DB server
• Abstracts DBs from App
• Helps to instrument smooth and easy failover
• Connection pooling Options
20
Wednesday, September 18, 13
164. pgbouncer
• A lightweight connection pooler
• Helps to reduce # of newly created connections on DB server
• Abstracts DBs from App
• Helps to instrument smooth and easy failover
• Connection pooling Options
• Session, Transaction, Statement pooling options
20
Wednesday, September 18, 13
165. pgbouncer
• A lightweight connection pooler
• Helps to reduce # of newly created connections on DB server
• Abstracts DBs from App
• Helps to instrument smooth and easy failover
• Connection pooling Options
• Session, Transaction, Statement pooling options
• Beware! Transaction pooling doesn’t support prepared transactions
20
Wednesday, September 18, 13
170. Caching
• Memcached
• Open source, High-performance distributed memory object
caching system
21
Wednesday, September 18, 13
171. Caching
• Memcached
• Open source, High-performance distributed memory object
caching system
• Speeds up dynamic web applications by alleviating database load.
21
Wednesday, September 18, 13
172. Caching
• Memcached
• Open source, High-performance distributed memory object
caching system
• Speeds up dynamic web applications by alleviating database load.
• An in-memory key-value store for small chunks of arbitrary data
21
Wednesday, September 18, 13
173. Caching
• Memcached
• Open source, High-performance distributed memory object
caching system
• Speeds up dynamic web applications by alleviating database load.
• An in-memory key-value store for small chunks of arbitrary data
• Redis
21
Wednesday, September 18, 13
174. Caching
• Memcached
• Open source, High-performance distributed memory object
caching system
• Speeds up dynamic web applications by alleviating database load.
• An in-memory key-value store for small chunks of arbitrary data
• Redis
• Open source, advanced key-value store.
21
Wednesday, September 18, 13
175. Caching
• Memcached
• Open source, High-performance distributed memory object
caching system
• Speeds up dynamic web applications by alleviating database load.
• An in-memory key-value store for small chunks of arbitrary data
• Redis
• Open source, advanced key-value store.
• Works with an in-memory & persistent dataset
21
Wednesday, September 18, 13
212. Replication /built-in
• postgres_fdw
• Postgres 9.3 feature
• Allows to access data stored in external PostgreSQL
servers
Read
Salve 1
DW
System
postgres_fdw
23
Wednesday, September 18, 13
213. Replication /built-in
• postgres_fdw
• Postgres 9.3 feature
• Allows to access data stored in external PostgreSQL
servers
• cross version queries
Read
Salve 1
DW
System
postgres_fdw
23
Wednesday, September 18, 13
214. Replication /built-in
• postgres_fdw
• Postgres 9.3 feature
• Allows to access data stored in external PostgreSQL
servers
• cross version queries
• Postgres 9.3 could query Postgres 9.1
Read
Salve 1
DW
System
postgres_fdw
23
Wednesday, September 18, 13
215. Replication /built-in
• postgres_fdw
• Postgres 9.3 feature
• Allows to access data stored in external PostgreSQL
servers
• cross version queries
• Postgres 9.3 could query Postgres 9.1
• Application
Read
Salve 1
DW
System
postgres_fdw
23
Wednesday, September 18, 13
216. Replication /built-in
• postgres_fdw
• Postgres 9.3 feature
• Allows to access data stored in external PostgreSQL
servers
• cross version queries
• Postgres 9.3 could query Postgres 9.1
• Application
• Run query remotely on slave db
Read
Salve 1
DW
System
postgres_fdw
23
Wednesday, September 18, 13
217. Replication /built-in
• postgres_fdw
• Postgres 9.3 feature
• Allows to access data stored in external PostgreSQL
servers
• cross version queries
• Postgres 9.3 could query Postgres 9.1
• Application
• Run query remotely on slave db
• Data warehouse data refreshes
Read
Salve 1
DW
System
postgres_fdw
23
Wednesday, September 18, 13
239. Sharding
• Sharding is the process of splitting up your data so it resides in
different tables or often different physical databases.
25
Wednesday, September 18, 13
240. Sharding
• Sharding is the process of splitting up your data so it resides in
different tables or often different physical databases.
• Application aware sharding
25
Wednesday, September 18, 13
241. Sharding
• Sharding is the process of splitting up your data so it resides in
different tables or often different physical databases.
• Application aware sharding
• Application transparent sharding
25
Wednesday, September 18, 13
267. Obstacles for Scaling Postgres
• Postgres table bloat
• FKs relationships
• Insufficient logging
• Insufficient Caching
• Insufficient Monitoring and Metrics
29
Wednesday, September 18, 13
268. Obstacles for Scaling Postgres
• Postgres table bloat
• FKs relationships
• Insufficient logging
• Insufficient Caching
• Insufficient Monitoring and Metrics
• ORMs
29
Wednesday, September 18, 13
269. Obstacles for Scaling Postgres
• Postgres table bloat
• FKs relationships
• Insufficient logging
• Insufficient Caching
• Insufficient Monitoring and Metrics
• ORMs
• Single Point of Failure
29
Wednesday, September 18, 13
270. Obstacles for Scaling Postgres
• Postgres table bloat
• FKs relationships
• Insufficient logging
• Insufficient Caching
• Insufficient Monitoring and Metrics
• ORMs
• Single Point of Failure
• Lack of communications between teams
29
Wednesday, September 18, 13
275. Beyond Postgres
• Avoid serialization in application code
• Feature Flags
30
Wednesday, September 18, 13
276. Beyond Postgres
• Avoid serialization in application code
• Feature Flags
• Browse only mode (Read only mode)
30
Wednesday, September 18, 13
277. Beyond Postgres
• Avoid serialization in application code
• Feature Flags
• Browse only mode (Read only mode)
• Don’t use database for Queuing
30
Wednesday, September 18, 13
278. Beyond Postgres
• Avoid serialization in application code
• Feature Flags
• Browse only mode (Read only mode)
• Don’t use database for Queuing
• RabbitMQ
30
Wednesday, September 18, 13
279. Beyond Postgres
• Avoid serialization in application code
• Feature Flags
• Browse only mode (Read only mode)
• Don’t use database for Queuing
• RabbitMQ
• Reconsider options for Full Text Search
30
Wednesday, September 18, 13
280. Beyond Postgres
• Avoid serialization in application code
• Feature Flags
• Browse only mode (Read only mode)
• Don’t use database for Queuing
• RabbitMQ
• Reconsider options for Full Text Search
• tsearch provided by Postgres
30
Wednesday, September 18, 13
281. Beyond Postgres
• Avoid serialization in application code
• Feature Flags
• Browse only mode (Read only mode)
• Don’t use database for Queuing
• RabbitMQ
• Reconsider options for Full Text Search
• tsearch provided by Postgres
• Solr, Lucene
30
Wednesday, September 18, 13
284. Further reading . . .
• Scalable Internet Architectures - Theo Schlossnagle
• Web Operations: Keeping the Data On Time - John Allspaw , Jesse
Robbins
• PostgreSQL 9.0 High Performance - Greg Smith
31
Wednesday, September 18, 13