3. Special Thanks
Jonas BonĂŠr
twitter: @jboner
http://jonasboner.com/
http://www.slideshare.net/jboner/scalability-
availability-stability-patterns
4. Background
⢠Scalable Apps maintain performance under load
⢠More requests, More users, More data
⢠Available Apps maintain the experience during failures
⢠Hardware failures, Network splits/partitioning
⢠Simple Designs tend to scale better
11. Background
Master the
⢠Good Performance is good
⢠Predictably Good Performance is king!
Tradeoffs
⢠Measure everything (canât ďŹx what you donât know)
⢠Understand app and your data!)
(For your your data
⢠Understand your user experience
⢠Donât be a failure of your own success
23. Know what to scale!
⢠CPU or IO Bound?
⢠Scale up or Scale out?
⢠Waiting on IO? What? Disk/Net/Other System?
⢠How many components are used per request?
⢠Know who and what the slowest will be!
35. Asynchronous and
Non-Blocking
âDonât wait, go doing something else
âNever block
âAll callbacks all the time can get messy!
âGood language/framework support
âfunctional closures
âco-routines
36. Load Balancing
âMultiple endpoints to perform work
âCan be semantically aware
âChainable: DNS, hardware, software
âEndpoints can be Hardware, VM,
process, thread, co-routine, ďŹber, etc.
40. Parallel Computing
âDivide and Conquer
âWorker queues
âMap Reduce
âUE = Unit of Execution
âVM, process, thread, co-routine, ďŹber, callback
41. Parallel Computing
Worker Queues
âGood for ofďŹoading tasks
âNeed bounded time check in master
âAsync result processing
âFork/Join pattern
42. Parallel Computing
MapReduce
âUsed internally at Google
âVariation of Fork and Join
âDistributed
âOriginally used for logs processing
43. Parallel Computing
MapReduce
âGoogleâs MapReduce
âHadoop
âAmazonâs Elastic MapReduce
âRIAK uses it internally for queries
48. Master Record: Scaling
âTraditonally Scale Up
âTechnology will help here
âSSD (50k-100k IOPs)
âMore memory/cores per box
âFaster network connectivity
âClustering Appliances
60. Sharding: Over-provision
âUse N partitions
âUse Y replicas
âUse message based requests
âFirst back wins
âTherefore user wins (Google Search)
66. NoSQL in the wild
âGoogle: Bigtable, Colossus
âTwitter: Redis
âAmazon: Dynamo, SimpleDB
âYahoo: HBase (Hadoop)
âFacebook: Cassandra, HBase
67. Caching
âCache early and often
âUsually biggest bang for the buck
âReferential Transparency
âPolyglot APIs coming
âNoSQL stores
âCache invalidation is still hard!
72. HTTP Caching
âLives in browsers, proxies, CDNs, apps
âHard to control, so do it right!
âMaster page controls other resources
âmaster page not cached (at least too far)
âread-only resources
âchange link in master page
78. Cache Invalidation
âTTL (Time to Live)
âBounded FIFO or LIFO
âExplicit cache invalidation
âExplicit non-use of read-only resource
âHarder problem the more master items used
79. Scalability Key Points
âThe problem is not where you think ;)
âAutoscaling is a myth
âCanât ďŹx what you canât measure
âScaling master record writes is hard
âScaling reads is more tractable
âWhat is the opex cost of your choices?
94. You can only pick 2
Consistency
Availability
Partition Tolerance
95. Centralized Systems
âIf the system is centralized
âno P (network partitions)
âSo you get both:
âAvailability
âConsistency
96. Distributed Systems
âIf the system is distributed
âyou will have P! (network partitions)
âSo you get pick one:
âAvailability
âConsistency
97. CAP in reality
âThere is only once choice to make:
âWhen there is a network partition,
which do you sacriďŹce?
âAvailability
âConsistency
101. Eventually Consistent
âGreat tradeoff for the right kind of data
âCanât be used everywhere
âWorks in more places than you think
âSolved speed of light problem
108. Availability Key Points
âAlways have a dial tone
âSyntactically correct is good
âSemantically correct is better
âBe transparent
109. Background
Beating the
⢠Good Performance is good
⢠Predictably Good Performance is king!
dead horse
⢠Measure everything (canât ďŹx what you donât know)
⢠Understand your data
⢠Understand your user experience
⢠Donât be a failure of your own success
110. Background
Understand
⢠Good Performance is good
⢠Predictably Good Performance is king!
your data!
⢠Measure everything (canât ďŹx what you donât know)
⢠Understand your data
⢠Understand your user experience
⢠Donât be a failure of your own success
111. Background
Understand
⢠Good Performance is good
⢠Predictably Good Performance is king!
your user!
⢠Measure everything (canât ďŹx what you donât know)
⢠Understand your data
⢠Understand your user experience
⢠Donât be a failure of your own success
112. Background
Understand
⢠Good Performance is good
the
⢠Predictably Good Performance is king!
⢠Measure everything (canât ďŹx what you donât know)
experience!
⢠Understand your data
⢠Understand your user experience
⢠Donât be a failure of your own success
113. Background
Master the
⢠Good Performance is good
⢠Predictably Good Performance is king!
Tradeoffs
⢠Measure everything (canât ďŹx what you donât know)
⢠Understand app and your data!)
(For your your data
⢠Understand your user experience
⢠Donât be a failure of your own success