Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
The Economies of Scaling Software
Abdelmonaim Remani
@PolymathicCoder
Creative Commons Attribution Non-Commercial License 3.0 Unported

The graphics and logos in this presentation belong to th...
About Me
•

Platform Architect at just.me Inc.

•

JavaOne RockStar and frequent speaker at many developer events and conf...
http://speakerscore.com/jazoon-scalability
Follow @PolymathicCoder
The Title of the Talk
• The Economies of Scale
• “In microeconomics, economies of scale are the cost

advantages that ente...
Let’s Go!
Blurred Lines…
• Only the enterprise worried about scalability
• The rise of social and the abundance of mobile
• An expon...
The Bar Is Higher!
Scalability is everyone’s problem…

|

@PolymathicCoder
What is Scalability?
The Common Definition
• The ability of an application to handle an increasing
amount of work without performance degradati...
A Better Definition
• The ability of an application to gracefully evolve within
the constraints of its ecosystem in order ...
A Black Art!
• Don’t be surprised if
• Your application supports one
million users
• You add one more feature
• 500,000 us...
Latency Is
Your Enemy
Syllogismo
• To scale is to reduce latency
• To reduce latency is to address bottlenecks
• To scale is to address bottlene...
Overcoming
The CPU
Bottleneck
Overcoming the CPU Bottleneck
• Nothing affects the CPU more than the instructions it is
summoned to execute
• This is abo...
A Scalable Architecture
Architecture?
• “Things that people perceive as hard-to-change” -Martin
Flower
• http://martinfowler.com/ieeeSoftware/whoN...
Be Wise… Think Twice…
• Choose the right technologies
•
•

Platform
Languages
• Frameworks
• Libraries

• Make the right a...
Write Good Code
Write Good Code
• Think your algorithms through and mind their complexity
(Asymptotic Complexity, Cyclomatic Complexity, e...
Quality… Quality… Quality!
• Obsess with testing
• TDD/BDD

• Tools
• Static code analyzers (PMD, FindBugs, etc…)
• Profil...
Know Thy S#!t
• Read
•
•
•
•
•
•
•
•
•

The Classics (The Mythical Man-Mouth, etc…)
GoF’s “Design Patterns”
Eric Evans’ “D...
The Inevitable
You do all that…
You’ll end up with…

At best…
The fading tradition of making cow dung piles
http://news.ukpha.org/2011/01...
Still better than…

|

@PolymathicCoder
Technical Debt
• What is it?
• The quick-and-dirty you are not proud of
• What you would have done differently haven't you...
Write Code That Scales Up
Vertical Scaling
• Vertical Scaling (Scaling Up)
• On a single-node system
• Adding more computing resources to the node (...
Parallelism At The Node Level
• Writing concurrent code of simultaneously executing
code
• Simple business logic within co...
Easier Said Than Done…
• Moore’s Law
• Performance gain is automatically realized by software (Code is
faster on faster ha...
Easier Said Than Done…
• Synchronize state across threads across multiple cores
• Good luck!

• Relay on frameworks and li...
It Gets More Interesting…
• Amdahl’s Law
• Throwing more cores does not necessarily result in performance
gain
• Diminishi...
Miscellaneous
• Leverage Probabilistic data structures and algorithms
• Bloom Filters, Quotient filters, etc…

• Go Reacti...
Write Code That Scales Out
Horizontal Scaling

• Horizontal Scaling
• On a distributed system (A cluster)
• Adding more nodes

• Writing code to harn...
Topology
• A typical cluster consists of
• A number of identical application server nodes behind a load
balancer

|

@Poly...
Topology
• A typical cluster consists of
• A number of identical application server nodes behind a load
balancer

A number...
Topology
• A typical cluster consists of
• A number of identical application server nodes behind a load
balancer

Identica...
Topology
• A typical cluster consists of
• A number of identical application server nodes behind a load
balancer

Load bal...
Managing State
• Session data
• Session Replication
• Session Affinity / Sticky Session
• Requests from the same client ar...
Parallelism At The Cluster Level
• Leverage Map/Reduce
• “A programming model for processing large data sets
with a parall...
Miscellaneous
• How to HTTPS?
• End at load balancer
• Wildcard SSL

• Distributed Lock Manager (DLM)
• Synchronize access...
Deployment
Deployment
• Multiple Environments
• Development, Test, Stage, and Production
• Automatic Configuration Management

• Prac...
Overcoming
The Storage I/O
Bottleneck
The Storage I/O Bottleneck

• The storage I/O is usually the most significant

|

@PolymathicCoder
The Persistent Datastore
What Datastore to Use?
• Relational of course!
•
•
•
•

Normalized schema guaranteeing data integrity
ACID Transactions
No...
Mucho Data!
• No other choice but scaling out RDBMS
• Master/Slave clusters
• Sharding

• Failed big time!
• RDBMS is desi...
NoSQL
• A wide range of specialized datastores with the goal of
addressing the challenges of the relational model
• “The w...
Polyglot Persistence
• Within the application
• Data is complex and accessed in many different ways
• Why should we fit it...
Caching
Caching
• A cache is typically a simple key-value data structure
• Instead of incurring the overhead of data retrieval or
...
Caching
• Where to cache?
• On disk
• File System: Slow and sequential access
• DB: A bit better (Data is arranged in stru...
Caching
• How to cache?
• Most caches implement a very simple interface
• Always attempt to get from cache first using a k...
Caching Patterns
• Caching Query Results
• Key: Hash of the query itself
• How about parameterized queries?
• Key: Hash of...
Caching Patterns
• Time-series datasets (Ex. Real-time feed)
• Most of the time pseudo/near real-time is enough
• Use cach...
Caching Gotchas
• Profile your code to assess what to cache, and whether
you need to to begin with
• Stale state might bit...
Featured Solutions
•
•
•
•

EhCache
Memcahed
Oracle Coherence
Redis
• A persistence NoSQL datastore
• Built-in data struct...
Overcoming
The Network I/O
Bottleneck
The Network I/O Bottleneck

• The Network I/O is can bring you down as much

|

@PolymathicCoder
Asynchronous Processing
Asynchronous Processing

• Resource-intensive tasks cannot be handled practically during an
HTTP session
• Synchronous pro...
Asynchronous Processing Patterns
• Pseudo-Asynchronous Processing
• Flow
• Process data / operations in advance
• User req...
Asynchronous Processing Patterns
• True Asynchronous Processing
• Flow
• User request data or operation
• Acknowledge
• Ex...
Techniques
• Leverage Job/Work/Task Queues
•
•
•
•
•

JMS (Java Messaging Service) – JSR 914
AMQP (Advanced Message Queuin...
Content Delivery Network
Content Delivery Network (CDN)
• Static content
• Binary (Video, Audio, etc…)
• Web objects (HTML, JavaScript, CSS, etc…)
...
CDN Gotchas
• Dirty Caches
• script.js is a script file deployed on CDN
• Multiple copies of script.js will be replicated ...
CDN Gotchas
• Dirty Caches
• What to do?
•
•
•

Simply append version number to file names
• script-v1.js, script-v2.js, e...
Domain Name Service
Domain Name Service (DNS)
• Do NOT rely on your free domain name registrar DNS
•

Use a scalable DNS solution
• AWS Route ...
Remoting
Remoting
• In a SOA (Service Oriented Architecture)
• RPC calls to multiple services
• Data Exchange (Plain vs. Binary)
• ...
Qualifying
Scalability
Qualifying Scalability
• Instrumentation: Bake it into the code early
• Monitoring
• Health (Application / Infrastructure)...
Disaster
Recovery
When Disaster Hits…
• Goal
• Fault-tolerant system
• Restore service and recover data ASAP in case of a disaster

• Be pro...
Scaling Teams
Scaling Teams
• Hiring
• Always hire top talent
• You are as strong as your weakest link
• Develop a process to bring peop...
Scaling Teams
• Team Structure
• Small is good
• Form ad-hoc teams from pools of Agile breeds
• Product Owners
• Team Memb...
The Take-home
The Take-home Message
• The early-bird gets the worm
• Design to scale from day one
• Plan for capacity early

• Your need...
Take it slow… You’ll get there…
Work smarter not harder…

|

@PolymathicCoder
Questions?
http://speakerscore.com/jazoon-scalability

Thanks for the attention!
Follow @PolymathicCoder

abdelmonaim.remani@gmail.co...
JAZOON'13 - Abdelmonaim Remani - The Economies of Scaling Software
Nächste SlideShare
Wird geladen in …5
×

JAZOON'13 - Abdelmonaim Remani - The Economies of Scaling Software

1.104 Aufrufe

Veröffentlicht am

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

JAZOON'13 - Abdelmonaim Remani - The Economies of Scaling Software

  1. 1. The Economies of Scaling Software Abdelmonaim Remani @PolymathicCoder
  2. 2. Creative Commons Attribution Non-Commercial License 3.0 Unported The graphics and logos in this presentation belong to their rightful owner
  3. 3. About Me • Platform Architect at just.me Inc. • JavaOne RockStar and frequent speaker at many developer events and conferences including JavaOne, JAX, OSCON, OREDEV, 33rd Degree, etc... • Open-source advocate and contributor • Active Community member • • The NorCal Java User Group The Silicon Valley Dart Meetup Bio: Twitter: http://about.me/PolymathicCoder @PolymathicCoder Email: abdelmonaim.remani@gmail.com SlideShare: http://www.slideshare.net/PolymathicCoder/ | @PolymathicCoder
  4. 4. http://speakerscore.com/jazoon-scalability Follow @PolymathicCoder
  5. 5. The Title of the Talk • The Economies of Scale • “In microeconomics, economies of scale are the cost advantages that enterprises obtain due to size [...] often operational efficiency is [...] greater with increasing scale [...]” Wikipedia | @PolymathicCoder
  6. 6. Let’s Go!
  7. 7. Blurred Lines… • Only the enterprise worried about scalability • The rise of social and the abundance of mobile • An exponential growth of internet traffic • The creation of a spoiled user-base • I want to see the closest Moroccan restaurants to my current location on a map along with consumer ratings and whether any of my friends has recently checked-in in the last 30 days • The lines are blurred between consumer applications and the enterprise applications | @PolymathicCoder
  8. 8. The Bar Is Higher! Scalability is everyone’s problem… | @PolymathicCoder
  9. 9. What is Scalability?
  10. 10. The Common Definition • The ability of an application to handle an increasing amount of work without performance degradation • Not a good definition! It implies: • You’ll need to scale forever • Scalability is relative; It is bound by one’s specific needs • You’ll need to be fully scalable from day one • Scalability is evolutionary; It is a gradual process • There are no external constraints • Unrealistic | @PolymathicCoder
  11. 11. A Better Definition • The ability of an application to gracefully evolve within the constraints of its ecosystem in order to handle the maximum potential amount of work without performance degradation • Work? • Simultaneous requests • Performance degradation? • Increased latency or decreased throughput | @PolymathicCoder
  12. 12. A Black Art! • Don’t be surprised if • Your application supports one million users • You add one more feature • 500,000 user load crashes your system or renders it unusable | @PolymathicCoder
  13. 13. Latency Is Your Enemy
  14. 14. Syllogismo • To scale is to reduce latency • To reduce latency is to address bottlenecks • To scale is to address bottlenecks • The usual suspects • The CPU • The Storage I/O • The Network I/O • Inter-related | @PolymathicCoder
  15. 15. Overcoming The CPU Bottleneck
  16. 16. Overcoming the CPU Bottleneck • Nothing affects the CPU more than the instructions it is summoned to execute • This is about your application • How it is written (Architecture, code base, etc..) • How it is deployed | @PolymathicCoder
  17. 17. A Scalable Architecture
  18. 18. Architecture? • “Things that people perceive as hard-to-change” -Martin Flower • http://martinfowler.com/ieeeSoftware/whoNeedsArchitect.pdf • Decision you commit to; the ones that will be stuck with you forever | @PolymathicCoder
  19. 19. Be Wise… Think Twice… • Choose the right technologies • • Platform Languages • Frameworks • Libraries • Make the right abstractions • Loosely-coupled components • Functional abstractions • Technical abstractions • Make sure that the latter is subordinate to the former and not the other way around | @PolymathicCoder
  20. 20. Write Good Code
  21. 21. Write Good Code • Think your algorithms through and mind their complexity (Asymptotic Complexity, Cyclomatic Complexity, etc…) • SOLIDify your design • Single Responsibility, Open-Closed, Liskov Substitution, Interface Segregation, and Dependency Inversion • Understand the limitation of your technology and leverage its strengths | @PolymathicCoder
  22. 22. Quality… Quality… Quality! • Obsess with testing • TDD/BDD • Tools • Static code analyzers (PMD, FindBugs, etc…) • Profilers (Detect memory leaks, bottlenecks, etc…) • Etc… | @PolymathicCoder
  23. 23. Know Thy S#!t • Read • • • • • • • • • The Classics (The Mythical Man-Mouth, etc…) GoF’s “Design Patterns” Eric Evans’ “Domain-Driven Design” Every book by Martin Fowler Uncle Bob’s “Clean Code” Josh Bloch’s “Effective Java” Brian Goetz’s “Java Concurrency in Practice” Tech Papers/Blogs Etc... | @PolymathicCoder
  24. 24. The Inevitable
  25. 25. You do all that… You’ll end up with… At best… The fading tradition of making cow dung piles http://news.ukpha.org/2011/01/the-fading-tradition-of-making-cow-dung-piles/ | @PolymathicCoder
  26. 26. Still better than… | @PolymathicCoder
  27. 27. Technical Debt • What is it? • The quick-and-dirty you are not proud of • What you would have done differently haven't you had time • It’s a matter of time before it starts to smell really bad • What to do? • The fact you recognize it as debt is good thing in itself • Keep tabs and refactor often • Cut the right corners • Don’t mortgage architecture (Don’t lock yourself out) | @PolymathicCoder
  28. 28. Write Code That Scales Up
  29. 29. Vertical Scaling • Vertical Scaling (Scaling Up) • On a single-node system • Adding more computing resources to the node (Getting a beefier machine) • Writing code to harness the full power of the one node | @PolymathicCoder
  30. 30. Parallelism At The Node Level • Writing concurrent code of simultaneously executing code • Simple business logic within containers is already multithreaded • Executing complex business logic within a reasonable time • Break it into smaller steps • Execute them in parallel • Aggregate data back | @PolymathicCoder
  31. 31. Easier Said Than Done… • Moore’s Law • Performance gain is automatically realized by software (Code is faster on faster hardware) • Nothing is forever… • The era of the multi-core chip • We need to write code to take advantage of all cores | @PolymathicCoder
  32. 32. Easier Said Than Done… • Synchronize state across threads across multiple cores • Good luck! • Relay on frameworks and libraries (Fork/Join, Akka, etc…) • Go immutable • Not always straightforward or possible • Go functional (Scala, Clojure, etc…) | @PolymathicCoder
  33. 33. It Gets More Interesting… • Amdahl’s Law • Throwing more cores does not necessarily result in performance gain • Diminishing return at some point no matter how many cores you throw in | @PolymathicCoder
  34. 34. Miscellaneous • Leverage Probabilistic data structures and algorithms • Bloom Filters, Quotient filters, etc… • Go Reactive • http://www.reactivemanifesto.org/ • RxJava, Spring Reactor, etc… | @PolymathicCoder
  35. 35. Write Code That Scales Out
  36. 36. Horizontal Scaling • Horizontal Scaling • On a distributed system (A cluster) • Adding more nodes • Writing code to harness the full power of the cluster | @PolymathicCoder
  37. 37. Topology • A typical cluster consists of • A number of identical application server nodes behind a load balancer | @PolymathicCoder
  38. 38. Topology • A typical cluster consists of • A number of identical application server nodes behind a load balancer A number? • It depends on how many you actually need and can afford • Elastic Scaling / Auto-Scaling • The number of live nodes within the cluster shrinks and grows depending on the load • New ones are provisioned or terminated as needed | @PolymathicCoder
  39. 39. Topology • A typical cluster consists of • A number of identical application server nodes behind a load balancer Identical? • Application nodes are cloned off of image files (Ex. AWS Ec2 AMIs, etc...) • Configuration Management tool (Chef, Puppet, Salt, etc...) | @PolymathicCoder
  40. 40. Topology • A typical cluster consists of • A number of identical application server nodes behind a load balancer Load balancer? • Load is evenly distributed across live nodes according to some algorithm (Round-Robin typically) | @PolymathicCoder
  41. 41. Managing State • Session data • Session Replication • Session Affinity / Sticky Session • Requests from the same client are routed to the same node • When the node dies, the session data dies with it • Shared Session / Distributed Session • Session data is in a “centralized” location • Go Stateless • No session data (Any node would do) | @PolymathicCoder
  42. 42. Parallelism At The Cluster Level • Leverage Map/Reduce • “A programming model for processing large data sets with a parallel, distributed algorithm on a cluster” • Apache Hadoop | @PolymathicCoder
  43. 43. Miscellaneous • How to HTTPS? • End at load balancer • Wildcard SSL • Distributed Lock Manager (DLM) • Synchronize access to shared resources • (Google Chubby, Apache Zookeeper, etc…) • Distributed Transactions • X/Open XA | @PolymathicCoder
  44. 44. Deployment
  45. 45. Deployment • Multiple Environments • Development, Test, Stage, and Production • Automatic Configuration Management • Practice Continuous Delivery • Leverage The Cloud • IaaS, PaaS, SaaS, and NaaS | @PolymathicCoder
  46. 46. Overcoming The Storage I/O Bottleneck
  47. 47. The Storage I/O Bottleneck • The storage I/O is usually the most significant | @PolymathicCoder
  48. 48. The Persistent Datastore
  49. 49. What Datastore to Use? • Relational of course! • • • • Normalized schema guaranteeing data integrity ACID Transactions No biased towards specific access patterns Flexible query language • As datasets grow • • • • • Scale up (Buy beefier machines) Database tuning / query optimization Create materialized views De-normalize Etc… | @PolymathicCoder
  50. 50. Mucho Data! • No other choice but scaling out RDBMS • Master/Slave clusters • Sharding • Failed big time! • RDBMS is designed to run on one machine • Eric Brewer’s CAP Theorem of distributed systems • Pick 2 out of 3: Consistency, Availability, and Partition Tolerance • The relational model is designed to favor CA, hence can never support P | @PolymathicCoder
  51. 51. NoSQL • A wide range of specialized datastores with the goal of addressing the challenges of the relational model • “The whole point of seeking alternatives is that you need to solve a problem that relational databases are a bad fit for” –Eric Evans • A wide variety • • • • Key-Value Datastores Columnar Datastores Document Datastores Graph Datastores | @PolymathicCoder
  52. 52. Polyglot Persistence • Within the application • Data is complex and accessed in many different ways • Why should we fit it into one storage model? • Polyglot Persistence is about • Leveraging multiple data stores based on the specific way the data is stored and accessed • For more info: • Checkout my talk on YouTube from JAX Conf 2012 • “The Rise of NoSQL and Polyglot Persistence” • http://bit.ly/PCWtWi | @PolymathicCoder
  53. 53. Caching
  54. 54. Caching • A cache is typically a simple key-value data structure • Instead of incurring the overhead of data retrieval or computation every time, you check the cache first • You can’t cache everything, caches can be configured to use multiple algorithms depending on the use case (LRU, LFU, Bélády's Algorithm, etc...) • Use aggressively! • What to cache? • Frequently accessed data (Session data, feeds, etc…) • Results of intensive computations | @PolymathicCoder
  55. 55. Caching • Where to cache? • On disk • File System: Slow and sequential access • DB: A bit better (Data is arranged in structures designed for efficiant access, indexes, etc…) • Generally a terrible idea (SSDs make things a bit better) • In-Memory: Fast and random access, but volatile • Something in between: Persistence caches (Redis, etc…) • What type of cache? • Local, Replicated, Distributed, and Clustered | @PolymathicCoder
  56. 56. Caching • How to cache? • Most caches implement a very simple interface • Always attempt to get from cache first using a key • If it is a hit, you saved yourself the overhead • If it is a miss, compute or read from the data store then put in cache for subsequent gets • When you update you can evict stale data • You can set a TTL when you put • Many other common operations... | @PolymathicCoder
  57. 57. Caching Patterns • Caching Query Results • Key: Hash of the query itself • How about parameterized queries? • Key: Hash of the query itself + Hash of parameter values • Method/Function Memoization • Key: Method name • How methods with parameters? • Key: Hash of the method name + Hash of parameter values • Caching Objects • Key: Identity of the object | @PolymathicCoder
  58. 58. Caching Patterns • Time-series datasets (Ex. Real-time feed) • Most of the time pseudo/near real-time is enough • Use caching to throttle access to resources • Cache query result with a t expiry • Fresh data is only read every t | @PolymathicCoder
  59. 59. Caching Gotchas • Profile your code to assess what to cache, and whether you need to to begin with • Stale state might bite you hard • Incoherence: Inconsistent copies of objects cached with multiple keys • Stale nested aggregates • Network overhead of misses might outweighs the performance gain of hits • Consider writing/updating cache when writing/updating the persistence store | @PolymathicCoder
  60. 60. Featured Solutions • • • • EhCache Memcahed Oracle Coherence Redis • A persistence NoSQL datastore • Built-in data structures like sets and lists • Supports intelligent keys and namespaces | @PolymathicCoder
  61. 61. Overcoming The Network I/O Bottleneck
  62. 62. The Network I/O Bottleneck • The Network I/O is can bring you down as much | @PolymathicCoder
  63. 63. Asynchronous Processing
  64. 64. Asynchronous Processing • Resource-intensive tasks cannot be handled practically during an HTTP session • Synchronous processing is overused and not necessary most of the time | @PolymathicCoder
  65. 65. Asynchronous Processing Patterns • Pseudo-Asynchronous Processing • Flow • Process data / operations in advance • User requests data or operation • Respond synchronously with pre-processed result • Sometimes not possible (Dynamic content, etc...) | @PolymathicCoder
  66. 66. Asynchronous Processing Patterns • True Asynchronous Processing • Flow • User request data or operation • Acknowledge • Ex. A REST that return an “202 Accepted” HTTP status code • Do Processing at your own convenience • Allow the user to check progress • Optionally notify when processing is completed | @PolymathicCoder
  67. 67. Techniques • Leverage Job/Work/Task Queues • • • • • JMS (Java Messaging Service) – JSR 914 AMQP (Advanced Message Queuing Protocol): RabbitMQ, ActiveMQ, etc… AWS SQS Redis Lists Etc… • Task Scheduling • Jobs triggered periodically (Cron, Quartz, etc…) • Batch Processing | @PolymathicCoder
  68. 68. Content Delivery Network
  69. 69. Content Delivery Network (CDN) • Static content • Binary (Video, Audio, etc…) • Web objects (HTML, JavaScript, CSS, etc…) • Do NOT serve through your application server • Use a CDN • “A large distributed system of servers deployed in multiple data centers across the internet” • Akamai • AWS CloudFront | @PolymathicCoder
  70. 70. CDN Gotchas • Dirty Caches • script.js is a script file deployed on CDN • Multiple copies of script.js will be replicated across all edge nodes of the CDN • Clients/browsers will their own copies of script.js locally • We update script.js • Since the new and old version have the same URI • New clients will be served the old version by the CDN • Old clients will continue to use the old version from their local cache | @PolymathicCoder
  71. 71. CDN Gotchas • Dirty Caches • What to do? • • • Simply append version number to file names • script-v1.js, script-v2.js, etc… Force invalidation of all copies on edge nodes Set HTTP caching headers properly | @PolymathicCoder
  72. 72. Domain Name Service
  73. 73. Domain Name Service (DNS) • Do NOT rely on your free domain name registrar DNS • Use a scalable DNS solution • AWS Route 53 • DynECT • UltraDNS • Etc… • Domain Sharding • • Browsers limit the number of connections per host (Max of 6 usually) • Creating multiple subdomains (CNAME entries) allow for more resources to be downloaded in parallel Watch out for: DNS lookup overhead, HTTPS cost, Browser’s Same-Origin Policy, etc… | @PolymathicCoder
  74. 74. Remoting
  75. 75. Remoting • In a SOA (Service Oriented Architecture) • RPC calls to multiple services • Data Exchange (Plain vs. Binary) • SOAP / REST with XML or JSON • Google Protocol Buffers, Apache Thrift, Apache Avro, etc… • Protocol • JMS • HTTP • SPDY | @PolymathicCoder
  76. 76. Qualifying Scalability
  77. 77. Qualifying Scalability • Instrumentation: Bake it into the code early • Monitoring • Health (Application / Infrastructure) • Key Performance Indicators (KPIs) • Number of request handled, throughput, latency, Apdex Index, etc ... • Logs • Testing • Load/Stress testing | @PolymathicCoder
  78. 78. Disaster Recovery
  79. 79. When Disaster Hits… • Goal • Fault-tolerant system • Restore service and recover data ASAP in case of a disaster • Be proactive • Develop a Disaster Recovery Plan (DRP) • Practice and test your DRP by doing failure drills | @PolymathicCoder
  80. 80. Scaling Teams
  81. 81. Scaling Teams • Hiring • Always hire top talent • You are as strong as your weakest link • Develop a process to bring people in • Turnkey Hardware/Software Setup (Vagrant, etc...) • Arrange for proper access/accounts • Develop a knowledge base (Architecture documentation, FAQs, etc...) • Development Process • Be Agile • Refine in the spirit of Six Sigma | @PolymathicCoder
  82. 82. Scaling Teams • Team Structure • Small is good • Form ad-hoc teams from pools of Agile breeds • Product Owners • Team Members • Team Lead (Scrum Master) • Engineers • QAs • Architecture Owners • Give them ownership of their DevOps | @PolymathicCoder
  83. 83. The Take-home
  84. 84. The Take-home Message • The early-bird gets the worm • Design to scale from day one • Plan for capacity early • Your needs determine how scalable “your scalable” needs to be • Do not over-engineer • Do not bite more than you can chew • Building scalable system is process • Commit to a road map around bottlenecks • Guided by planned business features • Learn from others’ experiences (Twitter, Netflix, etc...) | @PolymathicCoder
  85. 85. Take it slow… You’ll get there… Work smarter not harder… | @PolymathicCoder
  86. 86. Questions?
  87. 87. http://speakerscore.com/jazoon-scalability Thanks for the attention! Follow @PolymathicCoder abdelmonaim.remani@gmail.com http://blog.polymathiccoder.com

×