2. Scaling your Web App
Topics to be covered :
1. What is Scaling ?
2. Why do we need to scale ?
3. How do we scale ?
- Scaling up Vs. Scaling out
4. What is Sharding ?
5. RDBMS vs NoSQL (Perspective : Scalibility)
4. Desirable Properties of a Web App
Scalability
High Availability
Performance
Manageability
Low Cost
Feature Rich
Generates $$$ :D
5. What is Scaling ?
Scalability
It is the property of your system to handle
growing amount of work in a graceful manner
or to be readily enlarged as demand
increases.
Scaling and performance are different.
6. So What is Performance ?
Performance :
The amount of useful work accomplished by a
computer system compared to the time and
resources used.
Better Performance means more work
accomplished in shorter time and/or using less
resources
7. Ok. Now What ?
I have an application up and running on
servers and Its doing pretty well.
Why should I think about scaling it ?
Will it really require scaling ?
8. Why do I need scaling ?
Who knows your app might be the next FB or
Twitter...
How will you handle so many Users doing so many
things over network ??? It might go up to processing
millions of request / second. So,
Scale it !!!!!! :)
9. OK. Cool.
I really need to think about scaling my
application now.
Wait a minute !!!!!
How should I do it ??
How Should I Scale ??
10. Well there are a couple of ways to
scale your application
1. Scaling Up
2. Scaling Out
11. Scaling Up vs. Scaling Out
Scaling Up :
− More CPU, Bigger HD, More RAM, etc.
− Biggest , fastest single computer that exists is still
not as fastest as two of such computers together.
− i.e. Diminishing returns => Not a good solution
Nah!! Not Efficient !!!!
12. Scaling Up Vs. Scaling Out
Scaling Out :
− Add more nodes
Master / Slave architecture
Sharding
14. Master / Slave Architecture
Pros :
− Increased READ speed
− Takes READ load off of master
− Allows us to join across all tables
Cons:
− Doesn't buy increased write throughput
− Single point failure
15. What is Sharding ?
Sharding is the method of splitting your database
across several servers (called a Cluster).
Each shard can consist of one or multiple machines.
No machine has all your data on it.
More machines =>
More RAM.CPU =>
More operations/sec =>
Improved Throughput.
Yeyyy !!!!
17. Sharding
Pros :
− Increased READ and WRITE throughput
− No Single Point Failure
Individual features can fail But Whole system won't go
down at a time.
Cons :
− Can't join queries between shards
19. Scaling : RDBMS
RDBMS guarantee ACID operationsACID operations but when a
relational database grows out of one server, it is no
longer that easy to use. In other words, they don't scale
out very well in a distributed system.
The CAP theoremCAP theorem states that a distributed (i.e. scalable)
system cannot guarantee all of the following properties at
the same time:
Consistency Availability Partition tolerance
Most NoSQL Databases drop Consistency in favour of
availability. Thats why they are better scalable.
20. Why NoSQL Databases Scale &
RDBMS does not (?)
Data Sharding would require distinct data entities that can be
distributed and processed independently.
RDBMS can't do that because of its table based nature
NoSQL do not distribute a logical entity across multiple tables,
it’s always stored in one place.
NoSQL only enforce consistency inside a single entity and
sometimes not even that.
22. Does it mean that my app gives a
better performance now ?
No It doesn't. Performance depends on how correctly you
implement (scalable solution) for your case.
Besides there are several factors affecting like
− Disk I/O
− Network
− Caching
24. MySQL OR MongoDB
“The real thing to point out is that if you are
being held back from making something super
awesome because you can’t choose a
database, you are doing it wrong.
If you know mysql, just use it. Optimize when
you actually need to.
Use it like a k/v store, use it like a rdbms, but for
god sake, build your killer app! None of this
will matter to most apps.
Facebook still uses MySQL, a lot. Wikipedia
uses MySQL, a lot. FriendFeed uses MySQL,
25. MySQL OR MongoDB
“What am I going to build my next app on?
Probably Postgres.
Will I use NoSQL? Maybe. I might also use
Hadoop and Hive. I might keep everything in
flat files.
Maybe I’ll start hacking on Maglev. I’ll use
whatever is best for the job.
If I need reporting or ACIDIty, I won’t be using
any NoSQL.
If I need caching, I’ll probably use Tokyo Tyrant.
26. Conclusion of the Debate
If there’s anything to take away from the
RDBMS vs NoSQL debate,
it’s just to be happy there are more tools, because
more cool tools means more win win situation for
everyone.
27. Summary
We covered following topics :
1. What is Scaling ?
2. Why do we need to scale ?
3. How to do we scale ?
- Scaling up Vs. Scaling out
4. What is Sharding ?
5. RDBMS vs NoSQL (Perspective : Scalibility)