The german travel meta search engine Swoodoo was hit by heavy load spikes due to TV advertisments. Learn about the successful caching, hosting and database strategies we've implemented, and which did not work well. Covering file-based Caching, APC, memcached and sharded database layouts on to our experiences with fully virtualized hosting.
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
Caching, sharding, distributing - Scaling best practices
1. Caching, sharding, distributing - Scaling best practices.
18.11.2009
Lars Jankowfsky
CTO swoodoo AG
Mittwoch, 18. November 2009
2. About me:
PHP, C++, Developer, Software Architect since 1992
PHP since 1998
Many successful projects from 2 to 20 developers
Running right now three projects using eXtreme
Programming
CTO and (Co-)Founder swoodoo AG
(Co-)Founder OXID eSales AG
Mittwoch, 18. November 2009
3. LOAD?
Average 17, Maximum 138
Mittwoch, 18. November 2009
4. Scaling?
Scaling Distributing
Caching Sharding
Mittwoch, 18. November 2009
7. SOA Scaling
Your App Your App Your App
Your App Your App Your App
Mittwoch, 18. November 2009
8. SOA Scaling
GUI/Frontend
API
Your App
Engine
Database
Mittwoch, 18. November 2009
9. SOA Scaling
GUI/Frontend GUI/Frontend GUI/Frontend
API API
Engine
Database
Mittwoch, 18. November 2009
10. SOA PRO
Scalable!
You can add Servers where you need them
Easier maintainable
More robust
easy to introduce HA
Cloud...
Mittwoch, 18. November 2009
11. SOA CON
A lot of work....
Difficult to test when doing TDD
Complex deployment
Mittwoch, 18. November 2009
13. Virtual Machines Distributing
GUI API
Engine Server 1 GUI API Server 2 DB
GUI API
Mittwoch, 18. November 2009
14. Virtual Machines Distributing
GUI API API
GUI
Server 1 API
Server 2 API
Server 2
GUI API API
GUI
Engine
Server 1 GUI
Server 2 DB
Server 2
GUI
Mittwoch, 18. November 2009
15. Virtual Machines PRO
Easy to distribute on new hardware as needed
Isolated, separated services even on one machine
Easy to install when using templates (DB, GUI...)
Very good for testing, staging
Mittwoch, 18. November 2009
16. Virtual Machines CON
Hardware failure....
Costs (at least for VMWare)
Performance penalty (15%)
Limitations (VMWare only 4 CPU‘s, VSphere 8...)
Some resources can‘t be virtualized (Disk I/O)
Mittwoch, 18. November 2009
19. Caching
GUI/Frontend
API
Engine
Database
Mittwoch, 18. November 2009
20. Files PRO
simple, easy for the begin
good for a „share nothing“ architecture
Mittwoch, 18. November 2009
21. Files CON
hits the HDD
consumes memory (file system cache)
local cache, can‘t be reused by different servers
manual handling of expiration
serialization penalty
Mittwoch, 18. November 2009
22. APC PRO
OPCODE Cache
Invalidation and size limits are automatically handled
good for a „share nothing“ architecture
Mittwoch, 18. November 2009
23. APC CON
bloats web server (apache) process memory
local cache, can‘t be reused by different servers
Mittwoch, 18. November 2009
24. memcached PRO
can be used by several servers
Invalidation and size limits are automatically handled
Mittwoch, 18. November 2009
25. memcached CON
network roundtrip penalty
serialization penalty
Mittwoch, 18. November 2009
26. Conclusion Caching
File System APC
memcached
Mittwoch, 18. November 2009
27. Conclusion Caching
APC memcached
opcode cache
rarely used local data
Mittwoch, 18. November 2009
29. Single Table Database
Data
Mittwoch, 18. November 2009
30. Single Table PRO
simple, easy for the begin
Mittwoch, 18. November 2009
31. Single Table CON
slow
read/write lock problematic
doesn‘t scale properly
Mittwoch, 18. November 2009
32. Offline/Online Table Database
Online, Once per hour Offline,
read only write only
MYISAM INNODB
Mittwoch, 18. November 2009
33. Offline/Online Table PRO
simple architecture
separation between read & write access
very fast reads
Mittwoch, 18. November 2009
34. Offline/Online Table CON
writes not scalable
generation process will take longer with more data
„stale“ data might occur in read table, no „live“ feeling
after generation of read table, is „cold“ again. Slow!
Mittwoch, 18. November 2009
36. Sharding #1 Generation PRO
Scalable!
Still fast with hundreds of millions of records
Separates Database logic from system, easy scalable
Moving, Adding, Deleting shards on the fly
query can be run on various machines in parallel -> Fast!
Mittwoch, 18. November 2009
37. Sharding #1 Generation CON
Queries are limited by shards, you can‘t join all shards
Complex to develop, special „protocol“ needed for the queries
Custom Queries not possible, no SQL any more in your App.
Difficult to maintain data (import, export, purge...)
After failure or power loss it takes a while to rebuild tables
Memory table leak
Mittwoch, 18. November 2009
39. Sharding #2 Generation PRO
More stable (INNODB vs. MEMORY)
Fast failover
Slave hardware can be used for production shards
Mittwoch, 18. November 2009
40. Sharding #2 Generation CON
Slower ( MEMORY faster than INNODB)
but that‘s ok, we got additional machines (slaves..)
Mittwoch, 18. November 2009