1. Web frameworks don't matter
Web frameworks don't matter
Web frameworks don’t
matter.
Some tips, tricks and patterns for designing, scaling and
maintaining large scale web applications.
Tomas (t0m) Doran
Nordic perl Workshop 2010
São Paulo.pm perl workshop 2010
2. Introduction
• t0m - Catalyst core team. Moose
committer. 99 CPAN dists. Silly hair. Idiot ;)
• This is a rant talk about (web) application
design.
• You get to listen to 3 hours of this today.
(Sorry about that)
• Please stop me, ask questions, disagree etc.
(It’ll be more fun, for all of us)
3. Application design
• Is hard!
• You won’t get it all right first time.
• The web parts are not the main parts
• Yes, even in a web application
4. Success
• You need to be flexible, requirements will
change.
• You need to be flexible, your planned
solutions won’t work
• You need to be flexible, the performance
inflection points won’t be where you
predict
5. Getting the data model right
• Hint - You won’t..
• Consistency is the key
• KISS
• No broken windows
6. Getting the data model right
• You need a consistent and standalone
model of your application
• Routing URIs to actions isn’t a hard
problem
• Keeping the data from becoming a pile of
shit is a hard problem
7. Loosely
coupled
• You will throw parts of your codebase away
• Components can be tested / replaced
independently
• Dependency injection
9. Dependency Injection
• Any of the components can be faked /
mocked / replaced
• Testing becomes much easier (having a test
MogileFS is a pain in the ass, really!)
• Write factories to build instances so
dependency injection doesn’t cost.
• Bread::Board?
10. Efficiency
• Web stuff should be fast!
• Doing extra work on your web servers is
bad news
• Lots of web processes probably not what
you want
11. The PHP fallacy
Even if extra context switching has zero overhead
you serve people sooner if you queue.
A B A B A B A B
A B
A finishes significantly before B in the lower diagram
B finishes at the same time in both
12. Caching strategies
• Page fragments
• Objects and data structures
• Varnish/ESI, mod_cache/SSI
• memcached
• Materialized views (triggers!)
13. Simple web
architecture
Load balancer
Web server Web server
Web server Web server
Web server Web server
This is how you start and how
most people think about a web
architecture
Data store
14. Add a reverse
proxy Load balancer
Reverse proxy
Your users are not on a Gb/s Web server Web server
pipe. Web server Web server
Web server Web server
Don’t keep your expensive
application processes in IO wait
sending bytes
Data store
15. Cache
expensive Load balancer
lookups
Reverse proxy
Memcache shown here, but this Web server Web server Memcache
Web server Web server
isn’t the only strategy Web server Web server Memcache
Materialized views in the data
store layer - denormalisation
without the pain.
Data store
16. Using a page
assembly layer PAL
Reverse proxy
Web server Web server Memcache
Allows different chunks of Web server Web server
Web server Web server Memcache
content to have different caching
strategies.
Cache even authenticated users Data store
(mostly)
17. Complex
web
architecture
PAL
Reverse proxy
Web server Web server Memcache
Anything that is going to Message
Web server Web server
Web server Web server Memcache
block is run as an Queue
asynchronous job
Job Server
Data store
18. Cache stampede!
• Cache flushes can become terminal to your
application.
• You need to ensure you can avoid this (or
at least know it’s there)
• Implement switches to ‘cool down’ your
app
19. Development,
deployment and testing
• You must use version control
• You must have version numbers
• You must have package management
• You must have a stage environment
21. Development,
deployment and testing
• Everyone in your team needs to be able to
do everything
• No superstars, being run over by a bus
more likely than you think
24. The Zen swap
• Migrating continually updated data around
with no downtime
• E.g. moving Zen machines between hosts,
moving database info around.
• For database, relies on triggers
25. The Zen swap
• Modify time column on ‘from’ table.
• Copy (and munge) all the data, row by row,
noting when you start
• Copy (and munge) all the data changed
since you started
• Repeat until the set of dirty data is very
small
27. The Zen swap
• Stop the universe
• Do one final pass of the data
• Add triggers for reads/writes in the old
column to use the new column
• Start the universe
• Needs transactional DDL to be seamless
• Even without, use to minimise downtime
28. Accidental complexity
• Premature generalisation is the root of all
evil.
• Lack of polymorphism
• Insufficient use of delegation
• Commonality develops independently.You
MUST refactor
29. Load and scalability
testing
• Simple ab can tell you a lot
• NYTProf
• You need to test your system with your
hardware and your data
• A little tuning can go a long way.
30. Health monitoring
• Log useful stuff from your app!
• Syslog
• Nagios
• Healthcheck pages
• Munin
• Splunk
31. Performance
monitoring
• Per hit stats (db queries, memcache hits,
times taken)
• Query comments
• Graphs are awesome (RRD is kinda hateful)