DrupalCampLA 2014 - Drupal backend performance and scalability
1. Drupal Backend Performance and Scalability
Drupal Backend
Performance and
Scalability
DrupalCamp LA 2014
Ashok Modi (btmash)
Christoph Weber (ChristophWeber)
2. Drupal Backend Performance and Scalability
About Presentation
• Similar content to other performance related
presentations.
• There are tons of performance presentations at
camp on specific topics.
• Lot of material.
• May cover rest in BoFs
• Won't really go into shared hosting.
• Small part of presentation (code related
optimizations) may apply.
3. Drupal Backend Performance and Scalability
About Presentation
• May talk bit about cloud (dedicated sessions at
camp)
• Have a question? Ask!
• Have something to share? Come on up!
• We can all learn from each other.
4. Drupal Backend Performance and Scalability
Performance related
• Asynchronous PHP and Real Time Messaging
• Steve Rhoades
• Headless Drupal (multiple sessions!)
• Josh Koenig (Drupal 8 demos)
• Steve Rifkin (AngularJS)
• Matt Chapman (KnockoutJS / Backbone)
• Matt Wrather (Drupal 7 demos)
5. Drupal Backend Performance and Scalability
Performance related
• HHVM (multiple sessions!)
• Sara Golemon
• Josh Koenig
• Get your head “in the cloud” with AWS
• Jeremy Lindblom
• Cloud Lightning for Drupal
• Chris Charlton
• Performance for Fun and Profit
• Christoph Weber
6. Drupal Backend Performance and Scalability
Goals and Objectives
• Define them.
• Do you want a faster page response for end
user?
• Handle more traffic?
• Minimize downtime?
• Related...but different.
• Gets harder and harder to achieve better
performance.
• More Infrastructure.
• Patching / Hacking Drupal.
7. Drupal Backend Performance and Scalability
Diagnosis
• Proper diagnosis is essential before proposing and
implementing a solution.
• Based on proper data.
• Analysis of data.
• Possible paths of optimization.
8. Drupal Backend Performance and Scalability
Validation
• Avoid the 'wild goose chase'.
• Validate results on a test server.
• Replicate the data.
• Backup and Migrate is useful.
• Migrate is a heavier but also useful approach.
• Recreate the site.
• Gather a performance difference between test and
production server.
• Measure again and see if relative times remain the
same.
9. Drupal Backend Performance and Scalability
Points of Optimization.
• Introduction
• Tools to measure and diagnose issues.
• Speed optimizations.
10. Drupal Backend Performance and Scalability
Hardware - Introduction
• Physical device matters (maybe not as much now?).
• Multiple cores are the norm.
• 32 > 16 > 8 > 4 > 2 > 1
• Lots of RAM (caching file system, db, and page
rendering as much as possible).
• Multiple disks (split one server into various disks /
servers)
• SSD is much faster than regular HD.
• Tuning DB on SSD is different from DB on HD.
11. Drupal Backend Performance and Scalability
LAMP Stack
• Traditionally most common stack for hosting
Drupal / similar applications.
• Linux
• Apache
• MySQL
• PHP
• Not discussing Windows.
• Can discuss outside.
• There are many options out there.
• Grow out your stack to many other technologies.
12. Drupal Backend Performance and Scalability
Multiple Servers
• Master DB server, multiple web servers.
• Use a load balancer (or something like HAProxy
or other round robin).
• Set up slave DB servers for select queries.
• Or set up a cluster (mysql/galera in open source)
• Do it only if you have the budget or resources.
• Complexity is expensive.
• Tuning a system can avoid or delay a split.
• Site by 2bits runs on 1 server.
• Read more at http://goo.gl/XueVY
13. Drupal Backend Performance and Scalability
Testing Tools
• Apache Benchmark (DEMO)
• ab –n 100 –c 10 http://www.example.com
• ab –n 10 –c 10 –C PHPSESSID=<sessid> http://
www.example.com
• Do 10 concurrent requests for up to 100
requests.
• Average response time per second.
• How many requests handled per second.
• Jmeter
• Similar to Apache Benchmark.
14. Drupal Backend Performance and Scalability
Testing Tools
• Grinder
• http://grinder.sourceforge.net/
• Gatling
• http://gatling.io/
• LoadStorm
• Web service to load test site.
15. Drupal Backend Performance and Scalability
Console Monitoring Tools
• Top
• Real time monitoring.
• Load average.
• CPU utilization.
• Memory usage.
• List of processes.
• htop
• Similar to top but for multiple cores.
• Faster.
• Very slick.
16. Drupal Backend Performance and Scalability
Console Monitoring Tools
• atop
• Shows network statistics.
• Runs a collection daemon in the background.
• vmstat
• Report memory statistics
• netstat
• Shows active network connections
• netstat –an
• netstat –an | grep EST
17. Drupal Backend Performance and Scalability
Graphical Monitoring
• Cacti
• http://www.cacti.net
• Available as a package on Ubuntu, Debian
(various other *nix/bsd flavors).
• Easy to understand graphs.
• Displays history over day, week, month, year.
• Graphs available to display stats for CPU,
memory, network,Apache, MySQL.
• Many others written by others available online.
18. Drupal Backend Performance and Scalability
Graphical Monitoring
• Munin
• http://munin.projects.linpro.no
• Very similar to Cacti (doesn’t require a db).
• Nagios
• Very powerful (alerts by email, sms, etc).
• Drupal module for integration.
• Lots of configuration.
• Panopta and New Relic offer hosted monitoring.
• VPS providers may also offer *something*.
19. Drupal Backend Performance and Scalability
Linux / BSD
• Use proven, stable distribution (Debian, Ubuntu
LTS, RHEL, CentOS)
• Use recent versions.
• Use whatever your staff has expertise in.
• Try to avoid bloat.
• Don't install PostGreSQL if you are using
MySQL, no desktop version, java, etc.
• Balance compiling own version vs. using packages.
• Compiling gives full control.
• Possible pain to upgrade.
20. Drupal Backend Performance and Scalability
Apache
• Most popular, supported, feature rich.
• Stable.
• Usually enabled with too many modules.
• mod_proxy, mod_cgi, mod_dav, etc may be
unnecessary.
• Smaller processes = less memory
• More users can access site.
• apachectl -M - show all enabled modules.
• apachetop
• Reads / Analyzes apache access logs.
21. Drupal Backend Performance and Scalability
Apache Optimizations
• MaxClients (prevent swapping / thrashing)
• Too low - cannot serve enough clients.
• Too high - you run out of memory and start
swapping. Server dies and you cannot serve any
clients.
• MaxRequestsPerChild
• Tune it to terminate process faster and free up
memory.
• KeepAlive
• Keep it enabled.
22. Drupal Backend Performance and Scalability
Varnish
• HTTP accelerator
• Serve millions of pages of content will little impact
on server.
• Used on Drupal.org, grammys.com, etc.
• Set up as reverse proxy to web server (Apache,
Nginx, etc) if cannot serve itself.
• Serve anonymous page requests and static files.
• D6 core will not serve anonymously - pressflow.
• D7 core and varnish play nicely.
• Requires tuning.
• Look at http://drupal.org/project/varnish
23. Drupal Backend Performance and Scalability
Varnish (cont'd)
• Define IP/Port in backend <name> for each server.
• Define multiple backends for different servers.
• backend b1 {.host="127.0.0.1"; .port="81"}
• Use director to group backends for round robin.
• return (pass); // Do not cache
• return (look); // Return cache or lookup backend, cache, and
serve.
• unset beresp.http.Set-Cookie; // Remove cookie - what allows
caching.
• Lullabot’s article: http://goo.gl/7JFrP
• Basic setup for D7: http://goo.gl/l7601
• Tested on own blog last year - handled about 3k requests
per second.Apache w/ mod_php handled about 30 - 50 in
comparison.
24. Drupal Backend Performance and Scalability
Nginx
• http://nginx.net
• Stable
• More lightweight than Apache.
• Uses less memory.
• Less functionality.
• Good enough for 90% of use cases out there.
• Easy to set up.
• http://wiki.nginx.org/Drupal for base setup.
• Run PHP as FastCGI process.
• Can also do this with Apache (install php-fpm and
don't look back).
25. Drupal Backend Performance and Scalability
Nginx (cont'd)
• Lots of options.
• worker_processes 24; // Max number of processes. 1
per cpu
• worker_connections 1000; // "Max Clients"
• keepalive_timeout 30; // Keepalive.
• gzip enabled; // mod_gzip
• http://dak1n1.com/blog/12-nginx-performance-tuning
• https://github.com/perusio/drupal-with-nginx
• Tuning specifically for Drupal (uses nginx cache).
• Server went from serving 200+ reqs / second from
base to 3000+ reqs / second from varnish to 4000+
reqs / second from nginx.
26. Drupal Backend Performance and Scalability
Other web servers.
• Cherokee
• http://www.cherokee-project.com/
• Benchmarks are very promising.
• Configured through web ui.
• Have not used it.
• Atleast 2x as fast as Nginx.
• G-WAN
• http://gwan.ch/
• Very new.
• Potentially 3x - 4x faster than Nginx.
• Any others?
27. Drupal Backend Performance and Scalability
MySQL/MariaDB
• Most popular database for Drupal.
• Easy to set up, lots to tune.
• Pressflow, D7+ install with InnoDB as default, which
requires tuning even for small sites.
• Various pluggable engines (InnoDB, MyISAM,Aria, etc)
• Forks
• Percona - Closest to MySQL 5.5
• MariaDB - More changes.
• Drizzle - Rewritten in C++.
• MySQL 5.5 is a big difference.
• More to tune.
• http://goo.gl/hU8tW
28. Drupal Backend Performance and Scalability
MySQL Monitoring
• mtop / mytop
• Like top but for MySQL.
• Real time monitoring (no history).
• Shows slow queries and locks.
• If you have neither - SHOW FULL PROCESSLIST;
• mysqlreport
• Deprecated but still useful.
• http://hackmysql.com/mysqlreport
• Reports on server - no recommendations
(documentation explains everything about stats)
• mysqltuner
• Comes with percona / mariadb.
29. Drupal Backend Performance and Scalability
MySQL Engines
• MyISAM
• Fast reads.
• Less overhead.
• Poor concurrency (table level locking).
• InnoDB
• Transactional.
• Slower (SELECT COUNT(...))
• Better concurrency (row level locking).
• Forks
• Percona comes with XtraDB (InnoDB replacement).
• MariaDB also comes with XtraDB,Aria (MyISAM).
• XtraDB contains patches that did not get in.
• Same tuning settings.
30. Drupal Backend Performance and Scalability
MySQL Tuning
• Lots of things - focus on a few.
• innodb_buffer_pool_size
• Very important.
• Set up to 80% of memory allocated for DB to this.
• If DB is small, use memory elsewhere.
• innodb_flush_log_at_trx_commit
• Each update flushes log by default (expensive).
• 0 => No flush on transaction.
• 2 => Flush cache on transaction.
• log still flushed every second.
• No flush loses 1-2 seconds on OS crash. Cache flush
loses 1 second on hard server crash.
31. Drupal Backend Performance and Scalability
MySQL Tuning (cont'd)
• innodb_log_file_size
• Important for sites with lots of writes.
• table_cache
• Opening tables can be expensive.
• Keep tables open in cache.
• See output from mysqltuner (usually >1024)
• thread_cache
• Increase if lots of quick connections.
• query_cache_size
• Will cache query results.
• Generally 32M - 512M.
32. Drupal Backend Performance and Scalability
MySQL Replication
• Used on Drupal.org
• INSERT/UPDATE/DELETE goes to master.
• SELECT goes to slave(s).
• Provide noticeable improvements.
• Supported in D7.
• D6 => Pressflow.
• Beware of complexity.
• Connection bet master/slave goes down, bad day.
• Extensive tuning could alleviate need for slave.
33. Drupal Backend Performance and Scalability
MySQL Replication
• MySQL Cluster
• Scales well.
• High Availability.
• Expensive.
• Galera
• Relatively new (2009)
• Allows Master/Master setup.
• Recommend atleast 3 servers (and odd numbers)
• 1 out of sync -> quorum decides which one needs to
be in line with others.
• Various cloud options (Amazon RDS, etc.)
• Slower but higher throughput.
34. Drupal Backend Performance and Scalability
MongoDB
• Document Oriented.
• 'no-sql'
• b.collection.insert|add|update({parameters})
• Retrieve subsets.
• Manages collections of objects in json-like format.
• Supports up to 64 indexes
• 1 for ascending order, -1 for descending order.
• Supports replication.
• Built-in clustering.
• Very fast.
• Still need to architect despite being ‘schemaless’
35. Drupal Backend Performance and Scalability
MongoDB (cont'd)
• http://drupal.org/project/mongodb
• D6, D7
• Cache, Field Storage, Blocks, Queues, Sessions,
Watchdog.
• Does a lot of heavy lifting.
• See most gains from field storage, queues, watchdog.
• Ad-hoc test to update 50000 nodes.Took 3.5 hours w/
reg. database.Took 40 minutes with mongo.
• For anything exported into mongodb, previous sql
queries need to become mongo queries.
• Use EntityFieldQuery for entities.
• Backwards compatible queries :)
36. Drupal Backend Performance and Scalability
PHP
• Use a recent, stable release.
• D7 requires 5.2.x, as do a few 6.x contributed modules.
• D8 will require 5.4.x.
• Install an opcode cacher / accelerator.
• Useful in bringing down memory usage.
• APC
• eAccelerator
• XCache
• Zend optimizer (commercial)
• PHP 5.4 comes with OpCache (yay!)
• Compile into hiphop. (2 sessions on topic)
37. Drupal Backend Performance and Scalability
Running PHP
• mod_php
• Standard module used by Apache.
• Well tested, supported.
• Resource hog.
• FastCGI (PHP-FPM)
• Proxy requests from web server to FastCGI process
• Supported by Apache, NginX, Cherokee, etc.
• Runs as a separate process (or pool).
• Secure.
• Slightly slower than mod_php.
• Much more stable behavior (limited # of php
processes).
• Lots of online documentation.
38. Drupal Backend Performance and Scalability
Debugging PHP
• XDebug
• http://xdebug.org
• Display traces on error conditions.
• Trace functions.
• Profile PHP scripts.
• Manually used for testing D8 performance on WSCCI
initiative.
• kCacheGrind
• Provide visualization on performance bottlenecks.
• http://kcachegrind.sourceforge.net/html/Home.html
39. Drupal Backend Performance and Scalability
Op-code caching
• Lower memory usage.
• Decrease in CPU utilization.
• Usage on http://calarts.edu lowered memory usage
from 45M down to less than 10M.
• May crash.
• May require restarts after updating code
• Won't always work.
• Network connections.
• Sorting arrays.
• Faster queries.
• Bad code is bad.
40. Drupal Backend Performance and Scalability
Drupal
• Database intensive.
• Resource hog.
• Memory Intensive.
• (D8 >) D7 > D6 > D5
• Disable unnecessary modules.
• Views UI, Rules UI, <module> UI in
production.
• Create your own custom field consisting of all
the data you need in one row.
• Make sure cron runs regularly.
42. Drupal Backend Performance and Scalability
Debugging Drupal
• Everything from before.
• Devel
• http://drupal.org/project/devel
• Total page execution.
• Query execution times.
• Memory utilization.
• Combine with stress testing.
• DB Tuner
• http://drupal.org/project/dbtuner
• Trace
• http://drupal.org/project/trace
43. Drupal Backend Performance and Scalability
Drupal Caching
• Helpful in not repeatedly processing same content.
• Great for anonymous who would not see differing
content.
• Many caches in core.
• Bootstrap
• Field
• ...
• Page
• Many from contrib modules (views, rules).
44. Drupal Backend Performance and Scalability
Useful contrib caching
• EntityCache
• http://drupal.org/project/entitycache
• Stays in cache until expiry or content is deleted/updated.
• Boost
• http://drupal.org/project/boost
• Create html versions of pages and serve those.
• Requires changes to .htaccess file (apache)
• Does not load drupal once content is cached.
• Can display site while in maintenance mode.
• Varnish/Nginx have this built-in.
• Views content cache.
• Block Cache Alter
• Panels Cache - serve authenticated users.
45. Drupal Backend Performance and Scalability
Pluggable caching
• Use $conf variable in settings.php
• $conf['cache_backends'][] = '/path/to/cache_type_1.inc';
• $conf['cache_backends'][] = '/path/to/cache_type_2.inc';
• $conf['cache_class_<bin>'] = 'CacheClass1';
• $conf['cache_class_<bin>'] = 'CacheClass2';
• Allows you to use a custom caching module.
• APC (http://drupal.org/project/apc)
• Very Fast.
• Cannot use across multiple web servers.
• Memcache (http://drupal.org/project/memcache)
• Scalable.
• Redis (http://drupal.org/project/redis)
• Fast.
46. Drupal Backend Performance and Scalability
Memcached
• Distributed object caching in memory.
• Can span multiple servers.
• D6, D7.
• Scalable.
• Does not clear cache on cron.
• Call from CLI.
• Not persistent.
47. Drupal Backend Performance and Scalability
Redis
• More than just for caching.
• Queue,Watchdog
• Create your own complex data structure.
• http://drupal.org/project/redis_ssi for powering
logged in users solely through redis.
• 1000s of requests per second.
• Also takes drupal out of the picture.
• Very fast.
• Can store data on HD.
• Recoverable cache!
48. Drupal Backend Performance and Scalability
Search
• Drupal core search.
• Slow.
• Google Custom Search Engine
• Better alternative.
• Search API
• Very promising alternative.
• Pluggable system to support various backends.
• MongoDB.
• ApacheSolr / Lucene
• SphinxSE
49. Drupal Backend Performance and Scalability
ApacheSolr
• Fast.
• Scalable.
• Easy-ish to configure.
• Various companies offer Solr as a Service.
• Available as standalone Drupal module.
• Search API Backend module.
• Views plugins
• Drive non-search pages using Solr!
50. Drupal Backend Performance and Scalability
Other Options
• Optimized Distribution
• Pressflow (D6).
• Only supports MySQL.
• Supports reverse proxies.
• Requires PHP5.
• Cocomore.
• Same as pressflow.
• Pressflow for D7 is very similar to D7 core.
• Need performance backports from D8.
51. Drupal Backend Performance and Scalability
Other Options (cont'd)
• 'Patch' Drupal.
• Hack Core.
• Need to know what you're doing.
• Sometimes it is necessary.
• Create a patches directory.
• Better yet, use drush make and call your
patches in there.
• Create own module and alter DB schema from
there.
52. Drupal Backend Performance and Scalability
Advice for developers.
• Take advantage of caching.
• Use memory wisely.
• Unset a variable if you don't need it anymore.
• Save a variable to static memory (see
drupal_static())
• Take advantage of AHAH functionality.
• Fewer queries.
• No page rendering.
• Save bandwidth.
• Learn to use jQuery / learn AngularJS/EmberJS.
53. Drupal Backend Performance and Scalability
Thank you
• Have a question?
• Want to talk more about performance?
• Let's talk after :)