2. Designing for high
performance
The process is usually the same for major
refactoring and building a new site for high
performance
It’s always easier to replace an existing site,
because you have real data
Creating a high performance site on some
estimations from a customer might get you
pretty far away from the actual needed solution
3. Designing for high
performance
For this session, we’ll imagine a situation where
we have an existing site with actual data
available
The recent case where we were working with
this kind of a design was exactly that: a well
matured site (running sine 1998!) going to be
reincarnated for the fourth time
5. First look: identify the
problem
When a site is not performing well, it can be
caused by numerous different reasons
Analyze it
Profile under load
Look at the logs
Look at the server loads under load
6. First look: identify the
problem
Make sure you’re not hitting some simple
bottleneck
Too many running services on a single hardware
A crazy database query killing the site
Broken router causing 3 sec delay to every request
(seen that, for real)
And many, many others
7. Problem identified
When you’ve arrived to the conclusion that you
actually have too much volume, then figure out
of what?
Too much content? I’ve seen 12 million nodes plus
60 million comments on a single installation, that’s a
lot.
Too many requests per second? Make sure they are
page requests. Statics can be easily fixed, look at
cache headers, aggregation, Varnish, Nginx, CDNs.
8. Problem identified
Too many Drupal page requests per second?
Anonymous?
If anonymous, it’s usually easy to fix, as long as it’s
cacheable. We’ll go into the whole “cacheable” thing later.
If it’s cacheable, look at page cache, Boost, Varnish, CDNs.
Logged in?
Drupal cache is turning off, and the calls are bypassing all
the caches
This usually is a more difficult problem to solve
9. Problem identified: too
many logged in users
There’s still one case that’s pretty common and still
easy enough to solve:
logged in users with small amount of
personalized content
(small in percentage of the CPU cost of building the page in Drupal)
10. Problem identified: too
many logged in users
logged in as
user
highlights:
content area: common content for common
everybody
your friends’
favorites
11. Problem identified: too
many logged in users
Let’s make a couple pre-requisite conditions
You’re running on your own environment
You have Varnish configured in front of the Drupal
site
You have some skills in programming with Drupal
You got all of this? Ok, let’s
continue.
13. What’s Cache Control
It’s similar to ESI module with some benefits
It’s mainly directed to cache blocks or block-like
content on the page
It needs some programming usually
When dealing with an optimal problem for it, it’s
the optimal solution and will make your site
faster by magnitudes
14. DRUPAL User first gets the common page
for everybody from Varnish
Then a javascript routine checks
whether the user is logged in or
VARNISH not
The javascript either makes the
hidden for-anonymous content
USER visible or fetches this user’s
BROWSER content with a ajax request
15. Problem identified: too
many logged in users
logged in as
login box
user
highlights:
content area: common content for common
everybody
your friends’
staff picks
favorites
16.
17. Benefits of Cache Control
Burdens the back-end significantly less due to
only loading the needed parts
Loads multiple blocks and/or areas with a single
request
Gives the user something to look at while
loading the hard parts of the page – and it does
make the site feel faster
Plays well with some other modules, like
captcha etc.
18. What about ESI
ESI (Edge Side Includes) is a partial loading
technique supported by Varnish and some CDNs,
e.g. Akamai
It basically makes Varnish do the partial page
loading
Varnish first fetches the common version from cache
Then it looks though the page to see any ESI markup
Then it loads all the ESI marked parts of the page from
cache or from the Drupal
19. How is Cache Control
different than ESI
ESI needs to wait until the whole page is loaded
before giving anything to the user
ESI loads all the portions of the page (still in D7,
this might change in D8) in separate http
requests, thus burdening the server with even
more bootstraps than without any cache
20. HEY… HOW ABOUT THAT
USER GENERATED
CONTENT THAT MAKES
VARNISH PURGE
EVERYTHING ALL THE
TIME?
21. Different problem
As stated, Cache Control works well for specific
problems, but that also is in trouble when the
Varnish cache gets purged all the time
That usually happens on a really UGC (User
Generated Content) oriented site
22. Different problem: UGC
When a single page on a site gets new content
every 2-30 seconds
Caching is of no use, purging multiple pages on
that rate makes no sense
You need that data to have a way of refreshing
even more frequently
And we’re still talking about a page that doesn’t
update after it has loaded (so no Socket.IO stuff
on this slide deck, sorry)
23. Different problem: UGC
logged in as
user
content area: common content for
everybody highlights:
common
and this is getting updates every
30 seconds
your friends’
favorites
24. Solution: A new cache
layer
We add a new, fast-paced cache layer on the
page
We’ll try to purge and reload that cache as fast a
humanly possible in Drupal
We’ll minimize our efforts on the backend
25. Solution: A new cache
layer
Let’s load the whole page from Varnish and the
refresh the fast-paged part with javascript
To minimize the load on the backend, skip the
theming layer and just load JSON
Sound good?
26. Solution: A new cache
layer
Until you realize you have to theme everything in
the Javascript and that’s not fun
Even if you use a javascript templating engine,
you still have to keep your themes up to date in
two places
28. Front themer
When theming in Javascript, Front themer
makes your life a bit easier
It allows you to map your Drupal theme’s theme
implementations to very simple Javascript
versions
It’s designed to help out with simple elements,
such as boxes and lists
It might need you to tweak your theming
functions a bit to make them work better with it
29. Solution: A new cache
layer
And the back-end?
Exove has a module coming out to help get
grouped and cached JSON outputs fast from
Views
It’s not something to be used for integrations but
just for the faster cache layer
Going to be released during this fall with a site
using it
Until that, just use Views and Views datasource
30. SO, WAIT A MINUTE
THESE ARE ALL HACKS,
RIGHT?
Not quite.
31. Drupal doing high
performance
You can’t really use Drupal for high performance
out of the box
Hacks, or actually extensions are needed and if
done as proper contribs, are safe and
convenient to use
Drupal has been made extensible for this exact
reason, it can be made better by extending it
32. What would we like to see
in Drupal 8
We’d like to see a real JSON output from
Drupal, preferably by piece by piece content
We’d also like to see a thinner bootstrap with
lazy-loading for pretty much everything
REST interface for doing more stuff in the front,
e.g. with JS frameworks
33. You can see a pattern here. This is all
covered by the WSSCI and Scotch
iniatives. We’re waiting for Drupal 8 to
be a lot better.
And Cache Control is going to rock on Drupal 8.
34. and there are always going to be hacks
to get Drupal to do more
35. THANK YOU FOR YOUR
TIME
PS. We’re hiring. www.exove.fi/careers