The document discusses common mistakes that are often found during website audits. It covers 5 categories: content architecture, display architecture, site architecture, security, and performance. Some examples of mistakes mentioned include having similar content types, not reusing fields, extra modules installed that are not useful, reinventing functionality that Drupal already provides well, outdated core/contrib modules, and complex queries without indexes. The document provides best practices for each category such as planning content architecture ahead of time, separating logic from presentation, using the right hooks for custom modules, keeping software updated, and optimizing databases before caching. It emphasizes the importance of testing, environments, and maintenance for the website lifecycle.
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
5 Common Mistakes You are Making on your Website
1. 5 Common Mistakes you are making
on your Website
Hernâni Borges de Freitas
Technical Consultant
hernani@acquia.com
@hernanibf
2. About me
• .PT / Uk
• Acquia Professional Services
EMEA
• Technical Consultant
• Drupal* many things
• Passionate about web and
communities
• Travel lover
4. Site Audit
• During limited time we look to your
website assuring it is following best
practices and do not present risks
regarding:
• Architecture (Content , Functionality,
Display)
• Security
• Performance
• Infrastructure
• Website Life Cycle (Development,
Deployment, Maintenance).
5. This webinar
• Common mistakes we found in site audits
looking to 5 categories:
• Architecture (Content , Functionality,
Display)
• Security
• Performance
• Infrastructure
• Website Life Cycle (Development,
Deployment, Maintenance).
6. Content architecture
“Editors don’t understand what to create. ”
“The page content type article is similar to news. We
just used it during some months to create special
news in homepage.”
“We needed to change this template because we
wanted to show everything in that location and
we use school_location and teacher_city.”
7. Content architecture
Symptoms
• Similar content types
• Fields not reused
• Content types with almost no nodes
Chasing it
• Take a look at field report page.
• Content type structure.
• Simple database queries
• Select count(*), type from node group by type
8. Content architecture
Best practices
• Plan your content architecture ahead. This is probably
the most important part of your site.
• Think before creating a new field or content type.
• Reuse and standardize as many content types and fields
as possible.
• This will help in maintenance
• This will help in user experience
• This will help in performance
9. Dis play architecture
“Views_london, views_paris, views_lisbon shows jobs
available in these cities”
“The scores block in the sports section ? Some PHP
code is controlling its visibility in block
configuration..”
“We need those node_load() in preprocess_page
because we need to show those nodes in
homepage.”
10. Display architecture
Chasing it
• Understand how pages are build.
• Look at views and how reusable they are.
• How much custom templates do you have?
• How much logic do you have in templates.
• How easy is to switch theme (mobile,
special occasions?)
• How long does it take to produce a
totally new design in your site?
11. Display architecture
Best practices
• Separate very well what is logic and what is
presentation.
• No code handling logic in template files (*.tpl.php)
• Custom logic in modules
• Custom logic in preprocess functions if needed
• Customize the right templates.
• Theme developer module can help.
• Start with a solid foundation to manage display and
presentation, excel on it.
12. Site architecture
Symptoms
• Modules installed
• Number of modules that are not useful at all.
• Hacked core and modules
• “There is a module for that” – does not
mean you need to use it!
• Modules used for things they were not
designed to do.
• PHP Code in database
13. Reinventing the wheel
“This is a custom module we designed to create
forms on the fly that can be sent by email to
site admins!”
“ That custom module adds small hidden
tokens to control SPAM in our website.”
14. Extra complexity
“We thought we needed content translation but in
the end our website is just in english.”
“Right now we only have one type of users, but
in the future we might need to have more
roles, so we already have content_access.”
“ Authcache module is used to speed up pages
for our 20 journalists.”
15. Site architecture
Chasing it
• Evaluate number of modules and
functionality they are providing.
• Evaluate if all modules are effectively
used or if better alternatives are present
from drupal.org
• Use hacked! module
(http://drupal.org/project/hacked) to
compare code versions used.
16. Custom modules
Symptoms
• Not following coding standards
• Can be a warning for what is coming…
• Not using the right hooks
• Excessive usage of hook_init, hook_nodeapi
• Not using the API
• Reinventing something that Drupal is already doing
well
• Hardcoded strings (nids, tids, vids, urls).
• All code in .module file
17. Best practices
• Balance custom code / contributed code or reusable
ways of solving problems.
• Couldn’t that query be a view ?
• Couldn’t context or panels create that page?
• Couldn’t that custom action be controlled by a rule?
• Find the best modules for your use case and excel on
using them.
• Search and plan before implementation. Test it in
short sprints.
• A site architecture is something that changes
overtime, reevaluate if periodically.
18. Security
“ That webservice path is impossible to find, it
does not need authentication. Only the mobile
app uses it.”
“ You would need to be a administrator to
access that page.”
“ We are the only ones we can access the
server, therefore we are just too worried
about it.”
The things we found in your website!
19. Security
Basic problems
• Not updated core and contributed modules.
• Bad configuration
• Users have permissions to do things they shouldn’t
• Admins have easy passwords (similar to
usernames, hacked email accounts..)
• File upload is not checked
• Code repository contain extra gifts
• Database dumps, files with information that should no be
there ..
21. Security
CSRF – Cross site request forgery
• HTML Email
• <img src=“http://example.com/admin/cookies/10/delete” />
• HTTP Post to forms
• You expect the request to come from your site but it can
come from anywhere
• Drupal protects against both attacks using tokens and Form API
22. Performance
What is your website doing
• How long do most pages take to load
(common lists, node pages, homepage?)
• Why do they take so long? DB queries,
application requests?
• What about edge cases? Clear cache for
instance?
• What is your caching strategy?
• What are your logs telling you?
23. Performance
• How long do most pages take to load ?
• Devel can show immediately some problems
• XhProf can do the rest
• NewRelic (newrelic.com) is pure gold!
• Why is CPU and memory wasted?
• Typically
• Complex queries that take too much time
• Function called too much times
• Edge cases that are happening all the time
24. Performance
Why is the database so slow? Why is only slow now?
• Databases not optimized to grow
• Complex queries made by without indexes usage
• Complex queries made automatically
SELECT node.nid AS nid, users.picture AS users_picture, users.uid AS users_uid, users.name AS
users_name, users.mail AS users_mail, node.title AS node_title, GREATEST(node.changed,
node_comment_statistics.last_comment_timestamp) AS node_comment_statistics_last_updated
FROM node node
INNER JOIN users users ON node.uid = users.uid
INNER JOIN node_comment_statistics node_comment_statistics ON node.nid =
node_comment_statistics.nid
ORDER BY node_comment_statistics_last_updated DESC
25. Performance
Is using InnoDb always better?
SELECT COUNT(*) FROM (SELECT DISTINCT node.nid AS nid FROM node node
LEFT JOIN og_ancestry og_ancestry ON node.nid = og_ancestry.nid INNER JOIN
users users ON node.uid = users.uid INNER JOIN node_comment_statistics
node_comment_statistics ON node.nid = node_comment_statistics.nid WHERE
og_ancestry.group_nid = 5 ) count_alias
• Use views lite pager module instead, if possible.
26. Performance
Optmizing before caching
•“My Site is Slow” - talk in Drupalcamp Madrid/ Drupalcamp
London
•http://www.slideshare.net/hernanibf/london2013
27. Performance
Can it be cached?
• Assure caching and aggregation are set. Yes, look at it!
• Review caching strategy:
• https://www.acquia.com/blog/when-and-how-caching-can-
save-your-site-part-2-authenticated-users
• Guarantee caching is effectively helping you.
• Don’t clear it too often.
• Not used only by a minority.
• Evaluate complexity before choosing a direction.
28. Infrastructure
This is where your website ends..
• What is the right size? How do you grow?
• Are the different servers well tuned ?
• Apache / PHP
• Mysql
• Varnish
• What are your logs telling you?
29. Infrastructure
“Our DB Server has 48Gb of memory. Enough to
handle all requests!”
• My.cnf
• Innodb_buffer_pool = 1024M
• Adjust limits according to your resources.
• http://mysqltuner.pl
• Your slowest bottleneck represents your overall
bottleneck.
30. Infrastructure
“We don’t need that many web servers. As
varnish is set in front and working as a reverse
proxy, most of the traffic will be cached.”
31. Infrastructure
“Our external firewall controls all sort of attacks.
We don’t use any specific firewall in the
servers.”
• 50/70% of attacks are internal. Remote connections with DB,
Memcached, Solr should be forbidden.
• Hard to remember about details on fast moving environments.
32. Website Life Cycle
This is going to be must of the work!
• What is your deployment architecture?
• How hard is it too change?
• How do you test changes?
• How relaxed do you leave your desk?
33. Deployment
“We just copy the code directly to the server by
FTP.”
“Any developer can just take a snapshot from
production and install on their laptop.”
“Don’t touch that module. We just did some
changes from what it was originally.”
34. Development
Control your code!
• All piece of code should be under VCS.
• Git, Mercury, Bazaar, SVN, CVS
• Copying to backup folders is not VCS.
• Yes, log messages should not be empty…
• No, your holidays pictures should not be under VCS.
• No, your database dumps shouldn’t also be there.
35. Maintenance
“We can only test that in production.”
“Yes we have a staging environment. But its data is
from last summer.”
“Sometimes problems occur when we upgrade.
But we have always a backup.”
36. Environments
Do once, prepare many!
• Several environments should exist
• Development, Staging and Production.
• Should be possible to deploy from VCS to them!
• Environments should be up to date and accessible
• Environments should be as possible similar to real
life
• Environments should be easy to destroy and
replicate
37. Maintenance
This is going to be most of the work!
• Be prepared for changes
• You don’t control them most of times!
• Pay attention to security updates
• Review your logs periodically
• Review periodically website architecture