This document discusses empowering Magnolia for enterprise use cases. It describes Aperto, an agency that has been using Magnolia since 2006 for clients. The document then covers two main topics: 1) customizing workflows for specific business requirements, and 2) approaches for handling large volumes of user-generated content (UGC), such as comments. For workflows, it provides examples of enhancing features and usability. For UGC, it describes separating this processing to avoid bottlenecks and ensure availability during peak loads. It also demonstrates a custom solution using Google Web Toolkit for comment moderation.
Empowering Magnolia for Enterprise Use Cases - Experience Report
1. Empowering Magnolia for Enterprise Use Cases
Magnolia Conference, Technical Track | Basel, 16. September 2010
2. About us
Sebastian Frick Jörg von Frantzius
Technical Project Manager System Architect
3. Some facts about Aperto
Internet agency in Berlin
Offering Conception, Design,
Development, Online Marketing
Building projects with Magnolia since
2006 for clients like Siemens,
Bertelsmann, EADS, INSM, Frankfurt
School and others
Contributed frontend and concept for
Standard Templating Kit
4. What we are talking about today
Part 1: workflow specific enhancements
Part 2: approach on dealing with user generated content
5. Part 1: workflow specific enhancements
Workflows - out of box features
Typical business requirements
Customization examples
Best practises / recommendations
6. Workflows - out of box features
Standard 4-eyes-workflow for
publishing process
Sending of E-mail notifications
Management of multiple workflows
Time-based de-/activation
Commenting
Inbox for editors for managing
workflow items
7. Standard 4 Eyes-Workflow
Group „Editors“ Group „Publishers“
activates content approves content
published
rejects content
8. Mapping Configuration in AdminCentral
Workflows depending on paths in CMS
Different repositories can share one
workflow or run their own one
9. OpenWFE – XML definition
XML contains
Process-definitions
Participants (e.g. group, role, user)
Fields (variables)
Conditional expressions (if, while,
loop)
Many more expressions or patterns
(OpenWFE manual)
10. There do exist some tools, but...
OpenWFE IDE
DroFlo – visual editor
Not mature or Magnolia specific enough
– for modelling a workflow it usually
takes good text editor with syntax
highlighting and a developer.
11. Typical business requirements > workflow process
Enhanced number of steps or states
(e.g. 8-eyes-workflow)
Automatic or manual selection of next
receiver
Non-linear pattern (e.g. one item
assigned to different groups at the
same time)
Workflow engine for different scenarios
than publishing (e.g. internal
processes)
13. Manual selection of receiver
activation dialog is extendible like any usual Magnolia dialog
setted variables can be retrieved via OpenWFE-elements
14. Automatic selection of receiver
Possible scenarios
by language of content
by section in site tree
by role
15. Dynamic selection of receiver
Make use of commands or custom functions for „outsourcing“ business logic
to custom Java classes or external services
1
2
method added in
OpenWFE‘s
function-map.xml
16. Typical business requirements – workflow usability
Display of current number of workflow
items
Display of current process status &
participiant
Better traceability: workflow history
19. Custom column providing additional workflow information
Sitetree-Implementation can be
exchanged by configuration
Adding additional columns is quite
easy (via Java)
Meta-information can be retrieved from
WorkflowItem-Object
20. Current number of items in inbox
Example for dynamic display of current
workflow items via AJAX based polling
AdminCentral frontend is extendible,
but we‘re looking forward to new
MagnoliaUI
22. Enhancing workflow-dialog by history tab
History info of an workflow item can be
build from attributes available in
WorkflowItem-Object
23. Enhancing workflow-dialog by references tab
Since content on one page can be
located in several repositories,
showing up references may be helpful
relations (UUID) and activation state
can be retrieved via Magnolia standard
functions
29. Best practises / recommendations #3
Avoid redundancies in workflow definitions whenever possible
Make use of Java-based commands (easier to maintain)
30. Best practises / recommendations #4
If groups will be used for determining workflow participants –
don‘t add ACLs to workflow groups
use seperate groups insteads
define a proper naming convention
31. Best practises / recommendations #5
Don‘t underestimate testing efforts
Set up at testing plan
Have already an idea on how to monitor single workitems before
development phase
Iterative development and testing
Regulary acceptance testing, adjustments will be usually necessary
32. Links
Evaluation matrix of workflow engines
Magnolia Workflow introduction
Home of OpenWFE
33. Part 2: approach on dealing with user generated content
1) 2.1. UGC: what‘s the problem?
2) 2.2. General solution + 2 implementation approaches
3) 2.3. GWT in the admin central
34. Client‘s UGC requirements
(UGC = User Generated Comments, e.g. page comments)
Client‘s website has page-commenting feature
At peak load times, thousands of users want to post their page comments
within a couple of minutes
Client‘s requirement:
Sustained content delivery during UGC peak loads!
Basel | 10.09.2009 | Magnolia Conference | Technical Track 34
35. But …
Magnolia
does scale
just great!
So,
what‘s the
problem?
Basel | 10.09.2009 | Magnolia Conference | Technical Track 35
36. The problem: UGC POST requests differ from content requests
Content requests
can be satisfied from cache, i.e. fast response time per request
no bottleneck, i.e. performance scales linearly with number of servers
network bandwidth can be maxed out, given a
good caching hierarchy and
sufficient hardware sizing
Not the case with UGC POST requests…
Basel | 10.09.2009 | Magnolia Conference | Technical Track 36
37. The problem: UGC POST have much larger performance impact!
UGC POST requests
cannot be satisfied from cache!
because require DB insert
requests take orders of magnitude longer,
meanwhile blocking your HTTP worker threads
system can take fewer of these requests simultaneously
before becoming unavailable
DB will become the bottleneck at some point
UGC load can exceed any hardware sizing
Basel | 10.09.2009 | Magnolia Conference | Technical Track 37
38. Solution (system architecture)
Website availability can only be ensured by
Separating content delivery from UGC processing ,
through separate operating system processes
UGC processing can run on dedicated hardware if necessary
So it could also be shifted into the cloud
Basel | 10.09.2009 | Magnolia Conference | Technical Track 38
39. Solution (system architecture): consequences
Consequences of separation:
Even if UGC processes fail (all threads busy):
Magnolia processes happily continue to serve content requests
Worst case only means: enduser will still see web page contents,
with additional error message „page commenting currently unavailabe“
Basel | 10.09.2009 | Magnolia Conference | Technical Track 39
40. Magnolia approach for page comments
Commenting module
(http://documentation.magnolia-cms.com/modules/commenting.html)
In order to separate processes:
have dedicated Magnolia instances that serve only commenting repository
Must have multiple of these instances for scalability
UGC is not published from author to publish servers,
but still all publish servers must see same content:
must set up shared JCR repository using Jackrabbit clustering
(http://wiki.apache.org/jackrabbit/Clustering)
Basel | 10.09.2009 | Magnolia Conference | Technical Track 40
42. Decision Magnolia approach vs. custom solution
Problems that we saw for us:
Increased system complexity (setup Jackrabbit clustering, setup dedicated
Magnolia instances with commenting repository)
Our lack of practical experience with clustered Jackrabbit:
How hard is it to setup?
How does it scale, in terms of lock contention?
How does it behave under high load?
What long-term consequences does journaling have (performance, maintenance)
For us: incurred complexity and risks outweigh advantages
Basel | 10.09.2009 | Magnolia Conference | Technical Track 42
43. Custom architecture chosen for UGC processing
Page comments are rendered in browser by Javascript,
using Google Web Toolkit (GWT)
UGC requests served by separate tomcats,
containing only a single REST webservice
Comments data is stored in clustered RDBMS
Comment moderation implemented with GWT
Proven software stack on server-side,
we know which screws to turn for optimization
Basel | 10.09.2009 | Magnolia Conference | Technical Track 43
45. Comment moderation: UI requirements
Comment moderation requires lots of tedious manual work,
UI shouldn‘t make a hard job even worse
So UI should have:
High useability
In particular: immediate responsiveness where possible
(i.e. no noticeable delay between click and visual response)
Basel | 10.09.2009 | Magnolia Conference | Technical Track 45
46. Comment moderation in admin interface with GWT
Solution: Google Web Toolkit (GWT)
(Java translated to Javascript + great tooling)
suitable for functional UIs (i.e. without pixel-grained styling)
server roundtrips can be minimized:
As much logic as wanted can be executed in browser
(e.g. status message update upon selection change)
As much state as wanted can be held in browser
(e.g. caching of previously shown rows in a paging table)
Economical implementation through development and debugging in Java
Basel | 10.09.2009 | Magnolia Conference | Technical Track 46
47. Comment moderation demo
Demo…
For the technically interested:
Use of GWT RPC, turned out to be fast and reliable,
most of all: much easier to program than custom REST webservice
JPA2 entities are serialized transparently through GWT RPC,
by using net.sf.gilead
Paging table based on org.gwtlib
Following: the big picture…
Basel | 10.09.2009 | Magnolia Conference | Technical Track 47
49. Thank you for your interest!
Our contacts..
Sebastian.frick@aperto.de
Joerg.Frantzius@aperto.de
In the web...
http://www.aperto.de
http://blog.aperto.de
http://www.twitter.com/aperto