1. A Content Repository for TYPO3 5.0
TYPO3 Developer Days 25.-29.04.2007, Dietikon / Switzerland
2. Special guest: David Nuescheler
Responsible for the technology strategy and ongoing product
development at Day. He joined Day in 1994
Specification lead on JSR 170 and JSR 283.
Also a committer on the Apache Jackrabbit Project and a
member of the Apache Software Foundation
He will now tell us more about JCR
Inspiring people to
share
3. Why a CR for TYPO3?
Flexible and extensible data structure
Object based storage and retrieval
Combines advantages of navigational and relational databases
Security can be enforced on a higher level
Cleaner and easier to use for the developer
Inspiring people to
share
4. More reasons for a CR
Data source abstraction instead of database abstraction
Data can be stored in different ways, a database is only one of
them
Due to the higher level of abstraction, database specific
functions and specialties like transactions, stored procedures,
partitioning ... can be used on the CR implementation level
Depending on the CR implementation the speed gain for read
access to the content tree can be immense
Inspiring people to
share
5. The Jackrabbit “shortcut”
As there exists no PHP-based CR implementation, we looked for
alternatives
Jackrabbit is the JSR 170 reference implementation, providing
all required and optional features of the specification
Using it from PHP is possible with the PHP-Java-Bridge
Provides a way to write and test PHP-based unit tests that are
needed for implementing a pure PHP-based CR
are we crazy?
Inspiring people to
share
6. A native PHP Content Repository
TYPO3 5.0 will still run completely without Java - by accessing
the PHP-based TYPO3 CR, based on the APIs defined in JSR 170
and JSR 283
The goal: A flexible and powerful content repository for TYPO3
written in PHP
We are not crazy
It is not impossible
Maybe not all of the standard will be implemented – but don’t
tell anyone...
Inspiring people to
share
7. Current status
phpCR: The JSR-170 API exists as PHP interfaces, thanks to
Travis Swicegood
The Jackrabbit bridge has proven to be a working setup,
although it does not handle the full API yet - maybe it never
will
We have a large set of unit tests available for the
phpCRJackrabbit package
A first batch of those tests has been generalized to be usable
for any implementation of the phpCR interfaces
Inspiring people to
share
8. Missing things
A domain model for the CMS part of the project
A way for defining node types based on that model
Inspiring people to
share
9. Defining the CMS domain model
We need to focus on the pure domain of the CMS
A first step is to find the common set of objects that form the
domain of content management
So, let’s see...
Inspiring people to
share
10. Defining the CMS domain model
Page Plugin
Sitemap Content
Element
Page
Tree System Category
Folder Backend
Module
Template
Record Content
orks pace Element
W
Inspiring people to
share
11. A possible hierarchy of things
Assignment: try to come up with a hierarchy of objects that
represent the content we currently have - and trim where
possible
You have 10 minutes...
Inspiring people to
share
12. Node types
To make good use of a CR, one needs to provide useful node
types
A node type specifies
allowed and/or required sub nodes to a node
allowed and/or required properties of a node
supertypes of a node, i.e. inheritance
Inspiring people to
share
13. The node types of magnolia
<nt:hierarchyNode> <nt:hierarchyNode>
mgnl:contentNode mgnl:content
<nt:hierarchyNode>
mgnl:reserve
<nt:hierarchyNode>
<nt:base>
<mix:versionable>
*
mgnl:metaData
<nt:resource> <mgnl:content> <mgnl:content> <mgnl:content>
mgnl:resource mgnl:user mgnl:role mgnl:group
All nodes can
have arbitrary
properties...
Inspiring people to
share
14. Our node types?
The node types should (partly) reflect the domain model
Specifically the parts of the domain model, that need to be
persisted
Coming up with a reasonable system of node types is not trivial
We need to further work on the domain model, before steps
make sense...
Inspiring people to
share
15. CR configuration from code
Currently MySQL tables are created when installing an extension
The definition is a plain SQL file
Further data comes from $TCA as defined in ext_tables.php and/
or tca.php
Automation needs to stay around, of course
We need to create node types instead of tables and fields
Inspiring people to
share
16. CR configuration from code
Goals
Get rid of multiple places for defining things
Make it as transparent as possible
Create node types based on PHP objects
Use reflection to gather information about the objects
Create node type definition accordingly
What objects need a corresponding node type?
Inspiring people to
share
17. Changes to existing node types
Changing and removing a node type is possible
But what about nodes type being in use?
Jackrabbit currently rejects nontrivial changes
We will probably only change node types on explicit request
Changing a node type may fail if the result would be
inconsistent repository content
Existing data needs to be removed before a node type can be
removed
Inspiring people to
share
18. CR configuration from code
JSR 170 had no defined API for registering node types
JSR 283 will have it, and we will use that by
adding it to the phpCR interfaces
adding some wrapper for Jackrabbit
An intermediate step is the generation of a file containg the
node type definition in Compact Namespace and Node Type
Definition (CND) notation
Inspiring people to
share
19. Storing actual content
One way is to store e.g. the text of a text content element as we
do today, i.e. as a string
What about links in the text?
To be aware of links, we’d need to parse it and maintain a
reference index
A possible syntax:
<a href=quot;${link:{uuid:{522c0cac-7d67-4324-869f-
7553426f95b0},repository:{website},workspace:{default},path:{/
help/user-mailing-list}}}quot;>some link</a>
Inspiring people to
share
20. Storing actual content
An alternative could be to break up the content in smaller nodes
A working example is the DOM tree of a HTML document
Advantages
No need to have a seperate reference index
Queries for links always easily possible
Disadvantages
Adds quite some complexity
Inspiring people to
share
21. Open tasks & next steps
An awful lot of them...
Inspiring people to
share
22. Thanks for listening
Karsten Dambekalns <karsten@typo3.org>
Inspiring people to
share