2. Overall presentation goal
Learn how to use content repositories APIs
to store and manipulate rich documents
www.devoxx.com
3. Speakerâs qualiïŹcations
Florent Guillaume is Head of R&D at Nuxeo, a leading
vendor of Open Source ECM platforms
Florent Guillaume is a member of international
normalization technical committees: JSR-283, Oasis CMIS
Florent Guillaume architected and wrote an ECM system
(Nuxeo CPS) and is a lead architect on another (Nuxeo EP)
Florent Guillaume wrote a low-level storage engine for
Nuxeo Core
3 www.devoxx.com
4. What is a document?
Store information
Attached ïŹle (binary stream)
Metadata
Retrieve information
Search
Security
Be visible and editable
Out of scope for a repository
4 www.devoxx.com
5. What is a ârichâ document?
Depends on your deïŹnition of ârichâ
5 www.devoxx.com
6. What is a ârichâ document?
Depends on your deïŹnition of ârichâ
5 www.devoxx.com
7. What is a ârichâ document?
Depends on your deïŹnition of ârichâ
Associated binary streams
Multiple attached ïŹles
Vignettes
Renditions
Complex metadata
Lists of records
Sub-records (XML-like)
6 www.devoxx.com
9. Filesystem
Decide where to store ïŹles
Decide how to uniquely identify ïŹles
Use java.io.FileOutputStream for binaries
Decide how to serialize metadata
All by hand
java.io.ObjectOutputStream, Serializable
etc.
8 www.devoxx.com
10. Filesystem example
Read/write
public void createDocument(Long id, String title) throws IOException {
File file = new File(ROOT_DIR, id.toString());
OutputStream out = new FileOutputStream(file);
try {
out.write(title.getBytes(quot;UTF-8quot;));
} finally {
out.close();
}
}
public MyDocument getDocument(Long id) throws IOException {
File file = new File(ROOT_DIR, id.toString());
InputStream in = new FileInputStream(file);
try {
byte[] bytes = new byte[100];
int n = in.read(bytes);
String title = new String(bytes, quot;UTF-8quot;);
MyDocument doc = new MyDocument();
doc.setId(id);
doc.setTitle(title);
return doc;
} finally {
in.close();
}
}
9 www.devoxx.com
14. JDBC
DeïŹne a SQL model for your data
Tables, Columns, Data types
Decide where to store BLOBs
Column or ïŹlesystem
Emit SQL statements to do all read/write operations
13 www.devoxx.com
15. JDBC example
Class deïŹning the model
class MyDocument {
private Long id;
private String title;
public Long getId() {
return id;
}
public void setId(Long id) {
this.id = id;
}
public String getTitle() {
return title;
}
public void setTitle(String title) {
this.title = title;
}
}
14 www.devoxx.com
21. Hibernate example
Read/write
class DocumentManager {
public void createDocument(Long id, String title) {
Session session = HibernateUtil.getSessionFactory().getCurrentSession();
session.beginTransaction();
MyDocument doc = new MyDocument();
doc.setId(id);
doc.setTitle(title);
session.save(doc);
session.getTransaction().commit();
session.close();
}
public MyDocument getDocument(Long id) {
Session session = HibernateUtil.getSessionFactory().getCurrentSession();
session.beginTransaction();
Query query = session.createQuery(quot;FROM MyDocument WHERE id = :idquot;);
query.setParameter(quot;idquot;, id);
MyDocument doc = (MyDocument) query.uniqueResult();
session.getTransaction().commit();
session.close();
return doc;
}
}
20 www.devoxx.com
22. Hibernate drawbacks
Document classes have to be deïŹned by hand
No standard, just application-deïŹned classes
Object-relational mapping too ïŹexible
Too much choice can be a curse
Binaries are still a problem (no streaming)
Hibernateâs âbinaryâ is a byte[] â Memory hog
Blob support inconsistent
21 www.devoxx.com
23. JPA
Model your documents as classes
DeïŹne an object-relational mapping using annotations
Use JPA to automate read/writes
22 www.devoxx.com
24. JPA example
Class deïŹning the model and the mapping
@Entity
class MyDocument {
private Long id;
private String title;
@Id
public Long getId() {
return id;
}
public void setId(Long id) {
this.id = id;
}
public String getTitle() {
return title;
}
public void setTitle(String title) {
this.title = title;
}
} www.devoxx.com
23
25. JPA example
Read/write
@Stateless
class DocumentManagerBean {
@PersistenceContext
EntityManager em;
public void createDocument(Long id, String title) {
MyDocument doc = new MyDocument();
doc.setId(id);
doc.setTitle(title);
em.persist(doc);
}
public MyDocument getDocument(Long id) {
return em.find(MyDocument.class, id);
}
public List<MyDocument> listDocuments() {
Query query = em.createQuery(quot;SELECT doc FROM MyDocument docquot;);
return query.getResultList();
}
}
24 www.devoxx.com
26. JPA drawbacks
Same as Hibernate
Although annotations are very convenient
25 www.devoxx.com
28. Beyond mere storage
Standard document abstraction
Security
Locking
Versioning
Full text search
Types and inheritance
Folders? ordering?
27 www.devoxx.com
29. JCR
Content Repository API for Javaâą Technology
JSR-170, released in June 2005
Initiated by Day Software
Also BEA, Documentum, FileNet, IBM, Oracle, Vignette
and others
Apache Jackrabbit is the RI
28 www.devoxx.com
30. JCR goals
Java API
Fine-grained, hierarchical storage model
Be the âSQLâ of hierarchical storage
Lots of functionality
29 www.devoxx.com
31. JCR features
CRUD
Hierarchy of nodes
Simple properties, Lists, Binaries
Queries
Versioning, Locking, References, ...
30 www.devoxx.com
32. JCR 2
JSR-283
First public review July 2007
Final release expected early 2009
Nuxeo is a contributor to the speciïŹcation
31 www.devoxx.com
33. JCR 2 features
Fix JSR-170 inconsistencies
Several compliance levels
New property types
Improved features
Versioning, Access control, Observation
Retention & Hold
Shareable nodes
Java query API
32 www.devoxx.com
34. JCR â nodes and properties
foo bar
Property Type
Node Type
value
child child
33 www.devoxx.com
35. JCR â node hierarchy
(root)
foo bar gee
Folder Folder Folder
thing doc
Docu Docu
ment ment
vignette
File
34 www.devoxx.com
36. JCR â properties types
String JCR 2
Binary Decimal
Date WeakReference
Long URI
Double
Boolean
Name
Path
Reference
35 www.devoxx.com
37. JCR â hierarchy with properties
(root)
foo bar gee
Folder Folder Folder
thing doc
Docu Docu
ment ment
title description
date creator vignette
String String Date String
File
my doc ... 2008-12-11 ïŹorent
ïŹlename data
String Binary
img.png <binary>
36 www.devoxx.com
42. JCR â API
Query
public List<Node> getDocuments() throws RepositoryException {
QueryManager queryManager = session.getWorkspace().getQueryManager();
Query query = queryManager.createQuery(
quot;//element(*, my:document)quot;, Query.XPATH);
query = queryManager.createQuery(
quot;SELECT * from my:documentquot;, Query.SQL);
List<Node> documents = new LinkedList<Node>();
NodeIterator it = query.execute().getNodes();
while (it.hasNext()) {
documents.add(it.nextNode());
}
return documents;
}
41 www.devoxx.com
43. Nuxeo
Founded in 2000
Sustained growth for 8 years
Pioneering Open Source ECM software vendor
International organization, customers, partners,
community
40+ employees
Business oriented Open Source
42 www.devoxx.com
44. Nuxeo ECM
3+ years of work
Focus on the platform
Document Core
Nuxeo EP, Nuxeo WebEngine, Nuxeo RCP
Components everywhere (OSGi)
Large feature set
Big deployments
43 www.devoxx.com
45. Nuxeo Core
High-level document-oriented Java API
Complex document abstraction
Independent of actual storage backend
EJB remoting
REST bindings (JAX-RS)
SOAP bindings (JAX-WS)
44 www.devoxx.com
50. CMIS
Draft v 0.5 published in September 2008 by EMC, IBM,
Microsoft
Alfresco, Open Text, Oracle, SAP also on board from the
start
Oasis TC formed in November 2008
Adullact, Booz Allen Hamilton, Day, Ektron, Exalead,
Fidelity, Flatirons, Magnolia, Mitre, Nuxeo, Saperion, Sun,
Vamosa, Vignette (as of 2008-12-01)
CMIS 1.0 expected mid-2009
49 www.devoxx.com
51. CMIS goals
Simple document model
Independent of protocol
SOAP, REST (AtomPub) bindings
Not tied to a programming language
Platform, vendor independent
Basic set of ECM functions
âGreatest common denominatorâ
50 www.devoxx.com
62. CMIS query example
Standard SQL-92
Extensions for multi-valued properties
Extensions for hierarchical searches
Extensions for fulltext search
SELECT OBJECT_ID, SCORE() AS SC, DESTINATION, DEPARTURE_DATES
FROM TRAVEL_BROCHURE
WHERE IN_TREE( , âID00093854763â)
AND CONTAINS( , 'PARADISE ISLAND CRUISE')
AND '2010-01-01' < ANY DEPARTURE_DATES
AND CONTINENT <> 'ANTARCTICA'
ORDER BY SC DESC
61 www.devoxx.com
63. CMIS AtomPub bindings
Additional headers for behavior control
MIME types
Service application/atomsvc+xml
Feed application/atom+xml;type=feed
Entry application/atom+xml;type=entry
Query application/cmisquery+xml
AllowableActions application/cmisallowableactions+xml
62 www.devoxx.com
64. CMIS Web Services bindings
All the WSDL ïŹles are provided
Check the spec ;-)
63 www.devoxx.com
78. Nuxeo Core 2 and CMIS
Next-generation storage based on CMIS model
No âimpedance mismatchâ in models
Nuxeo extensions if needed
Leverage the Visible SQL Storage backend
Distributed and clusterable
Faster remote access and caching
True clusters
Facilitate cloud-based backends
77 www.devoxx.com
79. Summary
Many storage choices
DeïŹne your model, choose the right API
JCR is very capable
CMIS is coming fast
Nuxeo gives the best of both
78 www.devoxx.com
82. Thank you for your attention!
http://www.nuxeo.com
http://doc.nuxeo.org
http://jcp.org/en/jsr/detail?id=170
http://jcp.org/en/jsr/detail?id=283
http://jackrabbit.apache.org/
http://www.oasis-open.org/committees/cmis/
www.devoxx.com