4. Solving SharePoint Type Problems
With An Open Source Stack
Richard Esplin
Community Technology
5. Agenda
● Making the case for content management
● Best practices: the platform approach
● Introducing CMIS
● Live examples
6. What is Alfresco?
Enterprise content management
1
platform across cloud, on-premise, or
both
API for content applications that can
run in the cloud, on-premise, or both
Content hub for your enterprise tablets
cloud on-premise hybrid cloud sync
7. What is “content”?
● Data
● Don't mistake Code for Content
● Unstructured Data
● Structured data works well in a relational data store, XML store, or
key-value store
● Unstructured Binary Data
● Unstructured non-binary data works well in source control
● Examples:
● Audio, Video, Images, Office Documents, Engineering Files,
Reports
8. What is a “content-centric application”?
● Applications that access binary files
● Files are often generated collaboratively
● Often must deal with large numbers of files
● May include a mix of structured and unstructured
content
● May also include business processes
9. A few examples
● Web site with catalogs, white papers, and videos
● Expense report review and approval
● Contract negotiation, creation, and review
● Research study authoring
● Sales / Marketing collateral creation and communication
● Course guide authoring and publishing
● Images and media in games
● Media curation, transformation, and delivery
● Legal compliance and corporate records management
10. Or the business is saying . . .
● I've got a ton of files,
● I've got people that
produce and consume
them,
● I've got systems that
use them,
● I want to make it
easier!
Doug Waldron (cc attribution share-alike)
http://www.flickr.com/photos/dougww/922328173/
11. Let's build it ourselves!
Pasukaru76 (cc attribution) http://www.flickr.com/photos/pasukaru76/4277763808/
12. DIY approach seems simple . . .
● “This is simple stuff.”
● Grab a web-application toolkit
● Favorite front-end / presentation framework
● Store a bunch of files
● Relational Database
● Data Model / Metadata
● Comments / Ratings
● Tagging / Categorization
13. File storage options
● On disk
● Amazon S3 or an internal CAS filer
● Source code control repository
● XML database
● NoSQL document store
14. Relational may not cut it
● Good at text and numbers. Not so good at
binary.
● Good at static table definitions. Not so good at
dynamic aspects.
● Size limits.
● Random seek (streaming).
● Search: Some relational databases can index
into blobs, but not all.
15. Once files are figured out . . .
● Ensure security
Execute a workflow
Lots
●
● Transform the content between
types
of
● Schedule a job
● Provide shared drive access custom
Versioning
code!
●
● Replication
● API Access
● Integrate with authoring tools
16. The optimistic scenario
gobucks2 (cc attribution non-commercial share-alike) http://www.flickr.com/photos/69331170@N00/2854583096
18. Evaluating DIY reasonableness
● Number and size of documents
● Number and concurrency of users
● Number and nature of integration points
● Business process volatility and complexity
● Time and cost of
● Integrating all of these services / sub-systems
● Maintaining all of that code . . . forever
● Access to off-the-shelf alternatives
24. Platform approach
● The common problems have been solved
● Content Platform = Repository + Services
● Find a platform that meets your needs
● Extend the platform with your own business logic
● Customize the UI that the platform provides
● Or write your own front-end using whatever language or
framework makes sense
● Meets your current needs while providing a roadmap
for the future
25. Evaluating content platforms
● Agility ● Open Source
● Applicable to a broad set ● Troubleshooting
of solutions vs a vertical ● Bug tracking
specific solution
● Community
● Scale up, scale down
● Standards compliance
● Developer ergonomics
● Easier integration
● Fast and friendly
developer model
● Lower migration costs
● Developer familiarity
26. General architecture
Web Applications Knowledge Portals Web Services
App CRM Business
Server Process
Engine
Portal Server
Virtual File System High Availability
27. Corporate Systems
Desktop
WebDAV
CMIS
CIFS
CMIS SharePoint
JSR-168 Protocol
Connectors
Social Media Channels
Mobile
CMIS
WebDAV
Open Web APIs
Open Web CMIS-based
Web Services APIs Alfresco Sync Public Alfresco Cloud
CMIS
29. What is CMIS?
● Content Management Interoperability Services
● Language-independent, vendor-neutral API for content
management
● Least-common-denominator (some vendors have extensions)
● CRUD functions for nodes
● Check-in / check-out
● Associations
● Permissions (Access Control Lists)
● Policies
● Queries
● Repository Traversal
30. What is CMIS?
● OASIS standard
● 30+ ECM vendors agreed to implement
● Two parts
● Interoperability through standard SOAP and AtomPub
bindings
– JSON bindings coming soon
● SQL-based query language for rich content
repositories
● Vendor specific extensions may be useful
34. Types
Document Folder
●
Content ●
Container
●
Renditions ●
Hierarchy
●
Version History ●
Filing
Described by
Type Definitions
Relationship ACL
●
Source Object ●
Target Object
●
Target Object
Policy
●
Target Object
35. Type Definitions
Object Property
●
Type Id ●
Property Id
●
Parent * ●
Display Name
●
Display Name ●
Type
●
Queryable ●
Required
●
Controllable ●
Default Value
●
…
Document Folder Relationship Policy
●
Versionable ●
Source Types
●
Allow Content ●
Target Types
Custom Type
36. Apache Chemistry
● Open Source implementations of CMIS
● Umbrella project for all CMIS related projects within the
ASF
● OpenCMIS (Java, client and server)
● cmislib (Python, client)
● phpclient (PHP, client)
● DotCMIS (.NET, client)
● De-facto reference for CMIS and used by CMIS technical
committee to test 1.1 features
38. My setup
● Debian Mint Wheezy
● OpenJDK 1.6.0_24
● Python 2.7.2
● Alfresco Community Edition 4.0.d
● Open CMIS Workbench 0.7.0
39. CMIS Workbench
● Download
● http://chemistry.apache.org/java/developing/tools
/dev-tools-workbench.html
● Connect to Alfresco
● http://localhost:8080/alfresco/cmisatom
● Good tool for figuring out what CMIS can do
● Check out the Groovy Console!
40. Python
● In the shell: ● Continued:
virtualenv . props = {}
./bin/easy_install cmislib props["cmis:objectTypeId"]="cmis:document"
./bin/python doc = folder.createDocumentFromString(
'testdoc.txt', props, contentString="This
from cmislib.model import CmisClient is a test showing how to create a text
client = CmisClient( document", contentType='text/plain')
"http://192.168.56.1:8080/alfresco/cmisato doc.isCheckedOut()
m", "admin", "admin") props = {}
repo = client.defaultRepository props['cmis:name'] = "test-updated.txt"
repo.id doc = doc.updateProperties(props)
repo.name doc.name
for (k,v) in doc.delete()
repo.getCapabilities().iteritems(): len(folder.getChildren())
print "%s: %s" %(k,v) result = repo.query("select * from
cmis:folder where cmis:name like '%alf%'")
for (k,v) in len(result)
repo.getRepositoryInfo().iteritems(): for i in result:
print "%s: %s" %(k,v) print i.name
root = repo.getRootFolder() result = repo.query("select * from
root.name cmis:document where contains('name')")
folder = root.createFolder('cmis-demo') for i in result:
folder.id print i.name
folder.name
for (k,v) in
folder.properties.iteritems():
print "%s: %s" %(k,v)
41. PHP and Drupal
● Drupal CMIS Views
● http://drupal.org/project/cmis_views
● Built on Drupal CMIS
● http://drupal.org/project/cmis
● Configure a repository in settings.php
● Enable cmis_sync
● Bundles an early release of phplib
● Currently read-only
● Good for exposing unstructured data alongside a
structured web page
42. Where to learn more
● cmis.alfresco.com includes a public CMIS server and links
to CMIS resources (check out the cheet sheet)
● Read the CMIS specification
● Apache Chemistry site has clients, lightweight server,
documentation
● “Getting Started with CMIS” tutorial shows how to use
"cURL to hit AtomPub bindings directly"
● Slideshare has some CMIS related presentations from
Alfresco DevCon here and here
44. Attribution and Licensing
● Copyright 2012, Alfresco Software
● Some images used in this presentation are
licensed under the Creative Commons by-
attribution non-commercial share-alike license.
● Original work in this presentation is licensed
under the Creative Commons by-attribution
license.
● Thanks to Jeff Potts for allowing me to base my
presentation on his.