Work on integrating semantic technologies developed in several R&D projects is now progressing at full speed. Expect to see creative new uses of semantic technologies in Nuxeo open source content management products in 2011!
4. Nuxeo: an open source
ECM vendor
Our Focus is Enterprise Content Management
ECM as a Platform for Content Applications
Open Source as Efficient Development Model
Modern architecture for 21st Century business
“Lean, mobile, social, interoperable”
A Social Marketplace in action
Innovation driven by community of customers, partners,
and our core developers
5. Nuxeo ECM - From Platform to Products
Construction Media Government Life Sciences
Business
Solutions
Correspondence Contracts Records
Invoice Processing
Management Management Management
Case Structured
Horizontal Document Digital Asset Content
Management Document
Packages Management Management
Framework Server
Aggregator
Nuxeo Enterprise Platform
Complete set of components covering all aspects of ECM
Platform
Content
Infrastructure
Nuxeo Core
Lightweight, scalable, embeddable content repository
5
9. Linked Online Data in 2007
“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
10. 2008
“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
11. 2009
“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
12. 2010
“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
13. Good for Enterprise apps too!
Diagram source: http://www.w3.org/2007/Talks/0130-sb-W3CTechSemWeb/
14. Key Enablers
Open Data and Linked Online Data
Advances in automatic content analysis
(linguistics, image processing) and machine
learning
Classical logic and classical AI
Computing power (Moore’s law +
MapReduce)
18. Semantic ECM
Metadata
Text
Sound Tags Entities
Image Relations
Video
Reasoning
Content Meaning
19. Semantic ECM
Metadata
Text
Sound Tags Entities
Image Relations
Video
Reasoning
Content Meaning
20. Goals for Semantic ECM
Repurpose existing content
Improve search and collaboration
Make information contextual
Extract and use information from your content
Make your content smarter!
21. Challenges
Extract meaning from content
Enrich content with knowledge
Enhance interaction with content thanks to
added meaning
23. Business value
from semantic ECM
Efficiency gains: 20% to 90% (ex: in search,
collaboration)
Effectiveness gains: better returns from
your assets (ex: news and images from AFP)
Strategic edge: growth, value capture, new
services, gain unfair strategic advantage (ex:
vertical ontologies for CEVAs / CCAs)
25. Project under the french FUI program, with 9
partners, and a budget of 4.7 M€
Goal: to develop algorithms and collaborative tools
for extracting knowledge from unstructured
documents and images
Started in 2008, finishing in Dec. 2010, with
results already integrated as a Nuxeo plugin
26. European project under the FP7, with 13
partners (6 SMEs) and a 8.5 M€ budget
Goal: create a semantic software “stack” that will
be used by CMS vendors to add semantic features
to their products
Started in Jan. 2009, will last until Dec. 2012
First tangible result: FISE, already integrated in a
Nuxeo plugin
32. What is a semantic engine?
• Unstructured content => Knowledge
• Language guessing
• Topic classification (Business, Sports, Media, ...)
• Named Entities extraction and linking
• Relationships and properties extraction
29
42. Nuxeo DM Improvement
Automated document categorization
(language, subject, geo coverage based on fixed
lists)
Semantic entities detection and linking
Available as add-ons on the Nuxeo
Marketplace in December!
43. Nuxeo DM:
Upcoming Work
Stanbol + Scribo integration
Multilingual support
Extraction of relations between entities
Topic classification and linking to external
taxonomies
44. Nuxeo DAM
Clustering pictures by similarity
Faces detection
Faces recognition using contextual information
Speech to text integration for full-text search
on audio and video files
45. Nuxeo CMF /
Correspondence
Document OCR and structure extraction
Scanned document categorization (ex: invoice
vs. contract vs. claim...) and routing
Structured field extraction with configurable
document masks