2. 2
Emergent Analytics
Extensible enterprise information management
paradigm
Add semantics to all aspects of the enterprise's
information systems
− All information becomes easily accessible using
SPARQL
− Add new information easily
− Understand how everything is related and what it is
Provides the capability to analyze information
enterprise wide
4. 4
Information Technology
The technology that enables the management of
all types of information
− Create it – works great
− Store it – works great
− Change it – works great
− Find it – not so good
− Analyze it – very complex, very difficult
− Use it – works great if you are inside the application
that creates it, otherwise BIG problem
− Commonly called SILOS
We all want FEDERATION
7. 7
New Information Management Paradigm
Semantic Technology is a layer of description that
sits within the current IT infrastructure
− We build the descriptions using OWL and RDF
− We access the descriptions at run-time using
SPARQL
OWL and RDF are unique because they are a
description language and an information model
that has its own unique aspects
− Enables a radical transformation of IT capabilities
− Completely distributed information management
− FEDERATION
8. 8
Information Federation
Enterprises are made up of many domains within
domains
Sales, Operations, R&D, Executive management,
manufacturing, …
Logistics, HR, Finance, intelligence …
Each domain fields its own applications and creates its
own information to execute its mission
It is normally not possible to federate and integrate applications
within domains, across domains or with partners
Enterprises will not take the next step in analytic
capability until they first solve the INFORMATION
federation problem
9. 9
What are RDF and OWL for?
They are only used for one thing....
To DESCRIBE things
ANYTHING
Machines can
UNDERSTAND
the descriptions
10. 10
Federation Requires Description
Information discovery, reuse, and integration all
depend on description
− If we do not know what something is we cannot possibly know how to
integrate it with other things or even how it should be used
If we describe everything well enough, we are in a
position to have a knowledge-based web
− integrate and interoperate
− Analyze any combination of information
RDF & OWL enable information federation
− both machines and people can understand the descriptions
11. 11
Defense Advanced Research Projects Agency
Relational Database Technology
TCP/IP
OWL/RDF
− DARPA creates the Defense Agent Markup Language program
in 2000 to facilitate information federation - DAML.org
− W3C takes the work funded by DARPA and others to create
the Resource Description Framework (RDF) and Ontology
Web Language (OWL) specifications
These standards comprise an excellent information
management technology architecture
There are no other standards that can be used to
accomplish the goal of information federation
12. 12
World Wide Web Architecture
Mature
Active Research
and Standards
Activity
Commercial
Cutting
Edge
14. 14
Semantic Software Architecture
All components support RDF, OWL and/or
SPARQL as well as other web technologies
− OWL modeling tools
− RDF stores
− Spyders
− Federators
− SPARQL endpoints
− Visualization tools
− Analytic tools
− SPARQL endpoint registry
15. 15
Spyder
Software component that transforms relational data
formats to RDF using the mapping ontology
Adds the semantics of any domain ontology to
any database
Provides SPARQL endpoint for relational
databases
Generates information about sources to optimize
performance
exposes full power of SQL
allows mappings themselves to be analyzed
Minimizes or eliminates the need for triple stores
Easier to use than ETL
16. 16
Federator
Enables users to query multiple RDF graphs exposed by
Spyders as if they were a single graph
− Uses the source metadata provided by Spyders to optimize
performance
Works against the native information sources
− Does not require RDF to be moved into a triple store before it is
queried
− Delegates the maximum amount of processing as far down as
possible
Better solution than traditional ETL based processes
− Uses the domain ontology and mapping ontology
Supports complex analytics
− Integrated with rules engine
22. 22
Ontology Architecture
An ontology architecture is the system of ontologies
required to accomplish a goal
− Very much like a software architecture
The goal for an EIW is federation of information sources
across business units to enable enterprise reporting and
analysis
− The ontology architecture of an EIW is designed to solve the
information federation problem
− While enabling sophisticated analytics
24. 24
EIW Ontology Architecture for
Federation
Human Resources
Domain Ontology
Relational Mapping
Ontology
Relational Mapping
Ontology
RDBMS RDBMS
Reporting/Analytics
SPARQL
Source
Ontology
Source
Ontology
The Federator
25. 25
Domain Ontology
The Domain Ontology is a conceptual description of a
business domain
− The “domain” is defined by the business processes, rules, information
sources, and any required analytics
Instances in this ontology are the same instances which
are currently stored in information sources (databases)
Exposes all information of the domain to any user or
application using the business terminology of the domain
in some cases, these business terms are defined by standards
26. 26
Relational Mapping Ontology
Describes how concepts in the domain ontology relate
to data in databases
Enables the translation of data from a relational format
to RDF format, using terminology defined in the
Domain Ontology
We have created a document that defines the
Relational Mapping Ontology
− This document should be released to the public this year
− The D2RQ language was not sufficient for our mission
http://www.knoodl.com/ui/groups/Mapping_Ontology_Community
27. 27
Relational Schema Ontology
Represents metadata about a relational database
schema as instance data
− All columns are instances and have properties relating them to
their tables
Enables analysis of the way a database is mapped
to the Domain Ontology (via the Relational Mapping
Ontology)
− How many columns are mapped to properties in the Domain
Ontology?
− How many are mapped to classes?
− How is Person represented in customer management system?
28. 28
Analytics Ontology
Enables us to describe questions, queries, reports, forms
− we represent questions as instances and relate them to the
queries that provide their answers
Queries are related to Domain Ontology concepts
Domain Ontology concepts are mapped to data sources
Enables "gap analysis" of analytic requirements
− are the concepts used in the query to answer this question
mapped to the necessary data sources?
Long-term can be used to model-drive a reporting tool
− create instances of "Reports" and the tool builds them
29. 29
Process Ontology
− Enables description of business processes
RDF/OWL version of BPMN
− Enables analysis of the information flows of business
process steps in terms of the HR Domain Ontology
− Long-term will enable execution of processes described
as instances of the ontology
− Short-term enables us to link processes with other
artifacts in the domain
Domain Ontology concepts
Standards
documentation
Discussions - anything
30. 30
How Hard is this?
Many people believe that it is too hard, not enough trained
people and takes too long to build the descriptions
− So millions of dollars and many years have been spent trying to
develop an automated way of doing the modeling
− Automated machine learning has not been invented
− The machines must be bootstrapped with descriptions
The first bullet is a fallacy
− It is not very hard
− There are plenty of people that can do this work
− It does not take very long to build the models
31. 31
Federation Solution
Enterprise Information Web
Any information from any system can be shared with any other system on
the enterprise networks or the World Wide Web
Steps
Describe all of the terms and artifacts in each domain using RDF, OWL
We currently do this description work, but we do not use machine readable
standards – Excel, Word, Powerpoint, Visio
The formal description of a domain is called a domain ontology
Describe how all of the information managed in each domain is related to
the domain vocabularyUse these descriptions to say how domains are
related
Query the Domain vocabularies for any information
The result is an Enterprise Information Web that meets the goals of
information sharing and analysis
32. 32
Relational
DB’s
Finance
HR
Logistics
Web Service
Domain Descriptions
Knowledgebase
Web sites
Applications
1. Information Systems
2. Expose as RDF web
services or SPARQL
endpoints
3. EIW contains self
described data
4. ESB is a big federated
knowledgebase of any
information
user
5. Any authorized
user or system can
query the ESB for
any information
Enterprise Information Web
RDF Web Service
sensors
Web Service
weather
location
Federator Web Service
Enterprise Information Web
33. 33
Leverage Existing Investment
We leverage existing infrastructure
Same networks, same security, same applications,
same organizations
A lot of this description work is being done now, it
simply requires some redirection
Must use standards like any other federation
The result of this relatively minor change and
expense is an astounding advance in information
management capability
36. 36
Visualization
There is no adopted standard by W3C for visual
representation of OWL or RDF models
OWL and RDF will not become a widely used standards
without good visualization of models
We do not believe any existing modeling standard will do,
OWL is too different
We need OWL design patterns to fundamentally
change information management capability at DOD
and elsewhere
The capability will be in beta test in December on knoodl.com