SlideShare ist ein Scribd-Unternehmen logo
1 von 48
Downloaden Sie, um offline zu lesen
An Ontology-Driven Framework for
Data Transformation in Scientific
Workflows
Shawn Bowers
Bertram Ludäscher
San Diego Supercomputer Center
University of California, San Diego
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
2
Outline
• Background (SEEK Project)
• Scientific Workflows
• The Problem: Reusing Structurally
Incompatible Services
• The Ontology-Driven Framework
• Future Work
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
3
Outline
• Background (SEEK Project)
• Scientific Workflows
• The Problem: Reusing Structurally
Incompatible Services
• The Ontology-Driven Framework
• Future Work
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
4
Science Environment for
Ecological Knowledge (SEEK)
• Domain Science Driver
– Ecology (LTER), biodiversity, …
• Analysis & Modeling System
– Design and execution of
ecological models and
analysis
– End user focus
– {application,upper}-ware
• Semantic Mediation System
– Data Integration of hard-to-
relate sources and processes
– Semantic Types and
Ontologies
– upper middleware
• EcoGrid
– Access to ecology data and
tools
– {middle,under}-ware
Architecture (cf. US cyberinfrastructure,
UK e-Science)
this paper
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
5
Outline
• The SEEK Project
• Scientific Workflows
– Focus: analysis & component integration on
top of data integration
• The Problem: Reusing Structurally
Incompatible Services
• The Ontology-Driven Framework
• Future Work
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
6
Promoter Identification in Kepler
[SSDBM’03]
• Problems
– Many components (web
serivces) are NOT
designed to fit!
“The problem P that X solves
is simple, and X doesn’t
solve it well”
– Semantically
meaningful connections
are structurally
incompatible
• Approach
– Distinguish structural
type and semantic type
– Structural type: e.g.
XML Schema
– Semantic type: e.g.
OWL expressions
– Exploit the (optional!)
semantic type as much as
possible
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
7
A Very Simple Scientific Workflow
S1
(life stage property)
S2
(mortality rate
for period)
P1
P2
P4
P3 P5
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
8
A Very Simple Scientific Workflow
S1
(life stage property)
S2
(mortality rate
for period)
P1
P2
P4
P3 P5
Phase Observed
Eggs
Instar I
Instar II
Instar III
Instar IV
Adults
44,000
3,513
2,529
1,922
1,461
1,300
observations
Population samples for life stages of the
common field grasshopper [Begon et al, 1996]
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
9
A Very Simple Scientific Workflow
S1
(life stage property)
S2
(mortality rate
for period)
P1
P2
P4
P3 P5
Phase Observed Period Phases
Eggs
Instar I
Instar II
Instar III
Instar IV
Adults
44,000
3,513
2,529
1,922
1,461
1,300
Nymphal {Instar I, Instar II, Instar III, Instar IV}
Population samples for life stages of the
common field grasshopper [Begon et al, 1996]
Periods of development in terms of phases
life stage periods
observations
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
10
A Very Simple Scientific Workflow
S1
(life stage property)
S2
(mortality rate
for period)
P1
P2
P4
P3 P5
Phase Observed Period Phases
Eggs
Instar I
Instar II
Instar III
Instar IV
Adults
44,000
3,513
2,529
1,922
1,461
1,300
Nymphal {Instar I, Instar II, Instar III, Instar IV}
Population samples for life stages of the
common field grasshopper [Begon et al, 1996]
Periods of development in terms of phases
life stage periods
k-value for each period
of observation
[(nymphal, 0.44)]
observations
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
11
Scientific Workflows
A scientific workflow consists of a network
of connected services …
A service can be any software
component (including a web service or
even a data source) …
Each service (optionally) takes input and
(optionally) produces output
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
12
Scientific Workflows
SEEK adopts a Ptolemy II “workflow” model:
– A service is called an actor
– Each actor has zero or more input and output ports
(and possibly parameters)
– Data flows through a workflow based on
connections made from output to input ports
– (ignored here: different models of computation, directors, …)
S1
(life stage property)
S2
(mortality rate
for period)
P1
P2
P4
P3 P5
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
13
Outline
• The SEEK Project
• Scientific Workflows
• The Problem: Reusing Structurally
Incompatible Services
• The Ontology-Driven Framework
• Future Work
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
14
Service Reusability
A scientist wishes to connect two
(independent) services
Source
Service
Target
Service
Ps Pt
Desired Connection
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
15
Service Reusability
In Ptolemy II/Kepler (and in web services),
input and output ports (message parts)
have structural types (XML Schema)
Source
Service
Target
Service
Ps Pt
Structural
Type Pt
Structural
Type Ps
Desired Connection
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
16
Service Reusability
Unless “designed to fit,” independent
services are structurally incompatible
è Generally, the source output type will not
be a subtype of the target input type
Source
Service
Target
Service
Ps Pt
Structural
Type Pt
Structural
Type Ps
Desired Connection
Incompatible
(⋠)
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
17
Service Reusability
A transformation mapping (d) is required to
connect the services … artificially
creating subtype compatibility
If such a d exists, the services are
“structurally feasible”
Source
Service
Target
Service
Ps Pt
Structural
Type Pt
Structural
Type Ps
Desired Connection
Incompatible
(⋠)
d(Ps)
d (≺)
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
18
Service Reusability
SEEK annotates services with semantic types
for discovery and interoperability of services
Source
Service
Target
Service
Ps Pt
Ontologies (OWL)
Semantic
Type Ps
Semantic
Type Pt
Desired Connection
Compatible (⊑)
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
19
Service Reusability
Services can be semantically compatible,
but structurally incompatible
Source
Service
Target
Service
Ps Pt
Semantic
Type Ps
Semantic
Type Pt
Structural
Type Pt
Structural
Type Ps
Desired Connection
Incompatible
Compatible
(⋠)
(⊑)
d(Ps)
d (≺)
Ontologies (OWL)
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
20
Example Structural Types (XML)
S1
(life stage property)
S2
(mortality rate
for period)
P1
P2
P4
P3 P5
root population = (sample)*
elem sample = (meas, lsp)
elem meas = (cnt, acc)
elem cnt = xsd:integer
elem acc = xsd:double
elem lsp = xsd:string
<population>
<sample>
<meas>
<cnt>44,000</cnt>
<acc>0.95</acc>
</meas>
<lsp>Eggs</lsp>
</sample>
…
<population>
root cohortTable = (measurement)*
elem measuremnt = (phase, obs)
elem phase = xsd:string
elem obs = xsd:integer
<cohortTable>
<measurement>
<phase>Eggs</cnt>
<obs>44,000</acc>
</measurement>
…
<cohortTable>
structType(P2) structType(P3)
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
21
Example Semantic Types
Portion of SEEK measurement ontology
MeasContext
Observation EntityMeasProperty
hasContext 0:*
1:1
appliesTo
hasProperty
0:*
Accuracy
Qualifier
Ecological
Property
Abundance
Count
LifeStage
Property
Numeric
Value
Spatial
Location
hasLocation
hasCount
1:1
1:1
hasValue
1:1
itemMeasured
1:*
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
22
Example Semantic Types
Portion of SEEK measurement ontology
MeasContext
Observation EntityMeasProperty
hasContext 0:*
1:1
appliesTo
hasProperty
0:*
Accuracy
Qualifier
Ecological
Property
Abundance
Count
LifeStage
Property
Numeric
Value
Spatial
Location
hasLocation
hasCount
1:1
1:1
hasValue
1:1
itemMeasured
1:*
Same in OWL, a description logic standard (here, Sparrow syntax):
Observation subClassOf forall hasContext/MeasContext and
forall hasProperty/MeasProperty and
exists itemMeasured/Entity.
MeasContext subClassOf exists appliesTo/Entity and
atmost 1/appliesTo.
EcologicalProperty subClassOf Entity.
LifeStageProperty subClassOf EcologicalProperty.
AbundanceCount subClassOf EcologicalProperty and
exists hasLocation/SpatialLocation and
atMost 1/hasLocation and
exists hasCount/NumericValue and
atMost 1/hasCount.
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
23
Example Semantic Types
Semantic types for P2 and P3
S1
(life stage property)
S2
(mortality rate
for period)
P1
P2
P4
P3 P5
Observation
semType(P3)
MeasContext
hasContext
1:1
appliesTo LifeStage
Property1:1
Abundance
Count
itemMeasured Number
Value
hasCount
1:11:1
semType(P2)
⊑
Accuracy
Qualifier
hasProperty
1:1
hasValue
1:1
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
24
Example Semantic Types
Semantic types for P2 and P3
S1
(life stage property)
S2
(mortality rate
for period)
P1
P2
P4
P3 P5
Observation
semType(P3)
MeasContext
hasContext
1:1
appliesTo LifeStage
Property1:1
Abundance
Count
itemMeasured Number
Value
hasCount
1:11:1
semType(P2)
⊑
Accuracy
Qualifier
hasProperty
1:1
hasValue
1:1
semType(P3) subClassOf Observation and
exists hasContext/(MeasurementContext and
exists appliesTo/LifeStageProperty and
atMost 1/appliesTo) and
exists itemMeasured/AbundanceCount and
atMost 1/itemMeasured.
semType(P2) subClassOf Observation and
exists hasContext/(MeasurementContext and
exists appliesTo/LifeStageProperty and
atMost 1/appliesTo) and
exists itemMeasured/AbundanceCount and
atMost 1/itemMeasured and
exists hasProperty/AccuracyQualifier and
atMost 1/hasProperty.
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
25
Outline
• The SEEK Project
• Scientific Workflows
• The Problem: Reusing Structurally
Incompatible Services
• The Ontology-Driven Framework
• Future Work
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
26
The Ontology-Driven Framework
Define semantic registration mappings
(“semantic views”) to connect structural
and semantic types
Use registration mappings to (semi-)
automate transformation, based on
derived structural correspondences
Depending on the ontologies and registration
mappings, it may not be possible to find an
appropriate d …
(since the correspondence is often under-
specified)
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
27
The Ontology-Driven Framework
Source
Service
Target
Service
Ps Pt
Semantic
Type Ps
Semantic
Type Pt
Structural
Type Pt
Structural
Type Ps
Desired Connection
Compatible (⊑)
Registration
Mapping (Output)
Registration
Mapping (Input)
Ontologies (OWL)
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
28
Registration Example (simple XPaths)
/population/sample == semType(P2)
/population/sample/meas/cnt == semType(P2).itemMeasured
/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount
/population/sample/meas/acc == semType(P2).hasProperty
/population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue
/population/sample/lsp/text() == semType(P2).hasContext.appliesTo
root population = (sample)*
elem sample = (meas, lsp)
elem meas = (cnt, acc)
elem cnt = xsd:integer
elem acc = xsd:double
elem lsp = xsd:string
<population>
<sample>
<meas>
<cnt>44,000</cnt>
<acc>0.95</acc>
</meas>
<lsp>Eggs</lsp>
</sample>
…
<population>
structType(P2)
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
29
Registration Example (simple XPaths)
/population/sample == semType(P2)
/population/sample/meas/cnt == semType(P2).itemMeasured
/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount
/population/sample/meas/acc == semType(P2).hasProperty
/population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue
/population/sample/lsp/text() == semType(P2).hasContext.appliesTo
root population = (sample)*
elem sample = (meas, lsp)
elem meas = (cnt, acc)
elem cnt = xsd:integer
elem acc = xsd:double
elem lsp = xsd:string
<population>
<sample>
<meas>
<cnt>44,000</cnt>
<acc>0.95</acc>
</meas>
<lsp>Eggs</lsp>
</sample>
…
<population>
structType(P2)
Each sample is an instance of the semantic type
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
30
Registration Example (simple XPaths)
/population/sample == semType(P2)
/population/sample/meas/cnt == semType(P2).itemMeasured
/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount
/population/sample/meas/acc == semType(P2).hasProperty
/population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue
/population/sample/lsp/text() == semType(P2).hasContext.appliesTo
root population = (sample)*
elem sample = (meas, lsp)
elem meas = (cnt, acc)
elem cnt = xsd:integer
elem acc = xsd:double
elem lsp = xsd:string
<population>
<sample>
<meas>
<cnt>44,000</cnt>
<acc>0.95</acc>
</meas>
<lsp>Eggs</lsp>
</sample>
…
<population>
structType(P2)
Each sample’s cnt represents the itemMeasured object
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
31
Registration Example (simple XPaths)
/population/sample == semType(P2)
/population/sample/meas/cnt == semType(P2).itemMeasured
/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount
/population/sample/meas/acc == semType(P2).hasProperty
/population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue
/population/sample/lsp/text() == semType(P2).hasContext.appliesTo
root population = (sample)*
elem sample = (meas, lsp)
elem meas = (cnt, acc)
elem cnt = xsd:integer
elem acc = xsd:double
elem lsp = xsd:string
<population>
<sample>
<meas>
<cnt>44,000</cnt>
<acc>0.95</acc>
</meas>
<lsp>Eggs</lsp>
</sample>
…
<population>
structType(P2)
Each sample’s cnt’s value represents the hasCount value of
the corresponding itemMeasured object
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
32
Registration Example (simple XPaths)
/cohortTable/measurement == semType(P3)
/cohortTable/measurement/obs == semType(P3).itemMeasured
/cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount
/cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo
<cohortTable>
<measurement>
<phase>Eggs</cnt>
<obs>44,000</acc>
</measurement>
…
<cohortTable>
root cohortTable = (measurement)*
elem measuremnt = (phase, obs)
elem phase = xsd:string
elem obs = xsd:integer
structType(P3)
… similary for P3 .. … .
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
33
The Ontology-Driven Framework
Source
Service
Target
Service
Ps Pt
Semantic
Type Ps
Semantic
Type Pt
Structural
Type Pt
Structural
Type Ps
Desired Connection
Compatible (⊑)
Registration
Mapping (Output)
Registration
Mapping (Input)
Correspondence
Ontologies (OWL)
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
34
Correspondence Example
/population/sample == semType(P2)
/population/sample/meas/cnt == semType(P2).itemMeasured
/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount
/population/sample/meas/acc == semType(P2).hasProperty
/population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue
/population/sample/lsp/text() == semType(P2).hasContext.appliesTo
/cohortTable/measurement == semType(P3)
/cohortTable/measurement/obs == semType(P3).itemMeasured
/cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount
/cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo
Source-side semantic registration mapping
Target-side semantic registration mapping
population
sample *
meas
cnt
xsd:double
xsd:string
lsp
xsd:integer
acc
cohortTable
measurement *
obs
xsd:integer
phase
xsd:string
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
35
Correspondence Example
/population/sample == semType(P2)
/population/sample/meas/cnt == semType(P2).itemMeasured
/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount
/population/sample/meas/acc == semType(P2).hasProperty
/population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue
/population/sample/lsp/text() == semType(P2).hasContext.appliesTo
/cohortTable/measurement == semType(P3)
/cohortTable/measurement/obs == semType(P3).itemMeasured
/cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount
/cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo
Source
Target
population
sample *
meas
cnt
xsd:double
xsd:string
lsp
xsd:integer
acc
cohortTable
measurement *
obs
xsd:integer
phase
xsd:string
We want to “compose”
the registrations to obtain
structural correspondences
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
36
Correspondence Example
/population/sample == semType(P2)
/population/sample/meas/cnt == semType(P2).itemMeasured
/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount
/population/sample/meas/acc == semType(P2).hasProperty
/population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue
/population/sample/lsp/text() == semType(P2).hasContext.appliesTo
/cohortTable/measurement == semType(P3)
/cohortTable/measurement/obs == semType(P3).itemMeasured
/cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount
/cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo
Source
Target
population
sample *
meas
cnt
xsd:double
xsd:string
lsp
xsd:integer
acc
cohortTable
measurement *
obs
xsd:integer
phase
xsd:string
/population/sample == semType(P2)
/cohortTable/measurement == semType(P3)
These fragments correspond
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
37
Correspondence Example
/population/sample == semType(P2)
/population/sample/meas/cnt == semType(P2).itemMeasured
/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount
/population/sample/meas/acc == semType(P2).hasProperty
/population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue
/population/sample/lsp/text() == semType(P2).hasContext.appliesTo
/cohortTable/measurement == semType(P3)
/cohortTable/measurement/obs == semType(P3).itemMeasured
/cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount
/cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo
Source
Target
population
sample *
meas
cnt
xsd:double
xsd:string
lsp
xsd:integer
acc
cohortTable
measurement *
obs
xsd:integer
phase
xsd:string
/population/sample/meas/cnt == semType(P2).itemMeasured
/cohortTable/measurement/obs == semType(P3).itemMeasured
These fragments correspond
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
38
Correspondence Example
/population/sample == semType(P2)
/population/sample/meas/cnt == semType(P2).itemMeasured
/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount
/population/sample/meas/acc == semType(P2).hasProperty
/population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue
/population/sample/lsp/text() == semType(P2).hasContext.appliesTo
/cohortTable/measurement == semType(P3)
/cohortTable/measurement/obs == semType(P3).itemMeasured
/cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount
/cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo
Source
Target
population
sample *
meas
cnt
xsd:double
xsd:string
lsp
xsd:integer
acc
cohortTable
measurement *
obs
xsd:integer
phase
xsd:string
/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount
/cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount
These fragments correspond
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
39
Correspondence Example
/population/sample == semType(P2)
/population/sample/meas/cnt == semType(P2).itemMeasured
/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount
/population/sample/meas/acc == semType(P2).hasProperty
/population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue
/population/sample/lsp/text() == semType(P2).hasContext.appliesTo
/cohortTable/measurement == semType(P3)
/cohortTable/measurement/obs == semType(P3).itemMeasured
/cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount
/cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo
Source
Target
population
sample *
meas
cnt
xsd:double
xsd:string
lsp
xsd:integer
acc
cohortTable
measurement *
obs
xsd:integer
phase
xsd:string
/population/sample/lsp/text() == semType(P2).hasContext.appliesTo
/cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo
These fragments correspond
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
40
The Ontology-Driven Framework
Source
Service
Target
Service
Ps Pt
Semantic
Type Ps
Semantic
Type Pt
Structural
Type Pt
Structural
Type Ps
Desired Connection
Compatible (⊑)
Registration
Mapping (Output)
Registration
Mapping (Input)
Correspondence
Generate d(Ps)
Ontologies (OWL)
Transformation
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
41
Example Result (XQuery)
Based on the structural correspondences
and certain assumptions, we derive the
transformation XQuery:
<cohortTable>
{ for $s in /population/sample return
<measurement>
{ for $c in $s/meas/cnt return <obs>{$c/text()}</obs> }
{ for $l in $s/lsp return <phase>{$l/text()}</phase> }
</measurement>
}
</cohortTable>
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
42
Assumptions Made
(or why this may not work for you…)
• Common XPath prefixes refer to the same
element
• Elements in correspondences have
compatible cardinalities
– source is equivalent or stricter than target
(e.g., + is stricter than *)
• Primitive data types are compatible
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
43
Framework Operations and Properties
In the paper, we define:
– A semantic registration mapping R as a set of rules q↔p,
where q is a substructure selection (query) and p is a
contextual path (a path in an ontology)
– A structural correspondence as a rule qs®qt, where qs
and qt are substructure selections over the source and
target, resp.
– The semantic composition of registration mappings Rs
and Rt, which returns a set of structural correspondence
rules
– The semantic subpath operation (subconcept), which
is used by the semantic composition to find matching
substructure selection rules
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
44
Framework Operations and Properties
In the paper, we define:
– Registration mapping properties (cardinality
consistency and partial complete registrations) and
discuss the impact on determining structural
transformations
– The simple XPath and Semantic Path languages for
defining registration mappings, and the corresponding
semantic join operator to find correspondences
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
45
Outline
• The SEEK Project
• Scientific Workflows
• The Problem: Reusing Structurally
Incompatible Services
• The Ontology-Driven Framework
• A Simple Framework Implementation
• Future Work
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
46
Future Work
• Extend the registration mapping language
– XPath is too limited …
è try a more general query language (e.g.,
XPath + variables)
è relational/Datalog based substructure
selection (query)
• Formalize the properties of registration
mappings and their effect on automated
transformation
• Introduce conversion routines (e.g., for
units) at the ontology level; apply them in
transformations
• Extend transformations to different
computation models and workflow
scheduling algorithms
• Add to the Kepler Scientific Workflow
System
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
47
Acknowledgements
• NSF/ITR Science Environment for Ecological Knowledge
• NSF/ITR Geosciences Network
• NIH Biomedical Informatics
Research Network
• DOE Scientific Data
Management Center
Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig
48
Questions …

Weitere ähnliche Inhalte

Ähnlich wie An ontology-driven framework for data transformation in scientific workflows

Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Alejandra Gonzalez-Beltran
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectKen Karapetyan
 
The Ondex Data Integration Framework
The Ondex Data Integration FrameworkThe Ondex Data Integration Framework
The Ondex Data Integration Frameworkbosc
 
Metadata for Research Objects
Metadata for Research ObjectsMetadata for Research Objects
Metadata for Research Objectsseanb
 
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Stuart Chalk
 
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...Marko Rodriguez
 
NANO266 - Lecture 12 - High-throughput computational materials design
NANO266 - Lecture 12 - High-throughput computational materials designNANO266 - Lecture 12 - High-throughput computational materials design
NANO266 - Lecture 12 - High-throughput computational materials designUniversity of California, San Diego
 
Query Translation for Ontology-extended Data Sources
Query Translation for Ontology-extended Data SourcesQuery Translation for Ontology-extended Data Sources
Query Translation for Ontology-extended Data SourcesJie Bao
 
The Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to TerminologyThe Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to TerminologySnow Owl
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the partsCarole Goble
 
NZ eResearch Symposium 2013 - Capturing the Flux in Scientific Knowledge
NZ eResearch Symposium 2013 - Capturing the Flux in Scientific KnowledgeNZ eResearch Symposium 2013 - Capturing the Flux in Scientific Knowledge
NZ eResearch Symposium 2013 - Capturing the Flux in Scientific KnowledgePrashant Gupta
 
2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinalDeborah McGuinness
 
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...ICZN
 
Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Rudy Potenzone
 
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...Mariano Rodriguez-Muro
 
Translation of Relational and Non-Relational Databases into RDF with xR2RML
Translation of Relational and Non-Relational Databases into RDF with xR2RMLTranslation of Relational and Non-Relational Databases into RDF with xR2RML
Translation of Relational and Non-Relational Databases into RDF with xR2RMLFranck Michel
 

Ähnlich wie An ontology-driven framework for data transformation in scientific workflows (20)

Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts project
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts project
 
The Ondex Data Integration Framework
The Ondex Data Integration FrameworkThe Ondex Data Integration Framework
The Ondex Data Integration Framework
 
Metadata for Research Objects
Metadata for Research ObjectsMetadata for Research Objects
Metadata for Research Objects
 
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
Toward Semantic Representation of Science in Electronic Laboratory Notebooks ...
 
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
 
NANO266 - Lecture 12 - High-throughput computational materials design
NANO266 - Lecture 12 - High-throughput computational materials designNANO266 - Lecture 12 - High-throughput computational materials design
NANO266 - Lecture 12 - High-throughput computational materials design
 
Query Translation for Ontology-extended Data Sources
Query Translation for Ontology-extended Data SourcesQuery Translation for Ontology-extended Data Sources
Query Translation for Ontology-extended Data Sources
 
COPO kick-off meeting
COPO kick-off meetingCOPO kick-off meeting
COPO kick-off meeting
 
The Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to TerminologyThe Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to Terminology
 
Semantics-enhanced Geoscience Interoperability, Analytics, and Applications
Semantics-enhanced Geoscience Interoperability, Analytics, and ApplicationsSemantics-enhanced Geoscience Interoperability, Analytics, and Applications
Semantics-enhanced Geoscience Interoperability, Analytics, and Applications
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
NZ eResearch Symposium 2013 - Capturing the Flux in Scientific Knowledge
NZ eResearch Symposium 2013 - Capturing the Flux in Scientific KnowledgeNZ eResearch Symposium 2013 - Capturing the Flux in Scientific Knowledge
NZ eResearch Symposium 2013 - Capturing the Flux in Scientific Knowledge
 
2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal
 
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
 
CBS CEDAR Presentation
CBS CEDAR PresentationCBS CEDAR Presentation
CBS CEDAR Presentation
 
Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011
 
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
Stanford'12 Intro to Ontology Based Data Access for RDBMS through query rewri...
 
Translation of Relational and Non-Relational Databases into RDF with xR2RML
Translation of Relational and Non-Relational Databases into RDF with xR2RMLTranslation of Relational and Non-Relational Databases into RDF with xR2RML
Translation of Relational and Non-Relational Databases into RDF with xR2RML
 

Mehr von Bertram Ludäscher

Games, Queries, and Argumentation Frameworks: Time for a Family Reunion
Games, Queries, and Argumentation Frameworks: Time for a Family ReunionGames, Queries, and Argumentation Frameworks: Time for a Family Reunion
Games, Queries, and Argumentation Frameworks: Time for a Family ReunionBertram Ludäscher
 
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!Bertram Ludäscher
 
[Flashback] Integration of Active and Deductive Database Rules
[Flashback] Integration of Active and Deductive Database Rules[Flashback] Integration of Active and Deductive Database Rules
[Flashback] Integration of Active and Deductive Database RulesBertram Ludäscher
 
[Flashback] Statelog: Integration of Active & Deductive Database Rules
[Flashback] Statelog: Integration of Active & Deductive Database Rules[Flashback] Statelog: Integration of Active & Deductive Database Rules
[Flashback] Statelog: Integration of Active & Deductive Database RulesBertram Ludäscher
 
Answering More Questions with Provenance and Query Patterns
Answering More Questions with Provenance and Query PatternsAnswering More Questions with Provenance and Query Patterns
Answering More Questions with Provenance and Query PatternsBertram Ludäscher
 
Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Computational Reproducibility vs. Transparency: Is It FAIR Enough?Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Computational Reproducibility vs. Transparency: Is It FAIR Enough?Bertram Ludäscher
 
Which Model Does Not Belong: A Dialogue
Which Model Does Not Belong: A DialogueWhich Model Does Not Belong: A Dialogue
Which Model Does Not Belong: A DialogueBertram Ludäscher
 
From Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science TalesFrom Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science TalesBertram Ludäscher
 
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of UsPossible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of UsBertram Ludäscher
 
Deduktive Datenbanken & Logische Programme: Eine kleine Zeitreise
Deduktive Datenbanken & Logische Programme: Eine kleine ZeitreiseDeduktive Datenbanken & Logische Programme: Eine kleine Zeitreise
Deduktive Datenbanken & Logische Programme: Eine kleine ZeitreiseBertram Ludäscher
 
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...Bertram Ludäscher
 
Dissecting Reproducibility: A case study with ecological niche models in th...
Dissecting Reproducibility:  A case study with ecological niche models  in th...Dissecting Reproducibility:  A case study with ecological niche models  in th...
Dissecting Reproducibility: A case study with ecological niche models in th...Bertram Ludäscher
 
Incremental Recomputation: Those who cannot remember the past are condemned ...
Incremental Recomputation:  Those who cannot remember the past are condemned ...Incremental Recomputation:  Those who cannot remember the past are condemned ...
Incremental Recomputation: Those who cannot remember the past are condemned ...Bertram Ludäscher
 
Validation and Inference of Schema-Level Workflow Data-Dependency Annotations
Validation and Inference of Schema-Level Workflow Data-Dependency AnnotationsValidation and Inference of Schema-Level Workflow Data-Dependency Annotations
Validation and Inference of Schema-Level Workflow Data-Dependency AnnotationsBertram Ludäscher
 
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses Approach
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses ApproachKnowledge Representation & Reasoning and the Hierarchy-of-Hypotheses Approach
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses ApproachBertram Ludäscher
 
Whole-Tale: The Experience of Research
Whole-Tale: The Experience of ResearchWhole-Tale: The Experience of Research
Whole-Tale: The Experience of ResearchBertram Ludäscher
 
ETC & Authors in the Driver's Seat
ETC & Authors in the Driver's SeatETC & Authors in the Driver's Seat
ETC & Authors in the Driver's SeatBertram Ludäscher
 
From Provenance Standards and Tools to Queries and Actionable Provenance
From Provenance Standards and Tools to Queries and Actionable ProvenanceFrom Provenance Standards and Tools to Queries and Actionable Provenance
From Provenance Standards and Tools to Queries and Actionable ProvenanceBertram Ludäscher
 
Wild Ideas at TDWG'17: Embrace multiple possible worlds; abandon techno-ligion
Wild Ideas at TDWG'17: Embrace multiple possible worlds; abandon techno-ligionWild Ideas at TDWG'17: Embrace multiple possible worlds; abandon techno-ligion
Wild Ideas at TDWG'17: Embrace multiple possible worlds; abandon techno-ligionBertram Ludäscher
 
Using YesWorkflow hybrid queries to reveal data lineage from data curation ac...
Using YesWorkflow hybrid queries to reveal data lineage from data curation ac...Using YesWorkflow hybrid queries to reveal data lineage from data curation ac...
Using YesWorkflow hybrid queries to reveal data lineage from data curation ac...Bertram Ludäscher
 

Mehr von Bertram Ludäscher (20)

Games, Queries, and Argumentation Frameworks: Time for a Family Reunion
Games, Queries, and Argumentation Frameworks: Time for a Family ReunionGames, Queries, and Argumentation Frameworks: Time for a Family Reunion
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion
 
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!
Games, Queries, and Argumentation Frameworks: Time for a Family Reunion!
 
[Flashback] Integration of Active and Deductive Database Rules
[Flashback] Integration of Active and Deductive Database Rules[Flashback] Integration of Active and Deductive Database Rules
[Flashback] Integration of Active and Deductive Database Rules
 
[Flashback] Statelog: Integration of Active & Deductive Database Rules
[Flashback] Statelog: Integration of Active & Deductive Database Rules[Flashback] Statelog: Integration of Active & Deductive Database Rules
[Flashback] Statelog: Integration of Active & Deductive Database Rules
 
Answering More Questions with Provenance and Query Patterns
Answering More Questions with Provenance and Query PatternsAnswering More Questions with Provenance and Query Patterns
Answering More Questions with Provenance and Query Patterns
 
Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Computational Reproducibility vs. Transparency: Is It FAIR Enough?Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Computational Reproducibility vs. Transparency: Is It FAIR Enough?
 
Which Model Does Not Belong: A Dialogue
Which Model Does Not Belong: A DialogueWhich Model Does Not Belong: A Dialogue
Which Model Does Not Belong: A Dialogue
 
From Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science TalesFrom Research Objects to Reproducible Science Tales
From Research Objects to Reproducible Science Tales
 
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of UsPossible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
Possible Worlds Explorer: Datalog & Answer Set Programming for the Rest of Us
 
Deduktive Datenbanken & Logische Programme: Eine kleine Zeitreise
Deduktive Datenbanken & Logische Programme: Eine kleine ZeitreiseDeduktive Datenbanken & Logische Programme: Eine kleine Zeitreise
Deduktive Datenbanken & Logische Programme: Eine kleine Zeitreise
 
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
 
Dissecting Reproducibility: A case study with ecological niche models in th...
Dissecting Reproducibility:  A case study with ecological niche models  in th...Dissecting Reproducibility:  A case study with ecological niche models  in th...
Dissecting Reproducibility: A case study with ecological niche models in th...
 
Incremental Recomputation: Those who cannot remember the past are condemned ...
Incremental Recomputation:  Those who cannot remember the past are condemned ...Incremental Recomputation:  Those who cannot remember the past are condemned ...
Incremental Recomputation: Those who cannot remember the past are condemned ...
 
Validation and Inference of Schema-Level Workflow Data-Dependency Annotations
Validation and Inference of Schema-Level Workflow Data-Dependency AnnotationsValidation and Inference of Schema-Level Workflow Data-Dependency Annotations
Validation and Inference of Schema-Level Workflow Data-Dependency Annotations
 
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses Approach
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses ApproachKnowledge Representation & Reasoning and the Hierarchy-of-Hypotheses Approach
Knowledge Representation & Reasoning and the Hierarchy-of-Hypotheses Approach
 
Whole-Tale: The Experience of Research
Whole-Tale: The Experience of ResearchWhole-Tale: The Experience of Research
Whole-Tale: The Experience of Research
 
ETC & Authors in the Driver's Seat
ETC & Authors in the Driver's SeatETC & Authors in the Driver's Seat
ETC & Authors in the Driver's Seat
 
From Provenance Standards and Tools to Queries and Actionable Provenance
From Provenance Standards and Tools to Queries and Actionable ProvenanceFrom Provenance Standards and Tools to Queries and Actionable Provenance
From Provenance Standards and Tools to Queries and Actionable Provenance
 
Wild Ideas at TDWG'17: Embrace multiple possible worlds; abandon techno-ligion
Wild Ideas at TDWG'17: Embrace multiple possible worlds; abandon techno-ligionWild Ideas at TDWG'17: Embrace multiple possible worlds; abandon techno-ligion
Wild Ideas at TDWG'17: Embrace multiple possible worlds; abandon techno-ligion
 
Using YesWorkflow hybrid queries to reveal data lineage from data curation ac...
Using YesWorkflow hybrid queries to reveal data lineage from data curation ac...Using YesWorkflow hybrid queries to reveal data lineage from data curation ac...
Using YesWorkflow hybrid queries to reveal data lineage from data curation ac...
 

Kürzlich hochgeladen

RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 

Kürzlich hochgeladen (20)

RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 

An ontology-driven framework for data transformation in scientific workflows

  • 1. An Ontology-Driven Framework for Data Transformation in Scientific Workflows Shawn Bowers Bertram Ludäscher San Diego Supercomputer Center University of California, San Diego
  • 2. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 2 Outline • Background (SEEK Project) • Scientific Workflows • The Problem: Reusing Structurally Incompatible Services • The Ontology-Driven Framework • Future Work
  • 3. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 3 Outline • Background (SEEK Project) • Scientific Workflows • The Problem: Reusing Structurally Incompatible Services • The Ontology-Driven Framework • Future Work
  • 4. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 4 Science Environment for Ecological Knowledge (SEEK) • Domain Science Driver – Ecology (LTER), biodiversity, … • Analysis & Modeling System – Design and execution of ecological models and analysis – End user focus – {application,upper}-ware • Semantic Mediation System – Data Integration of hard-to- relate sources and processes – Semantic Types and Ontologies – upper middleware • EcoGrid – Access to ecology data and tools – {middle,under}-ware Architecture (cf. US cyberinfrastructure, UK e-Science) this paper
  • 5. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 5 Outline • The SEEK Project • Scientific Workflows – Focus: analysis & component integration on top of data integration • The Problem: Reusing Structurally Incompatible Services • The Ontology-Driven Framework • Future Work
  • 6. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 6 Promoter Identification in Kepler [SSDBM’03] • Problems – Many components (web serivces) are NOT designed to fit! “The problem P that X solves is simple, and X doesn’t solve it well” – Semantically meaningful connections are structurally incompatible • Approach – Distinguish structural type and semantic type – Structural type: e.g. XML Schema – Semantic type: e.g. OWL expressions – Exploit the (optional!) semantic type as much as possible
  • 7. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 7 A Very Simple Scientific Workflow S1 (life stage property) S2 (mortality rate for period) P1 P2 P4 P3 P5
  • 8. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 8 A Very Simple Scientific Workflow S1 (life stage property) S2 (mortality rate for period) P1 P2 P4 P3 P5 Phase Observed Eggs Instar I Instar II Instar III Instar IV Adults 44,000 3,513 2,529 1,922 1,461 1,300 observations Population samples for life stages of the common field grasshopper [Begon et al, 1996]
  • 9. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 9 A Very Simple Scientific Workflow S1 (life stage property) S2 (mortality rate for period) P1 P2 P4 P3 P5 Phase Observed Period Phases Eggs Instar I Instar II Instar III Instar IV Adults 44,000 3,513 2,529 1,922 1,461 1,300 Nymphal {Instar I, Instar II, Instar III, Instar IV} Population samples for life stages of the common field grasshopper [Begon et al, 1996] Periods of development in terms of phases life stage periods observations
  • 10. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 10 A Very Simple Scientific Workflow S1 (life stage property) S2 (mortality rate for period) P1 P2 P4 P3 P5 Phase Observed Period Phases Eggs Instar I Instar II Instar III Instar IV Adults 44,000 3,513 2,529 1,922 1,461 1,300 Nymphal {Instar I, Instar II, Instar III, Instar IV} Population samples for life stages of the common field grasshopper [Begon et al, 1996] Periods of development in terms of phases life stage periods k-value for each period of observation [(nymphal, 0.44)] observations
  • 11. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 11 Scientific Workflows A scientific workflow consists of a network of connected services … A service can be any software component (including a web service or even a data source) … Each service (optionally) takes input and (optionally) produces output
  • 12. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 12 Scientific Workflows SEEK adopts a Ptolemy II “workflow” model: – A service is called an actor – Each actor has zero or more input and output ports (and possibly parameters) – Data flows through a workflow based on connections made from output to input ports – (ignored here: different models of computation, directors, …) S1 (life stage property) S2 (mortality rate for period) P1 P2 P4 P3 P5
  • 13. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 13 Outline • The SEEK Project • Scientific Workflows • The Problem: Reusing Structurally Incompatible Services • The Ontology-Driven Framework • Future Work
  • 14. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 14 Service Reusability A scientist wishes to connect two (independent) services Source Service Target Service Ps Pt Desired Connection
  • 15. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 15 Service Reusability In Ptolemy II/Kepler (and in web services), input and output ports (message parts) have structural types (XML Schema) Source Service Target Service Ps Pt Structural Type Pt Structural Type Ps Desired Connection
  • 16. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 16 Service Reusability Unless “designed to fit,” independent services are structurally incompatible è Generally, the source output type will not be a subtype of the target input type Source Service Target Service Ps Pt Structural Type Pt Structural Type Ps Desired Connection Incompatible (⋠)
  • 17. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 17 Service Reusability A transformation mapping (d) is required to connect the services … artificially creating subtype compatibility If such a d exists, the services are “structurally feasible” Source Service Target Service Ps Pt Structural Type Pt Structural Type Ps Desired Connection Incompatible (⋠) d(Ps) d (≺)
  • 18. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 18 Service Reusability SEEK annotates services with semantic types for discovery and interoperability of services Source Service Target Service Ps Pt Ontologies (OWL) Semantic Type Ps Semantic Type Pt Desired Connection Compatible (⊑)
  • 19. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 19 Service Reusability Services can be semantically compatible, but structurally incompatible Source Service Target Service Ps Pt Semantic Type Ps Semantic Type Pt Structural Type Pt Structural Type Ps Desired Connection Incompatible Compatible (⋠) (⊑) d(Ps) d (≺) Ontologies (OWL)
  • 20. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 20 Example Structural Types (XML) S1 (life stage property) S2 (mortality rate for period) P1 P2 P4 P3 P5 root population = (sample)* elem sample = (meas, lsp) elem meas = (cnt, acc) elem cnt = xsd:integer elem acc = xsd:double elem lsp = xsd:string <population> <sample> <meas> <cnt>44,000</cnt> <acc>0.95</acc> </meas> <lsp>Eggs</lsp> </sample> … <population> root cohortTable = (measurement)* elem measuremnt = (phase, obs) elem phase = xsd:string elem obs = xsd:integer <cohortTable> <measurement> <phase>Eggs</cnt> <obs>44,000</acc> </measurement> … <cohortTable> structType(P2) structType(P3)
  • 21. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 21 Example Semantic Types Portion of SEEK measurement ontology MeasContext Observation EntityMeasProperty hasContext 0:* 1:1 appliesTo hasProperty 0:* Accuracy Qualifier Ecological Property Abundance Count LifeStage Property Numeric Value Spatial Location hasLocation hasCount 1:1 1:1 hasValue 1:1 itemMeasured 1:*
  • 22. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 22 Example Semantic Types Portion of SEEK measurement ontology MeasContext Observation EntityMeasProperty hasContext 0:* 1:1 appliesTo hasProperty 0:* Accuracy Qualifier Ecological Property Abundance Count LifeStage Property Numeric Value Spatial Location hasLocation hasCount 1:1 1:1 hasValue 1:1 itemMeasured 1:* Same in OWL, a description logic standard (here, Sparrow syntax): Observation subClassOf forall hasContext/MeasContext and forall hasProperty/MeasProperty and exists itemMeasured/Entity. MeasContext subClassOf exists appliesTo/Entity and atmost 1/appliesTo. EcologicalProperty subClassOf Entity. LifeStageProperty subClassOf EcologicalProperty. AbundanceCount subClassOf EcologicalProperty and exists hasLocation/SpatialLocation and atMost 1/hasLocation and exists hasCount/NumericValue and atMost 1/hasCount.
  • 23. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 23 Example Semantic Types Semantic types for P2 and P3 S1 (life stage property) S2 (mortality rate for period) P1 P2 P4 P3 P5 Observation semType(P3) MeasContext hasContext 1:1 appliesTo LifeStage Property1:1 Abundance Count itemMeasured Number Value hasCount 1:11:1 semType(P2) ⊑ Accuracy Qualifier hasProperty 1:1 hasValue 1:1
  • 24. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 24 Example Semantic Types Semantic types for P2 and P3 S1 (life stage property) S2 (mortality rate for period) P1 P2 P4 P3 P5 Observation semType(P3) MeasContext hasContext 1:1 appliesTo LifeStage Property1:1 Abundance Count itemMeasured Number Value hasCount 1:11:1 semType(P2) ⊑ Accuracy Qualifier hasProperty 1:1 hasValue 1:1 semType(P3) subClassOf Observation and exists hasContext/(MeasurementContext and exists appliesTo/LifeStageProperty and atMost 1/appliesTo) and exists itemMeasured/AbundanceCount and atMost 1/itemMeasured. semType(P2) subClassOf Observation and exists hasContext/(MeasurementContext and exists appliesTo/LifeStageProperty and atMost 1/appliesTo) and exists itemMeasured/AbundanceCount and atMost 1/itemMeasured and exists hasProperty/AccuracyQualifier and atMost 1/hasProperty.
  • 25. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 25 Outline • The SEEK Project • Scientific Workflows • The Problem: Reusing Structurally Incompatible Services • The Ontology-Driven Framework • Future Work
  • 26. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 26 The Ontology-Driven Framework Define semantic registration mappings (“semantic views”) to connect structural and semantic types Use registration mappings to (semi-) automate transformation, based on derived structural correspondences Depending on the ontologies and registration mappings, it may not be possible to find an appropriate d … (since the correspondence is often under- specified)
  • 27. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 27 The Ontology-Driven Framework Source Service Target Service Ps Pt Semantic Type Ps Semantic Type Pt Structural Type Pt Structural Type Ps Desired Connection Compatible (⊑) Registration Mapping (Output) Registration Mapping (Input) Ontologies (OWL)
  • 28. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 28 Registration Example (simple XPaths) /population/sample == semType(P2) /population/sample/meas/cnt == semType(P2).itemMeasured /population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount /population/sample/meas/acc == semType(P2).hasProperty /population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue /population/sample/lsp/text() == semType(P2).hasContext.appliesTo root population = (sample)* elem sample = (meas, lsp) elem meas = (cnt, acc) elem cnt = xsd:integer elem acc = xsd:double elem lsp = xsd:string <population> <sample> <meas> <cnt>44,000</cnt> <acc>0.95</acc> </meas> <lsp>Eggs</lsp> </sample> … <population> structType(P2)
  • 29. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 29 Registration Example (simple XPaths) /population/sample == semType(P2) /population/sample/meas/cnt == semType(P2).itemMeasured /population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount /population/sample/meas/acc == semType(P2).hasProperty /population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue /population/sample/lsp/text() == semType(P2).hasContext.appliesTo root population = (sample)* elem sample = (meas, lsp) elem meas = (cnt, acc) elem cnt = xsd:integer elem acc = xsd:double elem lsp = xsd:string <population> <sample> <meas> <cnt>44,000</cnt> <acc>0.95</acc> </meas> <lsp>Eggs</lsp> </sample> … <population> structType(P2) Each sample is an instance of the semantic type
  • 30. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 30 Registration Example (simple XPaths) /population/sample == semType(P2) /population/sample/meas/cnt == semType(P2).itemMeasured /population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount /population/sample/meas/acc == semType(P2).hasProperty /population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue /population/sample/lsp/text() == semType(P2).hasContext.appliesTo root population = (sample)* elem sample = (meas, lsp) elem meas = (cnt, acc) elem cnt = xsd:integer elem acc = xsd:double elem lsp = xsd:string <population> <sample> <meas> <cnt>44,000</cnt> <acc>0.95</acc> </meas> <lsp>Eggs</lsp> </sample> … <population> structType(P2) Each sample’s cnt represents the itemMeasured object
  • 31. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 31 Registration Example (simple XPaths) /population/sample == semType(P2) /population/sample/meas/cnt == semType(P2).itemMeasured /population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount /population/sample/meas/acc == semType(P2).hasProperty /population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue /population/sample/lsp/text() == semType(P2).hasContext.appliesTo root population = (sample)* elem sample = (meas, lsp) elem meas = (cnt, acc) elem cnt = xsd:integer elem acc = xsd:double elem lsp = xsd:string <population> <sample> <meas> <cnt>44,000</cnt> <acc>0.95</acc> </meas> <lsp>Eggs</lsp> </sample> … <population> structType(P2) Each sample’s cnt’s value represents the hasCount value of the corresponding itemMeasured object
  • 32. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 32 Registration Example (simple XPaths) /cohortTable/measurement == semType(P3) /cohortTable/measurement/obs == semType(P3).itemMeasured /cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount /cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo <cohortTable> <measurement> <phase>Eggs</cnt> <obs>44,000</acc> </measurement> … <cohortTable> root cohortTable = (measurement)* elem measuremnt = (phase, obs) elem phase = xsd:string elem obs = xsd:integer structType(P3) … similary for P3 .. … .
  • 33. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 33 The Ontology-Driven Framework Source Service Target Service Ps Pt Semantic Type Ps Semantic Type Pt Structural Type Pt Structural Type Ps Desired Connection Compatible (⊑) Registration Mapping (Output) Registration Mapping (Input) Correspondence Ontologies (OWL)
  • 34. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 34 Correspondence Example /population/sample == semType(P2) /population/sample/meas/cnt == semType(P2).itemMeasured /population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount /population/sample/meas/acc == semType(P2).hasProperty /population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue /population/sample/lsp/text() == semType(P2).hasContext.appliesTo /cohortTable/measurement == semType(P3) /cohortTable/measurement/obs == semType(P3).itemMeasured /cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount /cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo Source-side semantic registration mapping Target-side semantic registration mapping population sample * meas cnt xsd:double xsd:string lsp xsd:integer acc cohortTable measurement * obs xsd:integer phase xsd:string
  • 35. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 35 Correspondence Example /population/sample == semType(P2) /population/sample/meas/cnt == semType(P2).itemMeasured /population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount /population/sample/meas/acc == semType(P2).hasProperty /population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue /population/sample/lsp/text() == semType(P2).hasContext.appliesTo /cohortTable/measurement == semType(P3) /cohortTable/measurement/obs == semType(P3).itemMeasured /cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount /cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo Source Target population sample * meas cnt xsd:double xsd:string lsp xsd:integer acc cohortTable measurement * obs xsd:integer phase xsd:string We want to “compose” the registrations to obtain structural correspondences
  • 36. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 36 Correspondence Example /population/sample == semType(P2) /population/sample/meas/cnt == semType(P2).itemMeasured /population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount /population/sample/meas/acc == semType(P2).hasProperty /population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue /population/sample/lsp/text() == semType(P2).hasContext.appliesTo /cohortTable/measurement == semType(P3) /cohortTable/measurement/obs == semType(P3).itemMeasured /cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount /cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo Source Target population sample * meas cnt xsd:double xsd:string lsp xsd:integer acc cohortTable measurement * obs xsd:integer phase xsd:string /population/sample == semType(P2) /cohortTable/measurement == semType(P3) These fragments correspond
  • 37. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 37 Correspondence Example /population/sample == semType(P2) /population/sample/meas/cnt == semType(P2).itemMeasured /population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount /population/sample/meas/acc == semType(P2).hasProperty /population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue /population/sample/lsp/text() == semType(P2).hasContext.appliesTo /cohortTable/measurement == semType(P3) /cohortTable/measurement/obs == semType(P3).itemMeasured /cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount /cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo Source Target population sample * meas cnt xsd:double xsd:string lsp xsd:integer acc cohortTable measurement * obs xsd:integer phase xsd:string /population/sample/meas/cnt == semType(P2).itemMeasured /cohortTable/measurement/obs == semType(P3).itemMeasured These fragments correspond
  • 38. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 38 Correspondence Example /population/sample == semType(P2) /population/sample/meas/cnt == semType(P2).itemMeasured /population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount /population/sample/meas/acc == semType(P2).hasProperty /population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue /population/sample/lsp/text() == semType(P2).hasContext.appliesTo /cohortTable/measurement == semType(P3) /cohortTable/measurement/obs == semType(P3).itemMeasured /cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount /cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo Source Target population sample * meas cnt xsd:double xsd:string lsp xsd:integer acc cohortTable measurement * obs xsd:integer phase xsd:string /population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount /cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount These fragments correspond
  • 39. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 39 Correspondence Example /population/sample == semType(P2) /population/sample/meas/cnt == semType(P2).itemMeasured /population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount /population/sample/meas/acc == semType(P2).hasProperty /population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue /population/sample/lsp/text() == semType(P2).hasContext.appliesTo /cohortTable/measurement == semType(P3) /cohortTable/measurement/obs == semType(P3).itemMeasured /cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount /cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo Source Target population sample * meas cnt xsd:double xsd:string lsp xsd:integer acc cohortTable measurement * obs xsd:integer phase xsd:string /population/sample/lsp/text() == semType(P2).hasContext.appliesTo /cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo These fragments correspond
  • 40. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 40 The Ontology-Driven Framework Source Service Target Service Ps Pt Semantic Type Ps Semantic Type Pt Structural Type Pt Structural Type Ps Desired Connection Compatible (⊑) Registration Mapping (Output) Registration Mapping (Input) Correspondence Generate d(Ps) Ontologies (OWL) Transformation
  • 41. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 41 Example Result (XQuery) Based on the structural correspondences and certain assumptions, we derive the transformation XQuery: <cohortTable> { for $s in /population/sample return <measurement> { for $c in $s/meas/cnt return <obs>{$c/text()}</obs> } { for $l in $s/lsp return <phase>{$l/text()}</phase> } </measurement> } </cohortTable>
  • 42. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 42 Assumptions Made (or why this may not work for you…) • Common XPath prefixes refer to the same element • Elements in correspondences have compatible cardinalities – source is equivalent or stricter than target (e.g., + is stricter than *) • Primitive data types are compatible
  • 43. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 43 Framework Operations and Properties In the paper, we define: – A semantic registration mapping R as a set of rules q↔p, where q is a substructure selection (query) and p is a contextual path (a path in an ontology) – A structural correspondence as a rule qs®qt, where qs and qt are substructure selections over the source and target, resp. – The semantic composition of registration mappings Rs and Rt, which returns a set of structural correspondence rules – The semantic subpath operation (subconcept), which is used by the semantic composition to find matching substructure selection rules
  • 44. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 44 Framework Operations and Properties In the paper, we define: – Registration mapping properties (cardinality consistency and partial complete registrations) and discuss the impact on determining structural transformations – The simple XPath and Semantic Path languages for defining registration mappings, and the corresponding semantic join operator to find correspondences
  • 45. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 45 Outline • The SEEK Project • Scientific Workflows • The Problem: Reusing Structurally Incompatible Services • The Ontology-Driven Framework • A Simple Framework Implementation • Future Work
  • 46. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 46 Future Work • Extend the registration mapping language – XPath is too limited … è try a more general query language (e.g., XPath + variables) è relational/Datalog based substructure selection (query) • Formalize the properties of registration mappings and their effect on automated transformation • Introduce conversion routines (e.g., for units) at the ontology level; apply them in transformations • Extend transformations to different computation models and workflow scheduling algorithms • Add to the Kepler Scientific Workflow System
  • 47. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 47 Acknowledgements • NSF/ITR Science Environment for Ecological Knowledge • NSF/ITR Geosciences Network • NIH Biomedical Informatics Research Network • DOE Scientific Data Management Center
  • 48. Bowers & Ludäscher – Ontology-Driven Data Transformations, DILS’04, Leipzig 48 Questions …