This document discusses evolving workflow provenance information in the presence of custom inference rules. It presents three inference rules for provenance data, including that actors are associated with all subactivities if one activity, objects and their parts are used together, and information objects are present where physical objects carrying them are. It examines handling updates to provenance knowledge bases using these rules either by deleting all inferred facts or only as needed, and considers complexity of different approaches.
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Â
Evolution of Workflow Provenance Information in the Presence of Custom Inference Rules
1. Evolution of Workflow Provenance
Information in the Presence of
Custom Inference Rules
C. Strubulis, Y. Tzitzikas, M. Doerr and G. Flouris
{strubul, tzitzik, martin, fgeo}@ics.forth.gr
Institute of Computer Science
Foundation for Research SWPM Workshop
and Technology â Hellas 28/05/2012
ICS-FORTH Crete, Herakleion
Computer Science Department
University of Crete
2. SWPM 2012 2
Outline
âą Provenance-based inference rules
âą Knowledge evolution of provenance information
3. SWPM 2012 3
What is Provenance?
âą Etymology
â« French verb âprovenirâ
â« Meaning: to come forth, originate
âą The Merriam-Webster Online Dictionary
â« the origin, source
â« the history of ownership of a valued object or work
of art or literature.
4. SWPM 2012 4
Why is Provenance Important?
anon4877_base_20060331.jpg anon4877_lesion_20060401.jpg
Reproducibility
Data quality
Attribution
Informational
How were these images created?
Was any pre-processing applied more!)raw data?
Provenance is as (or to the
important as the experimental
Who created them? Whatâs the difference?
results
Are they really from the same patient?
5. SWPM 2012 5
Motivation
âą Motivation
â« Reduce the amount of provenance information
that has to be stored (produced by scientific
workflow systems)
â« Reduce the time and human effort in the case of
manual ingestion of provenance metadata in
repositories
â« Elimination of errors starting from the input
data
ï Reduction of error search space -> Easier error
correction
6. SWPM 2012 6
Storage Space Challenge
Extra data for
Adoption of Inference
provenance
Storage mechanisms
Space
Data
Production
Time
7. SWPM 2012 7
User Input Challenge
Record
Provenance
Digital
Metadata Ingestion
RDF/S
Repository
8. SWPM 2012 8
User Input Challenge
Day 1 Day 11111
âŠâŠâŠâŠâŠ. Adoption of
Inference
mechanisms
Time
9. SWPM 2012 9
Error Correction Challenge
Scientific workflow Provenance Data Metadata
record ingestion
Which
one to
correct?
From
where to Adoption of Inference
start mechanisms
searching?
10. SWPM 2012 10
Approach-Results
âą Our Approach
â« Dynamic completion of the stored knowledge by
logical assumptions â inferences
â« Identify and specify some basic provenance-
Query results
based inference rules +
Inferences
â« In addition, we tackle the knowledge evolution
requirements
ï The question is how we can satisfy update requests
while still supporting the aforementioned
provenance-based inference rules.
ï Operations: Disassociation, Contraction
11. SWPM 2012 11
The Assumed Provenance Model
Most provenance
models have P12 was E73 Information
E5 Event
similar concepts!!! present at Object
IsA
P9 forms P128
part of P12 was carries
E7 Activity present at
P14 E24
Small part of P16 was used
carried Physical
CIDOC CRM for
out by Man-Made
Thing
E22 Man-made IsA
E39 Actor Object (Device)
P46 forms part of
12. SWPM 2012 12
The three inference rules
R1:
If an actor has carried out one activity,
then (s)he has carried out all of its subactivities.
R2:
If an object (device) was used for an event,
then all parts of that object were also used for
that event.
R3:
If a physical object that carries an information object
was present at an event, then that information object
was also present at the event.
14. SWPM 2012 14
Actors - Activities
âą If an actor has carried out one activity, then (s)he has carried out all of
its subactivities.
Starc P14 carried out Laser scanning P14 carried out
Institute by acquisition John
by
P9 forms
part of
Detailed sequence
of shots
P14 carried P14 carried
out by out by
P9 forms
part of
Capture 1 âŠâŠâŠâŠâŠâŠ.. Capture 10
P14 carried out by
15. SWPM 2012 15
Devices - Activities
If an object was used for an event, then all parts of the object were
used for that event too.
P16 was used for
Detailed
P16 was
sequence of Multiviewdome device
used for
shots
P46 forms P46 forms
part of part of
Nikon D90 AF-5_NIKKOR
18-105mm ........ Nikon D300
P16 was used for
16. SWPM 2012 16
Information Objects - Events
If a physical object that carries an information object was present
at an event, then that information object was present at the event
too.
P12 was present at
3D reconstruction Part of column of Ramesses II
Detailed sequence of
shots
P128
âŠâŠâŠ carries
Capture 1_10
P12 was present
Information in hieroglyphics
at
17. SWPM 2012 17
Outline
âą Provenance-based inference rules
âą Knowledge evolution of provenance information
18. SWPM 2012 18
Knowledge Evolution
Inference
rules
Evolution
âą Updating our knowledge is essential!!
âą Requests for adding/removing information
âą The use of inference rules introduces difficulties
with respect to the evolution of knowledge
19. SWPM 2012 19
Example
âą Consider a KB containing the Starc P14 carried Laser scanning
activities of Laser Scan Institute out by acquisition
Acquisition
P9 forms
part of
âą Starc is propagated to all the
subactivities of Laser Scan Detailed sequence
Acquisition by rule R1 of shots
âą Update request: P14
Starc was not responsible for carried P9
writing the Capture 1_1 out by forms
part of
âą There are two cases to handle the Capture 1 Capture 10
request
20. SWPM 2012 20
Foundational vs Coherence
âą Foundational Viewpoint
â« Each piece of our knowledge serves as a
justification for other beliefs
â« Implicit facts are supported by the explicit ones
â« Explicit knowledge is more important than
implicit one
âą Coherence Viewpoint
â« Every piece of knowledge is self-justified
â« Implicit needs no support from explicit
â« Explicit and implicit have the same value
21. SWPM 2012 21
Deletion of a fact
âą Foundational:
â« All implicit data that is no longer supported must also
be deleted
âą Coherence:
â« Delete implicit data only if it is necessary due to the
deletion request
22. SWPM 2012 22
Example
âą Consider a KB containing the Starc P14 carried Laser scanning
activities of Laser Scan Institute out by acquisition
Acquisition
P9 forms
part of
âą Starc is propagated to all the
subactivities of Laser Scan Detailed sequence
Acquisition by rule R1 of shots
âą Update request: P14
Starc was not responsible for carried P9
Capture 1_1 out by forms
part of
âą Two cases: Capture 1 Capture 10
âąActor disassociation
(foundational)
âąActor contraction (coherence)
23. 23
âą Update request:
Starc was not responsible for Capture 1
Foundational Coherence
P14 carried Laser P14 carried Laser
Starc Starc
scanning scanning
Institute out by Institute out by
acquisition acquisition
P9 forms P9 forms
part of part of
Detailed sequence Detailed sequence
of shots of shots
P14 P14
carried carried
P9 forms P9 forms
out by out by
part of part of
Capture 1 Capture 10 Capture 1 Capture 10
P14 carried out by P14 carried out by
24. SWPM 2012 24
Complexity analysis
âą Similar operations can be also defined for
rules R2 and R3
25. SWPM 2012 25
Conclusion
âą Provenance-based Inference Rules
â« We motivated the need for provenance-based inference rules to
ï reduce the storage space requirements
ï ease the ingestion of metadata and the error correction
â« We identified three basic rules accompanied by real world examples.
âą Provenance-based Inference Rules and Knowledge
Evolution
â« The use of inference rules introduces difficulties with respect to the
evolution of knowledge
â« We identified two ways to deal with deletions in this context
â« Even though we confined ourselves to CIDOC, and to three specific
inference rules, the general ideas behind our work (including the
discrimination between foundational and coherence semantics of
deletion) can be applied to other models and/or sets of inference rules.