SlideShare a Scribd company logo
1 of 55
Acquisition and Understanding of
   Process Knowledge Using
   Problem Solving Methods

               Jose Manuel Gómez Pérez

                  Facultad de Informática
             Universidad Politécnica de Madrid
              Campus de Montegancedo sn
             28660 Boadilla del Monte, Madrid

                 http://www.oeg-upm.net

                   jmgomez@fi.upm.es
                  Phone: 34.91.3363670
                    Fax: 34.91.3524819



PhD thesis                                       Date: 08/05/2009
Outline




          Introduction and motivation
          Open research problems and work hypotheses
          Acquisition of process knowledge by SMEs
          Provenance analysis of process executions by SMEs
          Evaluation
          Conclusions and future research problems




Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis        2
Knowledge Acquisition: Towards SME empowerment


                                                                                                                         DARPA’s
                                                                                   KA Frameworks                        HPKB & RKF

  The Knowledge                               The Role
Acquisition Bottleneck                      Differentiation
                                               Principle

                              The Knowledge Level                                                                                  KA by Subject
                                                                                                                                   Matter Experts
                                                  DARPA’s KSE                                KA by                                 (SMEs)
  Subject Matter Expert (SME)
                                                                                             Knowledge
                                                                                             Engineers                                     KB edition by SMES
                                                                                             (KEs)                                                      KRAKEN
                                                      Knowledge                                                              Knowledge                  DISCIPL-RKF
                                                                                                                             formulation
                                                      modeling                                  KRR Languages
                                                                                                                              by SMEs
                                                                                                                                                    KB
                                                                                                                  OCML
                                                                                                                                                maintenance
                                                   Ontologies                   Ontology editors                  KARL            SHAKEN               CHIMAERA

                                                                                                                                        Collaborative
Knowledge
Engineer (KE)         Knowledge                                                                                                      knowledge creation
                      programming                       Problem Solving                                                                              SEMANTIC WIKIS
                                                           Methods




    Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                                      3
Knowledge types
                  Source: the Halo project KR analysis phase for
                  the domains of Chemistry, Biology, and Physics
                                                                                             •      Processes are special knowledge
     RUL                    CLS
                                                  FACT
                                                                           MAT                      types that
                        (classification)          (factual            (mathematics)
  (inference)
                                                knowledge)
                                                                                                       • Relate to the sequence of
                                                                                                         operations and involved
    CMP                     TAB                    PCS                   CAUS
 (comparison)               (tables)            (processes)            (cause-effect)                    resources leading to the
                                                                                                         production of some outcome
     DAT                   PROC                    EXP                      US                         • Encapsulate preconditions,
(data structures)        (procedural)          (experiments)         (underspecified)
                                                                                                         results, contents, actors, and
  TRANS                                                                                                  causes
                             NF                  SPACE                    PWR
  (translation)
                        (non functional)          (spatial)             (part-whole)         •      Process knowledge is complex
                                                                                                       • It builds on top of other simpler
                            TIME                   GRA
                          (temporal)          (diagrammatic)
                                                                                                         knowledge types, like facts and
                                                                                                         rules
 “What is released/added/increased upon binding
               of two amino acids?”
                                          “A piece of solid calcium is heated in oxygen gas. ...”

                          “Find correct RNA sequence for a given DNA sequence.”

    Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                4
Why processes are important




                                                                                                                •        Processes appear in
                                                                                                                         37% (average) in the
                                                                                                                         domains of Biology,
                                                                                                                         Chemistry, and Physics
                                                                                                                •        The most important
                                                                                                                         knowledge type in
                                                                                                                         Chemistry (53%)
                                                                                                                •        Second in Biology
                                                                                                                         (35%)
                                                                                                                •        Fourth in Physics (22%)




                                                                                                                    Source: The Halo project KR analysis phase for
                                                                                                                    the domains of Chemistry, Biology, and Physics

Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                                        5
Work objectives
Objective 1: To enable SMEs to                                                                                            Provide reusable guidelines
formulate processes without KEs                                                                                           to formulate process
                                                                                                   How                    knowledge
Objective 2: To support SMEs in
                                                                                                                          Support reasoning
understanding process executions
                                                                                                                          Describe the main rationale
                                                                                                                          behind a process




    What
                                                                       PSMs
                                                                                                                                              Whom



                          PCS                                                                                       SMEs



 Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                             6
PSM perspectives
  Task-method                                                                                   Interaction
 decomposition




                                                                                                                Black-box perspective
                                                                                                                Knowledge transformation
                                                                                                                within the PSM


                                                                                                Hierarchically defines how
PSM establishes and controls                                                                    tasks decompose into simpler
the sequence of actions                                                                         (sub)tasks
required to perform a task                                                                      Describes tasks at several
Defines knowledge required at                                                                   levels of detail
each task step                                                                                  Provides alternative ways to
                                                                                                achieve a task
         Knowledge flow



                                                                                                                                           Task
                                                                                                                                           Method
                                                                                                                                           Role


  Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                   7
Provenance analysis of process executions




                                                  ?
Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis   8
In summary

  •      This thesis proposes the use of PSMs as a novel approach for
         supporting SMEs both in the formulation of process knowledge
         and in the provenance analysis of process executions

  •      It also explores to what extent it is possible to build such tools
         that take KEs out of the formulation and analysis loop

  •      Ultimately, it aims at showing that it is possible to
         engage users
            • To generate computer-readable content
              represented in formal languages
            • To apply knowledge representation and reasoning
              techniques to analyze the outcomes of
              automated, knowledge-intensive processes




Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis           9
Outline




          Introduction and motivation
          Open research problems and work hypotheses
          Acquisition of process knowledge by SMEs
          Provenance analysis of process executions by SMEs
          Evaluation
          Conclusions and future research problems




Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis       10
Open research problems and work hypotheses: Objective 1

                                                                                            •      H1: Empowering SMEs can increase KB
Objective 1: To provide SMEs with the
means to formulate process knowledge in                                                            quality and reduce costs
their domains of expertise without the
intervention of KEs
                                                                                            •      H2: The complexity of process
                                                                                                   knowledge requires providing SMEs with
                                                                                                   specific means to acquire and reason
                                                                                                   with processes

                                                                                            •      H3: PSMs can reduce the complexity of
                                                                                                   acquiring process knowledge by SMEs

                                                                                            •      H4: The proposed methods and tools
                                                                                                   abstract SMEs from the underlying
                                                                                                   representation

                                                                                            •      H5: The proposed methods and tools
                                                                                                   produce sound and complete executable
                                                                                                   process models

                                                                                            •      H6: The proposed method and tools are
                                                                                                   flexible and reusable across domains

 Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                 11
Open research problems and work hypotheses: Objective 2
Objective 2: To support SMEs in
analyzing and understanding process
executions


                                                                                                        •      H7: The analytical capabilities of
                                                                                                               PSMs can provide SMEs with
                                                                                                               meaningful interpretations of
                                                                                                               process executions

                                                                                                        •      H8: The method proposed identifies
                                                                                                               the main rationale behind
                                                                                                               processes by detecting occurrences
                                                                                                               of PSMs in their execution logs

                                                                                                        •      H9: The method proposed can use
                                                                                                               the hierarchical structure of PSMs
                                                                                                               to describe process executions at
                                                                                                               different levels of detail




   Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                       12
Outline




          Introduction and motivation
          Open research problems and work hypotheses
          Acquisition of process knowledge by SMEs
          Provenance analysis of process executions by SMEs
          Evaluation
          Conclusions and future research problems




Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis       13
Acquisition of process knowledge by SMEs
                                                                                                      Objective 1: To provide SMEs with the means to
                                                                                                      formulate process knowledge in their domains of
• Four main contributions                                                                             expertise without the intervention of KEs

     • C1: a process metamodel, which provides the terminology
       necessary to express process entities in scientific domains, and
       the relations between them
     • C2: a PSM library, which provides high-level, reusable
       abstractions for process representation and a method for its
       development
     • C3: a graphical process modeling and
       reasoning environment, which applies
       the previous contributions in order to
       enable SMEs to formulate process
       knowledge
     • C4: a method for the automatic
       synthesis of executable process
       models from SME-authored process
       diagrams, supported by an underlying
       representation and reasoning
       formalism
 Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                             14
Contribution 1: The process metamodel




•   Resources (roles)                                                                •      Actions
      • Containers of domain conceptsthat                                                     • Inspired by activities in EO and TOVE
        can play a particular role


                                                                                     •      Relations
                                                                                     •      Forks

    Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis       15
Contribution 2: Building a PSM library for acquisition of process knowledge




                                                                                                                               Identification




                                                                                                                             Decomposition
                                                                                                                             and abstraction
  Extends work done in the Halo
analysis phase by Omniscience and
          Ontoprise teams                                                                                                     Reusable
                                                                                                                             PSM library

•      755 AP questions analyzed
•      >100 domain-specific processes
•      Four main process categories
          •      Join
          •      Split
          •      Modify
          •      Locate
    Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                 16
Contribution 2: A PSM example (decompose & combine)
                                                                                                                 “Crystallization occurs when certain
                                                                                                                 pairs of oppositely charged ions
                                                                                                                 attract each other so strongly that they
                                                                                                                 form an insoluble ionic solid. This
                                                                                                                 process coexists with dissolution
                                                                                                                 processes in precipitation reactions”




                                                                                                                The addition of a colorless solution of potassium iodide (KI)
                                                                                                                to a colorless solution of lead nitrate [Pb(NO3)2] produces a
                                                                                                                yellow precipitate of lead iodide (Pbl2) that slowly settles to
                                                                                                                                  the bottom of the beaker.

                                                                                  name                  decompose & combine
                                                                                  goal                  member(Recombination set, Element) and
                                                                                                        member(Constituents set, Piece) and
                                                                                                        part-of(Piece, Element) and
                                                                                                        part-of(Piece, Combination) and
                                                                                                        properties(Element, ep) and
                                                                                                        properties(Combination, cp) and
                                                                                                        not equal(ep, cp)
                                                                                  actions               decompose, combine
                                                                                  input action          decompose
                                                                                  output action         combine
                                                                                  input roles           Recombination set, Decomposer, Combinator
                                                                                  output roles          Combination, Byproduct
Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                                                  17
Contribution 3: The graphical process modeling environment

                                                                                        Domain process
                                                                                         to which this
                                                                                            process
                                                                                          diagram is                          Consistency
                                                                                             bound                            maintenance
    Process                                                                                                              (knowledge base and
   metamodel                                                                                                               process data flow)



 PSM library
    (e.g.
decompose &
 recombine)


                                                                                                                               Domain-level
                                                                                                                              reasoning and
                                                                                                                                control flow
                                                                                                                                 evaluation



Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                     18
Contribution 4: The process representation and reasoning formalism

  •       Bridges the gap between the
          knowledge level and the
          operational level
  •       Focus on three main aspects
            • Process frame
            • Data flow
            • Control flow
                                                                                                                               Input
                                                                                                                               action


                                                                   Conditional                                                 Output
                                                                   precedence                                                  action
                                                                  (control flow)




“In a long-distance jump competition, an athlete
can jump only after his mitochondria have
accumulated enough energy for his muscles to
contract.”

      Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis            19
Contribution 4: Addressing the frame problem
                                                                                       Pre state of
                                                                                         action
•     Two states (pre and post) per                                                     Dissolve
      process action
         •     Pre state: portion of the process
               frame in the scope of an action
                                                                                                                                  Post state of
         •     Post state: updated pre state of                                                                                     action
               the action after its the execution                                                                                  Dissolve
•     Actions read from their pre state
      and write into their post state


•     At modeling time we automatically synthesize process
      rules that manage the process frame during execution
         •     Setup rules: build the pre state of the input actions of the                                              Explicit manipulation of
               process                                                                                                   the process frame
         •     Precedence rules: describe what actions can be                                                            allows runtime
                                                                                                                         management of data
               connected with each other by means of their outputs and                                                   and control flow
               inputs
         •     Transition rules: describe the transition between pre and
               post states
Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                       20
Contribution 4: The Code synthesis mechanism

•     Action-centric algorithm
•     Each action results into a set
      of process rules in F-logic

                                     intermediate          output
                  input actions
                                        actions            actions

setup rules              x                  -                  -

 transition
                         x                  x                  x
    rules                                                                             setup         FORALL m, e, v
                                                                                                          m:mitochondrion@preState(accumulateEnergy) AND
precedence                                                                                                m:TOOL@preState(accumulateEnergy) AND
                         -                  x                  x
   rules                                                                                                  e: energy@preState(accumulateEnergy) AND
                                                                              transition                  e:RESOURCE@preState(accumulateEnergy) AND
                                                                                                          e[hasEnergyValue -> v]@preState(accumulateEnergy)
 FORALL m, e, j                                                                                     <-
     j: jump@postState(muscleContraction) AND                                                             m:mitochondrion AND
     j: OUTPUT@postState(muscleContraction) AND                                                           e:energy AND
     muscleContraction[PROVIDES -> j] @postState(muscleContraction)                                       e[hasEnergyValue -> v].
 <.-                                                                                                                                               precedence
     m:muscle @preState(muscleContraction) AND                                                         FORALL e, v
     m:TOOL@preState(muscleContraction) AND                                                                e:energy@preState(muscleContraction) AND
     m[IS_USED_BY -> muscleContraction]@preState(muscleContraction) AND                                    e[hasEnergyValue -> v]@ preState(muscleContraction)
     e:energy@ preState(muscleContraction) AND                                                         <.-
     e:RESOURCE@preState(muscleContraction) AND                                                            e:energy@postState(accumulateEnergy) AND
     e[IS_CONSUMED_BY -> muscleContraction] @preState(muscleContraction).                                  e[hasEnergyValue -> v]@ postState(accumulateEnergy).

Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                                            21
Contribution 4: Domain-level reasoning within processes
  “The minimum amount
  of energy needed to
  jump are 5 calories”




  “The length of the jump
  is directly proportional
  to the amount of
  energy accumulated”

                                                                                                                                                  transition
                                                                check
                                                                                         FORALL j, m, e, length
FORALL anEnergy, v
                                                                                             j: jump@postState(muscleContraction) AND
   enough_energy_for_contraction(anEnergy)
                                                                                             j: OUTPUT@postState(muscleContraction) AND
   @check_enough_energy_for_contraction(accumulateEnergy)
                                                                                             muscle contraction[PROVIDES -> j]@postState(muscleContraction) AND
<-
                                                                                             j[hasLength-> length] @postState(muscleContraction)
   anEnergy:energy @preState(muscleContraction) AND
                                                                                         <.-
   anEnergy:energy[hasEnergyValue -> v] @preState(muscleContraction) AND
                                                                                             m:muscle @preState(muscleContraction) AND
   greater(v, 5).
                                                                                             m:TOOL@preState(muscleContraction) AND
                                                              update                         m[IS_USED_BY -> muscleContraction]@preState(muscleContraction) AND
FORALL length, anEnergy, v                                                                   e:energy@ preState(muscleContraction) AND
   aJump(out(hasLength, length):jump@update(muscleContraction)                               e:RESOURCE@preState(muscleContraction) AND
   aJump(out(hasLength, length) [hasLength -> length]@update(muscleContraction)              e[IS_CONSUMED_BY -> muscleContraction] @preState(muscleContraction) AND
<-                                                                                           enough_energy_for_contraction(e)
   anEnergy:energy @preState(muscleContraction) AND                                          @check_enough_energy_for_contraction(accumulateEnergy) AND
   anEnergy:energy[hasEnergyValue -> v] @preState(muscleContraction) AND                     j:jump@update(muscleContraction) AND
   multiply(length, 2, v).                                                                   j[hasLength -> length]@update(muscleContraction).

      Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                                      22
Contribution 4: Sample question




“In a long-distance jump competition, an
athlete can jump only after his mitochondria
have accumulated enough energy for his
muscles to contract.”




 At least, what amount of energy does a long jump                                    energy1:energy[hasValue -> 100].n FORALL j, oa, v <-
 athlete need to consume in order to jump more than                                          “long jump”:PROCESS@ProcessModule AND
 8m long?                                                                                    “long jump”[OUTPUT_ACTION ->> oa]@ProcessModule AND
                                                                                             j:Jump[hasValue -> v]@postState(oa) AND
√ a. 100 cal                                                                                 greater(v, 8).
√ b. 50 cal
√ c. 250 cal
     d. 1 cal


  Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                           23
Contribution 4: Properties of the process models

      •      Sound and complete
               • Based on F-logic’s proof theory plus additional proof for the
                 process formalism
               • A process action is sound ↔ its post state can be deduced from its
                 pre state
               • A process action is complete ↔ it allows deducing all the possible
                 clauses of its post state from the clauses in the pre state
               • A process model is sound and complete ↔ all its actions are sound
                 and complete
      •      Optimized
               • Attribute and concept names ground
                  • person(Peter) instead of instanceOf(person, Peter)
                  • Allows OntoBroker to index tuples by class and attribute name
               • Process rules are generally stratified
                  • Critical in the presence of negation (forks and loops)
                  • Avoid costly well-founded evaluation mode

Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis   24
Outline




          Introduction and motivation
          Open research problems and work hypotheses
          Acquisition of process knowledge by SMEs
          Provenance analysis of process executions by SMEs
          Evaluation
          Conclusions and future research problems




Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis       25
Provenance analysis of process executions by SMEs
                                                                                                                    Objective 2: To support SMEs in
                                                                                                                    analyzing and understanding process
                                                                                                                    executions


• Two main contributions
         • C5: A method and algorithm that uses PSMs as high-level,
           reusable process abstractions and visualization paradigm to
           identify and explain the reasoning strategies and rationale of
           executed processes
         • C6: An architecture and integrated environment for the analysis
           of process executions at the knowledge level




Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                                26
Contribution 5: Towards knowledge provenance


                                                                                                           •      PSMs as semantic overlays
                                                                                                                  on top of existing process
                                                                                                                  documentation
                                                                                                                    •    Task: What is going to be
                                                                                                                         achieved by executing a
                                                                                                                         process
                                                                                                                    •    PSM: HOW



                                                        •      Provenance, from a knowledge perspective
                                                                 •     How provenance relates to the execution of a process
                                                                 •     Simpler process analysis proposing decompositions into
                                                                       simpler subprocesses
                                                                 •     Visualize provenance at different levels of detail
                                                        •      Supporting SMEs in two main ways
                                                                 •     Validation of process executions
                                                                 •     Identification of reasoning patterns in process
                         Source: myGrid                                executions

Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                          27
Contribution 5: The twig join function

         •         Based on XML pattern matching algorithms on Directed
                   Acyclic Graphs (Bruno et al., 2002)
         •         twig_join detects the occurrence of a pattern in a XML DAG
        •          Given
                 •          P, a process
                 •          T, a task potentially describing P
                 •          M, a PSM providing a strategy on how to achieve T
                 •          i(T), the set of input roles of T
                 •          o(T), the set of output roles of T
                 •          D, the DAG resulting from documenting the execution of P
         •           twig_join(D,i(T),o(T)) checks whether a twig exists for M that
                     connects i(T) with o(T) in D

         •           In this case, PSM M is the pattern to be identified in the
                     process documentation DAG D

Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis   28
Contribution 5: A twig join example

Domain                        Bridges                           PSM
entities                     (mapping)                         entities




                                                                                                                         twig join!




Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                29
Contribution 5: The matching algorithm
                                                                                                                     •   twig_join recursively applied
                                                                                             Task-method
                                                                                            decomposition
                                                                                                                         at each decomposition level
                                                                                                                     •   Each task decomposed by
                                                                                                                         one or several PSMs (task-
           twig_join(Ti, D)                                                                                              method decomposition view)
                                                                                                                     •   Knowledge flow defines the
                                                                                                                         sequence of evaluation
           decompose(Ti)



                                                                              twig_join(T11, D)


                                                                                                                                        Knowledge flow
                                                                              twig_join(T12, D)




                                                                              twig_join(T13, D)
                                                                                                      Backtracking
                                                                                                      possible at PSM
                                                                                                      and role levels
                                                                              twig_join(T14, D)


          Interaction
Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                              30
Contribution 6: A Knowledge-Oriented Provenance Environment



                                                                                          PSM-                             Matching
                                                                                         Ontology                        visualization
                                                                                         bridges




                                                                                       Provenance                         Matching
                                                                                          query                           detection




Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                   31
Outline




          Introduction and motivation
          Open research problems and work hypotheses
          Acquisition of process knowledge by SMEs
          Provenance analysis of process executions by SMEs
          Evaluation
          Conclusions and future research problems




Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis       32
Objective 1

      •      Evaluation settings
               • 2 Chemistry SMEs, 2 Biology SMEs, and 2 Physics SMEs




                    Judith               Lennart               Christianne               Martina                 Markus   Andreas



               • Syllabus
                         • Chemistry: Stoichiometry, solutions and equilibrium (Brown & Lemay,
                           pages 75-83, 113-133, and 613-653)
                         • Biology: Cell and DNA structure and processes (Campbell and Reece,
                           pages 112-124, 217-223, 239-245, 293-301, 304-311, and 317-319)
                         • Physics: Kinematics and Dynamics (Serway and Faughn, chapters 2,3,
                           and 4)
      •      Two main dimensions: usability and utility


Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis              33
Evaluation results: utilization of the PSM library
                                                                                                                                                Objective 1
                                              # of
                                           processes
                                            modeled
            SME1 (Physics)                        0                         H3: PSMs can reduce the                            H1: SME empowerment
            SME2 (Biology)                        2                         complexity of process KA                           can increase KB quality
            SME3 (Biology)                        6                                                                            and reduce costs
                                                                            H6: The proposed
          SME4 (Chemistry)                        0                         methods and tools are
                                                                            flexible and reusable
          SME5 (Chemistry)                        3
            SME6 (Physics)                        0
                   Total                         11                                                                Processes                      PSMs

                                                                                                     Transition from G2 phase to mitosis           n.a.
                                                                             SME2 (Biology)
                                                                                                                     Mitosis                       n.a.

                                                                                                                     Mitosis               decompose & combine
                                                                                                          Carbohydrate metabolism           consume, transform
                                                                                                              Cellular respiration         decompose, consume
                                                                             SME3 (Biology)
                                                                                                                 Detoxification                  transform
                                                                                                                Photosynthesis              form by combination
                                                                                                         Ribosome protein synthesis         situate & combine

                                                                                                           Complete ionic equation          form by combination
                                                                            SME5 (Chemistry)                  Molecular equation           decompose & combine
                                                                                                               Net ionic equation           form by combination

Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                                        34
Evaluation results: performance of process models
                                                                                                                                         Objective 1



                            with respect to configuration C0
                                                                                                                         H5: The proposed
Query           C0                      C1                       C2                                                      methods and tools
                                                                                                                         produce sound and
SME3-q0          31      1,00                   0        0,00           16        0,52                                   complete executable
SME3-q1          63      1,00                  16        0,25           16        0,25                                   process models
SME3-q2          31      1,00                  16        0,52           16        0,52
SME3-q3          47      1,00                  16        0,34           16        0,34
SME3-q4          15      1,00                   0        0,00            0        0,00
SME3-q5          32      1,00                  16        0,50            0        0,00
SME3-q6         203      1,00                 219        1,08          234        1,15
SME3-q7          63      1,00                  31        0,49           31        0,49
                                                                                              •      C0
SME3-q8          47      1,00                  31        0,66           16        0,34                 •     Well-founded evaluation on
SME3-q9          62     1,00                   32        0,52           16        0,26
SME3-q10        203     1,00                  218        1,07         203         1,00                 •     Concept/attr. names ground off
Average        79,7     1,00                 59,5                     56,4
Median           47     1,00                   16
                                                         0,75
                                                         0,34           16
                                                                                  0,71
                                                                                  0,34
                                                                                              •      C1
Min              15     1,00                    0        0,00            0        0,00                 •     Well-founded evaluation on
Max             203      1,00                 219        1,08          234        1,15
<1 - faster                                                                                            •     Concept/attr. names ground on
=1 - same time as config C0
>1 - slower
                                                                                              •      C2
                                                                                                       •     Well-founded evaluation off
                                                                                                       •     Concept/attr. names ground on


Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                                 35
Evaluation results: utility and usability
                                                                                                                                       Objective 1




•       Physics SMEs did not use processes                                              •      System Usability (SU) scale
•       Not so important for Chemistry SMEs                                             •      SMEs answered a questionnaire about
                                                                                               the system with a quantitative value
•       SME2 (Biology): “It makes the
                                                                                               ranging between 0 and 100
        representation of biological models
        easier”                                                                         •      Average obtained: 64,5
•       SME3 (Biology): “The modeling of                                                    H2: Due to its                   H4: The method and
        processes is very useful. It must be                                                complexity, SMEs                 tools proposed
        possible to ask questions about the                                                 require specific                 abstract SMEs from
        various states of a process. And asking                                             means for process                the underlying KRR
        questions with T&D worked okay”                                                     KA                               formalism
    Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                           36
Evaluation settings (Provenance Challenge)
                                                                                                                               Objective 2
             Brain Atlas                                                        Brain Atlas
              Workflow                                                       Provenance Data
                                                                                   Flow




                                                                                                                         Catalogation PSM




                                                                                                                                      37

Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                       37
Evaluation results
                                                                                                                                      Objective 2


                                                                                                            •      Focus on precision and
                                                                                                                   recall metrics
                                                                                                            •      Identified at three different
                                                                                                                   layered contexts
                                                                                                                      • Method
                                                                                                                      • Task
H7: PSMs can provide
SMEs with explanations                                                                                                • Decomposition-level
of process executions

H8: The method
proposed identifies the
main rationale behind
processes by detecting
PSM occurrences

H9: PSMs describe
process executions at
different levels of detail

 Perfect match
 Partial match
 No match
  Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                            38
Outline




          Introduction and motivation
          Open research problems and work hypotheses
          Acquisition of process knowledge by SMEs
          Provenance analysis of process executions by SMEs
          Evaluation
          Conclusions and future research problems




Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis       39
Conclusions
Objective 1: To enable SMEs to
acquire processes without KEs
        •      Qualitative evidence rather than statistical proof (only 6 SMEs)
        •      However, evidence found that it is possible to engage users in
               acquiring process knowledge without the intervention of KEs
        •      SMEs using the PSM library (SME3 and SME5) produced more and
               better quality process models (82%) than the rest (SME2)
        •      The method used to create the PSM library has also shown evidence
               to be reusable in other domains
Objective 2: To support SMEs in
understanding process executions
        •      Semantic overlays e.g. PSMs on top of process documentation
               provide the required abstractions to analyze provenance from a
               knowledge perspective
        •      Provenance analysis by SMEs favors from a hierarchical structure in
               such overlays
        •      The matching algorithm has not been applied to large PSM libraries
               and provenance logs

 Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis            40
The ubiquity of processes

                                                       Business


                                                                                                                                   Biology




   Healthcare




                                                                    Climate
                                                                                                                         Ecology
                                                                   prediction




              Chemistry




Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                       41
Future research problems
     •      The Web is driving a new computing paradigm through the
            involvement of users forming online communities
     •      Additionally, focus change from data to process
     •      The solutions proposed live in the Semantic Web in the small
     •      Challenge: move to the Web in the large

   Process                                  Performance,                                  Collaboration                       Process
representation                               coverage,                                       in user                      validation, trust
and reasoning                                   scale                                     communities                      maintenance
  More expressivity                                  Distributed
 (events, qualitative                                reasoning                                Share and reuse
     reasoning)                                      algorithms                                 processes                   Process reliability
                                                                                                                             and validation
     Incomplete,                                  Conciliation of
                                                  partial results                               Compare and
    inconsistent,
                                                                                                 recommend
    contradictory
                                                    Heuristics                                    processes
  knowledge bases
                                                  (assumptions,
                                                     defaults)                                                                    Trust
    Uncertainty,                                                                             Process-specific
  nonmonotonicity                                      Caching                              query mechanisms


 Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                           42
Acquisition and Understanding of
   Process Knowledge Using
   Problem Solving Methods

               Jose Manuel Gómez Pérez

                  Facultad de Informática
             Universidad Politécnica de Madrid
              Campus de Montegancedo sn
             28660 Boadilla del Monte, Madrid

                 http://www.oeg-upm.net

                   jmgomez@fi.upm.es
                  Phone: 34.91.3363670
                    Fax: 34.91.3524819



PhD thesis                                       Date: 08/05/2009
Previous steps towards provenance analysis




  •      PASOA (Groth et al., 2008)
         process documentation
  •      Relationship p-assertions
            • Who exchanges messages
  •      Interaction p-assertions
            • How input data becomes
              output data
  •      Actor-state p-assertions
                                                                                                     PASOA interaction p-assertion
            • Non-functional properties




Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis               44
Contribution 2: PSMs for the acquisition of process knowledge




                                                                                               •      PSMs separate process from
                                                                                                      domain knowledge by explicitly
                                                                                                      representing the different roles
                                                                                                      domain entities play in a process
                                                                                               •      Focus on how to achieve
                                                                                                      processes
                                                                                               •      Provide SMEs with guidelines and
                                                                                                      templates to formulate process
                                                                                                      models



                Modelling framework for process knowledge,
                              extending TMDA




Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis              45
Contribution 2: The PSM library for Process Modelling




                                                                                Reusable
                                                                                across the
                                                                                  three
                                                                                 domains


Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis   46
The Halo Project



      •      “Project Halo aims to create an intelligent computer system able
             to answer novel questions in scientific domains with expertise
             equivalent to Advanced Placement (AP )competence level”

      •      Three scientific domains: Chemistry, Biology, and Physics
      •      Focus on tools that allow Subject Matter Experts (SMEs) to
             author scientific knowledge
       The elongation of the leading strand during DNA synthesis:                              The primer that initiates the synthesis of a new DNA
                                                                                               strand is usually:
       a. produces Okazaki fragments
       b. depends on the action of DNA polymerase                                              a.   RNA
       c. does not require a template strand                                                   b.   DNA
                                                                                               c.   an Okazaki fragment
                                                                                               d.   a structural protein
       Which part of the animal cell is required only in the first stage
                                                                                               e.   a thymine dimer
       of mitosis and what is the name of such stage?

       a.   chromatin and prophase
       b.   chromatid and prometaphase
       c.   centromere and anaphase
       d.   plasma membrane and telophase

Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                                47
Validating process models: The testing & debugging perspective




                                       Test set




                                         Query




                                     Test results




Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis   48
The process representation and reasoning formalism

      •      Bridges the gap between the knowledge level and the operational level
      •      Three main aspects to address
               •     Process frame
                         • On what portion of the knowledge base can a process action have effect?
                         • How can the effects of actions on the knowledge base be propagated to
                           successive actions?
               •     Data flow: path followed by the data throughout a process
               •     Control flow: run-time evaluation and enactment of the actual flow of data




                                                     Data flow                                       Data flow




              Input action                                                                Conditional                    Output action
                                                                                          precedence
                                                                                         (control flow)

Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                   49
Addressing the frame problem

      •      Two states (pre and post) per process action
               •      Pre state: portion of the process frame in the scope of an action
               •      Post state: updated pre state of the action after its the execution
      •      Actions read from their pre state and write into their post state
                                                                                                               Post state of
                                                                                                              action Dissolve




                    Pre state of
                   action Dissolve




      •      We automatically synthesize rules at modeling time that create                                                     Explicit manipulation of
             these modules upon process execution                                                                               the process frame
                                                                                                                                allows runtime
               •     Setup rules: build the pre state of the input actions of the process
                                                                                                                                management of data
               •     Precedence rules: describe what actions can be connected with                                              and control flow
                     each other by means of their outputs and inputs
               •     Transition rules: describe the transition between pre and post states


Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                              50
Correlation between the process metamodel and the formalism


                         Process metamodel                               Process representation and reasoning formalism

                                 Action                                                             Process rule


                                                1                                        1..*




                                                                                                     Setup rule        Precedence rule


                 Atomic action        Iterative action



                                                                       Transition rule                                    Domain rule


                                                                                                1                  *




                                                                                                            Update rule                 Check rule




Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                               51
F-logic as process representation and reasoning language

    •      Process diagrams authored by SMEs are automatically translated into
           F-logic (OntoBroker) code
    •      Some key properties
             •     Module system (encoding the process frame; pre and post states)
             •     Negation: (control flow evaluation)
             •     Evaluation methods: dynamic filtering (Kifer and Loziinski, 1986) and well-
                   founded evaluation (vanGelder et a., 1991)
             •     Performing and scalable, as shown by OpenRuleBench (Liang et al., 2009)
    •      Functional advantages
             •     Enabling a single entry point for reasoning across the whole system
             •     Domain-level reasoning within processes, through rules
             •     Keeping introspective properties for reasoning with meta-level information,
                   like subprocesses and intermediate results
    •      Support for expressing and reasoning with processes , allowing
           inference of process outcomes, grounded in the particular domains of
           application
    •      The synthesized code implements data and control flow within
           processes

Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis   52
Evaluation results: utilization of process metamodel
                                                                                                                          Objective 1



                                          # of                                                # of
                  Processes                             # of relations      # of forks                     Total
                                       resources                                             actions
                Transition from
                 G2 phase to               12                 21                1               3            37
  SME2              mitosis
 (Biology)
                    Mitosis                34                 37                0               5            76

                    Mitosis                29                 11                0               5            45
                 Carbohydrate
                                            5                 6                 0               2            13
                  metabolism
                    Cellular
  SME3                                      5                 7                 0               2            14
                  respiration
 (Biology)
                 Detoxification             4                 4                 0               1            9
                Photosynthesis              6                 6                 0               1            13
              Ribosome protein
                                            2                 2                 0               1            5
                  synthesis

                Complete ionic
                                            7                 7                 0               1            15
                  equation
  SME5             Molecular
                                            4                 4                 0               1            9
(Chemistry)        equation
                   Net ionic
                                            3                 3                 0               1            7
                   equation

   Total                                   111               108                1              23           243




 Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                 53
Contribution 6: A hierarchical visualization paradigm

                                                                                                                            Task-method
                                                                                                                         decomposition view




                                                                                                                                 Perfect match
                                                                                                                                 Partial match
                                                                                                                                 No match
Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis                  54
Future research problems

      • The Web is driving a new computing paradigm through the
        involvement of users forming online communities
      • Migrating from data to process
      • The solutions proposed live in the Semantic Web in the small
      • Challenge: move to the Web in the large
               • Process representation and reasoning
                         • More expressivity required (events, qualitative)
                         • Incomplete, inconsistent, even contradictory knowledge bases
                         • Uncertainty, nonmonotonicity
               • Performance, coverage, and scale
                         • Distributed reasoning algorithms, conciliation of partial results,
                           heuristics, caching, well-founded assumptions, defaults
               • Collaboration in communities of users
                         • Share, merge, compare, and recommend process models
                         • Process-specific query mechanisms
               • Web-scale process validation and trust maintenance
                         • Process reliability, trust and validation
Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis   55

More Related Content

Viewers also liked

C pedagogiado amor-textolindo!
C pedagogiado amor-textolindo!C pedagogiado amor-textolindo!
C pedagogiado amor-textolindo!Juliennerecepcoes
 
Ley sobre el efecto retroactivo de las leyes
Ley sobre el efecto retroactivo de las leyesLey sobre el efecto retroactivo de las leyes
Ley sobre el efecto retroactivo de las leyesfernando vildoso funes
 
Jac 10th result 2016, jharkhand academic council jac.nic.in jac matric result...
Jac 10th result 2016, jharkhand academic council jac.nic.in jac matric result...Jac 10th result 2016, jharkhand academic council jac.nic.in jac matric result...
Jac 10th result 2016, jharkhand academic council jac.nic.in jac matric result...weedplanet.org
 
Presentation slide
Presentation slidePresentation slide
Presentation slideg1263763
 
Master en Producción e Xestión Audiovisual_ 2016_17_Sesión de introducción
Master en Producción e Xestión Audiovisual_ 2016_17_Sesión de introducciónMaster en Producción e Xestión Audiovisual_ 2016_17_Sesión de introducción
Master en Producción e Xestión Audiovisual_ 2016_17_Sesión de introducciónRafael Linares Palomar
 
Профессиональный подход к обучению работе в СЭД
Профессиональный подход к обучению работе в СЭДПрофессиональный подход к обучению работе в СЭД
Профессиональный подход к обучению работе в СЭДLANIT
 
LanDocs: web клиент (удаленный доступ к СЭД, в том числе с мобильных устройств)
LanDocs: web клиент (удаленный доступ к СЭД, в том числе с мобильных устройств)LanDocs: web клиент (удаленный доступ к СЭД, в том числе с мобильных устройств)
LanDocs: web клиент (удаленный доступ к СЭД, в том числе с мобильных устройств)LANIT
 
Iluminacion
IluminacionIluminacion
IluminacionUPB
 
Мобильность 4.0: на производстве, в офисе и дома
Мобильность 4.0: на производстве, в офисе и домаМобильность 4.0: на производстве, в офисе и дома
Мобильность 4.0: на производстве, в офисе и домаStanislav Makarov
 
Б.Сүнжидмаа - Мэдээллийн технологийн салбарын ажиллах хүчний төлөвлөлт
Б.Сүнжидмаа - Мэдээллийн технологийн салбарын ажиллах хүчний төлөвлөлтБ.Сүнжидмаа - Мэдээллийн технологийн салбарын ажиллах хүчний төлөвлөлт
Б.Сүнжидмаа - Мэдээллийн технологийн салбарын ажиллах хүчний төлөвлөлтbatnasanb
 
EXAMEN FINAL DE DERECHO INFORMATICO
EXAMEN FINAL DE DERECHO INFORMATICOEXAMEN FINAL DE DERECHO INFORMATICO
EXAMEN FINAL DE DERECHO INFORMATICOderecho una
 

Viewers also liked (13)

C pedagogiado amor-textolindo!
C pedagogiado amor-textolindo!C pedagogiado amor-textolindo!
C pedagogiado amor-textolindo!
 
Ley sobre el efecto retroactivo de las leyes
Ley sobre el efecto retroactivo de las leyesLey sobre el efecto retroactivo de las leyes
Ley sobre el efecto retroactivo de las leyes
 
Jac 10th result 2016, jharkhand academic council jac.nic.in jac matric result...
Jac 10th result 2016, jharkhand academic council jac.nic.in jac matric result...Jac 10th result 2016, jharkhand academic council jac.nic.in jac matric result...
Jac 10th result 2016, jharkhand academic council jac.nic.in jac matric result...
 
The Rising Sun administration
The Rising Sun administrationThe Rising Sun administration
The Rising Sun administration
 
Presentation slide
Presentation slidePresentation slide
Presentation slide
 
E08-01 (CAP3)
E08-01 (CAP3)E08-01 (CAP3)
E08-01 (CAP3)
 
Master en Producción e Xestión Audiovisual_ 2016_17_Sesión de introducción
Master en Producción e Xestión Audiovisual_ 2016_17_Sesión de introducciónMaster en Producción e Xestión Audiovisual_ 2016_17_Sesión de introducción
Master en Producción e Xestión Audiovisual_ 2016_17_Sesión de introducción
 
Профессиональный подход к обучению работе в СЭД
Профессиональный подход к обучению работе в СЭДПрофессиональный подход к обучению работе в СЭД
Профессиональный подход к обучению работе в СЭД
 
LanDocs: web клиент (удаленный доступ к СЭД, в том числе с мобильных устройств)
LanDocs: web клиент (удаленный доступ к СЭД, в том числе с мобильных устройств)LanDocs: web клиент (удаленный доступ к СЭД, в том числе с мобильных устройств)
LanDocs: web клиент (удаленный доступ к СЭД, в том числе с мобильных устройств)
 
Iluminacion
IluminacionIluminacion
Iluminacion
 
Мобильность 4.0: на производстве, в офисе и дома
Мобильность 4.0: на производстве, в офисе и домаМобильность 4.0: на производстве, в офисе и дома
Мобильность 4.0: на производстве, в офисе и дома
 
Б.Сүнжидмаа - Мэдээллийн технологийн салбарын ажиллах хүчний төлөвлөлт
Б.Сүнжидмаа - Мэдээллийн технологийн салбарын ажиллах хүчний төлөвлөлтБ.Сүнжидмаа - Мэдээллийн технологийн салбарын ажиллах хүчний төлөвлөлт
Б.Сүнжидмаа - Мэдээллийн технологийн салбарын ажиллах хүчний төлөвлөлт
 
EXAMEN FINAL DE DERECHO INFORMATICO
EXAMEN FINAL DE DERECHO INFORMATICOEXAMEN FINAL DE DERECHO INFORMATICO
EXAMEN FINAL DE DERECHO INFORMATICO
 

More from Jose Manuel Gómez-Pérez

More from Jose Manuel Gómez-Pérez (9)

Science religion-dsmeetupv1.0
Science religion-dsmeetupv1.0Science religion-dsmeetupv1.0
Science religion-dsmeetupv1.0
 
Scientific data management from the lab to the web
Scientific data management   from the lab to the webScientific data management   from the lab to the web
Scientific data management from the lab to the web
 
Trust and linked data jmgomez-v1.1
Trust and linked data jmgomez-v1.1Trust and linked data jmgomez-v1.1
Trust and linked data jmgomez-v1.1
 
Halo Pcs Kcap2007 V2
Halo Pcs Kcap2007 V2Halo Pcs Kcap2007 V2
Halo Pcs Kcap2007 V2
 
NeOn: Lifecycle Support for Networked Ontologies - Case Studies in the Pharma...
NeOn: Lifecycle Support for Networked Ontologies - Case Studies in the Pharma...NeOn: Lifecycle Support for Networked Ontologies - Case Studies in the Pharma...
NeOn: Lifecycle Support for Networked Ontologies - Case Studies in the Pharma...
 
Next Challenges in Corporate Knowledge Management
Next Challenges in Corporate Knowledge ManagementNext Challenges in Corporate Knowledge Management
Next Challenges in Corporate Knowledge Management
 
Provenance: From e-Science to the Web Of Data
Provenance: From e-Science to the Web Of DataProvenance: From e-Science to the Web Of Data
Provenance: From e-Science to the Web Of Data
 
Tecnologías Semánticas en Salud
Tecnologías Semánticas en SaludTecnologías Semánticas en Salud
Tecnologías Semánticas en Salud
 
Provenance and Trust
Provenance and TrustProvenance and Trust
Provenance and Trust
 

Recently uploaded

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 

Recently uploaded (20)

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 

Acquisition And Understanding Of Process Knowledgev1 1

  • 1. Acquisition and Understanding of Process Knowledge Using Problem Solving Methods Jose Manuel Gómez Pérez Facultad de Informática Universidad Politécnica de Madrid Campus de Montegancedo sn 28660 Boadilla del Monte, Madrid http://www.oeg-upm.net jmgomez@fi.upm.es Phone: 34.91.3363670 Fax: 34.91.3524819 PhD thesis Date: 08/05/2009
  • 2. Outline Introduction and motivation Open research problems and work hypotheses Acquisition of process knowledge by SMEs Provenance analysis of process executions by SMEs Evaluation Conclusions and future research problems Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 2
  • 3. Knowledge Acquisition: Towards SME empowerment DARPA’s KA Frameworks HPKB & RKF The Knowledge The Role Acquisition Bottleneck Differentiation Principle The Knowledge Level KA by Subject Matter Experts DARPA’s KSE KA by (SMEs) Subject Matter Expert (SME) Knowledge Engineers KB edition by SMES (KEs) KRAKEN Knowledge Knowledge DISCIPL-RKF formulation modeling KRR Languages by SMEs KB OCML maintenance Ontologies Ontology editors KARL SHAKEN CHIMAERA Collaborative Knowledge Engineer (KE) Knowledge knowledge creation programming Problem Solving SEMANTIC WIKIS Methods Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 3
  • 4. Knowledge types Source: the Halo project KR analysis phase for the domains of Chemistry, Biology, and Physics • Processes are special knowledge RUL CLS FACT MAT types that (classification) (factual (mathematics) (inference) knowledge) • Relate to the sequence of operations and involved CMP TAB PCS CAUS (comparison) (tables) (processes) (cause-effect) resources leading to the production of some outcome DAT PROC EXP US • Encapsulate preconditions, (data structures) (procedural) (experiments) (underspecified) results, contents, actors, and TRANS causes NF SPACE PWR (translation) (non functional) (spatial) (part-whole) • Process knowledge is complex • It builds on top of other simpler TIME GRA (temporal) (diagrammatic) knowledge types, like facts and rules “What is released/added/increased upon binding of two amino acids?” “A piece of solid calcium is heated in oxygen gas. ...” “Find correct RNA sequence for a given DNA sequence.” Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 4
  • 5. Why processes are important • Processes appear in 37% (average) in the domains of Biology, Chemistry, and Physics • The most important knowledge type in Chemistry (53%) • Second in Biology (35%) • Fourth in Physics (22%) Source: The Halo project KR analysis phase for the domains of Chemistry, Biology, and Physics Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 5
  • 6. Work objectives Objective 1: To enable SMEs to Provide reusable guidelines formulate processes without KEs to formulate process How knowledge Objective 2: To support SMEs in Support reasoning understanding process executions Describe the main rationale behind a process What PSMs Whom PCS SMEs Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 6
  • 7. PSM perspectives Task-method Interaction decomposition Black-box perspective Knowledge transformation within the PSM Hierarchically defines how PSM establishes and controls tasks decompose into simpler the sequence of actions (sub)tasks required to perform a task Describes tasks at several Defines knowledge required at levels of detail each task step Provides alternative ways to achieve a task Knowledge flow Task Method Role Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 7
  • 8. Provenance analysis of process executions ? Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 8
  • 9. In summary • This thesis proposes the use of PSMs as a novel approach for supporting SMEs both in the formulation of process knowledge and in the provenance analysis of process executions • It also explores to what extent it is possible to build such tools that take KEs out of the formulation and analysis loop • Ultimately, it aims at showing that it is possible to engage users • To generate computer-readable content represented in formal languages • To apply knowledge representation and reasoning techniques to analyze the outcomes of automated, knowledge-intensive processes Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 9
  • 10. Outline Introduction and motivation Open research problems and work hypotheses Acquisition of process knowledge by SMEs Provenance analysis of process executions by SMEs Evaluation Conclusions and future research problems Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 10
  • 11. Open research problems and work hypotheses: Objective 1 • H1: Empowering SMEs can increase KB Objective 1: To provide SMEs with the means to formulate process knowledge in quality and reduce costs their domains of expertise without the intervention of KEs • H2: The complexity of process knowledge requires providing SMEs with specific means to acquire and reason with processes • H3: PSMs can reduce the complexity of acquiring process knowledge by SMEs • H4: The proposed methods and tools abstract SMEs from the underlying representation • H5: The proposed methods and tools produce sound and complete executable process models • H6: The proposed method and tools are flexible and reusable across domains Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 11
  • 12. Open research problems and work hypotheses: Objective 2 Objective 2: To support SMEs in analyzing and understanding process executions • H7: The analytical capabilities of PSMs can provide SMEs with meaningful interpretations of process executions • H8: The method proposed identifies the main rationale behind processes by detecting occurrences of PSMs in their execution logs • H9: The method proposed can use the hierarchical structure of PSMs to describe process executions at different levels of detail Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 12
  • 13. Outline Introduction and motivation Open research problems and work hypotheses Acquisition of process knowledge by SMEs Provenance analysis of process executions by SMEs Evaluation Conclusions and future research problems Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 13
  • 14. Acquisition of process knowledge by SMEs Objective 1: To provide SMEs with the means to formulate process knowledge in their domains of • Four main contributions expertise without the intervention of KEs • C1: a process metamodel, which provides the terminology necessary to express process entities in scientific domains, and the relations between them • C2: a PSM library, which provides high-level, reusable abstractions for process representation and a method for its development • C3: a graphical process modeling and reasoning environment, which applies the previous contributions in order to enable SMEs to formulate process knowledge • C4: a method for the automatic synthesis of executable process models from SME-authored process diagrams, supported by an underlying representation and reasoning formalism Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 14
  • 15. Contribution 1: The process metamodel • Resources (roles) • Actions • Containers of domain conceptsthat • Inspired by activities in EO and TOVE can play a particular role • Relations • Forks Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 15
  • 16. Contribution 2: Building a PSM library for acquisition of process knowledge Identification Decomposition and abstraction Extends work done in the Halo analysis phase by Omniscience and Ontoprise teams Reusable PSM library • 755 AP questions analyzed • >100 domain-specific processes • Four main process categories • Join • Split • Modify • Locate Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 16
  • 17. Contribution 2: A PSM example (decompose & combine) “Crystallization occurs when certain pairs of oppositely charged ions attract each other so strongly that they form an insoluble ionic solid. This process coexists with dissolution processes in precipitation reactions” The addition of a colorless solution of potassium iodide (KI) to a colorless solution of lead nitrate [Pb(NO3)2] produces a yellow precipitate of lead iodide (Pbl2) that slowly settles to the bottom of the beaker. name decompose & combine goal member(Recombination set, Element) and member(Constituents set, Piece) and part-of(Piece, Element) and part-of(Piece, Combination) and properties(Element, ep) and properties(Combination, cp) and not equal(ep, cp) actions decompose, combine input action decompose output action combine input roles Recombination set, Decomposer, Combinator output roles Combination, Byproduct Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 17
  • 18. Contribution 3: The graphical process modeling environment Domain process to which this process diagram is Consistency bound maintenance Process (knowledge base and metamodel process data flow) PSM library (e.g. decompose & recombine) Domain-level reasoning and control flow evaluation Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 18
  • 19. Contribution 4: The process representation and reasoning formalism • Bridges the gap between the knowledge level and the operational level • Focus on three main aspects • Process frame • Data flow • Control flow Input action Conditional Output precedence action (control flow) “In a long-distance jump competition, an athlete can jump only after his mitochondria have accumulated enough energy for his muscles to contract.” Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 19
  • 20. Contribution 4: Addressing the frame problem Pre state of action • Two states (pre and post) per Dissolve process action • Pre state: portion of the process frame in the scope of an action Post state of • Post state: updated pre state of action the action after its the execution Dissolve • Actions read from their pre state and write into their post state • At modeling time we automatically synthesize process rules that manage the process frame during execution • Setup rules: build the pre state of the input actions of the Explicit manipulation of process the process frame • Precedence rules: describe what actions can be allows runtime management of data connected with each other by means of their outputs and and control flow inputs • Transition rules: describe the transition between pre and post states Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 20
  • 21. Contribution 4: The Code synthesis mechanism • Action-centric algorithm • Each action results into a set of process rules in F-logic intermediate output input actions actions actions setup rules x - - transition x x x rules setup FORALL m, e, v m:mitochondrion@preState(accumulateEnergy) AND precedence m:TOOL@preState(accumulateEnergy) AND - x x rules e: energy@preState(accumulateEnergy) AND transition e:RESOURCE@preState(accumulateEnergy) AND e[hasEnergyValue -> v]@preState(accumulateEnergy) FORALL m, e, j <- j: jump@postState(muscleContraction) AND m:mitochondrion AND j: OUTPUT@postState(muscleContraction) AND e:energy AND muscleContraction[PROVIDES -> j] @postState(muscleContraction) e[hasEnergyValue -> v]. <.- precedence m:muscle @preState(muscleContraction) AND FORALL e, v m:TOOL@preState(muscleContraction) AND e:energy@preState(muscleContraction) AND m[IS_USED_BY -> muscleContraction]@preState(muscleContraction) AND e[hasEnergyValue -> v]@ preState(muscleContraction) e:energy@ preState(muscleContraction) AND <.- e:RESOURCE@preState(muscleContraction) AND e:energy@postState(accumulateEnergy) AND e[IS_CONSUMED_BY -> muscleContraction] @preState(muscleContraction). e[hasEnergyValue -> v]@ postState(accumulateEnergy). Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 21
  • 22. Contribution 4: Domain-level reasoning within processes “The minimum amount of energy needed to jump are 5 calories” “The length of the jump is directly proportional to the amount of energy accumulated” transition check FORALL j, m, e, length FORALL anEnergy, v j: jump@postState(muscleContraction) AND enough_energy_for_contraction(anEnergy) j: OUTPUT@postState(muscleContraction) AND @check_enough_energy_for_contraction(accumulateEnergy) muscle contraction[PROVIDES -> j]@postState(muscleContraction) AND <- j[hasLength-> length] @postState(muscleContraction) anEnergy:energy @preState(muscleContraction) AND <.- anEnergy:energy[hasEnergyValue -> v] @preState(muscleContraction) AND m:muscle @preState(muscleContraction) AND greater(v, 5). m:TOOL@preState(muscleContraction) AND update m[IS_USED_BY -> muscleContraction]@preState(muscleContraction) AND FORALL length, anEnergy, v e:energy@ preState(muscleContraction) AND aJump(out(hasLength, length):jump@update(muscleContraction) e:RESOURCE@preState(muscleContraction) AND aJump(out(hasLength, length) [hasLength -> length]@update(muscleContraction) e[IS_CONSUMED_BY -> muscleContraction] @preState(muscleContraction) AND <- enough_energy_for_contraction(e) anEnergy:energy @preState(muscleContraction) AND @check_enough_energy_for_contraction(accumulateEnergy) AND anEnergy:energy[hasEnergyValue -> v] @preState(muscleContraction) AND j:jump@update(muscleContraction) AND multiply(length, 2, v). j[hasLength -> length]@update(muscleContraction). Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 22
  • 23. Contribution 4: Sample question “In a long-distance jump competition, an athlete can jump only after his mitochondria have accumulated enough energy for his muscles to contract.” At least, what amount of energy does a long jump energy1:energy[hasValue -> 100].n FORALL j, oa, v <- athlete need to consume in order to jump more than “long jump”:PROCESS@ProcessModule AND 8m long? “long jump”[OUTPUT_ACTION ->> oa]@ProcessModule AND j:Jump[hasValue -> v]@postState(oa) AND √ a. 100 cal greater(v, 8). √ b. 50 cal √ c. 250 cal d. 1 cal Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 23
  • 24. Contribution 4: Properties of the process models • Sound and complete • Based on F-logic’s proof theory plus additional proof for the process formalism • A process action is sound ↔ its post state can be deduced from its pre state • A process action is complete ↔ it allows deducing all the possible clauses of its post state from the clauses in the pre state • A process model is sound and complete ↔ all its actions are sound and complete • Optimized • Attribute and concept names ground • person(Peter) instead of instanceOf(person, Peter) • Allows OntoBroker to index tuples by class and attribute name • Process rules are generally stratified • Critical in the presence of negation (forks and loops) • Avoid costly well-founded evaluation mode Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 24
  • 25. Outline Introduction and motivation Open research problems and work hypotheses Acquisition of process knowledge by SMEs Provenance analysis of process executions by SMEs Evaluation Conclusions and future research problems Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 25
  • 26. Provenance analysis of process executions by SMEs Objective 2: To support SMEs in analyzing and understanding process executions • Two main contributions • C5: A method and algorithm that uses PSMs as high-level, reusable process abstractions and visualization paradigm to identify and explain the reasoning strategies and rationale of executed processes • C6: An architecture and integrated environment for the analysis of process executions at the knowledge level Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 26
  • 27. Contribution 5: Towards knowledge provenance • PSMs as semantic overlays on top of existing process documentation • Task: What is going to be achieved by executing a process • PSM: HOW • Provenance, from a knowledge perspective • How provenance relates to the execution of a process • Simpler process analysis proposing decompositions into simpler subprocesses • Visualize provenance at different levels of detail • Supporting SMEs in two main ways • Validation of process executions • Identification of reasoning patterns in process Source: myGrid executions Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 27
  • 28. Contribution 5: The twig join function • Based on XML pattern matching algorithms on Directed Acyclic Graphs (Bruno et al., 2002) • twig_join detects the occurrence of a pattern in a XML DAG • Given • P, a process • T, a task potentially describing P • M, a PSM providing a strategy on how to achieve T • i(T), the set of input roles of T • o(T), the set of output roles of T • D, the DAG resulting from documenting the execution of P • twig_join(D,i(T),o(T)) checks whether a twig exists for M that connects i(T) with o(T) in D • In this case, PSM M is the pattern to be identified in the process documentation DAG D Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 28
  • 29. Contribution 5: A twig join example Domain Bridges PSM entities (mapping) entities twig join! Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 29
  • 30. Contribution 5: The matching algorithm • twig_join recursively applied Task-method decomposition at each decomposition level • Each task decomposed by one or several PSMs (task- twig_join(Ti, D) method decomposition view) • Knowledge flow defines the sequence of evaluation decompose(Ti) twig_join(T11, D) Knowledge flow twig_join(T12, D) twig_join(T13, D) Backtracking possible at PSM and role levels twig_join(T14, D) Interaction Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 30
  • 31. Contribution 6: A Knowledge-Oriented Provenance Environment PSM- Matching Ontology visualization bridges Provenance Matching query detection Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 31
  • 32. Outline Introduction and motivation Open research problems and work hypotheses Acquisition of process knowledge by SMEs Provenance analysis of process executions by SMEs Evaluation Conclusions and future research problems Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 32
  • 33. Objective 1 • Evaluation settings • 2 Chemistry SMEs, 2 Biology SMEs, and 2 Physics SMEs Judith Lennart Christianne Martina Markus Andreas • Syllabus • Chemistry: Stoichiometry, solutions and equilibrium (Brown & Lemay, pages 75-83, 113-133, and 613-653) • Biology: Cell and DNA structure and processes (Campbell and Reece, pages 112-124, 217-223, 239-245, 293-301, 304-311, and 317-319) • Physics: Kinematics and Dynamics (Serway and Faughn, chapters 2,3, and 4) • Two main dimensions: usability and utility Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 33
  • 34. Evaluation results: utilization of the PSM library Objective 1 # of processes modeled SME1 (Physics) 0 H3: PSMs can reduce the H1: SME empowerment SME2 (Biology) 2 complexity of process KA can increase KB quality SME3 (Biology) 6 and reduce costs H6: The proposed SME4 (Chemistry) 0 methods and tools are flexible and reusable SME5 (Chemistry) 3 SME6 (Physics) 0 Total 11 Processes PSMs Transition from G2 phase to mitosis n.a. SME2 (Biology) Mitosis n.a. Mitosis decompose & combine Carbohydrate metabolism consume, transform Cellular respiration decompose, consume SME3 (Biology) Detoxification transform Photosynthesis form by combination Ribosome protein synthesis situate & combine Complete ionic equation form by combination SME5 (Chemistry) Molecular equation decompose & combine Net ionic equation form by combination Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 34
  • 35. Evaluation results: performance of process models Objective 1 with respect to configuration C0 H5: The proposed Query C0 C1 C2 methods and tools produce sound and SME3-q0 31 1,00 0 0,00 16 0,52 complete executable SME3-q1 63 1,00 16 0,25 16 0,25 process models SME3-q2 31 1,00 16 0,52 16 0,52 SME3-q3 47 1,00 16 0,34 16 0,34 SME3-q4 15 1,00 0 0,00 0 0,00 SME3-q5 32 1,00 16 0,50 0 0,00 SME3-q6 203 1,00 219 1,08 234 1,15 SME3-q7 63 1,00 31 0,49 31 0,49 • C0 SME3-q8 47 1,00 31 0,66 16 0,34 • Well-founded evaluation on SME3-q9 62 1,00 32 0,52 16 0,26 SME3-q10 203 1,00 218 1,07 203 1,00 • Concept/attr. names ground off Average 79,7 1,00 59,5 56,4 Median 47 1,00 16 0,75 0,34 16 0,71 0,34 • C1 Min 15 1,00 0 0,00 0 0,00 • Well-founded evaluation on Max 203 1,00 219 1,08 234 1,15 <1 - faster • Concept/attr. names ground on =1 - same time as config C0 >1 - slower • C2 • Well-founded evaluation off • Concept/attr. names ground on Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 35
  • 36. Evaluation results: utility and usability Objective 1 • Physics SMEs did not use processes • System Usability (SU) scale • Not so important for Chemistry SMEs • SMEs answered a questionnaire about the system with a quantitative value • SME2 (Biology): “It makes the ranging between 0 and 100 representation of biological models easier” • Average obtained: 64,5 • SME3 (Biology): “The modeling of H2: Due to its H4: The method and processes is very useful. It must be complexity, SMEs tools proposed possible to ask questions about the require specific abstract SMEs from various states of a process. And asking means for process the underlying KRR questions with T&D worked okay” KA formalism Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 36
  • 37. Evaluation settings (Provenance Challenge) Objective 2 Brain Atlas Brain Atlas Workflow Provenance Data Flow Catalogation PSM 37 Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 37
  • 38. Evaluation results Objective 2 • Focus on precision and recall metrics • Identified at three different layered contexts • Method • Task H7: PSMs can provide SMEs with explanations • Decomposition-level of process executions H8: The method proposed identifies the main rationale behind processes by detecting PSM occurrences H9: PSMs describe process executions at different levels of detail Perfect match Partial match No match Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 38
  • 39. Outline Introduction and motivation Open research problems and work hypotheses Acquisition of process knowledge by SMEs Provenance analysis of process executions by SMEs Evaluation Conclusions and future research problems Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 39
  • 40. Conclusions Objective 1: To enable SMEs to acquire processes without KEs • Qualitative evidence rather than statistical proof (only 6 SMEs) • However, evidence found that it is possible to engage users in acquiring process knowledge without the intervention of KEs • SMEs using the PSM library (SME3 and SME5) produced more and better quality process models (82%) than the rest (SME2) • The method used to create the PSM library has also shown evidence to be reusable in other domains Objective 2: To support SMEs in understanding process executions • Semantic overlays e.g. PSMs on top of process documentation provide the required abstractions to analyze provenance from a knowledge perspective • Provenance analysis by SMEs favors from a hierarchical structure in such overlays • The matching algorithm has not been applied to large PSM libraries and provenance logs Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 40
  • 41. The ubiquity of processes Business Biology Healthcare Climate Ecology prediction Chemistry Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 41
  • 42. Future research problems • The Web is driving a new computing paradigm through the involvement of users forming online communities • Additionally, focus change from data to process • The solutions proposed live in the Semantic Web in the small • Challenge: move to the Web in the large Process Performance, Collaboration Process representation coverage, in user validation, trust and reasoning scale communities maintenance More expressivity Distributed (events, qualitative reasoning Share and reuse reasoning) algorithms processes Process reliability and validation Incomplete, Conciliation of partial results Compare and inconsistent, recommend contradictory Heuristics processes knowledge bases (assumptions, defaults) Trust Uncertainty, Process-specific nonmonotonicity Caching query mechanisms Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 42
  • 43. Acquisition and Understanding of Process Knowledge Using Problem Solving Methods Jose Manuel Gómez Pérez Facultad de Informática Universidad Politécnica de Madrid Campus de Montegancedo sn 28660 Boadilla del Monte, Madrid http://www.oeg-upm.net jmgomez@fi.upm.es Phone: 34.91.3363670 Fax: 34.91.3524819 PhD thesis Date: 08/05/2009
  • 44. Previous steps towards provenance analysis • PASOA (Groth et al., 2008) process documentation • Relationship p-assertions • Who exchanges messages • Interaction p-assertions • How input data becomes output data • Actor-state p-assertions PASOA interaction p-assertion • Non-functional properties Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 44
  • 45. Contribution 2: PSMs for the acquisition of process knowledge • PSMs separate process from domain knowledge by explicitly representing the different roles domain entities play in a process • Focus on how to achieve processes • Provide SMEs with guidelines and templates to formulate process models Modelling framework for process knowledge, extending TMDA Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 45
  • 46. Contribution 2: The PSM library for Process Modelling Reusable across the three domains Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 46
  • 47. The Halo Project • “Project Halo aims to create an intelligent computer system able to answer novel questions in scientific domains with expertise equivalent to Advanced Placement (AP )competence level” • Three scientific domains: Chemistry, Biology, and Physics • Focus on tools that allow Subject Matter Experts (SMEs) to author scientific knowledge The elongation of the leading strand during DNA synthesis: The primer that initiates the synthesis of a new DNA strand is usually: a. produces Okazaki fragments b. depends on the action of DNA polymerase a. RNA c. does not require a template strand b. DNA c. an Okazaki fragment d. a structural protein Which part of the animal cell is required only in the first stage e. a thymine dimer of mitosis and what is the name of such stage? a. chromatin and prophase b. chromatid and prometaphase c. centromere and anaphase d. plasma membrane and telophase Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 47
  • 48. Validating process models: The testing & debugging perspective Test set Query Test results Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 48
  • 49. The process representation and reasoning formalism • Bridges the gap between the knowledge level and the operational level • Three main aspects to address • Process frame • On what portion of the knowledge base can a process action have effect? • How can the effects of actions on the knowledge base be propagated to successive actions? • Data flow: path followed by the data throughout a process • Control flow: run-time evaluation and enactment of the actual flow of data Data flow Data flow Input action Conditional Output action precedence (control flow) Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 49
  • 50. Addressing the frame problem • Two states (pre and post) per process action • Pre state: portion of the process frame in the scope of an action • Post state: updated pre state of the action after its the execution • Actions read from their pre state and write into their post state Post state of action Dissolve Pre state of action Dissolve • We automatically synthesize rules at modeling time that create Explicit manipulation of these modules upon process execution the process frame allows runtime • Setup rules: build the pre state of the input actions of the process management of data • Precedence rules: describe what actions can be connected with and control flow each other by means of their outputs and inputs • Transition rules: describe the transition between pre and post states Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 50
  • 51. Correlation between the process metamodel and the formalism Process metamodel Process representation and reasoning formalism Action Process rule 1 1..* Setup rule Precedence rule Atomic action Iterative action Transition rule Domain rule 1 * Update rule Check rule Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 51
  • 52. F-logic as process representation and reasoning language • Process diagrams authored by SMEs are automatically translated into F-logic (OntoBroker) code • Some key properties • Module system (encoding the process frame; pre and post states) • Negation: (control flow evaluation) • Evaluation methods: dynamic filtering (Kifer and Loziinski, 1986) and well- founded evaluation (vanGelder et a., 1991) • Performing and scalable, as shown by OpenRuleBench (Liang et al., 2009) • Functional advantages • Enabling a single entry point for reasoning across the whole system • Domain-level reasoning within processes, through rules • Keeping introspective properties for reasoning with meta-level information, like subprocesses and intermediate results • Support for expressing and reasoning with processes , allowing inference of process outcomes, grounded in the particular domains of application • The synthesized code implements data and control flow within processes Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 52
  • 53. Evaluation results: utilization of process metamodel Objective 1 # of # of Processes # of relations # of forks Total resources actions Transition from G2 phase to 12 21 1 3 37 SME2 mitosis (Biology) Mitosis 34 37 0 5 76 Mitosis 29 11 0 5 45 Carbohydrate 5 6 0 2 13 metabolism Cellular SME3 5 7 0 2 14 respiration (Biology) Detoxification 4 4 0 1 9 Photosynthesis 6 6 0 1 13 Ribosome protein 2 2 0 1 5 synthesis Complete ionic 7 7 0 1 15 equation SME5 Molecular 4 4 0 1 9 (Chemistry) equation Net ionic 3 3 0 1 7 equation Total 111 108 1 23 243 Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 53
  • 54. Contribution 6: A hierarchical visualization paradigm Task-method decomposition view Perfect match Partial match No match Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 54
  • 55. Future research problems • The Web is driving a new computing paradigm through the involvement of users forming online communities • Migrating from data to process • The solutions proposed live in the Semantic Web in the small • Challenge: move to the Web in the large • Process representation and reasoning • More expressivity required (events, qualitative) • Incomplete, inconsistent, even contradictory knowledge bases • Uncertainty, nonmonotonicity • Performance, coverage, and scale • Distributed reasoning algorithms, conciliation of partial results, heuristics, caching, well-founded assumptions, defaults • Collaboration in communities of users • Share, merge, compare, and recommend process models • Process-specific query mechanisms • Web-scale process validation and trust maintenance • Process reliability, trust and validation Jose Manuel Gómez Pérez – Acquisition and Understanding of Process Knowledge Using Problem Solving Methods, PhD thesis 55