SlideShare ist ein Scribd-Unternehmen logo
1 von 33
On the Role of Fitness,
Precision, Generalization and
Simplicity in Process
Discovery

Joos Buijs
Boudewijn van Dongen
Wil van der Aalst




        http://www.win.tue.nl/coselog/
Advances in Process Mining

• Many process discovery and conformance checking algorithms
  and tools are available (cf. the various ProM packages).
• Also commercial software based on these ideas:
  Disco (Fluxicon), Reflect (Futura/Perceptive), BPMOne (Pallas Athena/Perceptive),
  ARIS Process Performance Manager (Software AG), Interstage Automated Process
  Discovery (Fujitsu), QPR ProcessAnalyzer/Analysis (QPR Software), flow
  (fourspark), Discovery Analyst (StereoLOGIC), etc.
• We applied process mining in over 100 organizations.



                    More than 75 people
                    involving more than 50
                    organizations created the
                    Process Mining Manifesto in
                    the context of the IEEE Task
                    Force on Process Mining.

                    Available in 13 languages
                                                                              PAGE 1
Example Process Discovery
(Vestia, Dutch housing agency, 208 cases, 5987 events)




                                                         PAGE 2
Example: Conformance Checking
(WOZ objections Dutch municipality, 745 objections, 9583 event, f= 0.988)




                                                                        PAGE 3
Challenge:
Four Competing Quality Criteria

 “able to replay event log”                 “Occam’s razor”

replay fitness                                simplicity

                               process
                              discovery



generalization                                precision
 “not overfitting the log”                “not underfitting the log”



                                                                PAGE 4
Example: one log four models
                                                                                                               b
                                                                                                            examine
                                                                                                           thoroughly
                                                                                                                                                                            g
                                                                                                                                                                         pay
                                                                                                               c                                                     compensation
                                                                                          a                examine                                 e
                                                                           start     register              casually                           decide                                   end
                                                                                                                                                                                                     #      trace
                                                                                     request
                                                                                                                                                                            h                        455 acdeh
                                                                                                               d                                                         reject
                                                                                                          check ticket                                                  request                      191 abdeg
                                                                                                                                               f     reinitiate
                                                                                                                                                      request                                        177 adceh
                                                                               N1 : fitness = +, precision = +, generalization = +, simplicity = +
                                                                                                                                                                                                     144 abdeh
                                                                                                                                                                                                     111 acdeg
                                                                                     a               c                        d                          e                      h
                                                                                                                                                                                                      82 adceg
                                                                          start    register       examine                   check                      decide                reject     end
                                                                                   request        casually                  ticket                                          request
                                                                                                                                                                                                      56 adbeh
                                                                               N2 : fitness = -, precision = +, generalization = -, simplicity = +
                                                                                                                                                                                                      47 acdefdbeh
 “able to replay event log”                 “Occam’s razor”
                                                                                                                                                                                                      38 adbeg
                                                                                                       examine                                check
                                                                                                      thoroughly        b             d       ticket                        g                         33 acdefbdeh
replay fitness                                simplicity                                                                                                            pay
                                                                                                                                                                compensation
                                                                                          a                                                                                                           14 acdefbdeg
                                                                           start     register   examine
                                                                                                             c                                                                         end            11 acdefdbeg
                                                                                     request    casually
                                                                                                                         e                f        reinitiate               h
                               process                                                                        decide                                request        reject
                                                                                                                                                                  request
                                                                                                                                                                                                         9 adcefcdeh
                              discovery                                        N3 : fitness = +, precision = -, generalization = +, simplicity = +                                                       8 adcefdbeh
                                                                                                                                                                                                         5 adcefbdeg
                                                                                       a              d                        c                           e                    g
                                                                                                                                                                                                         3 acdefbdefdbeg
generalization                                precision                             register
                                                                                    request
                                                                                                    check
                                                                                                    ticket
                                                                                                                         examine
                                                                                                                         casually
                                                                                                                                                        decide              pay
                                                                                                                                                                        compensation
                                                                                                                                                                                                         2 adcefdbeg
                                                                                       a              c                        d                          e                     g                        2 adcefbdefbdeg
 “not overfitting the log”                “not underfitting the log”                register      examine                    check                      decide              pay
                                                                                    request       casually                   ticket                                     compensation                     1 adcefdbefbdeh
                                                                                       a              d                        c                           e                    h                        1 adbefbdefdbeg
                                                                                    register        check                examine                        decide                reject
                                                                                    request         ticket               casually                                            request                     1 adcefdbefcdefdbeg
                                                                                      a               c                       d                           e                     h                   1391
                                                                       start                                                                                                                  end
                                                                                   register       examine                   check                      decide                reject
                                                                                   request        casually                  ticket                                          request


                                                                                                 …                 (all 21 variants seen in the log)


                                                                                      a              b                        d                           e                     g
                                                                                   register        examine                  check                      decide               pay
                                                                                   request        thoroughly                ticket                                      compensation

                                                                                      a              d                        b                           e                     h
                                                                                   register         check                 examine                      decide                reject
                                                                                   request          ticket               thoroughly                                         request

                                                                                      a              b                        d                           e                     h
                                                                                   register        examine                  check                      decide                reject
                                                                                   request        thoroughly                ticket                                          request                        PAGE 5
                                                                                N4 : fitness = +, precision = +, generalization = -, simplicity = -
#     trace
                                                                               455 acdeh
        Model N1                                                               191 abdeg
                                                                               177 adceh
                                                                               144 abdeh
                                                                               111 acdeg
                                                                                82 adceg
                                                                                56 adbeh
                          b                                                     47 acdefdbeh
                       examine
                      thoroughly                                                38 adbeg
                                                              g                 33 acdefbdeh
                                                             pay
                          c                              compensation           14 acdefbdeg
           a          examine               e
                                                                                11 acdefdbeg
start    register     casually          decide                          end
         request                                                                   9 adcefcdeh
                                                              h
                          d                                 reject                 8 adcefdbeh
                     check ticket                          request                 5 adcefbdeg
                                        f   reinitiate                             3 acdefbdefdbeg
                                             request
N1 : fitness = +, precision = +, generalization = +, simplicity = +                2 adcefdbeg
                                                                                   2 adcefbdefbdeg
                                                                                   1 adcefdbefbdeh
                                                                                   1 adbefbdefdbeg
                                                                                   1 adcefdbefcdefdbeg
                                                                                             PAGE 6
                                                                              1391
#     trace
                                                                              455 acdeh
        Model N2                                                              191 abdeg
                                                                              177 adceh
                                                                              144 abdeh
                                                                              111 acdeg
                                                                               82 adceg
                                                                               56 adbeh
                                                                               47 acdefdbeh
                                                                               38 adbeg
          a           c             d            e             h               33 acdefbdeh
start   register   examine        check        decide         reject   end     14 acdefbdeg
        request    casually       ticket                     request
   N2 : fitness = -, precision = +, generalization = -, simplicity = +         11 acdefdbeg
                                                                                  9 adcefcdeh
                                                                                  8 adcefdbeh
                                                                                  5 adcefbdeg
                                                                                  3 acdefbdefdbeg
                                                                                  2 adcefdbeg
                                                                                  2 adcefbdefbdeg
                                                                                  1 adcefdbefbdeh
                                                                                  1 adbefbdefdbeg
                                                                                  1 adcefdbefcdefdbeg
                                                                                            PAGE 7
                                                                             1391
#     trace
                                                                                          455 acdeh
        Model N3                                                                          191 abdeg
                                                                                          177 adceh
                                                                                          144 abdeh
                                                                                          111 acdeg
                                                                                           82 adceg
                                                                                           56 adbeh
                                                                                           47 acdefdbeh
                           examine                  check
                          thoroughly    b   d       ticket                     g           38 adbeg
                                                                        pay                33 acdefbdeh
                                                                    compensation
           a                                                                               14 acdefbdeg
start    register   examine                                                        end     11 acdefdbeg
         request    casually   c
                                        e       f      reinitiate
                                                                      reject
                                                                               h              9 adcefcdeh
                               decide                   request
                                                                     request                  8 adcefdbeh
 N3 : fitness = +, precision = -, generalization = +, simplicity = +
                                                                                              5 adcefbdeg
                                                                                              3 acdefbdefdbeg
                                                                                              2 adcefdbeg
                                                                                              2 adcefbdefbdeg
                                                                                              1 adcefdbefbdeh
                                                                                              1 adbefbdefdbeg
                                                                                              1 adcefdbefcdefdbeg
                                                                                                        PAGE 8
                                                                                         1391
#     trace
                                                                                               455 acdeh
Model N4                                                                                       191 abdeg
                                                                                               177 adceh
                                                                                               144 abdeh
              a             d                 c                e              g                111 acdeg
           register       check           examine            decide          pay
           request        ticket          casually                       compensation           82 adceg
              a             c                 d                e              g                 56 adbeh
           register     examine             check           decide           pay
           request      casually            ticket                       compensation           47 acdefdbeh
              a             d                 c                e              h                 38 adbeg
           register       check           examine           decide           reject
           request        ticket          casually                          request             33 acdefbdeh
              a            c                 d                e               h                 14 acdefbdeg
start                                                                                   end
           register     examine            check            decide           reject
           request      casually           ticket                           request             11 acdefdbeg

                       …             (all 21 variants seen in the log)
                                                                                                   9 adcefcdeh
                                                                                                   8 adcefdbeh
                                                                                                   5 adcefbdeg
             a             b                 d                e               g
           register     examine            check            decide           pay                   3 acdefbdefdbeg
           request     thoroughly          ticket                        compensation
                                                                                                   2 adcefdbeg
             a             d                 b                e               h
           register       check            examine          decide           reject                2 adcefbdefbdeg
           request        ticket          thoroughly                        request
                                                                                                   1 adcefdbefbdeh
             a             b                 d                e               h
          register       examine           check            decide          reject                 1 adbefbdefdbeg
          request       thoroughly         ticket                          request
                                                                                                   1 adcefdbefcdefdbeg
        N4 : fitness = +, precision = +, generalization = -, simplicity = -
                                                                                                             PAGE 9
                                                                                              1391
Another challenge: Huge search space

M    M M         M              M
  M M                M M             M M         M              M
        M   M     M M            M M                 M M
M                       M   M           M M       M M
  M M     M MM
                  M M     M    M    M     M MM          M M    M M
                                 M                M M     M      M
M M M M M M            M M             M M                    M M
                    M           M M          M M M M M
MM    M MM M          M MM M                M M
     M M M      MM               M MM M         MM    M MM      MM
MM    M M M M M M            M          M            M M       M
     M M M M M M M M M M MM M M M M
  M    M M M M         M M    M M       M    M M MM M M M M
    M M             M M                M M M M         M M M M
          M   MM              M    M M              M M
  M                       M               M MM
    M M     M MM
                    M M     M    M
                                   M  M     M MM          M M M M MM
  M M M M                M M MM          M M        M M     M
                  M M                                    M M    M M
  MM    M MM M          M MM M
                                     M
                                             M  M M M
      M M      M MM
                      M M          M MM M M M M         M MM M MM
   MM           M M            M          M           M M     M M
       M              M            M          M
  M       M MM          M M M           M M M       M M
    M M      M MM
                    M M      M    M
                                    M M      M MM         M M M M MM
   M M M M               M M M M                    M M     MM
                   M M                    M M                    M M
        M MM M                        M            M M M M
  MM
       M M      M MM    M MM M
                                    M M M MM M          M MM M MM
                       M M                M     M MM          M M
                                                       M M

                                                               PAGE 10
… with just a few interesting candidates

M    M M         M
  M M                M M       M    M M        M              M
        M   M     M M           M M                 M M
M                       M M M          M M      M M
  M M     M MM
                  M M     M        M     M MM          M M   M M
                                M               M M     M      M
M M M M M M            M M     M M    M M                   M M
      M MM M        M                       M M M M M
MM                    M MM M              M M
     M M M      MM
                     M M    M   M MM M         MM    M MM     MM
      M M M M
MM
     M M                             M MM M M M M M          M
            M M MM M       M MM       M                         M
  M
    M M
       M M M M         M M   M M M M M M MM M M M M M
          M   MM    M M           M              M   M M M M
  M                       M  M         MM   MM    M M
    M M     M MM
                    M M     M   M
                                  M  M     M MM         M M M M MM
  M M M M                M MM M M       M M       M M     M
                  M M                                         M M
        M MM M            M                   M M M M M
  MM
      M M      M MM     M    M MM M M M               M MM M MM
   MM           M M   M M     M      M M      M MM          M M
       M              M           M                  M M
          M MM          M M M               M      M M
  M                                    M M M
    M M      M MM
                    M M     M    M
                                   M M     M MM          M M M M MM
   M M M M                M M M M                  M M    MM
                   M M                  M M             M M    M M
  MM    M M   M M       M MM M
                                     M
                                            M  M M M
       M M      M MM
                       M M         M MM M M M M        M MM M MM
                                         M            M M   M M
                                                              PAGE 11
Two requirements

1. It should be possible to      “able to replay event log”                 “Occam’s razor”

                                replay fitness                                simplicity
   seamlessly balance the
   different quality criteria                                  process
                                                              discovery

   based on user-defined        generalization                                precision
   preferences.                  “not overfitting the log”                “not underfitting the log”




2. The algorithm should
   always return a "correct"
   process model and not
   waste time on model
   having deadlocks and
   other anomalies.
                                                                                     PAGE 12
Proposal: Evolutionary Tree Miner (ETM)

• Process trees as representation (= limit search space
  to "good" models).
• Genetic approach (= very flexible)
• Fitness function uses all four criteria (= seamlessly
  balance the different "forces")




                                                    PAGE 13
Representational Bias: Process Trees

         →           • Always sound because of the
                       block structure            A
     A       →       • Also Loop and OR operator
         ∧       E
                                            B    C       D
     X       D
                         A     B
 B       C                A (BA)*                E




                                                      PAGE 14
Petri Net Semantics
(used for comparison and conformance checking only)




                                                A
             A         B

Sequence
                           Parallellism         B
                   A

                   B                            A
Exclusive Choice

                   A
                                                B
                   B        Or Choice
Loop




                                                      PAGE 15
Steps of the Genetic ETM Algorithm



              Change
             Population

                            No
  Create                                 Return
              Measure     Stop
  Initial                                 Best
              Quality      ?     Yes
Population                             Individual




                                               PAGE 16
Population Change


                           Elite




Population i                               Population i+1




                         Selection




               Replace   Crossover   Mutation

                                                            PAGE 17
Four Metrics (see paper)

          “able to replay event log”                 “Occam’s razor”

         replay fitness                                simplicity

                                        process
                                       discovery



         generalization                                precision
          “not overfitting the log”                “not underfitting the log”




                                                                                1 = optimal
                                                                                0 = very bad

                                                                                       PAGE 18
Example



                              B            E

                 A            C                       G

                              D            F




          A = send e-mail, B = check credit,
          C = calculate capacity, D = check system,
          E = accept, F = reject, G = send e-mail
                                                          PAGE 19
Conventional Algorithms (1/3)
   ("best effort" mapping to process trees to allow for comparison)

alpha miner




                                                     low fitness
ILP miner




                                                    low precision
language-based region miner




                                                     low fitness
                                                                    PAGE 20
Conventional Algorithms (2/3)

heuristic miner




multi-phase miner




                                   PAGE 21
Conventional Algorithms (3/3)

genetic miner




state-based region miner




                                  PAGE 22
Often unsound result and no mechanism
to seamlessly balance the four criteria




                                     PAGE 23
Genetic Mining (ETM) While Considering
    Only One Criterion




                                                          best value
                                                           possible
                                                          for this log




ETM with weight zero to three out of four perspectives.                  PAGE 24
Considering Replay Fitness and One
    Other Criterion




ETM with weight zero to two out of four perspectives.   PAGE 25
Considering 3 of 4 Criteria




     replay fitness needs to have a larger weight   PAGE 26
Considering All Four Criteria with
Emphasis on Fitness




    fitness has weight 10

                                     PAGE 27
Initial Model Versus Discovered Model




    simulated          discovered by ETM




    B    E

A   C           G

    D    F

                                           PAGE 28
Real-Life Event Logs

• Event log L0 is the event log used before. L0 contains
  100 traces, 590 events and 7 activities.
• Event Log L1 contains 105 traces, 743 events in total,
  with 6 different activities.
• Event Log L2 contains 444 traces, 3.269 events in total,
  with 6 different activities.
• Event Log L3 contains 274 traces, 1:582 events in total,
  with 6 different activities.


Event logs L1, L2 and L3 are extracted from
the information systems of municipalities
participating in the CoSeLoG project
(http://www.win.tue.nl/coselog/).
                                                        PAGE 29
Results
          If unsound , the
          sound behavior
          is approximated
          when creating
          the process
          tree.




          Equal weights for
          all criteria.



            PAGE 30
Conclusion

 “able to replay event log”                 “Occam’s razor”

replay fitness                                simplicity

                               process
                              discovery



generalization                                precision
 “not overfitting the log”                “not underfitting the log”


                                                                       • First algorithm that allows for balancing
                                                                         all four perspectives.
                                                                       • Genetic algorithm is very flexible, but
                                                                         also very slow.
                                                                       • Process trees only used internally
                                                                         (choose your favorite representation)
                                                                       • Future work:
                                                                            − Improve speed
                                                                            − Distribute PM tasks
                                                                            − Discover configurable process trees
                                                                                                           PAGE 34
www.processmining.org
  www.win.tue.nl/ieeetfpm/
                             PAGE 35

Weitere ähnliche Inhalte

Was ist angesagt?

Jeu norme iso 9001
Jeu norme iso 9001Jeu norme iso 9001
Jeu norme iso 9001CIPE
 
Découvrez la Value Stream Mapping (VSM)
Découvrez la Value Stream Mapping (VSM)Découvrez la Value Stream Mapping (VSM)
Découvrez la Value Stream Mapping (VSM)XL Groupe
 
Mini projet power bi
Mini projet power bi Mini projet power bi
Mini projet power bi AfnouchAhmed
 
Thibaut Petit - Memento / Guide du Lean Construction
Thibaut Petit - Memento / Guide du Lean ConstructionThibaut Petit - Memento / Guide du Lean Construction
Thibaut Petit - Memento / Guide du Lean ConstructionThibaut PETIT
 
Gestion De Production Implantation
Gestion De Production ImplantationGestion De Production Implantation
Gestion De Production Implantationcharkaoui abdelkabir
 
Systèmes d'Exploitation - chp6-synchronisation
Systèmes d'Exploitation - chp6-synchronisationSystèmes d'Exploitation - chp6-synchronisation
Systèmes d'Exploitation - chp6-synchronisationLilia Sfaxi
 
SRM70 Présentation
SRM70 PrésentationSRM70 Présentation
SRM70 Présentationguest96f0df
 
Gimelec dossier industrie 4.0 les leviers de la transformation 2014
Gimelec dossier industrie 4.0 les leviers de la transformation 2014Gimelec dossier industrie 4.0 les leviers de la transformation 2014
Gimelec dossier industrie 4.0 les leviers de la transformation 2014polenumerique33
 
Reflex WMS - Logiciel de gestion d entrepôt
Reflex WMS - Logiciel de gestion d entrepôtReflex WMS - Logiciel de gestion d entrepôt
Reflex WMS - Logiciel de gestion d entrepôtHardis
 
TP2-UML-Correction
TP2-UML-CorrectionTP2-UML-Correction
TP2-UML-CorrectionLilia Sfaxi
 
Mieux rediger-les-user-stories-bonnes-pratiques-oeildecoach 2019
Mieux rediger-les-user-stories-bonnes-pratiques-oeildecoach 2019Mieux rediger-les-user-stories-bonnes-pratiques-oeildecoach 2019
Mieux rediger-les-user-stories-bonnes-pratiques-oeildecoach 2019Oeil de Coach
 
Projet décisionnel
Projet décisionnelProjet décisionnel
Projet décisionnelSiham JABRI
 
Webinar | Mener un projet de GED : Les 6 étapes incontournables !
Webinar | Mener un projet de GED : Les 6 étapes incontournables !Webinar | Mener un projet de GED : Les 6 étapes incontournables !
Webinar | Mener un projet de GED : Les 6 étapes incontournables !Sollan France
 
Systèmes d'Exploitation - chp2-gestion des processus
Systèmes d'Exploitation - chp2-gestion des processusSystèmes d'Exploitation - chp2-gestion des processus
Systèmes d'Exploitation - chp2-gestion des processusLilia Sfaxi
 

Was ist angesagt? (20)

Jeu norme iso 9001
Jeu norme iso 9001Jeu norme iso 9001
Jeu norme iso 9001
 
Découvrez la Value Stream Mapping (VSM)
Découvrez la Value Stream Mapping (VSM)Découvrez la Value Stream Mapping (VSM)
Découvrez la Value Stream Mapping (VSM)
 
Rédiger des User Stories
Rédiger des User StoriesRédiger des User Stories
Rédiger des User Stories
 
Mini projet power bi
Mini projet power bi Mini projet power bi
Mini projet power bi
 
Thibaut Petit - Memento / Guide du Lean Construction
Thibaut Petit - Memento / Guide du Lean ConstructionThibaut Petit - Memento / Guide du Lean Construction
Thibaut Petit - Memento / Guide du Lean Construction
 
Atelier KANBAN avec le jeu get KANBAN
Atelier KANBAN avec le jeu get KANBANAtelier KANBAN avec le jeu get KANBAN
Atelier KANBAN avec le jeu get KANBAN
 
Gestion De Production Implantation
Gestion De Production ImplantationGestion De Production Implantation
Gestion De Production Implantation
 
Juste à temps (Just in time)
Juste à temps (Just in time)Juste à temps (Just in time)
Juste à temps (Just in time)
 
Systèmes d'Exploitation - chp6-synchronisation
Systèmes d'Exploitation - chp6-synchronisationSystèmes d'Exploitation - chp6-synchronisation
Systèmes d'Exploitation - chp6-synchronisation
 
SRM70 Présentation
SRM70 PrésentationSRM70 Présentation
SRM70 Présentation
 
Les pratiques Scrum
Les pratiques ScrumLes pratiques Scrum
Les pratiques Scrum
 
Gimelec dossier industrie 4.0 les leviers de la transformation 2014
Gimelec dossier industrie 4.0 les leviers de la transformation 2014Gimelec dossier industrie 4.0 les leviers de la transformation 2014
Gimelec dossier industrie 4.0 les leviers de la transformation 2014
 
Reflex WMS - Logiciel de gestion d entrepôt
Reflex WMS - Logiciel de gestion d entrepôtReflex WMS - Logiciel de gestion d entrepôt
Reflex WMS - Logiciel de gestion d entrepôt
 
TP2-UML-Correction
TP2-UML-CorrectionTP2-UML-Correction
TP2-UML-Correction
 
Mieux rediger-les-user-stories-bonnes-pratiques-oeildecoach 2019
Mieux rediger-les-user-stories-bonnes-pratiques-oeildecoach 2019Mieux rediger-les-user-stories-bonnes-pratiques-oeildecoach 2019
Mieux rediger-les-user-stories-bonnes-pratiques-oeildecoach 2019
 
Projet décisionnel
Projet décisionnelProjet décisionnel
Projet décisionnel
 
Webinar | Mener un projet de GED : Les 6 étapes incontournables !
Webinar | Mener un projet de GED : Les 6 étapes incontournables !Webinar | Mener un projet de GED : Les 6 étapes incontournables !
Webinar | Mener un projet de GED : Les 6 étapes incontournables !
 
Industrie 4.0
Industrie 4.0Industrie 4.0
Industrie 4.0
 
User stories
User storiesUser stories
User stories
 
Systèmes d'Exploitation - chp2-gestion des processus
Systèmes d'Exploitation - chp2-gestion des processusSystèmes d'Exploitation - chp2-gestion des processus
Systèmes d'Exploitation - chp2-gestion des processus
 

Mehr von Wil van der Aalst

Process Mining: BPM on Steroids (CPOs@BPM&O 2019 Keynote)
Process Mining: BPM on Steroids (CPOs@BPM&O 2019 Keynote)Process Mining: BPM on Steroids (CPOs@BPM&O 2019 Keynote)
Process Mining: BPM on Steroids (CPOs@BPM&O 2019 Keynote)Wil van der Aalst
 
Everything You Always Wanted To Know About Petri Nets, But Were Afraid To Ask
Everything You Always Wanted To Know About Petri Nets, But Were Afraid To AskEverything You Always Wanted To Know About Petri Nets, But Were Afraid To Ask
Everything You Always Wanted To Know About Petri Nets, But Were Afraid To AskWil van der Aalst
 
20 years of Process Mining Research (ICPM 2019 keynote)
20 years of Process Mining Research (ICPM 2019 keynote)20 years of Process Mining Research (ICPM 2019 keynote)
20 years of Process Mining Research (ICPM 2019 keynote)Wil van der Aalst
 
Earth Movers’ Stochastic Conformance Checking
Earth Movers’ Stochastic Conformance CheckingEarth Movers’ Stochastic Conformance Checking
Earth Movers’ Stochastic Conformance CheckingWil van der Aalst
 
Using Process Mining to Remove Operational Friction in Shared Services
Using Process Mining to Remove Operational Friction in Shared ServicesUsing Process Mining to Remove Operational Friction in Shared Services
Using Process Mining to Remove Operational Friction in Shared ServicesWil van der Aalst
 
Object-Centric Process Mining: Dealing With Divergence and Convergence in Eve...
Object-Centric Process Mining: Dealing With Divergence and Convergence in Eve...Object-Centric Process Mining: Dealing With Divergence and Convergence in Eve...
Object-Centric Process Mining: Dealing With Divergence and Convergence in Eve...Wil van der Aalst
 
Process Mining In Today’s Platforms Economy: Opportunities and Challenges (WI...
Process Mining In Today’s Platforms Economy: Opportunities and Challenges (WI...Process Mining In Today’s Platforms Economy: Opportunities and Challenges (WI...
Process Mining In Today’s Platforms Economy: Opportunities and Challenges (WI...Wil van der Aalst
 
Event Logs: What kind of data does process mining require?
Event Logs: What kind of data does process mining require?Event Logs: What kind of data does process mining require?
Event Logs: What kind of data does process mining require?Wil van der Aalst
 
Configurable Declare: Designing Customizable Flexible Models
Configurable Declare: Designing Customizable Flexible ModelsConfigurable Declare: Designing Customizable Flexible Models
Configurable Declare: Designing Customizable Flexible ModelsWil van der Aalst
 
A Decade of Business Process Management Conferences: Reflections on a Develop...
A Decade of Business Process Management Conferences: Reflections on a Develop...A Decade of Business Process Management Conferences: Reflections on a Develop...
A Decade of Business Process Management Conferences: Reflections on a Develop...Wil van der Aalst
 
Process Mining: Understanding and Improving Desire Lines in Big Data
Process Mining: Understanding and Improving Desire Lines in Big DataProcess Mining: Understanding and Improving Desire Lines in Big Data
Process Mining: Understanding and Improving Desire Lines in Big DataWil van der Aalst
 
Business Process Configuration in the Cloud: How to Support and Analyze Multi...
Business Process Configuration in the Cloud: How to Support and Analyze Multi...Business Process Configuration in the Cloud: How to Support and Analyze Multi...
Business Process Configuration in the Cloud: How to Support and Analyze Multi...Wil van der Aalst
 
Discovering Concurrency: Learning (Business) Process Models from Examples
Discovering Concurrency: Learning (Business) Process Models from ExamplesDiscovering Concurrency: Learning (Business) Process Models from Examples
Discovering Concurrency: Learning (Business) Process Models from ExamplesWil van der Aalst
 
Distributed Process Discovery and Conformance Checking
Distributed Process Discovery and Conformance CheckingDistributed Process Discovery and Conformance Checking
Distributed Process Discovery and Conformance CheckingWil van der Aalst
 
Service Interaction: Patterns, Formalization, and Analysis
Service Interaction: Patterns, Formalization, and AnalysisService Interaction: Patterns, Formalization, and Analysis
Service Interaction: Patterns, Formalization, and AnalysisWil van der Aalst
 
Keynote Gartner Business Process Management Summit, February 2009, London
Keynote Gartner Business Process Management Summit, February 2009, London Keynote Gartner Business Process Management Summit, February 2009, London
Keynote Gartner Business Process Management Summit, February 2009, London Wil van der Aalst
 
Keynote on Process Mining at SSCI 2010 / CIDM 2011
Keynote on Process Mining at SSCI 2010 / CIDM 2011Keynote on Process Mining at SSCI 2010 / CIDM 2011
Keynote on Process Mining at SSCI 2010 / CIDM 2011Wil van der Aalst
 
Discovering Petri Nets: Evidence-Based Business Process Management
Discovering Petri Nets: Evidence-Based Business Process ManagementDiscovering Petri Nets: Evidence-Based Business Process Management
Discovering Petri Nets: Evidence-Based Business Process ManagementWil van der Aalst
 
TomTom for Business Process Managment (TomTom4BPM)
TomTom for Business Process Managment (TomTom4BPM)TomTom for Business Process Managment (TomTom4BPM)
TomTom for Business Process Managment (TomTom4BPM)Wil van der Aalst
 
Keynote at 18th International Conference on Cooperative Information Systems (...
Keynote at 18th International Conference on Cooperative Information Systems (...Keynote at 18th International Conference on Cooperative Information Systems (...
Keynote at 18th International Conference on Cooperative Information Systems (...Wil van der Aalst
 

Mehr von Wil van der Aalst (20)

Process Mining: BPM on Steroids (CPOs@BPM&O 2019 Keynote)
Process Mining: BPM on Steroids (CPOs@BPM&O 2019 Keynote)Process Mining: BPM on Steroids (CPOs@BPM&O 2019 Keynote)
Process Mining: BPM on Steroids (CPOs@BPM&O 2019 Keynote)
 
Everything You Always Wanted To Know About Petri Nets, But Were Afraid To Ask
Everything You Always Wanted To Know About Petri Nets, But Were Afraid To AskEverything You Always Wanted To Know About Petri Nets, But Were Afraid To Ask
Everything You Always Wanted To Know About Petri Nets, But Were Afraid To Ask
 
20 years of Process Mining Research (ICPM 2019 keynote)
20 years of Process Mining Research (ICPM 2019 keynote)20 years of Process Mining Research (ICPM 2019 keynote)
20 years of Process Mining Research (ICPM 2019 keynote)
 
Earth Movers’ Stochastic Conformance Checking
Earth Movers’ Stochastic Conformance CheckingEarth Movers’ Stochastic Conformance Checking
Earth Movers’ Stochastic Conformance Checking
 
Using Process Mining to Remove Operational Friction in Shared Services
Using Process Mining to Remove Operational Friction in Shared ServicesUsing Process Mining to Remove Operational Friction in Shared Services
Using Process Mining to Remove Operational Friction in Shared Services
 
Object-Centric Process Mining: Dealing With Divergence and Convergence in Eve...
Object-Centric Process Mining: Dealing With Divergence and Convergence in Eve...Object-Centric Process Mining: Dealing With Divergence and Convergence in Eve...
Object-Centric Process Mining: Dealing With Divergence and Convergence in Eve...
 
Process Mining In Today’s Platforms Economy: Opportunities and Challenges (WI...
Process Mining In Today’s Platforms Economy: Opportunities and Challenges (WI...Process Mining In Today’s Platforms Economy: Opportunities and Challenges (WI...
Process Mining In Today’s Platforms Economy: Opportunities and Challenges (WI...
 
Event Logs: What kind of data does process mining require?
Event Logs: What kind of data does process mining require?Event Logs: What kind of data does process mining require?
Event Logs: What kind of data does process mining require?
 
Configurable Declare: Designing Customizable Flexible Models
Configurable Declare: Designing Customizable Flexible ModelsConfigurable Declare: Designing Customizable Flexible Models
Configurable Declare: Designing Customizable Flexible Models
 
A Decade of Business Process Management Conferences: Reflections on a Develop...
A Decade of Business Process Management Conferences: Reflections on a Develop...A Decade of Business Process Management Conferences: Reflections on a Develop...
A Decade of Business Process Management Conferences: Reflections on a Develop...
 
Process Mining: Understanding and Improving Desire Lines in Big Data
Process Mining: Understanding and Improving Desire Lines in Big DataProcess Mining: Understanding and Improving Desire Lines in Big Data
Process Mining: Understanding and Improving Desire Lines in Big Data
 
Business Process Configuration in the Cloud: How to Support and Analyze Multi...
Business Process Configuration in the Cloud: How to Support and Analyze Multi...Business Process Configuration in the Cloud: How to Support and Analyze Multi...
Business Process Configuration in the Cloud: How to Support and Analyze Multi...
 
Discovering Concurrency: Learning (Business) Process Models from Examples
Discovering Concurrency: Learning (Business) Process Models from ExamplesDiscovering Concurrency: Learning (Business) Process Models from Examples
Discovering Concurrency: Learning (Business) Process Models from Examples
 
Distributed Process Discovery and Conformance Checking
Distributed Process Discovery and Conformance CheckingDistributed Process Discovery and Conformance Checking
Distributed Process Discovery and Conformance Checking
 
Service Interaction: Patterns, Formalization, and Analysis
Service Interaction: Patterns, Formalization, and AnalysisService Interaction: Patterns, Formalization, and Analysis
Service Interaction: Patterns, Formalization, and Analysis
 
Keynote Gartner Business Process Management Summit, February 2009, London
Keynote Gartner Business Process Management Summit, February 2009, London Keynote Gartner Business Process Management Summit, February 2009, London
Keynote Gartner Business Process Management Summit, February 2009, London
 
Keynote on Process Mining at SSCI 2010 / CIDM 2011
Keynote on Process Mining at SSCI 2010 / CIDM 2011Keynote on Process Mining at SSCI 2010 / CIDM 2011
Keynote on Process Mining at SSCI 2010 / CIDM 2011
 
Discovering Petri Nets: Evidence-Based Business Process Management
Discovering Petri Nets: Evidence-Based Business Process ManagementDiscovering Petri Nets: Evidence-Based Business Process Management
Discovering Petri Nets: Evidence-Based Business Process Management
 
TomTom for Business Process Managment (TomTom4BPM)
TomTom for Business Process Managment (TomTom4BPM)TomTom for Business Process Managment (TomTom4BPM)
TomTom for Business Process Managment (TomTom4BPM)
 
Keynote at 18th International Conference on Cooperative Information Systems (...
Keynote at 18th International Conference on Cooperative Information Systems (...Keynote at 18th International Conference on Cooperative Information Systems (...
Keynote at 18th International Conference on Cooperative Information Systems (...
 

Kürzlich hochgeladen

CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 

Kürzlich hochgeladen (20)

CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 

On the Role of Fitness, Precision, Generalization and Simplicity in Process Discovery

  • 1. On the Role of Fitness, Precision, Generalization and Simplicity in Process Discovery Joos Buijs Boudewijn van Dongen Wil van der Aalst http://www.win.tue.nl/coselog/
  • 2. Advances in Process Mining • Many process discovery and conformance checking algorithms and tools are available (cf. the various ProM packages). • Also commercial software based on these ideas: Disco (Fluxicon), Reflect (Futura/Perceptive), BPMOne (Pallas Athena/Perceptive), ARIS Process Performance Manager (Software AG), Interstage Automated Process Discovery (Fujitsu), QPR ProcessAnalyzer/Analysis (QPR Software), flow (fourspark), Discovery Analyst (StereoLOGIC), etc. • We applied process mining in over 100 organizations. More than 75 people involving more than 50 organizations created the Process Mining Manifesto in the context of the IEEE Task Force on Process Mining. Available in 13 languages PAGE 1
  • 3. Example Process Discovery (Vestia, Dutch housing agency, 208 cases, 5987 events) PAGE 2
  • 4. Example: Conformance Checking (WOZ objections Dutch municipality, 745 objections, 9583 event, f= 0.988) PAGE 3
  • 5. Challenge: Four Competing Quality Criteria “able to replay event log” “Occam’s razor” replay fitness simplicity process discovery generalization precision “not overfitting the log” “not underfitting the log” PAGE 4
  • 6. Example: one log four models b examine thoroughly g pay c compensation a examine e start register casually decide end # trace request h 455 acdeh d reject check ticket request 191 abdeg f reinitiate request 177 adceh N1 : fitness = +, precision = +, generalization = +, simplicity = + 144 abdeh 111 acdeg a c d e h 82 adceg start register examine check decide reject end request casually ticket request 56 adbeh N2 : fitness = -, precision = +, generalization = -, simplicity = + 47 acdefdbeh “able to replay event log” “Occam’s razor” 38 adbeg examine check thoroughly b d ticket g 33 acdefbdeh replay fitness simplicity pay compensation a 14 acdefbdeg start register examine c end 11 acdefdbeg request casually e f reinitiate h process decide request reject request 9 adcefcdeh discovery N3 : fitness = +, precision = -, generalization = +, simplicity = + 8 adcefdbeh 5 adcefbdeg a d c e g 3 acdefbdefdbeg generalization precision register request check ticket examine casually decide pay compensation 2 adcefdbeg a c d e g 2 adcefbdefbdeg “not overfitting the log” “not underfitting the log” register examine check decide pay request casually ticket compensation 1 adcefdbefbdeh a d c e h 1 adbefbdefdbeg register check examine decide reject request ticket casually request 1 adcefdbefcdefdbeg a c d e h 1391 start end register examine check decide reject request casually ticket request … (all 21 variants seen in the log) a b d e g register examine check decide pay request thoroughly ticket compensation a d b e h register check examine decide reject request ticket thoroughly request a b d e h register examine check decide reject request thoroughly ticket request PAGE 5 N4 : fitness = +, precision = +, generalization = -, simplicity = -
  • 7. # trace 455 acdeh Model N1 191 abdeg 177 adceh 144 abdeh 111 acdeg 82 adceg 56 adbeh b 47 acdefdbeh examine thoroughly 38 adbeg g 33 acdefbdeh pay c compensation 14 acdefbdeg a examine e 11 acdefdbeg start register casually decide end request 9 adcefcdeh h d reject 8 adcefdbeh check ticket request 5 adcefbdeg f reinitiate 3 acdefbdefdbeg request N1 : fitness = +, precision = +, generalization = +, simplicity = + 2 adcefdbeg 2 adcefbdefbdeg 1 adcefdbefbdeh 1 adbefbdefdbeg 1 adcefdbefcdefdbeg PAGE 6 1391
  • 8. # trace 455 acdeh Model N2 191 abdeg 177 adceh 144 abdeh 111 acdeg 82 adceg 56 adbeh 47 acdefdbeh 38 adbeg a c d e h 33 acdefbdeh start register examine check decide reject end 14 acdefbdeg request casually ticket request N2 : fitness = -, precision = +, generalization = -, simplicity = + 11 acdefdbeg 9 adcefcdeh 8 adcefdbeh 5 adcefbdeg 3 acdefbdefdbeg 2 adcefdbeg 2 adcefbdefbdeg 1 adcefdbefbdeh 1 adbefbdefdbeg 1 adcefdbefcdefdbeg PAGE 7 1391
  • 9. # trace 455 acdeh Model N3 191 abdeg 177 adceh 144 abdeh 111 acdeg 82 adceg 56 adbeh 47 acdefdbeh examine check thoroughly b d ticket g 38 adbeg pay 33 acdefbdeh compensation a 14 acdefbdeg start register examine end 11 acdefdbeg request casually c e f reinitiate reject h 9 adcefcdeh decide request request 8 adcefdbeh N3 : fitness = +, precision = -, generalization = +, simplicity = + 5 adcefbdeg 3 acdefbdefdbeg 2 adcefdbeg 2 adcefbdefbdeg 1 adcefdbefbdeh 1 adbefbdefdbeg 1 adcefdbefcdefdbeg PAGE 8 1391
  • 10. # trace 455 acdeh Model N4 191 abdeg 177 adceh 144 abdeh a d c e g 111 acdeg register check examine decide pay request ticket casually compensation 82 adceg a c d e g 56 adbeh register examine check decide pay request casually ticket compensation 47 acdefdbeh a d c e h 38 adbeg register check examine decide reject request ticket casually request 33 acdefbdeh a c d e h 14 acdefbdeg start end register examine check decide reject request casually ticket request 11 acdefdbeg … (all 21 variants seen in the log) 9 adcefcdeh 8 adcefdbeh 5 adcefbdeg a b d e g register examine check decide pay 3 acdefbdefdbeg request thoroughly ticket compensation 2 adcefdbeg a d b e h register check examine decide reject 2 adcefbdefbdeg request ticket thoroughly request 1 adcefdbefbdeh a b d e h register examine check decide reject 1 adbefbdefdbeg request thoroughly ticket request 1 adcefdbefcdefdbeg N4 : fitness = +, precision = +, generalization = -, simplicity = - PAGE 9 1391
  • 11. Another challenge: Huge search space M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M MM M M M M M M MM M M M M M M M M M M M M M M M M M M M M M M M M M M M M M MM M MM M M MM M M M M M M MM M MM M MM M MM MM MM M M M M M M M M M M M M M M M M M M M M M MM M M M M M M M M M M M M M M M M MM M M M M M M M M M M M M M M M M M MM M M M M M M M M MM M M M MM M M M M M M M MM M M M M MM M M M M M M MM M M M M M M M M M M M MM M MM M M MM M M M M M M M M M MM M M M MM M M M M M MM M MM MM M M M M M M M M M M M M M M MM M M M M M M M M M M M MM M M M M M M M MM M M M M MM M M M M M M M M M M MM M M M M M M M MM M M M M M M MM M M M MM M MM M M M M MM M M MM M MM M M M M MM M M M M PAGE 10
  • 12. … with just a few interesting candidates M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M M MM M M M M M MM M M M M M M M M M M M M M M M M M M M M M M M M MM M M M M M M M MM M MM M M M M M M MM M M M M MM M MM M MM MM M M M M MM M M M MM M M M M M M M M MM M M MM M M M M M M M M M M M M M M M M M MM M M M M M M MM M M M M M M M M M M M MM MM M M M M M MM M M M M M M M MM M M M M MM M M M M M MM M M M M M M M M M M M M MM M M M M M M M MM M M M MM M M MM M M M M MM M MM MM M M M M M M M M MM M M M M M M M M MM M M M M M M M M M M M M M MM M M M M M M M MM M M M M MM M M M M M M M M M M MM M M M M M M M M MM M M M M M MM M M M M M M M M M MM M M M MM M M M M M MM M MM M M M M M PAGE 11
  • 13. Two requirements 1. It should be possible to “able to replay event log” “Occam’s razor” replay fitness simplicity seamlessly balance the different quality criteria process discovery based on user-defined generalization precision preferences. “not overfitting the log” “not underfitting the log” 2. The algorithm should always return a "correct" process model and not waste time on model having deadlocks and other anomalies. PAGE 12
  • 14. Proposal: Evolutionary Tree Miner (ETM) • Process trees as representation (= limit search space to "good" models). • Genetic approach (= very flexible) • Fitness function uses all four criteria (= seamlessly balance the different "forces") PAGE 13
  • 15. Representational Bias: Process Trees → • Always sound because of the block structure A A → • Also Loop and OR operator ∧ E B C D X D A B B C A (BA)* E PAGE 14
  • 16. Petri Net Semantics (used for comparison and conformance checking only) A A B Sequence Parallellism B A B A Exclusive Choice A B B Or Choice Loop PAGE 15
  • 17. Steps of the Genetic ETM Algorithm Change Population No Create Return Measure Stop Initial Best Quality ? Yes Population Individual PAGE 16
  • 18. Population Change Elite Population i Population i+1 Selection Replace Crossover Mutation PAGE 17
  • 19. Four Metrics (see paper) “able to replay event log” “Occam’s razor” replay fitness simplicity process discovery generalization precision “not overfitting the log” “not underfitting the log” 1 = optimal 0 = very bad PAGE 18
  • 20. Example B E A C G D F A = send e-mail, B = check credit, C = calculate capacity, D = check system, E = accept, F = reject, G = send e-mail PAGE 19
  • 21. Conventional Algorithms (1/3) ("best effort" mapping to process trees to allow for comparison) alpha miner low fitness ILP miner low precision language-based region miner low fitness PAGE 20
  • 22. Conventional Algorithms (2/3) heuristic miner multi-phase miner PAGE 21
  • 23. Conventional Algorithms (3/3) genetic miner state-based region miner PAGE 22
  • 24. Often unsound result and no mechanism to seamlessly balance the four criteria PAGE 23
  • 25. Genetic Mining (ETM) While Considering Only One Criterion best value possible for this log ETM with weight zero to three out of four perspectives. PAGE 24
  • 26. Considering Replay Fitness and One Other Criterion ETM with weight zero to two out of four perspectives. PAGE 25
  • 27. Considering 3 of 4 Criteria replay fitness needs to have a larger weight PAGE 26
  • 28. Considering All Four Criteria with Emphasis on Fitness fitness has weight 10 PAGE 27
  • 29. Initial Model Versus Discovered Model simulated discovered by ETM B E A C G D F PAGE 28
  • 30. Real-Life Event Logs • Event log L0 is the event log used before. L0 contains 100 traces, 590 events and 7 activities. • Event Log L1 contains 105 traces, 743 events in total, with 6 different activities. • Event Log L2 contains 444 traces, 3.269 events in total, with 6 different activities. • Event Log L3 contains 274 traces, 1:582 events in total, with 6 different activities. Event logs L1, L2 and L3 are extracted from the information systems of municipalities participating in the CoSeLoG project (http://www.win.tue.nl/coselog/). PAGE 29
  • 31. Results If unsound , the sound behavior is approximated when creating the process tree. Equal weights for all criteria. PAGE 30
  • 32. Conclusion “able to replay event log” “Occam’s razor” replay fitness simplicity process discovery generalization precision “not overfitting the log” “not underfitting the log” • First algorithm that allows for balancing all four perspectives. • Genetic algorithm is very flexible, but also very slow. • Process trees only used internally (choose your favorite representation) • Future work: − Improve speed − Distribute PM tasks − Discover configurable process trees PAGE 34