SlideShare ist ein Scribd-Unternehmen logo
1 von 73
Downloaden Sie, um offline zu lesen
An Evolutionary Perspective on
 Approximate RDF Query Answering



Christophe Guéret, Eyal Oren, Stefan Schlobach,
     Frank van Harmelen and Martijn Schut



         Vrije Universiteit, Amsterdam
Problem and context          Method proposed   Experimental results   Conclusion



 The next 30 minutes in 4 points...

              RDF




                                                                          griffioen



SUM 2008 - October 2, 2008                                                 2 / 24
Problem and context            Method proposed   Experimental results   Conclusion



 The next 30 minutes in 4 points...

              RDF
                      Data on the Web




                                                                            griffioen



SUM 2008 - October 2, 2008                                                   2 / 24
Problem and context            Method proposed       Experimental results    Conclusion



 The next 30 minutes in 4 points...

              RDF
                      Data on the Web
                      Inconsistent, uncertain, heterogeneous, Huge and growing!




                                                                                  griffioen



SUM 2008 - October 2, 2008                                                        2 / 24
Problem and context            Method proposed       Experimental results    Conclusion



 The next 30 minutes in 4 points...

              RDF
                      Data on the Web
                      Inconsistent, uncertain, heterogeneous, Huge and growing!

              RDF Query answering




                                                                                  griffioen



SUM 2008 - October 2, 2008                                                        2 / 24
Problem and context             Method proposed         Experimental results   Conclusion



 The next 30 minutes in 4 points...

              RDF
                      Data on the Web
                      Inconsistent, uncertain, heterogeneous, Huge and growing!

              RDF Query answering
                      Finding data matching criterion




                                                                                   griffioen



SUM 2008 - October 2, 2008                                                          2 / 24
Problem and context             Method proposed        Experimental results   Conclusion



 The next 30 minutes in 4 points...

              RDF
                      Data on the Web
                      Inconsistent, uncertain, heterogeneous, Huge and growing!

              RDF Query answering
                      Finding data matching criterion
                      ... many queries are actually not satisfiable




                                                                                  griffioen



SUM 2008 - October 2, 2008                                                         2 / 24
Problem and context             Method proposed        Experimental results   Conclusion



 The next 30 minutes in 4 points...

              RDF
                      Data on the Web
                      Inconsistent, uncertain, heterogeneous, Huge and growing!

              RDF Query answering
                      Finding data matching criterion
                      ... many queries are actually not satisfiable

              Approximate RDF Query answering




                                                                                  griffioen



SUM 2008 - October 2, 2008                                                         2 / 24
Problem and context             Method proposed          Experimental results   Conclusion



 The next 30 minutes in 4 points...

              RDF
                      Data on the Web
                      Inconsistent, uncertain, heterogeneous, Huge and growing!

              RDF Query answering
                      Finding data matching criterion
                      ... many queries are actually not satisfiable

              Approximate RDF Query answering
                      Finding some, almost valid, data




                                                                                    griffioen



SUM 2008 - October 2, 2008                                                           2 / 24
Problem and context             Method proposed          Experimental results   Conclusion



 The next 30 minutes in 4 points...

              RDF
                      Data on the Web
                      Inconsistent, uncertain, heterogeneous, Huge and growing!

              RDF Query answering
                      Finding data matching criterion
                      ... many queries are actually not satisfiable

              Approximate RDF Query answering
                      Finding some, almost valid, data

              The Evolutionary Perspective

                                                                                    griffioen



SUM 2008 - October 2, 2008                                                           2 / 24
Problem and context             Method proposed          Experimental results   Conclusion



 The next 30 minutes in 4 points...

              RDF
                      Data on the Web
                      Inconsistent, uncertain, heterogeneous, Huge and growing!

              RDF Query answering
                      Finding data matching criterion
                      ... many queries are actually not satisfiable

              Approximate RDF Query answering
                      Finding some, almost valid, data

              The Evolutionary Perspective
                      Test different solutions
                                                                                    griffioen



SUM 2008 - October 2, 2008                                                           2 / 24
Problem and context             Method proposed          Experimental results   Conclusion



 The next 30 minutes in 4 points...

              RDF
                      Data on the Web
                      Inconsistent, uncertain, heterogeneous, Huge and growing!

              RDF Query answering
                      Finding data matching criterion
                      ... many queries are actually not satisfiable

              Approximate RDF Query answering
                      Finding some, almost valid, data

              The Evolutionary Perspective
                      Test different solutions
                      Progressive optimisation of the result                        griffioen



SUM 2008 - October 2, 2008                                                           2 / 24
Problem and context          Method proposed   Experimental results   Conclusion




       1    What’s the problem ?
             Querying RDF datastores
             Standard techniques

       2    And Now for Something Completely Different
              Guessing the solution instead
              The way we do it

       3    Does it work ?
              Evolution of the quality
              Some characteristics of this method

       4    TODO list
                                                                          griffioen



SUM 2008 - October 2, 2008                                                 3 / 24
Problem and context          Method proposed   Experimental results   Conclusion




       1    What’s the problem ?
             Querying RDF datastores
             Standard techniques

       2    And Now for Something Completely Different
              Guessing the solution instead
              The way we do it

       3    Does it work ?
              Evolution of the quality
              Some characteristics of this method

       4    TODO list
                                                                          griffioen



SUM 2008 - October 2, 2008                                                 4 / 24
Problem and context          Method proposed   Experimental results   Conclusion



 Example

              RDF dataset
              <Ullman88> type Book .
              <Ullman88> label "Principles of Database and
                  Knowledge-Base Systems" .
              <Ullman88> author b1 .
              b1 _1 ullman .
              ullman homepage <http://stanford.edu/~ullman/> .

              SPARQL query
              SELECT ?title WHERE {
              ?publication type Book .
              ?publication label ?title .
              }

              Expected answer
                                                                    griffioen
              ?title = "Principles of Database and Knowledge-Base Systems

SUM 2008 - October 2, 2008                                                 5 / 24
Problem and context          Method proposed           Experimental results   Conclusion



 Problem description


         Triple =
         subject,
         predicate,
         object

         Dataset =
         graph of
         triples

         Querying :
         find a
         pattern in
         the graph                                                                griffioen

                                               A query and a graph [PSPARQL07]
SUM 2008 - October 2, 2008                                                         6 / 24
Problem and context          Method proposed   Experimental results   Conclusion



 Standard techniques

              Standard approach :




                                                                          griffioen



SUM 2008 - October 2, 2008                                                 7 / 24
Problem and context            Method proposed      Experimental results    Conclusion



 Standard techniques

              Standard approach :
                 1    Find all the possible results for ?publication type
                      Book




                                                                                griffioen



SUM 2008 - October 2, 2008                                                       7 / 24
Problem and context              Method proposed    Experimental results    Conclusion



 Standard techniques

              Standard approach :
                 1    Find all the possible results for ?publication type
                      Book
                              ?publication
                             <Ullman88>




                                                                                griffioen



SUM 2008 - October 2, 2008                                                       7 / 24
Problem and context              Method proposed    Experimental results     Conclusion



 Standard techniques

              Standard approach :
                 1    Find all the possible results for ?publication type
                      Book
                              ?publication
                             <Ullman88>
                 2    Find all the possible results for ?publication label
                      ?title




                                                                                 griffioen



SUM 2008 - October 2, 2008                                                        7 / 24
Problem and context              Method proposed            Experimental results   Conclusion



 Standard techniques

              Standard approach :
                 1    Find all the possible results for ?publication type
                      Book
                              ?publication
                             <Ullman88>
                 2    Find all the possible results for ?publication label
                      ?title
                              ?publication                 ?title
                             <Ullman88>            "Principles of ..."




                                                                                       griffioen



SUM 2008 - October 2, 2008                                                              7 / 24
Problem and context              Method proposed            Experimental results   Conclusion



 Standard techniques

              Standard approach :
                 1    Find all the possible results for ?publication type
                      Book
                              ?publication
                             <Ullman88>
                 2    Find all the possible results for ?publication label
                      ?title
                              ?publication                 ?title
                             <Ullman88>            "Principles of ..."
                 3    Do a join on the two tables and return the result




                                                                                       griffioen



SUM 2008 - October 2, 2008                                                              7 / 24
Problem and context               Method proposed            Experimental results   Conclusion



 Standard techniques

              Standard approach :
                 1    Find all the possible results for ?publication type
                      Book
                               ?publication
                              <Ullman88>
                 2    Find all the possible results for ?publication label
                      ?title
                               ?publication                 ?title
                              <Ullman88>            "Principles of ..."
                 3    Do a join on the two tables and return the result
                             ?title = "Principles of ..."




                                                                                        griffioen



SUM 2008 - October 2, 2008                                                               7 / 24
Problem and context               Method proposed            Experimental results   Conclusion



 Standard techniques

              Standard approach :
                 1    Find all the possible results for ?publication type
                      Book
                               ?publication
                              <Ullman88>
                 2    Find all the possible results for ?publication label
                      ?title
                               ?publication                 ?title
                              <Ullman88>            "Principles of ..."
                 3    Do a join on the two tables and return the result
                             ?title = "Principles of ..."



              Fast thanks to the creation of indexes and query
              optimisation                                                              griffioen



SUM 2008 - October 2, 2008                                                               7 / 24
Problem and context          Method proposed   Experimental results   Conclusion



 Motivation
              Designed to return results only when there are some
              Not designed for incomplete and approximate
              queries/answers
              Hard to distribute




                                                                          griffioen



SUM 2008 - October 2, 2008                                                 8 / 24
Problem and context             Method proposed         Experimental results       Conclusion



 Motivation
              Designed to return results only when there are some
              Not designed for incomplete and approximate
              queries/answers
              Hard to distribute


              Approximate answers to precise queries
                      If the query is unsat, return the best almost sat solution
                      found




                                                                                       griffioen



SUM 2008 - October 2, 2008                                                              8 / 24
Problem and context             Method proposed         Experimental results       Conclusion



 Motivation
              Designed to return results only when there are some
              Not designed for incomplete and approximate
              queries/answers
              Hard to distribute


              Approximate answers to precise queries
                      If the query is unsat, return the best almost sat solution
                      found

              Precises answers to approximate queries
                      Return a subset of existing solutions instead of showing
                      them all


                                                                                       griffioen



SUM 2008 - October 2, 2008                                                              8 / 24
Problem and context             Method proposed         Experimental results       Conclusion



 Motivation
              Designed to return results only when there are some
              Not designed for incomplete and approximate
              queries/answers
              Hard to distribute


              Approximate answers to precise queries
                      If the query is unsat, return the best almost sat solution
                      found

              Precises answers to approximate queries
                      Return a subset of existing solutions instead of showing
                      them all

              Interactive querying
                      Use of intermediate results to help the user improving his       griffioen

                      query
SUM 2008 - October 2, 2008                                                              8 / 24
Problem and context          Method proposed   Experimental results   Conclusion




       1    What’s the problem ?
             Querying RDF datastores
             Standard techniques

       2    And Now for Something Completely Different
              Guessing the solution instead
              The way we do it

       3    Does it work ?
              Evolution of the quality
              Some characteristics of this method

       4    TODO list
                                                                          griffioen



SUM 2008 - October 2, 2008                                                 9 / 24
Problem and context          Method proposed   Experimental results   Conclusion



 Approach

              “I’m Feeling Lucky” approach :




                                                                          griffioen



SUM 2008 - October 2, 2008                                                10 / 24
Problem and context            Method proposed       Experimental results   Conclusion



 Approach

              “I’m Feeling Lucky” approach :
                 1    Assign some random values to the variables




                                                                                griffioen



SUM 2008 - October 2, 2008                                                      10 / 24
Problem and context              Method proposed          Experimental results   Conclusion



 Approach

              “I’m Feeling Lucky” approach :
                 1    Assign some random values to the variables
                             ?publication     =    <Ullman88>
                             ?title           =    Book




                                                                                     griffioen



SUM 2008 - October 2, 2008                                                           10 / 24
Problem and context              Method proposed          Experimental results   Conclusion



 Approach

              “I’m Feeling Lucky” approach :
                 1    Assign some random values to the variables
                             ?publication     =    <Ullman88>
                             ?title           =    Book
                 2    Verify if the solution is valid




                                                                                     griffioen



SUM 2008 - October 2, 2008                                                           10 / 24
Problem and context              Method proposed          Experimental results   Conclusion



 Approach

              “I’m Feeling Lucky” approach :
                 1    Assign some random values to the variables
                             ?publication     =    <Ullman88>
                             ?title           =    Book
                 2    Verify if the solution is valid
                             Triple                          Is in the graph ?
                             <Ullman88> type Book                    yes
                             <Ullman88> label Book                   no




                                                                                     griffioen



SUM 2008 - October 2, 2008                                                           10 / 24
Problem and context              Method proposed          Experimental results   Conclusion



 Approach

              “I’m Feeling Lucky” approach :
                 1    Assign some random values to the variables
                             ?publication     =    <Ullman88>
                             ?title           =    Book
                 2    Verify if the solution is valid
                             Triple                          Is in the graph ?
                             <Ullman88> type Book                    yes
                             <Ullman88> label Book                   no
                 3    If the solution is OK, stop. Otherwise, try again with
                      something else




                                                                                     griffioen



SUM 2008 - October 2, 2008                                                           10 / 24
Problem and context              Method proposed          Experimental results   Conclusion



 Approach

              “I’m Feeling Lucky” approach :
                 1    Assign some random values to the variables
                             ?publication     =    <Ullman88>
                             ?title           =    Book
                 2    Verify if the solution is valid
                             Triple                          Is in the graph ?
                             <Ullman88> type Book                    yes
                             <Ullman88> label Book                   no
                 3    If the solution is OK, stop. Otherwise, try again with
                      something else

              Rely on membership testing (instead of lookup)


                                                                                     griffioen



SUM 2008 - October 2, 2008                                                           10 / 24
Problem and context              Method proposed          Experimental results   Conclusion



 Approach

              “I’m Feeling Lucky” approach :
                 1    Assign some random values to the variables
                             ?publication     =    <Ullman88>
                             ?title           =    Book
                 2    Verify if the solution is valid
                             Triple                          Is in the graph ?
                             <Ullman88> type Book                    yes
                             <Ullman88> label Book                   no
                 3    If the solution is OK, stop. Otherwise, try again with
                      something else

              Rely on membership testing (instead of lookup)
              The testing loop can be stopped at any time
                                                                                     griffioen



SUM 2008 - October 2, 2008                                                           10 / 24
Problem and context              Method proposed          Experimental results   Conclusion



 Approach

              “I’m Feeling Lucky” approach :
                 1    Assign some random values to the variables
                             ?publication     =    <Ullman88>
                             ?title           =    Book
                 2    Verify if the solution is valid
                             Triple                          Is in the graph ?
                             <Ullman88> type Book                    yes
                             <Ullman88> label Book                   no
                 3    If the solution is OK, stop. Otherwise, try again with
                      something else

              Rely on membership testing (instead of lookup)
              The testing loop can be stopped at any time
              A result may satisfy part of the query                                 griffioen



SUM 2008 - October 2, 2008                                                           10 / 24
Problem and context          Method proposed    Experimental results   Conclusion



 Our choices


              Need to pay attention to two aspects




                                                                           griffioen



SUM 2008 - October 2, 2008                                                 11 / 24
Problem and context               Method proposed           Experimental results        Conclusion



 Our choices


              Need to pay attention to two aspects
                 1    Each try should be a step closer to the solution
                             Random guessing may never end
                             Stopping the process at t + 1 should give better results than
                             at t




                                                                                             griffioen



SUM 2008 - October 2, 2008                                                                   11 / 24
Problem and context                Method proposed          Experimental results        Conclusion



 Our choices


              Need to pay attention to two aspects
                 1    Each try should be a step closer to the solution
                             Random guessing may never end
                             Stopping the process at t + 1 should give better results than
                             at t
                 2    Testing a candidate solution must be fast
                             Will try a lot of solutions




                                                                                             griffioen



SUM 2008 - October 2, 2008                                                                   11 / 24
Problem and context                Method proposed          Experimental results        Conclusion



 Our choices


              Need to pay attention to two aspects
                 1    Each try should be a step closer to the solution
                             Random guessing may never end
                             Stopping the process at t + 1 should give better results than
                             at t
                 2    Testing a candidate solution must be fast
                             Will try a lot of solutions


              We made the following choices
                      Generation of solutions : Evolutionary algorithm
                      Verification of solutions : Bloom filter based testing

                                                                                             griffioen



SUM 2008 - October 2, 2008                                                                   11 / 24
Problem and context              Method proposed               Experimental results   Conclusion



 Binary Bloom filters                                                                   (1/2)

              Compact representation of information : a set of n = 8 bits


                                1      2    3      4   5   6     7     8

              Supports two operations
                      INSERT ( KEY )   : Insert a key into the filter
                      CONTAINS ( KEY )     : Test for the presence of a key

              Use k = 3 hash functions to compute a set of bits from a
              key
           HASH 1(“ HELLO WORLD ”)=8
           HASH 2(“ HELLO WORLD ”)=6
           HASH 3(“ HELLO WORLD ”)=3
                                                                                          griffioen



SUM 2008 - October 2, 2008                                                                12 / 24
Problem and context          Method proposed    Experimental results          Conclusion



 Binary Bloom filters                                                              (2/2)
              INSERT (“ HELLO WORLD ”)
           Current
                                                        Bit-wise or operation
                       OR
                                                        Always successful (i.e.
    “Hello world”
                                                        unlimited capacity)
                        =
                                                        Precision depends of
               New
                                                        number of elements m.


              CONTAINS (“B ONJOUR         !”)
          Current
                                                        Bit-wise and operation
                      AND
      “Bonjour !”                                       Positive result can be a
                                                        collision
                        =
                                                                             kn     griffioen
       Test result                                          perror = (1 − e− m )k

SUM 2008 - October 2, 2008                                                          13 / 24
Problem and context          Method proposed       Experimental results          Conclusion



 A first (naive) approach

              Insert all the triples into a unique Bloom filter.
                      INSERT (“<Ullman88>_type_Book”)
                      INSERT (“<Ullman88>_label_"Principles               of ..."”)
                      ...




                                                                                      griffioen



SUM 2008 - October 2, 2008                                                            14 / 24
Problem and context           Method proposed      Experimental results          Conclusion



 A first (naive) approach

              Insert all the triples into a unique Bloom filter.
                      INSERT (“<Ullman88>_type_Book”)
                      INSERT (“<Ullman88>_label_"Principles               of ..."”)
                      ...
              Use the CONTAINS operation to verify a solution
                      CONTAINS (“<Ullman88>_type_Book”) ⇒ true
                      CONTAINS (“<Ullman88>_label_Book”) ⇒ false




                                                                                      griffioen



SUM 2008 - October 2, 2008                                                            14 / 24
Problem and context           Method proposed      Experimental results          Conclusion



 A first (naive) approach

              Insert all the triples into a unique Bloom filter.
                      INSERT (“<Ullman88>_type_Book”)
                      INSERT (“<Ullman88>_label_"Principles               of ..."”)
                      ...
              Use the CONTAINS operation to verify a solution
                      CONTAINS (“<Ullman88>_type_Book”) ⇒ true
                      CONTAINS (“<Ullman88>_label_Book”) ⇒ false


              Not the best approach ! Let’s see what happen in detail . . .




                                                                                      griffioen



SUM 2008 - October 2, 2008                                                            14 / 24
Problem and context              Method proposed   Experimental results          Conclusion



 A first (naive) approach

              Insert all the triples into a unique Bloom filter.
                      INSERT (“<Ullman88>_type_Book”)
                      INSERT (“<Ullman88>_label_"Principles               of ..."”)
                      ...
              Use the CONTAINS operation to verify a solution
                      CONTAINS (“<Ullman88>_type_Book”) ⇒ true
                      CONTAINS (“<Ullman88>_label_Book”) ⇒ false


              Not the best approach ! Let’s see what happen in detail . . .
                              ?publication label ?title

                        CONTAINS (“<Ullman88>_label_Book”)

                                                                                      griffioen
                             modify ?publication and ?title
SUM 2008 - October 2, 2008                                                            14 / 24
Problem and context          Method proposed          Experimental results   Conclusion



 Graph parsing
              Every triple of the graph is inserted into 4 Bloom filters
               <Ullman88>                      type             Book




                                                                                 griffioen



SUM 2008 - October 2, 2008                                                       15 / 24
Problem and context          Method proposed          Experimental results   Conclusion



 Graph parsing
              Every triple of the graph is inserted into 4 Bloom filters
               <Ullman88>                      type             Book

       <Ullman88>_type_Book



                      SPO




                                                                                 griffioen



SUM 2008 - October 2, 2008                                                       15 / 24
Problem and context          Method proposed          Experimental results   Conclusion



 Graph parsing
              Every triple of the graph is inserted into 4 Bloom filters
               <Ullman88>                      type             Book

       <Ullman88>_type_Book <Ullman88>_type



                      SPO                      SP




                                                                                 griffioen



SUM 2008 - October 2, 2008                                                       15 / 24
Problem and context          Method proposed          Experimental results   Conclusion



 Graph parsing
              Every triple of the graph is inserted into 4 Bloom filters
               <Ullman88>                      type             Book

       <Ullman88>_type_Book <Ullman88>_type type_Book



                      SPO                      SP                 PO




                                                                                 griffioen



SUM 2008 - October 2, 2008                                                       15 / 24
Problem and context          Method proposed          Experimental results   Conclusion



 Graph parsing
              Every triple of the graph is inserted into 4 Bloom filters
               <Ullman88>                      type             Book

       <Ullman88>_type_Book <Ullman88>_type type_Book <Ullman88>_Boo



                      SPO                      SP                 PO         SO




                                                                                  griffioen



SUM 2008 - October 2, 2008                                                        15 / 24
Problem and context          Method proposed          Experimental results   Conclusion



 Graph parsing
              Every triple of the graph is inserted into 4 Bloom filters
               <Ullman88>                      type             Book

       <Ullman88>_type_Book <Ullman88>_type type_Book <Ullman88>_Boo



                      SPO                      SP                 PO         SO


              Three domains are defined
              S = <Ullman88> b1 ullman
              P = type label author _1 homepage
              O = Book "Principles of ..." b1 ullman <http://...>
                                                                                  griffioen



SUM 2008 - October 2, 2008                                                        15 / 24
Problem and context          Method proposed          Experimental results   Conclusion



 Graph parsing
              Every triple of the graph is inserted into 4 Bloom filters
               <Ullman88>                      type             Book

       <Ullman88>_type_Book <Ullman88>_type type_Book <Ullman88>_Boo



                      SPO                      SP                 PO         SO


              Three domains are defined
              S = <Ullman88> b1 ullman
              P = type label author _1 homepage
              O = Book "Principles of ..." b1 ullman <http://...>

              Each term is replaced by an integer (with a dictionary)             griffioen

                      <Ullman88> → 46
SUM 2008 - October 2, 2008                                                        15 / 24
Problem and context          Method proposed    Experimental results         Conclusion



 Evolutionary algorithm flowchart                                       [Eiben2003]

              Set of populations + Set of operators




                                                                                 griffioen



SUM 2008 - October 2, 2008                                                       16 / 24
Problem and context          Method proposed   Experimental results   Conclusion



 Query parsing
              Definition of the chromosome for the individuals
                      ?publication1 ?publication2 ?title




                                                                          griffioen



SUM 2008 - October 2, 2008                                                17 / 24
Problem and context          Method proposed      Experimental results   Conclusion



 Query parsing
              Definition of the chromosome for the individuals
                      ?publication1 ?publication2 ?title


              Creation of constraints to verify




                                                                             griffioen



SUM 2008 - October 2, 2008                                                   17 / 24
Problem and context          Method proposed      Experimental results   Conclusion



 Query parsing
              Definition of the chromosome for the individuals
                      ?publication1 ?publication2 ?title


              Creation of constraints to verify
                      Clause ?publication type Book .
                       bloom(spo |?publication1 type Book)
                       bloom(sp   |?publication1 type)
                       bloom(po   |type Book)




                                                                             griffioen



SUM 2008 - October 2, 2008                                                   17 / 24
Problem and context          Method proposed      Experimental results   Conclusion



 Query parsing
              Definition of the chromosome for the individuals
                      ?publication1 ?publication2 ?title


              Creation of constraints to verify
                      Clause ?publication type Book .
                       bloom(spo |?publication1 type Book)
                       bloom(sp   |?publication1 type)
                       bloom(po   |type Book)
                      Clause ?publication label ?title .
                       bloom(spo |?publication2 label ?title)
                       bloom(sp   |?publication2 label)
                       bloom(po   |label ?title)
                       bloom(so   |?publication2 ?title)

                                                                             griffioen



SUM 2008 - October 2, 2008                                                   17 / 24
Problem and context          Method proposed      Experimental results   Conclusion



 Query parsing
              Definition of the chromosome for the individuals
                      ?publication1 ?publication2 ?title


              Creation of constraints to verify
                      Clause ?publication type Book .
                       bloom(spo |?publication1 type Book)
                       bloom(sp      |?publication1 type)
                       bloom(po      |type Book)
                      Clause ?publication label ?title .
                       bloom(spo |?publication2 label ?title)
                       bloom(sp      |?publication2 label)
                       bloom(po      |label ?title)
                       bloom(so      |?publication2 ?title)
                      Equality constraint
                       equal(?publication1 ,?publication2 )                  griffioen



SUM 2008 - October 2, 2008                                                   17 / 24
Problem and context          Method proposed          Experimental results   Conclusion



 Query parsing
              Definition of the chromosome for the individuals
                      ?publication1 ?publication2 ?title

                                                        Removed
              Creation of constraints to verify
                                                        because
                      Clause ?publication type Book .
                                                        always true
                       bloom(spo |?publication type Book)
                                                  1
                       bloom(sp      |?publication1 type)
                       bloom(po      |type Book)
                      Clause ?publication label ?title .
                       bloom(spo |?publication2 label ?title)
                       bloom(sp      |?publication2 label)
                       bloom(po      |label ?title)
                       bloom(so      |?publication2 ?title)
                      Equality constraint
                       equal(?publication1 ,?publication2 )                      griffioen



SUM 2008 - October 2, 2008                                                       17 / 24
Problem and context             Method proposed       Experimental results   Conclusion



 Evaluation of a candidate solution


              Solution is checked against all the constraints. If one is
              satisfied,
                      A global reward w is won
                      Each variable used is equally rewarded



              Rewards for : bloom(spo|?publication2 label
              ?title)
                      reward(solution) += w
                                                  w
                      reward(?publication1 ) +=   2
                      reward(?title) += w
                                        2


                                                                                 griffioen



SUM 2008 - October 2, 2008                                                       18 / 24
Problem and context                 Method proposed             Experimental results                    Conclusion



 Creation of new individuals

                Select two individuals and do a one point crossover

      dblp:ullman     <Ullman88>     "Principles. . . "    dblp:ullman     <Ullman88>            _:b1

      <Ullman88>      dblp:ullman           _:b1           <Ullman88>     dblp:ullman     "Principles. . . "


   Randomly pick a pivot point                            Swap the two parts


                Mutate the least efficient variable
      dblp:ullman     <Ullman88>    "Principles. . . "

            0           3×w               2×w              <Ullman88>    <Ullman88>     "Principles. . . "


   Select the variable with lowest                        Assign a random new value
   reward                                                                                                    griffioen



SUM 2008 - October 2, 2008                                                                                   19 / 24
Problem and context          Method proposed   Experimental results   Conclusion




       1    What’s the problem ?
             Querying RDF datastores
             Standard techniques

       2    And Now for Something Completely Different
              Guessing the solution instead
              The way we do it

       3    Does it work ?
              Evolution of the quality
              Some characteristics of this method

       4    TODO list
                                                                          griffioen



SUM 2008 - October 2, 2008                                                20 / 24
Problem and context                 Method proposed                               Experimental results              Conclusion



                Results on some (small) datasets
                         Database FOAF (15k triples) and DBLP (3M triples)
                         Query with, respectively, 4 and 11 different variables
                         Average result for 200 individuals and 500 generations
                60                                                                   100

                50
                                                                                     90
fitness value




                                                                     fitness value
                40

                30                                                                   80

                20
                                                                                     70
                10

                 0                                                                   60
                     0   100      200      300     400         500                         0       100       200     300     400       500
                                 n-th generation                                                           n-th generation

                         Solutions with maximum reward (52) are found for FOAF                                                     griffioen

                         Not enough time for DBLP (max 319)
           SUM 2008 - October 2, 2008                                                                                              21 / 24
Problem and context              Method proposed      Experimental results    Conclusion



 Scalibility & speed

              Low memory requirements
                      Only depends on the number of individuals and the size of
                      the Bloom filters

                                 (a) parsing           (b) querying
                             dataset    memory     dataset        memory
                             FOAF        65 MB     FOAF            15 MB
                             DBLP       230 MB     DBLP           140 MB

                Table: Average memory usage (mostly due to dictionary)


              Computation can be distributed
                      Candidate solutions are independent
                      The dictionary can be based on a DHT                        griffioen



SUM 2008 - October 2, 2008                                                        22 / 24
Problem and context          Method proposed   Experimental results   Conclusion




       1    What’s the problem ?
             Querying RDF datastores
             Standard techniques

       2    And Now for Something Completely Different
              Guessing the solution instead
              The way we do it

       3    Does it work ?
              Evolution of the quality
              Some characteristics of this method

       4    TODO list
                                                                          griffioen



SUM 2008 - October 2, 2008                                                23 / 24
Problem and context             Method proposed        Experimental results    Conclusion



 Status and future work
              Current status
                      The search process can be slow to converge
                      Several parameters to tune (rewards, size of the population,
                      number of generations, . . . )




                                                                                     griffioen



SUM 2008 - October 2, 2008                                                           24 / 24
Problem and context             Method proposed        Experimental results    Conclusion



 Status and future work
              Current status
                      The search process can be slow to converge
                      Several parameters to tune (rewards, size of the population,
                      number of generations, . . . )


              Current work




                                                                                     griffioen



SUM 2008 - October 2, 2008                                                           24 / 24
Problem and context               Method proposed          Experimental results   Conclusion



 Status and future work
              Current status
                      The search process can be slow to converge
                      Several parameters to tune (rewards, size of the population,
                      number of generations, . . . )


              Current work
                 1    Improve benchmarking
                             Test with more queries and more datasets
                             Better study of the influence of the parameters




                                                                                      griffioen



SUM 2008 - October 2, 2008                                                            24 / 24
Problem and context               Method proposed          Experimental results    Conclusion



 Status and future work
              Current status
                      The search process can be slow to converge
                      Several parameters to tune (rewards, size of the population,
                      number of generations, . . . )


              Current work
                 1    Improve benchmarking
                             Test with more queries and more datasets
                             Better study of the influence of the parameters
                 2    Improve evolution
                             Experiment different type of crossover and mutation
                             Implement dynamic valuations for the rewards
                             Improve early results on tabbu search approach

                                                                                       griffioen



SUM 2008 - October 2, 2008                                                             24 / 24
Problem and context               Method proposed          Experimental results    Conclusion



 Status and future work
              Current status
                      The search process can be slow to converge
                      Several parameters to tune (rewards, size of the population,
                      number of generations, . . . )


              Current work
                 1    Improve benchmarking
                             Test with more queries and more datasets
                             Better study of the influence of the parameters
                 2    Improve evolution
                             Experiment different type of crossover and mutation
                             Implement dynamic valuations for the rewards
                             Improve early results on tabbu search approach
                 3    Test other, easy to parallelize and anytime, optimizer
                             Swarm based algorithm (PSO, ...) or an other EA           griffioen

                             CSP solver
SUM 2008 - October 2, 2008                                                             24 / 24

Weitere ähnliche Inhalte

Was ist angesagt?

Combining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept LocationCombining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept LocationSonia Haiduc
 
Automatic Key Term Extraction from Spoken Course Lectures
Automatic Key Term Extraction from Spoken Course LecturesAutomatic Key Term Extraction from Spoken Course Lectures
Automatic Key Term Extraction from Spoken Course LecturesYun-Nung (Vivian) Chen
 
Automatic Key Term Extraction and Summarization from Spoken Course Lectures
Automatic Key Term Extraction and Summarization from Spoken Course LecturesAutomatic Key Term Extraction and Summarization from Spoken Course Lectures
Automatic Key Term Extraction and Summarization from Spoken Course LecturesYun-Nung (Vivian) Chen
 
Setting Up a Qualitative or Mixed Methods Research Project in NVivo 10 to Cod...
Setting Up a Qualitative or Mixed Methods Research Project in NVivo 10 to Cod...Setting Up a Qualitative or Mixed Methods Research Project in NVivo 10 to Cod...
Setting Up a Qualitative or Mixed Methods Research Project in NVivo 10 to Cod...Shalin Hai-Jew
 
QALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic WebQALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic WebConstantin Orasan
 
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...Ahmed Magdy Ezzeldin, MSc.
 

Was ist angesagt? (8)

Combining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept LocationCombining IR with Relevance Feedback for Concept Location
Combining IR with Relevance Feedback for Concept Location
 
Automatic Key Term Extraction from Spoken Course Lectures
Automatic Key Term Extraction from Spoken Course LecturesAutomatic Key Term Extraction from Spoken Course Lectures
Automatic Key Term Extraction from Spoken Course Lectures
 
Automatic Key Term Extraction and Summarization from Spoken Course Lectures
Automatic Key Term Extraction and Summarization from Spoken Course LecturesAutomatic Key Term Extraction and Summarization from Spoken Course Lectures
Automatic Key Term Extraction and Summarization from Spoken Course Lectures
 
Extracting keywords from texts - Sanda Martincic Ipsic
Extracting keywords from texts - Sanda Martincic IpsicExtracting keywords from texts - Sanda Martincic Ipsic
Extracting keywords from texts - Sanda Martincic Ipsic
 
Setting Up a Qualitative or Mixed Methods Research Project in NVivo 10 to Cod...
Setting Up a Qualitative or Mixed Methods Research Project in NVivo 10 to Cod...Setting Up a Qualitative or Mixed Methods Research Project in NVivo 10 to Cod...
Setting Up a Qualitative or Mixed Methods Research Project in NVivo 10 to Cod...
 
2011 EASE - Motivation in Software Engineering: A Systematic Review Update
2011 EASE - Motivation in Software Engineering: A Systematic Review Update2011 EASE - Motivation in Software Engineering: A Systematic Review Update
2011 EASE - Motivation in Software Engineering: A Systematic Review Update
 
QALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic WebQALL-ME: Ontology and Semantic Web
QALL-ME: Ontology and Semantic Web
 
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
 

Andere mochten auch

Drugan Notes- Biological Perspective
Drugan Notes- Biological PerspectiveDrugan Notes- Biological Perspective
Drugan Notes- Biological PerspectiveKim Drugan
 
Ch. 2 -_the_biological_perspective
Ch. 2 -_the_biological_perspectiveCh. 2 -_the_biological_perspective
Ch. 2 -_the_biological_perspectivemcolon344
 
Every Crisis is Global, Social, Viral
Every Crisis is Global, Social, ViralEvery Crisis is Global, Social, Viral
Every Crisis is Global, Social, ViralGaurav Mishra
 
Learning styles from a multicultural perspective
Learning styles from a multicultural perspectiveLearning styles from a multicultural perspective
Learning styles from a multicultural perspectiveNamchalla LSS
 
Attitude Changes Everything
Attitude Changes EverythingAttitude Changes Everything
Attitude Changes Everythingpudding37
 
The bee effect: Action to effect change
The bee effect: Action to effect changeThe bee effect: Action to effect change
The bee effect: Action to effect changeAllen McClinton
 
Valuing ecosystem services: a biological perspective
Valuing ecosystem services: a biological perspectiveValuing ecosystem services: a biological perspective
Valuing ecosystem services: a biological perspectiveKent Holsinger
 
Managing Multicultural Individuals
Managing Multicultural IndividualsManaging Multicultural Individuals
Managing Multicultural IndividualsMargareta Heidt
 
Why Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveWhy Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveJames Hendler
 
C:\Multicultural Education Powerpoint
C:\Multicultural Education PowerpointC:\Multicultural Education Powerpoint
C:\Multicultural Education Powerpointnkiogima
 
Evolutionary perspective
Evolutionary perspectiveEvolutionary perspective
Evolutionary perspectivempape
 
our behaviour is the foundation of our attitude and self perception
our behaviour is the foundation of our attitude and self perceptionour behaviour is the foundation of our attitude and self perception
our behaviour is the foundation of our attitude and self perceptionParveen Bano
 
Theorizing the Future of Computer-Mediated Communication: The Changing Role o...
Theorizing the Future of Computer-Mediated Communication: The Changing Role o...Theorizing the Future of Computer-Mediated Communication: The Changing Role o...
Theorizing the Future of Computer-Mediated Communication: The Changing Role o...Jessica Vitak
 
Cognition, Learning, and Self-Tracking - Quantified Self 2011
Cognition, Learning, and Self-Tracking - Quantified Self 2011Cognition, Learning, and Self-Tracking - Quantified Self 2011
Cognition, Learning, and Self-Tracking - Quantified Self 2011nickwinter
 
ATTITUDE AND BEHAVIOUR
ATTITUDE AND BEHAVIOURATTITUDE AND BEHAVIOUR
ATTITUDE AND BEHAVIOURANTHONY ALU
 

Andere mochten auch (20)

Drugan Notes- Biological Perspective
Drugan Notes- Biological PerspectiveDrugan Notes- Biological Perspective
Drugan Notes- Biological Perspective
 
Ch. 2 -_the_biological_perspective
Ch. 2 -_the_biological_perspectiveCh. 2 -_the_biological_perspective
Ch. 2 -_the_biological_perspective
 
Every Crisis is Global, Social, Viral
Every Crisis is Global, Social, ViralEvery Crisis is Global, Social, Viral
Every Crisis is Global, Social, Viral
 
Learning styles from a multicultural perspective
Learning styles from a multicultural perspectiveLearning styles from a multicultural perspective
Learning styles from a multicultural perspective
 
2010 1 materialism1
2010 1 materialism12010 1 materialism1
2010 1 materialism1
 
Attitude Changes Everything
Attitude Changes EverythingAttitude Changes Everything
Attitude Changes Everything
 
The bee effect: Action to effect change
The bee effect: Action to effect changeThe bee effect: Action to effect change
The bee effect: Action to effect change
 
Valuing ecosystem services: a biological perspective
Valuing ecosystem services: a biological perspectiveValuing ecosystem services: a biological perspective
Valuing ecosystem services: a biological perspective
 
Managing Multicultural Individuals
Managing Multicultural IndividualsManaging Multicultural Individuals
Managing Multicultural Individuals
 
Inculcate Self Confidence & Self Belief
Inculcate Self Confidence & Self Belief Inculcate Self Confidence & Self Belief
Inculcate Self Confidence & Self Belief
 
Why Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveWhy Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspective
 
C:\Multicultural Education Powerpoint
C:\Multicultural Education PowerpointC:\Multicultural Education Powerpoint
C:\Multicultural Education Powerpoint
 
Evolutionary perspective
Evolutionary perspectiveEvolutionary perspective
Evolutionary perspective
 
our behaviour is the foundation of our attitude and self perception
our behaviour is the foundation of our attitude and self perceptionour behaviour is the foundation of our attitude and self perception
our behaviour is the foundation of our attitude and self perception
 
Theorizing the Future of Computer-Mediated Communication: The Changing Role o...
Theorizing the Future of Computer-Mediated Communication: The Changing Role o...Theorizing the Future of Computer-Mediated Communication: The Changing Role o...
Theorizing the Future of Computer-Mediated Communication: The Changing Role o...
 
Cognition, Learning, and Self-Tracking - Quantified Self 2011
Cognition, Learning, and Self-Tracking - Quantified Self 2011Cognition, Learning, and Self-Tracking - Quantified Self 2011
Cognition, Learning, and Self-Tracking - Quantified Self 2011
 
ATTITUDE AND BEHAVIOUR
ATTITUDE AND BEHAVIOURATTITUDE AND BEHAVIOUR
ATTITUDE AND BEHAVIOUR
 
Meth Powerpoint
Meth PowerpointMeth Powerpoint
Meth Powerpoint
 
Social cognition
Social cognitionSocial cognition
Social cognition
 
Social cognition
Social  cognitionSocial  cognition
Social cognition
 

Mehr von Christophe Guéret

HHAI June 2022 - KGs and Hybrid Intelligence
HHAI June 2022 - KGs and Hybrid IntelligenceHHAI June 2022 - KGs and Hybrid Intelligence
HHAI June 2022 - KGs and Hybrid IntelligenceChristophe Guéret
 
Informal presentation about RES
Informal presentation about RESInformal presentation about RES
Informal presentation about RESChristophe Guéret
 
Stop making tools! Nobody likes them anyway...
Stop making tools! Nobody likes them anyway...Stop making tools! Nobody likes them anyway...
Stop making tools! Nobody likes them anyway...Christophe Guéret
 
The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...
The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...
The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...Christophe Guéret
 
Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"
Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"
Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"Christophe Guéret
 
The Entity Registry System (ERS)
The Entity Registry System (ERS)The Entity Registry System (ERS)
The Entity Registry System (ERS)Christophe Guéret
 
Let's downscale the semantic web !
Let's downscale the semantic web !Let's downscale the semantic web !
Let's downscale the semantic web !Christophe Guéret
 
Your next data viz gear should be a Wii-U
Your next data viz gear should be a Wii-UYour next data viz gear should be a Wii-U
Your next data viz gear should be a Wii-UChristophe Guéret
 
The road towards a Web-based data ecosystem
The road towards a Web-based data ecosystemThe road towards a Web-based data ecosystem
The road towards a Web-based data ecosystemChristophe Guéret
 
Linked Open Data for Digital Humanities
Linked Open Data for Digital HumanitiesLinked Open Data for Digital Humanities
Linked Open Data for Digital HumanitiesChristophe Guéret
 
Downscaling information systems for education
Downscaling information systems for educationDownscaling information systems for education
Downscaling information systems for educationChristophe Guéret
 
ICT4D course 2013 - Low resources infrastructure
ICT4D course 2013 - Low resources infrastructureICT4D course 2013 - Low resources infrastructure
ICT4D course 2013 - Low resources infrastructureChristophe Guéret
 
ICT4D course 2013 - OLPC deployments
ICT4D course 2013 - OLPC deploymentsICT4D course 2013 - OLPC deployments
ICT4D course 2013 - OLPC deploymentsChristophe Guéret
 
Exposing the data from NARCIS with VIVO
Exposing the data from NARCIS with VIVOExposing the data from NARCIS with VIVO
Exposing the data from NARCIS with VIVOChristophe Guéret
 
Clarifier le sens de vos données publiques avec le Web de données
Clarifier le sens de vos données publiques avec le Web de donnéesClarifier le sens de vos données publiques avec le Web de données
Clarifier le sens de vos données publiques avec le Web de donnéesChristophe Guéret
 
Embedding young learners into the information society
Embedding young learners into the information societyEmbedding young learners into the information society
Embedding young learners into the information societyChristophe Guéret
 

Mehr von Christophe Guéret (20)

HHAI June 2022 - KGs and Hybrid Intelligence
HHAI June 2022 - KGs and Hybrid IntelligenceHHAI June 2022 - KGs and Hybrid Intelligence
HHAI June 2022 - KGs and Hybrid Intelligence
 
Informal presentation about RES
Informal presentation about RESInformal presentation about RES
Informal presentation about RES
 
Stop making tools! Nobody likes them anyway...
Stop making tools! Nobody likes them anyway...Stop making tools! Nobody likes them anyway...
Stop making tools! Nobody likes them anyway...
 
The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...
The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...
The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...
 
Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"
Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"
Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"
 
The Entity Registry System (ERS)
The Entity Registry System (ERS)The Entity Registry System (ERS)
The Entity Registry System (ERS)
 
Let's downscale the semantic web !
Let's downscale the semantic web !Let's downscale the semantic web !
Let's downscale the semantic web !
 
Your next data viz gear should be a Wii-U
Your next data viz gear should be a Wii-UYour next data viz gear should be a Wii-U
Your next data viz gear should be a Wii-U
 
Linking knowledge spaces
Linking knowledge spacesLinking knowledge spaces
Linking knowledge spaces
 
The data behind the HuisKluis
The data behind the HuisKluisThe data behind the HuisKluis
The data behind the HuisKluis
 
Digital archiving 3.0
Digital archiving 3.0Digital archiving 3.0
Digital archiving 3.0
 
The road towards a Web-based data ecosystem
The road towards a Web-based data ecosystemThe road towards a Web-based data ecosystem
The road towards a Web-based data ecosystem
 
Linked Open Data for Digital Humanities
Linked Open Data for Digital HumanitiesLinked Open Data for Digital Humanities
Linked Open Data for Digital Humanities
 
Downscaling information systems for education
Downscaling information systems for educationDownscaling information systems for education
Downscaling information systems for education
 
ICT4D course 2013 - Low resources infrastructure
ICT4D course 2013 - Low resources infrastructureICT4D course 2013 - Low resources infrastructure
ICT4D course 2013 - Low resources infrastructure
 
ICT4D course 2013 - OLPC deployments
ICT4D course 2013 - OLPC deploymentsICT4D course 2013 - OLPC deployments
ICT4D course 2013 - OLPC deployments
 
ICT4D course 2013 - Sugar
ICT4D course 2013 - SugarICT4D course 2013 - Sugar
ICT4D course 2013 - Sugar
 
Exposing the data from NARCIS with VIVO
Exposing the data from NARCIS with VIVOExposing the data from NARCIS with VIVO
Exposing the data from NARCIS with VIVO
 
Clarifier le sens de vos données publiques avec le Web de données
Clarifier le sens de vos données publiques avec le Web de donnéesClarifier le sens de vos données publiques avec le Web de données
Clarifier le sens de vos données publiques avec le Web de données
 
Embedding young learners into the information society
Embedding young learners into the information societyEmbedding young learners into the information society
Embedding young learners into the information society
 

Kürzlich hochgeladen

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 

Kürzlich hochgeladen (20)

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 

Evolutionary Perspective Approx RDF Query

  • 1. An Evolutionary Perspective on Approximate RDF Query Answering Christophe Guéret, Eyal Oren, Stefan Schlobach, Frank van Harmelen and Martijn Schut Vrije Universiteit, Amsterdam
  • 2. Problem and context Method proposed Experimental results Conclusion The next 30 minutes in 4 points... RDF griffioen SUM 2008 - October 2, 2008 2 / 24
  • 3. Problem and context Method proposed Experimental results Conclusion The next 30 minutes in 4 points... RDF Data on the Web griffioen SUM 2008 - October 2, 2008 2 / 24
  • 4. Problem and context Method proposed Experimental results Conclusion The next 30 minutes in 4 points... RDF Data on the Web Inconsistent, uncertain, heterogeneous, Huge and growing! griffioen SUM 2008 - October 2, 2008 2 / 24
  • 5. Problem and context Method proposed Experimental results Conclusion The next 30 minutes in 4 points... RDF Data on the Web Inconsistent, uncertain, heterogeneous, Huge and growing! RDF Query answering griffioen SUM 2008 - October 2, 2008 2 / 24
  • 6. Problem and context Method proposed Experimental results Conclusion The next 30 minutes in 4 points... RDF Data on the Web Inconsistent, uncertain, heterogeneous, Huge and growing! RDF Query answering Finding data matching criterion griffioen SUM 2008 - October 2, 2008 2 / 24
  • 7. Problem and context Method proposed Experimental results Conclusion The next 30 minutes in 4 points... RDF Data on the Web Inconsistent, uncertain, heterogeneous, Huge and growing! RDF Query answering Finding data matching criterion ... many queries are actually not satisfiable griffioen SUM 2008 - October 2, 2008 2 / 24
  • 8. Problem and context Method proposed Experimental results Conclusion The next 30 minutes in 4 points... RDF Data on the Web Inconsistent, uncertain, heterogeneous, Huge and growing! RDF Query answering Finding data matching criterion ... many queries are actually not satisfiable Approximate RDF Query answering griffioen SUM 2008 - October 2, 2008 2 / 24
  • 9. Problem and context Method proposed Experimental results Conclusion The next 30 minutes in 4 points... RDF Data on the Web Inconsistent, uncertain, heterogeneous, Huge and growing! RDF Query answering Finding data matching criterion ... many queries are actually not satisfiable Approximate RDF Query answering Finding some, almost valid, data griffioen SUM 2008 - October 2, 2008 2 / 24
  • 10. Problem and context Method proposed Experimental results Conclusion The next 30 minutes in 4 points... RDF Data on the Web Inconsistent, uncertain, heterogeneous, Huge and growing! RDF Query answering Finding data matching criterion ... many queries are actually not satisfiable Approximate RDF Query answering Finding some, almost valid, data The Evolutionary Perspective griffioen SUM 2008 - October 2, 2008 2 / 24
  • 11. Problem and context Method proposed Experimental results Conclusion The next 30 minutes in 4 points... RDF Data on the Web Inconsistent, uncertain, heterogeneous, Huge and growing! RDF Query answering Finding data matching criterion ... many queries are actually not satisfiable Approximate RDF Query answering Finding some, almost valid, data The Evolutionary Perspective Test different solutions griffioen SUM 2008 - October 2, 2008 2 / 24
  • 12. Problem and context Method proposed Experimental results Conclusion The next 30 minutes in 4 points... RDF Data on the Web Inconsistent, uncertain, heterogeneous, Huge and growing! RDF Query answering Finding data matching criterion ... many queries are actually not satisfiable Approximate RDF Query answering Finding some, almost valid, data The Evolutionary Perspective Test different solutions Progressive optimisation of the result griffioen SUM 2008 - October 2, 2008 2 / 24
  • 13. Problem and context Method proposed Experimental results Conclusion 1 What’s the problem ? Querying RDF datastores Standard techniques 2 And Now for Something Completely Different Guessing the solution instead The way we do it 3 Does it work ? Evolution of the quality Some characteristics of this method 4 TODO list griffioen SUM 2008 - October 2, 2008 3 / 24
  • 14. Problem and context Method proposed Experimental results Conclusion 1 What’s the problem ? Querying RDF datastores Standard techniques 2 And Now for Something Completely Different Guessing the solution instead The way we do it 3 Does it work ? Evolution of the quality Some characteristics of this method 4 TODO list griffioen SUM 2008 - October 2, 2008 4 / 24
  • 15. Problem and context Method proposed Experimental results Conclusion Example RDF dataset <Ullman88> type Book . <Ullman88> label "Principles of Database and Knowledge-Base Systems" . <Ullman88> author b1 . b1 _1 ullman . ullman homepage <http://stanford.edu/~ullman/> . SPARQL query SELECT ?title WHERE { ?publication type Book . ?publication label ?title . } Expected answer griffioen ?title = "Principles of Database and Knowledge-Base Systems SUM 2008 - October 2, 2008 5 / 24
  • 16. Problem and context Method proposed Experimental results Conclusion Problem description Triple = subject, predicate, object Dataset = graph of triples Querying : find a pattern in the graph griffioen A query and a graph [PSPARQL07] SUM 2008 - October 2, 2008 6 / 24
  • 17. Problem and context Method proposed Experimental results Conclusion Standard techniques Standard approach : griffioen SUM 2008 - October 2, 2008 7 / 24
  • 18. Problem and context Method proposed Experimental results Conclusion Standard techniques Standard approach : 1 Find all the possible results for ?publication type Book griffioen SUM 2008 - October 2, 2008 7 / 24
  • 19. Problem and context Method proposed Experimental results Conclusion Standard techniques Standard approach : 1 Find all the possible results for ?publication type Book ?publication <Ullman88> griffioen SUM 2008 - October 2, 2008 7 / 24
  • 20. Problem and context Method proposed Experimental results Conclusion Standard techniques Standard approach : 1 Find all the possible results for ?publication type Book ?publication <Ullman88> 2 Find all the possible results for ?publication label ?title griffioen SUM 2008 - October 2, 2008 7 / 24
  • 21. Problem and context Method proposed Experimental results Conclusion Standard techniques Standard approach : 1 Find all the possible results for ?publication type Book ?publication <Ullman88> 2 Find all the possible results for ?publication label ?title ?publication ?title <Ullman88> "Principles of ..." griffioen SUM 2008 - October 2, 2008 7 / 24
  • 22. Problem and context Method proposed Experimental results Conclusion Standard techniques Standard approach : 1 Find all the possible results for ?publication type Book ?publication <Ullman88> 2 Find all the possible results for ?publication label ?title ?publication ?title <Ullman88> "Principles of ..." 3 Do a join on the two tables and return the result griffioen SUM 2008 - October 2, 2008 7 / 24
  • 23. Problem and context Method proposed Experimental results Conclusion Standard techniques Standard approach : 1 Find all the possible results for ?publication type Book ?publication <Ullman88> 2 Find all the possible results for ?publication label ?title ?publication ?title <Ullman88> "Principles of ..." 3 Do a join on the two tables and return the result ?title = "Principles of ..." griffioen SUM 2008 - October 2, 2008 7 / 24
  • 24. Problem and context Method proposed Experimental results Conclusion Standard techniques Standard approach : 1 Find all the possible results for ?publication type Book ?publication <Ullman88> 2 Find all the possible results for ?publication label ?title ?publication ?title <Ullman88> "Principles of ..." 3 Do a join on the two tables and return the result ?title = "Principles of ..." Fast thanks to the creation of indexes and query optimisation griffioen SUM 2008 - October 2, 2008 7 / 24
  • 25. Problem and context Method proposed Experimental results Conclusion Motivation Designed to return results only when there are some Not designed for incomplete and approximate queries/answers Hard to distribute griffioen SUM 2008 - October 2, 2008 8 / 24
  • 26. Problem and context Method proposed Experimental results Conclusion Motivation Designed to return results only when there are some Not designed for incomplete and approximate queries/answers Hard to distribute Approximate answers to precise queries If the query is unsat, return the best almost sat solution found griffioen SUM 2008 - October 2, 2008 8 / 24
  • 27. Problem and context Method proposed Experimental results Conclusion Motivation Designed to return results only when there are some Not designed for incomplete and approximate queries/answers Hard to distribute Approximate answers to precise queries If the query is unsat, return the best almost sat solution found Precises answers to approximate queries Return a subset of existing solutions instead of showing them all griffioen SUM 2008 - October 2, 2008 8 / 24
  • 28. Problem and context Method proposed Experimental results Conclusion Motivation Designed to return results only when there are some Not designed for incomplete and approximate queries/answers Hard to distribute Approximate answers to precise queries If the query is unsat, return the best almost sat solution found Precises answers to approximate queries Return a subset of existing solutions instead of showing them all Interactive querying Use of intermediate results to help the user improving his griffioen query SUM 2008 - October 2, 2008 8 / 24
  • 29. Problem and context Method proposed Experimental results Conclusion 1 What’s the problem ? Querying RDF datastores Standard techniques 2 And Now for Something Completely Different Guessing the solution instead The way we do it 3 Does it work ? Evolution of the quality Some characteristics of this method 4 TODO list griffioen SUM 2008 - October 2, 2008 9 / 24
  • 30. Problem and context Method proposed Experimental results Conclusion Approach “I’m Feeling Lucky” approach : griffioen SUM 2008 - October 2, 2008 10 / 24
  • 31. Problem and context Method proposed Experimental results Conclusion Approach “I’m Feeling Lucky” approach : 1 Assign some random values to the variables griffioen SUM 2008 - October 2, 2008 10 / 24
  • 32. Problem and context Method proposed Experimental results Conclusion Approach “I’m Feeling Lucky” approach : 1 Assign some random values to the variables ?publication = <Ullman88> ?title = Book griffioen SUM 2008 - October 2, 2008 10 / 24
  • 33. Problem and context Method proposed Experimental results Conclusion Approach “I’m Feeling Lucky” approach : 1 Assign some random values to the variables ?publication = <Ullman88> ?title = Book 2 Verify if the solution is valid griffioen SUM 2008 - October 2, 2008 10 / 24
  • 34. Problem and context Method proposed Experimental results Conclusion Approach “I’m Feeling Lucky” approach : 1 Assign some random values to the variables ?publication = <Ullman88> ?title = Book 2 Verify if the solution is valid Triple Is in the graph ? <Ullman88> type Book yes <Ullman88> label Book no griffioen SUM 2008 - October 2, 2008 10 / 24
  • 35. Problem and context Method proposed Experimental results Conclusion Approach “I’m Feeling Lucky” approach : 1 Assign some random values to the variables ?publication = <Ullman88> ?title = Book 2 Verify if the solution is valid Triple Is in the graph ? <Ullman88> type Book yes <Ullman88> label Book no 3 If the solution is OK, stop. Otherwise, try again with something else griffioen SUM 2008 - October 2, 2008 10 / 24
  • 36. Problem and context Method proposed Experimental results Conclusion Approach “I’m Feeling Lucky” approach : 1 Assign some random values to the variables ?publication = <Ullman88> ?title = Book 2 Verify if the solution is valid Triple Is in the graph ? <Ullman88> type Book yes <Ullman88> label Book no 3 If the solution is OK, stop. Otherwise, try again with something else Rely on membership testing (instead of lookup) griffioen SUM 2008 - October 2, 2008 10 / 24
  • 37. Problem and context Method proposed Experimental results Conclusion Approach “I’m Feeling Lucky” approach : 1 Assign some random values to the variables ?publication = <Ullman88> ?title = Book 2 Verify if the solution is valid Triple Is in the graph ? <Ullman88> type Book yes <Ullman88> label Book no 3 If the solution is OK, stop. Otherwise, try again with something else Rely on membership testing (instead of lookup) The testing loop can be stopped at any time griffioen SUM 2008 - October 2, 2008 10 / 24
  • 38. Problem and context Method proposed Experimental results Conclusion Approach “I’m Feeling Lucky” approach : 1 Assign some random values to the variables ?publication = <Ullman88> ?title = Book 2 Verify if the solution is valid Triple Is in the graph ? <Ullman88> type Book yes <Ullman88> label Book no 3 If the solution is OK, stop. Otherwise, try again with something else Rely on membership testing (instead of lookup) The testing loop can be stopped at any time A result may satisfy part of the query griffioen SUM 2008 - October 2, 2008 10 / 24
  • 39. Problem and context Method proposed Experimental results Conclusion Our choices Need to pay attention to two aspects griffioen SUM 2008 - October 2, 2008 11 / 24
  • 40. Problem and context Method proposed Experimental results Conclusion Our choices Need to pay attention to two aspects 1 Each try should be a step closer to the solution Random guessing may never end Stopping the process at t + 1 should give better results than at t griffioen SUM 2008 - October 2, 2008 11 / 24
  • 41. Problem and context Method proposed Experimental results Conclusion Our choices Need to pay attention to two aspects 1 Each try should be a step closer to the solution Random guessing may never end Stopping the process at t + 1 should give better results than at t 2 Testing a candidate solution must be fast Will try a lot of solutions griffioen SUM 2008 - October 2, 2008 11 / 24
  • 42. Problem and context Method proposed Experimental results Conclusion Our choices Need to pay attention to two aspects 1 Each try should be a step closer to the solution Random guessing may never end Stopping the process at t + 1 should give better results than at t 2 Testing a candidate solution must be fast Will try a lot of solutions We made the following choices Generation of solutions : Evolutionary algorithm Verification of solutions : Bloom filter based testing griffioen SUM 2008 - October 2, 2008 11 / 24
  • 43. Problem and context Method proposed Experimental results Conclusion Binary Bloom filters (1/2) Compact representation of information : a set of n = 8 bits 1 2 3 4 5 6 7 8 Supports two operations INSERT ( KEY ) : Insert a key into the filter CONTAINS ( KEY ) : Test for the presence of a key Use k = 3 hash functions to compute a set of bits from a key HASH 1(“ HELLO WORLD ”)=8 HASH 2(“ HELLO WORLD ”)=6 HASH 3(“ HELLO WORLD ”)=3 griffioen SUM 2008 - October 2, 2008 12 / 24
  • 44. Problem and context Method proposed Experimental results Conclusion Binary Bloom filters (2/2) INSERT (“ HELLO WORLD ”) Current Bit-wise or operation OR Always successful (i.e. “Hello world” unlimited capacity) = Precision depends of New number of elements m. CONTAINS (“B ONJOUR !”) Current Bit-wise and operation AND “Bonjour !” Positive result can be a collision = kn griffioen Test result perror = (1 − e− m )k SUM 2008 - October 2, 2008 13 / 24
  • 45. Problem and context Method proposed Experimental results Conclusion A first (naive) approach Insert all the triples into a unique Bloom filter. INSERT (“<Ullman88>_type_Book”) INSERT (“<Ullman88>_label_"Principles of ..."”) ... griffioen SUM 2008 - October 2, 2008 14 / 24
  • 46. Problem and context Method proposed Experimental results Conclusion A first (naive) approach Insert all the triples into a unique Bloom filter. INSERT (“<Ullman88>_type_Book”) INSERT (“<Ullman88>_label_"Principles of ..."”) ... Use the CONTAINS operation to verify a solution CONTAINS (“<Ullman88>_type_Book”) ⇒ true CONTAINS (“<Ullman88>_label_Book”) ⇒ false griffioen SUM 2008 - October 2, 2008 14 / 24
  • 47. Problem and context Method proposed Experimental results Conclusion A first (naive) approach Insert all the triples into a unique Bloom filter. INSERT (“<Ullman88>_type_Book”) INSERT (“<Ullman88>_label_"Principles of ..."”) ... Use the CONTAINS operation to verify a solution CONTAINS (“<Ullman88>_type_Book”) ⇒ true CONTAINS (“<Ullman88>_label_Book”) ⇒ false Not the best approach ! Let’s see what happen in detail . . . griffioen SUM 2008 - October 2, 2008 14 / 24
  • 48. Problem and context Method proposed Experimental results Conclusion A first (naive) approach Insert all the triples into a unique Bloom filter. INSERT (“<Ullman88>_type_Book”) INSERT (“<Ullman88>_label_"Principles of ..."”) ... Use the CONTAINS operation to verify a solution CONTAINS (“<Ullman88>_type_Book”) ⇒ true CONTAINS (“<Ullman88>_label_Book”) ⇒ false Not the best approach ! Let’s see what happen in detail . . . ?publication label ?title CONTAINS (“<Ullman88>_label_Book”) griffioen modify ?publication and ?title SUM 2008 - October 2, 2008 14 / 24
  • 49. Problem and context Method proposed Experimental results Conclusion Graph parsing Every triple of the graph is inserted into 4 Bloom filters <Ullman88> type Book griffioen SUM 2008 - October 2, 2008 15 / 24
  • 50. Problem and context Method proposed Experimental results Conclusion Graph parsing Every triple of the graph is inserted into 4 Bloom filters <Ullman88> type Book <Ullman88>_type_Book SPO griffioen SUM 2008 - October 2, 2008 15 / 24
  • 51. Problem and context Method proposed Experimental results Conclusion Graph parsing Every triple of the graph is inserted into 4 Bloom filters <Ullman88> type Book <Ullman88>_type_Book <Ullman88>_type SPO SP griffioen SUM 2008 - October 2, 2008 15 / 24
  • 52. Problem and context Method proposed Experimental results Conclusion Graph parsing Every triple of the graph is inserted into 4 Bloom filters <Ullman88> type Book <Ullman88>_type_Book <Ullman88>_type type_Book SPO SP PO griffioen SUM 2008 - October 2, 2008 15 / 24
  • 53. Problem and context Method proposed Experimental results Conclusion Graph parsing Every triple of the graph is inserted into 4 Bloom filters <Ullman88> type Book <Ullman88>_type_Book <Ullman88>_type type_Book <Ullman88>_Boo SPO SP PO SO griffioen SUM 2008 - October 2, 2008 15 / 24
  • 54. Problem and context Method proposed Experimental results Conclusion Graph parsing Every triple of the graph is inserted into 4 Bloom filters <Ullman88> type Book <Ullman88>_type_Book <Ullman88>_type type_Book <Ullman88>_Boo SPO SP PO SO Three domains are defined S = <Ullman88> b1 ullman P = type label author _1 homepage O = Book "Principles of ..." b1 ullman <http://...> griffioen SUM 2008 - October 2, 2008 15 / 24
  • 55. Problem and context Method proposed Experimental results Conclusion Graph parsing Every triple of the graph is inserted into 4 Bloom filters <Ullman88> type Book <Ullman88>_type_Book <Ullman88>_type type_Book <Ullman88>_Boo SPO SP PO SO Three domains are defined S = <Ullman88> b1 ullman P = type label author _1 homepage O = Book "Principles of ..." b1 ullman <http://...> Each term is replaced by an integer (with a dictionary) griffioen <Ullman88> → 46 SUM 2008 - October 2, 2008 15 / 24
  • 56. Problem and context Method proposed Experimental results Conclusion Evolutionary algorithm flowchart [Eiben2003] Set of populations + Set of operators griffioen SUM 2008 - October 2, 2008 16 / 24
  • 57. Problem and context Method proposed Experimental results Conclusion Query parsing Definition of the chromosome for the individuals ?publication1 ?publication2 ?title griffioen SUM 2008 - October 2, 2008 17 / 24
  • 58. Problem and context Method proposed Experimental results Conclusion Query parsing Definition of the chromosome for the individuals ?publication1 ?publication2 ?title Creation of constraints to verify griffioen SUM 2008 - October 2, 2008 17 / 24
  • 59. Problem and context Method proposed Experimental results Conclusion Query parsing Definition of the chromosome for the individuals ?publication1 ?publication2 ?title Creation of constraints to verify Clause ?publication type Book . bloom(spo |?publication1 type Book) bloom(sp |?publication1 type) bloom(po |type Book) griffioen SUM 2008 - October 2, 2008 17 / 24
  • 60. Problem and context Method proposed Experimental results Conclusion Query parsing Definition of the chromosome for the individuals ?publication1 ?publication2 ?title Creation of constraints to verify Clause ?publication type Book . bloom(spo |?publication1 type Book) bloom(sp |?publication1 type) bloom(po |type Book) Clause ?publication label ?title . bloom(spo |?publication2 label ?title) bloom(sp |?publication2 label) bloom(po |label ?title) bloom(so |?publication2 ?title) griffioen SUM 2008 - October 2, 2008 17 / 24
  • 61. Problem and context Method proposed Experimental results Conclusion Query parsing Definition of the chromosome for the individuals ?publication1 ?publication2 ?title Creation of constraints to verify Clause ?publication type Book . bloom(spo |?publication1 type Book) bloom(sp |?publication1 type) bloom(po |type Book) Clause ?publication label ?title . bloom(spo |?publication2 label ?title) bloom(sp |?publication2 label) bloom(po |label ?title) bloom(so |?publication2 ?title) Equality constraint equal(?publication1 ,?publication2 ) griffioen SUM 2008 - October 2, 2008 17 / 24
  • 62. Problem and context Method proposed Experimental results Conclusion Query parsing Definition of the chromosome for the individuals ?publication1 ?publication2 ?title Removed Creation of constraints to verify because Clause ?publication type Book . always true bloom(spo |?publication type Book) 1 bloom(sp |?publication1 type) bloom(po |type Book) Clause ?publication label ?title . bloom(spo |?publication2 label ?title) bloom(sp |?publication2 label) bloom(po |label ?title) bloom(so |?publication2 ?title) Equality constraint equal(?publication1 ,?publication2 ) griffioen SUM 2008 - October 2, 2008 17 / 24
  • 63. Problem and context Method proposed Experimental results Conclusion Evaluation of a candidate solution Solution is checked against all the constraints. If one is satisfied, A global reward w is won Each variable used is equally rewarded Rewards for : bloom(spo|?publication2 label ?title) reward(solution) += w w reward(?publication1 ) += 2 reward(?title) += w 2 griffioen SUM 2008 - October 2, 2008 18 / 24
  • 64. Problem and context Method proposed Experimental results Conclusion Creation of new individuals Select two individuals and do a one point crossover dblp:ullman <Ullman88> "Principles. . . " dblp:ullman <Ullman88> _:b1 <Ullman88> dblp:ullman _:b1 <Ullman88> dblp:ullman "Principles. . . " Randomly pick a pivot point Swap the two parts Mutate the least efficient variable dblp:ullman <Ullman88> "Principles. . . " 0 3×w 2×w <Ullman88> <Ullman88> "Principles. . . " Select the variable with lowest Assign a random new value reward griffioen SUM 2008 - October 2, 2008 19 / 24
  • 65. Problem and context Method proposed Experimental results Conclusion 1 What’s the problem ? Querying RDF datastores Standard techniques 2 And Now for Something Completely Different Guessing the solution instead The way we do it 3 Does it work ? Evolution of the quality Some characteristics of this method 4 TODO list griffioen SUM 2008 - October 2, 2008 20 / 24
  • 66. Problem and context Method proposed Experimental results Conclusion Results on some (small) datasets Database FOAF (15k triples) and DBLP (3M triples) Query with, respectively, 4 and 11 different variables Average result for 200 individuals and 500 generations 60 100 50 90 fitness value fitness value 40 30 80 20 70 10 0 60 0 100 200 300 400 500 0 100 200 300 400 500 n-th generation n-th generation Solutions with maximum reward (52) are found for FOAF griffioen Not enough time for DBLP (max 319) SUM 2008 - October 2, 2008 21 / 24
  • 67. Problem and context Method proposed Experimental results Conclusion Scalibility & speed Low memory requirements Only depends on the number of individuals and the size of the Bloom filters (a) parsing (b) querying dataset memory dataset memory FOAF 65 MB FOAF 15 MB DBLP 230 MB DBLP 140 MB Table: Average memory usage (mostly due to dictionary) Computation can be distributed Candidate solutions are independent The dictionary can be based on a DHT griffioen SUM 2008 - October 2, 2008 22 / 24
  • 68. Problem and context Method proposed Experimental results Conclusion 1 What’s the problem ? Querying RDF datastores Standard techniques 2 And Now for Something Completely Different Guessing the solution instead The way we do it 3 Does it work ? Evolution of the quality Some characteristics of this method 4 TODO list griffioen SUM 2008 - October 2, 2008 23 / 24
  • 69. Problem and context Method proposed Experimental results Conclusion Status and future work Current status The search process can be slow to converge Several parameters to tune (rewards, size of the population, number of generations, . . . ) griffioen SUM 2008 - October 2, 2008 24 / 24
  • 70. Problem and context Method proposed Experimental results Conclusion Status and future work Current status The search process can be slow to converge Several parameters to tune (rewards, size of the population, number of generations, . . . ) Current work griffioen SUM 2008 - October 2, 2008 24 / 24
  • 71. Problem and context Method proposed Experimental results Conclusion Status and future work Current status The search process can be slow to converge Several parameters to tune (rewards, size of the population, number of generations, . . . ) Current work 1 Improve benchmarking Test with more queries and more datasets Better study of the influence of the parameters griffioen SUM 2008 - October 2, 2008 24 / 24
  • 72. Problem and context Method proposed Experimental results Conclusion Status and future work Current status The search process can be slow to converge Several parameters to tune (rewards, size of the population, number of generations, . . . ) Current work 1 Improve benchmarking Test with more queries and more datasets Better study of the influence of the parameters 2 Improve evolution Experiment different type of crossover and mutation Implement dynamic valuations for the rewards Improve early results on tabbu search approach griffioen SUM 2008 - October 2, 2008 24 / 24
  • 73. Problem and context Method proposed Experimental results Conclusion Status and future work Current status The search process can be slow to converge Several parameters to tune (rewards, size of the population, number of generations, . . . ) Current work 1 Improve benchmarking Test with more queries and more datasets Better study of the influence of the parameters 2 Improve evolution Experiment different type of crossover and mutation Implement dynamic valuations for the rewards Improve early results on tabbu search approach 3 Test other, easy to parallelize and anytime, optimizer Swarm based algorithm (PSO, ...) or an other EA griffioen CSP solver SUM 2008 - October 2, 2008 24 / 24