SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Downloaden Sie, um offline zu lesen
Approximate and Incremental Processing of
Complex Queries against the Web of Data
Thanh Tran, Günter Ladwig, Andreas Wagner
DEXA 2011


Institute of Applied Informatics and Formal Description Methods (AIFB)




KIT – University of the State of Baden-Württemberg and
National Large-scale Research Center of the Helmholtz Association        www.kit.edu
Contents




                                                       Approximate
       Introduction                 Overview           & Incremental                Evaluation                      Conclusion
                                                        Processing



                                                                                 Structure-based
                                                        Approximate
                                                                                      Result
                                  Entity Search          Structure
                                                                                 Refinement and
                                                         Matching
                                                                                  Computation




2    August 31st, 2011   DEXA 2011, Toulouse, France                  Institute of Applied Informatics and Formal Description Methods (AIFB)
INTRODUCTION


3   August 31st, 2011   DEXA 2011, Toulouse, France   Institute of Applied Informatics and Formal Description Methods (AIFB)
Introduction – Data Model
    Resource Description Framework (RDF)


                                                                   conference
                                                             a1                 c1
                                          authorOf
                                super-                       authorOf
                                vises
           name p2                         p1                              p5
    P2                                                                                          P5
                                                     worksAt                     name
                                                                      worksAt
                                  knows
                                                             i1                 u1
                                                                      partOf
                         p4                p3
                              super-                                                 name
                              vises                    worksAt
                                authorOf                                        U1
                                           a2                i2

                              conference                          partOf

                                           c2                u2
4    August 31st, 2011         DEXA 2011, Toulouse, France                       Institute of Applied Informatics and Formal Description Methods (AIFB)
Introduction – Query Model
    Basic Graph Patterns

       Conjunctive queries over RDF data: graph pattern matching


                                            AIFB          name                              KIT
                                                                    partOf                name
                                                                z            u
                                                worksAt
                              supervise
                         w                     x                y             v         name
                                   age                 author       conf
                                                                                          ICDE
                                   29




5    August 31st, 2011   DEXA 2011, Toulouse, France                       Institute of Applied Informatics and Formal Description Methods (AIFB)
Contribution

       Techniques for matching (basic) query patterns against graph-
       structured data have limits
       We might wish to trade completeness and exactness for
       responsiveness




             Our approach allows an “affordable” computation of an initial set
             of approximate results, which can be incrementally refined as
             needed.




6    August 31st, 2011   DEXA 2011, Toulouse, France   Institute of Applied Informatics and Formal Description Methods (AIFB)
Contribution – Pipeline Overview

       Pipeline of operations where approximate results are refined
       incrementally

   Intermediate,
Approximate Results


                                       Approximate       Structure-                                 Structure-
    Entity Search                       Structure      based Result                               based Answer
                                        Matching        Refinement                                 Computation




                       Entity &
                                                        Structure
                    Neighborhood                                                              Relation Index
                                                          Index
                        Index

7    August 31st, 2011   DEXA 2011, Toulouse, France   Institute of Applied Informatics and Formal Description Methods (AIFB)
Approximate       Structure-                                 Structure-
    Entity Search                       Structure      based Result                               based Answer
                                        Matching        Refinement                                 Computation




       ENTITY SEARCH


8    August 31st, 2011   DEXA 2011, Toulouse, France   Institute of Applied Informatics and Formal Description Methods (AIFB)
Entity Search

       Entity index
               Stores attribute edges of the data graph
               Enables lookup of entities by attribute and value
       Entity search
               Obtains candidate bindings for all variables in the query that have
               attribute edges
               Does not consider structure (i.e., relations between entities)
       Query decomposition and transformation
               Decompose query into entity queries to create a transformed
               query




9    August 31st, 2011   DEXA 2011, Toulouse, France   Institute of Applied Informatics and Formal Description Methods (AIFB)
Query Decomposition & Transformation


                                                    AIFB       name                                   KIT
                                                                         partOf             name
                                                                     z               u
                                                        worksAt
                                      supervise
                                w                       x            y               v
                                           age              author       conf                name
                                                                                                    ICDE
                                          29




        Identify entity queries
        Breadth-first search starting from random variable


10    August 31st, 2011   DEXA 2011, Toulouse, France                      Institute of Applied Informatics and Formal Description Methods (AIFB)
Query Decomposition & Transformation
                                              AIFB         name                                 KIT
                                                                     partOf            name
                                                                 z             u
                                                 worksAt
                               supervise
                          w                      x               y             v
                                    age                 author       conf              name
                                                                                              ICDE
                                    29
                                                                         Collapse entity queries


                                                                 z       partOf              u
                                                           name AIFB                  name KIT
                                                     worksAt

                          w
                              supervise              x               y                       v
                                                age 29 author             conf name ICDE

11    August 31st, 2011   DEXA 2011, Toulouse, France                       Institute of Applied Informatics and Formal Description Methods (AIFB)
Entity Search Results

        Use entity index to obtain bindings for all entity queries in
        transformed query
        Entity queries are necessary conditions,       x      z    u                                                                v
        but not sufficient                            p1     i1   u1                                                               c1
        Final results will be a subset                p3     i1   u1                                                               c1
                                                                                            p5            i1          u1           c1
                                                                                            p6            i1          u1           c1

                                                 z          partOf      u
                                         name AIFB                   name KIT
                                  worksAt

          w
                 supervise        x                     y               v
                              age 29 author                 conf name ICDE



12    August 31st, 2011   DEXA 2011, Toulouse, France                    Institute of Applied Informatics and Formal Description Methods (AIFB)
Approximate       Structure-                                 Structure-
 Entity Search                          Structure      based Result                               based Answer
                                        Matching        Refinement                                 Computation




       APPROXIMATE STRUCTURE
       MATCHING

13   August 31st, 2011   DEXA 2011, Toulouse, France   Institute of Applied Informatics and Formal Description Methods (AIFB)
Approximate Structure Matching

        Only entity parts of the query have been matched
        Relation edges have yet to be processed
        Instead of performing exact equijoins we propose to perform a
        neighborhood join
           The k-neighborhood of an entity e is the set of entities in the data graph
           that can be reached from e via a path of relation edges of length k or less.

        Neighborhood join allows us to check whether two entities are
        connected via relation edges (but not which ones)
           A neighborhood join between two sets of entities E1, E2 is an equijoin
           between all pairs e1 ∈ E1, e2 ∈ E2 where e1 and e2 are considered
           equivalent if the intersection of their k-neighborhood is non-empty.

        Again: necessary, but not sufficient

14    August 31st, 2011   DEXA 2011, Toulouse, France   Institute of Applied Informatics and Formal Description Methods (AIFB)
Neighborhood Join via Bloom Filters

        We store the set of k-neighborhood entities as a bloom filter
        Bloom filter
                Space-efficient, probabilistic data structure for set membership test
                False positives are possible (false negatives are not)
        We refine the results of the previous step
        To perform a neighborhood join between bindings E1, E2
                Load bloom filters for one set of entities, say E1
                In a nested loop manner, check if entities in E2 are contained in the
                bloom filter




15    August 31st, 2011   DEXA 2011, Toulouse, France   Institute of Applied Informatics and Formal Description Methods (AIFB)
Neighborhood Join via Bloom Filters
                                                                     AIFB
                                                           name
                                                                                                KIT
                                                                     partOf            name
                                                                 z             u
                                                 worksAt
                               supervise
                          w                      x               y             v
                                    age                 author       conf              name
                                                                                              ICDE
                                    29
                                                                        k=1


                                                                              k=2

        Load bloom filters for entities bound to x
        Check whether entities bound to w,y, z are in the neighborhood
        of x
        When k=2, bloom filters for x also cover u and v
16    August 31st, 2011   DEXA 2011, Toulouse, France                       Institute of Applied Informatics and Formal Description Methods (AIFB)
Approximate       Structure-                                 Structure-
 Entity Search                          Structure      based Result                               based Answer
                                        Matching        Refinement                                 Computation




       STRUCTURE-BASED RESULT
       REFINEMENT

17   August 31st, 2011   DEXA 2011, Toulouse, France   Institute of Applied Informatics and Formal Description Methods (AIFB)
Structure-based Result Refinement

        From ASM we know that entities in intermediate results are
        connected
                                                        Necessary, but not sufficient.

        With structure-based result refinement we find out whether they
        are connected via paths captured by query atoms
        Query is matched against a structure index graph
                Bisimulation-based summary of data graph that captures structural
                information
                Nodes in the data graph with the same “structure” are grouped
                together
                Much smaller than the data graph



18    August 31st, 2011   DEXA 2011, Toulouse, France           Institute of Applied Informatics and Formal Description Methods (AIFB)
Structure Index                                                                      Bisimulation

                                           conference
                                     a1                   c1
                     authorOf
         super-                      authorOf
         vises
 p2                   p1                            p5
                             worksAt
                                               worksAt
             knows                                                                    worksAt                        partOf
                                                                           E6                           E3                           E5
                                     i1                   u1               p5                          i1,i2                        u1, u2
                                               partOf
 p4                   p3
        super-
        vises                   worksAt
         authorOf                                                       worksAt
                                                                                                         authorOf
                      a2             i2
                                                          E1              E2             E4               E6
                                                         p2,p4 super-    p1,p3 authorOf a1,a2 conference c1,c2
      conference                          partOf                vises

                      c2             u2                                 knows
                                                                                          Structure Index Graph G~
 Data graph G
19    August 31st, 2011    DEXA 2011, Toulouse, France                   Institute of Applied Informatics and Formal Description Methods (AIFB)
Structure-based Result Refinement

        We take advantage of this property:
          Whenever there is a match of a query graph q on G the query also
          matches on G~. Moreover, extensions of the index graph
          matches will contain all data graph matches, i.e. the bindings to
          query variables.

        Match the query against the structure index graph to obtain sets
        of extensions that contain potential query answers
        Bindings computed in previous ES/ASM steps can only be
        answers if they are contained in the matched extensions




20    August 31st, 2011   DEXA 2011, Toulouse, France   Institute of Applied Informatics and Formal Description Methods (AIFB)
Approximate       Structure-                                 Structure-
 Entity Search                          Structure      based Result                               based Answer
                                        Matching        Refinement                                 Computation




       STRUCTURE-BASED ANSWER
       COMPUTATION

21   August 31st, 2011   DEXA 2011, Toulouse, France   Institute of Applied Informatics and Formal Description Methods (AIFB)
Structure-based Answer Compution

        Finally, results which exactly match the query are computed by
        the last refinement.
        Only for this step, we actually perform joins on the data.




22    August 31st, 2011   DEXA 2011, Toulouse, France   Institute of Applied Informatics and Formal Description Methods (AIFB)
EVALUTION


23   August 31st, 2011   DEXA 2011, Toulouse, France   Institute of Applied Informatics and Formal Description Methods (AIFB)
Evaluation

        Systems
                INC: the proposed approach
                VP: join processing using vertical partitioning with sextuple indexing
        Datasets
                DBLP: 13M triples
                LUBM: 0.7M – 6.7M triples
        Queries
                Generated 80 queries via random sampling
                Different shapes: path, star, graph




24    August 31st, 2011   DEXA 2011, Toulouse, France   Institute of Applied Informatics and Formal Description Methods (AIFB)
Results – Average Processing Time




25    August 31st, 2011   DEXA 2011, Toulouse, France   Institute of Applied Informatics and Formal Description Methods (AIFB)
Results – Average Processing Time
     Neighborhood Distance




26    August 31st, 2011   DEXA 2011, Toulouse, France   Institute of Applied Informatics and Formal Description Methods (AIFB)
Results – Precision vs. Time




27    August 31st, 2011   DEXA 2011, Toulouse, France   Institute of Applied Informatics and Formal Description Methods (AIFB)
Results - Precision




28    August 31st, 2011   DEXA 2011, Toulouse, France   Institute of Applied Informatics and Formal Description Methods (AIFB)
Conclusion

        We proposed a novel process for approximate and
        incremental processing of complex graph pattern queries
        Initial results are computed in a small fraction of total time and
        the incrementally refined via approximate matching at low cost
        Increased responsiveness as inexact results are available early
        Users can decide if and for which result exactness and
        completeness is desirable
        Experiments show that our approach is relatively fast w.r.t. exact
        and complete results, indicating that the proposed mechanism is
        able to reuse intermediate results




29    August 31st, 2011   DEXA 2011, Toulouse, France   Institute of Applied Informatics and Formal Description Methods (AIFB)
30   August 31st, 2011   DEXA 2011, Toulouse, France   Institute of Applied Informatics and Formal Description Methods (AIFB)
BACKUP SLIDES


31   August 31st, 2011   DEXA 2011, Toulouse, France   Institute of Applied Informatics and Formal Description Methods (AIFB)

Weitere ähnliche Inhalte

Andere mochten auch

Linked Data Top-K Query Processing
Linked Data Top-K Query ProcessingLinked Data Top-K Query Processing
Linked Data Top-K Query ProcessingWagner Andreas
 
Guide ir
Guide irGuide ir
Guide irjoelen
 
Linked Data for a privacy-aware Smart Grid
Linked Data for a privacy-aware Smart GridLinked Data for a privacy-aware Smart Grid
Linked Data for a privacy-aware Smart GridWagner Andreas
 
Ahmed Shahzad Portfolio
Ahmed Shahzad   PortfolioAhmed Shahzad   Portfolio
Ahmed Shahzad Portfolioa_shzad
 
Browsing-oriented Semantic Faceted Search
Browsing-oriented Semantic Faceted SearchBrowsing-oriented Semantic Faceted Search
Browsing-oriented Semantic Faceted SearchWagner Andreas
 
Тенденции развития унифицированных коммуникаций. Решения Avaya
Тенденции развития унифицированных коммуникаций. Решения AvayaТенденции развития унифицированных коммуникаций. Решения Avaya
Тенденции развития унифицированных коммуникаций. Решения AvayaEvgeny Kozlov
 
Web Techologies and Privacy policies for the Smart Grid
Web Techologies and Privacy policies for the Smart GridWeb Techologies and Privacy policies for the Smart Grid
Web Techologies and Privacy policies for the Smart GridWagner Andreas
 
Understanding International Students' Experiences of Learning with Technology
Understanding International Students' Experiences of Learning with TechnologyUnderstanding International Students' Experiences of Learning with Technology
Understanding International Students' Experiences of Learning with TechnologyTünde Varga-Atkins
 

Andere mochten auch (14)

Linked Data Top-K Query Processing
Linked Data Top-K Query ProcessingLinked Data Top-K Query Processing
Linked Data Top-K Query Processing
 
проекти
проектипроекти
проекти
 
Guide ir
Guide irGuide ir
Guide ir
 
Linked Data for a privacy-aware Smart Grid
Linked Data for a privacy-aware Smart GridLinked Data for a privacy-aware Smart Grid
Linked Data for a privacy-aware Smart Grid
 
Editorial
EditorialEditorial
Editorial
 
Ahmed Shahzad Portfolio
Ahmed Shahzad   PortfolioAhmed Shahzad   Portfolio
Ahmed Shahzad Portfolio
 
Overcoming Barriers to Learning Technology Adoption
Overcoming Barriers to Learning Technology AdoptionOvercoming Barriers to Learning Technology Adoption
Overcoming Barriers to Learning Technology Adoption
 
Early years professional status
Early years professional statusEarly years professional status
Early years professional status
 
Blackboard 9.1 Introduction
Blackboard 9.1 IntroductionBlackboard 9.1 Introduction
Blackboard 9.1 Introduction
 
Browsing-oriented Semantic Faceted Search
Browsing-oriented Semantic Faceted SearchBrowsing-oriented Semantic Faceted Search
Browsing-oriented Semantic Faceted Search
 
Games and Learning in Higher Education
Games and Learning in Higher EducationGames and Learning in Higher Education
Games and Learning in Higher Education
 
Тенденции развития унифицированных коммуникаций. Решения Avaya
Тенденции развития унифицированных коммуникаций. Решения AvayaТенденции развития унифицированных коммуникаций. Решения Avaya
Тенденции развития унифицированных коммуникаций. Решения Avaya
 
Web Techologies and Privacy policies for the Smart Grid
Web Techologies and Privacy policies for the Smart GridWeb Techologies and Privacy policies for the Smart Grid
Web Techologies and Privacy policies for the Smart Grid
 
Understanding International Students' Experiences of Learning with Technology
Understanding International Students' Experiences of Learning with TechnologyUnderstanding International Students' Experiences of Learning with Technology
Understanding International Students' Experiences of Learning with Technology
 

Ähnlich wie Approx. & Incremental Processing of Complex Queries

Crowdsourcing tasks in Linked Data management
Crowdsourcing tasks in Linked Data managementCrowdsourcing tasks in Linked Data management
Crowdsourcing tasks in Linked Data managementBarry Norton
 
Poster Semantic data integration proof of concept
Poster Semantic data integration proof of conceptPoster Semantic data integration proof of concept
Poster Semantic data integration proof of conceptNicolas Bertrand
 
Falcon-AO: Results for OAEI 2007
Falcon-AO: Results for OAEI 2007Falcon-AO: Results for OAEI 2007
Falcon-AO: Results for OAEI 2007Gong Cheng
 
Semantically-aware Networks and Services for Training and Knowledge Managemen...
Semantically-aware Networks and Services for Training and Knowledge Managemen...Semantically-aware Networks and Services for Training and Knowledge Managemen...
Semantically-aware Networks and Services for Training and Knowledge Managemen...Gilbert Paquette
 
SA2: Text Mining from User Generated Content
SA2: Text Mining from User Generated ContentSA2: Text Mining from User Generated Content
SA2: Text Mining from User Generated ContentJohn Breslin
 
Overlaying Paper Maps with Digital Information Services for Tourists
Overlaying Paper Maps with Digital Information Services for TouristsOverlaying Paper Maps with Digital Information Services for Tourists
Overlaying Paper Maps with Digital Information Services for TouristsBeat Signer
 
Status update OEG - Nov 2012
Status update OEG - Nov 2012Status update OEG - Nov 2012
Status update OEG - Nov 2012dgarijo
 
The Landscape of Ontology Reuse in Linked Data - OEDW2012
The Landscape of Ontology Reuse in Linked Data - OEDW2012The Landscape of Ontology Reuse in Linked Data - OEDW2012
The Landscape of Ontology Reuse in Linked Data - OEDW2012María Poveda Villalón
 
Combining Data Mining and Ontology Engineering to enrich Ontologies and Linke...
Combining Data Mining and Ontology Engineering to enrich Ontologies and Linke...Combining Data Mining and Ontology Engineering to enrich Ontologies and Linke...
Combining Data Mining and Ontology Engineering to enrich Ontologies and Linke...Mathieu d'Aquin
 
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...Christoph Lange
 
Controlled Vocabularies and Text Mining - Use Cases at the Goettingen
Controlled Vocabularies and Text Mining - Use Cases at the Goettingen Controlled Vocabularies and Text Mining - Use Cases at the Goettingen
Controlled Vocabularies and Text Mining - Use Cases at the Goettingen Ralf Stockmann
 
Resource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and FederationResource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and FederationPistoia Alliance
 
Taxonomies in Search
Taxonomies in SearchTaxonomies in Search
Taxonomies in SearchTSoholt
 
Challenge of Image Retrieval, Brighton, 2000 1 ANVIL: a System for the Retrie...
Challenge of Image Retrieval, Brighton, 2000 1 ANVIL: a System for the Retrie...Challenge of Image Retrieval, Brighton, 2000 1 ANVIL: a System for the Retrie...
Challenge of Image Retrieval, Brighton, 2000 1 ANVIL: a System for the Retrie...Petros Tsonis
 
The Nature of Information
The Nature of InformationThe Nature of Information
The Nature of InformationAdrian Paschke
 
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUECOMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUEJournal For Research
 
1.2M .pdf
1.2M .pdf1.2M .pdf
1.2M .pdfbutest
 
2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinalDeborah McGuinness
 

Ähnlich wie Approx. & Incremental Processing of Complex Queries (20)

Crowdsourcing tasks in Linked Data management
Crowdsourcing tasks in Linked Data managementCrowdsourcing tasks in Linked Data management
Crowdsourcing tasks in Linked Data management
 
Poster Semantic data integration proof of concept
Poster Semantic data integration proof of conceptPoster Semantic data integration proof of concept
Poster Semantic data integration proof of concept
 
Falcon-AO: Results for OAEI 2007
Falcon-AO: Results for OAEI 2007Falcon-AO: Results for OAEI 2007
Falcon-AO: Results for OAEI 2007
 
Semantically-aware Networks and Services for Training and Knowledge Managemen...
Semantically-aware Networks and Services for Training and Knowledge Managemen...Semantically-aware Networks and Services for Training and Knowledge Managemen...
Semantically-aware Networks and Services for Training and Knowledge Managemen...
 
SA2: Text Mining from User Generated Content
SA2: Text Mining from User Generated ContentSA2: Text Mining from User Generated Content
SA2: Text Mining from User Generated Content
 
2013-01-17 Research Object
2013-01-17 Research Object2013-01-17 Research Object
2013-01-17 Research Object
 
Overlaying Paper Maps with Digital Information Services for Tourists
Overlaying Paper Maps with Digital Information Services for TouristsOverlaying Paper Maps with Digital Information Services for Tourists
Overlaying Paper Maps with Digital Information Services for Tourists
 
Status update OEG - Nov 2012
Status update OEG - Nov 2012Status update OEG - Nov 2012
Status update OEG - Nov 2012
 
Beyond the PDF 2, 2013
Beyond the PDF 2, 2013Beyond the PDF 2, 2013
Beyond the PDF 2, 2013
 
The Landscape of Ontology Reuse in Linked Data - OEDW2012
The Landscape of Ontology Reuse in Linked Data - OEDW2012The Landscape of Ontology Reuse in Linked Data - OEDW2012
The Landscape of Ontology Reuse in Linked Data - OEDW2012
 
Combining Data Mining and Ontology Engineering to enrich Ontologies and Linke...
Combining Data Mining and Ontology Engineering to enrich Ontologies and Linke...Combining Data Mining and Ontology Engineering to enrich Ontologies and Linke...
Combining Data Mining and Ontology Engineering to enrich Ontologies and Linke...
 
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
 
Controlled Vocabularies and Text Mining - Use Cases at the Goettingen
Controlled Vocabularies and Text Mining - Use Cases at the Goettingen Controlled Vocabularies and Text Mining - Use Cases at the Goettingen
Controlled Vocabularies and Text Mining - Use Cases at the Goettingen
 
Resource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and FederationResource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and Federation
 
Taxonomies in Search
Taxonomies in SearchTaxonomies in Search
Taxonomies in Search
 
Challenge of Image Retrieval, Brighton, 2000 1 ANVIL: a System for the Retrie...
Challenge of Image Retrieval, Brighton, 2000 1 ANVIL: a System for the Retrie...Challenge of Image Retrieval, Brighton, 2000 1 ANVIL: a System for the Retrie...
Challenge of Image Retrieval, Brighton, 2000 1 ANVIL: a System for the Retrie...
 
The Nature of Information
The Nature of InformationThe Nature of Information
The Nature of Information
 
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUECOMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
 
1.2M .pdf
1.2M .pdf1.2M .pdf
1.2M .pdf
 
2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal
 

Kürzlich hochgeladen

31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseCeline George
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxMichelleTuguinay1
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptxmary850239
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Celine George
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Association for Project Management
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmStan Meyer
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1GloryAnnCastre1
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...DhatriParmar
 
How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17Celine George
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 

Kürzlich hochgeladen (20)

31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 Database
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx
 
prashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Professionprashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Profession
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and Film
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
 
How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 

Approx. & Incremental Processing of Complex Queries

  • 1. Approximate and Incremental Processing of Complex Queries against the Web of Data Thanh Tran, Günter Ladwig, Andreas Wagner DEXA 2011 Institute of Applied Informatics and Formal Description Methods (AIFB) KIT – University of the State of Baden-Württemberg and National Large-scale Research Center of the Helmholtz Association www.kit.edu
  • 2. Contents Approximate Introduction Overview & Incremental Evaluation Conclusion Processing Structure-based Approximate Result Entity Search Structure Refinement and Matching Computation 2 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 3. INTRODUCTION 3 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 4. Introduction – Data Model Resource Description Framework (RDF) conference a1 c1 authorOf super- authorOf vises name p2 p1 p5 P2 P5 worksAt name worksAt knows i1 u1 partOf p4 p3 super- name vises worksAt authorOf U1 a2 i2 conference partOf c2 u2 4 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 5. Introduction – Query Model Basic Graph Patterns Conjunctive queries over RDF data: graph pattern matching AIFB name KIT partOf name z u worksAt supervise w x y v name age author conf ICDE 29 5 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 6. Contribution Techniques for matching (basic) query patterns against graph- structured data have limits We might wish to trade completeness and exactness for responsiveness Our approach allows an “affordable” computation of an initial set of approximate results, which can be incrementally refined as needed. 6 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 7. Contribution – Pipeline Overview Pipeline of operations where approximate results are refined incrementally Intermediate, Approximate Results Approximate Structure- Structure- Entity Search Structure based Result based Answer Matching Refinement Computation Entity & Structure Neighborhood Relation Index Index Index 7 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 8. Approximate Structure- Structure- Entity Search Structure based Result based Answer Matching Refinement Computation ENTITY SEARCH 8 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 9. Entity Search Entity index Stores attribute edges of the data graph Enables lookup of entities by attribute and value Entity search Obtains candidate bindings for all variables in the query that have attribute edges Does not consider structure (i.e., relations between entities) Query decomposition and transformation Decompose query into entity queries to create a transformed query 9 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 10. Query Decomposition & Transformation AIFB name KIT partOf name z u worksAt supervise w x y v age author conf name ICDE 29 Identify entity queries Breadth-first search starting from random variable 10 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 11. Query Decomposition & Transformation AIFB name KIT partOf name z u worksAt supervise w x y v age author conf name ICDE 29 Collapse entity queries z partOf u name AIFB name KIT worksAt w supervise x y v age 29 author conf name ICDE 11 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 12. Entity Search Results Use entity index to obtain bindings for all entity queries in transformed query Entity queries are necessary conditions, x z u v but not sufficient p1 i1 u1 c1 Final results will be a subset p3 i1 u1 c1 p5 i1 u1 c1 p6 i1 u1 c1 z partOf u name AIFB name KIT worksAt w supervise x y v age 29 author conf name ICDE 12 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 13. Approximate Structure- Structure- Entity Search Structure based Result based Answer Matching Refinement Computation APPROXIMATE STRUCTURE MATCHING 13 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 14. Approximate Structure Matching Only entity parts of the query have been matched Relation edges have yet to be processed Instead of performing exact equijoins we propose to perform a neighborhood join The k-neighborhood of an entity e is the set of entities in the data graph that can be reached from e via a path of relation edges of length k or less. Neighborhood join allows us to check whether two entities are connected via relation edges (but not which ones) A neighborhood join between two sets of entities E1, E2 is an equijoin between all pairs e1 ∈ E1, e2 ∈ E2 where e1 and e2 are considered equivalent if the intersection of their k-neighborhood is non-empty. Again: necessary, but not sufficient 14 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 15. Neighborhood Join via Bloom Filters We store the set of k-neighborhood entities as a bloom filter Bloom filter Space-efficient, probabilistic data structure for set membership test False positives are possible (false negatives are not) We refine the results of the previous step To perform a neighborhood join between bindings E1, E2 Load bloom filters for one set of entities, say E1 In a nested loop manner, check if entities in E2 are contained in the bloom filter 15 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 16. Neighborhood Join via Bloom Filters AIFB name KIT partOf name z u worksAt supervise w x y v age author conf name ICDE 29 k=1 k=2 Load bloom filters for entities bound to x Check whether entities bound to w,y, z are in the neighborhood of x When k=2, bloom filters for x also cover u and v 16 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 17. Approximate Structure- Structure- Entity Search Structure based Result based Answer Matching Refinement Computation STRUCTURE-BASED RESULT REFINEMENT 17 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 18. Structure-based Result Refinement From ASM we know that entities in intermediate results are connected Necessary, but not sufficient. With structure-based result refinement we find out whether they are connected via paths captured by query atoms Query is matched against a structure index graph Bisimulation-based summary of data graph that captures structural information Nodes in the data graph with the same “structure” are grouped together Much smaller than the data graph 18 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 19. Structure Index Bisimulation conference a1 c1 authorOf super- authorOf vises p2 p1 p5 worksAt worksAt knows worksAt partOf E6 E3 E5 i1 u1 p5 i1,i2 u1, u2 partOf p4 p3 super- vises worksAt authorOf worksAt authorOf a2 i2 E1 E2 E4 E6 p2,p4 super- p1,p3 authorOf a1,a2 conference c1,c2 conference partOf vises c2 u2 knows Structure Index Graph G~ Data graph G 19 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 20. Structure-based Result Refinement We take advantage of this property: Whenever there is a match of a query graph q on G the query also matches on G~. Moreover, extensions of the index graph matches will contain all data graph matches, i.e. the bindings to query variables. Match the query against the structure index graph to obtain sets of extensions that contain potential query answers Bindings computed in previous ES/ASM steps can only be answers if they are contained in the matched extensions 20 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 21. Approximate Structure- Structure- Entity Search Structure based Result based Answer Matching Refinement Computation STRUCTURE-BASED ANSWER COMPUTATION 21 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 22. Structure-based Answer Compution Finally, results which exactly match the query are computed by the last refinement. Only for this step, we actually perform joins on the data. 22 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 23. EVALUTION 23 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 24. Evaluation Systems INC: the proposed approach VP: join processing using vertical partitioning with sextuple indexing Datasets DBLP: 13M triples LUBM: 0.7M – 6.7M triples Queries Generated 80 queries via random sampling Different shapes: path, star, graph 24 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 25. Results – Average Processing Time 25 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 26. Results – Average Processing Time Neighborhood Distance 26 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 27. Results – Precision vs. Time 27 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 28. Results - Precision 28 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 29. Conclusion We proposed a novel process for approximate and incremental processing of complex graph pattern queries Initial results are computed in a small fraction of total time and the incrementally refined via approximate matching at low cost Increased responsiveness as inexact results are available early Users can decide if and for which result exactness and completeness is desirable Experiments show that our approach is relatively fast w.r.t. exact and complete results, indicating that the proposed mechanism is able to reuse intermediate results 29 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 30. 30 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)
  • 31. BACKUP SLIDES 31 August 31st, 2011 DEXA 2011, Toulouse, France Institute of Applied Informatics and Formal Description Methods (AIFB)