SlideShare ist ein Scribd-Unternehmen logo
1 von 39
Linked Data and Services
                                                 Andreas Harth and Barry Norton


Institute AIFB




KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association                                  www.kit.edu
Outline

!    Motivation
!    Linked Data Principles
!    Query Processing over Linked Data
!    Linked Data Services (LIDS) and Linked Open
     Services (LOS)
!    Conclusion




                                       KIT – University of the State of Baden-Wuerttemberg and
                                       National Laboratory of the Helmholtz Association
Motivation

!      Semantic Web/Linked Data technologies are well-suited
       for data integration




                                                                                                        ? !

               Common Data
                                                    Data              Interactive Data
               Format/Access
                                                 Integration             Exploration
                  Protocol




     8/10/11    Taking the LIDS off Data Silos                 KIT – University of the State of Baden-Wuerttemberg and
                Andreas Harth                                  National Laboratory of the Helmholtz Association
Linked Data Principles*

1.     Use URIs to name things; not only documents, but
       also people, locations, concepts, etc.
2.     To enable agents (human users and machine agents
       alike) to look up those names, use HTTP URIs
3.     When someone looks up a URI we provide useful
       information; with 'useful' in the strict sense we usually
       mean structured data in RDF.
4.     Include links to other URIs allowing agents (machines
       and humans) to discover more things



      (*) http://www.w3.org/DesignIssues/LinkedData.html

                                                   KIT – University of the State of Baden-Wuerttemberg and
                                                   National Laboratory of the Helmholtz Association
Correspondence between thing-URI and
    source-URI


           User Agent


                              http://www.polleres.net/foaf.rdf#me

        HTTP            RDF
        GET




           Web Server


                               http://www.polleres.net/foaf.rdf


5                                         KIT – University of the State of Baden-Wuerttemberg and
                                          National Laboratory of the Helmholtz Association
Correspondence between thing-URI and
    source-URI


             User Agent


                                 http://dbpedia.org/resource/Gordon_Brown

      HTTP   303 HTTP     RDF
       GET       GET



                                http://dbpedia.org/data/Gordon_Brown
             Web Server


                                http://dbpedia.org/page/Gordon_Brown


6                                            KIT – University of the State of Baden-Wuerttemberg and
                                             National Laboratory of the Helmholtz Association
KIT – University of the State of Baden-Wuerttemberg and
National Laboratory of the Helmholtz Association
Queries over Linked Data

SELECT ?f ?n WHERE {
  an:f#ah foaf:knows ?f.
  ?f foaf:name ?n.
}

SELECT ?x1 ?x2 WHERE {
  dblppub:HoganHP08 dc:creator ?a1.
  ?x1 owl:sameAs ?a1.
  ?x2 foaf:knows ?x1.
}




   ?f                                 ?n

                                       KIT – University of the State of Baden-Wuerttemberg and
                                       National Laboratory of the Helmholtz Association
Querying Data Across Sources

    !     Data warehousing or materialisation-based approaches
          (MAT)

                                 CRAWL                            INDEX       SERVE


    !     Distributed query processing approaches (DQP)


             SELECT *                                                     R                 S
              FROM…
                                                   R          S




9   15.03.2010   Andreas Harth                                                KIT – University of the State of Baden-Wuerttemberg and
                 Data Summaries for On-Demand Queries over Linked Data        National Laboratory of the Helmholtz Association
DQP on Linked Data

            SELECT *                                                        R                S
             FROM…
                                                  R           S           ODBC           ODBC




         SELECT ?s                                                         TP              TP
          WHERE…                                                          HTTP          HTTP
                                                TP         TP              GET           GET




10   15.03.2010   Andreas Harth                                                  KIT – University of the State of Baden-Wuerttemberg and
                  Data Summaries for On-Demand Queries over Linked Data          National Laboratory of the Helmholtz Association
Query Processing Overview

      SELECT ?f ?n WHERE {
        an:f#ah foaf:knows ?f.
        ?f foaf:name ?n.
      }



                     TP                                                           TP
           (an:f#ah foaf:knows ?f)                                         (?f foaf:name ?n)



           Select source                                                  HTTP      RDF             Select source
                                          HTTP            RDF
                (s)                        GET                            GET                            (s)




             ?f                                                            ?n
             http://danbri.org/foaf.rdf#danbri                             Dan Brickley
11   15.03.2010   Andreas Harth                                             KIT – University of the State of Baden-Wuerttemberg and
                  Data Summaries for On-Demand Queries over Linked Data     National Laboratory of the Helmholtz Association
Barry




        KIT – University of the State of Baden-Wuerttemberg and
        National Laboratory of the Helmholtz Association
Problem: Source Selection for Triple Patterns

     !     (?s       ?p           ?o)
     !     (#s       ?p           ?o)
     !     (?s       #p           ?o)
     !     (?s       ?p           #o)
     !     (#s       #p           ?o)
     !     (#s       ?p           #o)
     !     (?s       #p           #o)
     !     (#s       #p           #o)

     !     Given a triple pattern, which source can contribute bindings
           for the triple pattern?

13   15.03.2010   Andreas Harth                                           KIT – University of the State of Baden-Wuerttemberg and
                  Data Summaries for On-Demand Queries over Linked Data   National Laboratory of the Helmholtz Association
Schema-Level Indices [Stuckenschmidt et al.
     2004]

     !     Keep index of properties and/or classes contained in
           sources

     !     (?s #p ?o), (?s rdf:type #o)

     !     Covers only queries containing schema-level elements
     !     Commonly used properties select potentially too many
           sources
                           SELECT ?x1 ?x2 WHERE {
                               SELECT ?f ?n WHERE {
                             dblppub:HoganHP08 dc:creator ?a1.
                                 an:f#ah foaf:knows ?f.
                             ?x1 owl:sameAs ?a1.
                                 ?f foaf:name ?n.
                             ?x2 foaf:knows ?x1.
                               }
                           }

14   15.03.2010   Andreas Harth                                           KIT – University of the State of Baden-Wuerttemberg and
                  Data Summaries for On-Demand Queries over Linked Data   National Laboratory of the Helmholtz Association
Direct Lookup (DL) [Hartig et al. 2009]

     !     Exploits correspondence between thing-URI and source-URI
     !     Linked Data sources (aka RDF files) return typically triples with a
           subject corresponding to the source
     !     Sometimes the sources return triples with object corresponding to the
           source

     !     (#s ?p ?o), (#s #p ?o), (#s #p #o)
     !     (?s ?p #o), (?s #p #o)

     !     Incomplete wrt. patterns but also wrt. to URI reuse across sources
     !     Limited parallelism, unclear how to schedule lookups
                            SELECT ?x1 ?x2 WHERE {
                                SELECT ?f ?n WHERE {
                              dblppub:HoganHP08 dc:creator ?a1.
                                  an:f#ah foaf:knows ?f.
                              ?x1 owl:sameAs ?a1.
                                  ?f foaf:name ?n.
                              ?x2 foaf:knows ?x1.
                                }
                            }
15   15.03.2010   Andreas Harth                                           KIT – University of the State of Baden-Wuerttemberg and
                  Data Summaries for On-Demand Queries over Linked Data   National Laboratory of the Helmholtz Association
Approximate Data Summaries
     !     Combined description of schema-level and instance-level
     !     Use approximation to reduce index size (incurs false positives)
     !     Possible to use entire query for source selection
     !     Parallel lookups since sources can be determined for the entire query

     !     (?s ?p ?o), (#s ?p ?o), (?s #p ?o), (?s ?p #o), (#s #p ?
           o), (#s ?p #o), (?s #p #o), (#s #p #o)
     !     and combinations of triple patterns


                          SELECT ?x1 ?x2 WHERE {
                              SELECT ?f ?n WHERE {
                            dblppub:HoganHP08 dc:creator ?a1.
                                an:f#ah foaf:knows ?f.
                            ?x1 owl:sameAs ?a1.
                                ?f foaf:name ?n.
                            ?x2 foaf:knows ?x1.
                              }
                          }


16   15.03.2010   Andreas Harth                                           KIT – University of the State of Baden-Wuerttemberg and
                  Data Summaries for On-Demand Queries over Linked Data   National Laboratory of the Helmholtz Association
Implementation

!    Deploy wrappers „in the cloud“
!    Google App Engine: hosting of Java and Python
     webapps on Google’s Cloud infrastructure
!    Limited amount of processing time (6hrs/day)
!    Single-threaded applications

!    Suited for deploying wrappers
!    e.g. http://twitter2foaf.appspot.com/ converts Twitter
     user data to RDF




                                           KIT – University of the State of Baden-Wuerttemberg and
                                           National Laboratory of the Helmholtz Association
Linking Open Data Cloud 2007




                               KIT – University of the State of Baden-Wuerttemberg and
                               National Laboratory of the Helmholtz Association
Linking Open Data Cloud 2008




                               KIT – University of the State of Baden-Wuerttemberg and
                               National Laboratory of the Helmholtz Association
Linking Open Data Cloud 2009




                               KIT – University of the State of Baden-Wuerttemberg and
                               National Laboratory of the Helmholtz Association
Linking Open Data Cloud 2010




                               KIT – University of the State of Baden-Wuerttemberg and
                               National Laboratory of the Helmholtz Association
Geonames Services




                    KIT – University of the State of Baden-Wuerttemberg and
                    National Laboratory of the Helmholtz Association
Geonames Services




                    KIT – University of the State of Baden-Wuerttemberg and
                    National Laboratory of the Helmholtz Association
Geonames Services




    {"weatherObservation":
     {"clouds":"broken clouds",
      "weatherCondition":"drizzle",
      "observation":"LESO 251300Z 03007KT
               340V040 CAVOK 23/15 Q1010",
      "windDirection":30,
      "ICAO":"LESO", ...

                                             KIT – University of the State of Baden-Wuerttemberg and
                                             National Laboratory of the Helmholtz Association
Geonames Services




    {"weatherObservation":
     {"clouds":"broken clouds",
      "weatherCondition":"drizzle",
      "observation":"LESO 251300Z 03007KT
               340V040 CAVOK 23/15 Q1010",
      "windDirection":30,
      "ICAO":"LESO", ...

                                             KIT – University of the State of Baden-Wuerttemberg and
                                             National Laboratory of the Helmholtz Association
Linked Open Service Principles
 REST Principles
 1. Application state and functionality is divided into resources
 2. Every resource is uniquely addressable
 3. All resources share a uniform interface:
    a) A constrained set of well-defined operations
    b) A constrained set of content types


              Linked Data Principles
              1. Use URIs as names for things
              2. Use HTTP URIs so that people can look up those names.
              3. When someone looks up a URI, provide useful information, using
              the standards (RDF*, SPARQL)
              4. Include links to other URIs. so that they can discover more things.


                       Linked Open Service Principles
                       1. Describe services as LOD prosumers with input and output
                       descriptions as SPARQL graph patterns
                       2. Communicate RDF by RESTful content negotiation
                       3. The output should make explicit its relation with the input
                                                             KIT – University of the State of Baden-Wuerttemberg and
                                                             National Laboratory of the Helmholtz Association
LOS Weather Service




                      KIT – University of the State of Baden-Wuerttemberg and
                      National Laboratory of the Helmholtz Association
LOS Geo Resources




                    KIT – University of the State of Baden-Wuerttemberg and
                    National Laboratory of the Helmholtz Association
Resource-Based Linked Open Services
                                            GET
                                            Accept: text/html

                                            303 REDIRECT /page
                                            GET
                                            Accept: application/rdf




                                                                                    Linked Data
                                            +xml
                                            (or text/n3)

                                            303 REDIRECT /data


                                            GET /weather




                                                                                    Linked Service
                                            Accept: application/rdf
                                            +xml
                                            (or text/n3)

                                            200 <rdf:Description>
                                KIT – University of the State of Baden-Wuerttemberg and
                                National Laboratory of the Helmholtz Association
Interlinking Data with Data from Services?




                                    KIT – University of the State of Baden-Wuerttemberg and
                                    National Laboratory of the Helmholtz Association
Data Services

!    Given input, provide output
!    Input and output are related in a service-specific way
!    Do not change the state of the world

                           Input             relation            Output


                                                   defines

                                             Service




!    E.g. GeoNames findNearbyWikipedia service
     !    Input: lat/lon
     !    Output: places                            KIT – University of the State of Baden-Wuerttemberg and
                                                    National Laboratory of the Helmholtz Association

     !    Relation: output places that are nearby input place
Linked Data Services

!     We’d like to integrate data services with Linked Data
1.    LIDS need to adhere to Linked Data principles

!     We’d like to use data services in software programs
2.    LIDS need machine-readable descriptions of input and
      output

!     Compared to naïve approach: assign URI to service output

!     Relationship between input and output is explicitly
      described

!     Dynamicity is supported              KIT – University of the State of Baden-Wuerttemberg and
                                           National Laboratory of the Helmholtz Association
1. Data Services as Linked Data

!    Input is given as URI                                  Service Endpoint


http://geowrap.openlids.org/findNearbyWikipedia
?lat=37.416&lng=-122.152                 Parameters
#point       Input Identifier


                                                                 Output
!    Resolving the URI yields
         Relation
                                    RDF: Input

@prefix dbp: <http://dbpedia.org/resource/> .
@prefix : <http://geo..Wiki?
  lat=37.416&lng=-122.152#>
:point
      foaf:based_near dbp:Palo_Alto
                             KIT – University of the State of Baden-Wuerttemberg and
  %2C_California ;           National Laboratory of the Helmholtz Association

      foaf:based_near dbp:Packard%27s_garage .
2. LIDS Descriptions

!    LIDS characterised by
     !    Endpoint URI ep, which is the base for all input entities
     !    Local identifier i of input entity
     !    List of parameters Xi
     !    Basic graph pattern Ti describing conditions on parameters
     !    Basic graph pattern To describing minimum output data


!    Example:
     ep = <http:/geowrap.openlids.org/findNearbyWikipedia>
     i = point
     Xi = {?lat, ?lng}
     Ti = ?point a Point . ?point geo:lat ?lat .
                           ?point geo:long ?lng
     To = ?point foaf:based_near ?feature

                                                  KIT – University of the State of Baden-Wuerttemberg and
                                                  National Laboratory of the Helmholtz Association
Interlink LIDS and Linked Data

                            !    Generate service URIs
                                 with input bindings,
                                 from evaluating :
                                 select Xi where Ti
                            !    sameAs: binding for i
Scale-Up Experiment: Link BTC to GeoNames

!    3 billion triples from the Billion Triple Challenge (BTC) 2010
     data set:
!    Annotate with LIDS wrapper of GeoNames findNearby
     service
!    Annotation time: < 12 hours on laptop!
!    ~ 12 hours for uncompressing the data set, cleaning
     results, and gather statistics

!    Original BTC data: 74 different domains that linked to
     GeoNames URIs
!    Interlinking process added 891 new now linked to LIDS
     geowrap
!    In total 2,448,160 new links were added
                                           KIT – University of the State of Baden-Wuerttemberg and
                                           National Laboratory of the Helmholtz Association
Query Answering using LIDS and Linked Data

                          !    Query execution
                               resolves URIs
                          !    => enlarges data set
                          !    LIDS are interlinked
                          !    Query is executed
                               again on new data set
                          !    Repeat until no new
                               links or no new data
                          !    Combine results
Experiment: Query Answering

!    Input:
     List of 562 (potential) universities from Facebook Graph
     API
!    Output:
     Facebook fans and DBpedia student numbers for 104
     universities

!    PREFIX u: <http://openlids.org/
     universities.rdf#> SELECT ?n ?f ?s WHERE {
           u:list foaf:topic ?u . ?u foaf:name ?
     n .
           ?u og:fan_count ?f .?u
     d:numberOfStudents ?s }

                                          KIT – University of the State of Baden-Wuerttemberg and
                                          National Laboratory of the Helmholtz Association
Linked Services and PlanetData

!    Several areas seem likely to produce services:
     !    Stream, inc. Sensor, resources (latest values)
     !    Any others exposing dynamic resources
     !    Dynamic computations, inc. on-the-fly quality
          assessments
!    Other areas seem likely to consider service
     technologies and move towards more service-like
     HTTP interactions
     !    Access control (OpenID, OAuth, etc.)
!    Finally, remaining areas could serve to complement
     LIDS/LOS alignment
     !    Provenance
                                            KIT – University of the State of Baden-Wuerttemberg and
                                            National Laboratory of the Helmholtz Association

Weitere ähnliche Inhalte

Was ist angesagt?

Verifying Integrity Constraints of a RDF-based WordNet
Verifying Integrity Constraints of a RDF-based WordNetVerifying Integrity Constraints of a RDF-based WordNet
Verifying Integrity Constraints of a RDF-based WordNetAlexandre Rademaker
 
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...Herbert Van de Sompel
 
Web Data Management with RDF
Web Data Management with RDFWeb Data Management with RDF
Web Data Management with RDFM. Tamer Özsu
 
AAT LOD Microthesauri
AAT LOD MicrothesauriAAT LOD Microthesauri
AAT LOD MicrothesauriMarcia Zeng
 
Open library data and embrace the world library linked data
Open library data and embrace the world library linked dataOpen library data and embrace the world library linked data
Open library data and embrace the world library linked data皓仁 柯
 
TranSMART: How open source software revolutionizes drug discovery through cro...
TranSMART: How open source software revolutionizes drug discovery through cro...TranSMART: How open source software revolutionizes drug discovery through cro...
TranSMART: How open source software revolutionizes drug discovery through cro...keesvb
 
EDF2012 Mariana Damova - Factforge
EDF2012   Mariana Damova - FactforgeEDF2012   Mariana Damova - Factforge
EDF2012 Mariana Damova - FactforgeEuropean Data Forum
 
The bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking ServersThe bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking ServersHerbert Van de Sompel
 

Was ist angesagt? (9)

The aDORe Federation Architecture
The aDORe Federation ArchitectureThe aDORe Federation Architecture
The aDORe Federation Architecture
 
Verifying Integrity Constraints of a RDF-based WordNet
Verifying Integrity Constraints of a RDF-based WordNetVerifying Integrity Constraints of a RDF-based WordNet
Verifying Integrity Constraints of a RDF-based WordNet
 
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
 
Web Data Management with RDF
Web Data Management with RDFWeb Data Management with RDF
Web Data Management with RDF
 
AAT LOD Microthesauri
AAT LOD MicrothesauriAAT LOD Microthesauri
AAT LOD Microthesauri
 
Open library data and embrace the world library linked data
Open library data and embrace the world library linked dataOpen library data and embrace the world library linked data
Open library data and embrace the world library linked data
 
TranSMART: How open source software revolutionizes drug discovery through cro...
TranSMART: How open source software revolutionizes drug discovery through cro...TranSMART: How open source software revolutionizes drug discovery through cro...
TranSMART: How open source software revolutionizes drug discovery through cro...
 
EDF2012 Mariana Damova - Factforge
EDF2012   Mariana Damova - FactforgeEDF2012   Mariana Damova - Factforge
EDF2012 Mariana Damova - Factforge
 
The bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking ServersThe bX project: Federating and Mining Usage Logs from Linking Servers
The bX project: Federating and Mining Usage Logs from Linking Servers
 

Ähnlich wie Linked Data and Sevices

Linked Data and Services
Linked Data and ServicesLinked Data and Services
Linked Data and ServicesBarry Norton
 
Searching Linked Data
Searching Linked DataSearching Linked Data
Searching Linked DataThanh Tran
 
Omitola birmingham cityuniv
Omitola birmingham cityunivOmitola birmingham cityuniv
Omitola birmingham cityunivTope Omitola
 
2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinalDeborah McGuinness
 
2010 06 ipaw_prv
2010 06 ipaw_prv2010 06 ipaw_prv
2010 06 ipaw_prvJun Zhao
 
Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked dataLaura Po
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod GmodJun Zhao
 
Linked data HHS 2015
Linked data HHS 2015Linked data HHS 2015
Linked data HHS 2015Cason Snow
 
Linked Data Driven Data Virtualization for Web-scale Integration
Linked Data Driven Data Virtualization for Web-scale IntegrationLinked Data Driven Data Virtualization for Web-scale Integration
Linked Data Driven Data Virtualization for Web-scale Integrationrumito
 
Introducing the Linked Data Research Centre
Introducing the Linked Data Research CentreIntroducing the Linked Data Research Centre
Introducing the Linked Data Research CentreMichael Hausenblas
 
Information Extraction and Linked Data Cloud
Information Extraction and Linked Data CloudInformation Extraction and Linked Data Cloud
Information Extraction and Linked Data CloudDhaval Thakker
 
Make our Scientific Datasets Accessible and Interoperable on the Web
Make our Scientific Datasets Accessible and Interoperable on the WebMake our Scientific Datasets Accessible and Interoperable on the Web
Make our Scientific Datasets Accessible and Interoperable on the WebFranck Michel
 
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talkDistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talkGezim Sejdiu
 
Efficient source selection for sparql endpoint federation
Efficient source selection for sparql endpoint federationEfficient source selection for sparql endpoint federation
Efficient source selection for sparql endpoint federationMuhammad Saleem
 
Linked data 101: Getting Caught in the Semantic Web
Linked data 101: Getting Caught in the Semantic Web Linked data 101: Getting Caught in the Semantic Web
Linked data 101: Getting Caught in the Semantic Web Morgan Briles
 

Ähnlich wie Linked Data and Sevices (20)

Linked Data and Services
Linked Data and ServicesLinked Data and Services
Linked Data and Services
 
Searching Linked Data
Searching Linked DataSearching Linked Data
Searching Linked Data
 
Omitola birmingham cityuniv
Omitola birmingham cityunivOmitola birmingham cityuniv
Omitola birmingham cityuniv
 
2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal
 
2010 06 ipaw_prv
2010 06 ipaw_prv2010 06 ipaw_prv
2010 06 ipaw_prv
 
Linked Data
Linked DataLinked Data
Linked Data
 
Linked Data
Linked DataLinked Data
Linked Data
 
Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked data
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod Gmod
 
Linked data HHS 2015
Linked data HHS 2015Linked data HHS 2015
Linked data HHS 2015
 
Linking Open Data
Linking Open DataLinking Open Data
Linking Open Data
 
Friday talk 11.02.2011
Friday talk 11.02.2011Friday talk 11.02.2011
Friday talk 11.02.2011
 
Linked Data Driven Data Virtualization for Web-scale Integration
Linked Data Driven Data Virtualization for Web-scale IntegrationLinked Data Driven Data Virtualization for Web-scale Integration
Linked Data Driven Data Virtualization for Web-scale Integration
 
Introducing the Linked Data Research Centre
Introducing the Linked Data Research CentreIntroducing the Linked Data Research Centre
Introducing the Linked Data Research Centre
 
Information Extraction and Linked Data Cloud
Information Extraction and Linked Data CloudInformation Extraction and Linked Data Cloud
Information Extraction and Linked Data Cloud
 
Make our Scientific Datasets Accessible and Interoperable on the Web
Make our Scientific Datasets Accessible and Interoperable on the WebMake our Scientific Datasets Accessible and Interoperable on the Web
Make our Scientific Datasets Accessible and Interoperable on the Web
 
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talkDistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
DistLODStats: Distributed Computation of RDF Dataset Statistics - ISWC 2018 talk
 
Efficient source selection for sparql endpoint federation
Efficient source selection for sparql endpoint federationEfficient source selection for sparql endpoint federation
Efficient source selection for sparql endpoint federation
 
Linked data 101: Getting Caught in the Semantic Web
Linked data 101: Getting Caught in the Semantic Web Linked data 101: Getting Caught in the Semantic Web
Linked data 101: Getting Caught in the Semantic Web
 
Linked GeoData - WhereCampDC 20110610
Linked GeoData - WhereCampDC 20110610Linked GeoData - WhereCampDC 20110610
Linked GeoData - WhereCampDC 20110610
 

Mehr von PlanetData Network of Excellence

A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoPlanetData Network of Excellence
 
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksOn Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksPlanetData Network of Excellence
 
Towards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingTowards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingPlanetData Network of Excellence
 
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamPlanetData Network of Excellence
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingPlanetData Network of Excellence
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...PlanetData Network of Excellence
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchPlanetData Network of Excellence
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSPlanetData Network of Excellence
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReducePlanetData Network of Excellence
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...PlanetData Network of Excellence
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsPlanetData Network of Excellence
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...PlanetData Network of Excellence
 

Mehr von PlanetData Network of Excellence (20)

Dl2014 slides
Dl2014 slidesDl2014 slides
Dl2014 slides
 
A Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about TrentinoA Contextualized Knowledge Repository for Open Data about Trentino
A Contextualized Knowledge Repository for Open Data about Trentino
 
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching NetworksOn Leveraging Crowdsourcing Techniques for Schema Matching Networks
On Leveraging Crowdsourcing Techniques for Schema Matching Networks
 
Towards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory SensingTowards Enabling Probabilistic Databases for Participatory Sensing
Towards Enabling Probabilistic Databases for Participatory Sensing
 
Privacy-Preserving Schema Reuse
Privacy-Preserving Schema ReusePrivacy-Preserving Schema Reuse
Privacy-Preserving Schema Reuse
 
Pay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching NetworksPay-as-you-go Reconciliation in Schema Matching Networks
Pay-as-you-go Reconciliation in Schema Matching Networks
 
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstreamDemo: tablet-based visualisation of transport data in Madrid using SPARQLstream
Demo: tablet-based visualisation of transport data in Madrid using SPARQLstream
 
On the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream ProcessingOn the need for a W3C community group on RDF Stream Processing
On the need for a W3C community group on RDF Stream Processing
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
 
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatchLinking Smart Cities Datasets with Human Computation: the case of UrbanMatch
Linking Smart Cities Datasets with Human Computation: the case of UrbanMatch
 
SciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMSSciQL, Bridging the Gap between Science and Relational DBMS
SciQL, Bridging the Gap between Science and Relational DBMS
 
CLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data ArchitectureCLODA: A Crowdsourced Linked Open Data Architecture
CLODA: A Crowdsourced Linked Open Data Architecture
 
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduceScalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
 
Data and Knowledge Evolution
Data and Knowledge Evolution  Data and Knowledge Evolution
Data and Knowledge Evolution
 
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...Evolution of Workflow Provenance Information in the Presence of Custom Infere...
Evolution of Workflow Provenance Information in the Presence of Custom Infere...
 
Access Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract ModelsAccess Control for RDF graphs using Abstract Models
Access Control for RDF graphs using Abstract Models
 
Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?Arrays in Databases, the next frontier?
Arrays in Databases, the next frontier?
 
Abstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF DatasetsAbstract Access Control Model for Dynamic RDF Datasets
Abstract Access Control Model for Dynamic RDF Datasets
 
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of FactsTowards Parallel Nonmonotonic Reasoning with Billions of Facts
Towards Parallel Nonmonotonic Reasoning with Billions of Facts
 
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
Automation in Cytomics: A Modern RDBMS Based Platform for Image Analysis and ...
 

Kürzlich hochgeladen

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 

Kürzlich hochgeladen (20)

SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 

Linked Data and Sevices

  • 1. Linked Data and Services Andreas Harth and Barry Norton Institute AIFB KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association www.kit.edu
  • 2. Outline !  Motivation !  Linked Data Principles !  Query Processing over Linked Data !  Linked Data Services (LIDS) and Linked Open Services (LOS) !  Conclusion KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 3. Motivation !  Semantic Web/Linked Data technologies are well-suited for data integration ? ! Common Data Data Interactive Data Format/Access Integration Exploration Protocol 8/10/11 Taking the LIDS off Data Silos KIT – University of the State of Baden-Wuerttemberg and Andreas Harth National Laboratory of the Helmholtz Association
  • 4. Linked Data Principles* 1.  Use URIs to name things; not only documents, but also people, locations, concepts, etc. 2.  To enable agents (human users and machine agents alike) to look up those names, use HTTP URIs 3.  When someone looks up a URI we provide useful information; with 'useful' in the strict sense we usually mean structured data in RDF. 4.  Include links to other URIs allowing agents (machines and humans) to discover more things (*) http://www.w3.org/DesignIssues/LinkedData.html KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 5. Correspondence between thing-URI and source-URI User Agent http://www.polleres.net/foaf.rdf#me HTTP RDF GET Web Server http://www.polleres.net/foaf.rdf 5 KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 6. Correspondence between thing-URI and source-URI User Agent http://dbpedia.org/resource/Gordon_Brown HTTP 303 HTTP RDF GET GET http://dbpedia.org/data/Gordon_Brown Web Server http://dbpedia.org/page/Gordon_Brown 6 KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 7. KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 8. Queries over Linked Data SELECT ?f ?n WHERE { an:f#ah foaf:knows ?f. ?f foaf:name ?n. } SELECT ?x1 ?x2 WHERE { dblppub:HoganHP08 dc:creator ?a1. ?x1 owl:sameAs ?a1. ?x2 foaf:knows ?x1. } ?f ?n KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 9. Querying Data Across Sources !  Data warehousing or materialisation-based approaches (MAT) CRAWL INDEX SERVE !  Distributed query processing approaches (DQP) SELECT * R S FROM… R S 9 15.03.2010 Andreas Harth KIT – University of the State of Baden-Wuerttemberg and Data Summaries for On-Demand Queries over Linked Data National Laboratory of the Helmholtz Association
  • 10. DQP on Linked Data SELECT * R S FROM… R S ODBC ODBC SELECT ?s TP TP WHERE… HTTP HTTP TP TP GET GET 10 15.03.2010 Andreas Harth KIT – University of the State of Baden-Wuerttemberg and Data Summaries for On-Demand Queries over Linked Data National Laboratory of the Helmholtz Association
  • 11. Query Processing Overview SELECT ?f ?n WHERE { an:f#ah foaf:knows ?f. ?f foaf:name ?n. } TP TP (an:f#ah foaf:knows ?f) (?f foaf:name ?n) Select source HTTP RDF Select source HTTP RDF (s) GET GET (s) ?f ?n http://danbri.org/foaf.rdf#danbri Dan Brickley 11 15.03.2010 Andreas Harth KIT – University of the State of Baden-Wuerttemberg and Data Summaries for On-Demand Queries over Linked Data National Laboratory of the Helmholtz Association
  • 12. Barry KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 13. Problem: Source Selection for Triple Patterns !  (?s ?p ?o) !  (#s ?p ?o) !  (?s #p ?o) !  (?s ?p #o) !  (#s #p ?o) !  (#s ?p #o) !  (?s #p #o) !  (#s #p #o) !  Given a triple pattern, which source can contribute bindings for the triple pattern? 13 15.03.2010 Andreas Harth KIT – University of the State of Baden-Wuerttemberg and Data Summaries for On-Demand Queries over Linked Data National Laboratory of the Helmholtz Association
  • 14. Schema-Level Indices [Stuckenschmidt et al. 2004] !  Keep index of properties and/or classes contained in sources !  (?s #p ?o), (?s rdf:type #o) !  Covers only queries containing schema-level elements !  Commonly used properties select potentially too many sources SELECT ?x1 ?x2 WHERE { SELECT ?f ?n WHERE { dblppub:HoganHP08 dc:creator ?a1. an:f#ah foaf:knows ?f. ?x1 owl:sameAs ?a1. ?f foaf:name ?n. ?x2 foaf:knows ?x1. } } 14 15.03.2010 Andreas Harth KIT – University of the State of Baden-Wuerttemberg and Data Summaries for On-Demand Queries over Linked Data National Laboratory of the Helmholtz Association
  • 15. Direct Lookup (DL) [Hartig et al. 2009] !  Exploits correspondence between thing-URI and source-URI !  Linked Data sources (aka RDF files) return typically triples with a subject corresponding to the source !  Sometimes the sources return triples with object corresponding to the source !  (#s ?p ?o), (#s #p ?o), (#s #p #o) !  (?s ?p #o), (?s #p #o) !  Incomplete wrt. patterns but also wrt. to URI reuse across sources !  Limited parallelism, unclear how to schedule lookups SELECT ?x1 ?x2 WHERE { SELECT ?f ?n WHERE { dblppub:HoganHP08 dc:creator ?a1. an:f#ah foaf:knows ?f. ?x1 owl:sameAs ?a1. ?f foaf:name ?n. ?x2 foaf:knows ?x1. } } 15 15.03.2010 Andreas Harth KIT – University of the State of Baden-Wuerttemberg and Data Summaries for On-Demand Queries over Linked Data National Laboratory of the Helmholtz Association
  • 16. Approximate Data Summaries !  Combined description of schema-level and instance-level !  Use approximation to reduce index size (incurs false positives) !  Possible to use entire query for source selection !  Parallel lookups since sources can be determined for the entire query !  (?s ?p ?o), (#s ?p ?o), (?s #p ?o), (?s ?p #o), (#s #p ? o), (#s ?p #o), (?s #p #o), (#s #p #o) !  and combinations of triple patterns SELECT ?x1 ?x2 WHERE { SELECT ?f ?n WHERE { dblppub:HoganHP08 dc:creator ?a1. an:f#ah foaf:knows ?f. ?x1 owl:sameAs ?a1. ?f foaf:name ?n. ?x2 foaf:knows ?x1. } } 16 15.03.2010 Andreas Harth KIT – University of the State of Baden-Wuerttemberg and Data Summaries for On-Demand Queries over Linked Data National Laboratory of the Helmholtz Association
  • 17. Implementation !  Deploy wrappers „in the cloud“ !  Google App Engine: hosting of Java and Python webapps on Google’s Cloud infrastructure !  Limited amount of processing time (6hrs/day) !  Single-threaded applications !  Suited for deploying wrappers !  e.g. http://twitter2foaf.appspot.com/ converts Twitter user data to RDF KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 18. Linking Open Data Cloud 2007 KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 19. Linking Open Data Cloud 2008 KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 20. Linking Open Data Cloud 2009 KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 21. Linking Open Data Cloud 2010 KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 22. Geonames Services KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 23. Geonames Services KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 24. Geonames Services {"weatherObservation": {"clouds":"broken clouds", "weatherCondition":"drizzle", "observation":"LESO 251300Z 03007KT 340V040 CAVOK 23/15 Q1010", "windDirection":30, "ICAO":"LESO", ... KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 25. Geonames Services {"weatherObservation": {"clouds":"broken clouds", "weatherCondition":"drizzle", "observation":"LESO 251300Z 03007KT 340V040 CAVOK 23/15 Q1010", "windDirection":30, "ICAO":"LESO", ... KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 26. Linked Open Service Principles REST Principles 1. Application state and functionality is divided into resources 2. Every resource is uniquely addressable 3. All resources share a uniform interface: a) A constrained set of well-defined operations b) A constrained set of content types Linked Data Principles 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL) 4. Include links to other URIs. so that they can discover more things. Linked Open Service Principles 1. Describe services as LOD prosumers with input and output descriptions as SPARQL graph patterns 2. Communicate RDF by RESTful content negotiation 3. The output should make explicit its relation with the input KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 27. LOS Weather Service KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 28. LOS Geo Resources KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 29. Resource-Based Linked Open Services GET Accept: text/html 303 REDIRECT /page GET Accept: application/rdf Linked Data +xml (or text/n3) 303 REDIRECT /data GET /weather Linked Service Accept: application/rdf +xml (or text/n3) 200 <rdf:Description> KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 30. Interlinking Data with Data from Services? KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 31. Data Services !  Given input, provide output !  Input and output are related in a service-specific way !  Do not change the state of the world Input relation Output defines Service !  E.g. GeoNames findNearbyWikipedia service !  Input: lat/lon !  Output: places KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association !  Relation: output places that are nearby input place
  • 32. Linked Data Services !  We’d like to integrate data services with Linked Data 1.  LIDS need to adhere to Linked Data principles !  We’d like to use data services in software programs 2.  LIDS need machine-readable descriptions of input and output !  Compared to naïve approach: assign URI to service output !  Relationship between input and output is explicitly described !  Dynamicity is supported KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 33. 1. Data Services as Linked Data !  Input is given as URI Service Endpoint http://geowrap.openlids.org/findNearbyWikipedia ?lat=37.416&lng=-122.152 Parameters #point Input Identifier Output !  Resolving the URI yields Relation RDF: Input @prefix dbp: <http://dbpedia.org/resource/> . @prefix : <http://geo..Wiki? lat=37.416&lng=-122.152#> :point foaf:based_near dbp:Palo_Alto KIT – University of the State of Baden-Wuerttemberg and %2C_California ; National Laboratory of the Helmholtz Association foaf:based_near dbp:Packard%27s_garage .
  • 34. 2. LIDS Descriptions !  LIDS characterised by !  Endpoint URI ep, which is the base for all input entities !  Local identifier i of input entity !  List of parameters Xi !  Basic graph pattern Ti describing conditions on parameters !  Basic graph pattern To describing minimum output data !  Example: ep = <http:/geowrap.openlids.org/findNearbyWikipedia> i = point Xi = {?lat, ?lng} Ti = ?point a Point . ?point geo:lat ?lat . ?point geo:long ?lng To = ?point foaf:based_near ?feature KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 35. Interlink LIDS and Linked Data !  Generate service URIs with input bindings, from evaluating : select Xi where Ti !  sameAs: binding for i
  • 36. Scale-Up Experiment: Link BTC to GeoNames !  3 billion triples from the Billion Triple Challenge (BTC) 2010 data set: !  Annotate with LIDS wrapper of GeoNames findNearby service !  Annotation time: < 12 hours on laptop! !  ~ 12 hours for uncompressing the data set, cleaning results, and gather statistics !  Original BTC data: 74 different domains that linked to GeoNames URIs !  Interlinking process added 891 new now linked to LIDS geowrap !  In total 2,448,160 new links were added KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 37. Query Answering using LIDS and Linked Data !  Query execution resolves URIs !  => enlarges data set !  LIDS are interlinked !  Query is executed again on new data set !  Repeat until no new links or no new data !  Combine results
  • 38. Experiment: Query Answering !  Input: List of 562 (potential) universities from Facebook Graph API !  Output: Facebook fans and DBpedia student numbers for 104 universities !  PREFIX u: <http://openlids.org/ universities.rdf#> SELECT ?n ?f ?s WHERE { u:list foaf:topic ?u . ?u foaf:name ? n . ?u og:fan_count ?f .?u d:numberOfStudents ?s } KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association
  • 39. Linked Services and PlanetData !  Several areas seem likely to produce services: !  Stream, inc. Sensor, resources (latest values) !  Any others exposing dynamic resources !  Dynamic computations, inc. on-the-fly quality assessments !  Other areas seem likely to consider service technologies and move towards more service-like HTTP interactions !  Access control (OpenID, OAuth, etc.) !  Finally, remaining areas could serve to complement LIDS/LOS alignment !  Provenance KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association