SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Downloaden Sie, um offline zu lesen
Exploring Linked Data content
                  through network analysis
                         Christophe Guéret (@cgueret)
                          Free University Amsterdam

      Co-explorers: Stefan Schlobach, Shenghui Wang,
              Paul Groth, Frank van Harmelen




http://latc-project.eu                                  http://www.vu.nl
Outline of the talk
     What is Linked Data?


     What is there is to be analysed?


     Do we miss something?


     New research directions and first results



November 23, 2011       Analysis of Linked Data   2/35
Linked Data (aka Semantic Web)


                    Linked Data




November 23, 2011   Analysis of Linked Data                                        3/35
                                   http://www.flickr.com/photos/erikcharlton/3337465138
What is the problem?
    Frank and Christophe publish some open data
    Roi wants to combine and enrich it


                     Kennissen            Staad
                    Christophe      Amsterdam
                    Peter           Barcelona                 WWW
        Frank       David           Parijs



                            Ville         Pays                                    Roi
                    Barcelone       Espagne
                    Paris           France                    WWW

      Christophe    Amsterdam       Pays-Bas


                                                               Marvel icons: mermer, DeviantArt
November 23, 2011                   Analysis of Linked Data                               4/35
What is the problem?
       Kennissen         Staad                 Ville           Pays
     Christophe
     Peter
     David
                    Amsterdam
                    Barcelona
                    Parijs
                                  +
                                       Barcelone
                                       Paris
                                       Amsterdam
                                                           Espagne
                                                           France
                                                           Pays-Bas
                                                                      =   ?
    Data integration issue
        “Kennissen”, “Staad”, “Ville”, “Pays” ?
        “Paris” = “Parijs” ?
        “Amsterdam” = “Amsterdam” ?

    Lot of work, must be done again on updates
November 23, 2011                Analysis of Linked Data                      5/35
A solution
     Do data integration at the data level

     Use, and re-use, unambiguous identifiers

     Use meta-level descriptions of the identifiers

     Proposal: use the Web as a platform
         Identifiers = URIs
         Descriptions = de-referenced documents
November 23, 2011       Analysis of Linked Data       6/35
Frank publishes his data                                                 Kennissen                   Staad
                                                                       Christophe             Amsterdam
                                                                       Peter                  Barcelona
   This is a “triple”
                                                                       David                  Parijs
                               ex:Acquaintance

                        rdf:type       rdf:type         rdf:type


    ex:Christophe                  ex:Peter               ex:David

                ex:worksIn              ex:worksIn                 ex:worksIn

 dbpedia:Amsterdam            dbpedia:Barcelona         dbpedia:Paris



                                                                   Use of compact URIs
                                                                   dbpedia = http://dbpedia.org/resource/
                                                                   ex = http://example.org/
                                                                   rdf = http://www.w3.org/1999/02/22-rdf-syntax-ns#




November 23, 2011                        Analysis of Linked Data                                               7/35
Christophe re-use part of Frank's data                                        Ville       Pays
to publish his data                                                   Barcelone       Espagne
                                                                      Paris           France
                                                                      Amsterdam       Pays-Bas
                           ex:Acquaintance

                    rdf:type          rdf:type         rdf:type


    ex:Christophe                ex:Peter                ex:David

            ex:worksIn                ex:worksIn                  ex:worksIn

 dbpedia:Amsterdam        dbpedia:Barcelona            dbpedia:Paris

            ex:isIn                   ex:isIn                     ex:isIn

 dbpedia:Netherlands           dbpedia:Spain          dbpedia:France




November 23, 2011                       Analysis of Linked Data                                  8/35
Roi add some                    “Conocido”@es
more information
                                       rdf:label


                                ex:Acquaintance

                    rdf:type            rdf:type         rdf:type


    ex:Christophe                  ex:Peter                ex:David

            ex:worksIn                  ex:worksIn                  ex:worksIn

 dbpedia:Amsterdam          dbpedia:Barcelona            dbpedia:Paris

            ex:isIn                     ex:isIn                     ex:isIn

 dbpedia:Netherlands             dbpedia:Spain          dbpedia:France


                      ex:isIn           ex:isIn           ex:isIn

                                dbpedia:Europe

November 23, 2011                         Analysis of Linked Data                9/35
dbpedia:Amsterdam




November 23, 2011   Analysis of Linked Data   10/35
Reasoning with Semantics                                                  Bonus!


      dbpedia:Amsterdam                  ex:isIn                 dbpedia:Amsterdam


                    ex:isIn                   rdf:type


      dbpedia:Netherlands     +   owl:TransitiveProperty     =            ex:isIn


                    ex:isIn


        dbpedia:Europe                                            dbpedia:Europe



    Example usage
         Materialize implicit information
         Check for consistency
November 23, 2011                  Analysis of Linked Data                           11/35
Rough estimate of size
     295 data sets, 31B facts in LOD Cloud




            Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
November 23, 2011                      Analysis of Linked Data                                     12/35
Lots of Data to analyze! :-)




November 23, 2011    Analysis of Linked Data                                     13/35
                                        http://www.flickr.com/photos/argonne/3323018571
But analyzing what exactly?
     Table of facts published at different locations
     A distributed Knowledge Base
    Subject          Predicate             Object
 ex:Christophe      rdf:type         ex:Acquaintance
 ex:Christophe      ex:worksIn       dbpedia:Amsterdam
 ex:Peter           rdf:type         ex:Acquaintance
       ...               ...                  ...
                           Subject            Predicate               Object
                    dbpedia:Amsterdam        ex:isIn            dbpedia:Netherlands
                    dbpedia:Netherlands      ex:isIn            dbpedia:Europe
                               ...                  ...                 ...
                                                    Subject           Predicate          Object
                                              ex:Acquaintance        rdf:label        “Conocido”@es
                                                          ...             ...              ...
November 23, 2011                          Analysis of Linked Data                                14/35
Analysis workflow
 1.Gather a snapshot of triples
 2.Compute descriptive statistics
         Top resources (subject, predicate, object)
         Frequency cross-links types (SP,SO,PO,...)
         Connected components
         Paths frequency
         …


=> Tricky enough, the data is really big!
=> We should be able to get more out of the data
November 23, 2011           Analysis of Linked Data   15/35
Can we explain that?


                                   Suggestions
                                        Started the graph
                                        General knowledge
                                        Very well known




November 23, 2011   Analysis of Linked Data                 16/35
or that?


                                   Suggestions
                                       All published by Bio2RDF
                                       Well aware of each other
                                       Overlapping domain




November 23, 2011   Analysis of Linked Data                  17/35
Could we predict the impact of ...
     Dbpedia being down for a while ?


     SIOC renaming “User” into “UserAccount” ?


     creating a dataset that turns out to be popular ?




         Analysing a set of triples is not enough
November 23, 2011       Analysis of Linked Data          18/35
Are we overlooking something?




November 23, 2011   Analysis of Linked Data   19/35
It's not only about the resources
     Several entities related to the data
                                    ex:something             WWW


     Data publishers/consumers        Resources            Web servers

     Interactions between all of them




                                            WWW


November 23, 2011                Analysis of Linked Data                 20/35
There are different scales
      Triples level versus Resource groups level
      Different data complexity at each scale
                                “Conocido”@es

                                                 rdf:label


                                ex:Acquaintance

                           rdf:type              rdf:type        rdf:type


      ex:Christophe                   ex:Peter                  ex:David


               ex:worksIn                    ex:worksIn                ex:worksIn


   dbpedia:Amsterdam           dbpedia:Barcelona             dbpedia:Paris


                 ex:isIn                         ex:isIn                    ex:isIn


   dbpedia:Netherlands           dbpedia:Spain               dbpedia:France


                           ex:isIn               ex:isIn          ex:isIn

                                dbpedia:Europe




November 23, 2011                                                            Analysis of Linked Data   21/35
It is not a static network
     Size and topology evolve over time




         2007        2008                      2010




November 23, 2011    Analysis of Linked Data          22/35
Linked Data is a Complex System
     Multiple scale of observation
     Emergence of properties
     The whole is more than the sum of the parts

=> Interactions/relations are important to
understand the system behavior

=> We can benefit from a large body of
research results in Complex Systems study
November 23, 2011     Analysis of Linked Data   23/35
Initial findings and future work




November 23, 2011   Analysis of Linked Data                       24/35
                                              Ya3hs3/2531493704 on Flickr
New analysis workflow
 1.Gather a snapshot of triples


 2.Gather information about other type of interactions


 3.Create specific networks related to the research
   questions at hand


 4.Run metrics, interpret results


November 23, 2011     Analysis of Linked Data         25/35
The LOD is not what we think it is
    LOD Cloud 2009/2010 vs BTC 2009 crawl
        Crawled sample differs from the community
        based view

    LOD Cloud has lumpy structure


    Evolution of LOD Cloud
        centrality changes
        Increased density and connectivity
                                                  Christophe Guéret, Shenghui Wang, Paul Groth et al. (2011)
                    Multi-scale Analysis of the Web Of Data: A Challenge to the Complex System's Community
                                                                       Advances in Complex Systems 14 (04)
November 23, 2011                 Analysis of Linked Data                                            26/35
November 23, 2011   Analysis of Linked Data   27/35
The tools we need don't exist
     We need to flatten the networks to study them


     Some specific aspects of the system
         Existence of implicit links
         Multi-relational and dynamic
         Distributed
         Hypergraph of relations



                                                     Christophe Guéret, Shenghui Wang, Paul Groth et al. (2011)
                       Multi-scale Analysis of the Web Of Data: A Challenge to the Complex System's Community
                                                                          Advances in Complex Systems 14 (04)
November 23, 2011                    Analysis of Linked Data                                            28/35
Influence content<->social networks
     Generate and bind two networks
                                                          ex:a



                                                                                      ex:b



                                                                 ex:c



     Measure evolution of degree, betweenness,
     clustering over time
     Predict evolution                                               Shenghui Wang, Paul Groth (2010)
                    Measuring the dynamic bi-directional influence between content and social networks
                           Proceedings of the 9th International Semantic Web Conference (ISWC2010)
November 23, 2011          Analysis of Linked Data                                              29/35
Result for conferences




                                                                     Shenghui Wang, Paul Groth (2010)
                    Measuring the dynamic bi-directional influence between content and social networks
                           Proceedings of the 9th International Semantic Web Conference (ISWC2010)
November 23, 2011          Analysis of Linked Data                                              30/35
Centrality to measure robustness
    Map the BTC2010 to two networks
        Semantic network based on namespaces
        Host networks based on hostnames


    Measure robustness as the variance in betweenness
    centrality


    Find weak spots


    Optimize networks to increase robustness
                                                  Christophe Guéret, Paul Groth, Frank Van Harmelen et al. (2010)
                    Finding the Achilles Heel of the Web of Data : using network analysis for link-recommendation
                                      Proceedings of the 9th International Semantic Web Conference (ISWC2010)
November 23, 2011                      Analysis of Linked Data                                              31/35
Results on hostnames




                                                  Christophe Guéret, Paul Groth, Frank Van Harmelen et al. (2010)
                    Finding the Achilles Heel of the Web of Data : using network analysis for link-recommendation
                                      Proceedings of the 9th International Semantic Web Conference (ISWC2010)
November 23, 2011                      Analysis of Linked Data                                              32/35
Results on namespaces




                                                  Christophe Guéret, Paul Groth, Frank Van Harmelen et al. (2010)
                    Finding the Achilles Heel of the Web of Data : using network analysis for link-recommendation
                                      Proceedings of the 9th International Semantic Web Conference (ISWC2010)
November 23, 2011                      Analysis of Linked Data                                              33/35
Improving the network




                                                  Christophe Guéret, Paul Groth, Frank Van Harmelen et al. (2010)
                    Finding the Achilles Heel of the Web of Data : using network analysis for link-recommendation
                                      Proceedings of the 9th International Semantic Web Conference (ISWC2010)
November 23, 2011                      Analysis of Linked Data                                              34/35
Conclusion
     Take home message
         Linked Data is not a simple knowledge base
         Network analysis tools give new insights on the data
         Results can be used to improve the network


     Future work
         Make resource-centric analysis rather than graph-
         centric analysis (big bottleneck now)
         Tackle the time aspect of the data
         Find more analysis to perform and what they tell us

November 23, 2011          Analysis of Linked Data              35/35

Weitere ähnliche Inhalte

Andere mochten auch

QER : query entity recognition
QER : query entity recognitionQER : query entity recognition
QER : query entity recognitionDhwaj Raj
 
The named entity recognition (ner)2
The named entity recognition (ner)2The named entity recognition (ner)2
The named entity recognition (ner)2Arabic_NLP_ImamU2013
 
Named Entity Recognition - ACL 2011 Presentation
Named Entity Recognition - ACL 2011 PresentationNamed Entity Recognition - ACL 2011 Presentation
Named Entity Recognition - ACL 2011 PresentationRichard Littauer
 
RDF and other linked data standards — how to make use of big localization data
RDF and other linked data standards — how to make use of big localization dataRDF and other linked data standards — how to make use of big localization data
RDF and other linked data standards — how to make use of big localization dataDave Lewis
 
Dynamically Optimizing Queries over Large Scale Data Platforms
Dynamically Optimizing Queries over Large Scale Data PlatformsDynamically Optimizing Queries over Large Scale Data Platforms
Dynamically Optimizing Queries over Large Scale Data PlatformsINRIA-OAK
 
Interaction with Linked Data
Interaction with Linked DataInteraction with Linked Data
Interaction with Linked DataEUCLID project
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataEUCLID project
 
Enhancing Entity Linking by Combining NER Models
Enhancing Entity Linking by Combining NER ModelsEnhancing Entity Linking by Combining NER Models
Enhancing Entity Linking by Combining NER ModelsJulien PLU
 
Natural language procssing
Natural language procssing Natural language procssing
Natural language procssing Rajnish Raj
 
A Vague Sense Classifier for Detecting Vague Definitions in Ontologies
A Vague Sense Classifier for Detecting Vague Definitions in OntologiesA Vague Sense Classifier for Detecting Vague Definitions in Ontologies
A Vague Sense Classifier for Detecting Vague Definitions in OntologiesPanos Alexopoulos
 
Exploiting Linked Open Data and Natural Language Processing for Classificati...
Exploiting Linked Open Data  and Natural Language Processing for Classificati...Exploiting Linked Open Data  and Natural Language Processing for Classificati...
Exploiting Linked Open Data and Natural Language Processing for Classificati...giuseppe_futia
 
Effective Named Entity Recognition for Idiosyncratic Web Collections
Effective Named Entity Recognition for Idiosyncratic Web CollectionsEffective Named Entity Recognition for Idiosyncratic Web Collections
Effective Named Entity Recognition for Idiosyncratic Web CollectionseXascale Infolab
 
Exploiting Entity Linking in Queries For Entity Retrieval
Exploiting Entity Linking in Queries For Entity RetrievalExploiting Entity Linking in Queries For Entity Retrieval
Exploiting Entity Linking in Queries For Entity RetrievalFaegheh Hasibi
 
Understanding Queries through Entities
Understanding Queries through EntitiesUnderstanding Queries through Entities
Understanding Queries through EntitiesPeter Mika
 
Being a PhD student: Experiences and Challenges
Being a PhD student: Experiences and ChallengesBeing a PhD student: Experiences and Challenges
Being a PhD student: Experiences and ChallengesFaegheh Hasibi
 

Andere mochten auch (20)

QER : query entity recognition
QER : query entity recognitionQER : query entity recognition
QER : query entity recognition
 
The named entity recognition (ner)2
The named entity recognition (ner)2The named entity recognition (ner)2
The named entity recognition (ner)2
 
Text mining
Text miningText mining
Text mining
 
Named Entity Recognition - ACL 2011 Presentation
Named Entity Recognition - ACL 2011 PresentationNamed Entity Recognition - ACL 2011 Presentation
Named Entity Recognition - ACL 2011 Presentation
 
RDF and other linked data standards — how to make use of big localization data
RDF and other linked data standards — how to make use of big localization dataRDF and other linked data standards — how to make use of big localization data
RDF and other linked data standards — how to make use of big localization data
 
Dynamically Optimizing Queries over Large Scale Data Platforms
Dynamically Optimizing Queries over Large Scale Data PlatformsDynamically Optimizing Queries over Large Scale Data Platforms
Dynamically Optimizing Queries over Large Scale Data Platforms
 
Interaction with Linked Data
Interaction with Linked DataInteraction with Linked Data
Interaction with Linked Data
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Discoverers of Surface Analysis
Discoverers of Surface AnalysisDiscoverers of Surface Analysis
Discoverers of Surface Analysis
 
Enhancing Entity Linking by Combining NER Models
Enhancing Entity Linking by Combining NER ModelsEnhancing Entity Linking by Combining NER Models
Enhancing Entity Linking by Combining NER Models
 
Natural language procssing
Natural language procssing Natural language procssing
Natural language procssing
 
Recipes for PhD
Recipes for PhDRecipes for PhD
Recipes for PhD
 
A Vague Sense Classifier for Detecting Vague Definitions in Ontologies
A Vague Sense Classifier for Detecting Vague Definitions in OntologiesA Vague Sense Classifier for Detecting Vague Definitions in Ontologies
A Vague Sense Classifier for Detecting Vague Definitions in Ontologies
 
NLP & DBpedia
 NLP & DBpedia NLP & DBpedia
NLP & DBpedia
 
Exploiting Linked Open Data and Natural Language Processing for Classificati...
Exploiting Linked Open Data  and Natural Language Processing for Classificati...Exploiting Linked Open Data  and Natural Language Processing for Classificati...
Exploiting Linked Open Data and Natural Language Processing for Classificati...
 
Effective Named Entity Recognition for Idiosyncratic Web Collections
Effective Named Entity Recognition for Idiosyncratic Web CollectionsEffective Named Entity Recognition for Idiosyncratic Web Collections
Effective Named Entity Recognition for Idiosyncratic Web Collections
 
Exploiting Entity Linking in Queries For Entity Retrieval
Exploiting Entity Linking in Queries For Entity RetrievalExploiting Entity Linking in Queries For Entity Retrieval
Exploiting Entity Linking in Queries For Entity Retrieval
 
Surface Analysis Techniques Feb & April 2013
Surface Analysis Techniques Feb & April 2013Surface Analysis Techniques Feb & April 2013
Surface Analysis Techniques Feb & April 2013
 
Understanding Queries through Entities
Understanding Queries through EntitiesUnderstanding Queries through Entities
Understanding Queries through Entities
 
Being a PhD student: Experiences and Challenges
Being a PhD student: Experiences and ChallengesBeing a PhD student: Experiences and Challenges
Being a PhD student: Experiences and Challenges
 

Ähnlich wie Exploring Linked Data content through network analysis

Linked Open Europeana: Semantics for the Citizen
Linked Open Europeana: Semantics for the CitizenLinked Open Europeana: Semantics for the Citizen
Linked Open Europeana: Semantics for the CitizenStefan Gradmann
 
Radically Open Cultural Heritage Data on the Web
Radically Open Cultural Heritage Data on the WebRadically Open Cultural Heritage Data on the Web
Radically Open Cultural Heritage Data on the WebJulie Allinson
 
wimmics and DBpedia FR
wimmics and DBpedia FRwimmics and DBpedia FR
wimmics and DBpedia FRJulienCojan
 
Session 5.2 multi-core meta-blocking for big linked data
Session 5.2   multi-core meta-blocking for big linked dataSession 5.2   multi-core meta-blocking for big linked data
Session 5.2 multi-core meta-blocking for big linked datasemanticsconference
 
Staab programming thesemanticweb
Staab programming thesemanticwebStaab programming thesemanticweb
Staab programming thesemanticwebAneta Tu
 
ESWC SS 2013 - Tuesday Keynote Steffen Staab: Programming the Semantic Web
ESWC SS 2013 - Tuesday Keynote Steffen Staab: Programming the Semantic WebESWC SS 2013 - Tuesday Keynote Steffen Staab: Programming the Semantic Web
ESWC SS 2013 - Tuesday Keynote Steffen Staab: Programming the Semantic Webeswcsummerschool
 
Programming the Semantic Web
Programming the Semantic WebProgramming the Semantic Web
Programming the Semantic WebSteffen Staab
 
The Story behind Everything Is Connected: Multimedia narration of automatical...
The Story behind Everything Is Connected: Multimedia narration of automatical...The Story behind Everything Is Connected: Multimedia narration of automatical...
The Story behind Everything Is Connected: Multimedia narration of automatical...Miel Vander Sande
 
Detection of Contextual Identity Links in a Knowledge Base
Detection of Contextual Identity Links in a Knowledge BaseDetection of Contextual Identity Links in a Knowledge Base
Detection of Contextual Identity Links in a Knowledge BaseJoe Raad
 
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)net2-project
 

Ähnlich wie Exploring Linked Data content through network analysis (12)

Linked Open Europeana: Semantics for the Citizen
Linked Open Europeana: Semantics for the CitizenLinked Open Europeana: Semantics for the Citizen
Linked Open Europeana: Semantics for the Citizen
 
Radically Open Cultural Heritage Data on the Web
Radically Open Cultural Heritage Data on the WebRadically Open Cultural Heritage Data on the Web
Radically Open Cultural Heritage Data on the Web
 
wimmics and DBpedia FR
wimmics and DBpedia FRwimmics and DBpedia FR
wimmics and DBpedia FR
 
Session 5.2 multi-core meta-blocking for big linked data
Session 5.2   multi-core meta-blocking for big linked dataSession 5.2   multi-core meta-blocking for big linked data
Session 5.2 multi-core meta-blocking for big linked data
 
Staab programming thesemanticweb
Staab programming thesemanticwebStaab programming thesemanticweb
Staab programming thesemanticweb
 
ESWC SS 2013 - Tuesday Keynote Steffen Staab: Programming the Semantic Web
ESWC SS 2013 - Tuesday Keynote Steffen Staab: Programming the Semantic WebESWC SS 2013 - Tuesday Keynote Steffen Staab: Programming the Semantic Web
ESWC SS 2013 - Tuesday Keynote Steffen Staab: Programming the Semantic Web
 
Programming the Semantic Web
Programming the Semantic WebProgramming the Semantic Web
Programming the Semantic Web
 
The Story behind Everything Is Connected: Multimedia narration of automatical...
The Story behind Everything Is Connected: Multimedia narration of automatical...The Story behind Everything Is Connected: Multimedia narration of automatical...
The Story behind Everything Is Connected: Multimedia narration of automatical...
 
Linkedopencamera.it
Linkedopencamera.itLinkedopencamera.it
Linkedopencamera.it
 
Detection of Contextual Identity Links in a Knowledge Base
Detection of Contextual Identity Links in a Knowledge BaseDetection of Contextual Identity Links in a Knowledge Base
Detection of Contextual Identity Links in a Knowledge Base
 
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
SPARQL1.1 Tutorial, given in UChile by Axel Polleres (DERI)
 
Slides
SlidesSlides
Slides
 

Mehr von Christophe Guéret

HHAI June 2022 - KGs and Hybrid Intelligence
HHAI June 2022 - KGs and Hybrid IntelligenceHHAI June 2022 - KGs and Hybrid Intelligence
HHAI June 2022 - KGs and Hybrid IntelligenceChristophe Guéret
 
Informal presentation about RES
Informal presentation about RESInformal presentation about RES
Informal presentation about RESChristophe Guéret
 
Stop making tools! Nobody likes them anyway...
Stop making tools! Nobody likes them anyway...Stop making tools! Nobody likes them anyway...
Stop making tools! Nobody likes them anyway...Christophe Guéret
 
The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...
The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...
The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...Christophe Guéret
 
Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"
Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"
Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"Christophe Guéret
 
The Entity Registry System (ERS)
The Entity Registry System (ERS)The Entity Registry System (ERS)
The Entity Registry System (ERS)Christophe Guéret
 
Let's downscale the semantic web !
Let's downscale the semantic web !Let's downscale the semantic web !
Let's downscale the semantic web !Christophe Guéret
 
Your next data viz gear should be a Wii-U
Your next data viz gear should be a Wii-UYour next data viz gear should be a Wii-U
Your next data viz gear should be a Wii-UChristophe Guéret
 
The road towards a Web-based data ecosystem
The road towards a Web-based data ecosystemThe road towards a Web-based data ecosystem
The road towards a Web-based data ecosystemChristophe Guéret
 
Linked Open Data for Digital Humanities
Linked Open Data for Digital HumanitiesLinked Open Data for Digital Humanities
Linked Open Data for Digital HumanitiesChristophe Guéret
 
Downscaling information systems for education
Downscaling information systems for educationDownscaling information systems for education
Downscaling information systems for educationChristophe Guéret
 
ICT4D course 2013 - Low resources infrastructure
ICT4D course 2013 - Low resources infrastructureICT4D course 2013 - Low resources infrastructure
ICT4D course 2013 - Low resources infrastructureChristophe Guéret
 
ICT4D course 2013 - OLPC deployments
ICT4D course 2013 - OLPC deploymentsICT4D course 2013 - OLPC deployments
ICT4D course 2013 - OLPC deploymentsChristophe Guéret
 
Exposing the data from NARCIS with VIVO
Exposing the data from NARCIS with VIVOExposing the data from NARCIS with VIVO
Exposing the data from NARCIS with VIVOChristophe Guéret
 
Clarifier le sens de vos données publiques avec le Web de données
Clarifier le sens de vos données publiques avec le Web de donnéesClarifier le sens de vos données publiques avec le Web de données
Clarifier le sens de vos données publiques avec le Web de donnéesChristophe Guéret
 
Embedding young learners into the information society
Embedding young learners into the information societyEmbedding young learners into the information society
Embedding young learners into the information societyChristophe Guéret
 

Mehr von Christophe Guéret (20)

HHAI June 2022 - KGs and Hybrid Intelligence
HHAI June 2022 - KGs and Hybrid IntelligenceHHAI June 2022 - KGs and Hybrid Intelligence
HHAI June 2022 - KGs and Hybrid Intelligence
 
Informal presentation about RES
Informal presentation about RESInformal presentation about RES
Informal presentation about RES
 
Stop making tools! Nobody likes them anyway...
Stop making tools! Nobody likes them anyway...Stop making tools! Nobody likes them anyway...
Stop making tools! Nobody likes them anyway...
 
The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...
The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...
The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...
 
Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"
Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"
Introduction about WorldWideSemanticWeb.org for the workshop "Making it Matter"
 
The Entity Registry System (ERS)
The Entity Registry System (ERS)The Entity Registry System (ERS)
The Entity Registry System (ERS)
 
Let's downscale the semantic web !
Let's downscale the semantic web !Let's downscale the semantic web !
Let's downscale the semantic web !
 
Your next data viz gear should be a Wii-U
Your next data viz gear should be a Wii-UYour next data viz gear should be a Wii-U
Your next data viz gear should be a Wii-U
 
Linking knowledge spaces
Linking knowledge spacesLinking knowledge spaces
Linking knowledge spaces
 
The data behind the HuisKluis
The data behind the HuisKluisThe data behind the HuisKluis
The data behind the HuisKluis
 
Digital archiving 3.0
Digital archiving 3.0Digital archiving 3.0
Digital archiving 3.0
 
The road towards a Web-based data ecosystem
The road towards a Web-based data ecosystemThe road towards a Web-based data ecosystem
The road towards a Web-based data ecosystem
 
Linked Open Data for Digital Humanities
Linked Open Data for Digital HumanitiesLinked Open Data for Digital Humanities
Linked Open Data for Digital Humanities
 
Downscaling information systems for education
Downscaling information systems for educationDownscaling information systems for education
Downscaling information systems for education
 
ICT4D course 2013 - Low resources infrastructure
ICT4D course 2013 - Low resources infrastructureICT4D course 2013 - Low resources infrastructure
ICT4D course 2013 - Low resources infrastructure
 
ICT4D course 2013 - OLPC deployments
ICT4D course 2013 - OLPC deploymentsICT4D course 2013 - OLPC deployments
ICT4D course 2013 - OLPC deployments
 
ICT4D course 2013 - Sugar
ICT4D course 2013 - SugarICT4D course 2013 - Sugar
ICT4D course 2013 - Sugar
 
Exposing the data from NARCIS with VIVO
Exposing the data from NARCIS with VIVOExposing the data from NARCIS with VIVO
Exposing the data from NARCIS with VIVO
 
Clarifier le sens de vos données publiques avec le Web de données
Clarifier le sens de vos données publiques avec le Web de donnéesClarifier le sens de vos données publiques avec le Web de données
Clarifier le sens de vos données publiques avec le Web de données
 
Embedding young learners into the information society
Embedding young learners into the information societyEmbedding young learners into the information society
Embedding young learners into the information society
 

Kürzlich hochgeladen

Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 

Kürzlich hochgeladen (20)

Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 

Exploring Linked Data content through network analysis

  • 1. Exploring Linked Data content through network analysis Christophe Guéret (@cgueret) Free University Amsterdam Co-explorers: Stefan Schlobach, Shenghui Wang, Paul Groth, Frank van Harmelen http://latc-project.eu http://www.vu.nl
  • 2. Outline of the talk What is Linked Data? What is there is to be analysed? Do we miss something? New research directions and first results November 23, 2011 Analysis of Linked Data 2/35
  • 3. Linked Data (aka Semantic Web) Linked Data November 23, 2011 Analysis of Linked Data 3/35 http://www.flickr.com/photos/erikcharlton/3337465138
  • 4. What is the problem? Frank and Christophe publish some open data Roi wants to combine and enrich it Kennissen Staad Christophe Amsterdam Peter Barcelona WWW Frank David Parijs Ville Pays Roi Barcelone Espagne Paris France WWW Christophe Amsterdam Pays-Bas Marvel icons: mermer, DeviantArt November 23, 2011 Analysis of Linked Data 4/35
  • 5. What is the problem? Kennissen Staad Ville Pays Christophe Peter David Amsterdam Barcelona Parijs + Barcelone Paris Amsterdam Espagne France Pays-Bas = ? Data integration issue “Kennissen”, “Staad”, “Ville”, “Pays” ? “Paris” = “Parijs” ? “Amsterdam” = “Amsterdam” ? Lot of work, must be done again on updates November 23, 2011 Analysis of Linked Data 5/35
  • 6. A solution Do data integration at the data level Use, and re-use, unambiguous identifiers Use meta-level descriptions of the identifiers Proposal: use the Web as a platform Identifiers = URIs Descriptions = de-referenced documents November 23, 2011 Analysis of Linked Data 6/35
  • 7. Frank publishes his data Kennissen Staad Christophe Amsterdam Peter Barcelona This is a “triple” David Parijs ex:Acquaintance rdf:type rdf:type rdf:type ex:Christophe ex:Peter ex:David ex:worksIn ex:worksIn ex:worksIn dbpedia:Amsterdam dbpedia:Barcelona dbpedia:Paris Use of compact URIs dbpedia = http://dbpedia.org/resource/ ex = http://example.org/ rdf = http://www.w3.org/1999/02/22-rdf-syntax-ns# November 23, 2011 Analysis of Linked Data 7/35
  • 8. Christophe re-use part of Frank's data Ville Pays to publish his data Barcelone Espagne Paris France Amsterdam Pays-Bas ex:Acquaintance rdf:type rdf:type rdf:type ex:Christophe ex:Peter ex:David ex:worksIn ex:worksIn ex:worksIn dbpedia:Amsterdam dbpedia:Barcelona dbpedia:Paris ex:isIn ex:isIn ex:isIn dbpedia:Netherlands dbpedia:Spain dbpedia:France November 23, 2011 Analysis of Linked Data 8/35
  • 9. Roi add some “Conocido”@es more information rdf:label ex:Acquaintance rdf:type rdf:type rdf:type ex:Christophe ex:Peter ex:David ex:worksIn ex:worksIn ex:worksIn dbpedia:Amsterdam dbpedia:Barcelona dbpedia:Paris ex:isIn ex:isIn ex:isIn dbpedia:Netherlands dbpedia:Spain dbpedia:France ex:isIn ex:isIn ex:isIn dbpedia:Europe November 23, 2011 Analysis of Linked Data 9/35
  • 10. dbpedia:Amsterdam November 23, 2011 Analysis of Linked Data 10/35
  • 11. Reasoning with Semantics Bonus! dbpedia:Amsterdam ex:isIn dbpedia:Amsterdam ex:isIn rdf:type dbpedia:Netherlands + owl:TransitiveProperty = ex:isIn ex:isIn dbpedia:Europe dbpedia:Europe Example usage Materialize implicit information Check for consistency November 23, 2011 Analysis of Linked Data 11/35
  • 12. Rough estimate of size 295 data sets, 31B facts in LOD Cloud Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/ November 23, 2011 Analysis of Linked Data 12/35
  • 13. Lots of Data to analyze! :-) November 23, 2011 Analysis of Linked Data 13/35 http://www.flickr.com/photos/argonne/3323018571
  • 14. But analyzing what exactly? Table of facts published at different locations A distributed Knowledge Base Subject Predicate Object ex:Christophe rdf:type ex:Acquaintance ex:Christophe ex:worksIn dbpedia:Amsterdam ex:Peter rdf:type ex:Acquaintance ... ... ... Subject Predicate Object dbpedia:Amsterdam ex:isIn dbpedia:Netherlands dbpedia:Netherlands ex:isIn dbpedia:Europe ... ... ... Subject Predicate Object ex:Acquaintance rdf:label “Conocido”@es ... ... ... November 23, 2011 Analysis of Linked Data 14/35
  • 15. Analysis workflow 1.Gather a snapshot of triples 2.Compute descriptive statistics Top resources (subject, predicate, object) Frequency cross-links types (SP,SO,PO,...) Connected components Paths frequency … => Tricky enough, the data is really big! => We should be able to get more out of the data November 23, 2011 Analysis of Linked Data 15/35
  • 16. Can we explain that? Suggestions Started the graph General knowledge Very well known November 23, 2011 Analysis of Linked Data 16/35
  • 17. or that? Suggestions All published by Bio2RDF Well aware of each other Overlapping domain November 23, 2011 Analysis of Linked Data 17/35
  • 18. Could we predict the impact of ... Dbpedia being down for a while ? SIOC renaming “User” into “UserAccount” ? creating a dataset that turns out to be popular ? Analysing a set of triples is not enough November 23, 2011 Analysis of Linked Data 18/35
  • 19. Are we overlooking something? November 23, 2011 Analysis of Linked Data 19/35
  • 20. It's not only about the resources Several entities related to the data ex:something WWW Data publishers/consumers Resources Web servers Interactions between all of them WWW November 23, 2011 Analysis of Linked Data 20/35
  • 21. There are different scales Triples level versus Resource groups level Different data complexity at each scale “Conocido”@es rdf:label ex:Acquaintance rdf:type rdf:type rdf:type ex:Christophe ex:Peter ex:David ex:worksIn ex:worksIn ex:worksIn dbpedia:Amsterdam dbpedia:Barcelona dbpedia:Paris ex:isIn ex:isIn ex:isIn dbpedia:Netherlands dbpedia:Spain dbpedia:France ex:isIn ex:isIn ex:isIn dbpedia:Europe November 23, 2011 Analysis of Linked Data 21/35
  • 22. It is not a static network Size and topology evolve over time 2007 2008 2010 November 23, 2011 Analysis of Linked Data 22/35
  • 23. Linked Data is a Complex System Multiple scale of observation Emergence of properties The whole is more than the sum of the parts => Interactions/relations are important to understand the system behavior => We can benefit from a large body of research results in Complex Systems study November 23, 2011 Analysis of Linked Data 23/35
  • 24. Initial findings and future work November 23, 2011 Analysis of Linked Data 24/35 Ya3hs3/2531493704 on Flickr
  • 25. New analysis workflow 1.Gather a snapshot of triples 2.Gather information about other type of interactions 3.Create specific networks related to the research questions at hand 4.Run metrics, interpret results November 23, 2011 Analysis of Linked Data 25/35
  • 26. The LOD is not what we think it is LOD Cloud 2009/2010 vs BTC 2009 crawl Crawled sample differs from the community based view LOD Cloud has lumpy structure Evolution of LOD Cloud centrality changes Increased density and connectivity Christophe Guéret, Shenghui Wang, Paul Groth et al. (2011) Multi-scale Analysis of the Web Of Data: A Challenge to the Complex System's Community Advances in Complex Systems 14 (04) November 23, 2011 Analysis of Linked Data 26/35
  • 27. November 23, 2011 Analysis of Linked Data 27/35
  • 28. The tools we need don't exist We need to flatten the networks to study them Some specific aspects of the system Existence of implicit links Multi-relational and dynamic Distributed Hypergraph of relations Christophe Guéret, Shenghui Wang, Paul Groth et al. (2011) Multi-scale Analysis of the Web Of Data: A Challenge to the Complex System's Community Advances in Complex Systems 14 (04) November 23, 2011 Analysis of Linked Data 28/35
  • 29. Influence content<->social networks Generate and bind two networks ex:a ex:b ex:c Measure evolution of degree, betweenness, clustering over time Predict evolution Shenghui Wang, Paul Groth (2010) Measuring the dynamic bi-directional influence between content and social networks Proceedings of the 9th International Semantic Web Conference (ISWC2010) November 23, 2011 Analysis of Linked Data 29/35
  • 30. Result for conferences Shenghui Wang, Paul Groth (2010) Measuring the dynamic bi-directional influence between content and social networks Proceedings of the 9th International Semantic Web Conference (ISWC2010) November 23, 2011 Analysis of Linked Data 30/35
  • 31. Centrality to measure robustness Map the BTC2010 to two networks Semantic network based on namespaces Host networks based on hostnames Measure robustness as the variance in betweenness centrality Find weak spots Optimize networks to increase robustness Christophe Guéret, Paul Groth, Frank Van Harmelen et al. (2010) Finding the Achilles Heel of the Web of Data : using network analysis for link-recommendation Proceedings of the 9th International Semantic Web Conference (ISWC2010) November 23, 2011 Analysis of Linked Data 31/35
  • 32. Results on hostnames Christophe Guéret, Paul Groth, Frank Van Harmelen et al. (2010) Finding the Achilles Heel of the Web of Data : using network analysis for link-recommendation Proceedings of the 9th International Semantic Web Conference (ISWC2010) November 23, 2011 Analysis of Linked Data 32/35
  • 33. Results on namespaces Christophe Guéret, Paul Groth, Frank Van Harmelen et al. (2010) Finding the Achilles Heel of the Web of Data : using network analysis for link-recommendation Proceedings of the 9th International Semantic Web Conference (ISWC2010) November 23, 2011 Analysis of Linked Data 33/35
  • 34. Improving the network Christophe Guéret, Paul Groth, Frank Van Harmelen et al. (2010) Finding the Achilles Heel of the Web of Data : using network analysis for link-recommendation Proceedings of the 9th International Semantic Web Conference (ISWC2010) November 23, 2011 Analysis of Linked Data 34/35
  • 35. Conclusion Take home message Linked Data is not a simple knowledge base Network analysis tools give new insights on the data Results can be used to improve the network Future work Make resource-centric analysis rather than graph- centric analysis (big bottleneck now) Tackle the time aspect of the data Find more analysis to perform and what they tell us November 23, 2011 Analysis of Linked Data 35/35