SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Institute for Web Science & Technologies – WeST
A Systematic Investigation of
Explicit and Implicit
Schema Information on the
Linked Open Data Cloud
Thomas Gottron, Malte Knauf, Stefan Scheglmann, Ansgar Scherp
ESWC 2013, Montpellier
Thomas Gottron ESWC, 30.5.2013 2Schema Information on the LOD Cloud
Explicit and Implicit Schema Information on LOD
E1
rdf:type
Explicit
Assigning class types
Implicit
Modelling attributes
dc:creator
E2
Bad News ...dc:title
foaf:Document
swrc:InProceedings
rdf:type
Class
Entity
rdf:type Entity
Property
Entity 2
Thomas Gottron ESWC, 30.5.2013 3Schema Information on the LOD Cloud
Main questions
a) How much information is encoded in the type set or
property set of a resource?
b) How much information is still contained in the properties,
once we know the types of a resource?
c) How much information is still contained in the types, once
we know the properties of a resource?
d) To which degree can one information (either properties or
types) explain the respective other?
Thomas Gottron ESWC, 30.5.2013 4Schema Information on the LOD Cloud
Answers!
Information
Measures
Apply on
LOD
Answers!
Information
Measures
Apply on
LOD
Outline
Questions
a)-d)
Answers!
Model
Schema
Information
Measures
Apply on
LOD
Thomas Gottron ESWC, 30.5.2013 5Schema Information on the LOD Cloud
e.g.
Probabilistic model of schema information
P(T,R) r1 r2 r3 r4 P(T)
t1 14% 2% 5% 8% 29%
t2 5% 15% 2% 3% 25%
t3 7% 3% 30% 5% 45%
P(R) 26% 20% 37% 17%foaf:Document
swrc:InProceedings
e.g.
dc:creator
dc:title
Marginal
Distributions
Marginal
Distributions
Thomas Gottron ESWC, 30.5.2013 6Schema Information on the LOD Cloud
Estimating probabilities
Data set Triples TS PS
Rest 22.3M 793 7,522
Datahub 910.1M 28,924 14,712
Dbpedia 198.1M 1,026,272 391,170
Freebase 101.2M 69,732 162,023
Timbl 204.8M 4,139 9,619
Thomas Gottron ESWC, 30.5.2013 7Schema Information on the LOD Cloud
Question a)
P(R) 1% 97% 1% 1%
P(R) 26% 20% 37% 17%
P(R) 25% 25% 25% 25%
Thomas Gottron ESWC, 30.5.2013 8Schema Information on the LOD Cloud
a) Normalized marginal entropy
 Tendencies:
 Entropy of property sets is higher
 No very high values
 No values close to zero
Data set H0(T) H0(R)
Rest 0.252 0.366
Datahub 0.263 0.250
Dbpedia 0.093 0.324
Freebase 0.127 0.166
Timbl 0.214 0.276
<
>
<
<
<
Thomas Gottron ESWC, 30.5.2013 9Schema Information on the LOD Cloud
Question b)-c)
Thomas Gottron ESWC, 30.5.2013 10Schema Information on the LOD Cloud
b)-c) Expected conditional entropy
P(T,R) r1 r2 r3 r4 P(T)
t1 14% 2% 5% 8% 29%
t2 5% 15% 2% 3% 25%
t3 7% 3% 30% 5% 45%
P(R) 26% 20% 37% 17%
1.726
1.610
1.441
0.501
0,403
0,648
Thomas Gottron ESWC, 30.5.2013 11Schema Information on the LOD Cloud
b)-c) Expected conditional entropy
P(T,R) r1 r2 r3 r4 P(T)
t1 22% 1% 0% 1% 24%
t2 1% 23% 1% 0% 26%
t3 1% 0% 48% 1% 50%
P(R) 24% 24% 49% 2%
Thomas Gottron ESWC, 30.5.2013 12Schema Information on the LOD Cloud
b)-c) Expected conditional entropy
 Tendencies:
 The types of a resource tell little about its properties
 Properties tell more about types
 Given information reduces the entropy
Data set H(T|R) H(R|T)
Rest 0.289 2.568
Datahub 1.319 0.876
Dbpedia 0.688 4.856
Freebase 0.286 1.117
Timbl 0.386 1.464
<
>
<
<
<
H(T) H(R)
2.428 4.708
3.904 3.460
1.856 6.027
2.037 2.868
2.568 3.646
<
Thomas Gottron ESWC, 30.5.2013 13Schema Information on the LOD Cloud
Question d)
Thomas Gottron ESWC, 30.5.2013 14Schema Information on the LOD Cloud
d) Normalized Mutual Information (Redundancy)
P(T,R) r1 r2 r3 r4 P(T)
t1 14% 2% 5% 8% 29%
t2 5% 15% 2% 3% 25%
t3 7% 3% 30% 5% 45%
P(R) 26% 20% 37% 17%
Thomas Gottron ESWC, 30.5.2013 15Schema Information on the LOD Cloud
d) Normalized Mutual Information (Redundancy)
P(T,R) r1 r2 r3 r4 P(T)
t1 22% 1% 0% 1% 24%
t2 1% 23% 1% 0% 26%
t3 1% 0% 48% 1% 50%
P(R) 24% 24% 49% 2%
Thomas Gottron ESWC, 30.5.2013 16Schema Information on the LOD Cloud
d) Normalized Mutual Information (Redundancy)
 Tendencies:
 Relatively high redundancy
 Freebase: (weakly) pre-defined schema
 Timbl: narrow domain (FOAF profiles)
 DBpedia: de-centralized schema
Data set H0(T,R)
Rest 0.881
Datahub 0.747
Dbpedia 0.635
Freebase 0.860
Timbl 0.850
1
4
5
2
3
Thomas Gottron ESWC, 30.5.2013 17Schema Information on the LOD Cloud
Summary
 Present a method to analyze redundancy of schema
information on LOD
 Observations
 Schema not dominated by few TS or PS combinations
 Attributes provide more information than types
 Attributes indicate types better than vice versa
 High redundancy of 63 to 88% on the analyzed segments
of the LOD cloud
 Future Work
 Analysis on data provider level
 Temporal evolution
Thomas Gottron ESWC, 30.5.2013 18Schema Information on the LOD Cloud
Thanks!
Contact:
Thomas Gottron
WeST – Institute for Web Science and Technologies
Universität Koblenz-Landau
gottron@uni-koblenz.de
Thomas Gottron ESWC, 30.5.2013 19Schema Information on the LOD Cloud
References
1. M. Konrath, T. Gottron, S. Staab, and A. Scherp, ―Schemex—efficient construction of a data catalogue by
stream-based indexing of linked data,‖ Journal of Web Semantics, 2012.
2. T. Gottron and R. Pickhardt, ―A detailed analysis of the quality of stream-based schema construction on
linked open data,‖ in CSWS’12: Proceedings of the Chinese Semantic Web Symposium, 2012. to
appear.
3. T. Gottron, A. Scherp, B. Krayer, and A. Peters, ―LODatio: Using a Schema-Based Index to Support
Users in Finding Relevant Sources of Linked Data,‖ in K-CAP’13: Proceedings of the Conference on
Knowledge Capture, 2013.

Weitere ähnliche Inhalte

Was ist angesagt?

OWL reasoning with WebPIE: calculating the closer of 100 billion triples
OWL reasoning with WebPIE: calculating the closer of 100 billion triplesOWL reasoning with WebPIE: calculating the closer of 100 billion triples
OWL reasoning with WebPIE: calculating the closer of 100 billion triplesMahdi Atawneh
 
K10692 control theory sampled data
K10692 control theory sampled dataK10692 control theory sampled data
K10692 control theory sampled datasaagar264
 
DCC2014 - Fully Online Grammar Compression in Constant Space
DCC2014 - Fully Online Grammar Compression in Constant SpaceDCC2014 - Fully Online Grammar Compression in Constant Space
DCC2014 - Fully Online Grammar Compression in Constant SpaceYasuo Tabei
 
Design Pattern of HBase Configuration
Design Pattern of HBase ConfigurationDesign Pattern of HBase Configuration
Design Pattern of HBase ConfigurationDan Han
 
Mining and Managing Large-scale Linked Open Data
Mining and Managing Large-scale Linked Open DataMining and Managing Large-scale Linked Open Data
Mining and Managing Large-scale Linked Open DataAnsgar Scherp
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data ManagementAlbert Bifet
 
強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷Eiji Sekiya
 
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATL
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATLParikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATL
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATLMLconf
 
SPIRE2013-tabei20131009
SPIRE2013-tabei20131009SPIRE2013-tabei20131009
SPIRE2013-tabei20131009Yasuo Tabei
 
Lec 17 heap data structure
Lec 17 heap data structureLec 17 heap data structure
Lec 17 heap data structureSajid Marwat
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data ScienceAlbert Bifet
 
Distributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasetsDistributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasetsBita Kazemi
 
Fast Perceptron Decision Tree Learning from Evolving Data Streams
Fast Perceptron Decision Tree Learning from Evolving Data StreamsFast Perceptron Decision Tree Learning from Evolving Data Streams
Fast Perceptron Decision Tree Learning from Evolving Data StreamsAlbert Bifet
 
A Short Course in Data Stream Mining
A Short Course in Data Stream MiningA Short Course in Data Stream Mining
A Short Course in Data Stream MiningAlbert Bifet
 
Moa: Real Time Analytics for Data Streams
Moa: Real Time Analytics for Data StreamsMoa: Real Time Analytics for Data Streams
Moa: Real Time Analytics for Data StreamsAlbert Bifet
 
Efficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsEfficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsAlbert Bifet
 
Efficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersEfficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersAlbert Bifet
 
Introduction to TensorFlow
Introduction to TensorFlowIntroduction to TensorFlow
Introduction to TensorFlowMatthias Feys
 

Was ist angesagt? (19)

OWL reasoning with WebPIE: calculating the closer of 100 billion triples
OWL reasoning with WebPIE: calculating the closer of 100 billion triplesOWL reasoning with WebPIE: calculating the closer of 100 billion triples
OWL reasoning with WebPIE: calculating the closer of 100 billion triples
 
K10692 control theory sampled data
K10692 control theory sampled dataK10692 control theory sampled data
K10692 control theory sampled data
 
DCC2014 - Fully Online Grammar Compression in Constant Space
DCC2014 - Fully Online Grammar Compression in Constant SpaceDCC2014 - Fully Online Grammar Compression in Constant Space
DCC2014 - Fully Online Grammar Compression in Constant Space
 
Design Pattern of HBase Configuration
Design Pattern of HBase ConfigurationDesign Pattern of HBase Configuration
Design Pattern of HBase Configuration
 
Mining and Managing Large-scale Linked Open Data
Mining and Managing Large-scale Linked Open DataMining and Managing Large-scale Linked Open Data
Mining and Managing Large-scale Linked Open Data
 
Real Time Big Data Management
Real Time Big Data ManagementReal Time Big Data Management
Real Time Big Data Management
 
強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷強化学習の分散アーキテクチャ変遷
強化学習の分散アーキテクチャ変遷
 
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATL
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATLParikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATL
Parikshit Ram – Senior Machine Learning Scientist, Skytree at MLconf ATL
 
SPIRE2013-tabei20131009
SPIRE2013-tabei20131009SPIRE2013-tabei20131009
SPIRE2013-tabei20131009
 
Lec 17 heap data structure
Lec 17 heap data structureLec 17 heap data structure
Lec 17 heap data structure
 
Internet of Things Data Science
Internet of Things Data ScienceInternet of Things Data Science
Internet of Things Data Science
 
Distributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasetsDistributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasets
 
Fast Perceptron Decision Tree Learning from Evolving Data Streams
Fast Perceptron Decision Tree Learning from Evolving Data StreamsFast Perceptron Decision Tree Learning from Evolving Data Streams
Fast Perceptron Decision Tree Learning from Evolving Data Streams
 
A Short Course in Data Stream Mining
A Short Course in Data Stream MiningA Short Course in Data Stream Mining
A Short Course in Data Stream Mining
 
Moa: Real Time Analytics for Data Streams
Moa: Real Time Analytics for Data StreamsMoa: Real Time Analytics for Data Streams
Moa: Real Time Analytics for Data Streams
 
Python pandas Library
Python pandas LibraryPython pandas Library
Python pandas Library
 
Efficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive WindowsEfficient Data Stream Classification via Probabilistic Adaptive Windows
Efficient Data Stream Classification via Probabilistic Adaptive Windows
 
Efficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersEfficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream Classifiers
 
Introduction to TensorFlow
Introduction to TensorFlowIntroduction to TensorFlow
Introduction to TensorFlow
 

Andere mochten auch

Performance analysis of Data Mining algorithms in Weka
Performance analysis of Data Mining algorithms in WekaPerformance analysis of Data Mining algorithms in Weka
Performance analysis of Data Mining algorithms in WekaIOSR Journals
 
литературный клуб «пушкинец»
литературный клуб «пушкинец»литературный клуб «пушкинец»
литературный клуб «пушкинец»Natalya Dyrda
 
Paicr presentation 21feb12
Paicr presentation 21feb12Paicr presentation 21feb12
Paicr presentation 21feb12Edwin Weinstein
 
Effects of Chemical Constituents on Insect Pest Population in West African Ok...
Effects of Chemical Constituents on Insect Pest Population in West African Ok...Effects of Chemical Constituents on Insect Pest Population in West African Ok...
Effects of Chemical Constituents on Insect Pest Population in West African Ok...IOSR Journals
 
Focused Exploration of Geospatial Context on Linked Open Data
Focused Exploration of Geospatial Context on Linked Open DataFocused Exploration of Geospatial Context on Linked Open Data
Focused Exploration of Geospatial Context on Linked Open DataThomas Gottron
 
Medial Axis Transformation based Skeletonzation of Image Patterns using Image...
Medial Axis Transformation based Skeletonzation of Image Patterns using Image...Medial Axis Transformation based Skeletonzation of Image Patterns using Image...
Medial Axis Transformation based Skeletonzation of Image Patterns using Image...IOSR Journals
 
Sistema de Comando de incidentes, Como organizar recursos
Sistema de Comando de incidentes, Como organizar recursosSistema de Comando de incidentes, Como organizar recursos
Sistema de Comando de incidentes, Como organizar recursosYeison Ramirez
 
A middleware approach for high level overlay network
A middleware approach for high level overlay networkA middleware approach for high level overlay network
A middleware approach for high level overlay networkIOSR Journals
 
Heart Attack Prediction System Using Fuzzy C Means Classifier
Heart Attack Prediction System Using Fuzzy C Means ClassifierHeart Attack Prediction System Using Fuzzy C Means Classifier
Heart Attack Prediction System Using Fuzzy C Means ClassifierIOSR Journals
 
Reduce the False Positive and False Negative from Real Traffic with Intrusion...
Reduce the False Positive and False Negative from Real Traffic with Intrusion...Reduce the False Positive and False Negative from Real Traffic with Intrusion...
Reduce the False Positive and False Negative from Real Traffic with Intrusion...IOSR Journals
 

Andere mochten auch (20)

Performance analysis of Data Mining algorithms in Weka
Performance analysis of Data Mining algorithms in WekaPerformance analysis of Data Mining algorithms in Weka
Performance analysis of Data Mining algorithms in Weka
 
E0512833
E0512833E0512833
E0512833
 
Atom Experience Storyboard
Atom Experience StoryboardAtom Experience Storyboard
Atom Experience Storyboard
 
Media genre
Media genreMedia genre
Media genre
 
литературный клуб «пушкинец»
литературный клуб «пушкинец»литературный клуб «пушкинец»
литературный клуб «пушкинец»
 
Paicr presentation 21feb12
Paicr presentation 21feb12Paicr presentation 21feb12
Paicr presentation 21feb12
 
E0942531
E0942531E0942531
E0942531
 
B0461015
B0461015B0461015
B0461015
 
C0511318
C0511318C0511318
C0511318
 
I0524953
I0524953I0524953
I0524953
 
I0516064
I0516064I0516064
I0516064
 
Effects of Chemical Constituents on Insect Pest Population in West African Ok...
Effects of Chemical Constituents on Insect Pest Population in West African Ok...Effects of Chemical Constituents on Insect Pest Population in West African Ok...
Effects of Chemical Constituents on Insect Pest Population in West African Ok...
 
Focused Exploration of Geospatial Context on Linked Open Data
Focused Exploration of Geospatial Context on Linked Open DataFocused Exploration of Geospatial Context on Linked Open Data
Focused Exploration of Geospatial Context on Linked Open Data
 
Medial Axis Transformation based Skeletonzation of Image Patterns using Image...
Medial Axis Transformation based Skeletonzation of Image Patterns using Image...Medial Axis Transformation based Skeletonzation of Image Patterns using Image...
Medial Axis Transformation based Skeletonzation of Image Patterns using Image...
 
Scrum 参考卡
Scrum 参考卡Scrum 参考卡
Scrum 参考卡
 
Sistema de Comando de incidentes, Como organizar recursos
Sistema de Comando de incidentes, Como organizar recursosSistema de Comando de incidentes, Como organizar recursos
Sistema de Comando de incidentes, Como organizar recursos
 
A middleware approach for high level overlay network
A middleware approach for high level overlay networkA middleware approach for high level overlay network
A middleware approach for high level overlay network
 
Manos a la siembra.
Manos a la siembra.Manos a la siembra.
Manos a la siembra.
 
Heart Attack Prediction System Using Fuzzy C Means Classifier
Heart Attack Prediction System Using Fuzzy C Means ClassifierHeart Attack Prediction System Using Fuzzy C Means Classifier
Heart Attack Prediction System Using Fuzzy C Means Classifier
 
Reduce the False Positive and False Negative from Real Traffic with Intrusion...
Reduce the False Positive and False Negative from Real Traffic with Intrusion...Reduce the False Positive and False Negative from Real Traffic with Intrusion...
Reduce the False Positive and False Negative from Real Traffic with Intrusion...
 

Ähnlich wie ESWC 2013: A Systematic Investigation of Explicit and Implicit Schema Information on the Linked Open Data Cloud

Making Use of the Linked Data Cloud: The Role of Index Structures
Making Use of the Linked Data Cloud: The Role of Index StructuresMaking Use of the Linked Data Cloud: The Role of Index Structures
Making Use of the Linked Data Cloud: The Role of Index StructuresThomas Gottron
 
Of Sampling and Smoothing: Approximating Distributions over Linked Open Data
Of Sampling and Smoothing: Approximating Distributions over Linked Open DataOf Sampling and Smoothing: Approximating Distributions over Linked Open Data
Of Sampling and Smoothing: Approximating Distributions over Linked Open DataThomas Gottron
 
(Talk in Powerpoint Format)
(Talk in Powerpoint Format)(Talk in Powerpoint Format)
(Talk in Powerpoint Format)butest
 
Open Analytics Environment
Open Analytics EnvironmentOpen Analytics Environment
Open Analytics EnvironmentIan Foster
 
Rate distortion theory
Rate distortion theoryRate distortion theory
Rate distortion theorysamusk
 
The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?Frank van Harmelen
 
From Changes to Dynamics: Dynamics Analysis of Linked Open Data Sources
From Changes to Dynamics: Dynamics Analysis of Linked Open Data Sources From Changes to Dynamics: Dynamics Analysis of Linked Open Data Sources
From Changes to Dynamics: Dynamics Analysis of Linked Open Data Sources Thomas Gottron
 
Enabling ontology based streaming data access final
Enabling ontology based streaming data access finalEnabling ontology based streaming data access final
Enabling ontology based streaming data access finalJean-Paul Calbimonte
 
Scalable and Privacy-preserving Data Integration - Part 2
Scalable and Privacy-preserving Data Integration - Part 2Scalable and Privacy-preserving Data Integration - Part 2
Scalable and Privacy-preserving Data Integration - Part 2ErhardRahm
 
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...Patrick Diehl
 
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesReasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesOntotext
 
Hamming Distance and Data Compression of 1-D CA
Hamming Distance and Data Compression of 1-D CAHamming Distance and Data Compression of 1-D CA
Hamming Distance and Data Compression of 1-D CAcsitconf
 
Hamming Distance and Data Compression of 1-D CA
Hamming Distance and Data Compression of 1-D CAHamming Distance and Data Compression of 1-D CA
Hamming Distance and Data Compression of 1-D CAcscpconf
 
Prediction and Explanation over DL-Lite Data Streams
Prediction and Explanation over DL-Lite Data StreamsPrediction and Explanation over DL-Lite Data Streams
Prediction and Explanation over DL-Lite Data StreamsSzymon Klarman
 
The Transformation of Systems Biology Into A Large Data Science
The Transformation of Systems Biology Into A Large Data ScienceThe Transformation of Systems Biology Into A Large Data Science
The Transformation of Systems Biology Into A Large Data ScienceRobert Grossman
 
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with StormC*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with StormDataStax
 
Ontologies and Similarity
Ontologies and SimilarityOntologies and Similarity
Ontologies and SimilaritySteffen Staab
 
2014-mo444-practical-assignment-02-paulo_faria
2014-mo444-practical-assignment-02-paulo_faria2014-mo444-practical-assignment-02-paulo_faria
2014-mo444-practical-assignment-02-paulo_fariaPaulo Faria
 

Ähnlich wie ESWC 2013: A Systematic Investigation of Explicit and Implicit Schema Information on the Linked Open Data Cloud (20)

Making Use of the Linked Data Cloud: The Role of Index Structures
Making Use of the Linked Data Cloud: The Role of Index StructuresMaking Use of the Linked Data Cloud: The Role of Index Structures
Making Use of the Linked Data Cloud: The Role of Index Structures
 
Of Sampling and Smoothing: Approximating Distributions over Linked Open Data
Of Sampling and Smoothing: Approximating Distributions over Linked Open DataOf Sampling and Smoothing: Approximating Distributions over Linked Open Data
Of Sampling and Smoothing: Approximating Distributions over Linked Open Data
 
(Talk in Powerpoint Format)
(Talk in Powerpoint Format)(Talk in Powerpoint Format)
(Talk in Powerpoint Format)
 
Open Analytics Environment
Open Analytics EnvironmentOpen Analytics Environment
Open Analytics Environment
 
Rate distortion theory
Rate distortion theoryRate distortion theory
Rate distortion theory
 
The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?
 
From Changes to Dynamics: Dynamics Analysis of Linked Open Data Sources
From Changes to Dynamics: Dynamics Analysis of Linked Open Data Sources From Changes to Dynamics: Dynamics Analysis of Linked Open Data Sources
From Changes to Dynamics: Dynamics Analysis of Linked Open Data Sources
 
Enabling ontology based streaming data access final
Enabling ontology based streaming data access finalEnabling ontology based streaming data access final
Enabling ontology based streaming data access final
 
Scalable and Privacy-preserving Data Integration - Part 2
Scalable and Privacy-preserving Data Integration - Part 2Scalable and Privacy-preserving Data Integration - Part 2
Scalable and Privacy-preserving Data Integration - Part 2
 
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
 
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven RecipesReasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
 
Hamming Distance and Data Compression of 1-D CA
Hamming Distance and Data Compression of 1-D CAHamming Distance and Data Compression of 1-D CA
Hamming Distance and Data Compression of 1-D CA
 
Hamming Distance and Data Compression of 1-D CA
Hamming Distance and Data Compression of 1-D CAHamming Distance and Data Compression of 1-D CA
Hamming Distance and Data Compression of 1-D CA
 
Prediction and Explanation over DL-Lite Data Streams
Prediction and Explanation over DL-Lite Data StreamsPrediction and Explanation over DL-Lite Data Streams
Prediction and Explanation over DL-Lite Data Streams
 
Democratizing Big Semantic Data management
Democratizing Big Semantic Data managementDemocratizing Big Semantic Data management
Democratizing Big Semantic Data management
 
The Transformation of Systems Biology Into A Large Data Science
The Transformation of Systems Biology Into A Large Data ScienceThe Transformation of Systems Biology Into A Large Data Science
The Transformation of Systems Biology Into A Large Data Science
 
Enabling semantic integration
Enabling semantic integration Enabling semantic integration
Enabling semantic integration
 
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with StormC*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
C*ollege Credit: CEP Distribtued Processing on Cassandra with Storm
 
Ontologies and Similarity
Ontologies and SimilarityOntologies and Similarity
Ontologies and Similarity
 
2014-mo444-practical-assignment-02-paulo_faria
2014-mo444-practical-assignment-02-paulo_faria2014-mo444-practical-assignment-02-paulo_faria
2014-mo444-practical-assignment-02-paulo_faria
 

Mehr von Thomas Gottron

Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open...
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open...Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open...
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open...Thomas Gottron
 
Perplexity of Index Models over Evolving Linked Data
Perplexity of Index Models over Evolving Linked Data Perplexity of Index Models over Evolving Linked Data
Perplexity of Index Models over Evolving Linked Data Thomas Gottron
 
 Challenges in Managing Online Business Communities
 Challenges in Managing Online Business Communities Challenges in Managing Online Business Communities
 Challenges in Managing Online Business CommunitiesThomas Gottron
 
Challenging Retrieval Scenarios: Social Media and Linked Open Data
Challenging Retrieval Scenarios: Social Media and Linked Open DataChallenging Retrieval Scenarios: Social Media and Linked Open Data
Challenging Retrieval Scenarios: Social Media and Linked Open DataThomas Gottron
 
Get the Google Feeling! Supporting Users in Finding Relevant Sources
Get the Google Feeling! Supporting Users in Finding Relevant SourcesGet the Google Feeling! Supporting Users in Finding Relevant Sources
Get the Google Feeling! Supporting Users in Finding Relevant SourcesThomas Gottron
 
Finding Good URLs: Aligning Entities in Knowledge Bases with Public Web Docum...
Finding Good URLs: Aligning Entities in Knowledge Bases with Public Web Docum...Finding Good URLs: Aligning Entities in Knowledge Bases with Public Web Docum...
Finding Good URLs: Aligning Entities in Knowledge Bases with Public Web Docum...Thomas Gottron
 

Mehr von Thomas Gottron (6)

Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open...
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open...Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open...
Leveraging the Web of Data: Managing, Analysing and Making Use of Linked Open...
 
Perplexity of Index Models over Evolving Linked Data
Perplexity of Index Models over Evolving Linked Data Perplexity of Index Models over Evolving Linked Data
Perplexity of Index Models over Evolving Linked Data
 
 Challenges in Managing Online Business Communities
 Challenges in Managing Online Business Communities Challenges in Managing Online Business Communities
 Challenges in Managing Online Business Communities
 
Challenging Retrieval Scenarios: Social Media and Linked Open Data
Challenging Retrieval Scenarios: Social Media and Linked Open DataChallenging Retrieval Scenarios: Social Media and Linked Open Data
Challenging Retrieval Scenarios: Social Media and Linked Open Data
 
Get the Google Feeling! Supporting Users in Finding Relevant Sources
Get the Google Feeling! Supporting Users in Finding Relevant SourcesGet the Google Feeling! Supporting Users in Finding Relevant Sources
Get the Google Feeling! Supporting Users in Finding Relevant Sources
 
Finding Good URLs: Aligning Entities in Knowledge Bases with Public Web Docum...
Finding Good URLs: Aligning Entities in Knowledge Bases with Public Web Docum...Finding Good URLs: Aligning Entities in Knowledge Bases with Public Web Docum...
Finding Good URLs: Aligning Entities in Knowledge Bases with Public Web Docum...
 

Kürzlich hochgeladen

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 

Kürzlich hochgeladen (20)

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 

ESWC 2013: A Systematic Investigation of Explicit and Implicit Schema Information on the Linked Open Data Cloud

  • 1. Institute for Web Science & Technologies – WeST A Systematic Investigation of Explicit and Implicit Schema Information on the Linked Open Data Cloud Thomas Gottron, Malte Knauf, Stefan Scheglmann, Ansgar Scherp ESWC 2013, Montpellier
  • 2. Thomas Gottron ESWC, 30.5.2013 2Schema Information on the LOD Cloud Explicit and Implicit Schema Information on LOD E1 rdf:type Explicit Assigning class types Implicit Modelling attributes dc:creator E2 Bad News ...dc:title foaf:Document swrc:InProceedings rdf:type Class Entity rdf:type Entity Property Entity 2
  • 3. Thomas Gottron ESWC, 30.5.2013 3Schema Information on the LOD Cloud Main questions a) How much information is encoded in the type set or property set of a resource? b) How much information is still contained in the properties, once we know the types of a resource? c) How much information is still contained in the types, once we know the properties of a resource? d) To which degree can one information (either properties or types) explain the respective other?
  • 4. Thomas Gottron ESWC, 30.5.2013 4Schema Information on the LOD Cloud Answers! Information Measures Apply on LOD Answers! Information Measures Apply on LOD Outline Questions a)-d) Answers! Model Schema Information Measures Apply on LOD
  • 5. Thomas Gottron ESWC, 30.5.2013 5Schema Information on the LOD Cloud e.g. Probabilistic model of schema information P(T,R) r1 r2 r3 r4 P(T) t1 14% 2% 5% 8% 29% t2 5% 15% 2% 3% 25% t3 7% 3% 30% 5% 45% P(R) 26% 20% 37% 17%foaf:Document swrc:InProceedings e.g. dc:creator dc:title Marginal Distributions Marginal Distributions
  • 6. Thomas Gottron ESWC, 30.5.2013 6Schema Information on the LOD Cloud Estimating probabilities Data set Triples TS PS Rest 22.3M 793 7,522 Datahub 910.1M 28,924 14,712 Dbpedia 198.1M 1,026,272 391,170 Freebase 101.2M 69,732 162,023 Timbl 204.8M 4,139 9,619
  • 7. Thomas Gottron ESWC, 30.5.2013 7Schema Information on the LOD Cloud Question a) P(R) 1% 97% 1% 1% P(R) 26% 20% 37% 17% P(R) 25% 25% 25% 25%
  • 8. Thomas Gottron ESWC, 30.5.2013 8Schema Information on the LOD Cloud a) Normalized marginal entropy  Tendencies:  Entropy of property sets is higher  No very high values  No values close to zero Data set H0(T) H0(R) Rest 0.252 0.366 Datahub 0.263 0.250 Dbpedia 0.093 0.324 Freebase 0.127 0.166 Timbl 0.214 0.276 < > < < <
  • 9. Thomas Gottron ESWC, 30.5.2013 9Schema Information on the LOD Cloud Question b)-c)
  • 10. Thomas Gottron ESWC, 30.5.2013 10Schema Information on the LOD Cloud b)-c) Expected conditional entropy P(T,R) r1 r2 r3 r4 P(T) t1 14% 2% 5% 8% 29% t2 5% 15% 2% 3% 25% t3 7% 3% 30% 5% 45% P(R) 26% 20% 37% 17% 1.726 1.610 1.441 0.501 0,403 0,648
  • 11. Thomas Gottron ESWC, 30.5.2013 11Schema Information on the LOD Cloud b)-c) Expected conditional entropy P(T,R) r1 r2 r3 r4 P(T) t1 22% 1% 0% 1% 24% t2 1% 23% 1% 0% 26% t3 1% 0% 48% 1% 50% P(R) 24% 24% 49% 2%
  • 12. Thomas Gottron ESWC, 30.5.2013 12Schema Information on the LOD Cloud b)-c) Expected conditional entropy  Tendencies:  The types of a resource tell little about its properties  Properties tell more about types  Given information reduces the entropy Data set H(T|R) H(R|T) Rest 0.289 2.568 Datahub 1.319 0.876 Dbpedia 0.688 4.856 Freebase 0.286 1.117 Timbl 0.386 1.464 < > < < < H(T) H(R) 2.428 4.708 3.904 3.460 1.856 6.027 2.037 2.868 2.568 3.646 <
  • 13. Thomas Gottron ESWC, 30.5.2013 13Schema Information on the LOD Cloud Question d)
  • 14. Thomas Gottron ESWC, 30.5.2013 14Schema Information on the LOD Cloud d) Normalized Mutual Information (Redundancy) P(T,R) r1 r2 r3 r4 P(T) t1 14% 2% 5% 8% 29% t2 5% 15% 2% 3% 25% t3 7% 3% 30% 5% 45% P(R) 26% 20% 37% 17%
  • 15. Thomas Gottron ESWC, 30.5.2013 15Schema Information on the LOD Cloud d) Normalized Mutual Information (Redundancy) P(T,R) r1 r2 r3 r4 P(T) t1 22% 1% 0% 1% 24% t2 1% 23% 1% 0% 26% t3 1% 0% 48% 1% 50% P(R) 24% 24% 49% 2%
  • 16. Thomas Gottron ESWC, 30.5.2013 16Schema Information on the LOD Cloud d) Normalized Mutual Information (Redundancy)  Tendencies:  Relatively high redundancy  Freebase: (weakly) pre-defined schema  Timbl: narrow domain (FOAF profiles)  DBpedia: de-centralized schema Data set H0(T,R) Rest 0.881 Datahub 0.747 Dbpedia 0.635 Freebase 0.860 Timbl 0.850 1 4 5 2 3
  • 17. Thomas Gottron ESWC, 30.5.2013 17Schema Information on the LOD Cloud Summary  Present a method to analyze redundancy of schema information on LOD  Observations  Schema not dominated by few TS or PS combinations  Attributes provide more information than types  Attributes indicate types better than vice versa  High redundancy of 63 to 88% on the analyzed segments of the LOD cloud  Future Work  Analysis on data provider level  Temporal evolution
  • 18. Thomas Gottron ESWC, 30.5.2013 18Schema Information on the LOD Cloud Thanks! Contact: Thomas Gottron WeST – Institute for Web Science and Technologies Universität Koblenz-Landau gottron@uni-koblenz.de
  • 19. Thomas Gottron ESWC, 30.5.2013 19Schema Information on the LOD Cloud References 1. M. Konrath, T. Gottron, S. Staab, and A. Scherp, ―Schemex—efficient construction of a data catalogue by stream-based indexing of linked data,‖ Journal of Web Semantics, 2012. 2. T. Gottron and R. Pickhardt, ―A detailed analysis of the quality of stream-based schema construction on linked open data,‖ in CSWS’12: Proceedings of the Chinese Semantic Web Symposium, 2012. to appear. 3. T. Gottron, A. Scherp, B. Krayer, and A. Peters, ―LODatio: Using a Schema-Based Index to Support Users in Finding Relevant Sources of Linked Data,‖ in K-CAP’13: Proceedings of the Conference on Knowledge Capture, 2013.