SlideShare a Scribd company logo
1 of 28
Download to read offline
Max Schmachtenberg 
Christian Bizer 
Heiko Paulheim 
Adoption of the Linked Data Best Practices 
in Different Topical Domains 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 1
The Linked Data Best Practices 
Central idea of Linked Data: Ease data discovery and 
integration by complying to a set of best practices. 
1. Linking Best Practices 
• Set RDF links pointing at instances in other data sources. 
2. Vocabulary Best Practices 
• Reuse terms from widely-used vocabularies. 
• Make definitions of proprietary terms dereferencable. 
• Link vocabulary terms to terms in other vocabularies. 
3. Metadata Best Practices 
• Publish machine-readable provenance and licensing metadata. 
• Publish metadata about alternative access methods (SPARQL, dumps) 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 2
State of the LOD Cloud Report - 2011 
 http://lod-cloud.net/state/ 
 Based on information 
by provided dataset 
publishers via the 
datahub.io catalog 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 3
LOD Cloud - 2011 
Consists of 
295 datasets. 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 4
Outline 
Goal: Update the State of the LOD Cloud report 
and LOD Cloud itself to 2014. 
1. Methodology 
2. Adoption of the Linking Best Practices 
3. Adoption of the Vocabulary Best Practices 
4. Adoption of the Metadata Best Practices 
5. Conclusions (in Relation to Schema.org) 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 5
1. Methodology: Crawl of the Linked Data Web 
 Crawler: LDSpider, Crawl Date: April 2014 
 Seeds: 560,000 seed URIs from 
1. Example URIs in datahub.io catalog 
2. URIs from BTC2012 dataset 
3. URIs from datasets advertised on public-lod@w3.org mailing list 
 Crawled Data Corpus 
• 900,000 documents containing 
• 8,038,000 resources 
• 1014 datasets 
• 77 datasets prevent 
crawling via robots.txt 
• Distribution by dataset 
• Red line: documents 
• Blue line: resources 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 6
Categorization by Topical Domain 
 Used categorization from datahub.io for existing datasets. 
 Manually categorized remaining datasets. 
 Added new category Social Networking 
 Growth without new category Social Networking: 94 % 
 LODstats (http://stats.lod2.eu/) discovered similar number of datasets: 1048 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 7
2. Adoption of the Linking Best Practices 
Data publishers should set RDF links as: 
1. Discoverability depends on being linked. 
2. RDF links ease data integration. 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 8
Degrees 
 56% of all datasets set RDF links pointing to other datasets. 
• The remaining 44% are either only the target of RDF links from other 
datasets or are isolated. 
 Datasets with Top In- and Outdegrees: 
 Most widely used linking predicates: owl:sameAs, rdfs:seeAlso, foaf:knows 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 9
“Crawlable” LOD Cloud 2014 
 ss 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 10
Degree Distributions 
 Dotted line: Social Networking (status.net, etc.) 
 Solid line: Cross-Domain datasets (DBpedia, etc.) 
 Largest Strongly Connected Component: 36% (377 datasets) 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 11
Conclusion concerning Linking Best Practices 
 Some datasets put a lot of effort into linking. 
 Many datasets only link to a small number of other datasets 
or do not set RDF links at all. 
 Similar situation as in 2011. 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 12
3. Adoption of the Vocabulary Best Practices 
Goal: Help applications understand the data by 
1. Reusing terms from widely-used vocabularies. 
2. Making definitions of proprietary terms 
dereferencable. 
3. Linking vocabulary terms to terms in other 
vocabularies. 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 13
Widely-Used and Proprietary Vocabularies 
 Strong agreement on some vocabularies. 
 Proprietary vocabularies are used in 
addition to common ones, 
as data is often very specific 
Widely-Used Vocabularies 
Proprietary Vocabularies 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 14
Dereferencability of Term URIs and Vocabulary Linking 
 28% of the proprietary vocabularies provide dereferencable URIs. 
 21% set RDF links to other vocabularies (8% in 2011) 
• Popular linking predicates: rdfs:range, rdfs:subClassOf, rdfs:subClassOf 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 15
Adoption of the Metadata Best Practices 
1. Publish machine-readable provenance information. 
2. Publish machine-readable licensing information. 
3. Publish metadata about alternative access methods 
(SPARQL endpoints, RDF dumps) 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 16
Provenance and Licensing Metadata 
 37% of the datasets provide provenance information 
• Dublin Core is used more than W3C Prov 
 10% provide machine-readable licensing information 
• Most used predicates dc:license, cc:license 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 17
Dataset Level Metadata (VoID) 
 15% of the datasets publish VoID descriptions. 
 Via these descriptions, it is possible to discover SPARQL 
endpoints and dumps for about 10% of the data sources. 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 18
Conclusion concerning Metadata Best Practices 
 Applications can not rely on availability of metadata, 
as only a small fraction of all data sources publishes such data. 
 The Government and Library domains are positive exceptions. 
 Similarly low numbers as in 2011. 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 19
“Full” LOD Cloud Diagram 
570 datasets 
 374 datahub.io 
 196 our crawl 
http://lod-cloud.net/ 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 20
Growth of the “Full” LOD Cloud Diagram 
 2011: 295 datasets 
 2014: 570 datasets (+ 93 %) 
http://lod-cloud.net/ 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 21
Comparison of Linked Data and Schema.org 
Schema.org 
1. does not expect data publishers to set data links. 
2. relies on marking up data in HTML pages. 
3. Strong application pull by Google, Microsoft, Yahoo! 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 22
Adoption 
WebDataCommons, 2013*: 
463,000 websites (PLDs) provide Microdata annotations. 
Google, 2014**: 
5 million websites provide Schema.org data. 
 Orders of magnitude more Schema.org data sources. 
* WebDataCommons extracts Microdata, RDFa, Microformat data 
from the CommonCrawl (2.2 billion HTML pages from 12.8 million PLDs). 
** Guha in LDOW2014 Keynote 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 23
Schema.org Topical Focus 
Different topics 
compared to 
Linked Data. 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 24
Class / Property Distribution 
Microdata 2012 
 Only a small set of classes / properties is actually used. 
 Less variety compared to Linked Data. 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 25
Shallowness of the Schema.org Data 
schema:Product schema:JobPosting 
Product Names 
• AppleMacBook Air MC968/A 11.6-Inch Laptop 
• Apple MacBook Air 11-in, Intel Core i5 1.60GHz, 64 GB, Lion 10.7 
JobPostings 
• More specific properties like skills are hardly used. 
• 57% of all hiringOrganizations are strings not instances. 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 26
Conclusion 
Linked Data Schema.org 
~ 1,000 sources > 460,000 sources 
covers wider range of specific topics 
(government, libraries, science) 
topics focused on search engines 
(products, organizations) 
contains more complex 
data structures 
very simple and shallow 
data structures 
partial ontology agreement strong ontology agreement 
identity resolution eased by RDF links identity resolution often requires 
value parsing 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 27
Thank you. 
References 
 Report 
http://linkeddatacatalog.dws.informatik.uni-mannheim.de/state/ 
 Catalog 
http://linkeddatacatalog.dws.informatik.uni-mannheim.de/ 
Acknowledgement 
 This work was supported by 
Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 28

More Related Content

What's hot

The Semantic Web – A Vision Come True, or Giving Up the Great Plan?
The Semantic Web – A Vision Come True, or Giving Up the Great Plan?The Semantic Web – A Vision Come True, or Giving Up the Great Plan?
The Semantic Web – A Vision Come True, or Giving Up the Great Plan?Martin Hepp
 
[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...
[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...
[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...Data Beers
 
Extending Tables with Data from over a Million Websites
 Extending Tables with Data from over a Million Websites Extending Tables with Data from over a Million Websites
Extending Tables with Data from over a Million WebsitesChris Bizer
 
2013 open analytics-meetup-mortar
2013 open analytics-meetup-mortar2013 open analytics-meetup-mortar
2013 open analytics-meetup-mortarOpen Analytics
 
Cenitpede: Analyzing Webcrawl
Cenitpede: Analyzing WebcrawlCenitpede: Analyzing Webcrawl
Cenitpede: Analyzing WebcrawlPrimal Pappachan
 
balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference Information
balloon Fusion: SPARQL Rewriting Based on  Unified Co-Reference Informationballoon Fusion: SPARQL Rewriting Based on  Unified Co-Reference Information
balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference InformationKai Schlegel
 
Industry@RuleML2015 DataGraft
Industry@RuleML2015 DataGraftIndustry@RuleML2015 DataGraft
Industry@RuleML2015 DataGraftRuleML
 
Internet in space - Networkshop44
Internet in space - Networkshop44Internet in space - Networkshop44
Internet in space - Networkshop44Jisc
 
Documents, services, and data on the web
Documents, services, and data on the webDocuments, services, and data on the web
Documents, services, and data on the webChiara Del Vescovo
 
Uk discovery-jisc-project-showcase
Uk discovery-jisc-project-showcaseUk discovery-jisc-project-showcase
Uk discovery-jisc-project-showcaseRDTF-Discovery
 
Grid Computing July 2009
Grid Computing July 2009Grid Computing July 2009
Grid Computing July 2009Ian Foster
 
Health Sciences Research Informatics, Powered by Globus
Health Sciences Research Informatics, Powered by GlobusHealth Sciences Research Informatics, Powered by Globus
Health Sciences Research Informatics, Powered by GlobusGlobus
 
Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...petrknoth
 
Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...petrknoth
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataEUCLID project
 
Unlocking the full potential of five-star addresses by using Linked Data Frag...
Unlocking the full potential of five-star addresses by using Linked Data Frag...Unlocking the full potential of five-star addresses by using Linked Data Frag...
Unlocking the full potential of five-star addresses by using Linked Data Frag...Raf Buyle
 
The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataOntotext
 

What's hot (20)

The Semantic Web – A Vision Come True, or Giving Up the Great Plan?
The Semantic Web – A Vision Come True, or Giving Up the Great Plan?The Semantic Web – A Vision Come True, or Giving Up the Great Plan?
The Semantic Web – A Vision Come True, or Giving Up the Great Plan?
 
[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...
[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...
[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...
 
Extending Tables with Data from over a Million Websites
 Extending Tables with Data from over a Million Websites Extending Tables with Data from over a Million Websites
Extending Tables with Data from over a Million Websites
 
2013 open analytics-meetup-mortar
2013 open analytics-meetup-mortar2013 open analytics-meetup-mortar
2013 open analytics-meetup-mortar
 
Cenitpede: Analyzing Webcrawl
Cenitpede: Analyzing WebcrawlCenitpede: Analyzing Webcrawl
Cenitpede: Analyzing Webcrawl
 
balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference Information
balloon Fusion: SPARQL Rewriting Based on  Unified Co-Reference Informationballoon Fusion: SPARQL Rewriting Based on  Unified Co-Reference Information
balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference Information
 
Industry@RuleML2015 DataGraft
Industry@RuleML2015 DataGraftIndustry@RuleML2015 DataGraft
Industry@RuleML2015 DataGraft
 
Internet in space - Networkshop44
Internet in space - Networkshop44Internet in space - Networkshop44
Internet in space - Networkshop44
 
Documents, services, and data on the web
Documents, services, and data on the webDocuments, services, and data on the web
Documents, services, and data on the web
 
SomeSlides
SomeSlidesSomeSlides
SomeSlides
 
Uk discovery-jisc-project-showcase
Uk discovery-jisc-project-showcaseUk discovery-jisc-project-showcase
Uk discovery-jisc-project-showcase
 
Grid Computing July 2009
Grid Computing July 2009Grid Computing July 2009
Grid Computing July 2009
 
Health Sciences Research Informatics, Powered by Globus
Health Sciences Research Informatics, Powered by GlobusHealth Sciences Research Informatics, Powered by Globus
Health Sciences Research Informatics, Powered by Globus
 
Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...
 
Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Unlocking the full potential of five-star addresses by using Linked Data Frag...
Unlocking the full potential of five-star addresses by using Linked Data Frag...Unlocking the full potential of five-star addresses by using Linked Data Frag...
Unlocking the full potential of five-star addresses by using Linked Data Frag...
 
ResourceSync Tutorial
ResourceSync TutorialResourceSync Tutorial
ResourceSync Tutorial
 
The Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open DataThe Power of Semantic Technologies to Explore Linked Open Data
The Power of Semantic Technologies to Explore Linked Open Data
 
Mapping the Repository Landscape
Mapping the Repository LandscapeMapping the Repository Landscape
Mapping the Repository Landscape
 

Similar to Adoption of the Linked Data Best Practices in Different Topical Domains

Linked Data: Opportunities for Entrepreneurs
Linked Data: Opportunities for EntrepreneursLinked Data: Opportunities for Entrepreneurs
Linked Data: Opportunities for Entrepreneurs3 Round Stones
 
EPA OEI Linked Data Process
EPA OEI Linked Data ProcessEPA OEI Linked Data Process
EPA OEI Linked Data Process3 Round Stones
 
THOR Workshop - Data Publishing Elsevier
THOR Workshop - Data Publishing ElsevierTHOR Workshop - Data Publishing Elsevier
THOR Workshop - Data Publishing ElsevierMaaike Duine
 
NIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsNIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsVivien Bonazzi
 
Llinked open data training for EU institutions
Llinked open data training for EU institutionsLlinked open data training for EU institutions
Llinked open data training for EU institutionsOpen Data Support
 
Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)aaroncollie
 
Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Vivien Bonazzi
 
Bourne RDAP11 Data Publication Repositories
Bourne RDAP11 Data Publication RepositoriesBourne RDAP11 Data Publication Repositories
Bourne RDAP11 Data Publication RepositoriesASIS&T
 
Linked Open Data_mlanet13
Linked Open Data_mlanet13Linked Open Data_mlanet13
Linked Open Data_mlanet13Kristi Holmes
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareIMC Technologies
 
The Commons: Leveraging the Power of the Cloud for Big Data
The Commons: Leveraging the Power of the Cloud for Big DataThe Commons: Leveraging the Power of the Cloud for Big Data
The Commons: Leveraging the Power of the Cloud for Big DataPhilip Bourne
 
2012 Fall Data Management Planning Workshop
2012 Fall Data Management Planning Workshop2012 Fall Data Management Planning Workshop
2012 Fall Data Management Planning WorkshopLizzy_Rolando
 
Exposing Hidden Relationships: Practical Work in Linked Data using Digital Co...
Exposing Hidden Relationships: Practical Work in Linked Data using Digital Co...Exposing Hidden Relationships: Practical Work in Linked Data using Digital Co...
Exposing Hidden Relationships: Practical Work in Linked Data using Digital Co...Cory Lampert
 
The Linked Data Lifecycle
The Linked Data LifecycleThe Linked Data Lifecycle
The Linked Data Lifecyclegeoknow
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Anja Jentzsch
 
Using Linked Data Resources to generate web pages based on a BBC case study
Using Linked Data Resources to generate web pages based on a BBC case studyUsing Linked Data Resources to generate web pages based on a BBC case study
Using Linked Data Resources to generate web pages based on a BBC case studyLeila Zemmouchi-Ghomari
 

Similar to Adoption of the Linked Data Best Practices in Different Topical Domains (20)

Linked Data: Opportunities for Entrepreneurs
Linked Data: Opportunities for EntrepreneursLinked Data: Opportunities for Entrepreneurs
Linked Data: Opportunities for Entrepreneurs
 
Linked Data and Semantic Web Application Development by Peter Haase
Linked Data and Semantic Web Application Development by Peter HaaseLinked Data and Semantic Web Application Development by Peter Haase
Linked Data and Semantic Web Application Development by Peter Haase
 
EPA OEI Linked Data Process
EPA OEI Linked Data ProcessEPA OEI Linked Data Process
EPA OEI Linked Data Process
 
THOR Workshop - Data Publishing Elsevier
THOR Workshop - Data Publishing ElsevierTHOR Workshop - Data Publishing Elsevier
THOR Workshop - Data Publishing Elsevier
 
Lecture20
Lecture20Lecture20
Lecture20
 
NIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsNIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data Commons
 
Llinked open data training for EU institutions
Llinked open data training for EU institutionsLlinked open data training for EU institutions
Llinked open data training for EU institutions
 
Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)Data Management for Research (New Faculty Orientation)
Data Management for Research (New Faculty Orientation)
 
Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2
 
Linked data life cycles
Linked data life cyclesLinked data life cycles
Linked data life cycles
 
Bourne RDAP11 Data Publication Repositories
Bourne RDAP11 Data Publication RepositoriesBourne RDAP11 Data Publication Repositories
Bourne RDAP11 Data Publication Repositories
 
Linked Open Data_mlanet13
Linked Open Data_mlanet13Linked Open Data_mlanet13
Linked Open Data_mlanet13
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the Software
 
The Commons: Leveraging the Power of the Cloud for Big Data
The Commons: Leveraging the Power of the Cloud for Big DataThe Commons: Leveraging the Power of the Cloud for Big Data
The Commons: Leveraging the Power of the Cloud for Big Data
 
2012 Fall Data Management Planning Workshop
2012 Fall Data Management Planning Workshop2012 Fall Data Management Planning Workshop
2012 Fall Data Management Planning Workshop
 
Exposing Hidden Relationships: Practical Work in Linked Data using Digital Co...
Exposing Hidden Relationships: Practical Work in Linked Data using Digital Co...Exposing Hidden Relationships: Practical Work in Linked Data using Digital Co...
Exposing Hidden Relationships: Practical Work in Linked Data using Digital Co...
 
Webinar@AIMS: LODE-BD
Webinar@AIMS: LODE-BDWebinar@AIMS: LODE-BD
Webinar@AIMS: LODE-BD
 
The Linked Data Lifecycle
The Linked Data LifecycleThe Linked Data Lifecycle
The Linked Data Lifecycle
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
 
Using Linked Data Resources to generate web pages based on a BBC case study
Using Linked Data Resources to generate web pages based on a BBC case studyUsing Linked Data Resources to generate web pages based on a BBC case study
Using Linked Data Resources to generate web pages based on a BBC case study
 

More from Chris Bizer

GPT4 versus BERT: Which Foundation Model is better for Web Data Integration?
GPT4 versus BERT: Which Foundation Model is better for Web Data Integration?GPT4 versus BERT: Which Foundation Model is better for Web Data Integration?
GPT4 versus BERT: Which Foundation Model is better for Web Data Integration?Chris Bizer
 
Integrating Product Data from the Semantic Web using Deep Learning Techniques
Integrating Product Data from the Semantic Web using Deep Learning TechniquesIntegrating Product Data from the Semantic Web using Deep Learning Techniques
Integrating Product Data from the Semantic Web using Deep Learning TechniquesChris Bizer
 
Using the Semantic Web as Training Data for Product Matching
Using the Semantic Web as Training Data for Product MatchingUsing the Semantic Web as Training Data for Product Matching
Using the Semantic Web as Training Data for Product MatchingChris Bizer
 
JIST2019 Keynote: Completing Knowledge Graphs using Data from the Open Web
JIST2019 Keynote: Completing Knowledge Graphs using Data from the Open WebJIST2019 Keynote: Completing Knowledge Graphs using Data from the Open Web
JIST2019 Keynote: Completing Knowledge Graphs using Data from the Open WebChris Bizer
 
Schema.org Annotations and Web Tables: Underexploited Semantic Nuggets on the...
Schema.org Annotations and Web Tables: Underexploited Semantic Nuggets on the...Schema.org Annotations and Web Tables: Underexploited Semantic Nuggets on the...
Schema.org Annotations and Web Tables: Underexploited Semantic Nuggets on the...Chris Bizer
 
Is the Semantic Web what we expected? Adoption Patterns and Content-driven Ch...
Is the Semantic Web what we expected? Adoption Patterns and Content-driven Ch...Is the Semantic Web what we expected? Adoption Patterns and Content-driven Ch...
Is the Semantic Web what we expected? Adoption Patterns and Content-driven Ch...Chris Bizer
 
Data Search and Search Joins (Universität Heidelberg 2015)
Data Search and Search Joins (Universität Heidelberg 2015)Data Search and Search Joins (Universität Heidelberg 2015)
Data Search and Search Joins (Universität Heidelberg 2015)Chris Bizer
 
Exploring the Application Potential of Relational Web Tables
Exploring the Application Potential of Relational Web TablesExploring the Application Potential of Relational Web Tables
Exploring the Application Potential of Relational Web TablesChris Bizer
 
Evolving the Web into a Global Database - Advances and Applications.
Evolving the Web into a Global Database - Advances and Applications. Evolving the Web into a Global Database - Advances and Applications.
Evolving the Web into a Global Database - Advances and Applications. Chris Bizer
 

More from Chris Bizer (9)

GPT4 versus BERT: Which Foundation Model is better for Web Data Integration?
GPT4 versus BERT: Which Foundation Model is better for Web Data Integration?GPT4 versus BERT: Which Foundation Model is better for Web Data Integration?
GPT4 versus BERT: Which Foundation Model is better for Web Data Integration?
 
Integrating Product Data from the Semantic Web using Deep Learning Techniques
Integrating Product Data from the Semantic Web using Deep Learning TechniquesIntegrating Product Data from the Semantic Web using Deep Learning Techniques
Integrating Product Data from the Semantic Web using Deep Learning Techniques
 
Using the Semantic Web as Training Data for Product Matching
Using the Semantic Web as Training Data for Product MatchingUsing the Semantic Web as Training Data for Product Matching
Using the Semantic Web as Training Data for Product Matching
 
JIST2019 Keynote: Completing Knowledge Graphs using Data from the Open Web
JIST2019 Keynote: Completing Knowledge Graphs using Data from the Open WebJIST2019 Keynote: Completing Knowledge Graphs using Data from the Open Web
JIST2019 Keynote: Completing Knowledge Graphs using Data from the Open Web
 
Schema.org Annotations and Web Tables: Underexploited Semantic Nuggets on the...
Schema.org Annotations and Web Tables: Underexploited Semantic Nuggets on the...Schema.org Annotations and Web Tables: Underexploited Semantic Nuggets on the...
Schema.org Annotations and Web Tables: Underexploited Semantic Nuggets on the...
 
Is the Semantic Web what we expected? Adoption Patterns and Content-driven Ch...
Is the Semantic Web what we expected? Adoption Patterns and Content-driven Ch...Is the Semantic Web what we expected? Adoption Patterns and Content-driven Ch...
Is the Semantic Web what we expected? Adoption Patterns and Content-driven Ch...
 
Data Search and Search Joins (Universität Heidelberg 2015)
Data Search and Search Joins (Universität Heidelberg 2015)Data Search and Search Joins (Universität Heidelberg 2015)
Data Search and Search Joins (Universität Heidelberg 2015)
 
Exploring the Application Potential of Relational Web Tables
Exploring the Application Potential of Relational Web TablesExploring the Application Potential of Relational Web Tables
Exploring the Application Potential of Relational Web Tables
 
Evolving the Web into a Global Database - Advances and Applications.
Evolving the Web into a Global Database - Advances and Applications. Evolving the Web into a Global Database - Advances and Applications.
Evolving the Web into a Global Database - Advances and Applications.
 

Recently uploaded

₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...Diya Sharma
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxellan12
 
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$kojalkojal131
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableSeo
 
(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...
(+971568250507  ))#  Young Call Girls  in Ajman  By Pakistani Call Girls  in ...(+971568250507  ))#  Young Call Girls  in Ajman  By Pakistani Call Girls  in ...
(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...Escorts Call Girls
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...SofiyaSharma5
 
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445ruhi
 
CALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service Onlineanilsa9823
 
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.soniya singh
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024APNIC
 
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...Neha Pandey
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Servicegwenoracqe6
 
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...tanu pandey
 
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...tanu pandey
 
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night StandHot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Standkumarajju5765
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝soniya singh
 

Recently uploaded (20)

VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
₹5.5k {Cash Payment}New Friends Colony Call Girls In [Delhi NIHARIKA] 🔝|97111...
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
 
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...
(+971568250507  ))#  Young Call Girls  in Ajman  By Pakistani Call Girls  in ...(+971568250507  ))#  Young Call Girls  in Ajman  By Pakistani Call Girls  in ...
(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
 
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
 
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
(INDIRA) Call Girl Pune Call Now 8250077686 Pune Escorts 24x7
 
CALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service Online
 
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Green Park Escort Service Delhi N.C.R.
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
 
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
 
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
 
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...Nanded City ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready ...
Nanded City ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready ...
 
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night StandHot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
 

Adoption of the Linked Data Best Practices in Different Topical Domains

  • 1. Max Schmachtenberg Christian Bizer Heiko Paulheim Adoption of the Linked Data Best Practices in Different Topical Domains Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 1
  • 2. The Linked Data Best Practices Central idea of Linked Data: Ease data discovery and integration by complying to a set of best practices. 1. Linking Best Practices • Set RDF links pointing at instances in other data sources. 2. Vocabulary Best Practices • Reuse terms from widely-used vocabularies. • Make definitions of proprietary terms dereferencable. • Link vocabulary terms to terms in other vocabularies. 3. Metadata Best Practices • Publish machine-readable provenance and licensing metadata. • Publish metadata about alternative access methods (SPARQL, dumps) Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 2
  • 3. State of the LOD Cloud Report - 2011  http://lod-cloud.net/state/  Based on information by provided dataset publishers via the datahub.io catalog Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 3
  • 4. LOD Cloud - 2011 Consists of 295 datasets. Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 4
  • 5. Outline Goal: Update the State of the LOD Cloud report and LOD Cloud itself to 2014. 1. Methodology 2. Adoption of the Linking Best Practices 3. Adoption of the Vocabulary Best Practices 4. Adoption of the Metadata Best Practices 5. Conclusions (in Relation to Schema.org) Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 5
  • 6. 1. Methodology: Crawl of the Linked Data Web  Crawler: LDSpider, Crawl Date: April 2014  Seeds: 560,000 seed URIs from 1. Example URIs in datahub.io catalog 2. URIs from BTC2012 dataset 3. URIs from datasets advertised on public-lod@w3.org mailing list  Crawled Data Corpus • 900,000 documents containing • 8,038,000 resources • 1014 datasets • 77 datasets prevent crawling via robots.txt • Distribution by dataset • Red line: documents • Blue line: resources Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 6
  • 7. Categorization by Topical Domain  Used categorization from datahub.io for existing datasets.  Manually categorized remaining datasets.  Added new category Social Networking  Growth without new category Social Networking: 94 %  LODstats (http://stats.lod2.eu/) discovered similar number of datasets: 1048 Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 7
  • 8. 2. Adoption of the Linking Best Practices Data publishers should set RDF links as: 1. Discoverability depends on being linked. 2. RDF links ease data integration. Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 8
  • 9. Degrees  56% of all datasets set RDF links pointing to other datasets. • The remaining 44% are either only the target of RDF links from other datasets or are isolated.  Datasets with Top In- and Outdegrees:  Most widely used linking predicates: owl:sameAs, rdfs:seeAlso, foaf:knows Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 9
  • 10. “Crawlable” LOD Cloud 2014  ss Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 10
  • 11. Degree Distributions  Dotted line: Social Networking (status.net, etc.)  Solid line: Cross-Domain datasets (DBpedia, etc.)  Largest Strongly Connected Component: 36% (377 datasets) Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 11
  • 12. Conclusion concerning Linking Best Practices  Some datasets put a lot of effort into linking.  Many datasets only link to a small number of other datasets or do not set RDF links at all.  Similar situation as in 2011. Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 12
  • 13. 3. Adoption of the Vocabulary Best Practices Goal: Help applications understand the data by 1. Reusing terms from widely-used vocabularies. 2. Making definitions of proprietary terms dereferencable. 3. Linking vocabulary terms to terms in other vocabularies. Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 13
  • 14. Widely-Used and Proprietary Vocabularies  Strong agreement on some vocabularies.  Proprietary vocabularies are used in addition to common ones, as data is often very specific Widely-Used Vocabularies Proprietary Vocabularies Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 14
  • 15. Dereferencability of Term URIs and Vocabulary Linking  28% of the proprietary vocabularies provide dereferencable URIs.  21% set RDF links to other vocabularies (8% in 2011) • Popular linking predicates: rdfs:range, rdfs:subClassOf, rdfs:subClassOf Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 15
  • 16. Adoption of the Metadata Best Practices 1. Publish machine-readable provenance information. 2. Publish machine-readable licensing information. 3. Publish metadata about alternative access methods (SPARQL endpoints, RDF dumps) Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 16
  • 17. Provenance and Licensing Metadata  37% of the datasets provide provenance information • Dublin Core is used more than W3C Prov  10% provide machine-readable licensing information • Most used predicates dc:license, cc:license Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 17
  • 18. Dataset Level Metadata (VoID)  15% of the datasets publish VoID descriptions.  Via these descriptions, it is possible to discover SPARQL endpoints and dumps for about 10% of the data sources. Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 18
  • 19. Conclusion concerning Metadata Best Practices  Applications can not rely on availability of metadata, as only a small fraction of all data sources publishes such data.  The Government and Library domains are positive exceptions.  Similarly low numbers as in 2011. Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 19
  • 20. “Full” LOD Cloud Diagram 570 datasets  374 datahub.io  196 our crawl http://lod-cloud.net/ Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 20
  • 21. Growth of the “Full” LOD Cloud Diagram  2011: 295 datasets  2014: 570 datasets (+ 93 %) http://lod-cloud.net/ Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 21
  • 22. Comparison of Linked Data and Schema.org Schema.org 1. does not expect data publishers to set data links. 2. relies on marking up data in HTML pages. 3. Strong application pull by Google, Microsoft, Yahoo! Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 22
  • 23. Adoption WebDataCommons, 2013*: 463,000 websites (PLDs) provide Microdata annotations. Google, 2014**: 5 million websites provide Schema.org data.  Orders of magnitude more Schema.org data sources. * WebDataCommons extracts Microdata, RDFa, Microformat data from the CommonCrawl (2.2 billion HTML pages from 12.8 million PLDs). ** Guha in LDOW2014 Keynote Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 23
  • 24. Schema.org Topical Focus Different topics compared to Linked Data. Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 24
  • 25. Class / Property Distribution Microdata 2012  Only a small set of classes / properties is actually used.  Less variety compared to Linked Data. Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 25
  • 26. Shallowness of the Schema.org Data schema:Product schema:JobPosting Product Names • AppleMacBook Air MC968/A 11.6-Inch Laptop • Apple MacBook Air 11-in, Intel Core i5 1.60GHz, 64 GB, Lion 10.7 JobPostings • More specific properties like skills are hardly used. • 57% of all hiringOrganizations are strings not instances. Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 26
  • 27. Conclusion Linked Data Schema.org ~ 1,000 sources > 460,000 sources covers wider range of specific topics (government, libraries, science) topics focused on search engines (products, organizations) contains more complex data structures very simple and shallow data structures partial ontology agreement strong ontology agreement identity resolution eased by RDF links identity resolution often requires value parsing Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 27
  • 28. Thank you. References  Report http://linkeddatacatalog.dws.informatik.uni-mannheim.de/state/  Catalog http://linkeddatacatalog.dws.informatik.uni-mannheim.de/ Acknowledgement  This work was supported by Schmachtenberg, Bizer, Paulheim: Adoption of the Linked Data Best Practices, 23.10.2014 Slide 28