SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Detecting Good Practices and
Pitfalls when Publishing
Vocabularies on the Web
María Poveda-Villalón1, Bernard Vatant2, Mari Carmen SuárezFigueroa1, Asunción Gómez-Pérez1 ,
1Ontology

Engineering Group. Universidad Politécnica de Madrid. Spain.
2Mondeca, Paris, France.

mpoveda@fi.upm.es, bernard.vatant@mondeca.com, {mcsuarez, asun}@fi.upm.es

Speaker: Asunción Gómez-Pérez
Contact author: María Poveda-Villalón: mpoveda@fi.upm.es

Date: 10/28/13
Table of Contents
•  Introduction
•  Good practices and pitfalls for publishing
vocabularies
•  Results and Analysis over LOV vocabularies
•  Conclusions and future work

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

2
Introduction
•  Different formats: RDFS, OWL, HTML
•  Different configurations
•  Do they ease or impede applications
consuming vocabularies?
Ø  Good practices & Pitfalls

Vocabularies bring semantics to data

“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
http://lod-cloud.net/”

Along this work:
•  Detailed analysis of 355 vocabularies gathered in the
LOV registry (http://lov.okfn.org/)
•  Why LOV: complete information about each
vocabulary, namely URI, namespace and prefix
•  Results:
1.  a non exhaustive list of good practices and
pitfalls about publishing LD vocabularies
2.  specific methods for detecting such good
practices and pitfalls
3.  some metadata about ontology quality
4.  the inclusion of pitfalls in services such as
OOPS! (http://www.oeg-upm.net/oops) to help
eager vocabulary managers

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

3
Table of Contents
•  Introduction
•  Good practices and pitfalls for publishing
vocabularies
•  Previous work
•  Proposed good practices and pitfalls

•  Results and analysis over LOV vocabularies
•  Conclusions and future work

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

4
Previous work (I)
Linked Open Data 5 Star rating system (Tim Bernes-Lee) http://www.w3.org/DesignIssues/
LinkedData.html. 2006 (last change 2009).
LOD1. Available on the web (whatever format) but with an open licence, to be Open Data
LOD2. Available as machine-readable structured data (e.g. excel instead of image scan of a table)
LOD3. As (2) plus non-proprietary format (e.g. CSV instead of excel)
LOD4. All the above plus, Use open standards from W3C (RDF and SPARQL) to identify things,
so that people can point at your stuff
LOD5. All the above plus Link your data to other people’s data to provide context

Is your linked data vocabulary 5-star? (Bernard Vatant) http://bvatant.blogspot.fr/2012/02/is-yourlinked-data-vocabulary-5-star_9588.html. 2012.
LDV1. Publish your vocabulary on the Web at a stable URI
LDV2. Provide human-readable documentation and basic metadata such as creator, publisher,
date of creation, last modification, version number
LDV3. Provide labels and descriptions, if possible in several languages, to make your
vocabulary usable in multiple linguistic scopes
LDV4. Make your vocabulary available via its namespace URI, both as a formal file and humanreadable documentation, using content negotiation
LDV5. Link to other vocabularies by re-using elements rather than re-inventing

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

5
Previous work (II)
Archer, P., Goedertier, S., and Loutas, N. D7.1.3 – Study on persistent URIs, with identification of
best practices and recommendations on the topic for the MSs and the EC. Deliverable. December
17, 2012.

Heath, T., Bizer, C.: Linked data: Evolving the Web into a global data space (1st edition). Morgan &
Claypool. 2011.

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

6
Proposed good practices and pitfalls
Proposals

Inspired by

Previous work brief reminder
Linked Open Data 5 Star
LOD1. on the web. Open.
LOD2. machine-readable
LOD3. non-proprietary
LOD4. open standards
LOD5. Link

Good practices
GP1. Provide RDF description
GP2. Provide HTML documentation
GP3. Content negotiation for RDF
GP4. Content negotiation for HTML
GP5. Provide vann metadata
GP6. Well-established prefix

Is your linked data vocabulary 5-star?
LDV1. vocabulary on the Web
LDV2. human-readable and metadata
LDV3. labels and descriptions
LDV4. content negotiation
LDV5. Link

Pitfalls
P36. URI contains file extension
P37. Ontology not available
P38. No OWL ontology declaration
P39. Ambiguous namespace
P40. Namespace hijacking

10 rules for persistent URIs

✔
Linked data: Evolving the Web
into a global data space:
“Only define new terms in a
namespace that you control.”

✖


• Follow the pattern
• Re-use existing identifiers
• Multiple representations
• Implements 303 redirects
• Use a dedicated server

• Avoid stating ownership
• Avoid version numbers
• Avoid using auto-increment
• Avoid query strings
• Avoid file extensions

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

7
Table of Contents
•  Introduction
•  Good practices and pitfalls for publishing
vocabularies
•  Results and analysis over LOV vocabularies
•  Conclusions and future work

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

8
Results and analysis over LOV vocabularies (I)
Good practices and pitfalls frequency

355 vocabularies registered in LOV - 19th June, 2013

GP1. Provide RDF
description
GP2. Provide HTML
documentation
GP3. Content negotiation for
RDF
GP4. Content negotiation for
HTML
GP5. Provide vann metadata
GP6. Well-established prefix

Pitfalls distribution

Good practices distribution

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

9

P36. URI contains file
extension
P37. Ontology not
available
P38. No OWL ontology
declaration
P39. Ambiguous
namespace
P40. Namespace
hijacking
Results and analysis over LOV vocabularies (I)
Grid with vocabularies according to the number of good practices and pitfalls observed.
Available at http://goo.gl/zu9ZbW

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

10
Table of Contents
•  Introduction
•  Good practices and pitfalls for publishing
vocabularies
•  Results and analysis over LOV vocabularies
•  Conclusions and future work

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

11
Conclusions
•  6 good practices and 5 pitfalls proposed
•  Based on existing works
•  Implementation of the detection methods
•  Grid-based rating system proposed. Useful for:
•  Vocabulary registry maintainers
•  Vocabulary developers and creators
•  Execution over 355 vocabularies
•  All good practices and pitfalls are observed
•  Some of them surprisingly (e.g.: P40. Namespace hijacking)
•  LOV vocabularies seem to be well maintained and likely to be high quality . Due to
semi-handcrafted maintenance instead of crawlers?

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

12
Future work (I)

Linked Open Data 5 Star
LOD1. on the web. Open.
LOD2. machine-readable
LOD3. non-proprietary
LOD4. open standards
LOD5. Link

•  Take into account:
•  metadata about licences
•  other metadata, e.g., creators, authors,
dates, languages, etc.
•  linguistic information
•  reused terms from other vocabularies
•  Provide guidelines to solve pitfalls and to
follow good practices
•  Execute methods over LOV in regular basis
•  Observe evaluation of the ecosystem
•  Draw tends for vocabulary publication

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

Is your linked data vocabulary 5-star?
LDV1. vocabulary on the Web
LDV2. human-readable and metadata
LDV3. labels and descriptions
LDV4. content negotiation
LDV5. Link

13
Future work (II)
•  Integration with third party systems. E.g.
•  LOV search
•  OOPS! - OntOlogy Pitfall Scanner! (http://oeg-upm.net/oops/)
ü  Done for pitfalls
•  Assign importance levels for good practices and pitfalls
ü  Done for pitfalls

…
…
…
…
Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

14
Questions?

Thanks!
Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

15
Detecting Good Practices and
Pitfalls when Publishing
Vocabularies on the Web
María Poveda-Villalón1, Bernard Vatant2, Mari Carmen SuárezFigueroa1, Asunción Gómez-Pérez1 ,
1Ontology

Engineering Group. Universidad Politécnica de Madrid. Spain.
2Mondeca, Paris, France.

mpoveda@fi.upm.es, bernard.vatant@mondeca.com, {mcsuarez, asun}@fi.upm.es

Speaker: Asunción Gómez-Pérez
Contact author: María Poveda-Villalón: mpoveda@fi.upm.es

Date: 10/28/13

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to Digital Humanities: Metadata standards and ontologies
Introduction to Digital Humanities: Metadata standards and ontologies Introduction to Digital Humanities: Metadata standards and ontologies
Introduction to Digital Humanities: Metadata standards and ontologies LIBIS
 
Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Oscar Corcho
 
Building Blocks for Accessing Multilingual Data: CLDR
Building Blocks for Accessing Multilingual Data: CLDRBuilding Blocks for Accessing Multilingual Data: CLDR
Building Blocks for Accessing Multilingual Data: CLDRSteven R. Loomis
 
Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Saeedeh Shekarpour
 
SmartData Webinar Slides: The Yosemite Project for Healthcare Information int...
SmartData Webinar Slides: The Yosemite Project for Healthcare Information int...SmartData Webinar Slides: The Yosemite Project for Healthcare Information int...
SmartData Webinar Slides: The Yosemite Project for Healthcare Information int...DATAVERSITY
 
RDF and other linked data standards — how to make use of big localization data
RDF and other linked data standards — how to make use of big localization dataRDF and other linked data standards — how to make use of big localization data
RDF and other linked data standards — how to make use of big localization dataDave Lewis
 
Sharing an Open Methodology for Building Domain-specific Corpora for EAP
Sharing an Open Methodology for Building Domain-specific Corpora for EAP Sharing an Open Methodology for Building Domain-specific Corpora for EAP
Sharing an Open Methodology for Building Domain-specific Corpora for EAP Alannah Fitzgerald
 
Research Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibilityResearch Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibilityOscar Corcho
 
Towards Open Methods: Using Scientific Workflows in Linguistics
Towards Open Methods: Using Scientific Workflows in LinguisticsTowards Open Methods: Using Scientific Workflows in Linguistics
Towards Open Methods: Using Scientific Workflows in LinguisticsRichard Littauer
 
Vocabularies - Managing Them
Vocabularies - Managing ThemVocabularies - Managing Them
Vocabularies - Managing ThemKehan Harman
 
Contexts and Importing in RDF
Contexts and Importing in RDFContexts and Importing in RDF
Contexts and Importing in RDFJie Bao
 
Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Herbert Van de Sompel
 
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...Holistic Benchmarking of Big Linked Data
 

Was ist angesagt? (16)

Efficient RDF Interchange (ERI) Format for RDF Data Streams
Efficient RDF Interchange (ERI) Format for RDF Data StreamsEfficient RDF Interchange (ERI) Format for RDF Data Streams
Efficient RDF Interchange (ERI) Format for RDF Data Streams
 
Introduction to Digital Humanities: Metadata standards and ontologies
Introduction to Digital Humanities: Metadata standards and ontologies Introduction to Digital Humanities: Metadata standards and ontologies
Introduction to Digital Humanities: Metadata standards and ontologies
 
Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?
 
Building Blocks for Accessing Multilingual Data: CLDR
Building Blocks for Accessing Multilingual Data: CLDRBuilding Blocks for Accessing Multilingual Data: CLDR
Building Blocks for Accessing Multilingual Data: CLDR
 
Tutorial on Question Answering Systems
Tutorial on Question Answering Systems Tutorial on Question Answering Systems
Tutorial on Question Answering Systems
 
SmartData Webinar Slides: The Yosemite Project for Healthcare Information int...
SmartData Webinar Slides: The Yosemite Project for Healthcare Information int...SmartData Webinar Slides: The Yosemite Project for Healthcare Information int...
SmartData Webinar Slides: The Yosemite Project for Healthcare Information int...
 
DisGeNET Tutorial SWAT4LS 2015-12-07
DisGeNET Tutorial SWAT4LS 2015-12-07DisGeNET Tutorial SWAT4LS 2015-12-07
DisGeNET Tutorial SWAT4LS 2015-12-07
 
RDF and other linked data standards — how to make use of big localization data
RDF and other linked data standards — how to make use of big localization dataRDF and other linked data standards — how to make use of big localization data
RDF and other linked data standards — how to make use of big localization data
 
Sharing an Open Methodology for Building Domain-specific Corpora for EAP
Sharing an Open Methodology for Building Domain-specific Corpora for EAP Sharing an Open Methodology for Building Domain-specific Corpora for EAP
Sharing an Open Methodology for Building Domain-specific Corpora for EAP
 
Research Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibilityResearch Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibility
 
Towards Open Methods: Using Scientific Workflows in Linguistics
Towards Open Methods: Using Scientific Workflows in LinguisticsTowards Open Methods: Using Scientific Workflows in Linguistics
Towards Open Methods: Using Scientific Workflows in Linguistics
 
Vocabularies - Managing Them
Vocabularies - Managing ThemVocabularies - Managing Them
Vocabularies - Managing Them
 
Contexts and Importing in RDF
Contexts and Importing in RDFContexts and Importing in RDF
Contexts and Importing in RDF
 
20140521 sem-tech-biz-guest-lecture
20140521 sem-tech-biz-guest-lecture20140521 sem-tech-biz-guest-lecture
20140521 sem-tech-biz-guest-lecture
 
Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013
 
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
4th Natural Language Interface over the Web of Data (NLIWoD) workshop and QAL...
 

Andere mochten auch

Detrás de un gran dataset siempre hay un gran vocabulario
Detrás de un gran dataset siempre hay un gran vocabularioDetrás de un gran dataset siempre hay un gran vocabulario
Detrás de un gran dataset siempre hay un gran vocabularioMaría Poveda Villalón
 
Validating ontologies with OOPS! - EKAW2012
Validating ontologies with OOPS! - EKAW2012Validating ontologies with OOPS! - EKAW2012
Validating ontologies with OOPS! - EKAW2012María Poveda Villalón
 
The Landscape of Ontology Reuse in Linked Data - OEDW2012
The Landscape of Ontology Reuse in Linked Data - OEDW2012The Landscape of Ontology Reuse in Linked Data - OEDW2012
The Landscape of Ontology Reuse in Linked Data - OEDW2012María Poveda Villalón
 
About André T. (anno June '11)
About André T. (anno June '11)About André T. (anno June '11)
About André T. (anno June '11)André Torkveen
 
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...María Poveda Villalón
 
Ontology Evaluation: a pitfall-based approach to ontology diagnosis
Ontology Evaluation: a pitfall-based approach to ontology diagnosisOntology Evaluation: a pitfall-based approach to ontology diagnosis
Ontology Evaluation: a pitfall-based approach to ontology diagnosisMaría Poveda Villalón
 

Andere mochten auch (7)

Detrás de un gran dataset siempre hay un gran vocabulario
Detrás de un gran dataset siempre hay un gran vocabularioDetrás de un gran dataset siempre hay un gran vocabulario
Detrás de un gran dataset siempre hay un gran vocabulario
 
Ee bdm ws-v1
Ee bdm ws-v1Ee bdm ws-v1
Ee bdm ws-v1
 
Validating ontologies with OOPS! - EKAW2012
Validating ontologies with OOPS! - EKAW2012Validating ontologies with OOPS! - EKAW2012
Validating ontologies with OOPS! - EKAW2012
 
The Landscape of Ontology Reuse in Linked Data - OEDW2012
The Landscape of Ontology Reuse in Linked Data - OEDW2012The Landscape of Ontology Reuse in Linked Data - OEDW2012
The Landscape of Ontology Reuse in Linked Data - OEDW2012
 
About André T. (anno June '11)
About André T. (anno June '11)About André T. (anno June '11)
About André T. (anno June '11)
 
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
 
Ontology Evaluation: a pitfall-based approach to ontology diagnosis
Ontology Evaluation: a pitfall-based approach to ontology diagnosisOntology Evaluation: a pitfall-based approach to ontology diagnosis
Ontology Evaluation: a pitfall-based approach to ontology diagnosis
 

Ähnlich wie Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
 
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)OpenAIRE
 
WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...
WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...
WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...DATAVERSITY
 
Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...Gautier Poupeau
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareIMC Technologies
 
Collaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsCollaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsAndrea Wiggins
 
Linked Data for Biopharma
Linked Data for BiopharmaLinked Data for Biopharma
Linked Data for BiopharmaTom Plasterer
 
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)Riccardo Albertoni
 
Opening up MOOCs for OER management on the Web of linked data
Opening up MOOCs for OER management on the Web of linked dataOpening up MOOCs for OER management on the Web of linked data
Opening up MOOCs for OER management on the Web of linked dataGilbert Paquette
 
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in DataverseClariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataversevty
 
New member
New member New member
New member Crossref
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Anita de Waard
 
Approach to leverage Websites to APIs through Semantics
Approach to leverage Websites to APIs through SemanticsApproach to leverage Websites to APIs through Semantics
Approach to leverage Websites to APIs through SemanticsIoannis Stavrakantonakis
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?Ivan Herman
 
Publishing and Using Linked Open Data - Day 4
Publishing and Using Linked Open Data - Day 4Publishing and Using Linked Open Data - Day 4
Publishing and Using Linked Open Data - Day 4Richard Urban
 
New member webinar 052418
New member webinar 052418New member webinar 052418
New member webinar 052418Crossref
 
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021Crossref
 

Ähnlich wie Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web (20)

Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
 
WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...
WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...
WEBINAR: The Yosemite Project: An RDF Roadmap for Healthcare Information Inte...
 
Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...Why I don't use Semantic Web technologies anymore, event if they still influe...
Why I don't use Semantic Web technologies anymore, event if they still influe...
 
Coming to terms to FAIR semantics
Coming to terms to FAIR semanticsComing to terms to FAIR semantics
Coming to terms to FAIR semantics
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the Software
 
Collaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsCollaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna Workflows
 
Linked Data for Biopharma
Linked Data for BiopharmaLinked Data for Biopharma
Linked Data for Biopharma
 
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
 
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)
 
Opening up MOOCs for OER management on the Web of linked data
Opening up MOOCs for OER management on the Web of linked dataOpening up MOOCs for OER management on the Web of linked data
Opening up MOOCs for OER management on the Web of linked data
 
Linked Data
Linked DataLinked Data
Linked Data
 
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in DataverseClariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
 
New member
New member New member
New member
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
Approach to leverage Websites to APIs through Semantics
Approach to leverage Websites to APIs through SemanticsApproach to leverage Websites to APIs through Semantics
Approach to leverage Websites to APIs through Semantics
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?
 
Publishing and Using Linked Open Data - Day 4
Publishing and Using Linked Open Data - Day 4Publishing and Using Linked Open Data - Day 4
Publishing and Using Linked Open Data - Day 4
 
New member webinar 052418
New member webinar 052418New member webinar 052418
New member webinar 052418
 
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
 

Mehr von María Poveda Villalón

Mehr von María Poveda Villalón (7)

Ontology development basic tools
Ontology development basic toolsOntology development basic tools
Ontology development basic tools
 
Chowlk notation
Chowlk notation Chowlk notation
Chowlk notation
 
New trends in ontological engineering, practices and tools
New trends in ontological engineering, practices and toolsNew trends in ontological engineering, practices and tools
New trends in ontological engineering, practices and tools
 
Publishing Linked Open Data on the Web & the Role of Ontologies
Publishing Linked Open Data on the Web & the Role of OntologiesPublishing Linked Open Data on the Web & the Role of Ontologies
Publishing Linked Open Data on the Web & the Role of Ontologies
 
Introducción a la web semántica
Introducción a la web semánticaIntroducción a la web semántica
Introducción a la web semántica
 
Semantic Discovery in the Web of Things
Semantic Discovery in the Web of ThingsSemantic Discovery in the Web of Things
Semantic Discovery in the Web of Things
 
Linked Open Vocabularies
Linked Open VocabulariesLinked Open Vocabularies
Linked Open Vocabularies
 

Kürzlich hochgeladen

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Kürzlich hochgeladen (20)

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web

  • 1. Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web María Poveda-Villalón1, Bernard Vatant2, Mari Carmen SuárezFigueroa1, Asunción Gómez-Pérez1 , 1Ontology Engineering Group. Universidad Politécnica de Madrid. Spain. 2Mondeca, Paris, France. mpoveda@fi.upm.es, bernard.vatant@mondeca.com, {mcsuarez, asun}@fi.upm.es Speaker: Asunción Gómez-Pérez Contact author: María Poveda-Villalón: mpoveda@fi.upm.es Date: 10/28/13
  • 2. Table of Contents •  Introduction •  Good practices and pitfalls for publishing vocabularies •  Results and Analysis over LOV vocabularies •  Conclusions and future work Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 2
  • 3. Introduction •  Different formats: RDFS, OWL, HTML •  Different configurations •  Do they ease or impede applications consuming vocabularies? Ø  Good practices & Pitfalls Vocabularies bring semantics to data “Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/” Along this work: •  Detailed analysis of 355 vocabularies gathered in the LOV registry (http://lov.okfn.org/) •  Why LOV: complete information about each vocabulary, namely URI, namespace and prefix •  Results: 1.  a non exhaustive list of good practices and pitfalls about publishing LD vocabularies 2.  specific methods for detecting such good practices and pitfalls 3.  some metadata about ontology quality 4.  the inclusion of pitfalls in services such as OOPS! (http://www.oeg-upm.net/oops) to help eager vocabulary managers Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 3
  • 4. Table of Contents •  Introduction •  Good practices and pitfalls for publishing vocabularies •  Previous work •  Proposed good practices and pitfalls •  Results and analysis over LOV vocabularies •  Conclusions and future work Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 4
  • 5. Previous work (I) Linked Open Data 5 Star rating system (Tim Bernes-Lee) http://www.w3.org/DesignIssues/ LinkedData.html. 2006 (last change 2009). LOD1. Available on the web (whatever format) but with an open licence, to be Open Data LOD2. Available as machine-readable structured data (e.g. excel instead of image scan of a table) LOD3. As (2) plus non-proprietary format (e.g. CSV instead of excel) LOD4. All the above plus, Use open standards from W3C (RDF and SPARQL) to identify things, so that people can point at your stuff LOD5. All the above plus Link your data to other people’s data to provide context Is your linked data vocabulary 5-star? (Bernard Vatant) http://bvatant.blogspot.fr/2012/02/is-yourlinked-data-vocabulary-5-star_9588.html. 2012. LDV1. Publish your vocabulary on the Web at a stable URI LDV2. Provide human-readable documentation and basic metadata such as creator, publisher, date of creation, last modification, version number LDV3. Provide labels and descriptions, if possible in several languages, to make your vocabulary usable in multiple linguistic scopes LDV4. Make your vocabulary available via its namespace URI, both as a formal file and humanreadable documentation, using content negotiation LDV5. Link to other vocabularies by re-using elements rather than re-inventing Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 5
  • 6. Previous work (II) Archer, P., Goedertier, S., and Loutas, N. D7.1.3 – Study on persistent URIs, with identification of best practices and recommendations on the topic for the MSs and the EC. Deliverable. December 17, 2012. Heath, T., Bizer, C.: Linked data: Evolving the Web into a global data space (1st edition). Morgan & Claypool. 2011. Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 6
  • 7. Proposed good practices and pitfalls Proposals Inspired by Previous work brief reminder Linked Open Data 5 Star LOD1. on the web. Open. LOD2. machine-readable LOD3. non-proprietary LOD4. open standards LOD5. Link Good practices GP1. Provide RDF description GP2. Provide HTML documentation GP3. Content negotiation for RDF GP4. Content negotiation for HTML GP5. Provide vann metadata GP6. Well-established prefix Is your linked data vocabulary 5-star? LDV1. vocabulary on the Web LDV2. human-readable and metadata LDV3. labels and descriptions LDV4. content negotiation LDV5. Link Pitfalls P36. URI contains file extension P37. Ontology not available P38. No OWL ontology declaration P39. Ambiguous namespace P40. Namespace hijacking 10 rules for persistent URIs ✔ Linked data: Evolving the Web into a global data space: “Only define new terms in a namespace that you control.” ✖ • Follow the pattern • Re-use existing identifiers • Multiple representations • Implements 303 redirects • Use a dedicated server • Avoid stating ownership • Avoid version numbers • Avoid using auto-increment • Avoid query strings • Avoid file extensions Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 7
  • 8. Table of Contents •  Introduction •  Good practices and pitfalls for publishing vocabularies •  Results and analysis over LOV vocabularies •  Conclusions and future work Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 8
  • 9. Results and analysis over LOV vocabularies (I) Good practices and pitfalls frequency 355 vocabularies registered in LOV - 19th June, 2013 GP1. Provide RDF description GP2. Provide HTML documentation GP3. Content negotiation for RDF GP4. Content negotiation for HTML GP5. Provide vann metadata GP6. Well-established prefix Pitfalls distribution Good practices distribution Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 9 P36. URI contains file extension P37. Ontology not available P38. No OWL ontology declaration P39. Ambiguous namespace P40. Namespace hijacking
  • 10. Results and analysis over LOV vocabularies (I) Grid with vocabularies according to the number of good practices and pitfalls observed. Available at http://goo.gl/zu9ZbW Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 10
  • 11. Table of Contents •  Introduction •  Good practices and pitfalls for publishing vocabularies •  Results and analysis over LOV vocabularies •  Conclusions and future work Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 11
  • 12. Conclusions •  6 good practices and 5 pitfalls proposed •  Based on existing works •  Implementation of the detection methods •  Grid-based rating system proposed. Useful for: •  Vocabulary registry maintainers •  Vocabulary developers and creators •  Execution over 355 vocabularies •  All good practices and pitfalls are observed •  Some of them surprisingly (e.g.: P40. Namespace hijacking) •  LOV vocabularies seem to be well maintained and likely to be high quality . Due to semi-handcrafted maintenance instead of crawlers? Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 12
  • 13. Future work (I) Linked Open Data 5 Star LOD1. on the web. Open. LOD2. machine-readable LOD3. non-proprietary LOD4. open standards LOD5. Link •  Take into account: •  metadata about licences •  other metadata, e.g., creators, authors, dates, languages, etc. •  linguistic information •  reused terms from other vocabularies •  Provide guidelines to solve pitfalls and to follow good practices •  Execute methods over LOV in regular basis •  Observe evaluation of the ecosystem •  Draw tends for vocabulary publication Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web Is your linked data vocabulary 5-star? LDV1. vocabulary on the Web LDV2. human-readable and metadata LDV3. labels and descriptions LDV4. content negotiation LDV5. Link 13
  • 14. Future work (II) •  Integration with third party systems. E.g. •  LOV search •  OOPS! - OntOlogy Pitfall Scanner! (http://oeg-upm.net/oops/) ü  Done for pitfalls •  Assign importance levels for good practices and pitfalls ü  Done for pitfalls … … … … Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 14
  • 15. Questions? Thanks! Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web 15
  • 16. Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web María Poveda-Villalón1, Bernard Vatant2, Mari Carmen SuárezFigueroa1, Asunción Gómez-Pérez1 , 1Ontology Engineering Group. Universidad Politécnica de Madrid. Spain. 2Mondeca, Paris, France. mpoveda@fi.upm.es, bernard.vatant@mondeca.com, {mcsuarez, asun}@fi.upm.es Speaker: Asunción Gómez-Pérez Contact author: María Poveda-Villalón: mpoveda@fi.upm.es Date: 10/28/13