SlideShare ist ein Scribd-Unternehmen logo
1 von 50
Downloaden Sie, um offline zu lesen
The importance of metadata for
datasets: the DCAT-AP European
standard
Giorgia Lodi –
giorgia.lodi@gmail.com
Summary
• The importance of metadata in the (open) data
management
• The standard DCAT and the European DCAT-AP version 1
and its new evolution
• Focus on the Italian extension of DCAT-AP named DCAT-
AP_IT
Data vs metadata
DATA
Physical representation of facts, atomic events, objective
phenomena, information suitable for communication,
interpretation and processing by human beings or automatic
means
METADATA
Data that defines other data (they are NOT the data itself!).
Examples: bibliographic reference, the author of a document,
the date of last modification of a dataset
Additional definitions
ONTOLOGY: a formal and explicit specification of shared
representation (conceptualization) of a knowledge domain,
defined on the basis of specific requirements
CONTROLLED VOCABULARY: a set of predefined and authoritative
standard terms and codes, pre-selected for the purpose of
indexing and retrieving information
DESCRIPTIVE METADATA: identify and describe digital objects
The importance of metadata
• They allow a better understanding of the data they describe
• Facilitate the discoverability of the data
• If they are defined using standard and shared ontologies
and controlled vocabularies they facilitate:
• Information exchange
• Interoperability
• Riutilization and valorisation of the public information
FAIR principles
FINDABLE
The first step in (re)using data is to find them. Metadata and data should be easy to find for
both humans and computers. Machine-readable metadata are essential for automatic
discovery of datasets and services
ACCESSIBLE
Once the user finds the required data, she/he needs to know how can they be accessed
INTEROPERABLE
The data usually need to be integrated with other data. In addition, the data need to
interoperate with applications or workflows for analysis, storage, and processing
REUSABLE
metadata and data should be well-described so that they can be replicated and/or
combined in different settings
European Directive 1/2
ARTICLE 5 – AVAILABLE FORMATS
“…public sector bodies and public undertakings shall make
their documents available in any pre-existing format or
language and, where possible and appropriate, by electronic
means, in formats that are open, machine-readable,
accessible, findable and re-usable, together with their
metadata. Both the format and the metadata shall, where
possible, comply with formal open standards”
European Directive 2/2
ARTICLE 9 – PRACTICAL ARRANGEMENTS
“Member States shall make practical arrangements facilitating
the search for documents available for re-use, such as asset
lists of main documents with relevant metadata, accessible
where possible and appropriate online and in machine-
readable format, and portal sites that are linked to the
asset lists. Where possible, Member States shall facilitate the
cross-linguistic search for documents, in particular by
enabling metadata aggregation at Union level”
Italian legislation – open data definition
AVAILABLE (LEGAL REQUIREMENT): disaggregated according to the
terms of an open licence allowing its re-use, also for commercial
purposes
ACCESSIBILE (TECHNOLOGICAL REQUIREMENT): by machines, in an
open format and with associated metadata
FREE OF CHARGE (ECONOMIC REQUIREMENT): free of charge or at
marginal costs incurred for reproduction, making available and
dissemination
Metadata: European and Italian scenarios
• We still observe different levels of quality for metadata
• There are still different platforms being used for cataloguing
data based on metadata
• CKAN, DKAN, Socrata, Linked-data based platforms and
proprietary infrastructures
• There are still different tematic classifications for datasets
• There are still different ways to specify licenses
Metadata: European and Italian scenarios
• We still observe different levels of quality for metadata
• There are still different platforms being used for cataloguing
data based on metadata
• CKAN, DKAN, Socrata, Linked-data based platforms and
proprietary infrastructures
• There are still different tematic classifications for datasets
• There are still different ways to specify licenses
HOWEVER
Common model for metadata specification
A common European data model for metadata is
helping in overcoming the previous mentioned
obstacles
The data model offers an harmonized and shared way
to specify metadata for datasets with a focus on
information that is particulartly relevant for re-users
DCAT-AP
DCAT-AP specifications
DCAT
DCAT-AP
Data CATalog vocabulary – Web standard based on the RDF
standard. It provides a data model of the descriptions of datasets
(not only open datasets) as stored in catalog
European Data CATalog vocabulary – Application Profile –
based on RDF, it is a set of constraints added to the DCAT
specification that facilitate the data and metadata exchange.
NO DE CH
National Data CATalog vocabulary – Application
Profiles – Defined by the different Member States that
adhere to the European DCAT-AP initiative
Based on the RDF standard, they typically include
additional constraints or property while maintaining the
compliance with DCAT-AP
IT
…
DCAT-AP extensions for specific types of data
• GeoDCAT-AP
- Facilitate the metadata exchange between geospatial catalogs and
data catalogs in general
• StatDCAT-AP
- Extends the DCAT-AP specification with a small number of elements
that are relevant in oder to describe statistical datasets
- Facilitate the metadata exchange between statitical catalogs and data
catalogs in general
Useful extensions to enable interoperability among different data catalogs
DCAT-AP specifications
DCAT
DCAT-AP
Data CATalog vocabulary – Web standard based on the RDF
standard. It provides a data model of the descriptions of datasets
(not only open datasets) as stored in catalogs
European Data CATalog vocabulary – Application Profile –
based on RDF, it is a set of constraints added to the DCAT
specification that facilitate the data and metadata exchange.
NO DE CH
National Data CATalog vocabulary – Application
Profiles – Defined by the different Member States that
adhere to the European DCAT-AP initiative
Based on the RDF standard, they typically include
additional constraints or property while maintaining the
compliance with DCAT-AP
IT
…
DCAT vocabulary
• There are two versions of it
o Version 1.0 of 2014
o Version 2.0 of 2019
o In both versions the vocabulary uses directly standard
and well known ontologies such as FOAF, Dublin Core,
SKOS)
DCAT version 1
• It is the latest Web recommendation
• It includes four main concepts (classes) for describing data in catalogs
o Catalog: a collection of metadata about datasets
o Catalog Record: represents a metadata item in the catalog, primarily
concerning the registration information, such as who added the item and
when
o Dataset: a collection of data, published or curated by a single agent, and
available for access or download in one or more serializations or formats
o Distribution: represent different formats of the dataset or different
endpoints. Examples of distributions include a downloadable CSV file, an
API or an RSS feed
DCAT version 1 – data model
DCAT version 1 – conformance 1/2
• Based on the standard we say that a catalog is compliant to DCAT if:
o It is organized in datasets and distributions
o There exists an RDF description of the catalog (independently of the
specific RDF serialization used to represent it)
o The contents of all metadata fields that are held in the catalog, and
that contain data about the catalog itself and its dataset and
distributions, are included in this RDF description
o All classes and properties are consistent with the semantics of the
specification
o Additional non-DCAT properties are specified
DCAT version 1 – conformance 2/2
DCAT PROFILE
A DCAT profile can be defined. A profile adds additional constraints to
DCAT. The constraints can be
o A minimum set of metadata fields that are mandatory (in contrast to
the open world assumption of DCAT vocabulary)
o Classes and properties for additional metadata fields that are not
covered in DCAT
o Controlled vocabularies or URI sets as acceptable values for some
properties (e.g., language, themes, etc.)
o Requirements for specific access mechanisms (RDF syntaxes,
protocols) to the catalog’s RDF description
DCAT-AP specifications
DCAT
DCAT-AP
Data CATalog vocabulary – Web standard based on the RDF
standard. It provides a data model of the descriptions of
datasets (not only open datasets) as stored in catalogs
European Data CATalog vocabulary – Application Profile –
based on RDF, it is a set of constraints added to the DCAT
specification that facilitate the data and metadata exchange.
NO DE CH
National Data CATalog vocabulary – Application
Profiles – Defined by the different Member States that
adhere to the European DCAT-AP initiative
Based on the RDF standard, they typically include
additional constraints or property while maintaining the
compliance with DCAT-AP
IT
…
European DCAT-Application Profile
• Born in 2013, DCAT-AP is a specification based on DCAT that aims at meeting
specific application needs of data portals in Europe while providing semantic
interoperability with other applications
• It provides a common specification for describing public sector datasets in
Europe to enable the exchange of descriptions of datasets among data portals
• It allows:
o Data catalogs to describe their dataset collections using a standardised
description, while keeping their own system for documenting and storing
them
o Content aggregators, such as the European data portal or national data
portals, to aggregate such descriptions into a single point of access.
o Data consumers to more easily find datasets through a single point of
access
https://d1jdzavdzee8nu.cloudfront.net/sites/default/files/distribution/access_url/2019-05/e3f7bcdf-eaad-
4741-9bf6-dc61327f4eea/DCAT_AP_1.2.1.pdf
DCAT-AP version 1.2.1
Mandatory elements DCAT-AP– catalog
Catalog class– Mandatory
The Catalog is described with the following
mandatory properties:
• title à example “Open Data Catalog of the
University of Bologna”
• description à short description of the content of
the Catalog
• publisher à who makes available the catalog
• dataset à list of all dataset objects that are
included in the catalog
Recommended
issued and modified à date in which the catalog is
released and modified, respectively
Mandatory elements DCAT-AP– dataset
Dataset class– Mandatory
The Dataset is described with the
following mandatory properties:
• title à represents in short the
content of the dataset
• description à description of
the content of the dataset
All the remaining properties are
recommended (i.e., contact point,
distribution, keyword/tag,
publisher, theme) and optional
(e.g., conforms to, accrual
periodicity, has version, identifier,
language, landing page, spatial
and temporal coverage, etc.)
Some recommended elements of DCAT-AP -
distribution
Distribution class– Recommended
If specificed following properties must be materialized for Distribution
• Access URL à U RL that gives ac cess to a D istribution of the Dataset
All the other properties are recommended (i.e., licence, format, description) and optional (e.g., byte
size, download URL, language, title, modified, etc.)
DCAT-AP specifications
DCAT
DCAT-AP
Data CATalog vocabulary – Web standard based on the RDF
standard. It provides a data model of the descriptions of
datasets (not only open datasets) as stored in catalogs
European Data CATalog vocabulary – Application Profile –
based on RDF, it is a set of constraints added to the DCAT
specification that facilitate the data and metadata exchange.
NO DE CH
National Data CATalog vocabulary – Application
Profiles – Defined by the different Member States that
adhere to the European DCAT-AP initiative
Based on the RDF standard, they typically include
additional constraints or property while maintaining the
compliance with DCAT-AP
IT
…
DCAT-AP_IT
[ITA] Technical guidelines for data catalogs
Available online
https://docs.italia.it/italia/daf/linee-guida-cataloghi-dati-dcat-
ap-it/it/stabile/
DCAT-AP_IT
• It reuses ontologies already available at the state of the art (e.g.,
Dublin-Core, FOAF, etc.) in order to guarantee interoperability with the
European application profile
• It extends some of the core concepts of DCAT and DCAT-AP in order
to define additional constraints and properties
• It does not use some concepts (classes) and properties defined as
optional in DCAT-AP
• Three core classes
o Catalog – A collection of metadata that describe datasets
o Dataset – a collection of data, published or curated by a single agent, and
available for access or download in one or more serializations or formats
o Distribution – a specific available form of a dataset
• An OWL ontology has been defined in order to describe the profile
Mandatory elements of DCAT-AP_IT - Catalog
Catalog class– Mandatory (subclass of
dcat:Catalog)
The Catalog is described with the
following mandatory properties:
• title à example “Open Data Catalog
of the University of Bologna”
• description à short description of
the content of the Catalog
• publisher à who makes available
the catalog
• modified à the date of last
modification of the catalog
• dataset à list of all dataset objects
that are included in the catalog
Mandatory elements of DCAT-AP_IT - Dataset
Dataset class– Mandatory (subclass of
dcat:Dataset)
The Dataset is described with the following
mandatory properties:
• identifier à example “unibo:D.1”
• titleà it describes in short its content
• description à description of the content
• modified à date of last modification
• theme à use of the controlled vocabulary
defined at the European level named Data
theme (13 themes associated with the
dataset)
• rightsHolder à who owns the rights on the
dataset (publisher is recommended and
creator is optional)
• accrual periodicityà the frequency of
update of the dataset. Use of the European
controlled vocabulary Frequencies
• distribuzione à mandatory property if the
dataset is open
Mandatory elements of DCAT-AP_IT - Distribution
Distribution class– Mandatory if the dataset is
open (subclass of dcat:Distribution)
La class is decribed by the following mandatory
properties:
• format à use of the European controlled
vocabulary File Type
• license à use of the Italian controlled
vocabulary Licences
(https://w3id.org/italia/controlled-
vocabulary/licences)
• description à describe the content of the
distribution
• access URL à a URL of a web page
through which it is possible to get access to
the dataset
downloadURL is optional but it may be useful to
specify it
Example in RDF - catalog
Example in RDF - dataset
Example in RDF - distribution
Example in RDF – Agent and point of contact
Spacial coverage – GeoDCAT-AP
• Only a minimal part of the GeoDCAT-
AP extension is used in the current
DCAT-AP_IT to connect the profile with
the geospatial world
• Italian guidelines guide in the
implementation the overall GeoDCAT-
AP specification
https://geodati.gov.it/geoportale/images/struttura/documenti/
GeoDCAT-AP_IT-v1.0.pdf
What’s next?
DCAT and DCAT-AP version 2.0
DCAT version 2
• It is the new candidate recommendation as of beginning of October
2019
• It changes the original version 1 in order to reflect years of practical use
cases and introduce important elements that characterize data in
catalogs e.g.,
o data resources
o relationships between data resources
o some geospatial elements
o APIs or data services
https://www.w3.org/TR/vocab-dcat-2/
DCAT version 2 – 1/2
NOVEL ELEMENTS
• 3 new concepts (classes)
o Resource: represents a dataset, a data
service or any other resource that may be
described by a metadata record in a catalog.
It is not used directly but it is the parent class
for Catalog, Dataset and Data Service
o Data Service: A data service is a collection of
operations accessible through an interface
(API) that provide access to one or more
datasets or data processing functions
o Relationship: An association class for
attaching additional information to a
relationship between DCAT Resources à to
be verified in practice
DCAT version 2 – 2/2
NOVEL ELEMENTS
• Revision of the definitions of
o Catalog: collections of metadata about
datasets or data services
o Distribution: represents an accessible form
of a dataset such as a downloadable file
• New elements for dealing with Time and Space
for Dataset and Distribution
• License metadata specified for dataset too other
than Distribution
• Possibility to specify compressed formats for
Distribution (e.g., zip e tar.gz) by also indicating
the format included in the compression
• Possibility to specify roles and relationships
among data resources
DCAT version 2 – relationship
• The class Relationship is used to characterize a
relationship between datasets, and potentially
other resources, where the nature of the
relationship is known but is not adequately
characterized by the standard Dublin core and
PROV-O properties
• The property hadRole defines the function of an
agent wrt another entity or resource
o May be used in a qualified-attribution to
specify the role of an Agent with respect to
an Entity
o Recommended the use of a controlled
vocabulary for roles
o A new way to specify roles for resources
(datasets, catalogs, data services)
DCAT version 2
• A data service typically provides selection, extraction,
combination, processing or transformation operations
over datasets that might be hosted locally or remote to
the service.
o The result of any request to a data service is a
representation of a part or all of a dataset or catalog
o Examples: a data discovery service, data
transformation services, such as coordinate
transformation services, re-sampling and
interpolation services, and various data processing
services, including simulation and modelling
services
o Three main properties characterize the data service:
endpointURL, endpointDescription, servesDataset
DCAT-AP version 2.0
• It is currently under development
• It is in public review until 4th of November; contributions are
discussed using the related github repository
• Two types of changes have been applied:
• Changes based on the feedback on the usage of verson
1.2.1
• Changes that adapt the profile to the new DCAT
https://joinup.ec.europa.eu/solution/dcat-application-profile-data-portals-europe/release/200
DCAT-AP version 2.0
DCAT-AP version 2.0 - catalog
Novel elements
• Possibility to specify catalog of catalog
• Possibility to specify catalog of data service
• Possibility to specify a creator
• Spatial coverage becomes recommended
All the rest remains unchanged
DCAT-AP version 2.0 - dataset
Novel elements
• Temporal and spatial coverage
become recommended (some new
properties have been added for these
two concepts)
• Introduction of a set of optional
properties that derive from the new
DCAT (e.g., provenance, relationship,
creator)
All the rest remains unchanged
DCAT-AP version 2.0 - distribution
Novel elements
• New properties for compressed
formats
• Temporal and spatial resolution
• A new property named
availability that assume the
following values: temporary,
experimental, available, stable
• New properties to link the
Distribution to a policy (rights)
and accessService to connect
the Distribution to Data services
Conclusions
• In the public sector an increasing number of Public
Administrations are adopting DCAT-AP(_IT)
• However, important changes are to be taken into account
• Challenge: how to rapidly adapt the current metadata
ecosystem in order to implement the new changes that
were introduced in DCAT and DCAT-AP?

Weitere ähnliche Inhalte

Was ist angesagt?

FIWARE Training: FIWARE Training: i4Trust Marketplace
FIWARE Training: FIWARE Training: i4Trust MarketplaceFIWARE Training: FIWARE Training: i4Trust Marketplace
FIWARE Training: FIWARE Training: i4Trust Marketplace
FIWARE
 
Metadata mapping
Metadata mappingMetadata mapping
Metadata mapping
Vlad Vega
 
Metadata harvesting
Metadata harvestingMetadata harvesting
Metadata harvesting
AndrewLIS688
 

Was ist angesagt? (20)

FRBR model by Gaurav Boudh
FRBR model by Gaurav BoudhFRBR model by Gaurav Boudh
FRBR model by Gaurav Boudh
 
FIWARE Training: FIWARE Training: i4Trust Marketplace
FIWARE Training: FIWARE Training: i4Trust MarketplaceFIWARE Training: FIWARE Training: i4Trust Marketplace
FIWARE Training: FIWARE Training: i4Trust Marketplace
 
Metadata mapping
Metadata mappingMetadata mapping
Metadata mapping
 
Apache Atlas: Governance for your Data
Apache Atlas: Governance for your DataApache Atlas: Governance for your Data
Apache Atlas: Governance for your Data
 
Encoded Archival Description (EAD)
Encoded Archival Description (EAD) Encoded Archival Description (EAD)
Encoded Archival Description (EAD)
 
FRSAD Functional Requirements for Subject Authority Data model
FRSAD Functional Requirements for Subject Authority Data modelFRSAD Functional Requirements for Subject Authority Data model
FRSAD Functional Requirements for Subject Authority Data model
 
What are the FAIR data principles?
What are the FAIR data principles?What are the FAIR data principles?
What are the FAIR data principles?
 
Introduction to DSpace
Introduction to DSpaceIntroduction to DSpace
Introduction to DSpace
 
Metadata Standards
Metadata StandardsMetadata Standards
Metadata Standards
 
Brief Introduction to Digital Preservation
Brief Introduction to Digital PreservationBrief Introduction to Digital Preservation
Brief Introduction to Digital Preservation
 
Interoperability in Digital Libraries
Interoperability in Digital LibrariesInteroperability in Digital Libraries
Interoperability in Digital Libraries
 
Information Organisation as a System
Information Organisation as a SystemInformation Organisation as a System
Information Organisation as a System
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
 
Metadata harvesting
Metadata harvestingMetadata harvesting
Metadata harvesting
 
Automation and Integrated Library Systems
Automation and Integrated Library SystemsAutomation and Integrated Library Systems
Automation and Integrated Library Systems
 
Digital Library Architecture
Digital Library ArchitectureDigital Library Architecture
Digital Library Architecture
 
Digital Content Management
Digital Content ManagementDigital Content Management
Digital Content Management
 
Selecting Software for Taxonomy, Thesaurus and Ontology Management
Selecting Software for Taxonomy, Thesaurus and Ontology ManagementSelecting Software for Taxonomy, Thesaurus and Ontology Management
Selecting Software for Taxonomy, Thesaurus and Ontology Management
 
Querying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge GraphQuerying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge Graph
 
Session 8 - Creating Data Processing Services | Train the Trainers Program
Session 8 - Creating Data Processing Services | Train the Trainers ProgramSession 8 - Creating Data Processing Services | Train the Trainers Program
Session 8 - Creating Data Processing Services | Train the Trainers Program
 

Ähnlich wie The importance of metadata for datasets: The DCAT-AP European standard

Vassilios Peristeras | Promoting Semantic Interoperability for European Publi...
Vassilios Peristeras | Promoting Semantic Interoperability for European Publi...Vassilios Peristeras | Promoting Semantic Interoperability for European Publi...
Vassilios Peristeras | Promoting Semantic Interoperability for European Publi...
semanticsconference
 
Ifla swsig meeting - Puerto Rico - 20110817
Ifla swsig meeting - Puerto Rico - 20110817Ifla swsig meeting - Puerto Rico - 20110817
Ifla swsig meeting - Puerto Rico - 20110817
Figoblog
 

Ähnlich wie The importance of metadata for datasets: The DCAT-AP European standard (20)

How to Describe a Dataset. Interoperability Issues, by Valeria Pesce
How to Describe a Dataset. Interoperability Issues, by Valeria PesceHow to Describe a Dataset. Interoperability Issues, by Valeria Pesce
How to Describe a Dataset. Interoperability Issues, by Valeria Pesce
 
Force11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, OxfordForce11 JDDCP workshop presentation, @ Force2015, Oxford
Force11 JDDCP workshop presentation, @ Force2015, Oxford
 
Bologna
BolognaBologna
Bologna
 
Vassilios Peristeras | Promoting Semantic Interoperability for European Publi...
Vassilios Peristeras | Promoting Semantic Interoperability for European Publi...Vassilios Peristeras | Promoting Semantic Interoperability for European Publi...
Vassilios Peristeras | Promoting Semantic Interoperability for European Publi...
 
Metadata: A concept
Metadata: A conceptMetadata: A concept
Metadata: A concept
 
EUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan BroederEUDAT data architecture and interoperability aspects – Daan Broeder
EUDAT data architecture and interoperability aspects – Daan Broeder
 
PRELIDA Project Draft Roadmap
PRELIDA Project Draft RoadmapPRELIDA Project Draft Roadmap
PRELIDA Project Draft Roadmap
 
Gap Analysis
Gap AnalysisGap Analysis
Gap Analysis
 
Jabes 2010 - RDA "Unimarc, RDA et le web sémantique"
Jabes 2010 - RDA "Unimarc, RDA et le web sémantique"Jabes 2010 - RDA "Unimarc, RDA et le web sémantique"
Jabes 2010 - RDA "Unimarc, RDA et le web sémantique"
 
Metadata : Concentrating on the data, not on the scheme
Metadata : Concentrating on the data, not on the schemeMetadata : Concentrating on the data, not on the scheme
Metadata : Concentrating on the data, not on the scheme
 
Ontology based metadata schema for digital library projects in China
Ontology based metadata schema for digital library projects in ChinaOntology based metadata schema for digital library projects in China
Ontology based metadata schema for digital library projects in China
 
eROSA Stakeholder WS1: Data discovery through federated dataset catalogues
eROSA Stakeholder WS1: Data discovery through federated dataset catalogueseROSA Stakeholder WS1: Data discovery through federated dataset catalogues
eROSA Stakeholder WS1: Data discovery through federated dataset catalogues
 
NISO/DCMI Webinar: Metadata for Public Sector Administration
NISO/DCMI Webinar: Metadata for Public Sector AdministrationNISO/DCMI Webinar: Metadata for Public Sector Administration
NISO/DCMI Webinar: Metadata for Public Sector Administration
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital Libraries
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital Libraries
 
Ifla swsig meeting - Puerto Rico - 20110817
Ifla swsig meeting - Puerto Rico - 20110817Ifla swsig meeting - Puerto Rico - 20110817
Ifla swsig meeting - Puerto Rico - 20110817
 
Glossary of Metadata standards
Glossary of Metadata standardsGlossary of Metadata standards
Glossary of Metadata standards
 
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015
 
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
Chachra, "Improving Discovery Systems Through Post Processing of Harvested Data"
 
Introduction to Digital Humanities: Metadata standards and ontologies
Introduction to Digital Humanities: Metadata standards and ontologies Introduction to Digital Humanities: Metadata standards and ontologies
Introduction to Digital Humanities: Metadata standards and ontologies
 

Mehr von Giorgia Lodi

Mehr von Giorgia Lodi (17)

Semantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenzaSemantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenza
 
Open Data Turismo
Open Data TurismoOpen Data Turismo
Open Data Turismo
 
Sviluppo di ontologie per gli Open Data
Sviluppo di ontologie per gli Open DataSviluppo di ontologie per gli Open Data
Sviluppo di ontologie per gli Open Data
 
Corso_LinkedData_RegioneUmbria_Ambiente_2022.pdf
Corso_LinkedData_RegioneUmbria_Ambiente_2022.pdfCorso_LinkedData_RegioneUmbria_Ambiente_2022.pdf
Corso_LinkedData_RegioneUmbria_Ambiente_2022.pdf
 
The role of Linked Open Data in the digital transformation of PA
The role of Linked Open Data in the digital transformation of PAThe role of Linked Open Data in the digital transformation of PA
The role of Linked Open Data in the digital transformation of PA
 
Interoperabilità semantica: metadatazione e ontologie per la PA
Interoperabilità semantica: metadatazione e ontologie per la PAInteroperabilità semantica: metadatazione e ontologie per la PA
Interoperabilità semantica: metadatazione e ontologie per la PA
 
Open data e big data: le potenzialità offerte per la PA
Open data e big data:  le potenzialità offerte  per la PAOpen data e big data:  le potenzialità offerte  per la PA
Open data e big data: le potenzialità offerte per la PA
 
Cultural-ON: l'ontologia dei luoghi della cultura e degli eventi culturali
Cultural-ON: l'ontologia dei luoghi della cultura e degli eventi culturaliCultural-ON: l'ontologia dei luoghi della cultura e degli eventi culturali
Cultural-ON: l'ontologia dei luoghi della cultura e degli eventi culturali
 
L'ontologia dei contratti pubblici nella rete di ontologie OntoPiA
L'ontologia dei contratti pubblici nella rete di ontologie OntoPiAL'ontologia dei contratti pubblici nella rete di ontologie OntoPiA
L'ontologia dei contratti pubblici nella rete di ontologie OntoPiA
 
OntoPiA la rete di ontologie e vocabolari controllati per la pubblica amminis...
OntoPiA la rete di ontologie e vocabolari controllati per la pubblica amminis...OntoPiA la rete di ontologie e vocabolari controllati per la pubblica amminis...
OntoPiA la rete di ontologie e vocabolari controllati per la pubblica amminis...
 
Dati aperti di qualità e interoperabilità: metadati e ontologie condivise
Dati aperti di qualità e interoperabilità: metadati e ontologie condiviseDati aperti di qualità e interoperabilità: metadati e ontologie condivise
Dati aperti di qualità e interoperabilità: metadati e ontologie condivise
 
OntoPiA e il ruolo delle ontologie negli ecosistemi
OntoPiA e il ruolo delle ontologie negli ecosistemiOntoPiA e il ruolo delle ontologie negli ecosistemi
OntoPiA e il ruolo delle ontologie negli ecosistemi
 
Reuse of Ontology Design Patterns: real examples from Cultural Heritage and o...
Reuse of Ontology Design Patterns: real examples from Cultural Heritage and o...Reuse of Ontology Design Patterns: real examples from Cultural Heritage and o...
Reuse of Ontology Design Patterns: real examples from Cultural Heritage and o...
 
Linked Open Vocabularies
Linked Open VocabulariesLinked Open Vocabularies
Linked Open Vocabularies
 
Open Data e Amministrazione Trasparente
Open Data e Amministrazione TrasparenteOpen Data e Amministrazione Trasparente
Open Data e Amministrazione Trasparente
 
OntoPiA e il knowledge graph della pubblica amministrazione italiana
OntoPiA e il knowledge graph della pubblica amministrazione italianaOntoPiA e il knowledge graph della pubblica amministrazione italiana
OntoPiA e il knowledge graph della pubblica amministrazione italiana
 
Core Public Event Vocabulary - italian Application profile
Core Public Event Vocabulary - italian Application profileCore Public Event Vocabulary - italian Application profile
Core Public Event Vocabulary - italian Application profile
 

Kürzlich hochgeladen

No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
Sheetaleventcompany
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
raffaeleoman
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
Kayode Fayemi
 

Kürzlich hochgeladen (20)

George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024
 
Air breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsAir breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animals
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
 
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
 
Mathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptxMathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptx
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
 
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubs
 

The importance of metadata for datasets: The DCAT-AP European standard

  • 1. The importance of metadata for datasets: the DCAT-AP European standard Giorgia Lodi – giorgia.lodi@gmail.com
  • 2. Summary • The importance of metadata in the (open) data management • The standard DCAT and the European DCAT-AP version 1 and its new evolution • Focus on the Italian extension of DCAT-AP named DCAT- AP_IT
  • 3. Data vs metadata DATA Physical representation of facts, atomic events, objective phenomena, information suitable for communication, interpretation and processing by human beings or automatic means METADATA Data that defines other data (they are NOT the data itself!). Examples: bibliographic reference, the author of a document, the date of last modification of a dataset
  • 4. Additional definitions ONTOLOGY: a formal and explicit specification of shared representation (conceptualization) of a knowledge domain, defined on the basis of specific requirements CONTROLLED VOCABULARY: a set of predefined and authoritative standard terms and codes, pre-selected for the purpose of indexing and retrieving information DESCRIPTIVE METADATA: identify and describe digital objects
  • 5. The importance of metadata • They allow a better understanding of the data they describe • Facilitate the discoverability of the data • If they are defined using standard and shared ontologies and controlled vocabularies they facilitate: • Information exchange • Interoperability • Riutilization and valorisation of the public information
  • 6. FAIR principles FINDABLE The first step in (re)using data is to find them. Metadata and data should be easy to find for both humans and computers. Machine-readable metadata are essential for automatic discovery of datasets and services ACCESSIBLE Once the user finds the required data, she/he needs to know how can they be accessed INTEROPERABLE The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing REUSABLE metadata and data should be well-described so that they can be replicated and/or combined in different settings
  • 7. European Directive 1/2 ARTICLE 5 – AVAILABLE FORMATS “…public sector bodies and public undertakings shall make their documents available in any pre-existing format or language and, where possible and appropriate, by electronic means, in formats that are open, machine-readable, accessible, findable and re-usable, together with their metadata. Both the format and the metadata shall, where possible, comply with formal open standards”
  • 8. European Directive 2/2 ARTICLE 9 – PRACTICAL ARRANGEMENTS “Member States shall make practical arrangements facilitating the search for documents available for re-use, such as asset lists of main documents with relevant metadata, accessible where possible and appropriate online and in machine- readable format, and portal sites that are linked to the asset lists. Where possible, Member States shall facilitate the cross-linguistic search for documents, in particular by enabling metadata aggregation at Union level”
  • 9. Italian legislation – open data definition AVAILABLE (LEGAL REQUIREMENT): disaggregated according to the terms of an open licence allowing its re-use, also for commercial purposes ACCESSIBILE (TECHNOLOGICAL REQUIREMENT): by machines, in an open format and with associated metadata FREE OF CHARGE (ECONOMIC REQUIREMENT): free of charge or at marginal costs incurred for reproduction, making available and dissemination
  • 10. Metadata: European and Italian scenarios • We still observe different levels of quality for metadata • There are still different platforms being used for cataloguing data based on metadata • CKAN, DKAN, Socrata, Linked-data based platforms and proprietary infrastructures • There are still different tematic classifications for datasets • There are still different ways to specify licenses
  • 11. Metadata: European and Italian scenarios • We still observe different levels of quality for metadata • There are still different platforms being used for cataloguing data based on metadata • CKAN, DKAN, Socrata, Linked-data based platforms and proprietary infrastructures • There are still different tematic classifications for datasets • There are still different ways to specify licenses HOWEVER
  • 12. Common model for metadata specification A common European data model for metadata is helping in overcoming the previous mentioned obstacles The data model offers an harmonized and shared way to specify metadata for datasets with a focus on information that is particulartly relevant for re-users
  • 14. DCAT-AP specifications DCAT DCAT-AP Data CATalog vocabulary – Web standard based on the RDF standard. It provides a data model of the descriptions of datasets (not only open datasets) as stored in catalog European Data CATalog vocabulary – Application Profile – based on RDF, it is a set of constraints added to the DCAT specification that facilitate the data and metadata exchange. NO DE CH National Data CATalog vocabulary – Application Profiles – Defined by the different Member States that adhere to the European DCAT-AP initiative Based on the RDF standard, they typically include additional constraints or property while maintaining the compliance with DCAT-AP IT …
  • 15. DCAT-AP extensions for specific types of data • GeoDCAT-AP - Facilitate the metadata exchange between geospatial catalogs and data catalogs in general • StatDCAT-AP - Extends the DCAT-AP specification with a small number of elements that are relevant in oder to describe statistical datasets - Facilitate the metadata exchange between statitical catalogs and data catalogs in general Useful extensions to enable interoperability among different data catalogs
  • 16. DCAT-AP specifications DCAT DCAT-AP Data CATalog vocabulary – Web standard based on the RDF standard. It provides a data model of the descriptions of datasets (not only open datasets) as stored in catalogs European Data CATalog vocabulary – Application Profile – based on RDF, it is a set of constraints added to the DCAT specification that facilitate the data and metadata exchange. NO DE CH National Data CATalog vocabulary – Application Profiles – Defined by the different Member States that adhere to the European DCAT-AP initiative Based on the RDF standard, they typically include additional constraints or property while maintaining the compliance with DCAT-AP IT …
  • 17. DCAT vocabulary • There are two versions of it o Version 1.0 of 2014 o Version 2.0 of 2019 o In both versions the vocabulary uses directly standard and well known ontologies such as FOAF, Dublin Core, SKOS)
  • 18. DCAT version 1 • It is the latest Web recommendation • It includes four main concepts (classes) for describing data in catalogs o Catalog: a collection of metadata about datasets o Catalog Record: represents a metadata item in the catalog, primarily concerning the registration information, such as who added the item and when o Dataset: a collection of data, published or curated by a single agent, and available for access or download in one or more serializations or formats o Distribution: represent different formats of the dataset or different endpoints. Examples of distributions include a downloadable CSV file, an API or an RSS feed
  • 19. DCAT version 1 – data model
  • 20. DCAT version 1 – conformance 1/2 • Based on the standard we say that a catalog is compliant to DCAT if: o It is organized in datasets and distributions o There exists an RDF description of the catalog (independently of the specific RDF serialization used to represent it) o The contents of all metadata fields that are held in the catalog, and that contain data about the catalog itself and its dataset and distributions, are included in this RDF description o All classes and properties are consistent with the semantics of the specification o Additional non-DCAT properties are specified
  • 21. DCAT version 1 – conformance 2/2 DCAT PROFILE A DCAT profile can be defined. A profile adds additional constraints to DCAT. The constraints can be o A minimum set of metadata fields that are mandatory (in contrast to the open world assumption of DCAT vocabulary) o Classes and properties for additional metadata fields that are not covered in DCAT o Controlled vocabularies or URI sets as acceptable values for some properties (e.g., language, themes, etc.) o Requirements for specific access mechanisms (RDF syntaxes, protocols) to the catalog’s RDF description
  • 22. DCAT-AP specifications DCAT DCAT-AP Data CATalog vocabulary – Web standard based on the RDF standard. It provides a data model of the descriptions of datasets (not only open datasets) as stored in catalogs European Data CATalog vocabulary – Application Profile – based on RDF, it is a set of constraints added to the DCAT specification that facilitate the data and metadata exchange. NO DE CH National Data CATalog vocabulary – Application Profiles – Defined by the different Member States that adhere to the European DCAT-AP initiative Based on the RDF standard, they typically include additional constraints or property while maintaining the compliance with DCAT-AP IT …
  • 23. European DCAT-Application Profile • Born in 2013, DCAT-AP is a specification based on DCAT that aims at meeting specific application needs of data portals in Europe while providing semantic interoperability with other applications • It provides a common specification for describing public sector datasets in Europe to enable the exchange of descriptions of datasets among data portals • It allows: o Data catalogs to describe their dataset collections using a standardised description, while keeping their own system for documenting and storing them o Content aggregators, such as the European data portal or national data portals, to aggregate such descriptions into a single point of access. o Data consumers to more easily find datasets through a single point of access https://d1jdzavdzee8nu.cloudfront.net/sites/default/files/distribution/access_url/2019-05/e3f7bcdf-eaad- 4741-9bf6-dc61327f4eea/DCAT_AP_1.2.1.pdf
  • 25. Mandatory elements DCAT-AP– catalog Catalog class– Mandatory The Catalog is described with the following mandatory properties: • title à example “Open Data Catalog of the University of Bologna” • description à short description of the content of the Catalog • publisher à who makes available the catalog • dataset à list of all dataset objects that are included in the catalog Recommended issued and modified à date in which the catalog is released and modified, respectively
  • 26. Mandatory elements DCAT-AP– dataset Dataset class– Mandatory The Dataset is described with the following mandatory properties: • title à represents in short the content of the dataset • description à description of the content of the dataset All the remaining properties are recommended (i.e., contact point, distribution, keyword/tag, publisher, theme) and optional (e.g., conforms to, accrual periodicity, has version, identifier, language, landing page, spatial and temporal coverage, etc.)
  • 27. Some recommended elements of DCAT-AP - distribution Distribution class– Recommended If specificed following properties must be materialized for Distribution • Access URL à U RL that gives ac cess to a D istribution of the Dataset All the other properties are recommended (i.e., licence, format, description) and optional (e.g., byte size, download URL, language, title, modified, etc.)
  • 28. DCAT-AP specifications DCAT DCAT-AP Data CATalog vocabulary – Web standard based on the RDF standard. It provides a data model of the descriptions of datasets (not only open datasets) as stored in catalogs European Data CATalog vocabulary – Application Profile – based on RDF, it is a set of constraints added to the DCAT specification that facilitate the data and metadata exchange. NO DE CH National Data CATalog vocabulary – Application Profiles – Defined by the different Member States that adhere to the European DCAT-AP initiative Based on the RDF standard, they typically include additional constraints or property while maintaining the compliance with DCAT-AP IT …
  • 29. DCAT-AP_IT [ITA] Technical guidelines for data catalogs Available online https://docs.italia.it/italia/daf/linee-guida-cataloghi-dati-dcat- ap-it/it/stabile/
  • 30. DCAT-AP_IT • It reuses ontologies already available at the state of the art (e.g., Dublin-Core, FOAF, etc.) in order to guarantee interoperability with the European application profile • It extends some of the core concepts of DCAT and DCAT-AP in order to define additional constraints and properties • It does not use some concepts (classes) and properties defined as optional in DCAT-AP • Three core classes o Catalog – A collection of metadata that describe datasets o Dataset – a collection of data, published or curated by a single agent, and available for access or download in one or more serializations or formats o Distribution – a specific available form of a dataset • An OWL ontology has been defined in order to describe the profile
  • 31. Mandatory elements of DCAT-AP_IT - Catalog Catalog class– Mandatory (subclass of dcat:Catalog) The Catalog is described with the following mandatory properties: • title à example “Open Data Catalog of the University of Bologna” • description à short description of the content of the Catalog • publisher à who makes available the catalog • modified à the date of last modification of the catalog • dataset à list of all dataset objects that are included in the catalog
  • 32. Mandatory elements of DCAT-AP_IT - Dataset Dataset class– Mandatory (subclass of dcat:Dataset) The Dataset is described with the following mandatory properties: • identifier à example “unibo:D.1” • titleà it describes in short its content • description à description of the content • modified à date of last modification • theme à use of the controlled vocabulary defined at the European level named Data theme (13 themes associated with the dataset) • rightsHolder à who owns the rights on the dataset (publisher is recommended and creator is optional) • accrual periodicityà the frequency of update of the dataset. Use of the European controlled vocabulary Frequencies • distribuzione à mandatory property if the dataset is open
  • 33. Mandatory elements of DCAT-AP_IT - Distribution Distribution class– Mandatory if the dataset is open (subclass of dcat:Distribution) La class is decribed by the following mandatory properties: • format à use of the European controlled vocabulary File Type • license à use of the Italian controlled vocabulary Licences (https://w3id.org/italia/controlled- vocabulary/licences) • description à describe the content of the distribution • access URL à a URL of a web page through which it is possible to get access to the dataset downloadURL is optional but it may be useful to specify it
  • 34. Example in RDF - catalog
  • 35. Example in RDF - dataset
  • 36. Example in RDF - distribution
  • 37. Example in RDF – Agent and point of contact
  • 38. Spacial coverage – GeoDCAT-AP • Only a minimal part of the GeoDCAT- AP extension is used in the current DCAT-AP_IT to connect the profile with the geospatial world • Italian guidelines guide in the implementation the overall GeoDCAT- AP specification https://geodati.gov.it/geoportale/images/struttura/documenti/ GeoDCAT-AP_IT-v1.0.pdf
  • 39. What’s next? DCAT and DCAT-AP version 2.0
  • 40. DCAT version 2 • It is the new candidate recommendation as of beginning of October 2019 • It changes the original version 1 in order to reflect years of practical use cases and introduce important elements that characterize data in catalogs e.g., o data resources o relationships between data resources o some geospatial elements o APIs or data services https://www.w3.org/TR/vocab-dcat-2/
  • 41. DCAT version 2 – 1/2 NOVEL ELEMENTS • 3 new concepts (classes) o Resource: represents a dataset, a data service or any other resource that may be described by a metadata record in a catalog. It is not used directly but it is the parent class for Catalog, Dataset and Data Service o Data Service: A data service is a collection of operations accessible through an interface (API) that provide access to one or more datasets or data processing functions o Relationship: An association class for attaching additional information to a relationship between DCAT Resources à to be verified in practice
  • 42. DCAT version 2 – 2/2 NOVEL ELEMENTS • Revision of the definitions of o Catalog: collections of metadata about datasets or data services o Distribution: represents an accessible form of a dataset such as a downloadable file • New elements for dealing with Time and Space for Dataset and Distribution • License metadata specified for dataset too other than Distribution • Possibility to specify compressed formats for Distribution (e.g., zip e tar.gz) by also indicating the format included in the compression • Possibility to specify roles and relationships among data resources
  • 43. DCAT version 2 – relationship • The class Relationship is used to characterize a relationship between datasets, and potentially other resources, where the nature of the relationship is known but is not adequately characterized by the standard Dublin core and PROV-O properties • The property hadRole defines the function of an agent wrt another entity or resource o May be used in a qualified-attribution to specify the role of an Agent with respect to an Entity o Recommended the use of a controlled vocabulary for roles o A new way to specify roles for resources (datasets, catalogs, data services)
  • 44. DCAT version 2 • A data service typically provides selection, extraction, combination, processing or transformation operations over datasets that might be hosted locally or remote to the service. o The result of any request to a data service is a representation of a part or all of a dataset or catalog o Examples: a data discovery service, data transformation services, such as coordinate transformation services, re-sampling and interpolation services, and various data processing services, including simulation and modelling services o Three main properties characterize the data service: endpointURL, endpointDescription, servesDataset
  • 45. DCAT-AP version 2.0 • It is currently under development • It is in public review until 4th of November; contributions are discussed using the related github repository • Two types of changes have been applied: • Changes based on the feedback on the usage of verson 1.2.1 • Changes that adapt the profile to the new DCAT https://joinup.ec.europa.eu/solution/dcat-application-profile-data-portals-europe/release/200
  • 47. DCAT-AP version 2.0 - catalog Novel elements • Possibility to specify catalog of catalog • Possibility to specify catalog of data service • Possibility to specify a creator • Spatial coverage becomes recommended All the rest remains unchanged
  • 48. DCAT-AP version 2.0 - dataset Novel elements • Temporal and spatial coverage become recommended (some new properties have been added for these two concepts) • Introduction of a set of optional properties that derive from the new DCAT (e.g., provenance, relationship, creator) All the rest remains unchanged
  • 49. DCAT-AP version 2.0 - distribution Novel elements • New properties for compressed formats • Temporal and spatial resolution • A new property named availability that assume the following values: temporary, experimental, available, stable • New properties to link the Distribution to a policy (rights) and accessService to connect the Distribution to Data services
  • 50. Conclusions • In the public sector an increasing number of Public Administrations are adopting DCAT-AP(_IT) • However, important changes are to be taken into account • Challenge: how to rapidly adapt the current metadata ecosystem in order to implement the new changes that were introduced in DCAT and DCAT-AP?