1. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
Linked Data with hybrid
services in Agriculture
Raul Palma1, Rob Knapen2
1Poznan Supercomputing and Networking Center
2Wageningen University & Research
113th OGC Technical Committee meeting
Toulouse, 19th November 2019
1
2. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
Linked data publication
• LD is increasingly becoming a popular method for publishing data on the Web
• Improves data accessibility by both humans and machines, e.g., for finding, reuse and integration
• Enables to discover more useful data through the links (and inferencing), and to exploit data with
semantic queries
• Growing number of datasets in the LOD cloud
1,239 datasets with 16,147 links (as of March 2019)
• Coverage of the LOD cloud
Large cross-domain datasets (dbpedia, freebase, etc.)
Variable domain coverage (e.g., Geography,
Government, BioInformatics)
• What about Agriculture?
“Just” few datasets (e.g., AGRIS biblio records,
AGROVOC thesaurus + other thesaurus like NALT)
Farming data and other agri-activities related data?
2
http://lod-cloud.net/
3. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
Why is Linked Data relevant in Agriculture:
Farming context
• Farm management
• Multiple activities and stakeholders
• Multiple applications, tools and
devices
• Multiple data sources, types and
formats
• Challenge
To combine/integrate those different
and heterogeneous data sources in
order to make economically and
environmentally sound decisions
3
4. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
Data Integration in relevant projects (context)
• Data integration challenges have been/are one of the key challenges
addressed in several recent projects related to the agri-food sector
4
5. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
Linked data principles principles and general tasks
• Simple set of principles & technologies
• URI, HTTP, RDF, SPARQL
• Involves a set of (common) general tasks
5
Datasets identification
Model specification
RDF data generation
Linking
Exploiting
Hyland et al.
Hausenblas et al.
Villazón-Terrazas et al.
Best Practices for Publishing Linked Data
5-star deployment scheme
for Linked Open Data
6. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
Linked data guidelines & patterns
6
T. Heath and C. Bizer. Linked Data: Evolving the Web into a Global Data Space,
http://linkeddatabook.com/editions/1.0/
B. Hyland, G. Atemezing, B. Villazón-Terrazas
Best Practices for Publishing Linked Data.
W3C Working Group Note
https://www.w3.org/TR/ld-bp/
7. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
From guidelines to practice
7
8. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
Implementing Linked Data publication pipelines
• Goal: to define and deploy (semi-) automatic processes to carry out the necessary steps
to transform and publish different input datasets as Linked Data.
• A pipeline connect different data processing components to carry out the
transformation of data into RDF and their linking, and includes the mapping
specifications to process the input datasets.
• Each pipeline is configured to support specific input dataset types (same format, model
and delivery form).
• Principles
Pipelines can be directly re-executed and re-applied
(e.g., extended/updated datasets)
Pipelines must be easily reusable
Pipelines must be easily adapted for new input datasets
Pipeline execution should be as automatic as possible.
The final target is to fully automated processes.
Pipelines should support both: (mostly) static data and data
streams (e.g., sensor data)
• The resulting datasets available as Linked Data, will provide an integrated view over the
initial (disconnected and heterogeneous) datasets, in compliance with any privacy and
access control needs
8
9. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
Serving Linked Data with hybrid services
• Many practical linked data use cases have to address hybrid information
needs1:
Variety of data sources
Variety of data modalities
Variety of data processing techniques
• Although SPARQL queries enable to express data requests over RDF
knowledge graphs, the support for hybrid information needs is limited
Query engines focus on retrieving RDF data and support a set of built-in services
• Approach: implement wrappers around the APIs that:
Assign HTTP URIs to the resources about which the API provides data
Upon URI dereference, rewrite the client’s request into a request against the API
Transform API results to RDF and sent back to the client.
9
1Nikolov, Andriy et al. “Ephedra: SPARQL Federation over RDF Data and Services.” International Semantic Web Conference (2017).
10. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
Use case: AgroDataCube (ongoing work)
• AgroDataCube provides a large collection of both open and derived data from Netherlands for
use in agri-food applications (by Wageningen Environmental Research)
• AgroDataCube exposes a REST API with 6 resources:
Fields: to retrieve data from the crop registration
datasets. Crop fields change per year,
and are recorded by farmers with an indication
of the crop that will be grown on the field.
Altitude: to retrieve AHN
('Actueel Hoogtebestand Nederland')
Meteo: to retrieve data from the KNMI
(the Royal Netherlands Meteorological Institute)
weather stations
Soil: to retrieve data from the BOFEK 2012 datasets
and the Dutch soil map 1:50.000
Vegetation: to retrieve NDVI
(Normalized Difference Vegetation Index) data
Codes: to retrieve more details about a specific crop
or soil code returned by other requests
Regions: to retrieve administrative boundaries of
provinces, municipalities, and postal code areas
• Data is returned in GeoJSON format
• Part of CYBELE demonstrator „Optimising computations for crop yield forecasting”
10
11. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
General steps
• Define/select semantic models to represent data of
resources from API
• Implement wrapper around API to transform on the fly
SPARQL request to API call and generate RDF data from
GeoJSON result
• Expose generated RDF data via SPARQL endpoint
• Query REST API with SPARQL
Process (e.g., format) any required output on the fly
Link the generated RDF data with other datasets and thesauri
(on the fly or with previously generated/discovered RDF links)
• Visualize and exploit Linked Data
11
12. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
Ontologies for AgroDataCube
• General rule: reuse standard and/or widely used
ontologies/vocabularies whenever possible, and
extend as needed
• Selected resources:
FOODIE ontology
OLU vocabulary
SOSA/SSN
Soilphysics
…
12
13. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
FOODIE ontology
• Application vocabulary covering the different categories
of information dealt by the farm mgmt. tools/apps
• in line with existing standards and best practices
Builds on the INSPIRE AF specification for agricultural data, and
the INSPIRE specification for themes in annex I for geospatial data,
based on
ISO/OGC standards for geographical information
• Generated (semi-)automatically with ShapeChange tool
from base model in UML1
ShapeChange implements ISO 19150-2 standard rules for mapping
ISO geographic information UML models to OWL ontologies.
• Overall structure (ShapeChange output)
UML featureTypes and dataTypes modelled as classes, and their
attributes as datatype or object properties
UML codeLists modelled as classes/concepts, and their attributes as
concept members
Cardinalities restrictions defined on properties (exactly, min, max)
DataType properties ranges defined according to model/mappings
Object properties ranges defined according to model/mappings
Object properties inverseOf defined
13
1Palma R., Reznik T., Esbri M., Charvat K., Mazurek C., An INSPIRE-based vocabulary
for the publication of Agricultural Linked Data. Proceedings of the OWLED
Workshop: collocated with the ISWC-2015, Bethlehem PA, USA, October 11-15, 2015
Datatype hierarchy codelist hierarchy
FeatureType hierarchy
14. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
FOODIE ontology
• Key feature on a more detailed level than Site that is already part
of the INSPIRE AF data model: Plot
• Represents a continuous area of agricultural land with
one type of crop species, cultivated by one user in one
farming mode
• Two kinds of data associated:
• metadata information
• agro-related information
Next level: Management Zone
• Enables a more precise description of the land
characteristics in fine-grained area
15. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
FOODIE ontology
• The Intervention is the basic feature type for any kind of (farming)
application with explicitly defined geometry, e.g., tillage or pruning.
Has multiple indirect associations with different concepts
15
16. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
Ephedra: API Wrapper
• Ephedra is a SPARQL federation
engine aimed at processing hybrid
queries, which provides a flexible
declarative mechanism for including
hybrid services into a SPARQL
federation.
• Ephedra is a component of
Metaphactory
(https://www.metaphacts.com/), an
end-to-end Knowledge Graph
Platform for knowledge graph
management, rapid application
development, and end-user oriented
interaction.
16
17. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
Creating SPARQL wrapper with Ephedra
• Describe the REST Service Signature (mapping)
17
18. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
Creating SPARQL wrapper with Ephedra
• Configure the
AgroDataCube REST
Service Repository
• Include this repository
into the Ephedra
federation
18
19. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
Expose generated RDF data via SPARQL endpoint
• SPARQL endpoint provided
http://metaphactory.foodie-
cloud.org/sparql?repository=ephedra
• Use SPARQL SERVICE keyword
SERVICE
<http://www.metaphacts.com/ontologies/platform/rep
ository/federation#agrodatacube>
19
20. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
Query REST API with SPARQL
• Process (e.g., format) any required output on the fly
• Link the generated RDF data with other datasets and thesauri on the fly
20
21. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
Visualize and exploit the linked data
• Demo app: http://metaphactory.foodie-cloud.org/resource/:AGROVOC-crops
21
22. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
Visualize and exploit the linked data
• Demo app: http://metaphactory.foodie-cloud.org/resource/:AGROVOC-crops
22
23. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
Visualize and exploit the linked data
• Demo app: http://metaphactory.foodie-cloud.org/resource/:AGROVOC-crops
23
24. www.cybele-project.eu
This project has received funding from the European
Union’s Horizon 2020 research and innovation
programme under grant agreement No. 825355.
Special thanks to Metaphacts team
Questions: rpalma@man.poznan.pl
24
Thank you!