Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Linked data publication pipelines for agri-related use cases
1. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
1
This project has received funding from
the European Union’s Horizon 2020
research and innovation programme
under grant agreement No 732064
This project is part
of BDV PPP
LINKED DATA PUBLICATION PIPELINES
FOR AGRI-RELATED USE CASES
Raul Palma1, Soumya Brahma1 , Marcin Krystek1, Karel
Charvát2, Raitis Berzins2
1Poznan Supercomputing and Networking Center, Poland
2WirelessInfo, Czech Republic 7th Leipzig Semantic Web Day
22nd May, 2019
2. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
2
Linked data publication
• LD is increasingly becoming a popular method for publishing data on the Web
• Improves data accessibility by both humans and machines, e.g., for finding, reuse and integration
• Enables to discover more useful data through the links (and inferencing), and to exploit data with semantic
queries
• Growing number of datasets in the LOD cloud
• 1,234 datasets with 16,136 links (as of June 2018)
• Coverage of the LOD cloud
• Large cross-domain datasets (dbpedia, freebase, etc.)
• Variable domain coverage (e.g., Geography,
Government, BioInformatics)
• What about Agriculture?
• “Just” few datasets (e.g., AGRIS biblio records,
AGROVOC thesaurus + other thesaurus like NALT)
• Farming and agri-activities related data?
http://lod-cloud.net/
3. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
3
Why is Linked Data relevant in Agriculture:
Farming context
• Farm management
• Multiple activities and stakeholders
• Multiple applications, tools and devices
• Multiple data sources, types and formats
• Challenge
• To combine/integrate those different and
heterogeneous data sources in order to
make economically and environmentally
sound decisions
4. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
4
Data Integration in relevant projects (context)
• Data integration challenges have been the focus of relevant projects
EU FP7, ICT CIP, 2014- 2017
EU FP7, ICT CIP, 2014- 2017
FOODIE aimed at building an open
and interoperable cloud-based
platform addressing among others the
integration of data relevant to
farming production including their
geo-spatial dimension, as well as their
publication as Linked data.
SDI4Apps aimed at building a cloud-
based framework with open API for
data integration focusing on the
development of six pilot apps, drawing
along the lines of INSPIRE, Copernicus
and GEOSS
DataBio aims at showcasing the benefits of Big
Data technologies in the raw material production
from agriculture & others for the bioeconomy
industry; deploying an interoperable platform on
top of the existing partners’ infrastructure.
DataBio aims at delivering solutions for big data
mgmt., including i) the storage and querying of
various big data sources; ii)
the harmonization and integration of a large
variety of data from many sources, using linked
data as a federated layer
5. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
5
Linked data principles & guidelines
• Simple set of principles & technologies
• URI, HTTP, RDF, SPARQL
• Involves a set of (common) tasks
Datasets
identification
Model specification
RDF data generation
Linking
Exploiting
Hyland et al.
Hausenblas et al.
Villazón-Terrazas et al.
Best Practices for Publishing Linked Data
5-star deployment scheme
for Linked Open Data
6. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
6
Implementing Linked Data publication pipelines
7. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
7
Implementing Linked Data publication pipelines
• Goal: to define and deploy (semi-) automatic processes to carry out the necessary steps to transform and
publish different input datasets as Linked Data.
• A pipeline connect different data processing components to carry out the transformation of data into RDF
and their linking, and includes the mapping specifications to process the input datasets.
• Each pipeline is configured to support specific input dataset types (same format, model and delivery form).
• Principles
• Pipelines can be directly re-executed and re-applied
(e.g., extended/updated datasets)
• Pipelines must be easily reusable
• Pipelines must be easily adapted for new input datasets
• Pipeline execution should be as automatic as possible.
The final target is to fully automated processes.
• Pipelines should support both: (mostly) static data and data streams
(e.g., sensor data)
• The resulting datasets available as Linked Data, will provide an integrated view over the initial
(disconnected and heterogeneous) datasets, in compliance with any privacy and access control needs
8. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
8
Use cases (examples)
• UC1: Publication of farm related linked data from agri pilots in DataBio
project, particularly Pilot 8 [B1.4] Cereals and biomass crops.
• Goal: query and access heterogeneous agricultural data sources from Rostenice farm via an integrated
layer in order to make informed decisions and discover new knowledge
• UC2: Publication of Open EU/national datasets relevant for agri-food pilots
as Linked Data
• Goal: provide access to multiple, isolated data sources relevant for agri-pilots, and identify links with
farm datasets, from a single integrated layer
• UC3: Publication of sensor data as linked data on the fly from Pilot 9 [B2.1]
Machinery management
• Goal: provide access to sensor data integrated with other farm data and other relevant datasets
9. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
9
Datasets identification and collection
• UC1 Datasets
• Farm data (Rostenice holding) that holds information
about each field names with the associated cereal
crop classifications and arranged by year.
• Data about the field boundaries and crop map and
yield potential of most of the fields in Rostenice farm
from Czech Republic.
• Yield records from two fields (Pivovarka, Predni) within
the pilot farm that were harvested in 2017/2018.
Source data types: shapefiles
10. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
10
Datasets identification and collection
• UC2 datasets
• Czech datasets
• Czech LPIS data showing the actual land boundaries.
• Czech erosion zones (strongly/SEO and moderately / MEO erosion-endangered soil zones)
• Czech water bodies (e.g., restricted area near to water bodies has 25m buffer according to
the nitrate directive).
• The data about soil types from all over Czech Republic.
• Polish datasets
• Polish LPIS data showing the cadastral land boundaries from all over the country.
• European datasets
• FADN (Farm Accountancy Data Network) data about the income of agricultural holdings and
the impacts of the Common Agricultural Policy from all EU member states
• Various open European geospatial datasets including
• (part of) Open Land Use (OLU)
• (part of) Open Transport Map (OTM)
• Smart Points of Interest (SPOI),
• (part of) Urban Atlas (pan-European comparable land use and land cover data for Large
Urban Zones )
• (part of) CORINE Land Cover
• HILUCS (Hierarchical INSPIRE Land Use Classification System )
• Experimental sample dataset from the review platform Yelp (global coverage). The
data contents are regarding the geographical location of a business, review and
reviewer information.
Source data types: shapefiles, JSON, CSV,
relational databases
11. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
11
Datasets identification and collection
• UC3 dataset
• Sensor data stored and collected by Senslog platform,
including the readings of IoT devices on tractors.
• Data updates is high, reading coming frequently
Source data types:
relational databases
12. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
12
Data models
• Various ontologies/vocabularies were selected and reused/extended for the
representation of data
• For farm-related data, INSPIRE based FOODIE ontology has been selected and extended as needed
• For the land parcel and cadastral data (for Czech republic & Poland), erosion-endangered soil zones, water buffer
and soil type classification, also FOODIE ontology and in some cases its extensions were used for modelling the
classes and properties.
• In case of the FADN the main ontologies used were Data Cube Vocabulary and its SDMX ISO standard extensions
that were much effective in aligning such multidimensional data. Moreover the Data Cube Vocabulary
encompasses well known RDF vocabularies like SKOS, SCOVO, VoiD, FOAF, Dublin Core, etc.
• For the Yelp dataset various ontologies like review, FOAF, schema.org, POI, etc. were used to represent the
classes and the properties identified from the input data sources.
• For some other datasets (e.g., corine, hilucs, olu, otm, urban atlas) simple ontologies/voabularies were generated
in line with standards and are available in https://github.com/FOODIE-cloud/ontology.
• For sensor data: Semantic Sensor Network (SSN) along with the SOSA (Sensor, Observation, Sample, and
Actuator) ontology for describing sensors and their observations, the involved procedures, the studied features of
interest, etc. Additionally, Data Cube Vocabulary and its SDMX ISO standard extensions for multidimensional data
13. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
13
Data model for farming data
• (FOODIE) farming data model principles
• application vocabulary covering the different categories of information
dealt by the farm mgmt. tools/apps (in FOODIE)
• in line with existing standards and best practices
• Resulting model*
• Builds on the INSPIRE AF specification for agricultural data, and
• the INSPIRE specification for themes in annex I for geospatial data, based on
• ISO/OGC standards for geographical information
• Created as an UML model
*consulted with experts from various institutions, e.g., EU DG JRC, EU Global
Navigation Satellite Systems Agency (GSA), Czech Ministry of Agriculture,
Global Earth Observation System of Systems (GEOSS), German Kuratorium für
Technik und Bauwesen in der Landwirtschaft (KTBL).
Challenge: Transform model into OWL
ontology
14. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
14
Transformation of model into OWL
ontology
Palma R., Reznik T., Esbri M., Charvat K., Mazurek C., An INSPIRE-
based vocabulary for the publication of Agricultural Linked Data.
Proceedings of the OWLED Workshop: collocated with the ISWC-
2015, Bethlehem PA, USA, October 11-15, 2015
ShapeChange implements ISO 19150-2
standard rules for mapping ISO geographic
information UML models to OWL
ontologies.
semi-automatic process: besides
transformation configuration,
additional pre and post processing
task were needed
15. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
15
FOODIE ontology - overview
• Overall structure (ShapeChange output)
• UML featureTypes and dataTypes modelled as classes, and
their attributes as datatype or object properties
• UML codeLists modelled as classes/concepts, and their
attributes as concept members
• Cardinalities restrictions defined on properties (exactly,
min, max)
• DataType properties ranges defined according to
model/mappings
• Object properties ranges defined according to
model/mappings
• Object properties inverseOf defined
Top hierarchy
FeatureType hierarchy
Codelist hierarchy
Datatype hierarchy
16. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
16
FOODIE ontology – main classes overview
• We found the lack of a feature on a more detailed level than Site that is already part of
the INSPIRE AF data model.
• Main concept: Plot
• Represents a continuous area of agricultural
land with one type of crop species, cultivated
by one user in one farming mode
• Two kinds of data associated:
• metadata information
• agro-related information
Next level: Management Zone
• Enables a more precise description of the land
characteristics in fine-grained area
foodie:Plot
INSPIRE-AF:Site
foodie:Alert Foodie:InterventionFoodie:CropSpecies
Foodie:ManagementZone
containsPlot
containsManagementZone
interventionPlotspeciesPlot
alertPlot
plotAlert
Foodie:ProductionType
production
Foodie:SoilNutrients
zoneNutrients
17. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
17
FOODIE ontology – main classes overview
• The Intervention is the basic feature type for any kind of (farming)
application with explicitly defined geometry, e.g., tillage or pruning.
• Has multiple indirect associations with different concepts
Foodie:Intervention
Foodie:Treatment
Foodie:TreatmentPlan
Foodie:Product Foodie:ProductPreparation
Foodie:ActiveIngredients
is-a
plan
productPlan
planProduct
preparationProduct preparation
productTreatment
treatmentProduct
preparationPlan
ingredientProduct
Foodie:FormOfT
reatmentValue
Foodie:Treatme
ntPurposeValue
formOfTreatmenttreatmentPurpose
18. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
18
RDF data generation
• Different tools were deployed, configured and used for the generation of RDF
data, depending on the source data type
• D2RQ: mainly for relational databases
• Geotriples: mainly for shapefiles
• R2MLprocessor: mainly for JSON, CSV data sources
• All these tools require a mapping file (in RDF) specifying how to map the data
source elements to the target ontology concepts and properties.
• Mapping specifications use R2RML (RDB to RDF Mapping Language) and/or
extensions (e.g., RDF Mapping language (RML))
19. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
19
RDF data generation
• The mapping file also specifies the connection details for the source dataset
• Based on the mapping file, the data source (e.g., database content, shapefile, JSON,
CSV, etc.) is either
• i) dumped to an RDF file; or
• ii) transformed on the fly as a virtual RDF graph (e.g., for data streaming)
• RDF files were loaded into Virtuoso triplestore (FOODIE endpoint)
20. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
20
Linking the generated RDF datasets
• In order to link the resulting RDF datasets with
other datasets, we follow different approaches:
• Apply existing tools like Silk or LIMES to discover
equivalence relations
• For other relations use queries (e.g., geospatial) to
access the integrated data as per need.
• In our experiments with equivalence relations;
however we also had to do some manual
entries
• We found issues in handling large datasets in Silk,
specially those accessed via SPARQL endpoint that
we cannot control
There were recent optimizations of LIMES
and a new tool for geospatial linking (from
Leipzig) that we plan to test
21. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
21
Triplestore statistics (examples)
Dataset Name Graph in FOODIE endpoint Source Triples
OLU** http://w3id.org/foodie/olu# Transformed from PostgreSQL 127,926,060
SPOI http://www.sdi4apps.eu/poi.rdf Provided by WRLS (also
available in FOODIE endpoint)
407,629,170
NUTS http://nuts.geovocab.org/ Open Source (available in
FOODIE endpoint)
316,238
OTM*** http://w3id.org/foodie/otm# Transformed from PostgreSQL 154,340,785
Yelp academic
dataset
http://data.yelp.com/academic_dataset# Transformed from JSON 86,348,185
LPIS data (CZ) http://w3id.org/foodie/open/cz/pLPIS_18
0616_WGS#
Transformed from shapefile 24,491,282
FADN http://ec.europa.eu/agriculture/FADN/{d
ataset}#
Transformed from CSV 25,457,255
Pilot 8 farm data private Transformed from shapefile 1,569,439
Total: over 1 bilion triples!
FOODIE triplestore is one of the largest semantic repositories related to agriculture, which has been
recognized by the EC innovation radar as „arable farming data integrator for Smart Farming”
22. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
22
Exploiting the Linked Data – querying triplestore
• Sparql endpoint: https://www.foodie-cloud.org/sparql
23. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
23
Exploiting the Linked Data – querying sensor data as virtual RDF graph
• Sparql endpoint: http://senslogrdf.foodie-cloud.org/sparql
• SNORQL search endpoint: http://senslogrdf.foodie-cloud.org/snorql/
• Web-based visualization: http://senslogrdf.foodie-cloud.org/
24. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
24
Exploiting the Linked Data – querying examples(1-2 linked
datasets)
• get information of Points Of Interest in a
given polygon (one dataset)
• get SPOIs for given Open Land Use parcel
(linking 2 datasets)
SELECT *
FROM <http://www.sdi4apps.eu/poi.rdf>
WHERE {
?Resource rdfs:label ?Label .
?Resource poi:class ?POI_Class .
?Resource geo:asWKT ?Coordinates .
FILTER(bif:st_intersects (?Coordinates,
bif:st_geomFromText("POLYGON((6.11553983198 54.438016608357,
6.95050076948 47.230985358357, 13.36651639448 47.626493170857,
14.99249295698 54.701688483357, 6.11553983198
54.438016608357))"))) . }
SELECT *
FROM <http://www.sdi4apps.eu/poi.rdf>
WHERE {
?Resource rdfs:label ?Label .
?Resource poi:class ?POI_Class .
?Resource geo:asWKT ?Coordinates .
FILTER(bif:st_intersects (?Coordinates,
bif:st_geomFromText(?coordinates))) .
{
SELECT bif:st_astext(?x) as ?coordinates
FROM <http://w3id.org/foodie/olu#>
WHERE {
olu-instance: geo:hasGeometry ?geometry.
?geometry geo:asWKT ?x
}
}
}
25. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
25
Exploiting the Linked Data – querying examples (3+ linked
datasets)
• Show me all the land parcels (OLU) that
have hotels (SPOI) and that lie not more
than 50 meters away from the major
highway (OTM) (linking 3 datasets)?
SELECT DISTINCT ?olu ?hilucs ?source ?municode ?specificLandUse
FROM <http://w3id.org/foodie/olu#>
WHERE {
?olu a olu:LandUse .
?olu geo:hasGeometry ?geometry .
?olu olu:hilucsLandUse ?hilucs .
?olu olu:geometrySource ?source .
OPTIONAL {?olu olu:municipalCode ?municode} .
OPTIONAL {?olu olu:specificLandUse ?specificLandUse} .
?geometry geo:asWKT ?coordinatesOLU .
FILTER(bif:st_within(bif:st_geomFromText(?coordinatesPOI),?coordinatesOLU)).
{
SELECT DISTINCT ?Resource, ?Label, bif:st_astext(?coordinatesPOIa) as ?coordinatesPOI
FROM <http://www.sdi4apps.eu/poi.rdf>
WHERE {
?Resource rdfs:label ?Label .
?Resource poi:class <http://gis.zcu.cz/SPOI/Ontology#lodging> .
?Resource geo:asWKT ?coordinatesPOIa .
FILTER(bif:st_within(?coordinatesPOIa,bif:st_geomFromText(?coordinatesOTM),0.00045)) .
{
SELECT bif:st_astext(?x) as ?coordinatesOTM
FROM <http://w3id.org/foodie/otm#>
WHERE {
?roadlink a otm:RoadLink .
?roadlink otm:roadName ?name.
?roadlink otm:functionalRoadClass ?class.
?roadlink otm:centerLineGeometry ?geometry .
?geometry geo:asWKT ?x .
FILTER(bif:st_intersects (?x, bif:st_geomFromText("POLYGON((14.426647 50.0751251,14.426647 50.07685089,14.43054696
50.07685089,14.43054696 50.0751251,14.426647 50.0751251))"))) .
FILTER(STRSTARTS(STR(?class),"firstClass") ) .
}
}
}
}
}
26. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
26
Exploiting the Linked Data – search & navigate
• Faceted search endpoint: http://www.foodie-cloud.org/fct
27. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
27
Exploiting the Linked Data – visualisation
• Map visualisation: http://ng.hslayers.org/examples/foodie-zones/
28. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
28
Exploiting the Linked Data – visualisation
• Map visualisation: http://ng.hslayers.org/examples/produce-3d/
Object information
on click
29. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
29
Exploiting the Linked Data – visualisation
• Map visualisation: http://ng.hslayers.org/examples/olu_spoi
• OLU polygons colored by the number of SPOI that lie inside them
30. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
30
Exploiting the Linked Data – visualisation
• Map visualisation: http://app.hslayers.org/project-databio/land
• Usage scenarios:
• Find and show buffer zones around water bodies (user will specify the distance),
which define the areas within the fields with limited/restricted application of agro-
chemicals.
• select farm/fields based on the ID_UZ attribute from public CZ LPIS database and
search EO data over all fields
• visualize crop species based on the farm data (need parcels with crop types- not
available from open LPIS data)
• select fields with different soil types
• select all fields with certain crop in max distance from certain point (it could be for
logistic, distribution of biomass etc)
• show/select erosion zones for specific farm ID
31. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
31
Exploiting the Linked Data – visualisation
• Map visualisation: http://app.hslayers.org/project-databio/land
32. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
32
Exploiting the Linked Data – visualisation
• Metaphactory: http://foodie.metaphacts.cloud/resource/Start
33. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
33
Exploiting the Linked Data – visualisation
• Metaphactory: http://foodie.metaphacts.cloud/resource/Start
34. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
34
Exploiting the Linked Data – visualisation
• Metaphactory: http://foodie.metaphacts.cloud/resource/Start
35. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
35
Exploiting the Linked Data – visualisation
• Metaphactory: http://foodie.metaphacts.cloud/resource/Start
36. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
36
Security of RDF Data
• SPARQL endpoints are web services capable of providing Read-Only access to a back-end graph DBMS.
• SPARQL endpoints can be sometimes purpose-specific so their access privilege therefore must be limited
to some basic operations over the graph.
• The privileges provided by a given Virtuoso SPARQL endpoint include specific user identities with specific
database roles and privileges.
• Virtuoso offers three methods for securing SPARQL endpoints:
• Digest Authentication via SQL Accounts
• OAuth Protocol based Authentication
• WebID Protocol based Authentication
37. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
37
Linked data publication technologies overview
• Used technologies:
• D2RQ, Geotriples and R2MLProcessor for input
datasets as Virtual RDF Graphs
• RDF for the representation of data
• FOODIE (Farming ontology) providing the underlying
vocabulary and relations, plus a number of other
ontologies/vocabularies (existing and generated)
• Virtuoso for storing the semantic data
• Silk/LIMES for discovery of links
• Sparql for querying semantic data
• Hslayers NG for visualisation of data
• Metaphactory for visualisation of data
D2RQ
38. This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
38
Thank you for your attention!
Contact details
Editor's Notes
Typical farm activities carried out by farmers include the monitoring field operations, managing the finances and applying for subsidies, depending on different software applications. Farmers need to use different tools to manage monitoring and data acquisition on‐line in the field. They need to analyse information related to subsidies, and to communicate with tax offices, product resellers etc.
the different stakeholders groups involved in the agricultural activities need integrated access to multiple and heterogeneous sources of information collected by multiple applications and devices.
The different groups of stakeholders involved in the agricultural activities have to manage many different and heterogeneous sources of information that need to be combined in order to make economically and environmentally sound decisions, which include (among others) the definition of policies (subsidies, standardisation and regulation, national strategies for rural development, climate change), valuation of ecological performances, development of sustainable agriculture, crop recollection timing and pricing, plagues detection, etc.
from agriculture, forestry and fishery
to produce food, energy and biomaterials responsibly and sustainably.
, to enable users with different profiles to benefit from the underlying HPC capabilities.
INSPIRE directive is an EU initiative that aims at building a Pan-European spatial data infrastructure (SDI) requiring
EU Member States to make available spatial data, from multiple thematic areas, according to established implementing rules using appropriate services.
Based on ISO/OGC standards for geographical information, i.e., ISO 19100 series standards
AF: Agricultural and aquaculture facilities. The theme Agricultural and Aquaculture Facilities concerns the description of all the physical instruments and constructions with permanent or semi-permanent emplacement (inland or outland) that are related to agricultural and aquaculture activities.
For the purposes of FOODIE, we found the lack of a feature on a more detailed level than Site that is already part of the INSPIRE AF data model.
Main concept: Plot
Represents a continuous area of agricultural land with one type of crop species, cultivated by one user in one farming mode
Two kinds of data associated:
metadata information
agro-related information
Next concept: Management Zone
Enables a more precise description of the land characteristics in fine-grained area
The Intervention is the basic feature type for any kind of (farming) application with explicitly defined geometry, e.g., tillage or pruning.
Has multiple indirect associations with different concepts
Pre-processing
Source model preparation
ShapeChange tool configuration: encoding rules; mappings UML classes - OWL elements; namespaces definition
Base ontologies fixes (INPSIRE common, ISO 19100 series standards)
Post-processing tasks
Manual fixes in the ontology
Manual creation of ontology elements of the base INSPIRE schemas (AF)
ShapeChange output
UML featureTypes and dataTypes modelled as classes, and their attributes as datatype or object properties
UML codeLists modelled as classes/concepts, and their attributes as concept members
Cardinalities restrictions defined on properties (exactly, min, max)
DataType properties ranges defined according to model/mappings
Object properties ranges defined according to model/mappings
Object properties inverseOf defined