SlideShare a Scribd company logo
1 of 38
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
1
This project has received funding from
the European Union’s Horizon 2020
research and innovation programme
under grant agreement No 732064
This project is part
of BDV PPP
LINKED DATA PUBLICATION PIPELINES
FOR AGRI-RELATED USE CASES
Raul Palma1, Soumya Brahma1 , Marcin Krystek1, Karel
Charvát2, Raitis Berzins2
1Poznan Supercomputing and Networking Center, Poland
2WirelessInfo, Czech Republic 7th Leipzig Semantic Web Day
22nd May, 2019
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
2
Linked data publication
• LD is increasingly becoming a popular method for publishing data on the Web
• Improves data accessibility by both humans and machines, e.g., for finding, reuse and integration
• Enables to discover more useful data through the links (and inferencing), and to exploit data with semantic
queries
• Growing number of datasets in the LOD cloud
• 1,234 datasets with 16,136 links (as of June 2018)
• Coverage of the LOD cloud
• Large cross-domain datasets (dbpedia, freebase, etc.)
• Variable domain coverage (e.g., Geography,
Government, BioInformatics)
• What about Agriculture?
• “Just” few datasets (e.g., AGRIS biblio records,
AGROVOC thesaurus + other thesaurus like NALT)
• Farming and agri-activities related data?
http://lod-cloud.net/
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
3
Why is Linked Data relevant in Agriculture:
Farming context
• Farm management
• Multiple activities and stakeholders
• Multiple applications, tools and devices
• Multiple data sources, types and formats
• Challenge
• To combine/integrate those different and
heterogeneous data sources in order to
make economically and environmentally
sound decisions
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
4
Data Integration in relevant projects (context)
• Data integration challenges have been the focus of relevant projects
EU FP7, ICT CIP, 2014- 2017
EU FP7, ICT CIP, 2014- 2017
FOODIE aimed at building an open
and interoperable cloud-based
platform addressing among others the
integration of data relevant to
farming production including their
geo-spatial dimension, as well as their
publication as Linked data.
SDI4Apps aimed at building a cloud-
based framework with open API for
data integration focusing on the
development of six pilot apps, drawing
along the lines of INSPIRE, Copernicus
and GEOSS
DataBio aims at showcasing the benefits of Big
Data technologies in the raw material production
from agriculture & others for the bioeconomy
industry; deploying an interoperable platform on
top of the existing partners’ infrastructure.
DataBio aims at delivering solutions for big data
mgmt., including i) the storage and querying of
various big data sources; ii)
the harmonization and integration of a large
variety of data from many sources, using linked
data as a federated layer
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
5
Linked data principles & guidelines
• Simple set of principles & technologies
• URI, HTTP, RDF, SPARQL
• Involves a set of (common) tasks
Datasets
identification
Model specification
RDF data generation
Linking
Exploiting
Hyland et al.
Hausenblas et al.
Villazón-Terrazas et al.
Best Practices for Publishing Linked Data
5-star deployment scheme
for Linked Open Data
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
6
Implementing Linked Data publication pipelines
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
7
Implementing Linked Data publication pipelines
• Goal: to define and deploy (semi-) automatic processes to carry out the necessary steps to transform and
publish different input datasets as Linked Data.
• A pipeline connect different data processing components to carry out the transformation of data into RDF
and their linking, and includes the mapping specifications to process the input datasets.
• Each pipeline is configured to support specific input dataset types (same format, model and delivery form).
• Principles
• Pipelines can be directly re-executed and re-applied
(e.g., extended/updated datasets)
• Pipelines must be easily reusable
• Pipelines must be easily adapted for new input datasets
• Pipeline execution should be as automatic as possible.
The final target is to fully automated processes.
• Pipelines should support both: (mostly) static data and data streams
(e.g., sensor data)
• The resulting datasets available as Linked Data, will provide an integrated view over the initial
(disconnected and heterogeneous) datasets, in compliance with any privacy and access control needs
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
8
Use cases (examples)
• UC1: Publication of farm related linked data from agri pilots in DataBio
project, particularly Pilot 8 [B1.4] Cereals and biomass crops.
• Goal: query and access heterogeneous agricultural data sources from Rostenice farm via an integrated
layer in order to make informed decisions and discover new knowledge
• UC2: Publication of Open EU/national datasets relevant for agri-food pilots
as Linked Data
• Goal: provide access to multiple, isolated data sources relevant for agri-pilots, and identify links with
farm datasets, from a single integrated layer
• UC3: Publication of sensor data as linked data on the fly from Pilot 9 [B2.1]
Machinery management
• Goal: provide access to sensor data integrated with other farm data and other relevant datasets
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
9
Datasets identification and collection
• UC1 Datasets
• Farm data (Rostenice holding) that holds information
about each field names with the associated cereal
crop classifications and arranged by year.
• Data about the field boundaries and crop map and
yield potential of most of the fields in Rostenice farm
from Czech Republic.
• Yield records from two fields (Pivovarka, Predni) within
the pilot farm that were harvested in 2017/2018.
Source data types: shapefiles
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
10
Datasets identification and collection
• UC2 datasets
• Czech datasets
• Czech LPIS data showing the actual land boundaries.
• Czech erosion zones (strongly/SEO and moderately / MEO erosion-endangered soil zones)
• Czech water bodies (e.g., restricted area near to water bodies has 25m buffer according to
the nitrate directive).
• The data about soil types from all over Czech Republic.
• Polish datasets
• Polish LPIS data showing the cadastral land boundaries from all over the country.
• European datasets
• FADN (Farm Accountancy Data Network) data about the income of agricultural holdings and
the impacts of the Common Agricultural Policy from all EU member states
• Various open European geospatial datasets including
• (part of) Open Land Use (OLU)
• (part of) Open Transport Map (OTM)
• Smart Points of Interest (SPOI),
• (part of) Urban Atlas (pan-European comparable land use and land cover data for Large
Urban Zones )
• (part of) CORINE Land Cover
• HILUCS (Hierarchical INSPIRE Land Use Classification System )
• Experimental sample dataset from the review platform Yelp (global coverage). The
data contents are regarding the geographical location of a business, review and
reviewer information.
Source data types: shapefiles, JSON, CSV,
relational databases
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
11
Datasets identification and collection
• UC3 dataset
• Sensor data stored and collected by Senslog platform,
including the readings of IoT devices on tractors.
• Data updates is high, reading coming frequently
Source data types:
relational databases
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
12
Data models
• Various ontologies/vocabularies were selected and reused/extended for the
representation of data
• For farm-related data, INSPIRE based FOODIE ontology has been selected and extended as needed
• For the land parcel and cadastral data (for Czech republic & Poland), erosion-endangered soil zones, water buffer
and soil type classification, also FOODIE ontology and in some cases its extensions were used for modelling the
classes and properties.
• In case of the FADN the main ontologies used were Data Cube Vocabulary and its SDMX ISO standard extensions
that were much effective in aligning such multidimensional data. Moreover the Data Cube Vocabulary
encompasses well known RDF vocabularies like SKOS, SCOVO, VoiD, FOAF, Dublin Core, etc.
• For the Yelp dataset various ontologies like review, FOAF, schema.org, POI, etc. were used to represent the
classes and the properties identified from the input data sources.
• For some other datasets (e.g., corine, hilucs, olu, otm, urban atlas) simple ontologies/voabularies were generated
in line with standards and are available in https://github.com/FOODIE-cloud/ontology.
• For sensor data: Semantic Sensor Network (SSN) along with the SOSA (Sensor, Observation, Sample, and
Actuator) ontology for describing sensors and their observations, the involved procedures, the studied features of
interest, etc. Additionally, Data Cube Vocabulary and its SDMX ISO standard extensions for multidimensional data
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
13
Data model for farming data
• (FOODIE) farming data model principles
• application vocabulary covering the different categories of information
dealt by the farm mgmt. tools/apps (in FOODIE)
• in line with existing standards and best practices
• Resulting model*
• Builds on the INSPIRE AF specification for agricultural data, and
• the INSPIRE specification for themes in annex I for geospatial data, based on
• ISO/OGC standards for geographical information
• Created as an UML model
*consulted with experts from various institutions, e.g., EU DG JRC, EU Global
Navigation Satellite Systems Agency (GSA), Czech Ministry of Agriculture,
Global Earth Observation System of Systems (GEOSS), German Kuratorium für
Technik und Bauwesen in der Landwirtschaft (KTBL).
Challenge: Transform model into OWL
ontology
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
14
Transformation of model into OWL
ontology
Palma R., Reznik T., Esbri M., Charvat K., Mazurek C., An INSPIRE-
based vocabulary for the publication of Agricultural Linked Data.
Proceedings of the OWLED Workshop: collocated with the ISWC-
2015, Bethlehem PA, USA, October 11-15, 2015
ShapeChange implements ISO 19150-2
standard rules for mapping ISO geographic
information UML models to OWL
ontologies.
semi-automatic process: besides
transformation configuration,
additional pre and post processing
task were needed
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
15
FOODIE ontology - overview
• Overall structure (ShapeChange output)
• UML featureTypes and dataTypes modelled as classes, and
their attributes as datatype or object properties
• UML codeLists modelled as classes/concepts, and their
attributes as concept members
• Cardinalities restrictions defined on properties (exactly,
min, max)
• DataType properties ranges defined according to
model/mappings
• Object properties ranges defined according to
model/mappings
• Object properties inverseOf defined
Top hierarchy
FeatureType hierarchy
Codelist hierarchy
Datatype hierarchy
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
16
FOODIE ontology – main classes overview
• We found the lack of a feature on a more detailed level than Site that is already part of
the INSPIRE AF data model.
• Main concept: Plot
• Represents a continuous area of agricultural
land with one type of crop species, cultivated
by one user in one farming mode
• Two kinds of data associated:
• metadata information
• agro-related information
 Next level: Management Zone
• Enables a more precise description of the land
characteristics in fine-grained area
foodie:Plot
INSPIRE-AF:Site
foodie:Alert Foodie:InterventionFoodie:CropSpecies
Foodie:ManagementZone
containsPlot
containsManagementZone
interventionPlotspeciesPlot
alertPlot
plotAlert
Foodie:ProductionType
production
Foodie:SoilNutrients
zoneNutrients
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
17
FOODIE ontology – main classes overview
• The Intervention is the basic feature type for any kind of (farming)
application with explicitly defined geometry, e.g., tillage or pruning.
• Has multiple indirect associations with different concepts
Foodie:Intervention
Foodie:Treatment
Foodie:TreatmentPlan
Foodie:Product Foodie:ProductPreparation
Foodie:ActiveIngredients
is-a
plan
productPlan
planProduct
preparationProduct preparation
productTreatment
treatmentProduct
preparationPlan
ingredientProduct
Foodie:FormOfT
reatmentValue
Foodie:Treatme
ntPurposeValue
formOfTreatmenttreatmentPurpose
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
18
RDF data generation
• Different tools were deployed, configured and used for the generation of RDF
data, depending on the source data type
• D2RQ: mainly for relational databases
• Geotriples: mainly for shapefiles
• R2MLprocessor: mainly for JSON, CSV data sources
• All these tools require a mapping file (in RDF) specifying how to map the data
source elements to the target ontology concepts and properties.
• Mapping specifications use R2RML (RDB to RDF Mapping Language) and/or
extensions (e.g., RDF Mapping language (RML))
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
19
RDF data generation
• The mapping file also specifies the connection details for the source dataset
• Based on the mapping file, the data source (e.g., database content, shapefile, JSON,
CSV, etc.) is either
• i) dumped to an RDF file; or
• ii) transformed on the fly as a virtual RDF graph (e.g., for data streaming)
• RDF files were loaded into Virtuoso triplestore (FOODIE endpoint)
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
20
Linking the generated RDF datasets
• In order to link the resulting RDF datasets with
other datasets, we follow different approaches:
• Apply existing tools like Silk or LIMES to discover
equivalence relations
• For other relations use queries (e.g., geospatial) to
access the integrated data as per need.
• In our experiments with equivalence relations;
however we also had to do some manual
entries
• We found issues in handling large datasets in Silk,
specially those accessed via SPARQL endpoint that
we cannot control
There were recent optimizations of LIMES
and a new tool for geospatial linking (from
Leipzig) that we plan to test
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
21
Triplestore statistics (examples)
Dataset Name Graph in FOODIE endpoint Source Triples
OLU** http://w3id.org/foodie/olu# Transformed from PostgreSQL 127,926,060
SPOI http://www.sdi4apps.eu/poi.rdf Provided by WRLS (also
available in FOODIE endpoint)
407,629,170
NUTS http://nuts.geovocab.org/ Open Source (available in
FOODIE endpoint)
316,238
OTM*** http://w3id.org/foodie/otm# Transformed from PostgreSQL 154,340,785
Yelp academic
dataset
http://data.yelp.com/academic_dataset# Transformed from JSON 86,348,185
LPIS data (CZ) http://w3id.org/foodie/open/cz/pLPIS_18
0616_WGS#
Transformed from shapefile 24,491,282
FADN http://ec.europa.eu/agriculture/FADN/{d
ataset}#
Transformed from CSV 25,457,255
Pilot 8 farm data private Transformed from shapefile 1,569,439
Total: over 1 bilion triples!
FOODIE triplestore is one of the largest semantic repositories related to agriculture, which has been
recognized by the EC innovation radar as „arable farming data integrator for Smart Farming”
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
22
Exploiting the Linked Data – querying triplestore
• Sparql endpoint: https://www.foodie-cloud.org/sparql
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
23
Exploiting the Linked Data – querying sensor data as virtual RDF graph
• Sparql endpoint: http://senslogrdf.foodie-cloud.org/sparql
• SNORQL search endpoint: http://senslogrdf.foodie-cloud.org/snorql/
• Web-based visualization: http://senslogrdf.foodie-cloud.org/
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
24
Exploiting the Linked Data – querying examples(1-2 linked
datasets)
• get information of Points Of Interest in a
given polygon (one dataset)
• get SPOIs for given Open Land Use parcel
(linking 2 datasets)
SELECT *
FROM <http://www.sdi4apps.eu/poi.rdf>
WHERE {
?Resource rdfs:label ?Label .
?Resource poi:class ?POI_Class .
?Resource geo:asWKT ?Coordinates .
FILTER(bif:st_intersects (?Coordinates,
bif:st_geomFromText("POLYGON((6.11553983198 54.438016608357,
6.95050076948 47.230985358357, 13.36651639448 47.626493170857,
14.99249295698 54.701688483357, 6.11553983198
54.438016608357))"))) . }
SELECT *
FROM <http://www.sdi4apps.eu/poi.rdf>
WHERE {
?Resource rdfs:label ?Label .
?Resource poi:class ?POI_Class .
?Resource geo:asWKT ?Coordinates .
FILTER(bif:st_intersects (?Coordinates,
bif:st_geomFromText(?coordinates))) .
{
SELECT bif:st_astext(?x) as ?coordinates
FROM <http://w3id.org/foodie/olu#>
WHERE {
olu-instance: geo:hasGeometry ?geometry.
?geometry geo:asWKT ?x
}
}
}
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
25
Exploiting the Linked Data – querying examples (3+ linked
datasets)
• Show me all the land parcels (OLU) that
have hotels (SPOI) and that lie not more
than 50 meters away from the major
highway (OTM) (linking 3 datasets)?
SELECT DISTINCT ?olu ?hilucs ?source ?municode ?specificLandUse
FROM <http://w3id.org/foodie/olu#>
WHERE {
?olu a olu:LandUse .
?olu geo:hasGeometry ?geometry .
?olu olu:hilucsLandUse ?hilucs .
?olu olu:geometrySource ?source .
OPTIONAL {?olu olu:municipalCode ?municode} .
OPTIONAL {?olu olu:specificLandUse ?specificLandUse} .
?geometry geo:asWKT ?coordinatesOLU .
FILTER(bif:st_within(bif:st_geomFromText(?coordinatesPOI),?coordinatesOLU)).
{
SELECT DISTINCT ?Resource, ?Label, bif:st_astext(?coordinatesPOIa) as ?coordinatesPOI
FROM <http://www.sdi4apps.eu/poi.rdf>
WHERE {
?Resource rdfs:label ?Label .
?Resource poi:class <http://gis.zcu.cz/SPOI/Ontology#lodging> .
?Resource geo:asWKT ?coordinatesPOIa .
FILTER(bif:st_within(?coordinatesPOIa,bif:st_geomFromText(?coordinatesOTM),0.00045)) .
{
SELECT bif:st_astext(?x) as ?coordinatesOTM
FROM <http://w3id.org/foodie/otm#>
WHERE {
?roadlink a otm:RoadLink .
?roadlink otm:roadName ?name.
?roadlink otm:functionalRoadClass ?class.
?roadlink otm:centerLineGeometry ?geometry .
?geometry geo:asWKT ?x .
FILTER(bif:st_intersects (?x, bif:st_geomFromText("POLYGON((14.426647 50.0751251,14.426647 50.07685089,14.43054696
50.07685089,14.43054696 50.0751251,14.426647 50.0751251))"))) .
FILTER(STRSTARTS(STR(?class),"firstClass") ) .
}
}
}
}
}
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
26
Exploiting the Linked Data – search & navigate
• Faceted search endpoint: http://www.foodie-cloud.org/fct
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
27
Exploiting the Linked Data – visualisation
• Map visualisation: http://ng.hslayers.org/examples/foodie-zones/
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
28
Exploiting the Linked Data – visualisation
• Map visualisation: http://ng.hslayers.org/examples/produce-3d/
Object information
on click
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
29
Exploiting the Linked Data – visualisation
• Map visualisation: http://ng.hslayers.org/examples/olu_spoi
• OLU polygons colored by the number of SPOI that lie inside them
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
30
Exploiting the Linked Data – visualisation
• Map visualisation: http://app.hslayers.org/project-databio/land
• Usage scenarios:
• Find and show buffer zones around water bodies (user will specify the distance),
which define the areas within the fields with limited/restricted application of agro-
chemicals.
• select farm/fields based on the ID_UZ attribute from public CZ LPIS database and
search EO data over all fields
• visualize crop species based on the farm data (need parcels with crop types- not
available from open LPIS data)
• select fields with different soil types
• select all fields with certain crop in max distance from certain point (it could be for
logistic, distribution of biomass etc)
• show/select erosion zones for specific farm ID
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
31
Exploiting the Linked Data – visualisation
• Map visualisation: http://app.hslayers.org/project-databio/land
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
32
Exploiting the Linked Data – visualisation
• Metaphactory: http://foodie.metaphacts.cloud/resource/Start
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
33
Exploiting the Linked Data – visualisation
• Metaphactory: http://foodie.metaphacts.cloud/resource/Start
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
34
Exploiting the Linked Data – visualisation
• Metaphactory: http://foodie.metaphacts.cloud/resource/Start
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
35
Exploiting the Linked Data – visualisation
• Metaphactory: http://foodie.metaphacts.cloud/resource/Start
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
36
Security of RDF Data
• SPARQL endpoints are web services capable of providing Read-Only access to a back-end graph DBMS.
• SPARQL endpoints can be sometimes purpose-specific so their access privilege therefore must be limited
to some basic operations over the graph.
• The privileges provided by a given Virtuoso SPARQL endpoint include specific user identities with specific
database roles and privileges.
• Virtuoso offers three methods for securing SPARQL endpoints:
• Digest Authentication via SQL Accounts
• OAuth Protocol based Authentication
• WebID Protocol based Authentication
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
37
Linked data publication technologies overview
• Used technologies:
• D2RQ, Geotriples and R2MLProcessor for input
datasets as Virtual RDF Graphs
• RDF for the representation of data
• FOODIE (Farming ontology) providing the underlying
vocabulary and relations, plus a number of other
ontologies/vocabularies (existing and generated)
• Virtuoso for storing the semantic data
• Silk/LIMES for discovery of links
• Sparql for querying semantic data
• Hslayers NG for visualisation of data
• Metaphactory for visualisation of data
D2RQ
This document is part of a project that has received funding
from the European Union’s Horizon 2020 research and innovation programme
under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or
reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu.
38
Thank you for your attention!
Contact details

More Related Content

What's hot

Exposing EO Linked (meta-)Data from OpenSearch Catalogue
Exposing EO Linked (meta-)Data from OpenSearch CatalogueExposing EO Linked (meta-)Data from OpenSearch Catalogue
Exposing EO Linked (meta-)Data from OpenSearch CatalogueRaul Palma
 
BDE SC2 Workshop 3: DataBio
BDE SC2 Workshop 3: DataBioBDE SC2 Workshop 3: DataBio
BDE SC2 Workshop 3: DataBioBigData_Europe
 
Data bio big data worksop Brussels
Data bio big data worksop BrusselsData bio big data worksop Brussels
Data bio big data worksop BrusselsWirelessInfo
 
06 WP3 Fishery pilots
06 WP3 Fishery pilots06 WP3 Fishery pilots
06 WP3 Fishery pilotsplan4all
 
02 agriculture challenges, existing standardisation efforts and data bio agri...
02 agriculture challenges, existing standardisation efforts and data bio agri...02 agriculture challenges, existing standardisation efforts and data bio agri...
02 agriculture challenges, existing standardisation efforts and data bio agri...plan4all
 
01 introduction mildorf
01 introduction mildorf01 introduction mildorf
01 introduction mildorfplan4all
 
01 DataBio Workshop in Rome
01 DataBio Workshop in Rome01 DataBio Workshop in Rome
01 DataBio Workshop in Romeplan4all
 
The 2018 European Commission Data Package
The 2018 European Commission Data PackageThe 2018 European Commission Data Package
The 2018 European Commission Data PackageJean-François Dechamp
 
Open Data and Open Science in the European Commission
Open Data and Open Science in the European CommissionOpen Data and Open Science in the European Commission
Open Data and Open Science in the European CommissionJean-François Dechamp
 
H2020 big data and fiware an d iot
H2020 big data and fiware an d iotH2020 big data and fiware an d iot
H2020 big data and fiware an d iotWirelessInfo
 
Linked Data with hybrid services in Agriculture
Linked Data with hybrid services in AgricultureLinked Data with hybrid services in Agriculture
Linked Data with hybrid services in AgricultureRaul Palma
 
The European Commission's Open Data ambition (Marjan Grootveld) - EUDAT Summe...
The European Commission's Open Data ambition (Marjan Grootveld) - EUDAT Summe...The European Commission's Open Data ambition (Marjan Grootveld) - EUDAT Summe...
The European Commission's Open Data ambition (Marjan Grootveld) - EUDAT Summe...EUDAT
 
DMP exercise: linking data management activities to services - EUDAT Summer ...
DMP exercise: linking data management activities to services  - EUDAT Summer ...DMP exercise: linking data management activities to services  - EUDAT Summer ...
DMP exercise: linking data management activities to services - EUDAT Summer ...EUDAT
 
The costs of making data FAIR (Marjan Grootveld) - EUDAT Summer School | www....
The costs of making data FAIR (Marjan Grootveld) - EUDAT Summer School | www....The costs of making data FAIR (Marjan Grootveld) - EUDAT Summer School | www....
The costs of making data FAIR (Marjan Grootveld) - EUDAT Summer School | www....EUDAT
 
How FAIR are your data? (Sarah Jones) - EUDAT Summer School | www.eudat.eu
How FAIR are your data? (Sarah Jones) - EUDAT Summer School | www.eudat.euHow FAIR are your data? (Sarah Jones) - EUDAT Summer School | www.eudat.eu
How FAIR are your data? (Sarah Jones) - EUDAT Summer School | www.eudat.euEUDAT
 
TRAINING OBJECTIVES
TRAINING OBJECTIVESTRAINING OBJECTIVES
TRAINING OBJECTIVESFAO
 
Cessda saw task4-6_haguefocusgrppresentation_0616
Cessda saw task4-6_haguefocusgrppresentation_0616Cessda saw task4-6_haguefocusgrppresentation_0616
Cessda saw task4-6_haguefocusgrppresentation_0616Neil Beagrie
 

What's hot (18)

Exposing EO Linked (meta-)Data from OpenSearch Catalogue
Exposing EO Linked (meta-)Data from OpenSearch CatalogueExposing EO Linked (meta-)Data from OpenSearch Catalogue
Exposing EO Linked (meta-)Data from OpenSearch Catalogue
 
BDE SC2 Workshop 3: DataBio
BDE SC2 Workshop 3: DataBioBDE SC2 Workshop 3: DataBio
BDE SC2 Workshop 3: DataBio
 
Data bio big data worksop Brussels
Data bio big data worksop BrusselsData bio big data worksop Brussels
Data bio big data worksop Brussels
 
06 WP3 Fishery pilots
06 WP3 Fishery pilots06 WP3 Fishery pilots
06 WP3 Fishery pilots
 
02 agriculture challenges, existing standardisation efforts and data bio agri...
02 agriculture challenges, existing standardisation efforts and data bio agri...02 agriculture challenges, existing standardisation efforts and data bio agri...
02 agriculture challenges, existing standardisation efforts and data bio agri...
 
01 introduction mildorf
01 introduction mildorf01 introduction mildorf
01 introduction mildorf
 
Forestry Pilot
Forestry PilotForestry Pilot
Forestry Pilot
 
01 DataBio Workshop in Rome
01 DataBio Workshop in Rome01 DataBio Workshop in Rome
01 DataBio Workshop in Rome
 
The 2018 European Commission Data Package
The 2018 European Commission Data PackageThe 2018 European Commission Data Package
The 2018 European Commission Data Package
 
Open Data and Open Science in the European Commission
Open Data and Open Science in the European CommissionOpen Data and Open Science in the European Commission
Open Data and Open Science in the European Commission
 
H2020 big data and fiware an d iot
H2020 big data and fiware an d iotH2020 big data and fiware an d iot
H2020 big data and fiware an d iot
 
Linked Data with hybrid services in Agriculture
Linked Data with hybrid services in AgricultureLinked Data with hybrid services in Agriculture
Linked Data with hybrid services in Agriculture
 
The European Commission's Open Data ambition (Marjan Grootveld) - EUDAT Summe...
The European Commission's Open Data ambition (Marjan Grootveld) - EUDAT Summe...The European Commission's Open Data ambition (Marjan Grootveld) - EUDAT Summe...
The European Commission's Open Data ambition (Marjan Grootveld) - EUDAT Summe...
 
DMP exercise: linking data management activities to services - EUDAT Summer ...
DMP exercise: linking data management activities to services  - EUDAT Summer ...DMP exercise: linking data management activities to services  - EUDAT Summer ...
DMP exercise: linking data management activities to services - EUDAT Summer ...
 
The costs of making data FAIR (Marjan Grootveld) - EUDAT Summer School | www....
The costs of making data FAIR (Marjan Grootveld) - EUDAT Summer School | www....The costs of making data FAIR (Marjan Grootveld) - EUDAT Summer School | www....
The costs of making data FAIR (Marjan Grootveld) - EUDAT Summer School | www....
 
How FAIR are your data? (Sarah Jones) - EUDAT Summer School | www.eudat.eu
How FAIR are your data? (Sarah Jones) - EUDAT Summer School | www.eudat.euHow FAIR are your data? (Sarah Jones) - EUDAT Summer School | www.eudat.eu
How FAIR are your data? (Sarah Jones) - EUDAT Summer School | www.eudat.eu
 
TRAINING OBJECTIVES
TRAINING OBJECTIVESTRAINING OBJECTIVES
TRAINING OBJECTIVES
 
Cessda saw task4-6_haguefocusgrppresentation_0616
Cessda saw task4-6_haguefocusgrppresentation_0616Cessda saw task4-6_haguefocusgrppresentation_0616
Cessda saw task4-6_haguefocusgrppresentation_0616
 

Similar to Linked data publication pipelines for agri-related use cases

BDV Webinar Series - Caj - Big Data Breakthroughs for Global Bio-economy Busi...
BDV Webinar Series - Caj - Big Data Breakthroughs for Global Bio-economy Busi...BDV Webinar Series - Caj - Big Data Breakthroughs for Global Bio-economy Busi...
BDV Webinar Series - Caj - Big Data Breakthroughs for Global Bio-economy Busi...Big Data Value Association
 
eROSA Policy WS1: Databio Project Overview
eROSA Policy WS1: Databio Project OvervieweROSA Policy WS1: Databio Project Overview
eROSA Policy WS1: Databio Project Overviewe-ROSA
 
BDV Webinar Series - Thanasis - Big Data Breakthroughs for Global Bio-economy...
BDV Webinar Series - Thanasis - Big Data Breakthroughs for Global Bio-economy...BDV Webinar Series - Thanasis - Big Data Breakthroughs for Global Bio-economy...
BDV Webinar Series - Thanasis - Big Data Breakthroughs for Global Bio-economy...Big Data Value Association
 
DataBio Architecture for Big Data and Big Data Visualisation
DataBio Architecture for Big Data and Big Data VisualisationDataBio Architecture for Big Data and Big Data Visualisation
DataBio Architecture for Big Data and Big Data Visualisationplan4all
 
BDV Webinar Series - Ephrem - Big Data Breakthroughs for Global Bio-economy B...
BDV Webinar Series - Ephrem - Big Data Breakthroughs for Global Bio-economy B...BDV Webinar Series - Ephrem - Big Data Breakthroughs for Global Bio-economy B...
BDV Webinar Series - Ephrem - Big Data Breakthroughs for Global Bio-economy B...Big Data Value Association
 
Iot and big data technologies for bio industry data bio
Iot and big data technologies for bio industry   data bioIot and big data technologies for bio industry   data bio
Iot and big data technologies for bio industry data bioWirelessInfo
 
A success story of applying big data in agriculture
A success story of applying big data in agricultureA success story of applying big data in agriculture
A success story of applying big data in agricultureBig Data Value Association
 
Big data and ai enhance production of bio resources. Aamu areena Caj Södergå...
Big data and ai enhance production of bio resources.  Aamu areena Caj Södergå...Big data and ai enhance production of bio resources.  Aamu areena Caj Södergå...
Big data and ai enhance production of bio resources. Aamu areena Caj Södergå...Caj Södergård
 
MEASURE H2020 project presented at OW2con'19, June 12-13, 2019, Paris.
 MEASURE H2020 project presented at OW2con'19, June 12-13, 2019, Paris.  MEASURE H2020 project presented at OW2con'19, June 12-13, 2019, Paris.
MEASURE H2020 project presented at OW2con'19, June 12-13, 2019, Paris. OW2
 
Linked Data Publication Pipelines for Agri-Related use cases
Linked Data Publication Pipelines for Agri-Related use casesLinked Data Publication Pipelines for Agri-Related use cases
Linked Data Publication Pipelines for Agri-Related use casesLeipziger Semantic Web Tag
 
E-infrastructure for open agri-food sciences: Vision & Roadmap
E-infrastructure for open agri-food sciences: Vision & RoadmapE-infrastructure for open agri-food sciences: Vision & Roadmap
E-infrastructure for open agri-food sciences: Vision & Roadmape-ROSA
 
PaNOSC and Research Data Management / Battery2030+ Initiative Workshop / 12 M...
PaNOSC and Research Data Management / Battery2030+ Initiative Workshop / 12 M...PaNOSC and Research Data Management / Battery2030+ Initiative Workshop / 12 M...
PaNOSC and Research Data Management / Battery2030+ Initiative Workshop / 12 M...PaNOSC
 
OpenAIRE factsheet: H2020 Open Data Pilot
OpenAIRE factsheet: H2020 Open Data PilotOpenAIRE factsheet: H2020 Open Data Pilot
OpenAIRE factsheet: H2020 Open Data PilotOpenAIRE
 
Data bio d6.2-data-management-plan_v1.0_2017-06-30_crea
Data bio d6.2-data-management-plan_v1.0_2017-06-30_creaData bio d6.2-data-management-plan_v1.0_2017-06-30_crea
Data bio d6.2-data-management-plan_v1.0_2017-06-30_creaWirelessInfo
 
B2FIND Overview February 2017 | www.eudat.eu |
B2FIND Overview February 2017 | www.eudat.eu | B2FIND Overview February 2017 | www.eudat.eu |
B2FIND Overview February 2017 | www.eudat.eu | EUDAT
 
Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...
Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...
Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...OpenAIRE
 
TOOP project: Once Only Principle
TOOP project: Once Only PrincipleTOOP project: Once Only Principle
TOOP project: Once Only PrincipleSamos2019Summit
 

Similar to Linked data publication pipelines for agri-related use cases (20)

BDV Webinar Series - Caj - Big Data Breakthroughs for Global Bio-economy Busi...
BDV Webinar Series - Caj - Big Data Breakthroughs for Global Bio-economy Busi...BDV Webinar Series - Caj - Big Data Breakthroughs for Global Bio-economy Busi...
BDV Webinar Series - Caj - Big Data Breakthroughs for Global Bio-economy Busi...
 
eROSA Policy WS1: Databio Project Overview
eROSA Policy WS1: Databio Project OvervieweROSA Policy WS1: Databio Project Overview
eROSA Policy WS1: Databio Project Overview
 
BDV Webinar Series - Thanasis - Big Data Breakthroughs for Global Bio-economy...
BDV Webinar Series - Thanasis - Big Data Breakthroughs for Global Bio-economy...BDV Webinar Series - Thanasis - Big Data Breakthroughs for Global Bio-economy...
BDV Webinar Series - Thanasis - Big Data Breakthroughs for Global Bio-economy...
 
DataBio Architecture for Big Data and Big Data Visualisation
DataBio Architecture for Big Data and Big Data VisualisationDataBio Architecture for Big Data and Big Data Visualisation
DataBio Architecture for Big Data and Big Data Visualisation
 
Overview of data bio project results
Overview of data bio project resultsOverview of data bio project results
Overview of data bio project results
 
BDV Webinar Series - Ephrem - Big Data Breakthroughs for Global Bio-economy B...
BDV Webinar Series - Ephrem - Big Data Breakthroughs for Global Bio-economy B...BDV Webinar Series - Ephrem - Big Data Breakthroughs for Global Bio-economy B...
BDV Webinar Series - Ephrem - Big Data Breakthroughs for Global Bio-economy B...
 
Iot and big data technologies for bio industry data bio
Iot and big data technologies for bio industry   data bioIot and big data technologies for bio industry   data bio
Iot and big data technologies for bio industry data bio
 
A success story of applying big data in agriculture
A success story of applying big data in agricultureA success story of applying big data in agriculture
A success story of applying big data in agriculture
 
Big data and ai enhance production of bio resources. Aamu areena Caj Södergå...
Big data and ai enhance production of bio resources.  Aamu areena Caj Södergå...Big data and ai enhance production of bio resources.  Aamu areena Caj Södergå...
Big data and ai enhance production of bio resources. Aamu areena Caj Södergå...
 
MEASURE H2020 project presented at OW2con'19, June 12-13, 2019, Paris.
 MEASURE H2020 project presented at OW2con'19, June 12-13, 2019, Paris.  MEASURE H2020 project presented at OW2con'19, June 12-13, 2019, Paris.
MEASURE H2020 project presented at OW2con'19, June 12-13, 2019, Paris.
 
Linked Data Publication Pipelines for Agri-Related use cases
Linked Data Publication Pipelines for Agri-Related use casesLinked Data Publication Pipelines for Agri-Related use cases
Linked Data Publication Pipelines for Agri-Related use cases
 
European Data Spaces
European Data SpacesEuropean Data Spaces
European Data Spaces
 
E-infrastructure for open agri-food sciences: Vision & Roadmap
E-infrastructure for open agri-food sciences: Vision & RoadmapE-infrastructure for open agri-food sciences: Vision & Roadmap
E-infrastructure for open agri-food sciences: Vision & Roadmap
 
PaNOSC and Research Data Management / Battery2030+ Initiative Workshop / 12 M...
PaNOSC and Research Data Management / Battery2030+ Initiative Workshop / 12 M...PaNOSC and Research Data Management / Battery2030+ Initiative Workshop / 12 M...
PaNOSC and Research Data Management / Battery2030+ Initiative Workshop / 12 M...
 
OpenAIRE factsheet: H2020 Open Data Pilot
OpenAIRE factsheet: H2020 Open Data PilotOpenAIRE factsheet: H2020 Open Data Pilot
OpenAIRE factsheet: H2020 Open Data Pilot
 
Open data pilot
Open data pilotOpen data pilot
Open data pilot
 
Data bio d6.2-data-management-plan_v1.0_2017-06-30_crea
Data bio d6.2-data-management-plan_v1.0_2017-06-30_creaData bio d6.2-data-management-plan_v1.0_2017-06-30_crea
Data bio d6.2-data-management-plan_v1.0_2017-06-30_crea
 
B2FIND Overview February 2017 | www.eudat.eu |
B2FIND Overview February 2017 | www.eudat.eu | B2FIND Overview February 2017 | www.eudat.eu |
B2FIND Overview February 2017 | www.eudat.eu |
 
Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...
Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...
Horizon 2020 Open Research Data Pilot, Jean-Claude Burgelman, DG RTD European...
 
TOOP project: Once Only Principle
TOOP project: Once Only PrincipleTOOP project: Once Only Principle
TOOP project: Once Only Principle
 

More from Raul Palma

FDO as building block for digitization technology stacks
FDO as building block for digitization technology stacksFDO as building block for digitization technology stacks
FDO as building block for digitization technology stacksRaul Palma
 
RO-crate-FDO-ROHub
RO-crate-FDO-ROHubRO-crate-FDO-ROHub
RO-crate-FDO-ROHubRaul Palma
 
RELIANCE-reproducible-OS.pptx
RELIANCE-reproducible-OS.pptxRELIANCE-reproducible-OS.pptx
RELIANCE-reproducible-OS.pptxRaul Palma
 
RELIANCE-services-final.pptx
RELIANCE-services-final.pptxRELIANCE-services-final.pptx
RELIANCE-services-final.pptxRaul Palma
 
ROHub-Argos integration
ROHub-Argos integrationROHub-Argos integration
ROHub-Argos integrationRaul Palma
 
RELIANCE ROHub hackathon
RELIANCE ROHub hackathonRELIANCE ROHub hackathon
RELIANCE ROHub hackathonRaul Palma
 
Fostering the Smart Agriculture Development in North East Europe
Fostering the Smart Agriculture Development in North East EuropeFostering the Smart Agriculture Development in North East Europe
Fostering the Smart Agriculture Development in North East EuropeRaul Palma
 
Reliance project introduction
Reliance project introductionReliance project introduction
Reliance project introductionRaul Palma
 
Inspire hack 2017-linked-data
Inspire hack 2017-linked-dataInspire hack 2017-linked-data
Inspire hack 2017-linked-dataRaul Palma
 
Wielkopolska activities with potential to cluster to cluster collaboration EU...
Wielkopolska activities with potential to cluster to cluster collaboration EU...Wielkopolska activities with potential to cluster to cluster collaboration EU...
Wielkopolska activities with potential to cluster to cluster collaboration EU...Raul Palma
 
Towards the development of smart agriculture infrastructure in Wielkopolska r...
Towards the development of smart agriculture infrastructure in Wielkopolska r...Towards the development of smart agriculture infrastructure in Wielkopolska r...
Towards the development of smart agriculture infrastructure in Wielkopolska r...Raul Palma
 
An INSPIRE-based vocabulary for the publication of Agricultural Linked Data
An INSPIRE-based vocabulary for the publication of Agricultural Linked DataAn INSPIRE-based vocabulary for the publication of Agricultural Linked Data
An INSPIRE-based vocabulary for the publication of Agricultural Linked DataRaul Palma
 
Aspects of Reproducibility in Earth Science
Aspects of Reproducibility in Earth ScienceAspects of Reproducibility in Earth Science
Aspects of Reproducibility in Earth ScienceRaul Palma
 

More from Raul Palma (14)

FDO as building block for digitization technology stacks
FDO as building block for digitization technology stacksFDO as building block for digitization technology stacks
FDO as building block for digitization technology stacks
 
RO-crate-FDO-ROHub
RO-crate-FDO-ROHubRO-crate-FDO-ROHub
RO-crate-FDO-ROHub
 
RELIANCE-reproducible-OS.pptx
RELIANCE-reproducible-OS.pptxRELIANCE-reproducible-OS.pptx
RELIANCE-reproducible-OS.pptx
 
RELIANCE-services-final.pptx
RELIANCE-services-final.pptxRELIANCE-services-final.pptx
RELIANCE-services-final.pptx
 
ROHub-Argos integration
ROHub-Argos integrationROHub-Argos integration
ROHub-Argos integration
 
RELIANCE ROHub hackathon
RELIANCE ROHub hackathonRELIANCE ROHub hackathon
RELIANCE ROHub hackathon
 
Fostering the Smart Agriculture Development in North East Europe
Fostering the Smart Agriculture Development in North East EuropeFostering the Smart Agriculture Development in North East Europe
Fostering the Smart Agriculture Development in North East Europe
 
Reliance project introduction
Reliance project introductionReliance project introduction
Reliance project introduction
 
Inspire hack 2017-linked-data
Inspire hack 2017-linked-dataInspire hack 2017-linked-data
Inspire hack 2017-linked-data
 
Wielkopolska activities with potential to cluster to cluster collaboration EU...
Wielkopolska activities with potential to cluster to cluster collaboration EU...Wielkopolska activities with potential to cluster to cluster collaboration EU...
Wielkopolska activities with potential to cluster to cluster collaboration EU...
 
Towards the development of smart agriculture infrastructure in Wielkopolska r...
Towards the development of smart agriculture infrastructure in Wielkopolska r...Towards the development of smart agriculture infrastructure in Wielkopolska r...
Towards the development of smart agriculture infrastructure in Wielkopolska r...
 
An INSPIRE-based vocabulary for the publication of Agricultural Linked Data
An INSPIRE-based vocabulary for the publication of Agricultural Linked DataAn INSPIRE-based vocabulary for the publication of Agricultural Linked Data
An INSPIRE-based vocabulary for the publication of Agricultural Linked Data
 
ROHub
ROHubROHub
ROHub
 
Aspects of Reproducibility in Earth Science
Aspects of Reproducibility in Earth ScienceAspects of Reproducibility in Earth Science
Aspects of Reproducibility in Earth Science
 

Recently uploaded

Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 

Recently uploaded (20)

Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 

Linked data publication pipelines for agri-related use cases

  • 1. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 1 This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 732064 This project is part of BDV PPP LINKED DATA PUBLICATION PIPELINES FOR AGRI-RELATED USE CASES Raul Palma1, Soumya Brahma1 , Marcin Krystek1, Karel Charvát2, Raitis Berzins2 1Poznan Supercomputing and Networking Center, Poland 2WirelessInfo, Czech Republic 7th Leipzig Semantic Web Day 22nd May, 2019
  • 2. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 2 Linked data publication • LD is increasingly becoming a popular method for publishing data on the Web • Improves data accessibility by both humans and machines, e.g., for finding, reuse and integration • Enables to discover more useful data through the links (and inferencing), and to exploit data with semantic queries • Growing number of datasets in the LOD cloud • 1,234 datasets with 16,136 links (as of June 2018) • Coverage of the LOD cloud • Large cross-domain datasets (dbpedia, freebase, etc.) • Variable domain coverage (e.g., Geography, Government, BioInformatics) • What about Agriculture? • “Just” few datasets (e.g., AGRIS biblio records, AGROVOC thesaurus + other thesaurus like NALT) • Farming and agri-activities related data? http://lod-cloud.net/
  • 3. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 3 Why is Linked Data relevant in Agriculture: Farming context • Farm management • Multiple activities and stakeholders • Multiple applications, tools and devices • Multiple data sources, types and formats • Challenge • To combine/integrate those different and heterogeneous data sources in order to make economically and environmentally sound decisions
  • 4. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 4 Data Integration in relevant projects (context) • Data integration challenges have been the focus of relevant projects EU FP7, ICT CIP, 2014- 2017 EU FP7, ICT CIP, 2014- 2017 FOODIE aimed at building an open and interoperable cloud-based platform addressing among others the integration of data relevant to farming production including their geo-spatial dimension, as well as their publication as Linked data. SDI4Apps aimed at building a cloud- based framework with open API for data integration focusing on the development of six pilot apps, drawing along the lines of INSPIRE, Copernicus and GEOSS DataBio aims at showcasing the benefits of Big Data technologies in the raw material production from agriculture & others for the bioeconomy industry; deploying an interoperable platform on top of the existing partners’ infrastructure. DataBio aims at delivering solutions for big data mgmt., including i) the storage and querying of various big data sources; ii) the harmonization and integration of a large variety of data from many sources, using linked data as a federated layer
  • 5. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 5 Linked data principles & guidelines • Simple set of principles & technologies • URI, HTTP, RDF, SPARQL • Involves a set of (common) tasks Datasets identification Model specification RDF data generation Linking Exploiting Hyland et al. Hausenblas et al. Villazón-Terrazas et al. Best Practices for Publishing Linked Data 5-star deployment scheme for Linked Open Data
  • 6. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 6 Implementing Linked Data publication pipelines
  • 7. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 7 Implementing Linked Data publication pipelines • Goal: to define and deploy (semi-) automatic processes to carry out the necessary steps to transform and publish different input datasets as Linked Data. • A pipeline connect different data processing components to carry out the transformation of data into RDF and their linking, and includes the mapping specifications to process the input datasets. • Each pipeline is configured to support specific input dataset types (same format, model and delivery form). • Principles • Pipelines can be directly re-executed and re-applied (e.g., extended/updated datasets) • Pipelines must be easily reusable • Pipelines must be easily adapted for new input datasets • Pipeline execution should be as automatic as possible. The final target is to fully automated processes. • Pipelines should support both: (mostly) static data and data streams (e.g., sensor data) • The resulting datasets available as Linked Data, will provide an integrated view over the initial (disconnected and heterogeneous) datasets, in compliance with any privacy and access control needs
  • 8. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 8 Use cases (examples) • UC1: Publication of farm related linked data from agri pilots in DataBio project, particularly Pilot 8 [B1.4] Cereals and biomass crops. • Goal: query and access heterogeneous agricultural data sources from Rostenice farm via an integrated layer in order to make informed decisions and discover new knowledge • UC2: Publication of Open EU/national datasets relevant for agri-food pilots as Linked Data • Goal: provide access to multiple, isolated data sources relevant for agri-pilots, and identify links with farm datasets, from a single integrated layer • UC3: Publication of sensor data as linked data on the fly from Pilot 9 [B2.1] Machinery management • Goal: provide access to sensor data integrated with other farm data and other relevant datasets
  • 9. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 9 Datasets identification and collection • UC1 Datasets • Farm data (Rostenice holding) that holds information about each field names with the associated cereal crop classifications and arranged by year. • Data about the field boundaries and crop map and yield potential of most of the fields in Rostenice farm from Czech Republic. • Yield records from two fields (Pivovarka, Predni) within the pilot farm that were harvested in 2017/2018. Source data types: shapefiles
  • 10. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 10 Datasets identification and collection • UC2 datasets • Czech datasets • Czech LPIS data showing the actual land boundaries. • Czech erosion zones (strongly/SEO and moderately / MEO erosion-endangered soil zones) • Czech water bodies (e.g., restricted area near to water bodies has 25m buffer according to the nitrate directive). • The data about soil types from all over Czech Republic. • Polish datasets • Polish LPIS data showing the cadastral land boundaries from all over the country. • European datasets • FADN (Farm Accountancy Data Network) data about the income of agricultural holdings and the impacts of the Common Agricultural Policy from all EU member states • Various open European geospatial datasets including • (part of) Open Land Use (OLU) • (part of) Open Transport Map (OTM) • Smart Points of Interest (SPOI), • (part of) Urban Atlas (pan-European comparable land use and land cover data for Large Urban Zones ) • (part of) CORINE Land Cover • HILUCS (Hierarchical INSPIRE Land Use Classification System ) • Experimental sample dataset from the review platform Yelp (global coverage). The data contents are regarding the geographical location of a business, review and reviewer information. Source data types: shapefiles, JSON, CSV, relational databases
  • 11. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 11 Datasets identification and collection • UC3 dataset • Sensor data stored and collected by Senslog platform, including the readings of IoT devices on tractors. • Data updates is high, reading coming frequently Source data types: relational databases
  • 12. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 12 Data models • Various ontologies/vocabularies were selected and reused/extended for the representation of data • For farm-related data, INSPIRE based FOODIE ontology has been selected and extended as needed • For the land parcel and cadastral data (for Czech republic & Poland), erosion-endangered soil zones, water buffer and soil type classification, also FOODIE ontology and in some cases its extensions were used for modelling the classes and properties. • In case of the FADN the main ontologies used were Data Cube Vocabulary and its SDMX ISO standard extensions that were much effective in aligning such multidimensional data. Moreover the Data Cube Vocabulary encompasses well known RDF vocabularies like SKOS, SCOVO, VoiD, FOAF, Dublin Core, etc. • For the Yelp dataset various ontologies like review, FOAF, schema.org, POI, etc. were used to represent the classes and the properties identified from the input data sources. • For some other datasets (e.g., corine, hilucs, olu, otm, urban atlas) simple ontologies/voabularies were generated in line with standards and are available in https://github.com/FOODIE-cloud/ontology. • For sensor data: Semantic Sensor Network (SSN) along with the SOSA (Sensor, Observation, Sample, and Actuator) ontology for describing sensors and their observations, the involved procedures, the studied features of interest, etc. Additionally, Data Cube Vocabulary and its SDMX ISO standard extensions for multidimensional data
  • 13. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 13 Data model for farming data • (FOODIE) farming data model principles • application vocabulary covering the different categories of information dealt by the farm mgmt. tools/apps (in FOODIE) • in line with existing standards and best practices • Resulting model* • Builds on the INSPIRE AF specification for agricultural data, and • the INSPIRE specification for themes in annex I for geospatial data, based on • ISO/OGC standards for geographical information • Created as an UML model *consulted with experts from various institutions, e.g., EU DG JRC, EU Global Navigation Satellite Systems Agency (GSA), Czech Ministry of Agriculture, Global Earth Observation System of Systems (GEOSS), German Kuratorium für Technik und Bauwesen in der Landwirtschaft (KTBL). Challenge: Transform model into OWL ontology
  • 14. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 14 Transformation of model into OWL ontology Palma R., Reznik T., Esbri M., Charvat K., Mazurek C., An INSPIRE- based vocabulary for the publication of Agricultural Linked Data. Proceedings of the OWLED Workshop: collocated with the ISWC- 2015, Bethlehem PA, USA, October 11-15, 2015 ShapeChange implements ISO 19150-2 standard rules for mapping ISO geographic information UML models to OWL ontologies. semi-automatic process: besides transformation configuration, additional pre and post processing task were needed
  • 15. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 15 FOODIE ontology - overview • Overall structure (ShapeChange output) • UML featureTypes and dataTypes modelled as classes, and their attributes as datatype or object properties • UML codeLists modelled as classes/concepts, and their attributes as concept members • Cardinalities restrictions defined on properties (exactly, min, max) • DataType properties ranges defined according to model/mappings • Object properties ranges defined according to model/mappings • Object properties inverseOf defined Top hierarchy FeatureType hierarchy Codelist hierarchy Datatype hierarchy
  • 16. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 16 FOODIE ontology – main classes overview • We found the lack of a feature on a more detailed level than Site that is already part of the INSPIRE AF data model. • Main concept: Plot • Represents a continuous area of agricultural land with one type of crop species, cultivated by one user in one farming mode • Two kinds of data associated: • metadata information • agro-related information  Next level: Management Zone • Enables a more precise description of the land characteristics in fine-grained area foodie:Plot INSPIRE-AF:Site foodie:Alert Foodie:InterventionFoodie:CropSpecies Foodie:ManagementZone containsPlot containsManagementZone interventionPlotspeciesPlot alertPlot plotAlert Foodie:ProductionType production Foodie:SoilNutrients zoneNutrients
  • 17. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 17 FOODIE ontology – main classes overview • The Intervention is the basic feature type for any kind of (farming) application with explicitly defined geometry, e.g., tillage or pruning. • Has multiple indirect associations with different concepts Foodie:Intervention Foodie:Treatment Foodie:TreatmentPlan Foodie:Product Foodie:ProductPreparation Foodie:ActiveIngredients is-a plan productPlan planProduct preparationProduct preparation productTreatment treatmentProduct preparationPlan ingredientProduct Foodie:FormOfT reatmentValue Foodie:Treatme ntPurposeValue formOfTreatmenttreatmentPurpose
  • 18. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 18 RDF data generation • Different tools were deployed, configured and used for the generation of RDF data, depending on the source data type • D2RQ: mainly for relational databases • Geotriples: mainly for shapefiles • R2MLprocessor: mainly for JSON, CSV data sources • All these tools require a mapping file (in RDF) specifying how to map the data source elements to the target ontology concepts and properties. • Mapping specifications use R2RML (RDB to RDF Mapping Language) and/or extensions (e.g., RDF Mapping language (RML))
  • 19. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 19 RDF data generation • The mapping file also specifies the connection details for the source dataset • Based on the mapping file, the data source (e.g., database content, shapefile, JSON, CSV, etc.) is either • i) dumped to an RDF file; or • ii) transformed on the fly as a virtual RDF graph (e.g., for data streaming) • RDF files were loaded into Virtuoso triplestore (FOODIE endpoint)
  • 20. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 20 Linking the generated RDF datasets • In order to link the resulting RDF datasets with other datasets, we follow different approaches: • Apply existing tools like Silk or LIMES to discover equivalence relations • For other relations use queries (e.g., geospatial) to access the integrated data as per need. • In our experiments with equivalence relations; however we also had to do some manual entries • We found issues in handling large datasets in Silk, specially those accessed via SPARQL endpoint that we cannot control There were recent optimizations of LIMES and a new tool for geospatial linking (from Leipzig) that we plan to test
  • 21. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 21 Triplestore statistics (examples) Dataset Name Graph in FOODIE endpoint Source Triples OLU** http://w3id.org/foodie/olu# Transformed from PostgreSQL 127,926,060 SPOI http://www.sdi4apps.eu/poi.rdf Provided by WRLS (also available in FOODIE endpoint) 407,629,170 NUTS http://nuts.geovocab.org/ Open Source (available in FOODIE endpoint) 316,238 OTM*** http://w3id.org/foodie/otm# Transformed from PostgreSQL 154,340,785 Yelp academic dataset http://data.yelp.com/academic_dataset# Transformed from JSON 86,348,185 LPIS data (CZ) http://w3id.org/foodie/open/cz/pLPIS_18 0616_WGS# Transformed from shapefile 24,491,282 FADN http://ec.europa.eu/agriculture/FADN/{d ataset}# Transformed from CSV 25,457,255 Pilot 8 farm data private Transformed from shapefile 1,569,439 Total: over 1 bilion triples! FOODIE triplestore is one of the largest semantic repositories related to agriculture, which has been recognized by the EC innovation radar as „arable farming data integrator for Smart Farming”
  • 22. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 22 Exploiting the Linked Data – querying triplestore • Sparql endpoint: https://www.foodie-cloud.org/sparql
  • 23. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 23 Exploiting the Linked Data – querying sensor data as virtual RDF graph • Sparql endpoint: http://senslogrdf.foodie-cloud.org/sparql • SNORQL search endpoint: http://senslogrdf.foodie-cloud.org/snorql/ • Web-based visualization: http://senslogrdf.foodie-cloud.org/
  • 24. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 24 Exploiting the Linked Data – querying examples(1-2 linked datasets) • get information of Points Of Interest in a given polygon (one dataset) • get SPOIs for given Open Land Use parcel (linking 2 datasets) SELECT * FROM <http://www.sdi4apps.eu/poi.rdf> WHERE { ?Resource rdfs:label ?Label . ?Resource poi:class ?POI_Class . ?Resource geo:asWKT ?Coordinates . FILTER(bif:st_intersects (?Coordinates, bif:st_geomFromText("POLYGON((6.11553983198 54.438016608357, 6.95050076948 47.230985358357, 13.36651639448 47.626493170857, 14.99249295698 54.701688483357, 6.11553983198 54.438016608357))"))) . } SELECT * FROM <http://www.sdi4apps.eu/poi.rdf> WHERE { ?Resource rdfs:label ?Label . ?Resource poi:class ?POI_Class . ?Resource geo:asWKT ?Coordinates . FILTER(bif:st_intersects (?Coordinates, bif:st_geomFromText(?coordinates))) . { SELECT bif:st_astext(?x) as ?coordinates FROM <http://w3id.org/foodie/olu#> WHERE { olu-instance: geo:hasGeometry ?geometry. ?geometry geo:asWKT ?x } } }
  • 25. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 25 Exploiting the Linked Data – querying examples (3+ linked datasets) • Show me all the land parcels (OLU) that have hotels (SPOI) and that lie not more than 50 meters away from the major highway (OTM) (linking 3 datasets)? SELECT DISTINCT ?olu ?hilucs ?source ?municode ?specificLandUse FROM <http://w3id.org/foodie/olu#> WHERE { ?olu a olu:LandUse . ?olu geo:hasGeometry ?geometry . ?olu olu:hilucsLandUse ?hilucs . ?olu olu:geometrySource ?source . OPTIONAL {?olu olu:municipalCode ?municode} . OPTIONAL {?olu olu:specificLandUse ?specificLandUse} . ?geometry geo:asWKT ?coordinatesOLU . FILTER(bif:st_within(bif:st_geomFromText(?coordinatesPOI),?coordinatesOLU)). { SELECT DISTINCT ?Resource, ?Label, bif:st_astext(?coordinatesPOIa) as ?coordinatesPOI FROM <http://www.sdi4apps.eu/poi.rdf> WHERE { ?Resource rdfs:label ?Label . ?Resource poi:class <http://gis.zcu.cz/SPOI/Ontology#lodging> . ?Resource geo:asWKT ?coordinatesPOIa . FILTER(bif:st_within(?coordinatesPOIa,bif:st_geomFromText(?coordinatesOTM),0.00045)) . { SELECT bif:st_astext(?x) as ?coordinatesOTM FROM <http://w3id.org/foodie/otm#> WHERE { ?roadlink a otm:RoadLink . ?roadlink otm:roadName ?name. ?roadlink otm:functionalRoadClass ?class. ?roadlink otm:centerLineGeometry ?geometry . ?geometry geo:asWKT ?x . FILTER(bif:st_intersects (?x, bif:st_geomFromText("POLYGON((14.426647 50.0751251,14.426647 50.07685089,14.43054696 50.07685089,14.43054696 50.0751251,14.426647 50.0751251))"))) . FILTER(STRSTARTS(STR(?class),"firstClass") ) . } } } } }
  • 26. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 26 Exploiting the Linked Data – search & navigate • Faceted search endpoint: http://www.foodie-cloud.org/fct
  • 27. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 27 Exploiting the Linked Data – visualisation • Map visualisation: http://ng.hslayers.org/examples/foodie-zones/
  • 28. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 28 Exploiting the Linked Data – visualisation • Map visualisation: http://ng.hslayers.org/examples/produce-3d/ Object information on click
  • 29. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 29 Exploiting the Linked Data – visualisation • Map visualisation: http://ng.hslayers.org/examples/olu_spoi • OLU polygons colored by the number of SPOI that lie inside them
  • 30. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 30 Exploiting the Linked Data – visualisation • Map visualisation: http://app.hslayers.org/project-databio/land • Usage scenarios: • Find and show buffer zones around water bodies (user will specify the distance), which define the areas within the fields with limited/restricted application of agro- chemicals. • select farm/fields based on the ID_UZ attribute from public CZ LPIS database and search EO data over all fields • visualize crop species based on the farm data (need parcels with crop types- not available from open LPIS data) • select fields with different soil types • select all fields with certain crop in max distance from certain point (it could be for logistic, distribution of biomass etc) • show/select erosion zones for specific farm ID
  • 31. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 31 Exploiting the Linked Data – visualisation • Map visualisation: http://app.hslayers.org/project-databio/land
  • 32. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 32 Exploiting the Linked Data – visualisation • Metaphactory: http://foodie.metaphacts.cloud/resource/Start
  • 33. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 33 Exploiting the Linked Data – visualisation • Metaphactory: http://foodie.metaphacts.cloud/resource/Start
  • 34. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 34 Exploiting the Linked Data – visualisation • Metaphactory: http://foodie.metaphacts.cloud/resource/Start
  • 35. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 35 Exploiting the Linked Data – visualisation • Metaphactory: http://foodie.metaphacts.cloud/resource/Start
  • 36. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 36 Security of RDF Data • SPARQL endpoints are web services capable of providing Read-Only access to a back-end graph DBMS. • SPARQL endpoints can be sometimes purpose-specific so their access privilege therefore must be limited to some basic operations over the graph. • The privileges provided by a given Virtuoso SPARQL endpoint include specific user identities with specific database roles and privileges. • Virtuoso offers three methods for securing SPARQL endpoints: • Digest Authentication via SQL Accounts • OAuth Protocol based Authentication • WebID Protocol based Authentication
  • 37. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 37 Linked data publication technologies overview • Used technologies: • D2RQ, Geotriples and R2MLProcessor for input datasets as Virtual RDF Graphs • RDF for the representation of data • FOODIE (Farming ontology) providing the underlying vocabulary and relations, plus a number of other ontologies/vocabularies (existing and generated) • Virtuoso for storing the semantic data • Silk/LIMES for discovery of links • Sparql for querying semantic data • Hslayers NG for visualisation of data • Metaphactory for visualisation of data D2RQ
  • 38. This document is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. Find us at www.databio.eu. 38 Thank you for your attention! Contact details

Editor's Notes

  1. Typical farm activities carried out by farmers include the monitoring field operations, managing the finances and applying for subsidies, depending on different software applications. Farmers need to use different tools to manage monitoring and data acquisition on‐line in the field. They need to analyse information related to subsidies, and to communicate with tax offices, product resellers etc. the different stakeholders groups involved in the agricultural activities need integrated access to multiple and heterogeneous sources of information collected by multiple applications and devices. The different groups of stakeholders involved in the agricultural activities have to manage many different and heterogeneous sources of information that need to be combined in order to make economically and environmentally sound decisions, which include (among others) the definition of policies (subsidies, standardisation and regulation, national strategies for rural development, climate change), valuation of ecological performances, development of sustainable agriculture, crop recollection timing and pricing, plagues detection, etc.
  2. from agriculture, forestry and fishery to produce food, energy and biomaterials responsibly and sustainably. , to enable users with different profiles to benefit from the underlying HPC capabilities.
  3. INSPIRE directive is an EU initiative that aims at building a Pan-European spatial data infrastructure (SDI) requiring EU Member States to make available spatial data, from multiple thematic areas, according to established implementing rules using appropriate services. Based on ISO/OGC standards for geographical information, i.e., ISO 19100 series standards AF: Agricultural and aquaculture facilities. The theme Agricultural and Aquaculture Facilities concerns the description of all the physical instruments and constructions with permanent or semi-permanent emplacement (inland or outland) that are related to agricultural and aquaculture activities.
  4. For the purposes of FOODIE, we found the lack of a feature on a more detailed level than Site that is already part of the INSPIRE AF data model. Main concept: Plot Represents a continuous area of agricultural land with one type of crop species, cultivated by one user in one farming mode Two kinds of data associated: metadata information agro-related information Next concept: Management Zone Enables a more precise description of the land characteristics in fine-grained area The Intervention is the basic feature type for any kind of (farming) application with explicitly defined geometry, e.g., tillage or pruning. Has multiple indirect associations with different concepts Pre-processing Source model preparation ShapeChange tool configuration: encoding rules; mappings UML classes - OWL elements; namespaces definition Base ontologies fixes (INPSIRE common, ISO 19100 series standards) Post-processing tasks Manual fixes in the ontology Manual creation of ontology elements of the base INSPIRE schemas (AF) ShapeChange output UML featureTypes and dataTypes modelled as classes, and their attributes as datatype or object properties UML codeLists modelled as classes/concepts, and their attributes as concept members Cardinalities restrictions defined on properties (exactly, min, max) DataType properties ranges defined according to model/mappings Object properties ranges defined according to model/mappings Object properties inverseOf defined