1. CIARD - creating a global framework for
information sharing in agricultural
research and innovation
Dr. Johannes Keizer
Office of Knowledge Exchange, Research and Extension
Food and Agriculture Organization of the UN
Keynote at Knowledge Technology Week
Kuala Lumpur, 2011, July 20
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
2. “... FAO’s principle task is to
work to ensure that the world’s
knowledge of food and
agriculture is available to
those who need it when they
need it and in a form which
they can access and use ...”
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
3. There will be generated more
scientific data in the next 5 years
than in the history of humankind
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
4. Contribution and Participation in Science
Territory size shows proportion of scientific papers published in
2001 by authors living there.
Copyright SASI Group (University of Sheffield) and Mark Newman (University of Michigan)
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
5. Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
6. Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
7. Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
8. The Internet!
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
9. Aggregation States of Knowledge
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
10. Data and Information in Agricultural
Research and Extension
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
11. Distributed Repositories
• stats
• gene banks
• gis data
• blogs,
• journals
• open archives
• raw data
• technologies
• learning objects
• ………..
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
12. Task 1: making services
? ? ?
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
13. Task 2: getting knowledge
? ? ?
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
14. Task 3: working together
? ? ?
How can I get in real time all the specimen data on
useful insects from all people making research on this on
my desktop? How can I share in real time my data with
other colleagues working on that.
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
16. Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
17. The Project: agINFRA
Enforce Webpublishing of Data
Produce linked open data from
all datasets
Use common reference
vocabularies to interlink data
sets
Don’t wait ! Wrap the Legacy
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
18. The Infrastructure elements
VocBench RING Tools
vocabulary server
LOD
concepts and entities routemap to information enabled software
triples nodes and gateways
Cloud Data Services LOD Generator
storage for RDF Webservices + APIs triplifier,
to triple stores concept and entity
triples
identifier
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
19. LOD Generator
Lod Generator: process triplifier,
concept and entity
identifier
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
21. Under Construction !!!!!
VocBench
AGROVOC Linked Open Data
AgroTagger
Triplifying AGRIS
Serendipity linking
Drupal front ends for triple stores
The CIARD R.I.N.G
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
22. The Infrastructure elements
VocBench RING Tools
vocabulary server
LOD
concepts and entities routemap to information enabled software
triples nodes and gateways
Cloud Data Services LOD Generator
storage for RDF Webservices + APIs triplifier,
to triple stores concept and entity
triples
identifier
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
23. The VocBench VocBench
concepts and
entities triples
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
24. VocBench Features
Domain independent
Structure independent (i.e.
thesauri, Glossaries, etc)
Supports RDF (SKOS, SKOS-XL), OWL
Supports collaborative editing
Supports editorial workflow, with user roles
Simple and advanced search
Supports data export: SKOS, Relational format
(MySQL)
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
25. Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
26. The AGROVOC concept scheme
AGROVOC
Concept
Other scheme
skos:inScheme Scheme in FAO
skos:topConceptOf Another scheme
in FAO
6211 Further schemes
skos:broader skos:inScheme in FAO
8171
skos:broader
SKOS 1474 SKOS Label
Concept skos:broader
12332
rdf:type
skosxl:prefLabel :bar
skos:literalForm “maize”
skosxl:altLabel has_synonym :foo
rdf:type has_translation maïs (fr)
has_synonym
:foo
skos:literalForm “corn”
has_synonym :bar
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
27. Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
28. Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
29. The Infrastructure elements
VocBench RING Tools
vocabulary server
LOD
concepts and entities routemap to information enabled software
triples nodes and gateways
Cloud Data Services LOD Generator
storage for RDF Webservices + APIs triplifier,
to triple stores concept and entity
triples
identifier
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
30. AgroTagger
• Does Concept identification in unstructured
texts
• Uses Agrovoc as a controlled vocabulary
• Prototype under testing with excellent results
(entire repository of ICARDA indexed)
• Will produce in future Structured RDF files
that can be used to link data like “open
Calais”
•
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
31. Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
32. Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
33. AGRIS Journal disambiguation
2.644.818 AGRIS records
2.171.113 records are journal records
(82.09%)
1.788.083 journal records have been
covered by the disambiguation process
(82.35%)
14.658 journals have been correctly
disambiguated
~20.000 strings must be examined yet: they
refer to journal’s titles
Triples have been generated:
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
34. Triplifying AGRIS (small exemple)
<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF xmlns:ags="http://purl.org/agmes/1.1/" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-
syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:bibo="http://purl.org/ontology/bibo/"
xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dct="http://purl.org/dc/terms/">
<bibo:Journal rdf:about="http://aims.fao.org/aos/journal/c_b6e4ca85">
<bibo:ISSN>0101-9066</bibo:ISSN>
<bibo:ISSN>0101-9066</bibo:ISSN>
<dct:title><![CDATA[Circular técnica]]></dct:title>
<dct:alternative><![CDATA[Circular técnica (Centro Nacional de Pesquisa de Seringueira e Dendê)]]></dct:alternative>
<dct:alternative><![CDATA[Circular Tecnica - Centro Nacional de Pesquisa da Seringueira e Dende]]></dct:alternative>
<dct:alternative><![CDATA[Circular técnica - CNPSD]]></dct:alternative>
<dct:alternative><![CDATA[Circ. téc.]]></dct:alternative>
<ags:publisherPlace rdf:resource="http://aims.fao.org/aos/geopolitical.owl#Brazil"/>
<dct:publisher><![CDATA[Empresa Brasileira de Pesquisa Agropecuária, Centro Nacional de Pesquisa de Seringueira e
Dendê]]></dct:publisher>
<dct:language>por</dct:language>
<dct:date>1980</dct:date>
<dct:subject rdf:resource="http://aims.fao.org/aos/agrovoc/c_10795"/>
<dct:subject rdf:resource="http://aims.fao.org/aos/agrovoc/c_4650"/>
<dct:subject rdf:resource="http://aims.fao.org/aos/agrovoc/c_32372"/>
<dct:subject rdf:resource="http://aims.fao.org/aos/agrovoc/c_332"/>
<dct:subject rdf:resource="http://aims.fao.org/aos/agrovoc/c_3589"/>
<dct:subject rdf:resource="http://aims.fao.org/aos/agrovoc/c_5556"/>
</bibo:Journal>
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
35. The Infrastructure elements
VocBench RING Tools
vocabulary server
LOD
concepts and entities routemap to information enabled software
triples nodes and gateways
Cloud Data Services LOD Generator
storage for RDF Webservices + APIs triplifier,
to triple stores concept and entity
triples
identifier
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
37. Semantic Linking
http://aims.fao.org/aos/agrovoc/c_7825
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
38. Semantic Linking
http://aims.fao.org/aos/agrovoc/c_7825
http://eurovoc.europa.eu/218754
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
39. http://aims.fao.org/aos/agrovoc/c_7825
http://eurovoc.europa.eu/218754
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
40. http://aims.fao.org/aos/agrovoc/c_7825
http://agclass.nal.usda.gov
/nalt/2011.xml#1780
http://eurovoc.europa.eu/218754
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
41. Linking data through common URIs
NALT
http://agclass.nal.usda.g http://www.agnic.
AGROVOC ov/nalt/2011.xml#1780 org/search/CAT8
http://aims.fao.org/aos/a 5822953
grovoc/c_7825
Eurovoc TOXIC UNBIS
http://eurovoc.euro
pa.eu/218754 SUBSTANCES
http://agris.fao.org/agris-
search/search/display.do?f=1996/TR
/TR96001.xml;TR9600026 http://unbisnet.un.org:8080/ipac20/
ipac.jsp?session=128F308557F34.
http://eur- 283092&profile=bib&uri=full=3100
lex.europa.eu/LexUriServ/LexUr 001~!685149~!1&ri=1&aspect=sub
iServ.do?uri=OJ:L:2010:202:00 tab124&menu=search&source=~!h
11:0015:EN:PDF orizon
http://aims.fao.org/aos/agrovoc/c_12332 owl:sameAs http://eurovoc.europa.eu/219871
skos: exact match UNBIS: Toxic Substances
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
42. Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
43. Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
44. The Infrastructure elements
VocBench RING Tools
vocabulary server
LOD
concepts and entities routemap to information enabled software
triples nodes and gateways
Cloud Data Services LOD Generator
storage for RDF Webservices + APIs triplifier,
to triple stores concept and entity
triples
identifier
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
45. The CIARD RING
Roadmap to information nodes and
gateways
Community switchboard to find data
sources
Not only registry, but dynamic
instrument for data linking
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
46. RING - Charts and numbers
http://ring.ciard.net
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
47. RING – Numbers
http://ring.ciard.net/totals
Number of
documents
potentially reachabl
e through the
services registered
in the RING.
Types of service
considered:
document
repositories and
bibliographic
databases.
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
48. The Infrastructure elements
VocBench RING Tools
vocabulary server
LOD
concepts and entities routemap to information enabled software
triples nodes and gateways
Cloud Data Services LOD Generator
storage for RDF Webservices + APIs triplifier,
to triple stores concept and entity
triples
identifier
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
50. The AIMS Community
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
51. Thank You!
http://www.ciard.net
http://ring.ciard.net
http://aims.fao.org
http://agris.fao.org
Credits: Imma Subirats, Yves Jaques, Valeria Pesce, Fabrizio Celli, Ahsan
Morshed, Catarina Caracciolo, Dickson Lukose, Gudrun Johannsen, Stefano
Anibaldi, Armando Stellato, Tom Baker and many others
Building the CIARD Framework for Data and Information Sharing
Praha, July 12, - johannes keizer
Hinweis der Redaktion
The mainintegrationworksthroughcommonsemanticsCore ofagINFRAtechnologyisaLODstoreofsharedencodedknowledgeorganizationsystemsan automaticmarkupto link structuredandunstructureddatasourcesthroughthissharedKnowledgeOrganizationsystemsSharing withinthe R.I.N.G.Partner registertheirservices, notechnicallimitationLOD – Wrapper for all participatingInstitutionsFor all registered services a „triplificationwrapper“ will besetupThe triplifierworkswith „agConceptsandagIdentities“ tocreatelinkeddataSteadilygrowing LOD ecosystemThe agINFRA LOD ecosystemoffers Webservices forthewww
The mainintegrationworksthroughcommonsemanticsCore ofagINFRAtechnologyisaLODstoreofsharedencodedknowledgeorganizationsystemsan automaticmarkupto link structuredandunstructureddatasourcesthroughthissharedKnowledgeOrganizationsystemsSharing withinthe R.I.N.G.Partner registertheirservices, notechnicallimitationLOD – Wrapper for all participatingInstitutionsFor all registered services a „triplificationwrapper“ will besetupThe triplifierworkswith „agConceptsandagIdentities“ tocreatelinkeddataSteadilygrowing LOD ecosystemThe agINFRA LOD ecosystemoffers Webservices forthewww
- All links are checked by a domain expert.
Thisis the AGROVOC SKOS modelthathasbeendeveloped and decided in April 2010 under activecollaborationfrom Tom Baker, whowasmemberof the W3C SKOS workinggroup.
The mainintegrationworksthroughcommonsemanticsCore ofagINFRAtechnologyisaLODstoreofsharedencodedknowledgeorganizationsystemsan automaticmarkupto link structuredandunstructureddatasourcesthroughthissharedKnowledgeOrganizationsystemsSharing withinthe R.I.N.G.Partner registertheirservices, notechnicallimitationLOD – Wrapper for all participatingInstitutionsFor all registered services a „triplificationwrapper“ will besetupThe triplifierworkswith „agConceptsandagIdentities“ tocreatelinkeddataSteadilygrowing LOD ecosystemThe agINFRA LOD ecosystemoffers Webservices forthewww
My team in collaborationwith the IndianInstituteofTechnology in Kanpur isdeveloping a similar service foroursubject area.
Open Calais isverygood in thoseareas, in whichtheyhavetheirownelaboratedconceptschemeagainstwhich the texts are analyzed: “Places”, “Persons”, “Business Processes” , “IndustryTerms”, butitisweak in the specifictopicanalysis, whattheycall “social tags”
AgroTaggerstilllacksmanyof the sophisticated featuresof “Open Calais” ,butismuch, muchbetter in the subjectanalysisof the text
The mainintegrationworksthroughcommonsemanticsCore ofagINFRAtechnologyisaLODstoreofsharedencodedknowledgeorganizationsystemsan automaticmarkupto link structuredandunstructureddatasourcesthroughthissharedKnowledgeOrganizationsystemsSharing withinthe R.I.N.G.Partner registertheirservices, notechnicallimitationLOD – Wrapper for all participatingInstitutionsFor all registered services a „triplificationwrapper“ will besetupThe triplifierworkswith „agConceptsandagIdentities“ tocreatelinkeddataSteadilygrowing LOD ecosystemThe agINFRA LOD ecosystemoffers Webservices forthewww
Ifresources are marked up withsemanticallydefined and machinereadableconcepts, they can belinked and mashed up preciselyaswehaveseen in the examplefrom the BBC.In thisexamplewe start withan AGRIS record on Hazardouswaste, whichisindexedwith AGROVOC. Alreadynowwe can easily link to material indexedwithEurovoc, hereanexamplefromEuroLex. If the UNBIS thesaurus wouldberestructuredto a conceptscheme and publishedas LOD, related UN documentscouldbeattachedautomaticallyby the machine.
Ifresources are marked up withsemanticallydefined and machinereadableconcepts, they can belinked and mashed up preciselyaswehaveseen in the examplefrom the BBC.In thisexamplewe start withan AGRIS record on Hazardouswaste, whichisindexedwith AGROVOC. Alreadynowwe can easily link to material indexedwithEurovoc, hereanexamplefromEuroLex. If the UNBIS thesaurus wouldberestructuredto a conceptscheme and publishedas LOD, related UN documentscouldbeattachedautomaticallyby the machine.
Ifresources are marked up withsemanticallydefined and machinereadableconcepts, they can belinked and mashed up preciselyaswehaveseen in the examplefrom the BBC.In thisexamplewe start withan AGRIS record on Hazardouswaste, whichisindexedwith AGROVOC. Alreadynowwe can easily link to material indexedwithEurovoc, hereanexamplefromEuroLex. If the UNBIS thesaurus wouldberestructuredto a conceptscheme and publishedas LOD, related UN documentscouldbeattachedautomaticallyby the machine.
Ifresources are marked up withsemanticallydefined and machinereadableconcepts, they can belinked and mashed up preciselyaswehaveseen in the examplefrom the BBC.In thisexamplewe start withan AGRIS record on Hazardouswaste, whichisindexedwith AGROVOC. Alreadynowwe can easily link to material indexedwithEurovoc, hereanexamplefromEuroLex. If the UNBIS thesaurus wouldberestructuredto a conceptscheme and publishedas LOD, related UN documentscouldbeattachedautomaticallyby the machine.
How does this work: A resource is connected with each concept URI in the web. The concepts between three vocabularies are having same literal which is connected with owl:sameAS/exactMatch relationship. As we are speakingaboutthesauri and notontologieswekept the relation tobechosenpurposelyvague. The conceptscouldbematchedwithowl:sameAS or the termscouldbematcheswith SKOS:exactMatch. A lotofdiscussion on thisisongoing
The mainintegrationworksthroughcommonsemanticsCore ofagINFRAtechnologyisaLODstoreofsharedencodedknowledgeorganizationsystemsan automaticmarkupto link structuredandunstructureddatasourcesthroughthissharedKnowledgeOrganizationsystemsSharing withinthe R.I.N.G.Partner registertheirservices, notechnicallimitationLOD – Wrapper for all participatingInstitutionsFor all registered services a „triplificationwrapper“ will besetupThe triplifierworkswith „agConceptsandagIdentities“ tocreatelinkeddataSteadilygrowing LOD ecosystemThe agINFRA LOD ecosystemoffers Webservices forthewww
the chart on the homepage representing the distribution of services across "service types" (http://ring.ciard.net) (implemented with support from John Fereira); the geographic map on the homepage representing the geographic distribution of services;
a first attempt to provide some aggregated data on the number of contents / resources potentially reachable through the services registered in the RING: http://ring.ciard.net/totals
The mainintegrationworksthroughcommonsemanticsCore ofagINFRAtechnologyisaLODstoreofsharedencodedknowledgeorganizationsystemsan automaticmarkupto link structuredandunstructureddatasourcesthroughthissharedKnowledgeOrganizationsystemsSharing withinthe R.I.N.G.Partner registertheirservices, notechnicallimitationLOD – Wrapper for all participatingInstitutionsFor all registered services a „triplificationwrapper“ will besetupThe triplifierworkswith „agConceptsandagIdentities“ tocreatelinkeddataSteadilygrowing LOD ecosystemThe agINFRA LOD ecosystemoffers Webservices forthewww