SlideShare ist ein Scribd-Unternehmen logo
1 von 93
Downloaden Sie, um offline zu lesen
Implementing Linked Data in Low
Resource Conditions
Caterina Caracciolo, Johannes Keizer
{caterina.caracciolo},{johannes.keizer}@fao.org
Food and Agriculture Organization of the UN
09 September 2015
Goals for Today
• Give you a high level view of what is needed
to do Linked Data
• Identify possible bottlenecks due to working
with little resources
• Based on our experience, give you some
suggestions to overcome those bottlenecks
Our background assumptions
Some restrictions are needed…
• Target audience: small-medium size institutions
– This talk is not meant to be a how-to guide for specific
technical problems, but rather a support grid to plan your
entering the linked open data world
• Target data
– We mainly think of textual data, e.g., list of publications
produced by the institution, catalogues of specimens in
the local museum, factsheets on plants, events organized,
..
Topics for today
• What is a “low-resource” condition
• Open Data and Linked Open Data
• An overview of Linked Data lifecycle
– Bottlenecks in terms of resources
– Our suggestions to overcome them
• The example of Agris
Low-resource condition = ?
1. IT competencies
• Few IT people, over-busy
• Technology fast moving, nothing taught in
school
• Need personal update
– But working environment may not encourage this
– Or there may be language barriers
2. Other IT/IM/cultural issues
• Competency on legal issues – licenses, litigations?
• “It is my data”, even in the same organization
• Different “cultures” in the same workplace
– Domain specialists “know” the domain and the
data – e.g., the reports they produced - do not
want to spend time with “techy stuff”
– IT/IM people may prefer to spend time to make
better system once, instead of repeating ad-hoc
conversions - would like to standardize more
All may require some investments in time
3. Software
• Outdated operating systems and software
– Because of cost of licenses, or cultural issues
4. Hardware
CPU, memory and technology constraints...
5. Electricity
may be unreliable
5. Electricity
..occasionally available…
5. Electricity
…expensive…
6. Internet connection
may be slow…
6. Internet connection
..dependent on the weather…
Data
The trend
Great attention to data
• Interoperability of data – data that can be
reused = processed in different applications
• Standard and open formats are seen as crucial
to interoperability
• Data made available over the web, for
maximum reuse
Open Data
Open data in a nutshell
• Like other “open” movements: open and free
• See http://opendefinition.org/
• Especially for government-generated data
• E.g., census, public investments, housing, environment, ..
• A variety of formats used to expose the data
• XLS, CSV, XLM, JSON, PPT, SDMX, ..
• Preference for non-proprietary formats
– Most of the data around is “open”, more or less…
• But, check out if your country has produced a national
policy on data!
Who does Open Data?
• National and regional initiatives (not exhaustive)
– opendataforafrica.org
– data.gov.uk
– usopendata.org
– opendatalatinamerica.org
– open-data.europa.eu
– data.gov.au
– data.gov.in
• Global and sectorial initiatives – e.g., GODAN
Why do people go for Open Data
• Increase transparency of governments and
institutions
• Create new business opportunities
• It is the way to go now
Linked Open Data
Linked Open Data in a nutshell
• Like other “open” movement: open and free
– You can have Linked Data that with no open license –
but today we think of Linked Open Data (LOD)
• Any type of data, any domain
• The format of choice: RDF
– Various serialization possible – XML, Turtle, N-
Triples, N-Quads, JSON-LD, Notation 3, TriX
• Not just getting datasets out, but linked pieces
of data
Why should I go for Linked Data?
• To be able to reuse data published by others
• To promote business – made by others or
yourself
• Not to be isolated, left behind in the
information world
• Yes but… is the game worth the candle?
Agris - a LOD-based application
Then, Open Data or Linked Data?
• Can be seen as two steps along the same line
• You should decide based on your situation and
goals
– Open data requires less effort. Good if data will be
primarily used by others or have no direct interest
in linking to other datasets
– Linked Open Data may be more complex because
of the linking step. Good if you want to exploit the
data yourself, e.g. to enhance your library/doc rep
catalogue with data produced by others
The Linked Data workflow
A typical Linked Data flow
SPARQL endpoint
HTML/RDF
Content negotiation
RDF store
RDF dump
LOD based
applications
Data consumptionLOD exposureLOD storage
“Original “
dataset
Maintenance in
RDF
Maintenance in
original format
Conversion
SPARQL endpoint
“Before” the LOD
Data generation
Some remarks on RDF
RDF
• RDF is simply triples
– Subject – predicate - object
titleID
dct:title
• Triples may be serialized in various formats
– RDF/XML, Turtle, N-triples, N-Quads, JSON-LD, TriX
The role of predicates
• … the dct:title in previous slide, to indicate the
“title” of a book
• Important to expose the data without
ambiguities
• Recommendation is to use standards, or de
facto standard, to facilitate reuse of data
• Search for the vocabulary appropriate to your
data, e.g. with http://lov.okfn.org/dataset/lov/index.html
– Look also at W3C Best Practices for Publishing Linked data
http://www.w3.org/TR/ld-bp/
Conversion from existing formats
Converting data to RDF
• Many converter to RDF
– A list in http://www.w3.org/wiki/ConverterToRdf
• Conversion could be done as a one-time
migration effort, or could be scheduled
regularly
– When done regularly, for exposing your data, your
established data maintenance is not affected
An simple example of conversion
My dummy table
ID book Author Title Subject
1 John Dee Perfect Art of Navigation Navigation,
geography
2 Jethro Tull The new horse-houghing
husbandry
Horse
husbandry
1. Get some RDF
“The perfect Art of Navigation”
John Dee
1
Subject
Title
Author
Navigation
2. Get some linked RDF
“John Dee”
(Agrovoc URI)
<URI>
dct:subject
dct:title
dct:creator
“The perfect Art of Navigation”
http://aims.fao.org/aos/agrovoc/c_15908
3. Get some more links
http://dbpedia.org/page/John_Dee
(Agrovoc URI)
<URI>
dct:subject
dct:title
dct:creator
“The perfect Art of Navigation”
http://aims.fao.org/aos/agrovoc/c_15908
Data maintenance
Data maintenance
• If data is regularly converted to RDF, the “old”
maintenance flow is kept
– But with the extra step of linking
• If data is once for all migrated RDF, may have
the problem of maintenance – you may need
a GUI
Linking your data
What can be linked?
1. Vocabularies used to describe and annotate
the data - or ontologies
– i.e., the properties of the triples - your “Title”
and somebody else’s “Titulo”
2. The entities linked, the “objects”
– i.e., the object of the triple – a specific author in
your dataset to the same author in somebody
else’s dataset, or in Wikipedia
• Often, they are also called vocabularies,
which may create confusion
1. Linking vocabularies
• It is a research area
– Ontology Alignment Evaluation Initiative (OAEI)
– Note that “ontology” is often used as a generic
term, also to mean rather simple vocabularies to
describe data – ontology may sometimes also
include “individuals”, e.g., country names, ..
• Best solution is to go for standard vocabularies
from the start!
– When you design the conversion of your data
2. Linking “individuals”
• Relatively simple problem, but few out-of-the-
box tools
– Usually the problem is data “cleanliness” – e.g.,
different name spelling, abbreviations, …
• Best solution is to identify the top dataset(s)
to link and start linking to it/them
– Either manually or semi-automatically (Automatic
selection of candidate links, then manual check)
– Data validation usually outside the rest of the data
lifecycle
Hint: Drupal for your catalogue
Drupal = a content management
system
• Allows you to:
1. import data from csv, xml, RSS feed
2. create RDF
3. maintain the data from GUI
4. expose RDF
• Good for your catalogues of documents,
people, ..
• Need to know Drupal, but no programming
skills required
Similar tools
• AgriDrupal
– Drupal customized for small institutions
– Includes tools for automatic tagging with
AGROVOC, which is a linked resource
• ScratchPad
– Customized for biodiversity data
If you want to have your thesuarus
linked…
• This is our experience - AGROVOC
• Thesauri are used for document indexing
(dct:subject “navigation”)
• Steps:
– Convert the thesaurus into SKOS concept scheme
– Use VocBench for data maintenance, including
links
– Use SKOSMOS for data visualization and search
Data storage
Triple stores
• Very many around, also very many benchmark
to compare performances and functionalities
– Cf. http://www.w3.org/wiki/RdfStoreBenchmarking
• Some tech know-how needed to choose the
best solution and keep it up and running
Data exposure
Various options
1. Provide a dump for download
2. Expose de-refenceable URIs
3. Expose sparql endpoint
4. Expose webserivces
RDF dump for download
• Pros
– Simply a file to download
– For data consumers, access to data is under control
-> efficient, fast
• Cons
– The issue may be to keep the dump in synch
– Need to decide policy on versioning
– Need to decide what to include in the dump (only
the data? Also the links? ..)
De-referenceable URIs
• Pros:
– Data exposed is always up-to-date
– Serving content for URIs
– Simple back-ends are available to visualize also the
html - e.g. Pubby, Loddy
• Cons:
– Need to set up content negotiation mechanism.
Not a big issue, but server must be up 24/7..
– Data is accessible but not searchable by humans
SPARQL endpoint
• Pros:
– Not much work involved, typically endpoint is
provided by triple store
• Cons:
– Require 24/7 server availability
– No limitations on queries -> may be heavy on
server side
• Other solutions under study, e.g.
http://linkeddatafragments.org
Web Services
• Pros:
– Known technology, good performances
– More control on data access, less strain on server
– May be built on top RDF store
• Cons:
– Need to be implemented
Multilinguality
Multilingual vocabularies can help
In practice…
An institution with limited
resources wants to move to Linked
Data. What to do?
You have at least two options
1. Consider your specific bottlenecks and go
ahead on your own
2. Organize a collaboration
– Effort on creating partnership, networks
AGRIS
An example of collaborative
approach to LOD
The AGRIS network
Data coordination
Partner
Partner
Partner
Partner
Partner
Partner
Can be much smaller o bigger!
Partner
Partner
The AGRIS network
6969
……a bibliographical record original
…the same record in a mashup page
http://agris.fao.org/agris-search/search.do?recordID=QM2007000047
Data Flow
72
AGRIS dataflow and processing
The AGRIMetaMaker
26%
22%
14%
11%
9%
4%
3%
2%
2%
2%
1% 1%
3%
Metadata tools used by AGRIS Providers
WebAgris AMM OJS Mendeley WebAGRIS
PubMed InMagic DOAJ GFIS system Dspace
AgriDrupal RISC Others
How linked data is produced
……using title and authors
……using key words
……using key words
…using the journal name or the ISSN
…using aligments between thesauri
http://agris.fao.org/agris-search/search.do?recordID=PL2009000495
http://agris.fao.org/agris-search/search.do?recordID=PH2011000084
Linking URIs
Linking vocabularies
Recap and Conclusions
1. Understand your own
constraints
2. Keep an eye on tech
improvements
3. Be smart from the start
In brief…
• Start small: one dataset only (or few)
• Start relevant: choose a key dataset, either
because central to your application, or
because widely used (visibility)
• Start from somewhere: try to reuse
experience as much as possible
• Go in steps: open first, then link
• Look for collaborations
4. In union there is strength
Find your own union
• Organize a consortium and maximize your
resources
• Look for experience and support from other
organizations
Thank you!
caterina.caracciolo@fao.org
johannes.keizer@fao.org
http://aims.fao.org

Weitere ähnliche Inhalte

Was ist angesagt?

Research Data Management in the Humanities and Social Sciences
Research Data Management in the Humanities and Social SciencesResearch Data Management in the Humanities and Social Sciences
Research Data Management in the Humanities and Social SciencesCelia Emmelhainz
 
Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?OCLC
 
Research as infrastructure, Digital Humanities Congress, Sheffield 2012
Research as infrastructure, Digital Humanities Congress, Sheffield 2012Research as infrastructure, Digital Humanities Congress, Sheffield 2012
Research as infrastructure, Digital Humanities Congress, Sheffield 2012University of South Australlia
 
Towards digitizing scholarly communication
Towards digitizing scholarly communicationTowards digitizing scholarly communication
Towards digitizing scholarly communicationSören Auer
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
 
Open Data and the Panton Principles in the Humanities
Open Data and the Panton Principles in the HumanitiesOpen Data and the Panton Principles in the Humanities
Open Data and the Panton Principles in the HumanitiesOpen Knowledge Maps
 
Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...
Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...
Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...National Institute of Informatics (NII)
 
Institutional Repositories (NLA 2011)
Institutional Repositories (NLA 2011)Institutional Repositories (NLA 2011)
Institutional Repositories (NLA 2011)Paul Royster
 
SSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW
 
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...Alison Hitchens
 
How and Why to Share Your Data
How and Why to Share Your DataHow and Why to Share Your Data
How and Why to Share Your Datakfear
 
Plagiarism: Detect and Prevent
Plagiarism: Detect and PreventPlagiarism: Detect and Prevent
Plagiarism: Detect and PreventNur Ahammad
 
Linking Open Government Data at Scale
Linking Open Government Data at Scale Linking Open Government Data at Scale
Linking Open Government Data at Scale Bernadette Hyland-Wood
 
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTHerbert Van de Sompel
 

Was ist angesagt? (20)

Open Science and Identifiers
Open Science and IdentifiersOpen Science and Identifiers
Open Science and Identifiers
 
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
 
Research Data Management in the Humanities and Social Sciences
Research Data Management in the Humanities and Social SciencesResearch Data Management in the Humanities and Social Sciences
Research Data Management in the Humanities and Social Sciences
 
Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?
 
Research as infrastructure, Digital Humanities Congress, Sheffield 2012
Research as infrastructure, Digital Humanities Congress, Sheffield 2012Research as infrastructure, Digital Humanities Congress, Sheffield 2012
Research as infrastructure, Digital Humanities Congress, Sheffield 2012
 
Towards digitizing scholarly communication
Towards digitizing scholarly communicationTowards digitizing scholarly communication
Towards digitizing scholarly communication
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 
Open Data and the Panton Principles in the Humanities
Open Data and the Panton Principles in the HumanitiesOpen Data and the Panton Principles in the Humanities
Open Data and the Panton Principles in the Humanities
 
Library Linked Data and the Future of Bibliographic Control
Library Linked Data and the Future of Bibliographic ControlLibrary Linked Data and the Future of Bibliographic Control
Library Linked Data and the Future of Bibliographic Control
 
Linked (Open) Data
Linked (Open) DataLinked (Open) Data
Linked (Open) Data
 
Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...
Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...
Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...
 
Institutional Repositories (NLA 2011)
Institutional Repositories (NLA 2011)Institutional Repositories (NLA 2011)
Institutional Repositories (NLA 2011)
 
SSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow Tutorial
 
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
 
How and Why to Share Your Data
How and Why to Share Your DataHow and Why to Share Your Data
How and Why to Share Your Data
 
Llauferseiler "OU Libraries: Opportunities Supporting Research and Education"
Llauferseiler "OU Libraries: Opportunities Supporting Research and Education"Llauferseiler "OU Libraries: Opportunities Supporting Research and Education"
Llauferseiler "OU Libraries: Opportunities Supporting Research and Education"
 
Linked Data Tutorial
Linked Data TutorialLinked Data Tutorial
Linked Data Tutorial
 
Plagiarism: Detect and Prevent
Plagiarism: Detect and PreventPlagiarism: Detect and Prevent
Plagiarism: Detect and Prevent
 
Linking Open Government Data at Scale
Linking Open Government Data at Scale Linking Open Government Data at Scale
Linking Open Government Data at Scale
 
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
 

Ähnlich wie Implementing Linked Data on a Budget

The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...Projeto RCAAP
 
Introduction to Databases
Introduction to DatabasesIntroduction to Databases
Introduction to DatabasesMohd Tousif
 
How to overcome obstacles to data publication: Issues, requirements, and good...
How to overcome obstacles to data publication: Issues, requirements, and good...How to overcome obstacles to data publication: Issues, requirements, and good...
How to overcome obstacles to data publication: Issues, requirements, and good...ariadnenetwork
 
Hide the Stack: Toward Usable Linked Data
Hide the Stack:Toward Usable Linked DataHide the Stack:Toward Usable Linked Data
Hide the Stack: Toward Usable Linked Dataaba-sah
 
Sands Fish - Knowing in the Age of Networked Knowledge
Sands Fish - Knowing in the Age of Networked KnowledgeSands Fish - Knowing in the Age of Networked Knowledge
Sands Fish - Knowing in the Age of Networked Knowledgesandsfish
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflowsSSSW
 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsJon Voss
 
Planning for Research Data Management: 26th January 2016
Planning for Research Data Management: 26th January 2016Planning for Research Data Management: 26th January 2016
Planning for Research Data Management: 26th January 2016IzzyChad
 
Documentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM BootcampDocumentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM BootcampSherry Lake
 
Data Management Planning for researchers
Data Management Planning for researchersData Management Planning for researchers
Data Management Planning for researchersSarah Jones
 
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...DuraSpace
 
Elag workshop sessie 1 en 2 v10
Elag workshop sessie 1 en 2 v10Elag workshop sessie 1 en 2 v10
Elag workshop sessie 1 en 2 v10Jeroen Rombouts
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data ManagementSarah Jones
 
Research data management workshop april12 2016
Research data management workshop april12 2016 Research data management workshop april12 2016
Research data management workshop april12 2016 Rebecca Raworth, MLIS
 
Research data management workshop April 2016
Research data management workshop April 2016Research data management workshop April 2016
Research data management workshop April 2016Rebecca Raworth, MLIS
 

Ähnlich wie Implementing Linked Data on a Budget (20)

The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...
 
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
 
The Web of Data: The W3C Semantic Web Initiative
The Web of Data: The W3C Semantic Web InitiativeThe Web of Data: The W3C Semantic Web Initiative
The Web of Data: The W3C Semantic Web Initiative
 
Unit 3 part i Data mining
Unit 3 part i Data miningUnit 3 part i Data mining
Unit 3 part i Data mining
 
Introduction to Databases
Introduction to DatabasesIntroduction to Databases
Introduction to Databases
 
How to overcome obstacles to data publication: Issues, requirements, and good...
How to overcome obstacles to data publication: Issues, requirements, and good...How to overcome obstacles to data publication: Issues, requirements, and good...
How to overcome obstacles to data publication: Issues, requirements, and good...
 
Hide the Stack: Toward Usable Linked Data
Hide the Stack:Toward Usable Linked DataHide the Stack:Toward Usable Linked Data
Hide the Stack: Toward Usable Linked Data
 
Sands Fish - Knowing in the Age of Networked Knowledge
Sands Fish - Knowing in the Age of Networked KnowledgeSands Fish - Knowing in the Age of Networked Knowledge
Sands Fish - Knowing in the Age of Networked Knowledge
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflows
 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & Museums
 
Planning for Research Data Management: 26th January 2016
Planning for Research Data Management: 26th January 2016Planning for Research Data Management: 26th January 2016
Planning for Research Data Management: 26th January 2016
 
Documentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM BootcampDocumentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM Bootcamp
 
Data Management Planning for researchers
Data Management Planning for researchersData Management Planning for researchers
Data Management Planning for researchers
 
Data Storage & Preservation
Data Storage & PreservationData Storage & Preservation
Data Storage & Preservation
 
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
 
Elag workshop sessie 1 en 2 v10
Elag workshop sessie 1 en 2 v10Elag workshop sessie 1 en 2 v10
Elag workshop sessie 1 en 2 v10
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data Management
 
DC101 UWE
DC101 UWEDC101 UWE
DC101 UWE
 
Research data management workshop april12 2016
Research data management workshop april12 2016 Research data management workshop april12 2016
Research data management workshop april12 2016
 
Research data management workshop April 2016
Research data management workshop April 2016Research data management workshop April 2016
Research data management workshop April 2016
 

Mehr von AIMS (Agricultural Information Management Standards)

Mehr von AIMS (Agricultural Information Management Standards) (20)

Linked Data Competency Index : Mapping the field for teachers and learners
 Linked Data Competency Index : Mapping the field for teachers and learners Linked Data Competency Index : Mapping the field for teachers and learners
Linked Data Competency Index : Mapping the field for teachers and learners
 
Metadata as Standard: improving Interoperability through the Research Data Al...
Metadata as Standard: improving Interoperability through the Research Data Al...Metadata as Standard: improving Interoperability through the Research Data Al...
Metadata as Standard: improving Interoperability through the Research Data Al...
 
Assigning Digital Object Identifiers (DOIs) to Plant Genetic Resources
Assigning Digital Object Identifiers (DOIs) to Plant Genetic ResourcesAssigning Digital Object Identifiers (DOIs) to Plant Genetic Resources
Assigning Digital Object Identifiers (DOIs) to Plant Genetic Resources
 
VocBench 3: some insights on the forthcoming release
VocBench 3: some insights on the forthcoming release VocBench 3: some insights on the forthcoming release
VocBench 3: some insights on the forthcoming release
 
The case for Digital Objects Identifiers (DOIs) in support of research activi...
The case for Digital Objects Identifiers (DOIs) in support of research activi...The case for Digital Objects Identifiers (DOIs) in support of research activi...
The case for Digital Objects Identifiers (DOIs) in support of research activi...
 
Webinar@AIMS_FAIR Principles and Data Management Planning
Webinar@AIMS_FAIR Principles and Data Management PlanningWebinar@AIMS_FAIR Principles and Data Management Planning
Webinar@AIMS_FAIR Principles and Data Management Planning
 
Webinar@ASIRA: How to foster openness from an academic library
Webinar@ASIRA: How to foster openness from an academic library Webinar@ASIRA: How to foster openness from an academic library
Webinar@ASIRA: How to foster openness from an academic library
 
Webinar@ASIRA: A Practitioners Approach to Open Data for Agricultural Research
Webinar@ASIRA: A Practitioners Approach to Open Data for Agricultural Research Webinar@ASIRA: A Practitioners Approach to Open Data for Agricultural Research
Webinar@ASIRA: A Practitioners Approach to Open Data for Agricultural Research
 
Webinar@ASIRA: AuthorAID: Supporting Developing Country Researchers in Publis...
Webinar@ASIRA: AuthorAID: Supporting Developing Country Researchers in Publis...Webinar@ASIRA: AuthorAID: Supporting Developing Country Researchers in Publis...
Webinar@ASIRA: AuthorAID: Supporting Developing Country Researchers in Publis...
 
Webinar@ASIRA: Introduction to Using TEEAL to Access Agricultural Journals
Webinar@ASIRA: Introduction to Using TEEAL to Access Agricultural Journals Webinar@ASIRA: Introduction to Using TEEAL to Access Agricultural Journals
Webinar@ASIRA: Introduction to Using TEEAL to Access Agricultural Journals
 
Webinar@ASIRA: Access to Global Online Research in Agriculture (AGORA)
Webinar@ASIRA: Access to Global Online Research in Agriculture (AGORA) Webinar@ASIRA: Access to Global Online Research in Agriculture (AGORA)
Webinar@ASIRA: Access to Global Online Research in Agriculture (AGORA)
 
Webinar@ASIRA: AGRIS: Providing Access to Agricultural Research and Technolog...
Webinar@ASIRA: AGRIS: Providing Access to Agricultural Research and Technolog...Webinar@ASIRA: AGRIS: Providing Access to Agricultural Research and Technolog...
Webinar@ASIRA: AGRIS: Providing Access to Agricultural Research and Technolog...
 
Webinar@ASIRA: New Roles for Changing Times UNAM Subject Librarians in Context
Webinar@ASIRA: New Roles for Changing Times UNAM Subject Librarians in Context Webinar@ASIRA: New Roles for Changing Times UNAM Subject Librarians in Context
Webinar@ASIRA: New Roles for Changing Times UNAM Subject Librarians in Context
 
Webinar@ASIRA: Emerging Themes in Agricultural Research Publishing
Webinar@ASIRA: Emerging Themes in Agricultural Research PublishingWebinar@ASIRA: Emerging Themes in Agricultural Research Publishing
Webinar@ASIRA: Emerging Themes in Agricultural Research Publishing
 
Webinar@AIMS: OKAD & F1000Research: a very different approach to publishing a...
Webinar@AIMS: OKAD & F1000Research: a very different approach to publishing a...Webinar@AIMS: OKAD & F1000Research: a very different approach to publishing a...
Webinar@AIMS: OKAD & F1000Research: a very different approach to publishing a...
 
Using AGRIS as a portal of choice to access agricultural research and technol...
Using AGRIS as a portal of choice to access agricultural research and technol...Using AGRIS as a portal of choice to access agricultural research and technol...
Using AGRIS as a portal of choice to access agricultural research and technol...
 
Research4Life: La bibliothèque qui ouvre ses portes
Research4Life: La bibliothèque qui ouvre ses portesResearch4Life: La bibliothèque qui ouvre ses portes
Research4Life: La bibliothèque qui ouvre ses portes
 
Publishing skos concept schemes with skosmos
Publishing skos concept schemes with skosmosPublishing skos concept schemes with skosmos
Publishing skos concept schemes with skosmos
 
Research4Life: La biblioteca que abre puertas
Research4Life: La biblioteca que abre puertasResearch4Life: La biblioteca que abre puertas
Research4Life: La biblioteca que abre puertas
 
Research4Life: The library that opens doors
Research4Life: The library that opens doorsResearch4Life: The library that opens doors
Research4Life: The library that opens doors
 

Kürzlich hochgeladen

Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 

Kürzlich hochgeladen (20)

Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 

Implementing Linked Data on a Budget

  • 1. Implementing Linked Data in Low Resource Conditions Caterina Caracciolo, Johannes Keizer {caterina.caracciolo},{johannes.keizer}@fao.org Food and Agriculture Organization of the UN 09 September 2015
  • 2. Goals for Today • Give you a high level view of what is needed to do Linked Data • Identify possible bottlenecks due to working with little resources • Based on our experience, give you some suggestions to overcome those bottlenecks
  • 3. Our background assumptions Some restrictions are needed… • Target audience: small-medium size institutions – This talk is not meant to be a how-to guide for specific technical problems, but rather a support grid to plan your entering the linked open data world • Target data – We mainly think of textual data, e.g., list of publications produced by the institution, catalogues of specimens in the local museum, factsheets on plants, events organized, ..
  • 4. Topics for today • What is a “low-resource” condition • Open Data and Linked Open Data • An overview of Linked Data lifecycle – Bottlenecks in terms of resources – Our suggestions to overcome them • The example of Agris
  • 6. 1. IT competencies • Few IT people, over-busy • Technology fast moving, nothing taught in school • Need personal update – But working environment may not encourage this – Or there may be language barriers
  • 7. 2. Other IT/IM/cultural issues • Competency on legal issues – licenses, litigations? • “It is my data”, even in the same organization • Different “cultures” in the same workplace – Domain specialists “know” the domain and the data – e.g., the reports they produced - do not want to spend time with “techy stuff” – IT/IM people may prefer to spend time to make better system once, instead of repeating ad-hoc conversions - would like to standardize more All may require some investments in time
  • 8. 3. Software • Outdated operating systems and software – Because of cost of licenses, or cultural issues
  • 9. 4. Hardware CPU, memory and technology constraints...
  • 10. 5. Electricity may be unreliable
  • 15. Data
  • 16. The trend Great attention to data • Interoperability of data – data that can be reused = processed in different applications • Standard and open formats are seen as crucial to interoperability • Data made available over the web, for maximum reuse
  • 18. Open data in a nutshell • Like other “open” movements: open and free • See http://opendefinition.org/ • Especially for government-generated data • E.g., census, public investments, housing, environment, .. • A variety of formats used to expose the data • XLS, CSV, XLM, JSON, PPT, SDMX, .. • Preference for non-proprietary formats – Most of the data around is “open”, more or less… • But, check out if your country has produced a national policy on data!
  • 19. Who does Open Data? • National and regional initiatives (not exhaustive) – opendataforafrica.org – data.gov.uk – usopendata.org – opendatalatinamerica.org – open-data.europa.eu – data.gov.au – data.gov.in • Global and sectorial initiatives – e.g., GODAN
  • 20. Why do people go for Open Data • Increase transparency of governments and institutions • Create new business opportunities • It is the way to go now
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 27. Linked Open Data in a nutshell • Like other “open” movement: open and free – You can have Linked Data that with no open license – but today we think of Linked Open Data (LOD) • Any type of data, any domain • The format of choice: RDF – Various serialization possible – XML, Turtle, N- Triples, N-Quads, JSON-LD, Notation 3, TriX • Not just getting datasets out, but linked pieces of data
  • 28. Why should I go for Linked Data? • To be able to reuse data published by others • To promote business – made by others or yourself • Not to be isolated, left behind in the information world • Yes but… is the game worth the candle?
  • 29. Agris - a LOD-based application
  • 30. Then, Open Data or Linked Data? • Can be seen as two steps along the same line • You should decide based on your situation and goals – Open data requires less effort. Good if data will be primarily used by others or have no direct interest in linking to other datasets – Linked Open Data may be more complex because of the linking step. Good if you want to exploit the data yourself, e.g. to enhance your library/doc rep catalogue with data produced by others
  • 31. The Linked Data workflow
  • 32. A typical Linked Data flow SPARQL endpoint HTML/RDF Content negotiation RDF store RDF dump LOD based applications Data consumptionLOD exposureLOD storage “Original “ dataset Maintenance in RDF Maintenance in original format Conversion SPARQL endpoint “Before” the LOD
  • 35. RDF • RDF is simply triples – Subject – predicate - object titleID dct:title • Triples may be serialized in various formats – RDF/XML, Turtle, N-triples, N-Quads, JSON-LD, TriX
  • 36. The role of predicates • … the dct:title in previous slide, to indicate the “title” of a book • Important to expose the data without ambiguities • Recommendation is to use standards, or de facto standard, to facilitate reuse of data • Search for the vocabulary appropriate to your data, e.g. with http://lov.okfn.org/dataset/lov/index.html – Look also at W3C Best Practices for Publishing Linked data http://www.w3.org/TR/ld-bp/
  • 38. Converting data to RDF • Many converter to RDF – A list in http://www.w3.org/wiki/ConverterToRdf • Conversion could be done as a one-time migration effort, or could be scheduled regularly – When done regularly, for exposing your data, your established data maintenance is not affected
  • 39. An simple example of conversion
  • 40. My dummy table ID book Author Title Subject 1 John Dee Perfect Art of Navigation Navigation, geography 2 Jethro Tull The new horse-houghing husbandry Horse husbandry
  • 41. 1. Get some RDF “The perfect Art of Navigation” John Dee 1 Subject Title Author Navigation
  • 42. 2. Get some linked RDF “John Dee” (Agrovoc URI) <URI> dct:subject dct:title dct:creator “The perfect Art of Navigation” http://aims.fao.org/aos/agrovoc/c_15908
  • 43. 3. Get some more links http://dbpedia.org/page/John_Dee (Agrovoc URI) <URI> dct:subject dct:title dct:creator “The perfect Art of Navigation” http://aims.fao.org/aos/agrovoc/c_15908
  • 45. Data maintenance • If data is regularly converted to RDF, the “old” maintenance flow is kept – But with the extra step of linking • If data is once for all migrated RDF, may have the problem of maintenance – you may need a GUI
  • 47. What can be linked? 1. Vocabularies used to describe and annotate the data - or ontologies – i.e., the properties of the triples - your “Title” and somebody else’s “Titulo” 2. The entities linked, the “objects” – i.e., the object of the triple – a specific author in your dataset to the same author in somebody else’s dataset, or in Wikipedia • Often, they are also called vocabularies, which may create confusion
  • 48. 1. Linking vocabularies • It is a research area – Ontology Alignment Evaluation Initiative (OAEI) – Note that “ontology” is often used as a generic term, also to mean rather simple vocabularies to describe data – ontology may sometimes also include “individuals”, e.g., country names, .. • Best solution is to go for standard vocabularies from the start! – When you design the conversion of your data
  • 49. 2. Linking “individuals” • Relatively simple problem, but few out-of-the- box tools – Usually the problem is data “cleanliness” – e.g., different name spelling, abbreviations, … • Best solution is to identify the top dataset(s) to link and start linking to it/them – Either manually or semi-automatically (Automatic selection of candidate links, then manual check) – Data validation usually outside the rest of the data lifecycle
  • 50. Hint: Drupal for your catalogue
  • 51. Drupal = a content management system • Allows you to: 1. import data from csv, xml, RSS feed 2. create RDF 3. maintain the data from GUI 4. expose RDF • Good for your catalogues of documents, people, .. • Need to know Drupal, but no programming skills required
  • 52. Similar tools • AgriDrupal – Drupal customized for small institutions – Includes tools for automatic tagging with AGROVOC, which is a linked resource • ScratchPad – Customized for biodiversity data
  • 53. If you want to have your thesuarus linked… • This is our experience - AGROVOC • Thesauri are used for document indexing (dct:subject “navigation”) • Steps: – Convert the thesaurus into SKOS concept scheme – Use VocBench for data maintenance, including links – Use SKOSMOS for data visualization and search
  • 55. Triple stores • Very many around, also very many benchmark to compare performances and functionalities – Cf. http://www.w3.org/wiki/RdfStoreBenchmarking • Some tech know-how needed to choose the best solution and keep it up and running
  • 57. Various options 1. Provide a dump for download 2. Expose de-refenceable URIs 3. Expose sparql endpoint 4. Expose webserivces
  • 58. RDF dump for download • Pros – Simply a file to download – For data consumers, access to data is under control -> efficient, fast • Cons – The issue may be to keep the dump in synch – Need to decide policy on versioning – Need to decide what to include in the dump (only the data? Also the links? ..)
  • 59. De-referenceable URIs • Pros: – Data exposed is always up-to-date – Serving content for URIs – Simple back-ends are available to visualize also the html - e.g. Pubby, Loddy • Cons: – Need to set up content negotiation mechanism. Not a big issue, but server must be up 24/7.. – Data is accessible but not searchable by humans
  • 60. SPARQL endpoint • Pros: – Not much work involved, typically endpoint is provided by triple store • Cons: – Require 24/7 server availability – No limitations on queries -> may be heavy on server side • Other solutions under study, e.g. http://linkeddatafragments.org
  • 61. Web Services • Pros: – Known technology, good performances – More control on data access, less strain on server – May be built on top RDF store • Cons: – Need to be implemented
  • 63.
  • 65. In practice… An institution with limited resources wants to move to Linked Data. What to do?
  • 66. You have at least two options 1. Consider your specific bottlenecks and go ahead on your own 2. Organize a collaboration – Effort on creating partnership, networks
  • 67. AGRIS An example of collaborative approach to LOD
  • 68. The AGRIS network Data coordination Partner Partner Partner Partner Partner Partner Can be much smaller o bigger! Partner Partner
  • 71. …the same record in a mashup page http://agris.fao.org/agris-search/search.do?recordID=QM2007000047
  • 73. AGRIS dataflow and processing
  • 75. 26% 22% 14% 11% 9% 4% 3% 2% 2% 2% 1% 1% 3% Metadata tools used by AGRIS Providers WebAgris AMM OJS Mendeley WebAGRIS PubMed InMagic DOAJ GFIS system Dspace AgriDrupal RISC Others
  • 76. How linked data is produced
  • 80. …using the journal name or the ISSN
  • 87. 1. Understand your own constraints
  • 88. 2. Keep an eye on tech improvements
  • 89. 3. Be smart from the start
  • 90. In brief… • Start small: one dataset only (or few) • Start relevant: choose a key dataset, either because central to your application, or because widely used (visibility) • Start from somewhere: try to reuse experience as much as possible • Go in steps: open first, then link • Look for collaborations
  • 91. 4. In union there is strength
  • 92. Find your own union • Organize a consortium and maximize your resources • Look for experience and support from other organizations