TeamStation AI System Report LATAM IT Salaries 2024
Linking Open Data with Drupal
1. Linking Open Data
with Drupal
Emmanuel Jamin
Drupal.cat October 4th, 2012 Citilab, Cornellá
2. Who am I?
Emmanuel Jamin
– PhD
• At Paris XI university (LIMSI-CNRS, Orsay)
– Research and development (EU projects)
• At Edelweiss (INRIA, Sophia Antipolis)
• At the Knowledge Lab (ATOS, Barcelona)
– Now
• Semantic Web consultant in Barcelona
• www.OpenData-consulting.com
• @openDataC
3. Plan
Introduction to Open Data
Introduction to the Semantic Web
From Open Data to Linked Data
Drupal modules for Linking Open Data
LOD + Drupal hackathon in Barcelona?
5. A – OD - Definition
“Open data is data that can be freely used,
reused and redistributed by anyone – subject
only, at most, to the requirement to attribute
and sharealike.”
http://OpenDefinition.org
6. A – OD - Principles
> Availabilityand Access
Availability
and Access
Reuse and Redistribution
> Reuse and Redistribution
Universal Participation
> Universal Participation
7. A – OD – Small history
1957-1958: 1st concept
“open access to scientific data”
2001: 1st definition
“the web of data” (Tim Berners Lee)
2004-05: 1st fondation
Open Knowledge Fondation (http://okfn.org/)
2009-05: 1st Open Government platform in US
http://data.gov
2012-09: 1st Open Knowledge Festival
http://okfestival.org
8. Image by Peter Ito (2009): http://www.flickr.com/photos/peterito/3054501076/lightbox/
9. A – OD - Platforms
Open Cities
Open Science
Open Government
Transparency
Open Science
Open CitiesOpen Government
Participation
Open Education Collaboration
Open Culture
Open Health Open Health
…
Open Education
10. A – OD – Status of OD
Topics
From: http://okfn.org/opendata/
11. A – OD – Status of OD
Database
Types of Data
Structured Data
Documents
Documents
Raw data Open Data
Raw Data
Structured data
Linked Data
Linked data Geo Data
12. A – OD – Status of OD
Heterogenous standards (Open Standard)
TXT
PDF - DOC PDF
CSV CSV
ZIP
XML ODT
RDF JSON
RDF
KML-KMZ XML
XSL
JSON
13. A – OD – Comparison
Barcelona Catalunya España
Datos.gov.es / Gen.cat / barcelona.cat
http://w20.bcn.cat/opendata/ http://www20.gencat.cat/portal/site/dadesobertes/
http://datos.gob.es/datos/
Website
Topics Economy, Cartography and Public sector, Culture
Cartography, maps, Facilities and hobbies, Science
Population, Statistics, and technologies,
Environment, Meteorology. Environment,
Administration Nomenclators, Health, Education,
Public transport, Tansport
Turism
Formats CSV, PDF, XLS, XML, TMX, ZIP, PDF, CSV, XHTML, HTML, PDF,
RDF, TXT, ZIP KML-KMZ, DOC, XLS, XLS, XML, ZIP
XML, JSON, RDF,
SHP, SPARQL
14. A – OD – Why opening up data?
Why opening up the data?
Why opening up the Data?
15. A – OD – Why opening up data?
Graphic representation of dataset
to visualize it easily
http://civio.es
16. A – OD – Why opening up data?
Facet search and browsing
Data integration
to compare easily
http://civio.es
17. A – OD – Why opening up data?
http://manybills.researchlabs.ibm.com/
Data formalization
Facet search and browsing
to contextualize information easily
18. http://www.unhabitat.org/
A – OD – Why opening up data?
Data reuse and combination
Facet search and browsing
to contextualize information easily
19. A – OD – Why opening up data?
Big Data analysis
Graphic representation of dataset
Statistics
to visualize it easily Graphic representation
Data reuse and combination
Data vizualization
Data integration
Data integrationreuse
Data
to compare easily
Facet search and browsing search and browsing
Facet
Data to contextualize information easily
–
contextualization
Data mapping
20. A – OD – Why opening up data?
Analyze it … Reuse it …
Opening the data
Reuse it
Open Data
Mix it
Analyse it
Mix it … Visualise it …
Vizualize it
For a for a comprehension
better better comprehension!
21. OD – The big challenge
The OD movement has:
The big challenge
The energy
The Open Mind philosophy
The public resources
Etc.
But something is missing ...
34. B–
The answer is based on a
Shared Ontology
The answer is based on a shared knowledge
We can understand
You can reason
35. B–
Document
Document
Book
Book
Roman / Novel
Roman Novel
36. B–
“An ontology is a specification of a
conceptualization”
(i.e. the logical description of the concepts and
relationships that can exist for an agent or a
community of agents).
Tom Grüber (1993)
40. B – SW – Resources
Everything is a resource
Everything is a resource
– Person Berners Lee
– Organisation W3C
– Document paper.html
– Event SW conference 2012
– … etc.
41. B – SW – Resources
Each resource is identified
with aanunique reference.
Each resource identified with URI
www.w3c.org/people/timbl.html#this Berners Lee
www.w3c.org/index.html#this W3C
www.w3c.org/papers/paper.html#this paper.html
www.w3c.org/events/swcon12.html#this SW con'12
42. B – SW – Resources
Namespace to reference
Namespace to simplify URI the URI
Namespace:
www.w3c.org/people/timbl.html#
Prefix
tbl: www.w3c.org/people/timbl.html#
CURIE
tbl:this
43. B – SW – Resources
CURIE to simplify the URI
Namespace to simplify URI
w3c:timbl foaf:Person
w3c:this foaf:Organisation
dblp:this foaf:Document
event:this foaf:Event
45. B – SW – Triples
RDF triples
web.html has author Tim Berners Lee
LinkedData.html has author Hausenblas
W3C has employee Tim Berners Lee
web.html is published at SW conference
46. B – SW – Ontologies
RDF-S → RDF-Schema
Definition of the
• Classes (concepts)
• and Properties (conceptual relations)
Hierachy organisation with conceptual relations
47. B – SW – Ontologies
RDFS
– Book is sub-type of Document
– Novel is sub-type of Book
– Roman is sub-type of Book
48. B – SW – RDF graph
RDF triples => Linked Data
RDF triples = LinkedData
– W3C.html has author Tim Berners Lee
– W3C.html is type of Document
– Tim Berners Lee is type of Person
– W3C.html is presented at Web Conference 2012
– Web Conference 2012 is type of Conference
– Conference is sub class of Event
49. B – SW – RDF graph
RDF triples => RDF graph
Organisation
RDF triples = RDF graph
Event
Document Person
RDF graph W3C
Conference
web.html Tim
Berners Lee
SW conference
50. B – SW – Federated Dataset
Federated dataset
Resourcesresources are connected
All are connected over the web
over the Web
LOD site 1 LOD site 2
w3c:this w3c:this
tim:this ivan:this
doc3:this
doc1:this doc2:this
doc2:this
51. B – SW – SPARQL
Search and retrieve information
Find and retrieve information from the graph
from the graph with SPARQL
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?document ?authorName
WHERE {
?person rdf:type foaf:Person
?person foaf:name ?authorName
?authorName foaf:made ?document
}
52. B – SW – Giant GlobalGiant Graph
Global Graph
The web becomes one giant database
59. C – OD + LD
From Open Data to Linked Data
From Open Data to Linked Data
RDFS
Open Data
RDF
JSON
Linked Data
XML
CSV
PDF
Structured Data
60. CFrom PDF to RDF
– OD + LD
From PDF to RDF
1. Document engineering
• Content extraction
• Content format
• Multimedia extraction
2. Knowledge engineering
• Term extraction (indexation)
• Recognition of Named Entities
• Ontology engineering
• Conceptual recognition and mapping
61. C – OD + LD
Synthesis about data formats
Síntesis de los formatos (table)
To create To exploit / reuse To maintain /
manage
Doc PDF
CSV XML
RDF RDFS
62. C – to arrive in LOD
To succeed with Linked Data
Linking Open Data
1. Data formalization
• Create or reuse ontologies (RDF, RDFS, OWL)
2. Data annotation
• Associate semantic metadata (RDF, RDFa, Microdata)
3. Data publication
• Publish your semantic data (RDFa, Microdata)
4. Data consumption
• Reuse all available data (SPARQL endpoints)
63. C – OD + LD
From Open Data to Linked Data
Data quality
64. B – SW – Big Giant Graph
Open Data + Data Interconnection
Linked
Linked Open Data Open Data
25 billion RDF triples over the web
25 billion of RDF triples over the web
65. B – SW – Big Giant Graph
Open Data + Data Interconnection
Linked
Linked Open Data Open Data
25 billion RDF triples over the web
From: http://www.w3.org/DesignIssues/diagrams/lod/2010-color.png
66. B – SW – Big Giant Graph
Open Data + Data Interconnection
Linked
Linked Open Data Open Data
25 billion RDF triples over the web
http://dbpedia.org
67. B – SW – Big Giant Graph
Open Data + Data Interconnection
Linked Open Data
25 billion RDF triples over the web
The Web 3.0
is already here ...
69. D – LODrupal - Drupal
LOD and Drupal
Entities ↔ Resources
Availability and Access
Entities ↔ Resources
RDFReuse and Redistribution
in Core
RDF in Drupal Core
Universal Participation
Semantic Web modules
and Semantic Web modules
70. D – LODrupal – Drupal Modules
Drupal modules
Main Microdata Web modules
Semantic
Import Linked Data
schema.org
Microdata
SPARQL RDFx
SPARQL Views
SPARQL Views
RDFx
SPARQL
71. D – LODrupal – Mod1 ...
RDFx
From: http://drupal.org/project/rdfx
72. D – LODrupal – Mod1 ...
schemaorg
From: http://drupal.org/project/schemaorg
73. D – LODrupal –Views ...
SPARQL Mod1
From: http://drupal.org/project/sparql_views
74. D – LODrupal – Mod1 ...
SPARQL
From: http://drupal.org/project/sparql
75. D – LODrupal – Drupal Prototype
Demonstration
Demo