Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Publishing Linked Data using Schema.org
1. Publishing Linked Data
using Schema.org
Development and management
of e-Repositories – OTA
IODE, Oostende, Belgium,
April 11th, 2013
An introduction to the project of
Mr. Aditya Kakodkar by
Christophe.Dupriez@destin-informatique.com
2. LinkedData, Why?
● External/Internal (Reference) Data
use and reuse
● (Meta) Data encoded and published along
standardized, perennial and documented
measurement systems and categories
● Massive international efforts for tools and
interlinked repositories development
● Opportunity to become a General Reference on
the Web for a specific domain
● Your work becomes discoverable
and well positioned by Search Engines
3. Data to be linked ?
●
Metadata provides the context,
links to a MODEL
● Observed Data: source, measure/range, unit...
● Manually entered Data: validation rules
● Aggregated Data:
Which indicator for which decision?
● Published Data: exact? complete? perenial?
● Reference Data: comparability with other data?
● Open Data is (not) Public Data! http://opendatacommons.org
● Personal Data: protection? anonymisation?
● Big Data: dangers? opportunities?
4. Linking Data in order to...
● Denote an “real life” object,
a concept, a transaction...
– not uniquely enough: sameAs.org
● Document (explain, contextualize)
the data to the user (HTML document page)
● Enrich, linking to other data ...
(RDF data page)
6. RDF: Resource Description
Framework
● A standard to provide (meta)data on the Web
● Based on a very simple model of triplets:
subject – property – object
● Everything is an URI; object can also be a
“constant value” (a text, a number, a date...)
suffixed by an indication of the language
● Example:
dbpedia:European_Herring_Gull rdfs:label “Goéland argenté”@fr
where “dbpedia:” stands for URI prefix:
http://dbpedia.org/resource/
and “rdfs:” stands for URI prefix:
http://www.w3.org/2000/01/rdf-schema#
7. Being a Gull is not Dull !
● http://en.wikipedia.org/wiki/European_Herring_Gull
● http://dbpedia.org/resource/European_Herring_Gull
which redirects to the document
(HTML for human consumption):
http://dbpedia.org/page/European_Herring_Gull
● Data (for machine consumption) is generated separately in
different formats (N3, Turtle, XML, JSON...) :
http://dbpedia.org/data/European_Herring_Gull.n3
● Browser negotiates the suitable format...
● What is validated there? What are the rules?
● Can it be a reference to take decisions?
8. Using a single page?
●
RDFa and MicroData are two standards to MERGE
an HTML document (made for humans) and the data
a machine may wish to extract from it
● Example from a page in OceanExpert.net:
<h1>Details of<span itemprop="name">
<span itemprop="familyName">Dupriez</span>
,
<span itemprop="givenName">Christophe
</span>
</span></h1>
● ANY23.org, an Open Source software to collect data
embedded in a Web Page will be demonstrated later
on OceanExpert.net...
9. Data Model
● Which processes do we need to automate?
(use cases)
● Which entities (real objects, concepts,
transactions/events) have to be represented?
● How do those entities interrelate?
● What measures (properties) are made about
each type of entity?
● Reuse: who else will align on the same model?
What Google may do with my data?
10. Schema.org
●
Schema.org is a modelling initiative of
Google / Microsoft / Yahoo to standardize URIs for RDF
properties
● Common model for data published as documents
harvestable on the web
● Their goal is to collect the data in our pages.
Those pages are then better indexed.
What else? (A.I.?)
● Schema.org models are far from exhaustive
(for instance, insufficient for CVs)
but a “/extension” mechanism exists
● Examples on the site http://schema.org
11. Google RichSnippets
● Google Spider extracts data tagged using RDFa
or MicroData
● Pages with such data are promoted...
● Google Search Engine enriches results using
this data
● Example “Apollo Theatre”:
place, events, reviews...
● Google RichSnippets tool validates a web page:
http://www.google.com/webmasters/tools/richsnippets
12. Data Search Engine
● ANY23 is used to feed SINDICE,
the Search Engine for RDF data
● Example:
http://www.sindice.com/search?q=apollo+theatre