SlideShare ist ein Scribd-Unternehmen logo
1 von 39
Downloaden Sie, um offline zu lesen
(1)
Standardizing for Open Data
Ivan	
  Herman,	
  W3C	
  
Open	
  Data	
  Week	
  
Marseille,	
  France,	
  June	
  26	
  2013	
  
Slides at: http://www.w3.org/2013/Talks/0626-Marseille-IH/
(2)
Data	
  is	
  everywhere	
  on	
  the	
  Web!	
  
l  Public,	
  private,	
  behind	
  enterprise	
  firewalls	
  
l  Ranges	
  from	
  informal	
  to	
  highly	
  curated	
  
l  Ranges	
  from	
  machine	
  readable	
  to	
  human	
  readable	
  
l  HTML	
  tables,	
  twitter	
  feeds,	
  local	
  vocabularies,	
  
spreadsheets,	
  …	
  
l  Expressed	
  in	
  diverse	
  models	
  	
  
l  tree,	
  graph,	
  table,	
  …	
  
l  Serialized	
  in	
  many	
  ways	
  	
  
l  XML,	
  CSV,	
  RDF,	
  PDF,	
  HTML	
  Tables,	
  microdata,…	
  
(3)
(4)
(5)
(6)
(7)
(8)
W3C’s	
  standardization	
  focus	
  was,	
  
traditionally,	
  on	
  Web	
  scale	
  
integration	
  of	
  data	
  
l Some	
  basic	
  principles:	
  
l  use	
  of	
  URIs	
  everywhere	
  (to	
  uniquely	
  identify	
  things)	
  
l  relate	
  resources	
  among	
  one	
  another	
  (to	
  connect	
  
things	
  on	
  the	
  Web)	
  
l  discover	
  new	
  relationships	
  through	
  inferences	
  
l This	
  is	
  what	
  the	
  Semantic	
  Web	
  technologies	
  are	
  
all	
  about	
  
	
  
(9)
We	
  have	
  a	
  number	
  of	
  standards	
  
RDF	
  1.1	
  
SPARQL	
  1.1	
  
URI	
  
JSON-­‐LD	
   Turtle	
   RDFa	
   RDF/XML	
  
RDF:	
  data	
  model,	
  links,	
  basic	
  assertions;	
  
different	
  serializations	
  	
  
SPARQL:	
  querying	
  data	
  
A	
  fairly	
  stable	
  set	
  of	
  technologies	
  by	
  now!	
  
(10)
We	
  have	
  a	
  number	
  of	
  standards	
  
RDB2RDF	
   RDF	
  1.1	
  
RDFS	
  1.1	
  SPARQL	
  1.1	
  
OWL	
  2	
  
URI	
  
JSON-­‐LD	
   Turtle	
   RDFa	
   RDF/XML	
  
RDF:	
  data	
  model,	
  links,	
  basic	
  assertions;	
  
different	
  serializations	
  	
  
SPARQL:	
  querying	
  data	
  
RDFS:	
  	
  simple	
  vocabularies	
  
OWL:	
  complex	
  vocabularies,	
  ontologies	
  
RDB2RDF:	
  databases	
  to	
  RDF	
  
A	
  fairly	
  stable	
  set	
  of	
  technologies	
  by	
  now!	
  
(11)
We	
  have	
  Linked	
  Data	
  principles	
  
(12)
Integration	
  is	
  done	
  in	
  different	
  ways	
  
l Very	
  roughly:	
  
l  data	
  is	
  accessed	
  directly	
  as	
  RDF	
  and	
  turned	
  into	
  
something	
  useful	
  
l  relies	
  on	
  data	
  being	
  “preprocessed”	
  and	
  published	
  as	
  RDF	
  
l  data	
  is	
  collected	
  from	
  different	
  sources,	
  integrated	
  
internally	
  
l  using,	
  say,	
  a	
  triple	
  store	
  
(13)
(15)
However…	
  
l There	
  is	
  a	
  price	
  to	
  pay:	
  a	
  relatively	
  heavy	
  
ecosystem	
  
l  many	
  developers	
  shy	
  away	
  from	
  using	
  RDF	
  and	
  
related	
  tools	
  
l Not	
  all	
  applications	
  need	
  this!	
  
l  data	
  may	
  be	
  used	
  directly,	
  no	
  need	
  for	
  integration	
  
concerns	
  
l  the	
  emphasis	
  may	
  be	
  on	
  easy	
  production	
  and	
  
manipulation	
  of	
  data	
  with	
  simple	
  tools	
  
(16)
Typical	
  situation	
  on	
  the	
  Web	
  
l Data	
  published	
  in	
  CSV,	
  JSON,	
  XML	
  
l An	
  application	
  uses	
  only	
  1-­‐2	
  datasets,	
  
integration	
  done	
  by	
  direct	
  programming	
  is	
  
straightforward	
  
l  e.g.,	
  in	
  a	
  Web	
  Application	
  
l Data	
  is	
  often	
  very	
  large,	
  direct	
  manipulation	
  is	
  
more	
  efficient	
  
(17)
Non-­‐RDF	
  Data	
  
l In	
  some	
  setting	
  that	
  data	
  can	
  be	
  converted	
  into	
  
RDF	
  
l But,	
  in	
  many	
  cases,	
  it	
  is	
  not	
  done	
  
l  e.g.,	
  CSV	
  data	
  is	
  way	
  too	
  big	
  
l  RDF	
  tooling	
  may	
  not	
  be	
  adequate	
  for	
  the	
  task	
  at	
  
hand	
  
l  integration	
  is	
  not	
  a	
  major	
  issue	
  
(18)
(19)
What	
  that	
  application	
  does… 	
  	
  
l Gets	
  the	
  data	
  published	
  by	
  NHS	
  
l Processes	
  the	
  data	
  (e.g.,	
  through	
  Hadoop)	
  
l Integrates	
  the	
  result	
  of	
  the	
  analysis	
  with	
  
geographical	
  data	
  
Ie:	
  the	
  raw	
  data	
  is	
  used	
  without	
  integration	
  
(20)
The	
  reality	
  of	
  data	
  on	
  the	
  Web…	
  
l It	
  is	
  still	
  a	
  fairly	
  messy	
  space	
  out	
  there	
  L	
  
l  many	
  different	
  formats	
  are	
  used	
  
l  data	
  is	
  difficult	
  to	
  find	
  
l  published	
  data	
  are	
  messy,	
  erroneous,	
  	
  
l  tools	
  are	
  complex,	
  unfinished…	
  	
  
(21)
How	
  do	
  developers	
  
perceive	
  this?	
  
‘When	
  transportation	
  agencies	
  consider	
  data	
  
integration,	
  one	
  pervasive	
  notion	
  is	
  that	
  the	
  
analysis	
  of	
  existing	
  information	
  needs	
  and	
  
infrastructure,	
  much	
  less	
  the	
  organization	
  of	
  data	
  
into	
  viable	
  channels	
  for	
  integration,	
  requires	
  a	
  
monumental	
  initial	
  commitment	
  of	
  resources	
  
and	
  staff.	
  Resource-­‐scarce	
  agencies	
  identify	
  this	
  
perceived	
  major	
  upfront	
  overhaul	
  as	
  
"unachievable"	
  and	
  "disruptive.”’	
  
	
  	
  -­‐-­‐	
  Data	
  Integration	
  Primer:	
  Challenges	
  to	
  Data	
  Integration,	
  US	
  
Dept.	
  of	
  Transportation	
  
	
  
(22)
One	
  may	
  look	
  at	
  the	
  problem	
  
through	
  different	
  goggles	
  
l Two	
  alternatives	
  come	
  to	
  the	
  fore:	
  
1.  provide	
  tools,	
  environments,	
  etc.,	
  to	
  help	
  
outsiders	
  to	
  publish	
  Linked	
  Data	
  (in	
  RDF)	
  
easily	
  
l  a	
  typical	
  example	
  is	
  the	
  Datalift	
  project	
  
2.  forget	
  about	
  RDF,	
  Linked	
  Data,	
  etc,	
  and	
  
concentrate	
  on	
  the	
  raw	
  data	
  instead	
  
(24)
But	
  religions	
  and	
  
cultures	
  can	
  
coexist…	
  J	
  
(25)
Open	
  Data	
  on	
  the	
  Web	
  Workshop	
  
l Had	
  a	
  successful	
  workshop	
  in	
  London,	
  in	
  April:	
  
l  around	
  100	
  participants	
  
l  coming	
  from	
  different	
  horizons:	
  publishers	
  and	
  users	
  
of	
  	
  Linked	
  Data,	
  CSV,	
  PDF,	
  …	
  
	
  
(26)
We	
  also	
  talked	
  to	
  our	
  
“stakeholders”	
  
l Member	
  organizations	
  and	
  companies	
  
l Open	
  Data	
  Institute,	
  Open	
  Knowledge	
  
Foundation,	
  Schema.org	
  
l …	
  
(27)
Some	
  takeaway	
  
l The	
  Semantic	
  Web	
  community	
  needs	
  stability	
  of	
  
the	
  technology	
  
l  do	
  not	
  add	
  yet	
  another	
  technology	
  block	
  J	
  
l  existing	
  technologies	
  should	
  be	
  maintained	
  
(28)
Some	
  takeaway	
  
l Look	
  at	
  the	
  more	
  general	
  space,	
  too	
  
l  importance	
  of	
  metadata	
  
l  deal	
  with	
  non-­‐RDF	
  data	
  formats	
  
l  best	
  practices	
  are	
  necessary	
  to	
  raise	
  the	
  quality	
  of	
  
published	
  data	
  
(29)
We	
  need	
  to	
  meet	
  app	
  developers	
  
where	
  they	
  are!	
  
(30)
Metadata	
  is	
  of	
  a	
  major	
  
importance	
  
l Metadata	
  describes	
  the	
  characteristics	
  of	
  the	
  
dataset	
  
l  structure,	
  datatypes	
  used	
  
l  access	
  rights,	
  licenses	
  
l  provenance,	
  authorship	
  
l  etc.	
  
l Vocabularies	
  are	
  also	
  key	
  for	
  Linked	
  Data	
  
(31)
Vocabulary	
  Management	
  Action	
  
l Standard	
  vocabularies	
  are	
  necessary	
  to	
  describe	
  
data	
  
l  there	
  are	
  already	
  some	
  initiatives:	
  W3C’s	
  data	
  cube,	
  
data	
  catalog,	
  PROV,	
  schema.org,	
  DCMI,	
  …	
  	
  
l At	
  the	
  moment,	
  it	
  is	
  a	
  fairly	
  chaotic	
  world…	
  
l  many,	
  possibly	
  overlapping	
  vocabularies	
  
l  difficult	
  to	
  locate	
  the	
  one	
  that	
  is	
  needed	
  
l  vocabularies	
  may	
  not	
  be	
  properly	
  managed,	
  
maintained,	
  versioned,	
  provided	
  persistence…	
  
(32)
W3C’s	
  plan:	
  	
  
l Provide	
  a	
  space	
  whereby	
  
l  communities	
  can	
  develop	
  
l  host	
  vocabularies	
  at	
  W3C	
  if	
  requested	
  
l  annotate	
  vocabularies	
  with	
  a	
  proper	
  set	
  of	
  metadata	
  
terms	
  
l  establish	
  a	
  vocabulary	
  directory	
  
l The	
  exact	
  structure	
  is	
  still	
  being	
  discussed:	
  
http://www.w3.org/2013/04/vocabs/	
  
(34)
CSV	
  on	
  the	
  Web	
  
l Planned	
  work	
  areas:	
  
l  metadata	
  vocabulary	
  to	
  describe	
  CSV	
  data	
  
l  structure,	
  reference	
  to	
  access	
  rights,	
  annotations,	
  etc.	
  
l  methods	
  to	
  find	
  the	
  metadata	
  
l  part	
  of	
  an	
  HTTP	
  header,	
  special	
  rows	
  and	
  columns,	
  
packaging	
  formats…	
  
l  mapping	
  content	
  to	
  RDF,	
  JSON,	
  XML	
  
l Possibly	
  at	
  a	
  later	
  phase:	
  	
  
l  API	
  standards	
  to	
  access	
  CSV	
  data	
  
(36)
Open	
  Data	
  Best	
  Practices	
  
l Document	
  best	
  practices	
  for	
  data	
  publishers	
  
l  management	
  of	
  persistence,	
  versioning,	
  URI	
  design	
  
l  use	
  of	
  core	
  vocabularies	
  (provenance,	
  access	
  control,	
  
ownership,	
  annotations,…)	
  
l  business	
  models	
  
l Specialized	
  Metadata	
  vocabularies	
  
l  quality	
  description	
  (quality	
  of	
  the	
  data,	
  update	
  
frequencies,	
  correction	
  policies,	
  etc.)	
  
l  description	
  of	
  data	
  access	
  API-­‐s	
  
l  …	
  
(37)
Summary	
  
l Data	
  on	
  the	
  Web	
  has	
  many	
  different	
  facets	
  
l We	
  have	
  concentrated	
  on	
  the	
  integration	
  
aspects	
  in	
  the	
  past	
  years	
  
l We	
  have	
  to	
  take	
  a	
  more	
  general	
  view,	
  look	
  at	
  
other	
  types	
  of	
  data	
  published	
  on	
  the	
  Web	
  
	
  
	
  
(38)
In	
  future…	
  
l We	
  should	
  look	
  at	
  other	
  formats,	
  not	
  only	
  CSV	
  
l  MARC,	
  GIS,	
  ABIF,…	
  
l Better	
  outreach	
  to	
  data	
  publishing	
  communities	
  
and	
  organizations	
  
l  WF,	
  RDA,	
  ODI,	
  OKFN,	
  …	
  
Enjoy	
  the	
  event!	
  

Weitere ähnliche Inhalte

Was ist angesagt?

Documents, services, and data on the web
Documents, services, and data on the webDocuments, services, and data on the web
Documents, services, and data on the webChiara Del Vescovo
 
WWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
WWW2014 Tutorial: Online Learning & Linked Data - Lessons LearnedWWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
WWW2014 Tutorial: Online Learning & Linked Data - Lessons LearnedStefan Dietze
 
20130711 records2 graphs_madrid
20130711 records2 graphs_madrid20130711 records2 graphs_madrid
20130711 records2 graphs_madridStefan Gradmann
 
Semantic Web special interest group meeting - IFLA WLIC 2012
Semantic Web special interest group meeting - IFLA WLIC 2012Semantic Web special interest group meeting - IFLA WLIC 2012
Semantic Web special interest group meeting - IFLA WLIC 2012Figoblog
 
20130711 linked datascholarship_madrid
20130711 linked datascholarship_madrid20130711 linked datascholarship_madrid
20130711 linked datascholarship_madridStefan Gradmann
 
Building Linked Data Applications
Building Linked Data ApplicationsBuilding Linked Data Applications
Building Linked Data ApplicationsEUCLID project
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageOntotext
 
20130719 dh2013 beyond_infrastructure
20130719 dh2013 beyond_infrastructure20130719 dh2013 beyond_infrastructure
20130719 dh2013 beyond_infrastructureStefan Gradmann
 
euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)Besnik Fetahu
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapubeswcsummerschool
 
Fondly Collisions: Archival hierarchy and the Europeana Data Model
Fondly Collisions: Archival hierarchy and the Europeana Data Model   Fondly Collisions: Archival hierarchy and the Europeana Data Model
Fondly Collisions: Archival hierarchy and the Europeana Data Model Valentine Charles
 
What can linked data do for digital libraries
What can linked data do for digital librariesWhat can linked data do for digital libraries
What can linked data do for digital librariesSören Auer
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
 
The importance of metadata for datasets: The DCAT-AP European standard
The importance of metadata for datasets: The DCAT-AP European standardThe importance of metadata for datasets: The DCAT-AP European standard
The importance of metadata for datasets: The DCAT-AP European standardGiorgia Lodi
 

Was ist angesagt? (20)

Documents, services, and data on the web
Documents, services, and data on the webDocuments, services, and data on the web
Documents, services, and data on the web
 
WWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
WWW2014 Tutorial: Online Learning & Linked Data - Lessons LearnedWWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
WWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
 
20130711 records2 graphs_madrid
20130711 records2 graphs_madrid20130711 records2 graphs_madrid
20130711 records2 graphs_madrid
 
Providing Linked Data
Providing Linked DataProviding Linked Data
Providing Linked Data
 
Semantic Web special interest group meeting - IFLA WLIC 2012
Semantic Web special interest group meeting - IFLA WLIC 2012Semantic Web special interest group meeting - IFLA WLIC 2012
Semantic Web special interest group meeting - IFLA WLIC 2012
 
20130711 linked datascholarship_madrid
20130711 linked datascholarship_madrid20130711 linked datascholarship_madrid
20130711 linked datascholarship_madrid
 
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORELOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
 
Building Linked Data Applications
Building Linked Data ApplicationsBuilding Linked Data Applications
Building Linked Data Applications
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
 
20130719 dh2013 beyond_infrastructure
20130719 dh2013 beyond_infrastructure20130719 dh2013 beyond_infrastructure
20130719 dh2013 beyond_infrastructure
 
euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)
 
Wed roman tut_open_datapub
Wed roman tut_open_datapubWed roman tut_open_datapub
Wed roman tut_open_datapub
 
LOD2 Webinar: SIREn
LOD2 Webinar: SIREnLOD2 Webinar: SIREn
LOD2 Webinar: SIREn
 
LOD2 Webinar Series: 3rd relase of the Stack
LOD2 Webinar Series: 3rd relase of the StackLOD2 Webinar Series: 3rd relase of the Stack
LOD2 Webinar Series: 3rd relase of the Stack
 
Fondly Collisions: Archival hierarchy and the Europeana Data Model
Fondly Collisions: Archival hierarchy and the Europeana Data Model   Fondly Collisions: Archival hierarchy and the Europeana Data Model
Fondly Collisions: Archival hierarchy and the Europeana Data Model
 
What can linked data do for digital libraries
What can linked data do for digital librariesWhat can linked data do for digital libraries
What can linked data do for digital libraries
 
LOD2 Webinar: UnifiedViews
LOD2 Webinar: UnifiedViewsLOD2 Webinar: UnifiedViews
LOD2 Webinar: UnifiedViews
 
Lod2 review meeting
Lod2 review meetingLod2 review meeting
Lod2 review meeting
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 
The importance of metadata for datasets: The DCAT-AP European standard
The importance of metadata for datasets: The DCAT-AP European standardThe importance of metadata for datasets: The DCAT-AP European standard
The importance of metadata for datasets: The DCAT-AP European standard
 

Ähnlich wie Standardizing for Open Data

Omitola birmingham cityuniv
Omitola birmingham cityunivOmitola birmingham cityuniv
Omitola birmingham cityunivTope Omitola
 
EuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage informationEuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage informationEnno Meijers
 
Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked dataLaura Po
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareIMC Technologies
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic WebIvan Herman
 
Intelligent expert systems for location planning
Intelligent expert systems for location planningIntelligent expert systems for location planning
Intelligent expert systems for location planningNavid Milanizadeh
 
Make our Scientific Datasets Accessible and Interoperable on the Web
Make our Scientific Datasets Accessible and Interoperable on the WebMake our Scientific Datasets Accessible and Interoperable on the Web
Make our Scientific Datasets Accessible and Interoperable on the WebFranck Michel
 
How google is using linked data today and vision for tomorrow
How google is using linked data today and vision for tomorrowHow google is using linked data today and vision for tomorrow
How google is using linked data today and vision for tomorrowVasu Jain
 
Enterprise knowledge graphs
Enterprise knowledge graphsEnterprise knowledge graphs
Enterprise knowledge graphsSören Auer
 
A distributed network of digital heritage information by Enno Meijers - Europ...
A distributed network of digital heritage information by Enno Meijers - Europ...A distributed network of digital heritage information by Enno Meijers - Europ...
A distributed network of digital heritage information by Enno Meijers - Europ...Europeana
 
What flavor of linked data is best for your collection?
What flavor of linked data is best for your collection? What flavor of linked data is best for your collection?
What flavor of linked data is best for your collection? Debra Shapiro
 
Scripting User Contributed Interlinking
Scripting User Contributed InterlinkingScripting User Contributed Interlinking
Scripting User Contributed Interlinkingwhalb
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?Ivan Herman
 
CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationEnno Meijers
 

Ähnlich wie Standardizing for Open Data (20)

Omitola birmingham cityuniv
Omitola birmingham cityunivOmitola birmingham cityuniv
Omitola birmingham cityuniv
 
EuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage informationEuropeanaTech 2018: A distributed network of digital heritage information
EuropeanaTech 2018: A distributed network of digital heritage information
 
Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked data
 
Linking Open Data
Linking Open DataLinking Open Data
Linking Open Data
 
The Future of LOD
The Future of LODThe Future of LOD
The Future of LOD
 
Linked Data to Improve the OER Experience
Linked Data to Improve the OER ExperienceLinked Data to Improve the OER Experience
Linked Data to Improve the OER Experience
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the Software
 
The Web of Data: The W3C Semantic Web Initiative
The Web of Data: The W3C Semantic Web InitiativeThe Web of Data: The W3C Semantic Web Initiative
The Web of Data: The W3C Semantic Web Initiative
 
State of the Semantic Web
State of the Semantic WebState of the Semantic Web
State of the Semantic Web
 
Intelligent expert systems for location planning
Intelligent expert systems for location planningIntelligent expert systems for location planning
Intelligent expert systems for location planning
 
Make our Scientific Datasets Accessible and Interoperable on the Web
Make our Scientific Datasets Accessible and Interoperable on the WebMake our Scientific Datasets Accessible and Interoperable on the Web
Make our Scientific Datasets Accessible and Interoperable on the Web
 
How google is using linked data today and vision for tomorrow
How google is using linked data today and vision for tomorrowHow google is using linked data today and vision for tomorrow
How google is using linked data today and vision for tomorrow
 
Enterprise knowledge graphs
Enterprise knowledge graphsEnterprise knowledge graphs
Enterprise knowledge graphs
 
A distributed network of digital heritage information by Enno Meijers - Europ...
A distributed network of digital heritage information by Enno Meijers - Europ...A distributed network of digital heritage information by Enno Meijers - Europ...
A distributed network of digital heritage information by Enno Meijers - Europ...
 
What flavor of linked data is best for your collection?
What flavor of linked data is best for your collection? What flavor of linked data is best for your collection?
What flavor of linked data is best for your collection?
 
Scripting User Contributed Interlinking
Scripting User Contributed InterlinkingScripting User Contributed Interlinking
Scripting User Contributed Interlinking
 
What is New in W3C land?
What is New in W3C land?What is New in W3C land?
What is New in W3C land?
 
Planetdata simpda
Planetdata simpdaPlanetdata simpda
Planetdata simpda
 
PlanetData: Consuming Structured Data at Web Scale
PlanetData: Consuming Structured Data at Web ScalePlanetData: Consuming Structured Data at Web Scale
PlanetData: Consuming Structured Data at Web Scale
 
CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage information
 

Mehr von Ivan Herman

The convergence of Publishing and the Web
The convergence of Publishing and the WebThe convergence of Publishing and the Web
The convergence of Publishing and the WebIvan Herman
 
Livres Numériques / Web : Construire la Convergence
Livres Numériques / Web : Construire la ConvergenceLivres Numériques / Web : Construire la Convergence
Livres Numériques / Web : Construire la ConvergenceIvan Herman
 
W3C Digital Publishing Interest Group Update
W3C Digital Publishing Interest Group UpdateW3C Digital Publishing Interest Group Update
W3C Digital Publishing Interest Group UpdateIvan Herman
 
Bridging the Web and Digital Publishing: EPUBWEB
Bridging the Web and Digital Publishing: EPUBWEBBridging the Web and Digital Publishing: EPUBWEB
Bridging the Web and Digital Publishing: EPUBWEBIvan Herman
 
W3C and Digital Publishing
W3C and Digital PublishingW3C and Digital Publishing
W3C and Digital PublishingIvan Herman
 
W3C et les publications numériques
W3C et les publications numériquesW3C et les publications numériques
W3C et les publications numériquesIvan Herman
 
Digital Publishing and the Open Web Platform
Digital Publishing and the Open Web PlatformDigital Publishing and the Open Web Platform
Digital Publishing and the Open Web PlatformIvan Herman
 
The W3C Prov Vocabulary
The W3C Prov VocabularyThe W3C Prov Vocabulary
The W3C Prov VocabularyIvan Herman
 
Semantic Web and Related Work at W3C
Semantic Web and Related Work at W3CSemantic Web and Related Work at W3C
Semantic Web and Related Work at W3CIvan Herman
 
On scholarly communication (report of a Dagstuhl workshop)
On scholarly communication (report of a Dagstuhl workshop)On scholarly communication (report of a Dagstuhl workshop)
On scholarly communication (report of a Dagstuhl workshop)Ivan Herman
 
Introduction to RDFa
Introduction to RDFaIntroduction to RDFa
Introduction to RDFaIvan Herman
 
Introduction to Semantic Web Technologies
Introduction to Semantic Web TechnologiesIntroduction to Semantic Web Technologies
Introduction to Semantic Web TechnologiesIvan Herman
 
A year on the Semantic Web @ W3C
A year on the Semantic Web @ W3CA year on the Semantic Web @ W3C
A year on the Semantic Web @ W3CIvan Herman
 
Introduction to Semantic Web
Introduction to Semantic WebIntroduction to Semantic Web
Introduction to Semantic WebIvan Herman
 
What is the Semantic Web
What is the Semantic WebWhat is the Semantic Web
What is the Semantic WebIvan Herman
 
Some news about the SW
Some news about the SWSome news about the SW
Some news about the SWIvan Herman
 
What is the Semantic Web (in 15 minutes...)
What is the Semantic Web (in 15 minutes...)What is the Semantic Web (in 15 minutes...)
What is the Semantic Web (in 15 minutes...)Ivan Herman
 
Semantic Web Tutorial at ESTC2008, Vienna, on September 24, 2008
Semantic Web Tutorial at ESTC2008, Vienna, on September 24, 2008Semantic Web Tutorial at ESTC2008, Vienna, on September 24, 2008
Semantic Web Tutorial at ESTC2008, Vienna, on September 24, 2008Ivan Herman
 
États des lieux du Web sémantique
États des lieux du Web sémantiqueÉtats des lieux du Web sémantique
États des lieux du Web sémantiqueIvan Herman
 

Mehr von Ivan Herman (20)

The convergence of Publishing and the Web
The convergence of Publishing and the WebThe convergence of Publishing and the Web
The convergence of Publishing and the Web
 
Livres Numériques / Web : Construire la Convergence
Livres Numériques / Web : Construire la ConvergenceLivres Numériques / Web : Construire la Convergence
Livres Numériques / Web : Construire la Convergence
 
W3C Digital Publishing Interest Group Update
W3C Digital Publishing Interest Group UpdateW3C Digital Publishing Interest Group Update
W3C Digital Publishing Interest Group Update
 
Bridging the Web and Digital Publishing: EPUBWEB
Bridging the Web and Digital Publishing: EPUBWEBBridging the Web and Digital Publishing: EPUBWEB
Bridging the Web and Digital Publishing: EPUBWEB
 
W3C and Digital Publishing
W3C and Digital PublishingW3C and Digital Publishing
W3C and Digital Publishing
 
W3C et les publications numériques
W3C et les publications numériquesW3C et les publications numériques
W3C et les publications numériques
 
Digital Publishing and the Open Web Platform
Digital Publishing and the Open Web PlatformDigital Publishing and the Open Web Platform
Digital Publishing and the Open Web Platform
 
The W3C Prov Vocabulary
The W3C Prov VocabularyThe W3C Prov Vocabulary
The W3C Prov Vocabulary
 
Semantic Web and Related Work at W3C
Semantic Web and Related Work at W3CSemantic Web and Related Work at W3C
Semantic Web and Related Work at W3C
 
On scholarly communication (report of a Dagstuhl workshop)
On scholarly communication (report of a Dagstuhl workshop)On scholarly communication (report of a Dagstuhl workshop)
On scholarly communication (report of a Dagstuhl workshop)
 
Introduction to RDFa
Introduction to RDFaIntroduction to RDFa
Introduction to RDFa
 
RDFa Tutorial
RDFa TutorialRDFa Tutorial
RDFa Tutorial
 
Introduction to Semantic Web Technologies
Introduction to Semantic Web TechnologiesIntroduction to Semantic Web Technologies
Introduction to Semantic Web Technologies
 
A year on the Semantic Web @ W3C
A year on the Semantic Web @ W3CA year on the Semantic Web @ W3C
A year on the Semantic Web @ W3C
 
Introduction to Semantic Web
Introduction to Semantic WebIntroduction to Semantic Web
Introduction to Semantic Web
 
What is the Semantic Web
What is the Semantic WebWhat is the Semantic Web
What is the Semantic Web
 
Some news about the SW
Some news about the SWSome news about the SW
Some news about the SW
 
What is the Semantic Web (in 15 minutes...)
What is the Semantic Web (in 15 minutes...)What is the Semantic Web (in 15 minutes...)
What is the Semantic Web (in 15 minutes...)
 
Semantic Web Tutorial at ESTC2008, Vienna, on September 24, 2008
Semantic Web Tutorial at ESTC2008, Vienna, on September 24, 2008Semantic Web Tutorial at ESTC2008, Vienna, on September 24, 2008
Semantic Web Tutorial at ESTC2008, Vienna, on September 24, 2008
 
États des lieux du Web sémantique
États des lieux du Web sémantiqueÉtats des lieux du Web sémantique
États des lieux du Web sémantique
 

Kürzlich hochgeladen

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 

Kürzlich hochgeladen (20)

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 

Standardizing for Open Data

  • 1. (1) Standardizing for Open Data Ivan  Herman,  W3C   Open  Data  Week   Marseille,  France,  June  26  2013   Slides at: http://www.w3.org/2013/Talks/0626-Marseille-IH/
  • 2. (2) Data  is  everywhere  on  the  Web!   l  Public,  private,  behind  enterprise  firewalls   l  Ranges  from  informal  to  highly  curated   l  Ranges  from  machine  readable  to  human  readable   l  HTML  tables,  twitter  feeds,  local  vocabularies,   spreadsheets,  …   l  Expressed  in  diverse  models     l  tree,  graph,  table,  …   l  Serialized  in  many  ways     l  XML,  CSV,  RDF,  PDF,  HTML  Tables,  microdata,…  
  • 3. (3)
  • 4. (4)
  • 5. (5)
  • 6. (6)
  • 7. (7)
  • 8. (8) W3C’s  standardization  focus  was,   traditionally,  on  Web  scale   integration  of  data   l Some  basic  principles:   l  use  of  URIs  everywhere  (to  uniquely  identify  things)   l  relate  resources  among  one  another  (to  connect   things  on  the  Web)   l  discover  new  relationships  through  inferences   l This  is  what  the  Semantic  Web  technologies  are   all  about    
  • 9. (9) We  have  a  number  of  standards   RDF  1.1   SPARQL  1.1   URI   JSON-­‐LD   Turtle   RDFa   RDF/XML   RDF:  data  model,  links,  basic  assertions;   different  serializations     SPARQL:  querying  data   A  fairly  stable  set  of  technologies  by  now!  
  • 10. (10) We  have  a  number  of  standards   RDB2RDF   RDF  1.1   RDFS  1.1  SPARQL  1.1   OWL  2   URI   JSON-­‐LD   Turtle   RDFa   RDF/XML   RDF:  data  model,  links,  basic  assertions;   different  serializations     SPARQL:  querying  data   RDFS:    simple  vocabularies   OWL:  complex  vocabularies,  ontologies   RDB2RDF:  databases  to  RDF   A  fairly  stable  set  of  technologies  by  now!  
  • 11. (11) We  have  Linked  Data  principles  
  • 12. (12) Integration  is  done  in  different  ways   l Very  roughly:   l  data  is  accessed  directly  as  RDF  and  turned  into   something  useful   l  relies  on  data  being  “preprocessed”  and  published  as  RDF   l  data  is  collected  from  different  sources,  integrated   internally   l  using,  say,  a  triple  store  
  • 13. (13)
  • 14.
  • 15. (15) However…   l There  is  a  price  to  pay:  a  relatively  heavy   ecosystem   l  many  developers  shy  away  from  using  RDF  and   related  tools   l Not  all  applications  need  this!   l  data  may  be  used  directly,  no  need  for  integration   concerns   l  the  emphasis  may  be  on  easy  production  and   manipulation  of  data  with  simple  tools  
  • 16. (16) Typical  situation  on  the  Web   l Data  published  in  CSV,  JSON,  XML   l An  application  uses  only  1-­‐2  datasets,   integration  done  by  direct  programming  is   straightforward   l  e.g.,  in  a  Web  Application   l Data  is  often  very  large,  direct  manipulation  is   more  efficient  
  • 17. (17) Non-­‐RDF  Data   l In  some  setting  that  data  can  be  converted  into   RDF   l But,  in  many  cases,  it  is  not  done   l  e.g.,  CSV  data  is  way  too  big   l  RDF  tooling  may  not  be  adequate  for  the  task  at   hand   l  integration  is  not  a  major  issue  
  • 18. (18)
  • 19. (19) What  that  application  does…     l Gets  the  data  published  by  NHS   l Processes  the  data  (e.g.,  through  Hadoop)   l Integrates  the  result  of  the  analysis  with   geographical  data   Ie:  the  raw  data  is  used  without  integration  
  • 20. (20) The  reality  of  data  on  the  Web…   l It  is  still  a  fairly  messy  space  out  there  L   l  many  different  formats  are  used   l  data  is  difficult  to  find   l  published  data  are  messy,  erroneous,     l  tools  are  complex,  unfinished…    
  • 21. (21) How  do  developers   perceive  this?   ‘When  transportation  agencies  consider  data   integration,  one  pervasive  notion  is  that  the   analysis  of  existing  information  needs  and   infrastructure,  much  less  the  organization  of  data   into  viable  channels  for  integration,  requires  a   monumental  initial  commitment  of  resources   and  staff.  Resource-­‐scarce  agencies  identify  this   perceived  major  upfront  overhaul  as   "unachievable"  and  "disruptive.”’      -­‐-­‐  Data  Integration  Primer:  Challenges  to  Data  Integration,  US   Dept.  of  Transportation    
  • 22. (22) One  may  look  at  the  problem   through  different  goggles   l Two  alternatives  come  to  the  fore:   1.  provide  tools,  environments,  etc.,  to  help   outsiders  to  publish  Linked  Data  (in  RDF)   easily   l  a  typical  example  is  the  Datalift  project   2.  forget  about  RDF,  Linked  Data,  etc,  and   concentrate  on  the  raw  data  instead  
  • 23.
  • 24. (24) But  religions  and   cultures  can   coexist…  J  
  • 25. (25) Open  Data  on  the  Web  Workshop   l Had  a  successful  workshop  in  London,  in  April:   l  around  100  participants   l  coming  from  different  horizons:  publishers  and  users   of    Linked  Data,  CSV,  PDF,  …    
  • 26. (26) We  also  talked  to  our   “stakeholders”   l Member  organizations  and  companies   l Open  Data  Institute,  Open  Knowledge   Foundation,  Schema.org   l …  
  • 27. (27) Some  takeaway   l The  Semantic  Web  community  needs  stability  of   the  technology   l  do  not  add  yet  another  technology  block  J   l  existing  technologies  should  be  maintained  
  • 28. (28) Some  takeaway   l Look  at  the  more  general  space,  too   l  importance  of  metadata   l  deal  with  non-­‐RDF  data  formats   l  best  practices  are  necessary  to  raise  the  quality  of   published  data  
  • 29. (29) We  need  to  meet  app  developers   where  they  are!  
  • 30. (30) Metadata  is  of  a  major   importance   l Metadata  describes  the  characteristics  of  the   dataset   l  structure,  datatypes  used   l  access  rights,  licenses   l  provenance,  authorship   l  etc.   l Vocabularies  are  also  key  for  Linked  Data  
  • 31. (31) Vocabulary  Management  Action   l Standard  vocabularies  are  necessary  to  describe   data   l  there  are  already  some  initiatives:  W3C’s  data  cube,   data  catalog,  PROV,  schema.org,  DCMI,  …     l At  the  moment,  it  is  a  fairly  chaotic  world…   l  many,  possibly  overlapping  vocabularies   l  difficult  to  locate  the  one  that  is  needed   l  vocabularies  may  not  be  properly  managed,   maintained,  versioned,  provided  persistence…  
  • 32. (32) W3C’s  plan:     l Provide  a  space  whereby   l  communities  can  develop   l  host  vocabularies  at  W3C  if  requested   l  annotate  vocabularies  with  a  proper  set  of  metadata   terms   l  establish  a  vocabulary  directory   l The  exact  structure  is  still  being  discussed:   http://www.w3.org/2013/04/vocabs/  
  • 33.
  • 34. (34) CSV  on  the  Web   l Planned  work  areas:   l  metadata  vocabulary  to  describe  CSV  data   l  structure,  reference  to  access  rights,  annotations,  etc.   l  methods  to  find  the  metadata   l  part  of  an  HTTP  header,  special  rows  and  columns,   packaging  formats…   l  mapping  content  to  RDF,  JSON,  XML   l Possibly  at  a  later  phase:     l  API  standards  to  access  CSV  data  
  • 35.
  • 36. (36) Open  Data  Best  Practices   l Document  best  practices  for  data  publishers   l  management  of  persistence,  versioning,  URI  design   l  use  of  core  vocabularies  (provenance,  access  control,   ownership,  annotations,…)   l  business  models   l Specialized  Metadata  vocabularies   l  quality  description  (quality  of  the  data,  update   frequencies,  correction  policies,  etc.)   l  description  of  data  access  API-­‐s   l  …  
  • 37. (37) Summary   l Data  on  the  Web  has  many  different  facets   l We  have  concentrated  on  the  integration   aspects  in  the  past  years   l We  have  to  take  a  more  general  view,  look  at   other  types  of  data  published  on  the  Web      
  • 38. (38) In  future…   l We  should  look  at  other  formats,  not  only  CSV   l  MARC,  GIS,  ABIF,…   l Better  outreach  to  data  publishing  communities   and  organizations   l  WF,  RDA,  ODI,  OKFN,  …