Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Linked (Open) Data

Nächste SlideShare
SWT Lecture Session 2 - RDF
SWT Lecture Session 2 - RDF
Wird geladen in …3

Hier ansehen

1 von 64 Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (19)


Ähnlich wie Linked (Open) Data (20)

Weitere von Bernhard Haslhofer (20)


Linked (Open) Data

  1. 1. Linked (Open) Data INFO 4302 - April 18, 2011 Bernhard Haslhofer - Cornell University
  2. 2. Who am I? • Postdoc at Cornell Information Science • Research areas • linked data • user-contributed data (annotations) • (meta-)data interoperability • Contact: • bernhard.haslhofer@cornell.edu
  3. 3. Today we talk about... http://www.youtube.com/watch?v=5Cb3ik6zP2I
  4. 4. Today we talk about... • Movies, actors and other real-world entities • How to make data about these entities available on the Web (Linked Data) • Enabling technologies, best-practices and useful tools that help us in doing so • Other Linked Data projects (BBC, LoC)
  5. 5. Web Architecture Recap
  6. 6. The World Wide Web (WWW) • Internet != WWW != Google != Facebook • Fundamental technologies • URI - a simple and generic syntax for identifiers • HTML - a markup language without formal schema binding • HTTP - a simple protocol to access and manipulate resources and resource representations in a distributed environment • W3C Consortium (http://www.w3.org)
  7. 7. URIs • Identification of resources via Uniform Resource Identifiers (URIs) •The generic syntax consists of a hierarchical sequence of components, scheme, Generic Syntax: authority, path, query, and fragment. URI = scheme “:” hier-path [ “?” query ] [ “#” fragment ] Scheme and hier-path are required, though the path may be empty. Example URIs with components: URI foo://example.com:8042/over/there?name=ferret#nose _/ ________________/_________/ _________/ __/ URL | | | | | URN scheme authority path query fragment
  8. 8. URIs / Resources • Information Resource • web pages, images, product catalogs, etc • all their essential characteristics can be conveyed in a message • e.g., http://www.flickr.com/user2/photos/image.jpg • Non-Information Resource • other things such as dogs, people, this classroom, concepts • their essence is not information • e.g., http://www.example.com/ontology/meter
  9. 9. HTTP • A stateless request-response protocol in the client-server computing model • HTTP methods: GET, POST, PUT, DELETE, ... • Agents may use a URI to access the referenced resource = dereferencing the URI
  10. 10. HTTP Content Negotiation • A URI is not (necessarily) a filename • Conneg = making available multiple resource representations via the same URI Plain Text text/plain HTML (en) URI text/html HTML (jp) http://example.com/The_Shining text/html Resource
  11. 11. (X)HTML(5) • A resource representation data format... • ... for presentation markup • rendered by user agents (typically browsers) • focus on readability • less formal, user-friendly syntax and semantics
  12. 12. Web Services • Application-to-application communication based on the Web architecture • simple and open standards (HTTP, XML, JSON, ...) • send data from Application A to Application B through the Web • usually define some API Web Application A Application B
  13. 13. Linked Data
  14. 14. Why Linked Data?
  15. 15. Why Linked Data?
  16. 16. Why Linked Data?
  17. 17. Why Linked Data? • There is lots of information on the Web • ...valuable information that can be (re-)used • Problem • information is usually expressed in the form of HTML documents • the underlying raw data are locked in closed data silos (mostly DBMS)
  18. 18. (c) http://www.flickr.com/photos/docsearls/5500714140
  19. 19. Why Linked Data? • The Web is successful because it provides • Uniform encoding (HTML) • Uniform addressing (URI) • Uniform transportation (HTTP) for the exchange of documents. • Why not apply the same mechanism to the underlying data?
  20. 20. What is Linked Data? • A method to build a Web of Data • Architectural style, set of standards Web
  21. 21. What is Linked Data? • A set of four principles • use URIs as names for things • use HTTP URIs so that people can look up those names • when someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) • include links to other URIs, so that they can discover more things
  22. 22. Enabling Technologies
  23. 23. Uniform Resource Identifiers (URI) • Name and identify things (resources) • Dereferencable HTTP URIs http://dbpedia.org/resource/ The_Shining_(film) http://data.linkedmdb.org/ resource/film/2014 http://rdf.freebase.com/ns/m/ 04fjzv
  24. 24. Resource Description Framework (RDF) • A model for representing data on the Web • Several statements (triples) form a graph http://dbpedia.org/ontology/ http://xmlns.com/foaf/0.1/ Film Person rdf:type rdf:type http://dbpedia.org/resource/ http://dbpedia.org/resource/ dbpprop:starring The_Shining_(film) Jack_Nicholson foaf:name rdfs:label rdfs:label dbpedia-owl:birthDate !" (#$) The Shining (film) 1937-04-22 Jack Nicholson
  25. 25. RDF serialization (RDF/XML, N3, Turtle, etc.) • Data formats for RDF resource representations RDF Serialization Formats: RDF/XML, N3, Turtle, N-Triple, etc • Used to transfer RDF data between apps Data formats for RDF resource representations Used to transfer RDF data from application-to-application N3/Turtle example: @prefix rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix dbpedia-owl:<http://dbpedia.org/ontology/> . <http://dbpedia.org/resource/The_Shining_%28film%29> rdf:type dbpedia-owl:Work , dbpedia-owl:Film . @prefix dbpprop:<http://dbpedia.org/property/> . @prefix ns9:<http://dbpedia.org/datatype/> . <http://dbpedia.org/resource/The_Shining_%28film%29> dbpprop:runtime"146.0"^^ns9:minute ; © Prof. Dr. Wolfgang Klas und Dr. Bernhard Haslhofer, WS 2010/11 - Multimediale Systeme 2 7 Linked (Open) Data 7-15
  26. 26. RDF Vocabulary Description Language (RDFS) • A language for describing the syntax and semantics of vocabularies in a machine- understandable way http://dbpedia.org/ontology/ Work rdfs:subClassOf http://dbpedia.org/ontology/ Film
  27. 27. OWL - Web Ontology Language • A more expressive (formal) language for defining the syntax and semantics of vocabularies • Solves RDFS shortcomings but introduces quite some complexity http://www.w3.org/2002/07/ http://dbpedia.org/ontology/ owl#ObjectProperty Work rdf:type rdfs:domain http://dbpedia.org/ontology/ http://dbpedia.org/ontology/ rdfs:range starring Person rdfs:label starring
  28. 28. Simple Knowledge Organization System (SKOS) • A language for describing controlled vocabularies (taxonomies, thesauri, classification schemes) http://dbpedia.org/resource/ Category:1980s_horror_films skos:subject rdf:type http://dbpedia.org/resource/ skos:broader http://www.w3.org/2004/02/ The_Shining_(film) skos/core#Concept rdf:type http://dbpedia.org/resource/ Category:1980s_films
  29. 29. Links between Resources • OWL defines properties for linking resources http://dbpedia.org/resource/ http://dbpedia.org/resource/ dbpprop:starring The_Shining_(film) Jack_Nicholson owl:sameAs owl:sameAs owl:sameAs http://data.linkedmdb.org/ resource/film/2014 http://data.nytimes.com/ N5761411277431266513 http://rdf.freebase.com/ns/m/ 04fjzv
  30. 30. SPARQL • A query language and protocol for accessing SPARQL - RDF Query Language RDF data on the Web A query language and protocol for accessing RDF data on the Web SELECT DISTINCT ?x WHERE {?x skos:subject <http:dbpedia.org/resource/Cate- gory:1980s_horror_films>} LIMIT 10
  31. 31. Vocabulary / Data Publishing Best Practices
  32. 32. Publishing Vocabularies • Hash-based URIs • e.g., http://example.com/example1#ClassA • Suited to group the description of a moderate number of related terms into one RDF document • Agent can retrieve terms with a single request • Slash-based URIs • e.g., http://example.com/example1/ClassB • Suited to split terms in large vocabularies into one document per term • No need to download a massive document
  33. 33. Provide either: human-readable content from vocabulary URI
  34. 34. or: machine-readable content from vocabulary URI ... depending on what is requested.
  35. 35. Publishing Data • Distinguish between non-information and information resource • Sample non-information resource • http://dbpedia.org/resource/The_Shining_(film) • Sample information resource • http://dbpedia.org/page/The_Shining_(film) - HTML • http://dbpedia.org/data/The_Shining_(film) - RDF
  36. 36. Publishing Data GET http://dbpedia.org/resource/The_Shining_(film) Accept: application/rdf+xml 303 See Other Location: http://dbpedia.org/data/The_Shining_(film) GET http://dbpedia.org/data/The_Shining_(film) Accept: application/rdf+xml 200 OK ... <?xml version="1.0" encoding="utf-8"?> <rdf:RDF ...
  37. 37. The Linking Open Data Community Project
  38. 38. Linking? Open? Data Project? • Open Data: a philosophy, practice, or policy that data are freely available to everyone without restrictions from copyright, patents, a.s.o. • Linked Data: method / best practices for exposing, sharing, and connecting data using URIs and RDF • Linking Open Data: a W3C community project with the goal to extend the Web with a data commons by publishing various open data sets as RDF on the Web and by setting links between data items from different sources
  39. 39. Useful Tools
  40. 40. RDF APIs • Java • Jena Semantic Web Framework (http://openjena.org/) • Sesame RDF API (http://www.openrdf.org/) • PHP • ARC (http://arc.semsol.org/) • Ruby • RDF.rb: Linked Data for Ruby (http://rdf.rubyforge.org/) • Python • RDFLib (http://www.rdflib.net/) • C • Redland RDF Libraries (http://librdf.org/)
  41. 41. RDF Stores • OpenLink Virtuoso (http://virtuoso.openlinksw.com/ dataspace/dav/wiki/Main/) • 4Store (http://4store.org/) • AllegroGraph (http://www.franz.com/agraph/ allegrograph/) • Oracle 11g (http://www.oracle.com/technetwork/ database/options/semantic-tech/ index.html) • ...and many more: http://www.w3.org/2001/sw/wiki/Tools
  42. 42. RDF / Linked Data Wrappers • D2RQ - SPARQL / Linked Data for relational databases (http://www4.wiwiss.fu-berlin.de/ bizer/d2rq/) • OAI2LOD Server - expose any OAI-PMH source as Linked Data • TripFS - filesystem as Linked Data • TripCel - XLS spreadsheets as Linked Dat • ...
  43. 43. Linked Data debugging Startup your console / terminal - native on Linux / Mac OS X - Windows: http://www.cygwin.com/ Dereference resources with cURL (http://curl.haxx.se/) curl -I -H "Accept: application/rdf+xml" http:// dbpedia.org/resource/The_Shining_%28film%29 curl -H "Accept: application/rdf+xml" http:// dbpedia.org/data/The_Shining_%28film%29
  44. 44. Linked Data debugging Install the Raptor RDF Syntax Library (http:// librdf.org/raptor/) - Mac: brew install raptor Use the rapper utility to dereference URIs rapper http://dbpedia.org/resource/The_Shining_%28film %29 rapper -o rdfxml http://dbpedia.org/resource/ The_Shining_%28film%29
  45. 45. Readings
  46. 46. Required Reading • T. Heath, C. Bizer. Linked Data: Evolving the Web into a Global Data Space, Chapters 1-5 http://linkeddatabook.com/editions/1.0/
  47. 47. Recommended Readings • Linked Data Web Site: http://linkeddata.org • Linked Data / Semantic Web Introduction: http:// www.linkeddatatools.com/semantic-web-basics • Tim Berners-Lee. Linked Data Design Issues: http:// www.w3.org/DesignIssues/LinkedData.html • Best Practice Recipes for Publishing RDF Vocabularies: http://www.w3.org/TR/swbp-vocab-pub/ • How to Publish Linked Data on the Web: http:// www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/