June 4, 2012Linked DataJuan F. Sequeda – Daniel P. MirankerCapsentaSemantic Tech & Business Conference 2012www.capsenta.co...
Outline Part 1: Introduction to Linked Data Part 2: Linked Data Principles Part 3: Linked Data Architectures Part 4: L...
Part 1:                   Introduction to                      Linked Datawww.capsenta.com                June 4, 2012   3
The Web is a Data Shredder   Structured            Unstructured      Data                   Data                          ...
The Web of Documents                   Search        Search        Engine         Crawlerwww.capsenta.com            June ...
What would we like? Make it easy for computers/software to find  THINGS       Do you SEARCH or do you                FIND...
Search for       Football Players who went to the University          of Texas at Austin, played for the Dallas           ...
www.capsenta.com   June 4, 2012   8
www.capsenta.com   June 4, 2012   9
www.capsenta.com   June 4, 2012   10
Why can’t we just FIND it…www.capsenta.com        June 4, 2012   11
www.capsenta.com   June 4, 2012   12
www.capsenta.com   June 4, 2012   13
Guess how I FOUND out?www.capsenta.com          June 4, 2012   14
On a Semantic Web Besides publishing documents on the web    which computers can’t understand easily Let’s publish on t...
The Semantic Web is a                       web of data                       The current web is a                        ...
But wait… doesn’t the      web already have data?www.capsenta.com          June 4, 2012   17
Current Data on the Web  Relational Databases  APIs  XML  CSV  XLS …  Can’t computers and applications already   co...
Yes! But it is all in different              formats and data                            models!www.capsenta.com          ...
This makes it hard to                         integrate datawww.capsenta.com                     June 4, 2012   20
The data in different   data sources aren’t linkedwww.capsenta.com          June 4, 2012   21
For example, how do I                state that the Juan          Sequeda in Facebook is                  the same as Juan...
Or if I create a mashup            from different services, I              have to learn different             APIs and I ...
Data is Siloedwww.capsenta.com    June 4, 2012   24
Wouldn’t it be great if we      had a standard way of       publishing data on the                        Web?www.capsenta...
We have a standardized              way of publishing             documents on the                    web, right?         ...
Then why can’t we have             a standard way of         publishing data on the                         Web?www.capsen...
Good question! And the          answer is YES. There is!                                RDFwww.capsenta.com               ...
Resource Description Framework (RDF) Data Model = a way to model data    i.e. Relational databases use relational data m...
RDF is a Graph    <JuanSequeda> <firstName> “Juan”    <JuanSequeda> <lastName> “Sequeda”    <JuanSequeda> <livesIn> “Au...
RDF can be serialized in different ways RDF/XML RDFa (RDF in HTML) N3 Turtle JSONwww.capsenta.com                    ...
www.capsenta.com   June 4, 2012   32
RDFawww.capsenta.com   June 4, 2012   33
RDF/XMLwww.capsenta.com   June 4, 2012   34
RDF/N-tripleswww.capsenta.com   June 4, 2012   35
RDF/Turtlewww.capsenta.com   June 4, 2012   36
So does that mean that I        have to publish my data                    in RDF now?www.capsenta.com             June 4,...
You don’t have to… but          we would like you to www.capsenta.com             June 4, 2012   38
An examplewww.capsenta.com          June 4, 2012   39
Document on the Webwww.capsenta.com       June 4, 2012   40
Databases back up documents                                              THINGS have PROPERTIES:                          ...
Lets represent the data in RDFIsbn     Title              Author     PublisherID   ReleasedData978-0-   Programming       ...
Remember that we are                     on the web     Everything on the web is identified by                            ...
And now let’s link the data to other data                                                          Programming            ...
And now consider the data from Revyu.com http://      hasReview    http:// …/revie                   …/isbn9   w1         ...
Let’s start to link data  http://      hasReview      http://  …/revie                     …/isbn9                        ...
Juan Sequeda publishes data too   http://juans              http://dbpedia.org/Au                   livesIn            sti...
Let’s link more data  http://…/ hasReview        http://…/   review1                    isbn978              descriptionha...
And more  http://…/ hasReview         http://…/   review1                     isbn978                                   Pr...
Data on the Web that is        in RDF and is linked to           other RDF data is             LINKED DATAwww.capsenta.com...
Linked Data makes the            web appear as                      ONE                     GIANT                     HUGE...
I can query a database       with SQL. Is there a way    to query Linked Data with           a query language?www.capsenta...
Yes! There is actually a      standardize language for                            that                           SPARQLwww...
FIND all the reviews on       the book “Programming         the Semantic Web” by       people who live in Austinwww.capsen...
SPARQL          SELECT ?review ?comment          WHERE {            isbn:978 ex:hasReview ?review .            ?review ex:...
SELECT ?review ?comment                                   WHERE {                                   isbn:978 ex:hasReview ...
This looks cool, but let’s       be realistic. What is the           incentive to publish    Linked Data on the Web?www.ca...
What was your incentive        to publish an HTML page                        in 1990?www.capsenta.com             June 4,...
1) Share data in documents  2) Because you neighbor was doing it  … later on …  3) Marketing, Advertising, …, SEOwww.capse...
So why should we publish          Linked Data in 2012?www.capsenta.com           June 4, 2012   60
1) Share data as data  2) Because you neighbor is doing it  … later on …  3) Marketing, Advertising, …, SEOwww.capsenta.co...
Linked Data Publishers  US and UK Government  BBC  NY Times  Best Buy  Sears  Kmart  Overstock  … too many more to...
Linked Open Datawww.capsenta.com                June 4, 2012   63
http://www.w3.org/DesignIssues/LinkedData.htmlwww.capsenta.com                             June 4, 2012       64
May 2007www.capsenta.com   June 4, 2012   65
Oct 2007www.capsenta.com   June 4, 2012   66
Nov 2007www.capsenta.com   June 4, 2012   67
Feb 2008www.capsenta.com   June 4, 2012   68
Mar 2008www.capsenta.com   June 4, 2012   69
Sept 2008www.capsenta.com   June 4, 2012   70
Mar 2009 (1)www.capsenta.com   June 4, 2012   71
Mar 2009 (2)www.capsenta.com   June 4, 2012   72
July 2009www.capsenta.com   June 4, 2012   73
September 2010www.capsenta.com   June 4, 2012   74
September 2011Linking Open Datacloud diagram, byRichard Cyganiak andAnja Jentzsch. http://lod-cloud.net/www.capsenta.com  ...
YOU GET THE PICTURE                   ITS BIG and getting                   BIGGER and                    BIGGERwww.capsen...
Part 2:        Linked Data Principleswww.capsenta.com            June 4, 2012   77
Linked Data is a set of best practices to     publish and interlink data on the webwww.capsenta.com                       ...
Linked Data Principles1.     Use URIs as names for       things2.     Use HTTP URIs so that       people can look up      ...
1. Use URIs as names for thingswww.capsenta.com                        June 4, 2012   80
1) Use URIs as names for     thingshttp://dbpedia.org/resource/Austin,_Texas  http://xmlns.com/foaf/0.1/based_near        ...
2. Use HTTP URIs so that people              can look up (dereference)                     those names.www.capsenta.com   ...
2) Use HTTP URIs HTTP client can lookup the URI using HTTP  protocol and retrieve a description  http://dbpedia.org/resou...
www.capsenta.com   June 4, 2012   84
www.capsenta.com   June 4, 2012   85
www.capsenta.com   June 4, 2012   86
What’s with the redirection (303) ?www.capsenta.com                              June 4, 2012   87
www.capsenta.com   June 4, 2012   88
http://upload.wikimedia.org/wikipedia/commons/0/06/AustinSkylineLouNeffPoint-2010-03-29-b.JPGwww.capsenta.com             ...
http://dbpedia.org/page/Austin,_Texaswww.capsenta.com                                           June 4, 2012   90
Identifies the abstract concept of                                  “the city of Austin, Texas”                        htt...
Minting HTTP URIs If you own the domain name and run a web  server at that location, mint URIs in this  namespace I own ...
Cool URIs         http://www.w3.org/TR/cooluris/ Don’t misuse a namespace that you don’t own    http://www.imdb.com/titl...
3. When someone looks up a                    URI, provide useful                       information.www.capsenta.com      ...
3) Provide useful information How do we provide useful information in  document form on the web?  HTML How do we provid...
What to publish?  Literal Triples <http://dbpedia.org/resource/Austin,_Texas>                 <http://xmlns.com/foaf/0.1/...
What to publish? Description of the data set    Semantic Sitemaps    voiD (Vocabulary of Interlinked Datasets) Provena...
Vocabularies (or Schemas or Ontologies)  Create your own using    RDFS/OWL/ SKOS  Reuse vocabularies    Dublin Core: m...
4. Include links to other URIs so             that they can discover more                         things.www.capsenta.com ...
4) Include links to other things Set external RDF links into other data sources on  the Web    Subject of the triple is ...
4) Include links to other things  Relationship Link Triples <http://juansequeda.com/foaf.rdf#me>                 <http://...
Which predicate for linking to choose? Depends on your domain Is it widely used?    owl:sameAs    foaf:knows    foaf:...
Part 3:                    Linked Data                   Architectureswww.capsenta.com              June 4, 2012   103
Static RDF Files Small amount of data (personal FOAF file) Use RDF/XML serialization Save as .rdf file and upload it to...
RDF in HTML (RDFa) Another syntax for RDF Useful if you have template HTML pages Drupal 7 will do this out of the boxww...
Triplestores (aka RDF db, …) Commercial    Oracle, IBM, OntoText (OWLIM), Franz (Allegrograph),     Openlink (Virtuoso),...
RDB2RDF  Upcoming W3C RDB2RDF Standards     R2RML: mapping language     Direct Mapping: default automatic mapping  Two...
Unstructured to RDF                                      Triplestore                   Entity Extractor                   ...
Semi-structured to RDF                                     Triplestore                     XML2RDF,                     XL...
RDB to RDF CMS with RDFa,       RDB2RDF  Semantic Wiki    (SPARQL to SQL)                 Triplestore                     ...
Creating Linked Data                                         Linked Data                                                  ...
Consuming Linked Data                             Application        Schema Mapping    Record Linkage        Provenance Tr...
Schema Matching  Renaming    <ex:name>  <foaf:name>    owl:equivalentClass and owl:equivalentProperty    rdfs:subClas...
Record Linkage Different URIs that identify the same thing Create owl:sameAs links between them Manually lookup: Sindic...
Provenance Keep track where the data is coming from    Quality    Trust Named Graphs SPARQL Graphwww.capsenta.com    ...
Centralized                      Application                               SPARQL                       Triplestore       ...
Centralized Advantage    Include the datasets that you need    Complex queries and high performance    Reasoning Draw...
Federated                            Application                                   SPARQL                              Fed...
Federated Advantage    Include the datasets that you need    Queried data is up to date Drawbacks    Requires existen...
Linked Traversal                                     Application                                             SPARQL       ...
Linked Traversal Advantage    No need to know the data sources in advance    Does not depend on the existence of SPARQL...
Applications Linked Data Browsers    http://browse.semanticweb.org/ Linked Data (Semantic Web) Search Engines    Falco...
Domain Specific Applications BBC World Cup Seevl.net Linked Life Data Government appswww.capsenta.com                J...
Part 2:       Linked Enterprise Datawww.capsenta.com            June 4, 2012   124
Use                                  Linked Data Principles                                        internally          Con...
Linked Enterprise Data Linked Data can be used as an architectural  style for integrating data in the Enterprise 1. Stand...
Linked Enterprise Data Information creation  information sharing Produce and consume data specific to your  needs but a...
Benefits of RDF/Linked Data RDF (graphs) is a least common denominator    Text, CVS, XML, XLS, RDB to RDF    Imagine mo...
Benefits of RDF/Linked Data Power of the URI and Links    Universal Identifier    Create a “foreign key” to a table tha...
What’s next? W3C Linked Data Platform Working Group    http://www.w3.org/2012/ldp/charter Linked Data Basic Profile 1.0...
Summarywww.capsenta.com        June 4, 2012   131
Linked Data Checklist Does your data link to other data sets? Do you provide provenance metadata? Do you provide licens...
Acknowledgements  RiBS Lab – UT Austin  Olaf Hartig – Humboldt University Berlin  Patrick Sinclair – BBC  Jamie Taylor...
Thanks!               Juan F. Sequeda        Daniel P. Miranker            juan@capsenta.com      miranker@capsenta.com   ...
Nächste SlideShare
Wird geladen in …5
×

Linked Data tutorial at Semtech 2012

4.246 Aufrufe

Veröffentlicht am

My Linked Data tutorial presentation that I presented at Semtech 2012.

http://semtechbizsf2012.semanticweb.com/sessionPop.cfm?confid=65&proposalid=4724

Veröffentlicht in: Technologie, Bildung
0 Kommentare
7 Gefällt mir
Statistik
Notizen
  • Als Erste(r) kommentieren

Keine Downloads
Aufrufe
Aufrufe insgesamt
4.246
Auf SlideShare
0
Aus Einbettungen
0
Anzahl an Einbettungen
13
Aktionen
Geteilt
0
Downloads
157
Kommentare
0
Gefällt mir
7
Einbettungen 0
Keine Einbettungen

Keine Notizen für die Folie

Linked Data tutorial at Semtech 2012

  1. 1. June 4, 2012Linked DataJuan F. Sequeda – Daniel P. MirankerCapsentaSemantic Tech & Business Conference 2012www.capsenta.com 1
  2. 2. Outline Part 1: Introduction to Linked Data Part 2: Linked Data Principles Part 3: Linked Data Architectures Part 4: Linked Enterprise Datawww.capsenta.com June 4, 2012 2
  3. 3. Part 1: Introduction to Linked Datawww.capsenta.com June 4, 2012 3
  4. 4. The Web is a Data Shredder Structured Unstructured Data Data Thanks Martin Heppwww.capsenta.com June 4, 2012 4
  5. 5. The Web of Documents Search Search Engine Crawlerwww.capsenta.com June 4, 2012 5
  6. 6. What would we like? Make it easy for computers/software to find THINGS Do you SEARCH or do you FIND?www.capsenta.com June 4, 2012 6
  7. 7. Search for Football Players who went to the University of Texas at Austin, played for the Dallas Cowboys as Cornerbackwww.capsenta.com June 4, 2012 7
  8. 8. www.capsenta.com June 4, 2012 8
  9. 9. www.capsenta.com June 4, 2012 9
  10. 10. www.capsenta.com June 4, 2012 10
  11. 11. Why can’t we just FIND it…www.capsenta.com June 4, 2012 11
  12. 12. www.capsenta.com June 4, 2012 12
  13. 13. www.capsenta.com June 4, 2012 13
  14. 14. Guess how I FOUND out?www.capsenta.com June 4, 2012 14
  15. 15. On a Semantic Web Besides publishing documents on the web  which computers can’t understand easily Let’s publish on the web something that computers can understand DATAwww.capsenta.com June 4, 2012 15
  16. 16. The Semantic Web is a web of data The current web is a web of documentswww.capsenta.com June 4, 2012 16
  17. 17. But wait… doesn’t the web already have data?www.capsenta.com June 4, 2012 17
  18. 18. Current Data on the Web  Relational Databases  APIs  XML  CSV  XLS …  Can’t computers and applications already consume that data on the web?www.capsenta.com June 4, 2012 18
  19. 19. Yes! But it is all in different formats and data models!www.capsenta.com June 4, 2012 19
  20. 20. This makes it hard to integrate datawww.capsenta.com June 4, 2012 20
  21. 21. The data in different data sources aren’t linkedwww.capsenta.com June 4, 2012 21
  22. 22. For example, how do I state that the Juan Sequeda in Facebook is the same as Juan Sequeda in Twitterwww.capsenta.com June 4, 2012 22
  23. 23. Or if I create a mashup from different services, I have to learn different APIs and I get different formats of data backwww.capsenta.com June 4, 2012 23
  24. 24. Data is Siloedwww.capsenta.com June 4, 2012 24
  25. 25. Wouldn’t it be great if we had a standard way of publishing data on the Web?www.capsenta.com June 4, 2012 25
  26. 26. We have a standardized way of publishing documents on the web, right? HTMLwww.capsenta.com June 4, 2012 26
  27. 27. Then why can’t we have a standard way of publishing data on the Web?www.capsenta.com June 4, 2012 27
  28. 28. Good question! And the answer is YES. There is! RDFwww.capsenta.com June 4, 2012 28
  29. 29. Resource Description Framework (RDF) Data Model = a way to model data  i.e. Relational databases use relational data model RDF is a graph data modelwww.capsenta.com June 4, 2012 29
  30. 30. RDF is a Graph  <JuanSequeda> <firstName> “Juan”  <JuanSequeda> <lastName> “Sequeda”  <JuanSequeda> <livesIn> “Austin”  <JuanSequeda> <knows> <DanielMiranker>  ..  <DanielMiranker> <firstName> “Daniel”  <DanielMiranker> <lastName> “Miranker”  <DanielMiranker> <livesIn> “Austin”www.capsenta.com June 4, 2012 30
  31. 31. RDF can be serialized in different ways RDF/XML RDFa (RDF in HTML) N3 Turtle JSONwww.capsenta.com June 4, 2012 31
  32. 32. www.capsenta.com June 4, 2012 32
  33. 33. RDFawww.capsenta.com June 4, 2012 33
  34. 34. RDF/XMLwww.capsenta.com June 4, 2012 34
  35. 35. RDF/N-tripleswww.capsenta.com June 4, 2012 35
  36. 36. RDF/Turtlewww.capsenta.com June 4, 2012 36
  37. 37. So does that mean that I have to publish my data in RDF now?www.capsenta.com June 4, 2012 37
  38. 38. You don’t have to… but we would like you to www.capsenta.com June 4, 2012 38
  39. 39. An examplewww.capsenta.com June 4, 2012 39
  40. 40. Document on the Webwww.capsenta.com June 4, 2012 40
  41. 41. Databases back up documents THINGS have PROPERTIES: A Book as a Title, an author, … Isbn Title Author PublisherID ReleasedData 978-0-596- Programming Toby 1 July 2009 15381-6 the Semantic Segaran Web … … … … … This is a THING: PublisherID PublisherName A book title “Programming the 1 O’Reilly Media Semantic Web” by Toby Segaran, … … …www.capsenta.com June 4, 2012 41
  42. 42. Lets represent the data in RDFIsbn Title Author PublisherID ReleasedData978-0- Programming Toby 1 July 2009596- the Semantic Segaran15381- Web6 Programming title the SemanticPublisherID PublisherName Web1 O’Reilly Media author Toby book Segaran isbn 978-0-596-15381-6 publisher name Publisher O’Reillywww.capsenta.com June 4, 2012 42
  43. 43. Remember that we are on the web Everything on the web is identified by a URIwww.capsenta.com June 4, 2012 43
  44. 44. And now let’s link the data to other data Programming title the Semantic Web http:// …/isbn9 author Toby 78 Segaran isbn 978-0-596-15381-6 publisher http://…/ name publisher O’Reilly 1www.capsenta.com June 4, 2012 44
  45. 45. And now consider the data from Revyu.com http:// hasReview http:// …/revie …/isbn9 w1 78 description reviewer Awesome Book http:// name …/revie wer Juan Sequedawww.capsenta.com June 4, 2012 45
  46. 46. Let’s start to link data http:// hasReview http:// …/revie …/isbn9 78 Programming w1 the Semantic description title WebhasReviewer owl:sameAs Awesome http:// author Toby Book …/isbn9 Segaran 78 http://…/ reviewer name isbn 978-0-596-15381-6 Juan publisher Sequeda http://…/ publisher name O’Reilly 1 www.capsenta.com June 4, 2012 46
  47. 47. Juan Sequeda publishes data too http://juans http://dbpedia.org/Au livesIn stin equeda.cowww.capsenta.com name Juan Sequeda June 4, 2012 47 m/id
  48. 48. Let’s link more data http://…/ hasReview http://…/ review1 isbn978 descriptionhasReviewer Awesome Book http://…/ name reviewer sameAs Juan Sequeda http://juans http://dbpedia.org/Au livesIn stin equeda.co www.capsenta.com name Juan Sequeda June 4, 2012 48 m/id
  49. 49. And more http://…/ hasReview http://…/ review1 isbn978 Programming description title the Semantic WebhasReviewer owl:sameAs Awesome author http://…/ Toby Book isbn978 Segaran http://…/ reviewer name isbn 978-0-596-15381-6 owl:sameAs Juan publisher http://…/p Sequeda ublisher1 name O’Reilly http://juans http://dbpedia.org/Au livesIn stin equeda.co www.capsenta.com name Juan Sequeda June 4, 2012 49 m/id
  50. 50. Data on the Web that is in RDF and is linked to other RDF data is LINKED DATAwww.capsenta.com June 4, 2012 50
  51. 51. Linked Data makes the web appear as ONE GIANT HUGE GLOBAL DATABASE!www.capsenta.com June 4, 2012 51
  52. 52. I can query a database with SQL. Is there a way to query Linked Data with a query language?www.capsenta.com June 4, 2012 52
  53. 53. Yes! There is actually a standardize language for that SPARQLwww.capsenta.com June 4, 2012 53
  54. 54. FIND all the reviews on the book “Programming the Semantic Web” by people who live in Austinwww.capsenta.com June 4, 2012 54
  55. 55. SPARQL SELECT ?review ?comment WHERE { isbn:978 ex:hasReview ?review . ?review ex:description ?comment . ?review ex:hasReviewer ?person . ?person ex:lives dbpedia:Austin . }www.capsenta.com June 4, 2012 55
  56. 56. SELECT ?review ?comment WHERE { isbn:978 ex:hasReview ?review . ?review ex:description ?comment . ?review ex:hasReviewer ?person . ?person ex:lives dbpedia:Austin . http://…/ hasReview http://…/ } review1 isbn978 Programming description title the Semantic WebhasReviewer owl:sameAs Awesome author http://…/ Toby Book isbn978 Segaran http://…/ reviewer name isbn 978-0-596-15381-6 owl:sameAs Juan publisher http://…/p Sequeda ublisher1name O’Reilly http://juans http://dbpedia.org/Au livesIn stin equeda.co 56 Juan Sequedawww.capsenta.com name June 4, 2012 m/id
  57. 57. This looks cool, but let’s be realistic. What is the incentive to publish Linked Data on the Web?www.capsenta.com June 4, 2012 57
  58. 58. What was your incentive to publish an HTML page in 1990?www.capsenta.com June 4, 2012 58
  59. 59. 1) Share data in documents 2) Because you neighbor was doing it … later on … 3) Marketing, Advertising, …, SEOwww.capsenta.com June 4, 2012 59
  60. 60. So why should we publish Linked Data in 2012?www.capsenta.com June 4, 2012 60
  61. 61. 1) Share data as data 2) Because you neighbor is doing it … later on … 3) Marketing, Advertising, …, SEOwww.capsenta.com June 4, 2012 61
  62. 62. Linked Data Publishers  US and UK Government  BBC  NY Times  Best Buy  Sears  Kmart  Overstock  … too many more to namewww.capsenta.com June 4, 2012 62
  63. 63. Linked Open Datawww.capsenta.com June 4, 2012 63
  64. 64. http://www.w3.org/DesignIssues/LinkedData.htmlwww.capsenta.com June 4, 2012 64
  65. 65. May 2007www.capsenta.com June 4, 2012 65
  66. 66. Oct 2007www.capsenta.com June 4, 2012 66
  67. 67. Nov 2007www.capsenta.com June 4, 2012 67
  68. 68. Feb 2008www.capsenta.com June 4, 2012 68
  69. 69. Mar 2008www.capsenta.com June 4, 2012 69
  70. 70. Sept 2008www.capsenta.com June 4, 2012 70
  71. 71. Mar 2009 (1)www.capsenta.com June 4, 2012 71
  72. 72. Mar 2009 (2)www.capsenta.com June 4, 2012 72
  73. 73. July 2009www.capsenta.com June 4, 2012 73
  74. 74. September 2010www.capsenta.com June 4, 2012 74
  75. 75. September 2011Linking Open Datacloud diagram, byRichard Cyganiak andAnja Jentzsch. http://lod-cloud.net/www.capsenta.com June 4, 2012 75
  76. 76. YOU GET THE PICTURE ITS BIG and getting BIGGER and BIGGERwww.capsenta.com June 4, 2012 76
  77. 77. Part 2: Linked Data Principleswww.capsenta.com June 4, 2012 77
  78. 78. Linked Data is a set of best practices to publish and interlink data on the webwww.capsenta.com June 4, 2012 78
  79. 79. Linked Data Principles1. Use URIs as names for things2. Use HTTP URIs so that people can look up (dereference) those names.3. When someone looks up a URI, provide useful information.4. Include links to other URIs so that they can discover more things.www.capsenta.com June 4, 2012 79
  80. 80. 1. Use URIs as names for thingswww.capsenta.com June 4, 2012 80
  81. 81. 1) Use URIs as names for thingshttp://dbpedia.org/resource/Austin,_Texas http://xmlns.com/foaf/0.1/based_near http://juansequeda.com/foaf.rdf#me http://www.w3.org/People/Berners-Lee/card#i http://xmlns.com/foaf/0.1/knows www.capsenta.com June 4, 2012 81
  82. 82. 2. Use HTTP URIs so that people can look up (dereference) those names.www.capsenta.com June 4, 2012 82
  83. 83. 2) Use HTTP URIs HTTP client can lookup the URI using HTTP protocol and retrieve a description http://dbpedia.org/resource/Austin,_Texaswww.capsenta.com June 4, 2012 83
  84. 84. www.capsenta.com June 4, 2012 84
  85. 85. www.capsenta.com June 4, 2012 85
  86. 86. www.capsenta.com June 4, 2012 86
  87. 87. What’s with the redirection (303) ?www.capsenta.com June 4, 2012 87
  88. 88. www.capsenta.com June 4, 2012 88
  89. 89. http://upload.wikimedia.org/wikipedia/commons/0/06/AustinSkylineLouNeffPoint-2010-03-29-b.JPGwww.capsenta.com June 4, 2012 89
  90. 90. http://dbpedia.org/page/Austin,_Texaswww.capsenta.com June 4, 2012 90
  91. 91. Identifies the abstract concept of “the city of Austin, Texas” http://dbpedia.org/resource/Austin,_Texas Accept: text/html Accept: application/rdf+xmlhttp://dbpedia.org/page/Austin,_Texas http://dbpedia.org/data/Austin,_Texas.xml Identifies an HTML document that Identifies an RDF document that describes “the city of Austin, Texas” describes “the city of Austin, Texas”www.capsenta.com June 4, 2012 91
  92. 92. Minting HTTP URIs If you own the domain name and run a web server at that location, mint URIs in this namespace I own the domain capsenta.com I run the webserver http://capsenta.com I can mint URIs in this namespace  http://capsenta.com/person/Juan-Sequedawww.capsenta.com June 4, 2012 92
  93. 93. Cool URIs http://www.w3.org/TR/cooluris/ Don’t misuse a namespace that you don’t own  http://www.imdb.com/title Avoid implementation details  http://capsenta.com/person.php?id=123&format=rdf Use Natural Keys  http://capsenta.com/person/123www.capsenta.com June 4, 2012 93
  94. 94. 3. When someone looks up a URI, provide useful information.www.capsenta.com June 4, 2012 94
  95. 95. 3) Provide useful information How do we provide useful information in document form on the web?  HTML How do we provide useful information in data form on the web  RDFwww.capsenta.com June 4, 2012 95
  96. 96. What to publish?  Literal Triples <http://dbpedia.org/resource/Austin,_Texas> <http://xmlns.com/foaf/0.1/name> “City of Austin”  Outgoing Link Triples <http://dbpedia.org/resource/Austin,_Texas> <http://www.w3.org/2002/07/owl#sameAs> <http://rdf.freebase.com/ns/m/0vzm>  Incoming Link Triples <http://dbpedia.org/resource/Dakota_Johnson> <http://dbpedia.org/ontology/birthPlace> <http://dbpedia.org/resource/Austin,_Texas>www.capsenta.com June 4, 2012 96
  97. 97. What to publish? Description of the data set  Semantic Sitemaps  voiD (Vocabulary of Interlinked Datasets) Provenance Metadata Licenses Informationwww.capsenta.com June 4, 2012 97
  98. 98. Vocabularies (or Schemas or Ontologies)  Create your own using  RDFS/OWL/ SKOS  Reuse vocabularies  Dublin Core: metadata attributes  Friend of a Friend (FOAF): persons and relationships  Semantically Interlinked Online Communities (SIOC): describing users, posts, blogs, etc  Description of a Project (DOAP)  Music Ontology  Programmes Ontology: TV and radio programs  Good Relations: describing products and services  Review Vocabulary  Basic Geo (WGS84) Vocabularywww.capsenta.com June 4, 2012 98
  99. 99. 4. Include links to other URIs so that they can discover more things.www.capsenta.com June 4, 2012 99
  100. 100. 4) Include links to other things Set external RDF links into other data sources on the Web  Subject of the triple is in the namespace of one data set  Object of the triple is a URI in the namespace of another data set Connect siloed data islands Enable discoverywww.capsenta.com June 4, 2012 100
  101. 101. 4) Include links to other things  Relationship Link Triples <http://juansequeda.com/foaf.rdf#me> <http://xmlns.com/foaf/0.1/based_near> <http://dbpedia.org/resource/Austin,_Texas>  Identity Link Triples <http://dbpedia.org/resource/Austin,_Texas> <http://www.w3.org/2002/07/owl#sameAs> <http://rdf.freebase.com/ns/m/0vzm>  Vocabulary Link Triples <http://capsenta.com/vocab/name> <http://www.w3.org/2002/07/owl#equivalentProperty> <http://xmlns.com/foaf/0.1/name>www.capsenta.com June 4, 2012 101
  102. 102. Which predicate for linking to choose? Depends on your domain Is it widely used?  owl:sameAs  foaf:knows  foaf:based_near … If you create your own, relate it to a widely used predicatewww.capsenta.com June 4, 2012 102
  103. 103. Part 3: Linked Data Architectureswww.capsenta.com June 4, 2012 103
  104. 104. Static RDF Files Small amount of data (personal FOAF file) Use RDF/XML serialization Save as .rdf file and upload it to your server  http://www.capsenta.com/company.rdf  http://www.capsenta.com/company.rdf#this Configure MIME types  AddType application/rdf+xml .rdf Make RDF discoverable from HTMl  <link rel="alternate" type="application/rdf+xml" href="company.rdf">www.capsenta.com June 4, 2012 104
  105. 105. RDF in HTML (RDFa) Another syntax for RDF Useful if you have template HTML pages Drupal 7 will do this out of the boxwww.capsenta.com June 4, 2012 105
  106. 106. Triplestores (aka RDF db, …) Commercial  Oracle, IBM, OntoText (OWLIM), Franz (Allegrograph), Openlink (Virtuoso), C&P (Stardog), Ontoprise (OntoBroker), Meronymy Open Source  Jena, Sesame, Mulgara, 4Store (Garlik), BigData (Systap)www.capsenta.com June 4, 2012 106
  107. 107. RDB2RDF  Upcoming W3C RDB2RDF Standards  R2RML: mapping language  Direct Mapping: default automatic mapping  Two Approaches  Dynamic (SPARQL to SQL)  ETL (Dump RDB to RDF)  Ultrawrap  Supports W3C standard and more  SPARQL as fast as SQLwww.capsenta.com June 4, 2012 107
  108. 108. Unstructured to RDF Triplestore Entity Extractor Unstructuredwww.capsenta.com June 4, 2012 108
  109. 109. Semi-structured to RDF Triplestore XML2RDF, XLS2RDF, CVS2RDF Semi-structuredwww.capsenta.com June 4, 2012 109
  110. 110. RDB to RDF CMS with RDFa, RDB2RDF Semantic Wiki (SPARQL to SQL) Triplestore RDB2RDF ETL Relational Databasewww.capsenta.com June 4, 2012 110
  111. 111. Creating Linked Data Linked Data CMS with Data Linked Data RDB2RDF Custom Linked Web Server RDFa, Semantic Interface (i.e. Ultrawrap) Data Wrapper Publication Wiki RDB2RDF Data source Data Triplestore RDB with API Storage XML2RDF, Data Entity Extractor XLS2RDF, CVS2RDF Preparation Unstructured Semi-structured Structured Type of DataThanks Heath and Bizer www.capsenta.com June 4, 2012 111
  112. 112. Consuming Linked Data Application Schema Mapping Record Linkage Provenance Tracking Data Access Linked Data Creating Linked Datawww.capsenta.com June 4, 2012 112
  113. 113. Schema Matching  Renaming  <ex:name>  <foaf:name>  owl:equivalentClass and owl:equivalentProperty  rdfs:subClass or rdfs:subProperty  Structural Transformation  <ex:Juan> <ex:lives> “Austin”  <ex:Juan><foaf:based_near><db:Austin> . <db:Austin><rdfs:label> “Austin”.  SPARQL Construct, RIF, R2Rwww.capsenta.com June 4, 2012 113
  114. 114. Record Linkage Different URIs that identify the same thing Create owl:sameAs links between them Manually lookup: Sindice (Semi) Automatically: SILKwww.capsenta.com June 4, 2012 114
  115. 115. Provenance Keep track where the data is coming from  Quality  Trust Named Graphs SPARQL Graphwww.capsenta.com June 4, 2012 115
  116. 116. Centralized Application SPARQL Triplestore Creating Linked Datawww.capsenta.com June 4, 2012 116
  117. 117. Centralized Advantage  Include the datasets that you need  Complex queries and high performance  Reasoning Drawbacks  Depends on RDF dumps or crawling  Effort to setup the centralized triplestore  Queried data may be out of datewww.capsenta.com June 4, 2012 117
  118. 118. Federated Application SPARQL Federator SPARQL SPARQL SPARQL SPARQL RDB2RDF RDB2RDF Triplestore Triplestore Relational Relational Database Databasewww.capsenta.com June 4, 2012 118
  119. 119. Federated Advantage  Include the datasets that you need  Queried data is up to date Drawbacks  Requires existence of a SPARQL endpoint  Effort to setup federatorwww.capsenta.com June 4, 2012 119
  120. 120. Linked Traversal Application SPARQL Linked Traversal Query Engine Linked Data RDB2RDF Triplestore Relational Databasewww.capsenta.com June 4, 2012 120
  121. 121. Linked Traversal Advantage  No need to know the data sources in advance  Does not depend on the existence of SPARQL endpoints or RDF dumps  Queried data is up to date Drawbacks  Query execution time is slow  Unsuitable for some queries  Results may be incomplete  Still in researchwww.capsenta.com June 4, 2012 121
  122. 122. Applications Linked Data Browsers  http://browse.semanticweb.org/ Linked Data (Semantic Web) Search Engines  Falcons, SWSE, VisiNav, Sindice, Sigma, Swoogle, Wats on Search Engines  Google, Bing, Yahoo! Faceted Browsers  http://dbpedia.neofonie.de/browse/www.capsenta.com June 4, 2012 122
  123. 123. Domain Specific Applications BBC World Cup Seevl.net Linked Life Data Government appswww.capsenta.com June 4, 2012 123
  124. 124. Part 2: Linked Enterprise Datawww.capsenta.com June 4, 2012 124
  125. 125. Use Linked Data Principles internally Consume Linked (Open) Data Publish Linked (Open) Datawww.capsenta.com June 4, 2012 125
  126. 126. Linked Enterprise Data Linked Data can be used as an architectural style for integrating data in the Enterprise 1. Standard Data Access Mechanism: HTTP 2. Standard Address & Identifier Scheme: URI 3. Standard Data Model: RDFwww.capsenta.com June 4, 2012 126
  127. 127. Linked Enterprise Data Information creation  information sharing Produce and consume data specific to your needs but also produce it in a way that it can be connected to other data in the enterprise Distributed but connected! Data that you create, may benefit others! Share it!www.capsenta.com June 4, 2012 127
  128. 128. Benefits of RDF/Linked Data RDF (graphs) is a least common denominator  Text, CVS, XML, XLS, RDB to RDF  Imagine modeling a social network in XML Dynamic and Flexible  Adding a column to a table in my RDBMS takes 6 months to authorize!  With RDF, simply add the triple!  Incrementalwww.capsenta.com June 4, 2012 128
  129. 129. Benefits of RDF/Linked Data Power of the URI and Links  Universal Identifier  Create a “foreign key” to a table that I have no control of Scalability in months, not only seconds  “More can be done with less and faster”  “Cooperation without coordination”www.capsenta.com June 4, 2012 129
  130. 130. What’s next? W3C Linked Data Platform Working Group  http://www.w3.org/2012/ldp/charter Linked Data Basic Profile 1.0  http://www.w3.org/Submission/ldbp/www.capsenta.com June 4, 2012 130
  131. 131. Summarywww.capsenta.com June 4, 2012 131
  132. 132. Linked Data Checklist Does your data link to other data sets? Do you provide provenance metadata? Do you provide licensing metadata? Do you reuse common vocabularies? Do you map proprietary vocabulary terms to common vocabularies? Do you provide other access methods? Thanks Heath & Bizerwww.capsenta.com June 4, 2012
  133. 133. Acknowledgements  RiBS Lab – UT Austin  Olaf Hartig – Humboldt University Berlin  Patrick Sinclair – BBC  Jamie Taylor – Google  Tom Heath & Chris Bizer. Linked Data: Evolving the Web into a Global Data Space  David Wood (Ed.). Linking Enterprise Datawww.capsenta.com June 4, 2012 133
  134. 134. Thanks! Juan F. Sequeda Daniel P. Miranker juan@capsenta.com miranker@capsenta.com @juansequeda www.capsenta.comwww.capsenta.com June 4, 2012 134

×