7. What is the Web?
“… the Web, is a system of
interlinked hypertext documents
accessed via the Internet. With a
web browser, one can view web
pages that may contain text,
images […] and navigate between
them via hyperlinks”
http://en.wikipedia.org/wiki/World_Wide_Web
9. History of the Web
• Created by Tim Berners-Lee at CERN in 1989
• Mosaic browser in 1993
• W3C created in 1994
• Exponential growth mid 90s
• Amazon, Ebay – 1995
• Search engines – Google 1998
• Dot-com boom 1997 – 2001
• Web 2.0 – blogs, Facebook, Twitter, etc
11. WHAT’S THE
WEATHER IN
AUSTIN TODAY?
http://www.flickr.com/photos/jamieca/31631256/
12.
13.
14.
15. What is the problem?
• The web is full of documents
• We aren’t always interested in documents
– We are interested in THINGS
– These THINGS might be in documents
• We can read a HTML document rendered in a
browser and find what we are searching for
– This is hard for computers.
– Computers have to guess (even though they are
pretty good at it)
16. The Web of Documents
Search
Search
Engine
Crawler
17. The Web is a Data Shredder
Structured Unstructured
Data Data
Thanks Martin Hepp
18. What would we like?
• Make it easy for computers/software to find
THINGS
Do you SEARCH or do you FIND?
19. Search for
Football Players who went to the
University of Texas at Austin, played for
the Dallas Cowboys as Cornerback
27. On a Semantic Web
• Besides publishing documents on the web
– which computers can’t understand easily
• Let’s publish on the web something that
computers can understand
DATA
28. The Semantic Web is a
web of data
The current web is a
web of documents
30. Current Data on the Web
• Relational Databases
• APIs
• XML
• CSV
• XLS
• …
• Can’t computers and applications already
consume that data on the web?
31. Yes! But it is all in different
formats and data models!
41. Resource Description Framework
(RDF)
• Data Model = a way to model data
– i.e. Relational databases use relational data model
• RDF is a graph data model
42. Key Value vs Graph
• Key Values
– firstName Juan
– lastName Sequeda
– livesIn Austin
– knows Stephane Corlosquet
• But what are these key/values describing?
– ME!
43. RDF is a Graph
• Let’s group the Key/Values together
– <JuanSequeda> <firstName> “Juan”
– <JuanSequeda> <lastName> “Sequeda”
– <JuanSequeda> <livesIn> “Austin”
– <JuanSequeda> <knows> <StephaneCorlosquet>
– ..
– <StephaneCorlosquet> <firstName> “Stephane”
– <StephaneCorlosquet> <lastName> “Corlosquet”
– <StephaneCorlosquet> <livesIn> “Boston”
44. Identifier for
the “group” RDF is a Graph Key/Value
• Let’s group the Key/Values together
– <JuanSequeda> <firstName> “Juan”
– <JuanSequeda> <lastName> “Sequeda”
– <JuanSequeda> <livesIn> “Austin”
– <JuanSequeda> <knows> <StephaneCorlosquet>
– ..
– <StephaneCorlosquet> <firstName> “Stephane”
– <StephaneCorlosquet> <lastName> “Corlosquet”
– <StephaneCorlosquet> <livesIn> “Boston”
45. RDF can be serialized in different ways
• RDF/XML
• RDFa (RDF in HTML)
• N3
• Turtle
• JSON
55. Databases back up documents
THINGS have PROPERTIES:
A Book as a Title, an author, …
Isbn Title Author PublisherID ReleasedData
978-0-596- Programming Toby Segaran 1 July 2009
15381-6 the Semantic
Web
… … … … …
PublisherID PublisherName
This is a THING:
A book title “Programming the 1 O’Reilly Media
Semantic Web” by Toby Segaran, … … …
56. Lets represent the data in RDF
Isbn Title Author PublisherID ReleasedData
978-0- Programming Toby 1 July 2009
596- the Semantic Segaran
15381- Web
6
Programming the
PublisherID PublisherName title
Semantic Web
1 O’Reilly Media
author
book Toby Segaran
isbn
978-0-596-15381-6
publisher
name
Publisher O’Reilly
57. Remember that we are on the
web
Everything on the web is identified
by a URI
58. And now let’s link the data to other
data
Programming the
title
Semantic Web
http://…/i author
Toby Segaran
sbn978
isbn
978-0-596-15381-6
publisher
http://…/p name
ublisher1 O’Reilly
59. And now consider the data from
Revyu.com
http://…/ hasReview http://…/i
review1 sbn978
description
reviewer
Awesome
Book
http://…/
name
reviewer
Juan
Sequeda
60. Let’s start to link data
http://…/ hasReview http://…/i
review1 sbn978
Programming the
description title
Semantic Web
hasReviewer owl:sameAs
Awesome author
http://…/i
Book Toby Segaran
sbn978
http://…/
name
reviewer isbn
978-0-596-15381-6
Juan publisher
Sequeda http://…/p name
ublisher1 O’Reilly
61. Juan Sequeda publishes data too
http://juanse livesIn http://dbpedia.org/Austin
queda.com/id name Juan Sequeda
62. Let’s link more data
http://…/ hasReview http://…/i
review1 sbn978
description
hasReviewer
Awesome
Book
http://…/
name
reviewer
sameAs Juan
Sequeda
http://juanse livesIn http://dbpedia.org/Austin
queda.com/id name Juan Sequeda
63. And more
http://…/ hasReview http://…/i
review1 sbn978
Programming the
description title
Semantic Web
hasReviewer owl:sameAs
Awesome author
http://…/i
Book Toby Segaran
sbn978
http://…/
name
reviewer isbn
978-0-596-15381-6
owl:sameAs Juan publisher
http://…/p
Sequeda name
ublisher1
O’Reilly
http://juanse livesIn http://dbpedia.org/Austin
queda.com/id name Juan Sequeda
64. Data on the Web that is in RDF and
is linked to other RDF data is
LINKED DATA
65. Linked Data Principles
1. Use URIs as names for
things
2. Use HTTP URIs so that
people can look up
(dereference) those
names.
3. When someone looks up
a URI, provide useful
information.
4. Include links to other
URIs so that they can
discover more things.
71. SELECT ?review ?comment
WHERE {
isbn:978 ex:hasReview ?review .
?review ex:description ?comment .
http://…/ hasReview ?review ex:hasReviewer ?person .
http://…/i
review1 ?person ex:lives dbpedia:Austin .
sbn978
}
Programming the
description title
Semantic Web
hasReviewer sameAs
Awesome author
http://…/i
Book Toby Segaran
sbn978
http://…/
name
reviewer isbn
978-0-596-15381-6
sameAs Juan publisher
Sequeda http://…/p name
ublisher1 O’Reilly
http://juanse
queda.com livesIn http://dbpedia.org/Austin
name Juan Sequeda
72. OWL
• Here is where the real semantics shows up
• Web Ontology Language
• Define schema/vocabulary
• Classes, Properties, Inheritance, etc
• Subclasses, Subproperties
• …
• You can get more complicated with rules…
79. 1) Share data as data
2) Because you neighbor is doing it
…
3) Marketing, Advertising, SEO ++
80. Linked Data Publishers
• UK Government
• US Government
• BBC
• Open Calais – Thomson Reuters
• Freebase/Google
• NY Times
• Best Buy
• Sears
• Kmart
• Overstock.com
• CNET
• Dbpedia
• O’Reilly Media
• …
97. Linked Data Browsers
• Not actually separate browsers. Run inside of
HTML browsers
• View the data that is returned after looking up
a URI in tabular form
• User can navigate between data sources by
following RDF Links
• (IMO) No usability
101. Linked Data (Semantic Web)
Search Engines
• Just like conventional search engines (Google, Bing, Yahoo),
crawl RDF documents and follow RDF links.
– Current search engines don’t crawl data, unless it’s RDFa
• Human focus Search
– Falcons - Keyword
– SWSE – Keyworkd
– VisiNav – Complex Queries
• Machine focus Search
– Sindice – data instances
– Swoogle - ontologies
– Watson - ontologies
– Uberblic – curated integrated data instances
102. (Semantic) SEO ++
• Markup your HTML with RDFa
• Use standard vocabularies (ontologies)
– Google Vocabulary
– Good Relations
– Dublin Core
• Google and Yahoo will crawl this data and use
it for better rendering
107. Domain Specific Applications
• Government
– Data.gov
– Data.gov.uk
– http://data-gov.tw.rpi.edu/wiki/Demos
• Music
– Seevl.net
• Dbpedia Mobile
• Life Science
– LinkedLifeData
• Sports
– BBC World Cup
115. Linked Data is Data Integration
SPARQL
Query
Diamond
Ultrawrap
Ultrawrap
Specify Ultrawrap
Morphster
Morphbank
116. Example 1 (Specify – DBpedia)
• Get full name and guid from taxon with id
http://tata.csres.utexas.edu:8080/specify/data/t
axon51807#thing
• AND fin any subjects it may have “skos:subject”
117. Result Example 1
• Note that
http://dbpedia.org/resource/Category:Fish_of_
Australia comes from a different data source
(dbpedia.org)
118. Example 2 (Specify-Morphbank)
• Get full name and guid from taxon with id
http://tata.csres.utexas.edu:8080/specify/data/t
axon42947#thing
• AND the rank and kingdom from Morphbank
119. Result Example 2
• Note that full name and guid come from Specify
http://tata.csres.utexas.edu:8080/specify/data/t
axon42947
• AND rank and kingdom come from
Morphbank
http://tata.csres.utexas.edu:8080/morphbank/d
ata/taxa398354
120. The killer app for A little semantics
Semantic Technology is goes a long way
YOUR life (online) - Jim Hendler
– Tom Gruber
Knowledge is Power
Occupy Your Data - Jim Hendler
- Tim Finin
Linked Data is the
(Semantic) Web done right
The novel part of the - Tim Berners-Lee
Semantic Web is not the
Semantics, but the Web
- Frank van Harmelen RAW DATA NOW
- Tim Berners-Lee