Schema on read is obsolete. Welcome metaprogramming..pdf
Linked Data Principles and RDF: University of Florida Libraries, BIBFRAME Working Group, Tech Talk, 2015-08-18
1. Tech Talk
BIBFRAMEWorking Group
18 August 2015
Allison Jai O’Dell, Metadata Librarian | AJODELL@ufl.edu | (352) 273-2667 | 404 Library East
2. Linked Data
Because once upon a time folks realized that finding stuff on the interwebs was
difficult without authority control.
3. Linked Data Principles
• Use URIs as names for things
• Use HTTP URIs so that people can look up those names.
• When someone looks up a URI, provide useful information, using the
standards (RDF, SPARQL)
• Include links to other URIs, so that they can discover more things.
http://www.w3.org/DesignIssues/LinkedData.html
4. Use URIs as names for things
• Uniform Resource Identifiers (URIs) are codes that uniquely identify
(that is, name and help locate) resources
• URIs can be of two types:
• Uniform Resource Names (URNs), which uniquely name a resource
• Uniform Resource Locators (URLs), which help locate a resource
• URIs may serve either or both of these functions
• The duality of naming and locating via URIs is important to how
Linked Data works
(and yes, fellow librarians, it’s basically authority control)
5. Use HTTP URIs so that people can look up
those names
• The Hypertext Transfer Protocol (HTTP) is the standard by which
data is communicated on the World WideWeb
• Using HTTP URIs makes an identifier scheme accessible and
communicable on theWeb
• An HTTP URI such as: http://www.example.org/vocab#Allison_ODell
can be used as a unique character string to identify Allison O’Dell, and
also, as a Web address for locating information about Allison O’Dell
6. When someone looks up a URI, provide useful
information, using the standards (RDF, SPARQL)
• Because HTTP URIs are on theWeb, they take advantage of Web
technologies. One can:
• Point a browser at an HTTP URI and read the information that is there.
• Run a query against the data, and obtain the information that is there
• An HTTP URI serves three functions:
• to uniquely name things
• to help locate things
• to gain information about things
7. This is where it gets cool.
authority control + taxonomy + encyclopedic information + access = awesome
8. Include links to other URIs, so that they can
discover more things
9. 5-star Linked Data
★ make your stuff available on the Web under an open license
★★ make it available as structured data
(e.g., Excel instead of image scan of a table)
★★★ use non-proprietary formats
(e.g., CSV instead of Excel)
★★★★ use URIs to denote things, so that people can point at
your stuff
★★★★★ link your data to other data to provide context
10. Resource Description Framework (RDF)
• The standard model for creating Linked Data
• Expresses data simply, by naming two things and the relationship
between them – a structure known as a “triple,” because it is a three-
part statement.
“Allison O’Dell” “lives in” “Gainesville, Florida”
• Each part of the triple can be identified by a URI
< http://www.example.org/vocab#Allison_ODell >
< http://www.example.org/vocab#lives_in >
< http://www.example.org/vocab#Gainesville >
12. This is where it gets really cool.
triples + graph databases + SPARQL = data merger and inferencing power
13. Relational Databases
(awesome for straightforward stuff)
• Relationships between tables
• Advantages:
• Easier data updates
• Faster processes
• Less storage space
• Disadvantages:
• Tables = rigid structure
• Tables = annoying to query
14. Graph Databases
(awesome for complex stuff)
• Relationships between everything
• Advantages:
• Graph = easy to query
• Allows variation in data and relationships
• Extensible: scales easier
• Inferencing power
• Disadvantages:
• Slow processing
• More storage space
15. SPARQL
• Query language for RDF data
• More intuitive than SQL:
• Queries based on relationships, rather than on the structure of tables
• Queries can use the underlying taxonomy or data model