Presentation to set the scene and stimulate discussion in the Workshop "The Power of Sharing Linked Data" at ELAG 2014 - Bath University, UK June 10/11 2014
Nell’iperspazio con Rocket: il Framework Web di Rust!
The Power of Sharing Linked Data - ELAG 2014 Workshop
1. The world’s libraries. Connected.
ELAG 2014 – Bath, UK
The Power of Sharing Linked Data:
Giving the Web What it Wants
Richard Wallis
OCLC Technology
Evangelist
@rjw
6. The world’s libraries. Connected.
Today’s online information seekers have many choices
• Select
• Acquire
• Describe
• Preserve
•Expose
7. The world’s libraries. Connected.
The problem with access to library
collections:
People don’t start research in the library catalog?
(No… that’s just a fact.)
The real problem is that we don’t expose our
collections very well on the web.
Question: How to connect users to library collections on the web?
8. The world’s libraries. Connected.
Scribe OPAC
Card
Catalog
Web of
Data
Web
Evolution of Metadata Management
and Library Catalogs
9. The world’s libraries. Connected.
What the Web wants
What is required to join
the web of data?
10. The world’s libraries. Connected.
What the Web wants
Some things
the web wants:
1. Size
2. Familiar structures
3. A network of links
4. Entity identifiers
11. The world’s libraries. Connected.
edition
author location
holding
date of publication
classification
publisher
title
source
ISBN
author location
holding
classification
publisher
person place
object concept
organization work
library data:
stored as records
title
12. The world’s libraries. Connected.
authorperson place
object concept
organization work
subjectitem
availability
library data stored as entities
13. The world’s libraries. Connected.
person place
object concept
organization work
library data stored as entities
library knowledge graph
A graph of relationships
14. The world’s libraries. Connected.
Knowledge cards for libraries
Günter Grass
Born: 16 October 1927
Gdańsk, Poland
German novelist, poet,
playwright, illustrator,
graphic artist, sculptor and
recipient of the 1999 Nobel
Prize in Literature.
Works
Subjects
Quotes
Find Günter Grass works at:
Libraries near me | Online Retailers
Germany | German literature | Historical fiction
War stories | Black humor | Fantasy
“Even bad books are books and therefore sacred.”—
The Tin Drum
Google Knowledge Graph
15. The world’s libraries. Connected.
person place
work
concept
organization
object
Günter Grass
Historical
Fiction
this copy of
“The Tin Drum”
Germany
library “Die Blechtrommel”
library data stored as entities
Field in a record vs. entity in knowledge graph
expression
“The Tin Drum”
16. The world’s libraries. Connected.
Evolution of Metadata Management
and Library Catalogs
Scribe OPAC
Card
Catalog
Web of
Data
Web
person place
object concept
organization work
Web of
Data
17. The world’s libraries. Connected.
person place
object concept
organization work
library data stored as entities
library knowledge graph
Works
FRBR
:
Work
FRBR
:
Manifestation
18. The world’s libraries. Connected.
Benefits for All Library Workflows
The Data Strategy: WorldCat Works
person place
object
organization work
Cataloging
Integration with the web
Cascading updates More options
Intuitive searching
19. The world’s libraries. Connected.
What the Web wants
We are already doing a lot of this…
1. Size
2. Familiar structures
3. A network of links
4. Entity identifiers
schema.org
VIAF
= Aggregation
= Linked Data
= Referrals
= Identifiers
22. The world’s libraries. Connected.
How does a library contribute to all of this?
1. Register
2. Aggregate
Add your holdings to the network
Manage identifiers:
Authorities
Institutions
3. Expose
person place
object concept
organization workwork
28. The world’s libraries. Connected.
Tell them about our resources…
…using their language and methods
http://www.flickr.com/photos/boston_public_library/6220572487
29. The world’s libraries. Connected.
WorldCat Linked Data
Linked Data
• 311+ million data resources
• Schema.org
• Embedded RDFa
• Links to Dewey, LCSH, LCNAF,
DOI, VIAF, FAST
• ODC-BY license
• June 2012
• Continuing development:
• Vocabulary, Content-negotiation, More Links
• Works …
31. The world’s libraries. Connected.
How we are sharing with the web
What the web
gets:
• WorldCat 311M+
• Schema.org
• VIAF, LCSH, Dewey, …
• WorldCat persistent
identifiers (URIs)
Some things
the web wants:
1. Size
2. Familiar structures
3. A network of links
4. Entity identifiers
34. The world’s libraries. Connected.
Part of the Web of Data
Worldcat.org/oclc/81453459
The Hidden Face of Eve
http://viaf.org/viaf/84254254/
Nawal El Saadawi
http://www.wikidata.org/wiki/Q238514
Nawal El Saadawi
http://isni-url.oclc.nl/isni/0000000120296695
Nawal El Saadawi
author
sameAs
sameAs
sameAs
VIAF
37. The world’s libraries. Connected.
BIBFRAME
Bibliographic Framework as a
Web of Data:
It is the foundation for the future of
bibliographic description that happens on, in,
and as part of the web and the networked
world we live in.
http://www.bibframe.org
39. The world’s libraries. Connected.
≈ Complementary ≈
bibliographic description as part of the web
? Conflict ?
@Fascinatingpicshttp://www.flickr.com/photos/54136840@N00/4921290518/
41. The world’s libraries. Connected.
WorldCat Works Linked Data
Works
• 197+ million Work descriptions and URIs
• Schema.org
• RDF Data formats – RDF/XML, Turtle, Triples, JSON-LD
• Links to WorldCat manifestations
• Links to Dewey, LCSH, LCNAF, VIAF, FAST
• Open Data license
• Released April 2014
42.
43.
44.
45. The world’s libraries. Connected.
WorldCat Works Linked Data
Single
Manifestation
Multiple
Manifestations
197 Million Work Descriptions
Linking to 311 Million Manifestations
54. The world’s libraries. Connected.
ELAG 2014 – Bath, UK
The Power of Sharing Linked Data:
Giving the Web What it Wants
Richard Wallis
OCLC Technology
Evangelist
@rjw
55. The world’s libraries. Connected.
Amy Elliott
Bert Cousins
Bob Murphy
Bob Schulz
Brian Wingerter
Brook Pauquette
Bruce Washburn
Daniel van Spanje
Diane Vizine-Goetz
Ed Macklin
Gail Thornburg
Gay Miller
Georgii Viznyuk
Hugh Jamieson
Jason Ash
Jean Godby
Jeff Mixter
Jeff Young
Jenny Toves
Jim Michalko
Joanne Cantrell
Jon Fausey
Julie Gay
Kanchitpol Ratanapan
Kelly Womble
Leonardo Simon
Lisa Cox
Lora Chappelear-Pearson
Martin van Muyen
Production Release of WorldCat Works
April 2014Kudos to:
Marty Loveless
Mike Teets
Paul Moss
Rich Greene
Rich Greene
Richard Wallis
Roy Tennant
Scott Orr
Shelley Hostetler
Stephan Schindehette
Steve Meyer
Ted Fons
Thom Hickey
Tod Matola
Xiaoming Liu
Hinweis der Redaktion
Thank you very much for that generous introduction.
Welcome, ladies and gentleman to this plenary session on the power of shared data. I’m pleased to be here in Cape Town with my colleague Richard Wallis to talk to you today about the changing landscape of exposing our valuable library collections to library users and the power of cooperation and shared data in solving some of the hardest problems of connecting library users to the resources they need….
We, at OCLC, start with the user and how they interact with the library
And that recognizes how the library as a place has changed
<CLICK>
We want the user to be successful in the research process. And we want library collection’s to be part of that success. We know you want that too.
You have responded to the user’s behavior by changing the library from a collections warehouse to a collaborative learning space.
You have re-balanced the budgets to reflect the need to purchase electronic over print. – Especially in academic libraries.
The balance in budgets have passed the tipping point where the majority of the budget are is spent on electronic material over print – not to say that print is not relevant, just that in budgetary terms it is taking the lion’s share.
You have re-engineered the technical and public services departments to manage and process electronic collections. But those collections are not always available to the user. Where the users are….
The users are here. They are on the web search engines. They are often not in the library’s physical space, or they are doing research from any physical location that allows network access and Internet searching.
We know that often the library may have, or provides access to the resource they need but..
We also know that they may not start their search in the library and That search process does not always provide a clear path from search to library collection.
Often the library collections are not clearly visible through search engine results.
I thought we might start with a story. The story of Anita.
She is an undergraduate university student who has a short term assignment to understand the cultural output of the second world war. Her instructor has made the job easy by asking her to focus on a<CLICK> single author and a work that has had significant cultural impact on one of the countries in Europe. But Anita has been busy with other activities and today she doesn’t have a lot of time for research. <CLICK>
OCLC’s research shows that she will start with sources familiar to her—sources that have provided her positive results in the past. <CLICK> But will those results provide the highest quality resources to include in her essay? Will they include any resources provided by her library? <CLICK>
Again, OCLC’s research shows that users have many choices in the research process. In fact, recent research shows that searchers will return again and again to a small set of sources that have provided them with positive results in the past. The library may or may not be part of that set of sources.
Library’s core professional obligations to the user:
Select…
The first 4 are well understood – have established communities of practice – in control of the professional librarian
Expose is the one that has radically changed - when we were dealing with physical things the path for the researcher to the library door was in the scope/control of the library …
Now that the user is out on the web that influence/control is being lost – the user goes where they want to go…
So the problem that is often stated as being ….
<CLICK>
That’s not the real problem – that is just a fact
Looking back on how libraries have handled collections before…
Scribe change was a rewrite – not flexible
Card catalogs – enormous step forward – flexible, etc. but cost still high – then printed cards came along…
50 years ago mainframe supported printing – lead to cooperative creation of records that solved efficiencies of card printing
Advent of networks meant use of machine readable card records to build OPACS
Web – put web servers on front of that same ‘machine readable card’ based system
We are now moving towards a wheb of data
Where the web of data becomes visible.
But, what if we started to think about the information in a different way.
We, at OCLC, with our major data ingest and processing techniques – Big Data tech
Matching incoming data with what we have
Identifying the entities and associating their role attributes
Woks – not so far very visible in libraries – important on the web
The web is starting to represent these entity in knowledge cards/graphs
The search engines are now all on board – providing information on the entities
How would this look if we the libraries did something similar with the information we have been curating
So the idea of a knowledge card for libraries is a kind of fashion forward view of how libraries will need to market themselves. I’ll come back to this point a little later in the presentation.
----- Meeting Notes (07/10/2013 09:39) -----
Aggregation description
Registration: Make sure library data is stored somewhere (WorldCat) that connects up.
Aggregation: Gathering all library collections together at the network level so the whole becomes more attractive on the web.
Entification: Describe our data in ways that extract what is most interesting to consumers and partners.
Syndication: Make data available on the web by demonstrating value.
Thanks Ted for that insight into our entities..
I’d like to now explore more what the web wants
Back to Anita and her project...
Take a look at the device she is using - where does that start – Google
As Ted mentioned, Its is easy to forget how much Google is used as THE route into the web.
The search engines are not just a way to sites – they are becoming the de facto way to find things within a site.
For example site analytics from the BnF show that 80% of hits on its detail pages come direct from external search engines
Simply this means - if you want your resources to be discovered, the search engines need to know about them
As Ted pointed out Knowledge Cards are appearing everywhere
These supplement search results with information (data) about things
They are driven by data – linked data -this is different from the traditional approach of just listing pages that seem to mention the topic
Note how Information in the knowledge graph links you directly to more information
They refer to this as Tings not Strings in our terms the Things are our entities and our resources.
So how do we get our DATA into the Googles of the world?
We need to tell them – in the way the are looking for it
Which is the approach we took when adding Linked Data capability into WorldCat
This is what it looks like…
We have just visited 4 separate sites – just like the web we know – data distributed across the web, linking together
Remember Ted showing you this list of things the web wants…
How does our WorldCat Linked Data developments satisfy these desires….
WorldCat URI – respected robust source
Why did we choose Schema.org
Successful general vocab – stress cooperation in creating it and W3C affiliation for enhancing it
Which is why Richard created another W3C group of bibliographic focused people to recommend extentions to schema.
Now 80+ members
Submitted 2 proposals, 1 of which has been adopted – more in the pipeline
Schema not only show in town
Emphasize the web-ness of their mission
Are they in competition –no they complement each other
Schema for general sharing with the web – BIBFRAME for detailed library focused data capture & exchange
Linked Data lets you mix vocabs – expect in the future descriptions will contain both
Let the consumer decide
Back to Anita again….
Her search not only finds her the book she wants, but by following the links, she gets signposts to where she can get access to it
As Ted reminded us – all these developments are part of a continuum
Using the technology of the day to best achieve our mission
We are using today’s technology – to build a web of data..
The web of data, like any network, creates the most benefit the more sources that are part of it.
If your library is not part of it, your users cannot be linked to your resources!
We all need to participate
Bringing those resources
– your resources – in the appropriate form
– to our users wherever they (in this case Anita) are
Thank you very much for that generous introduction.
Welcome, ladies and gentleman to this plenary session on the power of shared data. I’m pleased to be here in Cape Town with my colleague Richard Wallis to talk to you today about the changing landscape of exposing our valuable library collections to library users and the power of cooperation and shared data in solving some of the hardest problems of connecting library users to the resources they need….