2. Mimas
•Nationally designated
data centre
--John Rylands University Library:
--1.3 million bib records
•We month
--600,000 search sessions perhost a significant
number of the UK's
--23% of records unique (cross checked against
WorldCat) research information
--40,000 students assets…
10 years of circulation data (8
•and build applications to
million records) help people make the
most of this resource
3. Copac
•Aggregation of 50+
--John Rylands University Library: & specialist
research
--1.3 million bib records libraries
--600,000 search sessions per month
•50 million records +
--23% of records unique (cross checked against
WorldCat)
--40,000 students •1 million search sessions
10 years of circulation data (8
per month
million records)
4. JRUL
•1.3 million bib records
--John Rylands University Library:
--1.3 million bib records •600,000 search sessions
per month
--600,000 search sessions per month
--23% of records unique (cross checked against
WorldCat) •23% of records unique
--40,000 students (cross checked against
10 years of circulation data (8
WorlCat)
million records) •10 years of circulation
data (8 million records)
5. SALT project hypothesis…
Library circulation activity data
can be used to support
humanities research by
surfacing underused ‘long tail’
library materials through
search.
6. Could we develop an API-
based national shared
service?
RLUK, M25 Consortium, Leeds University,
Cambridge University, Sussex University.
7.
8. Building on JISC MOSAIC
Be pragmatic in keeping options open for new adopters by
promoting local and shared strategies for the exploitation of
user activity data
Build participation by engaging the service and
development communities by…encouraging small local
steps as well as radical innovation
Provide a platform (infrastructure, interfaces and services)
that will …reduce local implementation burdens
Address perceptions of value and risk for institutions and
services by facilitating dialogue involving senior managers
9. Serendipity
Anxiety
Trust concerns
Cynical about ratings and reviews
10. But they could see the immediate
benefit of recommender functionality….
11. So can activity data?
• Increase the visibility (& usage) of hidden
collections
• Provide new routes to discovery based on
use and disciplinary contexts (not traditional
classification)
• Enable serendipitous discovery
13. Loan transaction
data extracted
Additional
processing Data anonymised
performed on and given to Mimas
demand by API
API implemented in
Capita Prism sandbox Mimas processes
using JUICE framework data
14.
15.
16.
17. User evaluation
• 18 • 6
Round researchers Round researchers
One • 42 Two • 25
searchers searches
20. “Can we have it now?”
100% would welcome
a recommender function
based on circulation
records
21. Surfacing the long tail…
• What is the long tail in this context?
•Will surfacing these items mean
they‟ll be borrowed?
22. What about that shared service?
• National aggregation of data
• Based on usage activity
• From a representative sample of libraries
„Why should I make this a priority?‟
23. More SALT for JRUL…
•Testing the recommender with subject
librarians
•Going live with the local or national
service
•Making SALT available in Primo alongside
Bx recommendations
And think about…
•Allowing users to adjust thresholds in a
meaningful way
•Provide more targeted recommendations
through Portal
24. More SALT for Mimas…
•Aggregate more data
•Evaluate the longer-term impact on
borrowing patterns at JRUL
•Gather requirements/costs for a shared
service
•Investigate how activity data aggregations
could be used to support collection
development
•Communicate the benefits to library
decision makers
Huddersfield’s work in this area has already shown the possibilitiesBut we need more circulation data.Saw Dave present on this work at ILI a couple of years ago.
There are difficulties. Licensing of data – privacy and anonymisation – and this work is at the bottom of a very long to-do list. Is the case compelling enough? Has the case been made? This kind of activity still isn’t widespread.It’s a ‘nice to have.’ But is it core? Why shouldn’t libraries just plug in librarything or Amazon?One of the ways you can demonstrate that case and need is through market research and we’ve been talking to a lot of humanities researchers over the last couple of years.And so market research – evidence of why your developments are needed and required – plays a pivotal role. We need to be able to demonstrate in concrete terms why we’re devoting resource to this area (and taking it away from elsewhere) because such developments Hence the need to gather market researchWe can use this to strengthen our proposals – our requests for more money, or rationalisation for how we’re devoting resource
Things such as user ratings and reviews are viewed with suspicion – linked to their need to trust the source of their information – but they do see value in recommenders. They were, for example, suspicious of those people who rated and reviewed things on Amazon.Most suggested that they were not in the habit of ‘rating and commenting’ (for example, on Amazon) and, to an extent were suspicious of those who were. They might have a look, but would mostly disregard due to:Suspicions about commercial interestReputation & identity (who’s telling me this, and what do *they* know?)Concerns of trust and quality (is the information any good?)
Mention that the logic used follows the recommendation of Dave Pattern:http://www.daveyp.com/blog/archives/1453
Threshold = 15
Relatively plain sailing ‘til now – followed the guidelines of Dave Pattern to extract the data and create the appropriate algorithms for the recommenderThis is pulling from the API (which is not public yet, but will be)
What might this mean in a library context?Huddersfield already showing more borrowing across the shelfResearchers certainly were keen to find those items they were sure were out there but they were missing, but this only works if the items are seen to have quality. And there were some concerns about activity data skewing results towards certain courses and reading lists
Point to demonstrator – what would happen if we did this on a national level?We spoke to representatives from Cambridge, Leeds, Sussex, and the M25 group and there is certainly interest in taking this forward. But the message came through loud and clear that we need to make a strng case and answer this question. Our future development work needs to be accompanied by clear messages and dialogue with libraries
Next steps:Make this function in JRUL OPACUser-testing – how useable?At what point do recommendations become highly irrelevant?(lowering the threshold)Questions:Is this approach sustainable?Does this really need to be a shared service on top of aggregated data? (JRUL & Hudds demonstrate it doesn’t have to be)Data Out (API): Lightweight and agileData In (Data processing): Not so much…Licensing and data privacy – which way?Attribution and ownership of what’s in the pot might prove to be a concernIncreasing use of the Long tail of underused collections? – our findings will be very much a scratch at the surface; real results will be yielded longer term (but this will go into service for JRUL, so there’s opportunity to test this – in addition, there’s evidence from Hudds to suggest there are positive trends)
Questions:Is this approach sustainable?Does this really need to be a shared service on top of aggregated data? (JRUL & Hudds demonstrate it doesn’t have to be)Data Out (API): Lightweight and agileData In (Data processing): Not so much…Licensing and data privacy – which way?Attribution and ownership of what’s in the pot might prove to be a concernIncreasing use of the Long tail of underused collections? – our findings will be very much a scratch at the surface; real results will be yielded longer term (but this will go into service for JRUL, so there’s opportunity to test this – in addition, there’s evidence from Hudds to suggest there are positive trends)