Collection development is big business and how academic libraries decide to invest in content is radically changing. This is being driven as much by new approaches to organisational design, relationship management, and data insight in universities as by changes to business models and technology in scholarly publishing and the supply chain. Based on recent experience at Edinburgh, Manchester and Northumbria, this participatory session will explore new strategies for collection development, and specifically address challenges and opportunities faced by libraries that have moved or are transitioning from traditional subject librarian roles.
UKSG Conference 2016 Breakout Session - Collection development in a world without subject librarians, Rachel Kirkwood
1. Collection development
in a world without
subject librarians
Rachel Kirkwood,
Collection Development Manager,
University of Manchester Library
UKSG 2016 @racheljkirkwood
6. Content budget diversity
1. OCLC FirstSearch Subscription (less than £1K)
2. Proquest Ulrich's Serials Analysis Systems
3. Participation in the CLOCKSS Archive
4. ORCID Membership
5. Royal Society Open Access
6. Access to Altmetric for Institutions & API
7. ELSEVIER Scopus Fee
8. ELSEVIER SciVal All Modules (nearly £50K)
@racheljkirkwood for #UKSG16 6
7. Drivers for change
Student Experience
Course Collections Programme
Value of investment
Materials budget allocation changes
New modes of access
E-book packages
New services
Competing pressures for Academic Support Librarians (ASLs)
@racheljkirkwood for #UKSG16 7
8. Why did we do it?
• No more jack of all trades librarians
• Specialist teams to meet new challenges, e.g. for research services:
– Research Data Management
– Scholarly Comms and OA
– Citation analysis
• Academic Engagement Librarians as ‘key account managers’
• Teaching & Learning Librarians to transform info literacy teaching
with My Learning Essentials
Practical imperative to ‘automate’ CD
@racheljkirkwood for #UKSG16 8
10. Automation/self-service
• PDA (patron driven acquisition) for e AND print:
Books on Demand
• EBA (evidence based acquisition) of JSTOR
monographs
• Approval plans and back runs
– ‘blanket’ approach but can be tailored through a
detailed profile
– More librarian input (subject knowledge?)
@racheljkirkwood for #UKSG16 10
11. Getting clever with data
• Let’s try: search profiles based on research
keywords
1. Extracted from Scopus abstracts (Maths)
2. Harvested from School web pages (EEE)
– More clever, but tricky; statistical skewing OR labour-
intensive
• Blended approach: standing orders based on
academic liaison
@racheljkirkwood for #UKSG16 11
12. Unlocking the back office
• Using Alma Analytics
• Building dashboards
– Age profiles
– Print and e-books by Dewey century/decade
– Loan rates
– Items added to stock per year
– User types at site libraries …
@racheljkirkwood for #UKSG16 12
13. Looking to the future
• Collection profiles
– Leeds categorisation of collections
• Discipline profiles
– Statistics combined with relationship knowledge
• Dynamically linked, by mapping the taxonomy of
one domain to that of another
– Demonstrating relevance and value
• Combining multiple sources of data
@racheljkirkwood for #UKSG16 13
14. Outstanding issues
• Poor quality/absent (meta)data
• Lack of data fluency
• Need for statistical analysis skills
• Complexity and inconsistency of usage data
• Linking differently structured data
@racheljkirkwood for #UKSG16 14
15. OVER TO NICK!
Any questions?
rachel.kirkwood@manchester.ac.uk
@racheljkirkwood
@racheljkirkwood for #UKSG16 15
Editor's Notes
If you are live tweeting, there’s my handle.
Why did we say CD is big business? Because at institutions the size of Edinburgh or Manchester we spend millions of pounds on content each year, and collectively HE libraries spend billions of course. The amount we can spend on print items is becoming limited by the bald fact that many of our libraries are just full, and in general content budgets are being tightened like many others; but even in the case of spend on electronic resources – which don’t take up shelf space – there is a shift in attitude. In a previous era at Manchester we aimed for comprehensive coverage, big deals, and prided ourselves on sheer numbers of e-journals and e-books. Recently one of the conference’s generous sponsors has put a laudable amount of effort into trying to sell us what they described as “a comprehensive offer”, but I have had to reiterate that even though it doesn’t take up shelf space I am not interested in buying all the stuff, just the right stuff. And for me that means relevant stuff which will get used – we’re looking for value for money, if not RoI, which would be difficult to measure … in terms of ‘successful’ research (REF *s maybe) which cited our content?
Best value seems to mean buying the most relevant, easily accessible and usable resources, and making sure people know how to find them and use them. The collection development team at Manchester (that’s basically me, for now) has an emerging close relationship with our Academic Engagement Librarians, who are positioned in our Strategic Marketing & Comms division as ‘key account holders’ and principal conduit between the library and faculty. One of the work packages in last year’s library strategy project on CD & Profiling looked particularly at smoothing the workflows for e-books (both individual titles and packages), looking at integrating the selection, purchasing, cataloguing and activation of ebooks along with their marketing. We’ve now had two promotional events which have seen collection management folk working alongside academic liaison people, with direct input from the content providers themselves. And they also mediate requests from academics for new subscriptions, working alongside faculty to build ‘business cases’ for new requests. [Criteria include … ]
We wrote the abstract for this session before Christmas, but the question of ‘how we decide to invest’ has just recently become the focus of a flurry of social media interaction with a fascinating post on the “In the Open” blog by MIT’s Ellen Finnie, Head, Scholarly Communications & Collection Strategy, MIT Libraries. In a move to support the open movement and ‘vote with their collections dollars’ MIT library has subsumed their content budget under their scholarly communications budget.
Manchester (and I’m sure many others represented here) is doing this in a small way by supporting the Knowledge Unlatched model, where libraries can directly support making research monographs open access, and librarians participate in the choosing of those monographs. I think this is a great way to support the ‘long form’ in scholarship.
At the moment we still have a primarily content budget, but we have gone a step further than Edinburgh in that there are no more micro-divisions at subject level at all. The divisions are along the lines of ‘research’ (all subjects), and ‘teaching and learning’, PDA, and BOD.
However, the number of ‘non-content’ items paid on this budget has increased so we are now monitoring them more closely. This is probably not a comprehensive list, but … In ascending order of cost these are:
(There may be something missed off, e.g. WoS) This gives us a snap shot view of how the concept of ‘content’ is diversifying.
1 supports cataloguing to make the content discoverable.
2 helps analyse our e-journals – did you know that Ulrich’s provides Dewey classification for journals?
3 helps preserve content.
4 helps uniquely identify the individual producers of content.
5 helps our researchers make the content they have created freely available.
6 is about measuring how people are reacting to that content on social media.
7 indexes published content (and could help to drive further selection, wait for a later slide!).
The last one is perhaps furthest from content, and comes at a price that could swallow all the others combined. SciVal offers a set a very powerful analytical tools which allows you to do things like examine journal rankings, researcher h-indexes, project future research collaborations etc. [check with Stephen].
1. Regular compilation of publication lists for each of the University's Schools and Institutes.
2. Regular evaluation of the citation impact of the aforementioned publications, particularly in terms of whether each article is among the top 10% most cited in its field.
This allows me to step back to the question of drivers for change. All the drivers noted by Laura also applied in our case. There’s a significant difference in emphasis, perhaps, in that for us, the final point was decisive.
The lure of those new services was such that the library leadership made a strategic decision to restructure into functional teams, in order to position the library better to respond to the challenges of Research Data Management and Open Access Publishing, for example, which are complex and fast-changing areas. Plus we wanted to improve the standard and consistency of our more traditional activities, such as skills training and reading list provision. So lots of similarities there, but the most significant drivers were external to content development. Did CD get left behind slightly? We are currently awaiting a further, stalled restructure which would see more staffing resource allocated to CD. With only a single person having a remit for CD, there was a quite practical imperative to take automation to its limits and develop new data-driven approaches.
To look at some REACTIVE approaches first …
Teaching & learning resources are pretty much looked after in a fairly reactive but still dynamic way. We’ve centralised the way we deal with reading list material, there are direct order requests by academics, requests for our CLA licensed copying service, (all online of course) and PDA for e-books AND NOW print too. We’re in the third year of using PDA now, and this year we will be using a different supplier (EBL). One important reason for switching supplier was that EBL can give us more granular information about which users have triggered purchases – this is the kind of data that interests me, because it represents a direct link between our content and our users.
But what about research level resources? Can we get clever with data about research?
Keyword-driven approaches.
A more ambitious, innovative approach we tried involved drawing up profiles for purchase based on “research keyword” methodologies; keywords descriptive of Manchester’s published research outputs. Two methodologies attempted:
1) Keywords from Scopus – this is most suitable for subjects well indexed by Scopus, such as Mathematics
The plans was to Extract keywords from indexed UoM publications, sort, count, rank in a list
These would be Filtered against supplier’s subject bands to ensure relevance of terms - and we’d compare matched suggestions for purchase with the Coutts approval plan.
Problems! When we took our ranked list to a friendly academic (head of research for maths) for a relevance check he was … well, fairly horrified! There was a Statistical skewing introduced by an overproductive but undersignificant member of department, of which we were completely unaware.
2) Harvest keywords manually from web pages. School’s self-presentation/description, which we assume has been through some process of editorial/marketing review! This we tried with Electrical and Electronic Engineering. The application was a simple Supplier-side search plus a few parameters; with embedded orders created by supplier. This is significant – we’re Trying to transfer as much processing load to the supplier as possible. We’ve got some very relevant-looking material but some more filtering needs to be applied, and you need to do it in the right order. For example, readership level: not all titles will have had a readership level assigned by Coutts, so you don’t want to rule out potentially good titles by specifying “must be upper UG or research level”. But then you have to filter post-search, removing anything that HAS got the WRONG levels assigned.
As if we didn’t know it already, this underlines how important the QUALITY of metadata is when moving to data-driven methodologies.
We are also trying a kind of blended approach, a sort of semi-automated CD.
The relevant ‘academic engagement’ librarian works with a focus group of academics to specify a ‘profile’ based on the most relevant monographic series from the most reputable publishers. For the discipline of Classics & Ancient History this turned out to be OUP and CUP. (Interestingly, it was the same for Physics.). Again, we try to shift the work to the publishers: We stopped struggling to work out what titles have appeared in a monographic series, which can be challenging, depending on the quality of your cataloguing, and indeed your data structure (as these series have an annoying habit of changing their title in small but significant ways). Instead we just write to the publishers and ask for a comprehensive list with all-format isbns and prices. The data arrives in a nicely structured spreadsheet which can be manipulated in all kinds of ways and related to your library data.
As our supplier has up-to-date holdings information from us, they can do the work of checking against existing holdings.
If we get the right reporting codes into the order/bibliographic records we can track usage in terms of borrowing at least. Tracking usage in terms of citations would be a different research project! We’ve not got very far yet in establishing what might constitute good usage of research monographs, but we’re fairly sure that discoverability and marketing have a role to play, and we’re working more closely with our Academic Engagement colleagues from strategic marketing as a result of this project.
Here we start to really engage with the data, and make it work for us. Lots of tools are there ready to be used, stories to be told – but are staff confident enough to use them?
Collection profiles. It’s about Knowing what we’ve got - Comparing it to others - Deciding how good / extensive / strategically important it is, and how we want to treat it. Alma Analytics for internal picture; CCM tool for benchmarking.
What percentage of this collection is unique to Manchester? What percentage is almost unique (3 or fewer holding libraries)? And why so few: is the material very expensive, very specialised, or very bad (poor scholarship, out of date). How much of this collection is held by lots of other libraries? Why should this be so? What story is the data telling us about this collection of ours, in relation to others?
We can’t decide how strategically important a collection is and how we want to treat it, without information about our operating context, the strategic priorities of the University in its research endeavours, for example. Stats from the data warehouse on size of disciplines; info and data from the CRIS system.
Academic subject codes mapped to Library classification codes.
Sentry stats to tell us who uses the physical library.
Data fluency. I think this is the big issue for librarians now. Crucial to the success of this project has been the data fluency of particular individuals involved. By this I don’t just mean the skills of being able to build or use a relational database (such as Access) – although this is a key part of it. What I mean is an awareness of how data can be used and manipulated, How it needs to be structured in order to facilitate computational processing.
And an enquiring, imaginative mind that can ask questions of data, allow one question to prompt another, and imagine what stories the data might help us to tell about our collections and the people who use them. A curiosity about what data can do for collections and their users, and a willingness to engage with it – in short, a research librarian needs to have the mind of a researcher. Cataloguers need to do more than just re-label themselves ‘metadata teams’ – they need to recreate themselves and embrace new roles.
We need a Policy on usage statistics – how to understand and interpret them, how to apply them. In short, it’s a nightmare.
Data needs to be structured, and when you can combine it, really powerful stories emerge!
Thank you very much for listening so far – the next section is over to Nick.