With demands on library space increasing, while research collections continue to grow, off-site storage is becoming a reality for academic research libraries across North America. In 2005, the University of Toronto Libraries (UTL) established a high-density storage and preservation facility to preserve and maintain print serials and low use monographic resources. Aptly named Keep@Downsview, this collection now contains over 3 million volumes and has evolved into a collaborative partnership with four other Ontario universities. As our off-site collections continue to expand, this presents unique challenges for UTL in facilitating resource discovery and access. Since library users cannot physically browse the Keep@Downsview collections, the only way to discover these resources is through the metadata contained in the library’s discovery systems. To ensure that these resources remain accessible to the scholarly community, it is crucial that the metadata for these collections is optimized for search and discovery. With the goal of improving access to our off-site collections, we conducted an investigation into the state of our metadata for the print serial collection held at the Keep@Downsview facility. In this assessment, we analyzed metadata elements based on the following metrics: completeness, accuracy, consistency and coherence, conformance to expectations, timeliness and accessibility. Additionally, we conducted a qualitative investigation into how our metadata is perceived through the eyes of researchers, librarians, and our Keep@Downsview partners. By approaching metadata assessment from both an ‘Inside-Out’ and ‘Outside-In’ perspective, our aim was to obtain a holistic view of the quality and effectiveness of our metadata and explore strategies for improving the discovery of our remotely held collections.
Inside Out and Outside In: A Holistic Approach to Metadata Assessment for an Off-site Storage Collections
1. Inside Out
and
Outside In
A Holistic Approach to Metadata Assessment
for an Off-site Storage Collection
Marlene van Ballegooie and Juliya Borie
University of Toronto Libraries
NASIG Conference 2019
2. Agenda
● Our Context
● Why metadata assessment for off-site storage?
● Study Methodology
● Assessment Metrics
● Observations
● Next Steps
3. Our Context
● University of Toronto - Canada’s largest university
○ 70,890 undergraduate students
○ 19,187 graduate students
● Decentralized library system with
44 libraries on three campuses
● Over 13 million physical items and
approximately 2.8 million electronic
resources
4. What is Downsview?
● A purpose-built, high-density preservation facility designed to
provide a secure environmentally controlled space that is
optimal for long term preservation
● Built in 2005
● Approximately 3 million volumes
with capacity for further expansion
to five million items
5. About the Downsview Collection
Current collection composition:
○ 2,287,810 Monographs
○ 533,632 Serial volumes (23,834 titles)
○ 102,467 Other (including music, audio, video, maps, etc.)
Collection grew organically out of space needs:
○ Duplicate monographs
○ Low use resources from Robarts stacks
○ JSTOR journals and other cancelled journal runs
○ Pre-1923 holdings digitized by Internet Archive
6. Keep@Downsview as a Partnership
● Keep@Downsview is a shared print initiative between five Ontario
universities
● At the Downsview facility, partners can retain low-demand materials at
lower cost and without crowding storage spaces in the region with multiple
copies of lesser used items
● Partner libraries retain ownership of resources regardless of whether or not
a physical transfer actually takes place.
8. Challenges for Remote Collections
● No open shelves or public access to
collections
● No ability to physically browse collections
● Difficult to replicate serendipitous
discovery in online environment
● Total reliance on metadata for discovery
9. Improve Services
● Better metadata = improved discoverability = better service
● Recognition among staff that Downsview services could be
better
● High quality services require high quality metadata
● Metadata cleanup is an investment and will enable future
possibilities
10. Facilitate Comparison Across Collections
● As Downsview becomes a shared off-site storage facility, there is
increasing need to be able to compare across library collections
● For the Keep@Downsview Partnership
○ Quality metadata is critical to meeting project goals
● For national and international print preservation initiatives
○ Canadian Collective Print Preservation Strategy
○ HathiTrust
11. Upcoming Plans for System Migration
● UTL is planning for a transition to a new library services
platform (LSP)
● Prospective system migration is prompting a review of all
metadata
● Opportunity to clean up and address
legacy data issues
● Necessity to re-think established
workflows
13. Methodology
Utilize multiple techniques to obtain different views of our serials
metadata
● Local records vs. community managed records
● Local records vs. CONSER records
● Perceptions of staff and library partners
● Perception of users
Different perspectives provide a holistic
view of the quality and effectiveness of our serials metadata
14. Local Records vs. Community Managed Records
● Downsview collection contains 23,560 print serial records
● Matched 10,161 to OCLC master records on key identifiers such
as ISSN and OCLC number
● Combined dataset was loaded into MySQL for analysis
● SQL queries to determine convergence/divergence
between local data elements and OCLC records
15. Local Records vs. CONSER Standard
● Representative sample of 400 journal records that could not be
matched with OCLC
● Assessed the quality of our records vs. CONSER standard records
● Assessed holdings records
● Focus on foreign language titles as 40%
of the collection is in languages
other than English
16. Perceptions of Staff and Library Partners
Aimed to gain insight on metadata quality through the eyes of staff
members engaged in user services and Keep@Downsview partners
● Conducted three rounds of focus groups with user services
librarians
● Survey sent to Keep@Downsview partners
17. Perception of Library Users
Aimed to gain insight on the library user's experience in navigating
serials metadata
● Survey sent to faculty and graduate students in 3 Departments
● Conducted focus group with
graduate students
19. Bruce and Hillmann on Metadata Quality
● Defines general characteristics of metadata quality
● Quality measurements and metrics
○ Completeness
○ Accuracy
○ Provenance
○ Conformance to expectations
○ Logical consistency and coherence
○ Timeliness
○ Accessibility
21. Completeness
● According to Bruce and Hillmann:
"The element set used should describe the target objects as
completely as economically feasible...The element set should
be applied to the target object population as completely as
possible."
22. Does the element set completely describe the objects?
Serials are complex constructs that combine whole/part relationships and
aggregation relationships:
● they have a whole/part relationship to individual issues
published over time
● each individual issue is an aggregate of articles
(IFLA LRM)
23. Data Elements Used to Identify a Serial by Participants
● Title
● Place of publication
● Previous and subsequent title
● Holdings statements
● ISSN
● Editor (in cases of prominent editors)
● Series
● Ceased date
● Author/issuing body
● Subject headings (sometimes)
Image: "Forest" by Jean Jullien,
Flickr
24. CONSER Levels of Description:
● Full level records contain a full complement of elements that are applicable to the serial and all
elements contained are fully authoritative.
● Core level records contain those elements essential to the description and access of the serial
and all elements contained are fully authoritative.
● Minimal level records contain the essential (i.e., core) elements for description but subject
elements may not be present and one or more headings may not be authoritative.
Quality of UTL records according to CONSER Levels of Description
25. Faculty see title changes as part of the same journal
family
From Faculty:
"If I am not mistaken, records are usually separate for each title, so it may take some
investigation efforts to ensure this is the same journal." Sometimes it looked like I
found the correct record for the succeeding title, but that record had little information
to ensure it was the correct one (e.g., ISSN, place of publication or similar, perhaps
also an indication that the previous title was such and such) - so can't always be sure it
is the correct one."
27. Accuracy
According to Bruce and Hillmann:
"Metadata should be accurate in the way it describes objects...Minimally, the
information provided in the values needs to be correct and factual."
• Elimination of typographical errors
• Use of standard abbreviations
• Conform to standard expression of personal names and place names
28. Use of Standard Abbreviations
In the local record/OCLC comparison, standard abbreviations for journal
titles were analyzed:
• 4,647 journal title abbreviations in OCLC records
• 2,308 local records had matching abbreviations
• 2,213 local records did not contain a journal title abbreviation
• 126 local records had abbreviations that did not match OCLC
29. Are controlled vocabularies updated when relevant?
20% of records in the sample had Unauthorized Authorities
Cross reference used as main heading
○ American Society for Testing and Materials$bCommittee E-9 on Fatigue
instead of ASTM Committee E-9 on Fatigue
Earlier established forms of name used
○ Wales. National Library, Aberystwyth instead of current heading
National Library of Wales
30. Foreign Language Materials
From a Downsview partner:
"Diacritics are always an issue, especially with older records. If matching on ISBN or ISSN
it shouldn't be an issue too much though."
From researcher: "In my field, it’s important to have parallel fields in non-Latin scripts
because I think most students in our Department even are not aware of Library of
Congress transliteration. Otherwise, there are many other ways to transliterate, for
example, for Cyrillic titles, there are National tables of transliteration. So, when I search,
it’s good to search in the original language and not to think what transliteration they
used to transcribe."
31. Differences in Transliteration can affect discoverability
From Faculty member: "Transliteration, spelling are a challenge, so it is
important to be attentive."
Sample problem:
UTL Record: $6880-01$aTaipingyang dao guo yan jiu =$bResearch on Pacific Island
countries /$cChen Dezheng zhu bian; Lü Guixia, Qu Shengfu zhu bian
OCLC Record: $6880-01$aTai ping yang dao guo yan jiu =$bResearch on Pacific Island
countries /$cChen Dezheng zhu bian ; Lü Guixia, Qu Shengfu zhu b ian
33. Consistency and Coherence
According to Bruce and Hillmann
"Elements are conceived in a way that is consistent with standard
definitions and concepts used in the subject or related domains and
presented to the user in consistent ways."
● Use of standard data structures (i.e. MARC)
● Ability to search collections of similar objects using similar criteria
34. Sparse Records
From a user services librarian:
"Sometimes it’s a lot of information, and sometimes it’s nothing. I mean sometimes it’s
so sparse that I’m like, how am I even supposed to figure out what this is, there’s no
information. So the contrast is a little stark sometimes."
35. Successive vs. Latest Entry
"Title changes were on the same record, but now that the volumes are in
Downsview they're on separate records. Having different titles on one record
made it very confusing to match volumes, and to see what, if anything, was
already in Downsview."
36. Variance in use of parallel titles
OCLC record:
$aAdministrative law reports.$nFourth series =$bRecueil de
jurisprudence en droit administratif. Quatrième série
UTL record:
$aAdministrative law reports.$nFourth series
37. Is data in elements consistent throughout?
From a user services librarian
"I find it frustrating … that the volumes are described inconsistently. But I know that that
comes from ... someone entering the information, and, maybe we don’t really have a
clear or communicated practice, or criteria, for how individual items are reflected
through the creation of individual [item records]."
"With Downsview, the Keep@Downsview with other institutions adding stuff in, and also
with places like OISE and the Inforum [UTL libraries that use Dewey Decimal
classification] ... the call numbers don’t always match so therefore things are not in order
anymore. So it makes it harder to find."
38. Is data in elements consistent throughout?
From a user services librarian:
"It’s sometimes confusing because in some records
the electronic is mixed in with the print and in
others they are pulled apart as separate entities. I
think consistency around that would be very
useful."
40. Conformance to expectations
According to Bruce and Hillmann:
"Element sets...contain those elements that the community would
reasonably expect to find."
"It is important that community expectations be solicited, considered,
and managed realistically."
41. Is the metadata in line with community expectations?
Librarian and researcher expectations of how serial title changes should
be presented most effectively
From a user services librarian:
"I wish there was a way that you can ... bring them all
together … every serial or every volume or every
edition that with all the title variations, it can be seen
in one complete place. I would love to see
that. Wouldn’t it be great to be able to just flip a
switch and see it all come together and then send it
back out to its [original records]".
42. Is the metadata in line with community expectations?
Researchers' expectation was
that serial metadata would
mimic a combined display this
is common on many vendor
websites
"[I prefer] one display that guides
through title changes in the
journal’s history. If the metadata
allows for such accommodation. "
43. Lack of metadata consistency between local catalogues and Worldcat
was baffling to researchers
Metadata Inconsistencies Between Systems
"I’ve also had the experience of searching
something in RACER, and RACER didn’t show up ...
but on Worldcat it came up, which is confusing for
me. How is it possible for something to be on
Worldcat and be part of what we should be able to
find in RACER, but it was only after finding it in
Worldcat and then copying and pasting words in
RACER that it showed up in RACER."
44. Collaborative Catalogues
Researchers expressed interest in contributing their knowledge of a
serial resource to the library's catalogue records
Can we balance quality control with user expectations for a more open
and collaborative catalogue?
"Is there a possibility of people – kind of like in a
Wikipedia style – adding notes to the journal?"
46. Timeliness
Bruce and Hillmann describe two different aspects of metadata
timeliness: currency and lag.
• Currency - "When the target object changes but the metadata
does not"
• Lag - "When the target object is disseminated before some or all
metadata is knowable or available."
47. Is the metadata regularly updated as the resources
change?
Examples of problems with currency:
● Control field 008 06 – Type of date/Publication status
○ 606 local records coded as 'c' - Continuing resource currently published while
OCLC records have same value coded as 'd' -Continuing resource ceased
publication
● Control field 008 07-14 – Date 1 and Date 2
○ 1,329 local records have Date 2 set as 9999 while OCLC records have a terminal
date
48. Obsolete MARC Coding Practices
Records not updated as MARC coding rules have changed
● Obsolete or incorrect language codes in fixed fields
○ Croatian – use of 'scr' instead of current code 'hrv'
○ Serbian – use of 'scc' instead of current code 'srp'
○ Some codes were simply incorrect – 'cro' and 'ser'
● Obsolete practice of recording multiple language codes in 041 field
○ $afregerita instead of $afre$ager$aita
○ $aengger$bczerumrus instead of $aeng$ager$bcze$brum$brus
49. Changes in Publisher Not Updated
UTL record:
=245 00$aInternational journal.
=260 01$aToronto,$bCanadian Institute of International Affairs.
OCLC record:
=245 10$aInternational journal.
=246 1$iVolumes for <spring 2005-> also have title:$aIJ
=260 $a[Toronto] :$bCanadian Institute of International Affairs,$c[1946?]-
=260 3$3<Winter 2008/09-> :$aToronto :$bCanadian International Council
=264 31$3<Mar. 2014-> :$aLondon ;$aThousand Oaks, CA :$bSage Publications
50. Outdated Links
● Evidence of "link rot" in some local
records
● Target webpage no longer exists or no
longer represents original resource
=245 04$aThe lichenologist.
=856 41$zAlso available online:$3Table of contents and abstracts
$uhttp://www.hbuk.co.uk/www/ideal/journals/li.htm$2http
52. Accessibility
According to Bruce and Hillmann:
"Metadata that cannot be read or understood by users has no value."
"There is a need to offer different views
or arrangements of metadata to meet
the expectations and needs of diverse
audiences."
53. Holdings are confusing
"The main thing I have with holdings is that there’s too much, they are hard to follow
and they are not accurate."
"It's just so fiddly, you know … I'd rather see a long list and browse numerically than to
see the ranges … provided that it's in order."
54. Holdings are confusing
"I think it’s challenging for users that it will say Robarts, but then it’s at Downsview ...
Sometimes the holding statements are hard to read in the summary. There’s a lot of
stuff there. So then you’re trying to go through all the stuff that’s in Downsview to see if
we have the one. But the lists are insanely long. [Then] you go in the classic catalogue
and your mind spins…"
55. Displays are confusing
"In this record ... all the descriptive information is below the holdings ... [T]hat’s not great
either because you don’t know it’s there. But I feel from a user experience perspective if
something appears right at the top, this is crucial information. And I’m not sure if 25
centimeters [is something] users need to know now."
56. Publication Information vs. Holdings
"The other thing that’s
confusing for users though
is that they often look at
the publication date
range at the top of the
record and they get
confused because they
think that’s our holdings."
58. Serials Metadata is Dynamic
● Keeping up with serials metadata changes is challenging
● Fluidity of serials requires systematic review of metadata
● Standards change - difficult to go back and make changes to all
our records
59. Indexing is important
Prior to sending serials to offsite storage, it is helpful to know if and where they
are indexed
Less than 20% of the sampled journals were indexed
(13% with full indexing and 5.5% with partial indexing)
From a user services librarian:
"A lot of journals we’ve sent offsite are not indexed anywhere. So
there is no way to request an article, because it’s not indexed ...
Or we sent the indexes together with the journal, that happens."
60. Even if the metadata conforms to quality metrics, users perceive it
through a system interface. System interface design can significantly
impact discovery.
Metadata and Systems are Intertwined
"One of the things that I think would be very, very helpful
if the links to the previous and the subsequent [titles]
actually linked to the record for the previous and the
subsequent. Because what happens is you click on the
link thinking it’s going to take you to that record and it
just takes you to the same record that you are in."
61. Content Over Format
"I don’t care what the format of the
periodical is, whether it is in print or in
microfilm or whether it’s in the library
or at Downsview … But if I do a journal
title search and it’s a microform then it
won’t come up and that’s not ideal."
62. Flexible Metadata
Successive entry vs. Latest entry - Can it be both?
Rethinking the notion of the "record"
Is the MARC data structure appropriate for dynamic resources?
BIBFRAME has potential as a more appropriate data structure for serials.
"You have the metadata separate, but you have the system bring it all
together. Sounds like a dream!"
63. Workflow Affects Discoverability
● Decentralized system = decentralized workflow = lack of consistency
● Cost-effectiveness of addressing metadata issues before sending to
off-site storage (Laskowski, 2016)
"Those technical reports, no one will ever be able to use them. They are so
complicated!… On these large-processing projects, things like this are done to
save time, but it creates, what I think, is a very confusing [display]. So it’s a
solution for system problem … But maybe not the ideal solution."
65. Improving Metadata Quality
In consultation with campus libraries, devise strategy for improving
discoverability of serials
o Investigate programmatic upgrades to bibliographic records
(full overlays and targeted enhancements)
o Clean up and reformat call numbers for listing
o Investigate methods for cleaning up holdings records
and improving display in discovery system
Once methods for clean up are established, expand to non-Downsview
titles
66. Building Assessment into the Process
To ensure that metadata improvements are not short-lived, it is essential
to build assessment into all metadata creation and maintenance
activities
Define key points of review for serials metadata
One-time events: projects in preparation for system
migration
On-going review: renewal/cancellation time, transition
to off-site storage
Incorporate user feedback to prioritize and guide clean-up efforts
67. Advocate for Metadata Quality
● Technical services need to play an active
role in advocating for metadata quality
● Since metadata is the only way to access
these resources, we need to ensure that it
is optimized for our users
● The involvement of technical services in
preparing metadata for offsite storage is
essential
68. Assessment is not just measurement, it's a conversation
Assessment is more than simply measuring against a set
criteria for quality
It's a conversation – getting to know our metadata
better
We ask questions of our metadata
We ask questions of our users
We may not always like the feedback we get...
but the conversation provides guidance and
direction on where we need to go to move forward
69. "I’m really hoping that this can lead to some changes. Because, you know,
you get a sense that there’s a lot of problems, and we can recognize that.
I think this is an area where a little bit of work can drastically improve a lot
things for everyone."
Last words...
There are a lot of things that are good too.
But there’s always room for improvement.
Hopefully, we can work together to do that.
71. References
Bruce, T.R., Hillmann, D.: Metadata in Practice, Chap. The Continuum of Metadata
Quality: Defining, Expressing, Exploiting. pp. 238–256. ALA Editions, Chicago (2004)
Laskowski, Mary S. 2016. "When Good enough is Not Good enough: Resolving
Cataloging Issues for High Density Storage." Cataloging & Classification Quarterly 54
(3): 147-158.
Riva, P., Le Bœuf, P. and Žumer, M. 2017. Consolidation Editorial Group of the IFLA FRBR
Review Group ”IFLA Library Reference Model, A Conceptual Model for Bibliographic
Information. Revised after World-Wide Review”, 94.
Hinweis der Redaktion
Forest vs the trees
Consistency in level of detail
From a keep@downsview partners
Legacy practive to have print and electronic together, however, records like this remain in our system
Punctuation in holdings helps with accessibility. Earlier records don’t follow ANSI/NISO Z39.71-1999
Talk about holdings data
Skimpy vs. Full records - Debbie's quote
Value-adds: table of contents – but does MARC support this for serials?
Where and how should we begin to foster Quality?
"It is essential that quality assurance is built into the metadata creation process at the outset, that its scope extends beyond the local context and that the resulting metadata is as 'good' as it can be within the inevitable limitation of time and cost" (Barton, Jane, Sarah Currier, and Jessie M. N. Hey. “Building Quality Assurance into Metadata Creation: an Analysis based on the Learning Objects and e-Prints Communities of Practice" Paper presented at DC-2003, Seattle, WA.
Passing constructive feedback into the system design
Investigate alternative methods for systematically reviewing our data
Strategy for improving discoverability of titles