1. Stephen J. Stose IST 677 Assignment 1
Feb 22, 2010
A Chat with Digital Librarian and Academic Christine Madsen
Christine Madsen is a librarian and academic entrenched in the
world of digitization across many fronts. Christine and I met
up to talk in a small café in Oxford, UK, where she is currently
finishing her Ph.D. dissertation research on digitization’s
impact on scholarship and practice in the Tibetan and
Himalayan region. In 2003 she joined the Harvard University
Library as the project manager of the newly minted Open
Collections Program (www.ocp.hul.harvard.ed). In our time
Christine Madsen in Oxford.
together, Christine discussed the intricacies of spearheading Christine’s blog can be found at
and managing this digitization program throughout her tenure www.christinemadsen.com
up until 2008, when she decided to leave the program to join
the Oxford Internet Institute1.
Background
Christine’s interest in digitization grew from working with the media format from which
the first large scale digitization projects materialized. Slides were then the photographer
and art historian’s preferred media for classroom and professional presentations.
Christine completed her undergraduate degree at the University of California San Diego
(UCSD) in art history and photography in 1995, and upon graduation worked as a slide-
mounting assistant in UCSD’s library. Slides were the easiest media to begin digitizing,
Christine told me, as the one-to-one image to metadata transcription presented the fewest
technical challenges to an industry in its birth. While working, Christine attended the San
Jose State University School of Library and Information Science to complete her
Master’s degree in Library Science. With no digitization program in place, Christine
precociously selected her classes around what many library schools still lack today: a
curriculum focused on how the digital revolution has rendered the library’s future
ambiguous and uncertain. Her work at San Diego involved collaboration on ArtStor
(www.artstor.org), a non-profit digital library of more than one million images in the arts,
architecture and the social sciences.
Project Manager at Harvard
Christine’s ability to steep herself in innovation was rewarded by a job opportunity that
itself was at the forefront. In 2003, Harvard University Library (HUL)2 had received a
pilot grant from the Hewlett Foundation3 to begin a digitization program. One must
remember that Harvard College and University have over 70 libraries. Briefly, HUL is an
administrative entity established to centralize many services that otherwise individual
libraries would be at risk of duplicating, such as cataloging, preservation and information
systems. Thus, while digitization efforts were already underway in many of the individual
libraries with a focus on digitizing materials particular to that collection, Christine
1
http://www.oii.ox.ac.uk/
2
http://lib.harvard.edu/
3
http://www.hewlett.org
2. 2
explained how the newly minted digitization efforts undertaken by the Open Collections
Program at HUL would attempt to radically alter the philosophy behind digitization
initiatives. Given that HUL was this centralized organization, it had access to materials
across the various individual libraries. The goal of the new Open Collections Program
thus sought to digitize and open access to topic-based collections whose physical
surrogates may have resided across the various 70+ libraries that make up HUL. In
Christine’s words, this project attempted to exploit librarians’ under-appreciated skills in
collection development by “designing digital collections rather than digitizing existing
collections.”
The documentation, sub-contracting, budgeting, and development of this new kind of
digital collection defined Christine’s job responsibilities as project manager. Given that
the collection digitized served not to exhaust a particular collection, but to increase access
and visibility across a range of materials falling under a chosen topic, Christine’s first
task was to develop a taxonomy that adequately represented the two initial topics chosen
to be digitally developed: 1) Working Women from 1800 to 19304, and 2) Immigration
from 1789 to 1930. Christine focused her discussion on the former, as she believed it a
more likely story of success. The initial criterion for selection implied the object was rare
and not easily available (neither to the public, nor to university members), and that it was
out of copyright. Development involved collaboration with scholars and graduate
students across various disciplines (e.g., women’s studies and sociology, and medical and
labor historians, to name a few) to determine the scope of the topic as well as suggestions
for object inclusion/exclusion5. Given that the principal goal of designing the Working
Women’s digital collection was to integrate it into the curriculum of other less
advantaged university and high school programs6, Christine contracted a person with
expertise in such matters from Harvard’s School of Education. This project’s success in
gaining an audience that to this day makes use of these resources justified the initiative in
Christine’s view. Digitization for its own sake makes little sense. When a library holds
more than 15 million volumes, designing a small non-exhaustive part of it that is used
must be a satisfying exercise in perspective.
With an initial pilot grant for 200,000 dollars from the Hewlett Foundation, after one year
Madsen successfully convinced the foundation to renew the grant. She wryly noted that
this was both the most necessary and time-consuming part of her job. By 2007, and after
constant bi-annual reporting on the financial justification for its continuation, that figure
reached 2 million annually. Reports had to be filed to both Hewlett and to the grant
administrators at HUL. They included data analyses on real and potential users,
benchmarking studies, and workflow analyses, all framed in justification of financial
sustainability.
4
Madsen, Christine, Thomas J. Michalak, and Megan Hurst. “Harvard University Library Open Collections Program,
Women Working, 1870-1930: Final Report.” March 2005. Available online at http://ocp.hul.harvard.edu/report/final/
5
Michalak, Thomas J. and Christine Madsen. “Selecting Resources for the Women Working Digital Collection: Books,
Documents, and Pamphlets.” May 25, 2004.
Available online at http://hul.harvard.edu/ocp/internal/progress/developing_the_collection.pdf
6
Madsen, Christine, and Megan Hurst. “The Role of the Library in Open Education.” September, 2005. The Center for
Open and Sustainable Learning Conference Proceedings, 2005.
3. 3
Much of this annual money was spent on contracting services. The materials digitized
included manuscripts, photographs, and books and journals. Given their rarity and un-
accessibility, extreme caution was exercised in the workflow. Contracts were sought and
formalized with Harvard’s technical (i.e., cataloging) and imaging services, an internal
advantage bestowed on places with Harvard’s resources. Still, transport was never linear,
insofar as objects originated in one of 70 distinct libraries. Each of these libraries had
their own internal administration, and Christine describes negotiation with these heads of
collection as constituting the most problematic part of her job. Ultimate authority over a
particular collection remained within individual libraries; it was not for the central HUL
governing administration to decide. So when I asked Christine how she got this job at the
age of 30, she claimed an ability to talk to people at all levels of institutional hierarchy.
Extreme diplomacy, and perhaps even, as Christine hinted, obsequiousness, was the key
to securing the objects selected from each institution.
It is well known that librarians, as care-takers, develop possessive feeling over their
collection, and I wonder whether this concern stymies the sharing and preservation of
individual items making it up. In collection development, items are collated to tell a
story. The most frequent complaint Christine reported hearing was that this digitization
initiative required librarians to individuate an item from its larger context. That is,
librarians did not want to separate the item from the others that make up the collection’s
story and motivated its original collation. This, to me, is the problem libraries face in a
nutshell. Do they love the story itself more than the items making it up? Many consider a
tomato a vegetable, but it is also a fruit, depending on the story. Digitization is one step in
respecting an item’s descriptive polysemy: a digital surrogate’s physical non-locality
means it need not fit in only one box in only one depository. Christine’s job consisted in
diplomatically convincing curators that these digitalized items she would separate from
their box could indeed belong to two separate but equally valid stories. Thus, a pamphlet
on the women’s labor movement taken from a collection of labor pamphlets would be re-
described as also belonging to a collection on Working Women. Nevertheless, many
resisted, and Christine’s characterizations of these relationships bordered upon how one
might illustrate interactions between a therapist and patient. Despite assurances that
digitization would take pains to describe the item’s contextual provenance, libraries could
and did on the rare occasion say no. There were other reasons for denial, as well. One she
recalled was the condition requiring that the digitized object be watermarked, despite the
fact the item in its current state was not even catalogued. All or nothing, some might say.
Once extracted, the project’s team of “book wranglers” would transport the item to
HUL’s technical services7 (cataloging), where it was cataloged. This process was itself
innovative. Given the centralized nature of HOLLIS (Harvard’s Online Library
Information System8), digitized items were not redundantly catalogued. Instead, one of
Christine’s responsibilities was to create the digital library to dynamically reflect, not
hold, the institutional metadata regarding the object. In other words, while the digital
image and the digital library required separate creation, the metadata for each digital item
was fed off of the centralized cataloging database that would also hold the record for the
7
http://hcl.harvard.edu/technicalservices/
8
http://lib.harvard.edu/catalogs/hollis.html
4. 4
object itself. In this way, nothing was duplicated. The result: One object, one digital
object, and one metadata record pointing to each. Christine described cataloging as the
most expensive item on the budget. This same office was responsible for preservation
efforts if and when they were required. That is, the program did attempt to apply
preservation techniques to items taking advantage of their release in order to minimize
handling costs. This was not a common, however, and Christine discussed it little.
The second most expensive item was contracting the digital imaging itself to Harvard
Imaging Services9. This department, Christine averred, held the most innovative thinkers
and for her what became most influential part of the job. She named Harvard’s head
photography expert David Remington and the head of imaging services and preservation
Bill Comstock as two figures making up what she considered to be the best imaging
group available today, and the most valuable training she received. In 2006, together with
Bill Comstock, Madsen published the article “Streamlining a Book Digitization
Workflow” in an imaging journal10. Working with this group, for Christine, constituted a
digitization education in itself. While her training at library school told her about
digitization, she is firmly convinced it rather unimportant compared to her work
experience and interactions with colleagues. She did attend NEDCC’s (The Northeast
Document Conservation Center11) school for scanning. But it assisted her more with the
intricate details of project management and the role of applying technology to user-needs.
One last detail completes the workflow of the digitization program Christine Madsen led
at Harvard University Library before returning the items to their home libraries: Quality
Control. Madsen taught her book wranglers, upon pickup, to also control the image for
quality after scanning. That included ensuring images were straight and in focus, and
validating that the digitized items were numbered correctly in relation to themselves and
to the physical originals. I attach a copy of the decision tree and workflow Christine
created (selection, agreement with libraries, extraction, cataloging/preservation, imaging,
quality control, item’s return) to illustrate the complexity of this entire workflow.
Conclusion
As an academic in a Ph.D. program in digitization, besides colleagues and NEDCC’s
school of scanning, Christine keeps herself informed on digitization in a myriad of ways.
For instance, she has attended Digital Library Federation12 conferences every year since
2001. She also highlighted the online peer-reviewed journal “First Monday”13 dedicated
to the study of the Internet as being especially influential. Christine, however, is fast
becoming a leader of digitization. After leaving her job post at Harvard, her expertise was
quickly committed to the active involvement in many other ongoing research projects.
For instance, in 2008 JISC (Joint Information Systems Committee14) awarded her a grant
9
http://hcl.harvard.edu/info/imaging/
10
Madsen, Christine and Bill Comstock. “Streamlining a Book Digitization Workflow.” Microform and Imaging
Review, Spring 2006.
11
http://www.nedcc.org/home.php
12
http://www.diglib.org/
13
http://firstmonday.org/
14
http://www.jisc.ac.uk/
5. 5
to develop a “Toolkit for the Impact of Digitised Scholarly Resources15” in collaboration
with her colleagues at the Oxford Internet Institute. This stemmed from previous work
with CNRS (Centre national de la recherché scientifique16) in France in which she
assessed as part of the semantic web of digitized philosophical papers (Philosource17) the
needs of actual and potential users. She is also currently funded by the “Internet Society”
to develop a “Toolkit of the Dissemination of Cultural Heritage Materials Online18”
which coincides with her dissertation work on the Tibetan region and which has the
admirable goal of, in the words of Christine, creating and providing…
“… open access to a rich and multi-lingual library of resources that will promote
the creation of meaningful, well-designed, and sustainable digitization programs
by institutions supporting cultures whose textual artifacts are under threat from
the political or natural environment.”
15
http://microsites.oii.ox.ac.uk/tidsr/
16
http://www.cnrs.fr/
17
http://www.discovery-project.eu/philosource.html
18
http://www.isoc.org/isoc/chapters/projects/awards.php?id=9