From Digitization to Discoverability: Accomplishments and New Challenges: a Case Study from the JDC Archives
Linda Levi, Director of the JDC Global Archives and Jeffrey Edelstein, Digitization Project Manager, JDC Archives
1. From Digitization to Discoverability:
Accomplishments and New
Challenges
A Case Study of the JDC Archives
Linda G. Levi, Director of JDC Global Archives
Jeffrey Edelstein, Digitization Project Manager
November 2015
2. The JDC Archives Online
Main site:
http://archives.jdc.org
Collections database:
http://search.archives.jdc.org
3. Digitization of Text Collections
• Nearly 2.75 million pages digitized to date
• All collections from 1914 through 1954, plus some
1955-1989 collections
• Digitization, by period:
World War I Era: 100,089 pages
Interwar Period: 155,973 pages
World War II Era and Aftermath: 2,048,783 pages
Israel Collection: 87,809 pages
More Contemporary Collections: 352,954
4. Projects
• Judaica Europeana: Shared file-level XML for 1914-1918 collection
• Yad Vashem: Sharing complete XML to item level and digital assets for
Geneva 1945-1954 collection
• European Holocaust Research Infrastructure (EHRI): Shared
descriptions (with finding aid links) of 7 Holocaust-era collections
• CENDARI: Shared file-level XML for 1914-18 and 1919-21 collections
• World Digital Library: Provided descriptive metadata (via spreadsheet)
for 36 selected images
• Empire State Digital Network: Provided descriptive metadata (via XML
output) for selected photos
• Digital Library of the Caribbean: Shared file-level XML for Dominican
Republic Settlement Association (DORSA) collection
• Atlit: Shared lists of names of detainees for indexing and entry into
Atlit’s database.
• Beit Hatfutsot: Goal is to provide access to Names Index API so that
names in our database are returned as search results in their interface
5. Dissemination of Digitized Text Collections
World War I Era
World War II Era
and Aftermath
Israel Collections
6. Digitization & Dissemination of Other Collections
• Photo Collection: Over
65,000 photos digitized
• Names Index: 500,000 names
from lists and index cards in
the text collections
• Oral History Collection: AV
recordings and transcripts of
interviews with 155 JDC staff
and lay leaders, 1961-2010
7. What We Have Learned
1. Digitization is step #1. Discoverability is step #2.
2. Steps to drive traffic to our site
– Quarterly JDC Archives eNewsletter
– Social media: facebook, Instagram
– Linking to other sites
– Google search engine optimization
– JDC website to drive traffic
3. Curated material is popular
– Online exhibits
– Topic guides
– Photo Galleries
– Names Index
8. Data-Sharing Collaborations
Impetus:
• Completion of initial digitization grant: 1.8 million
pages online
• Desire to increase awareness of online availability
and use of the material/site traffic (donor mandate)
• Successful pilot project with Judaica Europeana
9. Types of Collaboration
1. Collection descriptions:
Shared descriptive
information from our
finding aids, with link to full
finding aid on the JDC
Archives site (EHRI)
10. Types of Collaboration
2. File-level XML. Shared XML for
file records for each collection
(Europeana, CENDARI, Digital
Library of the Caribbean)
11. Types of Collaboration
3. Groups of images. Shared
selected images with
complete metadata (World
Digital Library, Empire State
Digital Network/DPLA)
14. Technical Issues
• XML output
– Need to map our fields to partner’s
schema
– Work with our database provider to
modify export
• Vocabulary
– In-house subject terms may not be
from a standard authority (e.g., LOC
subject headings)
– Vocabulary required by partner (e.g.,
DDC codes) may be difficult to apply,
may not fully describe JDC items
(WDL project)
– May need to add broader terms for
general audience (WDL; ESDN)
15. Technical Issues
• Display
– How will the records look? Image-based
projects will display a thumbnail, but
document-based projects may not
accommodate a logo or icon at file level
– Even after your data has been
published, there may be follow-up
questions and issues
16. Technical Issues
• Usage
– Will the portal/partner be able to provide statistics on use of your
material? If so, how frequent will the reporting be?
17. Staff Time/Resources
• Research to identify portals/projects, determine their
suitability, and establish initial contact
• Image-sharing projects require individual selection of
items
• Descriptions/captions need to be rewritten or
expanded to reflect project context/audience
• As noted, descriptive metadata (subject terms) may
need to be added/revised
• Submission format: project-supplied spreadsheets
are time-consuming to complete
18. Legal Matters
• Data-sharing agreements require review/approval by legal
staff
– Special collaborations may require drafting individual agreement
• Proposed modifications to standard agreements generally
accepted without difficult negotiations, but response time
may be slow
– Some projects require formal application to participate; review and
approval performed only when partner’s panel meets
• Copyright concerns
– Where and how will credits/acknowledgments appear?
– Will we lose control of our assets? How much should we share?
– Photographs: we have so far limited sharing to public-domain items
19. Project Management
• Response time at each step can be slow
• Complexity: some projects involve many data
providers; some projects are developing new
technical tools
• Some projects are better than others about issuing
general updates to all participants
20. Findings
• Except where providing general descriptions only
(e.g., EHRI), data-sharing projects will take longer
than expected
• Preparing your output takes more than just “pushing
a button”
• Although it is too soon to have solid evidence that
traffic is coming to us from these sites, we believe
that there is value to participating in data-sharing
projects