The Standards Mosaic Opening the Way to New Technologies
Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski
1. 2010 CALA MW Annual Conference
Cleveland, Ohio and Western Reserve
Digital Text Collection Project
Suzhen Chen
Richard Wisneski
Kevin Smith Library, Case Western Reserve University
May 22, 2010
2. Institution
Case Western Reserve University (CRWU)
Founded in 1967 (federation of Case Institute of
Technology founded in 1881 and Western Reserve
University founded in 1826)
A private research university in northeast Ohio
~10,000 students
Kelvin Smith Library
Main library of CWRU
~ 1.7 million volumes
~ 60 library staff
3. Project (What)
Cleveland, Ohio and Western Reserve Digital Text
Collection project
A collection of digital resources of history of
Cleveland, Ohio and Western Reserve date from
early 19th century to early 20th century
The collection covers various topics including
women of Cleveland, religion, housing etc.
About 100 text files added to the collection, more
to be added including some manuscripts
4. Project(Why)
Online representation of the collection
Provide resources for historians, scholars and
others who are interested in Cleveland, OH and
Western Reserve
Serve the learning and teaching purpose
Promote scholarly communication
Long term preservation of regional history
5. Metadata standard
Intended
users
Project Types of
needs materials
Metadata
Standard
Subjects,
Genre Preservation
needs
…
6. TEI: Text Encoding Initiative
A consortium that develops the standard
for representing texts in digital form.
Maintain encoding guidelines for text
Often applies to humanities, social
science and linguistics
7. Example:
Projects from other institutions:
Shakespeare Quartos Archive
Newton Manuscript Project (University of Sussex)
Early American Digital Archive (University of Maryland)
8. Example of TEI Header
<titleStmt>
<editionStmt>
<publicationStmt>
…
10. TEI Metadata Standard
Mark up specific genres such as prose, verse,
drama
Mark content structure such as paragraphs,
divisions
Mark up feature of a text such as quotations,
footnotes etc.
Mark up texts for literary and linguistic analysis
13. TEI Metadata Standard
Provides various manifestations of a text or audio
Independent of applications
TEI is extensible
Accommodate encoding methods for data processing
needs and analysis
For better description, organization and classification
of information
15. Cleveland, Ohio and Western Reserve
Digital Text Collection project
Establish
Finalize the workflows, Trainings are
project policies and provided
procedures
Run through
optical character
Text files
recognition
scanned
software – Abbyy
FineReader
16. Procedures
Spelling check for the texts
Create TEI headers
Bibliographic description,
revisions, source of text
Encode the text
Quality control
17. Implementation
Workshops
In-house documentations for best practice
Standards
On line resources
Examples for completed work
Assistance from supervisor
Learn from each other
19. Cleveland, Ohio and Western Reserve
Digital Text Collection project
For future metadata conversion, exchange,
facilitate metadata harvesting and
federated search
Facilitate metadata sharing and cross-
collection searching
21. Resources
WWP Guide to Scholarly Text Encoding:
http://www.wwp.brown.edu/encoding/guide/index.ht
ml
Teach Yourself TEI: http://www.tei-
c.org/Support/Learn/tutorials.xml
A Gentle Introduction to XML: http://www.tei-
c.org/release/doc/tei-p4-doc/html/SG.htmlA
A Companion to Digital Literary Studies:
http://www.digitalhumanities.org/companion/DLS/
22. References
TEI: Text Encoding Initiative, “TEI: Text Encoding
Initiative,” 2010, http://www.tei-c.org/index.xml
International Federation of Library Associations and
Institutions. Cataloging Section. “Functional
Requirements for Bibliographic Records: Final Report,”
1998,
http://www.ifla.org.proxy2.library.uiuc.edu/VII/s13/frbr/frb
r.htm
TEI By Example Project, “TEI By Example Project,”
2010, http://tbe.kantl.be/TBE/TBE.htm?page=examples
…