The document summarizes a project by JSTOR Labs to create an online tool called Understanding Shakespeare that connects quotes from Shakespeare's plays found in scholarly articles on JSTOR to the passages in the Folger Digital Texts. It details how the tool was developed through a partnership between JSTOR and the Folger Shakespeare Library. It also describes the matching algorithm used and provides usage statistics and information on the open API. It concludes by suggesting potential applications of the matching technology to other texts and corpora.
2. JSTOR is a not-for-profit
digital library of academic
journals, books, and primary
sources.
Ithaka S+R is a not-for-profit
research and consulting service
that helps academic, cultural,
and publishing communities
thrive in the digital
environment.
Portico is a not-for-profit
preservation service for digital
publications, including
electronic journals, books, and
historical collections.
ITHAKA is a not-for-profit organization that helps the academic
community use digital technologies to preserve the scholarly record
and to advance research and teaching in sustainable ways.
3. JSTOR Labs works with partner publishers, libraries and
labs to create tools for researchers, teachers and students
that are immediately useful – and a little bit magical.
4. PARTNERSHIP
WITH FOLGER
• Open, collaborative partnership
with Folger Shakespeare Library
• They had:
- Shakespeare Quarterly
- Folger Digital Texts
- Scholars and students
• We had:
- Full run of SQ
- 2,000 more journals
- A new Labs team
• Flash Build - September, 2014:
- From ideation to working site
- One week at the Folger in DC
6. Understanding Shakespeare…
“...is the most exciting project in digital
Shakespeare in many years, and takes
a major step forward in creating a
‘living variorum’ for Shakespeare
studies on the web.”
-Peter Donaldson
Ford International Professor in the Humanities, MIT
7. MATCHMAKER
ALGORITHM
1. Identify candidate set of
articles from JSTOR
2. Extract quotations
- quotations, not allusions
- text within quotes or
block-quotes
3. Run fuzzy text matching of
quotations against primary
text
4. Calibrate to minimize false
positives and negatives
- quotation length
- % confidence
9. OPEN &
PUBLIC API
labs.jstor.org/developers
Play Data
(from Folger Digital
Text)
genre, play, act,
scene, line,
speaker,
speaker_gender,
on_stage
Scholarship Data
(from articles on
JSTOR)
title, authors,
journal, pubdate,
article_type,
keyterms
Quotations!
play_text,
match_text,
similarity,
match_size
10. WHERE DO
WE GO FROM
HERE?
• Apply Matchmaker to other
texts
- Understanding the
US Constitution App
- Understanding Dante
- and more!
• Matchmaker API
- Run Matchmaker on any
text you can upload or point
to
- Incorporate Matchmaker
links into other sites
- Apply Matchmaker to
other corpora
• What do you suggest?