Humanities as Data: Projects, Visualizations, and Emerging Methods
1. Humanities as Data
Projects, Visualizations, and Emerging Methods
Kurt Fendt
Massachusetts Institute of Technology
fendt@mit.edu @fendt
hyperstudio.mit.edu
@MIThyperstudio
2. Outline
• Digital Humanities
• Short History
• Trends
• New Affordances
• HyperStudio - Digital Humanities at MIT
• Structure, Principles
• Selected Projects
• Berliner sehen
• Annotation Studio & Open Source
• Data Visualization - The Comédie-Française Registers Project
• Parallel Axis Graph
• Combinatorial and Generative Research Visualization Tools
• Network Graphs
• Educating Digital Humanists
• Project-Based Digital Humanities Course at MIT
• Q & A
hyperstudio.mit.edu
@MIThyperstudio
3. Digital Humanities - A Definition
• The digital humanities, also known as humanities computing, is a field of
research, teaching, and invention concerned with the intersection of
computing and the disciplines of the humanities.
• It is methodological by nature and interdisciplinary in scope.
• It involves investigation, analysis, synthesis and presentation of information in
electronic form.
• It studies how these media affect the disciplines in which they are used, and
what these disciplines have to contribute to our knowledge of computing.
Wikipedia, s.v. „Digital Humanities“, last modified July 31, 2011, http://en.wikipedia.org/wiki/Digital_humanities
hyperstudio.mit.edu
@MIThyperstudio
4. Digital Humanities - A Brief History
hyperstudio.mit.edu
@MIThyperstudio
Father Busa
5. Digital Humanities - A Brief History
hyperstudio.mit.edu
@MIThyperstudio
Vannevar Bush:
“As We May Think” (1945)
6. Digital Humanities - A Brief History
hyperstudio.mit.edu
@MIThyperstudio
Theodor H. Nelson:
“Literary Machines”
(1965/1981)
8. Digital Humanities - Trends
hyperstudio.mit.edu
@MIThyperstudio
Big Data: Mapping the
Enlightenment
The Electronic Enlightenment database
contains over 55,000 letters and
documents exchanged between 6,400
correspondents in the Republic of Letters.
How can humanities scholars trained in
close reading of individual documents
make sense of patterns in large sets of
data?
How can historians and other humanities
scholars use visualization tools, to examine
large sets of heterogeneous historical data
with multiple dimensions?
http://www.stanford.edu/group/toolingup/rplviz/
9. Digital Humanities - Trends
hyperstudio.mit.edu
@MIThyperstudio
Data Visualization
http://www.visual-literacy.org/periodic_table/periodic_table.html#
11. Digital Humanities - New Affordances
• Asking and answering new research questions that cannot be reduced to
a single genre, medium, discipline, or institution
• New research methods, representational and interpretive practices,
meaning-making strategies, complexities, and ambiguities
• Fluid communities of practice
• Trans-historical and transmedia approach to knowledge and meaning-
making
• Questions of design at the center (information design, graphics,
typography, formal and rhetorical patterning)
• Project as the core activity
“A project is a kind of scholarship that requires design, management,
negotiation, and collaboration.” Anne Burdick et al.:„Digital_Humanities“, Cambridge, MA 2012, MIT Press
hyperstudio.mit.edu
@MIThyperstudio
15. HyperStudio as part of Comparative Media Studies
hyperstudio.mit.edu
@MIThyperstudio
• One of nine independent research groups within the Department of
Comparative Media Studies/Writing (CMS/W)
(School of Humanities, Arts, and Social Sciences)
Other CMS research groups include:
Center for Civic Media; Education Arcade; E-Lab; Imagination, Computation, and
Expression Lab; MIT Game Lab; Open
Documentary Lab; Mobile Experience Lab; Trope Tank
• Concept of Applied Humanities (Henry Jenkins)
• MIT Motto: Mens et Manus
• HyperStudio: 9 part-time and full-time staff (Graduate/undergraduate
students, software engineers, outside contractors, administrator)
16. HyperStudio - Principles
hyperstudio.mit.edu
@MIThyperstudio
• Pedagogical and/or scholarly needs drive development
• Co-design with faculty, students, and other partners
• Agile development with integrated feedback
• Students as novice scholars
• Engage learners in process of discovery, interpretation, and collaboration
• Rethinking of pedagogical concepts and roles
17. Multimedia Text Annotation for Students
“I have never annotated before. But I think I am getting better. I
am actually writing down ideas while reading. By writing them
down, I am actually looking deeper into the text, not like when I
just read the book or something and said, ‘Oh it may mean
this.’ Now it is more like, ‘Oh what does THIS mean?’ Then I
keep asking questions because I am annotating. I am thinking
about the text more.”
Student in a Fall 2012 literature class
hyperstudio.mit.edu
@MIThyperstudio
21. Pedagogical Approach
• Increase awareness of fluid processes of reading, writing, borrowing, and revision (John Bryant)
Engage students as “editors” (Wyn Kelley)
Develop traditional humanistic skills (close reading, textual analysis, persuasive writing, critical thinking)
Allow students to practice “scholarly primitives”
“ I’m using the term “primitives”
in a self-consciously analogical
way, to refer to some basic
functions common to scholarly
activity across disciplines, over
time, and independent of
theoretical orientation.“
John Unsworth
Discovering Annotating
Comparing Referring
Sampling Illustrating
hyperstudio.mit.edu
@MIThyperstudio
25. Annotation Studio Multimedia Text Annotations for Students
Annotation
• Citation
(reference to base text plus metadata)
• Comment
• Tags (folksonomies)
• Links to other sources
• User information (name, group)
• Date/time
hyperstudio.mit.edu
@MIThyperstudio
26. Annotation Across Classroom Practices
• Close Reading
• Generating Material: 800 comments on one text
• Developing an Argument
• Revision
• Research and Presentation
• Making Connections Across Texts
• Peer Review and Social Reading
• Reflecting on Processes of Reading, Writing, and
Sharing Work
hyperstudio.mit.edu
@MIThyperstudio
27. Future Directions
• Incorporating student-generated texts for annotationDeveloping citation toolsFiltering annotations by student, subject, date, etc.
• Exporting annotations into visual drafting spaceSupporting creative writing and translation courses
• Side-by-side display of texts/media documents
• Annotation across multiple documents
• Annotation of multimedia sources (image, video, audio)
• Customizable visual display of annotations
• Curated repository of media and text documents
• Export and archiving of annotations (Open Annotation Standard)
• Connection to other tools via open API
• Version for mobile devices
hyperstudio.mit.edu
@MIThyperstudio
30. Open Source: New Opportunities
hyperstudio.mit.edu
@MIThyperstudio
• Annotation Studio based on the Annotator by the Open Knowledge
Foundation
• Annotation Studio code open source as well (GPL 2)
• Rich community of developers
• Other groups can fork code, contribute, build upon
• Use of open APIs (application programming interfaces) allows new
forms of collaboration, e.g. visualization tools, filter mechanisms
• Annotation Studio can be freely installed or run as a service
• Basis for other projects, e.g. Lacuna Stories at Stanford U., Hofstra
University, and New York University
• Used by almost 150 institutions of higher education in the fall
• Open source is a requirement by the National Endowment for the
Humanities (federal US funding agency)
32. The Comédie-Française Registers
Project
The Comédie-Française Registers Project
(CFRP) is a web resource for scholars of 17th
& 18th c. French theater to support an
exploratory research process.
Three Components
•Archive
•Search tool (faceted browser)
•Interactive data visualization tools
hyperstudio.mit.edu
@MIThyperstudio
* Original version of the CFRP slides were created by Jason
Lipshin
34. The CFRP Online Archive
Significance for domain experts (French
theater historians):
• Digitized access to rare materials
• Cultural significance of the time period
(i.e. the French Revolution)
• Granular search through extensive
archives.
hyperstudio.mit.edu
@MIThyperstudio
35. Data Visualization as Methodology
For digital humanists: data viz as part of an
exploratory research process.
• ”Machine Reading” (Ramsay) – Macro-level
analysis and the affordances of computation
enabling new research questions.
• “Toggling” (Schnapp et all) – Merging
quantitative and qualitative analysis of historical
data.
• Combinatorial Research – Dynamically
combining parameters as generative analysis.
hyperstudio.mit.edu
@MIThyperstudio
36. Archive and Document Dimensions
The Comédie-Française Archive: 1680 – 1793
• Daily records of repertory and box office receipts
• Information on actors’ roles, payment, and playwrights
• Daily Expenses
Register Elements:
• Play title
• Author
• Actors
• Year
• Number of tickets sold
• Ticket price
• Location of seats in theater
• Premiere, first run, or revival
hyperstudio.mit.edu
@MIThyperstudio
37. Archive and Document Dimensions
The Comédie-Française Archive: 1680 – 1793
• 113 Seasons
• Approximately 320 records per season
• 2 plays per day
• 4 genre categories
Data challenges and difficulties:
• Troupe occupies 4 different theaters
• Each theater has between 5 and 7 sections
• These sections translate to between 13 and 21
ticket price categories
hyperstudio.mit.edu
@MIThyperstudio
38. Visualization: Case Studies
Parallel Axis Graph
• Dynamic relations between all categories recorded in the registers
Theater Mapping
• Diagram of theater layout acts as navigation to the database.
Line Graph (Voltaire’s Mahomet)
• Tracing the history of one play throughout its performance and reperformance.
Network Graphs
• Repertoire decisions, popularity of plays
hyperstudio.mit.edu
@MIThyperstudio
39. Visualization: Case Study 1
Parallel Axis Graph
• Dynamic relations between all categories recorded in the registers and external events
hyperstudio.mit.edu
@MIThyperstudio
40. Visualization: Case Study 2
Theater Mapping
• Diagram of physical space of theater acts as navigation to the database.
hyperstudio.mit.edu
@MIThyperstudio
42. Visualization: Case Study 3
Mahomet ou le Fanatisme: Voltaire
•Tracing the history of one play throughout its performance
•72 instances of Mahomet within the 13 year period between 1780 and 1793.
hyperstudio.mit.edu
@MIThyperstudio
43. Future Directions
The Iterative Design Process
•New visualizations based on research questions from domain experts.
•Collaborations with new scholars and institutions: Comédie-Italienne,
Broadway, Opéra de Paris.
•Generalizability of tools to other kinds of data.
•New browser tool for dynamic visualization creation (Chris Dessonville)
•Network Graphs to explore repertoire decisions
hyperstudio.mit.edu
@MIThyperstudio
44. New Browser Tool
Combinatorial and Generative Research
•The user can select which parameters to compare and the system will automatically
generate a list of potential visualizations.
•The visualization will load in the same facet without the need for refreshing.
hyperstudio.mit.edu
@MIThyperstudio
48. Project-Based Digital Humanities
Course
hyperstudio.mit.edu
@MIThyperstudio
• Thirteen 4-hour course units, 17 Graduate and Undergraduate Students
from Computer Science, Art History, Architecture, CMS, Mechanical
Engineering (MIT, Harvard University, Wellesley University, Mass. College
of Art)
• Each unit included:
• discussion of readings and introduction to new topics /guest speakers
• small data/tool experiments
• discussion and work on larger group projects
• Four larger group projects primarily with outside partners:
Institute for Contemporary Art, Isabella Stuart Gardner Museum, MIT
Museum, Comédie-Française Registers Project
2004 ALLC/ACH Conference in Gothenburg, Sweden – Conference renamed to Digital Humanities Conference
Indovidual scholar, used computers for the first time, concordance, literature as data
other beginnings that are equally imprortnat if not more important to what we describe now as digital humanities, Dean of Engineering at MIT, used the mind as a model to rethink the way we process information, namely by association, linking, juxtaposition, Google Glass
coined the term Hypertext, literature as the model (Talmud), palimpsest
now big tent, recent publications
one out of many examples, often connected to visualization, see also Lev Manovich’s work on Culturomics, taking large datasets such as Manga comic book covers and visualizing them.
Meta slide, visualization of the visualization methods, look at the URL: Visual Literarcy
Annotaion, DH 2012 in Hamburg
Focus on scholarship, research, So how does this all play out? Effect on learning and the training of novice scholars has not been the focus (more on this a bit later when I talk about the DH course)
That's where HyperStudio comes in
Talk about two projects (out of about 35 that we have done so far)
So if I were to describe the project in elevator pitch form, I would say that CFRP is a web resource for researchers of 18th century French theater positioned at the intersection of quantitative and qualitative approaches. This resource is really a constellation of three very integrated components:
It consists of a very large online archive of hundreds of years of documents related to France’s national theater troupe, the Comedie Francaise.
A faceted browser – for granular search of individual documents within that archive.
And perhaps most excitingly, a series of interactive visualization tools, which we are currently in the process of prototyping and which I’ll discuss in a little more detail in a little bit.
But before I get into the nitty-gritty and specific details about what the project entails
Socioeconomic makeup of the audience. Which plays were popular with certain classes?
- Tickets in the loges were much more expensive than tickets in the parterre.
Faceted browser and search is tightly integrated with visualization framework. Can see individual register as well; again, this notion, of toggling from the macro to the micro-point of view.
So, for instance, if you were to pick to visualize Voltaire’s YEAR play Mahomet, as it was played from . You can really start to get at this macro-scale perspective.
Link: http://vimeo.com/53298366
Peak in popularity in 1781-1782 season. What does this peak mean? How does it relate to the themes portrayed in the play – do they resonate with the audience in a charged, new political-economic climate?
- Days of the week: how performances at the Comedie-Francaise interacted with other entertainment forms throughout the week. For example, opera was on Tuesdays. How did that effect attendance at the Comedie-Francaise?
- Whereas many of our previous visualizations were sort of one-off experiments – visualizations tailored to answer specific research questions – this is a tool which allows for a greater deal of free play. The researcher can combine whatever parameters he or she chooses from those available on the registers and the system will dynamically generate a set of potential visualizations (in the same browser window, without having to refresh). This tool comes closest to facilitating what we mean by a truly exploratory research process.
Using a subset of the data from the CFRP Registers (1769 - 1793), this prototype visualization investigates the relationship between authors and plays which are staged on the same day. Each play or author is represented by a node, the relationship of sharing the same day of performance is a undirected link. Together, this set of network graphs represents approximately 700 unique play titles and 233 authors which cover the 15265 performances put on by the Comedie Francaise over the course of 14 years.
The interactive network graphs have features that guide the user from overview to detailed close readings. The user is encouraged to zoom, search, filter, and click through to more information about specific records.These interface features, along with the spatial representation of records work in conjunction to present a series of visualizations guided by 7 research questions:1. Is there a pattern to plays being staged on the same day?2. Are there groups of plays that are more often to be staged with each other?3. Are plays performed the most often also the most profitable?4. Is there a pattern to authors whose plays are staged on the same day?5. Are their groups of authors that are more often staged with each other?6. Are the most popular authors also the most profitable?7. How are central authors such as Moliere’s plays staged in relation to other authors?
- Whereas many of our previous visualizations were sort of one-off experiments – visualizations tailored to answer specific research questions – this is a tool which allows for a greater deal of free play. The researcher can combine whatever parameters he or she chooses from those available on the registers and the system will dynamically generate a set of potential visualizations (in the same browser window, without having to refresh). This tool comes closest to facilitating what we mean by a truly exploratory research process.
- Whereas many of our previous visualizations were sort of one-off experiments – visualizations tailored to answer specific research questions – this is a tool which allows for a greater deal of free play. The researcher can combine whatever parameters he or she chooses from those available on the registers and the system will dynamically generate a set of potential visualizations (in the same browser window, without having to refresh). This tool comes closest to facilitating what we mean by a truly exploratory research process.
- Whereas many of our previous visualizations were sort of one-off experiments – visualizations tailored to answer specific research questions – this is a tool which allows for a greater deal of free play. The researcher can combine whatever parameters he or she chooses from those available on the registers and the system will dynamically generate a set of potential visualizations (in the same browser window, without having to refresh). This tool comes closest to facilitating what we mean by a truly exploratory research process.