This document introduces a new tool for interactively analyzing literature that constructs temporally ordered networks of character occurrences. It uses natural language processing to automatically detect characters and relationships in texts. Researchers used the tool to produce annotated graphs of novels that identified important narrative episodes and sequences. Their annotations provided insights into how digital tools can cultivate new collaborative interpretive practices for literary analysis.
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Analysing literature through the lens of information theory and network science
1. Twenty Thousand Leagues Above the Book:
An Interactive Visual Analytics Approach to Literature
n words
n words
k slices of n words
(e.g. n=1,000)
...
1 2 3 k
Abel Magwitch
Joe Gargery,
Mrs Joe Gargery,
Pip
Joe Gargery,
Mrs Joe Gargery,
Pip
+
Lexicon of characters (can
also be a list of other
entities such as places or
phrases)
- network visualisations
- statistics and other
measures
Directed network
Dynamic network visualization
Source texts from
Project Gutenberg
Matched information
• Abel Magwitch
• Joe Gargary
• Mrs Joe Gargary
• Pip
Matched information
• Abel Magwitch
automatic/semi-automatic
methods to detect match
entities in text, topic
detection, sentiment
analysis, part-of-speech
tagging)
configure
analyze
In recent years data-driven analysis has emerged as a growing methodology within literary
studies. These distant reading practices harness available technology to open new avenues for how
we understand literary texts. Whereas traditional literary scholarship is generally grounded in the
interpretation of the specific language of a text or body of texts, macroanalytic approaches present
new ways of seeing texts, both individually and in the aggregate. Here we introduce a
novel tool for collaborative interaction with literature that
transcends the boundaries of traditional close and data-driven
distant reading. Our approach constructs temporally ordered networks of information
occurrence, which we configured to match characters as the unit of information. This creates a
unique view of the narrative structures within novels and opens a variety of possibilities for visual
as well as information theoretic analysis.
Figure 2: Overview of the data processing pipeline of our tool prototype. The user can choose between a supervised approach (lexicon of information to match needed)
or an unsupervised approach (using state-of-the-art natural language processing).
Markus Luczak-Roesch (@mluczak), Adam Grener, Emma Fenton
As part of our initial user studies humanities scholars
collaboratively produced a set of annotated
graphs based on the visualisations that were provided to them
(see Figures 3 and 4 for two examples). These annotated graphs
show to great detail what kind of insights humanities scholars are
looking for.Annotating groups of nodes to mark up larger episodes
orsequences in the novel was a common task theyperformed. The
interactions between the derived networks and the original texts
flowed both ways; the scholars began with moments of known
narrativesignificance and moved to the network to understand that
significance, but they also identified elements of the network that
presented as significant to identify areas to begin textual analysis.
These observations on self-designed annotations are evidence for
an emergent methodological practice, as the
humanities scholars were given a novel tool and then developed
their own analytical process around it. Our users studies also
demonstrate the potential of our tool to cultivate
collaborative interpretive practices for readers.
Project page: https://vuw-sim-stia.github.io/computational-literary-science/- Source code repository: https://github.com/vuw-sim-stia/lit-cascades
Figure 3: Manual annotation of the Bleak
House network that identifies moments of
convergence for important characters.
Figure 4: Manual annotations of the David
Copperfield network that identify moments of
narrative significance.
Figure 1: Deployment of our
prototype on a 49'' multi-touch
screen. Individuals and groups can
convene in front of this setup and use
the tool while working with one of
the analysed novels.
DOI of the paper describing this work:
https://doi.org/10.1145/3148330.3154507
Demo: https://goo.gl/yFoZ6U