This presentation consists of three sections: (1) a brief description of the work of Franco Moretti and Matthew Jockers to exemplify how past literature can be seen anew using text analysis tools, (2) a brief description of Voyant and TMT in the context of text analysis generally, (3) a science fiction extrapolation describing text creation tools as the obverse of text analysis tools. One might imagine a new future literature in which text artists build texts by combining words at a distance. One might imagine an obverse to Voyant: a “text creation system” which allows users to add numbers of words to a text and to define the proximity of those words to each other using network visualizations and other tools. The construction of such texts could be imagined to be every bit as complex as writing a novel, as artists might manipulate webs of words to differentiate and potentiate shades of meaning.
Boost PC performance: How more available memory can improve productivity
The New Past, and a Speculative Future, of Literature: A Brief Discussion of Two Text Analysis Tools
1. The New Past, and a Speculative Future, of Literature:
A Brief Discussion of Two Text Analysis Tools
Nat Gustafson-Sundell
Minnesota State University, Mankato
OpenResearch.weebly.com
1
2. Franco Moretti
“Writing about comparative social history, Marc Bloch once coined a lovely ‘slogan,’ as he himself
called it: ‘years of analysis for a day of synthesis’; and if you read Braudel or Wallerstein you
immediately see what Bloch had in mind. The text which is strictly Wallerstein’s, his ‘day of
synthesis’, occupies one-third of a page … the rest are quotations … Years of analysis; other
people’s analysis, which Wallerstein’s page synthesizes into a system.
Note, if we take this model seriously, the study of world literature will somehow have to
reproduce this ‘page’ – which is to say: this relationship between analysis and synthesis – for the
literary field. But in that case, literary history will quickly become very different from what it is
now: it will become ‘second hand’: a patchwork of other people’s research, without a single
direct textual reading. Still ambitious, and actually even more so than before (world literature!);
but the ambition is now directly proportional to the distance from the text: the more ambitious
the project, the greater must the distance be.” (Moretti 47-8, 2000)
“Distant reading: where distance … is a condition of knowledge: it
allows you to focus on units that are much smaller or much larger
than the text: devices, themes, tropes – or genres and systems.
And if, between the very small and the very large, the text itself
disappears, well, it is one of those cases when one can justifiably
say, Less is more. It we want to understand the system in its
entirety, we must accept losing something…” (Moretti 48-9, 2000)
2
3. Matthew Jockers
“The literary scholar of the twenty-first century can no longer be content with
anecdotal evidence, with random ‘things’ generated from a few , even
‘representative’ texts. We must strive to understand these things in the
context of everything else, including a mass of possibly ‘uninteresting’ texts.”
(Jockers 8)
“At the macro scale , we see evidence of time and gender influences on theme and
style. By superimposing these two network snapshots in our minds, we can begin
to imagine a larger context in which to read and study nineteenth-century
literature. What is clear is that the books we have traditionally studied are not
isolated books. The canonical greats are not even outliers: they are books that are
similar to other books…” (Jockers 168)
“It is the exact interplay between the macro and micro scale that promises a new, enhanced, and perhaps even
better understanding of the literary record. The two approaches work in tandem and inform each other.
Human interpretation of the ‘data,’ whether it be mined at the macro or micro level, remains essential … The
most fundamental and important difference in the two approaches is that the macroanalytic approach reveals
details about texts that are for all intents and purposes unavailable to close-readers of the texts.” (Jockers
online)
3
4. “The value of the computer-mediated exercises is that they enable readers to
readily perceive and appreciate features that are not obvious in a conventional
reading of a printed text.” (Irizarry 155, 1996)
“The computer is, among other things, an instrument uniquely suited to play
activities ...” (Irizarry 156, 1996)
“Assembling and disassembling a text, like playing with blocks
of Lego, may not necessarily contribute immediately to its
understanding, but it is likely to contribute to the aggregate
experience of the text in valuable ways. … I am suggesting
that play is an integral part of a humanist’s interpretive
activities…” (Sinclair 181, 2003)
“Playful experimentation is a pragmatic approach of trying something, seeing if you
obtain interesting results, and if you do, then trying to theorize why those results are
interesting rather than starting from articulated principles.” (Rockwell 214, 2003)
Play
4
12. For Topic 1: Top 25 Documents in Topic 1
In the arrangement of poems, what is the topic trend? What can we learn about arrangement in this book?
How often is this topic the “dominant” topic? What topics are most common across documents, or most rare?
What topics tend to dominate? What topics tend to be subordinate?
Does this topic relate to certain topics more than others?
12
16. 16
Works Cited
Blei, David. "Probabilistic Topic Models." Communications of the ACM 55.4 (2012): 77-84. Web.
Brett, Megan. "Topic Modeling: A Basic Introduction." Journal of Digital Humanities 2.1 (2012): 12-16. Web.
Irizarry, Estelle. "Tampering with the Text to Increase Awareness of poetry’s Art: Theory and Practice with a
Hispanic Perspective." Literary and Linguistic Computing 11 (1996): 155-162. Print.
Jockers, Matthew Lee. Macroanalysis: Digital Methods and Literary History. University of Illinois Press, 2013.
Print.
Moretti, Franco. Distant Reading. London: Verso, 2013. Print.
Rockwell, Geoffrey. "What is Text Analysis, really?" Literary and Linguistic Computing 18.2 (2003): 209-19.
Web.
Samuels, Lisa, and Jerome J. McGann. "Deformance and Interpretation." New Literary History 30.1 (1999):
25-56. Web.
Sinclair, Stefan. "Computer-Assisted Reading: Reconceiving Text Analysis." Literary and Linguistic Computing
18.2 (2003): 175-84. Web.
Hinweis der Redaktion
From a topic modeling (LDA) perspective, a text consists of some number of topics, each of which makes up some percent of the text. A topic can be thought of as a “bag of words.” We can think of a text as resulting from a number of random drawings from those bags of words based on the percentage allocation of topics (and the numbers of various words in those bags will dependon the percentage allocation of words within those topics).“One way to think about how the process of topic modeling works is to imagine working though an article with a set of highlighters. As you read through the article, you use a different color for the key words of themes within the paper as you come across them. When you were done, you could copy out the words as grouped by the color you assigned them. That list of words is a topic, and each color represents a different topic. Note: this description is inspired by the following illustration from David Blei’saricle, which is one of the best visual representation of a topic I’ve found.” (Brett 12)My caveat: the computer does not know the meanings of the words. The algorithm finds topics based on the co-occurrence of the words: “They look like ‘topics’ because terms that frequently occur together tend to be about the same subject” (Blei 9)