1. Four corners of the big tent:
perspectives on the Digital
Humanities
(for HUCO 617: Big Data: The Web as Evidence)
John Bradley
Department of Digital Humanities
King’s College London
2. Why me?
• Been active in Digital Humanities since the 1970s (!!)
• Been a member of King’s Department of Digital Humanities since 1997
• Have done work in more than one “aspect” of the DH
• John Bradley and Julianne Nyhan, “Getting Computers into Humanists’
Thinking”. in Julianne Nyhan and Andrew Flinn, Computation and the
Humanities: Towards an Oral History of Digital Humanities, Springer
Publishers 2017. ISBN: 978-3-319-20169-6 (Print) 978-3-319-20170-2
(Online) http://link.springer.com/chapter/10.1007/978-3-319-20170-2_14
3. KCL: Department of Digital Humanities
http://www.kcl.ac.uk/artshums/depts/ddh/index.aspx
4. What are the Humanities?
• “Research stemming from a detailed understanding of human
behaviour, economies, cultures and societies can dramatically
redefine the crucial decisions we need to make. These decisions may
involve the future direction of our economy, ways of broadening and
strengthening education provision at all levels, or how we deal with
the effects of climate or constitutional change… The humanities and
social sciences teach us how people have created their world, and
how they in turn are created by it.”
• –The British Academy for Humanities & Social Sciences, “Press Pack”.
• From Alan Liu, 4Humanities: Advocating for the Humanities. Website
at http://4humanities.org/2014/12/what-are-the-humanities/
5. What are the Humanities?
• “The humanities are academic disciplines that study human culture.
The humanities use methods that are primarily critical, or speculative,
and have a significant historical element—as distinguished from the
mainly empirical approaches of the natural sciences. The humanities
include ancient and modern languages, literature, philosophy,
religion, and visual and performing arts such as music and theatre.
Areas that are sometimes regarded as social sciences and sometimes
as humanities include history, archaeology, anthropology, area
studies, communication studies, classical studies, law and linguistics.”
• –Wikipedia, “Humanities,” 2014.
• From http://4humanities.org/2014/12/what-are-the-humanities/
6. What is the Digital Humanities?
• "Along with the digital archives, quantitative analyses, and tool-building
projects that once characterized the field, DH now encompasses a wide
range of methods and practices: visualizations of large image sets, 3D
modeling of historical artifacts, “born digital” dissertations, hashtag
activism and the analysis thereof, alternate reality games, mobile
makerspaces, and more.
In what has been called “big tent” DH, it can at times be difficult to
determine with any specificity what, precisely, digital humanities work
entails.“
• Klein and Gold (2016). “Digital Humanities: The Expanded Field”. In Debates
in the Digital Humanities. University of Minnesota Press. Online at
http://dhdebates.gc.cuny.edu/debates/2
7. What is the Digital Humanities?
• "Digital Humanities is not a unified field but an array of convergent
practice that explore a universe in which: a) print is no longer the
exclusive or the normative medium in which knowledge is produced
and/or disseminated; instead, print finds itself absorbed into new,
multimedia configurations; and b) digital tools, techniques, and media
have altered the production and dissemination of knowledge in the
arts, human and social sciences.
• The Digital Humanities Manifesto 2.0
http://www.humanitiesblast.com/manifesto/Manifesto_V2.pdf
• A “scholarly” context for the Digital Humanities.
8. Partitioning the Digital Humanities
• Traditional Scholarship about digital things in society
• Data Analysis using digital tools
• Data Representation using digital tools
• Making Digital Tools
9. DH as Traditional Scholarship on Digital
Matters: Digital Cultural Studies
• The Centre for Digital Culture presents
• The People’s Memes: Populist Politics in a Digital Society
February 27th
King's College London
Nash Lecture Theatre–18:30
Tickets and more details here.
• The event will host a number of speakers who have been investigating the
nexus between populism and digital technology. It will discuss the reasons for
the current surge of populist politics, and the way in which it relates to a
number of processes, including the cultural and social shocks produced by
rapid technological innovation, the new mass outreach affordances of social
media, and the way they allow to sidestep the mediation of mainstream news
media, and the changes in the class structure and in social experience that
have been facilitated by the diffusion of social media and digital technology
more generally. Come for the memes, stay for the discussion.
• Confirmed speakers include Paolo Gerbaudo (KCL), Alex Williams (City), and
Emmy Ekhlund (KCL).
10. Humanities research as process, and its research output:
Text in, Text Out
Humanities Research
as process
Source
Second’ry
Sources
Source
Source
Primary
Sources
Research Process
Emerging Ideas
“the historical work as what it most manifestly is: a verbal
structure in the form of a narrative prose discourse” Hayden
White (1973), quoted in Jörn Rüsen (1987). “Historical
Narration: Foundation, Types, Reason” in History and Theory Vol
26 No 4
?
Where is
the digital?
Research Output
The People’s Memes
11. Humanities research as process, and its research output:
Text in, Text Out
Humanities Research
as process
Source
Second’ry
Sources
Source
Source
Primary
Sources
Research Process
Emerging Ideas
“the historical work as what it most manifestly is: a verbal
structure in the form of a narrative prose discourse” Hayden
White (1973), quoted in Jörn Rüsen (1987). “Historical
Narration: Foundation, Types, Reason” in History and Theory Vol
26 No 4
Research Output
The People’s Memes
12. Partitioning the Digital Humanities
• Traditional Scholarship about digital things in society
• Data Analysis using digital tools
• Data Representation using digital tools
• Making Digital Tools
Traditional non-digital
methods applied to
digital things
13. Partitioning the Digital Humanities
• Traditional Scholarship about digital things in society
• Data Analysis using digital tools
• Data Representation using digital tools
• Making Digital Tools
Traditional non-digital
methods applied to
digital things
New digital methods
applied both to new
digital things, but also
against non-digital
subjects
14. Partitioning the Digital Humanities
• Traditional Scholarship about digital things in society
• Data Analysis using digital tools
• Data Representation using digital tools
• Making Digital Tools
Traditional non-digital
methods applied to
digital things
New digital methods
applied both to new
digital things, but also
against non-digital
subjects.
-- the issue of digital
models and modelling
17. Digital device as an abstract machine
• Analogue devices:
• Television, radio, camera,
• Each needed to be a separate device
• Each needed separate media technologies for their information
• Digital devices
• If the data becomes digital than the same device can at different times appear
to be any of these devices.
• At the bottom data layer are the bits: ones and zeros.
• A layer of interpretation on top of the bits: It is the representation of the
needs of the different media through data structures and software that makes
this abstract machine be able to do many different kinds of things. Software
(apps) provide this.
18. Need for “Formal Structure”
• Formal: computers being machines built on formalisms, formal
structures representing materials of interest are necessary if the
machine is to do anything with this material.
"An abstract structure is a formal object that is defined by a set of laws, properties, and
relationships …”
“The formal sciences are built up of symbols and theoretical rules. They can often be applied to
reality and they are often proved to be very useful.”
“In computer science, abstraction is a mechanism and practice to reduce and factor out details
so that one can focus on a few concepts at a time.”
20. Saying more about the image
• There is more that can be represented
about the image:
• It is of something:
• An image by Richard Gaywood (1644-
1668) : Concert of birds; including a
peacock and owl
• It contains images of things
• Subject classification
• Links between subjects and areas
on the image
• To say these things in way the
computer can use them requires
structuring as well.
The digital object as a surrogate for
the thing-in-the-world
21. Imaginary Museum
What objects of interest are
evidently represented in a catalogue
of photographs like the “Imaginary
Museum”?
22. Imaginary Museum:
A basic design
Examples of “Access points”:
Find me all pictures and holding
museum for Ihei Kimure
Find me all pictures held by the
Bibliothèque Nationale in Paris
Find me all pictures taken by
women photographers after the 2nd
World War
Took Holds
23. Partitioning the Digital Humanities
• Traditional Scholarship about digital things in society
• Data Analysis using digital tools: “Big Data”
• Data Representation using digital tools
• Making Digital Tools
Traditional non-digital
methods applied to
digital things
New digital methods
applied both to new
digital things, but also
against non-digital
subjects.
-- the issue of models
and modelling
24. Humanities research with digital humanities techniques
(such as those from Big Data) in the process
Humanities Research
as process
Source
Second’ry
Sources
Source
Source
Primary
Sources
Research Output
Research Process
Emerging Ideas
25. http://www.diggingintodata.org/
"Now going into the third round of the competition, the
Digging into Data Challenge has funded a wide variety
of projects that explore how computationally intensive
research methods can be used to ask new questions
about and gain new insights into our world."
25
26. Digging into Data: "a new era"
• "the Digging into Data Challenge investigators have demarcated a new
era -- one with the promise of revelatory explorations of our cultural
heritage that will lead us to new insights and knowledge, and to a
more nuanced and expansive understanding of the human condition"
(Williford and Henry 2012, p 1)
26
27. "New methodological approaches"
• "Research at these scales, speeds, and levels of complexity
encourages new methodological approaches and intellectual
strategies." (Williford and Henry 2012, p. 2)
27
28. "Big data is an all-
encompassing term
for any collection of
data sets so large and
complex that it
becomes difficult to
process using
traditional data
processing
applications."
"Big Data is a moving target;
what is considered to be
"Big" today will not be so
years ahead. "For some
organizations, facing
hundreds of gigabytes of
data for the first time may
trigger a need to reconsider
data management options.
For others, it may take tens
or hundreds of terabytes
before data size becomes a
significant consideration."
28
29. Powers of 10. substantial changes in
perspective
• Power of 10 Video (Charles and Ray Eames, IBM,
1977)
• https://www.youtube.com/watch?v=0fKBhvDjuy0
29
“the effect of adding
another zero”
30. Powers of 10. substantial changes in
perspective
• Power of 10 Video (Charles and Ray Eames, IBM,
1977)
• https://www.youtube.com/watch?v=0fKBhvDjuy0
30
kilometer
31. Powers of 10. substantial changes in
perspective
• Power of 10 Video (Charles and Ray Eames, IBM,
1977)
• https://www.youtube.com/watch?v=0fKBhvDjuy0
31
megameter
32. Powers of 10. substantial changes in
perspective
• Power of 10 Video (Charles and Ray Eames, IBM,
1977)
• https://www.youtube.com/watch?v=0fKBhvDjuy0
32
gigameter
33. "Million Books" Initiative and "Text Mining"
(Unsworth 2008)
• where does the trope of “a million books” come from? It originates, as far as I know, with the Universal Library and
its Million Books Project, which began in 2001. The Universal Library is directed by Raj Reddy, professor and former
Dean of Computer Science at Carnegie Mellon University; the million books project (funded by NSF and others) was
a kind of very large pilot, aimed at digitizing a million books (“less than 1% of all books in all languages ever
published”1), beginning with partners in India and later expanding to China and Egypt.
• Google Print (now known as Google Book Search), which had begun in secret in 2002 and was unveiled at the
Frankfurt Book Fair in October 2004, and which had Harvard's library as one of its initial partners. Google Books aims
to scan as many as 30 million books, a number equal to all the titles in WorldCat, and for all we know, they are
already about halfway there.
• “What do you do with a million books?”—a question first asked, I think, by Greg Crane, in D-Lib
Magazine, in March of 2006. (http://people.brandeis.edu/~unsworth/hownot2read.html)
• My answer to that question is that whatever you do, you don't read them, because you
can't.
• When millions of books are equally at your fingertips, all eagerly responding to your Google Book
Search: you can no longer as easily ignore the books you don't know, nor can you grasp the
collective systems they make up without some new strategy—a strategy for not reading.
• Tanya Clement, Sara Steger, John Unsworth, Kirsten Uszkalo (2008). "How Not to Read a Million Books"
http://people.brandeis.edu/~unsworth/hownot2read.html
33
Franco Moretti:
“distant reading”
(contrast with
“close reading”
34. Culturomics
34
"Aiden and Michel are the founders of a field they call “culturomics,”
in which quantitative analysis is performed on digitized texts to
generate empirical data about historical, cultural, and linguistic
trends."
"Both Aiden and Michel have backgrounds in biology, and [their
writing] reveals the extent to which that disciplinary sensibility fed
into the creation of culturomics."
Mark O'Connell (2014). "Bright Lights, Big Data". In New Yorker (Mar 20, 2014).
Online at http://www.newyorker.com/books/page-turner/bright-lights-big-data
35. 35
Egal, Marc (2013).
"Evolution of the Novel in
the United States: The
Statistical Evidence". In
Social Science History 37:2
pp. 231-254
The Google Ngram viewer:
https://books.google.com/ngrams
36. Model behind the NGram
• Words in texts: words are strings of letters, same string: same word
• Words collected into document (books), published on particular date
• Critique from Humanities reviewers:
• Different genres of books make a difference
• Different period of time had different genres more prominent
• What books have been preserved up to today?
• Words change their meaning, e.g. “spiritual”
Was the Ngram
model too simple?
37. Big Data and Unstructured text:
the example of Topic Modelling
• There is a vast amount of digital text, almost all of
it with minimal semantic structure
CHAPTER I. Down the Rabbit-Hole
Alice was beginning to get very tired of
sitting by her sister on the bank, and of
having nothing to do: once or twice she had
peeped into the book her sister was reading,
but it had no pictures or conversations in
it, 'and what is the use of a book,' thought
Alice 'without pictures or conversations?'
So she was considering in her own mind (as
well as she could, for the hot day made her
feel very sleepy and stupid), whether the
pleasure of making a daisy-chain would be
worth the trouble of getting up and picking
the daisies, when suddenly a White Rabbit
with pink eyes ran close by her.
37
38. "Topic Model" (Wikipedia)
• [A] topic model is a type of statistical model for discovering the abstract "topics"
that occur in a collection of documents. Intuitively, given that a document is
about a particular topic, one would expect particular words to appear in the
document more or less frequently: "dog" and "bone" will appear more often in
documents about dogs, "cat" and "meow" will appear in documents about cats,
and "the" and "is" will appear equally in both. A document typically concerns
multiple topics in different proportions; thus, in a document that is 10% about
cats and 90% about dogs, there would probably be about 9 times more dog
words than cat words. A topic model captures this intuition in a mathematical
framework, which allows examining a set of documents and discovering, based
on the statistics of the words in each, what the topics might be and what each
document's balance of topics is.
38
39. Topic Modelling:
"Bag of Words"
• "In this model, a text (such as a sentence or a document) is
represented as the bag (multiset) of its words, disregarding grammar
and even word order but keeping multiplicity."
• The idea is that the selection of words in a document relate to what
the document is about.
39
40. Topic Modelling: Martha Ballard's Diary
• Cameron Blevins:
http://historying.org/2010/04/01/topic-modeling-
martha-ballards-diary/
• "In A Midwife’s Tale, Laurel Ulrich describes the challenge of analyzing
Martha Ballard’s exhaustive diary, which records daily entries over the
course of 27 years"
• “The problem is not that the diary is trivial but that it introduces more
stories than can be easily recovered and absorbed.” “
• Each Diary Entry is somewhat independent, and often focuses on
something that happened that day: terrible weather, or a beautiful birth,
etc etc
• "MALLET allows you to feed in a series of text files, which the machine
will then process and generate a user-specified number of word clusters it
thinks are related topics."
40
41. Topic Modelling Martha Ballard's Diary:
Mallet's Topics
• birth deld safe morn receivd calld left cleverly pm labour fine reward arivd
infant expected recd shee born patient
• meeting attended afternoon reverend worship foren mr famely performd vers
attend public supper st service lecture discoarst administred supt
• day yesterday informd morn years death ye hear expired expird weak dead las
past heard days drowned departed evinn
• gardin sett worked clear beens corn warm planted matters cucumbers
gatherd potatoes plants ou sowd door squash wed seeds
• lb made brot bot tea butter sugar carried oz chees pork candles wheat store
pr beef spirit churnd flower
• unwell mr sick gave dr rainy easier care head neighbor feet relief made throat
poorly takeing medisin ts stomach
"MALLET is completely unconcerned with
the meaning of a word (which is fortunate,
given the difficulty of teaching a computer
that, in this text, discoarst actually means
discoursed). Instead, the program is only
concerned with how the words are used in
the text, and specifically what words tend to
be used similarly." 41
42. Topic Modelling Martha
Ballard's Diary: Mallet's Topics
• MIDWIFERY: birth deld safe morn receivd calld left cleverly pm labour fine
reward arivd infant expected recd shee born patient
• CHURCH: meeting attended afternoon reverend worship foren mr famely
performd vers attend public supper st service lecture discoarst administred
supt
• DEATH: day yesterday informd morn years death ye hear expired expird weak
dead las past heard days drowned departed evinn
• GARDENING: gardin sett worked clear beens corn warm planted matters
cucumbers gatherd potatoes plants ou sowd door squash wed seeds
• SHOPPING: lb made brot bot tea butter sugar carried oz chees pork candles
wheat store pr beef spirit churnd flower
• ILLNESS: unwell mr sick gave dr rainy easier care head neighbor feet relief
made throat poorly takeing medisin ts stomach
42
43. Partitioning the Digital Humanities
• Traditional Scholarship about digital things in society
• Data Analysis using digital tools
• Data Representation using digital tools
• Making Digital Tools
Traditional non-digital
methods applied to
digital things
New digital methods
applied both to new
digital things, but also
against non-digital
subjects.
-- the issue of models
and modelling
44. Partitioning the Digital Humanities
• Traditional Scholarship about digital things in society
• Data Analysis using digital tools
• Data Representation using digital tools
• Making Digital Tools
Traditional non-digital
methods applied to
digital things
New digital methods
applied both to new
digital things, but also
against non-digital
subjects.
-- the issue of models
and modelling
Making
Digital
Objects
Using Digital Tools
45. Humanities research with digital humanities techniques
(such as those from Big Data) in the process
Humanities Research:
Modelling
Capture
Presentation
Source
Source
Second’ry
Sources
Research Process
Primary Sources
e.g.:
• Marked up Text (TEI)
• Database
48. “Knowledge Representation”
• A knowledge representation (KR) is most fundamentally a surrogate, a
substitute for the thing itself, used to enable an entity to determine
consequences by thinking rather than acting, i.e., by reasoning about the
world rather than taking action in it. […] It is a set of ontological
commitments, i.e., an answer to the question: In what terms should I think
about the world? (Davis, Shrobe, Szolovits 1993)
• "In terms of humanities computing, modelling is an iterative process of
constructing and developing something like a computational 'knowledge
representation' as this is defined in computer science. In fact we might say
that a model is a manipulable knowledge representation.”
• Willard McCarty 2002. “Humanities Computing: Essential Problems, Experimental Practice” in
Literary and Linguistic Computing Vol 17 No 1. pp.103-125
49. Imaginary Museum:
A basic design
Examples of “Access points”:
Find me all pictures and holding
museum for Ihei Kimure
Find me all pictures held by the
Bibliothèque Nationale in Paris
Find me all pictures taken by
women photographers after the 2nd
World War
Took Holds
50. Modelling understanding: “The Art of making
in Antiquity” project
“Engaging the archive’s
compiler [Rockwell] in this
way will give a unique
angle to the metadata,
making timely use of his
position as a leading
authority on stoneworking
and a sculptor of long
standing.” http://www.artofmaking.ac.uk/
55. “Prosopography”
• “[the word] was missing from the main text of the two-volume Shorter Oxford English
Dictionary, published in 1973, but there it was in the Addenda, between polythene and
profiteroles […]. Wait, though: this is not our prosopography but a neoclassical one, given a
first attestation in 1577 and a derivation from an early modern neulogism ‘prosopographia’,
the description of an individual’s personality and career.”
• Janet L. Nelson, David Pelteret and Harold Short (2003). “Medieval Prosopographies and the
Prosopography of Anglo-Saxon England”. Fifty Years of Prosopography. In series Proceedings of the British
Academy, Vol 118, p 155-167.
56. Prosopography: definition from PASE
• What is a Prosopography?
• “A particular prosopography aims to amass and present clearly a quantity of information on all individuals
in a given category”
• (PASE website)
• An Historical project.
• A published prosopography becomes a reference for other historians to
use. It tells them:
• Who's who
• Something of what is known about them
• In what sources this individual appears.
57. How is prosopography carried out?
Don't forget that a prosopography is a kind of index:
• recording all that is known about people who fit the criteria of the project in the
prosopographical categories.
• all we know about historical figures is what has survived from their own period
• generally, for older periods, these are collections of manuscript documents
• information about a single person might be scattered across separate documents.
• A Bishop, for example, might have his life story told by more than one author
• also, however, his doings might also appear in various legal and church documents that
survive
• also, in letters he wrote, or others wrote to him.
58. The task of Prosopography
• The job, then, is to:
• read all the sources
• collect information from all categories of interest to the particular
prosopography about people
• record this information and publish it so that others can use it to:
• get a summary of what is known about the person
• where the person is actually described in the extent sources.
59. Traditional prosopography as narrative
From J.R. Martindale, The
Prosopography of the Later
Roman Empire, 3: A.D. 527-
641. Cambridge: Cambridge
University Press. 1992.
“Text in” and “Text out”?
61. 61
“DDH’s” Structured Prosopographical Projects
• Prosopography of the Byzantine Empire
• With John Martindale, Editor. First published in CD, now available free online
• Prosopography of the Byzantine World
• With Michael Jeffreys (KCL), Averil Cameron (Professor of Late Antique and Byzantine History, U of Oxford), Charlotte Roueché (Dept of
Byzantine and Modern Greek Studies, KCL): latest update 2011
• Clergy of the Church of England Database
• With Arthur Burns (History, KCL), Kenneth Fincham (History, U of Kent at Canterbury), Stephen Taylor (History, U of Reading): latest update 2014
• Prosopography of Anglo-Saxon England
• With Janet Nelson and Stephen Baxter (History, KCL), Simon Keynes (Department of Anglo-Saxon, Norse, and Celtic, U of Cambridge): latest
update 2011
• Paradox/Peoples of Medieval Scotland
• With Prof Dauvit Broun (University of Glasgow), Prof Roibeard Ó Maolalaigh (Glasgow), Prof David Carpenter (KCL), Dr Matthew Hammond
(University of Edinburgh): last update Sept 2012
• Breaking of Britain
• With Prof Dauvit Broun (University of Glasgow), Prof David Carpenter (KCL), Dr Matthew Hammond (University of Edinburgh), Prof Keith
Stringer (Univ of Lancaster)
• Making of Charlemagne’s Europe
• With Alice Rio (KCL, HIstory). Project finished 2015
• Digitising the Prosopographies of the Roman Republic
• With Henrick Mouritsen and Dominc Rathbone (KCL, Classics): beginning late 2013
62. Traditional Prosopography:
in print form
62
Sources
People
From J.R. Martindale, The
Prosopography of the
Later Roman Empire, 3:
A.D. 527-641. Cambridge:
Cambridge University
Press. 1992.
Places
63. Structured Data for prosopography: Many
entrances
Source
Person
Place
Office
Date
Source
Source
Event
Prosopographical Database
69. Modelling: Appropriate to the Humanities?
• “humanistic inquiry reveals itself as an activity
fundamentally dependent upon the location of pattern.”
• “Of all the technologies in use among computing
humanists, databases are perhaps the best suited to
facilitating and exploiting [pattern].”
• “To build a database one must be willing to move from the
forest to the trees and back again; to use a database is to
reap the benefits of the enhanced vision which the system
affords.”
• (from Ramsay, “Databases” in A Companion to Digital
Humanities”)
70. Appropriate to the Humanities?
• the underlying ontology [that a database represents] has considerable intellectual value.
• A well-designed database that contains information about people, buildings, and events in New
York City contains not static information, but an entire set of ontological relations capable of
generating statements about a domain.
• A truly relational database, in other words, contains not merely "Central Park", "Frederick Law
Olmstead", and "1857", but a far more suggestive string of logical relationships (e.g., "Frederick
Law Olmstead submitted his design for Central Park in New York during 1857").
• (from Ramsay, “Databases” in A Companion to Digital Humanities”)
71. Partitioning the Digital Humanities
• Traditional Scholarship about digital things in society
• Data Analysis using digital tools
• Data Representation using digital tools
• Making Digital Tools
Traditional non-digital
methods applied to
digital things
New digital methods
applied both to new
digital things, but also
against non-digital
subjects.
-- the issue of models
and modelling
Making
Digital
Objects
Using Digital Tools
72. Partitioning the Digital Humanities
• Traditional Scholarship about digital things in society
• Data Analysis using digital tools
• Data Representation using digital tools
• Making Digital Tools
Traditional non-digital
methods applied to
digital things
New digital methods
applied both to new
digital things, but also
against non-digital
subjects.
-- the issue of models
and modelling
Making
Digital
Objects
Using Digital Tools
73. The “world of code”
We live in worlds increasingly interwoven
with code. Code puts into operation the apps
and communication channels we increasingly
depend on in everyday life; it twitters away
insistently in our pockets in our portable
devices; it channels the personal data,
interests, and relationships that we include
in our social network profiles; and it
aggregates us into vast database
architectures, powered by database
management techniques, as a million little
informational bits, in order to offer us new
services, recommendations, and
experiences. Without really thinking about it,
we are spending vast amounts of our time
doing stuff with code.”
http://dmlcentral.net/blog/ben-williamson/coded-curriculum-new-architectures-learning
74. App Inventor: “Anyone can build Apps that
impact the world” (!!)
http://appinventor.mit.edu/explore/
75. “Telescopes for the Mind”
• "Digital artifacts like tools could then be considered as "telescopes for
the mind" that show us something in a new light" (Ramsay and
Rockwell 2012, p 79)
• An “app” can be a new tool that, like telescopes, allows us to see
material we point it at in a new way that can transform our
understanding of the material, as when Galileo pointed the telescope
at the moon.
76. New insights through “making things”
• It may be that my personal history and training exert a prejudicial influence
that limits the appeal of how I think about digital humanities. Perhaps that
history and training explain why in reading Tim Ingold's illuminating book,
Making: Anthropology, archaeology, art and architecture (2013), I am
drawn eagerly to the pedagogical expressions of his anthropology in such
class exercises as weaving baskets and see in them (changing what needs
to be changed) a model for training digital humanists.
• The link between baskets and computing was made explicit to me this
morning by the announcement of the Rare Book Summer School
(Humanist 30.742), which quoted a former student as saying, "I will never
look at a book -- ”any book -- ”the same way again.” (McCarty 2017,
Humanist Vol. 30, No. 746)
80. Pliny: what is it?
• it is a thought-piece:
• perhaps wrongheaded in various ways
• … although a couple of years was spent on research into what Humanities
scholarship was like before Pliny was built
• Pliny is meant to promote discussion within the DH about this area.
http://pliny.cch.kcl.ac.uk
The object of study for me in Pliny was
the traditional methods of Humanities
Research itself.
81. Pliny and software tool collaboration:
significance of annotation
From show about damaged books (!)
at Cambridge University Library
82. The page is at the “nexus”
Publishing
Application
•Preparing text
•book design and
presentation
•Printing
•Distribution
•The printing press
The page is the nexus between publishing and annotation
Annotation
Application
•Support dynamic
text
•Support using of
annotations
•The pen
83. The screen as the “nexus”
PDF Viewer
•Reading PDF file
•Layout on the
screen
•Supporting page
turning, etc
Pliny
•Support display
of annotations
•Manage notes
and anchors
•Support work
with notes
84. Humanities research as process, and its research output
Humanities Research
as process
Source
Second’ry
Sources
Source
Source
Primary
Sources
Research Output
Research Process
Emerging Ideas
Holmer, Joan Ozark (1994). “Draw, if
you be Men”: Saviolo’s Significance
for Romeo and Juliet”. In
Shakespeare Quarterly, Vol. 45 No 2
(Summer 1994). pp. 163-189
Annotation
& notetaking
85. 85
Pliny objects as a connected graph: a “Mind
Map”
• An example of a mindmap
• Graham Burnett (2005)
• http://en.wikipedia.org/wiki/File:Mindmap.gif
85
86. • "To ask whether coding is a scholarly act is like asking whether writing
is a scholarly act." (Ramsay and Rockwell 2012, p 82)
• "A tool is, so to speak, an objectified idea, a theorem whose force is
imposed on its consequences. It is thought in action and must not be
conceived in terms of the crudities of early materialism." (p 332)
88. Digital Humanities as the “big tent”
Traditional Scholarship about digital things in society
Data Analysis such as big data using digital tools
Data Representation using digital tools
Making Digital Tools
Textual Markup Wearable Digital Objects
Information studies for Humanities
92. “Our Data Ourselves”
Do you know how much data you generate? Do you know where it goes? Is it being sold by Google
to increase their market share, now valued at £250 billion? Is it being ‘scooped’ by the GCHQ or
NSA? The Snowden revelations show that security agencies are surreptitiously taking data form
our phones, cameras, apps, and anything else we use that leaves a digital trace. Especially if you
use a mobile device, you are actively contributing to the 2.5 billion gigabytes of data being
generated daily. To put that in perspective, this would be enough data to fill one hundred million
iPhones, every day. Yet, public understanding of our information-rich environment and our
quantified selves remains underdeveloped.
‘Our Data Ourselves’ is research project we lead at King’s College London, examining the data we
generate on our mobile devices. We have brought together media and cultural theorists,
computer scientists, programmers and youth in a unique project exploring this ‘big social data’
(BSD) we generate. As arts and humanities researchers we are interested in the development and
transfer of the technical skills and knowledge necessary for the capture and analysis of the
different forms of BSD, and its transformation into community research data. This desire to create
community research data signals broader social relevance and potential. Our project, therefore,
asks whether BSD can be transformed into a public asset and become a creative resource for
cultural and economic community development.
93. Partitioning the Digital Humanities
Traditional
Scholarship
about digital
things in
society
Data Analysis
using digital
tools
Data
Representation
using digital
tools
Making Digital
Tools
Digital Technology as
active partner in the
work
“Modelling and Models”
94. The World's Technological Capacity to Store, Communicate,
and Compute Information
Martin Hilbert and Priscila López (2011). "The World’s
Technological Capacity to Store, Communicate, and
Compute Information", In
Science 332:6025 (April 2011) pp. 60-65. DOI:
10.1126/science.1200970
Fig 2. World’s technological installed capacity to store information
10 times
94
95. Moretti: Graphs, maps, trees
Distant Reading
• "essay on literary history"
• "instead of concrete individual
works [..] the text undergoes a
process of deliberate reduction
and abstraction"
• Distant Reading: "a specific form
of knowledge: fewer elements,
hence a sharper sense of their
overall interconnection".
95
96. Moretti: Graphs, maps, trees
Distant Reading
• Moretti claims traditional history focused
on rare/curious objects and, therefore
prominent people and their doings.
Annales school looked at larger society:
with more data used statistical methods:
"social history")
• "What would happen if literary historians,
too, decided to 'shift their gaze [...] from
the extraordinary to the everyday, from
exceptional events to the large mass of
facts'?"
• "A more rational literary history. That is the
idea." (p. 4)
The Annales School: "The school has been
highly influential in setting the agenda for
historiography in France and numerous other
countries, especially regarding the use of social
scientific methods by historians, emphasis on
social rather than political or diplomatic themes,
and for being generally hostile to the class
analysis of Marxist historiography."
96
97. Moretti: The rise of the novel
Moretti claims that work like his
is "truly cooperative" (based on
from other projects) and could
be combined in more than one
way.
Moretti talks about graphs of
rise of the novel – similar
shapes for five countires, three
continents, over two centuries
... the same old metaphor
97
98. Reaction to Distant Reading
• Kathryn Schulz (2011) What Is Distant Reading?
• New York Times Sunday Book Review June 24, 2011.
• http://www.nytimes.com/2011/06/26/books/review/the-mechanic-muse-what-is-distant-
reading.html
• "Let’s say you pick up a copy of “Jude the Obscure,” become obsessed with Victorian fiction and somehow
manage to make your way through all 200-odd books generally considered part of that canon. Moretti would
say: So what? As many as 60,000 other novels were published in 19th-century England — to mention
nothing of other times and places. You might know your George Eliot from your George Meredith, but you
won’t have learned anything meaningful about literature, because your sample size is absurdly small. Since
no feasible amount of reading can fix that, what’s called for is a change not in scale but in strategy. To
understand literature, Moretti argues, we must stop reading books."
98
99. The "Canon"
• "What makes the history of music, or of any art, particularly troublesome is that
what is most exceptional, not what is most usual, has often the greatest claim on
our interest."
• ... It is only in the works of Haydn, Mozart, and Beethoven that all the
comtemporary elements of musical style -- rhythmic, harmonic, and melodic --
work coherently together, or that the ideals of the period are realized on a level
of any complexity." (pp 21-22)
• Charles Rosen, "The Classical Style"
99
100. Hype?
• 'For those who think in next-new-thing terms, "Big
Data" seems to fill the role not of an enormously
challenging research problem but of a solution, a t-
shirt slogan, a rah-rah rhetorical flourish, i.e. it
amounts to boondoggly hype.'
• HUMANIST Sun, Oct 26, 2014, Willard McCarty
100
101. Modelling, and Patterns
• "In representing the past, we seek perspective, the point of view that
allows us to discern patterns among the events that have occurred.
We are not so much trying to transmit accumulated knowledge –
culture and tradition do this, among other means – as to understand
the significance of our experience.“
• Bodenhamer 2008 p. 222
102. Data, Structure, Interpretation
• "There can be no data without structure, and all structure is interface,
whether we view it as a screen appearance or not. [...] Even more
importantly, all interfaces—visible as well as invisible—are
interpretational forms."
• (McGann, Jerome (2010). "Sustainability: The Elephant in the Room". In Online Humanities
Scholarship: The Shape of Things to Come. Houston Texas: Rice University Press)
http://rup.rice.edu/cnx_content/shape/m34305.html
103. Analytical Modelling: the utility of failure
• "the digital model illumines analytically by isolating what would not
compute. In other words, the failures of analytic modelling are where
its success is to be found.”
• Willard McCarty (2008). “What’s going on?” in Literary and Linguistic Computing, Vol 23 No 3.
p. 256
104. ENIAC
ENIAC (/ˈini.æk/ or
/ˈɛni.æk/; Electronic
Numerical Integrator And
Computer) was the first
electronic general-purpose
computer. It was [...]
capable of being
reprogrammed to solve "a
large class of numerical
problems". ENIAC was
initially designed to
calculate artillery firing
tables for the United States
Army's Ballistic Research
Laboratory.
105. ENIAC
Programming the
ENIAC
"ENIAC could be programmed to
perform complex sequences of
operations, including loops, branches,
and subroutines. The task of taking a
problem and mapping it onto the
machine was complex, and usually
took weeks. After the program was
figured out on paper, the process of
getting the program into ENIAC by
manipulating its switches and cables
could take days."
106. Von Neumann Machines
• Von Neumann Architecture
• Stored-Program System
• A stored-program digital computer is
one that keeps its program
instructions, as well as its data, in
read-write, random-access memory
(RAM)
107. Inside the phone (or computer...)
Long-term
Storage
Screen
Touch Surface
Speaker
Microphone
Camera Wireless
A Von
Neumann
Machine
108. Inside the phone (or computer...)
Long-term
Storage
Screen
Touch Surface
Speaker
Microphone
Camera Wireless
A Von
Neumann
Machine
Software
Software
Data
Data
•Software for devices (device
drivers)
•Software for services
(Operating System)
•Software for functionality
(application software: apps,
etc)