SlideShare ist ein Scribd-Unternehmen logo
1 von 109
Four corners of the big tent:
perspectives on the Digital
Humanities
(for HUCO 617: Big Data: The Web as Evidence)
John Bradley
Department of Digital Humanities
King’s College London
Why me?
• Been active in Digital Humanities since the 1970s (!!)
• Been a member of King’s Department of Digital Humanities since 1997
• Have done work in more than one “aspect” of the DH
• John Bradley and Julianne Nyhan, “Getting Computers into Humanists’
Thinking”. in Julianne Nyhan and Andrew Flinn, Computation and the
Humanities: Towards an Oral History of Digital Humanities, Springer
Publishers 2017. ISBN: 978-3-319-20169-6 (Print) 978-3-319-20170-2
(Online) http://link.springer.com/chapter/10.1007/978-3-319-20170-2_14
KCL: Department of Digital Humanities
http://www.kcl.ac.uk/artshums/depts/ddh/index.aspx
What are the Humanities?
• “Research stemming from a detailed understanding of human
behaviour, economies, cultures and societies can dramatically
redefine the crucial decisions we need to make. These decisions may
involve the future direction of our economy, ways of broadening and
strengthening education provision at all levels, or how we deal with
the effects of climate or constitutional change… The humanities and
social sciences teach us how people have created their world, and
how they in turn are created by it.”
• –The British Academy for Humanities & Social Sciences, “Press Pack”.
• From Alan Liu, 4Humanities: Advocating for the Humanities. Website
at http://4humanities.org/2014/12/what-are-the-humanities/
What are the Humanities?
• “The humanities are academic disciplines that study human culture.
The humanities use methods that are primarily critical, or speculative,
and have a significant historical element—as distinguished from the
mainly empirical approaches of the natural sciences. The humanities
include ancient and modern languages, literature, philosophy,
religion, and visual and performing arts such as music and theatre.
Areas that are sometimes regarded as social sciences and sometimes
as humanities include history, archaeology, anthropology, area
studies, communication studies, classical studies, law and linguistics.”
• –Wikipedia, “Humanities,” 2014.
• From http://4humanities.org/2014/12/what-are-the-humanities/
What is the Digital Humanities?
• "Along with the digital archives, quantitative analyses, and tool-building
projects that once characterized the field, DH now encompasses a wide
range of methods and practices: visualizations of large image sets, 3D
modeling of historical artifacts, “born digital” dissertations, hashtag
activism and the analysis thereof, alternate reality games, mobile
makerspaces, and more.
In what has been called “big tent” DH, it can at times be difficult to
determine with any specificity what, precisely, digital humanities work
entails.“
• Klein and Gold (2016). “Digital Humanities: The Expanded Field”. In Debates
in the Digital Humanities. University of Minnesota Press. Online at
http://dhdebates.gc.cuny.edu/debates/2
What is the Digital Humanities?
• "Digital Humanities is not a unified field but an array of convergent
practice that explore a universe in which: a) print is no longer the
exclusive or the normative medium in which knowledge is produced
and/or disseminated; instead, print finds itself absorbed into new,
multimedia configurations; and b) digital tools, techniques, and media
have altered the production and dissemination of knowledge in the
arts, human and social sciences.
• The Digital Humanities Manifesto 2.0
http://www.humanitiesblast.com/manifesto/Manifesto_V2.pdf
• A “scholarly” context for the Digital Humanities.
Partitioning the Digital Humanities
• Traditional Scholarship about digital things in society
• Data Analysis using digital tools
• Data Representation using digital tools
• Making Digital Tools
DH as Traditional Scholarship on Digital
Matters: Digital Cultural Studies
• The Centre for Digital Culture presents
• The People’s Memes: Populist Politics in a Digital Society
February 27th
King's College London
Nash Lecture Theatre–18:30
Tickets and more details here.
• The event will host a number of speakers who have been investigating the
nexus between populism and digital technology. It will discuss the reasons for
the current surge of populist politics, and the way in which it relates to a
number of processes, including the cultural and social shocks produced by
rapid technological innovation, the new mass outreach affordances of social
media, and the way they allow to sidestep the mediation of mainstream news
media, and the changes in the class structure and in social experience that
have been facilitated by the diffusion of social media and digital technology
more generally. Come for the memes, stay for the discussion.
• Confirmed speakers include Paolo Gerbaudo (KCL), Alex Williams (City), and
Emmy Ekhlund (KCL).
Humanities research as process, and its research output:
Text in, Text Out
Humanities Research
as process
Source
Second’ry
Sources
Source
Source
Primary
Sources
Research Process
Emerging Ideas
“the historical work as what it most manifestly is: a verbal
structure in the form of a narrative prose discourse” Hayden
White (1973), quoted in Jörn Rüsen (1987). “Historical
Narration: Foundation, Types, Reason” in History and Theory Vol
26 No 4
?
Where is
the digital?
Research Output
The People’s Memes
Humanities research as process, and its research output:
Text in, Text Out
Humanities Research
as process
Source
Second’ry
Sources
Source
Source
Primary
Sources
Research Process
Emerging Ideas
“the historical work as what it most manifestly is: a verbal
structure in the form of a narrative prose discourse” Hayden
White (1973), quoted in Jörn Rüsen (1987). “Historical
Narration: Foundation, Types, Reason” in History and Theory Vol
26 No 4
Research Output
The People’s Memes
Partitioning the Digital Humanities
• Traditional Scholarship about digital things in society
• Data Analysis using digital tools
• Data Representation using digital tools
• Making Digital Tools
Traditional non-digital
methods applied to
digital things
Partitioning the Digital Humanities
• Traditional Scholarship about digital things in society
• Data Analysis using digital tools
• Data Representation using digital tools
• Making Digital Tools
Traditional non-digital
methods applied to
digital things
New digital methods
applied both to new
digital things, but also
against non-digital
subjects
Partitioning the Digital Humanities
• Traditional Scholarship about digital things in society
• Data Analysis using digital tools
• Data Representation using digital tools
• Making Digital Tools
Traditional non-digital
methods applied to
digital things
New digital methods
applied both to new
digital things, but also
against non-digital
subjects.
-- the issue of digital
models and modelling
Pre-digital Information Technology
Digital Information Technology
Digital device as an abstract machine
• Analogue devices:
• Television, radio, camera,
• Each needed to be a separate device
• Each needed separate media technologies for their information
• Digital devices
• If the data becomes digital than the same device can at different times appear
to be any of these devices.
• At the bottom data layer are the bits: ones and zeros.
• A layer of interpretation on top of the bits: It is the representation of the
needs of the different media through data structures and software that makes
this abstract machine be able to do many different kinds of things. Software
(apps) provide this.
Need for “Formal Structure”
• Formal: computers being machines built on formalisms, formal
structures representing materials of interest are necessary if the
machine is to do anything with this material.
"An abstract structure is a formal object that is defined by a set of laws, properties, and
relationships …”
“The formal sciences are built up of symbols and theoretical rules. They can often be applied to
reality and they are often proved to be very useful.”
“In computer science, abstraction is a mechanism and practice to reduce and factor out details
so that one can focus on a few concepts at a time.”
Structure and an Image
• To show an image the data
about it must be
structured:
• A grid of dots
• Dimension of the grid
• A colour for each dot
• The computer must know
how to map it to the screen
•Image from the BPI 1700 project
•© Copyright The Trustees of
The British Museum
Saying more about the image
• There is more that can be represented
about the image:
• It is of something:
• An image by Richard Gaywood (1644-
1668) : Concert of birds; including a
peacock and owl
• It contains images of things
• Subject classification
• Links between subjects and areas
on the image
• To say these things in way the
computer can use them requires
structuring as well.
The digital object as a surrogate for
the thing-in-the-world
Imaginary Museum
What objects of interest are
evidently represented in a catalogue
of photographs like the “Imaginary
Museum”?
Imaginary Museum:
A basic design
Examples of “Access points”:
Find me all pictures and holding
museum for Ihei Kimure
Find me all pictures held by the
Bibliothèque Nationale in Paris
Find me all pictures taken by
women photographers after the 2nd
World War
Took Holds
Partitioning the Digital Humanities
• Traditional Scholarship about digital things in society
• Data Analysis using digital tools: “Big Data”
• Data Representation using digital tools
• Making Digital Tools
Traditional non-digital
methods applied to
digital things
New digital methods
applied both to new
digital things, but also
against non-digital
subjects.
-- the issue of models
and modelling
Humanities research with digital humanities techniques
(such as those from Big Data) in the process
Humanities Research
as process
Source
Second’ry
Sources
Source
Source
Primary
Sources
Research Output
Research Process
Emerging Ideas
http://www.diggingintodata.org/
"Now going into the third round of the competition, the
Digging into Data Challenge has funded a wide variety
of projects that explore how computationally intensive
research methods can be used to ask new questions
about and gain new insights into our world."
25
Digging into Data: "a new era"
• "the Digging into Data Challenge investigators have demarcated a new
era -- one with the promise of revelatory explorations of our cultural
heritage that will lead us to new insights and knowledge, and to a
more nuanced and expansive understanding of the human condition"
(Williford and Henry 2012, p 1)
26
"New methodological approaches"
• "Research at these scales, speeds, and levels of complexity
encourages new methodological approaches and intellectual
strategies." (Williford and Henry 2012, p. 2)
27
"Big data is an all-
encompassing term
for any collection of
data sets so large and
complex that it
becomes difficult to
process using
traditional data
processing
applications."
"Big Data is a moving target;
what is considered to be
"Big" today will not be so
years ahead. "For some
organizations, facing
hundreds of gigabytes of
data for the first time may
trigger a need to reconsider
data management options.
For others, it may take tens
or hundreds of terabytes
before data size becomes a
significant consideration."
28
Powers of 10. substantial changes in
perspective
• Power of 10 Video (Charles and Ray Eames, IBM,
1977)
• https://www.youtube.com/watch?v=0fKBhvDjuy0
29
“the effect of adding
another zero”
Powers of 10. substantial changes in
perspective
• Power of 10 Video (Charles and Ray Eames, IBM,
1977)
• https://www.youtube.com/watch?v=0fKBhvDjuy0
30
kilometer
Powers of 10. substantial changes in
perspective
• Power of 10 Video (Charles and Ray Eames, IBM,
1977)
• https://www.youtube.com/watch?v=0fKBhvDjuy0
31
megameter
Powers of 10. substantial changes in
perspective
• Power of 10 Video (Charles and Ray Eames, IBM,
1977)
• https://www.youtube.com/watch?v=0fKBhvDjuy0
32
gigameter
"Million Books" Initiative and "Text Mining"
(Unsworth 2008)
• where does the trope of “a million books” come from? It originates, as far as I know, with the Universal Library and
its Million Books Project, which began in 2001. The Universal Library is directed by Raj Reddy, professor and former
Dean of Computer Science at Carnegie Mellon University; the million books project (funded by NSF and others) was
a kind of very large pilot, aimed at digitizing a million books (“less than 1% of all books in all languages ever
published”1), beginning with partners in India and later expanding to China and Egypt.
• Google Print (now known as Google Book Search), which had begun in secret in 2002 and was unveiled at the
Frankfurt Book Fair in October 2004, and which had Harvard's library as one of its initial partners. Google Books aims
to scan as many as 30 million books, a number equal to all the titles in WorldCat, and for all we know, they are
already about halfway there.
• “What do you do with a million books?”—a question first asked, I think, by Greg Crane, in D-Lib
Magazine, in March of 2006. (http://people.brandeis.edu/~unsworth/hownot2read.html)
• My answer to that question is that whatever you do, you don't read them, because you
can't.
• When millions of books are equally at your fingertips, all eagerly responding to your Google Book
Search: you can no longer as easily ignore the books you don't know, nor can you grasp the
collective systems they make up without some new strategy—a strategy for not reading.
• Tanya Clement, Sara Steger, John Unsworth, Kirsten Uszkalo (2008). "How Not to Read a Million Books"
http://people.brandeis.edu/~unsworth/hownot2read.html
33
Franco Moretti:
“distant reading”
(contrast with
“close reading”
Culturomics
34
"Aiden and Michel are the founders of a field they call “culturomics,”
in which quantitative analysis is performed on digitized texts to
generate empirical data about historical, cultural, and linguistic
trends."
"Both Aiden and Michel have backgrounds in biology, and [their
writing] reveals the extent to which that disciplinary sensibility fed
into the creation of culturomics."
Mark O'Connell (2014). "Bright Lights, Big Data". In New Yorker (Mar 20, 2014).
Online at http://www.newyorker.com/books/page-turner/bright-lights-big-data
35
Egal, Marc (2013).
"Evolution of the Novel in
the United States: The
Statistical Evidence". In
Social Science History 37:2
pp. 231-254
The Google Ngram viewer:
https://books.google.com/ngrams
Model behind the NGram
• Words in texts: words are strings of letters, same string: same word
• Words collected into document (books), published on particular date
• Critique from Humanities reviewers:
• Different genres of books make a difference
• Different period of time had different genres more prominent
• What books have been preserved up to today?
• Words change their meaning, e.g. “spiritual”
Was the Ngram
model too simple?
Big Data and Unstructured text:
the example of Topic Modelling
• There is a vast amount of digital text, almost all of
it with minimal semantic structure
CHAPTER I. Down the Rabbit-Hole
Alice was beginning to get very tired of
sitting by her sister on the bank, and of
having nothing to do: once or twice she had
peeped into the book her sister was reading,
but it had no pictures or conversations in
it, 'and what is the use of a book,' thought
Alice 'without pictures or conversations?'
So she was considering in her own mind (as
well as she could, for the hot day made her
feel very sleepy and stupid), whether the
pleasure of making a daisy-chain would be
worth the trouble of getting up and picking
the daisies, when suddenly a White Rabbit
with pink eyes ran close by her.
37
"Topic Model" (Wikipedia)
• [A] topic model is a type of statistical model for discovering the abstract "topics"
that occur in a collection of documents. Intuitively, given that a document is
about a particular topic, one would expect particular words to appear in the
document more or less frequently: "dog" and "bone" will appear more often in
documents about dogs, "cat" and "meow" will appear in documents about cats,
and "the" and "is" will appear equally in both. A document typically concerns
multiple topics in different proportions; thus, in a document that is 10% about
cats and 90% about dogs, there would probably be about 9 times more dog
words than cat words. A topic model captures this intuition in a mathematical
framework, which allows examining a set of documents and discovering, based
on the statistics of the words in each, what the topics might be and what each
document's balance of topics is.
38
Topic Modelling:
"Bag of Words"
• "In this model, a text (such as a sentence or a document) is
represented as the bag (multiset) of its words, disregarding grammar
and even word order but keeping multiplicity."
• The idea is that the selection of words in a document relate to what
the document is about.
39
Topic Modelling: Martha Ballard's Diary
• Cameron Blevins:
http://historying.org/2010/04/01/topic-modeling-
martha-ballards-diary/
• "In A Midwife’s Tale, Laurel Ulrich describes the challenge of analyzing
Martha Ballard’s exhaustive diary, which records daily entries over the
course of 27 years"
• “The problem is not that the diary is trivial but that it introduces more
stories than can be easily recovered and absorbed.” “
• Each Diary Entry is somewhat independent, and often focuses on
something that happened that day: terrible weather, or a beautiful birth,
etc etc
• "MALLET allows you to feed in a series of text files, which the machine
will then process and generate a user-specified number of word clusters it
thinks are related topics."
40
Topic Modelling Martha Ballard's Diary:
Mallet's Topics
• birth deld safe morn receivd calld left cleverly pm labour fine reward arivd
infant expected recd shee born patient
• meeting attended afternoon reverend worship foren mr famely performd vers
attend public supper st service lecture discoarst administred supt
• day yesterday informd morn years death ye hear expired expird weak dead las
past heard days drowned departed evinn
• gardin sett worked clear beens corn warm planted matters cucumbers
gatherd potatoes plants ou sowd door squash wed seeds
• lb made brot bot tea butter sugar carried oz chees pork candles wheat store
pr beef spirit churnd flower
• unwell mr sick gave dr rainy easier care head neighbor feet relief made throat
poorly takeing medisin ts stomach
"MALLET is completely unconcerned with
the meaning of a word (which is fortunate,
given the difficulty of teaching a computer
that, in this text, discoarst actually means
discoursed). Instead, the program is only
concerned with how the words are used in
the text, and specifically what words tend to
be used similarly." 41
Topic Modelling Martha
Ballard's Diary: Mallet's Topics
• MIDWIFERY: birth deld safe morn receivd calld left cleverly pm labour fine
reward arivd infant expected recd shee born patient
• CHURCH: meeting attended afternoon reverend worship foren mr famely
performd vers attend public supper st service lecture discoarst administred
supt
• DEATH: day yesterday informd morn years death ye hear expired expird weak
dead las past heard days drowned departed evinn
• GARDENING: gardin sett worked clear beens corn warm planted matters
cucumbers gatherd potatoes plants ou sowd door squash wed seeds
• SHOPPING: lb made brot bot tea butter sugar carried oz chees pork candles
wheat store pr beef spirit churnd flower
• ILLNESS: unwell mr sick gave dr rainy easier care head neighbor feet relief
made throat poorly takeing medisin ts stomach
42
Partitioning the Digital Humanities
• Traditional Scholarship about digital things in society
• Data Analysis using digital tools
• Data Representation using digital tools
• Making Digital Tools
Traditional non-digital
methods applied to
digital things
New digital methods
applied both to new
digital things, but also
against non-digital
subjects.
-- the issue of models
and modelling
Partitioning the Digital Humanities
• Traditional Scholarship about digital things in society
• Data Analysis using digital tools
• Data Representation using digital tools
• Making Digital Tools
Traditional non-digital
methods applied to
digital things
New digital methods
applied both to new
digital things, but also
against non-digital
subjects.
-- the issue of models
and modelling
Making
Digital
Objects
Using Digital Tools
Humanities research with digital humanities techniques
(such as those from Big Data) in the process
Humanities Research:
Modelling
Capture
Presentation
Source
Source
Second’ry
Sources
Research Process
Primary Sources
e.g.:
• Marked up Text (TEI)
• Database
Formal Mathematical Models from Physics
Our Formal Models are not like this!
“Knowledge Representation”
• A knowledge representation (KR) is most fundamentally a surrogate, a
substitute for the thing itself, used to enable an entity to determine
consequences by thinking rather than acting, i.e., by reasoning about the
world rather than taking action in it. […] It is a set of ontological
commitments, i.e., an answer to the question: In what terms should I think
about the world? (Davis, Shrobe, Szolovits 1993)
• "In terms of humanities computing, modelling is an iterative process of
constructing and developing something like a computational 'knowledge
representation' as this is defined in computer science. In fact we might say
that a model is a manipulable knowledge representation.”
• Willard McCarty 2002. “Humanities Computing: Essential Problems, Experimental Practice” in
Literary and Linguistic Computing Vol 17 No 1. pp.103-125
Imaginary Museum:
A basic design
Examples of “Access points”:
Find me all pictures and holding
museum for Ihei Kimure
Find me all pictures held by the
Bibliothèque Nationale in Paris
Find me all pictures taken by
women photographers after the 2nd
World War
Took Holds
Modelling understanding: “The Art of making
in Antiquity” project
“Engaging the archive’s
compiler [Rockwell] in this
way will give a unique
angle to the metadata,
making timely use of his
position as a leading
authority on stoneworking
and a sculptor of long
standing.” http://www.artofmaking.ac.uk/
A model of Rockwell’s
interpretation
Art of Making: Exploring the Tools
Art of Making: The “Tooth Chisel”
Art of Making: a particular photograph
“Prosopography”
• “[the word] was missing from the main text of the two-volume Shorter Oxford English
Dictionary, published in 1973, but there it was in the Addenda, between polythene and
profiteroles […]. Wait, though: this is not our prosopography but a neoclassical one, given a
first attestation in 1577 and a derivation from an early modern neulogism ‘prosopographia’,
the description of an individual’s personality and career.”
• Janet L. Nelson, David Pelteret and Harold Short (2003). “Medieval Prosopographies and the
Prosopography of Anglo-Saxon England”. Fifty Years of Prosopography. In series Proceedings of the British
Academy, Vol 118, p 155-167.
Prosopography: definition from PASE
• What is a Prosopography?
• “A particular prosopography aims to amass and present clearly a quantity of information on all individuals
in a given category”
• (PASE website)
• An Historical project.
• A published prosopography becomes a reference for other historians to
use. It tells them:
• Who's who
• Something of what is known about them
• In what sources this individual appears.
How is prosopography carried out?
Don't forget that a prosopography is a kind of index:
• recording all that is known about people who fit the criteria of the project in the
prosopographical categories.
• all we know about historical figures is what has survived from their own period
• generally, for older periods, these are collections of manuscript documents
• information about a single person might be scattered across separate documents.
• A Bishop, for example, might have his life story told by more than one author
• also, however, his doings might also appear in various legal and church documents that
survive
• also, in letters he wrote, or others wrote to him.
The task of Prosopography
• The job, then, is to:
• read all the sources
• collect information from all categories of interest to the particular
prosopography about people
• record this information and publish it so that others can use it to:
• get a summary of what is known about the person
• where the person is actually described in the extent sources.
Traditional prosopography as narrative
From J.R. Martindale, The
Prosopography of the Later
Roman Empire, 3: A.D. 527-
641. Cambridge: Cambridge
University Press. 1992.
“Text in” and “Text out”?
Prosopography of Anglo-Saxon England (PASE)
http://www.pase.ac.uk
61
“DDH’s” Structured Prosopographical Projects
• Prosopography of the Byzantine Empire
• With John Martindale, Editor. First published in CD, now available free online
• Prosopography of the Byzantine World
• With Michael Jeffreys (KCL), Averil Cameron (Professor of Late Antique and Byzantine History, U of Oxford), Charlotte Roueché (Dept of
Byzantine and Modern Greek Studies, KCL): latest update 2011
• Clergy of the Church of England Database
• With Arthur Burns (History, KCL), Kenneth Fincham (History, U of Kent at Canterbury), Stephen Taylor (History, U of Reading): latest update 2014
• Prosopography of Anglo-Saxon England
• With Janet Nelson and Stephen Baxter (History, KCL), Simon Keynes (Department of Anglo-Saxon, Norse, and Celtic, U of Cambridge): latest
update 2011
• Paradox/Peoples of Medieval Scotland
• With Prof Dauvit Broun (University of Glasgow), Prof Roibeard Ó Maolalaigh (Glasgow), Prof David Carpenter (KCL), Dr Matthew Hammond
(University of Edinburgh): last update Sept 2012
• Breaking of Britain
• With Prof Dauvit Broun (University of Glasgow), Prof David Carpenter (KCL), Dr Matthew Hammond (University of Edinburgh), Prof Keith
Stringer (Univ of Lancaster)
• Making of Charlemagne’s Europe
• With Alice Rio (KCL, HIstory). Project finished 2015
• Digitising the Prosopographies of the Roman Republic
• With Henrick Mouritsen and Dominc Rathbone (KCL, Classics): beginning late 2013
Traditional Prosopography:
in print form
62
Sources
People
From J.R. Martindale, The
Prosopography of the
Later Roman Empire, 3:
A.D. 527-641. Cambridge:
Cambridge University
Press. 1992.
Places
Structured Data for prosopography: Many
entrances
Source
Person
Place
Office
Date
Source
Source
Event
Prosopographical Database
What brings these together?
Offices, Posts
Institutions
?
The ‘factoid model’
Factoid: a spot in a
source that says
something about a
person or persons.
http://factoid-dighum.kcl.ac.uk/
Some sizes for KCL’s structured
Prosopographies
• PASE: Sources 2,784 including Domesday book, people: 19.807
(including 978 women), factoids: 282,026
• PoMS: Sources: 9,259 (mainly charters), people: 21,311 persons,
factoids: 87,956
• CCE: people(clergy): 158,263, Sources: 2,987 (admin sources),
factoids: 931,636 (resolved), over 2M in total
Hermits in PASE
PASE: Acts of Fasting/Resisting Temptation
Modelling: Appropriate to the Humanities?
• “humanistic inquiry reveals itself as an activity
fundamentally dependent upon the location of pattern.”
• “Of all the technologies in use among computing
humanists, databases are perhaps the best suited to
facilitating and exploiting [pattern].”
• “To build a database one must be willing to move from the
forest to the trees and back again; to use a database is to
reap the benefits of the enhanced vision which the system
affords.”
• (from Ramsay, “Databases” in A Companion to Digital
Humanities”)
Appropriate to the Humanities?
• the underlying ontology [that a database represents] has considerable intellectual value.
• A well-designed database that contains information about people, buildings, and events in New
York City contains not static information, but an entire set of ontological relations capable of
generating statements about a domain.
• A truly relational database, in other words, contains not merely "Central Park", "Frederick Law
Olmstead", and "1857", but a far more suggestive string of logical relationships (e.g., "Frederick
Law Olmstead submitted his design for Central Park in New York during 1857").
• (from Ramsay, “Databases” in A Companion to Digital Humanities”)
Partitioning the Digital Humanities
• Traditional Scholarship about digital things in society
• Data Analysis using digital tools
• Data Representation using digital tools
• Making Digital Tools
Traditional non-digital
methods applied to
digital things
New digital methods
applied both to new
digital things, but also
against non-digital
subjects.
-- the issue of models
and modelling
Making
Digital
Objects
Using Digital Tools
Partitioning the Digital Humanities
• Traditional Scholarship about digital things in society
• Data Analysis using digital tools
• Data Representation using digital tools
• Making Digital Tools
Traditional non-digital
methods applied to
digital things
New digital methods
applied both to new
digital things, but also
against non-digital
subjects.
-- the issue of models
and modelling
Making
Digital
Objects
Using Digital Tools
The “world of code”
We live in worlds increasingly interwoven
with code. Code puts into operation the apps
and communication channels we increasingly
depend on in everyday life; it twitters away
insistently in our pockets in our portable
devices; it channels the personal data,
interests, and relationships that we include
in our social network profiles; and it
aggregates us into vast database
architectures, powered by database
management techniques, as a million little
informational bits, in order to offer us new
services, recommendations, and
experiences. Without really thinking about it,
we are spending vast amounts of our time
doing stuff with code.”
http://dmlcentral.net/blog/ben-williamson/coded-curriculum-new-architectures-learning
App Inventor: “Anyone can build Apps that
impact the world” (!!)
http://appinventor.mit.edu/explore/
“Telescopes for the Mind”
• "Digital artifacts like tools could then be considered as "telescopes for
the mind" that show us something in a new light" (Ramsay and
Rockwell 2012, p 79)
• An “app” can be a new tool that, like telescopes, allows us to see
material we point it at in a new way that can transform our
understanding of the material, as when Galileo pointed the telescope
at the moon.
New insights through “making things”
• It may be that my personal history and training exert a prejudicial influence
that limits the appeal of how I think about digital humanities. Perhaps that
history and training explain why in reading Tim Ingold's illuminating book,
Making: Anthropology, archaeology, art and architecture (2013), I am
drawn eagerly to the pedagogical expressions of his anthropology in such
class exercises as weaving baskets and see in them (changing what needs
to be changed) a model for training digital humanists.
• The link between baskets and computing was made explicit to me this
morning by the announcement of the Rare Book Summer School
(Humanist 30.742), which quoted a former student as saying, "I will never
look at a book -- ”any book -- ”the same way again.” (McCarty 2017,
Humanist Vol. 30, No. 746)
The TACT KWIC Display
TACT Distribution Display
Text Analysis Tool: Voyant
Moby Dick in Voyant
Pliny: what is it?
• it is a thought-piece:
• perhaps wrongheaded in various ways
• … although a couple of years was spent on research into what Humanities
scholarship was like before Pliny was built
• Pliny is meant to promote discussion within the DH about this area.
http://pliny.cch.kcl.ac.uk
The object of study for me in Pliny was
the traditional methods of Humanities
Research itself.
Pliny and software tool collaboration:
significance of annotation
From show about damaged books (!)
at Cambridge University Library
The page is at the “nexus”
Publishing
Application
•Preparing text
•book design and
presentation
•Printing
•Distribution
•The printing press
The page is the nexus between publishing and annotation
Annotation
Application
•Support dynamic
text
•Support using of
annotations
•The pen
The screen as the “nexus”
PDF Viewer
•Reading PDF file
•Layout on the
screen
•Supporting page
turning, etc
Pliny
•Support display
of annotations
•Manage notes
and anchors
•Support work
with notes
Humanities research as process, and its research output
Humanities Research
as process
Source
Second’ry
Sources
Source
Source
Primary
Sources
Research Output
Research Process
Emerging Ideas
Holmer, Joan Ozark (1994). “Draw, if
you be Men”: Saviolo’s Significance
for Romeo and Juliet”. In
Shakespeare Quarterly, Vol. 45 No 2
(Summer 1994). pp. 163-189
Annotation
& notetaking
85
Pliny objects as a connected graph: a “Mind
Map”
• An example of a mindmap
• Graham Burnett (2005)
• http://en.wikipedia.org/wiki/File:Mindmap.gif
85
• "To ask whether coding is a scholarly act is like asking whether writing
is a scholarly act." (Ramsay and Rockwell 2012, p 82)
• "A tool is, so to speak, an objectified idea, a theorem whose force is
imposed on its consequences. It is thought in action and must not be
conceived in terms of the crudities of early materialism." (p 332)
Digital Humanities as the “big tent”
Digital Humanities as the “big tent”
Traditional Scholarship about digital things in society
Data Analysis such as big data using digital tools
Data Representation using digital tools
Making Digital Tools
Textual Markup Wearable Digital Objects
Information studies for Humanities
Digital Humanities: the “big tent” (?)
“Our Data Ourselves”
“Our Data Ourselves”
Do you know how much data you generate? Do you know where it goes? Is it being sold by Google
to increase their market share, now valued at £250 billion? Is it being ‘scooped’ by the GCHQ or
NSA? The Snowden revelations show that security agencies are surreptitiously taking data form
our phones, cameras, apps, and anything else we use that leaves a digital trace. Especially if you
use a mobile device, you are actively contributing to the 2.5 billion gigabytes of data being
generated daily. To put that in perspective, this would be enough data to fill one hundred million
iPhones, every day. Yet, public understanding of our information-rich environment and our
quantified selves remains underdeveloped.
‘Our Data Ourselves’ is research project we lead at King’s College London, examining the data we
generate on our mobile devices. We have brought together media and cultural theorists,
computer scientists, programmers and youth in a unique project exploring this ‘big social data’
(BSD) we generate. As arts and humanities researchers we are interested in the development and
transfer of the technical skills and knowledge necessary for the capture and analysis of the
different forms of BSD, and its transformation into community research data. This desire to create
community research data signals broader social relevance and potential. Our project, therefore,
asks whether BSD can be transformed into a public asset and become a creative resource for
cultural and economic community development.
Partitioning the Digital Humanities
Traditional
Scholarship
about digital
things in
society
Data Analysis
using digital
tools
Data
Representation
using digital
tools
Making Digital
Tools
Digital Technology as
active partner in the
work
“Modelling and Models”
The World's Technological Capacity to Store, Communicate,
and Compute Information
Martin Hilbert and Priscila López (2011). "The World’s
Technological Capacity to Store, Communicate, and
Compute Information", In
Science 332:6025 (April 2011) pp. 60-65. DOI:
10.1126/science.1200970
Fig 2. World’s technological installed capacity to store information
10 times
94
Moretti: Graphs, maps, trees
Distant Reading
• "essay on literary history"
• "instead of concrete individual
works [..] the text undergoes a
process of deliberate reduction
and abstraction"
• Distant Reading: "a specific form
of knowledge: fewer elements,
hence a sharper sense of their
overall interconnection".
95
Moretti: Graphs, maps, trees
Distant Reading
• Moretti claims traditional history focused
on rare/curious objects and, therefore
prominent people and their doings.
Annales school looked at larger society:
with more data used statistical methods:
"social history")
• "What would happen if literary historians,
too, decided to 'shift their gaze [...] from
the extraordinary to the everyday, from
exceptional events to the large mass of
facts'?"
• "A more rational literary history. That is the
idea." (p. 4)
The Annales School: "The school has been
highly influential in setting the agenda for
historiography in France and numerous other
countries, especially regarding the use of social
scientific methods by historians, emphasis on
social rather than political or diplomatic themes,
and for being generally hostile to the class
analysis of Marxist historiography."
96
Moretti: The rise of the novel
Moretti claims that work like his
is "truly cooperative" (based on
from other projects) and could
be combined in more than one
way.
Moretti talks about graphs of
rise of the novel – similar
shapes for five countires, three
continents, over two centuries
... the same old metaphor
97
Reaction to Distant Reading
• Kathryn Schulz (2011) What Is Distant Reading?
• New York Times Sunday Book Review June 24, 2011.
• http://www.nytimes.com/2011/06/26/books/review/the-mechanic-muse-what-is-distant-
reading.html
• "Let’s say you pick up a copy of “Jude the Obscure,” become obsessed with Victorian fiction and somehow
manage to make your way through all 200-odd books generally considered part of that canon. Moretti would
say: So what? As many as 60,000 other novels were published in 19th-century England — to mention
nothing of other times and places. You might know your George Eliot from your George Meredith, but you
won’t have learned anything meaningful about literature, because your sample size is absurdly small. Since
no feasible amount of reading can fix that, what’s called for is a change not in scale but in strategy. To
understand literature, Moretti argues, we must stop reading books."
98
The "Canon"
• "What makes the history of music, or of any art, particularly troublesome is that
what is most exceptional, not what is most usual, has often the greatest claim on
our interest."
• ... It is only in the works of Haydn, Mozart, and Beethoven that all the
comtemporary elements of musical style -- rhythmic, harmonic, and melodic --
work coherently together, or that the ideals of the period are realized on a level
of any complexity." (pp 21-22)
• Charles Rosen, "The Classical Style"
99
Hype?
• 'For those who think in next-new-thing terms, "Big
Data" seems to fill the role not of an enormously
challenging research problem but of a solution, a t-
shirt slogan, a rah-rah rhetorical flourish, i.e. it
amounts to boondoggly hype.'
• HUMANIST Sun, Oct 26, 2014, Willard McCarty
100
Modelling, and Patterns
• "In representing the past, we seek perspective, the point of view that
allows us to discern patterns among the events that have occurred.
We are not so much trying to transmit accumulated knowledge –
culture and tradition do this, among other means – as to understand
the significance of our experience.“
• Bodenhamer 2008 p. 222
Data, Structure, Interpretation
• "There can be no data without structure, and all structure is interface,
whether we view it as a screen appearance or not. [...] Even more
importantly, all interfaces—visible as well as invisible—are
interpretational forms."
• (McGann, Jerome (2010). "Sustainability: The Elephant in the Room". In Online Humanities
Scholarship: The Shape of Things to Come. Houston Texas: Rice University Press)
http://rup.rice.edu/cnx_content/shape/m34305.html
Analytical Modelling: the utility of failure
• "the digital model illumines analytically by isolating what would not
compute. In other words, the failures of analytic modelling are where
its success is to be found.”
• Willard McCarty (2008). “What’s going on?” in Literary and Linguistic Computing, Vol 23 No 3.
p. 256
ENIAC
ENIAC (/ˈini.æk/ or
/ˈɛni.æk/; Electronic
Numerical Integrator And
Computer) was the first
electronic general-purpose
computer. It was [...]
capable of being
reprogrammed to solve "a
large class of numerical
problems". ENIAC was
initially designed to
calculate artillery firing
tables for the United States
Army's Ballistic Research
Laboratory.
ENIAC
Programming the
ENIAC
"ENIAC could be programmed to
perform complex sequences of
operations, including loops, branches,
and subroutines. The task of taking a
problem and mapping it onto the
machine was complex, and usually
took weeks. After the program was
figured out on paper, the process of
getting the program into ENIAC by
manipulating its switches and cables
could take days."
Von Neumann Machines
• Von Neumann Architecture
• Stored-Program System
• A stored-program digital computer is
one that keeps its program
instructions, as well as its data, in
read-write, random-access memory
(RAM)
Inside the phone (or computer...)
Long-term
Storage
Screen
Touch Surface
Speaker
Microphone
Camera Wireless
A Von
Neumann
Machine
Inside the phone (or computer...)
Long-term
Storage
Screen
Touch Surface
Speaker
Microphone
Camera Wireless
A Von
Neumann
Machine
Software
Software
Data
Data
•Software for devices (device
drivers)
•Software for services
(Operating System)
•Software for functionality
(application software: apps,
etc)
Operating System Services
answer = raw_input("What is your name?")
Operating System Software
Widget Services
Text Services

Weitere ähnliche Inhalte

Was ist angesagt?

Big Data and Social Sciences
Big Data and Social SciencesBig Data and Social Sciences
Big Data and Social SciencesDavid De Roure
 
Annotation and Scholarship
Annotation and ScholarshipAnnotation and Scholarship
Annotation and ScholarshipJohn Bradley
 
James baker bronte 11.10pptx
James baker bronte 11.10pptxJames baker bronte 11.10pptx
James baker bronte 11.10pptxSoniaJones
 
Analytic Journalism: Digital Evolution in the Datasphere
Analytic Journalism: Digital Evolution in the DatasphereAnalytic Journalism: Digital Evolution in the Datasphere
Analytic Journalism: Digital Evolution in the DatasphereJ T "Tom" Johnson
 
Project ‘The Digital City Revives’. A Case Study of Web Archaeology
Project ‘The Digital City Revives’. A Case Study of Web ArchaeologyProject ‘The Digital City Revives’. A Case Study of Web Archaeology
Project ‘The Digital City Revives’. A Case Study of Web ArchaeologyTjarda de Haan
 
Crowdsourcing and Cultural Heritage workshop
Crowdsourcing and Cultural Heritage workshopCrowdsourcing and Cultural Heritage workshop
Crowdsourcing and Cultural Heritage workshopMia
 
From digital to social collections. A short story of collections online.
From digital to social collections. A short story of collections online.From digital to social collections. A short story of collections online.
From digital to social collections. A short story of collections online.Elena Lagoudi
 
Scholarship in the Digital World
Scholarship in the Digital WorldScholarship in the Digital World
Scholarship in the Digital WorldDavid De Roure
 
New Forms of Data for e-Research
New Forms of Data for e-ResearchNew Forms of Data for e-Research
New Forms of Data for e-ResearchDavid De Roure
 
Research in the digital age - circa 2005
Research in the digital age - circa 2005Research in the digital age - circa 2005
Research in the digital age - circa 2005Larry Naukam
 
Topic Maps: Romancing Conversation Topics
Topic Maps: Romancing Conversation TopicsTopic Maps: Romancing Conversation Topics
Topic Maps: Romancing Conversation TopicsJack Park
 
Public History / Digital History
Public History / Digital HistoryPublic History / Digital History
Public History / Digital History6500jmk4
 
e-Research and the Demise of the Scholarly Article
e-Research and the Demise of the Scholarly Articlee-Research and the Demise of the Scholarly Article
e-Research and the Demise of the Scholarly ArticleDavid De Roure
 

Was ist angesagt? (17)

Big Data and Social Sciences
Big Data and Social SciencesBig Data and Social Sciences
Big Data and Social Sciences
 
Annotation and Scholarship
Annotation and ScholarshipAnnotation and Scholarship
Annotation and Scholarship
 
James baker bronte 11.10pptx
James baker bronte 11.10pptxJames baker bronte 11.10pptx
James baker bronte 11.10pptx
 
Analytic Journalism: Digital Evolution in the Datasphere
Analytic Journalism: Digital Evolution in the DatasphereAnalytic Journalism: Digital Evolution in the Datasphere
Analytic Journalism: Digital Evolution in the Datasphere
 
SVA Workshop 100111
SVA Workshop 100111SVA Workshop 100111
SVA Workshop 100111
 
Social Media and New Documentary Filmmaking
Social Media and New Documentary FilmmakingSocial Media and New Documentary Filmmaking
Social Media and New Documentary Filmmaking
 
Project ‘The Digital City Revives’. A Case Study of Web Archaeology
Project ‘The Digital City Revives’. A Case Study of Web ArchaeologyProject ‘The Digital City Revives’. A Case Study of Web Archaeology
Project ‘The Digital City Revives’. A Case Study of Web Archaeology
 
Crowdsourcing and Cultural Heritage workshop
Crowdsourcing and Cultural Heritage workshopCrowdsourcing and Cultural Heritage workshop
Crowdsourcing and Cultural Heritage workshop
 
From digital to social collections. A short story of collections online.
From digital to social collections. A short story of collections online.From digital to social collections. A short story of collections online.
From digital to social collections. A short story of collections online.
 
Scholarship in the Digital World
Scholarship in the Digital WorldScholarship in the Digital World
Scholarship in the Digital World
 
New Forms of Data for e-Research
New Forms of Data for e-ResearchNew Forms of Data for e-Research
New Forms of Data for e-Research
 
Research in the digital age - circa 2005
Research in the digital age - circa 2005Research in the digital age - circa 2005
Research in the digital age - circa 2005
 
Prague olomoucfinal
Prague olomoucfinalPrague olomoucfinal
Prague olomoucfinal
 
Topic Maps: Romancing Conversation Topics
Topic Maps: Romancing Conversation TopicsTopic Maps: Romancing Conversation Topics
Topic Maps: Romancing Conversation Topics
 
Public History / Digital History
Public History / Digital HistoryPublic History / Digital History
Public History / Digital History
 
Prague pptfinal
Prague pptfinalPrague pptfinal
Prague pptfinal
 
e-Research and the Demise of the Scholarly Article
e-Research and the Demise of the Scholarly Articlee-Research and the Demise of the Scholarly Article
e-Research and the Demise of the Scholarly Article
 

Ähnlich wie Four Corners of the Big Tent

Digital Humanities for Historians: An introduction
Digital Humanities for Historians: An introductionDigital Humanities for Historians: An introduction
Digital Humanities for Historians: An introductionlibrarianrafia
 
Estado arte de las Humanidades Digitales. Algunos proyectos de investigación
Estado arte de las Humanidades Digitales. Algunos proyectos de investigaciónEstado arte de las Humanidades Digitales. Algunos proyectos de investigación
Estado arte de las Humanidades Digitales. Algunos proyectos de investigaciónGimena Del Rio Riande
 
Digital Humanities and “Digital” Social Sciences
Digital Humanities and “Digital” Social SciencesDigital Humanities and “Digital” Social Sciences
Digital Humanities and “Digital” Social SciencesChantal van Son
 
Introduction to digital scholarship tools
Introduction to digital scholarship toolsIntroduction to digital scholarship tools
Introduction to digital scholarship toolslibrarianrafia
 
Dh presentation helig 2014
Dh presentation helig 2014Dh presentation helig 2014
Dh presentation helig 2014HELIGLIASA
 
Digital Humanities - Conversation Starter 2015
Digital Humanities - Conversation Starter 2015Digital Humanities - Conversation Starter 2015
Digital Humanities - Conversation Starter 2015University of Cape Town
 
Decolonizing the Digital Humanities
Decolonizing the Digital Humanities Decolonizing the Digital Humanities
Decolonizing the Digital Humanities Allan Cho
 
Living with Machines: one year in
Living with Machines: one year inLiving with Machines: one year in
Living with Machines: one year inMia
 
Decolonizing the Digital Humanities
Decolonizing the Digital HumanitiesDecolonizing the Digital Humanities
Decolonizing the Digital HumanitiesAllan Cho
 
Digital Transformations: keynote talk to Listening Experience Database Sympos...
Digital Transformations: keynote talk to Listening Experience Database Sympos...Digital Transformations: keynote talk to Listening Experience Database Sympos...
Digital Transformations: keynote talk to Listening Experience Database Sympos...Andrew Prescott
 
Introduction to Computational Social Science - Lecture 1
Introduction to Computational Social Science - Lecture 1Introduction to Computational Social Science - Lecture 1
Introduction to Computational Social Science - Lecture 1Lauri Eloranta
 
A view on digital scholarship in Humanities and Social Sciences: a culture of...
A view on digital scholarship in Humanities and Social Sciences: a culture of...A view on digital scholarship in Humanities and Social Sciences: a culture of...
A view on digital scholarship in Humanities and Social Sciences: a culture of...Esteban Romero Frías
 
What is digital humanities ,By: Professor Lili Saghafi
What is digital humanities ,By: Professor Lili SaghafiWhat is digital humanities ,By: Professor Lili Saghafi
What is digital humanities ,By: Professor Lili SaghafiProfessor Lili Saghafi
 
MA in Digital Humanities
MA in Digital Humanities MA in Digital Humanities
MA in Digital Humanities Paul Spence
 
Digital Humanities as Innovation: ‘constant revolution’ or ‘moving to the su...
Digital Humanities as Innovation:  ‘constant revolution’ or ‘moving to the su...Digital Humanities as Innovation:  ‘constant revolution’ or ‘moving to the su...
Digital Humanities as Innovation: ‘constant revolution’ or ‘moving to the su...Andrea Scharnhorst
 
Edward Whitley C19 2018 Institutional Climates for Digital Scholarship
Edward Whitley C19 2018 Institutional Climates for Digital ScholarshipEdward Whitley C19 2018 Institutional Climates for Digital Scholarship
Edward Whitley C19 2018 Institutional Climates for Digital Scholarshipedwardwhitley
 

Ähnlich wie Four Corners of the Big Tent (20)

Digital Humanities for Historians: An introduction
Digital Humanities for Historians: An introductionDigital Humanities for Historians: An introduction
Digital Humanities for Historians: An introduction
 
Estado arte de las Humanidades Digitales. Algunos proyectos de investigación
Estado arte de las Humanidades Digitales. Algunos proyectos de investigaciónEstado arte de las Humanidades Digitales. Algunos proyectos de investigación
Estado arte de las Humanidades Digitales. Algunos proyectos de investigación
 
Digital Humanities and “Digital” Social Sciences
Digital Humanities and “Digital” Social SciencesDigital Humanities and “Digital” Social Sciences
Digital Humanities and “Digital” Social Sciences
 
Introduction to digital scholarship tools
Introduction to digital scholarship toolsIntroduction to digital scholarship tools
Introduction to digital scholarship tools
 
Dh presentation helig 2014
Dh presentation helig 2014Dh presentation helig 2014
Dh presentation helig 2014
 
Digital Humanities - Conversation Starter 2015
Digital Humanities - Conversation Starter 2015Digital Humanities - Conversation Starter 2015
Digital Humanities - Conversation Starter 2015
 
Digital Humanities Workshop
Digital Humanities WorkshopDigital Humanities Workshop
Digital Humanities Workshop
 
Decolonizing the Digital Humanities
Decolonizing the Digital Humanities Decolonizing the Digital Humanities
Decolonizing the Digital Humanities
 
Digital Research at the British Library, by Stella Wisdom
Digital Research at the British Library, by Stella WisdomDigital Research at the British Library, by Stella Wisdom
Digital Research at the British Library, by Stella Wisdom
 
Living with Machines: one year in
Living with Machines: one year inLiving with Machines: one year in
Living with Machines: one year in
 
Decolonizing the Digital Humanities
Decolonizing the Digital HumanitiesDecolonizing the Digital Humanities
Decolonizing the Digital Humanities
 
Dh presentation 2018
Dh presentation 2018Dh presentation 2018
Dh presentation 2018
 
Digital Transformations: keynote talk to Listening Experience Database Sympos...
Digital Transformations: keynote talk to Listening Experience Database Sympos...Digital Transformations: keynote talk to Listening Experience Database Sympos...
Digital Transformations: keynote talk to Listening Experience Database Sympos...
 
Introduction to Computational Social Science - Lecture 1
Introduction to Computational Social Science - Lecture 1Introduction to Computational Social Science - Lecture 1
Introduction to Computational Social Science - Lecture 1
 
A view on digital scholarship in Humanities and Social Sciences: a culture of...
A view on digital scholarship in Humanities and Social Sciences: a culture of...A view on digital scholarship in Humanities and Social Sciences: a culture of...
A view on digital scholarship in Humanities and Social Sciences: a culture of...
 
What is digital humanities ,By: Professor Lili Saghafi
What is digital humanities ,By: Professor Lili SaghafiWhat is digital humanities ,By: Professor Lili Saghafi
What is digital humanities ,By: Professor Lili Saghafi
 
Dh presentation 2019
Dh presentation 2019Dh presentation 2019
Dh presentation 2019
 
MA in Digital Humanities
MA in Digital Humanities MA in Digital Humanities
MA in Digital Humanities
 
Digital Humanities as Innovation: ‘constant revolution’ or ‘moving to the su...
Digital Humanities as Innovation:  ‘constant revolution’ or ‘moving to the su...Digital Humanities as Innovation:  ‘constant revolution’ or ‘moving to the su...
Digital Humanities as Innovation: ‘constant revolution’ or ‘moving to the su...
 
Edward Whitley C19 2018 Institutional Climates for Digital Scholarship
Edward Whitley C19 2018 Institutional Climates for Digital ScholarshipEdward Whitley C19 2018 Institutional Climates for Digital Scholarship
Edward Whitley C19 2018 Institutional Climates for Digital Scholarship
 

Mehr von John Bradley

Pliny: 4 perspectives
Pliny: 4 perspectivesPliny: 4 perspectives
Pliny: 4 perspectivesJohn Bradley
 
Towards a bibliographic model of illustraions in the early modern book
Towards a bibliographic model of illustraions in the early modern bookTowards a bibliographic model of illustraions in the early modern book
Towards a bibliographic model of illustraions in the early modern bookJohn Bradley
 
What is this thing called REED?
What is this thing called REED?What is this thing called REED?
What is this thing called REED?John Bradley
 
Being Engelbartian
Being EngelbartianBeing Engelbartian
Being EngelbartianJohn Bradley
 
PBW: Possible Futures and technical directions
PBW: Possible Futures and technical directionsPBW: Possible Futures and technical directions
PBW: Possible Futures and technical directionsJohn Bradley
 
Ontologies for Prosopography
Ontologies for ProsopographyOntologies for Prosopography
Ontologies for ProsopographyJohn Bradley
 
Jb dariah-annotation-workshop
Jb dariah-annotation-workshopJb dariah-annotation-workshop
Jb dariah-annotation-workshopJohn Bradley
 
Tools for a whole range of Scholarly Activities (at DH2015)
Tools for a whole range of Scholarly Activities (at DH2015)Tools for a whole range of Scholarly Activities (at DH2015)
Tools for a whole range of Scholarly Activities (at DH2015)John Bradley
 
Towards an Ontology for Historical Persons
Towards an Ontology for Historical PersonsTowards an Ontology for Historical Persons
Towards an Ontology for Historical PersonsJohn Bradley
 
A Semantic Web understanding of the Factoid Prosopography model
A Semantic Web understanding of the Factoid Prosopography modelA Semantic Web understanding of the Factoid Prosopography model
A Semantic Web understanding of the Factoid Prosopography modelJohn Bradley
 

Mehr von John Bradley (10)

Pliny: 4 perspectives
Pliny: 4 perspectivesPliny: 4 perspectives
Pliny: 4 perspectives
 
Towards a bibliographic model of illustraions in the early modern book
Towards a bibliographic model of illustraions in the early modern bookTowards a bibliographic model of illustraions in the early modern book
Towards a bibliographic model of illustraions in the early modern book
 
What is this thing called REED?
What is this thing called REED?What is this thing called REED?
What is this thing called REED?
 
Being Engelbartian
Being EngelbartianBeing Engelbartian
Being Engelbartian
 
PBW: Possible Futures and technical directions
PBW: Possible Futures and technical directionsPBW: Possible Futures and technical directions
PBW: Possible Futures and technical directions
 
Ontologies for Prosopography
Ontologies for ProsopographyOntologies for Prosopography
Ontologies for Prosopography
 
Jb dariah-annotation-workshop
Jb dariah-annotation-workshopJb dariah-annotation-workshop
Jb dariah-annotation-workshop
 
Tools for a whole range of Scholarly Activities (at DH2015)
Tools for a whole range of Scholarly Activities (at DH2015)Tools for a whole range of Scholarly Activities (at DH2015)
Tools for a whole range of Scholarly Activities (at DH2015)
 
Towards an Ontology for Historical Persons
Towards an Ontology for Historical PersonsTowards an Ontology for Historical Persons
Towards an Ontology for Historical Persons
 
A Semantic Web understanding of the Factoid Prosopography model
A Semantic Web understanding of the Factoid Prosopography modelA Semantic Web understanding of the Factoid Prosopography model
A Semantic Web understanding of the Factoid Prosopography model
 

Kürzlich hochgeladen

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 

Kürzlich hochgeladen (20)

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 

Four Corners of the Big Tent

  • 1. Four corners of the big tent: perspectives on the Digital Humanities (for HUCO 617: Big Data: The Web as Evidence) John Bradley Department of Digital Humanities King’s College London
  • 2. Why me? • Been active in Digital Humanities since the 1970s (!!) • Been a member of King’s Department of Digital Humanities since 1997 • Have done work in more than one “aspect” of the DH • John Bradley and Julianne Nyhan, “Getting Computers into Humanists’ Thinking”. in Julianne Nyhan and Andrew Flinn, Computation and the Humanities: Towards an Oral History of Digital Humanities, Springer Publishers 2017. ISBN: 978-3-319-20169-6 (Print) 978-3-319-20170-2 (Online) http://link.springer.com/chapter/10.1007/978-3-319-20170-2_14
  • 3. KCL: Department of Digital Humanities http://www.kcl.ac.uk/artshums/depts/ddh/index.aspx
  • 4. What are the Humanities? • “Research stemming from a detailed understanding of human behaviour, economies, cultures and societies can dramatically redefine the crucial decisions we need to make. These decisions may involve the future direction of our economy, ways of broadening and strengthening education provision at all levels, or how we deal with the effects of climate or constitutional change… The humanities and social sciences teach us how people have created their world, and how they in turn are created by it.” • –The British Academy for Humanities & Social Sciences, “Press Pack”. • From Alan Liu, 4Humanities: Advocating for the Humanities. Website at http://4humanities.org/2014/12/what-are-the-humanities/
  • 5. What are the Humanities? • “The humanities are academic disciplines that study human culture. The humanities use methods that are primarily critical, or speculative, and have a significant historical element—as distinguished from the mainly empirical approaches of the natural sciences. The humanities include ancient and modern languages, literature, philosophy, religion, and visual and performing arts such as music and theatre. Areas that are sometimes regarded as social sciences and sometimes as humanities include history, archaeology, anthropology, area studies, communication studies, classical studies, law and linguistics.” • –Wikipedia, “Humanities,” 2014. • From http://4humanities.org/2014/12/what-are-the-humanities/
  • 6. What is the Digital Humanities? • "Along with the digital archives, quantitative analyses, and tool-building projects that once characterized the field, DH now encompasses a wide range of methods and practices: visualizations of large image sets, 3D modeling of historical artifacts, “born digital” dissertations, hashtag activism and the analysis thereof, alternate reality games, mobile makerspaces, and more. In what has been called “big tent” DH, it can at times be difficult to determine with any specificity what, precisely, digital humanities work entails.“ • Klein and Gold (2016). “Digital Humanities: The Expanded Field”. In Debates in the Digital Humanities. University of Minnesota Press. Online at http://dhdebates.gc.cuny.edu/debates/2
  • 7. What is the Digital Humanities? • "Digital Humanities is not a unified field but an array of convergent practice that explore a universe in which: a) print is no longer the exclusive or the normative medium in which knowledge is produced and/or disseminated; instead, print finds itself absorbed into new, multimedia configurations; and b) digital tools, techniques, and media have altered the production and dissemination of knowledge in the arts, human and social sciences. • The Digital Humanities Manifesto 2.0 http://www.humanitiesblast.com/manifesto/Manifesto_V2.pdf • A “scholarly” context for the Digital Humanities.
  • 8. Partitioning the Digital Humanities • Traditional Scholarship about digital things in society • Data Analysis using digital tools • Data Representation using digital tools • Making Digital Tools
  • 9. DH as Traditional Scholarship on Digital Matters: Digital Cultural Studies • The Centre for Digital Culture presents • The People’s Memes: Populist Politics in a Digital Society February 27th King's College London Nash Lecture Theatre–18:30 Tickets and more details here. • The event will host a number of speakers who have been investigating the nexus between populism and digital technology. It will discuss the reasons for the current surge of populist politics, and the way in which it relates to a number of processes, including the cultural and social shocks produced by rapid technological innovation, the new mass outreach affordances of social media, and the way they allow to sidestep the mediation of mainstream news media, and the changes in the class structure and in social experience that have been facilitated by the diffusion of social media and digital technology more generally. Come for the memes, stay for the discussion. • Confirmed speakers include Paolo Gerbaudo (KCL), Alex Williams (City), and Emmy Ekhlund (KCL).
  • 10. Humanities research as process, and its research output: Text in, Text Out Humanities Research as process Source Second’ry Sources Source Source Primary Sources Research Process Emerging Ideas “the historical work as what it most manifestly is: a verbal structure in the form of a narrative prose discourse” Hayden White (1973), quoted in Jörn Rüsen (1987). “Historical Narration: Foundation, Types, Reason” in History and Theory Vol 26 No 4 ? Where is the digital? Research Output The People’s Memes
  • 11. Humanities research as process, and its research output: Text in, Text Out Humanities Research as process Source Second’ry Sources Source Source Primary Sources Research Process Emerging Ideas “the historical work as what it most manifestly is: a verbal structure in the form of a narrative prose discourse” Hayden White (1973), quoted in Jörn Rüsen (1987). “Historical Narration: Foundation, Types, Reason” in History and Theory Vol 26 No 4 Research Output The People’s Memes
  • 12. Partitioning the Digital Humanities • Traditional Scholarship about digital things in society • Data Analysis using digital tools • Data Representation using digital tools • Making Digital Tools Traditional non-digital methods applied to digital things
  • 13. Partitioning the Digital Humanities • Traditional Scholarship about digital things in society • Data Analysis using digital tools • Data Representation using digital tools • Making Digital Tools Traditional non-digital methods applied to digital things New digital methods applied both to new digital things, but also against non-digital subjects
  • 14. Partitioning the Digital Humanities • Traditional Scholarship about digital things in society • Data Analysis using digital tools • Data Representation using digital tools • Making Digital Tools Traditional non-digital methods applied to digital things New digital methods applied both to new digital things, but also against non-digital subjects. -- the issue of digital models and modelling
  • 17. Digital device as an abstract machine • Analogue devices: • Television, radio, camera, • Each needed to be a separate device • Each needed separate media technologies for their information • Digital devices • If the data becomes digital than the same device can at different times appear to be any of these devices. • At the bottom data layer are the bits: ones and zeros. • A layer of interpretation on top of the bits: It is the representation of the needs of the different media through data structures and software that makes this abstract machine be able to do many different kinds of things. Software (apps) provide this.
  • 18. Need for “Formal Structure” • Formal: computers being machines built on formalisms, formal structures representing materials of interest are necessary if the machine is to do anything with this material. "An abstract structure is a formal object that is defined by a set of laws, properties, and relationships …” “The formal sciences are built up of symbols and theoretical rules. They can often be applied to reality and they are often proved to be very useful.” “In computer science, abstraction is a mechanism and practice to reduce and factor out details so that one can focus on a few concepts at a time.”
  • 19. Structure and an Image • To show an image the data about it must be structured: • A grid of dots • Dimension of the grid • A colour for each dot • The computer must know how to map it to the screen •Image from the BPI 1700 project •© Copyright The Trustees of The British Museum
  • 20. Saying more about the image • There is more that can be represented about the image: • It is of something: • An image by Richard Gaywood (1644- 1668) : Concert of birds; including a peacock and owl • It contains images of things • Subject classification • Links between subjects and areas on the image • To say these things in way the computer can use them requires structuring as well. The digital object as a surrogate for the thing-in-the-world
  • 21. Imaginary Museum What objects of interest are evidently represented in a catalogue of photographs like the “Imaginary Museum”?
  • 22. Imaginary Museum: A basic design Examples of “Access points”: Find me all pictures and holding museum for Ihei Kimure Find me all pictures held by the Bibliothèque Nationale in Paris Find me all pictures taken by women photographers after the 2nd World War Took Holds
  • 23. Partitioning the Digital Humanities • Traditional Scholarship about digital things in society • Data Analysis using digital tools: “Big Data” • Data Representation using digital tools • Making Digital Tools Traditional non-digital methods applied to digital things New digital methods applied both to new digital things, but also against non-digital subjects. -- the issue of models and modelling
  • 24. Humanities research with digital humanities techniques (such as those from Big Data) in the process Humanities Research as process Source Second’ry Sources Source Source Primary Sources Research Output Research Process Emerging Ideas
  • 25. http://www.diggingintodata.org/ "Now going into the third round of the competition, the Digging into Data Challenge has funded a wide variety of projects that explore how computationally intensive research methods can be used to ask new questions about and gain new insights into our world." 25
  • 26. Digging into Data: "a new era" • "the Digging into Data Challenge investigators have demarcated a new era -- one with the promise of revelatory explorations of our cultural heritage that will lead us to new insights and knowledge, and to a more nuanced and expansive understanding of the human condition" (Williford and Henry 2012, p 1) 26
  • 27. "New methodological approaches" • "Research at these scales, speeds, and levels of complexity encourages new methodological approaches and intellectual strategies." (Williford and Henry 2012, p. 2) 27
  • 28. "Big data is an all- encompassing term for any collection of data sets so large and complex that it becomes difficult to process using traditional data processing applications." "Big Data is a moving target; what is considered to be "Big" today will not be so years ahead. "For some organizations, facing hundreds of gigabytes of data for the first time may trigger a need to reconsider data management options. For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration." 28
  • 29. Powers of 10. substantial changes in perspective • Power of 10 Video (Charles and Ray Eames, IBM, 1977) • https://www.youtube.com/watch?v=0fKBhvDjuy0 29 “the effect of adding another zero”
  • 30. Powers of 10. substantial changes in perspective • Power of 10 Video (Charles and Ray Eames, IBM, 1977) • https://www.youtube.com/watch?v=0fKBhvDjuy0 30 kilometer
  • 31. Powers of 10. substantial changes in perspective • Power of 10 Video (Charles and Ray Eames, IBM, 1977) • https://www.youtube.com/watch?v=0fKBhvDjuy0 31 megameter
  • 32. Powers of 10. substantial changes in perspective • Power of 10 Video (Charles and Ray Eames, IBM, 1977) • https://www.youtube.com/watch?v=0fKBhvDjuy0 32 gigameter
  • 33. "Million Books" Initiative and "Text Mining" (Unsworth 2008) • where does the trope of “a million books” come from? It originates, as far as I know, with the Universal Library and its Million Books Project, which began in 2001. The Universal Library is directed by Raj Reddy, professor and former Dean of Computer Science at Carnegie Mellon University; the million books project (funded by NSF and others) was a kind of very large pilot, aimed at digitizing a million books (“less than 1% of all books in all languages ever published”1), beginning with partners in India and later expanding to China and Egypt. • Google Print (now known as Google Book Search), which had begun in secret in 2002 and was unveiled at the Frankfurt Book Fair in October 2004, and which had Harvard's library as one of its initial partners. Google Books aims to scan as many as 30 million books, a number equal to all the titles in WorldCat, and for all we know, they are already about halfway there. • “What do you do with a million books?”—a question first asked, I think, by Greg Crane, in D-Lib Magazine, in March of 2006. (http://people.brandeis.edu/~unsworth/hownot2read.html) • My answer to that question is that whatever you do, you don't read them, because you can't. • When millions of books are equally at your fingertips, all eagerly responding to your Google Book Search: you can no longer as easily ignore the books you don't know, nor can you grasp the collective systems they make up without some new strategy—a strategy for not reading. • Tanya Clement, Sara Steger, John Unsworth, Kirsten Uszkalo (2008). "How Not to Read a Million Books" http://people.brandeis.edu/~unsworth/hownot2read.html 33 Franco Moretti: “distant reading” (contrast with “close reading”
  • 34. Culturomics 34 "Aiden and Michel are the founders of a field they call “culturomics,” in which quantitative analysis is performed on digitized texts to generate empirical data about historical, cultural, and linguistic trends." "Both Aiden and Michel have backgrounds in biology, and [their writing] reveals the extent to which that disciplinary sensibility fed into the creation of culturomics." Mark O'Connell (2014). "Bright Lights, Big Data". In New Yorker (Mar 20, 2014). Online at http://www.newyorker.com/books/page-turner/bright-lights-big-data
  • 35. 35 Egal, Marc (2013). "Evolution of the Novel in the United States: The Statistical Evidence". In Social Science History 37:2 pp. 231-254 The Google Ngram viewer: https://books.google.com/ngrams
  • 36. Model behind the NGram • Words in texts: words are strings of letters, same string: same word • Words collected into document (books), published on particular date • Critique from Humanities reviewers: • Different genres of books make a difference • Different period of time had different genres more prominent • What books have been preserved up to today? • Words change their meaning, e.g. “spiritual” Was the Ngram model too simple?
  • 37. Big Data and Unstructured text: the example of Topic Modelling • There is a vast amount of digital text, almost all of it with minimal semantic structure CHAPTER I. Down the Rabbit-Hole Alice was beginning to get very tired of sitting by her sister on the bank, and of having nothing to do: once or twice she had peeped into the book her sister was reading, but it had no pictures or conversations in it, 'and what is the use of a book,' thought Alice 'without pictures or conversations?' So she was considering in her own mind (as well as she could, for the hot day made her feel very sleepy and stupid), whether the pleasure of making a daisy-chain would be worth the trouble of getting up and picking the daisies, when suddenly a White Rabbit with pink eyes ran close by her. 37
  • 38. "Topic Model" (Wikipedia) • [A] topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. Intuitively, given that a document is about a particular topic, one would expect particular words to appear in the document more or less frequently: "dog" and "bone" will appear more often in documents about dogs, "cat" and "meow" will appear in documents about cats, and "the" and "is" will appear equally in both. A document typically concerns multiple topics in different proportions; thus, in a document that is 10% about cats and 90% about dogs, there would probably be about 9 times more dog words than cat words. A topic model captures this intuition in a mathematical framework, which allows examining a set of documents and discovering, based on the statistics of the words in each, what the topics might be and what each document's balance of topics is. 38
  • 39. Topic Modelling: "Bag of Words" • "In this model, a text (such as a sentence or a document) is represented as the bag (multiset) of its words, disregarding grammar and even word order but keeping multiplicity." • The idea is that the selection of words in a document relate to what the document is about. 39
  • 40. Topic Modelling: Martha Ballard's Diary • Cameron Blevins: http://historying.org/2010/04/01/topic-modeling- martha-ballards-diary/ • "In A Midwife’s Tale, Laurel Ulrich describes the challenge of analyzing Martha Ballard’s exhaustive diary, which records daily entries over the course of 27 years" • “The problem is not that the diary is trivial but that it introduces more stories than can be easily recovered and absorbed.” “ • Each Diary Entry is somewhat independent, and often focuses on something that happened that day: terrible weather, or a beautiful birth, etc etc • "MALLET allows you to feed in a series of text files, which the machine will then process and generate a user-specified number of word clusters it thinks are related topics." 40
  • 41. Topic Modelling Martha Ballard's Diary: Mallet's Topics • birth deld safe morn receivd calld left cleverly pm labour fine reward arivd infant expected recd shee born patient • meeting attended afternoon reverend worship foren mr famely performd vers attend public supper st service lecture discoarst administred supt • day yesterday informd morn years death ye hear expired expird weak dead las past heard days drowned departed evinn • gardin sett worked clear beens corn warm planted matters cucumbers gatherd potatoes plants ou sowd door squash wed seeds • lb made brot bot tea butter sugar carried oz chees pork candles wheat store pr beef spirit churnd flower • unwell mr sick gave dr rainy easier care head neighbor feet relief made throat poorly takeing medisin ts stomach "MALLET is completely unconcerned with the meaning of a word (which is fortunate, given the difficulty of teaching a computer that, in this text, discoarst actually means discoursed). Instead, the program is only concerned with how the words are used in the text, and specifically what words tend to be used similarly." 41
  • 42. Topic Modelling Martha Ballard's Diary: Mallet's Topics • MIDWIFERY: birth deld safe morn receivd calld left cleverly pm labour fine reward arivd infant expected recd shee born patient • CHURCH: meeting attended afternoon reverend worship foren mr famely performd vers attend public supper st service lecture discoarst administred supt • DEATH: day yesterday informd morn years death ye hear expired expird weak dead las past heard days drowned departed evinn • GARDENING: gardin sett worked clear beens corn warm planted matters cucumbers gatherd potatoes plants ou sowd door squash wed seeds • SHOPPING: lb made brot bot tea butter sugar carried oz chees pork candles wheat store pr beef spirit churnd flower • ILLNESS: unwell mr sick gave dr rainy easier care head neighbor feet relief made throat poorly takeing medisin ts stomach 42
  • 43. Partitioning the Digital Humanities • Traditional Scholarship about digital things in society • Data Analysis using digital tools • Data Representation using digital tools • Making Digital Tools Traditional non-digital methods applied to digital things New digital methods applied both to new digital things, but also against non-digital subjects. -- the issue of models and modelling
  • 44. Partitioning the Digital Humanities • Traditional Scholarship about digital things in society • Data Analysis using digital tools • Data Representation using digital tools • Making Digital Tools Traditional non-digital methods applied to digital things New digital methods applied both to new digital things, but also against non-digital subjects. -- the issue of models and modelling Making Digital Objects Using Digital Tools
  • 45. Humanities research with digital humanities techniques (such as those from Big Data) in the process Humanities Research: Modelling Capture Presentation Source Source Second’ry Sources Research Process Primary Sources e.g.: • Marked up Text (TEI) • Database
  • 47. Our Formal Models are not like this!
  • 48. “Knowledge Representation” • A knowledge representation (KR) is most fundamentally a surrogate, a substitute for the thing itself, used to enable an entity to determine consequences by thinking rather than acting, i.e., by reasoning about the world rather than taking action in it. […] It is a set of ontological commitments, i.e., an answer to the question: In what terms should I think about the world? (Davis, Shrobe, Szolovits 1993) • "In terms of humanities computing, modelling is an iterative process of constructing and developing something like a computational 'knowledge representation' as this is defined in computer science. In fact we might say that a model is a manipulable knowledge representation.” • Willard McCarty 2002. “Humanities Computing: Essential Problems, Experimental Practice” in Literary and Linguistic Computing Vol 17 No 1. pp.103-125
  • 49. Imaginary Museum: A basic design Examples of “Access points”: Find me all pictures and holding museum for Ihei Kimure Find me all pictures held by the Bibliothèque Nationale in Paris Find me all pictures taken by women photographers after the 2nd World War Took Holds
  • 50. Modelling understanding: “The Art of making in Antiquity” project “Engaging the archive’s compiler [Rockwell] in this way will give a unique angle to the metadata, making timely use of his position as a leading authority on stoneworking and a sculptor of long standing.” http://www.artofmaking.ac.uk/
  • 51. A model of Rockwell’s interpretation
  • 52. Art of Making: Exploring the Tools
  • 53. Art of Making: The “Tooth Chisel”
  • 54. Art of Making: a particular photograph
  • 55. “Prosopography” • “[the word] was missing from the main text of the two-volume Shorter Oxford English Dictionary, published in 1973, but there it was in the Addenda, between polythene and profiteroles […]. Wait, though: this is not our prosopography but a neoclassical one, given a first attestation in 1577 and a derivation from an early modern neulogism ‘prosopographia’, the description of an individual’s personality and career.” • Janet L. Nelson, David Pelteret and Harold Short (2003). “Medieval Prosopographies and the Prosopography of Anglo-Saxon England”. Fifty Years of Prosopography. In series Proceedings of the British Academy, Vol 118, p 155-167.
  • 56. Prosopography: definition from PASE • What is a Prosopography? • “A particular prosopography aims to amass and present clearly a quantity of information on all individuals in a given category” • (PASE website) • An Historical project. • A published prosopography becomes a reference for other historians to use. It tells them: • Who's who • Something of what is known about them • In what sources this individual appears.
  • 57. How is prosopography carried out? Don't forget that a prosopography is a kind of index: • recording all that is known about people who fit the criteria of the project in the prosopographical categories. • all we know about historical figures is what has survived from their own period • generally, for older periods, these are collections of manuscript documents • information about a single person might be scattered across separate documents. • A Bishop, for example, might have his life story told by more than one author • also, however, his doings might also appear in various legal and church documents that survive • also, in letters he wrote, or others wrote to him.
  • 58. The task of Prosopography • The job, then, is to: • read all the sources • collect information from all categories of interest to the particular prosopography about people • record this information and publish it so that others can use it to: • get a summary of what is known about the person • where the person is actually described in the extent sources.
  • 59. Traditional prosopography as narrative From J.R. Martindale, The Prosopography of the Later Roman Empire, 3: A.D. 527- 641. Cambridge: Cambridge University Press. 1992. “Text in” and “Text out”?
  • 60. Prosopography of Anglo-Saxon England (PASE) http://www.pase.ac.uk
  • 61. 61 “DDH’s” Structured Prosopographical Projects • Prosopography of the Byzantine Empire • With John Martindale, Editor. First published in CD, now available free online • Prosopography of the Byzantine World • With Michael Jeffreys (KCL), Averil Cameron (Professor of Late Antique and Byzantine History, U of Oxford), Charlotte Roueché (Dept of Byzantine and Modern Greek Studies, KCL): latest update 2011 • Clergy of the Church of England Database • With Arthur Burns (History, KCL), Kenneth Fincham (History, U of Kent at Canterbury), Stephen Taylor (History, U of Reading): latest update 2014 • Prosopography of Anglo-Saxon England • With Janet Nelson and Stephen Baxter (History, KCL), Simon Keynes (Department of Anglo-Saxon, Norse, and Celtic, U of Cambridge): latest update 2011 • Paradox/Peoples of Medieval Scotland • With Prof Dauvit Broun (University of Glasgow), Prof Roibeard Ó Maolalaigh (Glasgow), Prof David Carpenter (KCL), Dr Matthew Hammond (University of Edinburgh): last update Sept 2012 • Breaking of Britain • With Prof Dauvit Broun (University of Glasgow), Prof David Carpenter (KCL), Dr Matthew Hammond (University of Edinburgh), Prof Keith Stringer (Univ of Lancaster) • Making of Charlemagne’s Europe • With Alice Rio (KCL, HIstory). Project finished 2015 • Digitising the Prosopographies of the Roman Republic • With Henrick Mouritsen and Dominc Rathbone (KCL, Classics): beginning late 2013
  • 62. Traditional Prosopography: in print form 62 Sources People From J.R. Martindale, The Prosopography of the Later Roman Empire, 3: A.D. 527-641. Cambridge: Cambridge University Press. 1992. Places
  • 63. Structured Data for prosopography: Many entrances Source Person Place Office Date Source Source Event Prosopographical Database
  • 64. What brings these together? Offices, Posts Institutions ?
  • 65. The ‘factoid model’ Factoid: a spot in a source that says something about a person or persons. http://factoid-dighum.kcl.ac.uk/
  • 66. Some sizes for KCL’s structured Prosopographies • PASE: Sources 2,784 including Domesday book, people: 19.807 (including 978 women), factoids: 282,026 • PoMS: Sources: 9,259 (mainly charters), people: 21,311 persons, factoids: 87,956 • CCE: people(clergy): 158,263, Sources: 2,987 (admin sources), factoids: 931,636 (resolved), over 2M in total
  • 68. PASE: Acts of Fasting/Resisting Temptation
  • 69. Modelling: Appropriate to the Humanities? • “humanistic inquiry reveals itself as an activity fundamentally dependent upon the location of pattern.” • “Of all the technologies in use among computing humanists, databases are perhaps the best suited to facilitating and exploiting [pattern].” • “To build a database one must be willing to move from the forest to the trees and back again; to use a database is to reap the benefits of the enhanced vision which the system affords.” • (from Ramsay, “Databases” in A Companion to Digital Humanities”)
  • 70. Appropriate to the Humanities? • the underlying ontology [that a database represents] has considerable intellectual value. • A well-designed database that contains information about people, buildings, and events in New York City contains not static information, but an entire set of ontological relations capable of generating statements about a domain. • A truly relational database, in other words, contains not merely "Central Park", "Frederick Law Olmstead", and "1857", but a far more suggestive string of logical relationships (e.g., "Frederick Law Olmstead submitted his design for Central Park in New York during 1857"). • (from Ramsay, “Databases” in A Companion to Digital Humanities”)
  • 71. Partitioning the Digital Humanities • Traditional Scholarship about digital things in society • Data Analysis using digital tools • Data Representation using digital tools • Making Digital Tools Traditional non-digital methods applied to digital things New digital methods applied both to new digital things, but also against non-digital subjects. -- the issue of models and modelling Making Digital Objects Using Digital Tools
  • 72. Partitioning the Digital Humanities • Traditional Scholarship about digital things in society • Data Analysis using digital tools • Data Representation using digital tools • Making Digital Tools Traditional non-digital methods applied to digital things New digital methods applied both to new digital things, but also against non-digital subjects. -- the issue of models and modelling Making Digital Objects Using Digital Tools
  • 73. The “world of code” We live in worlds increasingly interwoven with code. Code puts into operation the apps and communication channels we increasingly depend on in everyday life; it twitters away insistently in our pockets in our portable devices; it channels the personal data, interests, and relationships that we include in our social network profiles; and it aggregates us into vast database architectures, powered by database management techniques, as a million little informational bits, in order to offer us new services, recommendations, and experiences. Without really thinking about it, we are spending vast amounts of our time doing stuff with code.” http://dmlcentral.net/blog/ben-williamson/coded-curriculum-new-architectures-learning
  • 74. App Inventor: “Anyone can build Apps that impact the world” (!!) http://appinventor.mit.edu/explore/
  • 75. “Telescopes for the Mind” • "Digital artifacts like tools could then be considered as "telescopes for the mind" that show us something in a new light" (Ramsay and Rockwell 2012, p 79) • An “app” can be a new tool that, like telescopes, allows us to see material we point it at in a new way that can transform our understanding of the material, as when Galileo pointed the telescope at the moon.
  • 76. New insights through “making things” • It may be that my personal history and training exert a prejudicial influence that limits the appeal of how I think about digital humanities. Perhaps that history and training explain why in reading Tim Ingold's illuminating book, Making: Anthropology, archaeology, art and architecture (2013), I am drawn eagerly to the pedagogical expressions of his anthropology in such class exercises as weaving baskets and see in them (changing what needs to be changed) a model for training digital humanists. • The link between baskets and computing was made explicit to me this morning by the announcement of the Rare Book Summer School (Humanist 30.742), which quoted a former student as saying, "I will never look at a book -- ”any book -- ”the same way again.” (McCarty 2017, Humanist Vol. 30, No. 746)
  • 77. The TACT KWIC Display
  • 79. Text Analysis Tool: Voyant Moby Dick in Voyant
  • 80. Pliny: what is it? • it is a thought-piece: • perhaps wrongheaded in various ways • … although a couple of years was spent on research into what Humanities scholarship was like before Pliny was built • Pliny is meant to promote discussion within the DH about this area. http://pliny.cch.kcl.ac.uk The object of study for me in Pliny was the traditional methods of Humanities Research itself.
  • 81. Pliny and software tool collaboration: significance of annotation From show about damaged books (!) at Cambridge University Library
  • 82. The page is at the “nexus” Publishing Application •Preparing text •book design and presentation •Printing •Distribution •The printing press The page is the nexus between publishing and annotation Annotation Application •Support dynamic text •Support using of annotations •The pen
  • 83. The screen as the “nexus” PDF Viewer •Reading PDF file •Layout on the screen •Supporting page turning, etc Pliny •Support display of annotations •Manage notes and anchors •Support work with notes
  • 84. Humanities research as process, and its research output Humanities Research as process Source Second’ry Sources Source Source Primary Sources Research Output Research Process Emerging Ideas Holmer, Joan Ozark (1994). “Draw, if you be Men”: Saviolo’s Significance for Romeo and Juliet”. In Shakespeare Quarterly, Vol. 45 No 2 (Summer 1994). pp. 163-189 Annotation & notetaking
  • 85. 85 Pliny objects as a connected graph: a “Mind Map” • An example of a mindmap • Graham Burnett (2005) • http://en.wikipedia.org/wiki/File:Mindmap.gif 85
  • 86. • "To ask whether coding is a scholarly act is like asking whether writing is a scholarly act." (Ramsay and Rockwell 2012, p 82) • "A tool is, so to speak, an objectified idea, a theorem whose force is imposed on its consequences. It is thought in action and must not be conceived in terms of the crudities of early materialism." (p 332)
  • 87. Digital Humanities as the “big tent”
  • 88. Digital Humanities as the “big tent” Traditional Scholarship about digital things in society Data Analysis such as big data using digital tools Data Representation using digital tools Making Digital Tools Textual Markup Wearable Digital Objects Information studies for Humanities
  • 89.
  • 90. Digital Humanities: the “big tent” (?)
  • 92. “Our Data Ourselves” Do you know how much data you generate? Do you know where it goes? Is it being sold by Google to increase their market share, now valued at £250 billion? Is it being ‘scooped’ by the GCHQ or NSA? The Snowden revelations show that security agencies are surreptitiously taking data form our phones, cameras, apps, and anything else we use that leaves a digital trace. Especially if you use a mobile device, you are actively contributing to the 2.5 billion gigabytes of data being generated daily. To put that in perspective, this would be enough data to fill one hundred million iPhones, every day. Yet, public understanding of our information-rich environment and our quantified selves remains underdeveloped. ‘Our Data Ourselves’ is research project we lead at King’s College London, examining the data we generate on our mobile devices. We have brought together media and cultural theorists, computer scientists, programmers and youth in a unique project exploring this ‘big social data’ (BSD) we generate. As arts and humanities researchers we are interested in the development and transfer of the technical skills and knowledge necessary for the capture and analysis of the different forms of BSD, and its transformation into community research data. This desire to create community research data signals broader social relevance and potential. Our project, therefore, asks whether BSD can be transformed into a public asset and become a creative resource for cultural and economic community development.
  • 93. Partitioning the Digital Humanities Traditional Scholarship about digital things in society Data Analysis using digital tools Data Representation using digital tools Making Digital Tools Digital Technology as active partner in the work “Modelling and Models”
  • 94. The World's Technological Capacity to Store, Communicate, and Compute Information Martin Hilbert and Priscila López (2011). "The World’s Technological Capacity to Store, Communicate, and Compute Information", In Science 332:6025 (April 2011) pp. 60-65. DOI: 10.1126/science.1200970 Fig 2. World’s technological installed capacity to store information 10 times 94
  • 95. Moretti: Graphs, maps, trees Distant Reading • "essay on literary history" • "instead of concrete individual works [..] the text undergoes a process of deliberate reduction and abstraction" • Distant Reading: "a specific form of knowledge: fewer elements, hence a sharper sense of their overall interconnection". 95
  • 96. Moretti: Graphs, maps, trees Distant Reading • Moretti claims traditional history focused on rare/curious objects and, therefore prominent people and their doings. Annales school looked at larger society: with more data used statistical methods: "social history") • "What would happen if literary historians, too, decided to 'shift their gaze [...] from the extraordinary to the everyday, from exceptional events to the large mass of facts'?" • "A more rational literary history. That is the idea." (p. 4) The Annales School: "The school has been highly influential in setting the agenda for historiography in France and numerous other countries, especially regarding the use of social scientific methods by historians, emphasis on social rather than political or diplomatic themes, and for being generally hostile to the class analysis of Marxist historiography." 96
  • 97. Moretti: The rise of the novel Moretti claims that work like his is "truly cooperative" (based on from other projects) and could be combined in more than one way. Moretti talks about graphs of rise of the novel – similar shapes for five countires, three continents, over two centuries ... the same old metaphor 97
  • 98. Reaction to Distant Reading • Kathryn Schulz (2011) What Is Distant Reading? • New York Times Sunday Book Review June 24, 2011. • http://www.nytimes.com/2011/06/26/books/review/the-mechanic-muse-what-is-distant- reading.html • "Let’s say you pick up a copy of “Jude the Obscure,” become obsessed with Victorian fiction and somehow manage to make your way through all 200-odd books generally considered part of that canon. Moretti would say: So what? As many as 60,000 other novels were published in 19th-century England — to mention nothing of other times and places. You might know your George Eliot from your George Meredith, but you won’t have learned anything meaningful about literature, because your sample size is absurdly small. Since no feasible amount of reading can fix that, what’s called for is a change not in scale but in strategy. To understand literature, Moretti argues, we must stop reading books." 98
  • 99. The "Canon" • "What makes the history of music, or of any art, particularly troublesome is that what is most exceptional, not what is most usual, has often the greatest claim on our interest." • ... It is only in the works of Haydn, Mozart, and Beethoven that all the comtemporary elements of musical style -- rhythmic, harmonic, and melodic -- work coherently together, or that the ideals of the period are realized on a level of any complexity." (pp 21-22) • Charles Rosen, "The Classical Style" 99
  • 100. Hype? • 'For those who think in next-new-thing terms, "Big Data" seems to fill the role not of an enormously challenging research problem but of a solution, a t- shirt slogan, a rah-rah rhetorical flourish, i.e. it amounts to boondoggly hype.' • HUMANIST Sun, Oct 26, 2014, Willard McCarty 100
  • 101. Modelling, and Patterns • "In representing the past, we seek perspective, the point of view that allows us to discern patterns among the events that have occurred. We are not so much trying to transmit accumulated knowledge – culture and tradition do this, among other means – as to understand the significance of our experience.“ • Bodenhamer 2008 p. 222
  • 102. Data, Structure, Interpretation • "There can be no data without structure, and all structure is interface, whether we view it as a screen appearance or not. [...] Even more importantly, all interfaces—visible as well as invisible—are interpretational forms." • (McGann, Jerome (2010). "Sustainability: The Elephant in the Room". In Online Humanities Scholarship: The Shape of Things to Come. Houston Texas: Rice University Press) http://rup.rice.edu/cnx_content/shape/m34305.html
  • 103. Analytical Modelling: the utility of failure • "the digital model illumines analytically by isolating what would not compute. In other words, the failures of analytic modelling are where its success is to be found.” • Willard McCarty (2008). “What’s going on?” in Literary and Linguistic Computing, Vol 23 No 3. p. 256
  • 104. ENIAC ENIAC (/ˈini.æk/ or /ˈɛni.æk/; Electronic Numerical Integrator And Computer) was the first electronic general-purpose computer. It was [...] capable of being reprogrammed to solve "a large class of numerical problems". ENIAC was initially designed to calculate artillery firing tables for the United States Army's Ballistic Research Laboratory.
  • 105. ENIAC Programming the ENIAC "ENIAC could be programmed to perform complex sequences of operations, including loops, branches, and subroutines. The task of taking a problem and mapping it onto the machine was complex, and usually took weeks. After the program was figured out on paper, the process of getting the program into ENIAC by manipulating its switches and cables could take days."
  • 106. Von Neumann Machines • Von Neumann Architecture • Stored-Program System • A stored-program digital computer is one that keeps its program instructions, as well as its data, in read-write, random-access memory (RAM)
  • 107. Inside the phone (or computer...) Long-term Storage Screen Touch Surface Speaker Microphone Camera Wireless A Von Neumann Machine
  • 108. Inside the phone (or computer...) Long-term Storage Screen Touch Surface Speaker Microphone Camera Wireless A Von Neumann Machine Software Software Data Data •Software for devices (device drivers) •Software for services (Operating System) •Software for functionality (application software: apps, etc)
  • 109. Operating System Services answer = raw_input("What is your name?") Operating System Software Widget Services Text Services