Interrogating the Politics and Performativity of Web Archiving
1. Interrogating the Politics and
Performativity of Web Archiving
— Jessica Ogden —
@jessogdenjessica.ogden@soton.ac.uk
Joint conference on Digital Libraries 2016: Doctoral consortium | Rutgers University | 19 June 2016
2. • Background: my academic training and
where we are to date
• Research context
• Theoretical framework
• Problematising web archiving, some
questions
• Proposed methodology
• Future work
Outline
3. Preface: Perspectives
Web Archiving
Anthropology
• culturally constituted
meanings and socially
organised practices
• recording practice
stories
Archaeology
• materiality of tech artefacts
• data repository management
• digital tools for cultural
heritage recording
Web Science
• socio-technical systems
• repository software development
• critical information studies
Archives
• appraisal practice
• digital record management
4. • Defining web archives… turns out this is pretty hard to do.
Research Context
International Organisation for Standardisation (Sept. 2012)
5. • Defining web archives… turns out this is pretty hard to do.
Research Context
Dougherty et al. (2010)
6. • Defining web archives… turns out this is pretty hard to do.
Research Context
Dougherty et al. (2010)
7. • Why do these definitions matter?
• Divide within the practice literature between web archives
and ‘social media archives’ - in terms of standards, ‘best
practice’ and the mitigating factors that affect collection
• Assumptions about what web archives are -> reflective of
understanding of what the Web is? (Web 1.0/2.0?)
• Intentionality behind collection seemingly an important
factor - explicitly gives weight to selection practices
Research Context
8. • Existing surveys of web archival practice - some key points
• Gomes et al. (2011) - global view on ‘who is archiving’,
migrated to Wikipedia page which is continuously updated.
Some discussion of ‘what’ is being archived.
• Bailey et al. (2012/2014) - presence/absence of collection +
access policies, flags up increased financial commitment to
web archiving, but continued concern about ‘scoping collections’
• Truman (2016) - community needs to ‘radically increase
communication and collaboration’ and ‘build local capacity’
Research Context
9. Research Context - motivations for Web Archiving
Legislation
• Public record retainment
• Non-legal deposit laws
• Regulatory compliance
Digital heritage
(+ Knowledge)
• Advocacy for ‘collective
web heritage’, historical
view of digital cultural
production through time
• Preservation/provision of
universal knowledge
• User rights/advocacy for
user-generated content
Academic
• Long term preservation
of scholarly citations/
web outputs
• Support web-based
research
Selection practices?
10. Research Context - challenges
Tools + Technologies
• Pace of web development
• Path dependencies in
harvesting/record
management tools
• ‘Data access regimes’ and
platform APIs
Defining the Object
• Understanding boundaries of
selection/collection objects
• Accounting for epistemological/
ontological perspectives
Legal + Ethical
• Copyright restrictions
• Ethical implications for
collection with/out
consent
• Representativeness
(geographic, language,
ethnic) of a global Web
Selection practices clearly affected
by these, but Let’s take a step Back
11. • Archives are not just sites of knowledge retrieval, but deeply reflective
of/implicated in the production of knowledge
• Objective role of archivists VS valuing interpretive role of archivists
in the construction of social memory
• Selection practices are unavoidably exclusionary - certain narratives
are privileged, others marginalised
• Critical engagement with archival practice as a set of ‘professional
decisions’ enables reflection on the operation of power in archival
maintenance
Theoretical Framework: Postmodernism + Questioning Archives
12. • ‘Naturalisation of practice’ - Butler’s (1996) ‘social magic,’ where
repeated action -> patterns of behaviour and belief, shifts the
locus away from a subject that constitutes an action (the
performative) to a subject constituted by that action
• Performativity applied as critique of ‘archival science’ by Cook and
Schwartz (2002)
• Role of materiality - ‘generative capabilities’ of technologically-
enabled data, information, and knowledge which are in an ‘eternal
process of becoming’ (Pinch and Henry 1999; Waterton 2010)
Theoretical Framework: Performative (Web) Archives
13. 1. Web archiving is performative
Problem statement/Thesis
Brügger, N. (2012)
14. 1. Web archiving is performative
Problem statement/Thesis
• Web archives are created through ‘the doing’ of web
archiving.They are contingent on the multitude of socio-
technical, economic, legal and ethical factors that affect their
creation
15. 2. Web archiving as a (new?) form of knowledge production
Problem statement/Thesis
• Important to understand the selection/appraisal practices
of web archiving to understand the ways in which web archives
enhance/constrain knowledge about the Web
• Insufficient literature exists detailing selection/appraisal
practices of web archivists - and specifically how these are
similar/differ across communities
16. 3. Web archiving/archives as socio-technical assemblages
Problem statement/Thesis
• Consider how the materiality of technologies (platforms,
tools, interfaces, code, algorithms) are implicated in the
production of archives and their role in the development of
practice
• Role of social identity in web archival practice, as observed
through the routine activities and typical patterns of work
17. (Evolving) Research Questions
• How does web archival practice shape the Web that is
archived?
• What are selection/appraisal practices? In what ways does
practice differ (in rationale, content, methods) across
communities?
• What is the role of materiality - how is web archival practice
‘embedded in physical artefacts, technologies, and ways of
doing things’?
18. • Ethnographic Methods - chosen as places importance on ‘collecting
information about macro processes […] institutions, patterns, and norms as
well as about people’s feelings, thoughts and experiences’ (O’Reilly 2015)
• Semi-structured interviews, structured (online/offline) observations,
documentary analysis of historical sources - with goal of developing ‘thick
description’ of web archival practice in different communities
• ‘Focused Ethnography’ - relatively short-term visits, periods of intense data
collection, emphasis on communicative activities
• Drawing on organisation studies, STS (observing socio-tech interactions)
Proposed Methodology
19. • Establishing access to the ‘backstage’ (Goffman) - whilst creating incentives for
participation and not doing professional harm
• Observing and following technological agents - beyond ‘users using computers’
• Assessing relationship between discursive practices and actual collection practices
• Documenting the ‘naturalisation of practice’ within/across web archiving
communities
• Translating findings into significance and contribute to key findings of Truman
(2016) - e.g. leading to collaboration and building local capacity for web archiving
Proposed Methodology: Some Objectives
20. Sampling Communities of Practice
• Social + Professional
identities and roles
• Motivations, aims
• Content types, platforms
• Tools + Technologies
used, developed
(University archives,
and smaller, localised
archives)
21. • Further fleshing out of methodology - both practically and theoretically
• Preliminary analysis of Archive Team logs and wiki - assessing who
the key players are (+ how these may be used as evidence of discursive
practices and resulting actions) -> approach for contextual interviews
• Setting up access/schedule for case studies, ethics approval and getting
appropriate consent for observations/interviews
• Key: thinking ahead about ways to assess outcomes between
communities of practice - in a structured way
Future Work
22. References And Credits
Media:
Brewster Kahle: Educause https://
youtu.be/fDGKfVJQRkk
Jason Scott: By pinguino k from North
Hollywood, USA
Noun Project:
Envelope by Alexandria Eddings
building by João Proença
Poetry by Cristiano Zoucas
colosseum by Adriano Gazzellini
code by Azis
Twitter by OliM
Compass by gayatri
mouse pointer by Julynn B.
Question by SuperAtic LABS
References:
Bailey,J. et al., 2014.Web Archiving in the United States: A 2013 Survey,The National Digital Stewardship Alliance.Available at: http://
www.digitalpreservation.gov/documents/NDSA_USWebArchivingSurvey_2013.pdf [Accessed January 28, 2016].
Brügger, N., 2012.When the Present Web is Later the Past: Web Historiography, Digital History, and Internet Studies. Historical Social Research, 37(4), pp.
102 – 117.
Butler,J. Performativity’s Social Magic. In Schatzki,T. R. and Natter,W., editors, The Social and Political Body.The Guilford Press, New York, 1996.
Dougherty, M., Meyer, E.T., Madsen, C., van den Heuvel, C.,Thomas,A., and Wyatt, S. Researcher Engagement with Web Archives State of the Art.Technical
report,JISC, London, 2010.
Gomes, D., Miranda,J. & Costa, M., 2011.A Survey on Web Archiving Initiatives. In S. Gradmann et al., eds. Research and Advanced Technology for Digital
Libraries. Lecture Notes in Computer Science. Springer Berlin Heidelberg, pp. 408–420.
O’Reilly, K., 2015. Ethnography: Telling Practice Stories. In Emerging Trends in the Social and Behavioral Sciences.John Wiley & Sons, Inc.
Schwartz,J. M. and Cook,T.Archives, Records, and Power: The Making of Modern Memory. Archival Science, 2:1–19, 2002.
Truman, G. 2016.WebArchiving Environmental Scan. Harvard Library Report, Harvard Law, Cambridge.
Waterton, C. Experimenting with the Archive: STS-ers As Analysts and Co-constructors of Databases and Other Archival Forms. Science,Technology, &
Human Values, 35(5):645–676, 2010.
23. Acknowledgements
This Ph.D. is supervised by Professor Susan Halford and Professor Les Carr at the
University of Southampton.
This presentation was enabled by a generous student travel grant provided by the ACM
Special Interest Group on Information Retrieval.Additional funds were provided by
the University of Southampton’s Web Science Centre for Doctoral Training.
This Ph.D. research is funded by the EPSRC Grant Number EP/G036926/1.