Slides from a lecture entitled: 'Personal Digital Archiving: Storing, Organising and Protecting Your Digital Content for the Future', by Sara Day Thomson of the Digital Preservation Coalition. This event was held on 26th September 2019, 6pm, at the Royal Irish Academy. This was the inaugural lecture in a new collaborative series between the Digital Repository of Ireland and the National Archives, Ireland
Sara Day Thomson, 'Personal Digital Archiving: Storing, Organising and Protecting Your Digital Content for the Future'
1. tw: @sdaythomsonsara day thomson
dpconline.org
Personal Digital Archiving
Storing, Organising and Protecting
Your Digital Content for the Future
2. Digital Preservation Coalition
• Membership
Organisation
• 90+ Members
Worldwide
• 6 Supporters
• Events, Training,
Research,
Consultancy,
Guidance,
Publications,
Awards, and
more!
… exists to secure our digital legacy
3. What is Personal Digital Archiving?
Digital Preservation?
Digital Curation?
4. Personal Digital Archiving
Look after records or other
items of importance in order
to preserve a meaningful
representation of an event, or
place, or person
6. Personal Digital Archiving
1s and 0s that need
technology to interpret and
render them, technologies –
hardware and software – that
are vulnerable to obsolescence
7. Personal Digital Archiving
“‘Personal digital archives’ is a
formal term for the ‘digital stuff’ we
create and save every day.”
From the DPC Technology Watch Report Personal Digital
Archiving by Gabriela Redwine
8. Your Digital Stuff
• Financial
spreadsheets
• Bank statements
• Tax records
• Professional CV or
Portfolio
• Emails
• Whatsapp
messages
• Family photos
• Photos on
Instagram or
Facebook
• Wedding video
• Digitized (scanned)
analogue photos
• …
12. Cont’d…
“The meaning of digital files can change
over time. The text message that initially
seems inconsequential may take on vital
significance if it ends up being the last
communication from a loved one.”
From the DPC Technology Watch Report Personal Digital
Archiving by Gabriela Redwine
13. Threats!
• Obsolescence
• Hardware
• Storage Media
• Software
• File Formats
• Service Providers
• Human Error
• Degradation and Bit Rot
• Natural Disaster
14. Basic Steps
• Find all your stuff on different devices
and platforms
• Move it to one place
• Review and select what to keep,
delete duplicates and unimportant
stuff
• Organise it with folders and file names
• Store multiple copies in multiple
places
• Repeat
15. Some Guiding Principles
• Make an inventory of what you have
and where
• Archiving is not just keeping, it’s
also deleting
• Printing is not the solution
• ‘Archiving’ an email means ‘hiding’
– it’s not the same thing
• You really do have to do all those
pesky updates. It’s a security issue.
• Act now!
16. Outline of Topics
• File Naming & Folder Structure
• Describing Things
• File Formats
• Digital Images & Video
• Email & Calendars
• Web & Social Media Content
• Back-up & Storage
• Copyright & Personal Data
• Further Resources
17. File Naming & Folder Structure
• Use unique and descriptive file names (but keep
it short)
• Avoid the use of spaces and special characters
• Dashes & Underscores instead
• Use the ISO standard format for dates:
YYYYMMDD
• Don’t change the file extension through re-
naming
• Use a consistent method for showing versions,
such as: 20190926_ProfessionalCV_v1
18. Cont’d…
Tip: some devices
automatically assign a
file name, such as
cameras. Take the time
to change the file name
or individual photos may
become impossible to
find!
19. Cont’d…
• Create folders for meaningful categories
• Type: photos, videos, documents, etc
• Function: finances, family holidays, etc
• Create logical folder names
• Would someone else be able to find what
they’re looking for?
• Consider organising by year
20190926_PersonalDigitalArchiving_Dublin
20191004_DigitalPreservationForAll_London
20200214_LoveInTheTimeOfArchives_Verona
20. Describe Your Stuff
Metadata: information about
information
• Date of creation
• Creator
• Description identifying
people, places, or location
Some of this information will be
automatically embedded in files.
21. Cont’d…
• Add tags (“embedded metadata”)
• Properties (Windows)
• Get Info (Mac)
• Photo-editing applications usually also allow for
adding embedded information
Photos can’t be searched like text (yet), so to find
images they must be described with textual
metadata.
23. File Formats
• Open: more software programmes will be
able to open
• Non-proprietary: lower risk of format
owner going out of business or ending
support
• Lower risk of getting trapped in cycle of
purchasing updated software
• Uncompressed: bigger files, more
information (e.g. TIFF)
• Compressed: smaller files, less information
(e.g. JP2)
25. … or to a more stable format
Cont’d…
Update files to newer file types
26. Digital Images & Video
Where are your photos & videos?
• Laptop
• Family desktop
• Mobile phone
• Social media platform
• Photo sharing platform
• Memory cards
• CDs?!
• Old digital camera?!?!
• Work computer?!?!?!
27. Digital Images & Video
Digital images and video
take up a lot of space and
require careful planning for
storing and backing up with
extra copies.
28. Email & Calendars
Email
• Make use of labelling and
sorting
• Delete unimportant emails
• Download and save important
attachments
• Outlook and similar mail
services allow export
• Gmail and platforms a bit
trickier…
29. Download Gmail message as .EML file
Step 1: Open relevant email message and
show more options by clicking three dots
beside reply and select ‘Show Original’
32. Cont’d…
Calendars
• Use a consistent approach to
creating events and meetings
• Make use of colour-coding
• Outlook allows export
• Google Calendar also supports
export
34. Web & Social Media Content
• Websites
• Blogs
Tip: Take photos and videos with your phone
camera and share to social media rather than
relying on capturing and saving them in the
platform itself. Download photos and videos on
platforms to local storage.
• Social Media Platforms
• Sharing Platforms
Your Web Content
35. Cont’d…
Some approaches to capturing web content
• Webrecorder.io
• HTTrack
• Download data from platform
38. Platform ‘Self-Archiving’
Download your social media
• Service for account holder
– Behind log-in
• Differs from platform to platform
• Some external tools, not built
into the platform, also exist for
downloading social media
39. Google (Takeout): Download Your
Data
Twitter (Settings – Account – Your Twitter
Data): Download your Twitter Data
Facebook (Settings – Your Facebook
Information): Download Your Information
InstaPort (free download):
Download Your Data
Whatsapp: Back-up to Google Drive
40. Back-up & Storage
Storage
• Hard disk
drives
• Desktop
• Laptop
• Tablet
• Mobile
phone
Removal Storage
• External Hard
drives
• Flash drives
• CDs
• DVDs
• Floppy Disks
Cloud Storage
• Dropbox
• Microsoft
One Drive
• Google Drive
• Amazon
Drive
41. Cont’d…
Storage
• Hard disk
drives
• Desktop
• Laptop
• Tablet
• Mobile
phone
Removal Storage
• External Hard
drives
• Flash drives
• CDs
• DVDs
• Floppy Disks
Cloud Storage
• Dropbox
• Microsoft
One Drive
• Google Drive
• Amazon
Drive
44. Level-up Your Archiving Geek
Fixity Checking: use a tool to create
unique checksums for your files and
check them over time
Check out:
‘Fixity’ tool
by AVP
45. Recap
• Keep 2+ copies of all your files
• Keep a copy in a different location (e.g. cloud)
• Don’t use the same storage media for each copy
• Use password protection
For example:
1 copy on laptop
1 copy on external hard drive
1 copy on Dropbox
46. Copyright & Personal Data
Copyright & IPR
• Creator vs. Owner
(social media *cough*)
• Thoughtful re-use
• Use Creative Commons!
47. Cont’d…
GDPR & Personal Data
Personal Data?
• Any identifying information about you
• Much easier to combine data in digital form to
match or find individuals
Who must comply?
• EU Organisations
• Any service, even outside the EU, that holds
personal data of EU citizens
Where to raise issues?
• Complaints to Data Protection Commission
48. Further Resources
DPC Technology Watch
Report
Personal Digital
Archiving
by Gabriela Redwine
http://dx.doi.org/10.720
7/twr15-01
51. ‘Personal Digital Archiving Strategies’ by Kari Smith and
Jessica Venlet from MIT Libraries, https://wayback.archive-
it.org/7963/20190802202912/https:/libraries.mit.edu/digit
al-
archives/files/2015/10/2015_pda_handoutdissemination-
v3.pdf
‘Personal Archiving’, Digital Preservation blog at the US
Library of Congress
http://digitalpreservation.gov/personalarchiving/
‘Save Your Social Media’ by Sara Day Thomson (2019), DPC
Blog, https://www.dpconline.org/blog/save-your-social-
media
‘Personal Archives Accessible in Digital Media (PARADIGM)
Project’ carried out by the University of Oxford and the
University of Manchester, http://www.paradigm.ac.uk/
52. Thank You!
Digital Repository of Ireland and National
Archives!
And a giant thank you to Jenny O’Neill for
generously sharing her personal digital
archiving presentation materials!
54. This presentation is available under the CC
license:
Attribution-NonCommercial-ShareAlike 4.0
International
Hinweis der Redaktion
Archiving Institutions – collect and protect the official historical record
Different types of archiving institutions look after different types of things. For example, the National Archives of Ireland look after the records of the Irish government (as stipulated by the National Archives Act, 1986). The Irish Film Institute Archive looks after Ireland’s moving image heritage, including
newsreels, feature films, home movies, and other types of moving image. The Digital Repository of Ireland looks after historical and contemporary Irish cultural heritage materials. The Diageo archives look after the Guinness corporate and heritage archives (e.g. business records as well as vintage advertising campaigns). And so on.
Historically, from an institutional perspective, Archives looked after ‘the documents’ (images, moving image, sounds, ‘ephemera’), while artwork, artefacts (‘things’), or buildings are looked after by museums and historical trusts, and published items like books or journals have been typically looked after by libraries. The difference, generally, has been that the focus of the archive was on original, authentic materials, while the library looks after publications. However, the emergence of digital technologies and the use of digital devices for daily business and social interaction has begun to wear away at those distinctions. What is an original document in the digital age when one file can be viewed by many people in different locations at the same time? What is a publication when online platforms facilitate self-publication and dissemination without going through established distribution channels?
An archive keeps things because they – the individuals who work there, based on a set of criteria and ethical principles – have decided they have value. That they mean something to the community they serve.
Is this unproblematic? No. How do you represent everyone? But - keeping everything is not an option.
Groups who are interested in grassroots archiving – putting archiving tools and know-how into the hands of individuals within communities, specifically those marginalised communities who are under-represented or un-represented in institutional archives. These are just two examples of groups who challenge institutions to acknowledge bias in the process of selecting content for the archive and to reform institutional practices. But also not to usurp the voices of those marginalised communities on the fringes of society, but to invite them and support them to provide their own records and their own histories.
‘Who you are, informs what you know.’
AAHRI: https://www.archivistsagainst.org/
Documenting the Now: https://www.docnow.io
All digital information is made of binary - 1s and 0s - that need technology to interpret and render it. These technologies – hardware and software – are vulnerable to obsolescence that will make digital information inaccessible.
Unlike analogue materials, such as print and film, which are:
Traditionally quite robust
Tangible, we can hold them in our hands
Generally independently understandable (if you speak the language they are written in…..)
Digital materials, on the other hand, are
Very susceptible to obsolescence as they are entirely dependent on the media they are stored on, the accessibility of their file format and often require more information – such as how they were created or what software to use to open them - to use and understand them
Challenging to rights management - protecting copyright is more difficult and ensuring personal data is protected in some cases is almost impossible
But digital content also brings a whole host of new benefits, in particular the ability to make content accessible to users, including remotely.
As part of a recent application for permanent settlement in the UK, I had to provide information about every time I had been absent from the UK since moving there 9 years before, including leaving date and returning date. I travel a lot, so this was no measly task that required searching through my own personal digital archives.
Personal Digital archives I referred to:
Work Outlook calendar
Work Outlook Email
Personal Google Calendar
Personal Gmail
From about April 2014 backwards, entries on my calendar were scarce.
Some emails to friends stated one return date, while emails to parents asking for a lift to airport were different dates.
This experience really enforced the need to keep systematic, clear digital records.
A recent family trip to Italy. I had some photos on my phone, which I uploaded to my Dropbox account, but there were also photos I wanted on family members’ Facebook accounts. I downloaded these from Facebook and then did the same as the photos on my phone – uploaded them to Dropbox.
*Personal photos removed from presentation for sharing and re-use.
Family Photo Image
Image by Peggy und Marco Lachmann-Anke from Pixabay
Obsolescence – many flavours
Hardware e.g. tablet
Storage Media e.g. CDs
Software e.g. WordStar, MS Works, MS Word
File Formats
Human error – accidentally delete something, your cat accidentally deletes something, your toddler accidentally deletes something, you accidentally leave your laptop on the bus, spill a cup of tea on your laptop by accident, etc...
The machines or even the 1s and 0s themselves deteriorate and make it impossible to open and view them even with the correct software and hardware.
Inventory - what do you have? Where is everything? On a phone? On an old camera’s memory card? Go nuts, make a spreadsheet!
Disposal – delete things you don’t need or superfluous copies of things. They just create clutter and make it difficult to find the things you actually need.
Act now! Loss of digital stuff can be immediate and irreversible; consequences if you don’t act now - corrupted files, lack of hardware for access old media types
Printing is not the solution. Please do not go home and print all your digital stuff.
The ‘archiving’ I’m talking about is not the function on Gmail for ‘archiving’ – in the IT / tech work ‘archiving’ often simply means ‘hiding’ – not managing and preserving for long-term keeping.
Table created by Jenny O'Neill, Data Manager, Research Services, UCD Library: http://libguides.ucd.ie/ld.php?content_id=32467254
There are two main forms of migration for digital preservation, and it is possible to use one or both.
The first is a method often referred to as ‘normalisation’. This is where all files of a particular type (for example, text documents) are ‘normalised’ to one file format.
The example on the slide shows Word documents being normalised to PDF. For images this could be JPEGs and GIFs normalised to TIFFs.
The choice of normalised files format used will depend on the needs of the organisation and its users.
The second method involves migrating old file formats to newer versions when they are at risk of becoming obsolete.
This could be migrating an old .xls spreadsheet to a newer .xlsx format.
Both methods have their positives and negatives:
Normalisation creates homogenous, easier to manage collections and means that users need to know how to use fewer files types.
Migrating to new versions means that files can be accessed in current computer environments.
Both processes can be automated but quality control is incredibly important and careful consideration must be given to migration pathways to avoid loss of data and functionality.
Save Gmail: https://www.thewindowsclub.com/save-gmail-emails-as-an-eml-file
Save Gmail: https://www.thewindowsclub.com/save-gmail-emails-as-an-eml-file
Save Gmail: https://www.thewindowsclub.com/save-gmail-emails-as-an-eml-file
Export Google Calendar: https://support.google.com/calendar/answer/37111?hl=en
Google Takeout: https://takeout.google.com/settings/takeout
Twitter Data Download: https://help.twitter.com/en/managing-your-account/how-to-download-your-twitter-archive
Facebook Information Download: https://www.facebook.com/help/1701730696756992?helpref=hc_global_nav
Download InstaPort for Instagram download: https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=2ahUKEwih0PGPru7kAhXjTRUIHSgnBGYQFjAAegQIAhAB&url=https%3A%2F%2Finstaport.en.softonic.com%2Fweb-apps%2Fdownload&usg=AOvVaw2HvueTtmnnBfD7om31_IJi
Whatsapp Back-up to Google Drive: https://faq.whatsapp.com/en/android/28000019/?category=5245251
Look into the lifespan of your hard drive, your external hard drive, etc. Remember if you heavily use your machine, it won’t necessarily last as long. If you don’t perform regular OS updates, you may also run into problems.
Can you open your storage devices? Can you open the files you have saved?
Fixity by AVP: https://www.weareavp.com/products/fixity/
Creative Commons: https://creativecommons.org
Ireland’s Data Protection Commission: https://www.dataprotection.ie/en/individuals/raising-concern-commission