SlideShare ist ein Scribd-Unternehmen logo
1 von 103
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Web Archives at the Nexus of
Good Fakes and Flawed Originals
Michael L. Nelson
Old Dominion University
Web Science & Digital Libraries Research Group
@WebSciDL, @phonedude_mln
With:
ODU: Michele C. Weigle, John Berlin, Mohamed Aturban, Justin Whitlock
LANL: Martin Klein, DANS: Herbert Van de Sompel
Supported in part by The Andrew Mellon Foundation
and the National Science Foundation
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
"You’re in a desert walking along in the
sand when all of the sudden you look down,
and you see a tortoise..."
Supported in part by The Andrew Mellon Foundation
and the National Science Foundation
Michael L. Nelson
Old Dominion University
Web Science & Digital Libraries Research Group
@WebSciDL, @phonedude_mln
With:
ODU: Michele C. Weigle, John Berlin, Mohamed Aturban, Justin Whitlock
LANL: Martin Klein, DANS: Herbert Van de Sompel
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
https://en.wikipedia.org/wiki/Blade_Runner
National Film Registry Induction, 1993: https://www.loc.gov/loc/lcib/94/9405/film.html
http://www.loc.gov/static/programs/national-film-preservation-board/documents/blade_runner.pdf
1982 1968
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
https://www.youtube.com/watch?v=LwDdP88Dr54
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
https://www.youtube.com/watch?v=LwDdP88Dr54
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
We’re not going to review RS’s/PKD’s predictions
https://www.cnn.com/2018/12/28/movies/blade-runner-predictions-2019-trnd/
https://twentytwowords.com/blade-runner-was-set-in-2019/
https://nwn.blogs.com/nwn/2019/01/blade-runner-los-angeles-2019.html
https://www.theregister.co.uk/2019/01/01/blade_runner_today/
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Common themes in the works of Phillip K. Dick
• identity
• self vs. the other
• memory
• humanity
• authenticity
• reality vs. simulacra
• unreliable narrator
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Blade Runner in 239 characters
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Voight-Kampff Test: distinguishing
authentic (humans) vs. fake (replicants)
https://www.youtube.com/watch?v=ic0PuvJbdu0
You’re in a desert walking along in the sand when all of the sudden you look down,
and you see a tortoise. You reach down, you flip the tortoise over on its back.
The tortoise lays on its back, its belly baking in the hot sun, beating its legs trying
to turn itself over, but it can’t, not without your help. But you’re not helping. Why is that?
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Robots indistinguishable from humans,
off-world slaves, perpetually “dark and stormy”
Los Angeles – all good cyberpunk sci-fi tropes –
but that’s not our 2019, right?
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
The future is already here —
it's just not evenly distributed.
-- William Gibson (yes, I’m mixing sci-fi authors)
https://twitter.com/badnetworker/status/1093864777179430912
https://geekologie.com/2018/02/boston-dynamics-tests-door-opening-robot.php
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
“So when do we get to that part about
web archiving?”
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Web archives are science fiction.
Web archives are enabling a reality, as
foreseen by PKD and other sci-fi authors,
where we can insert bespoke fakes
into our collective memory.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Web archives are like science fiction
because they’re a paradox:
We need a significant and continuous
technology investment today to be able to
say a page “used to look like this.”
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Web archiving is not file backup.
Backup = prevent, detect, repair changes
Web archiving = continuous change to better simulate the past
Web archiving is a simulacrum of the past.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
The essence of a web archive
is to modify its holdings
https://web.archive.org/web/19971211010502/https://www.cni.org/
Rewrite links so they
point back in the archive
Provide archival
metadata banner
(what, when, how many)
Relatively simple for the Web
of 1997. Today, it’s not so easy.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Some modifications are to make yesterday’s
formats safe for / available to today’s browser
http://www.dlib.org/dlib/january05/rosenthal/01rosenthal.html
Cf. https://techcrunch.com/2017/07/25/get-ready-to-say-goodbye-to-flash-in-2020/
http://web.archive.org/web/20100605013233/http://www.youtube.com/watch?v=1aPPSIDr3Mc&feature=player_embedded/
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Web archive software is continuously evolving, in part to
better realize a more authentic version of the past
https://github.com/internetarchive/wayback/releases
https://github.com/webrecorder/pywb/releases
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
"...the government presented testimony from the office
manager of the Internet Archive, who explained how the
Archive captures and preserves evidence of the contents
of the internet at a given time. The witness also compared
the screenshots sought to be admitted with true and
accurate copies of the same websites maintained in the
Internet Archive, and testified that the screenshots were
authentic and accurate copies of the Archive’s records.
Based on this testimony, the district court found that the
screenshots had been sufficiently authenticated."
https://law.justia.com/cases/federal/appellate-courts/ca2/17-2479/17-2479-2018-07-02.html
Evidentiary use of “screenshots” of archived pages
United States v. Gasperini, No. 17-2479 (2d Cir. 2018)
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Evidentiary use of “screenshots” of archived pages
United States v. Gasperini, No. 17-2479 (2d Cir. 2018)
"...the government presented testimony from the office
manager of the Internet Archive, who explained how the
Archive captures and preserves evidence of the contents
of the internet at a given time. The witness also compared
the screenshots sought to be admitted with true and
accurate copies of the same websites maintained in the
Internet Archive, and testified that the screenshots were
authentic and accurate copies of the Archive’s records.
Based on this testimony, the district court found that the
screenshots had been sufficiently authenticated."
https://law.justia.com/cases/federal/appellate-courts/ca2/17-2479/17-2479-2018-07-02.html
Screenshots matching IA’s records are not the
same thing as IA’s records matching the past…
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
So why is it so hard to recreate the past?
If we just had isolated, static pages
(jpegs, pdfs, mp3s, etc.)
then there’d be no problem.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
links
Javascript
(modifying the page)
embedded resources
(possibly including other
HTML pages via iframes)
links
links
Real HTML pages are complex
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Javascript
is why we can’t have nice (archival) things
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Load the archived page, get an eagle
https://www.webharvest.gov/congress112th/20130119060624/http://www.fws.gov/
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Hit “reload”, get a tiger
https://www.webharvest.gov/congress112th/20130119060624/http://www.fws.gov/
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Hit “reload” again, get a mountain
https://www.webharvest.gov/congress112th/20130119060624/http://www.fws.gov/
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
“Look on my Javascript, ye Mighty, and despair!”
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Actually, the fws.gov example was super easy;
most changes are much harder to trace
Mohamed Aturban, unpublished, memento:
http://web.archive.org/web/20130724144801/http://www.cnn.com/
Animated GIF: https://blog.dshr.org/2017/11/keynote-at-pacific-neighborhood.html
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Embedded resources + Javascript =
Our simulation of what CNN.com looked
like then is flawed.
It will never be 2013 again, so in some
sense that page is lost.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Zombies: live web “leaking” into an archived page
http://ws-dl.blogspot.com/2012/10/2012-10-10-zombies-in-archives.html
this page is
from 2008
this ad is
from 2012
(when this
screen shot
was taken)
As of late 2017, zombies
mostly no longer occur
https://blog.dshr.org/2017/09/attacking-users-of-wayback-machine.html
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Temporal violations: reconstructing legitimately
archived resources into a page that never existed
http://ws-dl.blogspot.com/2015/12/2015-12-08-evaluating-temporal.html
text (2004-12)
says rain,
image (2005-09)
is clear
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Incorrectly replaying the 2004 weather forecast for
Varina, Iowa is hardly the stuff of dystopian cyberpunk.
There are cases where temporal violations begin to look like tampering…
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Remember the case of Joy Reid’s blog?
https://www.odu.edu/news/2018/5/michael_nelson
https://twitter.com/DrDanetteAllen/status/990228054952865793
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
https://twitter.com/phonedude_mln/status/990054945457147904
HTML archived
on 2006-01-11
JS archived
on 2006-02-07
Reid was a prolific blogger,
so a gap of nearly a month
is catastrophic for temporal
integrity.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Not always Javascript – cookies causes the web archive to store
the Urdu language page at the URL for the English page
https://ws-dl.blogspot.com/2018/03/2018-03-21-cookies-are-why-your.html
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
https://ws-dl.blogspot.com/2019/03/2019-03-18-cookie-violations-cause.html
Cookies + Javascript =
A combo Urdu / Portuguese / English page that never existed
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Web archives are unreliable narrators.
Unreliable narrators cause us to question
everything we’ve been told.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Let’s prove Lester Holt did not “fudge the tape”!
https://twitter.com/AaronBlake/status/1035124642456002565https://twitter.com/realDonaldTrump/status/1035120511259500544
https://news.vice.com/en_us/article/ne5x3d/trump-lester-holt-james-comey-nbc
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
The May, 2017 NBC interview
is not archived until August, 2018
(and even then, the video itself is not archived)
https://www.nbcnews.com/nightly-news/video/pres-trump-s-extended-exclusive-interview-with-lester-holt-at-the-white-house-941854787582?v=raila
https://web.archive.org/web/*/https://www.nbcnews.com/nightly-news/video/pres-trump-s-extended-exclusive-interview-with-lester-holt-at-the-white-house-941854787582?v=raila
https://web.archive.org/web/20180825094239/https://www.nbcnews.com/nightly-news/video/pres-trump-s-extended-exclusive-interview-with-lester-holt-at-the-white-house-941854787582?v=raila
Clicking through to the video reveals a loop of postal
carrier slipping on ice; not the Lester Holt interview.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Errors in crawling and playback are hard
to distinguish from tampering
https://twitter.com/katestarbird/status/911257133231910913
https://er.educause.edu/articles/2018/10/managing-the-cultural-record-in-the-information-warfare-era
I want to explicitly note here the difference between the
act of quietly rewriting the record and enjoying the results
of the rewrites that are accepted as truth and that of
deliberately destroying the confidence of the public
(including the scholarly community) by creating compromise,
confusion, and ambiguity to suggest that the record cannot
be trusted.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Disinformation applied to web archives
doesn’t necessarily mean you have to
insert a specific narrative into the archive.
You just need to cast doubt on the
archive as our collective memory.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
We’re unaware of any cases where web
archive content has been hacked or faked
for any substantive goal.
However, web archives are not immune.
It’s just the theater of conflict has yet to
expand to include web archives.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Twitter then and now
http://inventorspot.com/articles/top_ten_twitterati_tweet_above_rest_31806
https://www.vox.com/policy-and-politics/2017/10/19/16504510/ten-gop-twitter-russia
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Facebook then and now
https://twitter.com/Pinboard/status/975013825010458624
https://web.archive.org/web/20090722095954/http://facebook.com/zuck
See also: https://www.businessinsider.com/facebook-old-posts-mark-zuckerberg-disappeared-2019-3
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Gmail then and now
http://googlepress.blogspot.com/2004/04/google-gets-message-launches-gmail.html
https://www.avanan.com/resources/gmail-exploit-allows-dnc-email-attack
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Web archives then and soon
https://web.archive.org/web/20020601134105/http://www.businessweek.com/technology/content/feb2002/tc20020228_1080.htm
?
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Why do we expect things to be different
for web archives?
Our trust model for web archives is still
rooted in the 1980s / early 90s.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
My chronology with Unix
Late 80s: 1 computer, many users
Used an X terminal to access Cray, Convex supercomputers
90s: 1 computer, 1 user
My Sun IPX workstation was the first www.larc.nasa.gov
now: many computers, 1 user
I’m not even sure how many computers I have access to
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
From brewster@wais.com Sun Apr 25 00:03:19
1993
Received: from express.larc.nasa.gov by
blearg.larc.nasa.gov with SMTP
(5.65.2/server2.4) id AA28277; Sun,
25 Apr 93 00:00:26 -0400
Received: from wais.wais.com by
express.larc.nasa.gov with SMTP id BA21157
(SMTP/Lite-1.15) for
<m.l.nelson@larc.nasa.gov>; Sun, 25 Apr 93
00:00:20 -0400
Received: by wais.wais.com (4.1/SMI-
4.1/Brent-911016)
id AA14369; Sat, 24 Apr 93 20:47:54
PDT
Date: Sat, 24 Apr 93 20:47:54 PDT
Message-Id:
<9304250347.AA14369@wais.wais.com>
From: Brewster Kahle <brewster@wais.com>
To: abc@concert.net
To: admin@ds.internic.net
To: akers@fiddle.oit.unc.edu
To: anders@ifi.uio.no
To: anders@munin.ub2.lu.se
…
To: m.l.nelson@LaRC.NASA.GOV
…
To: root@ds.internic.net
To: root@ncgia.ucsb.edu
To: root@fiddle.oit.unc.edu
To: root@oac.hsc.uth.tmc.edu
To: root@samba.acs.unc.edu
To: root@spk41.usace.mil
To: root@stone.ucs.indiana.edu
To: root@sunsite.unc.edu
To: root@uniwa.uwa.oz.au
To: root@uva.ci.uv.es
To: root@nic.funet.fi
…
WAIS server maintainers,
As you probably know through wais-discussion, we are announcing
the commercial WAIS server this thursday. There is a big press
event and showcase at the WAIS Inc offices.
Thank you, everyone, for making it possible for us to pull off a
startup company.
We are considering running a special price for a limited time for
those that know and understand WAIS already. We would like to
discuss this with those that might be interested in it, and would
like to help us determine how it should work. Most people will
continue to use the freeware, and that is fine, this is for those
that might be interested in a commercial version. At this time,
we will not be discussing the differences between things or other
products.
Given that the press has started to call and ask for information
before hand (to scoop this story, you know the press...), we have
had to keep a very quiet profile.
On the other hand, we need the help from all of you. Generally,
this is done with a signed non-disclosure basis, but this wont
work on the Internet and not in time.
What I was thinking was to ask anyone that would like to discuss
this, to send an "email non-disclosure" to non-disclosed-
waisites-request@wais.com.
I wish this weren't so baroque, but you could not believe some of
the members of the press I have talked to. If one reporter
publishes early, it can spoil things (and get it wrong).
(please dont email to me. At this point, my cup floweth over. I
will dig out after the showcase!)
-brewster
TMC->WAIS Inc->AOL->Alexa->IA
https://twitter.com/phonedude_mln/status/1105160308866338816
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
From brewster@wais.com Sun Apr 25 00:03:19
1993
Received: from express.larc.nasa.gov by
blearg.larc.nasa.gov with SMTP
(5.65.2/server2.4) id AA28277; Sun,
25 Apr 93 00:00:26 -0400
Received: from wais.wais.com by
express.larc.nasa.gov with SMTP id BA21157
(SMTP/Lite-1.15) for
<m.l.nelson@larc.nasa.gov>; Sun, 25 Apr 93
00:00:20 -0400
Received: by wais.wais.com (4.1/SMI-
4.1/Brent-911016)
id AA14369; Sat, 24 Apr 93 20:47:54
PDT
Date: Sat, 24 Apr 93 20:47:54 PDT
Message-Id:
<9304250347.AA14369@wais.wais.com>
From: Brewster Kahle <brewster@wais.com>
To: abc@concert.net
To: admin@ds.internic.net
To: akers@fiddle.oit.unc.edu
To: anders@ifi.uio.no
To: anders@munin.ub2.lu.se
…
To: m.l.nelson@LaRC.NASA.GOV
…
To: root@ds.internic.net
To: root@ncgia.ucsb.edu
To: root@fiddle.oit.unc.edu
To: root@oac.hsc.uth.tmc.edu
To: root@samba.acs.unc.edu
To: root@spk41.usace.mil
To: root@stone.ucs.indiana.edu
To: root@sunsite.unc.edu
To: root@uniwa.uwa.oz.au
To: root@uva.ci.uv.es
To: root@nic.funet.fi
…
WAIS server maintainers,
As you probably know through wais-discussion, we are announcing
the commercial WAIS server this thursday. There is a big press
event and showcase at the WAIS Inc offices.
Thank you, everyone, for making it possible for us to pull off a
startup company.
We are considering running a special price for a limited time for
those that know and understand WAIS already. We would like to
discuss this with those that might be interested in it, and would
like to help us determine how it should work. Most people will
continue to use the freeware, and that is fine, this is for those
that might be interested in a commercial version. At this time,
we will not be discussing the differences between things or other
products.
Given that the press has started to call and ask for information
before hand (to scoop this story, you know the press...), we have
had to keep a very quiet profile.
On the other hand, we need the help from all of you. Generally,
this is done with a signed non-disclosure basis, but this wont
work on the Internet and not in time.
What I was thinking was to ask anyone that would like to discuss
this, to send an "email non-disclosure" to non-disclosed-
waisites-request@wais.com.
I wish this weren't so baroque, but you could not believe some of
the members of the press I have talked to. If one reporter
publishes early, it can spoil things (and get it wrong).
(please dont email to me. At this point, my cup floweth over. I
will dig out after the showcase!)
-brewster
When computers were $$$, an email to “root” could be expected
to be received by someone entrusted with the necessary $$$
to responsibly administer the machine.
IOW, “root” was almost always a white hat.
It hasn’t been like that for a long time.
Web archives are like the Unix mainframes of today.
TMC->WAIS Inc->AOL->Alexa->IA
https://twitter.com/phonedude_mln/status/1105160308866338816
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
How well do you know root@archive.org?
As in, could you call/email him right now and expect a response?
Our entire national digital preservation strategy is predicated on
Brewster Kahle “not being evil”™
If he is leading a 25+ year sleeper cell, we’re doomed.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
How well do you know these roots?
Many more: https://en.wikipedia.org/wiki/List_of_Web_archiving_initiatives
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Up until now, we’ve only looked at failures
or edge cases in crawling and replay.
What about deliberate fakes?
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Cut-n-paste / mashup “fakes” for humor
Victorian Photo Collage
https://www.metmuseum.org/exhibitions/listings/2010/victorian-photocollage
“The Flying Saucer” (1956)
https://en.wikipedia.org/wiki/The_Flying_Saucer_(song)
https://www.youtube.com/watch?v=XCrn6QXvHLg
Brian Williams Raps ‘Gin & Juice’
https://www.youtube.com/watch?v=XlGLhYFrv6w
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
More convincing fakes require significant skills,
knowledge, and access
https://en.wikipedia.org/wiki/Piltdown_Man
https://en.wikipedia.org/wiki/Shroud_of_Turin
https://www.npr.org/templates/story/story.php?storyId=94461486
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
“deep learning” + “fake” = deepfakes
https://motherboard.vice.com/en_us/article/7x799b/selling-ai-generated-fake-porn-is-probably-a-good-way-to-get-sued
https://motherboard.vice.com/en_us/article/ev5eba/ai-fake-porn-of-friends-deepfakes
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Becoming more mainstream:
https://twitter.com/MikaelThalen/status/1090349932266094593 https://deepfakesapp.online/
A “safe for work” example:
No longer buried in
the dark corners of Reddit:
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
“Detecting” deepfakes will happen.
“Preventing” deepfakes won’t happen; they’re here to stay:
Mementos, even of a fake past, are core to the human condition.
“Did you get your precious photos?” “Implants. Those aren't your memories,
they're somebody else's. They're
Tyrell's niece's.”
http://deepemotions.free.fr/theme_1.html
Real photos, fake memories: replicants attach significant value to photos,
even when they know the memories are fake.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Next Thanksgiving dinner,
liven up the discussion with your extended family
1. Extract just 0:23—0:26 of the
Obama/Peele video
2. Embed in an HTML page
3. Use Javascript to rewrite the
banner and browser URL
– Datetime: 2016-11-09
– URL:
www.whitehouse.gov/totally
NotFake
1. Claim the deep state deleted
the page from the live webhttps://www.theverge.com/tldr/2018/4/17/17247334/ai-fake-news-video-barack-obama-jordan-peele-buzzfeed
https://www.youtube.com/watch?time_continue=43&v=cQ54GDm1eL0#t=0m23s
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Not just hypothetical.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Inserting fakes into real archives
Here’s an actual page in the IA “proving”
Brian Williams released “Gin and Juice” in 1992, a full year before Snoop Dogg.
John Berlin, MS Thesis, 2018
https://www.youtube.com/watch?v=k3QTcJZdFfs
(actual URI-R & URI-M have also been obscured in the video to hide the technique)
The content is clearly fake, but it demonstrates that it’s possible
to write Javascript that attacks the archive’s playback capability.
It takes an archiving expert to tell the difference.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
We’ve known about these & other attacks
for nearly two years
http://labs.rhizome.org/presentations/security.html#/
https://acmccs.github.io/papers/p1741-lernerAT3.pdf
https://blog.dshr.org/2017/06/wac2017-security-issues-for-web-archives.html
https://ws-dl.blogspot.com/2018/04/2018-05-01-high-fidelity-ms-thesis-to.html
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
There are other ways, presumably still
hypothetical, to attack the archives
https://twitter.com/internetarchive/status/596768668756774914
https://xkcd.com/538/
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
https://www.theguardian.com/uk-news/2018/sep/05/planes-trains-and-fake-names-the-trail-left-by-skripal-suspects
https://www.cnn.com/2018/10/22/middleeast/saudi-operative-jamal-khashoggi-clothes/index.html
“Planes, trains and fake names:
the trail left by Skripal suspects”
“Surveillance footage shows
Saudi 'body double' in
Khashoggi's clothes after he was
killed, Turkish source says”
Before you say “that will never happen!”
Reminder: agents, dissidents, journalists have all disappeared;
they won’t mind adding a librarian/sysadmin to the list
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
I’ve got good news and bad news:
Setting up a web archive is not as difficult
nor expensive as it used to be.
OpenWayback, WAIL, pywb, et al. + cloud storage =
you can have a web archive running for about the same time
it took to generate the Steve Buscemi / Jennifer Lawrence deepfake.
https://github.com/iipc/openwayback
https://github.com/N0taN3rd/wail
https://machawk1.github.io/wail/
https://github.com/webrecorder/pywb
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Inserting fakes into fake archives
breitbart.com/wayback/*/whitehouse.gov/totallyNotFake
infowars.com/web/*/whitehouse.gov/totallyNotFake
iluv.aynrand.org/*/whitehouse.gov/totallyNotFake
InternetResearchAgency.ru/whitehouse.gov/totallyNotFake
How well do you know root at these archives?
Are they really four different archives, or one root for all of them?
What if 99.9% of the time they faithfully replay pages?
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
http://www.dlib.org/dlib/november05/rosenthal/11rosenthal.html
What if we start off with > (n/2)+1
archives compromised?
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
What if the archives were targeted to amplify a
specific disinformation narrative?
And what if the archives had no choice but to
cooperate?
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
The University of Farmington is fake
DHS strong armed a “.edu” registration, they could do the same to IA & others too
https://twitter.com/nwarikoo/status/1090726638034276352
https://web.archive.org/web/20161023170733/https://universityoffarmington.edu/
https://twitter.com/phonedude_mln/status/1092464939040755712
First capture: 2016-10-23
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Blockchain to the rescue!!!
<lasers>
<sirens>
<disco-thumping-soundtrack>
nope.
https://www.multichain.com/blog/2015/11/avoiding-pointless-blockchain-project/
https://eprint.iacr.org/2017/375.pdf
https://blog.dshr.org/search/label/bitcoin
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
There is no shortage of
deepfake vs. blockchain stories
https://www.wired.com/story/the-blockchain-solution-to-our-deepfake-problems/
https://www.longhash.com/news/the-coming-war-between-deepfakes-and-blockchain
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
A Voight-Kampff Test for deepfakes
doesn’t seem that silly now
https://twitter.com/TechCrunch/status/1009556795965296642
https://www.technologyreview.com/s/611726/the-defense-department-has-produced-the-first-tools-for-catching-deepfakes/
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Are we prepared for the
unintended consequences?
“Enforcing digital signatures for all
cameras and video devices would
offer the same capability in reverse.
Suddenly every photograph and video
shared online could be traced back to
its original owner. Security services in
a repressive regime could scour social
media for all videos depicting them in
a negative light and trace them back
to the precise individuals who captured
the video, arresting them en masse.”
https://www.forbes.com/sites/kalevleetaru/2018/09/09/why-digital-signatures-wont-prevent-deep-fakes-but-will-help-repressive-governments/
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
On the other hand, “blockchaining” our pets is a study in
incompatibility, so tracking photos may never happen
https://www.aspca.org/about-us/aspca-policy-and-position-statements/microchips
https://moviepaws.com/2017/10/22/owls-snakes-and-unicorns-the-animals-of-blade-runner/
In Blade Runner, synthetic pets
had serial numbers
(real pets are unavailable
to all but the richest).
“While most of the world has
accepted these standards,
North America has not. The
primary problem is a competitive,
technological one involving the
compatibility of the microchips
and the readers that are used
by shelters and veterinary clinics.”
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
As for blockchains and web archives…
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
This is not what you think it is…
https://petertodd.org/2017/carbon-dating-the-internet-archive-with-opentimestamps
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
This is not what you think it is…
https://petertodd.org/2017/carbon-dating-the-internet-archive-with-opentimestamps
“…right now you can get timestamps for every book,
movie, song, computer program, legal document,
etc. in the thousands of collections in the archive.
In the future we hope to be able to work with the
Internet Archive to extend this to timestamping
website snapshots…”
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
That’s never going to happen.
(at least not 3rd
party through the playback interface)
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Archive URI-Ms
-----------------------------
perma-archives.org 182
bibalex.org 199
webarchive.org.uk 349
bac-lac.gc.ca 351
proni.gov.uk 469
digar.ee 488
webharvest.gov 712
internetmemory.org 979
nationalarchives.gov.uk 994
stanford.edu 1222
archive-it.org 1383
archive.is 1396
web.archive.org 1566
arquivo.pt 1569
webcitation.org 1585
vefsafn.is 1589
loc.gov 1594
-----------------------------
Total 16627
Sample 16k+ Mementos from 17 Web Archives
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Periodically Replay Each Archived Page
Above example: http://perma-archives.org/warc/20170101182813/http://umich.edu/
35 times, from Nov. 2017 – Oct. 2018
For each replay, we download both the rewritten version and the “raw” version (where possible).
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Periodically Replay Each Archived Page
Above example: http://perma-archives.org/warc/20170101182813/http://umich.edu/
35 times, from Nov. 2017 – Oct. 2018
For each replay, we download both the rewritten version and the “raw” version (where possible).
Partial archive outage because
of security / maintenance upgrade
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Periodically Replay Each Archived Page
Above example: http://perma-archives.org/warc/20170101182813/http://umich.edu/
35 times, from Nov. 2017 – Oct. 2018
For each replay, we download both the rewritten version and the “raw” version (where possible).
Post-upgrade, replay is variable.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
More Archived Pages Changed Every Time
Than Never Changed
(yes, this experiment used “raw” mode)
Never changed:
2007 URI-Ms (1 in 8)
Always changed:
2773 URI-Ms (1 in 6)
Fixity-based approaches, including blockchain, will not work.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
“Hash the screen shot, not the HTML!”
That doesn’t work either.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
1 WARC file, 2 Wayback Machines, 3 Browsers
= 6 different replays
http://wayback.archive-it.org/all/20130106140348/http://www.harvard.edu/
http://web.archive.org/web/20130106140348/http://www.harvard.edu/
see also: https://ws-dl.blogspot.com/2016/12/2016-12-20-archiving-pages-with.html
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Why not create a LOCKSS for web archives?
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Web archives are not especially interoperable.
There are many issues regarding
interoperability, but generational loss is a good
demonstration of incompatible assumptions
about simulating the past.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
https://web.archive.org/web/20180501125952/https:/twitter.com/phonedude_mln/status/990054945457147904
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
http://archive.is/PaKx6
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
https://perma.cc/3HMS-TB59
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
http://www.webcitation.org/77RhNeyoZ
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
https://web.archive.org/web/20190407024654/https://perma.cc/3HMS-TB59
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
https://web.archive.org/web/20190407031659/http://www.webcitation.org/77RhNeyoZ
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Web archiving interoperability: a metaphor
(non-synthetic pets, possibly microchipped)
https://www.youtube.com/watch?v=SQudKvrwDAU
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
To summarize:
Existing, trusted archives can be compromised by:
1) crawling malicious pages, or
2) attacking facilities / personnel
3) court orders
Lowered resource threshold for archives allows:
1) “long game” archives: faithful now, corrupt later,
2) “sock puppet” archives: surreptitiously cooperating
archives
The nature of web archives is to change content –
current fixity based approaches will not help.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Looking forward:
We need new models for web archiving
and verifying authenticity.
The Heritrix / Wayback Machine
technology stack, while successful, has
limited our thinking.
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
“Studies generally suggest that, year after year, less than
60 percent of web traffic is human; … For a period of time in
2013, the Times reported this year, a full half of YouTube
traffic was “bots masquerading as people,” a portion so high
that employees feared an inflection point after which
YouTube’s systems for detecting fraudulent traffic would
begin to regard bot traffic as real and human traffic as fake.
They called this hypothetical event “the Inversion.””
http://nymag.com/intelligencer/2018/12/how-much-of-the-internet-is-fake.html
Robots outnumber humans 10:1 in sessions, 5:4 in HTTP connections in the IA, ca. 2012
http://arxiv.org/abs/1309.4016
https://giphy.com/gifs/harrison-ford-blade-runner-sean-young-yjB2fwqjv5rry/media
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
I suspect the core of the new model will have a
lot in common with click farms
https://twitter.com/mbrennanchina/status/1072114511212109824
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Record what we saw at crawl time as a baseline,
then we need a distance measure for crawl time and replay time
http://dx.doi.org/10.5210/fm.v22i112.8097
https://ws-dl.blogspot.com/2013/05/2013-05-25-game-walkthroughs-as.html
Documenting instead of archiving…
1)Robotic witnesses
2)New Nielsen families
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
Some of you might be thinking
“but I don’t like Blade Runner – what can I
take away from this talk?”
(my wife refers to the film as “serious white guys talking”)
Two methods for passing the Voight-
Kampff Test for Blade Runner fandom
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
1) Is Deckard a replicant?
In the book, he’s definitely human. In the seven (!) versions of
the movie, it ranges from “ambiguous” to “replicant”.
https://moviepaws.com/2017/10/22/owls-snakes-and-unicorns-the-animals-of-blade-runner/
https://en.wikipedia.org/wiki/Themes_in_Blade_Runner
https://en.wikipedia.org/wiki/Blade_Runner#Versions
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
2) “Tears in Rain” – Greatest monologue in sci-fi?
Or greatest monologue of all time?
I've seen things you people wouldn't believe. Attack ships on fire
off the shoulder of Orion. I watched C-beams glitter in the dark
near the Tannhäuser Gate. All those moments will be lost in time,
like tears in rain. Time to die.
https://www.youtube.com/watch?v=9hDo80ddn4Q
https://en.wikipedia.org/wiki/Tears_in_rain_monologue
https://www.youtube.com/watch?v=BM54jXndyvQ
CNI Spring 2019 Membership Meeting, 2019-04-09,
@phonedude_mln, @WebSciDL
2) “Tears in Rain” – Greatest monologue in sci-fi?
Or greatest monologue of all time?
I've crawled things you people wouldn't believe. Clickjacking attacks
off the x-frame-options: sameorigin. I watched ajax requests redirect
at the aggregator TimeGate. All those pages will be lost in time,
like tears in rain. Time to lie.
https://www.youtube.com/watch?v=9hDo80ddn4Q
https://en.wikipedia.org/wiki/Tears_in_rain_monologue
https://www.youtube.com/watch?v=BM54jXndyvQ

Weitere ähnliche Inhalte

Was ist angesagt?

Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Michael Nelson
 
Reading Beyond the Book for ICJSE
Reading Beyond the Book for ICJSEReading Beyond the Book for ICJSE
Reading Beyond the Book for ICJSEJen LaMaster
 
10 Steps For Your Project
10 Steps For Your Project10 Steps For Your Project
10 Steps For Your Projectbu_hall
 
Personal Learning Networks
Personal Learning NetworksPersonal Learning Networks
Personal Learning NetworksFloyd Pentlin
 
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...Shawn Jones
 
@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015Michael Nelson
 
Web20 School20 4ss
Web20 School20 4ssWeb20 School20 4ss
Web20 School20 4ssdwarlick
 
Open Apereo 19 Privacy Keynote
Open Apereo 19 Privacy KeynoteOpen Apereo 19 Privacy Keynote
Open Apereo 19 Privacy KeynoteIan Dolphin
 
The Information Revolution
The Information RevolutionThe Information Revolution
The Information Revolutionrpop1012
 
UMASL Search Like a Pro
UMASL Search Like a ProUMASL Search Like a Pro
UMASL Search Like a Probsdesantis
 
Using Web Archives to Enrich the Live Web Experience Through Storytelling
Using Web Archives to Enrich  the Live Web Experience Through StorytellingUsing Web Archives to Enrich  the Live Web Experience Through Storytelling
Using Web Archives to Enrich the Live Web Experience Through StorytellingYasmin AlNoamany, PhD
 
New Media Consortium 2016 conference: my keynote
New Media Consortium 2016 conference: my keynoteNew Media Consortium 2016 conference: my keynote
New Media Consortium 2016 conference: my keynoteBryan Alexander
 
Exploring Digital Cultures W12: The Wikipedia Debate
Exploring Digital Cultures W12: The Wikipedia DebateExploring Digital Cultures W12: The Wikipedia Debate
Exploring Digital Cultures W12: The Wikipedia DebateNoNeedforInk
 
Working with opportunities and risks for CSE in a digital age
Working with opportunities and risks for CSE in a digital ageWorking with opportunities and risks for CSE in a digital age
Working with opportunities and risks for CSE in a digital ageBex Lewis
 
I know how to search the internet,
I know how to search the internet,I know how to search the internet,
I know how to search the internet,Hindie Dershowitz
 
A Shipping Forecast for ATOD 2.0
A Shipping Forecast for ATOD 2.0A Shipping Forecast for ATOD 2.0
A Shipping Forecast for ATOD 2.0Anne Welsh
 
Entity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - OverviewEntity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - OverviewRadityo Eko Prasojo
 

Was ist angesagt? (20)

Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
 
Reading Beyond the Book for ICJSE
Reading Beyond the Book for ICJSEReading Beyond the Book for ICJSE
Reading Beyond the Book for ICJSE
 
10 Steps For Your Project
10 Steps For Your Project10 Steps For Your Project
10 Steps For Your Project
 
Personal Learning Networks
Personal Learning NetworksPersonal Learning Networks
Personal Learning Networks
 
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
Social Cards Probably Provide For Better Understanding Of Web Archive Collect...
 
@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015
 
Web20 School20 4ss
Web20 School20 4ssWeb20 School20 4ss
Web20 School20 4ss
 
Wizard of Apps Revised
Wizard of Apps RevisedWizard of Apps Revised
Wizard of Apps Revised
 
Open Apereo 19 Privacy Keynote
Open Apereo 19 Privacy KeynoteOpen Apereo 19 Privacy Keynote
Open Apereo 19 Privacy Keynote
 
NCAGT Wikipedia
NCAGT WikipediaNCAGT Wikipedia
NCAGT Wikipedia
 
The Information Revolution
The Information RevolutionThe Information Revolution
The Information Revolution
 
UMASL Search Like a Pro
UMASL Search Like a ProUMASL Search Like a Pro
UMASL Search Like a Pro
 
Using Web Archives to Enrich the Live Web Experience Through Storytelling
Using Web Archives to Enrich  the Live Web Experience Through StorytellingUsing Web Archives to Enrich  the Live Web Experience Through Storytelling
Using Web Archives to Enrich the Live Web Experience Through Storytelling
 
New Media Consortium 2016 conference: my keynote
New Media Consortium 2016 conference: my keynoteNew Media Consortium 2016 conference: my keynote
New Media Consortium 2016 conference: my keynote
 
Exploring Digital Cultures W12: The Wikipedia Debate
Exploring Digital Cultures W12: The Wikipedia DebateExploring Digital Cultures W12: The Wikipedia Debate
Exploring Digital Cultures W12: The Wikipedia Debate
 
Working with opportunities and risks for CSE in a digital age
Working with opportunities and risks for CSE in a digital ageWorking with opportunities and risks for CSE in a digital age
Working with opportunities and risks for CSE in a digital age
 
I know how to search the internet,
I know how to search the internet,I know how to search the internet,
I know how to search the internet,
 
A Shipping Forecast for ATOD 2.0
A Shipping Forecast for ATOD 2.0A Shipping Forecast for ATOD 2.0
A Shipping Forecast for ATOD 2.0
 
Sadler NISO Virtual Conf Feb17
Sadler NISO Virtual Conf Feb17Sadler NISO Virtual Conf Feb17
Sadler NISO Virtual Conf Feb17
 
Entity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - OverviewEntity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - Overview
 

Ähnlich wie Web Archives at the Nexus of Good Fakes and Flawed Originals

mEducation Alliance Symposium - Oct 2019
mEducation Alliance Symposium - Oct 2019mEducation Alliance Symposium - Oct 2019
mEducation Alliance Symposium - Oct 2019Hal Speed
 
Internet History
Internet HistoryInternet History
Internet Historydebbylatina
 
Freedom to Succeed - Dwell in Possibility
Freedom to Succeed - Dwell in PossibilityFreedom to Succeed - Dwell in Possibility
Freedom to Succeed - Dwell in PossibilityMiguel Guhlin
 
micro:bit IET - Nov 2019
micro:bit IET - Nov 2019micro:bit IET - Nov 2019
micro:bit IET - Nov 2019Hal Speed
 
Digital marketing - evolution
Digital marketing - evolution Digital marketing - evolution
Digital marketing - evolution Eva Sycz
 
micro:bit WeTeach_CS Resource Palooza - Sept 2019
micro:bit WeTeach_CS Resource Palooza - Sept 2019micro:bit WeTeach_CS Resource Palooza - Sept 2019
micro:bit WeTeach_CS Resource Palooza - Sept 2019Hal Speed
 
Archive Assisted Archival Fixity Verification Framework
Archive Assisted Archival Fixity Verification FrameworkArchive Assisted Archival Fixity Verification Framework
Archive Assisted Archival Fixity Verification FrameworkSawood Alam
 
Why Mobile Marketing is Essential for 21st Century Business Success
Why Mobile Marketing is Essential for 21st Century Business SuccessWhy Mobile Marketing is Essential for 21st Century Business Success
Why Mobile Marketing is Essential for 21st Century Business SuccessMorgan Liu
 
Case Study: Understanding Knowledge Workers' Creation, Description, and Stora...
Case Study: Understanding Knowledge Workers' Creation, Description, and Stora...Case Study: Understanding Knowledge Workers' Creation, Description, and Stora...
Case Study: Understanding Knowledge Workers' Creation, Description, and Stora...Camille Mathieu
 
Assignment 2 task 1 presentation
Assignment 2 task 1 presentationAssignment 2 task 1 presentation
Assignment 2 task 1 presentationJackStamp2
 
Reframining Digital Literacies: Beyond Flashy, Flimsy and Faddish Models
Reframining Digital Literacies: Beyond Flashy, Flimsy and Faddish ModelsReframining Digital Literacies: Beyond Flashy, Flimsy and Faddish Models
Reframining Digital Literacies: Beyond Flashy, Flimsy and Faddish ModelsMark Brown
 
Storytelling for Summarizing Collections in Web Archives
Storytelling for Summarizing Collections in Web ArchivesStorytelling for Summarizing Collections in Web Archives
Storytelling for Summarizing Collections in Web ArchivesMichael Nelson
 
Learning and teaching using social media
Learning and teaching using social mediaLearning and teaching using social media
Learning and teaching using social mediaSue Beckingham
 
EZRust: Z80 for the Web
EZRust: Z80 for the WebEZRust: Z80 for the Web
EZRust: Z80 for the WebLiz Frost
 
MW19 Workshop: Museum DAM in 2019_20190330-public
MW19 Workshop: Museum DAM in 2019_20190330-publicMW19 Workshop: Museum DAM in 2019_20190330-public
MW19 Workshop: Museum DAM in 2019_20190330-publicChristina Gibbs
 
Research Visibility in the Global South: Towards Increased Online Visibility...
Research Visibility  in the Global South: Towards Increased Online Visibility...Research Visibility  in the Global South: Towards Increased Online Visibility...
Research Visibility in the Global South: Towards Increased Online Visibility...Lighton Phiri
 
If you love your content, set it free (v3.0)
If you love your content, set it free (v3.0) If you love your content, set it free (v3.0)
If you love your content, set it free (v3.0) Mike Ellis
 
Open Data in Design & Civic Governance 2012
Open Data in Design & Civic Governance 2012Open Data in Design & Civic Governance 2012
Open Data in Design & Civic Governance 2012Fingal Open Data
 

Ähnlich wie Web Archives at the Nexus of Good Fakes and Flawed Originals (20)

mEducation Alliance Symposium - Oct 2019
mEducation Alliance Symposium - Oct 2019mEducation Alliance Symposium - Oct 2019
mEducation Alliance Symposium - Oct 2019
 
Internet History
Internet HistoryInternet History
Internet History
 
Freedom to Succeed - Dwell in Possibility
Freedom to Succeed - Dwell in PossibilityFreedom to Succeed - Dwell in Possibility
Freedom to Succeed - Dwell in Possibility
 
micro:bit IET - Nov 2019
micro:bit IET - Nov 2019micro:bit IET - Nov 2019
micro:bit IET - Nov 2019
 
Digital marketing - evolution
Digital marketing - evolution Digital marketing - evolution
Digital marketing - evolution
 
micro:bit WeTeach_CS Resource Palooza - Sept 2019
micro:bit WeTeach_CS Resource Palooza - Sept 2019micro:bit WeTeach_CS Resource Palooza - Sept 2019
micro:bit WeTeach_CS Resource Palooza - Sept 2019
 
Archive Assisted Archival Fixity Verification Framework
Archive Assisted Archival Fixity Verification FrameworkArchive Assisted Archival Fixity Verification Framework
Archive Assisted Archival Fixity Verification Framework
 
Why Mobile Marketing is Essential for 21st Century Business Success
Why Mobile Marketing is Essential for 21st Century Business SuccessWhy Mobile Marketing is Essential for 21st Century Business Success
Why Mobile Marketing is Essential for 21st Century Business Success
 
Case Study: Understanding Knowledge Workers' Creation, Description, and Stora...
Case Study: Understanding Knowledge Workers' Creation, Description, and Stora...Case Study: Understanding Knowledge Workers' Creation, Description, and Stora...
Case Study: Understanding Knowledge Workers' Creation, Description, and Stora...
 
Assignment 2 task 1 presentation
Assignment 2 task 1 presentationAssignment 2 task 1 presentation
Assignment 2 task 1 presentation
 
Reframining Digital Literacies: Beyond Flashy, Flimsy and Faddish Models
Reframining Digital Literacies: Beyond Flashy, Flimsy and Faddish ModelsReframining Digital Literacies: Beyond Flashy, Flimsy and Faddish Models
Reframining Digital Literacies: Beyond Flashy, Flimsy and Faddish Models
 
Storytelling for Summarizing Collections in Web Archives
Storytelling for Summarizing Collections in Web ArchivesStorytelling for Summarizing Collections in Web Archives
Storytelling for Summarizing Collections in Web Archives
 
Assignment 2 task 1
Assignment 2 task 1Assignment 2 task 1
Assignment 2 task 1
 
Learning and teaching using social media
Learning and teaching using social mediaLearning and teaching using social media
Learning and teaching using social media
 
EZRust: Z80 for the Web
EZRust: Z80 for the WebEZRust: Z80 for the Web
EZRust: Z80 for the Web
 
ICT : The Organization and Work
ICT : The Organization and WorkICT : The Organization and Work
ICT : The Organization and Work
 
MW19 Workshop: Museum DAM in 2019_20190330-public
MW19 Workshop: Museum DAM in 2019_20190330-publicMW19 Workshop: Museum DAM in 2019_20190330-public
MW19 Workshop: Museum DAM in 2019_20190330-public
 
Research Visibility in the Global South: Towards Increased Online Visibility...
Research Visibility  in the Global South: Towards Increased Online Visibility...Research Visibility  in the Global South: Towards Increased Online Visibility...
Research Visibility in the Global South: Towards Increased Online Visibility...
 
If you love your content, set it free (v3.0)
If you love your content, set it free (v3.0) If you love your content, set it free (v3.0)
If you love your content, set it free (v3.0)
 
Open Data in Design & Civic Governance 2012
Open Data in Design & Civic Governance 2012Open Data in Design & Civic Governance 2012
Open Data in Design & Civic Governance 2012
 

Mehr von Michael Nelson

Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesBlockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesMichael Nelson
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Michael Nelson
 
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...Michael Nelson
 
Summarizing archival collections using storytelling techniques
Summarizing archival collections using storytelling techniquesSummarizing archival collections using storytelling techniques
Summarizing archival collections using storytelling techniquesMichael Nelson
 
The Memento Protocol and Research Issues With Web Archiving
The Memento Protocol and Research Issues With Web ArchivingThe Memento Protocol and Research Issues With Web Archiving
The Memento Protocol and Research Issues With Web ArchivingMichael Nelson
 
We Need Multiple, Independent Web Archives
We Need Multiple, Independent Web ArchivesWe Need Multiple, Independent Web Archives
We Need Multiple, Independent Web ArchivesMichael Nelson
 
Combining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
Combining Heritrix and PhantomJS for Better Crawling of Pages with JavascriptCombining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
Combining Heritrix and PhantomJS for Better Crawling of Pages with JavascriptMichael Nelson
 
Why We Need Multiple Archives
Why We Need Multiple ArchivesWhy We Need Multiple Archives
Why We Need Multiple ArchivesMichael Nelson
 
Combining Storytelling and Web Archives
Combining Storytelling and Web ArchivesCombining Storytelling and Web Archives
Combining Storytelling and Web ArchivesMichael Nelson
 
Evaluating the Temporal Coherence of Archived Pages
Evaluating the Temporal Coherence of Archived PagesEvaluating the Temporal Coherence of Archived Pages
Evaluating the Temporal Coherence of Archived PagesMichael Nelson
 
When Should I Make Preservation Copies of Myself?
When Should I Make Preservation Copies of Myself?�When Should I Make Preservation Copies of Myself?�
When Should I Make Preservation Copies of Myself?Michael Nelson
 
Assessing the Quality of Web Archives
Assessing the Quality of Web ArchivesAssessing the Quality of Web Archives
Assessing the Quality of Web ArchivesMichael Nelson
 
Profiling Web Archives
Profiling Web ArchivesProfiling Web Archives
Profiling Web ArchivesMichael Nelson
 
Evaluating the SiteStory Transactional Web Archive with the ApacheBench Tool
Evaluating the SiteStory Transactional Web Archive with the ApacheBench ToolEvaluating the SiteStory Transactional Web Archive with the ApacheBench Tool
Evaluating the SiteStory Transactional Web Archive with the ApacheBench ToolMichael Nelson
 
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...Michael Nelson
 
Who and What Links to the Internet Archive
Who and What Links to the Internet ArchiveWho and What Links to the Internet Archive
Who and What Links to the Internet ArchiveMichael Nelson
 
On the Change in Archivability of Websites Over Time
On the Change in Archivability of Websites Over TimeOn the Change in Archivability of Websites Over Time
On the Change in Archivability of Websites Over TimeMichael Nelson
 
Profiling Web Archive Coverage for Top-Level Domain and Content Language
Profiling Web Archive Coverage for Top-Level Domain and Content LanguageProfiling Web Archive Coverage for Top-Level Domain and Content Language
Profiling Web Archive Coverage for Top-Level Domain and Content LanguageMichael Nelson
 
Who Will Archive the Archives? Thoughts About the Future of Web Archiving
Who Will Archive the Archives? Thoughts About the Future of Web ArchivingWho Will Archive the Archives? Thoughts About the Future of Web Archiving
Who Will Archive the Archives? Thoughts About the Future of Web ArchivingMichael Nelson
 
More Archives, More Better
More Archives, More Better More Archives, More Better
More Archives, More Better Michael Nelson
 

Mehr von Michael Nelson (20)

Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web PagesBlockchain Can Not Be Used To Verify Replayed Archived Web Pages
Blockchain Can Not Be Used To Verify Replayed Archived Web Pages
 
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence Weaponized Web Archives: Provenance Laundering of Short Order Evidence
Weaponized Web Archives: Provenance Laundering of Short Order Evidence
 
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
Web Archiving Activities of ODU’s Web Science and Digital Library Research G...
 
Summarizing archival collections using storytelling techniques
Summarizing archival collections using storytelling techniquesSummarizing archival collections using storytelling techniques
Summarizing archival collections using storytelling techniques
 
The Memento Protocol and Research Issues With Web Archiving
The Memento Protocol and Research Issues With Web ArchivingThe Memento Protocol and Research Issues With Web Archiving
The Memento Protocol and Research Issues With Web Archiving
 
We Need Multiple, Independent Web Archives
We Need Multiple, Independent Web ArchivesWe Need Multiple, Independent Web Archives
We Need Multiple, Independent Web Archives
 
Combining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
Combining Heritrix and PhantomJS for Better Crawling of Pages with JavascriptCombining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
Combining Heritrix and PhantomJS for Better Crawling of Pages with Javascript
 
Why We Need Multiple Archives
Why We Need Multiple ArchivesWhy We Need Multiple Archives
Why We Need Multiple Archives
 
Combining Storytelling and Web Archives
Combining Storytelling and Web ArchivesCombining Storytelling and Web Archives
Combining Storytelling and Web Archives
 
Evaluating the Temporal Coherence of Archived Pages
Evaluating the Temporal Coherence of Archived PagesEvaluating the Temporal Coherence of Archived Pages
Evaluating the Temporal Coherence of Archived Pages
 
When Should I Make Preservation Copies of Myself?
When Should I Make Preservation Copies of Myself?�When Should I Make Preservation Copies of Myself?�
When Should I Make Preservation Copies of Myself?
 
Assessing the Quality of Web Archives
Assessing the Quality of Web ArchivesAssessing the Quality of Web Archives
Assessing the Quality of Web Archives
 
Profiling Web Archives
Profiling Web ArchivesProfiling Web Archives
Profiling Web Archives
 
Evaluating the SiteStory Transactional Web Archive with the ApacheBench Tool
Evaluating the SiteStory Transactional Web Archive with the ApacheBench ToolEvaluating the SiteStory Transactional Web Archive with the ApacheBench Tool
Evaluating the SiteStory Transactional Web Archive with the ApacheBench Tool
 
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...
Resurrecting My Revolutionsing Social Link Neighborhood in Bringing Context t...
 
Who and What Links to the Internet Archive
Who and What Links to the Internet ArchiveWho and What Links to the Internet Archive
Who and What Links to the Internet Archive
 
On the Change in Archivability of Websites Over Time
On the Change in Archivability of Websites Over TimeOn the Change in Archivability of Websites Over Time
On the Change in Archivability of Websites Over Time
 
Profiling Web Archive Coverage for Top-Level Domain and Content Language
Profiling Web Archive Coverage for Top-Level Domain and Content LanguageProfiling Web Archive Coverage for Top-Level Domain and Content Language
Profiling Web Archive Coverage for Top-Level Domain and Content Language
 
Who Will Archive the Archives? Thoughts About the Future of Web Archiving
Who Will Archive the Archives? Thoughts About the Future of Web ArchivingWho Will Archive the Archives? Thoughts About the Future of Web Archiving
Who Will Archive the Archives? Thoughts About the Future of Web Archiving
 
More Archives, More Better
More Archives, More Better More Archives, More Better
More Archives, More Better
 

Kürzlich hochgeladen

Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 

Kürzlich hochgeladen (20)

Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 

Web Archives at the Nexus of Good Fakes and Flawed Originals

  • 1. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Web Archives at the Nexus of Good Fakes and Flawed Originals Michael L. Nelson Old Dominion University Web Science & Digital Libraries Research Group @WebSciDL, @phonedude_mln With: ODU: Michele C. Weigle, John Berlin, Mohamed Aturban, Justin Whitlock LANL: Martin Klein, DANS: Herbert Van de Sompel Supported in part by The Andrew Mellon Foundation and the National Science Foundation
  • 2. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL "You’re in a desert walking along in the sand when all of the sudden you look down, and you see a tortoise..." Supported in part by The Andrew Mellon Foundation and the National Science Foundation Michael L. Nelson Old Dominion University Web Science & Digital Libraries Research Group @WebSciDL, @phonedude_mln With: ODU: Michele C. Weigle, John Berlin, Mohamed Aturban, Justin Whitlock LANL: Martin Klein, DANS: Herbert Van de Sompel
  • 3. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL https://en.wikipedia.org/wiki/Blade_Runner National Film Registry Induction, 1993: https://www.loc.gov/loc/lcib/94/9405/film.html http://www.loc.gov/static/programs/national-film-preservation-board/documents/blade_runner.pdf 1982 1968
  • 4. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL https://www.youtube.com/watch?v=LwDdP88Dr54
  • 5. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL https://www.youtube.com/watch?v=LwDdP88Dr54
  • 6. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL We’re not going to review RS’s/PKD’s predictions https://www.cnn.com/2018/12/28/movies/blade-runner-predictions-2019-trnd/ https://twentytwowords.com/blade-runner-was-set-in-2019/ https://nwn.blogs.com/nwn/2019/01/blade-runner-los-angeles-2019.html https://www.theregister.co.uk/2019/01/01/blade_runner_today/
  • 7. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Common themes in the works of Phillip K. Dick • identity • self vs. the other • memory • humanity • authenticity • reality vs. simulacra • unreliable narrator
  • 8. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Blade Runner in 239 characters
  • 9. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Voight-Kampff Test: distinguishing authentic (humans) vs. fake (replicants) https://www.youtube.com/watch?v=ic0PuvJbdu0 You’re in a desert walking along in the sand when all of the sudden you look down, and you see a tortoise. You reach down, you flip the tortoise over on its back. The tortoise lays on its back, its belly baking in the hot sun, beating its legs trying to turn itself over, but it can’t, not without your help. But you’re not helping. Why is that?
  • 10. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Robots indistinguishable from humans, off-world slaves, perpetually “dark and stormy” Los Angeles – all good cyberpunk sci-fi tropes – but that’s not our 2019, right?
  • 11. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL The future is already here — it's just not evenly distributed. -- William Gibson (yes, I’m mixing sci-fi authors) https://twitter.com/badnetworker/status/1093864777179430912 https://geekologie.com/2018/02/boston-dynamics-tests-door-opening-robot.php
  • 12. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL “So when do we get to that part about web archiving?”
  • 13. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Web archives are science fiction. Web archives are enabling a reality, as foreseen by PKD and other sci-fi authors, where we can insert bespoke fakes into our collective memory.
  • 14. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Web archives are like science fiction because they’re a paradox: We need a significant and continuous technology investment today to be able to say a page “used to look like this.”
  • 15. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Web archiving is not file backup. Backup = prevent, detect, repair changes Web archiving = continuous change to better simulate the past Web archiving is a simulacrum of the past.
  • 16. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL The essence of a web archive is to modify its holdings https://web.archive.org/web/19971211010502/https://www.cni.org/ Rewrite links so they point back in the archive Provide archival metadata banner (what, when, how many) Relatively simple for the Web of 1997. Today, it’s not so easy.
  • 17. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Some modifications are to make yesterday’s formats safe for / available to today’s browser http://www.dlib.org/dlib/january05/rosenthal/01rosenthal.html Cf. https://techcrunch.com/2017/07/25/get-ready-to-say-goodbye-to-flash-in-2020/ http://web.archive.org/web/20100605013233/http://www.youtube.com/watch?v=1aPPSIDr3Mc&feature=player_embedded/
  • 18. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Web archive software is continuously evolving, in part to better realize a more authentic version of the past https://github.com/internetarchive/wayback/releases https://github.com/webrecorder/pywb/releases
  • 19. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL "...the government presented testimony from the office manager of the Internet Archive, who explained how the Archive captures and preserves evidence of the contents of the internet at a given time. The witness also compared the screenshots sought to be admitted with true and accurate copies of the same websites maintained in the Internet Archive, and testified that the screenshots were authentic and accurate copies of the Archive’s records. Based on this testimony, the district court found that the screenshots had been sufficiently authenticated." https://law.justia.com/cases/federal/appellate-courts/ca2/17-2479/17-2479-2018-07-02.html Evidentiary use of “screenshots” of archived pages United States v. Gasperini, No. 17-2479 (2d Cir. 2018)
  • 20. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Evidentiary use of “screenshots” of archived pages United States v. Gasperini, No. 17-2479 (2d Cir. 2018) "...the government presented testimony from the office manager of the Internet Archive, who explained how the Archive captures and preserves evidence of the contents of the internet at a given time. The witness also compared the screenshots sought to be admitted with true and accurate copies of the same websites maintained in the Internet Archive, and testified that the screenshots were authentic and accurate copies of the Archive’s records. Based on this testimony, the district court found that the screenshots had been sufficiently authenticated." https://law.justia.com/cases/federal/appellate-courts/ca2/17-2479/17-2479-2018-07-02.html Screenshots matching IA’s records are not the same thing as IA’s records matching the past…
  • 21. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL So why is it so hard to recreate the past? If we just had isolated, static pages (jpegs, pdfs, mp3s, etc.) then there’d be no problem.
  • 22. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL links Javascript (modifying the page) embedded resources (possibly including other HTML pages via iframes) links links Real HTML pages are complex
  • 23. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Javascript is why we can’t have nice (archival) things
  • 24. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Load the archived page, get an eagle https://www.webharvest.gov/congress112th/20130119060624/http://www.fws.gov/
  • 25. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Hit “reload”, get a tiger https://www.webharvest.gov/congress112th/20130119060624/http://www.fws.gov/
  • 26. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Hit “reload” again, get a mountain https://www.webharvest.gov/congress112th/20130119060624/http://www.fws.gov/
  • 27. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL “Look on my Javascript, ye Mighty, and despair!”
  • 28. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Actually, the fws.gov example was super easy; most changes are much harder to trace Mohamed Aturban, unpublished, memento: http://web.archive.org/web/20130724144801/http://www.cnn.com/ Animated GIF: https://blog.dshr.org/2017/11/keynote-at-pacific-neighborhood.html
  • 29. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Embedded resources + Javascript = Our simulation of what CNN.com looked like then is flawed. It will never be 2013 again, so in some sense that page is lost.
  • 30. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Zombies: live web “leaking” into an archived page http://ws-dl.blogspot.com/2012/10/2012-10-10-zombies-in-archives.html this page is from 2008 this ad is from 2012 (when this screen shot was taken) As of late 2017, zombies mostly no longer occur https://blog.dshr.org/2017/09/attacking-users-of-wayback-machine.html
  • 31. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Temporal violations: reconstructing legitimately archived resources into a page that never existed http://ws-dl.blogspot.com/2015/12/2015-12-08-evaluating-temporal.html text (2004-12) says rain, image (2005-09) is clear
  • 32. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Incorrectly replaying the 2004 weather forecast for Varina, Iowa is hardly the stuff of dystopian cyberpunk. There are cases where temporal violations begin to look like tampering…
  • 33. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Remember the case of Joy Reid’s blog? https://www.odu.edu/news/2018/5/michael_nelson https://twitter.com/DrDanetteAllen/status/990228054952865793
  • 34. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL https://twitter.com/phonedude_mln/status/990054945457147904 HTML archived on 2006-01-11 JS archived on 2006-02-07 Reid was a prolific blogger, so a gap of nearly a month is catastrophic for temporal integrity.
  • 35. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Not always Javascript – cookies causes the web archive to store the Urdu language page at the URL for the English page https://ws-dl.blogspot.com/2018/03/2018-03-21-cookies-are-why-your.html
  • 36. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL https://ws-dl.blogspot.com/2019/03/2019-03-18-cookie-violations-cause.html Cookies + Javascript = A combo Urdu / Portuguese / English page that never existed
  • 37. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Web archives are unreliable narrators. Unreliable narrators cause us to question everything we’ve been told.
  • 38. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Let’s prove Lester Holt did not “fudge the tape”! https://twitter.com/AaronBlake/status/1035124642456002565https://twitter.com/realDonaldTrump/status/1035120511259500544 https://news.vice.com/en_us/article/ne5x3d/trump-lester-holt-james-comey-nbc
  • 39. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL The May, 2017 NBC interview is not archived until August, 2018 (and even then, the video itself is not archived) https://www.nbcnews.com/nightly-news/video/pres-trump-s-extended-exclusive-interview-with-lester-holt-at-the-white-house-941854787582?v=raila https://web.archive.org/web/*/https://www.nbcnews.com/nightly-news/video/pres-trump-s-extended-exclusive-interview-with-lester-holt-at-the-white-house-941854787582?v=raila https://web.archive.org/web/20180825094239/https://www.nbcnews.com/nightly-news/video/pres-trump-s-extended-exclusive-interview-with-lester-holt-at-the-white-house-941854787582?v=raila Clicking through to the video reveals a loop of postal carrier slipping on ice; not the Lester Holt interview.
  • 40. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Errors in crawling and playback are hard to distinguish from tampering https://twitter.com/katestarbird/status/911257133231910913 https://er.educause.edu/articles/2018/10/managing-the-cultural-record-in-the-information-warfare-era I want to explicitly note here the difference between the act of quietly rewriting the record and enjoying the results of the rewrites that are accepted as truth and that of deliberately destroying the confidence of the public (including the scholarly community) by creating compromise, confusion, and ambiguity to suggest that the record cannot be trusted.
  • 41. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Disinformation applied to web archives doesn’t necessarily mean you have to insert a specific narrative into the archive. You just need to cast doubt on the archive as our collective memory.
  • 42. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL We’re unaware of any cases where web archive content has been hacked or faked for any substantive goal. However, web archives are not immune. It’s just the theater of conflict has yet to expand to include web archives.
  • 43. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Twitter then and now http://inventorspot.com/articles/top_ten_twitterati_tweet_above_rest_31806 https://www.vox.com/policy-and-politics/2017/10/19/16504510/ten-gop-twitter-russia
  • 44. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Facebook then and now https://twitter.com/Pinboard/status/975013825010458624 https://web.archive.org/web/20090722095954/http://facebook.com/zuck See also: https://www.businessinsider.com/facebook-old-posts-mark-zuckerberg-disappeared-2019-3
  • 45. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Gmail then and now http://googlepress.blogspot.com/2004/04/google-gets-message-launches-gmail.html https://www.avanan.com/resources/gmail-exploit-allows-dnc-email-attack
  • 46. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Web archives then and soon https://web.archive.org/web/20020601134105/http://www.businessweek.com/technology/content/feb2002/tc20020228_1080.htm ?
  • 47. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Why do we expect things to be different for web archives? Our trust model for web archives is still rooted in the 1980s / early 90s.
  • 48. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL My chronology with Unix Late 80s: 1 computer, many users Used an X terminal to access Cray, Convex supercomputers 90s: 1 computer, 1 user My Sun IPX workstation was the first www.larc.nasa.gov now: many computers, 1 user I’m not even sure how many computers I have access to
  • 49. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL From brewster@wais.com Sun Apr 25 00:03:19 1993 Received: from express.larc.nasa.gov by blearg.larc.nasa.gov with SMTP (5.65.2/server2.4) id AA28277; Sun, 25 Apr 93 00:00:26 -0400 Received: from wais.wais.com by express.larc.nasa.gov with SMTP id BA21157 (SMTP/Lite-1.15) for <m.l.nelson@larc.nasa.gov>; Sun, 25 Apr 93 00:00:20 -0400 Received: by wais.wais.com (4.1/SMI- 4.1/Brent-911016) id AA14369; Sat, 24 Apr 93 20:47:54 PDT Date: Sat, 24 Apr 93 20:47:54 PDT Message-Id: <9304250347.AA14369@wais.wais.com> From: Brewster Kahle <brewster@wais.com> To: abc@concert.net To: admin@ds.internic.net To: akers@fiddle.oit.unc.edu To: anders@ifi.uio.no To: anders@munin.ub2.lu.se … To: m.l.nelson@LaRC.NASA.GOV … To: root@ds.internic.net To: root@ncgia.ucsb.edu To: root@fiddle.oit.unc.edu To: root@oac.hsc.uth.tmc.edu To: root@samba.acs.unc.edu To: root@spk41.usace.mil To: root@stone.ucs.indiana.edu To: root@sunsite.unc.edu To: root@uniwa.uwa.oz.au To: root@uva.ci.uv.es To: root@nic.funet.fi … WAIS server maintainers, As you probably know through wais-discussion, we are announcing the commercial WAIS server this thursday. There is a big press event and showcase at the WAIS Inc offices. Thank you, everyone, for making it possible for us to pull off a startup company. We are considering running a special price for a limited time for those that know and understand WAIS already. We would like to discuss this with those that might be interested in it, and would like to help us determine how it should work. Most people will continue to use the freeware, and that is fine, this is for those that might be interested in a commercial version. At this time, we will not be discussing the differences between things or other products. Given that the press has started to call and ask for information before hand (to scoop this story, you know the press...), we have had to keep a very quiet profile. On the other hand, we need the help from all of you. Generally, this is done with a signed non-disclosure basis, but this wont work on the Internet and not in time. What I was thinking was to ask anyone that would like to discuss this, to send an "email non-disclosure" to non-disclosed- waisites-request@wais.com. I wish this weren't so baroque, but you could not believe some of the members of the press I have talked to. If one reporter publishes early, it can spoil things (and get it wrong). (please dont email to me. At this point, my cup floweth over. I will dig out after the showcase!) -brewster TMC->WAIS Inc->AOL->Alexa->IA https://twitter.com/phonedude_mln/status/1105160308866338816
  • 50. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL From brewster@wais.com Sun Apr 25 00:03:19 1993 Received: from express.larc.nasa.gov by blearg.larc.nasa.gov with SMTP (5.65.2/server2.4) id AA28277; Sun, 25 Apr 93 00:00:26 -0400 Received: from wais.wais.com by express.larc.nasa.gov with SMTP id BA21157 (SMTP/Lite-1.15) for <m.l.nelson@larc.nasa.gov>; Sun, 25 Apr 93 00:00:20 -0400 Received: by wais.wais.com (4.1/SMI- 4.1/Brent-911016) id AA14369; Sat, 24 Apr 93 20:47:54 PDT Date: Sat, 24 Apr 93 20:47:54 PDT Message-Id: <9304250347.AA14369@wais.wais.com> From: Brewster Kahle <brewster@wais.com> To: abc@concert.net To: admin@ds.internic.net To: akers@fiddle.oit.unc.edu To: anders@ifi.uio.no To: anders@munin.ub2.lu.se … To: m.l.nelson@LaRC.NASA.GOV … To: root@ds.internic.net To: root@ncgia.ucsb.edu To: root@fiddle.oit.unc.edu To: root@oac.hsc.uth.tmc.edu To: root@samba.acs.unc.edu To: root@spk41.usace.mil To: root@stone.ucs.indiana.edu To: root@sunsite.unc.edu To: root@uniwa.uwa.oz.au To: root@uva.ci.uv.es To: root@nic.funet.fi … WAIS server maintainers, As you probably know through wais-discussion, we are announcing the commercial WAIS server this thursday. There is a big press event and showcase at the WAIS Inc offices. Thank you, everyone, for making it possible for us to pull off a startup company. We are considering running a special price for a limited time for those that know and understand WAIS already. We would like to discuss this with those that might be interested in it, and would like to help us determine how it should work. Most people will continue to use the freeware, and that is fine, this is for those that might be interested in a commercial version. At this time, we will not be discussing the differences between things or other products. Given that the press has started to call and ask for information before hand (to scoop this story, you know the press...), we have had to keep a very quiet profile. On the other hand, we need the help from all of you. Generally, this is done with a signed non-disclosure basis, but this wont work on the Internet and not in time. What I was thinking was to ask anyone that would like to discuss this, to send an "email non-disclosure" to non-disclosed- waisites-request@wais.com. I wish this weren't so baroque, but you could not believe some of the members of the press I have talked to. If one reporter publishes early, it can spoil things (and get it wrong). (please dont email to me. At this point, my cup floweth over. I will dig out after the showcase!) -brewster When computers were $$$, an email to “root” could be expected to be received by someone entrusted with the necessary $$$ to responsibly administer the machine. IOW, “root” was almost always a white hat. It hasn’t been like that for a long time. Web archives are like the Unix mainframes of today. TMC->WAIS Inc->AOL->Alexa->IA https://twitter.com/phonedude_mln/status/1105160308866338816
  • 51. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL How well do you know root@archive.org? As in, could you call/email him right now and expect a response? Our entire national digital preservation strategy is predicated on Brewster Kahle “not being evil”™ If he is leading a 25+ year sleeper cell, we’re doomed.
  • 52. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL How well do you know these roots? Many more: https://en.wikipedia.org/wiki/List_of_Web_archiving_initiatives
  • 53. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Up until now, we’ve only looked at failures or edge cases in crawling and replay. What about deliberate fakes?
  • 54. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Cut-n-paste / mashup “fakes” for humor Victorian Photo Collage https://www.metmuseum.org/exhibitions/listings/2010/victorian-photocollage “The Flying Saucer” (1956) https://en.wikipedia.org/wiki/The_Flying_Saucer_(song) https://www.youtube.com/watch?v=XCrn6QXvHLg Brian Williams Raps ‘Gin & Juice’ https://www.youtube.com/watch?v=XlGLhYFrv6w
  • 55. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL More convincing fakes require significant skills, knowledge, and access https://en.wikipedia.org/wiki/Piltdown_Man https://en.wikipedia.org/wiki/Shroud_of_Turin https://www.npr.org/templates/story/story.php?storyId=94461486
  • 56. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL “deep learning” + “fake” = deepfakes https://motherboard.vice.com/en_us/article/7x799b/selling-ai-generated-fake-porn-is-probably-a-good-way-to-get-sued https://motherboard.vice.com/en_us/article/ev5eba/ai-fake-porn-of-friends-deepfakes
  • 57. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Becoming more mainstream: https://twitter.com/MikaelThalen/status/1090349932266094593 https://deepfakesapp.online/ A “safe for work” example: No longer buried in the dark corners of Reddit:
  • 58. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL “Detecting” deepfakes will happen. “Preventing” deepfakes won’t happen; they’re here to stay: Mementos, even of a fake past, are core to the human condition. “Did you get your precious photos?” “Implants. Those aren't your memories, they're somebody else's. They're Tyrell's niece's.” http://deepemotions.free.fr/theme_1.html Real photos, fake memories: replicants attach significant value to photos, even when they know the memories are fake.
  • 59. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Next Thanksgiving dinner, liven up the discussion with your extended family 1. Extract just 0:23—0:26 of the Obama/Peele video 2. Embed in an HTML page 3. Use Javascript to rewrite the banner and browser URL – Datetime: 2016-11-09 – URL: www.whitehouse.gov/totally NotFake 1. Claim the deep state deleted the page from the live webhttps://www.theverge.com/tldr/2018/4/17/17247334/ai-fake-news-video-barack-obama-jordan-peele-buzzfeed https://www.youtube.com/watch?time_continue=43&v=cQ54GDm1eL0#t=0m23s
  • 60. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Not just hypothetical.
  • 61. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Inserting fakes into real archives Here’s an actual page in the IA “proving” Brian Williams released “Gin and Juice” in 1992, a full year before Snoop Dogg. John Berlin, MS Thesis, 2018 https://www.youtube.com/watch?v=k3QTcJZdFfs (actual URI-R & URI-M have also been obscured in the video to hide the technique) The content is clearly fake, but it demonstrates that it’s possible to write Javascript that attacks the archive’s playback capability. It takes an archiving expert to tell the difference.
  • 62. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL We’ve known about these & other attacks for nearly two years http://labs.rhizome.org/presentations/security.html#/ https://acmccs.github.io/papers/p1741-lernerAT3.pdf https://blog.dshr.org/2017/06/wac2017-security-issues-for-web-archives.html https://ws-dl.blogspot.com/2018/04/2018-05-01-high-fidelity-ms-thesis-to.html
  • 63. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL There are other ways, presumably still hypothetical, to attack the archives https://twitter.com/internetarchive/status/596768668756774914 https://xkcd.com/538/
  • 64. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL https://www.theguardian.com/uk-news/2018/sep/05/planes-trains-and-fake-names-the-trail-left-by-skripal-suspects https://www.cnn.com/2018/10/22/middleeast/saudi-operative-jamal-khashoggi-clothes/index.html “Planes, trains and fake names: the trail left by Skripal suspects” “Surveillance footage shows Saudi 'body double' in Khashoggi's clothes after he was killed, Turkish source says” Before you say “that will never happen!” Reminder: agents, dissidents, journalists have all disappeared; they won’t mind adding a librarian/sysadmin to the list
  • 65. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL I’ve got good news and bad news: Setting up a web archive is not as difficult nor expensive as it used to be. OpenWayback, WAIL, pywb, et al. + cloud storage = you can have a web archive running for about the same time it took to generate the Steve Buscemi / Jennifer Lawrence deepfake. https://github.com/iipc/openwayback https://github.com/N0taN3rd/wail https://machawk1.github.io/wail/ https://github.com/webrecorder/pywb
  • 66. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Inserting fakes into fake archives breitbart.com/wayback/*/whitehouse.gov/totallyNotFake infowars.com/web/*/whitehouse.gov/totallyNotFake iluv.aynrand.org/*/whitehouse.gov/totallyNotFake InternetResearchAgency.ru/whitehouse.gov/totallyNotFake How well do you know root at these archives? Are they really four different archives, or one root for all of them? What if 99.9% of the time they faithfully replay pages?
  • 67. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL http://www.dlib.org/dlib/november05/rosenthal/11rosenthal.html What if we start off with > (n/2)+1 archives compromised?
  • 68. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL What if the archives were targeted to amplify a specific disinformation narrative? And what if the archives had no choice but to cooperate?
  • 69. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL The University of Farmington is fake DHS strong armed a “.edu” registration, they could do the same to IA & others too https://twitter.com/nwarikoo/status/1090726638034276352 https://web.archive.org/web/20161023170733/https://universityoffarmington.edu/ https://twitter.com/phonedude_mln/status/1092464939040755712 First capture: 2016-10-23
  • 70. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Blockchain to the rescue!!! <lasers> <sirens> <disco-thumping-soundtrack> nope. https://www.multichain.com/blog/2015/11/avoiding-pointless-blockchain-project/ https://eprint.iacr.org/2017/375.pdf https://blog.dshr.org/search/label/bitcoin
  • 71. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL There is no shortage of deepfake vs. blockchain stories https://www.wired.com/story/the-blockchain-solution-to-our-deepfake-problems/ https://www.longhash.com/news/the-coming-war-between-deepfakes-and-blockchain
  • 72. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL A Voight-Kampff Test for deepfakes doesn’t seem that silly now https://twitter.com/TechCrunch/status/1009556795965296642 https://www.technologyreview.com/s/611726/the-defense-department-has-produced-the-first-tools-for-catching-deepfakes/
  • 73. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Are we prepared for the unintended consequences? “Enforcing digital signatures for all cameras and video devices would offer the same capability in reverse. Suddenly every photograph and video shared online could be traced back to its original owner. Security services in a repressive regime could scour social media for all videos depicting them in a negative light and trace them back to the precise individuals who captured the video, arresting them en masse.” https://www.forbes.com/sites/kalevleetaru/2018/09/09/why-digital-signatures-wont-prevent-deep-fakes-but-will-help-repressive-governments/
  • 74. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL On the other hand, “blockchaining” our pets is a study in incompatibility, so tracking photos may never happen https://www.aspca.org/about-us/aspca-policy-and-position-statements/microchips https://moviepaws.com/2017/10/22/owls-snakes-and-unicorns-the-animals-of-blade-runner/ In Blade Runner, synthetic pets had serial numbers (real pets are unavailable to all but the richest). “While most of the world has accepted these standards, North America has not. The primary problem is a competitive, technological one involving the compatibility of the microchips and the readers that are used by shelters and veterinary clinics.”
  • 75. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL As for blockchains and web archives…
  • 76. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL This is not what you think it is… https://petertodd.org/2017/carbon-dating-the-internet-archive-with-opentimestamps
  • 77. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL This is not what you think it is… https://petertodd.org/2017/carbon-dating-the-internet-archive-with-opentimestamps “…right now you can get timestamps for every book, movie, song, computer program, legal document, etc. in the thousands of collections in the archive. In the future we hope to be able to work with the Internet Archive to extend this to timestamping website snapshots…”
  • 78. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL That’s never going to happen. (at least not 3rd party through the playback interface)
  • 79. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Archive URI-Ms ----------------------------- perma-archives.org 182 bibalex.org 199 webarchive.org.uk 349 bac-lac.gc.ca 351 proni.gov.uk 469 digar.ee 488 webharvest.gov 712 internetmemory.org 979 nationalarchives.gov.uk 994 stanford.edu 1222 archive-it.org 1383 archive.is 1396 web.archive.org 1566 arquivo.pt 1569 webcitation.org 1585 vefsafn.is 1589 loc.gov 1594 ----------------------------- Total 16627 Sample 16k+ Mementos from 17 Web Archives
  • 80. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Periodically Replay Each Archived Page Above example: http://perma-archives.org/warc/20170101182813/http://umich.edu/ 35 times, from Nov. 2017 – Oct. 2018 For each replay, we download both the rewritten version and the “raw” version (where possible).
  • 81. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Periodically Replay Each Archived Page Above example: http://perma-archives.org/warc/20170101182813/http://umich.edu/ 35 times, from Nov. 2017 – Oct. 2018 For each replay, we download both the rewritten version and the “raw” version (where possible). Partial archive outage because of security / maintenance upgrade
  • 82. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Periodically Replay Each Archived Page Above example: http://perma-archives.org/warc/20170101182813/http://umich.edu/ 35 times, from Nov. 2017 – Oct. 2018 For each replay, we download both the rewritten version and the “raw” version (where possible). Post-upgrade, replay is variable.
  • 83. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL More Archived Pages Changed Every Time Than Never Changed (yes, this experiment used “raw” mode) Never changed: 2007 URI-Ms (1 in 8) Always changed: 2773 URI-Ms (1 in 6) Fixity-based approaches, including blockchain, will not work.
  • 84. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL “Hash the screen shot, not the HTML!” That doesn’t work either.
  • 85. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL 1 WARC file, 2 Wayback Machines, 3 Browsers = 6 different replays http://wayback.archive-it.org/all/20130106140348/http://www.harvard.edu/ http://web.archive.org/web/20130106140348/http://www.harvard.edu/ see also: https://ws-dl.blogspot.com/2016/12/2016-12-20-archiving-pages-with.html
  • 86. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Why not create a LOCKSS for web archives?
  • 87. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Web archives are not especially interoperable. There are many issues regarding interoperability, but generational loss is a good demonstration of incompatible assumptions about simulating the past.
  • 88. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL https://web.archive.org/web/20180501125952/https:/twitter.com/phonedude_mln/status/990054945457147904
  • 89. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL http://archive.is/PaKx6
  • 90. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL https://perma.cc/3HMS-TB59
  • 91. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL http://www.webcitation.org/77RhNeyoZ
  • 92. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL https://web.archive.org/web/20190407024654/https://perma.cc/3HMS-TB59
  • 93. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL https://web.archive.org/web/20190407031659/http://www.webcitation.org/77RhNeyoZ
  • 94. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Web archiving interoperability: a metaphor (non-synthetic pets, possibly microchipped) https://www.youtube.com/watch?v=SQudKvrwDAU
  • 95. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL To summarize: Existing, trusted archives can be compromised by: 1) crawling malicious pages, or 2) attacking facilities / personnel 3) court orders Lowered resource threshold for archives allows: 1) “long game” archives: faithful now, corrupt later, 2) “sock puppet” archives: surreptitiously cooperating archives The nature of web archives is to change content – current fixity based approaches will not help.
  • 96. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Looking forward: We need new models for web archiving and verifying authenticity. The Heritrix / Wayback Machine technology stack, while successful, has limited our thinking.
  • 97. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL “Studies generally suggest that, year after year, less than 60 percent of web traffic is human; … For a period of time in 2013, the Times reported this year, a full half of YouTube traffic was “bots masquerading as people,” a portion so high that employees feared an inflection point after which YouTube’s systems for detecting fraudulent traffic would begin to regard bot traffic as real and human traffic as fake. They called this hypothetical event “the Inversion.”” http://nymag.com/intelligencer/2018/12/how-much-of-the-internet-is-fake.html Robots outnumber humans 10:1 in sessions, 5:4 in HTTP connections in the IA, ca. 2012 http://arxiv.org/abs/1309.4016 https://giphy.com/gifs/harrison-ford-blade-runner-sean-young-yjB2fwqjv5rry/media
  • 98. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL I suspect the core of the new model will have a lot in common with click farms https://twitter.com/mbrennanchina/status/1072114511212109824
  • 99. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Record what we saw at crawl time as a baseline, then we need a distance measure for crawl time and replay time http://dx.doi.org/10.5210/fm.v22i112.8097 https://ws-dl.blogspot.com/2013/05/2013-05-25-game-walkthroughs-as.html Documenting instead of archiving… 1)Robotic witnesses 2)New Nielsen families
  • 100. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL Some of you might be thinking “but I don’t like Blade Runner – what can I take away from this talk?” (my wife refers to the film as “serious white guys talking”) Two methods for passing the Voight- Kampff Test for Blade Runner fandom
  • 101. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL 1) Is Deckard a replicant? In the book, he’s definitely human. In the seven (!) versions of the movie, it ranges from “ambiguous” to “replicant”. https://moviepaws.com/2017/10/22/owls-snakes-and-unicorns-the-animals-of-blade-runner/ https://en.wikipedia.org/wiki/Themes_in_Blade_Runner https://en.wikipedia.org/wiki/Blade_Runner#Versions
  • 102. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL 2) “Tears in Rain” – Greatest monologue in sci-fi? Or greatest monologue of all time? I've seen things you people wouldn't believe. Attack ships on fire off the shoulder of Orion. I watched C-beams glitter in the dark near the Tannhäuser Gate. All those moments will be lost in time, like tears in rain. Time to die. https://www.youtube.com/watch?v=9hDo80ddn4Q https://en.wikipedia.org/wiki/Tears_in_rain_monologue https://www.youtube.com/watch?v=BM54jXndyvQ
  • 103. CNI Spring 2019 Membership Meeting, 2019-04-09, @phonedude_mln, @WebSciDL 2) “Tears in Rain” – Greatest monologue in sci-fi? Or greatest monologue of all time? I've crawled things you people wouldn't believe. Clickjacking attacks off the x-frame-options: sameorigin. I watched ajax requests redirect at the aggregator TimeGate. All those pages will be lost in time, like tears in rain. Time to lie. https://www.youtube.com/watch?v=9hDo80ddn4Q https://en.wikipedia.org/wiki/Tears_in_rain_monologue https://www.youtube.com/watch?v=BM54jXndyvQ