SlideShare ist ein Scribd-Unternehmen logo
1 von 57
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Herbert Van de Sompel
LANL & DANS
@hvdsomp
http://mementoweb.org/about/
http://timetravel.mementoweb.org
Infrastructure for Collaborating Web Archives
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
• Having Many Web Archives is a Good Thing ™
• Web Archive Interoperability
• Memento
• Towards Increased Interoperability
• Infrastructure for Web Archive Collaboration
• Aggregator
• Aggregator Services
• Aggregator APIs
• If You Build It Will They Come?
Outline
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Having Many Web Archives is a Good Thing ™
Capture of http://webcitation.org dated July 17 2013
https://archive.today/eAETp
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Having Many Web Archives is a Good Thing ™
Remnant of discontinued web archive http://mummify.it captured on February 14 2014
https://web.archive.org/web/20140214233752/https://www.mummify.it/
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Having Many Web Archives is a Good Thing ™
Capture of http://webcitation.org dated August 6 2014
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Having Many Web Archives is a Good Thing ™
http://arstechnica.com/business/2013/11/fire-at-internet-archive-destroys-equipment-and-materials-but-data-safe/
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Having Many Web Archives is a Good Thing ™
http://www.themoscowtimes.com/news/article/russia-bans-wayback-machine-internet-archive-over-islamic-state-
video/510074.html
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
http://www.independent.co.uk/news/uk/politics/tories-deleted-past-broken-promises-from-party-website-
8937435.html
Having Many Web Archives is a Good Thing ™
Speeches not
accessible in IA
Available in other
Web archives
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Having Many Web Archives is a Good Thing ™
http://web.archive.org/web/20140717152222/http://vk.com/strelkov_info https://archive.today/XFFAj
Captures of http://vk.com/strelkov_info
17 July 2014 15:22:22 17 July 2014 17:06:51
Claim of
responsibility for
downing what
Strelkov thought to
be a Ukrainian
military transport
plane, but was
MH17, removed
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
But Even a Better Thing if They Collaborate
Julien Masanes vision of a global grid of web archives:
Such a grid should link Web archives so that they together form
one global navigation space like the live Web itself. This is only
possible if they are structured in a way close enough to the original
Web and if they are openly accessible.
J. Masanes. Web Archiving. Springer-Verlag, 2006
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
• Having Many Web Archives is a Good Thing ™
• Web Archive Interoperability
• Memento
• Towards Increased Interoperability
• Infrastructure for Web Archive Collaboration
• Aggregator
• Aggregator Services
• Aggregator APIs
• If You Build It Will They Come?
Outline
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
2009
• Memento observation:
• Web resources exist in the eternal now.
• Prior versions of resources exist in web
archives and resource versioning
systems.
• The current resource and its prior
versions live disconnected lives.
• How to interconnect current and prior
versions of resources across distributed
web servers, web archives, resource
versioning systems?
Herbert Van de Sompel, Michael L. Nelson, and Robert Sanderson (2013) RFC7089 Memento
http://mementoweb.org/guide/rfc/
Memento Did Just That. And More.
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Original Resource and Mementos
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Bridge from Present to Past
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Bridge from Present to Past
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Bridge from Past to Present
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Today
Select Date
Nov 17 2014
Apr 1 2014
archive.is
Memento: Access Versions via the Original URI and a Datetime
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Memento for Chrome
Memento for Chrome
http://bit.ly/memento-for-chrome
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
• Open Wayback
• pywb
• Memento TimeGate server
• Bridge between a homegrown versioning API and the Memento
protocol
• MediaWiki Memento extensions
• Linked Data Fragments server
Tools for Server-Side Memento Support
Memento Tools
http://mementoweb.org/tools/
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Can’t Please Everyone
An anonymous reviewer of our submission for WWW 2010:
Is there any statistics to show that many or a good number of Web
users would like to get obsolete data or resources?
Herbert Van de Sompel, Michael L. Nelson, et al. (2009) Memento: Time Travel for the Web
http://arxiv.org/abs/0911.1112
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
• Having Many Web Archives is a Good Thing ™
• Web Archive Interoperability
• Memento
• Towards Increased Interoperability
• Infrastructure for Web Archive Collaboration
• Aggregator
• Aggregator Services
• Aggregator APIs
• If You Build It Will They Come?
Outline
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Raw Mementos
Shawn Jones (2016) Mementos in the Raw, Take Two
http://ws-dl.blogspot.nl/2016/08/2016-08-15-mementos-in-raw-take-two.html
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Raw Mementos
Shawn Jones (2016) Mementos in the Raw, Take Two
http://ws-dl.blogspot.nl/2016/08/2016-08-15-mementos-in-raw-take-two.html
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Verifying Authenticity of Mementos
Ongoing research Old Dominion University & LANL
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
• Having Many Web Archives is a Good Thing ™
• Web Archive Interoperability
• Memento
• Towards Increased Interoperability
• Infrastructure for Web Archive Collaboration
• Aggregator
• Aggregator Services
• Aggregator APIs
• If You Build It Will They Come?
Outline
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
• Resource Version Control Systems
• Servers with dedicated web archive
• Servers with a preference for a specific web archive
Original Resource Provides timegate Link
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Original Resource Provides No timegate Link – Client Intelligence
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Memento Aggregator
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
• Official service of the LANL Research Library
• Currently covers 23 archives (web and linked data):
archive.today, Archive-It, Bibliotheca Alexandrina Web Archive, DBpedia
archive, DBpedia Triple Pattern Fragments archive, Canadian Government
Web Archive, Croatian Web Archive, Estonian Web Archive, Icelandic web
archive, Internet Archive, Library of Congress Web Archive, NARA Web
Archive, National Library of Ireland Web Archive, perma.cc, Portugese Web
Archive, PRONI Web Archive, Slovenian Web Archive, Stanford Web
Archive, UK Government Web Archive, UK Parliament's Web Archive, UK
Web Archive, Web Archive Singapore, WebCite
• LANL Aggregator software not available, but see MemGator
LANL Memento Aggregator
Archives covered by LANL Memento Aggregator: http://mementoweb.org/depot/
MemGator: https://github.com/oduwsdl/memgator
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
• Polling of many distributed archives:
• Slow
• Load on aggregator and archives
• Approaches:
• Batch collecting and caching of archival coverage of popular
URIs in all archives
• Summarization of archives (based on CDX files and/or search)
• Machine Learning of URI patterns for archives
Memento Aggregator Challenges
Sawood Alam, Michael L. Nelson, et al. (2016) Web archive profiling through fulltext search
https://doi.org/10.1007/978-3-319-43997-6_10
Sawood Alam, Michael L. Nelson, et al. (2016) Web archive profiling through CDX summarization
https://doi.org/10.1007/s00799-016-0184-4
Nicholas Bornand, Herbert Van de Sompel, et al. (2016) Routing Memento Requests Using Binary Classifiers
https://doi.org/10.1145/2910896.2910899
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
• Having Many Web Archives is a Good Thing ™
• Web Archive Interoperability
• Memento
• Towards Increased Interoperability
• Infrastructure for Web Archive Collaboration
• Aggregator
• Aggregator Services
• Aggregator APIs
• If You Build It Will They Come?
Outline
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
• Exposes:
• TimeGates
• TimeMaps
that reach across all web archives covered by the Aggregator
Basic Aggregator Services
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Time Travel Services
http://timetravel.mementoweb.org/
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Time Travel Find
http://timetravel.mementoweb.org/list/20120428045424/http://www.stanford.edu/
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Time Travel Find
http://timetravel.mementoweb.org/list/20120428045424/http://www.stanford.edu/
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Time Travel Reconstruct
http://timetravel.mementoweb.org/reconstruct/20120428045424/http://www.stanford.edu/
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Time Travel Reconstruct
http://timetravel.mementoweb.org/reconstruct/20120428045424/http://www.stanford.edu/
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
• Having Many Web Archives is a Good Thing ™
• Web Archive Interoperability
• Memento
• Towards Increased Interoperability
• Infrastructure for Web Archive Collaboration
• Aggregator
• Aggregator Services
• Aggregator APIs
• If You Build It Will They Come?
Outline
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Time Travel APIs
http://timetravel.mementoweb.org/guide/api/
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
URI that Redirects to a Memento
http://timetravel.mementoweb.org/memento/20120428045424/http://www.stanford.edu/
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
URI that Redirects to a JSON Description of a Memento
http://timetravel.mementoweb.org/api/json/20100428103432/http://stanford.edu
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
JSON Format for TimeMaps
http://mementoweb.org/guide/timemap-json/
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
DIY TimeMap - Index TimeMap Lists Potential TimeMap URIs
http://timetravel.mementoweb.org/timemap/json/http://stanford.edu
SPEED
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
WDI TimeMap – Index TimeMap with Full Coverage
http://labs.mementoweb.org/timemap/link/http://stanford.edu
COVERAGE
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Time Travel Archive Registry
http://labs.mementoweb.org/aggregator_config/archivelist.xml
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
• Having Many Web Archives is a Good Thing ™
• Web Archive Interoperability
• Memento
• Towards Increased Interoperability
• Infrastructure for Web Archive Collaboration
• Aggregator
• Aggregator Services
• Aggregator APIs
• If You Build It Will They Come?
Outline
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Time Travel Infrastructure Use, October 2016
TimeTravel
Interface
Use
/api/ 1,404,985
/timegate/ 54,007
/list/ 744,484
/memento/ 1,563,278
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
oldweb.today
http://oldweb.today/nsmac4/20001115150435/http://www.stanford.edu
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
arquivo.pt
http://arquivo.pt/wayback/20120127040929/http://stanford.edu/
Link to Reconstruct
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
TimeTravel Reconstruct
http://timetravel.mementoweb.org/reconstruct/20120127040929/http://stanford.edu/
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
British Library Memento Service
http://www.webarchive.org.uk/mementos/search/http://www.stanford.edu
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
#icanhazmemento
http://ws-dl.blogspot.nl/2015/07/2015-07-22-i-can-haz-memento.html
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
#icanhazmemento
http://timetravel.mementoweb.org/list/20161116101831/http://signposting.org/adopters
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Robust Links
• Decorate links to allow retrieving Mementos subject to link date or
from a specific archive
• In combination with the Time Travel API, this yields links - provided
client or server side - that circumvent link rot and content drift
Robust Links Specification
http://robustlinks.mementoweb.org/spec/
<a href=“http://archive.is/FAy6o”
data-originalurl=“http://www.stanford.edu”
data-versiondate=“2014-08-15” >
<a href=“http://www.stanford.edu”
data-versiondate=“2014-08-15” > DO
DO
<a href=“http://archive.is/FAy6o” > DON’T
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Robust Links – robustify.js
Rene Voorburg (2014) robustify.js
https://github.com/renevoorburg/robustify.js
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Robust Links – robustlinks.js
Herbert Van de Sompel and Michael L. Nelson (2015) Reminiscing about 15 years of interoperability efforts.
https://dx.doi.org/10.1045/november2015-vandesompel
Herbert Van de Sompel
Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016
Herbert Van de Sompel
LANL & DANS
@hvdsomp
http://mementoweb.org/about/
http://timetravel.mementoweb.org
Infrastructure for Collaborating Web Archives

Weitere ähnliche Inhalte

Was ist angesagt?

COST_Workshop_Louvain_la_Neuve_05_04_13_Presentation_Biblissima
COST_Workshop_Louvain_la_Neuve_05_04_13_Presentation_BiblissimaCOST_Workshop_Louvain_la_Neuve_05_04_13_Presentation_Biblissima
COST_Workshop_Louvain_la_Neuve_05_04_13_Presentation_Biblissima
SGehrke
 
Joining forces with Wikipedia reasons, experiences and impact - Sharing is Ca...
Joining forces with Wikipedia reasons, experiences and impact - Sharing is Ca...Joining forces with Wikipedia reasons, experiences and impact - Sharing is Ca...
Joining forces with Wikipedia reasons, experiences and impact - Sharing is Ca...
Olaf Janssen
 
IFLA 2012 OCLC Linked Data Roundtable
IFLA 2012 OCLC Linked Data RoundtableIFLA 2012 OCLC Linked Data Roundtable
IFLA 2012 OCLC Linked Data Roundtable
geckomarma
 

Was ist angesagt? (19)

Between Public Domain and Private Funding. Public Private Partnerships for C...
Between Public Domain and Private Funding. Public Private Partnerships for C...Between Public Domain and Private Funding. Public Private Partnerships for C...
Between Public Domain and Private Funding. Public Private Partnerships for C...
 
Open Cultural Heritage Data @ the Rijksmuseum
Open Cultural Heritage Data @ the RijksmuseumOpen Cultural Heritage Data @ the Rijksmuseum
Open Cultural Heritage Data @ the Rijksmuseum
 
Open Rijksmuseum Data : Challenges and Opportunities
Open Rijksmuseum Data : Challenges and OpportunitiesOpen Rijksmuseum Data : Challenges and Opportunities
Open Rijksmuseum Data : Challenges and Opportunities
 
COST_Workshop_Louvain_la_Neuve_05_04_13_Presentation_Biblissima
COST_Workshop_Louvain_la_Neuve_05_04_13_Presentation_BiblissimaCOST_Workshop_Louvain_la_Neuve_05_04_13_Presentation_Biblissima
COST_Workshop_Louvain_la_Neuve_05_04_13_Presentation_Biblissima
 
From Open Acces to Open Collections to Open Minds
From Open Acces to Open Collections to Open MindsFrom Open Acces to Open Collections to Open Minds
From Open Acces to Open Collections to Open Minds
 
OpenAIRE workshop: Beyond APCs - Klaudia Grabowska (IBL PAN, Poland)
OpenAIRE workshop: Beyond APCs - Klaudia Grabowska (IBL PAN, Poland)OpenAIRE workshop: Beyond APCs - Klaudia Grabowska (IBL PAN, Poland)
OpenAIRE workshop: Beyond APCs - Klaudia Grabowska (IBL PAN, Poland)
 
Challenges of the Rijksmuseum Research Library
Challenges of the Rijksmuseum Research LibraryChallenges of the Rijksmuseum Research Library
Challenges of the Rijksmuseum Research Library
 
Archiving The Social Media Presence of The River-side
Archiving The Social Media Presence of The River-sideArchiving The Social Media Presence of The River-side
Archiving The Social Media Presence of The River-side
 
GLAMorous LOD and ResearchSpace introduction
GLAMorous LOD and ResearchSpace introductionGLAMorous LOD and ResearchSpace introduction
GLAMorous LOD and ResearchSpace introduction
 
UKNOF update at SANOG24
UKNOF update at SANOG24UKNOF update at SANOG24
UKNOF update at SANOG24
 
HPC in the cloud comes of age - Red Oak HPC Seminar
HPC in the cloud comes of age - Red Oak HPC SeminarHPC in the cloud comes of age - Red Oak HPC Seminar
HPC in the cloud comes of age - Red Oak HPC Seminar
 
Joining forces with Wikipedia reasons, experiences and impact - Sharing is Ca...
Joining forces with Wikipedia reasons, experiences and impact - Sharing is Ca...Joining forces with Wikipedia reasons, experiences and impact - Sharing is Ca...
Joining forces with Wikipedia reasons, experiences and impact - Sharing is Ca...
 
Datahub for museums (poster)
Datahub for museums (poster)Datahub for museums (poster)
Datahub for museums (poster)
 
Cooperating with Google
Cooperating with GoogleCooperating with Google
Cooperating with Google
 
Austrian Books Online
Austrian Books OnlineAustrian Books Online
Austrian Books Online
 
Working with News Data across different media: A workshop
 Working with News Data across different media: A workshop Working with News Data across different media: A workshop
Working with News Data across different media: A workshop
 
John Garraway
John GarrawayJohn Garraway
John Garraway
 
Austrian National Library Vision 2025 and Austrian Books Online
Austrian National Library Vision 2025 and Austrian Books OnlineAustrian National Library Vision 2025 and Austrian Books Online
Austrian National Library Vision 2025 and Austrian Books Online
 
IFLA 2012 OCLC Linked Data Roundtable
IFLA 2012 OCLC Linked Data RoundtableIFLA 2012 OCLC Linked Data Roundtable
IFLA 2012 OCLC Linked Data Roundtable
 

Andere mochten auch (14)

Scholarly archive-of-the-future
Scholarly archive-of-the-futureScholarly archive-of-the-future
Scholarly archive-of-the-future
 
Did You Know - June 2016
Did You Know - June 2016Did You Know - June 2016
Did You Know - June 2016
 
Transformación del esfuerzo - problema
Transformación  del  esfuerzo - problemaTransformación  del  esfuerzo - problema
Transformación del esfuerzo - problema
 
Did you know newsletter eau suppl 2015
Did you know newsletter eau suppl 2015Did you know newsletter eau suppl 2015
Did you know newsletter eau suppl 2015
 
SBS Brochure
SBS BrochureSBS Brochure
SBS Brochure
 
Did you know N°1 2014
Did you know  N°1 2014 Did you know  N°1 2014
Did you know N°1 2014
 
Lgpl license
Lgpl licenseLgpl license
Lgpl license
 
Lgpl license
Lgpl licenseLgpl license
Lgpl license
 
Lgpl license
Lgpl licenseLgpl license
Lgpl license
 
Report e-Commerce
Report e-CommerceReport e-Commerce
Report e-Commerce
 
Manoj_4yearExperienced_SoftwareConfigurationManagement & CI & Build and release
Manoj_4yearExperienced_SoftwareConfigurationManagement & CI & Build and releaseManoj_4yearExperienced_SoftwareConfigurationManagement & CI & Build and release
Manoj_4yearExperienced_SoftwareConfigurationManagement & CI & Build and release
 
Ethicak hacking
Ethicak hackingEthicak hacking
Ethicak hacking
 
SBS Company Brochure
SBS Company BrochureSBS Company Brochure
SBS Company Brochure
 
Did you know Newsletter September 2015
Did you know Newsletter September 2015Did you know Newsletter September 2015
Did you know Newsletter September 2015
 

Ähnlich wie Collaborating web archives - Herbert van de Sompel

Archiving Web-Based #musetech for Institutional Memory
Archiving Web-Based #musetech for Institutional MemoryArchiving Web-Based #musetech for Institutional Memory
Archiving Web-Based #musetech for Institutional Memory
Samantha Norling
 
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
Herbert Van de Sompel
 
The development of web archiving 3
The development of web archiving 3The development of web archiving 3
The development of web archiving 3
Essam Obaid
 
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
Micah Altman
 

Ähnlich wie Collaborating web archives - Herbert van de Sompel (20)

Archiving Web-Based #musetech for Institutional Memory
Archiving Web-Based #musetech for Institutional MemoryArchiving Web-Based #musetech for Institutional Memory
Archiving Web-Based #musetech for Institutional Memory
 
PID Signposting Pattern
PID Signposting PatternPID Signposting Pattern
PID Signposting Pattern
 
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDTDBpedia Archive using Memento, Triple Pattern Fragments, and HDT
DBpedia Archive using Memento, Triple Pattern Fragments, and HDT
 
High and Lows of Library Linked Data
High and Lows of Library Linked DataHigh and Lows of Library Linked Data
High and Lows of Library Linked Data
 
Web archiving challenges and opportunities
Web archiving challenges and opportunitiesWeb archiving challenges and opportunities
Web archiving challenges and opportunities
 
Archive-It: Scaling Beyond a Billion Archival Webpages - Aaron Binns
Archive-It: Scaling Beyond a Billion Archival Webpages - Aaron BinnsArchive-It: Scaling Beyond a Billion Archival Webpages - Aaron Binns
Archive-It: Scaling Beyond a Billion Archival Webpages - Aaron Binns
 
The development of web archiving 3
The development of web archiving 3The development of web archiving 3
The development of web archiving 3
 
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
 
Tools for Managing the Past Web
Tools for Managing the Past WebTools for Managing the Past Web
Tools for Managing the Past Web
 
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
 
Slides for Web Archiving in the Heritage and Archive Sectors
Slides for Web Archiving in the Heritage and Archive SectorsSlides for Web Archiving in the Heritage and Archive Sectors
Slides for Web Archiving in the Heritage and Archive Sectors
 
Why libraries should embrace Linked Data
Why libraries should embrace Linked DataWhy libraries should embrace Linked Data
Why libraries should embrace Linked Data
 
Omeka s workshopdcmi
Omeka s workshopdcmiOmeka s workshopdcmi
Omeka s workshopdcmi
 
Open Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked DataOpen Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked Data
 
SiteStory 2013
SiteStory  2013SiteStory  2013
SiteStory 2013
 
Isaac - W3C Data on the Web Best Practices - Data Vocabularies
Isaac - W3C Data on the Web Best Practices - Data VocabulariesIsaac - W3C Data on the Web Best Practices - Data Vocabularies
Isaac - W3C Data on the Web Best Practices - Data Vocabularies
 
Progress Made and Lessons Learned through Collaborative Web Archiving Proj...
Progress Made and Lessons Learned through Collaborative Web Archiving Proj...Progress Made and Lessons Learned through Collaborative Web Archiving Proj...
Progress Made and Lessons Learned through Collaborative Web Archiving Proj...
 
Spotify architecture - Pressing play
Spotify architecture - Pressing playSpotify architecture - Pressing play
Spotify architecture - Pressing play
 
Breaking Up with MARC 2016 LITD Conference (03.11.2016)
Breaking Up with MARC   2016 LITD Conference (03.11.2016)Breaking Up with MARC   2016 LITD Conference (03.11.2016)
Breaking Up with MARC 2016 LITD Conference (03.11.2016)
 
SWIB14 Weaving repository contents into the Semantic Web
SWIB14 Weaving repository contents into the Semantic WebSWIB14 Weaving repository contents into the Semantic Web
SWIB14 Weaving repository contents into the Semantic Web
 

Mehr von Netwerk Digitaal Erfgoed

Mehr von Netwerk Digitaal Erfgoed (20)

Eppo van Nispen: Opening Words World Digital Preservation Day
Eppo van Nispen: Opening Words World Digital Preservation DayEppo van Nispen: Opening Words World Digital Preservation Day
Eppo van Nispen: Opening Words World Digital Preservation Day
 
Valerie Johnson: Supporting the Archives Sector via Collaboration
Valerie Johnson: Supporting the Archives Sector via CollaborationValerie Johnson: Supporting the Archives Sector via Collaboration
Valerie Johnson: Supporting the Archives Sector via Collaboration
 
Simon Tanner: Teaching Digital Preservation at King's College London
Simon Tanner: Teaching Digital Preservation at King's College LondonSimon Tanner: Teaching Digital Preservation at King's College London
Simon Tanner: Teaching Digital Preservation at King's College London
 
Sharon McMeekin: Are we Making Progress in Digital Preservation Training?
Sharon McMeekin: Are we Making Progress in Digital Preservation Training?Sharon McMeekin: Are we Making Progress in Digital Preservation Training?
Sharon McMeekin: Are we Making Progress in Digital Preservation Training?
 
Sarah Higgins: Challenges in Educating Digital Curation
Sarah Higgins: Challenges in Educating Digital CurationSarah Higgins: Challenges in Educating Digital Curation
Sarah Higgins: Challenges in Educating Digital Curation
 
Erika Hokke: Stichting Archief Publicaties Annual
Erika Hokke: Stichting Archief Publicaties AnnualErika Hokke: Stichting Archief Publicaties Annual
Erika Hokke: Stichting Archief Publicaties Annual
 
Rosemary Lynch: the DigCurv Curriculum Framework
Rosemary Lynch: the DigCurv Curriculum FrameworkRosemary Lynch: the DigCurv Curriculum Framework
Rosemary Lynch: the DigCurv Curriculum Framework
 
Puck Huitsing: Experiences Collaborative Learning
Puck Huitsing: Experiences Collaborative LearningPuck Huitsing: Experiences Collaborative Learning
Puck Huitsing: Experiences Collaborative Learning
 
Maureen Pennock: Digital Preservation Staffing and Skilss
Maureen Pennock: Digital Preservation Staffing and SkilssMaureen Pennock: Digital Preservation Staffing and Skilss
Maureen Pennock: Digital Preservation Staffing and Skilss
 
Jasper Snoeren: Collaborative Learning at Institute for Sound and Vision
Jasper Snoeren: Collaborative Learning at Institute for Sound and VisionJasper Snoeren: Collaborative Learning at Institute for Sound and Vision
Jasper Snoeren: Collaborative Learning at Institute for Sound and Vision
 
Frans Neggers: Learning Digital Preservation
Frans Neggers: Learning Digital PreservationFrans Neggers: Learning Digital Preservation
Frans Neggers: Learning Digital Preservation
 
Eef Masson: Digital Preservation Skills for AV Archivists
Eef Masson: Digital Preservation Skills for AV ArchivistsEef Masson: Digital Preservation Skills for AV Archivists
Eef Masson: Digital Preservation Skills for AV Archivists
 
Dorothy Waugh: The Archivist's Guide To KryoFlux
Dorothy Waugh: The Archivist's Guide To KryoFluxDorothy Waugh: The Archivist's Guide To KryoFlux
Dorothy Waugh: The Archivist's Guide To KryoFlux
 
Chantal Keijsper: Lifelong Learning How To Do That
Chantal Keijsper: Lifelong Learning How To Do ThatChantal Keijsper: Lifelong Learning How To Do That
Chantal Keijsper: Lifelong Learning How To Do That
 
Annet Dekker: Capturing Online Cultures Storytelling as a Method
Annet Dekker: Capturing Online Cultures Storytelling as a MethodAnnet Dekker: Capturing Online Cultures Storytelling as a Method
Annet Dekker: Capturing Online Cultures Storytelling as a Method
 
Amber Cushing: Digital Information Management Programmes
Amber Cushing: Digital Information Management ProgrammesAmber Cushing: Digital Information Management Programmes
Amber Cushing: Digital Information Management Programmes
 
3e Studiedag Webarchivering - Webarchivering van Chinees Nederland
3e Studiedag Webarchivering - Webarchivering van Chinees Nederland3e Studiedag Webarchivering - Webarchivering van Chinees Nederland
3e Studiedag Webarchivering - Webarchivering van Chinees Nederland
 
3e Studiedag Webarchivering - Taalvariatie op Twitter
3e Studiedag Webarchivering - Taalvariatie op Twitter3e Studiedag Webarchivering - Taalvariatie op Twitter
3e Studiedag Webarchivering - Taalvariatie op Twitter
 
3e Studiedag Webarchivering - Promise
3e Studiedag Webarchivering - Promise3e Studiedag Webarchivering - Promise
3e Studiedag Webarchivering - Promise
 
3e Studiedag Webarchivering - Vrienden voor het leven
3e Studiedag Webarchivering - Vrienden voor het leven3e Studiedag Webarchivering - Vrienden voor het leven
3e Studiedag Webarchivering - Vrienden voor het leven
 

Kürzlich hochgeladen

VIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Russian🍌Dazzling Hottie Get☎️ 9053900678 ☎️call girl In Chandigarh By Chandig...
Russian🍌Dazzling Hottie Get☎️ 9053900678 ☎️call girl In Chandigarh By Chandig...Russian🍌Dazzling Hottie Get☎️ 9053900678 ☎️call girl In Chandigarh By Chandig...
Russian🍌Dazzling Hottie Get☎️ 9053900678 ☎️call girl In Chandigarh By Chandig...
Chandigarh Call girls 9053900678 Call girls in Chandigarh
 

Kürzlich hochgeladen (20)

Call On 6297143586 Viman Nagar Call Girls In All Pune 24/7 Provide Call With...
Call On 6297143586  Viman Nagar Call Girls In All Pune 24/7 Provide Call With...Call On 6297143586  Viman Nagar Call Girls In All Pune 24/7 Provide Call With...
Call On 6297143586 Viman Nagar Call Girls In All Pune 24/7 Provide Call With...
 
best call girls in Pune - 450+ Call Girl Cash Payment 8005736733 Neha Thakur
best call girls in Pune - 450+ Call Girl Cash Payment 8005736733 Neha Thakurbest call girls in Pune - 450+ Call Girl Cash Payment 8005736733 Neha Thakur
best call girls in Pune - 450+ Call Girl Cash Payment 8005736733 Neha Thakur
 
VIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Bhavnagar 7001035870 Whatsapp Number, 24/07 Booking
 
A Press for the Planet: Journalism in the face of the Environmental Crisis
A Press for the Planet: Journalism in the face of the Environmental CrisisA Press for the Planet: Journalism in the face of the Environmental Crisis
A Press for the Planet: Journalism in the face of the Environmental Crisis
 
SMART BANGLADESH I PPTX I SLIDE IShovan Prita Paul.pptx
SMART BANGLADESH  I    PPTX   I    SLIDE   IShovan Prita Paul.pptxSMART BANGLADESH  I    PPTX   I    SLIDE   IShovan Prita Paul.pptx
SMART BANGLADESH I PPTX I SLIDE IShovan Prita Paul.pptx
 
Nanded City ? Russian Call Girls Pune - 450+ Call Girl Cash Payment 800573673...
Nanded City ? Russian Call Girls Pune - 450+ Call Girl Cash Payment 800573673...Nanded City ? Russian Call Girls Pune - 450+ Call Girl Cash Payment 800573673...
Nanded City ? Russian Call Girls Pune - 450+ Call Girl Cash Payment 800573673...
 
Call Girls Sangamwadi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Sangamwadi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Sangamwadi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Sangamwadi Call Me 7737669865 Budget Friendly No Advance Booking
 
(NEHA) Call Girls Nagpur Call Now 8250077686 Nagpur Escorts 24x7
(NEHA) Call Girls Nagpur Call Now 8250077686 Nagpur Escorts 24x7(NEHA) Call Girls Nagpur Call Now 8250077686 Nagpur Escorts 24x7
(NEHA) Call Girls Nagpur Call Now 8250077686 Nagpur Escorts 24x7
 
WORLD DEVELOPMENT REPORT 2024 - Economic Growth in Middle-Income Countries.
WORLD DEVELOPMENT REPORT 2024 - Economic Growth in Middle-Income Countries.WORLD DEVELOPMENT REPORT 2024 - Economic Growth in Middle-Income Countries.
WORLD DEVELOPMENT REPORT 2024 - Economic Growth in Middle-Income Countries.
 
Night 7k to 12k Call Girls Service In Navi Mumbai 👉 BOOK NOW 9833363713 👈 ♀️...
Night 7k to 12k  Call Girls Service In Navi Mumbai 👉 BOOK NOW 9833363713 👈 ♀️...Night 7k to 12k  Call Girls Service In Navi Mumbai 👉 BOOK NOW 9833363713 👈 ♀️...
Night 7k to 12k Call Girls Service In Navi Mumbai 👉 BOOK NOW 9833363713 👈 ♀️...
 
PPT BIJNOR COUNTING Counting of Votes on ETPBs (FOR SERVICE ELECTORS
PPT BIJNOR COUNTING Counting of Votes on ETPBs (FOR SERVICE ELECTORSPPT BIJNOR COUNTING Counting of Votes on ETPBs (FOR SERVICE ELECTORS
PPT BIJNOR COUNTING Counting of Votes on ETPBs (FOR SERVICE ELECTORS
 
celebrity 💋 Agra Escorts Just Dail 8250092165 service available anytime 24 hour
celebrity 💋 Agra Escorts Just Dail 8250092165 service available anytime 24 hourcelebrity 💋 Agra Escorts Just Dail 8250092165 service available anytime 24 hour
celebrity 💋 Agra Escorts Just Dail 8250092165 service available anytime 24 hour
 
VIP Model Call Girls Shikrapur ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Shikrapur ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Shikrapur ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Shikrapur ( Pune ) Call ON 8005736733 Starting From 5K t...
 
An Atoll Futures Research Institute? Presentation for CANCC
An Atoll Futures Research Institute? Presentation for CANCCAn Atoll Futures Research Institute? Presentation for CANCC
An Atoll Futures Research Institute? Presentation for CANCC
 
Russian🍌Dazzling Hottie Get☎️ 9053900678 ☎️call girl In Chandigarh By Chandig...
Russian🍌Dazzling Hottie Get☎️ 9053900678 ☎️call girl In Chandigarh By Chandig...Russian🍌Dazzling Hottie Get☎️ 9053900678 ☎️call girl In Chandigarh By Chandig...
Russian🍌Dazzling Hottie Get☎️ 9053900678 ☎️call girl In Chandigarh By Chandig...
 
Junnar ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
Junnar ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...Junnar ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready For S...
Junnar ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready For S...
 
The Economic and Organised Crime Office (EOCO) has been advised by the Office...
The Economic and Organised Crime Office (EOCO) has been advised by the Office...The Economic and Organised Crime Office (EOCO) has been advised by the Office...
The Economic and Organised Crime Office (EOCO) has been advised by the Office...
 
TEST BANK For Essentials of Negotiation, 7th Edition by Roy Lewicki, Bruce Ba...
TEST BANK For Essentials of Negotiation, 7th Edition by Roy Lewicki, Bruce Ba...TEST BANK For Essentials of Negotiation, 7th Edition by Roy Lewicki, Bruce Ba...
TEST BANK For Essentials of Negotiation, 7th Edition by Roy Lewicki, Bruce Ba...
 
Call On 6297143586 Yerwada Call Girls In All Pune 24/7 Provide Call With Bes...
Call On 6297143586  Yerwada Call Girls In All Pune 24/7 Provide Call With Bes...Call On 6297143586  Yerwada Call Girls In All Pune 24/7 Provide Call With Bes...
Call On 6297143586 Yerwada Call Girls In All Pune 24/7 Provide Call With Bes...
 
Financing strategies for adaptation. Presentation for CANCC
Financing strategies for adaptation. Presentation for CANCCFinancing strategies for adaptation. Presentation for CANCC
Financing strategies for adaptation. Presentation for CANCC
 

Collaborating web archives - Herbert van de Sompel

  • 1. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Herbert Van de Sompel LANL & DANS @hvdsomp http://mementoweb.org/about/ http://timetravel.mementoweb.org Infrastructure for Collaborating Web Archives
  • 2. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 • Having Many Web Archives is a Good Thing ™ • Web Archive Interoperability • Memento • Towards Increased Interoperability • Infrastructure for Web Archive Collaboration • Aggregator • Aggregator Services • Aggregator APIs • If You Build It Will They Come? Outline
  • 3. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Having Many Web Archives is a Good Thing ™ Capture of http://webcitation.org dated July 17 2013 https://archive.today/eAETp
  • 4. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Having Many Web Archives is a Good Thing ™ Remnant of discontinued web archive http://mummify.it captured on February 14 2014 https://web.archive.org/web/20140214233752/https://www.mummify.it/
  • 5. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Having Many Web Archives is a Good Thing ™ Capture of http://webcitation.org dated August 6 2014
  • 6. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Having Many Web Archives is a Good Thing ™ http://arstechnica.com/business/2013/11/fire-at-internet-archive-destroys-equipment-and-materials-but-data-safe/
  • 7. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Having Many Web Archives is a Good Thing ™ http://www.themoscowtimes.com/news/article/russia-bans-wayback-machine-internet-archive-over-islamic-state- video/510074.html
  • 8. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 http://www.independent.co.uk/news/uk/politics/tories-deleted-past-broken-promises-from-party-website- 8937435.html Having Many Web Archives is a Good Thing ™ Speeches not accessible in IA Available in other Web archives
  • 9. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Having Many Web Archives is a Good Thing ™ http://web.archive.org/web/20140717152222/http://vk.com/strelkov_info https://archive.today/XFFAj Captures of http://vk.com/strelkov_info 17 July 2014 15:22:22 17 July 2014 17:06:51 Claim of responsibility for downing what Strelkov thought to be a Ukrainian military transport plane, but was MH17, removed
  • 10. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 But Even a Better Thing if They Collaborate Julien Masanes vision of a global grid of web archives: Such a grid should link Web archives so that they together form one global navigation space like the live Web itself. This is only possible if they are structured in a way close enough to the original Web and if they are openly accessible. J. Masanes. Web Archiving. Springer-Verlag, 2006
  • 11. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 • Having Many Web Archives is a Good Thing ™ • Web Archive Interoperability • Memento • Towards Increased Interoperability • Infrastructure for Web Archive Collaboration • Aggregator • Aggregator Services • Aggregator APIs • If You Build It Will They Come? Outline
  • 12. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 2009 • Memento observation: • Web resources exist in the eternal now. • Prior versions of resources exist in web archives and resource versioning systems. • The current resource and its prior versions live disconnected lives. • How to interconnect current and prior versions of resources across distributed web servers, web archives, resource versioning systems? Herbert Van de Sompel, Michael L. Nelson, and Robert Sanderson (2013) RFC7089 Memento http://mementoweb.org/guide/rfc/ Memento Did Just That. And More.
  • 13. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Original Resource and Mementos
  • 14. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Bridge from Present to Past
  • 15. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Bridge from Present to Past
  • 16. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Bridge from Past to Present
  • 17. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Today Select Date Nov 17 2014 Apr 1 2014 archive.is Memento: Access Versions via the Original URI and a Datetime
  • 18. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Memento for Chrome Memento for Chrome http://bit.ly/memento-for-chrome
  • 19. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 • Open Wayback • pywb • Memento TimeGate server • Bridge between a homegrown versioning API and the Memento protocol • MediaWiki Memento extensions • Linked Data Fragments server Tools for Server-Side Memento Support Memento Tools http://mementoweb.org/tools/
  • 20. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Can’t Please Everyone An anonymous reviewer of our submission for WWW 2010: Is there any statistics to show that many or a good number of Web users would like to get obsolete data or resources? Herbert Van de Sompel, Michael L. Nelson, et al. (2009) Memento: Time Travel for the Web http://arxiv.org/abs/0911.1112
  • 21. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 • Having Many Web Archives is a Good Thing ™ • Web Archive Interoperability • Memento • Towards Increased Interoperability • Infrastructure for Web Archive Collaboration • Aggregator • Aggregator Services • Aggregator APIs • If You Build It Will They Come? Outline
  • 22. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Raw Mementos Shawn Jones (2016) Mementos in the Raw, Take Two http://ws-dl.blogspot.nl/2016/08/2016-08-15-mementos-in-raw-take-two.html
  • 23. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Raw Mementos Shawn Jones (2016) Mementos in the Raw, Take Two http://ws-dl.blogspot.nl/2016/08/2016-08-15-mementos-in-raw-take-two.html
  • 24. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Verifying Authenticity of Mementos Ongoing research Old Dominion University & LANL
  • 25. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 • Having Many Web Archives is a Good Thing ™ • Web Archive Interoperability • Memento • Towards Increased Interoperability • Infrastructure for Web Archive Collaboration • Aggregator • Aggregator Services • Aggregator APIs • If You Build It Will They Come? Outline
  • 26. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 • Resource Version Control Systems • Servers with dedicated web archive • Servers with a preference for a specific web archive Original Resource Provides timegate Link
  • 27. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Original Resource Provides No timegate Link – Client Intelligence
  • 28. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Memento Aggregator
  • 29. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 • Official service of the LANL Research Library • Currently covers 23 archives (web and linked data): archive.today, Archive-It, Bibliotheca Alexandrina Web Archive, DBpedia archive, DBpedia Triple Pattern Fragments archive, Canadian Government Web Archive, Croatian Web Archive, Estonian Web Archive, Icelandic web archive, Internet Archive, Library of Congress Web Archive, NARA Web Archive, National Library of Ireland Web Archive, perma.cc, Portugese Web Archive, PRONI Web Archive, Slovenian Web Archive, Stanford Web Archive, UK Government Web Archive, UK Parliament's Web Archive, UK Web Archive, Web Archive Singapore, WebCite • LANL Aggregator software not available, but see MemGator LANL Memento Aggregator Archives covered by LANL Memento Aggregator: http://mementoweb.org/depot/ MemGator: https://github.com/oduwsdl/memgator
  • 30. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 • Polling of many distributed archives: • Slow • Load on aggregator and archives • Approaches: • Batch collecting and caching of archival coverage of popular URIs in all archives • Summarization of archives (based on CDX files and/or search) • Machine Learning of URI patterns for archives Memento Aggregator Challenges Sawood Alam, Michael L. Nelson, et al. (2016) Web archive profiling through fulltext search https://doi.org/10.1007/978-3-319-43997-6_10 Sawood Alam, Michael L. Nelson, et al. (2016) Web archive profiling through CDX summarization https://doi.org/10.1007/s00799-016-0184-4 Nicholas Bornand, Herbert Van de Sompel, et al. (2016) Routing Memento Requests Using Binary Classifiers https://doi.org/10.1145/2910896.2910899
  • 31. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 • Having Many Web Archives is a Good Thing ™ • Web Archive Interoperability • Memento • Towards Increased Interoperability • Infrastructure for Web Archive Collaboration • Aggregator • Aggregator Services • Aggregator APIs • If You Build It Will They Come? Outline
  • 32. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 • Exposes: • TimeGates • TimeMaps that reach across all web archives covered by the Aggregator Basic Aggregator Services
  • 33. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Time Travel Services http://timetravel.mementoweb.org/
  • 34. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Time Travel Find http://timetravel.mementoweb.org/list/20120428045424/http://www.stanford.edu/
  • 35. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Time Travel Find http://timetravel.mementoweb.org/list/20120428045424/http://www.stanford.edu/
  • 36. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Time Travel Reconstruct http://timetravel.mementoweb.org/reconstruct/20120428045424/http://www.stanford.edu/
  • 37. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Time Travel Reconstruct http://timetravel.mementoweb.org/reconstruct/20120428045424/http://www.stanford.edu/
  • 38. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 • Having Many Web Archives is a Good Thing ™ • Web Archive Interoperability • Memento • Towards Increased Interoperability • Infrastructure for Web Archive Collaboration • Aggregator • Aggregator Services • Aggregator APIs • If You Build It Will They Come? Outline
  • 39. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Time Travel APIs http://timetravel.mementoweb.org/guide/api/
  • 40. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 URI that Redirects to a Memento http://timetravel.mementoweb.org/memento/20120428045424/http://www.stanford.edu/
  • 41. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 URI that Redirects to a JSON Description of a Memento http://timetravel.mementoweb.org/api/json/20100428103432/http://stanford.edu
  • 42. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 JSON Format for TimeMaps http://mementoweb.org/guide/timemap-json/
  • 43. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 DIY TimeMap - Index TimeMap Lists Potential TimeMap URIs http://timetravel.mementoweb.org/timemap/json/http://stanford.edu SPEED
  • 44. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 WDI TimeMap – Index TimeMap with Full Coverage http://labs.mementoweb.org/timemap/link/http://stanford.edu COVERAGE
  • 45. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Time Travel Archive Registry http://labs.mementoweb.org/aggregator_config/archivelist.xml
  • 46. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 • Having Many Web Archives is a Good Thing ™ • Web Archive Interoperability • Memento • Towards Increased Interoperability • Infrastructure for Web Archive Collaboration • Aggregator • Aggregator Services • Aggregator APIs • If You Build It Will They Come? Outline
  • 47. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Time Travel Infrastructure Use, October 2016 TimeTravel Interface Use /api/ 1,404,985 /timegate/ 54,007 /list/ 744,484 /memento/ 1,563,278
  • 48. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 oldweb.today http://oldweb.today/nsmac4/20001115150435/http://www.stanford.edu
  • 49. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 arquivo.pt http://arquivo.pt/wayback/20120127040929/http://stanford.edu/ Link to Reconstruct
  • 50. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 TimeTravel Reconstruct http://timetravel.mementoweb.org/reconstruct/20120127040929/http://stanford.edu/
  • 51. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 British Library Memento Service http://www.webarchive.org.uk/mementos/search/http://www.stanford.edu
  • 52. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 #icanhazmemento http://ws-dl.blogspot.nl/2015/07/2015-07-22-i-can-haz-memento.html
  • 53. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 #icanhazmemento http://timetravel.mementoweb.org/list/20161116101831/http://signposting.org/adopters
  • 54. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Robust Links • Decorate links to allow retrieving Mementos subject to link date or from a specific archive • In combination with the Time Travel API, this yields links - provided client or server side - that circumvent link rot and content drift Robust Links Specification http://robustlinks.mementoweb.org/spec/ <a href=“http://archive.is/FAy6o” data-originalurl=“http://www.stanford.edu” data-versiondate=“2014-08-15” > <a href=“http://www.stanford.edu” data-versiondate=“2014-08-15” > DO DO <a href=“http://archive.is/FAy6o” > DON’T
  • 55. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Robust Links – robustify.js Rene Voorburg (2014) robustify.js https://github.com/renevoorburg/robustify.js
  • 56. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Robust Links – robustlinks.js Herbert Van de Sompel and Michael L. Nelson (2015) Reminiscing about 15 years of interoperability efforts. https://dx.doi.org/10.1045/november2015-vandesompel
  • 57. Herbert Van de Sompel Een web van webarchieven, Hilversum, Nederland, 17 Nov 2016 Herbert Van de Sompel LANL & DANS @hvdsomp http://mementoweb.org/about/ http://timetravel.mementoweb.org Infrastructure for Collaborating Web Archives