1. Impact of URI Canonicalization
on Memento Count
Mat Kelly1
, Lulwah M. Alkwai1
, Sawood Alam1
,
Michael L. Nelson1
, Michele C. Weigle1
, and Herbert Van de Sompel2
1
Web Science and Digital Libraries (WS-DL) Research Group
Old Dominion University, Norfolk, Virginia, USA
ws-dl.cs.odu.edu • @WebSciDL
2
Los Alamos National Laboratory
Los Alamos, New Mexico, USA
@hvdsomp
Web Archiving and Digital Libraries (WADL) Workshop 2017
June 22-23, 2017
Toronto, Canada
https://arxiv.org/abs/1703.03302
13. % Redirects Over Time
● Revisits (no content change)
● Scheme switch
● Subdomain switch
● Slash-added
● others...
https://arxiv.org/abs/1703.03302
14. HTTPS Adoption?
● Early, quick redirects attributed to slash-added pattern
● Crawl rate increase → Fewer changes → More revisits
● Δtime for HTTP→ HTTPS redirect by year:
Datetime between two URI-Ms is ≤ 2 sec.
google.com, collected May 2016
2012 2014 2016
https://arxiv.org/abs/1703.03302
15. Impact of URI Canonicalization
on Memento Count
Mat Kelly1
, Lulwah M. Alkwai1
, Sawood Alam1
,
Michael L. Nelson1
, Michele C. Weigle1
, and Herbert Van de Sompel2
1
Web Science and Digital Libraries (WS-DL) Research Group
Old Dominion University, Norfolk, Virginia, USA
ws-dl.cs.odu.edu • @WebSciDL
2
Los Alamos National Laboratory
Los Alamos, New Mexico, USA
@hvdsomp
Web Archiving and Digital Libraries (WADL) Workshop 2017
June 22-23, 2017
Toronto, Canada
https://arxiv.org/abs/1703.03302
http://ws-dl.blogspot.com/2017/03/2017-03-24-impact-of-uri.html