Memento at the W3C

Part of Data

Author(s) and publish date

By:
Published:

memento follow your nose through time architecture

The W3C Wiki and the W3C specifications are now accessible using the Memento "Time Travel for the Web" protocol. This is the result of a collaboration between the W3C, the Prototyping Team of the Los Alamos National Laboratory, and the Web Science and Digital Library Research Group at Old Dominion University.

The Memento protocol is a straightforward extension of HTTP that adds a time dimension to the Web. It supports integrating live web resources, resources in versioning systems, and archived resources in web archives into an interoperable, distributed, machine-accessible versioning system for the entire web. The protocol is broadly supported by web archives. Recently, its use was recommended in the W3C Data on the Web Best Practices, when data versioning is concerned. But resource versioning systems have been slow to adopt. Hopefully, the investment made by the W3C will convince others to follow suit.

Memento is formally specified in RFC7089; a brief overview is available from the Memento web site. In essence, the protocol associates two special types of resources with a web resource, both made discoverable using typed links in the HTTP Link header. A TimeGate is capable of datetime negotiation, a variant on content negotiation. It provides access to the version of the web resource as it existed around a preferred datetime expressed by a client using the Accept-Datetime header; the version resource itself includes a Memento-Datetime header, which expresses the resource's actual version datetime. A TimeMap provides an overview of versions of the web resource and their version datetimes. The need for datetime negotiation had already been suggested by Tim Berners-Lee in his W3C Note about Generic Resources but it was not until 2009 that datetime negotiation was effectively introduced in an arXiv.org preprint Memento: Time Travel for the Web.

memento follow your nose through time architecture

Memento provides a bridge between the present and the past Web

Adding Memento support to versioning systems allows a client to uniformly access the version of a resource that  was active at a certain moment in time (TimeGate) and to obtain its version history (TimeMap). When a version page in a system that supports Memento links to a resource that resides in another system that supports Memento, a client can uniformly access the version of the linked resource that was active at the same moment in time.  If the linked resource is in a system that does not support Memento - it does not expose a TimeGate - the client can fall back to a default TimeGate that operates across web archives and retrieve an archived resource using the uniform datetime negotiation approach. Alternatively, the client can resort to the TimeGate of a specific web archive, such as that of the Internet Archive or the Portugese Web Archive. But, while resource versioning systems hold on to their entire resource history, web archives merely store discrete observations of (some) web resources. As such, with pages retrieved from web archives, there is no certainty that the archived page was active at that same time, but rather only around that same time.

A variety of tools is available to add support to systems that handle resource versions and expose associated APIs. Memento support was added to the W3C Wiki pages by deploying the Memento Extension for MediaWiki. Memento support for W3C specifications was realized by installing a Generic TimeGate Server for which a handler was implemented that interfaces with the versioning capabilities offered by the W3C API.

Memento can be leveraged programmatically, for example, by adding Accept-Datetime headers to curl commands, or by using the Python Memento Client Library. The Time Travel portal exposes an API that covers web archives and resource versioning systems with Memento support. The API can, for example, be used to construct a URI that redirects to the version of a resource as it existed around a given date. For example:

Browsers do not yet natively support Memento, but its cross-time and cross-server capabilities can be experienced by installing the Memento extension for Chrome. Try it out for yourself. Browse over to the W3C AWWW and pick some dates in the extension's calendar between 1 September 2002 and 1 September 2004. Navigate to the version of the specification that was current at the selected dates by right clicking the page and choosing "Get near saved date ..." from the context menu. Notice how the centrality of REST in the specification diminishes over time. In each version, find the reference to the IANA URI Registry and right click the link, this time choosing "Get near memento date ..." to see the Registry as it existed around the time of the version of the AWWW specification you are on. You will retrieve versions of the Registry from web archives and notice its evolution over time, for example around 1 September 2002 and around 1 September 2004. Compare the archived state of the Registry conveyed with its current state by right clicking in an archived page and choosing "Get at current date".

Further pointers:

Related RSS feed

Comments (0)

Comments for this post are closed.