The Internet Archive and the socio-technical construction of historical facts
I had the pleasure of co-authoring another paper with Adam Amram, which is now published in Internet Histories.
The paper analyses the socio-technical epistemic processes behind the construction of historical facts by the Internet Archive Wayback Machine (IAWM). Grounded in theoretical debates in Science and Technology Studies about digital and algorithmic platforms as “black boxes”, this article uses provenance information and other data traces provided by the IAWM to uncover specific epistemic processes embedded at its back-end, through a case study on the archiving of the North Korean web.
In 2016, an error in the configuration of one of North Korea's name servers revealed that the North Korean Web has only 28 websites. However, the IAWM has snapshots of the majority of the .kp websites, which have been archived from as early as 2010. How did the IAWM accumulate knowledge about the .kp websites that are generally hidden to the world?
Through our findings we argue that historical knowledge on the IAWM is generated by an entangled and iterative system comprised of proactive human contributions, routinely operated crawls and a reification of external, crowd-sourced knowledge devices. These turn the IAWM into a repository whose knowing of the past is potentially surplus – harbouring information which was unknown to each of the contributing actors at the time and place of archiving.
An earlier version of this paper was presented at the RESAW conference in London, June 2017. Adam and I are very grateful to Jane Winters, Niels Brügger, Marta Severo, Valérie Schafer and the anonymous reviewers for their valuable comments and insights. Special thanks are extended to Tzipy Lazar-Shoef and Dan Bareket for research assistance.
On May 31 2018, I gave an open lecture about this paper at the Centre for Internet Studies, Aarhus University. The slides from the lecture are available below.