The article proposes archival thinking as an analytical framework for studying Facebook. Following recent debates on data colonialism, it argues that Facebook dialectically assumes a role of a new archon of public records, while being unarchivable by design. It then puts forward counter-archiving – a practice developed to resist the epistemic hegemony of colonial archives – as a method that allows the critical study of the social media platform, after it had shut down researcher’s access to public data through its application programming interface. After defining and justifying counter-archiving as a method for studying datafied platforms, two counter-archives are presented as proof of concept. The article concludes by discussing the shifting boundaries between the archivist, the activist and the scholar, as the imperative of research methods after datafication.
Ben-David, A. (2020). Counter-archiving Facebook. European Journal of Communication. https://doi.org/10.1177/0267323120922069
User Comments Across Platforms and Journalistic Genres
This study introduces a comparative approach to study user comments on the same news content across online platforms while distinguishing between soft and hard news genres. Empirical analysis focuses on Israel’s popular news website Ynet. Using automated tools, we scraped 17,347 comments to analyze differences in the quantity, length, and topics of comments that were posted through Ynet’s comments section, Facebook Comment Plugin, and Facebook page. Our findings reveal that commenting patterns vary greatly across platforms and news genres. Specifically, the number of comments posted on Ynet’s Facebook page is significantly higher than the two other commenting platforms (for both hard and soft news), but these comments are shorter and more emotional. We discuss these findings in relation to the notion of ‘context collapse’ in social media, and argue that one of the outcomes of the convergence between news content and social media is the augmentation of consensual national sentiment.
Ben-David, A., & Soffer, O. (2018). User comments across platforms and journalistic genres. Information, Communication & Society, doi: 10.1080/1369118X.2018.1468919.
The Internet Archive and the socio-technical construction of historical facts
This article analyses the socio-technical epistemic processes behind the construction of historical facts by the Internet Archive Wayback Machine (IAWM). Grounded in theoretical debates in Science and Technology Studies about digital and algorithmic platforms as “black boxes”, this article uses provenance information and other data traces provided by the IAWM to uncover specific epistemic processes embedded at its back-end, through a case study on the archiving of the North Korean web. In 2016, an error in the configuration of one of North Korea's name servers revealed that it contains 28 websites. However, the IAWM has snapshots of the majority of the .kp websites, which have been archived from as early as 2010. How did the IAWM accumulate knowledge about the .kp websites that are generally hidden to the world? Through our findings we argue that historical knowledge on the IAWM is generated by an entangled and iterative system comprised of proactive human contributions, routinely operated crawls and a reification of external, crowd-sourced knowledge devices. These turn the IAWM into a repository whose knowing of the past is potentially surplus – harbouring information which was unknown to each of the contributing actors at the time and place of archiving.
Ben-David, Anat and Amram, Adam (2018) "The Internet Archive and the socio-technical construction of historical facts", Internet Histories, doi: 10.1080/24701475.2018.1455412.
Platform Inequality: Gender in the Gig-Economy
Laboring in the new economy has recently drawn tremendous social, legal, and political debate. The changes created by platform-facilitated labor are considered fundamental challenges to the future of work and are generating contestation regarding the proper classification of laborers as employees or independent contractors. Yet, despite this growing debate, attention to gender dimensions of such laboring is currently lacking. This Article considers the gendered promises and challenges that are associated with platform-facilitated labor, and provides an innovative empirical analysis of gender discrepancies in such labor; it conducts a case study of platform-facilitated labor using computational methods that capture some of the gendered interactions hosted by a digital platform. These empirical findings demonstrate that although women work for more hours on the platform, women’s average hourly rates are significantly lower than men’s, averaging about 2/3 (two-thirds) of men’s rates. Such gaps in hourly rates persist even after controlling for feedback score, experience, occupational category, hours of work, and educational attainment. These findings suggest we are witnessing the remaking of women into devalued workers. They point to the new ways in which sex inequality is occurring in platform-facilitated labor. They suggest that we are beholding a third generation of sex inequality, termed “Discrimination 3.0,” in which discrimination is no longer merely a function of formal barriers or even implicit biases. The Article sketches Equality-by-Design (EbD) as a possible direction for future redress, through the enlisting of platform technology to enhance gender parity. In sum, this Article provides an empirical base and analysis for understanding the new ways sex inequality is taking hold in platform-facilitated labor.
Barzilay, Arianne & Ben-David, Anat (2017) "Platform Inequality: Gender in the Gig-Economy," Seton Hall Law Review: Vol. 47(2), Issue 2.
The colors of the national Web: visual data analysis of the historical Yugoslav Web domain.
This study examines the use of visual data analytics as a method for historical investigation of national Webs, using Web archives. It empirically ana-lyzes all graphically designed (non-photographic) images extracted from websites hosted in the historical .yu domain and archived by the Internet Archive between 1997 and 2000, in order to assess the utility and value of visual data analytics as a measure of nationality of a Web domain. First, we report that only 23.5% of websites hosted in the .yu domain over the studied years had their graphically designed images properly archived. Second, we detect signiﬁcant differences between the color palettes of .yu sub-domains (commercial, organizational, academic, and governmental), as well as between Montenegrin and Serbian websites.Third, we show that the similarity of the domains’ colors to the colors of the Yugoslav national ﬂag decreasesover time. However, there are spikes in the use of Yugoslav national colors that correlate with major developments on the Kosovo frontier.
Ben-David, A., Amram, A., & R. Bekkerman. (2016). The colors of the national Web: visual data analysis of the historical Yugoslav Web domain. International Journal on Digital Libraries, DOI: 10.0007/s00799-016-0202-6.
What does the Web Remember of its Deleted Past? An Archival Reconstruction of the Former Yugoslav Top Level Domain.
This article argues that the use of the Web as a primary source for studying the history of nations is conditioned by the structural ties between sovereignty and the Internet protocol, and by a temporal proximity between live and archived websites. The argument is illustrated by an empirical reconstruction of the history of the top-level domain of Yugoslavia (.yu), which was deleted from the Internet in 2010. The archival discovery method used four lists of historical .yu URLs that were captured from the live Web before the domain was deleted, and an automated hyperlink discovery script that retrieved their snapshots from the Internet Archive and reconstructed their immediate hyperlinked environment in a network. Although a considerable portion of the historical .yu domain was found on the Internet Archive, the reconstructed space was predominantly Serbian.
Keywords:Web history, Web archives, Internet Archive, Wayback Machine, Yugoslavia, Serbia,Digital Heritage, ICANN, National Webs, ccTLD
Ben-David, A. (2016). What does the Web Remember of its Deleted Past? An Archival Reconstruction of the Former Yugoslav Top Level Domain. New Media & Society, 18(7), 1103–1119.
Hate Speech and Covert Discrimination on Social Media: Monitoring the Facebook Pages of Extreme-Right Political Parties in Spain
This study considers the ways that overt hate speech and covert discriminatory practices circulate on Facebook despite its official policy that prohibits hate speech. We argue that hate speech and discriminatory practices are not only explained by users’ motivations and actions, but are also formed by a network of ties between the platform’s policy, its technological affordances, and the communicative acts of its users. Our argument is supported with longitudinal multimodal content and network analyses of data extracted from official Facebook pages of seven extreme-right political parties in Spain between 2009 and 2013. We found that the Spanish extreme-right political parties primarily implicate discrimination, which is then taken up by their followers who use overt hate speech in the comment space.
Keywords: social media, hate speech, covert discrimination, extremism, extreme right, political parties, Spain, Facebook, digital methodsan.
Ben-David, A., & Matamoros Fernandez, A. (2016). Hate Speech and Covert Discrimination on Social Media: Monitoring the Facebook Pages of Extreme-Right Political Parties in Spain. International Journal of Communication, 10(7).
Lost but not forgotten: finding pages on the unarchived web.
Web archives attempt to preserve the fast changing web, yet they will always be incomplete. Due to restrictions in crawling depth, crawling frequency, and restrictive selection policies, large parts of the Web are unarchived and, therefore, lost to posterity. In this paper, we propose an approach to uncover unarchived web pages and websites and to reconstruct different types of descriptions for these pages and sites, based on links and anchor text in the set of crawled pages. We experiment with this approach on the Dutch Web Archive and evaluate the usefulness of page and host-level representations of unarchived content. Our main findings are the following: First, the crawled web contains evidence of a remarkable number of unarchived pages and websites, potentially dramatically increasing the coverage of a Web archive. Second, the link and anchor text have a highly skewed distribution: popular pages such as home pages have more links pointing to them and more terms in the anchor text, but the richness tapers off quickly. Aggregating web page evidence to the host-level leads to significantly richer representations, but the distribution remains skewed. Third, the succinct representation is generally rich enough to uniquely identify pages on the unarchived web: in a known-item search setting we can retrieve unarchived web pages within the first ranks on average, with host-level representations leading to further improvement of the retrieval effectiveness for websites.
Keywords: Web archives Web archiving Web crawlers Anchor text Link evidence Information retrieval
Huurdeman, H.C., Kamps, J., Samar, T., de Vries, A.P., Ben-David, A., & R.A. Rogers (2015). Lost but Not Forgotten: Finding Pages on the Unarchived Web. International Journal on Digital Libraries, 1–19.
Web archive search as research: Methodological and theoretical implications.
The field of web archiving is at a turning point. In the early years of web archiving, the single URL has been the dominant unit for preservation and access. Access tools such as the Internet Archive’s Wayback Machine reflect this notion as they allowed consultation, or browsing, of one URL at a time.In recent years, however, the single URL approach to accessing web archives is being gradually replaced by search interfaces. This paper addresses the theoretical and methodological implications of the transition to search on web archive research. It introduces ‘search as research’ methods, practices already applied in studies of the live web, which can be repurposed and implemented for critically studying archived web data. Such methods open up a variety of analytical practices that were so far precluded by the single URL entry point to the web archive, such as the re-assemblage of existing collections around a theme or an event, the study of archival artefacts and scaling the unit of analysis from the single URL to the full archive, by generating aggregate views and summaries. The paper introduces examples to ‘search as research’ scenarios, which have been developed by the Web ART project at the University of Amsterdam and the Centrum Wiskunde & Informatica, in collaboration with the National Library of the Netherlands. The paper concludes with a discussion of current and potential limitations of ‘search as research’ methods for studying web archives, and the ways with which they can be overcome in the near future.
web archives, Internet Archive, Wayback Machine, search, national libraries
Ben-David, A., & Huurdeman, H. (2014). Web Archive Search as Research: Methodological and Theoretical Implications. Alexandria, 25(1), 93–111.
The Palestinian diaspora on the Web: Between de-territorialization and re-territorialization.
This article analyzes Web-based networks of Palestinian communities in Germany, France, Italy, Austria, Australia, the United States, Canada, Spain, Argentina, Chile and Uruguay. The findings show a thematic and demographic shift from organizations of Palestinian communities abroad to a transnational solidarity network focused on Palestinian rights and the Boycott movement. Although the Palestinian Territories function as the network’s strong center of gravity, analysis of the references reveals that diaspora and non-diaspora actors operate as two distinct but intertwined networks: while diaspora actors are unique in putting emphasis on community as activity type and on diaspora and the right of return as primary cause, non-diaspora actors are mainly dedicated to solidarity as activity and Palestinian rights and the Boycott movement as primary cause. Despite this, ties between diaspora and non-diaspora actors are stronger than among diaspora actors, which indicates that part of the dynamics of Palestinian communities is manifest not just between diaspora communities, but mostly between diaspora communities and civil society organizations in their host societies.
Keywords: boycott, diaspora, internet, Palestine, Web
Ben-David, A. (2012). The Palestinian diaspora on the Web: Between de-territorialization and re-territorialization. Social Science Information, 51(4), 459–474.
Coming to terms: a conflict analysis of the usage, in official and unofficial sources, of ‘security fence’,‘apartheid wall’, and other terms for the structure between Israel and the Palestinian territories.
The official terms for the dividing wall are ‘security fence’ on the Israeli side and ‘apartheid wall’ on the Palestinian side. Both terms fuse two contextually charged notions to describe the construction project. Beyond the two official terms, the structure has been given other names by sources appearing in the media space (e.g. the International Court of Justice’s ‘West Bank wall’) or by news organizations covering the issue (e.g. ‘barrier wall’). Using data from Google News, which includes official NGO as well as news sources, this article offers a media monitoring method that also seeks to create conflict indicators from the shifting language employed by officials, journalists and others to describe the structure. The authors discovered that the Palestinians and Israelis choose their words differently: the Israelis are consistent (yet relatively alone) in the way they use their terms; the Palestinians adopt their terminology according to the setting, using different terms for the structure in diplomatic and international court settings than ‘at home’. Having identified ‘setting’ as an important variable in the study of language use as conflict indicator, the study also includes an analysis of diplomatic language in key debates on the obstacle at the UN Security Council. In all, it was found that, at particular moments in time, Israeli and Palestinian actors ‘come to terms’ most significantly around ‘separation wall’, coupling the Israeli left-of-centre adjective and the Palestinian noun, implying a peace-related arrangement distinctive from either side’s official position (as well as the current peace plans), and ultimately undesirable to those who share the term.
Keywords: Google News, Israeli–Palestinian conflict, media monitoring, new media
Rogers, R., & Ben-David, A. (2010). Coming to terms: a conflict analysis of the usage, in official and unofficial sources, of ‘security fence’,‘apartheid wall’, and other terms for the structure between Israel and the Palestinian territories. Media, War & Conflict, 3(2), 202–229.
Palestine's virtual borders 2.0: From a non-place to a user-generated space.
In 2003 the Palestinian state received official recognition on the Web before it was established on the ground. The delegation of the .ps Country code Top level domain(CcTld) to the Palestinian Authority and its inclusion in the UN list of recognized countries and territories created an official Web-space in which a Palestinian state operated side-by-side with other sovereign states. Yet with the rise of Web 2.0 applications, the official representation of the Palestinian state partially disappeared. This study focuses on the shift in the spatial representation of the Palestinian state on the Web, from an officially acknowledged national Web space, followed by its partial disappearance in Web 2.0 spaces, to its reconstruction as a user-generated space. It examines Palestine’s virtual borders on various Web 2.0 mapping platforms, along with the listing (and non-listing) of Palestine as a country in the registration procedure of popular Web 2.0 applications. It shows that on most mapping platforms the Palestinian Territory is underrepresented, and that the country's official representation on the UN list of recognized countries and territories is often omitted or modified on social media sites’ registration forms. After analyzing the geo-politics of social media's drop-down country lists, this study argues that Web 2.0 spaces are unofficial Web-spaces, in which official representations of countries are not determined by diplomacy or approved by international institutions, but rather by interaction between commercial platforms and their users. Faced with the partial disappearance of their homeland, Palestinian users both in the Palestinian Territory and in the Diaspora thus become placeless participants of Web 2.0 spaces. They attempt to reclaim the virtual representation of their home country as a sovereign Palestinian state by protesting, uploading, tagging and generating content on Web 2.0 platforms. On platforms such as Facebook, Blogger and Google Maps, user activism and user-generated content has led to a spatial transformation from the country's non-listing and non-placement, to its official inclusion. Finally, this article makes a contribution to the theorization of political Web spaces by arguing that the Palestinian case complicates current views on relationships between the Web and the ground. Unlike the common perception that the virtual is grounded in the real, the over-representation of a Palestinian state in official Web spaces, in parallel with its underrepresentation in unofficial Web spaces, and users' treatment of virtual spaces as real spaces, indicate that these realms actually tend to merge, at least in the case of contested Web terrains and unsettled struggles for self-determination.
Ben-David, A. (2010). Palestine's virtual borders 2.0: From a non-place to a user-generated space. Réseaux, (159),151–79.
The Palestinian-Israeli peace process and transnational issue networks: the complicated place of the Israeli NGO.
Israeli non-governmental organizations (NGOs) resisting the security fence and other Israeli security measures are in ‘virtual isolation’ in networks dedicated to the Palestinian-Israeli conflict, and especially to the criticism of Israeli governmental policies and the construction of the security fence.The research reported is a hyperlink and term analysis of select issue networks on the Web assembled around the security fence and other conflict issues. It shows that attempts by left-leaning Israeli NGO network actors to frame the issue in their own critical terms are ignored by networked trans-national actors working in the Palestinian-Israeli issue space, even though it may be that both kinds of organizations campaign against it.The Israeli organizations, it was found, are largely in an issue space of their own making, distinct from the human rights frame that dominates the trans-national networks. In putting forward the notion of the separation fence, theirs is also a particular local ‘peace process’ approach to issue settlement, different not only from that of the dominant transnational issue networks on the Web, but also from official Israeli as well as certain Western governmental positions.The article concludes by finding that, according to the Web, the local peace process is not a trans-national issue network affair.
Key words: apartheid wall, hyperlink analysis, non-governmental organizations (NGOs), Palestinian-Israeli conflict, security fence, trans-national advocacy networks.
Rogers, R., & Ben-David, A. (2008). The Palestinian—Israeli peace process and transnational issue networks: the complicated place of the Israeli NGO. New Media & Society, 10(3), 497–528.