Skip to main content

From Communications Studies to Captured Content: A Junior Fellow’s Contribution to the Mass Communication Web Archive

This is a guest post written by Kailyn Slater, a 2025 Junior Fellow in the Web Archiving Section.


How can the Library represent the presence of mass communications, as a broad field of theory and practice, on the web? How do we know what is worth preserving? What sources will be impactful for communications workers, scholars, and the public? These were some of the questions I had in mind when approaching the Documenting the Digital Age project. As the Junior Fellow for the project, I was embedded within the Web Archiving Section and focused on expanding and enhancing the Mass Communications Web Archive (MCWA). I assessed the quality of web captures, suggested new websites for collection leaders to nominate, and added descriptive metadata to new and existing records.  I came to librarianship after graduate study in Communication, curious about how to apply my knowledge of media studies in helping the public. I crafted my Master’s thesis around the study of misinformation and deepfake videos, and after graduation was not quite sure about a direct path to teaching. A librarian at my university suggested I think about going into librarianship, based on my skills as a researcher and interest in public service. A few years later, towards the completion of my MSLIS at the University of Illinois Urbana-Champaign, I heard about the Junior Fellows Program with remote projects. The funding that the fellowship provided, as well as the opportunity to support web archives in a tangible way, drew me to get involved.  

Amber Paranick of the Serials and Government Publications Division and Kelly Bennett of the Business Section lead the MCWA, and its scope is the “creation, distribution, and consumption of media.” As they’ve discussed in an interview for The Signal, determining the exact collecting boundaries for this web archive is challenging. So much of the last thirty years of mass communication occurs on the web—but what type of websites fit within this collecting scope, that can also be archivable given legal constraints and the technological challenges of web archiving? It’s important that the Web Archiving Section determine if the website can be crawled at all, considering technical challenges and a permissions process that engages with site owners directly. This aspect of the work intrigued me as someone outside of the institution—the intent to notify and gather, rather than to scrape and display without notification.   

Journal of 20th Century Media History website homepage
Journal of 20th Century Media History website, as it appears in a June 2024 capture in our web archives.

I approached the MCWA as a multifaceted resource, imagining how it could meet the needs of both public users and researchers like historians, sociologists, journalists, and communication scholars. As a discipline, mass communication is a large umbrella that contains journalism and broadcasting, media research, marketing, public relations, the study of social networks and social media, as well as the study of technology from a sociological lens. Websites already apart of the MCWA include Report for America, the Centre for Internet Studies, and the Journal of 20th Century Media History. Because websites are vehicles for communicating about communication and represent evidence of changes over time within mass communications as a field, I made specific decisions about websites to suggest to Collection Leaders that acknowledge the compounded nature of web archive material that is concerned with mass communications. 

I suggested 26 websites, focusing on organizations that produce educational material, and websites for professional associations that provide resources for journalists, reporters, and communications workers. I did not make formal appraisal decisions as a Junior Fellow, but provided suggestions toward appraisal and emphasized the need for collection items that demonstrate how the field of mass communications shifts over time, similar to how the Industry Associations Web Archive documents the evolution of industries. For example, some of the websites I suggested represent shifts in the industry of news media. In the last three decades, the migration of print publications to digital platforms, as well as the creation of digital-only publications and streaming media, has meant that the traditional news media products of broadcast and print are no longer the public’s singular source for information about current and unfolding events.  Additionally, when sifting through the Library’s web archives, I noticed that there were records that fit within the Mass Communications Web Archive’s collecting scope in other collections, such as the Public Policy Topics Web Archive, the Public Broadcasting Web Archive, and the United States News Web Archive. Web archives at the Library benefit from being considered as part of multiple collections to increase users’ access points and represent mass communications as both a subject and field of practice. 

Nieman Storyboard website with headline "The Post-it puzzle of a big writing project"
The Nieman Storyboard website, as it appears in a November 2023 capture in our web archives.

Over the course of the fellowship, I also completed over sixty capture assessments for the Mass Communications Web Archive and the International Political Events Web Archive. A “capture” is an archived copy of a website created at a specific moment in time, and Web Archiving staff utilize “capture assessments” to understand how similar those captures are to the live website. Websites are complex, containing many layers of code and content that change over time, and some websites are easier to preserve than others. Major discrepancies between a capture and the live web are important to be aware of for effective use of the archives. During my fellowship, I learned that  the Library’s quality assurance practices are based on Brenda Reyes Ayala’s grounded theory of correspondence for web archives, which provides a framework for assessing website captures in terms of visual similarity, interactivity, and level of completeness. When I compared captures to the live web, I gave each a correspondence rating based on those three categories, and I diagnosed specific issues with captures, like missing images or features that work on the live web but are difficult to replicate in the archived version of the site. These capture assessments ensure that the Section can troubleshoot as soon as possible and adjust how the website is crawled to improve the quality of future captures.

Another way I contributed to the MCWA during my fellowship was by adding Library of Congress Subject Headings and Library of Congress Name Authorities to web archive records. Users can search the web archives on loc.gov by entering a site name or URL in the search bar, or by browsing through the thematic and event-based collections. The Library’s web archives contain nearly 35,000 records. To aid users in the process of searching, adding subjects and contributors provide a web-friendly infrastructure for linking records across the Library.  Utilizing the Library of Congress Linked Data Service , I took stock of what terms fit the subject matter of the materials, as well as which classification systems were relevant for the material based on that subject matter. Some of these terms include “Journalism—Vocational guidance,” “Public interest—United States,” “mass media,” “Social media and journalism,” and “communication.” This strategy expands the presence of web archives as resources available for use within the Library’s collections. It also places the works in their historical, social, and political context, and reduces barriers of access to its technologically complex medium.  

Website capture for the National Press Club with the slogan "The World's Leading Professional Organization for Journalists"
The National Press Club website, as it appears in a June 2024 capture in our web archives.

I also learned how to utilize another schema to broaden the possibilities of discovery for users and insert the content into alternative contexts. The American Folklore Society Ethnographic Thesaurus, for example, is a schema with a material tie to the collecting scope of the MCWA: the web has been pivotal in providing a platform for communities to talk about folk tales and connect through shared histories. Including terms from the Thesaurus like “online communities,” “communications workers,” “freelance writers,” and “mass media” honors those connections.  

Web archives represent a visual and interactive link between online evidence and the people and communities who create that evidence. Depending on the content of the website, it can help users understand what organizations, strategies, and roles that journalists, broadcasters, and other kinds of communications workers engage in. Web archives have also been important in the documentation of current events and social movements. 

My fellowship with the Web Archiving Section has been fruitful. It was an affirming experience that felt attuned to my career growth as both an early-career librarian and researcher at the intersection of media studies, communication, and information studies. I learned about the breadth of resources that the Library provides across formats, as well as the many people that produce those resources. I also gained insight into how web archives and other complex digital materials like databases at the Library of Congress are ingested, stored, and accessed. Being able to make a material impact on a web archive that is aligned with my subject expertise and massive in its potential reach has been a career-changing experience for me.  

Source of Article

Similar posts