Where Science Meets Storytelling: Twelve Years of the Science Blogs Web Archive
More than a decade after its launch, the Science Blogs Web Archive continues to grow and evolve. In this interview, Jennifer “JJ” Harbster reflects on building and maintaining the collection, while intern Yahir Brito brings a fresh perspective on updating and expanding it. Together, they share a few of their favorite blogs and discuss why it is important to preserve these unique examples of scientific communication.
First, let’s introduce both of you to the readers. Who are you and what was your role in creating this collection?
Yahir: My name is Yahir Brito, and I’m a 2025 Library of Congress intern (LOCI) through the Washington Center working in the Science Section of the Researcher Engagement and General Collections Division. I’m a rising senior at Vanderbilt University majoring in Medicine, Health, and Society, as well as History and Communication of Science and Technology. I come from Presidio, Texas, a small town on the US-Mexico border, and this internship has allowed me to explore my academic interests in science, communication, and real-world preservation work. This summer, I supported the Science Blogs Web Archive by updating descriptive elements and helping redevelop the collection page, which the Library calls the “framework.” My goal was to help improve how the archive can be discovered, understood, and used by a range of researchers.
JJ: I am Jennifer “JJ” Harbster, head of the Library’s Science Section. In this capacity I have the pleasure to mentor interns such as Yahir. I have worked 21 non-continuous years in the Science Section at the Library of Congress and 16 of those years included archiving the web. My initial experience into the world of web archiving was in 2005 for a shared project with the Library of Congress, Internet Archive and University of California preserving digital content related to Hurricane Katrina. This work influenced the way I approach web archiving and favor collaborative projects that often revolve around a theme, event, or focused subject area. Currently, I am the collection lead for three ongoing web archiving projects, including the Science Blogs Web Archive.

Can you briefly describe the collection?
Yahir: The Science Blogs Web Archive captures the many ways science is communicated online, encompassing individual voices, institutional posts, and collaborative platforms. It features blogs across various scientific fields, from astronomy and geology to the history of science. These blogs often combine education, commentary, and personal reflection, serving as a connection between scientific communities and the public. Their conversational and personal nature provides a direct insight into the approach, interpretation, and dialogue of science, making them valuable records beyond traditional academic and media outlets.
JJ: Working with Yahir reviewing 13 years of archived science blogs has been a walk down memory lane. The idea for the Science Blog Web Archive collection began in 2012 to address the need to collect and preserve the proliferation of science communication and discourse happening in the blogosphere with an initial focus on U.S. bloggers covering all the sciences published on various platforms. What has amassed since is a potpourri of scientific voices, subjects, and themes that Yahir is helping me better understand and build upon.
How did you select material to archive? Did you have any specific goals or considerations in mind when nominating content?
JJ: When developing any collection, you need a purpose and a scope. Part of the science blog collecting purpose is to preserve ‘at-risk’ blogs that are ceasing publication or migrating to new online platforms which often results in the loss of content. Another purpose is to enhance the Library’s existing science collections with digital works that document the variety of perspectives in the sciences– from NOAA’s Teacher at Sea blog to a mycologist living in Alabama blogging about fungi. The original scope I developed for selection was simple: the blog must produce original thought, be from any science discipline, and originate from the U.S. Initially, I took a survey approach to document the landscape of science blog topics, categories, and voices from the U.S. However, the collecting approach has evolved over time and has become more intentional. For example, we have been focusing on specific science themes or disciplines (e.g. women in STEM or neurosciences) and with Yahir’s help we are identifying subject gap areas and priorities to work on.
Yahir: Blog candidates were identified through various channels and independent exploration. The scope prioritized blogs centered on science communication, research commentary, and public engagement across disciplines. Ideal candidates demonstrated originality, topical relevance, and consistent updates, with publicly accessible content free of paywalls or membership restrictions. Independent blogs sharing various perspectives, as well as those hosted by institutions with an assertive public outreach, were particularly prioritized. Furthermore, there is a growing interest in expanding the collection to encompass a wider range of multilingual and international perspectives.

What are some of the highlights?
Yahir: One of the most rewarding aspects of this project was rewriting the collection description and selecting featured items for the Science Blogs Web Archive framework, ensuring it showcased both scientific depth and a variety of voices. A few standout blogs truly embody this range: Isis the Scientist offers candid insights from a woman in academia, drawn from her personal experiences in STEM; the late Oliver Sacks’s blog combines science, memoir, and philosophy, providing intimate reflections for public understanding of neuroscience; and NeuroDojo authored by Zen Faulkes, offers an engaging take on neurobiology and science policy through a blend of commentary and humor. Nominating blogs for the collection deepened my appreciation for how presentation and metadata choices influence which stories are reflected in the scientific record.
JJ: There are a variety of blogs represented in the collection, from popular media outlets to academic campuses, and from scientific organizations to personal accounts. I tend to gravitate to blogs with creative names and a message. Grandma Got STEM is a great example and represents a blog project which solicited submissions from the public by senior women, our grandmothers, who worked in STEM. This blog represents the tradition of scientific biographies, but in a blog format and crowd sourced. Another highlight is Southern Fried Science, a marine science and conservation blog that I have been following since its inception 15 years ago! This blog is educational and informative, delving into all aspects of our planet’s ocean written by a group of authors representing a variety of scientific fields.
What challenges have you faced in creating this collection?
Yahir: One of the biggest challenges I faced was working with older Library of Congress web archive captures. Some blogs are less archivable than others, and web crawling technology was not as sophisticated at the start of the collection as it is now, leading to some archived pages that didn’t load properly or were missing essential content. This made it difficult to fully evaluate the site and determine the necessary updates or enhancements to the Library’s record. Finding contact information or author attributions for many blogs also proved surprisingly challenging. Some blogs were anonymous or hosted on defunct platforms, which complicated efforts to verify authority files and basic metadata, such as the creator’s name or organizational affiliation.
JJ: As in any of our web archiving work, finding contact information to email permission requests or notifications to crawl content can be challenging and figuring out geographic locations of blogs can be a daunting task. The focus of the collection has been on U.S. blogs, but this gets blurry as bloggers grow their careers and move around. The transitory element of blogs also keeps me busy. I periodically need to check on activity or need to track down when a blog moves platforms or changes its web address. Admittedly, I get a sad when a blog becomes inactive, and I send it to postcrawl.
The evolving nature of blogging, increased use of embedded multi-media and the variety of supporting blog platforms can pose philosophical discussions and technological challenges. For example, I have been trying to figure out what to do about a popular newsletter service that resembles a blogging format. Should I consider science-based newsletters blogs?

Why do you think web archiving is important for documenting this subject? How do you imagine researchers, now or in the future, might use this web archive?
Yahir: Science blogs capture how scientific ideas spread globally, how people respond to discoveries, communicate uncertainty, and engage with policy and public life. That’s not something we always see in peer-reviewed journals or press releases. Web archives like this one help preserve the very human side of science. I imagine researchers in the future using this archive to study how scientific communication has evolved, uncover insights from past experts, and find inspiration for new scientific discoveries. They might also use it to understand how a range of bloggers utilized digital space to share their perspectives and connect with others.
JJ: Blogs are like newspapers, they generally reflect the tone and tenor of the times and so they become lenses in which to view scientific ideas and research at a moment in time from a distinct voice. They also are a vehicle to distribute and document information in a digital space, so looking at the use of blogs over time as a publishing format is important in the realm of scientific publishing. One way to approach this collection is to trace careers of science writers and scientists. Many of the bloggers within this collection are now authors publishing award-winning books and appearing on science docuseries as well as film documentaries. Time is another way to look at the collection, so what were folks blogging about in the sciences during the first part of the 21st century?
Yahir, I am especially interested in your experience as an intern. What did you learn about web archives while working on this project?
Yahir: This internship provided me with my first hands-on experience with web archives, which significantly influenced my perspective on digital preservation. Before my internship, I saw archiving as a fixed process, similar to simply saving a web page. I now know that it involves preserving context, evolution, and change over time. I’ve learned how metadata decisions impact access and use, showcasing the importance of representation in shaping digital memory. These insights will be invaluable in my future endeavors, whether in science communication, public health, or cultural institutions. This experience has shown me that archiving is not just about preserving the past; it is also about saving something that will better the future.
Source of Article