- Researcher
- Software Engineer
- Cat Parent
- Child of the '80s
- and more...

by Shawn M. Jones, Valentina Neblitt-Jones, Michele C. Weigle, Martin Klein, Michael L. Nelson
In the Dark and Stormy Archives (DSA) project, we focus on storytelling techniques to summarize collections of archived web pages. Since collections can have hundreds or even thousands of seeds (initial URLs) and each seed can be recrawled many times, with each version separat...
by Shawn M. Jones, Michele C. Weigle, Martin Klein, Michael L. Nelson
Humans can choose individual documents from a web archive collection, but doing so is difficult if they are unfamiliar with the collection. The issue is scale. Most web archive collections consist of thousands of documents. Hypercane is a tool that automates the selection of d...
by Shawn M. Jones, Valentina Neblitt-Jones, Michele C. Weigle, Martin Klein, and Michael L. Nelson
Humans can choose individual documents from a web archive collection, but doing so is difficult if they are unfamiliar with the collection. The issue is scale. Most web archive collections consist of thousands of documents. Hypercane is a tool that automates the selection of d...
by Shawn M. Jones, Martin Klein, Herbert Van de Sompel, Michael L. Nelson, and Michele C. Weigle
Used by a variety of researchers, web archive collections have become invaluable sources of evidence. If a researcher is presented with a web archive collection that they did not create, how do they know what is inside so that they can use it for their own research? Search eng...
To allow previewing a web page, social media platforms have developed social cards: visualizations consisting of vital information about the underlying resource. At a minimum, social cards often include features such as the web resource’s title, text summary, striking image, a...
by Shawn M. Jones, Michele C. Weigle, Martin Klein, Michael L. Nelson
To allow previewing a web page, social media platforms have developed social cards: visualizations consisting of vital information about the underlying resource. At a minimum, social cards often include features such as the web resource’s title, text summary, striking image, a...
Since researchers and archivists are most often interested in the on-topic content of these collections, identifying the off-topic Mementos is a crucial first step before further analysis. For that reason, we created the Off-Topic Memento Toolkit (OTMT), which identifies (but ...
We developed MementoEmbed, an archive-aware service that can generate different types of visualizations for a given Memento. MementoEmbed can generate cards that appropriately attribute content to its original resource, including both the original domain and associated favico...
by Shawn M. Jones, Martin Klein, and Herbert Van de Sompel
Links to web resources frequently break, and linked content can change at unpredictable rates. These dynamics of the Web are detrimental when references to web resources provide evidence or supporting information. In this paper, we highlight the significance of reference rot, ...
Individual web archive collections can contain thousands of documents. If a researcher wants to use one of these collections, which one best meets their information need? How does the researcher differentiate them? deally, a user would be able to glance at a visualization and ...