- Researcher
- Software Engineer
- Cat Parent
- Child of the '80s
- and more...
Hypercane: Toolkit for Summarizing Large Collections of Archived Webpages
by Shawn M. Jones, Valentina Neblitt-Jones, Michele C. Weigle, Martin Klein, Michael L. Nelson
In the Dark and Stormy Archives (DSA) project, we focus on storytelling techniques to summarize collections of archived web pages. Since collections can have hundreds or even thousands of seeds (initial URLs) and each seed can be recrawled many times, with each version separat...
Hypercane: Intelligent Sampling for Web Archive Collections
by Shawn M. Jones, Michele C. Weigle, Martin Klein, Michael L. Nelson
Humans can choose individual documents from a web archive collection, but doing so is difficult if they are unfamiliar with the collection. The issue is scale. Most web archive collections consist of thousands of documents. Hypercane is a tool that automates the selection of d...
It's All About The Cards: Sharing on Social Media Probably Encouraged HTML Metadata Growth
by Shawn M. Jones, Valentina Neblitt-Jones, Michele C. Weigle, Martin Klein, and Michael L. Nelson
Humans can choose individual documents from a web archive collection, but doing so is difficult if they are unfamiliar with the collection. The issue is scale. Most web archive collections consist of thousands of documents. Hypercane is a tool that automates the selection of d...
Interoperability for Accessing Versions of Web Resources with the Memento Protocol
by Shawn M. Jones, Martin Klein, Herbert Van de Sompel, Michael L. Nelson, and Michele C. Weigle
Used by a variety of researchers, web archive collections have become invaluable sources of evidence. If a researcher is presented with a web archive collection that they did not create, how do they know what is inside so that they can use it for their own research? Search eng...
Automatically Selecting Striking Images for Social Cards
To allow previewing a web page, social media platforms have developed social cards: visualizations consisting of vital information about the underlying resource. At a minimum, social cards often include features such as the web resource’s title, text summary, striking image, a...
Automatically Selecting Striking Images for Social Cards
by Shawn M. Jones, Michele C. Weigle, Martin Klein, Michael L. Nelson
To allow previewing a web page, social media platforms have developed social cards: visualizations consisting of vital information about the underlying resource. At a minimum, social cards often include features such as the web resource’s title, text summary, striking image, a...
Off-Topic Memento Toolkit to identify topical outliers in web archive collections
Since researchers and archivists are most often interested in the on-topic content of these collections, identifying the off-topic Mementos is a crucial first step before further analysis. For that reason, we created the Off-Topic Memento Toolkit (OTMT), which identifies (but ...
MementoEmbed and Raintale for Web Archive Storytelling
We developed MementoEmbed, an archive-aware service that can generate different types of visualizations for a given Memento. MementoEmbed can generate cards that appropriately attribute content to its original resource, including both the original domain and associated favico...
Robustifying Links To Combat Reference Rot
by Shawn M. Jones, Martin Klein, and Herbert Van de Sompel
Links to web resources frequently break, and linked content can change at unpredictable rates. These dynamics of the Web are detrimental when references to web resources provide evidence or supporting information. In this paper, we highlight the significance of reference rot, ...
Web mentions
The Dark and Stormy Archives Project: Summarizing Web Archives Through Social Media Storytelling
Individual web archive collections can contain thousands of documents. If a researcher wants to use one of these collections, which one best meets their information need? How does the researcher differentiate them? deally, a user would be able to glance at a visualization and ...