Me presenting at CEDWARC 2019
  • Researcher
  • Software Engineer
  • Cat Parent
  • Child of the '80s
  • and more...
Hypercane: Toolkit for Summarizing Large Collections of Archived Webpages

Hypercane: Toolkit for Summarizing Large Collections of Archived Webpages

Accepted Future Publication

by Shawn M. Jones, Valentina Neblitt-Jones, Michele C. Weigle, Martin Klein, Michael L. Nelson

In the Dark and Stormy Archives (DSA) project, we focus on storytelling techniques to summarize collections of archived web pages. Since collections can have hundreds or even thousands of seeds (initial URLs) and each seed can be recrawled many times, with each version separat...

Read More
Hypercane: Intelligent Sampling for Web Archive Collections

Hypercane: Intelligent Sampling for Web Archive Collections

Accepted Future Publication

by Shawn M. Jones, Michele C. Weigle, Martin Klein, Michael L. Nelson

Humans can choose individual documents from a web archive collection, but doing so is difficult if they are unfamiliar with the collection. The issue is scale. Most web archive collections consist of thousands of documents. Hypercane is a tool that automates the selection of d...

Read More
It's All About The Cards: Sharing on Social Media Probably Encouraged HTML Metadata Growth

It's All About The Cards: Sharing on Social Media Probably Encouraged HTML Metadata Growth

Accepted Future Publication

by Shawn M. Jones, Valentina Neblitt-Jones, Michele C. Weigle, Martin Klein, and Michael L. Nelson

Humans can choose individual documents from a web archive collection, but doing so is difficult if they are unfamiliar with the collection. The issue is scale. Most web archive collections consist of thousands of documents. Hypercane is a tool that automates the selection of d...

Read More
Interoperability for Accessing Versions of Web Resources with the Memento Protocol

Interoperability for Accessing Versions of Web Resources with the Memento Protocol

by Shawn M. Jones, Martin Klein, Herbert Van de Sompel, Michael L. Nelson, and Michele C. Weigle

Used by a variety of researchers, web archive collections have become invaluable sources of evidence. If a researcher is presented with a web archive collection that they did not create, how do they know what is inside so that they can use it for their own research? Search eng...

Read More
Automatically Selecting Striking Images for Social Cards

Automatically Selecting Striking Images for Social Cards

To allow previewing a web page, social media platforms have developed social cards: visualizations consisting of vital information about the underlying resource. At a minimum, social cards often include features such as the web resource’s title, text summary, striking image, a...

Read More
Automatically Selecting Striking Images for Social Cards

Automatically Selecting Striking Images for Social Cards

by Shawn M. Jones, Michele C. Weigle, Martin Klein, Michael L. Nelson

To allow previewing a web page, social media platforms have developed social cards: visualizations consisting of vital information about the underlying resource. At a minimum, social cards often include features such as the web resource’s title, text summary, striking image, a...

Read More
Off-Topic Memento Toolkit to identify topical outliers in web archive collections

Off-Topic Memento Toolkit to identify topical outliers in web archive collections

Since researchers and archivists are most often interested in the on-topic content of these collections, identifying the off-topic Mementos is a crucial first step before further analysis. For that reason, we created the Off-Topic Memento Toolkit (OTMT), which identifies (but ...

Read More
MementoEmbed and Raintale for Web Archive Storytelling

MementoEmbed and Raintale for Web Archive Storytelling

We developed MementoEmbed, an archive-aware service that can generate different types of visualizations for a given Memento. MementoEmbed can generate cards that appropriately attribute content to its original resource, including both the original domain and associated favico...

Read More
Robustifying Links To Combat Reference Rot

Robustifying Links To Combat Reference Rot

by Shawn M. Jones, Martin Klein, and Herbert Van de Sompel

Links to web resources frequently break, and linked content can change at unpredictable rates. These dynamics of the Web are detrimental when references to web resources provide evidence or supporting information. In this paper, we highlight the significance of reference rot, ...

Web mentions

Read More
The Dark and Stormy Archives Project: Summarizing Web Archives Through Social Media Storytelling

The Dark and Stormy Archives Project: Summarizing Web Archives Through Social Media Storytelling

Individual web archive collections can contain thousands of documents. If a researcher wants to use one of these collections, which one best meets their information need? How does the researcher differentiate them? deally, a user would be able to glance at a visualization and ...

Read More