In the era of digital humanities, the scale of archival and born-digital materials has expanded dramatically. Massive digitisation efforts, automated pipelines and the growth of born-digital collections have added layers of complexity that challenge traditional research practices. At the same time, large language models (LLMs) offer new pathways into archives: they not only enable us to search large corpora but also to reveal connections, patterns, and overlooked materials in ways that were previously hard to imagine. Against the theme of DHNB 2026 – Lost in Abundance: Encounters with the Non-Canonical – this workshop addresses how archival scholars can turn the abundance of data into meaningful insight into non-canonical, neglected or dispersed materials.
The focus of this half-day workshop is on how LLMs can supplement traditional archival scholarship. While researchers increasingly experiment with retrieval-augmented generation (RAG) systems, tools such as NotebookLM, or pipelines that integrate local collections with external models, there remains little systematic reflection on methodological and ethical implications. We also see researchers engaging with the internal “knowledge” of LLMs in order to trace individuals, networks, cultural flows and under-studied materials that exist somewhere between scattered sources. These innovations raise questions of robustness and accountability in research: how to distinguish useful leads from model hallucinations, how to document AI-driven workflows, and how to maintain transparency when the model’s provenance is opaque.
Call for papers
We invite contributions that present concrete cases of AI use in archival settings, particularly for exploring overlooked data and non-canonical materials. Possible topics include but are not limited to:
- RAG pipelines and vector-based search across large archival sets
- practices for verifying and documenting AI-generated suggestions
- experiments where LLMs help identify hidden texts, individuals or cultural formations across dispersed or fragmentary data
- the integration of model-based insights with archival metadata, authority files or domain-specific ontologies
- reflections on cases where AI created unexpected bias or error and how these were addressed
- strategies for hybrid workflows where human expertise remains central
- considerations of data sovereignty, long-term preservation of AI-assisted workflows and ethical implications of researching non-canonical or marginalised sources.
Send an abstract of 200–400 words, excluding references. All submissions should be submitted via the DHNB conference tool at https://www.conftool.org/dhnb2026/. If you have not created a user account for DHNB 2026, you will need to create one. After logging in, select the workshop “Workshop 2: AI-Assisted Archival Research in the Age of Vast Data and the Non-Canonical.”
Outcome
The workshop is designed for sharing methodologies and reflecting on emerging practices. Short presentations of concrete work will be combined with structured discussion sessions. Participants will be encouraged to share successes, failures and lessons learned from AI-assisted archival research. Special emphasis will be placed on outlining reproducible methods, transparent documentation and strategies for maintaining methodological integrity in the face of computational opacity. The aim is to co-create a set of guiding considerations for researchers applying AI tools in the archival context while keeping humanistic values and critical inquiry at the center.
Important dates
Submission Deadline: 11 January 2026
Notification of acceptance: 31 January 2026
Workshop organisers
The workshop is organized by Mads Rosendahl Thomsen (madsrt@cc.au.dk) and Marc Barcelos (marcb@cc.au.dk) from Aarhus University. Both are affiliated with the research center TEXT: Center for Contemporary Cultures of Text. The center is dedicated to understand the impact of Generative Artificial Intelligence (GenAI) and Large Language Models (LLMs) on writing cultures at this pivotal moment in history, in which – after more than 6,000 years of handcrafted text production – we see that all aspects of text creation and use are being altered. The center is committed to the view that a research-based understanding of the role of text in a new technological environment is essential for maintaining a human-centered control over the production and use of text. Read more about TEXT here: https://arts.au.dk/en/text
For questions regarding the workshop, contact Mads Rosendahl Thomsen (madsrt@cc.au.dk).