Workshop: Graph-based text and knowledge modelling using the ATAG Editor and Entity Manager

Introduction

This hands-on half-day workshop introduces two complementary tools, the ATAG Editor and Entity Manager, for graph-based text and knowledge modelling. The tools have been built on the modelling concepts Applied Text as Graph (ATAG) and Reusable Abstraction Model for Editorial Needs (RAMEN). These tools provide a generic, domain-agnostic core with domain-specific options for implementation. This enables editors and researchers to model textual phenomena beyond the constraints of XML hierarchies.

Modelling foundations: RAMEN and ATAG

TEI-XML has long been the established standard for modelling structured text data in digital editions (Cummings 2019). However, in complex use cases, the limitations of hierarchical tree model increasingly become apparent: overlapping structures, ambiguous reading variants, fragmentary or non-linear texts, and heavily annotated materials can often only be represented in XML through cumbersome workarounds or lossy simplifications (TEI Consortium 2023; Dekker/Birnbaum 2017; Liedtke 2020; Cugliana et al. 2024).

These problems have been discussed extensively, as have non-XML-based modelling alternatives. In recent years, graph technologies have gained momentum within the Digital Humanities community, with editorial projects increasingly adopting graph-based modelling approaches to represent their data. Graph models have been used to connect complex text fragments with annotations, overcoming tree structures (Zeterberg et al. 2025, Liedtke 2020). However, these implementations often remain in the prototype phase or tend to be project-specific rather than generic.

RAMEN (Reusable Abstraction Model for Editorial Needs, Enns/Kuczera 2025) and ATAG (Applied Text as Graph, Kuczera 2024) offer a standardized approach for modelling digital editions. Both concepts use Labeled Property Graphs (LPG), in which information is stored and interconnected through nodes with typed edges, labels, and attributes. RAMEN provides a flexible, domain-agnostic framework encompassing all central elements of an editorial project – including texts, annotations, entities, collections, and their relationships (Fig. 1). Projects can add a layer with their domain-specific requirements on top of the abstract base model.

Figure 1: Two-layered modelling with RAMEN. An abstract metamodel with three parts: (Subset A) context modelling, (Subset B) knowledge modelling, and (Subset C) text / image modelling, followed by a domain-specific layer.

ATAG, as part of RAMEN, focuses specifically on fine-grained text and annotation modelling: it breaks up text into its smallest possible unit (the characters), with each character stored in a node that can be individually addressed and linked to annotation nodes (Fig. 2). RAMEN and ATAG use Neo4j as the underlying graph database.

Figure 2: Start of letter Wr045-60ra in the project Das Buch der Briefe der Hildegard von Bingen represented in ATAG. Three annotations (orange) of different types (emphasised, expansion, head) are connected with single characters (pink) of the text.

This workshop presents a comprehensive package addressing previous limitations: first, a reusable modelling approach adaptable to individual project needs; and second, two web-based editing tools – the ATAG Editor and Entity Manager – specialized in connecting text fragments with elements of a knowledge base. These tools are already being used in projects such as Das Buch der Briefe der Hildegard von Bingen (Dreyer et al. 2023), The Socinian Correspondence (Daugirdas/Kuczera 2017), and Regesta Imperii (Kuczera et al. 2025).

Tools

Entity Manager

Entity Manager is a web application that enables knowledge management for digital editions (Fig. 3). It is currently being developed specifically for Regesta Imperii but is based on the generic RAMEN model and can therefore be used for any projects that use this model. It enables the creation and editing of entities. In this context, entities are, for example, persons, places, roles, or other project-specific things. Properties can be assigned to entities (e.g., date of death, alternative spellings, Wikidata ID, etc.), as is typical for knowledge management. In addition, entities can be linked to each other. The networking options are open and can be defined on a project-specific basis, regardless of whether they involve relationships, political networks, or subordination of concepts. The Entity Manager can also be used for existing projects and offers helpful functions for cleaning up data and, for example, finding and merging duplicates.

Figure 3: This is an entity from the Regesta Imperii project in the Entity Manager. Details are displayed to the left, together with any related entities and related collections (in this case, one regesta). Similar entities for potential data normalisation processes are displayed on the right.

ATAG Editor

The ATAG Editor is a WYSIWYM tool designed for text editing and annotation (Fig. 4). While its interface draws on familiar text editor conventions, the tool leverages the full potential of graph-based modeling. Beyond simple text segment classification – such as distinguishing base texts from comments or marginal notes – the ATAG Editor connects textual structures to other elements in the model. Text management functions, for example organizing texts within corpora, are integrated into the editor and function similarly to a file management system or explorer.

Projects can configure annotation types to match their specific needs, enabling the same underlying technology to handle diverse text genres – from letters and regesta to diaries – each with distinct semantic properties. Annotations support attributes of different data types (numbers, strings, dates) and can be linked both to other text fragments and to entries in the knowledge base. These can be other textual structures (for example commentary text) or external knowledge elements (like mentioned persons, places, references – in short: everything that is managed by the Entity Manager). Annotations are therefore the gateway through which any character of any textual structure can address other elements of the edition.

Figure 4: Regesta from the project Regesta Imperii in the ATAG Editor. Annotated text is displayed in the center, annotation details in the right sidebar

Together, Entity Manager and ATAG Editor form a software stack for working with a flexible yet project-adaptable data model. The tools take advantage of Neo4j graph databases which are designed to rapidly retrieve highly interconnected data. Using the Cypher query language, information can be quickly extracted from the editorial data starting from any entry point. This enables a tight integration of knowledge base and text fragments.

Workshop Details

Aims

After the workshop, participants are able to

● understand graph-based text modeling as an alternative to hierarchical structures
● develop their own project-specific flavor of RAMEN
● build knowledge structures with the Entity Manager
● structure, edit, annotate and interconnect texts with the ATAG Editor
● design concrete implementation strategies for individual research projects

Format

Format: half-day (4 hours)

Number of Participants

Max. 25

Intended Audience

● Editors of ongoing or planned digital editions who want to overcome XML limitations
● DH researchers interested in graph-based modelling and/or innovative usages of graph technologies
● Instructors seeking easy-to-use tools for teaching markup, modelling, and data curation
● Students curious about text annotation for research or project-based learning

Requirements

Required prior knowledge
● Familiarity with digital editions and their challenges is an advantage, but not mandatory
● No programming skills required
● No prior knowledge of graph technologies required

Bring to the workshop
● A laptop with a modern chromium-based browser installed on a stable version (Google Chrome, Edge, etc.)
● Participants are encouraged to bring use cases from their own projects

Program

45 min:  Welcome and fundamentals: Basics of graph modeling with RAMEN and ATAG
45 min:  Practical session: Writing, annotating and knowledge modeling with ATAG Editor and Entity   Manager
15 min: Coffee break
45 min:  Practical session (continuation)
30 min: Evaluation & closing discussion: First, completion of feedback forms. Then discussion: Application scenarios, feedback, further development

Links

● RAMEN: https://github.com/RAMEN-Suite
● Demo ATAG Editor: https://thm-graphs.github.io/atag-editor/
● GitHub repository ATAG Editor: https://github.com/THM-Graphs/atag-editor

Coordinators

Maximilian Michel is a researcher at the Academy of Sciences and Literature, Mainz, with a focus on graph technologies, web development, and digital scholarly editions. He works on the Hildegard of Bingen digital edition project, and is currently developing a graph-based text editor.

Vincent Neeb is a master’s student in computer science at the Technische Hochschule Mittelhessen and works as a student assistant for the Regesta Imperii (RI) project. He specializes in full-stack web development and is currently developing a web-based knowledge management system for the RI.

Sebastian Enns is a researcher at TH Mittelhessen and the Academy of Sciences and Literature, Mainz. His research interests include digital scholarly editions, graph technologies, and model-driven development. He works on The Socinian Correspondence and the Antje Brons digital editions.

Andreas Kuczera is Professor of Applied Digital Methods in the Humanities at TH Mittelhessen. His research focuses on graph-based digital editions and scholarly editing. He leads projects on The Socinian Correspondence, Hildegard von Bingen, and Antje Brons.


Bibliography

Cugliana, Elisa, Sebastian Enns, and Andreas Kuczera. 2024. “›Sortes Dictae Sunt‹. Methods for Editing Mediaeval Books of Fortune.” Zeitschrift Für Digitale Geisteswissenschaften 9 (October). DOI: https://doi.org/10.17175/2024_005.

Cummings, James. 2019. “A World of Difference: Myths and Misconceptions about the TEI.” Digital Scholarship in the Humanities 34 (Supplement_1): i58–79. DOI: https://doi.org/10.1093/llc/fqy071.

Daugirdas, Kęstutis, and Andreas Kuczera. 2017. “Die sozinianischen Briefwechsel: Zwischen Theologie, frühmoderner Naturwissenschaft und politischer Korrespondenz”. DFG-funded research project. (DFG project no. 324518514). URL: https://sozinianer.de.

Dekker, Ronald Haentjens, and David Birnbaum. 2017. “It’s more than just overlap: Text As Graph. Presented at Balisage: The Markup Conference 2017, Washington, DC, August 1 – 4, 2017.” Proceedings of Balisage: The Markup Conference 2017, Balisage Series on Markup Technologies, vol. 19 (August). DOI: https://doi.org/10.4242/BalisageVol19.Dekker01.

Dreyer, Mechthild, Andreas Kuczera, and Thomas Stäcker. 2023. “Das Buch der Briefe der Hildegard von Bingen. Genese – Struktur – Komposition”. DFG-funded research project. (DFG project no. 530755431). URL: https://liberepistolarum.de.

Enns, Sebastian, and Andreas Kuczera. 2025. „From Abstraction to Annotation: Modelling Heterogeneous Structures in Digital Editions“ [Submitted manuscript].

Kuczera, Andreas, Yannick Pultar, Dominik Kasper, Anna Prusova, Dieter Rübsamen. 2025. Regesta Imperii Online. Academy of Sciences and Literature, Mainz. URL: https://www.regesta-imperii.de.

Liedtke, Clemens. 2020. “DH’s Next Top-Model? Digitale Editionsentwicklung zwischen Best Practice und Innovation am Beispiel des ‘Corpus Masoreticum.’” DHd 2020 Spielräume: Digital Humanities zwischen Modellierung und Interpretation. 7. Tagung des Verbands “Digital Humanities im deutschsprachigen Raum” (DHd 2020). Zenodo, February 20. DOI: https://doi.org/10.5281/zenodo.4621902.

TEI Consortium. 2023. “Non-Hierarchical Structures.” Guidelines for Electronic Text Encoding and Interchange [4.7.0], November 16. URL: https://www.tei-c.org/release/doc/tei-p5-doc/en/html/NH.html.

Zeterberg, Max-Ferdinand, Michelle Weidling, Clemens Steinberger, Tillmann Dönicke, and Noah Kröll. 2025. Vom Baum zum Netz. Digitale Editionen als Labeled Property Graphs. August 29. DOI: https://zenodo.org/records/16994723.