Decoding the Past, Digitizing the Future: Transkribus and (Digitized) Cultural Heritage. An interactive tutorial.

Annemieke Romein

 

READ-COOP SCE/ Transkribus.
University of Twente, the Netherlands.
University of Berne, Switzerland.

 

Workshop Description

We cordially invite you to participate in an immersive investigation of Transkribus, a pioneering instrument at the nexus of digital humanities, archival science, and artificial intelligence. Transkribus is the ideal use case that integrates traditional humanities scholarship with computational techniques, brings the tools in an easily understandable way of academia to users with different backgrounds, and applies AI so that users benefit without learning how to program.

The Transkribus system was developed through EU-funded projects and employs advanced AI to automate handwritten text recognition, thereby reducing the time and expertise required for palaeographic analysis. With currently 225+ private and institutional members, 250,000 registered users, and 200+ public models available, Transkribus has significantly enhanced users’ ability to adaptively tackle diverse handwriting styles adaptively, becoming an invaluable asset for researchers, archivists, and students.

The workshop is structured in two parts, allowing participants to engage with Transkribus at their own pace and following their interests and requirements. We encourage attendees to bring their own archival or library materials, whether handwritten or printed, reflecting the rich cultural heritage of the Baltic and Nordic regions. We will provide additional practice documents to ensure a comprehensive learning experience for those unable to bring their materials.

During the session, participants will receive practical experience using Transkribus’ user-friendly, browser-based interface. They will learn how to navigate the platform, understand its AI-driven recognition process, and explore its potential applications in their research workflows. By the end of the workshop, participants will have acquired the skills to use Transkribus in their projects, thereby contributing to the digital preservation and accessibility of cultural heritage.

 

Aim of the Workshop

The principal objective of this workshop is to equip participants with the knowledge and expertise to utilize Transkribus proficiently and effectively within the context of their digital humanities research and/or archival practices. By the end of the session, attendees will:

  1. Understand the principles behind AI-driven handwritten text recognition and its role in digital cultural heritage preservation.
  2. Gain practical experience using Transkribus’ interface, including uploading documents, training models, and extracting recognized text.
  3. Learn strategies for integrating Transkribus into various research workflows, from individual projects to large-scale digitization efforts.
  4. Explore Transkribus’s potential to enhance accessibility and analysis of historical documents across different disciplines within the humanities.
  5. Develop an appreciation for the intersection of traditional humanities scholarship, archival practices, and computational techniques in the digital age.

Tentative Programme

Part 1. Introductory Workshop (4 hours)

Audience: suitable for those interested in using Automatic Text Recognition/first-time users.

09:00 – 09:30 Welcome, introductions and READ-COOP

09:35 – 09:45 Upload, servers and sensitivities,

09:45 – 10:00 Practical exercise: upload and organize,

10:00 – 10:15 Simple workflow,

10:15 – 10:30 Training ATR models within the workflow,

10:30 – 11:00 Practical exercise,

11:00 – 11:30 Coffee break,

11:30 – 11:45 Layout recognition: the importance of LA,

11:45 – 12:15 Structure Tags and Training an LA model,

12:15 – 12:45 Practical exercise,

12:45 – 13:00 Wrap up.

 

<Lunchbreak: 13:00 – 14:00.>

 

Part 2. Advanced Workshop (4 hours)

Audience: suitable for those who did the introductory workshop and/ or those who have already worked with Transkribus previously.

14:00 – 14:10   Outline of this workshop session/ introductions,

14:10 – 14:30   Textual tagging,

14:30 – 14:45 Practical exercise,

14:45 – 15:00   Table Recognition Models,

15:00 – 15:30   Training layout-related models (LA/baseline/fields/tables),

15:30 – 16:00   Practical exercise,

16:00 – 16:30   Coffee break,

16:30 – 17:00   Text-Image-Matching (TIM).

17:00 – 17:10   Search,

17:10 – 17:20   Practical exercise,

17:20 – 17:45   Working with a team or volunteers/ Transkribus Sites/ new features,

17:45 – 18:00   Wrap-up.

 

 

Practical details

Format:

  • Interactive tutorial (or workshop) with practice moments for hands-on learning and practice.

Target Audience:

  • Introductory Workshop: Suitable for those interested in learning about Automatic Text Recognition for the first time or first-time users of Transkribus.
  • Advanced Workshop: For those who have attended the introductory session or already have experience using Transkribus. Interested participants without prior experience but who are keen to dive deeper are also welcome.

Anticipated Number of Participants:

  • Flexible: Between 10 and 90 participants.

Requirements for Participants:

  • Device:
    • Bring your own laptop (preferred).
    • Tablets are possible but may offer reduced functionality.
    • A mouse can be helpful but is optional.
  • Transkribus Account:
    • Create a free account at transkribus.org before the workshop. This can also be done at the workshop, but pre-registration is more convenient.
    • Bring your registration details, including your email and password.
    • If you use multi-factor authentication (MFA), bring the device you use for authentication.

About the tutor

Annemieke Romein is a teacher, an early modern (legal) historian by training, and a digital historian by choice. She combines in-depth historical research with the practical application of digital tools such as automatic text recognition, automatic meta-dating, and ontologies/ linked data. In 2023, she has been elected (honorary/volunteer) community director of READ-COOP to represent the members and users on the Board of Directors. She has been a volunteer tutor in many workshops, webinars, and training sessions, as she strongly believes in the Transkribus motto: unlock every doc.