The Challenges of HTR Model Training: Feedback from the Project Donner le gout de l'archive a l'ere numerique

  • 2023-03-16 18:17:37
  • Couture Beatrice, Verret Farah, Gohier Maxime, Deslandres Dominique
  • 0

Abstract

The arrival of handwriting recognition technologies offers new possibilitiesfor research in heritage studies. However, it is now necessary to reflect onthe experiences and the practices developed by research teams. Our use of theTranskribus platform since 2018 has led us to search for the most significantways to improve the performance of our handwritten text recognition (HTR)models which are made to transcribe French handwriting dating from the 17thcentury. This article therefore reports on the impacts of creating transcribingprotocols, using the language model at full scale and determining the best wayto use base models in order to help increase the performance of HTR models.Combining all of these elements can indeed increase the performance of a singlemodel by more than 20% (reaching a Character Error Rate below 5%). This articlealso discusses some challenges regarding the collaborative nature of HTRplatforms such as Transkribus and the way researchers can share their datagenerated in the process of creating or training handwritten text recognitionmodels.

 

Quick Read (beta)

loading the full paper ...