Investigating Transcription Normalization in the Faetar ASR Benchmark

  • 2025-08-20 06:19:36
  • Leo Peckham, Michael Ong, Naomi Nagy, Ewan Dunbar
  • 0

Abstract

We examine the role of transcription inconsistencies in the Faetar AutomaticSpeech Recognition benchmark, a challenging low-resource ASR benchmark. Withthe help of a small, hand-constructed lexicon, we conclude that find that,while inconsistencies do exist in the transcriptions, they are not the mainchallenge in the task. We also demonstrate that bigram word-based languagemodelling is of no added benefit, but that constraining decoding to a finitelexicon can be beneficial. The task remains extremely difficult.

 

Quick Read (beta)

loading the full paper ...