How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario

  • 2025-01-05 13:07:32
  • Shih-Heng Wang, Zih-Ching Chen, Jiatong Shi, Ming-To Chuang, Guan-Ting Lin, Kuan-Po Huang, David Harwath, Shang-Wen Li, Hung-yi Lee
  • 0

Abstract

The utilization of speech Self-Supervised Learning (SSL) models achievesimpressive performance on Automatic Speech Recognition (ASR). However, inlow-resource language ASR, they encounter the domain mismatch problem betweenpre-trained and low-resource languages. Typical solutions like fine-tuning theSSL model suffer from high computation costs while using frozen SSL models asfeature extractors comes with poor performance. To handle these issues, weextend a conventional efficient fine-tuning scheme based on the adapter. We addan extra intermediate adaptation to warm up the adapter and downstream modelinitialization. Remarkably, we update only 1-5% of the total model parametersto achieve the adaptation. Experimental results on the ML-SUPERB dataset showthat our solution outperforms conventional efficient fine-tuning. It achievesup to a 28% relative improvement in the Character/Phoneme error rate whenadapting to unseen languages.

 

Quick Read (beta)

loading the full paper ...