Abstract
This paper examines how linguistic similarity affects cross-lingual phoneticrepresentation in speech processing for low-resource languages, emphasizingeffective source language selection. Previous cross-lingual research has usedvarious source languages to enhance performance for the target low-resourcelanguage without thorough consideration of selection. Our study stands out byproviding an in-depth analysis of language selection, supported by a practicalapproach to assess phonetic proximity among multiple language families. Weinvestigate how within-family similarity impacts performance in multilingualtraining, which aids in understanding language dynamics. We also evaluate theeffect of using phonologically similar languages, regardless of family. For thephoneme recognition task, utilizing phonologically similar languagesconsistently achieves a relative improvement of 55.6% over monolingualtraining, even surpassing the performance of a large-scale self-supervisedlearning model. Multilingual training within the same language familydemonstrates that higher phonological similarity enhances performance, whilelower similarity results in degraded performance compared to monolingualtraining.