Multilingual Speech-to-Speech Translation into Multiple Target Languages

  • 2023-07-17 18:12:44
  • Hongyu Gong, Ning Dong, Sravya Popuri, Vedanuj Goswami, Ann Lee, Juan Pino
  • 0

Abstract

Speech-to-speech translation (S2ST) enables spoken communication betweenpeople talking in different languages. Despite a few studies on multilingualS2ST, their focus is the multilinguality on the source side, i.e., thetranslation from multiple source languages to one target language. We presentthe first work on multilingual S2ST supporting multiple target languages.Leveraging recent advance in direct S2ST with speech-to-unit and vocoder, weequip these key components with multilingual capability. Speech-to-masked-unit(S2MU) is the multilingual extension of S2U, which applies masking to unitswhich don't belong to the given target language to reduce the languageinterference. We also propose multilingual vocoder which is trained withlanguage embedding and the auxiliary loss of language identification. Onbenchmark translation testsets, our proposed multilingual model shows superiorperformance than bilingual models in the translation from English into $16$target languages.

 

Quick Read (beta)

loading the full paper ...