HLTCOE JHU Submission to the Voice Privacy Challenge 2024

  • 2024-09-17 15:39:44
  • Henry Li Xinyuan, Zexin Cai, Ashi Garg, Kevin Duh, Leibny Paola GarcĂ­a-Perera, Sanjeev Khudanpur, Nicholas Andrews, Matthew Wiesner
  • 0

Abstract

We present a number of systems for the Voice Privacy Challenge, includingvoice conversion based systems such as the kNN-VC method and the WavLM voiceConversion method, and text-to-speech (TTS) based systems includingWhisper-VITS. We found that while voice conversion systems better preserveemotional content, they struggle to conceal speaker identity in semi-white-boxattack scenarios; conversely, TTS methods perform better at anonymization andworse at emotion preservation. Finally, we propose a random admixture systemwhich seeks to balance out the strengths and weaknesses of the two category ofsystems, achieving a strong EER of over 40% while maintaining UAR at arespectable 47%.

 

Quick Read (beta)

loading the full paper ...