Boosting Star-GANs for Voice Conversion with Contrastive Discriminator

  • 2022-09-22 11:53:05
  • Shijing Si, Jianzong Wang, Xulong Zhang, Xiaoyang Qu, Ning Cheng, Jing Xiao
Nonparallel multi-domain voice conversion methods such as the StarGAN-VCshave been widely applied in many scenarios. However, the training of thesemodels usually poses a challenge due to their complicated adversarial networkarchitectures. To address this, in this work we leverage the state-of-the-artcontrastive learning techniques and incorporate an efficient Siamese networkstructure into the StarGAN discriminator. Our method is calledSimSiam-StarGAN-VC and it boosts the training stability and effectivelyprevents the discriminator overfitting issue in the training process. Weconduct experiments on the Voice Conversion Challenge (VCC 2018) dataset, plusa user study to validate the performance of our framework. Our experimentalresults show that SimSiam-StarGAN-VC significantly outperforms existingStarGAN-VC methods in terms of both the objective and subjective metrics.


