Abstract
Programs implemented in various programming languages form the foundation ofsoftware applications. To alleviate the burden of program migration andfacilitate the development of software systems, automated program translationacross languages has garnered significant attention. Previous approachesprimarily focus on pairwise translation paradigms, learning translation betweenpairs of languages using bilingual parallel data. However, parallel data isdifficult to collect for some language pairs, and the distribution of programsemantics across languages can shift, posing challenges for pairwise programtranslation. In this paper, we argue that jointly learning a unified model totranslate code across multiple programming languages is superior to separatelylearning from bilingual parallel data. We propose Variational Interaction forMultilingual Program Translation~(VIM-PT), a disentanglement-based generativeapproach that jointly trains a unified model for multilingual programtranslation across multiple languages. VIM-PT disentangles code intolanguage-shared and language-specific features, using variational inference andinteraction information with a novel lower bound, then achieves programtranslation through conditional generation. VIM-PT demonstrates fouradvantages: 1) captures language-shared information more accurately fromvarious implementations and improves the quality of multilingual programtranslation, 2) mines and leverages the capability of non-parallel data, 3)addresses the distribution shift of program semantics across languages, 4) andserves as a unified model, reducing deployment complexity.