Machine Translation in Pronunciation Space

Abstract

The research in machine translation community focus on translation in textspace. However, humans are in fact also good at direct translation inpronunciation space. Some existing translation systems, such as simultaneousmachine translation, are inherently more natural and thus potentially morerobust by directly translating in pronunciation space. In this paper, weconduct large scale experiments on a self-built dataset with about $20$M En-Zhpairs of text sentences and corresponding pronunciation sentences. We proposedthree new categories of translations: $1)$ translating a pronunciation sentencein source language into a pronunciation sentence in target language (P2P-Tran),$2)$ translating a text sentence in source language into a pronunciationsentence in target language (T2P-Tran), and $3)$ translating a pronunciationsentence in source language into a text sentence in target language (P2T-Tran),and compare them with traditional text translation (T2T-Tran). Our experimentsclearly show that all $4$ categories of translations have comparableperformances, with small and sometimes ignorable differences.

Quick Read (beta)

loading the full paper ...