Abstract
State-of-the-art machine translation (MT) systems are typically trained togenerate the "standard" target language; however, many languages have multiplevarieties (regional varieties, dialects, sociolects, non-native varieties) thatare different from the standard language. Such varieties are oftenlow-resource, and hence do not benefit from contemporary NLP solutions, MTincluded. We propose a general framework to rapidly adapt MT systems togenerate language varieties that are close to, but different from, the standardtarget language, using no parallel (source--variety) data. This also includesadaptation of MT systems to low-resource typologically-related targetlanguages. We experiment with adapting an English--Russian MT system togenerate Ukrainian and Belarusian, an English--Norwegian Bokm{\aa}l system togenerate Nynorsk, and an English--Arabic system to generate four Arabicdialects, obtaining significant improvements over competitive baselines.