Few-shot Controllable Style Transfer for Low-Resource Settings: A Study in Indian Languages

  • 2021-10-14 14:16:39
  • Kalpesh Krishna, Deepak Nathani, Xavier Garcia, Bidisha Samanta, Partha Talukdar
  • 4

Abstract

Style transfer is the task of rewriting an input sentence into a target stylewhile approximately preserving its content. While most prior literature assumesaccess to large style-labelled corpora, recent work (Riley et al. 2021) hasattempted "few-shot" style transfer using only 3-10 sentences at inference forextracting the target style. In this work we consider one such low resourcesetting where no datasets are available: style transfer for Indian languages.We find that existing few-shot methods perform this task poorly, with a strongtendency to copy inputs verbatim. We push the state-of-the-art for few-shotstyle transfer with a new method modeling the stylistic difference betweenparaphrases. When compared to prior work using automatic and human evaluations,our model achieves 2-3x better performance and output diversity in formalitytransfer and code-mixing addition across five Indian languages. Moreover, ourmethod is better able to control the amount of style transfer using an inputscalar knob. We report promising qualitative results for several attributetransfer directions, including sentiment transfer, text simplification, genderneutralization and text anonymization, all without retraining the model.Finally we found model evaluation to be difficult due to the lack of evaluationdatasets and metrics for Indian languages. To facilitate further research informality transfer for Indic languages, we crowdsource annotations for 4000sentence pairs in four languages, and use this dataset to design our automaticevaluation suite.

 

Quick Read (beta)

loading the full paper ...