Cross-lingual Transfer for Automatic Question Generation by Learning Interrogative Structures in Target Languages

Abstract

Automatic question generation (QG) serves a wide range of purposes, such asaugmenting question-answering (QA) corpora, enhancing chatbot systems, anddeveloping educational materials. Despite its importance, most existingdatasets predominantly focus on English, resulting in a considerable gap indata availability for other languages. Cross-lingual transfer for QG (XLT-QG)addresses this limitation by allowing models trained on high-resource languagedatasets to generate questions in low-resource languages. In this paper, wepropose a simple and efficient XLT-QG method that operates without the need formonolingual, parallel, or labeled data in the target language, utilizing asmall language model. Our model, trained solely on English QA datasets, learnsinterrogative structures from a limited set of question exemplars, which arethen applied to generate questions in the target language. Experimental resultsshow that our method outperforms several XLT-QG baselines and achievesperformance comparable to GPT-3.5-turbo across different languages.Additionally, the synthetic data generated by our model proves beneficial fortraining multilingual QA models. With significantly fewer parameters than largelanguage models and without requiring additional training for target languages,our approach offers an effective solution for QG and QA tasks across variouslanguages.

Quick Read (beta)

loading the full paper ...