Minimax and Neyman-Pearson Meta-Learning for Outlier Languages

Abstract

Model-agnostic meta-learning (MAML) has been recently put forth as a strategyto learn resource-poor languages in a sample-efficient fashion. Nevertheless,the properties of these languages are often not well represented by thoseavailable during training. Hence, we argue that the i.i.d. assumption ingrainedin MAML makes it ill-suited for cross-lingual NLP. In fact, under adecision-theoretic framework, MAML can be interpreted as minimising theexpected risk across training languages (with a uniform prior), which is knownas Bayes criterion. To increase its robustness to outlier languages, we createtwo variants of MAML based on alternative criteria: Minimax MAML reduces themaximum risk across languages, while Neyman-Pearson MAML constrains the risk ineach language to a maximum threshold. Both criteria constitute fullydifferentiable two-player games. In light of this, we propose a new adaptiveoptimiser solving for a local approximation to their Nash equilibrium. Weevaluate both model variants on two popular NLP tasks, part-of-speech taggingand question answering. We report gains for their average and minimumperformance across low-resource languages in zero- and few-shot settings,compared to joint multi-source transfer and vanilla MAML.

Quick Read (beta)

loading the full paper ...