Counterfactual Learning for Machine Translation: Degeneracies and Solutions

Abstract

Counterfactual learning is a natural scenario to improve web-based machinetranslation services by offline learning from feedback logged during userinteractions. In order to avoid the risk of showing inferior translations tousers, in such scenarios mostly exploration-free deterministic logging policiesare in place. We analyze possible degeneracies of inverse and reweightedpropensity scoring estimators, in stochastic and deterministic settings, andrelate them to recently proposed techniques for counterfactual learning underdeterministic logging.

Quick Read (beta)

loading the full paper ...