Differentiable lower bound for expected BLEU score

Abstract

In natural language processing tasks performance of the models is oftenmeasured with some non-differentiable metric, such as BLEU score. To useefficient gradient-based methods for optimization, it is a common workaround tooptimize some surrogate loss function. This approach is effective ifoptimization of such loss also results in improving target metric. Thecorresponding problem is referred to as loss-evaluation mismatch. In thepresent work we propose a method for calculation of differentiable lower boundof expected BLEU score that does not involve computationally expensive samplingprocedure such as the one required when using REINFORCE rule from reinforcementlearning (RL) framework.

Quick Read (beta)

loading the full paper ...