Accelerated Reinforcement Learning for Sentence Generation by Vocabulary Prediction

Abstract

A major obstacle in reinforcement learning-based sentence generation is thelarge action space whose size is equal to the vocabulary size of thetarget-side language. To improve the efficiency of reinforcement learning, wepresent a novel approach for reducing the action space based on dynamicvocabulary prediction. Our method first predicts a fixed-size small vocabularyfor each input to generate its target sentence. The input-specific vocabulariesare then used at supervised and reinforcement learning steps, and also at testtime. In our experiments on six machine translation and two image captioningdatasets, our method achieves faster reinforcement learning ($\sim$2.7x faster)with less GPU memory ($\sim$2.3x less) than the full-vocabulary counterpart.The reinforcement learning with our method consistently leads to significantimprovement of BLEU scores, and the scores are equal to or better than those ofbaselines using the full vocabularies, with faster decoding time ($\sim$3xfaster) on CPUs.

Quick Read (beta)

loading the full paper ...