Despite strong performance on a variety of tasks, neural sequence modelstrained with maximum likelihood have been shown to exhibit issues such aslength bias and degenerate repetition. We study the related issue of receivinginfinite-length sequences from a recurrent language model when using commondecoding algorithms. To analyze this issue, we first define inconsistency of adecoding algorithm, meaning that the algorithm can yield an infinite-lengthsequence that has zero probability under the model. We prove that commonly usedincomplete decoding algorithms - greedy search, beam search, top-k sampling,and nucleus sampling - are inconsistent, despite the fact that recurrentlanguage models are trained to produce sequences of finite length. Based onthese insights, we propose two remedies which address inconsistency: consistentvariants of top-k and nucleus sampling, and a self-terminating recurrentlanguage model. Empirical results show that inconsistency occurs in practice,and that the proposed methods prevent inconsistency.