Investigation on N-gram Approximated RNNLMs for Recognition of Morphologically Rich Speech

  • 2019-07-15 10:07:07
  • Balázs Tarján, György Szaszák, Tibor Fegyó, Péter Mihajlik
  • 13

Abstract

Recognition of Hungarian conversational telephone speech is challenging dueto the informal style and morphological richness of the language. RecurrentNeural Network Language Model (RNNLM) can provide remedy for the highperplexity of the task; however, two-pass decoding introduces a considerableprocessing delay. In order to eliminate this delay we investigate approachesaiming at the complexity reduction of RNNLM, while preserving its accuracy. Wecompare the performance of conventional back-off n-gram language models (BNLM),BNLM approximation of RNNLMs (RNN-BNLM) and RNN n-grams in terms of perplexityand word error rate (WER). Morphological richness is often addressed by usingstatistically derived subwords - morphs - in the language models, hence ourinvestigations are extended to morph-based models, as well. We found that usingRNN-BNLMs 40% of the RNNLM perplexity reduction can be recovered, which isroughly equal to the performance of a RNN 4-gram model. Combining morph-basedmodeling and approximation of RNNLM, we were able to achieve 8% relative WERreduction and preserve real-time operation of our conversational telephonespeech recognition system.

 

Quick Read (beta)

loading the full paper ...