A Continuous Space Neural Language Model for Bengali Language

  • 2020-01-11 14:50:57
  • Hemayet Ahmed Chowdhury, Md. Azizul Haque Imon, Anisur Rahman, Aisha Khatun, Md. Saiful Islam
  • 3

Abstract

Language models are generally employed to estimate the probabilitydistribution of various linguistic units, making them one of the fundamentalparts of natural language processing. Applications of language models include awide spectrum of tasks such as text summarization, translation andclassification. For a low resource language like Bengali, the research in thisarea so far can be considered to be narrow at the very least, with sometraditional count based models being proposed. This paper attempts to addressthe issue and proposes a continuous-space neural language model, or morespecifically an ASGD weight dropped LSTM language model, along with techniquesto efficiently train it for Bengali Language. The performance analysis withsome currently existing count based models illustrated in this paper also showsthat the proposed architecture outperforms its counterparts by achieving aninference perplexity as low as 51.2 on the held out data set for Bengali.

 

Quick Read (beta)

loading the full paper ...