A Generalized Language Model in Tensor Space

Abstract

In the literature, tensors have been effectively used for capturing thecontext information in language models. However, the existing methods usuallyadopt relatively-low order tensors, which have limited expressive power inmodeling language. Developing a higher-order tensor representation ischallenging, in terms of deriving an effective solution and showing itsgenerality. In this paper, we propose a language model named Tensor SpaceLanguage Model (TSLM), by utilizing tensor networks and tensor decomposition.In TSLM, we build a high-dimensional semantic space constructed by the tensorproduct of word vectors. Theoretically, we prove that such tensorrepresentation is a generalization of the n-gram language model. We furthershow that this high-order tensor representation can be decomposed to arecursive calculation of conditional probability for language modeling. Theexperimental results on Penn Tree Bank (PTB) dataset and WikiText benchmarkdemonstrate the effectiveness of TSLM.

Quick Read (beta)

loading the full paper ...