Multichannel CNN with Attention for Text Classification

  • 2020-06-29 16:37:51
  • Zhenyu Liu, Haiwei Huang, Chaohong Lu, Shengfei Lyu
  • 2


Recent years, the approaches based on neural networks have shown remarkablepotential for sentence modeling. There are two main neural network structures:recurrent neural network (RNN) and convolution neural network (CNN). RNN cancapture long term dependencies and store the semantics of the previousinformation in a fixed-sized vector. However, RNN is a biased model and itsability to extract global semantics is restricted by the fixed-sized vector.Alternatively, CNN is able to capture n-gram features of texts by utilizingconvolutional filters. But the width of convolutional filters restricts itsperformance. In order to combine the strengths of the two kinds of networks andalleviate their shortcomings, this paper proposes Attention-based MultichannelConvolutional Neural Network (AMCNN) for text classification. AMCNN utilizes abi-directional long short-term memory to encode the history and futureinformation of words into high dimensional representations, so that theinformation of both the front and back of the sentence can be fully expressed.Then the scalar attention and vectorial attention are applied to obtainmultichannel representations. The scalar attention can calculate the word-levelimportance and the vectorial attention can calculate the feature-levelimportance. In the classification task, AMCNN uses a CNN structure to cptureword relations on the representations generated by the scalar and vectorialattention mechanism instead of calculating the weighted sums. It caneffectively extract the n-gram features of the text. The experimental resultson the benchmark datasets demonstrate that AMCNN achieves better performancethan state-of-the-art methods. In addition, the visualization results verifythe semantic richness of multichannel representations.


