Convolutional Attention Networks for Multimodal Emotion Recognition from Speech and Text Data

  • 2018-05-17 05:51:00
  • Chan Woo Lee, Kyu Ye Song, Jihoon Jeong, Woo Yong Choi
  • 28

Abstract

Emotion recognition has become a popular topic of interest, especially in thefield of human computer interaction. Previous works involve unimodal analysisof emotion, while recent efforts focus on multi-modal emotion recognition fromvision and speech. In this paper, we propose a new method of learning about thehidden representations between just speech and text data using convolutionalattention networks. Compared to the shallow model which employs simpleconcatenation of feature vectors, the proposed attention model performs muchbetter in classifying emotion from speech and text data contained in theCMU-MOSEI dataset.

 

Quick Read (beta)

loading the full paper ...