A Case Study on the Independence of Speech Emotion Recognition in Bangla and English Languages using Language-Independent Prosodic Features

  • 2022-05-14 01:05:28
  • Fardin Saad, Hasan Mahmud, Mohammad Ridwan Kabir, Md. Alamin Shaheen, Paresha Farastu, Md. Kamrul Hasan
  • 0

Abstract

A language agnostic approach to recognizing emotions from speech remains anincomplete and challenging task. In this paper, we performed a step-by-stepcomparative analysis of Speech Emotion Recognition (SER) using Bangla andEnglish languages to assess whether distinguishing emotions from speech isindependent of language. Six emotions were categorized for this study, such as- happy, angry, neutral, sad, disgust, and fear. We employed three EmotionalSpeech Sets (ESS), of which the first two were developed by native Bengalispeakers in Bangla and English languages separately. The third was a subset ofthe Toronto Emotional Speech Set (TESS), which was developed by native Englishspeakers from Canada. We carefully selected language-independent prosodicfeatures, adopted a Support Vector Machine (SVM) model, and conducted threeexperiments to carry out our proposition. In the first experiment, we measuredthe performance of the three speech sets individually, followed by the secondexperiment, where different ESS pairs were integrated to analyze the impact onSER. Finally, we measured the recognition rate by training and testing themodel with different speech sets in the third experiment. Although this studyreveals that SER in Bangla and English languages is mostlylanguage-independent, some disparities were observed while recognizingemotional states like disgust and fear in these two languages. Moreover, ourinvestigations revealed that non-native speakers convey emotions throughspeech, much like expressing themselves in their native tongue.

 

Quick Read (beta)

loading the full paper ...