GlobalTrait: Personality Alignment of Multilingual Word Embeddings

  • 2018-11-01 05:26:27
  • Farhad Bin Siddique, Dario Bertero, Pascale Fung
  • 11

Abstract

We propose a multilingual model to recognize Big Five Personality traits fromtext data in four different languages: English, Spanish, Dutch and Italian. Ouranalysis shows that words having a similar semantic meaning in differentlanguages do not necessarily correspond to the same personality traits.Therefore, we propose a personality alignment method, GlobalTrait, which has amapping for each trait from the source language to the target language(English), such that words that correlate positively to each trait are closetogether in the multilingual vector space. Using these aligned embeddings fortraining, we can transfer personality related training features fromhigh-resource languages such as English to other low-resource languages, andget better multilingual results, when compared to using simple monolingual andunaligned multilingual embeddings. We achieve an average F-score increase(across all three languages except English) from 65 to 73.4 (+8.4), whencomparing our monolingual model to multilingual using CNN with personalityaligned embeddings. We also show relatively good performance in the regressiontasks, and better classification results when evaluating our model on aseparate Chinese dataset.

 

Introduction (beta)

None

 

Conclusion (beta)

None